Data Mover Light instructions
Data Mover Light (DML) is a Java-based client that can be used in conjunction with the ESG portal to download a large
number of files to a user personal desktop, thus avoiding the necessity of clicking on multiple hyperlinks.
You can use DML on any computer that has Java 1.5 or higher installed, including Unix systems (Linux, Solaris and MacOS 10.X) and Windows systems.
Please note: if you encounter any problem, please contact ESG.
DML is a product of the Scientific Data Management Research Group at LBNL.
- Step 1: Install DML (first time only)
- Download DML from the URL: http://datagrid.lbl.gov/dml/. Click on "download DML". After agreeing with the license conditions, click on latest dml-x.x.x.tar.gz.
- Untar the distribution dml-2.2.3.tar.gz and put it in any directory you choose. This will create a directory called "dml".
- Please make sure you have JDK 1.5.x installed on your system.
- On Unix systems only:
- setenv JAVA_HOME <path where Java is installed> (not necessary on Windows)
- On Unix systems only:
- setenv DML_HOME <path where you have untarred the file DML.tar.gz> (not necessary on Windows)
- Make sure that the DML installation directory contains a file under the dml/conf directory called lahfs.properties with the following two entries:
- ncar.scd.lahfs.authenticationMode=plain
- ncar.scd.lahfs.loginType=MYPROXY
- Note: the README/instructions file in the DML distribution directory (http://datagrid.lbl.gov/dml/) contains instructions for setting up more complicated transfers through gsiFTP and FTP, which you shouldn't need if you just want to download data through the ESG portal.
- Step 2: Request data through the ESG portal
- Browse the data catalogs on the ESG portal, select the desired files by clicking on the corresponding checkboxes. It will ask you for your login and password.
- When you are done selecting the files, click on "Go To Data Cart" to access the Data Cart management page. Here you can remove older files from your Data Cart, or empty it altogether.
- Click on "Transfer Data" to start formulating a data transfer request. Review your selection (and possibly unclick any undesired file), then click on "Submit". This will start the retrieval of files from the remote storage (at NCAR MSS, NERSC or ORNL HPSS) and transfer to the disk cache at NCAR. The files might also be already available on the cache.
- When all files are ready for download, an email will be sent to you containing a hyperlink to a web page with information about your request.
Click on the hyperlink, and it will ask you again for you login and password.
- If the request was successfully completed, click on "Generate XML" (under bullet no. 2.) to generate an XML file that will be used by DML to download all files to your local machine.
Save it as an XML file (i.e. ending with the extension .xml) on your computer referred here as <XML files list> (which can be placed anywhere and must be named anything with extension .xml).
The default name for that file is: dataTransferSerializer.xml. NOTE: depending on which browser you are using, you might need to view and save the web page source to obtain the <XML files list>.
This usually works OK with MS-explorer.
- Step 3: Use DML to download the requested data
- Unix systems: cd to the DML installation directory, then to subdirectory "bin". Then, type dml-gui at the command prompt to start DML.
- Windows systems: use the Windows File Explorer tool to go to the DML installation directory, and then to subdirectory "bin". Double click on the file win-dml.bat to start the DML Graphical User Interface.
- The DML Graphical User Interface will appear.
Use the menu Files->Load to select the <XML files list> containing your data request.
- Enter the target directory at the top of the window, e.g. /tmp or c:\
- Click on the button "Transfer" to start transferring files. You will be prompted for your ESG portal username and password.
(If you wish to avoid the prompt, you can enter your login and password in the lahfs.properties file; make sure to remove the # (comment symbol) in front to the login and password lines).
Hints
- Hint 1: if you have lots of files to download, you can accelerate the downloading process by downloading files concurrently. To do that use menu Options -> Concurrent Transfers ... to at most "10" (even though it lets you choose higher numbers).
- Hint 2: after transfer starts, you can see a summary status of your file transfers when you open the file dataTransferSerializer-status.txt in the same directory where you put dataTransferSerializer.xml.
- Hint 3: If you want to save all the information on your DML GUI including source and target URL of files, file sizes, and time taken to transfer each file, use menu File -> Save or Save As. If you use Save, it will save a file called dataTransferSerializer-report.xml under the same directory where you put dataTransferSerializer.xml.
- Hint 4: under menu Tools there are options for using your GSI certificate in case that you wish to use GridFTP.
- Hint 5: Ignore all other menu Operations and Options items. These are for advanced administrator functions.
|