You might also try using info:load, which loads things in batches. http://docs.marklogic.com/4.2doc/docapp.xqy#display.xqy?fname=http://pubs/4.2doc/apidoc/info.xml&category=Information%20Studio&function=info:load
-Danny From: [email protected] [mailto:[email protected]] On Behalf Of Damon Feldman Sent: Tuesday, April 19, 2011 10:59 AM To: General MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Loading xml files in mark logic server Rajesh, Each module invoke such as yours below runs as a single transaction with all the data in memory. For thousands of XML documents, you should break the work up into smaller chunks. The InformationStudio flows available in version 4.2 will do this automatically, and also provide a nice GUI for viewing progress, unloading the data later, and checking on errors. Also, the Java-based RecordLoader utility (http://developer.marklogic.com/code/recordloader, http://marklogic.github.com/recordloader/tutorial.html) will insert documents in smaller chunks. It does not provide all the power of InformationStudio, but can be faster in some instances. Yours, Damon ________________________________ From: [email protected] [[email protected]] On Behalf Of Rajesh Marklogic [[email protected]] Sent: Tuesday, April 19, 2011 1:03 PM To: [email protected] Subject: [MarkLogic Dev General] Loading xml files in mark logic server Hi We are trying to load 14 million xml files in Mark logic database. The below xdmp:document-load script could load maximum 5000 xml files at a time. Anything more than 5000 xml files threw Memory exceptions. xquery version "1.0-ml"; let $files:=xdmp:filesystem-directory("/filePath/") for $filepath in $files//dir:entry[1 to 5000] return (xdmp:document-load($filepath//dir:pathname, <options xmlns="xdmp:document-load"> <uri>{$filepath//dir:filename/text()}</uri> <permissions>{xdmp:default-permissions()}</permissions> <format>xml</format> <repair>none</repair> </options>)) Is there any configuration changes required in admin setting to load all the 14 million xml files in 3 to 4 hours?. The total size of the content will be around 4GB and we have Unix server with 250 GB memory (RAM) It would be great, if you suggest an best approach to load all the 14 million xml files in the time frame of 3-4 hours. Thanks and Regards Rajesh
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
