Hi Danny, I could load the documents using info:load, it worked fine first time. But second time, it stopped after loading 100,000 records and again i tried, it stopped after loading 65K records.
Every time, i cleared the forest before loading the documents. Can you help me to figure out the above problem. I couldn't find the log file in opt/marklogic/logs (working in unix server). Thanks and Regards Rajesh On Tue, Apr 19, 2011 at 11:31 PM, Danny Sokolsky < [email protected]> wrote: > You might also try using info:load, which loads things in batches. > > > > > http://docs.marklogic.com/4.2doc/docapp.xqy#display.xqy?fname=http://pubs/4.2doc/apidoc/info.xml&category=Information%20Studio&function=info:load > > > > -Danny > > > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Damon Feldman > *Sent:* Tuesday, April 19, 2011 10:59 AM > *To:* General MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] Loading xml files in mark logic > server > > > > Rajesh, > > > > Each module invoke such as yours below runs as a single transaction with > all the data in memory. For thousands of XML documents, you should break the > work up into smaller chunks. > > > > The InformationStudio flows available in version 4.2 will do this > automatically, and also provide a nice GUI for viewing progress, unloading > the data later, and checking on errors. > > > > Also, the Java-based RecordLoader utility ( > http://developer.marklogic.com/code/recordloader, > http://marklogic.github.com/recordloader/tutorial.html) will insert > documents in smaller chunks. It does not provide all the power of > InformationStudio, but can be faster in some instances. > > > > Yours, > > Damon > > > ------------------------------ > > *From:* [email protected] [ > [email protected]] On Behalf Of Rajesh Marklogic [ > [email protected]] > *Sent:* Tuesday, April 19, 2011 1:03 PM > *To:* [email protected] > *Subject:* [MarkLogic Dev General] Loading xml files in mark logic server > > Hi > > > > We are trying to load 14 million xml files in Mark logic database. The > below xdmp:document-load script could load maximum 5000 xml files at a time. > Anything more than 5000 xml files threw Memory exceptions. > > > > xquery version "1.0-ml"; > > > > let $files:=xdmp:filesystem-directory("/filePath/") > > for $filepath in $files//dir:entry[1 to 5000] > > return (xdmp:document-load($filepath//dir:pathname, > > <options xmlns="xdmp:document-load"> > > <uri>{$filepath//dir:filename/text()}</uri> > > <permissions>{xdmp:default-permissions()}</permissions> > > <format>xml</format> > > <repair>none</repair> > > </options>)) > > > > > > Is there any configuration changes required in admin setting to load all > the 14 million xml files in 3 to 4 hours?. The total size of the content > will be around 4GB and we have Unix server with 250 GB memory (RAM) > > > > It would be great, if you suggest an best approach to load all the 14 > million xml files in the time frame of 3-4 hours. > > > > Thanks and Regards > > > > Rajesh > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
