Hi Damon, Using Record loader, i could upload the million xml documents successfully. The total size of the document is 40 mb, but the forest size is increased to 70 mb.
Any idea why the forest size is double than actual file size? Thanks and Regards Rajesh Govindan On Tue, Apr 19, 2011 at 11:28 PM, Damon Feldman <[email protected] > wrote: > Rajesh, > > Each module invoke such as yours below runs as a single transaction with > all the data in memory. For thousands of XML documents, you should break the > work up into smaller chunks. > > The InformationStudio flows available in version 4.2 will do this > automatically, and also provide a nice GUI for viewing progress, unloading > the data later, and checking on errors. > > Also, the Java-based RecordLoader utility ( > http://developer.marklogic.com/code/recordloader, > http://marklogic.github.com/recordloader/tutorial.html) will insert > documents in smaller chunks. It does not provide all the power of > InformationStudio, but can be faster in some instances. > > Yours, > Damon > > ------------------------------ > *From:* [email protected] [ > [email protected]] On Behalf Of Rajesh Marklogic [ > [email protected]] > *Sent:* Tuesday, April 19, 2011 1:03 PM > *To:* [email protected] > *Subject:* [MarkLogic Dev General] Loading xml files in mark logic server > > Hi > > We are trying to load 14 million xml files in Mark logic database. The > below xdmp:document-load script could load maximum 5000 xml files at a time. > Anything more than 5000 xml files threw Memory exceptions. > > xquery version "1.0-ml"; > > let $files:=xdmp:filesystem-directory("/filePath/") > for $filepath in $files//dir:entry[1 to 5000] > return (xdmp:document-load($filepath//dir:pathname, > <options xmlns="xdmp:document-load"> > <uri>{$filepath//dir:filename/text()}</uri> > <permissions>{xdmp:default-permissions()}</permissions> > <format>xml</format> > <repair>none</repair> > </options>)) > > > Is there any configuration changes required in admin setting to load all > the 14 million xml files in 3 to 4 hours?. The total size of the content > will be around 4GB and we have Unix server with 250 GB memory (RAM) > > It would be great, if you suggest an best approach to load all the 14 > million xml files in the time frame of 3-4 hours. > > Thanks and Regards > > Rajesh > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
