Thanks for your reply.

We didn't configure any index for the database yet. Does the default index
double the size?

regards

Rajesh

On Tue, Apr 26, 2011 at 10:30 PM, Jason Hunter <[email protected]>wrote:

> The extra space is for the indexes.
>
> -jh-
>
> On Apr 26, 2011, at 9:51 AM, Rajesh Marklogic wrote:
>
> Hi Damon,
>
> Using Record loader, i could upload the million xml documents successfully.
> The total size of the document is 40 mb, but the forest size is increased to
> 70 mb.
>
> Any idea  why the forest size is double than actual file size?
>
> Thanks and Regards
>
> Rajesh Govindan
>
> On Tue, Apr 19, 2011 at 11:28 PM, Damon Feldman <
> [email protected]> wrote:
>
>>  Rajesh,
>>
>> Each module invoke such as yours below runs as a single transaction with
>> all the data in memory. For thousands of XML documents, you should break the
>> work up into smaller chunks.
>>
>> The InformationStudio flows available in version 4.2 will do this
>> automatically, and also provide a nice GUI for viewing progress, unloading
>> the data later, and checking on errors.
>>
>> Also, the Java-based RecordLoader utility (
>> http://developer.marklogic.com/code/recordloader,
>> http://marklogic.github.com/recordloader/tutorial.html) will insert
>> documents in smaller chunks. It does not provide all the power of
>> InformationStudio, but can be faster in some instances.
>>
>> Yours,
>> Damon
>>
>>  ------------------------------
>> *From:* [email protected] [
>> [email protected]] On Behalf Of Rajesh Marklogic [
>> [email protected]]
>> *Sent:* Tuesday, April 19, 2011 1:03 PM
>> *To:* [email protected]
>> *Subject:* [MarkLogic Dev General] Loading xml files in mark logic server
>>
>>  Hi
>>
>>  We are trying to load 14 million xml files in Mark logic database. The
>> below xdmp:document-load script could load maximum 5000 xml files at a time.
>>  Anything more than 5000 xml files threw Memory exceptions.
>>
>>  xquery version "1.0-ml";
>>
>>  let $files:=xdmp:filesystem-directory("/filePath/")
>> for $filepath in $files//dir:entry[1 to 5000]
>> return (xdmp:document-load($filepath//dir:pathname,
>> <options xmlns="xdmp:document-load">
>>        <uri>{$filepath//dir:filename/text()}</uri>
>>        <permissions>{xdmp:default-permissions()}</permissions>
>>       <format>xml</format>
>>        <repair>none</repair>
>>     </options>))
>>
>>
>>  Is there any configuration changes required in admin setting to load all
>> the 14 million xml files in 3 to 4 hours?. The total size of the content
>> will be around 4GB and we have Unix server with 250 GB memory (RAM)
>>
>>  It would be great, if you suggest an best  approach to load all the 14
>> million xml files in the time frame of 3-4 hours.
>>
>>  Thanks and Regards
>>
>>  Rajesh
>>
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
>
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
>
>
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to