By default there are a number of indexes enabled. You have admin control over 
them, but MarkLogic chooses a good default set. A doubling of size isn't 
abnormal. In fact, compared to other products it's actually very good. 
MarkLogic works hard to provide so many rich indexes at such a small size. 

Sent from my iPhone

On Apr 26, 2011, at 8:03 PM, Rajesh Marklogic <[email protected]> 
wrote:

> Thanks for your reply.
> 
> We didn't configure any index for the database yet. Does the default index 
> double the size?
> 
> regards
> 
> Rajesh
> 
> On Tue, Apr 26, 2011 at 10:30 PM, Jason Hunter <[email protected]> wrote:
> The extra space is for the indexes.
> 
> -jh-
> 
> On Apr 26, 2011, at 9:51 AM, Rajesh Marklogic wrote:
> 
>> Hi Damon,
>> 
>> Using Record loader, i could upload the million xml documents successfully. 
>> The total size of the document is 40 mb, but the forest size is increased to 
>> 70 mb.
>> 
>> Any idea  why the forest size is double than actual file size?
>> 
>> Thanks and Regards
>> 
>> Rajesh Govindan
>> 
>> On Tue, Apr 19, 2011 at 11:28 PM, Damon Feldman 
>> <[email protected]> wrote:
>> Rajesh,
>>  
>> Each module invoke such as yours below runs as a single transaction with all 
>> the data in memory. For thousands of XML documents, you should break the 
>> work up into smaller chunks.
>>  
>> The InformationStudio flows available in version 4.2 will do this 
>> automatically, and also provide a nice GUI for viewing progress, unloading 
>> the data later, and checking on errors.
>>  
>> Also, the Java-based RecordLoader utility 
>> (http://developer.marklogic.com/code/recordloader, 
>> http://marklogic.github.com/recordloader/tutorial.html) will insert 
>> documents in smaller chunks. It does not provide all the power of 
>> InformationStudio, but can be faster in some instances.
>>  
>> Yours,
>> Damon
>>  
>> From: [email protected] 
>> [[email protected]] On Behalf Of Rajesh Marklogic 
>> [[email protected]]
>> Sent: Tuesday, April 19, 2011 1:03 PM
>> To: [email protected]
>> Subject: [MarkLogic Dev General] Loading xml files in mark logic server
>> 
>> Hi 
>> 
>> We are trying to load 14 million xml files in Mark logic database. The below 
>> xdmp:document-load script could load maximum 5000 xml files at a time.  
>> Anything more than 5000 xml files threw Memory exceptions.
>> 
>> xquery version "1.0-ml";
>> 
>> let $files:=xdmp:filesystem-directory("/filePath/")
>> for $filepath in $files//dir:entry[1 to 5000]
>> return (xdmp:document-load($filepath//dir:pathname,
>> <options xmlns="xdmp:document-load">          
>>        <uri>{$filepath//dir:filename/text()}</uri>       
>>        <permissions>{xdmp:default-permissions()}</permissions>        
>>       <format>xml</format>
>>        <repair>none</repair>       
>>     </options>)) 
>> 
>> 
>> Is there any configuration changes required in admin setting to load all the 
>> 14 million xml files in 3 to 4 hours?. The total size of the content will be 
>> around 4GB and we have Unix server with 250 GB memory (RAM)
>> 
>> It would be great, if you suggest an best  approach to load all the 14 
>> million xml files in the time frame of 3-4 hours.
>> 
>> Thanks and Regards
>> 
>> Rajesh 
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
>> 
>> 
>> _______________________________________________
>> General mailing list
>> [email protected]
>> http://developer.marklogic.com/mailman/listinfo/general
> 
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> 
> 
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to