Re: How to index large set data

Noble Paul നോബിള്‍ नोब्ळ् Thu, 21 May 2009 22:40:39 -0700

what is the total no:of docs created ?  I guess it may not be memory
bound. indexing is mostly amn IO bound operation. You may be able to
get a better perf if a SSD is used (solid state disk)


On Fri, May 22, 2009 at 10:46 AM, Jianbin Dai <djian...@yahoo.com> wrote:
>
> Hi Paul,
>
> Thank you so much for answering my questions. It really helped.
> After some adjustment, basically setting mergeFactor to 1000 from the default 
> value of 10, I can finished the whole job in 2.5 hours. I checked that during 
> running time, only around 18% of memory is being used, and VIRT is always 
> 1418m. I am thinking it may be restricted by JVM memory setting. But I run 
> the data import command through web, i.e.,
> http://<host>:<port>/solr/dataimport?command=full-import, how can I set the 
> memory allocation for JVM?
> Thanks again!
>
> JB
>
> --- On Thu, 5/21/09, Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com> 
> wrote:
>
>> From: Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com>
>> Subject: Re: How to index large set data
>> To: solr-user@lucene.apache.org
>> Date: Thursday, May 21, 2009, 9:57 PM
>> check the status page of DIH and see
>> if it is working properly. and
>> if, yes what is the rate of indexing
>>
>> On Thu, May 21, 2009 at 11:48 AM, Jianbin Dai <djian...@yahoo.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I have about 45GB xml files to be indexed. I am using
>> DataImportHandler. I started the full import 4 hours ago,
>> and it's still running....
>> > My computer has 4GB memory. Any suggestion on the
>> solutions?
>> > Thanks!
>> >
>> > JB
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>
>
>
>
>
>



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Re: How to index large set data

Reply via email to