Hi Paul,

Thank you so much for answering my questions. It really helped.
After some adjustment, basically setting mergeFactor to 1000 from the default 
value of 10, I can finished the whole job in 2.5 hours. I checked that during 
running time, only around 18% of memory is being used, and VIRT is always 
1418m. I am thinking it may be restricted by JVM memory setting. But I run the 
data import command through web, i.e.,
http://<host>:<port>/solr/dataimport?command=full-import, how can I set the 
memory allocation for JVM? 
Thanks again!

JB

--- On Thu, 5/21/09, Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com> wrote:

> From: Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com>
> Subject: Re: How to index large set data
> To: solr-user@lucene.apache.org
> Date: Thursday, May 21, 2009, 9:57 PM
> check the status page of DIH and see
> if it is working properly. and
> if, yes what is the rate of indexing
> 
> On Thu, May 21, 2009 at 11:48 AM, Jianbin Dai <djian...@yahoo.com>
> wrote:
> >
> > Hi,
> >
> > I have about 45GB xml files to be indexed. I am using
> DataImportHandler. I started the full import 4 hours ago,
> and it's still running....
> > My computer has 4GB memory. Any suggestion on the
> solutions?
> > Thanks!
> >
> > JB
> >
> >
> >
> >
> >
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 




Reply via email to