You can start a fixedThreadPool to index all these files in the multhread
way. Every thread execute an index task which could index a part of all the
files. In the index task, when indexing 10000 files, you need execute the
indexWrite.commit() method to flush all the index add operation to disk
file.

If you need index all these files into only one index file, you need to hold
only one indexWriter instance among all the index thread.

Hope it's helpful.



On Wed, Oct 20, 2010 at 1:05 PM, Sahin Buyrukbilen <
sahin.buyrukbi...@gmail.com> wrote:

> Thank you Johnbin,
> do you know which parameter I have to play with?
>
> On Wed, Oct 20, 2010 at 12:59 AM, Johnbin Wang <johnbin.w...@gmail.com
> >wrote:
>
> > I think you can write index file once every 10,000 files or less have
> been
> > read.
> >
> > On Wed, Oct 20, 2010 at 12:11 PM, Sahin Buyrukbilen <
> > sahin.buyrukbi...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I have to index about 4.5Million txt files. When I run the my indexing
> > > application through Eclipse, I get this error : "Exception in thread
> > "main"
> > > java.lang.OutOfMemoryError: Java heap space"
> > >
> > > eclipse -vmargs -Xmx2000m -Xss8192k
> > >
> > > eclipse -vmargs -Xms40M -Xmx2G
> > >
> > >  I tried running Eclipse with above memory parameters, but still had
> the
> > > same error. The architecture of my computer is AMD x2 64bit 2GHz
> > processor,
> > > Ubuntu 10.04 LTS 64bit. java-6-openjdk.
> > >
> > > Anybody has a suggestion?
> > >
> > > thank you.
> > > Sahin.
> > >
> >
> >
> >
> > --
> > cheers,
> > Johnbin Wang
> >
>



-- 
cheers,
Johnbin Wang

Reply via email to