The most recent article about Lucene published on talks exactly about this type of stuff.  It
should answer your questions from this email.


--- Vince Taluskie <[EMAIL PROTECTED]> wrote:
> Howdy All,
> I am interested in several things to improve the speed of my
> indexing.  
> First would be to find out if it's possible (as well as how) to merge
> lucene indexes of similarly structured (same number of and type of 
> fields) documents or coordinate several machines updating the same 
> index.   For my application (estimate of 360M lucene documents across
> 30k physical files), I'd like to parallelize the indexing across as
> many 
> CPUs as I can and then merge the results back together - or use a 
> MultiSearcher across all the individual indexes if merge is not an
> option.
> Secondly, I'd like to know more about performing indexing in a 
> RAMDirectory and flushing those indexes back out to a FSDirectory.  
> I 
> was performing some tests of indexing on a Solaris-based machine and
> my 
> indexing speed went up by a factor of 3 when I pointed my indexing 
> program to store it's index in a tmpfs (ram-based) filesystem rather 
> than a physical disk - so I would imagine that I'd see a similar
> speedup 
> with a RAMDirectory and it would be portable to non-solaris machines
> as 
> well.   Would it be as simple as getting a list() from the RAMDir,
> then 
> an openFile() on each file and writing that Stream out to to disk?
> Thanks,
> Vince Taluskie
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to