You might want to consider using LuSql, which is a high performance,
multithreaded, well documented tool designed specifically for moving
data from a JDBC database into Lucene (you didn't say if it was a
JDBC-accessible db...)
 http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql

Disclosure: I am the author of LuSql.

-Glen Newton
 http://zzzoot.blogspot.com/
 http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/Glen_Newton


2009/10/22 Paul Taylor <paul_t...@fastmail.fm>:
> I'm building a lucene index from a database, creating 1 about 1 million
> documents, unsuprisingly this takes quite a long time.
> I do this by sending a query  to the db over a range of ids , (10,000)
> records
> Add these results in Lucene
> Then get next 10,0000 and so on.
> When completed indexing I then call optimize()
> I also set  indexWriter.setMaxBufferedDocs(1000) and
>  indexWriter.setMergeFactor(3000) but don't fully understand these values.
> Each document contains about 10 small fields
>
> I'm looking for some ways to improve performance.
>
> This index writing is single threaded, is there a way I can multi-thread
> writing to the indexing ?
> I only call optimize() once at the end, is the best way to do it.
> I'm going to run a profiler over the code, but are there any rules of thumbs
> on the best values to set for MaxBufferedDocs and Mergefactor()
>
> thanks Paul
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>



-- 

-

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to