Indexing rates depend heavily on document size (text) and pre-indexing
processing. Other things probably matter, too, like number of fields.
My application is indexing 20X faster than Christian's, because I have
small documents (a few hundred bytes) that are extracted from an RDBMS
and submitted in Solr's XML format.
I am probably seeing something close to the maximum rate at 250 docs/s.
This is on a dual-CPU 3 GHz Xeon, Fedora Core 4, JDK 1.5. A fast RAID
would probably make it go faster, but that is about the only speedup
I can think of.
This has been discussed before, so check the mailing list archives.
wunder
On 2/20/07 2:58 AM, Burkamp, Christian [EMAIL PROTECTED] wrote:
I do agree. There's probably no need to go to the index directly.
My current solr test server has more than 5M documents and a size of about
60GB.
I still index at 13 docs per second and this still includes filtering of the
documents.
(If you have your content ready in XML format performance will be even
better).
It seems to me that indexing performance does not drop as the index increases.
Optimizing the index although does take huge amounts of time for large
indexes.
--Christian
-Ursprüngliche Nachricht-
Von: Erik Hatcher [mailto:[EMAIL PROTECTED]
Gesendet: Dienstag, 20. Februar 2007 11:43
An: solr-user@lucene.apache.org
Betreff: Re: solr performance
You could build your index using Lucene directly and then point a
Solr instance at it once its built. My suspicion is that the
overhead of forming a document as an XML string and posting to Solr
via HTTP won't be that much different than indexing with Lucene
directly.
My largest Solr index is currently at 1.4M and it takes a max of 3ms
to add a document (according to Solr's console), most of them 1ms.
My single threaded indexer is indexing around 1000 documents per
minute, but I think I can get this number even faster by
parallelizing the indexer.
I'm curious what rates others are indexing at ???
Erik
On Feb 20, 2007, at 2:21 AM, Jack L wrote:
Hello,
I have a question about solr's performance of accepting inserts and
indexing. If I have 10 million documents that I'd like to index, I
suppose it will take some time to submit them to solr. Is there any
faster way to do this than through the web interface?
--
Best regards,
Jack
__
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com