my 2 cents
My indexing-module handles the documents with ~15 fields, most of those must
be indexed and stored. Using the GermanAnalyzer I saw the following times:
10 MB ~ 3400 docs --> 6-8 sec
70 MB ~ 5 docs --> 65 sec
so it gives me 500 - 760 doc/s
--
View this message in context:
http:/
Hi:
Here my latest testing of Oracle-Lucene integration (Lucene 2.3.2
binary dist. / Oracle 11g):
http://marceloochoa.blogspot.com/2008/06/new-binary-release-of-lucene-oracle.html
Tested against Spanish Wikipedia Dumps and using Wikipedia Analyzer/Tokenizer.
There is independent times for upl
There i really no "typical". I'm playing with Hadoop (HDFS) and Solr at the
moment, for example, and I'm seeing indexing rate of cca 70 docs/second.
However, the bottleneck there is not indexing, it is reading data from HDFS
(over the network).
I've also seen 500+ docs/second.
It depends on
Of course it depends on analysis, etc., but my experience has been at
least 2x faster, if not up to 4-5 times depending on the docs, etc.
You can use the contrib/benchmark package to try for yourself, of
course!
On Jun 2, 2008, at 7:40 PM, Simon Wistow wrote:
I know this is one of those