Re: Typical Indexing performance

2008-06-06 Thread Konstantyn Smirnov
my 2 cents My indexing-module handles the documents with ~15 fields, most of those must be indexed and stored. Using the GermanAnalyzer I saw the following times: 10 MB ~ 3400 docs --> 6-8 sec 70 MB ~ 5 docs --> 65 sec so it gives me 500 - 760 doc/s -- View this message in context: http:/

Re: Typical Indexing performance

2008-06-03 Thread Marcelo Ochoa
Hi: Here my latest testing of Oracle-Lucene integration (Lucene 2.3.2 binary dist. / Oracle 11g): http://marceloochoa.blogspot.com/2008/06/new-binary-release-of-lucene-oracle.html Tested against Spanish Wikipedia Dumps and using Wikipedia Analyzer/Tokenizer. There is independent times for upl

Re: Typical Indexing performance

2008-06-03 Thread Otis Gospodnetic
There i really no "typical". I'm playing with Hadoop (HDFS) and Solr at the moment, for example, and I'm seeing indexing rate of cca 70 docs/second. However, the bottleneck there is not indexing, it is reading data from HDFS (over the network). I've also seen 500+ docs/second. It depends on

Re: Typical Indexing performance

2008-06-03 Thread Grant Ingersoll
Of course it depends on analysis, etc., but my experience has been at least 2x faster, if not up to 4-5 times depending on the docs, etc. You can use the contrib/benchmark package to try for yourself, of course! On Jun 2, 2008, at 7:40 PM, Simon Wistow wrote: I know this is one of those