Hi List,

It takes pretty long time to index documents using Lucene.Net.  It
takes about 3 seconds to add thounsand documents to the index.  I've
used Java Lucene in the past and according to my memories indexing
should be about 20 times faster.

Here's the relevant code:

            IndexWriter index_writer = new IndexWriter("index", new
StandardAnalyzer(), true);
//            index_writer.SetMergeFactor(10000);
//            index_writer.SetMaxMergeDocs(10000);
//            index_writer.SetMaxBufferedDocs(10000);
            ExecuteSqlQuery("SELECT artist, title FROM songname");
            int count = 0;
            while (reader.Read()) {
                if (count > 0 && count%1000 == 0) {
                    Console.WriteLine(count);
                }
                Document document = new Document();
                document.Add(new Field("artist",
reader.GetString("artist"), Field.Store.YES, Field.Index.TOKENIZED));
                document.Add(new Field("title",
reader.GetString("title"), Field.Store.YES, Field.Index.TOKENIZED));
                index_writer.AddDocument(document);
                count++;
            }

When decommenting the commented lines indexing gets about 2x faster,
but it's not really significant.

I'd really appreciate your insights about this speed issue.

Thanks in advance!

-- 
Laci  <http://monda.hu>

Reply via email to