Re: reuse of TokenStream

2005-02-18 Thread Harald Kirsch
to get a custom made Tokenizer or TokenStream object for the given field. Since setting up the TokenStream needs some work to be done, I rather not repeat this work for every document to be indexed. Harald. -- -------- Harald

reuse of TokenStream

2005-02-17 Thread Harald Kirsch
IndexWriter.addDocument() in a loop? Or does addDocument only put the work into a queue where tasks are taken out for parallel indexing by several threads? Thanks, Harald. -- Harald Kirsch | [EMAIL PROTECTED] | +44 (0) 1223/49-2593

Re: Most efficient way to index 14M documents (out of memory/file handles)

2004-07-07 Thread Harald Kirsch
. The documents have, however, only three fields. Maybe this helps, Harald. -- -------- Harald Kirsch | [EMAIL PROTECTED] | +44 (0) 1223/49-2593 - To unsubscribe,

multiple documents per file, seek and character encoding

2004-07-02 Thread Harald Kirsch
? Harald. -- Harald Kirsch | [EMAIL PROTECTED] | +44 (0) 1223/49-2593 - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL

best mergeFactor for merging 100 Indexes

2004-06-25 Thread Harald Kirsch
, indexing time and mergeFacter interact, I would appreciate a good gues how to combine these indexes. Which mergeFactor to use? Use a different strategy then the 3 lines shown above? Thanks, Harald. -- -------- Harald Kirsc