Re: Improving indexing speed

2011-11-17 Thread Wouter Heijke
Hi, We faced a similar problem. The solution was to give the indexer less work and let worker threads do all the work. They would result in pre-processed/analyzed/tokenized Documents that could be indexed by the writer without any processing. Wouter > Hi > > the file to be indexed depends on the

Re: Improving indexing speed

2011-11-17 Thread KARTHIK SHIVAKUMAR
Hi the file to be indexed depends on the type of Document / data extractor My Document types are usually XML type and every time 2+ Million XML's are indexed and time taken is less then 5 minuts. with regards karthik On Fri, Nov 11, 2011 at 1:17 AM, Ian Lea wrote: > And how long do

Re: Improving indexing speed

2011-11-10 Thread Ian Lea
And how long does it take just to read and parse the files, without indexing them? Often that is the problem - nothing to do with lucene. There is plenty of good advice in http://wiki.apache.org/lucene-java/ImproveIndexingSpeed. A good match on the subject of your message! -- Ian. On Thu, Nov

Re: Improving indexing speed

2011-11-10 Thread Simon Willnauer
can you provide more information about your setup? things like how much time does it take to index you documents, how many docs do you index, what are your index writer settings, how many cores do you have, where do you read from and write to (disks). oh and what version of lucene are you using? t

Improving indexing speed

2011-11-10 Thread antony jospeh
Hi all, I have a large number of files in a directory need to be index them. All the files are in specific format need to parse to extract information after that i had to index. Single thread process one file at a time then i decided to use multi threads when the main thread that loops the directo