Hi,
We faced a similar problem.
The solution was to give the indexer less work and let worker threads do
all the work. They would result in pre-processed/analyzed/tokenized
Documents that could be indexed by the writer without any processing.
Wouter
> Hi
>
> the file to be indexed depends on the
Hi
the file to be indexed depends on the type of Document / data extractor
My Document types are usually XML type and every time 2+ Million XML's
are indexed and time taken is less then 5 minuts.
with regards
karthik
On Fri, Nov 11, 2011 at 1:17 AM, Ian Lea wrote:
> And how long do
And how long does it take just to read and parse the files, without
indexing them? Often that is the problem - nothing to do with lucene.
There is plenty of good advice in
http://wiki.apache.org/lucene-java/ImproveIndexingSpeed. A good match
on the subject of your message!
--
Ian.
On Thu, Nov
can you provide more information about your setup? things like how
much time does it take to index you documents, how many docs do you
index, what are your index writer settings, how many cores do you
have, where do you read from and write to (disks). oh and what version
of lucene are you using?
t
Hi all,
I have a large number of files in a directory need to be index them. All
the files are in specific format need to parse to extract information after
that i had to index.
Single thread process one file at a time then i decided to use multi
threads when the main thread that loops the directo