Hi all:
did you tried to increase IndexWriter.mergeFactor. I tried to increase it to 1000 and index speed is about 10 time faster than defualt = 10 .


Regards

Che Dong
http://www.chedong.com/


Aalap Parikh åé:
My machine is pretty good and fairly new. The disk for
sure is not slow and also I am not indexing large
Documents; 27 fields with each field value being a
string with no more than 15-20 characters long.

I tried setting the maxFieldLength value of the
Indexwriter to a low value but that didn't help.

Also, I am not using Hibernate at all.

Thanks,
Aalap.

--- Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:

That sounds way too long, unless you have veeery
slow disks, veeery
large Documents (long fields that you analyze,
index, and store in
Lucene), or some such.
If you have very loooong fiiiiieeeelds you could try
setting


http://lucene.apache.org/java/docs/api/org/apache/lucene/index/IndexWriter.html#maxFieldLength

to a very small number and see if that changes
performance drastically.
There are other IndexWriter knobs you can fiddle
with.

I've seen Hibernate 2.* get sluggish once its
Session gets filled up
with a lot of objects.

Otis


--- Aalap Parikh <[EMAIL PROTECTED]> wrote:

Hi,

I have similar issues in indexing time.

I am doing a SELECT from database and getting back
10,000 rows. I then start indexing each row and

hence

would have 10,000 documents in my Lucene index.

Each

doc has 27 fields.

I added some timing code to my indexing process.

The

DB select call takes around 23 seconds and the
indexing process takes 567 seconds. Also, I

profiled

the app using JProfiler and found out that 90% of

time

is spent in the IndexWriter.addDocument call. As
expected, there were 10,000 invocation of that

method

(one for each doc) and the profiler showed that

the

method took 90% of the processing time.

I am concerned that it is taking around 9.5

minutes

for 10,000 docs and I am expecting to have around
600,000 docs to index. So that would take 570

minutes

(9-10 hours) to index and which is HUGE!!!

My machine: Pentium 4 CPU 2.40 GHz
           RAM 1 GB

Any help appreciated.

Thanks,
Aalap.


--- [EMAIL PROTECTED] wrote:

Ãâ ÃÂÃÂÃÂÃÂÃâÃÂÃÂÃÂàÃÂÃâ ÃÂÃâÃÂÃÂà20
ÃÂÃÂÃâÃÂÃÂÃÅ 2005 04:07 Mufaddal Khumri
ÃÂÃÂÃÂÃÂÃÂÃÂÃÂ(a):

The 20000 products I mentioned are 20000 rows.

I

get the products in

bulk by using a limit clause.

I am using hibernate with MySQL server on a

2.8GHz, 1.00GB Ram machine.

Maybe your session-level cache in hibernate

grows

incredibly. Do you do Session.clear() sometimes while doing indexing?
Here's a link about batching & hibernate:



http://blog.hibernate.org/cgi-bin/blosxom.cgi/2004/08/


---------------------------------------------------------------------

To unsubscribe, e-mail:
[EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]




---------------------------------------------------------------------

To unsubscribe, e-mail:

[EMAIL PROTECTED]

For additional commands, e-mail:

[EMAIL PROTECTED]



---------------------------------------------------------------------

To unsubscribe, e-mail:
[EMAIL PROTECTED]
For additional commands, e-mail:
[EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to