Re: AW: Rebuilding index

Nelson Takashi Omori Fri, 23 Nov 2012 12:34:03 -0800

Thank you, Claus.

I'll try to configure a cluster and see how it works.

Another thing on the process of rebuilding the index, is that mycomputer's CPU usage was constantly on 16% and access to the HD andmemory usage was low too. So my computer's resources weren't usedcompletely. So I executed on debug mode and I saw this message, many times:

"Executor is under load, will schedule 1987 remaining tasks for 50 ms later"

Searching deeply, I found that Jackrabbit creates a text extraction taskas a low priority task. The execution of this kind of task is controlledby the value "maxLoadForLowPriorityTasks" in the JackrabbitThreadPool,wich is defined by the value from a system parameter"org.apache.jackrabbit.core.JackrabbitThreadPool.maxLoadForLowPriorityTasks".If this value doesn't exist or it's not between 0 and 100, Jackrabbituses 75 by default. This value is used to determine if it's possible toexecute a low priority task, checking the number of threads that areactive in the moment. Using default value, if more than 75% of threadsare in use, the task will be scheduled for later.

So I set the parameter"org.apache.jackrabbit.core.JackrabbitThreadPool.maxLoadForLowPriorityTasks"to "0" and Jackrabbit ignores the verification and the process wasfaster, about 2hours to complete the rebuild. The CPU usage was floatingfrom 50% to 90%, memory was used up to the limit and the HD was accessedmore constantly. Maybe it's better to increase the memory allocatedbefore you execute this.

In my scenario, it make sense to set this value to "0", because whilethe rebuild process is executing, my client can't use the system, so Ican use all the resources that I have to finish as soon as possible.After the rebuild process, you should remove the parameter, soJackrabbit can control the execution of low priority task again.

Maybe this can help someone who have to rebuild the index as soon aspossible and don't have a cluster mentioned by Clauss.


Em 23/11/2012 06:29, KÖLL Claus escreveu:

Hi Nelson,

1) Reading the source code, jackrabbit is using LazyTextExtractorField (and 
other classes) to execute the extraction in a separate thread.
Doesn't it do exactly what I want? But, even so I waited 3 hours and the 
repository wasn't initialized and ready to use. Is it normal?

First .. yes this is normal ..
and yes you are right about extraction in a separate thread .. this happens on 
session.save() operation. If you start the repository it will start to re-index 
it if the index is not present.
In that way jackrabbit does not separate between full text indexing and 
"normal" node/property indexing. So the start will take much time
depending on your content.

2)  What I'm planning to do is the best approach? Did anybody make something 
similar?

One way to handle such index recovering is to create a cluster. Let's assume 
you would have 2 cluster members where one is the primary and the other one is 
a hot standby member.
If you have problems with the index on the primary cluster member you could 
copy the index folder from the standby cluster member.
If you like you could re-index the repository on your standby member while the 
primary is running.

greets
claus

Re: AW: Rebuilding index

Reply via email to