Re: getMaxDocumentRequest problem

Karl Wright Tue, 09 Oct 2012 07:31:14 -0700

New worker threads are never spawned.  That is why losing all the
threads is fatal.
Karl


On Tue, Oct 9, 2012 at 10:28 AM, Maciej Liżewski
<maciej.lizew...@gmail.com> wrote:
> One more thing:
>
> now, running other job connected with same repository also hangs after
> seeding and no worker threads are spawn...
>
>
>
>
> 2012/10/9 Maciej Liżewski <maciej.lizew...@gmail.com>:
>> 2012/10/9 Karl Wright <daddy...@gmail.com>:
>>> "- all worker threads are gone," ???
>>>
>>> Really??
>>
>> yes... really.. this is why I am also writing that this is strange...
>> this is list of currently active threads:
>> system
>> Reference Handler       Waiting
>> Finalizer       Waiting
>> Signal Dispatcher       Running
>> Attach Listener Running
>> main
>> main    Waiting
>> Timer-0 Evaluating...
>> HashSessionScavenger-0  Waiting
>> HashSessionScavenger-1  Evaluating...
>> HashSessionScavenger-2  Waiting
>> qtp1792381350-157       Evaluating...
>> qtp1792381350-158       Waiting
>> qtp1792381350-159       Evaluating...
>> qtp1792381350-161       Waiting
>> Job start thread        Waiting
>> Startup thread  Waiting
>> Delete startup thread   Waiting
>> Finisher thread Waiting
>> Job notification thread Waiting
>> Job delete thread       Waiting
>> Stuffer thread  Waiting
>> Expire stuffer thread   Waiting
>> Set priority thread     Waiting
>> Expiration thread '0'   Waiting
>> Expiration thread '1'   Waiting
>> Expiration thread '2'   Waiting
>> Expiration thread '3'   Waiting
>> Expiration thread '4'   Waiting
>> Expiration thread '5'   Evaluating...
>> Expiration thread '6'   Waiting
>> Expiration thread '7'   Waiting
>> Expiration thread '8'   Waiting
>> Expiration thread '9'   Evaluating...
>> Document cleanup stuffer thread Waiting
>> Document cleanup thread '0'     Waiting
>> Document cleanup thread '1'     Waiting
>> Document cleanup thread '2'     Waiting
>> Document cleanup thread '3'     Evaluating...
>> Document cleanup thread '4'     Waiting
>> Document cleanup thread '5'     Waiting
>> Document cleanup thread '6'     Evaluating...
>> Document cleanup thread '7'     Waiting
>> Document cleanup thread '8'     Evaluating...
>> Document cleanup thread '9'     Waiting
>> Document delete stuffer thread  Waiting
>> Document delete thread '0'      Waiting
>> Document delete thread '1'      Waiting
>> Document delete thread '2'      Evaluating...
>> Document delete thread '3'      Waiting
>> Document delete thread '4'      Evaluating...
>> Document delete thread '5'      Waiting
>> Document delete thread '6'      Waiting
>> Document delete thread '7'      Waiting
>> Document delete thread '8'      Waiting
>> Document delete thread '9'      Evaluating...
>> Job reset thread        Waiting
>> Seeding thread  Waiting
>> Idle cleanup thread     Evaluating...
>> Connection pool reaper  Sleeping
>> qtp1792381350-160       Running
>> qtp1792381350-162       Running
>> qtp1792381350-163       Running
>> qtp1792381350-156       Evaluating...
>> derby.daemons
>> derby.rawStoreDaemon    Waiting
>>
>> but still UI shows:
>>
>> Name    Status  Start Time      End Time        Documents       Active  
>> Processed
>> Restart  Pause  Abort   msol-x  Running Tue Oct 09 15:15:29 CEST
>> 2012            5689    10      5689
>>
>> (10 documents left active to process and status = "running")
>>
>> I checked the code of connector and there are only
>> "ManifoldCFException" throwed...
>>
>>
>>>
>>> I can think of no scenario where the worker threads disappear except
>>> if shutdown of the agents process is attempted and fails.
>>> WorkerThread catches all Throwable's and logs them and repeats in a
>>> loop.  The only kind of exception that can cause the thread to exit is
>>> an InterruptedException.
>>>
>>> This isn't making a heck of a lot of sense...
>>>
>>> Karl
>>>
>>>
>>> On Tue, Oct 9, 2012 at 9:58 AM, Maciej Liżewski
>>> <maciej.lizew...@gmail.com> wrote:
>>>> Just looking at threads... but nothing special.
>>>> - all worker threads are gone,
>>>> - stuffer thread runs in a loop but finds nothing to do...
>>>> - other threads just waits on 'sleep' commands.
>>>>
>>>> is there any particular thread I should look at?
>>>>
>>>> I could guess that there was some exception (maybe in my connector,
>>>> but I could not repeat it) that was not handled and some worker thread
>>>> just disappeared...
>>>>
>>>> How to enable debug logs so I could see verbose output from core functions?
>>>>
>>>>
>>>> 2012/10/9 Karl Wright <daddy...@gmail.com>:
>>>>> FWIW, getting thread dumps from the process running the agents process
>>>>> when it is "hung" may (or may not) help determine the underlying
>>>>> clause.
>>>>>
>>>>> Karl
>>>>>
>>>>> On Tue, Oct 9, 2012 at 9:21 AM, Karl Wright <daddy...@gmail.com> wrote:
>>>>>> What is your deployment model?  Is this a multiprocess deployment?
>>>>>> What database are you using?
>>>>>>
>>>>>> There are various load tests for each database, which do far more than
>>>>>> 7000 documents.  I am concerned that you are seeing this because of
>>>>>> some kind of cross-process synchronization issues, which might occur
>>>>>> (for instance) if you are using a multiprocess environment with a
>>>>>> single-process properties.xml file.
>>>>>>
>>>>>> Karl
>>>>>>
>>>>>> On Tue, Oct 9, 2012 at 9:12 AM, Maciej Liżewski
>>>>>> <maciej.lizew...@gmail.com> wrote:
>>>>>>> Ok... it is not a getMaxDocumentRequest issue, because I was able to
>>>>>>> get it even with getMaxDocumentRequest=1. Seems it occurs when
>>>>>>> indenxing large sets of documents (in my case ~7000). It also happened
>>>>>>> once for CIFS connecotr (with samba share)...
>>>>>>>
>>>>>>> result is like this:
>>>>>>>
>>>>>>> Name    Status  Start Time      End Time        Documents       Active  
>>>>>>> Processed
>>>>>>> Restart  Pause  Abort   Mantis  Running Tue Oct 09 13:56:59 CEST
>>>>>>> 2012            5689    1600    4400
>>>>>>>
>>>>>>> there is "active" documents count 1600 for about an hour now but there
>>>>>>> is no server load and nothing changes... seems that it is hanging
>>>>>>> somwhere inside manifold core.
>>>>>>>
>>>>>>> also - when hitting abort - nothing happens (job process remains in
>>>>>>> "aborting" state)...
>>>>>>>
>>>>>>> Problem is that it happens irregularly (sometime 10 documents,
>>>>>>> sometime 1600 and sometime all documents are indexed). Tried to check
>>>>>>> that locally but on first pass everything went ok... really strange...
>>>>>>>
>>>>>>>
>>>>>>> 2012/10/3 Karl Wright <daddy...@gmail.com>:
>>>>>>>> Hi Maciej,
>>>>>>>>
>>>>>>>> It sounds like your loop condition must be somehow incorrect.  You may
>>>>>>>> not receive the full number of documents specified by
>>>>>>>> getMaxDocumentRequest(), but rather a number less than that.
>>>>>>>>
>>>>>>>> We have a number of connectors that use document batches > 1, e.g. the
>>>>>>>> LiveLink connector, so this is likely not the problem.
>>>>>>>>
>>>>>>>> I'd recommend adding System.out.println() diagnostics to see exactly
>>>>>>>> what is happening inside both getDocumentVersions() and
>>>>>>>> processDocuments().
>>>>>>>>
>>>>>>>> Karl
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Oct 3, 2012 at 4:30 PM, Maciej Liżewski
>>>>>>>> <maciej.lizew...@gmail.com> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I have noticed strange problem with Connector (new one I am developing
>>>>>>>>> right now) and getMaxDocumentRequest parameter.
>>>>>>>>> When it returns 1 (default) everything seems ok, but when I set it to
>>>>>>>>> anything higher (5, 10, 20) indexing job does not end but hangs when
>>>>>>>>> there is only getMaxDocumentRequest documents left (when it should
>>>>>>>>> process 5 documents in a row - 5 documents stays "active")
>>>>>>>>> All document related functions seem written ok (they all iterate
>>>>>>>>> throug passed arrays), there are no exceptions thrown (at least I do
>>>>>>>>> not see any in console).
>>>>>>>>>
>>>>>>>>> What can be wrong and what should I look at to? any ideas?
>>>>>>>>>
>>>>>>>>> By the way - the new connector is for Mantis Bug tracker to index 
>>>>>>>>> issues.
>>>>>>>>>
>>>>>>>>> TIA
>>>>>>>>> Redguy

Re: getMaxDocumentRequest problem

Reply via email to