Re: Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-30 Thread Karl Wright
Hi,

You say this is a "Tika error".  Is this Tika as a stand-alone service?  I
do not recognize any ManifoldCF classes whatsoever in this thread dump.

If this is Tika, I suggest contacting the Tika team.

Karl


On Thu, Sep 30, 2021 at 3:02 AM Bisonti Mario 
wrote:

> Additional info.
>
>
>
> I am using 2.17-dev version
>
>
>
>
>
>
>
> *Da:* Bisonti Mario
> *Inviato:* martedì 28 settembre 2021 17:01
> *A:* user@manifoldcf.apache.org
> *Oggetto:* Error: Repeated service interruptions - failure processing
> document: Read timed out
>
>
>
> Hello
>
>
>
> I have error on a Job that parses a network folder.
>
>
>
> This is the tika error:
> 2021-09-28 16:14:50 INFO  Server:415 - Started @1367ms
>
> 2021-09-28 16:14:50 WARN  ContextHandler:1671 - Empty contextPath
>
> 2021-09-28 16:14:50 INFO  ContextHandler:916 - Started
> o.e.j.s.h.ContextHandler@3dd69f5a{/,null,AVAILABLE}
>
> 2021-09-28 16:14:50 INFO  TikaServerCli:413 - Started Apache Tika server
> at http://sengvivv02.vimar.net:9998/
>
> 2021-09-28 16:15:04 INFO  MetadataResource:484 - meta (application/pdf)
>
> 2021-09-28 16:26:46 INFO  MetadataResource:484 - meta (application/pdf)
>
> 2021-09-28 16:26:46 INFO  TikaResource:484 - tika (application/pdf)
>
> 2021-09-28 16:27:23 INFO  MetadataResource:484 - meta (application/pdf)
>
> 2021-09-28 16:27:24 INFO  TikaResource:484 - tika (application/pdf)
>
> 2021-09-28 16:27:26 INFO  MetadataResource:484 - meta (application/pdf)
>
> 2021-09-28 16:27:26 INFO  TikaResource:484 - tika (application/pdf)
>
> 2021-09-28 16:30:28 WARN  PhaseInterceptorChain:468 - Interceptor for {
> http://resource.server.tika.apache.org/}MetadataResource has thrown
> exception, unwinding now
>
> org.apache.cxf.interceptor.Fault: Could not send Message.
>
> at
> org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:67)
>
> at
> org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
>
> at
> org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90)
>
> at
> org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
>
> at
> org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
>
> at
> org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
>
> at
> org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247)
>
> at
> org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79)
>
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190)
>
> at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
>
> at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>
> at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
>
> at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
>
> at org.eclipse.jetty.server.Server.handle(Server.java:516)
>
> at
> org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
>
> at
> org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
>
> at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
>
> at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
>
> at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
>
> at
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
>
> at
> org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
>
> at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
>
> at java.base/java.lang.Thread.run(Thread.java:834)
>
> Caused by: org.eclipse.jetty.io.EofException
>
> at
> org.eclipse.jetty.io.ChannelEndPoint.flush(Chann

R: Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-30 Thread Bisonti Mario
Additional info.

I am using 2.17-dev version



Da: Bisonti Mario
Inviato: martedì 28 settembre 2021 17:01
A: user@manifoldcf.apache.org
Oggetto: Error: Repeated service interruptions - failure processing document: 
Read timed out

Hello

I have error on a Job that parses a network folder.

This is the tika error:
2021-09-28 16:14:50 INFO  Server:415 - Started @1367ms
2021-09-28 16:14:50 WARN  ContextHandler:1671 - Empty contextPath
2021-09-28 16:14:50 INFO  ContextHandler:916 - Started 
o.e.j.s.h.ContextHandler@3dd69f5a{/,null,AVAILABLE}<mailto:o.e.j.s.h.ContextHandler@3dd69f5a%7b/,null,AVAILABLE%7d>
2021-09-28 16:14:50 INFO  TikaServerCli:413 - Started Apache Tika server at 
http://sengvivv02.vimar.net:9998/
2021-09-28 16:15:04 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:26:46 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:26:46 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:27:23 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:27:24 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:27:26 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:27:26 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:30:28 WARN  PhaseInterceptorChain:468 - Interceptor for 
{http://resource.server.tika.apache.org/}MetadataResource has thrown exception, 
unwinding now
org.apache.cxf.interceptor.Fault: Could not send Message.
at 
org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:67)
at 
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at 
org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90)
at 
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at 
org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
at 
org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
at 
org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247)
at 
org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.eclipse.jetty.io.EofException
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
at 
org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
at 
org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:826)
at 
org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at 
org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:550)
at 
org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:915)
at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:987)
at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:285)
at org.eclipse.jetty.server.HttpOutput

Error: Repeated service interruptions - failure processing document: Read timed out

2021-09-28 Thread Bisonti Mario
Hello

I have error on a Job that parses a network folder.

This is the tika error:
2021-09-28 16:14:50 INFO  Server:415 - Started @1367ms
2021-09-28 16:14:50 WARN  ContextHandler:1671 - Empty contextPath
2021-09-28 16:14:50 INFO  ContextHandler:916 - Started 
o.e.j.s.h.ContextHandler@3dd69f5a{/,null,AVAILABLE}
2021-09-28 16:14:50 INFO  TikaServerCli:413 - Started Apache Tika server at 
http://sengvivv02.vimar.net:9998/
2021-09-28 16:15:04 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:26:46 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:26:46 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:27:23 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:27:24 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:27:26 INFO  MetadataResource:484 - meta (application/pdf)
2021-09-28 16:27:26 INFO  TikaResource:484 - tika (application/pdf)
2021-09-28 16:30:28 WARN  PhaseInterceptorChain:468 - Interceptor for 
{http://resource.server.tika.apache.org/}MetadataResource has thrown exception, 
unwinding now
org.apache.cxf.interceptor.Fault: Could not send Message.
at 
org.apache.cxf.interceptor.MessageSenderInterceptor$MessageSenderEndingInterceptor.handleMessage(MessageSenderInterceptor.java:67)
at 
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at 
org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(OutgoingChainInterceptor.java:90)
at 
org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseInterceptorChain.java:308)
at 
org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainInitiationObserver.java:121)
at 
org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(AbstractHTTPDestination.java:265)
at 
org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(JettyHTTPDestination.java:247)
at 
org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTPHandler.java:79)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:190)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.eclipse.jetty.io.EofException
at org.eclipse.jetty.io.ChannelEndPoint.flush(ChannelEndPoint.java:279)
at org.eclipse.jetty.io.WriteFlusher.flush(WriteFlusher.java:422)
at org.eclipse.jetty.io.WriteFlusher.write(WriteFlusher.java:277)
at 
org.eclipse.jetty.io.AbstractEndPoint.write(AbstractEndPoint.java:381)
at 
org.eclipse.jetty.server.HttpConnection$SendCallback.process(HttpConnection.java:826)
at 
org.eclipse.jetty.util.IteratingCallback.processing(IteratingCallback.java:241)
at 
org.eclipse.jetty.util.IteratingCallback.iterate(IteratingCallback.java:223)
at org.eclipse.jetty.server.HttpConnection.send(HttpConnection.java:550)
at 
org.eclipse.jetty.server.HttpChannel.sendResponse(HttpChannel.java:915)
at org.eclipse.jetty.server.HttpChannel.write(HttpChannel.java:987)
at org.eclipse.jetty.server.HttpOutput.channelWrite(HttpOutput.java:285)
at org.eclipse.jetty.server.HttpOutput.close(HttpOutput.java:638)
at 
org.apache.cxf.transport.http_jetty.JettyHTTPDestination$JettyOutputStream.close(JettyHTTPDestination.java:329)
at 

Re: Error: Repeated service interruptions - failure processing document: Failed to acquire credits in time

2020-05-21 Thread Karl Wright
So the folder is accessible, but can you open the specific document
itself?  There may be an issue there unrelated to the folder.

If it does open OK, then I'm afraid you're beyond my knowledge of what the
problem might be.  The current JCIFS library comes from a Github project
and perhaps you can contact the maintainers to get them to interpret what
it means.  Sometimes just googling the precise error message (not
ManifoldCF's, but the underlying JCIFS error) can help clarify the issue.

Karl


On Thu, May 21, 2020 at 4:00 AM ritika jain 
wrote:

> Reply:-
> The smb exception means that it is coming from the JCIFS library, which is
> trying to find documents and their metadata from your windows shares, and
> is apparently not getting something it needs back promptly. Perhaps the
> user you are using to do the crawl has insufficient privileges? Also, the
> error you are seeing is a new one; I've never seen that before, so the
> connector hasn't either, and it basically doesn't know whether to skip the
> document or hard fail. But what I'd do is try to open the document yourself
> in Windows and find out whether it seems to work or not, for a start.
>
> Many Thanks for you reply,
> Surely will now follow mail chain only.
> I have checked the user privileges. User is having  all access rights.
> Also the manual access to folders is working fine and folder is accessible.
> Can it be possible in any case, the window shares connector faces some
> problem while connecting? (a network issue)
>
> Thanks
> Ritika
>
> On Tue, May 19, 2020 at 2:39 PM Karl Wright  wrote:
>
>> I commented in the ticket you created.
>> Thanks,
>> Karl
>>
>> On Tue, May 19, 2020 at 3:07 AM ritika jain 
>> wrote:
>>
>>> Hi All,
>>>
>>> I am configured Units job (Manifoldcf 2.14 and ES 7.6.2 and postgres
>>> 9.6.10) on server to access files from samba SMBv3 server and used
>>> jcifs-ng-2.1.2.jar to be loaded in lib of manifoldcf.
>>>
>>> After ingesting some records into the index , the got this error in logs
>>> :-
>>>  :-Unrecognized SmbException thrown getting document version for
>>> smb://store1.directory.intra/folders/UnitsTag1/Hydraulic Engineering/13 HYE
>>> Data/morelis/VSS/MatlabTools/data/s/srca.a
>>> jcifs.smb.SmbException: Failed to acquire credits in time.
>>>
>>> Can anybody please help me understand what can be the possible cause of
>>> this error. Can it be a network connection issue or something else.
>>>
>>> For info:- no authority connection/ Active Directory is being used till
>>> now. Also the Use SID for security (checkbox on manifoldcf UI):- is
>>> UNCHECKED.
>>>
>>> Any help will be appreciated greatly.
>>>
>>> Thanks
>>> RItika
>>>
>>>
>>>
>>>
>>>
>>>


Re: Error: Repeated service interruptions - failure processing document: Failed to acquire credits in time

2020-05-21 Thread ritika jain
Reply:-
The smb exception means that it is coming from the JCIFS library, which is
trying to find documents and their metadata from your windows shares, and
is apparently not getting something it needs back promptly. Perhaps the
user you are using to do the crawl has insufficient privileges? Also, the
error you are seeing is a new one; I've never seen that before, so the
connector hasn't either, and it basically doesn't know whether to skip the
document or hard fail. But what I'd do is try to open the document yourself
in Windows and find out whether it seems to work or not, for a start.

Many Thanks for you reply,
Surely will now follow mail chain only.
I have checked the user privileges. User is having  all access rights. Also
the manual access to folders is working fine and folder is accessible. Can
it be possible in any case, the window shares connector faces some problem
while connecting? (a network issue)

Thanks
Ritika

On Tue, May 19, 2020 at 2:39 PM Karl Wright  wrote:

> I commented in the ticket you created.
> Thanks,
> Karl
>
> On Tue, May 19, 2020 at 3:07 AM ritika jain 
> wrote:
>
>> Hi All,
>>
>> I am configured Units job (Manifoldcf 2.14 and ES 7.6.2 and postgres
>> 9.6.10) on server to access files from samba SMBv3 server and used
>> jcifs-ng-2.1.2.jar to be loaded in lib of manifoldcf.
>>
>> After ingesting some records into the index , the got this error in logs
>> :-
>>  :-Unrecognized SmbException thrown getting document version for
>> smb://store1.directory.intra/folders/UnitsTag1/Hydraulic Engineering/13 HYE
>> Data/morelis/VSS/MatlabTools/data/s/srca.a
>> jcifs.smb.SmbException: Failed to acquire credits in time.
>>
>> Can anybody please help me understand what can be the possible cause of
>> this error. Can it be a network connection issue or something else.
>>
>> For info:- no authority connection/ Active Directory is being used till
>> now. Also the Use SID for security (checkbox on manifoldcf UI):- is
>> UNCHECKED.
>>
>> Any help will be appreciated greatly.
>>
>> Thanks
>> RItika
>>
>>
>>
>>
>>
>>


Re: Error: Repeated service interruptions - failure processing document: Failed to acquire credits in time

2020-05-19 Thread Karl Wright
I commented in the ticket you created.
Thanks,
Karl

On Tue, May 19, 2020 at 3:07 AM ritika jain 
wrote:

> Hi All,
>
> I am configured Units job (Manifoldcf 2.14 and ES 7.6.2 and postgres
> 9.6.10) on server to access files from samba SMBv3 server and used
> jcifs-ng-2.1.2.jar to be loaded in lib of manifoldcf.
>
> After ingesting some records into the index , the got this error in logs :-
>  :-Unrecognized SmbException thrown getting document version for
> smb://store1.directory.intra/folders/UnitsTag1/Hydraulic Engineering/13 HYE
> Data/morelis/VSS/MatlabTools/data/s/srca.a
> jcifs.smb.SmbException: Failed to acquire credits in time.
>
> Can anybody please help me understand what can be the possible cause of
> this error. Can it be a network connection issue or something else.
>
> For info:- no authority connection/ Active Directory is being used till
> now. Also the Use SID for security (checkbox on manifoldcf UI):- is
> UNCHECKED.
>
> Any help will be appreciated greatly.
>
> Thanks
> RItika
>
>
>
>
>
>


Error: Repeated service interruptions - failure processing document: Failed to acquire credits in time

2020-05-19 Thread ritika jain
Hi All,

I am configured Units job (Manifoldcf 2.14 and ES 7.6.2 and postgres
9.6.10) on server to access files from samba SMBv3 server and used
jcifs-ng-2.1.2.jar to be loaded in lib of manifoldcf.

After ingesting some records into the index , the got this error in logs :-
 :-Unrecognized SmbException thrown getting document version for
smb://store1.directory.intra/folders/UnitsTag1/Hydraulic Engineering/13 HYE
Data/morelis/VSS/MatlabTools/data/s/srca.a
jcifs.smb.SmbException: Failed to acquire credits in time.

Can anybody please help me understand what can be the possible cause of
this error. Can it be a network connection issue or something else.

For info:- no authority connection/ Active Directory is being used till
now. Also the Use SID for security (checkbox on manifoldcf UI):- is
UNCHECKED.

Any help will be appreciated greatly.

Thanks
RItika


Re: Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-07 Thread Ronny Heylen
Hi,
We have reset thottling to 10 for AD and SOLR (2 for the windows
repository).
Job indexing all pptx to null ouput has run successfully (162733 documents)
Job indexing all pptx to solr still fails, manifoldcf.log contains:
 WARN 2013-11-07 14:34:06,502 (Worker thread '29') - JCIFS: Possibly
transient exception detected on attempt 1 while getting share security: All
pipe instances are busy.
jcifs.smb.SmbException: All pipe instances are busy.
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
at jcifs.smb.SmbSession.send(SmbSession.java:238)
at jcifs.smb.SmbTree.send(SmbTree.java:119)
at jcifs.smb.SmbFile.send(SmbFile.java:775)
at jcifs.smb.SmbFile.open0(SmbFile.java:989)
at jcifs.smb.SmbFile.open(SmbFile.java:1006)
at jcifs.smb.SmbFileOutputStream.init(SmbFileOutputStream.java:142)
at
jcifs.smb.TransactNamedPipeOutputStream.init(TransactNamedPipeOutputStream.java:32)
at
jcifs.smb.SmbNamedPipe.getNamedPipeOutputStream(SmbNamedPipe.java:187)
at
jcifs.dcerpc.DcerpcPipeHandle.doSendFragment(DcerpcPipeHandle.java:68)
at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:190)
at jcifs.dcerpc.DcerpcHandle.bind(DcerpcHandle.java:126)
at jcifs.dcerpc.DcerpcHandle.sendrecv(DcerpcHandle.java:140)
at jcifs.smb.SmbFile.getShareSecurity(SmbFile.java:2943)
at
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getFileShareSecurity(SharedDriveConnector.java:2393)
at
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.describeDocumentSecurity(SharedDriveConnector.java:1045)
at
org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.getDocumentVersions(SharedDriveConnector.java:554)
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:322)
 WARN 2013-11-07 14:55:45,257 (Worker thread '30') - IO exception during
indexing: Read timed out
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
at
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
at
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
at
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at
org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at
org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:919)
 WARN 2013-11-07 14:55:45,273 (Worker thread '30') - Service interruption
reported for job 1383765534700 connection 'Filesharesrv1': IO exception
during indexing: Read timed out
ERROR 2013-11-07 14:55:45,304 (Worker thread '30') - Exception tossed:
Repeated service interruptions - failure processing document: Read timed out
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
interruptions - failure processing document: Read timed out
at
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152

Re: Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-07 Thread Karl Wright
 thread '30') - Exception tossed:
 Repeated service interruptions - failure processing document: Read timed out
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated
 service interruptions - failure processing document: Read timed out
 at
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:152)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
 at
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at
 org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at
 org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:919)
  WARN 2013-11-07 15:06:04,235 (Worker thread '9') - IO exception during
 indexing: Read timed out
 java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:152)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
 at
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at
 org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117

Re: Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-07 Thread Ronny Heylen
)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at
 org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at
 org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:919)
  WARN 2013-11-07 14:55:45,273 (Worker thread '30') - Service interruption
 reported for job 1383765534700 connection 'Filesharesrv1': IO exception
 during indexing: Read timed out
 ERROR 2013-11-07 14:55:45,304 (Worker thread '30') - Exception tossed:
 Repeated service interruptions - failure processing document: Read timed out
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated
 service interruptions - failure processing document: Read timed out
 at
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:586)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:152)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715)
 at
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
 at
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at
 org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291)
 at
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
 at
 org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:919)
  WARN 2013-11-07 15:06:04,235 (Worker thread '9') - IO exception during
 indexing: Read timed out
 java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:152)
 at java.net.SocketInputStream.read(SocketInputStream.java:122)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715

Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-06 Thread Ronny Heylen
Hi,
We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive with
several hundred thousands documents.
Doing only one manifoldcf job to index all the drive was always giving some
kind of error, therefore to better understand where the problem can be, we
made one job to index all *.doc*, another one for *.xls*, another one for
*.pdf ...
Using the help from the list (thanks!) we set the size limit to 100MB and
all jobs succeeds (great) except the one for *.pptx
The message is
Error: Repeated service interruptions - failure processing document: Read
timed out
We don't find any error in the log we have searched: solr.log, ...
Based on some indications found on Internet, we have set the Throttling max
connections setting to 2 (instead of 10) in 3 places:
output connection to SOLR
authority connection to the Active Directory
repository connection to the windows file share
But the problem stays the same.
We have tried on another machine with SOLR 4.5 and Manifoldcf 1.4, same
problem.
We can let run the job for all *.PDF, or all *.DOC*, or all *.XLS* without
problem, but the same message comes always for *.PPTX.
The last time the job stops with the message, it displays (not the same
numbers for each run as the windows drive is changing) 56311 documents,
with 17466 busy and 38847 processed.
As we don't find anything in the log (but probably we don't look at the
correct place), we don't know what to do.
Thanks for your help,
Ronny and Frédéric


Re: Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-06 Thread Karl Wright
Hi Ronny,

One minor thing: you should need to set throttling to 2 ONLY for the
Windows repository connection, not for AD or Solr.


As for how to debug this issue, first off you should be looking in the
manifoldcf.log file (or the equivalent).  You should see WARN messages from
the shared file connector under most conditions when there's a service
interruption.  You would probably see Read timed out warnings if you
looked there, since that is what aborted the job run, along with a stack
trace.  However, that's not going to add much information to the analysis
at this point.

What might be valuable is to determine whether the problem is happening on
the Windows side or on the Solr side.  At this point I can't tell.  You
could, however, create a null output connection, and create  a similar job
the sends its output there, and see if it completes.  Can you do this and
get back to me?

Thanks,
Karl





On Wed, Nov 6, 2013 at 3:17 PM, Ronny Heylen securaqbere...@gmail.comwrote:

 Hi,
 We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive with
 several hundred thousands documents.
 Doing only one manifoldcf job to index all the drive was always giving
 some kind of error, therefore to better understand where the problem can
 be, we made one job to index all *.doc*, another one for *.xls*, another
 one for *.pdf ...
 Using the help from the list (thanks!) we set the size limit to 100MB and
 all jobs succeeds (great) except the one for *.pptx
 The message is
 Error: Repeated service interruptions - failure processing document: Read
 timed out
 We don't find any error in the log we have searched: solr.log, ...
 Based on some indications found on Internet, we have set the Throttling
 max connections setting to 2 (instead of 10) in 3 places:
 output connection to SOLR
 authority connection to the Active Directory
 repository connection to the windows file share
 But the problem stays the same.
 We have tried on another machine with SOLR 4.5 and Manifoldcf 1.4, same
 problem.
 We can let run the job for all *.PDF, or all *.DOC*, or all *.XLS* without
 problem, but the same message comes always for *.PPTX.
 The last time the job stops with the message, it displays (not the same
 numbers for each run as the windows drive is changing) 56311 documents,
 with 17466 busy and 38847 processed.
 As we don't find anything in the log (but probably we don't look at the
 correct place), we don't know what to do.
 Thanks for your help,
 Ronny and Frédéric



Re: Error: Repeated service interruptions - failure processing document: Read timed out

2013-11-06 Thread Ronny Heylen
Ok Karl, thanks for the tip and the quick response, we will do this and
come back with the result.


On Wed, Nov 6, 2013 at 9:28 PM, Karl Wright daddy...@gmail.com wrote:

 Hi Ronny,

 One minor thing: you should need to set throttling to 2 ONLY for the
 Windows repository connection, not for AD or Solr.


 As for how to debug this issue, first off you should be looking in the
 manifoldcf.log file (or the equivalent).  You should see WARN messages from
 the shared file connector under most conditions when there's a service
 interruption.  You would probably see Read timed out warnings if you
 looked there, since that is what aborted the job run, along with a stack
 trace.  However, that's not going to add much information to the analysis
 at this point.

 What might be valuable is to determine whether the problem is happening on
 the Windows side or on the Solr side.  At this point I can't tell.  You
 could, however, create a null output connection, and create  a similar job
 the sends its output there, and see if it completes.  Can you do this and
 get back to me?

 Thanks,
 Karl





 On Wed, Nov 6, 2013 at 3:17 PM, Ronny Heylen securaqbere...@gmail.comwrote:

 Hi,
 We use Manifoldcf 1.3 and Solr 4.4 to index a shared network drive with
 several hundred thousands documents.
 Doing only one manifoldcf job to index all the drive was always giving
 some kind of error, therefore to better understand where the problem can
 be, we made one job to index all *.doc*, another one for *.xls*, another
 one for *.pdf ...
 Using the help from the list (thanks!) we set the size limit to 100MB and
 all jobs succeeds (great) except the one for *.pptx
 The message is
 Error: Repeated service interruptions - failure processing document: Read
 timed out
 We don't find any error in the log we have searched: solr.log, ...
 Based on some indications found on Internet, we have set the Throttling
 max connections setting to 2 (instead of 10) in 3 places:
 output connection to SOLR
 authority connection to the Active Directory
 repository connection to the windows file share
 But the problem stays the same.
 We have tried on another machine with SOLR 4.5 and Manifoldcf 1.4, same
 problem.
 We can let run the job for all *.PDF, or all *.DOC*, or all *.XLS*
 without problem, but the same message comes always for *.PPTX.
 The last time the job stops with the message, it displays (not the same
 numbers for each run as the windows drive is changing) 56311 documents,
 with 17466 busy and 38847 processed.
 As we don't find anything in the log (but probably we don't look at the
 correct place), we don't know what to do.
 Thanks for your help,
 Ronny and Frédéric





Re: Repeated service interruptions

2012-08-01 Thread Karl Wright
On Wed, Aug 1, 2012 at 5:48 AM, Shinichiro Abe
shinichiro.ab...@gmail.com wrote:
 Hi Karl,

 I still have a problem.
 I reduced maximum number of connections into 2.
 I rebooted the file server, not domain controller.
 When I configured the paths[1], the log said no error
 and ShareDrive connector crawled the files successfully.
 When I made the path's config default(matching * ),
 the log said all pipe instances are busy error.
 Both of path's config pointed the same location.

 Also when this error occurred, watching the log of ingest,
 HttpPoster was waiting for response stream
 and couldn't get response from Solr,
 and threw SocketTimeoutException.
 I increased jcifs.smb.client.responseTimeout
 but still threw the exception.
 On Solr, Jetty threw SocketException(socket wr
 ite error).
 I'm working on checking Solr logs.
 Solr may do something wrong when running /update/extract.


If Solr threw the exception this sounds likely.

 Do you know something like this?
 Does path's matching config affect those errors?

 [1]Paths Tab:
 Include  directory(s)  matching  /01*


This should have nothing to do with socket exceptions, except possibly
that the crawler winds up trying to read a file that isn't actually a
file but is something else, like a named pipe or something.  This
typically doesn't happen if the server is a Windows machine but if it
is a Samba server I could imagine something like that happening.

Karl

 P.S.
 Thank you for fix CONNECTORS-494.
 I checked trunk code, worked well.

 Thank you,
 Shinichiro Abe

 On 2012/07/24, at 22:13, Karl Wright wrote:

 Hi Abe-san,

 Did you figure out what the problem was?

 Karl

 On Thu, Jul 19, 2012 at 5:52 AM, Karl Wright daddy...@gmail.com wrote:
 Hi Abe-san,

 Sometimes what looks like a server error can actually be due to the
 domain controller.  I wonder if the domain controller needs to be
 rebooted?

 Karl

 On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe
 shinichiro.ab...@gmail.com wrote:
 Hi Karl,
 Thank you for the reply.
 I tried to reduce maximum number of connections from 10
 to 5, but didn't  avoid busy error. I'll try to reduce more.
 Thank you.
 Shinichiro Abe

 On 2012/07/19, at 15:55, Karl Wright wrote:

 Hi Abe-san,

 The all pipe instances are busy error is coming from the Windows
 server you are trying to crawl.  I don't know what is happening there
 but here are some possibilities:

 (1) The Windows server is just overloaded; you can try reducing the
 maximum number of connections to 2 or 3 to see if that helps.
 (2) The Windows server needs rebooting.

 Thanks,
 Karl

 On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe
 shinichiro.ab...@gmail.com wrote:
 Hi,

 I use windows shares connector and ran a job.
 The job was aborted without done normally and the job's status said:
 Error: Repeated service interruptions - failure processing document: 
 Read timed out

 Why was the job aborted? I use ManifoldCF 0.5.1 and the latest version's 
 jcifs.jar.
 Is the crawled server busy? I think the server MCF is installed seems 
 not to be busy,
 the other servers in which MCF will crawls seem to be busy.
 How can I run the job without error? What's wrong?


 the logs of connector:

 WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 1 while getting share security: 
 All pipe instances are busy.
   at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
   at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 3 while getting share security: 
 All pipe instances are busy.
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: 'Busy' 
 response when getting document version for 
 smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf: retrying...
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest service 
 interruption reported for job 1342076182624 connection 'Windows shares': 
 Timeout or other service interruption: All pipe instances are busy.
 ..
 WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion 
 API socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion 
 API socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception tossed: 
 Repeated service interruptions - failure processing document: Read timed 
 out
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated 
 service interruptions - failure processing document: Read timed out
   at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606

Re: Repeated service interruptions

2012-07-24 Thread Karl Wright
Hi Abe-san,

Did you figure out what the problem was?

Karl

On Thu, Jul 19, 2012 at 5:52 AM, Karl Wright daddy...@gmail.com wrote:
 Hi Abe-san,

 Sometimes what looks like a server error can actually be due to the
 domain controller.  I wonder if the domain controller needs to be
 rebooted?

 Karl

 On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe
 shinichiro.ab...@gmail.com wrote:
 Hi Karl,
 Thank you for the reply.
 I tried to reduce maximum number of connections from 10
 to 5, but didn't  avoid busy error. I'll try to reduce more.
 Thank you.
 Shinichiro Abe

 On 2012/07/19, at 15:55, Karl Wright wrote:

 Hi Abe-san,

 The all pipe instances are busy error is coming from the Windows
 server you are trying to crawl.  I don't know what is happening there
 but here are some possibilities:

 (1) The Windows server is just overloaded; you can try reducing the
 maximum number of connections to 2 or 3 to see if that helps.
 (2) The Windows server needs rebooting.

 Thanks,
 Karl

 On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe
 shinichiro.ab...@gmail.com wrote:
 Hi,

 I use windows shares connector and ran a job.
 The job was aborted without done normally and the job's status said:
 Error: Repeated service interruptions - failure processing document: Read 
 timed out

 Why was the job aborted? I use ManifoldCF 0.5.1 and the latest version's 
 jcifs.jar.
 Is the crawled server busy? I think the server MCF is installed seems not 
 to be busy,
 the other servers in which MCF will crawls seem to be busy.
 How can I run the job without error? What's wrong?


 the logs of connector:

 WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 1 while getting share security: 
 All pipe instances are busy.
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 3 while getting share security: 
 All pipe instances are busy.
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: 'Busy' response 
 when getting document version for 
 smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf: retrying...
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest service 
 interruption reported for job 1342076182624 connection 'Windows shares': 
 Timeout or other service interruption: All pipe instances are busy.
 ..
 WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion API 
 socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion API 
 socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception tossed: 
 Repeated service interruptions - failure processing document: Read timed 
 out
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated 
 service interruptions - failure processing document: Read timed out
at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606)
 Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at 
 org.apache.manifoldcf.agents.output.solr.HttpPoster.readLine(HttpPoster.java:571)
at 
 org.apache.manifoldcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.java:598)

 Thanks in advance,
 Shinichiro Abe








Re: Repeated service interruptions

2012-07-19 Thread Karl Wright
Hi Abe-san,

Sometimes what looks like a server error can actually be due to the
domain controller.  I wonder if the domain controller needs to be
rebooted?

Karl

On Thu, Jul 19, 2012 at 5:12 AM, Shinichiro Abe
shinichiro.ab...@gmail.com wrote:
 Hi Karl,
 Thank you for the reply.
 I tried to reduce maximum number of connections from 10
 to 5, but didn't  avoid busy error. I'll try to reduce more.
 Thank you.
 Shinichiro Abe

 On 2012/07/19, at 15:55, Karl Wright wrote:

 Hi Abe-san,

 The all pipe instances are busy error is coming from the Windows
 server you are trying to crawl.  I don't know what is happening there
 but here are some possibilities:

 (1) The Windows server is just overloaded; you can try reducing the
 maximum number of connections to 2 or 3 to see if that helps.
 (2) The Windows server needs rebooting.

 Thanks,
 Karl

 On Wed, Jul 18, 2012 at 10:09 PM, Shinichiro Abe
 shinichiro.ab...@gmail.com wrote:
 Hi,

 I use windows shares connector and ran a job.
 The job was aborted without done normally and the job's status said:
 Error: Repeated service interruptions - failure processing document: Read 
 timed out

 Why was the job aborted? I use ManifoldCF 0.5.1 and the latest version's 
 jcifs.jar.
 Is the crawled server busy? I think the server MCF is installed seems not 
 to be busy,
 the other servers in which MCF will crawls seem to be busy.
 How can I run the job without error? What's wrong?


 the logs of connector:

 WARN 2012-07-12 16:28:52,648 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 1 while getting share security: All 
 pipe instances are busy.
at jcifs.smb.SmbTransport.checkStatus(SmbTransport.java:563)
at jcifs.smb.SmbTransport.send(SmbTransport.java:663)
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: Possibly 
 transient exception detected on attempt 3 while getting share security: All 
 pipe instances are busy.
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - JCIFS: 'Busy' response 
 when getting document version for 
 smb://XX.XX.XX.XX/D$/abcde/1234/123456789/e123456789a.pdf: retrying...
 ..
 WARN 2012-07-12 16:36:37,585 (Worker thread '19') - Pre-ingest service 
 interruption reported for job 1342076182624 connection 'Windows shares': 
 Timeout or other service interruption: All pipe instances are busy.
 ..
 WARN 2012-07-12 19:14:30,335 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion API 
 socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 WARN 2012-07-12 20:43:50,210 (Worker thread '19') - Service interruption 
 reported for job 1342076182624 connection 'Windows shares': Ingestion API 
 socket timeout exception waiting for response code: Read timed out; 
 ingestion will be retried again later
 ..
 ERROR 2012-07-12 20:43:50,210 (Worker thread '19') - Exception tossed: 
 Repeated service interruptions - failure processing document: Read timed out
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
 interruptions - failure processing document: Read timed out
at 
 org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:606)
 Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at 
 org.apache.manifoldcf.agents.output.solr.HttpPoster.readLine(HttpPoster.java:571)
at 
 org.apache.manifoldcf.agents.output.solr.HttpPoster.getResponse(HttpPoster.java:598)

 Thanks in advance,
 Shinichiro Abe