[
https://issues.apache.org/jira/browse/CONNECTORS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271494#comment-15271494
]
Karl Wright commented on CONNECTORS-1312:
-----------------------------------------
"Connection reset by peer" sounds like something we can look for and retry on.
However, this may also mean: (1) you are crawling your server too hard. We
recommend that you reduce the maximum number of connections for JCIFS to the
point where you don't get funky errors like this all over the place; (2) you
might have a network switch which disconnects after a certain period of time
reading data, or when you transfer too large a file. Saw that in a couple of
installations...
> jcifs.smb.SmbException: Connection reset by peer: socket write error
> --------------------------------------------------------------------
>
> Key: CONNECTORS-1312
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1312
> Project: ManifoldCF
> Issue Type: Bug
> Components: JCIFS connector
> Affects Versions: ManifoldCF 2.5
> Environment: Windows x64, java 1.8.x
> Reporter: Konstantin Avdeev
>
> hi Karl,
> we've found another JCIFS exception: Windows share jobs stop when
> encountering a "Connection reset by peer" error, e.g.:
> {code}
> ERROR 2016-05-03 15:29:24,209 (Worker thread '80') - JCIFS: SmbException
> tossed processing smb://server.domain.com/path/file.ppt
> jcifs.smb.SmbException: Connection reset by peer: socket write error
> java.net.SocketException: Connection reset by peer: socket write error
> at java.net.SocketOutputStream.socketWrite0(Native Method)
> at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> at jcifs.smb.SmbTransport.doSend(SmbTransport.java:453)
> at jcifs.util.transport.Transport.sendrecv(Transport.java:67)
> at jcifs.smb.SmbTransport.send(SmbTransport.java:655)
> at jcifs.smb.SmbSession.send(SmbSession.java:238)
> at jcifs.smb.SmbTree.send(SmbTree.java:119)
> at jcifs.smb.SmbFile.send(SmbFile.java:775)
> at jcifs.smb.SmbFileInputStream.readDirect(SmbFileInputStream.java:181)
> at jcifs.smb.SmbFileInputStream.read(SmbFileInputStream.java:142)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> at java.io.FilterInputStream.read(FilterInputStream.java:107)
> at java.nio.file.Files.copy(Files.java:2908)
> at java.nio.file.Files.copy(Files.java:3027)
> at org.apache.tika.io.TikaInputStream.getPath(TikaInputStream.java:587)
> at org.apache.tika.io.TikaInputStream.getFile(TikaInputStream.java:615)
> at
> org.apache.tika.parser.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:358)
> at
> org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:424)
> at
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
> at
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)
> at
> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:48)
> at
> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:227)
> at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3224)
> at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3075)
> at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2706)
> at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
> at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
> at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
> at
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:979)
> at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> {code}
> Current workaround - to start the job again (manually or by the scheduler).
> It is clear, that there are many errors, when it makes no sense to skip a
> failed URL and continue the job, e.g.:
> {code}
> Error: SmbAuthException thrown: Logon failure: unknown user name or bad
> password.
> {code}
> I'm thinking about a general solution, like defining a list (through the UI
> or properties.xml) with non severe exceptions, like "file busy" or "symlink
> detected" etc, so the admins would be able to specify, when the crawler
> should stop and when it should retry, skip and go further.
> What do you think?
> Thank you!
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)