Created CONNECTORS-1510 and committed a fix. Karl On Mon, Jun 18, 2018 at 2:33 PM Karl Wright <daddy...@gmail.com> wrote:
> It certainly is a particular file -- the mime type is null, and that's > causing this line to blow up: > > final String lowerMimeType = mimeType.toLowerCase(Locale.ROOT); > > > That code was added a couple of revs back to address a different problem; > it's a trivial fix: > > final String lowerMimeType = (mimeType != > null)?mimeType.toLowerCase(Locale.ROOT):null; > > (This is HttpPoster line 811) > > Karl > > > On Mon, Jun 18, 2018 at 2:30 PM Steph van Schalkwyk <st...@remcam.net> > wrote: > >> >> Looks like a particular file may be causing this. Try to find the filanem >> it crashes on and copy that to asmall crawl directory. Repeat crawl. >> >> >> On Mon, Jun 18, 2018 at 11:34 AM, Bisonti Mario <mario.biso...@vimar.com> >> wrote: >> >>> Hallo >>> >>> >>> >>> I configured ManifoldCF 2.10 with Tomcat 9.0.8 and Postgres 9.3 >>> >>> >>> >>> I configured multiprocess-file-example >>> >>> >>> >>> >>> >>> When I create a Job to scan a big Windows share (22000 docs word, pdf, >>> etc,) manifoldcf crash with the message: >>> at >>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >>> [mcf-pull-agent.jar:?] >>> >>> FATAL 2018-06-18T18:29:23,676 (Worker thread '36') - Error tossed: null >>> >>> java.lang.NullPointerException >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.HttpPoster.checkMimeTypeIndexable(HttpPoster.java:811) >>> ~[?:?] >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.SolrConnector.checkMimeTypeIndexable(SolrConnector.java:534) >>> ~[?:?] >>> >>> at >>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineCheckEntryPoint.checkMimeTypeIndexable(IncrementalIngester.java:2937) >>> ~[mcf-agents.jar:?] >>> >>> at >>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineCheckFanout.checkMimeTypeIndexable(IncrementalIngester.java:2864) >>> ~[mcf-agents.jar:?] >>> >>> at >>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObject.checkMimeTypeIndexable(IncrementalIngester.java:2589) >>> ~[mcf-agents.jar:?] >>> >>> at >>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.checkMimeTypeIndexable(IncrementalIngester.java:273) >>> ~[mcf-agents.jar:?] >>> >>> at >>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.checkMimeTypeIndexable(WorkerThread.java:2029) >>> ~[mcf-pull-agent.jar:?] >>> >>> at >>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.checkIncludeFile(SharedDriveConnector.java:1439) >>> ~[?:?] >>> >>> at >>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector$ProcessDocumentsFilter.accept(SharedDriveConnector.java:4874) >>> ~[?:?] >>> >>> at jcifs.smb.SmbFile.doFindFirstNext(SmbFile.java:2016) ~[?:?] >>> >>> at jcifs.smb.SmbFile.doEnum(SmbFile.java:1741) ~[?:?] >>> >>> at jcifs.smb.SmbFile.listFiles(SmbFile.java:1718) ~[?:?] >>> >>> at jcifs.smb.SmbFile.listFiles(SmbFile.java:1707) ~[?:?] >>> >>> at >>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.fileListFiles(SharedDriveConnector.java:2318) >>> ~[?:?] >>> >>> at >>> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:798) >>> ~[?:?] >>> >>> at >>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >>> [mcf-pull-agent.jar:?] >>> >>> >>> >>> >>> >>> If I use a smaller windows share it works. >>> >>> >>> >>> Note that with ManifoldCF 2.9.1 HDSQLDB and QuickStart with Jetty it >>> worked. >>> >>> >>> >>> What could I do? >>> >>> >>> >>> Thanks a lot >>> >>> Mario >>> >> >>