It looks like you are still running out of memory. I would love to know what document it was that doing that. I suspect it is very large already, and for some reason it cannot be streamed.
Karl On Wed, Jul 25, 2018 at 1:13 PM Karl Wright <daddy...@gmail.com> wrote: > Hi Maxence, > > The second exception is occurring because processing is still occurring > while the JVM is shutting down; it can be ignored. > > Karl > > > On Wed, Jul 25, 2018 at 1:01 PM msaunier <msaun...@citya.com> wrote: > >> Hi Karl, >> >> >> >> I have add the snapshot and I’m spam with this error : >> >> >> >> FATAL 2018-07-25T16:43:04,599 (Worker thread '0') - Error tossed: >> org/apache/commons/compress/utils/InputStreamStatistics >> >> java.lang.NoClassDefFoundError: >> org/apache/commons/compress/utils/InputStreamStatistics >> >> at >> org.apache.poi.openxml4j.util.ZipArchiveThresholdInputStream.<init>(ZipArchiveThresholdInputStream.java:62) >> ~[?:?] >> >> at >> org.apache.poi.openxml4j.util.ZipSecureFile.getInputStream(ZipSecureFile.java:147) >> ~[?:?] >> >> at >> org.apache.poi.openxml4j.util.ZipSecureFile.getInputStream(ZipSecureFile.java:34) >> ~[?:?] >> >> at >> org.apache.poi.openxml4j.util.ZipFileZipEntrySource.getInputStream(ZipFileZipEntrySource.java:66) >> ~[?:?] >> >> at >> org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:255) >> ~[?:?] >> >> at >> org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:725) ~[?:?] >> >> at >> org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:238) ~[?:?] >> >> at >> org.apache.tika.parser.pkg.ZipContainerDetector.detectOPCBased(ZipContainerDetector.java:197) >> ~[?:?] >> >> at >> org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:127) >> ~[?:?] >> >> at >> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:88) >> ~[?:?] >> >> at >> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:84) >> ~[?:?] >> >> at >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:116) >> ~[?:?] >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74) >> ~[?:?] >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235) >> ~[?:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226) >> ~[mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) >> ~[mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708) >> ~[mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756) >> ~[mcf-agents.jar:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583) >> ~[mcf-pull-agent.jar:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548) >> ~[mcf-pull-agent.jar:?] >> >> at >> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939) >> ~[?:?] >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >> [mcf-pull-agent.jar:?] >> >> >> >> Maxence, >> >> >> >> >> >> *De :* Karl Wright [mailto:daddy...@gmail.com] >> *Envoyé :* mercredi 25 juillet 2018 13:12 >> *À :* user@manifoldcf.apache.org >> *Objet :* Re: Out of memory, one file bug i think >> >> >> >> Hi Maxence, >> >> >> >> Tomorrow (7/26) the POI project will be delivering a nightly build which >> should repair the Class Not Found exceptions. You will need to download it >> here: >> >> >> https://builds.apache.org/view/P/view/POI/job/POI-DSL-1.8/lastSuccessfulBuild/artifact/build/dist/ >> >> >> >> ... and replace all poi jars with the corresponding ones from the binary >> distribution. I believe the poi jars are all in connector-common-lib. Be >> sure to delete the old ones (or move them somewhere else) first. >> >> >> >> I don't know whether this will fix your out of memory problem however. >> Please let me know what's still not working and I can take it from there. >> >> >> >> Karl >> >> >> >> >> >> On Wed, Jul 25, 2018 at 6:03 AM Karl Wright <daddy...@gmail.com> wrote: >> >> Out of memory errors are fatal, I'm afraid, because they corrupt not only >> the document in question but all others being processed at the same time. >> So those cannot be ignored. >> >> >> >> Tika should ignore documents that it cannot process, however, and that is >> a great enhancement request for them. >> >> >> >> Karl >> >> >> >> >> >> On Wed, Jul 25, 2018 at 3:39 AM msaunier <msaun...@citya.com> wrote: >> >> Hi Karl, >> >> >> >> Okay. So today, I'm going to force ManifoldCF to run so that only the >> documents are left behind. >> >> In the future, could I ignore these mistakes? Because it makes the >> application crash, and in production it is not terrible as behavior. >> >> >> >> Thanks >> >> Maxence, >> >> >> >> >> >> *De :* Karl Wright [mailto:daddy...@gmail.com] >> *Envoyé :* mardi 24 juillet 2018 17:53 >> *À :* user@manifoldcf.apache.org >> *Objet :* Re: Out of memory, one file bug i think >> >> >> >> The problem isn't with images in general; it's with certain kinds of >> images. There are optional dependencies in Tika for some kinds of images >> that we cannot include in the MCF distribution because of licensing >> problems. I don't know which kinds these are but apparently you are trying >> to index some of them. >> >> You will need to find and download the right jar and put it in the >> connector-common-lib folder for this to work. >> >> >> >> Karl >> >> >> >> >> >> On Tue, Jul 24, 2018 at 11:36 AM msaunier <msaun...@citya.com> wrote: >> >> On other crawl I extract images with sames parameters and I not have >> problems with images. They are index without errors. Images are necessary >> for this job. I try to recreate my job and test. >> >> >> >> Thanks, >> >> Maxence, >> >> >> >> >> >> >> >> >> >> *De :* Karl Wright [mailto:daddy...@gmail.com] >> *Envoyé :* mardi 24 juillet 2018 17:32 >> *À :* user@manifoldcf.apache.org >> *Objet :* Re: Out of memory, one file bug i think >> >> >> >> " java.lang.NoSuchMethodException: >> org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTPictureBaseImpl.<init>(org.apache.xmlbeans.SchemaType, >> boolean)" >> >> >> >> This exception is occurring because you are trying to extract content >> from an image. In order for this to work you need a jar that isn't >> supplied with Tika for licensing reasons. Can you exclude images from your >> crawl? >> >> >> >> Karl >> >> >> >> >> >> On Tue, Jul 24, 2018 at 10:32 AM msaunier <msaun...@citya.com> wrote: >> >> Hi Karl, >> >> >> >> With just connectors in debug I have that informations: >> >> >> >> [Thread-269948] INFO org.apache.zookeeper.ZooKeeper - Initiating client >> connection, connectString=kemp-formation-solr:2181 sessionTimeout=60000 >> watcher=org.apache.solr.common.cloud.SolrZkClient$3@3c351b22 >> >> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> [Thread-269948-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Session establishment complete on server >> kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid = >> 0xff00000201970049, negotiated timeout = 40000 >> >> [Thread-269948] INFO org.apache.solr.common.cloud.ZkStateReader - Updated >> live nodes from ZooKeeper... (0) -> (2) >> >> [Thread-269948] INFO >> org.apache.solr.client.solrj.impl.ZkClientClusterStateProvider - Cluster at >> kemp-formation-solr:2181 ready >> >> java.lang.NoSuchMethodException: >> org.openxmlformats.schemas.wordprocessingml.x2006.main.impl.CTPictureBaseImpl.<init>(org.apache.xmlbeans.SchemaType, >> boolean) >> >> at java.lang.Class.getConstructor0(Class.java:3082) >> >> at java.lang.Class.getDeclaredConstructor(Class.java:2178) >> >> at >> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.getJavaImplConstructor2(SchemaTypeImpl.java:1817) >> >> at >> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createUnattachedSubclass(SchemaTypeImpl.java:1961) >> >> at >> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createUnattachedNode(SchemaTypeImpl.java:1950) >> >> at >> org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:1051) >> >> at >> org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:938) >> >> at org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1675) >> >> at org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2659) >> >> at org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2652) >> >> at >> org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:995) >> >> at >> org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2904) >> >> at >> org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:162) >> >> at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:169) >> >> at >> org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:112) >> >> at >> org.apache.poi.xwpf.extractor.XWPFWordExtractor.<init>(XWPFWordExtractor.java:60) >> >> at >> org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:243) >> >> at >> org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:105) >> >> at >> org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:106) >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) >> >> at >> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) >> >> at >> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74) >> >> at >> org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:235) >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3226) >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3077) >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2708) >> >> at >> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756) >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583) >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548) >> >> at >> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:939) >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >> >> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28024ms for sessionid 0x100000050ae004d >> >> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28024ms for sessionid 0x100000050ae004d, closing socket >> connection and attempting reconnect >> >> [zkCallback-16-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@5382340 name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Disconnected type:None path:null path: null type: None >> >> [zkCallback-16-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected >> >> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at >> org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:737) >> >> at >> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:784) >> >> at >> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146) >> >> at >> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204) >> >> at >> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837) >> >> at >> org.apache.manifoldcf.crawler.jobs.JobManager.getJobsReadyForInactivity(JobManager.java:8024) >> >> at >> org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:76) >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at >> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:1200) >> >> at >> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:1583) >> >> at >> org.postgresql.jdbc.PgConnection.prepareStatement(PgConnection.java:372) >> >> at >> org.apache.manifoldcf.core.database.Database.execute(Database.java:896) >> >> at >> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696) >> >> [Thread-35854-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Session establishment complete on server >> kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid = >> 0x100000050ae004d, negotiated timeout = 40000 >> >> [Thread-490] INFO org.eclipse.jetty.server.ServerConnector - Stopped >> ServerConnector@2a640157{HTTP/1.1}{0.0.0.0:8345} >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.util.HashMap.resize(HashMap.java:704) >> >> at java.util.HashMap.putVal(HashMap.java:629) >> >> at java.util.HashMap.put(HashMap.java:612) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:154) >> >> at >> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204) >> >> at >> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837) >> >> at >> org.apache.manifoldcf.crawler.jobs.JobManager.processParentHashSet(JobManager.java:5642) >> >> at >> org.apache.manifoldcf.crawler.jobs.JobManager.calculateAffectedRestoreCarrydownChildren(JobManager.java:5581) >> >> at >> org.apache.manifoldcf.crawler.jobs.JobManager.finishDocuments(JobManager.java:5453) >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:570) >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.util.Arrays.copyOf(Arrays.java:3308) >> >> at java.util.BitSet.ensureCapacity(BitSet.java:337) >> >> at java.util.BitSet.expandTo(BitSet.java:352) >> >> at java.util.BitSet.set(BitSet.java:447) >> >> at >> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:267) >> >> at >> org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) >> >> at >> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) >> >> at >> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) >> >> at >> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) >> >> at >> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) >> >> at >> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:306) >> >> at >> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator$SheetTextAsHTML.cell(XSSFExcelExtractorDecorator.java:431) >> >> at >> org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.endElement(XSSFSheetXMLHandler.java:380) >> >> at >> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator$XSSFSheetInterestingPartsCapturer.endElement(XSSFExcelExtractorDecorator.java:520) >> >> at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown >> Source) >> >> at >> org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown >> Source) >> >> at >> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown >> Source) >> >> at >> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown >> Source) >> >> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown >> Source) >> >> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown >> Source) >> >> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) >> >> at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown >> Source) >> >> at >> org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source) >> >> at >> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.processSheet(XSSFExcelExtractorDecorator.java:344) >> >> at >> org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.buildXHTML(XSSFExcelExtractorDecorator.java:167) >> >> at >> org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor.getXHTML(AbstractOOXMLExtractor.java:135) >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x100000050ae004e closed >> >> [Thread-257943-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x100000050ae004e >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x100000050ae004d closed >> >> [Thread-35854-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x100000050ae004d >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x2000000b80d004a closed >> >> [Thread-8765-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x2000000b80d004a >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x2000000b80d004b closed >> >> [Thread-35853-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x2000000b80d004b >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0xff00000201970046 closed >> >> [Thread-6991-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0xff00000201970046 >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x100000050ae004c closed >> >> [Thread-8699-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x100000050ae004c >> >> [Thread-490] INFO org.eclipse.jetty.server.handler.ContextHandler - >> Stopped >> o.e.j.w.WebAppContext@44d52de2{/mcf-api-service,file:/tmp/jetty-0.0.0.0-8345-mcf-api-service.war-_mcf-api-service-any-559052738855414857.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf-trunk/bin/./../web-proprietary/war/mcf-api-service.war} >> >> [Thread-490] INFO org.eclipse.jetty.server.handler.ContextHandler - >> Stopped >> o.e.j.w.WebAppContext@60410cd{/mcf-authority-service,file:/tmp/jetty-0.0.0.0-8345-mcf-authority-service.war-_mcf-authority-service-any-927770358411352606.dir/webapp/,UNAVAILABLE}{/opt/manifoldcf-trunk/bin/./../web-proprietary/war/mcf-authority-service.war} >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x2000000b80d004c closed >> >> [Thread-262666-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x2000000b80d004c >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0xff00000201970048 closed >> >> [Thread-244171-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0xff00000201970048 >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0xff00000201970049 closed >> >> [Thread-269948-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0xff00000201970049 >> >> >> >> I have unactivate history to gain performances. So, can I find the last >> file with SQL request? >> >> >> >> Maxence, >> >> >> >> *De :* Karl Wright [mailto:daddy...@gmail.com] >> *Envoyé :* mardi 24 juillet 2018 16:04 >> *À :* user@manifoldcf.apache.org >> *Objet :* Re: Out of memory, one file bug i think >> >> >> >> Hi Maxence, >> >> >> >> You would want to turn on connector debugging INSTEAD of the debugging >> you've turned on, which is very noisy and not helpful. >> >> >> >> In global properties: org.apache.manifoldcf.connectors value DEBUG >> >> >> >> Karl >> >> >> >> >> >> On Tue, Jul 24, 2018 at 9:12 AM msaunier <msaun...@citya.com> wrote: >> >> With debug: >> >> >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28034ms for sessionid 0x100000050ae0049 >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28034ms for sessionid 0x100000050ae0049, closing socket >> connection and attempting reconnect >> >> [Thread-31532-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27708ms for sessionid 0xff00000201970044 >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27737ms for sessionid 0xff00000201970043 >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27737ms for sessionid 0xff00000201970043, closing socket >> connection and attempting reconnect >> >> [Thread-31551-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28316ms for sessionid 0x100000050ae004b >> >> [Thread-7602-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28394ms for sessionid 0x2000000b80d0047 >> >> [Thread-7602-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28394ms for sessionid 0x2000000b80d0047, closing socket >> connection and attempting reconnect >> >> [Thread-31532-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27708ms for sessionid 0xff00000201970044, closing socket >> connection and attempting reconnect >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> agents process ran out of memory - shutting down >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> [Thread-7538-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 36805ms for sessionid 0x2000000b80d0046 >> >> [Thread-7538-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 36805ms for sessionid 0x2000000b80d0046, closing socket >> connection and attempting reconnect >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.lang.StringBuilder.toString(StringBuilder.java:407) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.readSharedData(CacheManager.java:849) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.hasExpired(CacheManager.java:483) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.lookupObject(CacheManager.java:454) >> >> at >> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:131) >> >> at >> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204) >> >> at >> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:862) >> >> at >> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:236) >> >> at >> org.apache.manifoldcf.crawler.jobs.Jobs.deletingJobsPresent(Jobs.java:3133) >> >> at >> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDeletableDocuments(JobManager.java:1862) >> >> at >> org.apache.manifoldcf.crawler.system.DocumentDeleteStufferThread.run(DocumentDeleteStufferThread.java:108) >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> agents process ran out of memory - shutting down >> >> [Thread-7574-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27763ms for sessionid 0x100000050ae004a >> >> [Thread-7574-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 27763ms for sessionid 0x100000050ae004a, closing socket >> connection and attempting reconnect >> >> [zkCallback-3-thread-7] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@7a5c701e name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Disconnected type:None path:null path: null type: None >> >> [zkCallback-3-thread-7] WARN >> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected >> >> [Thread-31551-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 28316ms for sessionid 0x100000050ae004b, closing socket >> connection and attempting reconnect >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> [zkCallback-11-thread-5] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@53181a58 name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Disconnected type:None path:null path: null type: None >> >> [zkCallback-11-thread-5] WARN >> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, >> session 0xff00000201970043 has expired >> >> [Thread-7573-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, >> session 0xff00000201970043 has expired, closing socket connection >> >> [Thread-7573-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0xff00000201970043 >> >> [zkCallback-11-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@53181a58 name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Expired type:None path:null path: null type: None >> >> [zkCallback-11-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - Our previous ZooKeeper >> session was expired. Attempting to reconnect to recover relationship with >> ZooKeeper... >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, >> session 0x100000050ae0049 has expired >> >> [Thread-5234-SendThread(kemp-formation-solr.citya.local:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, >> session 0x100000050ae0049 has expired, closing socket connection >> >> [zkCallback-11-thread-2] WARN >> org.apache.solr.common.cloud.DefaultConnectionStrategy - Connection expired >> - starting a new one... >> >> [zkCallback-11-thread-2] INFO org.apache.zookeeper.ZooKeeper - Initiating >> client connection, connectString=kemp-formation-solr:2181 >> sessionTimeout=60000 >> watcher=org.apache.solr.common.cloud.ConnectionManager@53181a58 >> >> [Thread-5234-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x100000050ae0049 >> >> [zkCallback-3-thread-4] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@7a5c701e name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Expired type:None path:null path: null type: None >> >> [zkCallback-3-thread-4] WARN >> org.apache.solr.common.cloud.ConnectionManager - Our previous ZooKeeper >> session was expired. Attempting to reconnect to recover relationship with >> ZooKeeper... >> >> [zkCallback-3-thread-4] WARN >> org.apache.solr.common.cloud.DefaultConnectionStrategy - Connection expired >> - starting a new one... >> >> [zkCallback-3-thread-4] INFO org.apache.zookeeper.ZooKeeper - Initiating >> client connection, connectString=kemp-formation-solr:2181 >> sessionTimeout=60000 >> watcher=org.apache.solr.common.cloud.ConnectionManager@7a5c701e >> >> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> kemp-formation-solr.citya.local/192.168.37.107:2181. Will not attempt to >> authenticate using SASL (unknown error) >> >> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Socket connection established to >> kemp-formation-solr.citya.local/192.168.37.107:2181, initiating session >> >> [Thread-490] INFO org.eclipse.jetty.server.ServerConnector - Stopped >> ServerConnector@2a640157{HTTP/1.1}{0.0.0.0:8345} >> >> [zkCallback-3-thread-4-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on >> server kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid = >> 0x2000000b80d0049, negotiated timeout = 40000 >> >> [zkCallback-11-thread-2-SendThread(kemp-formation-solr.citya.local:2181)] >> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on >> server kemp-formation-solr.citya.local/192.168.37.107:2181, sessionid = >> 0xff00000201970045, negotiated timeout = 40000 >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.util.HashMap.newNode(HashMap.java:1747) >> >> at java.util.HashMap.putVal(HashMap.java:631) >> >> at java.util.HashMap.put(HashMap.java:612) >> >> at jcifs.util.transport.Transport.sendrecv(Transport.java:66) >> >> at jcifs.smb.SmbTransport.send(SmbTransport.java:661) >> >> at jcifs.smb.SmbSession.send(SmbSession.java:238) >> >> at jcifs.smb.SmbTree.send(SmbTree.java:119) >> >> at jcifs.smb.SmbFile.send(SmbFile.java:776) >> >> at >> jcifs.smb.SmbFileInputStream.readDirect(SmbFileInputStream.java:181) >> >> at jcifs.smb.SmbFileInputStream.read(SmbFileInputStream.java:142) >> >> at >> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:903) >> >> at >> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) >> >> [zkCallback-11-thread-2] INFO >> org.apache.solr.common.cloud.ConnectionManager - Connection with ZooKeeper >> reestablished. >> >> [zkCallback-3-thread-4] INFO >> org.apache.solr.common.cloud.ConnectionManager - Connection with ZooKeeper >> reestablished. >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> [zkCallback-11-thread-2] INFO >> org.apache.solr.common.cloud.DefaultConnectionStrategy - Reconnected to >> ZooKeeper >> >> [zkCallback-11-thread-2] INFO >> org.apache.solr.common.cloud.ConnectionManager - Connected:true >> >> [zkCallback-3-thread-4] INFO >> org.apache.solr.common.cloud.DefaultConnectionStrategy - Reconnected to >> ZooKeeper >> >> [zkCallback-3-thread-4] INFO >> org.apache.solr.common.cloud.ConnectionManager - Connected:true >> >> [Thread-490] INFO org.apache.zookeeper.ZooKeeper - Session: >> 0x2000000b80d0046 closed >> >> [zkCallback-21-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - Watcher >> org.apache.solr.common.cloud.ConnectionManager@381a7557 name: >> ZooKeeperConnection Watcher:kemp-formation-solr:2181 got event WatchedEvent >> state:Disconnected type:None path:null path: null type: None >> >> [zkCallback-21-thread-2] WARN >> org.apache.solr.common.cloud.ConnectionManager - zkClient has disconnected >> >> [Thread-7538-EventThread] INFO org.apache.zookeeper.ClientCnxn - >> EventThread shut down for session: 0x2000000b80d0046 >> >> agents process ran out of memory - shutting down >> >> java.lang.OutOfMemoryError: GC overhead limit exceeded >> >> at java.util.regex.Matcher.<init>(Matcher.java:225) >> >> at java.util.regex.Pattern.matcher(Pattern.java:1093) >> >> at >> de.l3s.boilerpipe.util.UnicodeTokenizer.tokenize(UnicodeTokenizer.java:40) >> >> at >> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.flushBlock(BoilerpipeHTMLContentHandler.java:296) >> >> at >> de.l3s.boilerpipe.sax.BoilerpipeHTMLContentHandler.characters(BoilerpipeHTMLContentHandler.java:198) >> >> at >> org.apache.tika.parser.html.BoilerpipeContentHandler.characters(BoilerpipeContentHandler.java:155) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.SecureContentHandler.characters(SecureContentHandler.java:270) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.SafeContentHandler.access$001(SafeContentHandler.java:46) >> >> at >> org.apache.tika.sax.SafeContentHandler$1.write(SafeContentHandler.java:82) >> >> at >> org.apache.tika.sax.SafeContentHandler.filter(SafeContentHandler.java:140) >> >> at >> org.apache.tika.sax.SafeContentHandler.characters(SafeContentHandler.java:287) >> >> at >> org.apache.tika.sax.XHTMLContentHandler.characters(XHTMLContentHandler.java:279) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.xpath.MatchingContentHandler.characters(MatchingContentHandler.java:85) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> at >> org.apache.tika.sax.ContentHandlerDecorator.characters(ContentHandlerDecorator.java:146) >> >> >> >>