The thread dump is not helpful. I see no traces for any threads that identify code from MCF at all. That's new and has nothing to do with MCF; it's probably a new Java "feature".
The OutOfMemoryException should, I've confirmed, be rethrown. The Worker Thread code does in fact catch the exception and terminate the process: >>>>>> catch (OutOfMemoryError e) { System.err.println("agents process ran out of memory - shutting down"); e.printStackTrace(System.err); System.exit(-200); } <<<<<< So that is what you should see happening. Are you providing the complete stack trace from your out-of-memory exception? That would help confirm the picture. It had to have been printed from somewhere; it goes to standard error, whatever that is. Karl On Tue, May 7, 2019 at 6:42 AM Karl Wright <daddy...@gmail.com> wrote: > Hi Olivier, > > Any out-of-memory exception that makes it to the top level should cause > the agents process to shut itself down. > > The reason for this is simple: in a multithread environment, with lots of > third-party jars, an out-of-memory in one thread is usually indicative of > an out-of-memory in other threads as well, and we cannot count on that > third-party software being rigorous about letting an OutOfMemoryException > be thrown to the top level. So we tend to get corrupted results that are > hard to recover from. > > In this case, the exception is being thrown within the JDBCConnector, so > it ought to rethrow the OutOfMemoryException. Let me confirm that is what > the code indicates. > > Karl > > > On Tue, May 7, 2019 at 6:14 AM Olivier Tavard < > olivier.tav...@francelabs.com> wrote: > >> Hi Karl, >> >> Thank you for your help. >> Indeed I found a OoM error in the logs of the MCF-agent process : >> java.lang.OutOfMemoryError: Java heap space >> at com.mysql.jdbc.Buffer.<init>(Buffer.java:58) >> at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1441) >> at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2816) >> at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:467) >> at >> com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2510) >> at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1746) >> at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2135) >> at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2542) >> at >> com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1734) >> at >> com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1885) >> at >> org.apache.manifoldcf.jdbc.JDBCConnection$PreparedStatementQueryThread.run(JDBCConnection.java:1392) >> >> >> But the MCF process agent was not killed, it is still active. Why is the >> process still active in this case ? Since we know that we can get an OOM, >> couldn't we find an elegant way to notify users of this problem from the >> MCF admin UI, rather than just having the process endlessly active ? >> >> As for the OOM, the work around we decided to use is to increase the >> values of xms and xmx java memory in options.env.unix. >> I also attached the thread dump to this mail. >> >> Thanks, >> Best regards, >> >> Olivier >> >> Le 6 mai 2019 à 12:55, Karl Wright <daddy...@gmail.com> a écrit : >> >> It sounds like there might be an out-of-memory situation, although if >> that is the case I would expect that the MCF agents process would just shut >> itself down. >> >> If the process is hung, it should be easy to get a thread dump. That >> would be the first step. >> >> Some thoughts as to what might be happening: there might be an >> out-of-memory condition that is being silently eaten somewhere. MCF relies >> on its connectors to use streams rather than load documents into memory, >> BUT we're at the mercy of the JDBC drivers and Tika. So another experiment >> would be to try to crawl your documents without the tika extractor (just >> send them to the null output connector) and see if that succeeds. If it >> does, then maybe try the external Tika Service connector instead and see >> what happens then. >> >> Karl >> >> >> On Mon, May 6, 2019 at 5:56 AM Olivier Tavard < >> olivier.tav...@francelabs.com> wrote: >> >>> Hi MCF community, >>> >>> We have some issues with the JDBC connector to index database with large >>> LONGBLOB into it : I mean files more than 100 MB. >>> To reproduce the issue, I created a simple database on MySQL 5.7 with >>> only one table into it with 2 columns : id (int) and data (longblob). >>> There are 8 records in the table : two records with very small files : >>> less than 5 KB and 6 records with a large file of 192 MB. >>> >>> If I include in my crawl only one big BLOB, the crawl is successful. >>> seeding query : SELECT id AS $(IDCOLUMN) FROM data WHERE id =6 >>> >>> But if I include more than 2 of these (with a WHERE IN condition or if >>> select all the records of the table) the job stays in running condition but >>> no document is processed : >>> seeding query : SELECT id AS $(IDCOLUMN) FROM data WHERE id IN (6,7) >>> >>> >>> >>> There is no message in the logs (even with debug mode activated). >>> The tests were done in a Docker cluster on dedicated servers (4 cores, >>> 32 GB RAM each). >>> Regarding MCF, the version is 2.12. The MCF agent has 1GB of RAM (xmx >>> and xmx) the internal database is PostgreSQL 10.1 and we use Zookeeper >>> based synchronization. >>> We have a dump of the database here : >>> https://www.datafari.com/files/mcftest.tar.gz >>> >>> In other cases, we also encounter this error : >>> Error: Unexpected jobqueue status - record id 1555190422690, expecting >>> active status, saw 0 >>> >>> Any ideas of what the problem could be ? or at least why there is >>> nothing in the debug log you think ? >>> >>> Thank you, >>> Best regards, >>> >>> Olivier >>> >> >>