The limit is applied in the method that calls noteTransformationConnectionRegistration.
Here it is: >>>>>> /** Note the registration of a transformation connector used by the specified connections. * This method will be called when a connector is registered, on which the specified * connections depend. *@param connectionNames is the set of connection names. */ @Override public void noteTransformationConnectorRegistration(String[] connectionNames) throws ManifoldCFException { // For each connection, find the corresponding list of jobs. From these jobs, we want the job id and the status. List<String> list = new ArrayList<String>(); int maxCount = database.findConjunctionClauseMax(new ClauseDescription[]{}); int currentCount = 0; int i = 0; while (i < connectionNames.length) { if (currentCount == maxCount) { noteTransformationConnectionRegistration(list); list.clear(); currentCount = 0; } list.add(connectionNames[i++]); currentCount++; } if (currentCount > 0) noteTransformationConnectionRegistration(list); } <<<<<< It looks correct now. Do you see an issue with it? Karl On Mon, Jul 30, 2018 at 3:28 PM Mike Hugo <m...@piragua.com> wrote: > Nice catch Karl! > > I applied that patch, but I'm still getting the same error. > > I think the problem is in JobManager.noteTransformationConnectionRe > gistration > > If jobs.findJobsMatchingTransformations(list); returns a large list of > ids (like it is doing in our case - 39,941 ids ), the generated query > string still has a large OR clause in it. I don't see getMaxOrClause > applied to the query being built inside noteTransformationConnectionRe > gistration > > >>>>>> > protected void noteTransformationConnectionRegistration(List<String> list) > throws ManifoldCFException > { > // Query for the matching jobs, and then for each job potentially > adjust the state > Long[] jobIDs = jobs.findJobsMatchingTransformations(list); > if (jobIDs.length == 0) > return; > > StringBuilder query = new StringBuilder(); > ArrayList newList = new ArrayList(); > > query.append("SELECT > ").append(jobs.idField).append(",").append(jobs.statusField) > .append(" FROM ").append(jobs.getTableName()).append(" WHERE ") > * .append(database.buildConjunctionClause(newList,new > ClauseDescription[]{* > * new MultiClause(jobs.idField,jobIDs)}))* > .append(" FOR UPDATE"); > IResultSet set = > database.performQuery(query.toString(),newList,null,null); > int i = 0; > while (i < set.getRowCount()) > { > IResultRow row = set.getRow(i++); > Long jobID = (Long)row.getValue(jobs.idField); > int statusValue = > jobs.stringToStatus((String)row.getValue(jobs.statusField)); > jobs.noteTransformationConnectorRegistration(jobID,statusValue); > } > } > <<<<<< > > > On Mon, Jul 30, 2018 at 1:55 PM, Karl Wright <daddy...@gmail.com> wrote: > >> The Postgresql driver supposedly limits this to 25 clauses at a pop: >> >> >>>>>> >> @Override >> public int getMaxOrClause() >> { >> return 25; >> } >> >> /* Calculate the number of values a particular clause can have, given >> the values for all the other clauses. >> * For example, if in the expression x AND y AND z, x has 2 values and z >> has 1, find out how many values x can legally have >> * when using the buildConjunctionClause() method below. >> */ >> @Override >> public int findConjunctionClauseMax(ClauseDescription[] >> otherClauseDescriptions) >> { >> // This implementation uses "OR" >> return getMaxOrClause(); >> } >> <<<<<< >> >> The problem is that there was a cut-and-paste error, with just >> transformation connections, that defeated the limit. I'll create a ticket >> and attach a patch. CONNECTORS-1520. >> >> Karl >> >> >> >> >> >> On Mon, Jul 30, 2018 at 2:29 PM Karl Wright <daddy...@gmail.com> wrote: >> >>> Hi Mike, >>> >>> This might be the issue indeed. I'll look into it. >>> >>> Karl >>> >>> >>> On Mon, Jul 30, 2018 at 2:26 PM Mike Hugo <m...@piragua.com> wrote: >>> >>>> I'm not sure what the solution is yet, but I think I may have found the >>>> culprit: >>>> >>>> JobManager.noteTransformationConnectionRegistration(List<String> list) >>>> is creating a pretty big query: >>>> >>>> SELECT id,status FROM jobs WHERE (id=? OR id=? OR id=? OR id=? >>>> ........ OR id=?) FOR UPDATE >>>> >>>> replace the elipsis with as list of 39,941 ids (it's a huge query when >>>> it prints out) >>>> >>>> It seems that the database doesn't like that query and closes the >>>> connection before returning with a response. >>>> >>>> As I mentioned this instance of manifold has nearly 40,000 web >>>> crawlers. is that a high number for Manifold to handle? >>>> >>>> On Mon, Jul 30, 2018 at 10:58 AM, Karl Wright <daddy...@gmail.com> >>>> wrote: >>>> >>>>> Well, I have absolutely no idea what is wrong and I've never seen >>>>> anything like that before. But postgres is complaining because the >>>>> communication with the JDBC client is being interrupted by something. >>>>> >>>>> Karl >>>>> >>>>> >>>>> On Mon, Jul 30, 2018 at 10:39 AM Mike Hugo <m...@piragua.com> wrote: >>>>> >>>>>> No, and manifold and postgres run on the same host. >>>>>> >>>>>> On Mon, Jul 30, 2018 at 9:35 AM, Karl Wright <daddy...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> ' LOG: incomplete message from client' >>>>>>> >>>>>>> This shows a network issue. Did your network configuration change >>>>>>> recently? >>>>>>> >>>>>>> Karl >>>>>>> >>>>>>> >>>>>>> On Mon, Jul 30, 2018 at 9:59 AM Mike Hugo <m...@piragua.com> wrote: >>>>>>> >>>>>>>> Tried a postgres vacuum and also a restart, but the problem >>>>>>>> persists. Here's the log again with some additional logging details >>>>>>>> added >>>>>>>> (below) >>>>>>>> >>>>>>>> I tried running the last query from the logs against the database >>>>>>>> and it works fine - I modified it to return a count and that also >>>>>>>> works. >>>>>>>> >>>>>>>> SELECT count(*) FROM jobs t1 WHERE EXISTS(SELECT 'x' FROM >>>>>>>> jobpipelines WHERE t1.id=ownerid AND transformationname='Tika'); >>>>>>>> count >>>>>>>> ------- >>>>>>>> 39941 >>>>>>>> (1 row) >>>>>>>> >>>>>>>> >>>>>>>> Is 39k jobs a high number? I've run some other instances of >>>>>>>> Manifold with more like 1,000 jobs and those seem to be working fine. >>>>>>>> That's the only thing I can think of that's different between this >>>>>>>> instance >>>>>>>> that won't start and the others. Any ideas? >>>>>>>> >>>>>>>> Thanks for your help! >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> LOG: duration: 0.079 ms parse <unnamed>: SELECT connectionname >>>>>>>> FROM transformationconnections WHERE classname=$1 >>>>>>>> LOG: duration: 0.079 ms bind <unnamed>: SELECT connectionname >>>>>>>> FROM transformationconnections WHERE classname=$1 >>>>>>>> DETAIL: parameters: $1 = >>>>>>>> 'org.apache.manifoldcf.agents.transformation.tika.TikaExtractor' >>>>>>>> LOG: duration: 0.017 ms execute <unnamed>: SELECT connectionname >>>>>>>> FROM transformationconnections WHERE classname=$1 >>>>>>>> DETAIL: parameters: $1 = >>>>>>>> 'org.apache.manifoldcf.agents.transformation.tika.TikaExtractor' >>>>>>>> LOG: duration: 0.039 ms parse <unnamed>: SELECT * FROM agents >>>>>>>> LOG: duration: 0.040 ms bind <unnamed>: SELECT * FROM agents >>>>>>>> LOG: duration: 0.010 ms execute <unnamed>: SELECT * FROM agents >>>>>>>> LOG: duration: 0.084 ms parse <unnamed>: SELECT id FROM jobs t1 >>>>>>>> WHERE EXISTS(SELECT 'x' FROM jobpipelines WHERE t1.id=ownerid AND >>>>>>>> transformationname=$1) >>>>>>>> LOG: duration: 0.359 ms bind <unnamed>: SELECT id FROM jobs t1 >>>>>>>> WHERE EXISTS(SELECT 'x' FROM jobpipelines WHERE t1.id=ownerid AND >>>>>>>> transformationname=$1) >>>>>>>> DETAIL: parameters: $1 = 'Tika' >>>>>>>> LOG: duration: 77.622 ms execute <unnamed>: SELECT id FROM jobs >>>>>>>> t1 WHERE EXISTS(SELECT 'x' FROM jobpipelines WHERE t1.id=ownerid >>>>>>>> AND transformationname=$1) >>>>>>>> DETAIL: parameters: $1 = 'Tika' >>>>>>>> LOG: incomplete message from client >>>>>>>> LOG: disconnection: session time: 0:00:06.574 user=REMOVED >>>>>>>> database=REMOVED host=127.0.0.1 port=45356 >>>>>>>> >2018-07-30 12:36:09,415 [main] ERROR org.apache.manifoldcf.root - >>>>>>>> Exception: This connection has been closed. >>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: This >>>>>>>> connection has been closed. >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:627) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.rollbackCurrentTransaction(DBInterfacePostgreSQL.java:1296) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database.endTransaction(Database.java:368) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.endTransaction(DBInterfacePostgreSQL.java:1236) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:605) >>>>>>>> ~[mcf-pull-agent.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160) >>>>>>>> ~[mcf-pull-agent.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239) >>>>>>>> [mcf-jetty-runner.jar:?] >>>>>>>> Caused by: org.postgresql.util.PSQLException: This connection has >>>>>>>> been closed. >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:766) >>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:1576) >>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:367) >>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database.execute(Database.java:873) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696) >>>>>>>> ~[mcf-core.jar:?] >>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: This >>>>>>>> connection has been closed. >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:627) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.rollbackCurrentTransaction(DBInterfacePostgreSQL.java:1296) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database.endTransaction(Database.java:368) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.endTransaction(DBInterfacePostgreSQL.java:1236) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:605) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239) >>>>>>>> Caused by: org.postgresql.util.PSQLException: This connection has >>>>>>>> been closed. >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:766) >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:1576) >>>>>>>> at >>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:367) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database.execute(Database.java:873) >>>>>>>> at >>>>>>>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696) >>>>>>>> LOG: disconnection: session time: 0:00:10.677 user=postgres >>>>>>>> database=template1 host=127.0.0.1 port=45354 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Jul 29, 2018 at 8:09 AM, Karl Wright <daddy...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> It looks to me like your database server is not happy. Maybe it's >>>>>>>>> out of resources? Not sure but a restart may be in order. >>>>>>>>> >>>>>>>>> Karl >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sun, Jul 29, 2018 at 9:06 AM Mike Hugo <m...@piragua.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Recently we started seeing this error when Manifold CF starts >>>>>>>>>> up. We had been running Manifold CF with many web connectors and a >>>>>>>>>> few RSS >>>>>>>>>> feeds for a while and it had been working fine. The server got >>>>>>>>>> rebooted >>>>>>>>>> and since then we started seeing this error. I'm not sure exactly >>>>>>>>>> what >>>>>>>>>> changed. Any ideas as to where to start looking and how to fix this? >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>> Mike >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Initial repository connections already created. >>>>>>>>>> Configuration file successfully read >>>>>>>>>> Successfully unregistered all domains >>>>>>>>>> Successfully unregistered all output connectors >>>>>>>>>> Successfully unregistered all transformation connectors >>>>>>>>>> Successfully unregistered all mapping connectors >>>>>>>>>> Successfully unregistered all authority connectors >>>>>>>>>> Successfully unregistered all repository connectors >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.solr.SolrConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.searchblox.SearchBloxConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.nullconnector.NullConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.kafka.KafkaOutputConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.hdfs.HDFSOutputConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.gts.GTSConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.filesystem.FileOutputConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.elasticsearch.ElasticSearchConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered output connector >>>>>>>>>> 'org.apache.manifoldcf.agents.output.amazoncloudsearch.AmazonCloudSearchConnector' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> WARNING: there is no transaction in progress >>>>>>>>>> Successfully registered transformation connector >>>>>>>>>> 'org.apache.manifoldcf.agents.transformation.tikaservice.TikaExtractor' >>>>>>>>>> WARNING: there is already a transaction in progress >>>>>>>>>> LOG: incomplete message from client >>>>>>>>>> >2018-07-29 13:02:06,659 [main] ERROR org.apache.manifoldcf.root >>>>>>>>>> - Exception: This connection has been closed. >>>>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: This >>>>>>>>>> connection has been closed. >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:627) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.rollbackCurrentTransaction(DBInterfacePostgreSQL.java:1296) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database.endTransaction(Database.java:368) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.endTransaction(DBInterfacePostgreSQL.java:1236) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:605) >>>>>>>>>> ~[mcf-pull-agent.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160) >>>>>>>>>> ~[mcf-pull-agent.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239) >>>>>>>>>> [mcf-jetty-runner.jar:?] >>>>>>>>>> Caused by: org.postgresql.util.PSQLException: This connection has >>>>>>>>>> been closed. >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:766) >>>>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:1576) >>>>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:367) >>>>>>>>>> ~[postgresql-42.1.3.jar:42.1.3] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database.execute(Database.java:873) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696) >>>>>>>>>> ~[mcf-core.jar:?] >>>>>>>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: This >>>>>>>>>> connection has been closed. >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.reinterpretException(DBInterfacePostgreSQL.java:627) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.rollbackCurrentTransaction(DBInterfacePostgreSQL.java:1296) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database.endTransaction(Database.java:368) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.endTransaction(DBInterfacePostgreSQL.java:1236) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.registerConnectors(ManifoldCF.java:605) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.crawler.system.ManifoldCF.reregisterAllConnectors(ManifoldCF.java:160) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.jettyrunner.ManifoldCFJettyRunner.main(ManifoldCFJettyRunner.java:239) >>>>>>>>>> Caused by: org.postgresql.util.PSQLException: This connection has >>>>>>>>>> been closed. >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.checkClosed(PgConnection.java:766) >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:1576) >>>>>>>>>> at >>>>>>>>>> org.postgresql.jdbc.PgConnection.createStatement(PgConnection.java:367) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database.execute(Database.java:873) >>>>>>>>>> at >>>>>>>>>> org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:696) >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>> >