[jira] [Resolved] (CONNECTORS-249) HSQLDB gets primary key constraint on multiple job cleanup
[ https://issues.apache.org/jira/browse/CONNECTORS-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-249. Resolution: Fixed HSQLDB gets primary key constraint on multiple job cleanup -- Key: CONNECTORS-249 URL: https://issues.apache.org/jira/browse/CONNECTORS-249 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Create two jobs which overlap. Crawl both of them. Then delete one, and then the other. You will get stack traces like this: ERROR 2011-09-02 07:38:26,485 (Delete startup thread) - Exception tossed: integr ity constraint violation: unique constraint or index violation; SYS_PK_10121 tab le: JOBQUEUE org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10121 table: JOBQUEUE at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretExce ption(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModifica tion(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(D BInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable .java:88) at org.apache.manifoldcf.crawler.jobs.JobQueue.prepareDeleteScan(JobQueu e.java:461) at org.apache.manifoldcf.crawler.jobs.JobManager.prepareDeleteScan(JobMa nager.java:4951) at org.apache.manifoldcf.crawler.system.StartDeleteThread.run(StartDelet eThread.java:107) Caused by: java.sql.SQLException: integrity constraint violation: unique constra int or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedState ment.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:60 6) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(D atabase.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique cons traint or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerM VCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1344) at org.hsqldb.Session.execute(Session.java:997) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4651) ... 3 more This only happens on HSQLDB, and is probably related to CONNECTORS-248 and CONNECTORS-246. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13098151#comment-13098151 ] Karl Wright commented on CONNECTORS-202: Great news! My suggestion is to wait to release this as a formal MCF feature until the current released version of Solr supports it. Otherwise we risk confusing people, and there *is* the workaround of explicitly providing the parameter within the connector's generic parameter feature. So for now, I think updating the end-user documentation would be best, and then when the next rev of Solr is released we can add the explicit feature you suggest. Does this sound reasonable? SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit Fix For: ManifoldCF next The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-247) Need a set of tests for the scripting language client
[ https://issues.apache.org/jira/browse/CONNECTORS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-247. Resolution: Fixed r1165132. Need a set of tests for the scripting language client - Key: CONNECTORS-247 URL: https://issues.apache.org/jira/browse/CONNECTORS-247 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 We need unit tests for the script language client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095867#comment-13095867 ] Karl Wright commented on CONNECTORS-248: r1164408 may fix this, according to Fred Toussi. He recalls fixing precisely such a race condition at some point recently. File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-246) A file crawl exited with an unexpected jobqueue status error under HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095866#comment-13095866 ] Karl Wright commented on CONNECTORS-246: r1164408 may fix this, according to Fred Toussi. A file crawl exited with an unexpected jobqueue status error under HSQLDB --- Key: CONNECTORS-246 URL: https://issues.apache.org/jira/browse/CONNECTORS-246 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Under HSQLDB, a file crawl terminated with: Error: Unexpected jobqueue status - record id 1314721269570, expecting active status. The full trace was: ERROR 2011-08-30 12:23:48,962 (Worker thread '38') - Exception tossed: Unexpected jobqueue status - record id 1314721269570, expecting active status org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1314721269570, expecting active status at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:633) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2386) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:798) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CONNECTORS-246) A file crawl exited with an unexpected jobqueue status error under HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reopened CONNECTORS-246: New failure indicates problem is likely unresolved A file crawl exited with an unexpected jobqueue status error under HSQLDB --- Key: CONNECTORS-246 URL: https://issues.apache.org/jira/browse/CONNECTORS-246 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Under HSQLDB, a file crawl terminated with: Error: Unexpected jobqueue status - record id 1314721269570, expecting active status. The full trace was: ERROR 2011-08-30 12:23:48,962 (Worker thread '38') - Exception tossed: Unexpected jobqueue status - record id 1314721269570, expecting active status org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1314721269570, expecting active status at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:633) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2386) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:798) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-249) HSQLDB gets primary key constraint on multiple job cleanup
HSQLDB gets primary key constraint on multiple job cleanup -- Key: CONNECTORS-249 URL: https://issues.apache.org/jira/browse/CONNECTORS-249 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF 0.3 Create two jobs which overlap. Crawl both of them. Then delete one, and then the other. You will get stack traces like this: ERROR 2011-09-02 07:38:26,485 (Delete startup thread) - Exception tossed: integr ity constraint violation: unique constraint or index violation; SYS_PK_10121 tab le: JOBQUEUE org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10121 table: JOBQUEUE at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretExce ption(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModifica tion(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(D BInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable .java:88) at org.apache.manifoldcf.crawler.jobs.JobQueue.prepareDeleteScan(JobQueu e.java:461) at org.apache.manifoldcf.crawler.jobs.JobManager.prepareDeleteScan(JobMa nager.java:4951) at org.apache.manifoldcf.crawler.system.StartDeleteThread.run(StartDelet eThread.java:107) Caused by: java.sql.SQLException: integrity constraint violation: unique constra int or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedState ment.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:60 6) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(D atabase.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique cons traint or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerM VCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1344) at org.hsqldb.Session.execute(Session.java:997) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4651) ... 3 more This only happens on HSQLDB, and is probably related to CONNECTORS-248 and CONNECTORS-246. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reopened CONNECTORS-248: New failure indicates problem is likely unresolved File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-247) Need a set of tests for the scripting language client
[ https://issues.apache.org/jira/browse/CONNECTORS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-247: -- Assignee: Karl Wright Need a set of tests for the scripting language client - Key: CONNECTORS-247 URL: https://issues.apache.org/jira/browse/CONNECTORS-247 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 We need unit tests for the script language client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-249) HSQLDB gets primary key constraint on multiple job cleanup
[ https://issues.apache.org/jira/browse/CONNECTORS-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-249: -- Assignee: Karl Wright HSQLDB gets primary key constraint on multiple job cleanup -- Key: CONNECTORS-249 URL: https://issues.apache.org/jira/browse/CONNECTORS-249 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Create two jobs which overlap. Crawl both of them. Then delete one, and then the other. You will get stack traces like this: ERROR 2011-09-02 07:38:26,485 (Delete startup thread) - Exception tossed: integr ity constraint violation: unique constraint or index violation; SYS_PK_10121 tab le: JOBQUEUE org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10121 table: JOBQUEUE at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretExce ption(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModifica tion(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(D BInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable .java:88) at org.apache.manifoldcf.crawler.jobs.JobQueue.prepareDeleteScan(JobQueu e.java:461) at org.apache.manifoldcf.crawler.jobs.JobManager.prepareDeleteScan(JobMa nager.java:4951) at org.apache.manifoldcf.crawler.system.StartDeleteThread.run(StartDelet eThread.java:107) Caused by: java.sql.SQLException: integrity constraint violation: unique constra int or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedState ment.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:60 6) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(D atabase.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique cons traint or index violation; SYS_PK_10121 table: JOBQUEUE at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerM VCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1344) at org.hsqldb.Session.execute(Session.java:997) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStateme nt.java:4651) ... 3 more This only happens on HSQLDB, and is probably related to CONNECTORS-248 and CONNECTORS-246. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13096269#comment-13096269 ] Karl Wright commented on CONNECTORS-248: I have a script I run that is a good stress test. But the failure details differ from run to run, so I have not made it an official test yet. File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095217#comment-13095217 ] Karl Wright commented on CONNECTORS-248: Talked with Fred Toussi about this. He's looked at the HSQLDB database settings, and has the following comment: Your settings indicate this should not have happened. If this is caused by a race condition or similar in the database engine, switching the trasaction isolation model from MVCC to LOCKS would prevent it. I would recommend this as a temprary fix for your v. 0.3 until I find the cause. Regards Fred I'm going to wait a bit to see what he comes up with. File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-164) Support Oracle for DBInterface
[ https://issues.apache.org/jira/browse/CONNECTORS-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-164: --- Component/s: (was: API) Framework core Affects Version/s: ManifoldCF 0.2 Support Oracle for DBInterface -- Key: CONNECTORS-164 URL: https://issues.apache.org/jira/browse/CONNECTORS-164 Project: ManifoldCF Issue Type: New Feature Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Jeff Guo Fix For: ManifoldCF next Original Estimate: 504h Remaining Estimate: 504h The DBInterface currently supports PostgreSQL, Derby, and MySql, Oracle support is needed as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-164) Support Oracle for DBInterface
[ https://issues.apache.org/jira/browse/CONNECTORS-164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-164: --- Fix Version/s: ManifoldCF next Support Oracle for DBInterface -- Key: CONNECTORS-164 URL: https://issues.apache.org/jira/browse/CONNECTORS-164 Project: ManifoldCF Issue Type: New Feature Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Jeff Guo Fix For: ManifoldCF next Original Estimate: 504h Remaining Estimate: 504h The DBInterface currently supports PostgreSQL, Derby, and MySql, Oracle support is needed as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-158) SharePoint 2010 evaluation needed; possible changes and removal of custom web service for this version
[ https://issues.apache.org/jira/browse/CONNECTORS-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-158: --- Fix Version/s: ManifoldCF next SharePoint 2010 evaluation needed; possible changes and removal of custom web service for this version -- Key: CONNECTORS-158 URL: https://issues.apache.org/jira/browse/CONNECTORS-158 Project: ManifoldCF Issue Type: Improvement Components: SharePoint connector Affects Versions: ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next We need to evaluate the SharePoint connector against SharePoint 2010. The goal would be to see if it works, and also to see if Microsoft provides functionality that would make the deployment of the custom MCPermissions web service unnecessary for this version of SharePoint. Modifications may be necessary or desired based on what the research indicates. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-79) Tests and test server for jCIFS connector needed
[ https://issues.apache.org/jira/browse/CONNECTORS-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-79: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Issue Type: Test (was: Bug) Tests and test server for jCIFS connector needed Key: CONNECTORS-79 URL: https://issues.apache.org/jira/browse/CONNECTORS-79 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next We need test infrastructure and tests for the jCIFS connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-83) Tests and test server needed for Meridio connector
[ https://issues.apache.org/jira/browse/CONNECTORS-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-83: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Tests and test server needed for Meridio connector -- Key: CONNECTORS-83 URL: https://issues.apache.org/jira/browse/CONNECTORS-83 Project: ManifoldCF Issue Type: Bug Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next The Meridio connector needs tests, and a Meridio test server to run against. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-82) Tests and test server needed for Memex connector
[ https://issues.apache.org/jira/browse/CONNECTORS-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-82: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Issue Type: Test (was: Bug) Tests and test server needed for Memex connector Key: CONNECTORS-82 URL: https://issues.apache.org/jira/browse/CONNECTORS-82 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next The Memex connector needs tests and a Patriarch server to run against. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-84) Tests and test server needed for SharePoint connector
[ https://issues.apache.org/jira/browse/CONNECTORS-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-84: -- Component/s: (was: SharePoint connector) Tests Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Issue Type: Test (was: Bug) Tests and test server needed for SharePoint connector - Key: CONNECTORS-84 URL: https://issues.apache.org/jira/browse/CONNECTORS-84 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next We need tests and a SharePoint server to test the SharePoint connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-77) Tests and test server needed for FileNet connector
[ https://issues.apache.org/jira/browse/CONNECTORS-77?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-77: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Issue Type: Test (was: Bug) Tests and test server needed for FileNet connector -- Key: CONNECTORS-77 URL: https://issues.apache.org/jira/browse/CONNECTORS-77 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next We need global testing infrastructure available that would permit a FileNet test to be written. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-87) Connector Framework load test needs to be written
[ https://issues.apache.org/jira/browse/CONNECTORS-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-87: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Issue Type: Test (was: Bug) Connector Framework load test needs to be written - Key: CONNECTORS-87 URL: https://issues.apache.org/jira/browse/CONNECTORS-87 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next LCF needs a load or performance test, which verifies that the core software is performing as expected. This test can use the file system connector, but must verify that individual throttle bins are getting approximately equal time, and that the system as a whole is behaving efficiently. Furthermore, at least 1,000,000 documents should be crawled by this test. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-13) We should move to eliminate process synchronization via shared file system, and use a process/service instead
[ https://issues.apache.org/jira/browse/CONNECTORS-13?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-13: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Remaining Estimate: (was: 168h) Original Estimate: (was: 168h) We should move to eliminate process synchronization via shared file system, and use a process/service instead - Key: CONNECTORS-13 URL: https://issues.apache.org/jira/browse/CONNECTORS-13 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next The current implementation relies on the file system to synchronize activity between various LCF processes. This has several downsides: first, it is possible to get the file system into a state that is corrupted (by killing processes); second, this limits the future ability to spread crawler workload over multiple machines. It should be reasonably straightforward, and probably more resilient, to introduce a synchronization process, which all other LCF processes talk to in order to manage locks, shared data, and other synchronization activities. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-63) Add support for reports to API
[ https://issues.apache.org/jira/browse/CONNECTORS-63?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-63: -- Description: The API does not currently have implemented support for any ManifoldCF reports. Add this functionality. was: The API does not currently have implemented support for any LCF reporting. Add this functionality. Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Add support for reports to API -- Key: CONNECTORS-63 URL: https://issues.apache.org/jira/browse/CONNECTORS-63 Project: ManifoldCF Issue Type: Improvement Components: API Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next The API does not currently have implemented support for any ManifoldCF reports. Add this functionality. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-118) Crawled archive files should be expanded into their constituent files
[ https://issues.apache.org/jira/browse/CONNECTORS-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-118: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Crawled archive files should be expanded into their constituent files - Key: CONNECTORS-118 URL: https://issues.apache.org/jira/browse/CONNECTORS-118 Project: ManifoldCF Issue Type: New Feature Components: File system connector, JCIFS connector, Web connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Jack Krupansky Fix For: ManifoldCF next Archive files such as zip, mbox, tar, etc. should be expanded into their constituent files during crawling of repositories so that any output connector would output the flattened archive. This could be an option, defaulted to ON, since someone may want to implement a copy connector that maintains crawled files as-is. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-110) Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-110: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Max activity and Max bandwidth reports don't work properly under Derby or HSQLDB Key: CONNECTORS-110 URL: https://issues.apache.org/jira/browse/CONNECTORS-110 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next The reason for the failure is because the queries used are doing the Postgresql DISTINCT ON (xxx) syntax, which Derby does not support. Unfortunately, there does not seem to be a way in Derby at present to do anything similar to DISTINCT ON (xxx), and the queries really can't be done without that. One option is to introduce a getCapabilities() method into the database implementation, which would allow ACF to query the database capabilities before even presenting the report in the navigation menu in the UI. Another alternative is to do a sizable chunk of resultset processing within ACF, which would require not only the DISTINCT ON() implementation, but also the enclosing sort and limit stuff. It's the latter that would be most challenging, because of the difficulties with i18n etc. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-120) Port all connectors to use httpclient 4.x, after we submit our remaining 3.x changes as commons-httpclient tickets
[ https://issues.apache.org/jira/browse/CONNECTORS-120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-120: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Port all connectors to use httpclient 4.x, after we submit our remaining 3.x changes as commons-httpclient tickets -- Key: CONNECTORS-120 URL: https://issues.apache.org/jira/browse/CONNECTORS-120 Project: ManifoldCF Issue Type: Task Components: LiveLink connector, Meridio connector, RSS connector, SharePoint connector, Web connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next Now that commons-httpclient has accepted our NTLM patch, we can upgrade our connectors to use their newest 4.x httpclient code. We still need to submit or apply patches for other features first, so this ticket depends on the resolution of that action, covered in CONNECTORS-119. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-119) Submit patch requests for all remaining httpclient customizations
[ https://issues.apache.org/jira/browse/CONNECTORS-119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-119: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Submit patch requests for all remaining httpclient customizations - Key: CONNECTORS-119 URL: https://issues.apache.org/jira/browse/CONNECTORS-119 Project: ManifoldCF Issue Type: Task Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next Now that commons-httpclient has accepted the NTLM patch, we can in theory start to use httpclient 4.x plain-vanilla as a replacement for our customized 3.1 httpclient. But first we should submit any remaining differences as patch requests. Specifically, the cross-path cookie allowance should be submitted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-222) Would like support for building and running ManifoldCF under eclipse
[ https://issues.apache.org/jira/browse/CONNECTORS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-222: -- Assignee: Karl Wright Would like support for building and running ManifoldCF under eclipse Key: CONNECTORS-222 URL: https://issues.apache.org/jira/browse/CONNECTORS-222 Project: ManifoldCF Issue Type: New Feature Components: Framework agents process Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Being able to build and run under Eclipse would allow people to develop connectors and patches more readily. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-202) SOLR connector suport for commitWithin
[ https://issues.apache.org/jira/browse/CONNECTORS-202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-202: --- Fix Version/s: ManifoldCF next SOLR connector suport for commitWithin -- Key: CONNECTORS-202 URL: https://issues.apache.org/jira/browse/CONNECTORS-202 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: commit Fix For: ManifoldCF next The output connection must support commitWithin (http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_for_.22add.22) in addition to sending a commit() at the end of a job. This allows for efficient handling of commits on the Solr side. The parameter should ideally be configurable per job. In that way you could say that for Important job commitWithin=10s while for Big crawl job, commitWithin=600s. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-222) Would like support for building and running ManifoldCF under eclipse
[ https://issues.apache.org/jira/browse/CONNECTORS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095239#comment-13095239 ] Karl Wright commented on CONNECTORS-222: The maven support is now significant and probably counts as a resolution to this ticket. So I'm closing it. Would like support for building and running ManifoldCF under eclipse Key: CONNECTORS-222 URL: https://issues.apache.org/jira/browse/CONNECTORS-222 Project: ManifoldCF Issue Type: New Feature Components: Framework agents process Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF 0.3 Being able to build and run under Eclipse would allow people to develop connectors and patches more readily. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-237) RSS Connector proxy code doesn't seem to function correctly
[ https://issues.apache.org/jira/browse/CONNECTORS-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-237: --- Priority: Minor (was: Major) Fix Version/s: ManifoldCF next RSS Connector proxy code doesn't seem to function correctly --- Key: CONNECTORS-237 URL: https://issues.apache.org/jira/browse/CONNECTORS-237 Project: ManifoldCF Issue Type: Bug Components: RSS connector Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Trying to crawl through a proxy fails. No activity is recorded but all fetches fail (with timeout errors) and are requeued. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-237) RSS Connector proxy code doesn't seem to function correctly
[ https://issues.apache.org/jira/browse/CONNECTORS-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095241#comment-13095241 ] Karl Wright commented on CONNECTORS-237: This seems to be a problem only with certain proxies. Not quite sure what the criteria are for the problem occurring. But I've confirmed that crawling through most proxies still works. RSS Connector proxy code doesn't seem to function correctly --- Key: CONNECTORS-237 URL: https://issues.apache.org/jira/browse/CONNECTORS-237 Project: ManifoldCF Issue Type: Bug Components: RSS connector Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF next Trying to crawl through a proxy fails. No activity is recorded but all fetches fail (with timeout errors) and are requeued. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-233) ManifoldCF would benefit from a generic push agent
[ https://issues.apache.org/jira/browse/CONNECTORS-233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-233: --- Fix Version/s: ManifoldCF next ManifoldCF would benefit from a generic push agent -- Key: CONNECTORS-233 URL: https://issues.apache.org/jira/browse/CONNECTORS-233 Project: ManifoldCF Issue Type: New Feature Components: Framework agents process Affects Versions: ManifoldCF next Reporter: Karl Wright Fix For: ManifoldCF next ManifoldCF has a pull agent which crawls to get what it needs. There is, however, no push agents available. Developing a JSON-based push agent would demonstrate how to write one of these entities, and if done properly would also be useful in many off-the-shelf situations where notification is used. The most common model would involve an API to which change notifications could be reliably posted. A database table would maintain a list of the documents that needed processing, like the jobqueue. Fetching of documents would then need to be performed through a pluggable interface similar in some respects to IRepositoryConnector, but which would differ because version strings are unneeded. Indexing, of course, would proceed through the agents framework. I would anticipate that such an exercise would lead to some changes in the way the agents framework is structured. It is also possible to imagine that instead of a push agent, a notification service could be added to the pull-agent which would effectively do the same thing. Choosing the right approach would be part of this ticket. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-242) Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions
[ https://issues.apache.org/jira/browse/CONNECTORS-242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-242: --- Fix Version/s: ManifoldCF 0.3 Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions --- Key: CONNECTORS-242 URL: https://issues.apache.org/jira/browse/CONNECTORS-242 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF 0.3 Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-34) eRoom authority and connector
[ https://issues.apache.org/jira/browse/CONNECTORS-34?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-34: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next eRoom authority and connector - Key: CONNECTORS-34 URL: https://issues.apache.org/jira/browse/CONNECTORS-34 Project: ManifoldCF Issue Type: New Feature Components: eRoom connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF next eRoom has a SOAP API which looks like it has enough power to perhaps implement a connector and an authority. The eRoom API url is here (and yes, it is a chinese url, but is legit): https://eroom.abraxas.ch/eroomHelp/en/API_Help/Api.htm#home_api.html -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-100) DB lock timeout, and/or indefinite or excessive database activity
[ https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-100: --- Affects Version/s: ManifoldCF 0.3 ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next DB lock timeout, and/or indefinite or excessive database activity - Key: CONNECTORS-100 URL: https://issues.apache.org/jira/browse/CONNECTORS-100 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Environment: Running unmodified dist/example from trunk/ using the default configuration. Reporter: Andrzej Bialecki Assignee: Karl Wright Fix For: ManifoldCF next When a job is started and running (via crawler-ui) occasionally it's not possible to display a list of running jobs. The problem persists even after restarting ACF. The following exception is thrown in the console: {code} org.apache.acf.core.interfaces.ACFException: Database exception: Exception doing query: A lock could not be obtained within the time requested at org.apache.acf.core.database.Database.executeViaThread(Database.java:421) at org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465) at org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072) at org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.acf.core.database.Database.executeQuery(Database.java:167) at org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727) at org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611) at org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549) at org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.sql.SQLTransactionRollbackException: A lock could not be obtained within the time requested at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) at
[jira] [Updated] (CONNECTORS-94) fix common localization traps
[ https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-94: -- Affects Version/s: ManifoldCF 0.3 ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next fix common localization traps - Key: CONNECTORS-94 URL: https://issues.apache.org/jira/browse/CONNECTORS-94 Project: ManifoldCF Issue Type: Task Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Robert Muir Assignee: Robert Muir Fix For: ManifoldCF next Searching thru the LCF code, i found several uses of the following that appear to be potentially dangerous: * getBytes() with no encoding: this is dangerous as the encoding is completely unspecified. In most places this should likely mean UTF-8 * getBytes(utf-8): this is mostly a nitpick, but this alias is not guaranteed to exist (see Charset docs). I suggest changing these all to UTF-8 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where it appears the text is not used solely for display, but instead for 'caseless matching'. I suggest changing these to use either the root Locale: new Locale() or even easier, Locale.ENGLISH. This way ACF does not have surprising behavior on say a Turkish computer. I can contribute a patch to address these. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-241) OpenSearchServer connector needs how-to-build-and-deploy section
[ https://issues.apache.org/jira/browse/CONNECTORS-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095244#comment-13095244 ] Karl Wright commented on CONNECTORS-241: A patch was submitted under CONNECTORS-240, so I'm closing this ticket as a duplicate. OpenSearchServer connector needs how-to-build-and-deploy section Key: CONNECTORS-241 URL: https://issues.apache.org/jira/browse/CONNECTORS-241 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright The how-to-build-and-deploy page needs to be updated to include the OpenSearchServer connector info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-248: --- Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-246) A file crawl exited with an unexpected jobqueue status error under HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-246: --- Component/s: Framework core Affects Version/s: ManifoldCF 0.3 Fix Version/s: ManifoldCF 0.3 A file crawl exited with an unexpected jobqueue status error under HSQLDB --- Key: CONNECTORS-246 URL: https://issues.apache.org/jira/browse/CONNECTORS-246 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Under HSQLDB, a file crawl terminated with: Error: Unexpected jobqueue status - record id 1314721269570, expecting active status. The full trace was: ERROR 2011-08-30 12:23:48,962 (Worker thread '38') - Exception tossed: Unexpected jobqueue status - record id 1314721269570, expecting active status org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1314721269570, expecting active status at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:633) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2386) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:798) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-240) OpenSearchServer connector needs end-user documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-240: --- Fix Version/s: ManifoldCF 0.3 OpenSearchServer connector needs end-user documentation --- Key: CONNECTORS-240 URL: https://issues.apache.org/jira/browse/CONNECTORS-240 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF 0.3 Attachments: oss-mfc-site.patch We need end-user documentation for the OpenSearchServer connector -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-241) OpenSearchServer connector needs how-to-build-and-deploy section
[ https://issues.apache.org/jira/browse/CONNECTORS-241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-241. Resolution: Duplicate Fix Version/s: ManifoldCF 0.3 See CONNECTORS-240. OpenSearchServer connector needs how-to-build-and-deploy section Key: CONNECTORS-241 URL: https://issues.apache.org/jira/browse/CONNECTORS-241 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Fix For: ManifoldCF 0.3 The how-to-build-and-deploy page needs to be updated to include the OpenSearchServer connector info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-66) Document Active Directory authority configuration API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-66?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-66: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document Active Directory authority configuration API pieces Key: CONNECTORS-66 URL: https://issues.apache.org/jira/browse/CONNECTORS-66 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document Active Directory-specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-19: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-75) Document Solr Connector configuration/specification API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-75: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document Solr Connector configuration/specification API pieces -- Key: CONNECTORS-75 URL: https://issues.apache.org/jira/browse/CONNECTORS-75 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document Solr Connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-69) Document JDBC connector configuration/specification API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-69: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document JDBC connector configuration/specification API pieces -- Key: CONNECTORS-69 URL: https://issues.apache.org/jira/browse/CONNECTORS-69 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document JDBC connector -specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-67) Document GTS output connector configuration/specification API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-67?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-67: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document GTS output connector configuration/specification API pieces Key: CONNECTORS-67 URL: https://issues.apache.org/jira/browse/CONNECTORS-67 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document GTS output connector- specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-70) Document LiveLink configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-70?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-70: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document LiveLink configuration/specification/command API pieces Key: CONNECTORS-70 URL: https://issues.apache.org/jira/browse/CONNECTORS-70 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document LiveLink connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-64) Document the FileNet configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-64?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-64: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document the FileNet configuration/specification/command API pieces --- Key: CONNECTORS-64 URL: https://issues.apache.org/jira/browse/CONNECTORS-64 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document FileNet-specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-68) Document jCIFS connector configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-68: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document jCIFS connector configuration/specification/command API pieces --- Key: CONNECTORS-68 URL: https://issues.apache.org/jira/browse/CONNECTORS-68 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document jCIFS connector -specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-72) Document Meridio connector configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-72: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document Meridio connector configuration/specification/command API pieces - Key: CONNECTORS-72 URL: https://issues.apache.org/jira/browse/CONNECTORS-72 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document Meridio connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-141) Forrest skin files should be excluded from license check
[ https://issues.apache.org/jira/browse/CONNECTORS-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-141: --- Affects Version/s: ManifoldCF 0.3 Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright Forrest skin files should be excluded from license check Key: CONNECTORS-141 URL: https://issues.apache.org/jira/browse/CONNECTORS-141 Project: ManifoldCF Issue Type: Bug Components: Build Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The following files are reported by RAT: [rat:report] Unapproved licenses: [rat:report] [rat:report] C:/wip/mcf-release/release-0.1-branch/site/src/documentation/skins/common/xslt/html/split.xsl [rat:report] C:/wip/mcf-release/release-0.1-branch/site/src/documentation/skins/lucene/note.txt [rat:report] [rat:report] *** According to Forrest developers, these files should simply be excluded by the license report. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-99) REST API serialization inconsistency
[ https://issues.apache.org/jira/browse/CONNECTORS-99?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-99: -- Affects Version/s: ManifoldCF 0.3 ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next REST API serialization inconsistency Key: CONNECTORS-99 URL: https://issues.apache.org/jira/browse/CONNECTORS-99 Project: ManifoldCF Issue Type: Wish Components: API Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Environment: ACF trunk. Reporter: Andrzej Bialecki Priority: Minor Fix For: ManifoldCF next There is some inconsistency in REST APIs that makes the returned values more difficult to process than necessary. It boils down to the fact that lists of values are serialized into JSON arrays only when there is more than 1 element on the list, but they are serialized into plain JSON objects when there is 0 or 1 element on the list. Example: * listings of jobs, connectors, connections, repositories etc. all suffer from this symptom: {code} * 1 element: {job:{id:1283811504796,description:job 1 ... * 2 elements: {job:[{id:1283811504796,description:job 1 ... {code} * nested elements, such as e.g. job metadata: {code} 1 element: metadata:{_value_:,_attribute_name:jobKey1,_attribute_value:jobVal1} 2 elements: metadata:[{_value_:,_attribute_name:jobKey1,_attribute_value:jobVal1},{_value_:,_attribute_name:jobKey2,_attribute_value:jobVal2}] {code} In my opinion, in all the above cases the API should always return a JSON array for those elements that can occur with cardinality 1. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-144) Apache headers may not be necessary for README and DISCLAIMER
[ https://issues.apache.org/jira/browse/CONNECTORS-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-144: --- Affects Version/s: ManifoldCF 0.3 Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright Apache headers may not be necessary for README and DISCLAIMER - Key: CONNECTORS-144 URL: https://issues.apache.org/jira/browse/CONNECTORS-144 Project: ManifoldCF Issue Type: Bug Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 One last comment is that the some of the doc files like the README and DISCLAIMER have the Apache License header which i don't think is necessary and the README is the first thing people look at so having a big glob of legal text right at the top isn't so attractive. We'd need to check if this is indeed true. For the DISCLAIMER it may well be true, not sure about the README. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-65) Document File Connector configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-65?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-65: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document File Connector configuration/specification/command API pieces -- Key: CONNECTORS-65 URL: https://issues.apache.org/jira/browse/CONNECTORS-65 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document File System Connector -specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-73) Document RSS connector configuration/specification API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-73: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document RSS connector configuration/specification API pieces - Key: CONNECTORS-73 URL: https://issues.apache.org/jira/browse/CONNECTORS-73 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document RSS connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-74) Document SharePoint connector configuration/specification/command API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-74: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document SharePoint connector configuration/specification/command API pieces Key: CONNECTORS-74 URL: https://issues.apache.org/jira/browse/CONNECTORS-74 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document SharePoint connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-76) Document Web Connector configuration/specification API pieces
[ https://issues.apache.org/jira/browse/CONNECTORS-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-76: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Document Web Connector configuration/specification API pieces - Key: CONNECTORS-76 URL: https://issues.apache.org/jira/browse/CONNECTORS-76 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Need to document web connector - specific API objects and commands. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-27) Add support for observation to the crawler agent
[ https://issues.apache.org/jira/browse/CONNECTORS-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-27: -- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Add support for observation to the crawler agent Key: CONNECTORS-27 URL: https://issues.apache.org/jira/browse/CONNECTORS-27 Project: ManifoldCF Issue Type: New Feature Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Ralph Benjamin Ruijs Priority: Minor Fix For: ManifoldCF next Attachments: Added_observation_logic_to_the_crawler.patch When crawling a large repository, it could take a lot of time before changes are propagated to Solr. You can add an event listener to the repository, and be notified about changes. The crawler will ensure you have a complete copy in case of missed events. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-176) It might be nice to have a direct link from the site navigation area to the performance tuning page
[ https://issues.apache.org/jira/browse/CONNECTORS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-176: --- Fix Version/s: ManifoldCF 0.3 It might be nice to have a direct link from the site navigation area to the performance tuning page --- Key: CONNECTORS-176 URL: https://issues.apache.org/jira/browse/CONNECTORS-176 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The only way to get to the performance tuning page now is through the developer support page. A direct link might make it easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-178) Implement ability to run ManifoldCF with Derby in multiprocess mode
[ https://issues.apache.org/jira/browse/CONNECTORS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-178: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Implement ability to run ManifoldCF with Derby in multiprocess mode --- Key: CONNECTORS-178 URL: https://issues.apache.org/jira/browse/CONNECTORS-178 Project: ManifoldCF Issue Type: Bug Components: Documentation, Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next Derby has a standalone server mode, which we can no doubt use if we modify the Derby driver to accept a configuration parameter which allows you to choose between the embedded driver and the client driver. It might be useful to be able to run ManifoldCF with Derby in this manner. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-180) Connector factories all have a Pool class that should be derived from a base Pool class
[ https://issues.apache.org/jira/browse/CONNECTORS-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-180: --- Affects Version/s: ManifoldCF 0.1 ManifoldCF 0.2 Fix Version/s: ManifoldCF next Connector factories all have a Pool class that should be derived from a base Pool class --- Key: CONNECTORS-180 URL: https://issues.apache.org/jira/browse/CONNECTORS-180 Project: ManifoldCF Issue Type: Improvement Components: Framework core Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Erlend Garåsen Priority: Minor Fix For: ManifoldCF next There's a fair bit of duplicated code in the connector factories - RepositoryConnectorFactory, AuthorityConnectorFactory, etc. The duplicated code can be easily eliminated by creating a base factory pool class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-177) File System Connector has some testing code in it
[ https://issues.apache.org/jira/browse/CONNECTORS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-177: --- Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright File System Connector has some testing code in it - Key: CONNECTORS-177 URL: https://issues.apache.org/jira/browse/CONNECTORS-177 Project: ManifoldCF Issue Type: Improvement Components: File system connector Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The file system connector has testing code in it that should be removed. See getBinNames(). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-184) Active Directory authority could use support for SSL, and checkboxes/pulldowns for selection of authentication modes
[ https://issues.apache.org/jira/browse/CONNECTORS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-184: --- Fix Version/s: ManifoldCF next Active Directory authority could use support for SSL, and checkboxes/pulldowns for selection of authentication modes Key: CONNECTORS-184 URL: https://issues.apache.org/jira/browse/CONNECTORS-184 Project: ManifoldCF Issue Type: Improvement Components: Active Directory authority Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF next The active directory authority's UI or implementation currently does not support SSL. Also, the selection of security protocol does not help the user at all by giving any hints of what's allowed and what isn't - it's just a simple text box. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-207) ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead
[ https://issues.apache.org/jira/browse/CONNECTORS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095247#comment-13095247 ] Karl Wright commented on CONNECTORS-207: This has been resolved through the use of ServiceInterruption exceptions rather than the REPOSITORY_CONNECTION_ERROR ManifoldCFException type. ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead -- Key: CONNECTORS-207 URL: https://issues.apache.org/jira/browse/CONNECTORS-207 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is to wait 5 minutes and retry. It might want to just allow the job to be aborted with no retries. The current behavior is not actually *wrong*, but the circumstances under which it was added were the result of severe problems at various sites that were unrelated to ManifoldCF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-207) ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead
[ https://issues.apache.org/jira/browse/CONNECTORS-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-207. Resolution: Duplicate Fix Version/s: ManifoldCF 0.3 ManifoldCFException type REPOSITORY_CONNECTION_ERROR causes a five-minute retry, but may want to abort the job instead -- Key: CONNECTORS-207 URL: https://issues.apache.org/jira/browse/CONNECTORS-207 Project: ManifoldCF Issue Type: Bug Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The way a worker thread treats ManifoldCFException type REPOSITORY_CONNECTION_ERROR is to wait 5 minutes and retry. It might want to just allow the job to be aborted with no retries. The current behavior is not actually *wrong*, but the circumstances under which it was added were the result of severe problems at various sites that were unrelated to ManifoldCF. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-247) Need a set of tests for the scripting language client
[ https://issues.apache.org/jira/browse/CONNECTORS-247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-247: --- Fix Version/s: ManifoldCF 0.3 Need a set of tests for the scripting language client - Key: CONNECTORS-247 URL: https://issues.apache.org/jira/browse/CONNECTORS-247 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 We need unit tests for the script language client. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-141) Forrest skin files should be excluded from license check
[ https://issues.apache.org/jira/browse/CONNECTORS-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-141. Resolution: Fixed r1164084. Forrest skin files should be excluded from license check Key: CONNECTORS-141 URL: https://issues.apache.org/jira/browse/CONNECTORS-141 Project: ManifoldCF Issue Type: Bug Components: Build Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The following files are reported by RAT: [rat:report] Unapproved licenses: [rat:report] [rat:report] C:/wip/mcf-release/release-0.1-branch/site/src/documentation/skins/common/xslt/html/split.xsl [rat:report] C:/wip/mcf-release/release-0.1-branch/site/src/documentation/skins/lucene/note.txt [rat:report] [rat:report] *** According to Forrest developers, these files should simply be excluded by the license report. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-144) Apache headers may not be necessary for README and DISCLAIMER
[ https://issues.apache.org/jira/browse/CONNECTORS-144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-144. Resolution: Fixed r1164086. Apache headers may not be necessary for README and DISCLAIMER - Key: CONNECTORS-144 URL: https://issues.apache.org/jira/browse/CONNECTORS-144 Project: ManifoldCF Issue Type: Bug Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 One last comment is that the some of the doc files like the README and DISCLAIMER have the Apache License header which i don't think is necessary and the README is the first thing people look at so having a big glob of legal text right at the top isn't so attractive. We'd need to check if this is indeed true. For the DISCLAIMER it may well be true, not sure about the README. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-240) OpenSearchServer connector needs end-user documentation
[ https://issues.apache.org/jira/browse/CONNECTORS-240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-240: -- Assignee: Karl Wright OpenSearchServer connector needs end-user documentation --- Key: CONNECTORS-240 URL: https://issues.apache.org/jira/browse/CONNECTORS-240 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Attachments: oss-mfc-site.patch We need end-user documentation for the OpenSearchServer connector -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-246) A file crawl exited with an unexpected jobqueue status error under HSQLDB
[ https://issues.apache.org/jira/browse/CONNECTORS-246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094456#comment-13094456 ] Karl Wright commented on CONNECTORS-246: r1163591 for some additional debugging info should this happen again. A file crawl exited with an unexpected jobqueue status error under HSQLDB --- Key: CONNECTORS-246 URL: https://issues.apache.org/jira/browse/CONNECTORS-246 Project: ManifoldCF Issue Type: Bug Reporter: Karl Wright Assignee: Karl Wright Under HSQLDB, a file crawl terminated with: Error: Unexpected jobqueue status - record id 1314721269570, expecting active status. The full trace was: ERROR 2011-08-30 12:23:48,962 (Worker thread '38') - Exception tossed: Unexpected jobqueue status - record id 1314721269570, expecting active status org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1314721269570, expecting active status at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:633) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2386) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:798) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-245) Creating a job through the API with no schedule creates an incorrect schedule entry
[ https://issues.apache.org/jira/browse/CONNECTORS-245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-245. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 r1163744. Creating a job through the API with no schedule creates an incorrect schedule entry --- Key: CONNECTORS-245 URL: https://issues.apache.org/jira/browse/CONNECTORS-245 Project: ManifoldCF Issue Type: Bug Components: API Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 When I dump a job using the API that has no schedule, and create another one using the same dumped JSON parameters, I get: Scheduled time: Any day of week at midnight in January on the 1st of the month This is obviously incorrect. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-58) ManifoldCF scripting language, executed via the API, plus example jobs for file system and web crawl
[ https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-58. --- Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright Decided only to include file system crawl script, since it demonstrates pretty near everything important. r1163882 for the last of many commits for this feature. ManifoldCF scripting language, executed via the API, plus example jobs for file system and web crawl --- Key: CONNECTORS-58 URL: https://issues.apache.org/jira/browse/CONNECTORS-58 Project: ManifoldCF Issue Type: Sub-task Components: Examples, Scripting client Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 Creating a basic connection setup to do a relatively simple crawl for a file system or web can be a daunting task for someone new to LCF. So, it would be nice to have a scripting file that supports an abbreviated API (subset of the full API discussed in CONNECTORS-56) sufficient to create a default set of connections and example jobs that the new user can choose from. Beyond this initial need, this script format might be a useful form to dump all of the connections and jobs in the LCF database in a form that can be used to recreate an LCF configuration. Kind of a dump and reload capability. That in fact might be how the initial example script gets created. Those are two distinct use cases, but could utilize the same feature. The example script could have example jobs to crawl a subdirectory of LCF, crawl the LCF wiki, etc. There could be more than one script. There might be example scripts for each form of connector. This capability should be available for both QuickStart and the general release of LCF. As just one possibility, the script format might be a sequence of JSON expressions, each with an initial string analogous to a servlet path to specify the operation to be performed, followed by the JSON form of the connection or job or other LCF object. Or, some other format might be more suitable. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-248) File system crawl with HSQLDB aborts with a constraint error
[ https://issues.apache.org/jira/browse/CONNECTORS-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095067#comment-13095067 ] Karl Wright commented on CONNECTORS-248: The constraint being violated appears to be the primary key for the table, which is the id column. This is very strange because the update operation in question is not changing the id column in any way, but rather another column. The update uses the clause ... WHERE id IN (...) to apply the update to multiple rows at a single time. So I can't think of any reason this can possibly result in a unique constraint violation. This is the second bizarre bug I've seen when crawling with HSQLDB. I'm beginning to think that there's a funky sort of race condition in this database. Either that, or we're seeing a misleading error message, which really means something else (maybe that two transactions are trying to write the same record or something). I think I'm going to have to escalate this to the HSQLDB group. File system crawl with HSQLDB aborts with a constraint error Key: CONNECTORS-248 URL: https://issues.apache.org/jira/browse/CONNECTORS-248 Project: ManifoldCF Issue Type: Bug Components: Framework agents process, Framework core Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright While running two jobs with overlapping files with HSQLDB, I got this error on the second job that aborted it: Error: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS The complete exception is here: ERROR 2011-08-31 21:07:06,029 (Worker thread '34') - Exception tossed: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS org.apache.manifoldcf.core.interfaces.ManifoldCFException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.reinterpretException(DBInterfaceHSQLDB.java:587) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performModification(DBInterfaceHSQLDB.java:607) at org.apache.manifoldcf.core.database.DBInterfaceHSQLDB.performUpdate(DBInterfaceHSQLDB.java:242) at org.apache.manifoldcf.core.database.BaseTable.performUpdate(BaseTable.java:88) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.updateRowIds(IncrementalIngester.java:628) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentCheckMultiple(IncrementalIngester.java:588) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:653) Caused by: java.sql.SQLException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.jdbc.Util.sqlException(Util.java:255) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4659) at org.hsqldb.jdbc.JDBCPreparedStatement.executeUpdate(JDBCPreparedStatement.java:311) at org.apache.manifoldcf.core.database.Database.execute(Database.java:606) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:421) Caused by: org.hsqldb.HsqlException: integrity constraint violation: unique constraint or index violation; SYS_PK_10041 table: INGESTSTATUS at org.hsqldb.error.Error.error(Error.java:134) at org.hsqldb.Constraint.getException(Constraint.java:914) at org.hsqldb.index.IndexAVL.insert(IndexAVL.java:731) at org.hsqldb.persist.RowStoreAVL.indexRow(RowStoreAVL.java:171) at org.hsqldb.persist.RowStoreAVLDisk.indexRow(RowStoreAVLDisk.java:169) at org.hsqldb.TransactionManagerMVCC.addInsertAction(TransactionManagerMVCC.java:401) at org.hsqldb.Session.addInsertAction(Session.java:434) at org.hsqldb.Table.insertSingleRow(Table.java:2553) at org.hsqldb.StatementDML.update(StatementDML.java:1032) at org.hsqldb.StatementDML.executeUpdateStatement(StatementDML.java:541) at org.hsqldb.StatementDML.getResult(StatementDML.java:196) at org.hsqldb.StatementDMQL.execute(StatementDMQL.java:190) at org.hsqldb.Session.executeCompiledStatement(Session.java:1340) at org.hsqldb.Session.execute(Session.java:993) at org.hsqldb.jdbc.JDBCPreparedStatement.fetchResult(JDBCPreparedStatement.java:4651) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-244) Derby deadlocks in a new way on the IngestStatus table, which isn't caught and retried
Derby deadlocks in a new way on the IngestStatus table, which isn't caught and retried -- Key: CONNECTORS-244 URL: https://issues.apache.org/jira/browse/CONNECTORS-244 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Derby deadlocks when a file system job is run, as follows: Irrecoverable Derby deadlock at: at org.apache.manifoldcf.core.database.DBInterfaceDerby.reinterpretException(DBInterfaceDerby.java:803) at org.apache.manifoldcf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:961) at org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:229) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:388) at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:364) at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1555) at org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:283) at org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:561) The deadlock needs to be caught, backed off, and retried. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-243) Web crawler must get the Last-Modified HTTP header and pass it as metadata to output
[ https://issues.apache.org/jira/browse/CONNECTORS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088754#comment-13088754 ] Karl Wright commented on CONNECTORS-243: I'll try to have a look at this this evening. Web crawler must get the Last-Modified HTTP header and pass it as metadata to output -- Key: CONNECTORS-243 URL: https://issues.apache.org/jira/browse/CONNECTORS-243 Project: ManifoldCF Issue Type: New Feature Components: Web connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Labels: last-modified Last-Modified is important in web search, at it may be used for (de)boosting based on date. In fact, ManifoldCF should have the ability to parse any (or all) HTTP headers from source document and pass it on. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-243) Web crawler must get the Last-Modified HTTP header and pass it as metadata to output
[ https://issues.apache.org/jira/browse/CONNECTORS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-243: -- Assignee: Karl Wright Web crawler must get the Last-Modified HTTP header and pass it as metadata to output -- Key: CONNECTORS-243 URL: https://issues.apache.org/jira/browse/CONNECTORS-243 Project: ManifoldCF Issue Type: New Feature Components: Web connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Assignee: Karl Wright Labels: last-modified Last-Modified is important in web search, at it may be used for (de)boosting based on date. In fact, ManifoldCF should have the ability to parse any (or all) HTTP headers from source document and pass it on. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-243) Web crawler must get the Last-Modified HTTP header and pass it as metadata to output
[ https://issues.apache.org/jira/browse/CONNECTORS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089128#comment-13089128 ] Karl Wright commented on CONNECTORS-243: Looking at this further, there are a number of headers that would be bad to include in metadata. For example, you would not want to include anything authentication related or session related. Any transient information should also be excluded, since that will cause ManifoldCF to be unable to avoid refetching the document on each job run. Here's the list of exclusions I've come up with so far: Age WWW-Authenticate Proxy-Authenticate Date Set-cookie Via Any I've missed? Web crawler must get the Last-Modified HTTP header and pass it as metadata to output -- Key: CONNECTORS-243 URL: https://issues.apache.org/jira/browse/CONNECTORS-243 Project: ManifoldCF Issue Type: New Feature Components: Web connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Assignee: Karl Wright Labels: last-modified Last-Modified is important in web search, at it may be used for (de)boosting based on date. In fact, ManifoldCF should have the ability to parse any (or all) HTTP headers from source document and pass it on. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-243) Web crawler must get the Last-Modified HTTP header and pass it as metadata to output
[ https://issues.apache.org/jira/browse/CONNECTORS-243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-243. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Web crawler must get the Last-Modified HTTP header and pass it as metadata to output -- Key: CONNECTORS-243 URL: https://issues.apache.org/jira/browse/CONNECTORS-243 Project: ManifoldCF Issue Type: New Feature Components: Web connector Affects Versions: ManifoldCF 0.2 Reporter: Jan Høydahl Assignee: Karl Wright Labels: last-modified Fix For: ManifoldCF 0.3 Last-Modified is important in web search, at it may be used for (de)boosting based on date. In fact, ManifoldCF should have the ability to parse any (or all) HTTP headers from source document and pass it on. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086245#comment-13086245 ] Karl Wright commented on CONNECTORS-224: Thanks for the update - I've merged it into the branch. At this point I think we're just about ready to merge the branch into trunk. I agree with the decision to use a synchronizer to prevent leaks, although I'd love it if there was more of a description of the problem in comments in the code. But that can wait until after this connector hits trunk. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch, oss-mfc-rc1.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086267#comment-13086267 ] Karl Wright commented on CONNECTORS-224: Committed to trunk. r1158655. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch, oss-mfc-rc1.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-224. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Fix For: ManifoldCF 0.3 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch, oss-mfc-rc1.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-241) OpenSearchServer connector needs how-to-build-and-deploy section
OpenSearchServer connector needs how-to-build-and-deploy section Key: CONNECTORS-241 URL: https://issues.apache.org/jira/browse/CONNECTORS-241 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright The how-to-build-and-deploy page needs to be updated to include the OpenSearchServer connector info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-240) OpenSearchServer connector needs end-user documentation
OpenSearchServer connector needs end-user documentation --- Key: CONNECTORS-240 URL: https://issues.apache.org/jira/browse/CONNECTORS-240 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright We need end-user documentation for the OpenSearchServer connector -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-242) Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions
Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions --- Key: CONNECTORS-242 URL: https://issues.apache.org/jira/browse/CONNECTORS-242 Project: ManifoldCF Issue Type: Improvement Components: Documentation Affects Versions: ManifoldCF 0.3 Reporter: Karl Wright Connector support matrix page needs to be updated with both OpenSearchServer supported versions and CMIS supported versions -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-157) Root-relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085959#comment-13085959 ] Karl Wright commented on CONNECTORS-157: I can see what the problem is right away. The relative urls are not legit according to w3c. It looks like I will need to replace java's implementation of url with my own in order to make such a thing work. I will experiment and get back to you. Root-relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CONNECTORS-157) Root-relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reopened CONNECTORS-157: Going to make this ticket more broad Root-relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-157) Some relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-157: --- Description: If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. Another similar case is the following: http://foo.com/bar/xyz.asmx?q=hello; ... with a relative URL of q=there ... produces: http://foo.com/bar/?q=there;, incorrectly losing the last part of the path. This is apparently a bug, but we need to find a way to work around it properly. was:If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. This is apparently a bug, but we need to find a way to work around it properly. Summary: Some relative paths without leading / do not resolve properly (was: Root-relative paths without leading / do not resolve properly) Some relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. Another similar case is the following: http://foo.com/bar/xyz.asmx?q=hello; ... with a relative URL of q=there ... produces: http://foo.com/bar/?q=there;, incorrectly losing the last part of the path. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-157) Some relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-157. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Some relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3, ManifoldCF 0.2 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. Another similar case is the following: http://foo.com/bar/xyz.asmx?q=hello; ... with a relative URL of q=there ... produces: http://foo.com/bar/?q=there;, incorrectly losing the last part of the path. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-157) Some relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086059#comment-13086059 ] Karl Wright commented on CONNECTORS-157: r1158476 to port the same fixes to the RSS connector. Some relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: RSS connector, Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2, ManifoldCF 0.3 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. Another similar case is the following: http://foo.com/bar/xyz.asmx?q=hello; ... with a relative URL of q=there ... produces: http://foo.com/bar/?q=there;, incorrectly losing the last part of the path. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-157) Some relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13086062#comment-13086062 ] Karl Wright commented on CONNECTORS-157: r1158480 to add similar tests to the RSS connector. Some relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: RSS connector, Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2, ManifoldCF 0.3 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. Another similar case is the following: http://foo.com/bar/xyz.asmx?q=hello; ... with a relative URL of q=there ... produces: http://foo.com/bar/?q=there;, incorrectly losing the last part of the path. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-157) Root-relative paths without leading / do not resolve properly
[ https://issues.apache.org/jira/browse/CONNECTORS-157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085171#comment-13085171 ] Karl Wright commented on CONNECTORS-157: What is the url of the page with the link? Is it http://mysare.sare.org/MySareF?do=searchProjq=*amp;searchmethod=andregion=state=projType=0sortby=1page=1;? Because, if so, I don't see where /ProjectReport.aspx is supposed to be coming from. The relative URL composition rules in Java in general adhere to the w3c specifications. The problem is often that browsers do somewhat different things than the w3c spec. So let's work with your specific case and figure out what's happening. The only two inputs are: (a) the url of the page, and (b) the relative url of the reference. Since I don't see /ProjectReport.aspx in either one, it must either be in the page URL, or there must be a redirection taking place. Root-relative paths without leading / do not resolve properly - Key: CONNECTORS-157 URL: https://issues.apache.org/jira/browse/CONNECTORS-157 Project: ManifoldCF Issue Type: Bug Components: Web connector Affects Versions: ManifoldCF 0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.2 If a document has a URL which is just the domain, e.g. http://foo.com;, the java.net.URI class fails to resolve URLs in that document which have no starting /, e.g. document.pdf. The resolved URI has no path part, e.g. http://foo.comdocument.pdf;. This is apparently a bug, but we need to find a way to work around it properly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-239) RSS Connector chromed mode does not work properly in the UI
RSS Connector chromed mode does not work properly in the UI --- Key: CONNECTORS-239 URL: https://issues.apache.org/jira/browse/CONNECTORS-239 Project: ManifoldCF Issue Type: Bug Components: RSS connector Affects Versions: ManifoldCF 0.2, ManifoldCF 0.1, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor The top chromed mode button, if selected, becomes unselected when the form is reposted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-239) RSS Connector chromed mode does not work properly in the UI
[ https://issues.apache.org/jira/browse/CONNECTORS-239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13085385#comment-13085385 ] Karl Wright commented on CONNECTORS-239: r1158039 RSS Connector chromed mode does not work properly in the UI --- Key: CONNECTORS-239 URL: https://issues.apache.org/jira/browse/CONNECTORS-239 Project: ManifoldCF Issue Type: Bug Components: RSS connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The top chromed mode button, if selected, becomes unselected when the form is reposted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-239) RSS Connector chromed mode does not work properly in the UI
[ https://issues.apache.org/jira/browse/CONNECTORS-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-239. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 RSS Connector chromed mode does not work properly in the UI --- Key: CONNECTORS-239 URL: https://issues.apache.org/jira/browse/CONNECTORS-239 Project: ManifoldCF Issue Type: Bug Components: RSS connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3 The top chromed mode button, if selected, becomes unselected when the form is reposted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-213) DBInterfaceMySQL Initalization error
[ https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-213: --- Resolution: Fixed Status: Resolved (was: Patch Available) DBInterfaceMySQL Initalization error Key: CONNECTORS-213 URL: https://issues.apache.org/jira/browse/CONNECTORS-213 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Emanuele Lombardi Assignee: Karl Wright Labels: Mysql, bug, manifoldcf Fix For: ManifoldCF 0.3 Attachments: DBInterfaceMySQL.patch When I try to creare a new database using DBCreate and MySql DBInterface I have this exception Configuration file successfully read Exception in thread main java.lang.NullPointerException at org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40) at org.apache.manifoldcf.core.database.Database.init(Database.java:63) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391) at org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656) at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51) at org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54) at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-213) DBInterfaceMySQL Initalization error
[ https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084038#comment-13084038 ] Karl Wright commented on CONNECTORS-213: I haven't heard anything further, so I'm closing this ticket. I'll either reopen it, or open a new one, if there are further mysql issues/patches reported. DBInterfaceMySQL Initalization error Key: CONNECTORS-213 URL: https://issues.apache.org/jira/browse/CONNECTORS-213 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Emanuele Lombardi Labels: Mysql, bug, manifoldcf Fix For: ManifoldCF 0.3 Attachments: DBInterfaceMySQL.patch When I try to creare a new database using DBCreate and MySql DBInterface I have this exception Configuration file successfully read Exception in thread main java.lang.NullPointerException at org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40) at org.apache.manifoldcf.core.database.Database.init(Database.java:63) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391) at org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656) at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51) at org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54) at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified
[ https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084043#comment-13084043 ] Karl Wright commented on CONNECTORS-226: After some thought, I believe the correct approach is to convert Livelink's usage to a ServiceInterruption. Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified Key: CONNECTORS-226 URL: https://issues.apache.org/jira/browse/CONNECTORS-226 Project: ManifoldCF Issue Type: Bug Components: Framework core, JCIFS connector, LiveLink connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated by the framework somewhat inconsistently. In some places it is treated as a permanent connection exception, and in others as a temporary connection exception (in lieu of a ServiceInterruption where ServiceInterruption is not possible). Only two connectors use it (LiveLink and jCIFS), and the JCIFS case is not interesting. So really this is currently here to support Livelink. There are two ways forward. The first way is to convert the Livelink connector's exception to a true ServiceInterruption, and revert REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been deprecated as a result of the fact that connect() methods can no longer throw ManifoldCFExceptions at all. The second is to continue the current Livelink-style usage, and make ALL usages consistent with that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified
[ https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-226. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 r1157065 Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified Key: CONNECTORS-226 URL: https://issues.apache.org/jira/browse/CONNECTORS-226 Project: ManifoldCF Issue Type: Bug Components: Framework core, JCIFS connector, LiveLink connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated by the framework somewhat inconsistently. In some places it is treated as a permanent connection exception, and in others as a temporary connection exception (in lieu of a ServiceInterruption where ServiceInterruption is not possible). Only two connectors use it (LiveLink and jCIFS), and the JCIFS case is not interesting. So really this is currently here to support Livelink. There are two ways forward. The first way is to convert the Livelink connector's exception to a true ServiceInterruption, and revert REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been deprecated as a result of the fact that connect() methods can no longer throw ManifoldCFExceptions at all. The second is to continue the current Livelink-style usage, and make ALL usages consistent with that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084063#comment-13084063 ] Karl Wright commented on CONNECTORS-55: --- HSQLDB support has been added, which works a lot better than Derby does, so I think we've finally hit the necessary criteria for closing out this ticket. Bundle database server with ManifoldCF packaged product --- Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: ManifoldCF Issue Type: Sub-task Components: Installers Reporter: Jack Krupansky Fix For: ManifoldCF 0.3 The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate the real thing with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira