[jira] [Resolved] (CONNECTORS-85) RSS tests need to be written
[ https://issues.apache.org/jira/browse/CONNECTORS-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-85. --- Resolution: Fixed Fix Version/s: (was: ManifoldCF next) ManifoldCF 1.1 We have both an RSS load test and some unit tests for RSS. We could use still an RSS integration test, but that will come over time, hopefully. RSS tests need to be written Key: CONNECTORS-85 URL: https://issues.apache.org/jira/browse/CONNECTORS-85 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 RSS connector unit tests, which set up a proper RSS test environment, needs to be written. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-85) RSS tests need to be written
[ https://issues.apache.org/jira/browse/CONNECTORS-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-85: - Assignee: Karl Wright RSS tests need to be written Key: CONNECTORS-85 URL: https://issues.apache.org/jira/browse/CONNECTORS-85 Project: ManifoldCF Issue Type: Test Components: Tests Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF next RSS connector unit tests, which set up a proper RSS test environment, needs to be written. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-61) Support bundling of LCF with an app
[ https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-61: -- Fix Version/s: (was: ManifoldCF next) ManifoldCF 0.6 Support bundling of LCF with an app --- Key: CONNECTORS-61 URL: https://issues.apache.org/jira/browse/CONNECTORS-61 Project: ManifoldCF Issue Type: Sub-task Components: Documentation, Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.6 It should be possible for an application developer to bundle LCF with an application to facilitate installation and deployment of the application in conjunction with LCF. This may (or may not) be as simple as providing appropriate jar files and documentation for how to use them, but there may be other components or scripts needed. There are two options: 1) include the LCF UI along with the other LCF processes, and 2) exclude the LCF UI and include only the other processes that can be controlled via the full API. The database server would be included. The web app server would be optional since the application may have its own choice of web app server. One use case is bundling LCF with Solr or a Solr-based application. Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-27) Add support for observation to the crawler agent
[ https://issues.apache.org/jira/browse/CONNECTORS-27?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-27: -- Fix Version/s: (was: ManifoldCF next) ManifoldCF 1.1 Add support for observation to the crawler agent Key: CONNECTORS-27 URL: https://issues.apache.org/jira/browse/CONNECTORS-27 Project: ManifoldCF Issue Type: New Feature Components: Framework crawler agent Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Ralph Benjamin Ruijs Priority: Minor Fix For: ManifoldCF 1.1 Attachments: ASF.LICENSE.NOT.GRANTED--Added_observation_logic_to_the_crawler.patch When crawling a large repository, it could take a lot of time before changes are propagated to Solr. You can add an event listener to the repository, and be notified about changes. The crawler will ensure you have a complete copy in case of missed events. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-15) Documentum Connector testing code references a not-present class
[ https://issues.apache.org/jira/browse/CONNECTORS-15?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-15. --- Resolution: Won't Fix This code is legacy and will not likely be used. Documentum Connector testing code references a not-present class Key: CONNECTORS-15 URL: https://issues.apache.org/jira/browse/CONNECTORS-15 Project: ManifoldCF Issue Type: Test Components: Documentum connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Fix For: ManifoldCF 1.1 The documentum connector Java testing code references a class from TrinityTechnologies, which was not granted. This class reference should be removed and replaced by direct references to the appropriate DFC methods. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-566) Simple JDBC Authority connector
[ https://issues.apache.org/jira/browse/CONNECTORS-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-566. Resolution: Fixed Simple JDBC Authority connector --- Key: CONNECTORS-566 URL: https://issues.apache.org/jira/browse/CONNECTORS-566 Project: ManifoldCF Issue Type: Improvement Components: JDBC connector Affects Versions: ManifoldCF 1.0.1 Reporter: Maciej Lizewski Assignee: Maciej Lizewski Fix For: ManifoldCF 1.1 For scenarios when privileges are based on SQL database with user records and some authorization tokens assigned to them (groups, etc). It could help when you have to index number of www sites based on common user-group authorization in database. Also it could help on showcases, presentations and demo versions of solr-manifold-search client to show how authorization works. (of course I already have this connector written as additional class in default JDBC connector so I could use some of its classes :) ) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-563) Extended LDAP authority connector
[ https://issues.apache.org/jira/browse/CONNECTORS-563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-563. Resolution: Fixed Extended LDAP authority connector - Key: CONNECTORS-563 URL: https://issues.apache.org/jira/browse/CONNECTORS-563 Project: ManifoldCF Issue Type: Improvement Components: LDAP authority Affects Versions: ManifoldCF 1.0.1 Reporter: Maciej Lizewski Assignee: Maciej Lizewski Fix For: ManifoldCF 1.1 Attachments: CONNECTORS-563-JapanesetTranslation.patch, CONNECTORS-566-JapanesetTranslation.patch 1. possibility to include username in authority tokens (because tokens are mapped to filesystem privileges there may be per-group rights or per-user right assigned to documents, so it is necessary to check for user permissions also) 2. possibility to search groups by user name or user DN (there are two usage scenarios involving GroupOfNames/UniqueGroupOfNames and PosixGroups. First one needs to search by DN, the other by user name/uid) 3. allow binding to LDAP server with specified credentials -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-581) Move OpenSearchServer and ElasticSearch connectors off of commons-httpclient and onto httpcomponents
Karl Wright created CONNECTORS-581: -- Summary: Move OpenSearchServer and ElasticSearch connectors off of commons-httpclient and onto httpcomponents Key: CONNECTORS-581 URL: https://issues.apache.org/jira/browse/CONNECTORS-581 Project: ManifoldCF Issue Type: Bug Components: Elastic Search connector, OpenSearchServer connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Original port of code onto HttpComponents 4.x overlooked these connectors; finish the job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-582) Either wait for httpcomponents 4.2.3, or include local copy of NTLM implementation and work with 4.2.2
Karl Wright created CONNECTORS-582: -- Summary: Either wait for httpcomponents 4.2.3, or include local copy of NTLM implementation and work with 4.2.2 Key: CONNECTORS-582 URL: https://issues.apache.org/jira/browse/CONNECTORS-582 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The NTLM fixes I submitted to the Httpcomponents project are not yet released. So we have a choice: either delay MCF 1.1 until HttpClient 4.2.3 has shipped, or supply our own NTLM implementation that we can remove next chance we have. In any case we need to do something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-580) Remove commons-httpclient-mcf from dependencies and from build.xml and from pom.xml
[ https://issues.apache.org/jira/browse/CONNECTORS-580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-580. Resolution: Fixed Remove commons-httpclient-mcf from dependencies and from build.xml and from pom.xml --- Key: CONNECTORS-580 URL: https://issues.apache.org/jira/browse/CONNECTORS-580 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The commons-httpclient-mcf dependency should no longer be required. Let's get rid of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-580) Remove commons-httpclient-mcf from dependencies and from build.xml and from pom.xml
[ https://issues.apache.org/jira/browse/CONNECTORS-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527648#comment-13527648 ] Karl Wright commented on CONNECTORS-580: r1419177 to redo this fix. Remove commons-httpclient-mcf from dependencies and from build.xml and from pom.xml --- Key: CONNECTORS-580 URL: https://issues.apache.org/jira/browse/CONNECTORS-580 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The commons-httpclient-mcf dependency should no longer be required. Let's get rid of it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-583) MySQL locks time out and are not caught
Karl Wright created CONNECTORS-583: -- Summary: MySQL locks time out and are not caught Key: CONNECTORS-583 URL: https://issues.apache.org/jira/browse/CONNECTORS-583 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 MySQL locks time out and are not caught. The SQL code is 41000. Here's the exception: 012/12/07 23:08:14 ERROR (Stuffer thread) - Stuffer thread aborting and restarting due to database connection reset: Database exception: SQLException doing query (41000): Lock wait timeout exceeded; try restarting transaction org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database exception: SQLException doing query (41000): Lock wait timeout exceeded; try restarting transaction at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:681) at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:709) at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1394) at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:186) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.performQuery(DBInterfaceMySQL.java:882) at org.apache.manifoldcf.crawler.jobs.JobManager.fetchAndProcessDocuments(JobManager.java:2260) at org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2066) at org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:157) Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3609) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3541) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2002) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2624) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2127) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2293) at org.apache.manifoldcf.core.database.Database.execute(Database.java:826) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:641) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
Karl Wright created CONNECTORS-584: -- Summary: Crawling with MySQL does not use indexes for order-by on critical queries Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-583) MySQL locks time out and are not caught
[ https://issues.apache.org/jira/browse/CONNECTORS-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-583. Resolution: Fixed MySQL locks time out and are not caught --- Key: CONNECTORS-583 URL: https://issues.apache.org/jira/browse/CONNECTORS-583 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 MySQL locks time out and are not caught. The SQL code is 41000. Here's the exception: 012/12/07 23:08:14 ERROR (Stuffer thread) - Stuffer thread aborting and restarting due to database connection reset: Database exception: SQLException doing query (41000): Lock wait timeout exceeded; try restarting transaction org.apache.manifoldcf.core.interfaces.ManifoldCFException: Database exception: SQLException doing query (41000): Lock wait timeout exceeded; try restarting transaction at org.apache.manifoldcf.core.database.Database.executeViaThread(Database.java:681) at org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:709) at org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1394) at org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:186) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.performQuery(DBInterfaceMySQL.java:882) at org.apache.manifoldcf.crawler.jobs.JobManager.fetchAndProcessDocuments(JobManager.java:2260) at org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2066) at org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:157) Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3609) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3541) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2002) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2163) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2624) at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2127) at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:2293) at org.apache.manifoldcf.core.database.Database.execute(Database.java:826) at org.apache.manifoldcf.core.database.Database$ExecuteQueryThread.run(Database.java:641) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528304#comment-13528304 ] Karl Wright commented on CONNECTORS-578: You forgot to install and/or download the dependencies. ant make-core-deps Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger
[jira] [Commented] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
[ https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528532#comment-13528532 ] Karl Wright commented on CONNECTORS-584: Experiments indicate that the FORCE INDEX (index_name) construct may do the right thing. I'm going to try it in the field in any case, to see how it behaves. Crawling with MySQL does not use indexes for order-by on critical queries - Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
[ https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528556#comment-13528556 ] Karl Wright commented on CONNECTORS-584: r1419964 installs FORCE INDEX in the crucial place. Crawling with MySQL does not use indexes for order-by on critical queries - Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-582) Either wait for httpcomponents 4.2.3, or include local copy of NTLM implementation and work with 4.2.2
[ https://issues.apache.org/jira/browse/CONNECTORS-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528925#comment-13528925 ] Karl Wright commented on CONNECTORS-582: I pulled up the httpclient NTLM fixes to the httpclient 4.2.x branch, so when 4.2.3 goes out it will include it. I'm still waiting, however, for verification that NTLMv2-only systems work properly with the new NTLM code. Either wait for httpcomponents 4.2.3, or include local copy of NTLM implementation and work with 4.2.2 -- Key: CONNECTORS-582 URL: https://issues.apache.org/jira/browse/CONNECTORS-582 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The NTLM fixes I submitted to the Httpcomponents project are not yet released. So we have a choice: either delay MCF 1.1 until HttpClient 4.2.3 has shipped, or supply our own NTLM implementation that we can remove next chance we have. In any case we need to do something. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528980#comment-13528980 ] Karl Wright commented on CONNECTORS-578: Well, you certainly aren't using the LiveLink connector in the branches/CONNECTORS-578 branch. It's using HttpComponents, and the code you are running is using the older commons-httpclient: C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelinkgrep MultiThreadedHttpConnectionManager *.java C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelink The exception you posted is the result of incorrect or missing trust store setup. You need to add either the server's cert or a certificate authority's cert in the connection configuration. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d
[jira] [Comment Edited] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528980#comment-13528980 ] Karl Wright edited comment on CONNECTORS-578 at 12/11/12 2:02 PM: -- Well, you certainly aren't using the LiveLink connector in the branches/CONNECTORS-578 branch. It's using HttpComponents, and the code you are running is using the older commons-httpclient: C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelinkgrep MultiThreadedHttpConnectionManager *.java C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelink The exception you posted is the result of incorrect or missing trust store setup. You need to add either the server's cert or a certificate authority's cert in the connection configuration. However, the screen is not supposed to blank even in that case. So if you can view page source in your browser, and attach it to this ticket, I'd be grateful. was (Author: kwri...@metacarta.com): Well, you certainly aren't using the LiveLink connector in the branches/CONNECTORS-578 branch. It's using HttpComponents, and the code you are running is using the older commons-httpclient: C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelinkgrep MultiThreadedHttpConnectionManager *.java C:\wip\mcf\CONNECTORS-578\connectors\livelink\connector\src\main\java\org\apache \manifoldcf\crawler\connectors\livelink The exception you posted is the result of incorrect or missing trust store setup. You need to add either the server's cert or a certificate authority's cert in the connection configuration. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529003#comment-13529003 ] Karl Wright commented on CONNECTORS-578: The core deps have not changed; you can still download the sames ones from http://people.apache.org/~kwright/apache-manifoldcf-1.1-dev . Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529010#comment-13529010 ] Karl Wright commented on CONNECTORS-578: bq. I'm sorry; What branch was I supposed to use? I grabbed the latest from branches/Connectors-578 and built it. Well, if you built it, you aren't running it. Or you attached the wrong log. I know this because there are classes in the trace that do not exist in the CONNECTORS-578 branch. bq. And the cert I used was from another connector that works. If that was true, you should not have seen the error in the log you posted. There's a disconnect somewhere - either the log is not the right one, or the code you are running is not from CONNECTORS-578. The blank screen in the UI might be caused by an exception, but if that happened, you should either see it in the log or written to standard out. The Cert exception was also logged as a WARNING which is not enough to cause the UI to fail to render - usually that would have to be some kind of null pointer exception or some such. I'll explore a bit here and see if I can get a NPE without a livelink server to attach to, but really in general I do need correct info in order to proceed... Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529020#comment-13529020 ] Karl Wright commented on CONNECTORS-578: No luck, everything works fine here as far as I can test. I do not get a blank screen. The status I get is: Connection status: Transient error: Could not access server ... which is pretty much what I would have expected. So in order to proceed we need actual exceptions. I can add code that catches really bad exceptions and prints these to standard out in the UI. That will relieve you of needing to find the right log. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529022#comment-13529022 ] Karl Wright commented on CONNECTORS-578: Ok, I've committed debugging code to the branch that should dump any fatal errors that occur during connect to standard out. You should see them then. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct.
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529032#comment-13529032 ] Karl Wright commented on CONNECTORS-578: If you sync up and try again you will see the error exception dumped to standard out. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529089#comment-13529089 ] Karl Wright commented on CONNECTORS-578: Thanks, this is the exception I'm looking for. It's basically a result of the refactoring, and I will check in a fix shortly. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529114#comment-13529114 ] Karl Wright commented on CONNECTORS-578: Ok, checked in a fix for this problem. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger and see all the dtree column values.
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529280#comment-13529280 ] Karl Wright commented on CONNECTORS-578: The fix was checked into the branch. If you have time, please try it again. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger
[jira] [Resolved] (CONNECTORS-585) LivelinkConnector: Even when lapi.jar is supplied, connectors-proprietary.xml does not automatically enable registration
[ https://issues.apache.org/jira/browse/CONNECTORS-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-585. Resolution: Not A Problem Turns out it was operator error; old placeholder file, needed an ant clean. LivelinkConnector: Even when lapi.jar is supplied, connectors-proprietary.xml does not automatically enable registration Key: CONNECTORS-585 URL: https://issues.apache.org/jira/browse/CONNECTORS-585 Project: ManifoldCF Issue Type: Bug Components: Build, LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The livelink connector has the option of the user providing the lapi.jar, and building against that. When that happens, livelink-PLACEHOLDER.txt should not be delivered to connector-lib-proprietary, and the livelink connector should be enabled in the connectors-proprietary.xml file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529501#comment-13529501 ] Karl Wright commented on CONNECTORS-578: Since there is no change I have to presume that the httpcomponents library is broken and cannot accept a host argument and a relative URL, contrary to the spec. So I replaced all uses of that construct with a hand-constructed URL instead. Please try again. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different
[jira] [Commented] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
[ https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529590#comment-13529590 ] Karl Wright commented on CONNECTORS-584: r1420515 for the final fix. Using multiple criteria for ORDER BY was what was breaking MySQL. However, I need to verify that the fix does not break any other database; if so we will need an abstraction. Crawling with MySQL does not use indexes for order-by on critical queries - Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
[ https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529780#comment-13529780 ] Karl Wright commented on CONNECTORS-584: Well, it looks like it is HSQLDB (at least) that breaks. I'll therefore be introducing a pertinent abstraction today. Crawling with MySQL does not use indexes for order-by on critical queries - Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530033#comment-13530033 ] Karl Wright commented on CONNECTORS-578: Any news here? Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger and see all the dtree column values. (2) Correct. (3)
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530077#comment-13530077 ] Karl Wright commented on CONNECTORS-578: Hi David, The authentication is clearly attempting to establish a connection via Kerberos, rather than NTLM. This is not supported, so I suggest you turn off Kerberos authentication for this particular livelink instance. You would do this in IIS on the server you are trying to fetch documents from. However, even though it fails with Kerberos, it seems to succeed eventually. It must fall back to NTLM or something, because I see no *hard* failure in this log. Specifically, I don't see anything in this log that would explain a failure to start a job. So, a question: (1) How long did you wait for the job to start? and (2) What database is this? You might be able to get more information by looking at the Simple History report to see whether it is in fact making (very slow) progress. If not it may be getting an exception and retrying silently, but it doesn't do that very long before giving up. I'd start by fixing the Kerberos auth issue, and then things should begin to operate in real time again. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530090#comment-13530090 ] Karl Wright commented on CONNECTORS-578: Ok, can you look at this with top (if Linux) or Task Manager (if Windows)? Is the ManifoldCF process using 100% of one CPU? If so, can you get a thread dump? You can get this with jstack from the JDK, or kill -QUIT, or a special magic keystroke on Windows... Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530102#comment-13530102 ] Karl Wright commented on CONNECTORS-578: jstack.exe is also available on windows, FYI Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger and see all the dtree column
[jira] [Commented] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530135#comment-13530135 ] Karl Wright commented on CONNECTORS-579: Hmm - there are a couple ways to find out what kind of authentication is being used, but it really does sound like it is just SSL still not establishing a reasonable session. Can you provide the exact error message, and I will chase it down and see what might be missing? RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530171#comment-13530171 ] Karl Wright commented on CONNECTORS-579: Thanks, this is critical information, because I can see from the stack trace that it is using the wrong certificate verifier, and yet I am supplying Can you confirm that you are running a recently checked-out version of ManifoldCF trunk? If you are running something you downloaded from people.apache.org, it will not contain the necessary fix. The code in question is here: {code} SSLSocketFactory myFactory = new SSLSocketFactory(new InterruptibleSocketFactory(httpsSocketFactory,connectionTimeoutMilliseconds), new AllowAllHostnameVerifier()); {code} If you can confirm you are running the right stuff, then I'll go looking inside the httpcomponents implementation for the reason why certificate verification is still active even though I explicitly disabled it. RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530171#comment-13530171 ] Karl Wright edited comment on CONNECTORS-579 at 12/12/12 6:27 PM: -- Thanks, this is critical information, because I can see from the stack trace that it is using the wrong certificate verifier. Can you confirm that you are running a recently checked-out version of ManifoldCF trunk? If you are running something you downloaded from people.apache.org, it will not contain the necessary fix. The code in question is here: {code} SSLSocketFactory myFactory = new SSLSocketFactory(new InterruptibleSocketFactory(httpsSocketFactory,connectionTimeoutMilliseconds), new AllowAllHostnameVerifier()); {code} If you can confirm you are running the right stuff, then I'll go looking inside the httpcomponents implementation for the reason why certificate verification is still active even though I explicitly disabled it. was (Author: kwri...@metacarta.com): Thanks, this is critical information, because I can see from the stack trace that it is using the wrong certificate verifier, and yet I am supplying Can you confirm that you are running a recently checked-out version of ManifoldCF trunk? If you are running something you downloaded from people.apache.org, it will not contain the necessary fix. The code in question is here: {code} SSLSocketFactory myFactory = new SSLSocketFactory(new InterruptibleSocketFactory(httpsSocketFactory,connectionTimeoutMilliseconds), new AllowAllHostnameVerifier()); {code} If you can confirm you are running the right stuff, then I'll go looking inside the httpcomponents implementation for the reason why certificate verification is still active even though I explicitly disabled it. RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530202#comment-13530202 ] Karl Wright commented on CONNECTORS-579: Ok. I looked in the httpcomponents code. The basic logic has it fetching the peer certificates and then deciding not to verify them (because we're ignoring cert problems). Unfortunately, in the run-up to not verifying the connection, the following line is executed: {code} Certificate[] certs = session.getPeerCertificates(); {code} It is this like that is throwing the exception that is aborting the connection. So it looks like I will need to develop a patch for this oversight for httpcomponents. The earliest I can do that is this evening, so for now we need to put the RSS connector changes on hold. Maybe instead we can make more progress with Livelink. RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530218#comment-13530218 ] Karl Wright commented on CONNECTORS-579: I take it you redacted the host names? Anyhow, it is properly fetching the feed, and deciding it is a feed. It's not clear that parsing the feed is yielding any documents, however. It could be feed formatting, or maybe a currently unsupported standard. Can you fetch the feed yourself with curl, and attach it to this ticket? If you want to redact machine names, that is fine, but leave something in the document so I know you've redacted something at that point. Thanks! Karl RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-579) RSS connector: Add untrusted, unverified SSL support
[ https://issues.apache.org/jira/browse/CONNECTORS-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-579. Resolution: Fixed Sounds like it is working as designed. RSS connector: Add untrusted, unverified SSL support Key: CONNECTORS-579 URL: https://issues.apache.org/jira/browse/CONNECTORS-579 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The RSS has never needed SSL support before. But there are some sites that serve up everything through SSL. There's no need for host verification etc, but a simple allow everything approach might well be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530360#comment-13530360 ] Karl Wright commented on CONNECTORS-578: First thing you need to do is locate the correct pid, which from your procmon dump may be 14508. But you can get it from Task Manager as well. Then, you invoke jstack.exe from the SAME JDK that you are running the JVM from, using the Command Prompt. For example: {code} C:\wip\mcf\trunkc:\Program Files\Java\jdk1.6.0_21\bin\jstack.exe Usage: jstack [-l] pid (to connect to running process) Options: -l long listing. Prints additional information about locks -h or -help to print this help message C:\wip\mcf\trunk {code} Now, use jstack with the proper pid: {code} C:\wip\mcf\trunkc:\Program Files\Java\jdk1.6.0_21\bin\jstack.exe 14508 14508: no such process C:\wip\mcf\trunk {code} If you have the right pid, a whole pile of stuff will dump out. Make sure that that happens. Then, when you are sure, do it again and capture it like this: {code} C:\wip\mcf\trunkc:\Program Files\Java\jdk1.6.0_21\bin\jstack.exe 14508 2stderr.dmp 1stdout.dmp C:\wip\mcf\trunk {code} Then, send me the output of whichever file has significant output. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter.
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530364#comment-13530364 ] Karl Wright commented on CONNECTORS-578: bq. You have to use jstack from the installed java bin directory. The documentation lied... Sorry, no. It *must* be from the same jvm you used to fire up ManifoldCF. That is why you got a core dump. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530412#comment-13530412 ] Karl Wright commented on CONNECTORS-578: I think it's getting an exception from LAPI, throwing an exception, and retrying. I'm putting in code to dump what that exception is. Please synch up and rebuild, and when you start ManifoldCF it should start spewing errors to standard out if I am correct. Please send me one of the traces. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530498#comment-13530498 ] Karl Wright commented on CONNECTORS-578: I think I found the problem! Just checked in a fix to the branch. Sorry for the confusion. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it?
[jira] [Created] (CONNECTORS-586) Maven build does not work with Java 7
Karl Wright created CONNECTORS-586: -- Summary: Maven build does not work with Java 7 Key: CONNECTORS-586 URL: https://issues.apache.org/jira/browse/CONNECTORS-586 Project: ManifoldCF Issue Type: Bug Components: Build Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.2 According to one user, the native2ascii version ManifoldCF is using does not work under all current versions of Java. In particular: {code} [ERROR] Failed to execute goal org.codehaus.mojo:native2ascii-maven-plugin:1.0-alpha-1:native2ascii (native2ascii-utf8) on project mcf-ui-core: Execution native2ascii-utf8 of goal org.codehaus.mojo:native2ascii-maven-plugin:1.0-alpha-1:native2ascii failed: Error starting Sun's native2ascii: sun.tools.native2ascii.Main - [Help 1] {code} It appears that there is a dependency on a Sun/Oracle class that does not exist in Java 7? We should figure out how to get the same thing to work using a more modern plugin. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531056#comment-13531056 ] Karl Wright commented on CONNECTORS-578: Hi David, You will need to place your version of lapi.jar in the connectors/livelink/lib-proprietary directory before building. So place it there now, and do the following: ant clean build Then, try again - it should automatically copy lapi.jar to the connector-lib-proprietary directory (and if it doesn't you've done something wrong). Thanks! Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531151#comment-13531151 ] Karl Wright commented on CONNECTORS-578: Exception has been fixed in branch. Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the
[jira] [Commented] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531312#comment-13531312 ] Karl Wright commented on CONNECTORS-578: r1421421 Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the
[jira] [Resolved] (CONNECTORS-578) Livelink connector needs access to general metadata
[ https://issues.apache.org/jira/browse/CONNECTORS-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-578. Resolution: Fixed Livelink connector needs access to general metadata --- Key: CONNECTORS-578 URL: https://issues.apache.org/jira/browse/CONNECTORS-578 Project: ManifoldCF Issue Type: New Feature Components: LiveLink connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: procmon.xlsx, stacktrace-startup.txt, stacktrace.txt General metadata access requested for the livelink connector. All you have to do is call GetObjectInfo with the volumeID (use zero), the docID, and a LLVALUE objectInfo (which is just for the returned data; it's one of those god awful OpenText rec arrays IIRC) On second thought: let me verify this with a code example; but I'm 90% sure this should work. GetObjectInfo should return (in objectInfo) all the data in dtree but I only need a subset of that. UI display name - prefix + dtree name Name - ot_name Description - ot_dcomment Owner - ot_userid Creation Date - ot_createdate Creator - ot_createdby Modified - ot_modifydate Modified By - ot_modifiedby For example, to get the dcomment out of the LLValue You would have to do this: LLValue elem = ( new LLValue() ).setAssocNotSet(); String dcomment = elem.toString(“DCOMMENT”); And that’s how you extract the data out of the opentext structure. It’s kinda crappy, you have to know the OT db schema in advance. Now since owner and creator are sometimes different you would have to make at most 2 additional calls to GetObjectInfo to find out the OpenText user name (because all you’ll have is a number). So I would check to see if the owner and the creator are the same then I only have to make one call. Listing the general attributes: I would suggest prefixing them with ot_ (for OpenText) so the values don't interfere with the general data on the document itself. Attached is a picture of the general tab in OpenText. The data itself resides in the dtree table. And just append it to the array or map that the LivelinkConnector.java returns. IIRC, you can insert the code around line 1055-1160 after you get the cats and atts. Attached is the GetObjectInfo description from the documentation: GetObjectInfo This function returns an Assoc value object containing information about the specified object. C++ Function Prototype LLSTATUS LL_GetObjectInfo( LLSESSION session, LLLONGvolumeID, LLLONGobjectID, LLVALUE objectInfo ); Java/.NET Method Declaration public int GetObjectInfo( int volumeID, int objectID, LLValue objectInfo ) Input Parameters session the session handle as returned by the SessionAllocEx function volumeIDthe volume ID of the object. Specify 0 to identify the object using only the objectID value. objectID the object ID of the object Output Parameters objectInfo a value object of type Assoc, initialized using the Value API, containing the attributes for the new object. For more information, see ObjectInfo Attributes. Remarks For Livelink servers using complex attributes, the category field of the objectInfo Assoc is undefined. This function does not retrieve attribute definitions. Any LLVALUE initialized using the Value API can be passed as the objectInfo output parameter. GetObjectInfo assigns the appropriate type and value to it. This function can be performed on any type of object. Followup questions/answers: To make sure I understand this: (1) There is no way to determine through LAPI the names of all available general attributes without having a specific item’s ObjectInfo object except by describing the dtree table – via something like JDBC? (If this is in fact true then I’d prefer to just allow the user to enter names of the columns by hand.) (2) The general attribute info is found, for any specific object, in its associated ObjectInfo structure, and can be retrieved by name. (3) SOME general attribute data will require additional processing (e.g. lookup of the actual user name given the user id). Please correct me if I'm incorrect in any way. Is there any different kind of additional processing you can envision? Or is lookup of a user name pretty much it? (1) Correct. You could also cheat and just look at the returned LLValue in the debugger and see all the dtree column values. (2)
[jira] [Commented] (CONNECTORS-587) Seeding end time never recorded in job record on manual job start
[ https://issues.apache.org/jira/browse/CONNECTORS-587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531556#comment-13531556 ] Karl Wright commented on CONNECTORS-587: r1421570 Seeding end time never recorded in job record on manual job start - Key: CONNECTORS-587 URL: https://issues.apache.org/jira/browse/CONNECTORS-587 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The jobs table's lastchecktime field is never updated, except when the SeedingThread does the seeding. This means that subsequent addSeedDocuments() method calls always receive a start time of 0. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-584) Crawling with MySQL does not use indexes for order-by on critical queries
[ https://issues.apache.org/jira/browse/CONNECTORS-584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13531626#comment-13531626 ] Karl Wright commented on CONNECTORS-584: OK, looks like all is now working. Closing this ticket. Crawling with MySQL does not use indexes for order-by on critical queries - Key: CONNECTORS-584 URL: https://issues.apache.org/jira/browse/CONNECTORS-584 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following ORDER-BY query is taking a long time on MySQL: {code} # Time: 121204 16:25:40 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 7.240532 Lock_time: 0.000204 Rows_sent: 1200 Rows_examined: 611091 SET timestamp=1354605940; SELECT t0.id,t0.jobid,t0.dochash,t0.docid,t0.status,t0.failtime,t0.failcount,t0.priorityset FROM jobqueue t0 WHERE t0.status IN ('P','G') AND t0.checkaction='R' AND t0.checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid AND t1.priority=5) AND NOT EXISTS(SELECT 'x' FROM jobqueue t2 WHERE t2.dochash=t0.dochash AND t2.status IN ('A','F','a','f','D','d') AND t2.jobid!=t0.jobid) AND NOT EXISTS(SELECT 'x' FROM prereqevents t3,events t4 WHERE t0.id=t3.owner AND t3.eventname=t4.name) ORDER BY t0.docpriority ASC,t0.status ASC,t0.checkaction ASC,t0.checktime ASC LIMIT 1200; # Time: 121204 16:25:44 # User@Host: manifoldcf[manifoldcf] @ localhost [127.0.0.1] # Query_time: 3.064339 Lock_time: 0.84 Rows_sent: 1 Rows_examined: 406359 SET timestamp=1354605944; SELECT docpriority,jobid,dochash,docid FROM jobqueue t0 WHERE status IN ('P','G') AND checkaction='R' AND checktime=1354605932817 AND EXISTS(SELECT 'x' FROM jobs t1 WHERE t1.status IN ('A','a') AND t1.id=t0.jobid) ORDER BY docpriority ASC,status ASC,checkaction ASC,checktime ASC LIMIT 1; --- {code} I wonder if the queries appropriately use index of the table. As a result of EXPLAIN against the slow query, there was filesort. There seems to be some conditions that MySQL does not use index depending on ORDER BY: - Executing ORDER BY against multiple keys - When keys selected from records are different from keys used by ORDER BY Since filesort was happening, fully scanning records should be having MCF slower. Do you think this could happen even in PostgreSQL or HSQLDB? Do you think queries could be modified to use index appropriately? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-543) Test needed for combined war
[ https://issues.apache.org/jira/browse/CONNECTORS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-543. Resolution: Fixed Test needed for combined war Key: CONNECTORS-543 URL: https://issues.apache.org/jira/browse/CONNECTORS-543 Project: ManifoldCF Issue Type: Improvement Components: Tests Affects Versions: ManifoldCF 1.0 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 We need a test that uses the combined war for crawling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-543) Test needed for combined war
[ https://issues.apache.org/jira/browse/CONNECTORS-543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13532203#comment-13532203 ] Karl Wright commented on CONNECTORS-543: r1421775 Test needed for combined war Key: CONNECTORS-543 URL: https://issues.apache.org/jira/browse/CONNECTORS-543 Project: ManifoldCF Issue Type: Improvement Components: Tests Affects Versions: ManifoldCF 1.0 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 We need a test that uses the combined war for crawling. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-588) ManifoldCFQParser dead locking Solr Core loading.
[ https://issues.apache.org/jira/browse/CONNECTORS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533730#comment-13533730 ] Karl Wright commented on CONNECTORS-588: I've looked at the thread dump. First, it is not a traditional deadlock, at least as far as the JVM is concerned. None of the threads are in the ManifoldCFQParserPlugin code either. So if your thread dump was truly captured when your system was deadlocked, it is not the plugin that is doing this. There is a warmup query that is active, and here's the appropriate bit of thread dump: {code} at org.apache.solr.spelling.SpellCheckCollator.collate(SpellCheckCollator.java:112) at org.apache.solr.handler.component.SpellCheckComponent.addCollationsToResponse(SpellCheckComponent.java:203) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:180) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:206) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1699) {code} As you see, it's in the spellcheckcomponent. No idea what it's doing and still no indication that it is deadlocked. ManifoldCFQParser dead locking Solr Core loading. - Key: CONNECTORS-588 URL: https://issues.apache.org/jira/browse/CONNECTORS-588 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.0.1 Environment: Solr 4.0 or Solr 4.0 BETA, ManifoldCF 1.0.1 Reporter: Sampo Saarela Attachments: solrconfig.xml, td.log The exact requirements to reproduce the bug are not really clear but at least these are required: Some data in the Solr index with access tokens set. Seems easier to reproduce with at least 1000 documents in the index. Solr QueryParser enabled. Default warm up queries enabled in the SolrConfig.xml At the Solr Core loading when it tries to perform the warm up queries the loading dead locks. After this Solr is in dead lock and the admin interface is locked also. This can be worked around by enabling ColdStart from SolrConfig so the core does not try to warm itself up before all the init is done in the ManifoldCFQParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-588) ManifoldCFQParser dead locking Solr Core loading.
[ https://issues.apache.org/jira/browse/CONNECTORS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533732#comment-13533732 ] Karl Wright commented on CONNECTORS-588: My recommendation is to post this information to the solr users list and see what they make of it. I think if the ManifoldCFQParserPlugin is involved, it is only obliquely. In other words, it might be simply helping Solr hurt itself, and for that purpose any plugin will do, not merely ours. ManifoldCFQParser dead locking Solr Core loading. - Key: CONNECTORS-588 URL: https://issues.apache.org/jira/browse/CONNECTORS-588 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.0.1 Environment: Solr 4.0 or Solr 4.0 BETA, ManifoldCF 1.0.1 Reporter: Sampo Saarela Attachments: solrconfig.xml, td.log The exact requirements to reproduce the bug are not really clear but at least these are required: Some data in the Solr index with access tokens set. Seems easier to reproduce with at least 1000 documents in the index. Solr QueryParser enabled. Default warm up queries enabled in the SolrConfig.xml At the Solr Core loading when it tries to perform the warm up queries the loading dead locks. After this Solr is in dead lock and the admin interface is locked also. This can be worked around by enabling ColdStart from SolrConfig so the core does not try to warm itself up before all the init is done in the ManifoldCFQParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (CONNECTORS-588) ManifoldCFQParser dead locking Solr Core loading.
[ https://issues.apache.org/jira/browse/CONNECTORS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533732#comment-13533732 ] Karl Wright edited comment on CONNECTORS-588 at 12/17/12 8:45 AM: -- My recommendation is to post this information to the solr users list and see what they make of it. I think if the ManifoldCFQParserPlugin is involved, it is only obliquely. In other words, it might be simply helping Solr hurt itself, and for that purpose any plugin will do, not only ours. was (Author: kwri...@metacarta.com): My recommendation is to post this information to the solr users list and see what they make of it. I think if the ManifoldCFQParserPlugin is involved, it is only obliquely. In other words, it might be simply helping Solr hurt itself, and for that purpose any plugin will do, not merely ours. ManifoldCFQParser dead locking Solr Core loading. - Key: CONNECTORS-588 URL: https://issues.apache.org/jira/browse/CONNECTORS-588 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.0.1 Environment: Solr 4.0 or Solr 4.0 BETA, ManifoldCF 1.0.1 Reporter: Sampo Saarela Attachments: solrconfig.xml, td.log The exact requirements to reproduce the bug are not really clear but at least these are required: Some data in the Solr index with access tokens set. Seems easier to reproduce with at least 1000 documents in the index. Solr QueryParser enabled. Default warm up queries enabled in the SolrConfig.xml At the Solr Core loading when it tries to perform the warm up queries the loading dead locks. After this Solr is in dead lock and the admin interface is locked also. This can be worked around by enabling ColdStart from SolrConfig so the core does not try to warm itself up before all the init is done in the ManifoldCFQParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-588) ManifoldCFQParser dead locking Solr Core loading.
[ https://issues.apache.org/jira/browse/CONNECTORS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13533852#comment-13533852 ] Karl Wright commented on CONNECTORS-588: I understand; but unless you have a trace indicating that a thread somewhere is stuck in the ManifoldCFQParserPlugin, it's hard to see how our code is directly involved. ManifoldCFQParser dead locking Solr Core loading. - Key: CONNECTORS-588 URL: https://issues.apache.org/jira/browse/CONNECTORS-588 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.0.1 Environment: Solr 4.0 or Solr 4.0 BETA, ManifoldCF 1.0.1 Reporter: Sampo Saarela Attachments: solrconfig.xml, td.log The exact requirements to reproduce the bug are not really clear but at least these are required: Some data in the Solr index with access tokens set. Seems easier to reproduce with at least 1000 documents in the index. Solr QueryParser enabled. Default warm up queries enabled in the SolrConfig.xml At the Solr Core loading when it tries to perform the warm up queries the loading dead locks. After this Solr is in dead lock and the admin interface is locked also. This can be worked around by enabling ColdStart from SolrConfig so the core does not try to warm itself up before all the init is done in the ManifoldCFQParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-588) ManifoldCFQParser dead locking Solr Core loading.
[ https://issues.apache.org/jira/browse/CONNECTORS-588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-588. Resolution: Invalid Fix Version/s: ManifoldCF 1.1 Assignee: Karl Wright Apparently not a ManifoldCF issue; will reopen if data points the other way. ManifoldCFQParser dead locking Solr Core loading. - Key: CONNECTORS-588 URL: https://issues.apache.org/jira/browse/CONNECTORS-588 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.0.1 Environment: Solr 4.0 or Solr 4.0 BETA, ManifoldCF 1.0.1 Reporter: Sampo Saarela Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: solrconfig.xml, td.log The exact requirements to reproduce the bug are not really clear but at least these are required: Some data in the Solr index with access tokens set. Seems easier to reproduce with at least 1000 documents in the index. Solr QueryParser enabled. Default warm up queries enabled in the SolrConfig.xml At the Solr Core loading when it tries to perform the warm up queries the loading dead locks. After this Solr is in dead lock and the admin interface is locked also. This can be worked around by enabling ColdStart from SolrConfig so the core does not try to warm itself up before all the init is done in the ManifoldCFQParser. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-589) For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry
Karl Wright created CONNECTORS-589: -- Summary: For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry Key: CONNECTORS-589 URL: https://issues.apache.org/jira/browse/CONNECTORS-589 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The IBM portal apparently generates feeds that have multiple links per entry, as follows: {code} link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=self/link link href=https://[redacted]/files/app/file/1adf16d8-bbe4-4e70-be09-b002ce5cd816; rel=alternate type=text/html/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=edit/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media; rel=edit-media/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media/web.png; rel=enclosure type=image/png title=web.png hreflang=en length=4297/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/thumbnail; rel=thumbnail/link category term=document scheme=tag:ibm.com,2006:td/type label=document/category link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/feed; rel=replies type=application/atom+xml thr:count=0 {code} Right now, only the last link is processed. It would be better if all of them were processed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-589) For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry
[ https://issues.apache.org/jira/browse/CONNECTORS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536144#comment-13536144 ] Karl Wright commented on CONNECTORS-589: All link tags are being parsed and included. Before only one was. That is the only change to the RSS connector I made yesterday. So I honestly don't understand what you mean. For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry -- Key: CONNECTORS-589 URL: https://issues.apache.org/jira/browse/CONNECTORS-589 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The IBM portal apparently generates feeds that have multiple links per entry, as follows: {code} link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=self/link link href=https://[redacted]/files/app/file/1adf16d8-bbe4-4e70-be09-b002ce5cd816; rel=alternate type=text/html/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=edit/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media; rel=edit-media/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media/web.png; rel=enclosure type=image/png title=web.png hreflang=en length=4297/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/thumbnail; rel=thumbnail/link category term=document scheme=tag:ibm.com,2006:td/type label=document/category link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/feed; rel=replies type=application/atom+xml thr:count=0 {code} Right now, only the last link is processed. It would be better if all of them were processed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-589) For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry
[ https://issues.apache.org/jira/browse/CONNECTORS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13536178#comment-13536178 ] Karl Wright commented on CONNECTORS-589: Hi David, The links are presumed to point at independent documents, which we then subsequently fetch and send to Solr as individual independent documents. That is what the Atom spec means when it describes the link tag. If you were expecting a single document with lots of metadata for all the link tags, then I'm afraid you will want to review how RSS and Atom actually are supposed to work. For compatibility with IBM portal, feed parser should allow multiple links to be added to the queue, per entry -- Key: CONNECTORS-589 URL: https://issues.apache.org/jira/browse/CONNECTORS-589 Project: ManifoldCF Issue Type: Improvement Components: RSS connector Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The IBM portal apparently generates feeds that have multiple links per entry, as follows: {code} link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=self/link link href=https://[redacted]/files/app/file/1adf16d8-bbe4-4e70-be09-b002ce5cd816; rel=alternate type=text/html/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/entry; rel=edit/link link href=https://[redacted]/files/basic/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media; rel=edit-media/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/media/web.png; rel=enclosure type=image/png title=web.png hreflang=en length=4297/link link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/thumbnail; rel=thumbnail/link category term=document scheme=tag:ibm.com,2006:td/type label=document/category link href=https://[redacted]/files/basic/anonymous/api/library/e82a8b7b-08ea-42b6-bc5b-094f2bd124a3/document/1adf16d8-bbe4-4e70-be09-b002ce5cd816/feed; rel=replies type=application/atom+xml thr:count=0 {code} Right now, only the last link is processed. It would be better if all of them were processed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-590) ManifoldCF sometimes throws an unexpected state exception under MySQL
Karl Wright created CONNECTORS-590: -- Summary: ManifoldCF sometimes throws an unexpected state exception under MySQL Key: CONNECTORS-590 URL: https://issues.apache.org/jira/browse/CONNECTORS-590 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Environment: MySQL 5.5.28, for Linux (x86_64), ManifoldCF 1.1-dev Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following exception occurred under this setup: {code} 2012/12/21 10:09:37 ERROR (Worker thread '78') - Exception tossed: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:742) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2438) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:765) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-590) ManifoldCF sometimes throws an unexpected state exception under MySQL
[ https://issues.apache.org/jira/browse/CONNECTORS-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537644#comment-13537644 ] Karl Wright commented on CONNECTORS-590: The only hypothesis that may make some sense is that we're seeing a reset due to either an unhandled database exception, or due to an error condition. The bug would be that the reset gets applied not only to documents that actually failed, but also to documents that had been already handed to active worker threads. I would expect there to be some kind of log warning if this took place however, within the last 10 minutes of the crawl before the bad state was seen. ManifoldCF sometimes throws an unexpected state exception under MySQL - Key: CONNECTORS-590 URL: https://issues.apache.org/jira/browse/CONNECTORS-590 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Environment: MySQL 5.5.28, for Linux (x86_64), ManifoldCF 1.1-dev Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following exception occurred under this setup: {code} 2012/12/21 10:09:37 ERROR (Worker thread '78') - Exception tossed: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:742) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2438) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:765) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-590) ManifoldCF sometimes throws an unexpected state exception under MySQL
[ https://issues.apache.org/jira/browse/CONNECTORS-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13537655#comment-13537655 ] Karl Wright commented on CONNECTORS-590: It looks like the long-running query was the result of a job coming to an end. That, in turn, is because there are no active documents left for the job. The job documents' docpriorities are all cleared, and then the job is put into the inactive state: {code} // All the job's documents need to have their docpriority set to null, to clear dead wood out of the docpriority index. // See CONNECTORS-290. // We do this BEFORE updating the job state. jobQueue.clearDocPriorities(jobID); IJobDescription jobDesc = jobs.load(jobID,true); modifiedJobs.add(jobDesc); jobs.finishStopJob(jobID,timestamp); {code} The finishStopJob() method is taking the job from shutting down to notification. ManifoldCF sometimes throws an unexpected state exception under MySQL - Key: CONNECTORS-590 URL: https://issues.apache.org/jira/browse/CONNECTORS-590 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Environment: MySQL 5.5.28, for Linux (x86_64), ManifoldCF 1.1-dev Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 The following exception occurred under this setup: {code} 2012/12/21 10:09:37 ERROR (Worker thread '78') - Exception tossed: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:742) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2438) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:765) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-590) ManifoldCF sometimes throws an unexpected state exception under MySQL
[ https://issues.apache.org/jira/browse/CONNECTORS-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-590: --- Fix Version/s: (was: ManifoldCF 1.1) ManifoldCF 1.2 ManifoldCF sometimes throws an unexpected state exception under MySQL - Key: CONNECTORS-590 URL: https://issues.apache.org/jira/browse/CONNECTORS-590 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Environment: MySQL 5.5.28, for Linux (x86_64), ManifoldCF 1.1-dev Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.2 The following exception occurred under this setup: {code} 2012/12/21 10:09:37 ERROR (Worker thread '78') - Exception tossed: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:742) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2438) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:765) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-591) ElasticSearch has moved its download URL
[ https://issues.apache.org/jira/browse/CONNECTORS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539331#comment-13539331 ] Karl Wright commented on CONNECTORS-591: r1425688 ElasticSearch has moved its download URL Key: CONNECTORS-591 URL: https://issues.apache.org/jira/browse/CONNECTORS-591 Project: ManifoldCF Issue Type: Bug Components: Build Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 ElasticSearch download is no longer via github. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-591) ElasticSearch has moved its download URL
[ https://issues.apache.org/jira/browse/CONNECTORS-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-591. Resolution: Fixed ElasticSearch has moved its download URL Key: CONNECTORS-591 URL: https://issues.apache.org/jira/browse/CONNECTORS-591 Project: ManifoldCF Issue Type: Bug Components: Build Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 ElasticSearch download is no longer via github. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-592) Add last updated metadata to the web and jcifs connectors
Karl Wright created CONNECTORS-592: -- Summary: Add last updated metadata to the web and jcifs connectors Key: CONNECTORS-592 URL: https://issues.apache.org/jira/browse/CONNECTORS-592 Project: ManifoldCF Issue Type: Bug Components: JCIFS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 I would like to suggest new tiny functionality to Windows Share and Web connection. There are occasion that people want to search by latest update time of contents. It would be nice that MCF could transfer those update times to Solr so that Solr Cell could map them to index. For files in windows server, you can obtain update time through jcifs. For web, you can obtain update time from response header if it contains the date time. I'm attaching modified files for jcifs and web connection which implements to transfer update time as lastModified. You can search the modified code by Kobayashi. (For Windows share connection, the modification also includes code to avoid hidden files.) Thanks for taking your time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-592) Add last updated metadata to the web and jcifs connectors
[ https://issues.apache.org/jira/browse/CONNECTORS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-592: --- Attachment: connectors.zip Add last updated metadata to the web and jcifs connectors --- Key: CONNECTORS-592 URL: https://issues.apache.org/jira/browse/CONNECTORS-592 Project: ManifoldCF Issue Type: Bug Components: JCIFS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: connectors.zip I would like to suggest new tiny functionality to Windows Share and Web connection. There are occasion that people want to search by latest update time of contents. It would be nice that MCF could transfer those update times to Solr so that Solr Cell could map them to index. For files in windows server, you can obtain update time through jcifs. For web, you can obtain update time from response header if it contains the date time. I'm attaching modified files for jcifs and web connection which implements to transfer update time as lastModified. You can search the modified code by Kobayashi. (For Windows share connection, the modification also includes code to avoid hidden files.) Thanks for taking your time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-592) Add last updated metadata to the web and jcifs connectors
[ https://issues.apache.org/jira/browse/CONNECTORS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539520#comment-13539520 ] Karl Wright commented on CONNECTORS-592: r1425886 Add last updated metadata to the web and jcifs connectors --- Key: CONNECTORS-592 URL: https://issues.apache.org/jira/browse/CONNECTORS-592 Project: ManifoldCF Issue Type: Bug Components: JCIFS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Attachments: connectors.zip I would like to suggest new tiny functionality to Windows Share and Web connection. There are occasion that people want to search by latest update time of contents. It would be nice that MCF could transfer those update times to Solr so that Solr Cell could map them to index. For files in windows server, you can obtain update time through jcifs. For web, you can obtain update time from response header if it contains the date time. I'm attaching modified files for jcifs and web connection which implements to transfer update time as lastModified. You can search the modified code by Kobayashi. (For Windows share connection, the modification also includes code to avoid hidden files.) Thanks for taking your time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-590) ManifoldCF sometimes throws an unexpected state exception under MySQL
[ https://issues.apache.org/jira/browse/CONNECTORS-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539795#comment-13539795 ] Karl Wright commented on CONNECTORS-590: I created a branch, CONNECTORS-590, which has test code in it that should be able to print forensics when this error happens (when I've completed the code). Then we could see in a straightforward manner what is happening. The downside is that this requires keeping data structures around in memory which represent at least 10 minutes and probably more like an hour worth of database operations to the jobqueue table. So running out of memory may well be possible. ManifoldCF sometimes throws an unexpected state exception under MySQL - Key: CONNECTORS-590 URL: https://issues.apache.org/jira/browse/CONNECTORS-590 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Environment: MySQL 5.5.28, for Linux (x86_64), ManifoldCF 1.1-dev Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.2 The following exception occurred under this setup: {code} 2012/12/21 10:09:37 ERROR (Worker thread '78') - Exception tossed: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Unexpected jobqueue status - record id 1356045273314, expecting active status, saw 0 at org.apache.manifoldcf.crawler.jobs.JobQueue.updateCompletedRecord(JobQueue.java:742) at org.apache.manifoldcf.crawler.jobs.JobManager.markDocumentCompletedMultiple(JobManager.java:2438) at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:765) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13539925#comment-13539925 ] Karl Wright commented on CONNECTORS-594: I'll have a look at this shortly. In the future, the best way to submit patches is to set up an SVN workarea, based on trunk, and make your changes there, including adding new files with svn add. Then, you can just do: svn diff CONNECTORS-594.patch ... and then it is simple to read, and to apply. For now, I will try to do the same thing here and commit it into a work branch, branches/CONNECTORS-594. Then everyone can have a look at it. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540254#comment-13540254 ] Karl Wright commented on CONNECTORS-594: I downloaded the dependencies and built the connector. There's an immediate difficulty, though. Here's the dependencies of Solrj at the moment: 12/27/2012 08:10 PM 163,151 commons-io-2.1.jar 12/27/2012 08:10 PM 352,585 httpclient-4.1.3.jar 12/27/2012 08:10 PM 181,410 httpcore-4.1.4.jar 12/27/2012 08:10 PM26,938 httpmime-4.1.3.jar 12/27/2012 08:10 PM17,308 jcl-over-slf4j-1.6.4.jar 12/27/2012 08:10 PM20,639 log4j-over-slf4j-1.6.4.jar 12/27/2012 08:10 PM25,962 slf4j-api-1.6.4.jar 12/27/2012 08:10 PM 8,887 slf4j-jdk14-1.6.4.jar 12/27/2012 08:10 PM 520,969 wstx-asl-3.2.7.jar 12/27/2012 08:10 PM 608,239 zookeeper-3.3.6.jar The versions of httpclient and httpcore versions we require elsewhere in ManifoldCF are 4.2.3 for trunk, which is a whole major version beyond the solrj version. This may or may not be a problem, but we'll have to either confirm that the code will work with this upgrade, or isolate SolrJ and its dependencies using a classloader strategy. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540263#comment-13540263 ] Karl Wright commented on CONNECTORS-594: Ok, I had a first glance at the code itself. There are a number of problems in the code itself, notably exception handling and dealing properly with session setup vs. connection. These, however, are minor enough that I could correct them myself over time. The biggest problem is the potentially incompatible jar dependencies. However, am I correct that basically all you have done is adapt the Solr connector to use the Solr-j library, and add additional parameters and dependencies as required by that library? If that is true, then we need to decide what the right approach should be in dealing with all variants of Solr. The possible approaches are: - Have a separate connector for Solr 3 and Solr 4 and SolrCloud, as is the case now - Have one connector that allows you to select which Solr variant to use The second approach is arguably better for a number of reasons. Specifically, each variant could class-load its own version of Solr-j, and the required dependencies. I'll look into this possibility in more detail and post what I find. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540396#comment-13540396 ] Karl Wright commented on CONNECTORS-594: I was able to confirm that SolrJ 4.0.0 at least builds against the httpcomponents 4.2.x family of releases. So this opens the possibility that, if SolrJ 4.0.0 is capable of communicating with earlier solr versions such as 3.x and 1.4, we might be able to get away with a lot less work. But I can find no resources anywhere that indicate that solrj 4 is in any way interoperable with earlier solr versions, so I am asking on the lucene dev list about this. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540779#comment-13540779 ] Karl Wright commented on CONNECTORS-594: Thanks Ryan - this is just what I needed. I'll modify the branch to use a mixed set of downloaded dependencies (too bad SolrJ 4.0.0 final does not seem to have been pushed into the Maven repo yet though. ;-) ) Then we can see if the solrcloud connector works for both a 3.x release and a 4.x release. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-594: -- Assignee: Karl Wright SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-594: --- Fix Version/s: (was: ManifoldCF next) SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF 1.1 Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-594: --- Fix Version/s: ManifoldCF 1.1 SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Priority: Minor Fix For: ManifoldCF 1.1, ManifoldCF next Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541557#comment-13541557 ] Karl Wright commented on CONNECTORS-594: Ok, I've updated the Solr connector in branches/CONNECTORS-594 to use SolrJ. There are, however, a couple of outstanding questions. (1) What is the right way to interpret UpdateResponse from an add or delete operation? Basically, ManifoldCF receives either an UpdateResponse, a SolrServerException, or an IOException. From that, it needs to decide to classify the response as one of the following: - Document was successfully accepted and indexed - Document failed, but might be accepted if tried later - Document was rejected, and that will not change if retried in the future - Document failed sufficiently badly that the whole job should be stopped immediately (2) Prior to using SolrJ, the Solr Connector client required a user to specify the URLs for the following: - Index (e.g. /update/extract) - Deletion (e.g. /update) - Status check (e.g. /admin/ping) It now seems like only one of these needs to be specified - the index URL. Why don't I have to specify the others anymore? Are they permanently fixed? I thought that they could be moved easily by changing solrconfig.xml. How does SolrJ deal with that? If I can get answers to these questions, I think I can quickly finish the SolrJ work up and commit the revised connector to trunk. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-595) Solr 4.x plugin fails tests against solr-lucene 4.0.0 final
Karl Wright created CONNECTORS-595: -- Summary: Solr 4.x plugin fails tests against solr-lucene 4.0.0 final Key: CONNECTORS-595 URL: https://issues.apache.org/jira/browse/CONNECTORS-595 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Priority: Blocker Fix For: ManifoldCF 1.1 When build is updated to point to the lucene-solr 4.0.0 tag, the plugin tests fail as follows: {code} [junit4:junit4] JUnit4 says hello! Master seed: 76514493C8D9B3A8 [junit4:junit4] Your default console's encoding may not display certain unicode glyphs: windows-1252 [junit4:junit4] Executing 3 suites with 1 JVM. [junit4:junit4] [junit4:junit4] Suite: org.apache.solr.mcf.ManifoldCFSCLoadTest [junit4:junit4] 2 225 T10 oas.SolrTestCaseJ4.deleteCore ###deleteCore [junit4:junit4] 2 NOTE: test params are: codec=Lucene3x, sim=RandomSimilarity Provider(queryNorm=false,coord=no): {}, locale=en, timezone=America/Indiana/Vinc ennes [junit4:junit4] 2 NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_21 (32-bit)/cpus=2,threads=1,free=10125792,total=16252928 [junit4:junit4] 2 NOTE: All tests run in this JVM: [ManifoldCFSCLoadTest] [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=ManifoldCFSCLoad Test -Dtests.seed=76514493C8D9B3A8 -Dtests.slow=true -Dtests.locale=en -Dtests.t imezone=America/Indiana/Vincennes -Dtests.file.encoding=US-ASCII [junit4:junit4] ERROR 0.00s | ManifoldCFSCLoadTest (suite) [junit4:junit4] Throwable #1: java.lang.RuntimeException: Cannot find resou rce: C:\wip\mcf-integration\solr-4.x\trunk\solr\solr\build\contrib\solr-mcf\test \J0\solr\collection1 [junit4:junit4]at __randomizedtesting.SeedInfo.seed([76514493C8D9B3A8]: 0) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4 .java:1417) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCase J4.java:1422) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ 4.java:174) [junit4:junit4]at org.apache.solr.mcf.ManifoldCFSCLoadTest.beforeClass( ManifoldCFSCLoadTest.java:43) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native M ethod) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMet hodAccessorImpl.java:39) [junit4:junit4]at sun.reflect.DelegatingMethodAccessorImpl.invoke(Deleg atingMethodAccessorImpl.java:25) [junit4:junit4]at java.lang.reflect.Method.invoke(Method.java:597) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.i nvoke(RandomizedRunner.java:1559) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.a ccess$600(RandomizedRunner.java:79) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$4 .evaluate(RandomizedRunner.java:677) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$5 .evaluate(RandomizedRunner.java:693) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) [junit4:junit4]at org.apache.lucene.util.AbstractBeforeAfterRule$1.eval uate(AbstractBeforeAfterRule.java:45) [junit4:junit4]at org.apache.lucene.util.TestRuleStoreClassName$1.evalu ate(TestRuleStoreClassName.java:42) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at org.apache.lucene.util.TestRuleAssertionsRequired$1.e valuate(TestRuleAssertionsRequired.java:43) [junit4:junit4]at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate (TestRuleMarkFailure.java:48) [junit4:junit4]at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures $1.evaluate(TestRuleIgnoreAfterMaxFailures.java:70) [junit4:junit4]at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.eva luate(TestRuleIgnoreTestSuites.java:55) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at com.carrotsearch.randomizedtesting.ThreadLeakControl$ StatementRunner.run(ThreadLeakControl.java:358)
[jira] [Commented] (CONNECTORS-595) Solr 4.x plugin fails tests against solr-lucene 4.0.0 final
[ https://issues.apache.org/jira/browse/CONNECTORS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541598#comment-13541598 ] Karl Wright commented on CONNECTORS-595: r1427332 (trunk) Solr 4.x plugin fails tests against solr-lucene 4.0.0 final --- Key: CONNECTORS-595 URL: https://issues.apache.org/jira/browse/CONNECTORS-595 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Priority: Blocker Fix For: ManifoldCF 1.1 When build is updated to point to the lucene-solr 4.0.0 tag, the plugin tests fail as follows: {code} [junit4:junit4] JUnit4 says hello! Master seed: 76514493C8D9B3A8 [junit4:junit4] Your default console's encoding may not display certain unicode glyphs: windows-1252 [junit4:junit4] Executing 3 suites with 1 JVM. [junit4:junit4] [junit4:junit4] Suite: org.apache.solr.mcf.ManifoldCFSCLoadTest [junit4:junit4] 2 225 T10 oas.SolrTestCaseJ4.deleteCore ###deleteCore [junit4:junit4] 2 NOTE: test params are: codec=Lucene3x, sim=RandomSimilarity Provider(queryNorm=false,coord=no): {}, locale=en, timezone=America/Indiana/Vinc ennes [junit4:junit4] 2 NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_21 (32-bit)/cpus=2,threads=1,free=10125792,total=16252928 [junit4:junit4] 2 NOTE: All tests run in this JVM: [ManifoldCFSCLoadTest] [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=ManifoldCFSCLoad Test -Dtests.seed=76514493C8D9B3A8 -Dtests.slow=true -Dtests.locale=en -Dtests.t imezone=America/Indiana/Vincennes -Dtests.file.encoding=US-ASCII [junit4:junit4] ERROR 0.00s | ManifoldCFSCLoadTest (suite) [junit4:junit4] Throwable #1: java.lang.RuntimeException: Cannot find resou rce: C:\wip\mcf-integration\solr-4.x\trunk\solr\solr\build\contrib\solr-mcf\test \J0\solr\collection1 [junit4:junit4]at __randomizedtesting.SeedInfo.seed([76514493C8D9B3A8]: 0) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4 .java:1417) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCase J4.java:1422) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ 4.java:174) [junit4:junit4]at org.apache.solr.mcf.ManifoldCFSCLoadTest.beforeClass( ManifoldCFSCLoadTest.java:43) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native M ethod) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMet hodAccessorImpl.java:39) [junit4:junit4]at sun.reflect.DelegatingMethodAccessorImpl.invoke(Deleg atingMethodAccessorImpl.java:25) [junit4:junit4]at java.lang.reflect.Method.invoke(Method.java:597) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.i nvoke(RandomizedRunner.java:1559) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.a ccess$600(RandomizedRunner.java:79) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$4 .evaluate(RandomizedRunner.java:677) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$5 .evaluate(RandomizedRunner.java:693) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) [junit4:junit4]at org.apache.lucene.util.AbstractBeforeAfterRule$1.eval uate(AbstractBeforeAfterRule.java:45) [junit4:junit4]at org.apache.lucene.util.TestRuleStoreClassName$1.evalu ate(TestRuleStoreClassName.java:42) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at org.apache.lucene.util.TestRuleAssertionsRequired$1.e valuate(TestRuleAssertionsRequired.java:43) [junit4:junit4]at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate (TestRuleMarkFailure.java:48) [junit4:junit4]at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures
[jira] [Commented] (CONNECTORS-595) Solr 4.x plugin fails tests against solr-lucene 4.0.0 final
[ https://issues.apache.org/jira/browse/CONNECTORS-595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541599#comment-13541599 ] Karl Wright commented on CONNECTORS-595: r1427333 (release branch) Solr 4.x plugin fails tests against solr-lucene 4.0.0 final --- Key: CONNECTORS-595 URL: https://issues.apache.org/jira/browse/CONNECTORS-595 Project: ManifoldCF Issue Type: Bug Components: Solr-4.x-component Affects Versions: ManifoldCF 1.1 Reporter: Karl Wright Assignee: Karl Wright Priority: Blocker Fix For: ManifoldCF 1.1 When build is updated to point to the lucene-solr 4.0.0 tag, the plugin tests fail as follows: {code} [junit4:junit4] JUnit4 says hello! Master seed: 76514493C8D9B3A8 [junit4:junit4] Your default console's encoding may not display certain unicode glyphs: windows-1252 [junit4:junit4] Executing 3 suites with 1 JVM. [junit4:junit4] [junit4:junit4] Suite: org.apache.solr.mcf.ManifoldCFSCLoadTest [junit4:junit4] 2 225 T10 oas.SolrTestCaseJ4.deleteCore ###deleteCore [junit4:junit4] 2 NOTE: test params are: codec=Lucene3x, sim=RandomSimilarity Provider(queryNorm=false,coord=no): {}, locale=en, timezone=America/Indiana/Vinc ennes [junit4:junit4] 2 NOTE: Windows Vista 6.0 x86/Sun Microsystems Inc. 1.6.0_21 (32-bit)/cpus=2,threads=1,free=10125792,total=16252928 [junit4:junit4] 2 NOTE: All tests run in this JVM: [ManifoldCFSCLoadTest] [junit4:junit4] 2 NOTE: reproduce with: ant test -Dtestcase=ManifoldCFSCLoad Test -Dtests.seed=76514493C8D9B3A8 -Dtests.slow=true -Dtests.locale=en -Dtests.t imezone=America/Indiana/Vincennes -Dtests.file.encoding=US-ASCII [junit4:junit4] ERROR 0.00s | ManifoldCFSCLoadTest (suite) [junit4:junit4] Throwable #1: java.lang.RuntimeException: Cannot find resou rce: C:\wip\mcf-integration\solr-4.x\trunk\solr\solr\build\contrib\solr-mcf\test \J0\solr\collection1 [junit4:junit4]at __randomizedtesting.SeedInfo.seed([76514493C8D9B3A8]: 0) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.getFile(SolrTestCaseJ4 .java:1417) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.TEST_HOME(SolrTestCase J4.java:1422) [junit4:junit4]at org.apache.solr.SolrTestCaseJ4.initCore(SolrTestCaseJ 4.java:174) [junit4:junit4]at org.apache.solr.mcf.ManifoldCFSCLoadTest.beforeClass( ManifoldCFSCLoadTest.java:43) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke0(Native M ethod) [junit4:junit4]at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMet hodAccessorImpl.java:39) [junit4:junit4]at sun.reflect.DelegatingMethodAccessorImpl.invoke(Deleg atingMethodAccessorImpl.java:25) [junit4:junit4]at java.lang.reflect.Method.invoke(Method.java:597) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.i nvoke(RandomizedRunner.java:1559) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner.a ccess$600(RandomizedRunner.java:79) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$4 .evaluate(RandomizedRunner.java:677) [junit4:junit4]at com.carrotsearch.randomizedtesting.RandomizedRunner$5 .evaluate(RandomizedRunner.java:693) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesRestoreRule$1.evaluate(SystemPropertiesRestoreRule.java:53) [junit4:junit4]at org.apache.lucene.util.AbstractBeforeAfterRule$1.eval uate(AbstractBeforeAfterRule.java:45) [junit4:junit4]at org.apache.lucene.util.TestRuleStoreClassName$1.evalu ate(TestRuleStoreClassName.java:42) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.SystemProper tiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.NoShadowingO rOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) [junit4:junit4]at com.carrotsearch.randomizedtesting.rules.StatementAda pter.evaluate(StatementAdapter.java:36) [junit4:junit4]at org.apache.lucene.util.TestRuleAssertionsRequired$1.e valuate(TestRuleAssertionsRequired.java:43) [junit4:junit4]at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate (TestRuleMarkFailure.java:48) [junit4:junit4]at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541641#comment-13541641 ] Karl Wright commented on CONNECTORS-594: Answer to the how to specify urls question: The contributed code actually seems to be doing this partly incorrectly. The proper code looks like this: {code} response_object = request_object.process(solrServer); {code} The delete request can be specified as an UpdateRequest, and a URL can be included when that is created. The ping request, on the other hand, has a hard-wired /admin/status URL, but the request class is very simple, so I just duplicated it locally and permitted a URL to be passed in. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13541898#comment-13541898 ] Karl Wright commented on CONNECTORS-594: Spelunking through the SolrJ code, I think I've mostly answered my outstanding questions, and made appropriate changes to the SolrConnector where appropriate. I'm going to prepare the branch for commit (mostly by peeling out the SolrCloud connector since that is now superceded by the revised Solr connector), and then try out the code against a single instance Solr. The only other remaining work involves getting Japanese translations for the new text I added to the UI. SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-19: -- Fix Version/s: (was: ManifoldCF next) ManifoldCF 1.1 Assignee: Karl Wright Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library
[ https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-19. --- Resolution: Fixed Resolved by CONNECTORS-594. Look into converting SOLR connector to use SolrJ java library - Key: CONNECTORS-19 URL: https://issues.apache.org/jira/browse/CONNECTORS-19 Project: ManifoldCF Issue Type: Improvement Components: Lucene/SOLR connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2 Reporter: Karl Wright Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 The SOLR connector currently uses its own multipart post code. It might be a good idea to convert it to use the SolrJ client api jar instead. This would require license confirmation, plus research to make sure there are no jar conflicts as a result, with any other connector. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542148#comment-13542148 ] Karl Wright commented on CONNECTORS-594: r1427803 for translations SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: CONNECTORS-594.properties.patch, solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-594) SolrCloud Output Connector
[ https://issues.apache.org/jira/browse/CONNECTORS-594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-594. Resolution: Fixed SolrCloud Output Connector -- Key: CONNECTORS-594 URL: https://issues.apache.org/jira/browse/CONNECTORS-594 Project: ManifoldCF Issue Type: New Feature Affects Versions: ManifoldCF 1.0.1 Reporter: Minoru Osuka Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 1.1 Attachments: CONNECTORS-594.properties.patch, solrcloud-output-connector.tar.gz The Output Connectors doesn't support SolrCloud currently. Since the Solr 4.0 released, I think there is a need for support for SolrCloud. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-596) RSS and Web connectors don't strip off namespace when handling feeds
Karl Wright created CONNECTORS-596: -- Summary: RSS and Web connectors don't strip off namespace when handling feeds Key: CONNECTORS-596 URL: https://issues.apache.org/jira/browse/CONNECTORS-596 Project: ManifoldCF Issue Type: Bug Components: RSS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 In the RSS and Web connectors, if a tag is qualified with a namespace, the parser is not smart enough to pull off the qualifier to recognize the tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-596) RSS and Web connectors don't strip off namespace when handling feeds
[ https://issues.apache.org/jira/browse/CONNECTORS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-596. Resolution: Fixed RSS and Web connectors don't strip off namespace when handling feeds Key: CONNECTORS-596 URL: https://issues.apache.org/jira/browse/CONNECTORS-596 Project: ManifoldCF Issue Type: Bug Components: RSS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 In the RSS and Web connectors, if a tag is qualified with a namespace, the parser is not smart enough to pull off the qualifier to recognize the tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-596) RSS and Web connectors don't strip off namespace when handling feeds
[ https://issues.apache.org/jira/browse/CONNECTORS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542663#comment-13542663 ] Karl Wright commented on CONNECTORS-596: r1428143 RSS and Web connectors don't strip off namespace when handling feeds Key: CONNECTORS-596 URL: https://issues.apache.org/jira/browse/CONNECTORS-596 Project: ManifoldCF Issue Type: Bug Components: RSS connector, Web connector Affects Versions: ManifoldCF 1.0.1 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 In the RSS and Web connectors, if a tag is qualified with a namespace, the parser is not smart enough to pull off the qualifier to recognize the tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-598) Add proxy pac files to the RSS connector
[ https://issues.apache.org/jira/browse/CONNECTORS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543183#comment-13543183 ] Karl Wright commented on CONNECTORS-598: Filtering the documents being specified in a feed based on whatever criterion is currently not part of the RSS connector. The only means of filtering is by including or excluding the feed itself. So it sounds like what you need for this case is NOT to understand a proxy.pac file, but rather to permit discovered URLs to be filtered in some way. Will being able to filter based on regular expressions run against a document URL be sufficient? The web connector uses this strategy, but it seems to me like it would be problematic in an RSS situation. Presumably the mix of links will be changing all the time, as the feeds are regenerated; you might possibly be able to decide via a regexp whether a link was internal or not, but it will be cumbersome to manage this I think. The alternative is to generate the feeds without the documents that you don't want. Please let me know how you want to proceed. Add proxy pac files to the RSS connector Key: CONNECTORS-598 URL: https://issues.apache.org/jira/browse/CONNECTORS-598 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Reporter: David Morana Fix For: ManifoldCF 1.1 I have a public RSS feed on an intranet that lists important bookmarks. The list has many external links in it. So ManifoldCF would need to know when to use the company's proxy to index the external links. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-598) Add proxy pac files to the RSS connector
[ https://issues.apache.org/jira/browse/CONNECTORS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-598: --- Fix Version/s: (was: ManifoldCF 1.1) ManifoldCF next Add proxy pac files to the RSS connector Key: CONNECTORS-598 URL: https://issues.apache.org/jira/browse/CONNECTORS-598 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Reporter: David Morana Fix For: ManifoldCF next I have a public RSS feed on an intranet that lists important bookmarks. The list has many external links in it. So ManifoldCF would need to know when to use the company's proxy to index the external links. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-598) Add mode to use null content if chromed content not found to the RSS connector
[ https://issues.apache.org/jira/browse/CONNECTORS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13544666#comment-13544666 ] Karl Wright commented on CONNECTORS-598: r1429250 to add mode as described. Still need Japanese translations for two radio button entries though. Add mode to use null content if chromed content not found to the RSS connector -- Key: CONNECTORS-598 URL: https://issues.apache.org/jira/browse/CONNECTORS-598 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Reporter: David Morana Fix For: ManifoldCF next I have a public RSS feed on an intranet that lists important bookmarks. The list has many external links in it. So ManifoldCF would need to know when to use the company's proxy to index the external links. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-598) Add mode to use null content if chromed content not found to the RSS connector
[ https://issues.apache.org/jira/browse/CONNECTORS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-598: --- Fix Version/s: (was: ManifoldCF next) ManifoldCF 1.1 Assignee: Karl Wright Add mode to use null content if chromed content not found to the RSS connector -- Key: CONNECTORS-598 URL: https://issues.apache.org/jira/browse/CONNECTORS-598 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Reporter: David Morana Assignee: Karl Wright Fix For: ManifoldCF 1.1 I have a public RSS feed on an intranet that lists important bookmarks. The list has many external links in it. So ManifoldCF would need to know when to use the company's proxy to index the external links. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-599) Derby stalls, does not perform well, on multi-threaded tests
Karl Wright created CONNECTORS-599: -- Summary: Derby stalls, does not perform well, on multi-threaded tests Key: CONNECTORS-599 URL: https://issues.apache.org/jira/browse/CONNECTORS-599 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.0 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 1.1 Derby has been problematic for a while. On one particular test it is easy to see it without fail: ant run-rss-tests-derby. I've opened a ticket for the Derby project to track the problem, but there appears to be little interest in addressing it. DERBY-6011. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-598) Add mode to use null content if chromed content not found to the RSS connector
[ https://issues.apache.org/jira/browse/CONNECTORS-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13545730#comment-13545730 ] Karl Wright commented on CONNECTORS-598: Thank you Abe-san! Add mode to use null content if chromed content not found to the RSS connector -- Key: CONNECTORS-598 URL: https://issues.apache.org/jira/browse/CONNECTORS-598 Project: ManifoldCF Issue Type: Improvement Affects Versions: ManifoldCF 1.0.1, ManifoldCF 1.1 Reporter: David Morana Assignee: Karl Wright Fix For: ManifoldCF 1.1 I have a public RSS feed on an intranet that lists important bookmarks. The list has many external links in it. So ManifoldCF would need to know when to use the company's proxy to index the external links. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira