[jira] [Updated] (CONNECTORS-213) DBInterfaceMySQL Initalization error
[ https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-213: --- Resolution: Fixed Status: Resolved (was: Patch Available) DBInterfaceMySQL Initalization error Key: CONNECTORS-213 URL: https://issues.apache.org/jira/browse/CONNECTORS-213 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Emanuele Lombardi Assignee: Karl Wright Labels: Mysql, bug, manifoldcf Fix For: ManifoldCF 0.3 Attachments: DBInterfaceMySQL.patch When I try to creare a new database using DBCreate and MySql DBInterface I have this exception Configuration file successfully read Exception in thread main java.lang.NullPointerException at org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40) at org.apache.manifoldcf.core.database.Database.init(Database.java:63) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391) at org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656) at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51) at org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54) at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-213) DBInterfaceMySQL Initalization error
[ https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084038#comment-13084038 ] Karl Wright commented on CONNECTORS-213: I haven't heard anything further, so I'm closing this ticket. I'll either reopen it, or open a new one, if there are further mysql issues/patches reported. DBInterfaceMySQL Initalization error Key: CONNECTORS-213 URL: https://issues.apache.org/jira/browse/CONNECTORS-213 Project: ManifoldCF Issue Type: Bug Components: Framework core Affects Versions: ManifoldCF 0.3 Reporter: Emanuele Lombardi Labels: Mysql, bug, manifoldcf Fix For: ManifoldCF 0.3 Attachments: DBInterfaceMySQL.patch When I try to creare a new database using DBCreate and MySql DBInterface I have this exception Configuration file successfully read Exception in thread main java.lang.NullPointerException at org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40) at org.apache.manifoldcf.core.database.Database.init(Database.java:63) at org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391) at org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656) at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51) at org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54) at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified
[ https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084043#comment-13084043 ] Karl Wright commented on CONNECTORS-226: After some thought, I believe the correct approach is to convert Livelink's usage to a ServiceInterruption. Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified Key: CONNECTORS-226 URL: https://issues.apache.org/jira/browse/CONNECTORS-226 Project: ManifoldCF Issue Type: Bug Components: Framework core, JCIFS connector, LiveLink connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated by the framework somewhat inconsistently. In some places it is treated as a permanent connection exception, and in others as a temporary connection exception (in lieu of a ServiceInterruption where ServiceInterruption is not possible). Only two connectors use it (LiveLink and jCIFS), and the JCIFS case is not interesting. So really this is currently here to support Livelink. There are two ways forward. The first way is to convert the Livelink connector's exception to a true ServiceInterruption, and revert REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been deprecated as a result of the fact that connect() methods can no longer throw ManifoldCFExceptions at all. The second is to continue the current Livelink-style usage, and make ALL usages consistent with that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified
[ https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-226. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 r1157065 Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified Key: CONNECTORS-226 URL: https://issues.apache.org/jira/browse/CONNECTORS-226 Project: ManifoldCF Issue Type: Bug Components: Framework core, JCIFS connector, LiveLink connector Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated by the framework somewhat inconsistently. In some places it is treated as a permanent connection exception, and in others as a temporary connection exception (in lieu of a ServiceInterruption where ServiceInterruption is not possible). Only two connectors use it (LiveLink and jCIFS), and the JCIFS case is not interesting. So really this is currently here to support Livelink. There are two ways forward. The first way is to convert the Livelink connector's exception to a true ServiceInterruption, and revert REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been deprecated as a result of the fact that connect() methods can no longer throw ManifoldCFExceptions at all. The second is to continue the current Livelink-style usage, and make ALL usages consistent with that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084063#comment-13084063 ] Karl Wright commented on CONNECTORS-55: --- HSQLDB support has been added, which works a lot better than Derby does, so I think we've finally hit the necessary criteria for closing out this ticket. Bundle database server with ManifoldCF packaged product --- Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: ManifoldCF Issue Type: Sub-task Components: Installers Reporter: Jack Krupansky Fix For: ManifoldCF 0.3 The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate the real thing with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product
[ https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-55: - Assignee: Karl Wright Bundle database server with ManifoldCF packaged product --- Key: CONNECTORS-55 URL: https://issues.apache.org/jira/browse/CONNECTORS-55 Project: ManifoldCF Issue Type: Sub-task Components: Installers Reporter: Jack Krupansky Assignee: Karl Wright Fix For: ManifoldCF 0.3 The current requirement that the user install and deploy a PostgreSQL server complicates the installation and deployment of LCF for the user. Installation and deployment of LCF should be as simple as Solr itself. QuickStart is great for the low-end and basic evaluation, but a comparable level of simplified installation and deployment is still needed for full-blown, high-end environments that need the full performance of a ProstgreSQL-class database server. So, PostgreSQL should be bundled with the packaged release of LCF so that installation and deployment of LCF will automatically install and deploy a subset of the full PostgreSQL distribution that is sufficient for the needs of LCF. Starting LCF, with or without the LCF UI, should automatically start the database server. Shutting down LCF should also shutdown the database server process. A typical use case would be for a non-developer who is comfortable with Solr and simply wants to crawl documents from, for example, a SharePoint repository and feed them into Solr. QuickStart should work well for the low end or in the early stages of evaluation, but the user would prefer to evaluate the real thing with something resembling a production crawl of thousands of documents. Such a user might not be a hard-core developer or be comfortable fiddling with a lot of software components simply to do one conceptually simple operation. It should still be possible for the user to supply database server settings to override the defaults, but the LCF package should have all of the best-practice settings deemed appropriate for use with LCF. One downside is that installation and deployment will be platform-specific since there are multiple processes and PostgreSQL itself requires a platform-specific installation. This proposal presumes that PostgreSQL is the best option for the foreseeable future, but nothing here is intended to preclude support for other database servers in futures releases. This proposal should not have any impact on QuickStart packaging or deployment. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-60) Agent process should be started automatically
[ https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-60. --- Resolution: Fixed Fix Version/s: ManifoldCF 0.1 ManifoldCF 0.2 ManifoldCF 0.3 Agent process should be started automatically - Key: CONNECTORS-60 URL: https://issues.apache.org/jira/browse/CONNECTORS-60 Project: ManifoldCF Issue Type: Sub-task Components: Documentation Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.3, ManifoldCF 0.2, ManifoldCF 0.1 LCF as it exists today is a bit too complex to run for an average user, especially with a separate agent process for crawling. LCF should be as easy to run as Solr is today. QuickStart is a good move in this direction, but the same user-visible simplicity is needed for full LCF. The separate agent process is a reasonable design for execution, but a little too cumbersome for the average user to manage. Unfortunately, it is expected that starting up a multi-process application will require platform-specific scripting. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. KDW - this functionality is already present; however the documentation is not adequate to help people figure out how to do it. So I'm moving this to Documentation and treating it as a doc bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-60) Agent process should be started automatically
[ https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-60: - Assignee: Karl Wright Agent process should be started automatically - Key: CONNECTORS-60 URL: https://issues.apache.org/jira/browse/CONNECTORS-60 Project: ManifoldCF Issue Type: Sub-task Components: Documentation Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 LCF as it exists today is a bit too complex to run for an average user, especially with a separate agent process for crawling. LCF should be as easy to run as Solr is today. QuickStart is a good move in this direction, but the same user-visible simplicity is needed for full LCF. The separate agent process is a reasonable design for execution, but a little too cumbersome for the average user to manage. Unfortunately, it is expected that starting up a multi-process application will require platform-specific scripting. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. KDW - this functionality is already present; however the documentation is not adequate to help people figure out how to do it. So I'm moving this to Documentation and treating it as a doc bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-60) Agent process should be started automatically
[ https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084067#comment-13084067 ] Karl Wright commented on CONNECTORS-60: --- The Quick Start definitely meets the requirements listed for this task, so I'm closing it. Agent process should be started automatically - Key: CONNECTORS-60 URL: https://issues.apache.org/jira/browse/CONNECTORS-60 Project: ManifoldCF Issue Type: Sub-task Components: Documentation Reporter: Jack Krupansky Priority: Minor Fix For: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3 LCF as it exists today is a bit too complex to run for an average user, especially with a separate agent process for crawling. LCF should be as easy to run as Solr is today. QuickStart is a good move in this direction, but the same user-visible simplicity is needed for full LCF. The separate agent process is a reasonable design for execution, but a little too cumbersome for the average user to manage. Unfortunately, it is expected that starting up a multi-process application will require platform-specific scripting. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. KDW - this functionality is already present; however the documentation is not adequate to help people figure out how to do it. So I'm moving this to Documentation and treating it as a doc bug. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-58) ManifoldCF scripting language, executed via the API, plus example jobs for file system and web crawl
[ https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-58: -- Component/s: Scripting client Adding a new ticket category for scripting client. ManifoldCF scripting language, executed via the API, plus example jobs for file system and web crawl --- Key: CONNECTORS-58 URL: https://issues.apache.org/jira/browse/CONNECTORS-58 Project: ManifoldCF Issue Type: Sub-task Components: Examples, Scripting client Reporter: Jack Krupansky Priority: Minor Creating a basic connection setup to do a relatively simple crawl for a file system or web can be a daunting task for someone new to LCF. So, it would be nice to have a scripting file that supports an abbreviated API (subset of the full API discussed in CONNECTORS-56) sufficient to create a default set of connections and example jobs that the new user can choose from. Beyond this initial need, this script format might be a useful form to dump all of the connections and jobs in the LCF database in a form that can be used to recreate an LCF configuration. Kind of a dump and reload capability. That in fact might be how the initial example script gets created. Those are two distinct use cases, but could utilize the same feature. The example script could have example jobs to crawl a subdirectory of LCF, crawl the LCF wiki, etc. There could be more than one script. There might be example scripts for each form of connector. This capability should be available for both QuickStart and the general release of LCF. As just one possibility, the script format might be a sequence of JSON expressions, each with an initial string analogous to a servlet path to specify the operation to be performed, followed by the JSON form of the connection or job or other LCF object. Or, some other format might be more suitable. Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-61) Support bundling of LCF with an app
[ https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084074#comment-13084074 ] Karl Wright commented on CONNECTORS-61: --- The current status of this ticket is that it is stalled. What is needed, in my view, to complete it is the following: (1) Make the jetty-runner class include specific methods for programmatically starting and stopping the ManifoldCF quick start from within an enclosing process, and (2) Document those methods in how to build and deploy. Support bundling of LCF with an app --- Key: CONNECTORS-61 URL: https://issues.apache.org/jira/browse/CONNECTORS-61 Project: ManifoldCF Issue Type: Sub-task Components: Documentation, Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Jack Krupansky Priority: Minor It should be possible for an application developer to bundle LCF with an application to facilitate installation and deployment of the application in conjunction with LCF. This may (or may not) be as simple as providing appropriate jar files and documentation for how to use them, but there may be other components or scripts needed. There are two options: 1) include the LCF UI along with the other LCF processes, and 2) exclude the LCF UI and include only the other processes that can be controlled via the full API. The database server would be included. The web app server would be optional since the application may have its own choice of web app server. One use case is bundling LCF with Solr or a Solr-based application. Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Emmanuel Keller updated CONNECTORS-224: --- Attachment: oss-mfc-beta.patch This new version fixes the previous points (Java 5, missing javascript checks, implementation of the check method, memory leak). We also added a specification tab panel to handle mime types and extension filter, and a limit to the maximum size of documents. Finally, one issue remains: The following methods are never called: checkMimetypeIndexable and checkFileIndexable. Until now we were not able to understand why. Thanks in advance for your help. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084101#comment-13084101 ] Karl Wright commented on CONNECTORS-224: checkMimeTypeIndexable and checkFileIndexable will only be called from certain connectors, which are in a position to prefilter content. Other connectors will not be able to use these. For example, the Web connector calls checkMimeTypeIndexable and the JCIFS connector calls checkFileIndexable. Hope this helps. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084111#comment-13084111 ] Emmanuel Keller commented on CONNECTORS-224: It helps, thank you. So I will check with these connectors. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-61) Support bundling of LCF with an app
[ https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084112#comment-13084112 ] Karl Wright commented on CONNECTORS-61: --- r1157095 for the jetty-runner changes. Support bundling of LCF with an app --- Key: CONNECTORS-61 URL: https://issues.apache.org/jira/browse/CONNECTORS-61 Project: ManifoldCF Issue Type: Sub-task Components: Documentation, Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor It should be possible for an application developer to bundle LCF with an application to facilitate installation and deployment of the application in conjunction with LCF. This may (or may not) be as simple as providing appropriate jar files and documentation for how to use them, but there may be other components or scripts needed. There are two options: 1) include the LCF UI along with the other LCF processes, and 2) exclude the LCF UI and include only the other processes that can be controlled via the full API. The database server would be included. The web app server would be optional since the application may have its own choice of web app server. One use case is bundling LCF with Solr or a Solr-based application. Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084116#comment-13084116 ] Karl Wright commented on CONNECTORS-224: Checked in the latest patch, and I'll try to evaluate it at first available opportunity. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-61) Support bundling of LCF with an app
[ https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-61. --- Resolution: Fixed Fix Version/s: ManifoldCF next Support bundling of LCF with an app --- Key: CONNECTORS-61 URL: https://issues.apache.org/jira/browse/CONNECTORS-61 Project: ManifoldCF Issue Type: Sub-task Components: Documentation, Framework crawler agent Affects Versions: ManifoldCF 0.3 Reporter: Jack Krupansky Assignee: Karl Wright Priority: Minor Fix For: ManifoldCF next It should be possible for an application developer to bundle LCF with an application to facilitate installation and deployment of the application in conjunction with LCF. This may (or may not) be as simple as providing appropriate jar files and documentation for how to use them, but there may be other components or scripts needed. There are two options: 1) include the LCF UI along with the other LCF processes, and 2) exclude the LCF UI and include only the other processes that can be controlled via the full API. The database server would be included. The web app server would be optional since the application may have its own choice of web app server. One use case is bundling LCF with Solr or a Solr-based application. Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API
[ https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-50. --- Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright Resolved, leaving scripting language request open as a separate ticket. Proposal for initial two releases of LCF, including packaged product and full API - Key: CONNECTORS-50 URL: https://issues.apache.org/jira/browse/CONNECTORS-50 Project: ManifoldCF Issue Type: New Feature Reporter: Jack Krupansky Assignee: Karl Wright Fix For: ManifoldCF 0.3 Currently, LCF has a relatively high-bar for evaluation and use, requiring developer expertise. Also, although LCF has a comprehensive UI, it is not currently packaged for use as a crawling engine for advanced applications. A small set of individual feature requests are needed to address these issues. They are summarized briefly to show how they fit together for two initial releases of LCF, but will be broken out into individual LCF Jira issues. Goals: 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as Solr is today) 2. LCF as a toolkit for developers needing customized crawling and repository access 3. An API-based crawling engine that can be integrated with applications (as Aperture is today) Larger goals: 1. Make it very easy for users to evaluate LCF. 2. Make it very easy for developers to customize LCF. 3. Make it very easy for appplications to fully manage and control LCF in operation. Two phases: 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call it LCF 0.5. 2) API-based crawling engine for applications for which the UI might not be appropriate. Call it LCF 1.0. Phase 1 --- LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later. It would contain roughly the features that are currently in place or currently underway, plus a little more. Specifically, LCF 0.5 would contain these additional capabilities: 1. Plug-in architecture for connectors (CONNECTORS-40 - DONE) 2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59) 3. Bundled with database - PostgreSQL or derby - ready to run without additional manual setup (CONNECTORS-55) 4. Mini-API to initially configure default connections and example jobs for file system and web crawl (CONNECTORS-58) 5. Agent process started automatically (CONNECTORS-60) 6. Solr output connector option to commit at end of job, by default (CONNECTORS-57) Installation and basic evaluation of LCF would be essentially as simple as Solr is today. The example connections and jobs would permit the user to initiate example crawls of a file system example directory and an example web on the LCF web site with just a couple of clicks (as opposed to the detailed manual setup required today to create repository and output connections and jobs. It is worth considering whether the SharePoint connector could also be included as part of the default package. Users could then add additional connectors and repositories and jobs as desired. Timeframe for release? Level of effort? Phase 2 --- The essence of Phase 2 is that LCF would be split to allow direct, full API access to LCF as a crawling engine, in additional to the full LCF UI. Call this LCF 1.0. Specifically, LCF 1.0 would contain these additional capabilities: 1. Full API for LCF as a crawling engine (CONNECTORS-56) 2. LCF can be bundled within an app (CONNECTORS-61) 3. LCF event and activity notification for full control by an application (CONNECTORS-41) Overall, LCF will offer roughly the same crawling capabilities as with LCF 0.5, plus whatever bug fixes and minor enhancements might also be added. Timeframe for release? Level of effort? - Issues: - Can we package PostgreSQL with LCF so LCF can set it up? - Or do we need Derby for that purpose? - Managing multiple processes (UI, database, agent, app processes) - What exactly would the API look like? (URL, XML, JSON, YAML?) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API
[ https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084131#comment-13084131 ] Karl Wright commented on CONNECTORS-50: --- I'm going to resolve this ticket, since the planning part is now meaningless, and the only thing that remains is a scripting language, for which there is a separate ticket. Proposal for initial two releases of LCF, including packaged product and full API - Key: CONNECTORS-50 URL: https://issues.apache.org/jira/browse/CONNECTORS-50 Project: ManifoldCF Issue Type: New Feature Reporter: Jack Krupansky Fix For: ManifoldCF 0.3 Currently, LCF has a relatively high-bar for evaluation and use, requiring developer expertise. Also, although LCF has a comprehensive UI, it is not currently packaged for use as a crawling engine for advanced applications. A small set of individual feature requests are needed to address these issues. They are summarized briefly to show how they fit together for two initial releases of LCF, but will be broken out into individual LCF Jira issues. Goals: 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as Solr is today) 2. LCF as a toolkit for developers needing customized crawling and repository access 3. An API-based crawling engine that can be integrated with applications (as Aperture is today) Larger goals: 1. Make it very easy for users to evaluate LCF. 2. Make it very easy for developers to customize LCF. 3. Make it very easy for appplications to fully manage and control LCF in operation. Two phases: 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call it LCF 0.5. 2) API-based crawling engine for applications for which the UI might not be appropriate. Call it LCF 1.0. Phase 1 --- LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later. It would contain roughly the features that are currently in place or currently underway, plus a little more. Specifically, LCF 0.5 would contain these additional capabilities: 1. Plug-in architecture for connectors (CONNECTORS-40 - DONE) 2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59) 3. Bundled with database - PostgreSQL or derby - ready to run without additional manual setup (CONNECTORS-55) 4. Mini-API to initially configure default connections and example jobs for file system and web crawl (CONNECTORS-58) 5. Agent process started automatically (CONNECTORS-60) 6. Solr output connector option to commit at end of job, by default (CONNECTORS-57) Installation and basic evaluation of LCF would be essentially as simple as Solr is today. The example connections and jobs would permit the user to initiate example crawls of a file system example directory and an example web on the LCF web site with just a couple of clicks (as opposed to the detailed manual setup required today to create repository and output connections and jobs. It is worth considering whether the SharePoint connector could also be included as part of the default package. Users could then add additional connectors and repositories and jobs as desired. Timeframe for release? Level of effort? Phase 2 --- The essence of Phase 2 is that LCF would be split to allow direct, full API access to LCF as a crawling engine, in additional to the full LCF UI. Call this LCF 1.0. Specifically, LCF 1.0 would contain these additional capabilities: 1. Full API for LCF as a crawling engine (CONNECTORS-56) 2. LCF can be bundled within an app (CONNECTORS-61) 3. LCF event and activity notification for full control by an application (CONNECTORS-41) Overall, LCF will offer roughly the same crawling capabilities as with LCF 0.5, plus whatever bug fixes and minor enhancements might also be added. Timeframe for release? Level of effort? - Issues: - Can we package PostgreSQL with LCF so LCF can set it up? - Or do we need Derby for that purpose? - Managing multiple processes (UI, database, agent, app processes) - What exactly would the API look like? (URL, XML, JSON, YAML?) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-34) eRoom authority and connector
[ https://issues.apache.org/jira/browse/CONNECTORS-34?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-34: -- Component/s: eRoom connector eRoom authority and connector - Key: CONNECTORS-34 URL: https://issues.apache.org/jira/browse/CONNECTORS-34 Project: ManifoldCF Issue Type: New Feature Components: eRoom connector Reporter: Karl Wright eRoom has a SOAP API which looks like it has enough power to perhaps implement a connector and an authority. The eRoom API url is here (and yes, it is a chinese url, but is legit): https://eroom.abraxas.ch/eroomHelp/en/API_Help/Api.htm#home_api.html -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests
[ https://issues.apache.org/jira/browse/CONNECTORS-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084140#comment-13084140 ] Karl Wright commented on CONNECTORS-54: --- ManifoldCF in Action required a simple file-based repository to be written, and an output connector to that repository also. It's not so simple because the metadata and acl information also needs to go into the file system for this to be useful. I'm therefore going to close this ticket since I can't think of any realistic use for the proposed connector other than testing. A Filesystem output connector would be useful and would allow more complete unit tests --- Key: CONNECTORS-54 URL: https://issues.apache.org/jira/browse/CONNECTORS-54 Project: ManifoldCF Issue Type: Improvement Reporter: Karl Wright Fix For: ManifoldCF 0.3 Right now, the unit tests are limited because there is no way to check that the indexed files actually do get indexed. The addition of a filesystem output connector would allow more complete tests to be constructed. In addition, such a connector might well be useful in its own right. The connector would need to convert URI's into relative file paths, but other than that there's really nothing very tricky about it. Configuration information is minimal; just the root path of the output is all that's needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests
[ https://issues.apache.org/jira/browse/CONNECTORS-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-54. --- Resolution: Won't Fix Fix Version/s: ManifoldCF 0.3 A Filesystem output connector would be useful and would allow more complete unit tests --- Key: CONNECTORS-54 URL: https://issues.apache.org/jira/browse/CONNECTORS-54 Project: ManifoldCF Issue Type: Improvement Reporter: Karl Wright Fix For: ManifoldCF 0.3 Right now, the unit tests are limited because there is no way to check that the indexed files actually do get indexed. The addition of a filesystem output connector would allow more complete tests to be constructed. In addition, such a connector might well be useful in its own right. The connector would need to convert URI's into relative file paths, but other than that there's really nothing very tricky about it. Configuration information is minimal; just the root path of the output is all that's needed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CONNECTORS-94) fix common localization traps
[ https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright updated CONNECTORS-94: -- Component/s: Framework core fix common localization traps - Key: CONNECTORS-94 URL: https://issues.apache.org/jira/browse/CONNECTORS-94 Project: ManifoldCF Issue Type: Task Components: Framework core Reporter: Robert Muir Searching thru the LCF code, i found several uses of the following that appear to be potentially dangerous: * getBytes() with no encoding: this is dangerous as the encoding is completely unspecified. In most places this should likely mean UTF-8 * getBytes(utf-8): this is mostly a nitpick, but this alias is not guaranteed to exist (see Charset docs). I suggest changing these all to UTF-8 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where it appears the text is not used solely for display, but instead for 'caseless matching'. I suggest changing these to use either the root Locale: new Locale() or even easier, Locale.ENGLISH. This way ACF does not have surprising behavior on say a Turkish computer. I can contribute a patch to address these. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-94) fix common localization traps
[ https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084142#comment-13084142 ] Karl Wright commented on CONNECTORS-94: --- Has there been any update to this ticket? fix common localization traps - Key: CONNECTORS-94 URL: https://issues.apache.org/jira/browse/CONNECTORS-94 Project: ManifoldCF Issue Type: Task Components: Framework core Reporter: Robert Muir Searching thru the LCF code, i found several uses of the following that appear to be potentially dangerous: * getBytes() with no encoding: this is dangerous as the encoding is completely unspecified. In most places this should likely mean UTF-8 * getBytes(utf-8): this is mostly a nitpick, but this alias is not guaranteed to exist (see Charset docs). I suggest changing these all to UTF-8 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where it appears the text is not used solely for display, but instead for 'caseless matching'. I suggest changing these to use either the root Locale: new Locale() or even easier, Locale.ENGLISH. This way ACF does not have surprising behavior on say a Turkish computer. I can contribute a patch to address these. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CONNECTORS-94) fix common localization traps
[ https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright reassigned CONNECTORS-94: - Assignee: Robert Muir fix common localization traps - Key: CONNECTORS-94 URL: https://issues.apache.org/jira/browse/CONNECTORS-94 Project: ManifoldCF Issue Type: Task Components: Framework core Reporter: Robert Muir Assignee: Robert Muir Searching thru the LCF code, i found several uses of the following that appear to be potentially dangerous: * getBytes() with no encoding: this is dangerous as the encoding is completely unspecified. In most places this should likely mean UTF-8 * getBytes(utf-8): this is mostly a nitpick, but this alias is not guaranteed to exist (see Charset docs). I suggest changing these all to UTF-8 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where it appears the text is not used solely for display, but instead for 'caseless matching'. I suggest changing these to use either the root Locale: new Locale() or even easier, Locale.ENGLISH. This way ACF does not have surprising behavior on say a Turkish computer. I can contribute a patch to address these. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-92) Move from ant to maven or other build system with decent library management
[ https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-92. --- Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Move from ant to maven or other build system with decent library management --- Key: CONNECTORS-92 URL: https://issues.apache.org/jira/browse/CONNECTORS-92 Project: ManifoldCF Issue Type: Wish Components: Build Reporter: Jettro Coenradie Assignee: Karl Wright Fix For: ManifoldCF 0.3 Attachments: Screen shot 2010-08-23 at 16.31.07.png, maven-poms-including-start-jar.patch, maven-poms-problem-starting-jetty-and-derby.patch, maven-start-jar.patch, move-to-maven-acf-framework.patch, patch-connectors.zip I am looking at the current project structure. If we want to make another build tool available I think we need to change the directory structure. I tried to place a suggestion in an image. Can you please have a look at it. If we agree that this is a good way to go, than I will continue to work on a patch. Which might be a bit hard with all these changing directories, but I'll do my best to at least get an idea whether it would be working. So I have three questions: - Do you want to move to maven or put maven next to ant? - Do you prefer another build mechanism [ant with ivy, gradle, maven3] - Do you have an idea about the amount of scripts that need to be changed if we change the project structure The image of a possible project layout (that is based on the maven standards) is attached to the issue -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-92) Move from ant to maven or other build system with decent library management
[ https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084146#comment-13084146 ] Karl Wright commented on CONNECTORS-92: --- We now have a maven build system, in addition to ant, that was contributed elsewhere, so I'm closing this ticket. Move from ant to maven or other build system with decent library management --- Key: CONNECTORS-92 URL: https://issues.apache.org/jira/browse/CONNECTORS-92 Project: ManifoldCF Issue Type: Wish Components: Build Reporter: Jettro Coenradie Assignee: Karl Wright Fix For: ManifoldCF 0.3 Attachments: Screen shot 2010-08-23 at 16.31.07.png, maven-poms-including-start-jar.patch, maven-poms-problem-starting-jetty-and-derby.patch, maven-start-jar.patch, move-to-maven-acf-framework.patch, patch-connectors.zip I am looking at the current project structure. If we want to make another build tool available I think we need to change the directory structure. I tried to place a suggestion in an image. Can you please have a look at it. If we agree that this is a good way to go, than I will continue to work on a patch. Which might be a bit hard with all these changing directories, but I'll do my best to at least get an idea whether it would be working. So I have three questions: - Do you want to move to maven or put maven next to ant? - Do you prefer another build mechanism [ant with ivy, gradle, maven3] - Do you have an idea about the amount of scripts that need to be changed if we change the project structure The image of a possible project layout (that is based on the maven standards) is attached to the issue -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-100) DB lock timeout, and/or indefinite or excessive database activity
[ https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084147#comment-13084147 ] Karl Wright commented on CONNECTORS-100: I haven't heard anything back from the Derby folks. I'm therefore going to leave this ticket open. HSQLDB works better for the hopcount queries, although it does not work well for the report queries. So I guess you can pick your poison at the moment. DB lock timeout, and/or indefinite or excessive database activity - Key: CONNECTORS-100 URL: https://issues.apache.org/jira/browse/CONNECTORS-100 Project: ManifoldCF Issue Type: Bug Components: Framework core Environment: Running unmodified dist/example from trunk/ using the default configuration. Reporter: Andrzej Bialecki Assignee: Karl Wright When a job is started and running (via crawler-ui) occasionally it's not possible to display a list of running jobs. The problem persists even after restarting ACF. The following exception is thrown in the console: {code} org.apache.acf.core.interfaces.ACFException: Database exception: Exception doing query: A lock could not be obtained within the time requested at org.apache.acf.core.database.Database.executeViaThread(Database.java:421) at org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465) at org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072) at org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144) at org.apache.acf.core.database.Database.executeQuery(Database.java:167) at org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727) at org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611) at org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549) at org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316) at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377) at org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313) at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) Caused by: java.sql.SQLTransactionRollbackException: A lock could not be obtained within the time requested at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source) at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown
[jira] [Commented] (CONNECTORS-31) For the Solr LCF security filter plugin, establish a concept of session to improve performance
[ https://issues.apache.org/jira/browse/CONNECTORS-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084151#comment-13084151 ] Karl Wright commented on CONNECTORS-31: --- We fixed this another way - by using caching within individual authorities. This improvement is therefore likely unneeded, so I'm going to close this ticket for now. For the Solr LCF security filter plugin, establish a concept of session to improve performance -- Key: CONNECTORS-31 URL: https://issues.apache.org/jira/browse/CONNECTORS-31 Project: ManifoldCF Issue Type: Improvement Components: Solr Security Filter Reporter: Karl Wright Fix For: ManifoldCF 0.3 Instead of only allowing an authenticated user name to be passed to the LCFSecurityFilter SearchComponent, improve this to return a security token and optionally receive the security token as well. Then it will be possible for it to make the access tokens sticky, reducing load on the authority service on situations where multiple searches occur in each session. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CONNECTORS-31) For the Solr LCF security filter plugin, establish a concept of session to improve performance
[ https://issues.apache.org/jira/browse/CONNECTORS-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-31. --- Resolution: Won't Fix Fix Version/s: ManifoldCF 0.3 For the Solr LCF security filter plugin, establish a concept of session to improve performance -- Key: CONNECTORS-31 URL: https://issues.apache.org/jira/browse/CONNECTORS-31 Project: ManifoldCF Issue Type: Improvement Components: Solr Security Filter Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 Instead of only allowing an authenticated user name to be passed to the LCFSecurityFilter SearchComponent, improve this to return a security token and optionally receive the security token as well. Then it will be possible for it to make the access tokens sticky, reducing load on the authority service on situations where multiple searches occur in each session. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: File Metadata
Curious, since jcifs seems to be free, why is it not included as one of the base connectors and in the base build? PS. Going through the steps of building the jcifs connector. On 8/8/2011 11:53 AM, Karl Wright wrote: PS. My next item is the file owner, so far I'm finding a lot of references to performing JNI per file. The whole goal is to be able to find a set of crawled docs that were modified a date range which belonged to person X. If you can limit yourself to crawling files that are accessible from within a Windows share, you can use the jCIFS connector to get this, I think. But filtering by owner would require a feature addition to the connector, I believe. Karl
Re: File Metadata
jcifs is LGPL licensed, and that is not one of the license types Apache permits for redistribution. Karl On Fri, Aug 12, 2011 at 10:26 AM, Farzad Valad ho...@farzad.net wrote: Curious, since jcifs seems to be free, why is it not included as one of the base connectors and in the base build? PS. Going through the steps of building the jcifs connector. On 8/8/2011 11:53 AM, Karl Wright wrote: PS. My next item is the file owner, so far I'm finding a lot of references to performing JNI per file. The whole goal is to be able to find a set of crawled docs that were modified a date range which belonged to person X. If you can limit yourself to crawling files that are accessible from within a Windows share, you can use the jCIFS connector to get this, I think. But filtering by owner would require a feature addition to the connector, I believe. Karl
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084220#comment-13084220 ] Karl Wright commented on CONNECTORS-224: It doesn't build on 1.5 still: compile-connector: [javac] C:\wip\mcf\CONNECTORS-224\connectors\opensearchserver\build.xml:10: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [javac] Compiling 9 source files to C:\wip\mcf\CONNECTORS-224\connectors\ope nsearchserver\build\connector\classes [javac] C:\wip\mcf\CONNECTORS-224\connectors\opensearchserver\connector\src\ main\java\org\apache\manifoldcf\agents\output\opensearchserver\OpenSearchServerI ndex.java:59: cannot find symbol [javac] symbol : constructor IOException(org.apache.manifoldcf.core.interfa ces.ManifoldCFException) [javac] location: class java.io.IOException [javac] throw new IOException(e); [javac] ^ [javac] 1 error This is easily corrected; I'll check in a fix. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084224#comment-13084224 ] Karl Wright commented on CONNECTORS-224: Another problem that occurs in this connector with no real connection is that notification never completes. Tons of exceptions: ERROR 2011-08-12 12:15:00,637 (Job notification thread) - java.net.ConnectException: Connection refused: connect org.apache.manifoldcf.core.interfaces.ManifoldCFException: java.net.ConnectException: Connection refused: connect at org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnection.call(OpenSearchServerConnection.java:102) at org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerAction.init(OpenSearchServerAction.java:19) at org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector.noteJobComplete(OpenSearchServerConnector.java:328) at org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:115) Caused by: java.net.ConnectException: Connection refused: connect at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:529) at java.net.Socket.connect(Socket.java:478) at java.net.Socket.init(Socket.java:375) at java.net.Socket.init(Socket.java:249) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(Unknown Source) at org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(Unknown Source) at org.apache.commons.httpclient.HttpConnection.open(Unknown Source) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Unknown Source) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(Unknown Source) at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown Source) at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown Source) at org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnection.call(OpenSearchServerConnection.java:93) ... 3 more This is actually a framework problem, I believe; the job is aborted but notification is attempted anyhow. But it is supposed to give up not try indefinitely. I'll create a new ticket for that issue. OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CONNECTORS-238) Exception on end notification is not handled properly
Exception on end notification is not handled properly - Key: CONNECTORS-238 URL: https://issues.apache.org/jira/browse/CONNECTORS-238 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Reporter: Karl Wright When an exception occurs during end notification, handling should permit the job to stop. Notification is a nicety, not a requirement, and the notification method is called even when the job is aborted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Compiling the JCIFS connector
Some how when I'm building the core jars, like pull-agent, agent, etc, some of the classes are missing methods needed by the jcifs connector. I'm pretty stumped as to why the java compiler would not include some of the methods. Thought I just raise a flag in case you know what might be happening. For example, the class IFingerprintActivity is missing checkLengthIndexable and checkURLIndexable when I decompile the byte code. My build process is build the core mcf jars first, then build the jcifs connector. Thanks!
[jira] [Resolved] (CONNECTORS-238) Exception on end notification is not handled properly
[ https://issues.apache.org/jira/browse/CONNECTORS-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karl Wright resolved CONNECTORS-238. Resolution: Fixed Fix Version/s: ManifoldCF 0.3 Assignee: Karl Wright r1157178 Exception on end notification is not handled properly - Key: CONNECTORS-238 URL: https://issues.apache.org/jira/browse/CONNECTORS-238 Project: ManifoldCF Issue Type: Bug Components: Framework agents process Reporter: Karl Wright Assignee: Karl Wright Fix For: ManifoldCF 0.3 When an exception occurs during end notification, handling should permit the job to stop. Notification is a nicety, not a requirement, and the notification method is called even when the job is aborted. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Compiling the JCIFS connector
It sounds like you have old core,agents, and pull-agents jars around. It builds fine here with the ant build. Karl On Fri, Aug 12, 2011 at 12:25 PM, Farzad Valad ho...@farzad.net wrote: Some how when I'm building the core jars, like pull-agent, agent, etc, some of the classes are missing methods needed by the jcifs connector. I'm pretty stumped as to why the java compiler would not include some of the methods. Thought I just raise a flag in case you know what might be happening. For example, the class IFingerprintActivity is missing checkLengthIndexable and checkURLIndexable when I decompile the byte code. My build process is build the core mcf jars first, then build the jcifs connector. Thanks!
[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector
[ https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084252#comment-13084252 ] Karl Wright commented on CONNECTORS-224: Ok, I've tested some more. This is a lot better than before. The issues previously described seem to have been fixed. Now we have a new set of issues to consider. I'm listing these below. (1) In addOrReplaceDocument(), the following code is present: OpenSearchServerConfig config = getConfigParameters(null); Integer count = addInstance(config); synchronized (count) { Can you explain what the purpose of this synchronizer is? It looks to me like you might be effectively managing your own connection pool here, which is both redundant and would prevent ManifoldCF end users from controlling the size of that pool. Am I correct? (2) In addOrReplaceDocument(), you close the input stream. You should not do that. The caller closes the stream. (3) Formatting. Apache guidelines set indenting to 2 spaces, with no tabs. ManifoldCF adheres to this convention. It would be great if you could reformat accordingly. Thanks again! OpenSearchServer connector -- Key: CONNECTORS-224 URL: https://issues.apache.org/jira/browse/CONNECTORS-224 Project: ManifoldCF Issue Type: New Feature Components: OpenSearchServer connector Affects Versions: ManifoldCF 0.3 Reporter: Emmanuel Keller Assignee: Karl Wright Labels: OpenSearchServer, connector, outputconnector Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, oss-mfc-beta.patch, oss-mfc-dev.patch Original Estimate: 336h Remaining Estimate: 336h Provide an output connector for [OpenSearchServer|http://www.open-search-server.com]. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira