[jira] [Updated] (CONNECTORS-213) DBInterfaceMySQL Initalization error

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-213:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 DBInterfaceMySQL Initalization error
 

 Key: CONNECTORS-213
 URL: https://issues.apache.org/jira/browse/CONNECTORS-213
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 0.3
Reporter: Emanuele Lombardi
Assignee: Karl Wright
  Labels: Mysql, bug, manifoldcf
 Fix For: ManifoldCF 0.3

 Attachments: DBInterfaceMySQL.patch


 When I try to creare a new database using DBCreate and MySql DBInterface I 
 have this exception
 Configuration file successfully read
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40)
   at org.apache.manifoldcf.core.database.Database.init(Database.java:63)
   at 
 org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391)
   at 
 org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656)
   at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51)
   at 
 org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54)
   at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-213) DBInterfaceMySQL Initalization error

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084038#comment-13084038
 ] 

Karl Wright commented on CONNECTORS-213:


I haven't heard anything further, so I'm closing this ticket.  I'll either 
reopen it, or open a new one, if there are further mysql issues/patches 
reported.


 DBInterfaceMySQL Initalization error
 

 Key: CONNECTORS-213
 URL: https://issues.apache.org/jira/browse/CONNECTORS-213
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
Affects Versions: ManifoldCF 0.3
Reporter: Emanuele Lombardi
  Labels: Mysql, bug, manifoldcf
 Fix For: ManifoldCF 0.3

 Attachments: DBInterfaceMySQL.patch


 When I try to creare a new database using DBCreate and MySql DBInterface I 
 have this exception
 Configuration file successfully read
 Exception in thread main java.lang.NullPointerException
   at 
 org.apache.manifoldcf.core.interfaces.CacheManagerFactory.make(CacheManagerFactory.java:40)
   at org.apache.manifoldcf.core.database.Database.init(Database.java:63)
   at 
 org.apache.manifoldcf.core.database.DBInterfaceMySQL.createUserAndDatabase(DBInterfaceMySQL.java:391)
   at 
 org.apache.manifoldcf.core.system.ManifoldCF.createSystemDatabase(ManifoldCF.java:656)
   at org.apache.manifoldcf.core.DBCreate.doExecute(DBCreate.java:51)
   at 
 org.apache.manifoldcf.core.DBInitializationCommand.execute(DBInitializationCommand.java:54)
   at org.apache.manifoldcf.core.DBCreate.main(DBCreate.java:80)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084043#comment-13084043
 ] 

Karl Wright commented on CONNECTORS-226:


After some thought, I believe the correct approach is to convert Livelink's 
usage to a ServiceInterruption.


 Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR 
 needs to be reviewed and clarified
 

 Key: CONNECTORS-226
 URL: https://issues.apache.org/jira/browse/CONNECTORS-226
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core, JCIFS connector, LiveLink connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Karl Wright
Assignee: Karl Wright

 The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated 
 by the framework somewhat inconsistently.  In some places it is treated as a 
 permanent connection exception, and in others as a temporary connection 
 exception (in lieu of a ServiceInterruption where ServiceInterruption is not 
 possible).  Only two connectors use it (LiveLink and jCIFS), and the JCIFS 
 case is not interesting.  So really this is currently here to support 
 Livelink.
 There are two ways forward.  The first way is to convert the Livelink 
 connector's exception to a true ServiceInterruption, and revert 
 REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been 
 deprecated as a result of the fact that connect() methods can no longer throw 
 ManifoldCFExceptions at all.  The second is to continue the current 
 Livelink-style usage, and make ALL usages consistent with that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-226) Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR needs to be reviewed and clarified

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-226.


   Resolution: Fixed
Fix Version/s: ManifoldCF 0.3

r1157065


 Usage and meaning of ManifoldCFException type REPOSITORY_CONNECTION_ERROR 
 needs to be reviewed and clarified
 

 Key: CONNECTORS-226
 URL: https://issues.apache.org/jira/browse/CONNECTORS-226
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core, JCIFS connector, LiveLink connector
Affects Versions: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3


 The ManifoldCFException type REPOSITORY_CONNECTION_ERROR seems to be treated 
 by the framework somewhat inconsistently.  In some places it is treated as a 
 permanent connection exception, and in others as a temporary connection 
 exception (in lieu of a ServiceInterruption where ServiceInterruption is not 
 possible).  Only two connectors use it (LiveLink and jCIFS), and the JCIFS 
 case is not interesting.  So really this is currently here to support 
 Livelink.
 There are two ways forward.  The first way is to convert the Livelink 
 connector's exception to a true ServiceInterruption, and revert 
 REPOSITORY_CONNECTION_ERROR to its original meaning, which has now been 
 deprecated as a result of the fact that connect() methods can no longer throw 
 ManifoldCFExceptions at all.  The second is to continue the current 
 Livelink-style usage, and make ALL usages consistent with that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084063#comment-13084063
 ] 

Karl Wright commented on CONNECTORS-55:
---

HSQLDB support has been added, which works a lot better than Derby does, so I 
think we've finally hit the necessary criteria for closing out this ticket.


 Bundle database server with ManifoldCF packaged product
 ---

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Installers
Reporter: Jack Krupansky
 Fix For: ManifoldCF 0.3


 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CONNECTORS-55) Bundle database server with ManifoldCF packaged product

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-55:
-

Assignee: Karl Wright

 Bundle database server with ManifoldCF packaged product
 ---

 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Installers
Reporter: Jack Krupansky
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3


 The current requirement that the user install and deploy a PostgreSQL server 
 complicates the installation and deployment of LCF for the user. Installation 
 and deployment of LCF should be as simple as Solr itself. QuickStart is great 
 for the low-end and basic evaluation, but a comparable level of simplified 
 installation and deployment is still needed for full-blown, high-end 
 environments that need the full performance of a ProstgreSQL-class database 
 server. So, PostgreSQL should be bundled with the packaged release of LCF so 
 that installation and deployment of LCF will automatically install and deploy 
 a subset of the full PostgreSQL distribution that is sufficient for the needs 
 of LCF. Starting LCF, with or without the LCF UI, should automatically start 
 the database server. Shutting down LCF should also shutdown the database 
 server process.
 A typical use case would be for a non-developer who is comfortable with Solr 
 and simply wants to crawl documents from, for example, a SharePoint 
 repository and feed them into Solr. QuickStart should work well for the low 
 end or in the early stages of evaluation, but the user would prefer to 
 evaluate the real thing with something resembling a production crawl of 
 thousands of documents. Such a user might not be a hard-core developer or be 
 comfortable fiddling with a lot of software components simply to do one 
 conceptually simple operation.
 It should still be possible for the user to supply database server settings 
 to override the defaults, but the LCF package should have all of the 
 best-practice settings deemed appropriate for use with LCF.
 One downside is that installation and deployment will be platform-specific 
 since there are multiple processes and PostgreSQL itself requires a 
 platform-specific installation.
 This proposal presumes that PostgreSQL is the best option for the foreseeable 
 future, but nothing here is intended to preclude support for other database 
 servers in futures releases.
 This proposal should not have any impact on QuickStart packaging or 
 deployment.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-60) Agent process should be started automatically

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-60.
---

   Resolution: Fixed
Fix Version/s: ManifoldCF 0.1
   ManifoldCF 0.2
   ManifoldCF 0.3

 Agent process should be started automatically
 -

 Key: CONNECTORS-60
 URL: https://issues.apache.org/jira/browse/CONNECTORS-60
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation
Reporter: Jack Krupansky
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 0.3, ManifoldCF 0.2, ManifoldCF 0.1


 LCF as it exists today is a bit too complex to run for an average user, 
 especially with a separate agent process for crawling. LCF should be as easy 
 to run as Solr is today. QuickStart is a good move in this direction, but the 
 same user-visible simplicity is needed for full LCF. The separate agent 
 process is a reasonable design for execution, but a little too cumbersome for 
 the average user to manage.
 Unfortunately, it is expected that starting up a multi-process application 
 will require platform-specific scripting.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.
 KDW - this functionality is already present; however the documentation is not 
 adequate to help people figure out how to do it.  So I'm moving this to 
 Documentation and treating it as a doc bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CONNECTORS-60) Agent process should be started automatically

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-60:
-

Assignee: Karl Wright

 Agent process should be started automatically
 -

 Key: CONNECTORS-60
 URL: https://issues.apache.org/jira/browse/CONNECTORS-60
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation
Reporter: Jack Krupansky
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3


 LCF as it exists today is a bit too complex to run for an average user, 
 especially with a separate agent process for crawling. LCF should be as easy 
 to run as Solr is today. QuickStart is a good move in this direction, but the 
 same user-visible simplicity is needed for full LCF. The separate agent 
 process is a reasonable design for execution, but a little too cumbersome for 
 the average user to manage.
 Unfortunately, it is expected that starting up a multi-process application 
 will require platform-specific scripting.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.
 KDW - this functionality is already present; however the documentation is not 
 adequate to help people figure out how to do it.  So I'm moving this to 
 Documentation and treating it as a doc bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-60) Agent process should be started automatically

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084067#comment-13084067
 ] 

Karl Wright commented on CONNECTORS-60:
---

The Quick Start definitely meets the requirements listed for this task, so I'm 
closing it.


 Agent process should be started automatically
 -

 Key: CONNECTORS-60
 URL: https://issues.apache.org/jira/browse/CONNECTORS-60
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation
Reporter: Jack Krupansky
Priority: Minor
 Fix For: ManifoldCF 0.1, ManifoldCF 0.2, ManifoldCF 0.3


 LCF as it exists today is a bit too complex to run for an average user, 
 especially with a separate agent process for crawling. LCF should be as easy 
 to run as Solr is today. QuickStart is a good move in this direction, but the 
 same user-visible simplicity is needed for full LCF. The separate agent 
 process is a reasonable design for execution, but a little too cumbersome for 
 the average user to manage.
 Unfortunately, it is expected that starting up a multi-process application 
 will require platform-specific scripting.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.
 KDW - this functionality is already present; however the documentation is not 
 adequate to help people figure out how to do it.  So I'm moving this to 
 Documentation and treating it as a doc bug.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-58) ManifoldCF scripting language, executed via the API, plus example jobs for file system and web crawl

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-58:
--

Component/s: Scripting client

Adding a new ticket category for scripting client.

 ManifoldCF scripting language, executed via the API, plus example jobs for 
 file system and web crawl 
 ---

 Key: CONNECTORS-58
 URL: https://issues.apache.org/jira/browse/CONNECTORS-58
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Examples, Scripting client
Reporter: Jack Krupansky
Priority: Minor

 Creating a basic connection setup to do a relatively simple crawl for a file 
 system or web can be a daunting task for someone new to LCF. So, it would be 
 nice to have a scripting file that supports an abbreviated API (subset of the 
 full API discussed in CONNECTORS-56) sufficient to create a default set of 
 connections and example jobs that the new user can choose from.
 Beyond this initial need, this script format might be a useful form to dump 
 all of the connections and jobs in the LCF database in a form that can be 
 used to recreate an LCF configuration. Kind of a dump and reload 
 capability. That in fact might be how the initial example script gets created.
 Those are two distinct use cases, but could utilize the same feature.
 The example script could have example jobs to crawl a subdirectory of LCF, 
 crawl the LCF wiki, etc.
 There could be more than one script. There might be example scripts for each 
 form of connector.
 This capability should be available for both QuickStart and the general 
 release of LCF.
 As just one possibility, the script format might be a sequence of JSON 
 expressions, each with an initial string analogous to a servlet path to 
 specify the operation to be performed, followed by the JSON form of the 
 connection or job or other LCF object. Or, some other format might be more 
 suitable.
 Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-61) Support bundling of LCF with an app

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084074#comment-13084074
 ] 

Karl Wright commented on CONNECTORS-61:
---

The current status of this ticket is that it is stalled.  What is needed, in my 
view, to complete it is the following:

(1) Make the jetty-runner class include specific methods for programmatically 
starting and stopping the ManifoldCF quick start from within an enclosing 
process, and
(2) Document those methods in how to build and deploy.


 Support bundling of LCF with an app
 ---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation, Framework crawler agent
Affects Versions: ManifoldCF 0.3
Reporter: Jack Krupansky
Priority: Minor

 It should be possible for an application developer to bundle LCF with an 
 application to facilitate installation and deployment of the application in 
 conjunction with LCF. This may (or may not) be as simple as providing 
 appropriate jar files and documentation for how to use them, but there may be 
 other components or scripts needed.
 There are two options: 1) include the LCF UI along with the other LCF 
 processes, and 2) exclude the LCF UI and include only the other processes 
 that can be controlled via the full API.
 The database server would be included.
 The web app server would be optional since the application may have its own 
 choice of web app server.
 One use case is bundling LCF with Solr or a Solr-based application.
 Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated CONNECTORS-224:
---

Attachment: oss-mfc-beta.patch

This new version fixes the previous points (Java 5, missing javascript checks, 
implementation of the check method, memory leak).
We also added a specification tab panel to handle mime types and extension 
filter, and a limit to the maximum size of documents.
Finally, one issue remains: The following methods are never called: 
checkMimetypeIndexable and checkFileIndexable. Until now we were not able to 
understand why. Thanks in advance for your help.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084101#comment-13084101
 ] 

Karl Wright commented on CONNECTORS-224:


checkMimeTypeIndexable and checkFileIndexable will only be called from certain 
connectors, which are in a position to prefilter content.  Other connectors 
will not be able to use these.

For example, the Web connector calls checkMimeTypeIndexable and the JCIFS 
connector calls checkFileIndexable.  Hope this helps.



 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Emmanuel Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084111#comment-13084111
 ] 

Emmanuel Keller commented on CONNECTORS-224:


It helps, thank you. So I will check with these connectors.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-61) Support bundling of LCF with an app

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084112#comment-13084112
 ] 

Karl Wright commented on CONNECTORS-61:
---

r1157095 for the jetty-runner changes.


 Support bundling of LCF with an app
 ---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation, Framework crawler agent
Affects Versions: ManifoldCF 0.3
Reporter: Jack Krupansky
Assignee: Karl Wright
Priority: Minor

 It should be possible for an application developer to bundle LCF with an 
 application to facilitate installation and deployment of the application in 
 conjunction with LCF. This may (or may not) be as simple as providing 
 appropriate jar files and documentation for how to use them, but there may be 
 other components or scripts needed.
 There are two options: 1) include the LCF UI along with the other LCF 
 processes, and 2) exclude the LCF UI and include only the other processes 
 that can be controlled via the full API.
 The database server would be included.
 The web app server would be optional since the application may have its own 
 choice of web app server.
 One use case is bundling LCF with Solr or a Solr-based application.
 Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084116#comment-13084116
 ] 

Karl Wright commented on CONNECTORS-224:


Checked in the latest patch, and I'll try to evaluate it at first available 
opportunity.


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-61) Support bundling of LCF with an app

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-61.
---

   Resolution: Fixed
Fix Version/s: ManifoldCF next

 Support bundling of LCF with an app
 ---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: ManifoldCF
  Issue Type: Sub-task
  Components: Documentation, Framework crawler agent
Affects Versions: ManifoldCF 0.3
Reporter: Jack Krupansky
Assignee: Karl Wright
Priority: Minor
 Fix For: ManifoldCF next


 It should be possible for an application developer to bundle LCF with an 
 application to facilitate installation and deployment of the application in 
 conjunction with LCF. This may (or may not) be as simple as providing 
 appropriate jar files and documentation for how to use them, but there may be 
 other components or scripts needed.
 There are two options: 1) include the LCF UI along with the other LCF 
 processes, and 2) exclude the LCF UI and include only the other processes 
 that can be controlled via the full API.
 The database server would be included.
 The web app server would be optional since the application may have its own 
 choice of web app server.
 One use case is bundling LCF with Solr or a Solr-based application.
 Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-50.
---

   Resolution: Fixed
Fix Version/s: ManifoldCF 0.3
 Assignee: Karl Wright

Resolved, leaving scripting language request open as a separate ticket.


 Proposal for initial two releases of LCF, including packaged product and full 
 API
 -

 Key: CONNECTORS-50
 URL: https://issues.apache.org/jira/browse/CONNECTORS-50
 Project: ManifoldCF
  Issue Type: New Feature
Reporter: Jack Krupansky
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3


 Currently, LCF has a relatively high-bar for evaluation and use, requiring 
 developer expertise. Also, although LCF has a comprehensive UI, it is not 
 currently packaged for use as a crawling engine for advanced applications.
 A small set of individual feature requests are needed to address these 
 issues. They are summarized briefly to show how they fit together for two 
 initial releases of LCF, but will be broken out into individual LCF Jira 
 issues.
 Goals:
 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as 
 Solr is today)
 2. LCF as a toolkit for developers needing customized crawling and repository 
 access
 3. An API-based crawling engine that can be integrated with applications (as 
 Aperture is today)
 Larger goals:
 1. Make it very easy for users to evaluate LCF.
 2. Make it very easy for developers to customize LCF.
 3. Make it very easy for appplications to fully manage and control LCF in 
 operation.
 Two phases:
 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call 
 it LCF 0.5.
 2) API-based crawling engine for applications for which the UI might not be 
 appropriate. Call it LCF 1.0.
 Phase 1
 ---
 LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
 It would contain roughly the features that are currently in place or 
 currently underway, plus a little more.
 Specifically, LCF 0.5 would contain these additional capabilities:
 1. Plug-in architecture for connectors (CONNECTORS-40 - DONE)
 2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59)
 3. Bundled with database - PostgreSQL or derby - ready to run without 
 additional manual setup (CONNECTORS-55)
 4. Mini-API to initially configure default connections and example jobs for 
 file system and web crawl (CONNECTORS-58)
 5. Agent process started automatically (CONNECTORS-60)
 6. Solr output connector option to commit at end of job, by default 
 (CONNECTORS-57)
 Installation and basic evaluation of LCF would be essentially as simple as 
 Solr is today. The example
 connections and jobs would permit the user to initiate example crawls of a 
 file system example
 directory and an example web on the LCF web site with just a couple of clicks 
 (as opposed to the
 detailed manual setup required today to create repository and output 
 connections and jobs.
 It is worth considering whether the SharePoint connector could also be 
 included as part of the default package.
 Users could then add additional connectors and repositories and jobs as 
 desired.
 Timeframe for release? Level of effort?
 Phase 2
 ---
 The essence of Phase 2 is that LCF would be split to allow direct, full API 
 access to LCF as a
 crawling engine, in additional to the full LCF UI. Call this LCF 1.0.
 Specifically, LCF 1.0 would contain these additional capabilities:
 1. Full API for LCF as a crawling engine (CONNECTORS-56)
 2. LCF can be bundled within an app (CONNECTORS-61)
 3. LCF event and activity notification for full control by an application 
 (CONNECTORS-41)
 Overall, LCF will offer roughly the same crawling capabilities as with LCF 
 0.5, plus whatever bug
 fixes and minor enhancements might also be added.
 Timeframe for release? Level of effort?
 -
 Issues:
 - Can we package PostgreSQL with LCF so LCF can set it up?
   - Or do we need Derby for that purpose?
 - Managing multiple processes (UI, database, agent, app processes)
 - What exactly would the API look like? (URL, XML, JSON, YAML?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084131#comment-13084131
 ] 

Karl Wright commented on CONNECTORS-50:
---

I'm going to resolve this ticket, since the planning part is now meaningless, 
and the only thing that remains is a scripting language, for which there is a 
separate ticket.



 Proposal for initial two releases of LCF, including packaged product and full 
 API
 -

 Key: CONNECTORS-50
 URL: https://issues.apache.org/jira/browse/CONNECTORS-50
 Project: ManifoldCF
  Issue Type: New Feature
Reporter: Jack Krupansky
 Fix For: ManifoldCF 0.3


 Currently, LCF has a relatively high-bar for evaluation and use, requiring 
 developer expertise. Also, although LCF has a comprehensive UI, it is not 
 currently packaged for use as a crawling engine for advanced applications.
 A small set of individual feature requests are needed to address these 
 issues. They are summarized briefly to show how they fit together for two 
 initial releases of LCF, but will be broken out into individual LCF Jira 
 issues.
 Goals:
 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as 
 Solr is today)
 2. LCF as a toolkit for developers needing customized crawling and repository 
 access
 3. An API-based crawling engine that can be integrated with applications (as 
 Aperture is today)
 Larger goals:
 1. Make it very easy for users to evaluate LCF.
 2. Make it very easy for developers to customize LCF.
 3. Make it very easy for appplications to fully manage and control LCF in 
 operation.
 Two phases:
 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call 
 it LCF 0.5.
 2) API-based crawling engine for applications for which the UI might not be 
 appropriate. Call it LCF 1.0.
 Phase 1
 ---
 LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
 It would contain roughly the features that are currently in place or 
 currently underway, plus a little more.
 Specifically, LCF 0.5 would contain these additional capabilities:
 1. Plug-in architecture for connectors (CONNECTORS-40 - DONE)
 2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59)
 3. Bundled with database - PostgreSQL or derby - ready to run without 
 additional manual setup (CONNECTORS-55)
 4. Mini-API to initially configure default connections and example jobs for 
 file system and web crawl (CONNECTORS-58)
 5. Agent process started automatically (CONNECTORS-60)
 6. Solr output connector option to commit at end of job, by default 
 (CONNECTORS-57)
 Installation and basic evaluation of LCF would be essentially as simple as 
 Solr is today. The example
 connections and jobs would permit the user to initiate example crawls of a 
 file system example
 directory and an example web on the LCF web site with just a couple of clicks 
 (as opposed to the
 detailed manual setup required today to create repository and output 
 connections and jobs.
 It is worth considering whether the SharePoint connector could also be 
 included as part of the default package.
 Users could then add additional connectors and repositories and jobs as 
 desired.
 Timeframe for release? Level of effort?
 Phase 2
 ---
 The essence of Phase 2 is that LCF would be split to allow direct, full API 
 access to LCF as a
 crawling engine, in additional to the full LCF UI. Call this LCF 1.0.
 Specifically, LCF 1.0 would contain these additional capabilities:
 1. Full API for LCF as a crawling engine (CONNECTORS-56)
 2. LCF can be bundled within an app (CONNECTORS-61)
 3. LCF event and activity notification for full control by an application 
 (CONNECTORS-41)
 Overall, LCF will offer roughly the same crawling capabilities as with LCF 
 0.5, plus whatever bug
 fixes and minor enhancements might also be added.
 Timeframe for release? Level of effort?
 -
 Issues:
 - Can we package PostgreSQL with LCF so LCF can set it up?
   - Or do we need Derby for that purpose?
 - Managing multiple processes (UI, database, agent, app processes)
 - What exactly would the API look like? (URL, XML, JSON, YAML?)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-34) eRoom authority and connector

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-34?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-34:
--

Component/s: eRoom connector

 eRoom authority and connector
 -

 Key: CONNECTORS-34
 URL: https://issues.apache.org/jira/browse/CONNECTORS-34
 Project: ManifoldCF
  Issue Type: New Feature
  Components: eRoom connector
Reporter: Karl Wright

 eRoom has a SOAP API which looks like it has enough power to perhaps 
 implement a connector and an authority.  The eRoom API url is here (and yes, 
 it is a chinese url, but is legit):
 https://eroom.abraxas.ch/eroomHelp/en/API_Help/Api.htm#home_api.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-54?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084140#comment-13084140
 ] 

Karl Wright commented on CONNECTORS-54:
---

ManifoldCF in Action required a simple file-based repository to be written, and 
an output connector to that repository also.  It's not so simple because the 
metadata and acl information also needs to go into the file system for this to 
be useful.  I'm therefore going to close this ticket since I can't think of any 
realistic use for the proposed connector other than testing.


 A Filesystem output connector would be useful and would allow more complete 
 unit  tests
 ---

 Key: CONNECTORS-54
 URL: https://issues.apache.org/jira/browse/CONNECTORS-54
 Project: ManifoldCF
  Issue Type: Improvement
Reporter: Karl Wright
 Fix For: ManifoldCF 0.3


 Right now, the unit tests are limited because there is no way to check that 
 the indexed files actually do get indexed.  The addition of a filesystem 
 output connector would allow more complete tests to be constructed.  In 
 addition, such a connector might well be useful in its own right.
 The connector would need to convert URI's into relative file paths, but other 
 than that there's really nothing very tricky about it.  Configuration 
 information is minimal; just the root path of the output is all that's needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-54?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-54.
---

   Resolution: Won't Fix
Fix Version/s: ManifoldCF 0.3

 A Filesystem output connector would be useful and would allow more complete 
 unit  tests
 ---

 Key: CONNECTORS-54
 URL: https://issues.apache.org/jira/browse/CONNECTORS-54
 Project: ManifoldCF
  Issue Type: Improvement
Reporter: Karl Wright
 Fix For: ManifoldCF 0.3


 Right now, the unit tests are limited because there is no way to check that 
 the indexed files actually do get indexed.  The addition of a filesystem 
 output connector would allow more complete tests to be constructed.  In 
 addition, such a connector might well be useful in its own right.
 The connector would need to convert URI's into relative file paths, but other 
 than that there's really nothing very tricky about it.  Configuration 
 information is minimal; just the root path of the output is all that's needed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-94) fix common localization traps

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-94:
--

Component/s: Framework core

 fix common localization traps
 -

 Key: CONNECTORS-94
 URL: https://issues.apache.org/jira/browse/CONNECTORS-94
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Reporter: Robert Muir

 Searching thru the LCF code, i found several uses of the following that 
 appear to be potentially dangerous:
 * getBytes() with no encoding: this is dangerous as the encoding is 
 completely unspecified. In most places this should likely mean UTF-8
 * getBytes(utf-8): this is mostly a nitpick, but this alias is not 
 guaranteed to exist (see Charset docs). I suggest changing these all to 
 UTF-8
   
 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where 
 it appears the text is not used solely for display, but instead for 'caseless 
 matching'. I suggest changing these to use either the root Locale: new 
 Locale() or even easier, Locale.ENGLISH. This way ACF does not have 
 surprising behavior on say a Turkish computer.
 I can contribute a patch to address these.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-94) fix common localization traps

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084142#comment-13084142
 ] 

Karl Wright commented on CONNECTORS-94:
---

Has there been any update to this ticket?


 fix common localization traps
 -

 Key: CONNECTORS-94
 URL: https://issues.apache.org/jira/browse/CONNECTORS-94
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Reporter: Robert Muir

 Searching thru the LCF code, i found several uses of the following that 
 appear to be potentially dangerous:
 * getBytes() with no encoding: this is dangerous as the encoding is 
 completely unspecified. In most places this should likely mean UTF-8
 * getBytes(utf-8): this is mostly a nitpick, but this alias is not 
 guaranteed to exist (see Charset docs). I suggest changing these all to 
 UTF-8
   
 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where 
 it appears the text is not used solely for display, but instead for 'caseless 
 matching'. I suggest changing these to use either the root Locale: new 
 Locale() or even easier, Locale.ENGLISH. This way ACF does not have 
 surprising behavior on say a Turkish computer.
 I can contribute a patch to address these.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CONNECTORS-94) fix common localization traps

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-94:
-

Assignee: Robert Muir

 fix common localization traps
 -

 Key: CONNECTORS-94
 URL: https://issues.apache.org/jira/browse/CONNECTORS-94
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Reporter: Robert Muir
Assignee: Robert Muir

 Searching thru the LCF code, i found several uses of the following that 
 appear to be potentially dangerous:
 * getBytes() with no encoding: this is dangerous as the encoding is 
 completely unspecified. In most places this should likely mean UTF-8
 * getBytes(utf-8): this is mostly a nitpick, but this alias is not 
 guaranteed to exist (see Charset docs). I suggest changing these all to 
 UTF-8
   
 * String.toLowerCase()/String.toUpperCase() with no specified Locale, where 
 it appears the text is not used solely for display, but instead for 'caseless 
 matching'. I suggest changing these to use either the root Locale: new 
 Locale() or even easier, Locale.ENGLISH. This way ACF does not have 
 surprising behavior on say a Turkish computer.
 I can contribute a patch to address these.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-92.
---

   Resolution: Fixed
Fix Version/s: ManifoldCF 0.3

 Move from ant to maven or other build system with decent library management
 ---

 Key: CONNECTORS-92
 URL: https://issues.apache.org/jira/browse/CONNECTORS-92
 Project: ManifoldCF
  Issue Type: Wish
  Components: Build
Reporter: Jettro Coenradie
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3

 Attachments: Screen shot 2010-08-23 at 16.31.07.png, 
 maven-poms-including-start-jar.patch, 
 maven-poms-problem-starting-jetty-and-derby.patch, maven-start-jar.patch, 
 move-to-maven-acf-framework.patch, patch-connectors.zip


 I am looking at the current project structure. If we want to make another 
 build tool available I think we need to change the directory structure. I 
 tried to place a suggestion in an image. Can you please have a look at it. If 
 we agree that this is a good way to go, than I will continue to work on a 
 patch. Which might be a bit hard with all these changing directories, but 
 I'll do my best to at least get an idea whether it would be working.
 So I have three questions:
 - Do you want to move to maven or put maven next to ant?
 - Do you prefer another build mechanism [ant with ivy, gradle, maven3]
 - Do you have an idea about the amount of scripts that need to be changed if 
 we change the project structure
 The image of a possible project layout (that is based on the maven standards) 
 is attached to the issue

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-92) Move from ant to maven or other build system with decent library management

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084146#comment-13084146
 ] 

Karl Wright commented on CONNECTORS-92:
---

We now have a maven build system, in addition to ant, that was contributed 
elsewhere, so I'm closing this ticket.


 Move from ant to maven or other build system with decent library management
 ---

 Key: CONNECTORS-92
 URL: https://issues.apache.org/jira/browse/CONNECTORS-92
 Project: ManifoldCF
  Issue Type: Wish
  Components: Build
Reporter: Jettro Coenradie
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3

 Attachments: Screen shot 2010-08-23 at 16.31.07.png, 
 maven-poms-including-start-jar.patch, 
 maven-poms-problem-starting-jetty-and-derby.patch, maven-start-jar.patch, 
 move-to-maven-acf-framework.patch, patch-connectors.zip


 I am looking at the current project structure. If we want to make another 
 build tool available I think we need to change the directory structure. I 
 tried to place a suggestion in an image. Can you please have a look at it. If 
 we agree that this is a good way to go, than I will continue to work on a 
 patch. Which might be a bit hard with all these changing directories, but 
 I'll do my best to at least get an idea whether it would be working.
 So I have three questions:
 - Do you want to move to maven or put maven next to ant?
 - Do you prefer another build mechanism [ant with ivy, gradle, maven3]
 - Do you have an idea about the amount of scripts that need to be changed if 
 we change the project structure
 The image of a possible project layout (that is based on the maven standards) 
 is attached to the issue

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-100) DB lock timeout, and/or indefinite or excessive database activity

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084147#comment-13084147
 ] 

Karl Wright commented on CONNECTORS-100:


I haven't heard anything back from the Derby folks. I'm therefore going to 
leave this ticket open.  HSQLDB works better for the hopcount queries, although 
it does not work well for the report queries.  So I guess you can pick your 
poison at the moment.


 DB lock timeout, and/or indefinite or excessive database activity
 -

 Key: CONNECTORS-100
 URL: https://issues.apache.org/jira/browse/CONNECTORS-100
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework core
 Environment: Running unmodified dist/example from trunk/ using the 
 default configuration.
Reporter: Andrzej Bialecki 
Assignee: Karl Wright

 When a job is started and running (via crawler-ui) occasionally it's not 
 possible to display a list of running jobs. The problem persists even after 
 restarting ACF. The following exception is thrown in the console:
 {code}
 org.apache.acf.core.interfaces.ACFException: Database exception: Exception 
 doing query: A lock could not be obtained within the time requested
 at 
 org.apache.acf.core.database.Database.executeViaThread(Database.java:421)
 at 
 org.apache.acf.core.database.Database.executeUncachedQuery(Database.java:465)
 at 
 org.apache.acf.core.database.Database$QueryCacheExecutor.create(Database.java:1072)
 at 
 org.apache.acf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
 at 
 org.apache.acf.core.database.Database.executeQuery(Database.java:167)
 at 
 org.apache.acf.core.database.DBInterfaceDerby.performQuery(DBInterfaceDerby.java:727)
 at 
 org.apache.acf.crawler.jobs.JobManager.makeJobStatus(JobManager.java:5611)
 at 
 org.apache.acf.crawler.jobs.JobManager.getAllStatus(JobManager.java:5549)
 at 
 org.apache.jsp.showjobstatus_jsp._jspService(showjobstatus_jsp.java:316)
 at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:70)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:377)
 at 
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:313)
 at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:260)
 at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
 at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
 at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:390)
 at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
 at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
 at 
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:418)
 at 
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
 at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
 at org.mortbay.jetty.Server.handle(Server.java:326)
 at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
 at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:923)
 at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:547)
 at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
 at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
 at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
 at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
 Caused by: java.sql.SQLTransactionRollbackException: A lock could not be 
 obtained within the time requested
 at 
 org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown 
 Source)
 at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
 Source)
 at 
 org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown 
 Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown Source)
 at org.apache.derby.impl.jdbc.EmbedStatement.execute(Unknown 

[jira] [Commented] (CONNECTORS-31) For the Solr LCF security filter plugin, establish a concept of session to improve performance

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084151#comment-13084151
 ] 

Karl Wright commented on CONNECTORS-31:
---

We fixed this another way - by using caching within individual authorities.  
This improvement is therefore likely unneeded, so I'm going to close this 
ticket for now.


 For the Solr LCF security filter plugin, establish a concept of session to 
 improve performance
 --

 Key: CONNECTORS-31
 URL: https://issues.apache.org/jira/browse/CONNECTORS-31
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Solr Security Filter
Reporter: Karl Wright
 Fix For: ManifoldCF 0.3


 Instead of only allowing an authenticated user name to be passed to the 
 LCFSecurityFilter SearchComponent, improve this to return a security token 
 and optionally receive the security token as well.  Then it will be possible 
 for it to make the access tokens sticky, reducing load on the authority 
 service on situations where multiple searches occur in each session.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CONNECTORS-31) For the Solr LCF security filter plugin, establish a concept of session to improve performance

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-31.
---

   Resolution: Won't Fix
Fix Version/s: ManifoldCF 0.3

 For the Solr LCF security filter plugin, establish a concept of session to 
 improve performance
 --

 Key: CONNECTORS-31
 URL: https://issues.apache.org/jira/browse/CONNECTORS-31
 Project: ManifoldCF
  Issue Type: Improvement
  Components: Solr Security Filter
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3


 Instead of only allowing an authenticated user name to be passed to the 
 LCFSecurityFilter SearchComponent, improve this to return a security token 
 and optionally receive the security token as well.  Then it will be possible 
 for it to make the access tokens sticky, reducing load on the authority 
 service on situations where multiple searches occur in each session.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: File Metadata

2011-08-12 Thread Farzad Valad
Curious, since jcifs seems to be free, why is it not included as one of 
the base connectors and in the base build?


PS. Going through the steps of building the jcifs connector.

On 8/8/2011 11:53 AM, Karl Wright wrote:

PS. My next item is the file owner, so far I'm finding a lot of references
to performing JNI per file.  The whole goal is to be able to find a set of
crawled docs that were modified a date range which belonged to person X.


If you can limit yourself to crawling files that are accessible from
within a Windows share, you can use the jCIFS connector to get this, I
think.  But filtering by owner would require a feature addition to the
connector, I believe.

Karl





Re: File Metadata

2011-08-12 Thread Karl Wright
jcifs is LGPL licensed, and that is not one of the license types
Apache permits for redistribution.

Karl


On Fri, Aug 12, 2011 at 10:26 AM, Farzad Valad ho...@farzad.net wrote:
 Curious, since jcifs seems to be free, why is it not included as one of the
 base connectors and in the base build?

 PS. Going through the steps of building the jcifs connector.

 On 8/8/2011 11:53 AM, Karl Wright wrote:

 PS. My next item is the file owner, so far I'm finding a lot of
 references
 to performing JNI per file.  The whole goal is to be able to find a set
 of
 crawled docs that were modified a date range which belonged to person X.

 If you can limit yourself to crawling files that are accessible from
 within a Windows share, you can use the jCIFS connector to get this, I
 think.  But filtering by owner would require a feature addition to the
 connector, I believe.

 Karl





[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084220#comment-13084220
 ] 

Karl Wright commented on CONNECTORS-224:


It doesn't build on 1.5 still:

compile-connector:
[javac] C:\wip\mcf\CONNECTORS-224\connectors\opensearchserver\build.xml:10:
warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last;
 set to false for repeatable builds
[javac] Compiling 9 source files to C:\wip\mcf\CONNECTORS-224\connectors\ope
nsearchserver\build\connector\classes
[javac] C:\wip\mcf\CONNECTORS-224\connectors\opensearchserver\connector\src\
main\java\org\apache\manifoldcf\agents\output\opensearchserver\OpenSearchServerI
ndex.java:59: cannot find symbol
[javac] symbol  : constructor IOException(org.apache.manifoldcf.core.interfa
ces.ManifoldCFException)
[javac] location: class java.io.IOException
[javac] throw new IOException(e);
[javac]   ^
[javac] 1 error

This is easily corrected; I'll check in a fix.



 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084224#comment-13084224
 ] 

Karl Wright commented on CONNECTORS-224:


Another problem that occurs in this connector with no real connection is that 
notification never completes.  Tons of exceptions:

ERROR 2011-08-12 12:15:00,637 (Job notification thread) - 
java.net.ConnectException: Connection refused: connect
org.apache.manifoldcf.core.interfaces.ManifoldCFException: 
java.net.ConnectException: Connection refused: connect
at 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnection.call(OpenSearchServerConnection.java:102)
at 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerAction.init(OpenSearchServerAction.java:19)
at 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector.noteJobComplete(OpenSearchServerConnector.java:328)
at 
org.apache.manifoldcf.crawler.system.JobNotificationThread.run(JobNotificationThread.java:115)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at java.net.Socket.connect(Socket.java:478)
at java.net.Socket.init(Socket.java:375)
at java.net.Socket.init(Socket.java:249)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(Unknown
 Source)
at 
org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory.createSocket(Unknown
 Source)
at org.apache.commons.httpclient.HttpConnection.open(Unknown Source)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Unknown 
Source)
at 
org.apache.commons.httpclient.HttpMethodDirector.executeMethod(Unknown Source)
at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown 
Source)
at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown 
Source)
at 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnection.call(OpenSearchServerConnection.java:93)
... 3 more

This is actually a framework problem, I believe; the job is aborted but 
notification is attempted anyhow.  But it is supposed to give up not try 
indefinitely.

I'll create a new ticket for that issue.


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CONNECTORS-238) Exception on end notification is not handled properly

2011-08-12 Thread Karl Wright (JIRA)
Exception on end notification is not handled properly
-

 Key: CONNECTORS-238
 URL: https://issues.apache.org/jira/browse/CONNECTORS-238
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework agents process
Reporter: Karl Wright


When an exception occurs during end notification, handling should permit the 
job to stop.  Notification is a nicety, not a requirement, and the notification 
method is called even when the job is aborted.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Compiling the JCIFS connector

2011-08-12 Thread Farzad Valad
Some how when I'm building the core jars, like pull-agent, agent, etc, 
some of the classes are missing methods needed by the jcifs connector.  
I'm pretty stumped as to why the java compiler would not include some of 
the methods.


Thought I just raise a flag in case you know what might be happening.  
For example, the class IFingerprintActivity is missing 
checkLengthIndexable and checkURLIndexable when I decompile the byte code.


My build process is build the core mcf jars first, then build the jcifs 
connector.  Thanks!


[jira] [Resolved] (CONNECTORS-238) Exception on end notification is not handled properly

2011-08-12 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-238.


   Resolution: Fixed
Fix Version/s: ManifoldCF 0.3
 Assignee: Karl Wright

r1157178


 Exception on end notification is not handled properly
 -

 Key: CONNECTORS-238
 URL: https://issues.apache.org/jira/browse/CONNECTORS-238
 Project: ManifoldCF
  Issue Type: Bug
  Components: Framework agents process
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 0.3


 When an exception occurs during end notification, handling should permit the 
 job to stop.  Notification is a nicety, not a requirement, and the 
 notification method is called even when the job is aborted.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Compiling the JCIFS connector

2011-08-12 Thread Karl Wright
It sounds like you have old core,agents, and pull-agents jars around.
It builds fine here with the ant build.

Karl

On Fri, Aug 12, 2011 at 12:25 PM, Farzad Valad ho...@farzad.net wrote:
 Some how when I'm building the core jars, like pull-agent, agent, etc, some
 of the classes are missing methods needed by the jcifs connector.  I'm
 pretty stumped as to why the java compiler would not include some of the
 methods.

 Thought I just raise a flag in case you know what might be happening.  For
 example, the class IFingerprintActivity is missing checkLengthIndexable and
 checkURLIndexable when I decompile the byte code.

 My build process is build the core mcf jars first, then build the jcifs
 connector.  Thanks!



[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-12 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13084252#comment-13084252
 ] 

Karl Wright commented on CONNECTORS-224:


Ok, I've tested some more.

This is a lot better than before.  The issues previously described seem to have 
been fixed.  Now we have a new set of issues to consider.  I'm listing these 
below.

(1) In addOrReplaceDocument(), the following code is present:


OpenSearchServerConfig config = getConfigParameters(null);
Integer count = addInstance(config);
synchronized (count) {


Can you explain what the purpose of this synchronizer is?  It looks to me like 
you might be effectively managing your own connection pool here, which is both 
redundant and would prevent ManifoldCF end users from controlling the size of 
that pool.  Am I correct?

(2) In addOrReplaceDocument(), you close the input stream.  You should not do 
that.  The caller closes the stream.

(3) Formatting.  Apache guidelines set indenting to 2 spaces, with no tabs.  
ManifoldCF adheres to this convention.  It would be great if you could reformat 
accordingly.

Thanks again!


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-beta.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira