[jira] Created: (CONNECTORS-46) JDBC connector changes for metadata need to be documented

2010-06-17 Thread Karl Wright (JIRA)
JDBC connector changes for metadata need to be documented
-

 Key: CONNECTORS-46
 URL: https://issues.apache.org/jira/browse/CONNECTORS-46
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Documentation
Reporter: Karl Wright


The JDBC connector now supports metadata.  This should be documented.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-47) Framework UI seems to call connector post processing more than needed

2010-06-17 Thread Karl Wright (JIRA)
Framework UI seems to call connector post processing more than needed
-

 Key: CONNECTORS-47
 URL: https://issues.apache.org/jira/browse/CONNECTORS-47
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Framework crawler agent
Reporter: Karl Wright
Priority: Minor


Connector form post processing is currently invoked both in execute.jsp (which 
is the target of all form posts), as well as in individual edit pages (such as 
editconfig.jsp and editjob.jsp).  Unless a reason can be found for why this is 
done, the individual edit page calls should be removed, since they are by 
definition superfluous.

Possible reasons it was done this way were:

(a) that code predates execute.jsp
(b) some other functionality, e.g. copy or posting of certificates, needs it

At any rate, this should be looked at after the bulk of CONNECTORS-40 related 
changes are committed to trunk.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-48) SharePoint rules description is incomplete

2010-06-18 Thread Karl Wright (JIRA)
SharePoint rules description is incomplete
--

 Key: CONNECTORS-48
 URL: https://issues.apache.org/jira/browse/CONNECTORS-48
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright


The description of how SharePoint inclusion and exclusion rules work is 
inadequate for an end user to be able to use the connector effectively.  
Specifically, it does not explain how the connector matches a rule.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-48) SharePoint rules description is incomplete

2010-06-18 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-48?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-48.
---

  Assignee: Karl Wright
Resolution: Fixed

Added a section on rule matching and implied rules - hope this helps.


> SharePoint rules description is incomplete
> --
>
> Key: CONNECTORS-48
> URL: https://issues.apache.org/jira/browse/CONNECTORS-48
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The description of how SharePoint inclusion and exclusion rules work is 
> inadequate for an end user to be able to use the connector effectively.  
> Specifically, it does not explain how the connector matches a rule.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-22 Thread Karl Wright (JIRA)
Solr connector metadata and id field can collide, causing multiple id fields to 
be passed in


 Key: CONNECTORS-49
 URL: https://issues.apache.org/jira/browse/CONNECTORS-49
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Lucene/SOLR connector
Reporter: Karl Wright


If a document has a metadata field called "id", or "ID", or "Id", or any such 
thing, the Solr connector will blithely send both the document id and the 
metadata id along to Solr, which will then crap out with an error.  The 
solution is to map the metadata "id" field to something else, which should be 
determined by the solr connection definition.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-22 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-49:
-

Assignee: Karl Wright

> Solr connector metadata and id field can collide, causing multiple id fields 
> to be passed in
> 
>
> Key: CONNECTORS-49
> URL: https://issues.apache.org/jira/browse/CONNECTORS-49
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Lucene/SOLR connector
>    Reporter: Karl Wright
>Assignee: Karl Wright
>
> If a document has a metadata field called "id", or "ID", or "Id", or any such 
> thing, the Solr connector will blithely send both the document id and the 
> metadata id along to Solr, which will then crap out with an error.  The 
> solution is to map the metadata "id" field to something else, which should be 
> determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881604#action_12881604
 ] 

Karl Wright commented on CONNECTORS-49:
---

As per discussions in connectors-user, it's probably important to also provide 
a declaration of the name of the solr "id" field in the configuration, with a 
default value of "id".  Longer term, maybe Solr can learn to accept a generic 
notion of primary key, but that's as yet undecided.



> Solr connector metadata and id field can collide, causing multiple id fields 
> to be passed in
> 
>
> Key: CONNECTORS-49
> URL: https://issues.apache.org/jira/browse/CONNECTORS-49
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Lucene/SOLR connector
>Reporter: Karl Wright
>Assignee: Karl Wright
>
> If a document has a metadata field called "id", or "ID", or "Id", or any such 
> thing, the Solr connector will blithely send both the document id and the 
> metadata id along to Solr, which will then crap out with an error.  The 
> solution is to map the metadata "id" field to something else, which should be 
> determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-29 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883595#action_12883595
 ] 

Karl Wright commented on CONNECTORS-40:
---

The UI changes have been made, largely hand-tested, and merged into trunk.  
Next steps for this ticket include:

- Updating the wiki page on how to build a connector
- Writing the classloader implementation that will actually allow for plugin 
loading


> Classloader-based plug-in architecture would permit LCF to be prebuilt
> --
>
> Key: CONNECTORS-40
> URL: https://issues.apache.org/jira/browse/CONNECTORS-40
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>
> The LCF architecture at this point requires interaction with the build script 
> in order to add connectors.  This is because the connector JSPs and jars need 
> to be added to the appropriate war files.  However, there is another 
> architectural option that would eliminate this need, which is to use a custom 
> classloader to pull components from jars that are placed in a specific 
> directory or directories.
> In order for this to work, however, the UI components of every connector must 
> become part of a jar.  That implies that they will need to cease being JSPs, 
> and become instead methods of each connector class.  (There is no 
> proscription against using something like Velocity for assembling the 
> necessary output for a connector, however.)  Limiting the 
> backwards-compatibility impact of this change will be difficult, especially 
> after a first release is made, so it seems clear that any change along these 
> lines should be attempted before version 1.0 is released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-45) Solr connector gives no way to specify the solr core name

2010-06-29 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-45.
---

  Assignee: Karl Wright
Resolution: Fixed

Fixed, r959078.


> Solr connector gives no way to specify the solr core name
> -
>
> Key: CONNECTORS-45
> URL: https://issues.apache.org/jira/browse/CONNECTORS-45
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Lucene/SOLR connector
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The Solr Connector allows you to specify everything about the Solr connection 
> except the Solr Core name.  A new configuration field should be added, which 
> is optional and defaults to blank, to allow this field to be set.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-49) Solr connector metadata and id field can collide, causing multiple id fields to be passed in

2010-06-29 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-49?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-49.
---

Resolution: Fixed

r959167.  Tested, except in the context of an actual crawl.


> Solr connector metadata and id field can collide, causing multiple id fields 
> to be passed in
> 
>
> Key: CONNECTORS-49
> URL: https://issues.apache.org/jira/browse/CONNECTORS-49
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Lucene/SOLR connector
>    Reporter: Karl Wright
>Assignee: Karl Wright
>
> If a document has a metadata field called "id", or "ID", or "Id", or any such 
> thing, the Solr connector will blithely send both the document id and the 
> metadata id along to Solr, which will then crap out with an error.  The 
> solution is to map the metadata "id" field to something else, which should be 
> determined by the solr connection definition.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2010-06-30 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883923#action_12883923
 ] 

Karl Wright commented on CONNECTORS-50:
---

I don't think much of "umbrella tickets".  Each ticket should describe a 
reasonably isolated feature or fix, not a wish list.  Can you break this up 
into more specific work items, being careful to check first whether there are 
existing tickets covering the feature/service you are looking for?

I'm also still looking for much greater specificity as to the use cases.  One 
cannot design useful features without use cases.  For example, the word "API" 
is so unspecific as to be essentially meaningless.  If you describe in detail 
what your hoped-for interaction with this hypothetical "API" is, that would go 
a long way towards clarifying the need.  I'm not just interested in the API 
format; I'm interested in how you intend to interact with it.  This is crucial 
because, as I've pointed out in various posts, one key design goal of LCF is to 
make the connector developer provide the UI for their connector, and your 
proposal may well force a violation of that principle, unless you have 
something clever up your sleeve.

There are still a number of points in your document we have discussed in the 
past which remain but whose controversy goes unacknowledged.  It would be good, 
if you create tickets or add to tickets already created, to mention the 
associated issues and why you think they are unimportant or immaterial.  For 
example, I've discussed the limitations of using Derby as the prime database 
for LCF - that should be captured somewhere.



> Proposal for initial two releases of LCF, including packaged product and full 
> API
> -
>
> Key: CONNECTORS-50
> URL: https://issues.apache.org/jira/browse/CONNECTORS-50
> Project: Lucene Connector Framework
>  Issue Type: New Feature
>  Components: Framework core
>Reporter: Jack Krupansky
>   Original Estimate: 3360h
>  Remaining Estimate: 3360h
>
> Currently, LCF has a relatively high-bar for evaluation and use, requiring 
> developer expertise. Also, although LCF has a comprehensive UI, it is not 
> currently packaged for use as a crawling engine for advanced applications.
> A small set of individual feature requests are needed to address these 
> issues. They are summarized briefly to show how they fit together for two 
> initial releases of LCF, but will be broken out into individual LCF Jira 
> issues.
> Goals:
> 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as 
> Solr is today)
> 2. LCF as a toolkit for developers needing customized crawling and repository 
> access
> 3. An API-based crawling engine that can be integrated with applications (as 
> Aperture is today)
> Larger goals:
> 1. Make it very easy for users to evaluate LCF.
> 2. Make it very easy for developers to customize LCF.
> 3. Make it very easy for appplications to fully manage and control LCF in 
> operation.
> Two phases:
> 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call 
> it LCF 0.5.
> 2) API-based crawling engine for applications for which the UI might not be 
> appropriate. Call it LCF 1.0.
> Phase 1
> ---
> LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
> It would contain roughly the features that are currently in place or 
> currently underway, plus a little more.
> Specifically, LCF 0.5 would contain these additional capabilities:
> 1. Plug-in architecture for connectors (already underway)
> 2. Packaged app ready to run with embedded Jetty app server (I think this has 
> been agreed to)
> 3. Bundled with database - PostgreSQL or derby - ready to run without 
> additional manual setup
> 4. Mini-API to initially configure default connections and "example" jobs for 
> file system and web crawl
> 5. Agent process started automatically (platform-specific startup required)
> 6. Solr output connector option to commit at end of job, by default
> Installation and basic evaluation of LCF would be essentially as simple as 
> Solr is today. The example
> connections and jobs would permit the user to initiate example crawls of a 
> file system example
> directory and an example web on the LCF web site with just a couple of clicks 
> (as opposed to the
> detailed manual setup required today to create repository and output 
> connections and jobs.
> It is worth considering whether the SharePoint connector could also be 
> included as part of the defa

[jira] Resolved: (CONNECTORS-47) Framework UI seems to call connector post processing more than needed

2010-06-30 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-47.
---

  Assignee: Karl Wright
Resolution: Fixed

r959393.  Refactor as needed to solidify the contract between edit pages and 
the execute.jsp post page.


> Framework UI seems to call connector post processing more than needed
> -
>
> Key: CONNECTORS-47
> URL: https://issues.apache.org/jira/browse/CONNECTORS-47
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Framework crawler agent
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>Priority: Minor
>
> Connector form post processing is currently invoked both in execute.jsp 
> (which is the target of all form posts), as well as in individual edit pages 
> (such as editconfig.jsp and editjob.jsp).  Unless a reason can be found for 
> why this is done, the individual edit page calls should be removed, since 
> they are by definition superfluous.
> Possible reasons it was done this way were:
> (a) that code predates execute.jsp
> (b) some other functionality, e.g. copy or posting of certificates, needs it
> At any rate, this should be looked at after the bulk of CONNECTORS-40 related 
> changes are committed to trunk.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-51) Reduce the number of required -D defines by using System.setProperty() in the appropriate places

2010-06-30 Thread Karl Wright (JIRA)
Reduce the number of required -D defines by using System.setProperty() in the 
appropriate places


 Key: CONNECTORS-51
 URL: https://issues.apache.org/jira/browse/CONNECTORS-51
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: JCIFS connector
Reporter: Karl Wright
Priority: Minor


The JCIFS connector requires a fair number of -D switches in the java startup 
in order to do the right things.  This is largely because jcifs.jar is 
constructed this way.  It may be possible, however, to eliminate these -D's by 
judicious static use of System.setProperty() within the appropriate connector 
class, provided we presume that jcifs classes will never be loaded prior to the 
jcifs connector classes being loaded.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-06-30 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-40:
-

Assignee: Karl Wright

> Classloader-based plug-in architecture would permit LCF to be prebuilt
> --
>
> Key: CONNECTORS-40
> URL: https://issues.apache.org/jira/browse/CONNECTORS-40
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The LCF architecture at this point requires interaction with the build script 
> in order to add connectors.  This is because the connector JSPs and jars need 
> to be added to the appropriate war files.  However, there is another 
> architectural option that would eliminate this need, which is to use a custom 
> classloader to pull components from jars that are placed in a specific 
> directory or directories.
> In order for this to work, however, the UI components of every connector must 
> become part of a jar.  That implies that they will need to cease being JSPs, 
> and become instead methods of each connector class.  (There is no 
> proscription against using something like Velocity for assembling the 
> necessary output for a connector, however.)  Limiting the 
> backwards-compatibility impact of this change will be difficult, especially 
> after a first release is made, so it seems clear that any change along these 
> lines should be attempted before version 1.0 is released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-52) Update documentation to reflect changes to Solr Connector

2010-07-01 Thread Karl Wright (JIRA)
Update documentation to reflect changes to Solr Connector
-

 Key: CONNECTORS-52
 URL: https://issues.apache.org/jira/browse/CONNECTORS-52
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Documentation
Reporter: Karl Wright


The Solr Connector has sprouted various new tabs and features lately.  The 
end-user documentation for it should be revamped to match the software.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-37) LCF should use an XML configuration file, not the simple name/value config file it currently has

2010-07-01 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-37?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-37.
---

  Assignee: Karl Wright
Resolution: Fixed

r959660.  properties.xml is now the default, rather than properties.ini.  The 
basic format of the file is:


 
 ...
 
 ...


> LCF should use an XML configuration file, not the simple name/value config 
> file it currently has
> 
>
> Key: CONNECTORS-37
> URL: https://issues.apache.org/jira/browse/CONNECTORS-37
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>Assignee: Karl Wright
>
> LCF's configuration file is limited in what it can specify, and XML 
> configuration files seem to offer more flexibility and are the modern norm.  
> Before backwards compatibility becomes an issue, it may therefore be worth 
> converting the property file reader to use XML rather than name/value format. 
>  It would also be nice to be able to fold the logging configuration into the 
> same file, if this seems possible.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-07-01 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884285#action_12884285
 ] 

Karl Wright commented on CONNECTORS-40:
---

Classloader has bee added, and the configuration file format is now XML.  The 
wiki connector description pages have been updated.  Next:

 - Change the build process and connector delivery model to take advantage of 
the classloader
 - Change the build process wiki document to reflect all changes


> Classloader-based plug-in architecture would permit LCF to be prebuilt
> --
>
> Key: CONNECTORS-40
> URL: https://issues.apache.org/jira/browse/CONNECTORS-40
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>Assignee: Karl Wright
>
> The LCF architecture at this point requires interaction with the build script 
> in order to add connectors.  This is because the connector JSPs and jars need 
> to be added to the appropriate war files.  However, there is another 
> architectural option that would eliminate this need, which is to use a custom 
> classloader to pull components from jars that are placed in a specific 
> directory or directories.
> In order for this to work, however, the UI components of every connector must 
> become part of a jar.  That implies that they will need to cease being JSPs, 
> and become instead methods of each connector class.  (There is no 
> proscription against using something like Velocity for assembling the 
> necessary output for a connector, however.)  Limiting the 
> backwards-compatibility impact of this change will be difficult, especially 
> after a first release is made, so it seems clear that any change along these 
> lines should be attempted before version 1.0 is released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-40) Classloader-based plug-in architecture would permit LCF to be prebuilt

2010-07-01 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-40.
---

Resolution: Fixed

All code committed.  Related tickets (such as removing the need for 
connector-specific -D switches) still in progress.


> Classloader-based plug-in architecture would permit LCF to be prebuilt
> --
>
> Key: CONNECTORS-40
> URL: https://issues.apache.org/jira/browse/CONNECTORS-40
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The LCF architecture at this point requires interaction with the build script 
> in order to add connectors.  This is because the connector JSPs and jars need 
> to be added to the appropriate war files.  However, there is another 
> architectural option that would eliminate this need, which is to use a custom 
> classloader to pull components from jars that are placed in a specific 
> directory or directories.
> In order for this to work, however, the UI components of every connector must 
> become part of a jar.  That implies that they will need to cease being JSPs, 
> and become instead methods of each connector class.  (There is no 
> proscription against using something like Velocity for assembling the 
> necessary output for a connector, however.)  Limiting the 
> backwards-compatibility impact of this change will be difficult, especially 
> after a first release is made, so it seems clear that any change along these 
> lines should be attempted before version 1.0 is released.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-51) Reduce the number of required -D defines by using System.setProperty() in the appropriate places

2010-07-01 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-51?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-51.
---

  Assignee: Karl Wright
Resolution: Fixed

r959748.



> Reduce the number of required -D defines by using System.setProperty() in the 
> appropriate places
> 
>
> Key: CONNECTORS-51
> URL: https://issues.apache.org/jira/browse/CONNECTORS-51
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: JCIFS connector
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Minor
>
> The JCIFS connector requires a fair number of -D switches in the java startup 
> in order to do the right things.  This is largely because jcifs.jar is 
> constructed this way.  It may be possible, however, to eliminate these -D's 
> by judicious static use of System.setProperty() within the appropriate 
> connector class, provided we presume that jcifs classes will never be loaded 
> prior to the jcifs connector classes being loaded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-38) There should be an LCF startup path that uses Jetty for running lcf-crawler-ui and lcf-authority-service

2010-07-02 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884651#action_12884651
 ] 

Karl Wright commented on CONNECTORS-38:
---

I've started to look at what would be necessary to perform this work.  If the 
"quick-start" implementation will be using embedded derby, then it must run in 
a single process (or derby is not happy at all).  That would include the 
crawler ui, the authority service, and the crawler daemon.

If jetty can be configured to run in such a way as to use "system" classes for 
all of its web applications, then in theory it should be possible to put 
together an LCF which, on startup, spawns the crawler daemon before starting up 
jetty within the same process.  For the classloader issue, there seems to be a 
considerable degree of configuration flexibility, as described here:

http://docs.codehaus.org/display/JETTY/Classloading

The rest of the problem, i.e. starting and stopping jetty programmatically, may 
be doable based on this page:

http://docs.codehaus.org/display/JETTY/Embedding+Jetty

However, (1) it's really not clear what model I should be using.  I basically 
need to be able to fire up two entire web applications, which don't need to be 
in wars necessarily, but which certainly need to contain JSPs, .css files, 
.jpg's, tld's, and other standard webish content.  And (2), it's not clear 
if/how you properly perform Jetty shutdown using the chosen model.  Any advice 
welcome.



> There should be an LCF startup path that uses Jetty for running 
> lcf-crawler-ui and lcf-authority-service
> 
>
> Key: CONNECTORS-38
> URL: https://issues.apache.org/jira/browse/CONNECTORS-38
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Karl Wright
>
> Integrating with Jetty would allow LCF to be deployed in simple cases without 
> requiring Tomcat, which would simplify the setup in such cases.  This of 
> course should not be construed as removing the support for Tomcat-style web 
> applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-38) There should be an LCF startup path that uses Jetty for running lcf-crawler-ui and lcf-authority-service

2010-07-06 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885746#action_12885746
 ] 

Karl Wright commented on CONNECTORS-38:
---

Code complete.  There's now a dist/example directory, and you run lcf with the 
command  -jar start.jar from that directory, just like Solr.

Documentation needs updating, but otherwise this ticket is complete.


> There should be an LCF startup path that uses Jetty for running 
> lcf-crawler-ui and lcf-authority-service
> 
>
> Key: CONNECTORS-38
> URL: https://issues.apache.org/jira/browse/CONNECTORS-38
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Karl Wright
>
> Integrating with Jetty would allow LCF to be deployed in simple cases without 
> requiring Tomcat, which would simplify the setup in such cases.  This of 
> course should not be construed as removing the support for Tomcat-style web 
> applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-38) There should be an LCF startup path that uses Jetty for running lcf-crawler-ui and lcf-authority-service

2010-07-07 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-38.
---

  Assignee: Karl Wright
Resolution: Fixed

Documentation now also mentions the quick-start jetty stuff, so this ticket is 
done, I think.


> There should be an LCF startup path that uses Jetty for running 
> lcf-crawler-ui and lcf-authority-service
> 
>
> Key: CONNECTORS-38
> URL: https://issues.apache.org/jira/browse/CONNECTORS-38
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>    Reporter: Karl Wright
>Assignee: Karl Wright
>
> Integrating with Jetty would allow LCF to be deployed in simple cases without 
> requiring Tomcat, which would simplify the setup in such cases.  This of 
> course should not be construed as removing the support for Tomcat-style web 
> applications.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-46) JDBC connector changes for metadata need to be documented

2010-07-07 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-46.
---

  Assignee: Karl Wright
Resolution: Fixed

A paragraph added describing how metadata works with JDBC connector.


> JDBC connector changes for metadata need to be documented
> -
>
> Key: CONNECTORS-46
> URL: https://issues.apache.org/jira/browse/CONNECTORS-46
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Documentation
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The JDBC connector now supports metadata.  This should be documented.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-52) Update documentation to reflect changes to Solr Connector

2010-07-08 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-52?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-52.
---

  Assignee: Karl Wright
Resolution: Fixed

Documentation and screen shots have been now updated.


> Update documentation to reflect changes to Solr Connector
> -
>
> Key: CONNECTORS-52
> URL: https://issues.apache.org/jira/browse/CONNECTORS-52
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Documentation
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> The Solr Connector has sprouted various new tabs and features lately.  The 
> end-user documentation for it should be revamped to match the software.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-54) A Filesystem output connector would be useful and would allow more complete unit tests

2010-07-08 Thread Karl Wright (JIRA)
A Filesystem output connector would be useful and would allow more complete 
unit  tests
---

 Key: CONNECTORS-54
 URL: https://issues.apache.org/jira/browse/CONNECTORS-54
 Project: Lucene Connector Framework
  Issue Type: Improvement
Reporter: Karl Wright


Right now, the unit tests are limited because there is no way to check that the 
"indexed" files actually do get indexed.  The addition of a filesystem output 
connector would allow more complete tests to be constructed.  In addition, such 
a connector might well be useful in its own right.

The connector would need to convert URI's into relative file paths, but other 
than that there's really nothing very tricky about it.  Configuration 
information is minimal; just the root path of the output is all that's needed.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886475#action_12886475
 ] 

Karl Wright commented on CONNECTORS-55:
---

Hi jack,
This seems to me to be beyond the scope of most open-source installers.  I've 
constructed installers involving Postgres before and the integration 
possibilities are very limited.  Furthermore, you would need a totally 
different installer for windows, debian, redhat, solaris, the mac, etc.  Many 
of these platforms do not work well with bundles but instead use a dependency 
model in any case.

--- original message ---
From: "ext Jack Krupansky (JIRA)" 
Subject: [jira] Created: (CONNECTORS-55) Bundle database server with LCF 
packaged product
Date: July 8, 2010
Time: 4:35:20  PM


Bundle database server with LCF packaged product


 Key: CONNECTORS-55
 URL: https://issues.apache.org/jira/browse/CONNECTORS-55
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Framework core
Reporter: Jack Krupansky


The current requirement that the user install and deploy a PostgreSQL server 
complicates the installation and deployment of LCF for the user. Installation 
and deployment of LCF should be as simple as Solr itself. QuickStart is great 
for the low-end and basic evaluation, but a comparable level of simplified 
installation and deployment is still needed for full-blown, high-end 
environments that need the full performance of a ProstgreSQL-class database 
server. So, PostgreSQL should be bundled with the packaged release of LCF so 
that installation and deployment of LCF will automatically install and deploy a 
subset of the full PostgreSQL distribution that is sufficient for the needs of 
LCF. Starting LCF, with or without the LCF UI, should automatically start the 
database server. Shutting down LCF should also shutdown the database server 
process.

A typical use case would be for a non-developer who is comfortable with Solr 
and simply wants to crawl documents from, for example, a SharePoint repository 
and feed them into Solr. QuickStart should work well for the low end or in the 
early stages of evaluation, but the user would prefer to evaluate "the real 
thing" with something resembling a production crawl of thousands of documents. 
Such a user might not be a hard-core developer or be comfortable fiddling with 
a lot of software components simply to do one conceptually simple operation.

It should still be possible for the user to supply database server settings to 
override the defaults, but the LCF package should have all of the best-practice 
settings deemed appropriate for use with LCF.

One downside is that installation and deployment will be platform-specific 
since there are multiple processes and PostgreSQL itself requires a 
platform-specific installation.

This proposal presumes that PostgreSQL is the best option for the foreseeable 
future, but nothing here is intended to preclude support for other database 
servers in futures releases.

This proposal should not have any impact on QuickStart packaging or deployment.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, 

[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886548#action_12886548
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark, it took most of a month to get Derby working, and to do it I needed to 
disable certain functionality in LCF.  No performance tuning or analysis has 
yet been done on Derby, and I would not be surprised if another month was 
required to complete that.  Point being that it is by no means ever a "plug and 
play" operation to switch databases - there are just way too many side effects 
(e.g. query A performs wonderfully on database X, but you need to use query B 
or you're dead on database Y).  Jack, for example, was extremely surprised to 
learn that embedded Derby would not allow more than one process to access the 
database at a time - and Jack was the one advocating most strongly for Derby 
support!

I therefore strongly suggest a cautious approach when considering Introducing 
additional databases.  Testing of any change also becomes much more difficult 
the more supported databases there are.  So, in my view, one really must ask, 
"What unmet scenario do you see that would demand support for this database?", 
before just going ahead and deciding to support whatever may be out there.  I 
realize this cautious approach is diametrically opposed to your stated goal of 
supporting "other java databases".  Perhaps you could clarify your request so 
that we could understand your true goal here.




> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886691#action_12886691
 ] 

Karl Wright commented on CONNECTORS-55:
---

Robert,

I'm not opposed to implementing support for hsqldb, but let's be clear on the 
goals here.

The initial goal for doing the Derby implementation was simply to be able to 
write unit tests, and to make Jack happy.  Later, because of the fact that 
Derby has an embedded JDBC mode of operation, it was possible to construct the 
LCF Quick-Start to use it.  So it made it possible to have an unzip-and-go 
solution.

What would the be goal of using hsqldb?   It seems to support an embedded mode, 
so it could certainly be used instead of Derby, wherever we are currently using 
Derby.  Since it fully supports MVCC it is certainly much closer to Postgresql 
in actual operation than Derby is, so chances are good that we'd find fewer 
issues in scaling than with Derby.  If this is the approach you are suggesting, 
I would suggest dropping support for Derby and simply replace it with Hsqldb.  
We'd leave the Derby implementation class around, of course, but we'd not tune 
against it or test against it.

FWIW, if hsqldb is sufficiently performant, I could foresee also dropping 
support for postgresql in the future, in the same way.  But that is yet to be 
proven.  And, indeed, that's what the problem is - there's no way to know in 
advance of doing the work how exactly things will pan out.  So if that's the 
true goal, we've got a fair bit of work to do before deciding whether Hsqldb or 
Derby or any other Java database can actually do what we need it to.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886701#action_12886701
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>>>>>
Can you help me out and give me more ideas on what particular performance 
problems you are concerned about (e.g. query types or whatever) ?
<<<<<<

Hi Robert,
There are two major determinants of performance for LCF, under Postgresql at 
any rate.  The first is the performance of the queue stuffer query, and how 
that scales to when the queue is extremely large.  This is a complex query, but 
its basic form is:

SELECT  FROM  WHERE  AND NOT 
EXISTS() ORDER BY  
ASC LIMIT 

Because the queue may be very large, and this query may potentially return ALL 
records in the queue, the query plan MUST wind up reading directly out of the 
priority index, or the query simply will not work.  It simply cannot afford to 
read 20 million records into memory and then sort them!

The second place performance can be severely impacted is in how parallel writes 
can be.  In postgresql 7.4, for example, everything was single-threaded on 
writes.  This caused web crawling in particular to be poorly performing, 
because every typical web page has a significant number of links that must be 
entered in the queue, and single-threading that process cost some 4x to 10x 
over Postgresql 8.x, which allowed much more parallelism.

Hope this helps.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886703#action_12886703
 ] 

Karl Wright commented on CONNECTORS-55:
---

I should add that, even for Postgresql, we've had to mess with the stuffer 
query on pretty near every point release of Postgresql, to guarantee that it 
continues to meet the basic criteria.  The last change that we needed was to 
perform an ANALYZE before *every* time the query was run.  Why?  Because 
Postgresql 8.3 became somehow incredibly sensitive to small changes in 
statistics and would cease to do the right thing very quickly as the database 
changed.  We looked at this and discovered that it took a specific plan 
optimization path when it thought a particular statistic was 100%, and a 
totally different one when the statistic was anything less than 100%.   A bug?  
Well, no, just a sensitivity...  I guess one could call it a design flaw that 
nobody thought about what might happen if the statistics were slightly out of 
date.

> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886717#action_12886717
 ] 

Karl Wright commented on CONNECTORS-55:
---

Mark,

If your concern is about installing LCF, read the Quick Start part of the 
build/deploy page.  You check out, build, and run.  Derby-based.  Nothing else 
to install.   Not hard really.



> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886722#action_12886722
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>>>>>
forcing the user to pick the right/acceptable release of PostgreSQL to install 
is error prone and a support headache
<<<<<<

Yup.  It is.  Problem is that products/versions get security fixes, CVE's, 
end-of-life notices, etc.  It is beyond the scope of LCF to try and control all 
that - we'd be buying a whole new level of support headache, believe me.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886721#action_12886721
 ] 

Karl Wright commented on CONNECTORS-55:
---

>>>>>>
That is exactly what I was leaning towards - but what kind of hobbled state are 
you in with derby? You said you have to run the db and ui one at a time or 
something? And that many sql queries don't work with derby - that has all been 
addressed already?

<<<<<<

The "hobbling" is that you can't sort on some columns in reports that you could 
sort on before when just Postgresql was involved.  Also, that no real 
large-scale perf tests have been done on Derby.  Also that you need to use 
"LIKE" %-based syntax instead of real regular expressions whenever you specify 
regular expressions in your reports.  The quick-start does not limit your 
simultaneous use of UI and crawler - it runs jetty as the app server within the 
same process.  It *does* limit your ability to use other commands 
simultaneously - but you should not need to do that in normal circumstances.

So "that " has indeed already been addressed.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886730#action_12886730
 ] 

Karl Wright commented on CONNECTORS-55:
---

The quick-start even takes care of connector registration for you, so 
executecommand is not needed even then.  What you *don't* get to do is use the 
command-based API to control LCF; that's not going to work in the 
single-process model.

By the way, hsqldb is apparently limited to a 16GB database (version 2.0).  
That's not very much.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-56) All features should be accessible through an API

2010-07-09 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886744#action_12886744
 ] 

Karl Wright commented on CONNECTORS-56:
---

The general approach for an API that I'd suggest, which would be completely 
compatibly with the Quick Start version of LCF, would basically consist of its 
own web application.  The web application would consist completely of a 
servlet, which would interpret path and argument information as commands and 
posted command arguments, respectively.  A separate web application would 
permit users to control access to the API using standard application server 
access management mechanisms.

All commands would consist of HTTP posts that address the API servlet.  The 
commands themselves would come in broad categories, and be specified as part of 
the "path", as follows:

job/find - get existing job information
job/start - start a job
job/abort - abort a job
job/restart - restart a job
job/delete - delete a job
job/save - save a job
...
report/simplehistory - generate a simple history report
report/documentstatus - generate a document status report
...
etc.

As for arguments and return values, my sense is that have a choice of either 
XML or JSON here.  It's not clear which is better, but my preference would be 
JSON, because of its relative simplicity, and because otherwise people will be 
tempted to want us to use full SOAP, with WSDLs etc.  That would add a lot of 
overhead to the solution, in my opinion.

The API commands would be stateless, in that there would be no explicit or 
implicit session created.  All arguments are therefore explicit.

Connection configuration, document specification, and output specification 
information is the main problem.  This information is easily mappable to 
unstructured XML, and is managed by LCF in this way.  The contents of the XML 
is determined wholly by the involved connector code, and is therefore opaque as 
far as the API is concerned.  Embedding such XML as a JSON field is certainly 
possible, or it would even be possible to convert it to embedded or nested 
JSON.  It's really not clear to me what the best approach is here, although 
embedded JSON would require fewer moving parts in the client.

If this approach is to fly, somebody will need to document these opaque 
configuration structures, which will mean that these structures must remain 
backwards compatible as they evolve.

Thoughts?


> All features should be accessible through an API
> 
>
> Key: CONNECTORS-56
> URL: https://issues.apache.org/jira/browse/CONNECTORS-56
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jack Krupansky
>
> LCF consists of a full-featured crawling engine and a full-featured user 
> interface to access the features of that engine, but some applications are 
> better served with a full API that lets the application control the crawling 
> engine, including creation and editing of connections and creation, editing, 
> and control of jobs. Put simply, everything that a user can accomplish via 
> the LCF UI should be doable through an LCF API. All LCF objects should be 
> queryable through the API.
> A primary use case is Solr applications which currently use Aperture for 
> crawling, but would prefer the full-featured capabilities of LCF as a 
> crawling engine over Aperture.
> I do not wish to over-specify the API in this initial description, but I 
> think the LCF API should probably be a traditional REST API., with some of 
> the API elements specified via the context path, some parameters via URL 
> query parameters, and complex, detailed structures as JSON (or similar.). The 
> precise details of the API are beyond the scope of this initial description 
> and will be added incrementally once the high-level approach to the API 
> becomes reasonably settled.
> A job status and event reporting scheme is also needed in conjunction with 
> the LCF API. That requirement has already been captured as CONNECTORS-41.
> The intention for the API is to create, edit, access, and control all of the 
> objects managed by LCF. The main focus is on repositories, jobs, and status, 
> and less about document-specific crawling information, but there may be some 
> benefit to querying crawling status for individual documents as well.
> Nothing in this proposal should in any way limit or constrain the features 
> that will be available in the LCF UI. The intent is that LCF should continue 
> to have a full-featured UI, but in addition to a full-featured API.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-60) Agent process should be started automatically

2010-07-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887804#action_12887804
 ] 

Karl Wright commented on CONNECTORS-60:
---

I'm tempted to close this because the jetty integration has already addressed 
this issue.



 [ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jack Krupansky updated CONNECTORS-60:
-

Description:
LCF as it exists today is a bit too complex to run for an average user, 
especially with a separate agent process for crawling. LCF should be as easy to 
run as Solr is today. QuickStart is a good move in this direction, but the same 
user-visible simplicity is needed for full LCF. The separate agent process is a 
reasonable design for execution, but a little too cumbersome for the average 
user to manage.

Unfortunately, it is expected that starting up a multi-process application will 
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


  was:
LCF as it exists today is a bit too complex to run for an average user, 
especially with a separate agent process for crawling. LCF should be as easy to 
run as Solr is today. QuickStart is a good move in this direction, but the same 
user-visible simplicity is needed for LCF. The separate agent process is a 
reasonable design for execution, but a little too cumbersome for the average 
user to manage.

Unfortunately, it is expected that starting up a multi-process application will 
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.




--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Agent process should be started automatically
> -
>
> Key: CONNECTORS-60
> URL: https://issues.apache.org/jira/browse/CONNECTORS-60
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>Reporter: Jack Krupansky
>
> LCF as it exists today is a bit too complex to run for an average user, 
> especially with a separate agent process for crawling. LCF should be as easy 
> to run as Solr is today. QuickStart is a good move in this direction, but the 
> same user-visible simplicity is needed for full LCF. The separate agent 
> process is a reasonable design for execution, but a little too cumbersome for 
> the average user to manage.
> Unfortunately, it is expected that starting up a multi-process application 
> will require platform-specific scripting.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-61) Support bundling of LCF with an app

2010-07-13 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887806#action_12887806
 ] 

Karl Wright commented on CONNECTORS-61:
---

I'm tempted to close this issue because (a) there is absolutely no reason 
anyone competent cannot bundle lcf with an app today, and (b) it is completely 
unclear what, if anything, the 'fix' would look like.  A specific statement of 
an actual concrete problem is the only thing that will prevent me from closing 
this.


--- original message ---
From: "ext Jack Krupansky (JIRA)" 
Subject: [jira] Created: (CONNECTORS-61) Support bundling of LCF with an app
Date: July 12, 2010
Time: 2:48:11  PM


Support bundling of LCF with an app
---

 Key: CONNECTORS-61
 URL: https://issues.apache.org/jira/browse/CONNECTORS-61
 Project: Lucene Connector Framework
  Issue Type: Sub-task
  Components: Framework core
Reporter: Jack Krupansky


It should be possible for an application developer to bundle LCF with an 
application to facilitate installation and deployment of the application in 
conjunction with LCF. This may (or may not) be as simple as providing 
appropriate jar files and documentation for how to use them, but there may be 
other components or scripts needed.

There are two options: 1) include the LCF UI along with the other LCF 
processes, and 2) exclude the LCF UI and include only the other processes that 
can be controlled via the full API.

The database server would be included.

The web app server would be optional since the application may have its own 
choice of web app server.

One use case is bundling LCF with Solr or a Solr-based application.

Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Support bundling of LCF with an app
> ---
>
> Key: CONNECTORS-61
> URL: https://issues.apache.org/jira/browse/CONNECTORS-61
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>
> It should be possible for an application developer to bundle LCF with an 
> application to facilitate installation and deployment of the application in 
> conjunction with LCF. This may (or may not) be as simple as providing 
> appropriate jar files and documentation for how to use them, but there may be 
> other components or scripts needed.
> There are two options: 1) include the LCF UI along with the other LCF 
> processes, and 2) exclude the LCF UI and include only the other processes 
> that can be controlled via the full API.
> The database server would be included.
> The web app server would be optional since the application may have its own 
> choice of web app server.
> One use case is bundling LCF with Solr or a Solr-based application.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-60) Agent process should be started automatically

2010-07-14 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-60:
--

   Priority: Minor  (was: Major)
Description: 
LCF as it exists today is a bit too complex to run for an average user, 
especially with a separate agent process for crawling. LCF should be as easy to 
run as Solr is today. QuickStart is a good move in this direction, but the same 
user-visible simplicity is needed for full LCF. The separate agent process is a 
reasonable design for execution, but a little too cumbersome for the average 
user to manage.

Unfortunately, it is expected that starting up a multi-process application will 
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

KDW - this functionality is already present; however the documentation is not 
adequate to help people figure out how to do it.  So I'm moving this to 
Documentation and treating it as a doc bug.


  was:
LCF as it exists today is a bit too complex to run for an average user, 
especially with a separate agent process for crawling. LCF should be as easy to 
run as Solr is today. QuickStart is a good move in this direction, but the same 
user-visible simplicity is needed for full LCF. The separate agent process is a 
reasonable design for execution, but a little too cumbersome for the average 
user to manage.

Unfortunately, it is expected that starting up a multi-process application will 
require platform-specific scripting.

Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.


Component/s: Documentation
 (was: Framework agents process)

> Agent process should be started automatically
> -
>
> Key: CONNECTORS-60
> URL: https://issues.apache.org/jira/browse/CONNECTORS-60
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Jack Krupansky
>Priority: Minor
>
> LCF as it exists today is a bit too complex to run for an average user, 
> especially with a separate agent process for crawling. LCF should be as easy 
> to run as Solr is today. QuickStart is a good move in this direction, but the 
> same user-visible simplicity is needed for full LCF. The separate agent 
> process is a reasonable design for execution, but a little too cumbersome for 
> the average user to manage.
> Unfortunately, it is expected that starting up a multi-process application 
> will require platform-specific scripting.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.
> KDW - this functionality is already present; however the documentation is not 
> adequate to help people figure out how to do it.  So I'm moving this to 
> Documentation and treating it as a doc bug.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-59) Packaged app ready to run with embedded Jetty app server

2010-07-14 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-59?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-59.
---

Resolution: Fixed

I am unaware of any "lingering issues" with the QuickStart work.


> Packaged app ready to run with embedded Jetty app server 
> -
>
> Key: CONNECTORS-59
> URL: https://issues.apache.org/jira/browse/CONNECTORS-59
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>
> Many potential users of LCF are not necessarily sophisticated developers who 
> are prepared to "work with code", but are able to install packaged software, 
> much as Solr is currently distributed. QuickStart for LCF is a good move in 
> this direction, but similar packaging is needed for full LCF with a 
> production database server. This issue focuses on assuring that full LCF is 
> released as a packaged app suitable for download and immediate use without 
> any additional software development expertise required.
> Database packaging has already been called out as a distinct issue 
> (CONNECTORS-55), so this issue is more of a catch-all for any lingering work 
> needed to address support for full LCF as a packaged app.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-60) Agent process should be started automatically

2010-07-14 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-60:
--

Component/s: Framework agents process

Moving to more descriptive category.


> Agent process should be started automatically
> -
>
> Key: CONNECTORS-60
> URL: https://issues.apache.org/jira/browse/CONNECTORS-60
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework agents process
>Reporter: Jack Krupansky
>
> LCF as it exists today is a bit too complex to run for an average user, 
> especially with a separate agent process for crawling. LCF should be as easy 
> to run as Solr is today. QuickStart is a good move in this direction, but the 
> same user-visible simplicity is needed for full LCF. The separate agent 
> process is a reasonable design for execution, but a little too cumbersome for 
> the average user to manage.
> Unfortunately, it is expected that starting up a multi-process application 
> will require platform-specific scripting.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-60) Agent process should be started automatically

2010-07-14 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-60?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888502#action_12888502
 ] 

Karl Wright commented on CONNECTORS-60:
---

The QuickStart can be used with Postgresql as well.  All you need to do is 
modify the properties.xml file to use the Postgresql implementation rather than 
the Derby one.  That is, the following property should be set:



... instead of the corresponding Derby implementation class.  Then, just start 
QuickStart and everything should be set up, provided you've configured 
Postgresql in the standard way.  That's it.



> Agent process should be started automatically
> -
>
> Key: CONNECTORS-60
> URL: https://issues.apache.org/jira/browse/CONNECTORS-60
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework agents process
>Reporter: Jack Krupansky
>
> LCF as it exists today is a bit too complex to run for an average user, 
> especially with a separate agent process for crawling. LCF should be as easy 
> to run as Solr is today. QuickStart is a good move in this direction, but the 
> same user-visible simplicity is needed for full LCF. The separate agent 
> process is a reasonable design for execution, but a little too cumbersome for 
> the average user to manage.
> Unfortunately, it is expected that starting up a multi-process application 
> will require platform-specific scripting.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-62) Document the LCF API

2010-07-14 Thread Karl Wright (JIRA)
Document the LCF API


 Key: CONNECTORS-62
 URL: https://issues.apache.org/jira/browse/CONNECTORS-62
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright


Not only does the LCF API itself need documentation, but so do all the 
connector configuration/specification objects, now that they are exposed.  This 
should probably become part of the developer documentation on the main LCF 
website.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-56) All features should be accessible through an API

2010-07-14 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888508#action_12888508
 ] 

Karl Wright commented on CONNECTORS-56:
---

Checked in code which I claim constitutes the 0.5 version of the API.  This 
includes complete json-based operations for the following:

- /job/list
- /job/get
- /job/save
- /job/delete
- /jobstatus/list
- /jobstatus/start
- /jobstatus/abort
- /jobstatus/restart
- /jobstatus/pause
- /jobstatus/resume
- /outputconnection/list
- /outputconnection/get
- /outputconnection/save
- /outputconnection/delete
- /outputconnection/checkstatus
- /repositoryconnection/list
- /repositoryconnection/get
- /repositoryconnection/save
- /repositoryconnection/delete
- /repositoryconnection/checkstatus
- /authorityconnection/list
- /authorityconnection/get
- /authorityconnection/save
- /authorityconnection/delete
- /authorityconnection/checkstatus

Some minor immediate refinements are possible, specifically around the way 
error handling is done.  Currently, errors that toss exceptions cause a 500 
error (with some helpful text).  It may be that it's better to get the 
exception text back as part of the JSON object being returned.  
Comments/preferences welcome.

Also not yet implemented are the following methods:

- /report/documentstatus
- /report/queuestatus
- /report/simplehistory
- /report/maximumbandwidth
- /report/maximumactivity
- /report/resultsummary

These I'd propose to hold of on right now.

I've created a separate ticket for documentation/examples.  I'm still intending 
to create some tests to exercise at least major portions of the API before 
closing THIS ticket.


> All features should be accessible through an API
> 
>
> Key: CONNECTORS-56
> URL: https://issues.apache.org/jira/browse/CONNECTORS-56
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>
> LCF consists of a full-featured crawling engine and a full-featured user 
> interface to access the features of that engine, but some applications are 
> better served with a full API that lets the application control the crawling 
> engine, including creation and editing of connections and creation, editing, 
> and control of jobs. Put simply, everything that a user can accomplish via 
> the LCF UI should be doable through an LCF API. All LCF objects should be 
> queryable through the API.
> A primary use case is Solr applications which currently use Aperture for 
> crawling, but would prefer the full-featured capabilities of LCF as a 
> crawling engine over Aperture.
> I do not wish to over-specify the API in this initial description, but I 
> think the LCF API should probably be a traditional REST API., with some of 
> the API elements specified via the context path, some parameters via URL 
> query parameters, and complex, detailed structures as JSON (or similar.). The 
> precise details of the API are beyond the scope of this initial description 
> and will be added incrementally once the high-level approach to the API 
> becomes reasonably settled.
> A job status and event reporting scheme is also needed in conjunction with 
> the LCF API. That requirement has already been captured as CONNECTORS-41.
> The intention for the API is to create, edit, access, and control all of the 
> objects managed by LCF. The main focus is on repositories, jobs, and status, 
> and less about document-specific crawling information, but there may be some 
> benefit to querying crawling status for individual documents as well.
> Nothing in this proposal should in any way limit or constrain the features 
> that will be available in the LCF UI. The intent is that LCF should continue 
> to have a full-featured UI, but in addition to a full-featured API.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-63) Add support for reports to API

2010-07-16 Thread Karl Wright (JIRA)
Add support for reports to API
--

 Key: CONNECTORS-63
 URL: https://issues.apache.org/jira/browse/CONNECTORS-63
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: API
Reporter: Karl Wright


The API does not currently have implemented support for any LCF reporting.  Add 
this functionality.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-61) Support bundling of LCF with an app

2010-07-16 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-61?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-61:
--

   Priority: Minor  (was: Major)
Component/s: Documentation
 (was: Framework core)

Moving to documentation, since that's what this is apparently about.


> Support bundling of LCF with an app
> ---
>
> Key: CONNECTORS-61
> URL: https://issues.apache.org/jira/browse/CONNECTORS-61
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Documentation
>Reporter: Jack Krupansky
>Priority: Minor
>
> It should be possible for an application developer to bundle LCF with an 
> application to facilitate installation and deployment of the application in 
> conjunction with LCF. This may (or may not) be as simple as providing 
> appropriate jar files and documentation for how to use them, but there may be 
> other components or scripts needed.
> There are two options: 1) include the LCF UI along with the other LCF 
> processes, and 2) exclude the LCF UI and include only the other processes 
> that can be controlled via the full API.
> The database server would be included.
> The web app server would be optional since the application may have its own 
> choice of web app server.
> One use case is bundling LCF with Solr or a Solr-based application.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-56) All features should be accessible through an API

2010-07-16 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-56.
---

  Assignee: Karl Wright
Resolution: Fixed

I've created tests, and a separate ticket for API documentation and for adding 
report functions to the API.  So I'm going to close this one, and use the other 
tickets to track those other issues.


> All features should be accessible through an API
> 
>
> Key: CONNECTORS-56
> URL: https://issues.apache.org/jira/browse/CONNECTORS-56
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>Assignee: Karl Wright
>
> LCF consists of a full-featured crawling engine and a full-featured user 
> interface to access the features of that engine, but some applications are 
> better served with a full API that lets the application control the crawling 
> engine, including creation and editing of connections and creation, editing, 
> and control of jobs. Put simply, everything that a user can accomplish via 
> the LCF UI should be doable through an LCF API. All LCF objects should be 
> queryable through the API.
> A primary use case is Solr applications which currently use Aperture for 
> crawling, but would prefer the full-featured capabilities of LCF as a 
> crawling engine over Aperture.
> I do not wish to over-specify the API in this initial description, but I 
> think the LCF API should probably be a traditional REST API., with some of 
> the API elements specified via the context path, some parameters via URL 
> query parameters, and complex, detailed structures as JSON (or similar.). The 
> precise details of the API are beyond the scope of this initial description 
> and will be added incrementally once the high-level approach to the API 
> becomes reasonably settled.
> A job status and event reporting scheme is also needed in conjunction with 
> the LCF API. That requirement has already been captured as CONNECTORS-41.
> The intention for the API is to create, edit, access, and control all of the 
> objects managed by LCF. The main focus is on repositories, jobs, and status, 
> and less about document-specific crawling information, but there may be some 
> benefit to querying crawling status for individual documents as well.
> Nothing in this proposal should in any way limit or constrain the features 
> that will be available in the LCF UI. The intent is that LCF should continue 
> to have a full-featured UI, but in addition to a full-featured API.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-58) Mini-API to initially configure default connections and "example" jobs for file system and web crawl

2010-07-16 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-58?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-58:
--

   Priority: Minor  (was: Major)
Component/s: Examples
 (was: Framework core)

I'm going to put this in a new category called "examples".


> Mini-API to initially configure default connections and "example" jobs for 
> file system and web crawl 
> -
>
> Key: CONNECTORS-58
> URL: https://issues.apache.org/jira/browse/CONNECTORS-58
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Examples
>Reporter: Jack Krupansky
>Priority: Minor
>
> Creating a basic connection setup to do a relatively simple crawl for a file 
> system or web can be a daunting task for someone new to LCF. So, it would be 
> nice to have a scripting file that supports an abbreviated API (subset of the 
> full API discussed in CONNECTORS-56) sufficient to create a default set of 
> connections and example jobs that the new user can choose from.
> Beyond this initial need, this script format might be a useful form to "dump" 
> all of the connections and jobs in the LCF database in a form that can be 
> used to recreate an LCF configuration. Kind of a "dump and reload" 
> capability. That in fact might be how the initial example script gets created.
> Those are two distinct use cases, but could utilize the same feature.
> The example script could have example jobs to crawl a subdirectory of LCF, 
> crawl the LCF wiki, etc.
> There could be more than one script. There might be example scripts for each 
> form of connector.
> This capability should be available for both QuickStart and the general 
> release of LCF.
> As just one possibility, the script format might be a sequence of JSON 
> expressions, each with an initial string analogous to a servlet path to 
> specify the operation to be performed, followed by the JSON form of the 
> connection or job or other LCF object. Or, some other format might be more 
> suitable.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-50) Proposal for initial two releases of LCF, including packaged product and full API

2010-07-16 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-50?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-50:
--

Component/s: (was: Framework core)

Moving this out of core, since it's a planning ticket not a software issue.


> Proposal for initial two releases of LCF, including packaged product and full 
> API
> -
>
> Key: CONNECTORS-50
> URL: https://issues.apache.org/jira/browse/CONNECTORS-50
> Project: Lucene Connector Framework
>  Issue Type: New Feature
>Reporter: Jack Krupansky
>
> Currently, LCF has a relatively high-bar for evaluation and use, requiring 
> developer expertise. Also, although LCF has a comprehensive UI, it is not 
> currently packaged for use as a crawling engine for advanced applications.
> A small set of individual feature requests are needed to address these 
> issues. They are summarized briefly to show how they fit together for two 
> initial releases of LCF, but will be broken out into individual LCF Jira 
> issues.
> Goals:
> 1. LCF as a standalone, downloadable, usable-out-of-the-box product (much as 
> Solr is today)
> 2. LCF as a toolkit for developers needing customized crawling and repository 
> access
> 3. An API-based crawling engine that can be integrated with applications (as 
> Aperture is today)
> Larger goals:
> 1. Make it very easy for users to evaluate LCF.
> 2. Make it very easy for developers to customize LCF.
> 3. Make it very easy for appplications to fully manage and control LCF in 
> operation.
> Two phases:
> 1) Standalone, packaged app that is super-easy to evaluate and deploy. Call 
> it LCF 0.5.
> 2) API-based crawling engine for applications for which the UI might not be 
> appropriate. Call it LCF 1.0.
> Phase 1
> ---
> LCF 0.5 right out of the box would interface loosely with Solr 1.4 or later.
> It would contain roughly the features that are currently in place or 
> currently underway, plus a little more.
> Specifically, LCF 0.5 would contain these additional capabilities:
> 1. Plug-in architecture for connectors (CONNECTORS-40 - DONE)
> 2. Packaged app ready to run with embedded Jetty app server (CONNECTORS-59)
> 3. Bundled with database - PostgreSQL or derby - ready to run without 
> additional manual setup (CONNECTORS-55)
> 4. Mini-API to initially configure default connections and "example" jobs for 
> file system and web crawl (CONNECTORS-58)
> 5. Agent process started automatically (CONNECTORS-60)
> 6. Solr output connector option to commit at end of job, by default 
> (CONNECTORS-57)
> Installation and basic evaluation of LCF would be essentially as simple as 
> Solr is today. The example
> connections and jobs would permit the user to initiate example crawls of a 
> file system example
> directory and an example web on the LCF web site with just a couple of clicks 
> (as opposed to the
> detailed manual setup required today to create repository and output 
> connections and jobs.
> It is worth considering whether the SharePoint connector could also be 
> included as part of the default package.
> Users could then add additional connectors and repositories and jobs as 
> desired.
> Timeframe for release? Level of effort?
> Phase 2
> ---
> The essence of Phase 2 is that LCF would be split to allow direct, full API 
> access to LCF as a
> crawling "engine", in additional to the full LCF UI. Call this LCF 1.0.
> Specifically, LCF 1.0 would contain these additional capabilities:
> 1. Full API for LCF as a crawling engine (CONNECTORS-56)
> 2. LCF can be bundled within an app (CONNECTORS-61)
> 3. LCF event and activity notification for full control by an application 
> (CONNECTORS-41)
> Overall, LCF will offer roughly the same crawling capabilities as with LCF 
> 0.5, plus whatever bug
> fixes and minor enhancements might also be added.
> Timeframe for release? Level of effort?
> -
> Issues:
> - Can we package PostgreSQL with LCF so LCF can set it up?
>   - Or do we need Derby for that purpose?
> - Managing multiple processes (UI, database, agent, app processes)
> - What exactly would the API look like? (URL, XML, JSON, YAML?)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-16 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-55:
--

Component/s: Installers
 (was: Framework core)

Moving this to "installers" category.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-62) Document the LCF API

2010-07-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889210#action_12889210
 ] 

Karl Wright commented on CONNECTORS-62:
---

Main API documentation complete, at 
https://cwiki.apache.org/confluence/display/CONNECTORS/Programmatic+Operation+of+LCF
Connector-specific documentation will necessarily take longer, and may not be 
done immediately.


> Document the LCF API
> 
>
> Key: CONNECTORS-62
> URL: https://issues.apache.org/jira/browse/CONNECTORS-62
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>
> Not only does the LCF API itself need documentation, but so do all the 
> connector configuration/specification objects, now that they are exposed.  
> This should probably become part of the developer documentation on the main 
> LCF website.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-56) All features should be accessible through an API

2010-07-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889255#action_12889255
 ] 

Karl Wright commented on CONNECTORS-56:
---

Another potential problem with using PUT is that there is no support in PUT I 
am aware of for large data.  You need multipart POST for that.  Thus, PUT would 
fail for large jobs (e.g. RSS jobs with lots of seeds).


> All features should be accessible through an API
> 
>
> Key: CONNECTORS-56
> URL: https://issues.apache.org/jira/browse/CONNECTORS-56
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>Assignee: Karl Wright
>
> LCF consists of a full-featured crawling engine and a full-featured user 
> interface to access the features of that engine, but some applications are 
> better served with a full API that lets the application control the crawling 
> engine, including creation and editing of connections and creation, editing, 
> and control of jobs. Put simply, everything that a user can accomplish via 
> the LCF UI should be doable through an LCF API. All LCF objects should be 
> queryable through the API.
> A primary use case is Solr applications which currently use Aperture for 
> crawling, but would prefer the full-featured capabilities of LCF as a 
> crawling engine over Aperture.
> I do not wish to over-specify the API in this initial description, but I 
> think the LCF API should probably be a traditional REST API., with some of 
> the API elements specified via the context path, some parameters via URL 
> query parameters, and complex, detailed structures as JSON (or similar.). The 
> precise details of the API are beyond the scope of this initial description 
> and will be added incrementally once the high-level approach to the API 
> becomes reasonably settled.
> A job status and event reporting scheme is also needed in conjunction with 
> the LCF API. That requirement has already been captured as CONNECTORS-41.
> The intention for the API is to create, edit, access, and control all of the 
> objects managed by LCF. The main focus is on repositories, jobs, and status, 
> and less about document-specific crawling information, but there may be some 
> benefit to querying crawling status for individual documents as well.
> Nothing in this proposal should in any way limit or constrain the features 
> that will be available in the LCF UI. The intent is that LCF should continue 
> to have a full-featured UI, but in addition to a full-featured API.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-56) All features should be accessible through an API

2010-07-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889251#action_12889251
 ] 

Karl Wright commented on CONNECTORS-56:
---

HTTP methods other than GET or PUT are in fact poorly supported in many HTTP 
clients, including Apache Commons HTTPClient.  I am also unsure of whether 
Jetty supports the DELETE method at the servlet level.  I therefore think your 
suggestion would potentially cause a great deal of headache for no tangible 
benefit.


> All features should be accessible through an API
> 
>
> Key: CONNECTORS-56
> URL: https://issues.apache.org/jira/browse/CONNECTORS-56
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Framework core
>Reporter: Jack Krupansky
>Assignee: Karl Wright
>
> LCF consists of a full-featured crawling engine and a full-featured user 
> interface to access the features of that engine, but some applications are 
> better served with a full API that lets the application control the crawling 
> engine, including creation and editing of connections and creation, editing, 
> and control of jobs. Put simply, everything that a user can accomplish via 
> the LCF UI should be doable through an LCF API. All LCF objects should be 
> queryable through the API.
> A primary use case is Solr applications which currently use Aperture for 
> crawling, but would prefer the full-featured capabilities of LCF as a 
> crawling engine over Aperture.
> I do not wish to over-specify the API in this initial description, but I 
> think the LCF API should probably be a traditional REST API., with some of 
> the API elements specified via the context path, some parameters via URL 
> query parameters, and complex, detailed structures as JSON (or similar.). The 
> precise details of the API are beyond the scope of this initial description 
> and will be added incrementally once the high-level approach to the API 
> becomes reasonably settled.
> A job status and event reporting scheme is also needed in conjunction with 
> the LCF API. That requirement has already been captured as CONNECTORS-41.
> The intention for the API is to create, edit, access, and control all of the 
> objects managed by LCF. The main focus is on repositories, jobs, and status, 
> and less about document-specific crawling information, but there may be some 
> benefit to querying crawling status for individual documents as well.
> Nothing in this proposal should in any way limit or constrain the features 
> that will be available in the LCF UI. The intent is that LCF should continue 
> to have a full-featured UI, but in addition to a full-featured API.
> Note: This issue is part of Phase 2 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-64) Document the FileNet configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document the FileNet configuration/specification/command API pieces
---

 Key: CONNECTORS-64
 URL: https://issues.apache.org/jira/browse/CONNECTORS-64
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document FileNet-specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-65) Document File Connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document File Connector configuration/specification/command API pieces
--

 Key: CONNECTORS-65
 URL: https://issues.apache.org/jira/browse/CONNECTORS-65
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document File System Connector -specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-62) Document the LCF API

2010-07-19 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-62?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889844#action_12889844
 ] 

Karl Wright commented on CONNECTORS-62:
---

Overall API has been documented in the 'programmatic control of LCF' wiki page 
now.  Still missing individual connector configuration/specification/command 
descriptions.  My sense is that we may be able to postpone the latter until 
post-release, so I'm going to resolve this ticket and create a family of 
related tickets specific to individual connectors.  Then we can decide the 
relative importance of each to the release.


> Document the LCF API
> 
>
> Key: CONNECTORS-62
> URL: https://issues.apache.org/jira/browse/CONNECTORS-62
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Karl Wright
>
> Not only does the LCF API itself need documentation, but so do all the 
> connector configuration/specification objects, now that they are exposed.  
> This should probably become part of the developer documentation on the main 
> LCF website.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-62) Document the LCF API

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-62.
---

  Assignee: Karl Wright
Resolution: Fixed

Connector-specific API documentation still needs to be written - additional 
tickets to be created.

> Document the LCF API
> 
>
> Key: CONNECTORS-62
> URL: https://issues.apache.org/jira/browse/CONNECTORS-62
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>    Assignee: Karl Wright
>
> Not only does the LCF API itself need documentation, but so do all the 
> connector configuration/specification objects, now that they are exposed.  
> This should probably become part of the developer documentation on the main 
> LCF website.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-66) Document Active Directory authority configuration API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document Active Directory authority configuration API pieces


 Key: CONNECTORS-66
 URL: https://issues.apache.org/jira/browse/CONNECTORS-66
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document Active Directory-specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-69) Document JDBC connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document JDBC connector configuration/specification API pieces
--

 Key: CONNECTORS-69
 URL: https://issues.apache.org/jira/browse/CONNECTORS-69
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document JDBC connector -specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-67) Document GTS output connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document GTS output connector configuration/specification API pieces


 Key: CONNECTORS-67
 URL: https://issues.apache.org/jira/browse/CONNECTORS-67
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document GTS output connector- specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-73) Document RSS connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document RSS connector configuration/specification API pieces
-

 Key: CONNECTORS-73
 URL: https://issues.apache.org/jira/browse/CONNECTORS-73
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document RSS connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-68) Document jCIFS connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document jCIFS connector configuration/specification/command API pieces
---

 Key: CONNECTORS-68
 URL: https://issues.apache.org/jira/browse/CONNECTORS-68
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document jCIFS connector -specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-70) Document LiveLink configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document LiveLink configuration/specification/command API pieces


 Key: CONNECTORS-70
 URL: https://issues.apache.org/jira/browse/CONNECTORS-70
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document LiveLink connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-75) Document Solr Connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document Solr Connector configuration/specification API pieces
--

 Key: CONNECTORS-75
 URL: https://issues.apache.org/jira/browse/CONNECTORS-75
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document Solr Connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-71) Document Memex connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document Memex connector configuration/specification/command API pieces
---

 Key: CONNECTORS-71
 URL: https://issues.apache.org/jira/browse/CONNECTORS-71
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document Memex connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-72) Document Meridio connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document Meridio connector configuration/specification/command API pieces
-

 Key: CONNECTORS-72
 URL: https://issues.apache.org/jira/browse/CONNECTORS-72
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document Meridio connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-74) Document SharePoint connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document SharePoint connector configuration/specification/command API pieces


 Key: CONNECTORS-74
 URL: https://issues.apache.org/jira/browse/CONNECTORS-74
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document SharePoint connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-76) Document Web Connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)
Document Web Connector configuration/specification API pieces
-

 Key: CONNECTORS-76
 URL: https://issues.apache.org/jira/browse/CONNECTORS-76
 Project: Lucene Connector Framework
  Issue Type: Improvement
  Components: Documentation
Reporter: Karl Wright
Priority: Minor


Need to document web connector - specific API objects and commands.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-65) Document File Connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-65?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-65:
--

Fix Version/s: LCF Release 0.5

> Document File Connector configuration/specification/command API pieces
> --
>
> Key: CONNECTORS-65
> URL: https://issues.apache.org/jira/browse/CONNECTORS-65
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document File System Connector -specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-75) Document Solr Connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-75?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-75:
--

Fix Version/s: LCF Release 0.5

> Document Solr Connector configuration/specification API pieces
> --
>
> Key: CONNECTORS-75
> URL: https://issues.apache.org/jira/browse/CONNECTORS-75
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document Solr Connector - specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-73) Document RSS connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-73:
--

Fix Version/s: LCF Release 0.5

> Document RSS connector configuration/specification API pieces
> -
>
> Key: CONNECTORS-73
> URL: https://issues.apache.org/jira/browse/CONNECTORS-73
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document RSS connector - specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-74) Document SharePoint connector configuration/specification/command API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-74?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-74:
--

Fix Version/s: LCF Release 0.5

> Document SharePoint connector configuration/specification/command API pieces
> 
>
> Key: CONNECTORS-74
> URL: https://issues.apache.org/jira/browse/CONNECTORS-74
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document SharePoint connector - specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-76) Document Web Connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-76?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-76:
--

Fix Version/s: LCF Release 0.5

> Document Web Connector configuration/specification API pieces
> -
>
> Key: CONNECTORS-76
> URL: https://issues.apache.org/jira/browse/CONNECTORS-76
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document web connector - specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CONNECTORS-69) Document JDBC connector configuration/specification API pieces

2010-07-19 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-69:
--

Fix Version/s: LCF Release 0.5

> Document JDBC connector configuration/specification API pieces
> --
>
> Key: CONNECTORS-69
> URL: https://issues.apache.org/jira/browse/CONNECTORS-69
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Documentation
>    Reporter: Karl Wright
>Priority: Minor
> Fix For: LCF Release 0.5
>
>
> Need to document JDBC connector -specific API objects and commands.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-78) Tests and test infrastructure needed for Documentum connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test infrastructure needed for Documentum connector
-

 Key: CONNECTORS-78
 URL: https://issues.apache.org/jira/browse/CONNECTORS-78
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


We need tests and testing infrastructure that will allow the Documentum 
connector to be tested.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-80) Tests and test server needed for JDBC connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for JDBC connector
---

 Key: CONNECTORS-80
 URL: https://issues.apache.org/jira/browse/CONNECTORS-80
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


The JDBC connector needs tests and a test database server to run against.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-77) Tests and test server needed for FileNet connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for FileNet connector
--

 Key: CONNECTORS-77
 URL: https://issues.apache.org/jira/browse/CONNECTORS-77
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


We need global testing infrastructure available that would permit a FileNet 
test to be written.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-84) Tests and test server needed for SharePoint connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for SharePoint connector
-

 Key: CONNECTORS-84
 URL: https://issues.apache.org/jira/browse/CONNECTORS-84
 Project: Lucene Connector Framework
  Issue Type: Bug
Reporter: Karl Wright


We need tests and a SharePoint server to test the SharePoint connector.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-79) Tests and test server for jCIFS connector needed

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server for jCIFS connector needed


 Key: CONNECTORS-79
 URL: https://issues.apache.org/jira/browse/CONNECTORS-79
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


We need test infrastructure and tests for the jCIFS connector.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-85) RSS tests need to be written

2010-07-20 Thread Karl Wright (JIRA)
RSS tests need to be written


 Key: CONNECTORS-85
 URL: https://issues.apache.org/jira/browse/CONNECTORS-85
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


RSS connector unit tests, which set up a proper RSS test environment, needs to 
be written.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-81) Tests and test server needed for LiveLink connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for LiveLink connector
---

 Key: CONNECTORS-81
 URL: https://issues.apache.org/jira/browse/CONNECTORS-81
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


The LiveLink connector needs tests and a test LiveLink server to run against.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-82) Tests and test server needed for Memex connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for Memex connector


 Key: CONNECTORS-82
 URL: https://issues.apache.org/jira/browse/CONNECTORS-82
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


The Memex connector needs tests and a Patriarch server to run against.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-83) Tests and test server needed for Meridio connector

2010-07-20 Thread Karl Wright (JIRA)
Tests and test server needed for Meridio connector
--

 Key: CONNECTORS-83
 URL: https://issues.apache.org/jira/browse/CONNECTORS-83
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


The Meridio connector needs tests, and a Meridio test server to run against.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-86) Web connector tests need to be written

2010-07-20 Thread Karl Wright (JIRA)
Web connector tests need to be written
--

 Key: CONNECTORS-86
 URL: https://issues.apache.org/jira/browse/CONNECTORS-86
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


Unit Webconnector tests, which set up a proper web crawling environment, need 
to be written.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-87) Connector Framework load test needs to be written

2010-07-20 Thread Karl Wright (JIRA)
Connector Framework load test needs to be written
-

 Key: CONNECTORS-87
 URL: https://issues.apache.org/jira/browse/CONNECTORS-87
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Tests
Reporter: Karl Wright


LCF needs a load or performance test, which verifies that the core software is 
performing as expected.  This test can use the file system connector, but must 
verify that individual throttle bins are getting approximately equal time, and 
that the system as a whole is behaving efficiently.  Furthermore, at least 
1,000,000 documents should be crawled by this test.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-88) Nightly builds need to be set up, including web publishing of javadoc

2010-07-20 Thread Karl Wright (JIRA)
Nightly builds need to be set up, including web publishing of javadoc
-

 Key: CONNECTORS-88
 URL: https://issues.apache.org/jira/browse/CONNECTORS-88
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Build
Reporter: Karl Wright


LCF needs nightly builds, and web publishing of its javadocs.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-89) Quickstart build should also create a ZIP file

2010-07-20 Thread Karl Wright (JIRA)
Quickstart build should also create a ZIP file
--

 Key: CONNECTORS-89
 URL: https://issues.apache.org/jira/browse/CONNECTORS-89
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Build
Reporter: Karl Wright


The Quickstart example build should create a zip file that's appropriate for 
download.  This should include everything for Quickstart, as well as what is 
needed for all auxiliary processes (namely, the Documentum and FileNet RMI 
processes).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-20 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12890572#action_12890572
 ] 

Karl Wright commented on CONNECTORS-55:
---

H2 looks pretty impressive also, featurewise.
I actually already did a hsqldb driver for LCF but simply have not had a moment 
to even try it out.  It should not be hard to attempt one for H2.  Simple 
problems ought to manifest themselves rather quickly under the unit tests.  
Ideally, though, we need a large test to figure out what embedded database to 
choose.

The hardest methods of the driver to writer are the "interrogation" methods - 
e.g. finding the definitions for tables and indexes, finding out whether a user 
already exists, etc.  There's not enough standardization on how you do this 
across databases, and the way you do it is almost always not well documented 
either.


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CONNECTORS-90) Software fails on non-English systems

2010-07-21 Thread Karl Wright (JIRA)
Software fails on non-English systems
-

 Key: CONNECTORS-90
 URL: https://issues.apache.org/jira/browse/CONNECTORS-90
 Project: Lucene Connector Framework
  Issue Type: Bug
  Components: Framework core
 Environment: Any machine whose locale indicates non-English as the 
language
Reporter: Karl Wright
Priority: Blocker


When LCF is used on non-English systems, database errors that should be 
interpreted as retries are in fact interpreted as hard failures.  This is 
because the database implementations are looking for english text in the error 
message itself.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CONNECTORS-90) Software fails on non-English systems

2010-07-21 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-90.
---

Fix Version/s: LCF Release 0.5
   Resolution: Fixed

r966217.  Use SQLState values instead, which should be language independent.

> Software fails on non-English systems
> -
>
> Key: CONNECTORS-90
> URL: https://issues.apache.org/jira/browse/CONNECTORS-90
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Framework core
> Environment: Any machine whose locale indicates non-English as the 
> language
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Blocker
> Fix For: LCF Release 0.5
>
>
> When LCF is used on non-English systems, database errors that should be 
> interpreted as retries are in fact interpreted as hard failures.  This is 
> because the database implementations are looking for english text in the 
> error message itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CONNECTORS-90) Software fails on non-English systems

2010-07-21 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-90?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-90:
-

Assignee: Karl Wright

> Software fails on non-English systems
> -
>
> Key: CONNECTORS-90
> URL: https://issues.apache.org/jira/browse/CONNECTORS-90
> Project: Lucene Connector Framework
>  Issue Type: Bug
>  Components: Framework core
> Environment: Any machine whose locale indicates non-English as the 
> language
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Blocker
> Fix For: LCF Release 0.5
>
>
> When LCF is used on non-English systems, database errors that should be 
> interpreted as retries are in fact interpreted as hard failures.  This is 
> because the database implementations are looking for english text in the 
> error message itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-55) Bundle database server with LCF packaged product

2010-07-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891533#action_12891533
 ] 

Karl Wright commented on CONNECTORS-55:
---

MVCC is the feature that suggests greater concurrency (and, hence, greater 
performance).


> Bundle database server with LCF packaged product
> 
>
> Key: CONNECTORS-55
> URL: https://issues.apache.org/jira/browse/CONNECTORS-55
> Project: Lucene Connector Framework
>  Issue Type: Sub-task
>  Components: Installers
>Reporter: Jack Krupansky
>
> The current requirement that the user install and deploy a PostgreSQL server 
> complicates the installation and deployment of LCF for the user. Installation 
> and deployment of LCF should be as simple as Solr itself. QuickStart is great 
> for the low-end and basic evaluation, but a comparable level of simplified 
> installation and deployment is still needed for full-blown, high-end 
> environments that need the full performance of a ProstgreSQL-class database 
> server. So, PostgreSQL should be bundled with the packaged release of LCF so 
> that installation and deployment of LCF will automatically install and deploy 
> a subset of the full PostgreSQL distribution that is sufficient for the needs 
> of LCF. Starting LCF, with or without the LCF UI, should automatically start 
> the database server. Shutting down LCF should also shutdown the database 
> server process.
> A typical use case would be for a non-developer who is comfortable with Solr 
> and simply wants to crawl documents from, for example, a SharePoint 
> repository and feed them into Solr. QuickStart should work well for the low 
> end or in the early stages of evaluation, but the user would prefer to 
> evaluate "the real thing" with something resembling a production crawl of 
> thousands of documents. Such a user might not be a hard-core developer or be 
> comfortable fiddling with a lot of software components simply to do one 
> conceptually simple operation.
> It should still be possible for the user to supply database server settings 
> to override the defaults, but the LCF package should have all of the 
> best-practice settings deemed appropriate for use with LCF.
> One downside is that installation and deployment will be platform-specific 
> since there are multiple processes and PostgreSQL itself requires a 
> platform-specific installation.
> This proposal presumes that PostgreSQL is the best option for the foreseeable 
> future, but nothing here is intended to preclude support for other database 
> servers in futures releases.
> This proposal should not have any impact on QuickStart packaging or 
> deployment.
> Note: This issue is part of Phase 1 of the CONNECTORS-50 umbrella issue.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898920#action_12898920
 ] 

Karl Wright commented on CONNECTORS-91:
---

It looks like this is simply using class-inheritance to separate out common 
functionality.  As such, I'm in favor of including this contribution.  Are 
there any subtleties I am missing?


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: commandsPatch.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-19) Look into converting SOLR connector to use SolrJ java library

2010-08-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-19?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898921#action_12898921
 ] 

Karl Wright commented on CONNECTORS-19:
---

I am certainly interested.  One thing I want to be certain of though is what 
jar dependencies would be necessary for your implementation, and whether the 
connector you have built is indeed as full-featured as the one it would be 
replacing?


> Look into converting SOLR connector to use SolrJ java library
> -
>
> Key: CONNECTORS-19
> URL: https://issues.apache.org/jira/browse/CONNECTORS-19
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Lucene/SOLR connector
>    Reporter: Karl Wright
>Priority: Minor
>
> The SOLR connector currently uses its own multipart post code.  It might be a 
> good idea to convert it to use the SolrJ client api jar instead.  This would 
> require license confirmation, plus research to make sure there are no jar 
> conflicts as a result, with any other connector.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-16 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898929#action_12898929
 ] 

Karl Wright commented on CONNECTORS-91:
---

I suggest, then, that you create a patch that is complete, so that I can commit 
it all at once.  For now, I would limit the work to actual "commands", and try 
not to include anything special like the quickstart main class.


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Lucene Connector Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: commandsPatch.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-20 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900902#action_12900902
 ] 

Karl Wright commented on CONNECTORS-91:
---

It is not clear whether the package name change (org.apache.acf vs 
org.apache.lcf) needs to be decided before your patch.  I am happy to do your 
stuff first.

For a list of classes, look in:

core/org/apache/lcf/core
agents/org/apache/lcf/agents
pull-agent/org/apache/lcf/crawler
pull-agent/org/apache/lcf/authorities

... Or, look at the how-to-build-and-deploy wiki page, which lists them all.

Karl

--- original message ---
From: "ext Jettro Coenradie (JIRA)" 
Subject: [jira] Commented: (CONNECTORS-91) Making the initialization commands 
more useable
Date: August 20, 2010
Time: 4:11:16  PM


[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900844#action_12900844
 ]

Jettro Coenradie commented on CONNECTORS-91:


I am going though the classes, but they are not all easy to do. It actually is 
a lot of work. Can we split the work in the most important classes first and 
maybe later on the. You talk about the actual command classes, can you provide 
a list with the classes you mean? They would help me a lot. Also for the patch 
it is easier to know if the package is actually going to be changed now that 
the name of the project is changed


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.




> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: commandsPatch.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-22 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901186#action_12901186
 ] 

Karl Wright commented on CONNECTORS-91:
---

The patch failed to apply, fairly catastrophically.  Only some 50% of files 
actually patched successfully.  I don't know why this happened yet.  Are you 
synchronized with the current version of trunk?

Karl


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: changesToCommandClasses.patch, commandsPatch.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-22 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901193#action_12901193
 ] 

Karl Wright commented on CONNECTORS-91:
---

Yes, your trunk version is too out of date.  For example:

kwri...@osore:~/wip/lcf-command-patch$ svn info 
trunk/modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java
Path: trunk/modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java
Name: SynchronizeAll.java
URL: 
https://svn.apache.org/repos/asf/incubator/lcf/trunk/modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java
Repository Root: https://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 987926
Node Kind: file
Schedule: normal
Last Changed Author: kwright
Last Changed Rev: 921329
Last Changed Date: 2010-03-10 07:44:20 -0500 (Wed, 10 Mar 2010)
Text Last Updated: 2010-08-22 12:50:36 -0400 (Sun, 22 Aug 2010)
Checksum: 42dcca2a9eab1c6d5b48bdcd5c083a0b

... and the patch says:

Index: modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java
===
--- modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java  
(revision 987465)
+++ modules/framework/agents/org/apache/lcf/agents/SynchronizeAll.java  
(working copy)


so, I am afraid you will need to synch up and reissue the patch.




> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: changesToCommandClasses.patch, commandsPatch.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901309#action_12901309
 ] 

Karl Wright commented on CONNECTORS-91:
---

This patch file worked properly.
Since the automated tests do not exercise the commands, it would be good to set 
up a database instance from scratch using the changed code.  If you have 
already done this, please let me know and I will go ahead and commit the 
changes.


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: change_commands.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901312#action_12901312
 ] 

Karl Wright commented on CONNECTORS-91:
---

Another thing I had not noticed before is that this patch removes all stderr 
success confirmation messages for those folks who use the commands, and 
replaces them with log output.  The log output is perfectly fine, but removing 
the feedback that the command was successful is, I think, not great.  If the 
log were going to stderr typically that would be OK, but it typically is not, 
so I think you are going to want to do both.  You would, obviously, want to do 
the stderr output within the main() method.

Would it be possible to fix that up before I commit this?


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: change_commands.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CONNECTORS-91) Making the initialization commands more useable

2010-08-23 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901336#action_12901336
 ] 

Karl Wright commented on CONNECTORS-91:
---

I looked at this.  The patch seems correct for some classes, but for others it 
is clearly incorrect, e.g. SynchronizeAll:

 {
   System.err.println("Usage: SynchronizeAll");
   System.exit(1);
+  System.err.println("Successfully synchronized all agents");
 }

Can you review your change for accuracy please?

Also, responding to the logging change - the log settings are global, and we 
are trying for the least amount of setup work necessary to achieve a functional 
system.  Clearly, all log messages to stderr is not going to be reasonable for 
people doing real crawls, so we'd need some way to segregate command output in 
order to direct it differently than everything else, which implies at the least 
you'd want a different logger, and then you'd also want to revise the 
documented log4j properties, if you think we should go that route.  

Re: testing.  The testing you've done so far is best we can do at the moment, 
unless you'd also like to write some unit tests.   I don't think this would be 
terribly difficult, but once again it would be time consuming. ;-)


> Making the initialization commands more useable
> ---
>
> Key: CONNECTORS-91
> URL: https://issues.apache.org/jira/browse/CONNECTORS-91
> Project: Apache Connectors Framework
>  Issue Type: Improvement
>  Components: Framework core
>Reporter: Jettro Coenradie
> Fix For: LCF Release 0.5
>
> Attachments: change_commands.patch, 
> change_commands_with_system_err_println.patch
>
>
> At the moment LCF comes with some classes that can be used to run command 
> line to interact with the system. Examples are DBCreate, DBDrop and 
> LockClean. I wanted to create a class that rebuilds my complete environment. 
> So dropping a database, creating a database, cleaning the synch folder, 
> registering agents, etc. Due to the structure of the classes with all the 
> logic in the main method, I could not easily reuse these classes. In the 
> patch I submit with issue I have refactored the current solution in a better 
> reuseable solution that can still be called command line.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



<    4   5   6   7   8   9   10   11   12   13   >