[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated CONNECTORS-224:
---

Comment: was deleted

(was: Here is a first version of the OpenSearchServer connector. The connector 
is not yet finished. This first draft contains the skeleton and the interface 
part. I thought it was interesting to publish it to show how we have separated 
the HTML content from the JAVA code. The HTML code is stored in separated 
files. We are using String replacement to provide dynamic content.)

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
  Labels: OpenSearchServer, connector, outputconnector
   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated CONNECTORS-224:
---

Attachment: oss-mfc-dev.patch

Here is a first version of the OpenSearchServer connector. The connector is not 
yet finished. This first draft contains the skeleton and the interface part. I 
thought it was interesting to publish it to show how we have separated the HTML 
content from the JAVA code. The HTML code is stored in separated files. We are 
using String replacement to provide dynamic content.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080861#comment-13080861
 ] 

Karl Wright commented on CONNECTORS-224:


I've created a branch for further development of this connector until it is 
ready to commit to trunk.  The url for the branch is:

https://svn.apache.org/repos/asf/incubator/lcf/branches/CONNECTORS-224.

For subsequent patches, please submit diffs against this branch.  Thanks!


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080867#comment-13080867
 ] 

Karl Wright edited comment on CONNECTORS-224 at 8/8/11 9:47 AM:


As far as the use of html files, the real question is how resources interact 
with class loaders in Java.  ManifoldCF originally had its HTML code generated 
by JSP files, but had to change this architecture.  The reason ManifoldCF 
pulled its HTML code out of JSP files was because JSPs don't work well with 
class loaders.  Connectors must be self-contained in that loading the connector 
class from whatever library directory is specified in the properties.xml file 
is adequate to locate all resources for the connector.  Can you clarify whether 
or not you think this will work for your html resource files?


  was (Author: kwri...@metacarta.com):
As far as the use of html files, the real question is how resources 
interact with class loaders in Java.  The reason ManifoldCF pulled its HTML 
code out of JSP files was because JSPs don't work well with class loaders.  
Connectors must be self-contained in that loading the connector class from 
whatever library directory is specified in the properties.xml file is adequate 
to locate all resources for the connector.  Can you clarify whether or not you 
think this will work for your html resource files?

  
 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080867#comment-13080867
 ] 

Karl Wright commented on CONNECTORS-224:


As far as the use of html files, the real question is how resources interact 
with class loaders in Java.  The reason ManifoldCF pulled its HTML code out of 
JSP files was because JSPs don't work well with class loaders.  Connectors must 
be self-contained in that loading the connector class from whatever library 
directory is specified in the properties.xml file is adequate to locate all 
resources for the connector.  Can you clarify whether or not you think this 
will work for your html resource files?


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




The ManifoldCF PPMC welcomes Piergiorgio Lucidi as a new ManifoldCF committer!

2011-08-08 Thread Karl Wright
Please join me in congratulating Piergiorgio!

Karl


Re: The ManifoldCF PPMC welcomes Piergiorgio Lucidi as a new ManifoldCF committer!

2011-08-08 Thread Piergiorgio Lucidi
Thank you very much for this great opportunity!
I'm very happy and honoured to be part of the Apache Community.

I would like to thank all the involved people in the project, expecially
Karl and Tommaso for their support during my first contribution.

Regards,
Piergiorgio

2011/8/8 Emmanuel Keller ekel...@open-search-server.com

 Congratulations Piergiorgio !

 Welcome to the committer and the musician.


 On 8 août 2011, at 14:17, Karl Wright wrote:

  Please join me in congratulating Piergiorgio!
 
  Karl




-- 
Piergiorgio Lucidi
Web: http://about.me/piergiorgiolucidi


[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated CONNECTORS-224:
---

Attachment: oss-mfc-alpha.patch

First alpha version of the output connector for OpenSearchServer.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13080998#comment-13080998
 ] 

Emmanuel Keller commented on CONNECTORS-224:


Thank you for the quick integration. I switched to the new svn branch.

About the resource file, at this time I only checked it with the embedded web 
server (Jetty). I will make additional tests using Tomcat. The files are 
directly loaded by our code using getClass().getResourceAsStream().
Because the connector is visible in the webapp, the containing resources should 
also be available. I suppose it is wise to locate the files in the same 
package/jar.
Finally, we don't rely on the JSP interpretation.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081000#comment-13081000
 ] 

Karl Wright commented on CONNECTORS-224:


When I build using ant build, this is what I get:

compile-connector:
[mkdir] Created dir: 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/build/connector/classes
[javac] Compiling 5 source files to 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/build/connector/classes
[javac] 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/connector/src/main/java/org/apache/manifoldcf/agents/output/opensearchserver/OpenSearchServerConnector.java:17:
 package 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerAction 
does not exist
[javac] import 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerAction.CommandEnum;
[javac] 
  ^
[javac] 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/connector/src/main/java/org/apache/manifoldcf/agents/output/opensearchserver/OpenSearchServerConnector.java:181:
 cannot find symbol
[javac] symbol  : class OpenSearchServerAction
[javac] location: class 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector
[javac] OpenSearchServerAction oo = new OpenSearchServerAction(
[javac] ^
[javac] 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/connector/src/main/java/org/apache/manifoldcf/agents/output/opensearchserver/OpenSearchServerConnector.java:181:
 cannot find symbol
[javac] symbol  : class OpenSearchServerAction
[javac] location: class 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector
[javac] OpenSearchServerAction oo = new OpenSearchServerAction(
[javac] ^
[javac] 
/data/mcf/CONNECTORS-224/connectors/opensearchserver/connector/src/main/java/org/apache/manifoldcf/agents/output/opensearchserver/OpenSearchServerConnector.java:182:
 cannot find symbol
[javac] symbol  : variable CommandEnum
[javac] location: class 
org.apache.manifoldcf.agents.output.opensearchserver.OpenSearchServerConnector
[javac] CommandEnum.optimize, 
getParameters(null));
[javac] ^
[javac] 4 errors

BUILD FAILED


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Emmanuel Keller updated CONNECTORS-224:
---

Attachment: oss-mfc-alpha2.patch

Sorry, here is the missing file.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081010#comment-13081010
 ] 

Karl Wright commented on CONNECTORS-224:


Great - patch 3 committed (and now it builds).


 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Reseting Manifoldcf

2011-08-08 Thread Farzad Valad
I took sometime to read and learn about the API service.  Here are the 
specifics, I want to be able to do the following with one invocation of 
something, is that doable?  Prefer a batch file, but understand if I 
have to use something else.


1) Delete any jobs in the system  (Purpose: Clear out my data stored in 
my connector db table)

2) Delete the repository connector (Purpose: Clear out the stored logs)
2) Delete the output connector (Purpose: Clear out the stored logs)
3) Unregister my connector (Purpose: Zap the db table)
4) Reregister my connector
5) Define the repository connector
6) Define the output connector
7) Define a job waiting to start.


On 8/3/2011 10:03 AM, Karl Wright wrote:

Can you clarify what you mean by user data?  There's no such data
stored by ManifoldCF in any kind of persistent way.

There are command-line commands which clear out various kinds of
things like jobs and connections.  There's also the ManifoldCF API
Service.  But I can't help further unless you are more specific.

Karl

On Wed, Aug 3, 2011 at 10:57 AM, Farzad Valadho...@farzad.net  wrote:

What would be the sequence of commands to automate to reset ManifoldCF and
flush out user data?  I've been doing it through the UI and it is very
tedious.  Much rather have a batch file to do the job.  Thanks!





[jira] [Commented] (CONNECTORS-224) OpenSearchServer connector

2011-08-08 Thread Emmanuel Keller (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081029#comment-13081029
 ] 

Emmanuel Keller commented on CONNECTORS-224:


There is no light version of OpenSearchServer. To bring up a standalone 
OpenSearchServer instance we will need to embed its war file (60 MB).

I can provide some JUnit tests that will rely on an already running 
OpenSearchServer instance.

And yes, I am not (yet) maven-savvy. Thank you for your help.

 OpenSearchServer connector
 --

 Key: CONNECTORS-224
 URL: https://issues.apache.org/jira/browse/CONNECTORS-224
 Project: ManifoldCF
  Issue Type: New Feature
  Components: OpenSearchServer connector
Affects Versions: ManifoldCF 0.3
Reporter: Emmanuel Keller
Assignee: Karl Wright
  Labels: OpenSearchServer, connector, outputconnector
 Attachments: oss-mfc-alpha.patch, oss-mfc-alpha2.patch, 
 oss-mfc-dev.patch

   Original Estimate: 336h
  Remaining Estimate: 336h

 Provide an output connector for 
 [OpenSearchServer|http://www.open-search-server.com].

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: File Metadata

2011-08-08 Thread Karl Wright
It's not there now, but it would be trivial to add.  If this is
something you need could you create a ticket with your proposal?
Karl

On Mon, Aug 8, 2011 at 12:45 PM, Farzad Valad ho...@farzad.net wrote:
 Do you know if the lastModified attribute of a crawled file via the
 FileSystem connector is accessible through the RepositoryDocument class in
 the output connector (addOrReplaceDocument Method).  I didn't see it in the
 connector code, just not sure if I over looked something.  Thanks!

 PS. My next item is the file owner, so far I'm finding a lot of references
 to performing JNI per file.  The whole goal is to be able to find a set of
 crawled docs that were modified a date range which belonged to person X.



Re: Reseting Manifoldcf

2011-08-08 Thread Karl Wright
The code on trunk looks like this:


public static void main(String[] args)
{
if (args.length  5)
{
System.err.println(Usage:
DefineRepositoryConnection connection_name description
connector_class authority_name pool_max param1=value1 ...);
System.exit(1);
}

String connectionName = args[0];
String description = args[1];
String connectorClass = args[2];
String authorityName = args[3];
String poolMax = args[4];




So it requires 5 parameters, not 4.

Karl

On Mon, Aug 8, 2011 at 6:23 PM, Farzad Valad ho...@farzad.net wrote:
 Thanks for the link.  I was readying chapter 3 of the book.  So I'm issuing
 the command with what I think is the 4 basic parms it needs, but I keep
 getting the Usage statement.  The code says if (args.length  4) and I'm
 counting 4 parms, what gives?

 processes\script\executecommand.bat
 org.apache.manifoldcf.crawler.DefineRepositoryConnection FileShare FileShare
 org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector 30

 Usage: DefineRepositoryConnection connection_name description
 connector_class authority_name pool_max param1=value1 ...


 On 8/8/2011 11:11 AM, Karl Wright wrote:

 Hi Farzad,

 Either the api service or the command are, I believe, capable of doing
 all of these.

 Have a look at this link for some idea of how to do either of these.

 http://incubator.apache.org/connectors/programmatic-operation.html

 Karl

 On Mon, Aug 8, 2011 at 12:05 PM, Farzad Valadho...@farzad.net  wrote:

 I took sometime to read and learn about the API service.  Here are the
 specifics, I want to be able to do the following with one invocation of
 something, is that doable?  Prefer a batch file, but understand if I have
 to
 use something else.

 1) Delete any jobs in the system  (Purpose: Clear out my data stored in
 my
 connector db table)
 2) Delete the repository connector (Purpose: Clear out the stored logs)
 2) Delete the output connector (Purpose: Clear out the stored logs)
 3) Unregister my connector (Purpose: Zap the db table)
 4) Reregister my connector
 5) Define the repository connector
 6) Define the output connector
 7) Define a job waiting to start.


 On 8/3/2011 10:03 AM, Karl Wright wrote:

 Can you clarify what you mean by user data?  There's no such data
 stored by ManifoldCF in any kind of persistent way.

 There are command-line commands which clear out various kinds of
 things like jobs and connections.  There's also the ManifoldCF API
 Service.  But I can't help further unless you are more specific.

 Karl

 On Wed, Aug 3, 2011 at 10:57 AM, Farzad Valadho...@farzad.net
  wrote:

 What would be the sequence of commands to automate to reset ManifoldCF
 and
 flush out user data?  I've been doing it through the UI and it is very
 tedious.  Much rather have a batch file to do the job.  Thanks!






Re: Defining a job

2011-08-08 Thread Karl Wright
The form of the XML differs whether you are sending in configuration
XML (which has the configuration tags) or specification XML (which
has the specification tags).

Karl

On Mon, Aug 8, 2011 at 7:19 PM, Farzad Valad ho...@farzad.net wrote:
 Having trouble getting the filespec_xml and outputspec_xml.  Used pgAdmin
 and see a column labeled configxml text for my output and repo connector.
  It's content for both is ?xml version=1.0
 encoding=UTF-8?configuration/  So I issued the following command and
 got errored out, but I used what was in the db.  Thoughts? Thanks!

 processes\script\executecommand.bat org.apache.manifoldcf.crawler.DefineJob
 TestCrawl FileShare DupFinder specified disable neverdelete 0 0 0 5 
 ?xml version='1.0' encoding='UTF-8'?configuration/ ?xml
 version='1.0' encoding='UTF-8'?configuration/
 Configuration file successfully read
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad xml - outer
 node is not 'specification'
        at
 org.apache.manifoldcf.core.interfaces.Configuration.initializeFromDoc(Configuration.java:652)
        at
 org.apache.manifoldcf.core.interfaces.Configuration.fromXML(Configuration.java:443)
        at org.apache.manifoldcf.crawler.DefineJob.main(DefineJob.java:125)



Re: Defining a job

2011-08-08 Thread Farzad Valad
I changed the word configuration to specification and the command 
returned a job id.  However I don't have a path defined, which my guess 
is related to the empty config xmls.  Where/How do I find the proper 
filespec and outputspec xml defs? All that was in the db was the empty 
ones I used.  Didn't have much success digging it out of the crawler UI.


On 8/8/2011 6:23 PM, Karl Wright wrote:

The form of the XML differs whether you are sending in configuration
XML (which has theconfiguration  tags) or specification XML (which
has thespecification  tags).

Karl

On Mon, Aug 8, 2011 at 7:19 PM, Farzad Valadho...@farzad.net  wrote:

Having trouble getting the filespec_xml and outputspec_xml.  Used pgAdmin
and see a column labeled configxml text for my output and repo connector.
  It's content for both is ?xml version=1.0
encoding=UTF-8?configuration/  So I issued the following command and
got errored out, but I used what was in the db.  Thoughts? Thanks!

processes\script\executecommand.bat org.apache.manifoldcf.crawler.DefineJob
TestCrawl FileShare DupFinder specified disable neverdelete 0 0 0 5 
?xml version='1.0' encoding='UTF-8'?configuration/?xml
version='1.0' encoding='UTF-8'?configuration/
Configuration file successfully read
org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad xml - outer
node is not 'specification'
at
org.apache.manifoldcf.core.interfaces.Configuration.initializeFromDoc(Configuration.java:652)
at
org.apache.manifoldcf.core.interfaces.Configuration.fromXML(Configuration.java:443)
at org.apache.manifoldcf.crawler.DefineJob.main(DefineJob.java:125)





RE: Defining a job

2011-08-08 Thread daddy...@gmail.com
The easiest way is to define what you want using the ui, then either look at 
the database or use the api or a command to get the xml.

Karl

Sent from my Nokia phone
-Original Message-
From: Farzad Valad
Sent:  08/08/2011, 7:44  PM
To: connectors-dev@incubator.apache.org
Subject: Re: Defining a job


I changed the word configuration to specification and the command 
returned a job id.  However I don't have a path defined, which my guess 
is related to the empty config xmls.  Where/How do I find the proper 
filespec and outputspec xml defs? All that was in the db was the empty 
ones I used.  Didn't have much success digging it out of the crawler UI.

On 8/8/2011 6:23 PM, Karl Wright wrote:
 The form of the XML differs whether you are sending in configuration
 XML (which has theconfiguration  tags) or specification XML (which
 has thespecification  tags).

 Karl

 On Mon, Aug 8, 2011 at 7:19 PM, Farzad Valadho...@farzad.net  wrote:
 Having trouble getting the filespec_xml and outputspec_xml.  Used pgAdmin
 and see a column labeled configxml text for my output and repo connector.
   It's content for both is ?xml version=1.0
 encoding=UTF-8?configuration/  So I issued the following command and
 got errored out, but I used what was in the db.  Thoughts? Thanks!

 processes\script\executecommand.bat org.apache.manifoldcf.crawler.DefineJob
 TestCrawl FileShare DupFinder specified disable neverdelete 0 0 0 5 
 ?xml version='1.0' encoding='UTF-8'?configuration/?xml
 version='1.0' encoding='UTF-8'?configuration/
 Configuration file successfully read
 org.apache.manifoldcf.core.interfaces.ManifoldCFException: Bad xml - outer
 node is not 'specification'
 at
 org.apache.manifoldcf.core.interfaces.Configuration.initializeFromDoc(Configuration.java:652)
 at
 org.apache.manifoldcf.core.interfaces.Configuration.fromXML(Configuration.java:443)
 at org.apache.manifoldcf.crawler.DefineJob.main(DefineJob.java:125)