[jira] [Resolved] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

 [ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-235.


   Resolution: Fixed
Fix Version/s: ManifoldCF 0.3
 Assignee: Karl Wright

r1153361. The description field value will now come through as "description" 
metadata, when it is not being used for dechromed content.


> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-236) Tests and test server needed for CMIS connector

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078612#comment-13078612
 ] 

Karl Wright commented on CONNECTORS-236:


If there is any way for the tests to start their own local test server, and 
tear it back down again when done, that's ideal.  Otherwise, we need to come up 
with some sort of "standard testbed" notion for each repository out there.


> Tests and test server needed for CMIS connector
> ---
>
> Key: CONNECTORS-236
> URL: https://issues.apache.org/jira/browse/CONNECTORS-236
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: CMIS connector
>Reporter: Piergiorgio Lucidi
>
> The CMIS connector needs tests, and a CMIS test server to run against.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-236) Tests and test server needed for CMIS connector

2011-08-03 Thread Piergiorgio Lucidi (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078619#comment-13078619
 ] 

Piergiorgio Lucidi commented on CONNECTORS-236:
---

I have just started to configure yesterday the CMIS test server that will be 
started before the test. 

OpenCMIS includes some framework components that helps us to integrate OpenCMIS 
in a totally way and includes an InMemory Repository dedicated to execute test:

http://chemistry.apache.org/java/developing/repositories/dev-repositories-inmemory.html

The InMemory Repository is distributed as a war Maven dependencies that I have 
configured in the Manifold test as a simple Maven ArtifactItem.

So now the CMIS test server is managed by the Maven test goal.
Now I can start the CMIS Server from Maven and during these two days I hope to 
finish a first version of this integration test.

I'm following the same approach used in the end-to-end test of the filesystem 
connector trying to re-use the same methods with test annotations: 
createTestArea, removeTestArea.

TODO:
- implement createTestArea method
- implement removeTestArea method
- implement the rest of the test class with the CMIS configuration 
initialization
- implement all the assert to check the test execution

> Tests and test server needed for CMIS connector
> ---
>
> Key: CONNECTORS-236
> URL: https://issues.apache.org/jira/browse/CONNECTORS-236
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: CMIS connector
>Reporter: Piergiorgio Lucidi
>
> The CMIS connector needs tests, and a CMIS test server to run against.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Kate McGonigal (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078764#comment-13078764
 ] 

Kate McGonigal commented on CONNECTORS-235:
---

Thanks! For the record though, I was using PostgreSQL.

> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13078766#comment-13078766
 ] 

Karl Wright commented on CONNECTORS-235:


Thanks for the info.  The fix, as structured, should generally apply to 
PostgreSQL too.  Please let me know if it works for you.  But I'll need to 
research how this problem could have gotten past the tests regardless.


> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Reseting Manifoldcf

2011-08-03 Thread Farzad Valad
What would be the sequence of commands to automate to reset ManifoldCF 
and flush out user data?  I've been doing it through the UI and it is 
very tedious.  Much rather have a batch file to do the job.  Thanks!


Re: Reseting Manifoldcf

2011-08-03 Thread Karl Wright
Can you clarify what you mean by "user data"?  There's no such data
stored by ManifoldCF in any kind of persistent way.

There are command-line "commands" which clear out various kinds of
things like jobs and connections.  There's also the ManifoldCF API
Service.  But I can't help further unless you are more specific.

Karl

On Wed, Aug 3, 2011 at 10:57 AM, Farzad Valad  wrote:
> What would be the sequence of commands to automate to reset ManifoldCF and
> flush out user data?  I've been doing it through the UI and it is very
> tedious.  Much rather have a batch file to do the job.  Thanks!
>


Re: CMIS Connector - Tests

2011-08-03 Thread Piergiorgio Lucidi
I'm trying to implement tests but I found a problem to set all the needed
parameters to the CMIS Repository Connector that needs: an username, a
password and the endpoint (url).

I need to know how to create the configuration nodes for the connector, in
the connector code I managed the configuration parameters in this way in the
processConfigurationPost method:

 public String processConfigurationPost(IThreadContext threadContext,
>   IPostParameters variableContext, ConfigParams parameters)
>   throws ManifoldCFException {
> String username = variableContext.getParameter(CONFIG_PARAM_USERNAME);
> if (StringUtils.isNotEmpty(username))
>   parameters.setParameter(CONFIG_PARAM_USERNAME, username);
> String password = variableContext.getParameter(CONFIG_PARAM_PASSWORD);
> if (StringUtils.isNotEmpty(password))
>   parameters.setParameter(CONFIG_PARAM_PASSWORD, password);
> String endpoint = variableContext.getParameter(CONFIG_PARAM_ENDPOINT);
> if (StringUtils.isNotEmpty(endpoint) && endpoint.length() > 0)
>   parameters.setParameter(CONFIG_PARAM_ENDPOINT, endpoint);
> String repositoryId = variableContext
> .getParameter(CONFIG_PARAM_REPOSITORY_ID);
> if (StringUtils.isNotEmpty(repositoryId))
>   parameters.setParameter(CONFIG_PARAM_REPOSITORY_ID, repositoryId);
> return null;
>   }


Now I have to setup the same parameters inside my test class APISanityTest
that doesn't like the following snippet, it works only if CMIS parameters
are commented as the following:

 @Test
>   public void sanityCheck()
> throws Exception
>   {
> try
> {
>   // Hey, we were able to install the file system connector etc.
>   // Now, create a local test job and run it.
>   IThreadContext tc = ThreadContextFactory.make();
>   int i;
>   IJobManager jobManager = JobManagerFactory.make(tc);
>   // Create a basic file system connection, and save it.
>   ConfigurationNode connectionObject;
>   ConfigurationNode child;
>   Configuration requestObject;
>   Configuration result;
>
>   connectionObject = new ConfigurationNode("repositoryconnection");
>
>   child = new ConfigurationNode("name");
>   child.setValue("CMIS Connection");
>   connectionObject.addChild(connectionObject.getChildCount(),child);
>
>   child = new ConfigurationNode("class_name");
>   child.setValue(
> "org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector");
>   connectionObject.addChild(connectionObject.getChildCount(),child);
>
>   child = new ConfigurationNode("description");
>   child.setValue("CMIS Connection");
>   connectionObject.addChild(connectionObject.getChildCount(),child);
>   child = new ConfigurationNode("max_connections");
>   child.setValue("10");
>   connectionObject.addChild(connectionObject.getChildCount(),child);
>
>   //setting the CMIS specific parameters
> //  child = new ConfigurationNode("username");
> //  child.setValue(CMIS_USERNAME);
> //  connectionObject.addChild(connectionObject.getChildCount(),child);
> //
> //  child = new ConfigurationNode("password");
> //  child.setValue(CMIS_PASSWORD);
> //  connectionObject.addChild(connectionObject.getChildCount(),child);
> //
> //  child = new ConfigurationNode("endpoint");
> //  child.setValue(CMIS_ENDPOINT_TEST_SERVER);
> //  connectionObject.addChild(connectionObject.getChildCount(),child);
>   requestObject = new Configuration();
>   requestObject.addChild(0,connectionObject);
>
>   result = performAPIPutOperationViaNodes(
> "repositoryconnections/CMIS%20Connection",201,requestObject);


How can I set the username, password and endpoint for the CMIS Repository
Connector parameters in this test class?

Thank you.

Piergiorgio


2011/8/2 Karl Wright 

> Thanks for the status report.  I hope to see your patch soon!
>
> Also, FWIW, once the documentation is also done I'd like to consider
> solidifying the 0.3 release.  It's got a lot of good stuff in it and I
> think as soon as we've finished off the new CMIS connector in all
> dimensions we should go ahead.  Thoughts, anyone?
>
> Karl
>
>
> On Tue, Aug 2, 2011 at 5:00 AM, Piergiorgio Lucidi
>  wrote:
> > Yesterday I started to work on end-to-end integration test for the CMIS
> > Connector and now I have a full running OpenCMIS test server integrated
> with
> > the ManifoldCF Maven build process.
> >
> > Now I have to implement:
> > - a setup method to create the test documents in the CMIS server
> > - a null output connector using the ManifoldCF api
> > - tests using the ManifoldCF api to create a mock configuration against
> the
> > test CMIS server
> >
> > I'll let you know when it works.
> >
> > Regards,
> > Piergiorgio
> >
> > 2011/7/29 Piergiorgio Lucidi 
> >
> >> Hi Karl,
> >>
> >> thank you for the details and as soon as I finish a first version of
> >> integration and/or unit test I will create a ne

Re: CMIS Connector - Tests

2011-08-03 Thread Karl Wright
The ConfigParams class is, I believe, derived from the Configuration
class.  So, you can create a ConfigParams object instead of a
Configuration object if you want to use the API in the manner you
describe.

The reason your commented-out code doesn't work is because the
setParameter() method isn't doing quite what you are expecting.  It's
creating a node named "_PARAMETER_" with a "name" attribute and a
value area, and you are creating nodes named by the parameter name.

Karl

On Wed, Aug 3, 2011 at 12:41 PM, Piergiorgio Lucidi
 wrote:
> I'm trying to implement tests but I found a problem to set all the needed
> parameters to the CMIS Repository Connector that needs: an username, a
> password and the endpoint (url).
>
> I need to know how to create the configuration nodes for the connector, in
> the connector code I managed the configuration parameters in this way in the
> processConfigurationPost method:
>
>  public String processConfigurationPost(IThreadContext threadContext,
>>       IPostParameters variableContext, ConfigParams parameters)
>>       throws ManifoldCFException {
>>     String username = variableContext.getParameter(CONFIG_PARAM_USERNAME);
>>     if (StringUtils.isNotEmpty(username))
>>       parameters.setParameter(CONFIG_PARAM_USERNAME, username);
>>     String password = variableContext.getParameter(CONFIG_PARAM_PASSWORD);
>>     if (StringUtils.isNotEmpty(password))
>>       parameters.setParameter(CONFIG_PARAM_PASSWORD, password);
>>     String endpoint = variableContext.getParameter(CONFIG_PARAM_ENDPOINT);
>>     if (StringUtils.isNotEmpty(endpoint) && endpoint.length() > 0)
>>       parameters.setParameter(CONFIG_PARAM_ENDPOINT, endpoint);
>>     String repositoryId = variableContext
>>         .getParameter(CONFIG_PARAM_REPOSITORY_ID);
>>     if (StringUtils.isNotEmpty(repositoryId))
>>       parameters.setParameter(CONFIG_PARAM_REPOSITORY_ID, repositoryId);
>>     return null;
>>   }
>
>
> Now I have to setup the same parameters inside my test class APISanityTest
> that doesn't like the following snippet, it works only if CMIS parameters
> are commented as the following:
>
>  @Test
>>   public void sanityCheck()
>>     throws Exception
>>   {
>>     try
>>     {
>>       // Hey, we were able to install the file system connector etc.
>>       // Now, create a local test job and run it.
>>       IThreadContext tc = ThreadContextFactory.make();
>>       int i;
>>       IJobManager jobManager = JobManagerFactory.make(tc);
>>       // Create a basic file system connection, and save it.
>>       ConfigurationNode connectionObject;
>>       ConfigurationNode child;
>>       Configuration requestObject;
>>       Configuration result;
>>
>>       connectionObject = new ConfigurationNode("repositoryconnection");
>>
>>       child = new ConfigurationNode("name");
>>       child.setValue("CMIS Connection");
>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>
>>       child = new ConfigurationNode("class_name");
>>       child.setValue(
>> "org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector");
>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>
>>       child = new ConfigurationNode("description");
>>       child.setValue("CMIS Connection");
>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>       child = new ConfigurationNode("max_connections");
>>       child.setValue("10");
>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>
>>       //setting the CMIS specific parameters
>> //      child = new ConfigurationNode("username");
>> //      child.setValue(CMIS_USERNAME);
>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>> //
>> //      child = new ConfigurationNode("password");
>> //      child.setValue(CMIS_PASSWORD);
>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>> //
>> //      child = new ConfigurationNode("endpoint");
>> //      child.setValue(CMIS_ENDPOINT_TEST_SERVER);
>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>>       requestObject = new Configuration();
>>       requestObject.addChild(0,connectionObject);
>>
>>       result = performAPIPutOperationViaNodes(
>> "repositoryconnections/CMIS%20Connection",201,requestObject);
>
>
> How can I set the username, password and endpoint for the CMIS Repository
> Connector parameters in this test class?
>
> Thank you.
>
> Piergiorgio
>
>
> 2011/8/2 Karl Wright 
>
>> Thanks for the status report.  I hope to see your patch soon!
>>
>> Also, FWIW, once the documentation is also done I'd like to consider
>> solidifying the 0.3 release.  It's got a lot of good stuff in it and I
>> think as soon as we've finished off the new CMIS connector in all
>> dimensions we should go ahead.  Thoughts, anyone?
>>
>> Karl
>>
>>
>> On Tue, Aug 2, 2011 at 5:00 AM, Piergiorgio Lucidi
>>  wrote:
>> > Yesterday I started to work on end-to-end integra

Re: CMIS Connector - Tests

2011-08-03 Thread Karl Wright
Another good way to see exactly what you need to do is to call the API
to get configuration information for an existing connection.  Then,
use the toXML() method to convert to XML, or the toJSON() to get it as
JSON.  Either way you will see the structure.  BTW, ManifoldCF in
Action Chapter 3 covers this in great detail as well.

Karl

On Wed, Aug 3, 2011 at 12:50 PM, Karl Wright  wrote:
> The ConfigParams class is, I believe, derived from the Configuration
> class.  So, you can create a ConfigParams object instead of a
> Configuration object if you want to use the API in the manner you
> describe.
>
> The reason your commented-out code doesn't work is because the
> setParameter() method isn't doing quite what you are expecting.  It's
> creating a node named "_PARAMETER_" with a "name" attribute and a
> value area, and you are creating nodes named by the parameter name.
>
> Karl
>
> On Wed, Aug 3, 2011 at 12:41 PM, Piergiorgio Lucidi
>  wrote:
>> I'm trying to implement tests but I found a problem to set all the needed
>> parameters to the CMIS Repository Connector that needs: an username, a
>> password and the endpoint (url).
>>
>> I need to know how to create the configuration nodes for the connector, in
>> the connector code I managed the configuration parameters in this way in the
>> processConfigurationPost method:
>>
>>  public String processConfigurationPost(IThreadContext threadContext,
>>>       IPostParameters variableContext, ConfigParams parameters)
>>>       throws ManifoldCFException {
>>>     String username = variableContext.getParameter(CONFIG_PARAM_USERNAME);
>>>     if (StringUtils.isNotEmpty(username))
>>>       parameters.setParameter(CONFIG_PARAM_USERNAME, username);
>>>     String password = variableContext.getParameter(CONFIG_PARAM_PASSWORD);
>>>     if (StringUtils.isNotEmpty(password))
>>>       parameters.setParameter(CONFIG_PARAM_PASSWORD, password);
>>>     String endpoint = variableContext.getParameter(CONFIG_PARAM_ENDPOINT);
>>>     if (StringUtils.isNotEmpty(endpoint) && endpoint.length() > 0)
>>>       parameters.setParameter(CONFIG_PARAM_ENDPOINT, endpoint);
>>>     String repositoryId = variableContext
>>>         .getParameter(CONFIG_PARAM_REPOSITORY_ID);
>>>     if (StringUtils.isNotEmpty(repositoryId))
>>>       parameters.setParameter(CONFIG_PARAM_REPOSITORY_ID, repositoryId);
>>>     return null;
>>>   }
>>
>>
>> Now I have to setup the same parameters inside my test class APISanityTest
>> that doesn't like the following snippet, it works only if CMIS parameters
>> are commented as the following:
>>
>>  @Test
>>>   public void sanityCheck()
>>>     throws Exception
>>>   {
>>>     try
>>>     {
>>>       // Hey, we were able to install the file system connector etc.
>>>       // Now, create a local test job and run it.
>>>       IThreadContext tc = ThreadContextFactory.make();
>>>       int i;
>>>       IJobManager jobManager = JobManagerFactory.make(tc);
>>>       // Create a basic file system connection, and save it.
>>>       ConfigurationNode connectionObject;
>>>       ConfigurationNode child;
>>>       Configuration requestObject;
>>>       Configuration result;
>>>
>>>       connectionObject = new ConfigurationNode("repositoryconnection");
>>>
>>>       child = new ConfigurationNode("name");
>>>       child.setValue("CMIS Connection");
>>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>>
>>>       child = new ConfigurationNode("class_name");
>>>       child.setValue(
>>> "org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector");
>>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>>
>>>       child = new ConfigurationNode("description");
>>>       child.setValue("CMIS Connection");
>>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>>       child = new ConfigurationNode("max_connections");
>>>       child.setValue("10");
>>>       connectionObject.addChild(connectionObject.getChildCount(),child);
>>>
>>>       //setting the CMIS specific parameters
>>> //      child = new ConfigurationNode("username");
>>> //      child.setValue(CMIS_USERNAME);
>>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>>> //
>>> //      child = new ConfigurationNode("password");
>>> //      child.setValue(CMIS_PASSWORD);
>>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>>> //
>>> //      child = new ConfigurationNode("endpoint");
>>> //      child.setValue(CMIS_ENDPOINT_TEST_SERVER);
>>> //      connectionObject.addChild(connectionObject.getChildCount(),child);
>>>       requestObject = new Configuration();
>>>       requestObject.addChild(0,connectionObject);
>>>
>>>       result = performAPIPutOperationViaNodes(
>>> "repositoryconnections/CMIS%20Connection",201,requestObject);
>>
>>
>> How can I set the username, password and endpoint for the CMIS Repository
>> Connector parameters in this test class?
>>
>> Thank you.
>>
>> Piergio

Re: CMIS Connector - Tests

2011-08-03 Thread Piergiorgio Lucidi
Yes, this is a very quick way to test my configuration code for the
integration test ;)
I'm going to fix this part, I saw what I need in the JSON retrieved by a GET
call against the Manifold API service.

I'll let you know tomorrow an update about this.

Anyway we have a problem with the OpenCMIS InMemory server. It seems that
there is a problem during the execution of CMIS queries, I notified the
problem to the Chemistry guys (Jens Hubel and Florian Muller) and they are
trying to reproduce the issue to solve our problem.

I'm going on developing the code using the public Alfresco CMIS server
exposed at the following address:
http://cmis.alfresco.com

Piergiorgio

2011/8/3 Karl Wright 

> Another good way to see exactly what you need to do is to call the API
> to get configuration information for an existing connection.  Then,
> use the toXML() method to convert to XML, or the toJSON() to get it as
> JSON.  Either way you will see the structure.  BTW, ManifoldCF in
> Action Chapter 3 covers this in great detail as well.
>
> Karl
>
> On Wed, Aug 3, 2011 at 12:50 PM, Karl Wright  wrote:
> > The ConfigParams class is, I believe, derived from the Configuration
> > class.  So, you can create a ConfigParams object instead of a
> > Configuration object if you want to use the API in the manner you
> > describe.
> >
> > The reason your commented-out code doesn't work is because the
> > setParameter() method isn't doing quite what you are expecting.  It's
> > creating a node named "_PARAMETER_" with a "name" attribute and a
> > value area, and you are creating nodes named by the parameter name.
> >
> > Karl
> >
> > On Wed, Aug 3, 2011 at 12:41 PM, Piergiorgio Lucidi
> >  wrote:
> >> I'm trying to implement tests but I found a problem to set all the
> needed
> >> parameters to the CMIS Repository Connector that needs: an username, a
> >> password and the endpoint (url).
> >>
> >> I need to know how to create the configuration nodes for the connector,
> in
> >> the connector code I managed the configuration parameters in this way in
> the
> >> processConfigurationPost method:
> >>
> >>  public String processConfigurationPost(IThreadContext threadContext,
> >>>   IPostParameters variableContext, ConfigParams parameters)
> >>>   throws ManifoldCFException {
> >>> String username =
> variableContext.getParameter(CONFIG_PARAM_USERNAME);
> >>> if (StringUtils.isNotEmpty(username))
> >>>   parameters.setParameter(CONFIG_PARAM_USERNAME, username);
> >>> String password =
> variableContext.getParameter(CONFIG_PARAM_PASSWORD);
> >>> if (StringUtils.isNotEmpty(password))
> >>>   parameters.setParameter(CONFIG_PARAM_PASSWORD, password);
> >>> String endpoint =
> variableContext.getParameter(CONFIG_PARAM_ENDPOINT);
> >>> if (StringUtils.isNotEmpty(endpoint) && endpoint.length() > 0)
> >>>   parameters.setParameter(CONFIG_PARAM_ENDPOINT, endpoint);
> >>> String repositoryId = variableContext
> >>> .getParameter(CONFIG_PARAM_REPOSITORY_ID);
> >>> if (StringUtils.isNotEmpty(repositoryId))
> >>>   parameters.setParameter(CONFIG_PARAM_REPOSITORY_ID,
> repositoryId);
> >>> return null;
> >>>   }
> >>
> >>
> >> Now I have to setup the same parameters inside my test class
> APISanityTest
> >> that doesn't like the following snippet, it works only if CMIS
> parameters
> >> are commented as the following:
> >>
> >>  @Test
> >>>   public void sanityCheck()
> >>> throws Exception
> >>>   {
> >>> try
> >>> {
> >>>   // Hey, we were able to install the file system connector etc.
> >>>   // Now, create a local test job and run it.
> >>>   IThreadContext tc = ThreadContextFactory.make();
> >>>   int i;
> >>>   IJobManager jobManager = JobManagerFactory.make(tc);
> >>>   // Create a basic file system connection, and save it.
> >>>   ConfigurationNode connectionObject;
> >>>   ConfigurationNode child;
> >>>   Configuration requestObject;
> >>>   Configuration result;
> >>>
> >>>   connectionObject = new ConfigurationNode("repositoryconnection");
> >>>
> >>>   child = new ConfigurationNode("name");
> >>>   child.setValue("CMIS Connection");
> >>>
> connectionObject.addChild(connectionObject.getChildCount(),child);
> >>>
> >>>   child = new ConfigurationNode("class_name");
> >>>   child.setValue(
> >>>
> "org.apache.manifoldcf.crawler.connectors.cmis.CmisRepositoryConnector");
> >>>
> connectionObject.addChild(connectionObject.getChildCount(),child);
> >>>
> >>>   child = new ConfigurationNode("description");
> >>>   child.setValue("CMIS Connection");
> >>>
> connectionObject.addChild(connectionObject.getChildCount(),child);
> >>>   child = new ConfigurationNode("max_connections");
> >>>   child.setValue("10");
> >>>
> connectionObject.addChild(connectionObject.getChildCount(),child);
> >>>
> >>>   //setting the CMIS specific parameters
> >>> //  child = new ConfigurationNode("username");

[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Kate McGonigal (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079085#comment-13079085
 ] 

Kate McGonigal commented on CONNECTORS-235:
---

I'm afraid these problems still exist for me. 

A few hours ago I built the latest from trunk. It is running on PostgreSQL.

Just in case, I also started from a fresh install of Solr 3.3.0.  I'm using the 
example that comes with the distribution. It is thus running on Derby. I 
realize the schema is not optimal for RSS feeds, but it does include a 
"description"  field, which is what I'm interested in at the moment.

Problem 1) When I try running the example job with "Dechromed Content" set to 
"No dechromed content", what shows up in the description field (for all 
documents) is "Jazz radio show from Winnipeg on CKUW 95.9 FM, hosted by Maurice 
Hogue." which is not the item-description in the RSS feed's XML, but rather 
from the website's metadata description element in the HTML.  I have tried 
another RSS feed, with the same result.

Problem 2) When I try running the example job (see original post) with 
"Dechromed Content" set to "if present, in 'description' field" it still hangs 
with the log file showing:
{quote}FATAL 2011-08-03 16:08:21,703 (Worker thread '10') - Error tossed: 
java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
java.lang.ClassCastException: java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
at 
org.apache.manifoldcf.crawler.jobs.Carrydown.getDataValuesAsFiles(Carrydown.java:611)
at 
org.apache.manifoldcf.crawler.jobs.JobManager.retrieveParentDataAsFiles(JobManager.java:4263)
at 
org.apache.manifoldcf.crawler.system.WorkerThread$VersionActivity.retrieveParentDataAsFiles(WorkerThread.java:1221)
at 
org.apache.manifoldcf.crawler.connectors.rss.RSSConnector.getDocumentVersions(RSSConnector.java:824)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:321){quote}

And just to be clear on what I am ultimately trying to do: I'd like to be able 
to show my searchers the "description" from the RSS feed for each of the 
documents that match their searches. I actually only need to index the 
item-description field (as opposed to what is at the item link) since my RSS 
feeds are of scientific papers that will have a detailed abstract in the 
item-description.

> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Kate McGonigal (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079085#comment-13079085
 ] 

Kate McGonigal edited comment on CONNECTORS-235 at 8/3/11 10:31 PM:


I'm afraid these problems still exist for me. 

A few hours ago I built the latest from trunk. It is running on PostgreSQL.

Just in case, I also started from a fresh install of Solr 3.3.0.  I'm using the 
example that comes with the distribution. It is thus running on Derby. I 
realize the schema is not optimal for RSS feeds, but it does include a 
"description"  field, which is what I'm interested in at the moment.

Problem 1) When I try running the example job (see original post) with 
"Dechromed Content" set to "No dechromed content", what shows up in the 
description field (for all documents) is "Jazz radio show from Winnipeg on CKUW 
95.9 FM, hosted by Maurice Hogue." which is not the item-description in the RSS 
feed's XML, but rather from the website's metadata description element in the 
HTML.  I have tried another RSS feed, with the same result.

Problem 2) When I try running the example job with "Dechromed Content" set to 
"if present, in 'description' field" it still hangs with the log file showing:
{quote}FATAL 2011-08-03 16:08:21,703 (Worker thread '10') - Error tossed: 
java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
java.lang.ClassCastException: java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
at 
org.apache.manifoldcf.crawler.jobs.Carrydown.getDataValuesAsFiles(Carrydown.java:611)
at 
org.apache.manifoldcf.crawler.jobs.JobManager.retrieveParentDataAsFiles(JobManager.java:4263)
at 
org.apache.manifoldcf.crawler.system.WorkerThread$VersionActivity.retrieveParentDataAsFiles(WorkerThread.java:1221)
at 
org.apache.manifoldcf.crawler.connectors.rss.RSSConnector.getDocumentVersions(RSSConnector.java:824)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:321){quote}

And just to be clear on what I am ultimately trying to do: I'd like to be able 
to show my searchers the "description" from the RSS feed for each of the 
documents that match their searches. I actually only need to index the 
item-description field (as opposed to what is at the item link) since my RSS 
feeds are of scientific papers that will have a detailed abstract in the 
item-description.

  was (Author: kmcgonig):
I'm afraid these problems still exist for me. 

A few hours ago I built the latest from trunk. It is running on PostgreSQL.

Just in case, I also started from a fresh install of Solr 3.3.0.  I'm using the 
example that comes with the distribution. It is thus running on Derby. I 
realize the schema is not optimal for RSS feeds, but it does include a 
"description"  field, which is what I'm interested in at the moment.

Problem 1) When I try running the example job with "Dechromed Content" set to 
"No dechromed content", what shows up in the description field (for all 
documents) is "Jazz radio show from Winnipeg on CKUW 95.9 FM, hosted by Maurice 
Hogue." which is not the item-description in the RSS feed's XML, but rather 
from the website's metadata description element in the HTML.  I have tried 
another RSS feed, with the same result.

Problem 2) When I try running the example job (see original post) with 
"Dechromed Content" set to "if present, in 'description' field" it still hangs 
with the log file showing:
{quote}FATAL 2011-08-03 16:08:21,703 (Worker thread '10') - Error tossed: 
java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
java.lang.ClassCastException: java.lang.String cannot be cast to 
org.apache.manifoldcf.core.interfaces.CharacterInput
at 
org.apache.manifoldcf.crawler.jobs.Carrydown.getDataValuesAsFiles(Carrydown.java:611)
at 
org.apache.manifoldcf.crawler.jobs.JobManager.retrieveParentDataAsFiles(JobManager.java:4263)
at 
org.apache.manifoldcf.crawler.system.WorkerThread$VersionActivity.retrieveParentDataAsFiles(WorkerThread.java:1221)
at 
org.apache.manifoldcf.crawler.connectors.rss.RSSConnector.getDocumentVersions(RSSConnector.java:824)
at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:321){quote}

And just to be clear on what I am ultimately trying to do: I'd like to be able 
to show my searchers the "description" from the RSS feed for each of the 
documents that match their searches. I actually only need to index the 
item-description field (as opposed to what is at the item link) since my RSS 
feeds are of scientific papers that will have a detailed abstract in the 
item-description.
  
> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https:

[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079097#comment-13079097
 ] 

Karl Wright commented on CONNECTORS-235:


Hmm, I'm using the very same feed you are, with PostgreSQL, and seeing perfect 
results.
Can you attach a screen shot of the View Job page of the job in question?  
Also, the View Connection page for both the output connection and the 
repository connection?


> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079118#comment-13079118
 ] 

Karl Wright commented on CONNECTORS-235:


One problem I found is that due to a rebuild I was not using PostgreSQL after 
all, so here's another check-in to fix its handling of streamed carrydown info. 
 r1153702.

> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079134#comment-13079134
 ] 

Karl Wright commented on CONNECTORS-235:


Ok, another mystery solved.  The RSS chromed data mode of "None" was not 
properly tried because of the inadvertant database switch, and I found that 
recrawling vs. crawling fresh generated incorrect version information.  I've 
fixed that problem but I can't check it in because it causes the following 
error against a plain-vanilla Solr installation:

ERROR: [http://www.onemansjazz.ca/content/view/330/50/] multiple values 
encountered for non multiValued field description: [Jazz radio show from 
Winnipeg on CKUW 95.9 FM, hosted by Maurice Hogue., I have created a Listener 
Survey and if you have the time to complete it, that would be terrific. I'm 
trying to do an evaluation of One Man's Jazz as well as considering some 
new options that have arisen. Your feedback would be most appreciate.This 
survey is in two parts and is a total of twenty parts, most of them just 
require a click of your mouse. Click here 
(http://www.surveymonkey.com/s/C3DZ3JK) for Part One, and here 
(http://www.surveymonkey.com/s/C38FVH8) for Part Two. Thanks again for your 
input. ]

I'm not sure why Solr is interpreting this long field as multivalued, but 
clearly it would be much better if I used a metadata name that wasn't 
"description", since Solr's example configuration has dibs on that.  I'll 
experiment and post further.


> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079137#comment-13079137
 ] 

Karl Wright commented on CONNECTORS-235:


I switched the name to "summary".  r1153705.


> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CONNECTORS-235) item description element not indexed

2011-08-03 Thread Karl Wright (JIRA)

[ 
https://issues.apache.org/jira/browse/CONNECTORS-235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13079141#comment-13079141
 ] 

Karl Wright commented on CONNECTORS-235:


Just to be clear, here's an example of the Solr log line for indexing one of 
the documents from the above mentioned feed.  You can, of course, configure the 
job to map the field names to whatever you like.  This is with no mapping 
whatsoever.

INFO: [] webapp=/solr path=/update/extract 
params={literal.source=http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/&literal.category=Radio+-+Play+lists&literal.summary=I+had+a+lot+of+fun+putting+this+show+together+this+week.+Hope+you+enjoy+it,+too.&literal.id=http://www.onemansjazz.ca/content/view/332/30/&literal.title=July+23,+2011+Playlist&literal.pubdate=1311339967000}
 status=0 QTime=13

I'm pretty certain you must have a metadata value set for "description" in your 
job, because there is absolutely no mechanism (and never was one) for picking 
up the channel description from the feed. So you will have to remove that in 
order to get all this to work for you.

> item description element not indexed
> 
>
> Key: CONNECTORS-235
> URL: https://issues.apache.org/jira/browse/CONNECTORS-235
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: RSS connector
>Affects Versions: ManifoldCF 0.2
>Reporter: Kate McGonigal
>Assignee: Karl Wright
> Fix For: ManifoldCF 0.3
>
>
> The RSS feed's *item* description is not written to any field in the Solr 
> index. 
> I have a typical RSS feed with the general structure:
> 
> 
> 
> 
> 
> 
> 
> 
> 
>  *** the description I do want *** 
> 
> 
> 
> 
> 
> Example:
> For the RSS feed: 
> http://www.onemansjazz.ca/component/option,com_rss/feed,RSS2.0/no_html,1/
> the rss/channel/item/description field is not indexed into Solr.
> Example notes:
>   - what does get written to the Solr "description" field is the description 
> metadata from the website, i.e. "Jazz radio show from Winnipeg on CKUW 95.9 
> FM, hosted by Maurice Hogue." in this case.
>   - on the "Dechromed Content" tab of the job, "No dechromed content" is 
> selected. I'm not sure if that is relevant.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira