[jira] [Commented] (CONNECTORS-1633) Exception tossed: Repeated service interruptions - failure processing document: The process cannot access the file because it is being used by another process.

2020-01-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022959#comment-17022959
 ] 

Karl Wright commented on CONNECTORS-1633:
-

Hi,
The connector retries for a specific period of time on this and then gives up 
and aborts the job.  What kind of behavior would you like to see different?  It 
could choose to skip the file and continue instead, but that I'd worry about 
too.



> Exception tossed: Repeated service interruptions - failure processing 
> document: The process cannot access the file because it is being used by 
> another process.
> ---
>
> Key: CONNECTORS-1633
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1633
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
>
> Seeing this error occurring and I'm working to address it.  If it's not a 
> bug, a better message should be generated.
>  
> {code:java}
> crawl job fails with the following error due to document being in use by 
> another user: 
>  WARN 2019-08-25T15:02:27,416 (Worker thread '11') - Service interruption 
> reported for job 1565115290083 connection 'fs_vwoaahvp319': Timeout or other 
> service interruption: The process cannot access the file because it is being 
> used by another process.
> ERROR 2019-08-25T15:02:27,424 (Worker thread '11') - Exception tossed: 
> Repeated service interruptions - failure processing document: The process 
> cannot access the file because it is being used by another process.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: The process cannot access the 
> file because it is being used by another process.
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) 
> [mcf-pull-agent.jar:?]
> Caused by: jcifs.smb.SmbException: The process cannot access the file because 
> it is being used by another process.
>         at 
> jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1457) ~[?:?]
>         at jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1568) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1023) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1539) ~[?:?]
>         at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409) ~[?:?]
>         at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:401) 
> ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:315) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:295) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1741) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1710) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1704) ~[?:?]
>         at jcifs.smb.SmbFile.queryPath(SmbFile.java:770) ~[?:?]
>         at jcifs.smb.SmbFile.exists(SmbFile.java:851) ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.fileExists(SharedDriveConnector.java:2188)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:610)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
> ~[mcf-pull-agent.jar:?]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1633) Exception tossed: Repeated service interruptions - failure processing document: The process cannot access the file because it is being used by another process.

2020-01-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1633:
---

Assignee: Karl Wright

> Exception tossed: Repeated service interruptions - failure processing 
> document: The process cannot access the file because it is being used by 
> another process.
> ---
>
> Key: CONNECTORS-1633
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1633
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
>
> Seeing this error occurring and I'm working to address it.  If it's not a 
> bug, a better message should be generated.
>  
> {code:java}
> crawl job fails with the following error due to document being in use by 
> another user: 
>  WARN 2019-08-25T15:02:27,416 (Worker thread '11') - Service interruption 
> reported for job 1565115290083 connection 'fs_vwoaahvp319': Timeout or other 
> service interruption: The process cannot access the file because it is being 
> used by another process.
> ERROR 2019-08-25T15:02:27,424 (Worker thread '11') - Exception tossed: 
> Repeated service interruptions - failure processing document: The process 
> cannot access the file because it is being used by another process.
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service 
> interruptions - failure processing document: The process cannot access the 
> file because it is being used by another process.
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:489) 
> [mcf-pull-agent.jar:?]
> Caused by: jcifs.smb.SmbException: The process cannot access the file because 
> it is being used by another process.
>         at 
> jcifs.smb.SmbTransportImpl.checkStatus2(SmbTransportImpl.java:1457) ~[?:?]
>         at jcifs.smb.SmbTransportImpl.checkStatus(SmbTransportImpl.java:1568) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.sendrecv(SmbTransportImpl.java:1023) 
> ~[?:?]
>         at jcifs.smb.SmbTransportImpl.send(SmbTransportImpl.java:1539) ~[?:?]
>         at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:409) ~[?:?]
>         at jcifs.smb.SmbTreeImpl.send(SmbTreeImpl.java:472) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send0(SmbTreeConnection.java:401) 
> ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:315) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.send(SmbTreeConnection.java:295) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:130) ~[?:?]
>         at jcifs.smb.SmbTreeHandleImpl.send(SmbTreeHandleImpl.java:117) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1741) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1710) ~[?:?]
>         at jcifs.smb.SmbFile.withOpen(SmbFile.java:1704) ~[?:?]
>         at jcifs.smb.SmbFile.queryPath(SmbFile.java:770) ~[?:?]
>         at jcifs.smb.SmbFile.exists(SmbFile.java:851) ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.fileExists(SharedDriveConnector.java:2188)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:610)
>  ~[?:?]
>         at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
> ~[mcf-pull-agent.jar:?]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1629) Support Solr Kerberos Authentication

2020-01-23 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17022122#comment-17022122
 ] 

Karl Wright commented on CONNECTORS-1629:
-

Hi [[~jornfranke], can you include the URL of the pull request here?  Thanks!


> Support Solr Kerberos Authentication
> 
>
> Key: CONNECTORS-1629
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1629
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Solr 7.x component
>Affects Versions: ManifoldCF 2.14
>Reporter: Jörn Franke
>Assignee: Karl Wright
>Priority: Major
>
> Several enterprise deployments of Solr are leveraging SolrCloud Kerberos 
> authentication.
> The integration seems to be rather simple and the goal of this Jira is to 
> evaluate the possential needed step to eventually contribute the Kerberos 
> integration to the ManifoldCF project.
> The following steps would be needed:
>  * One can pass the JVM parameter java.security.auth.login.config to the 
> ManifoldCF JVM using -Djava.security.auth.login.config=/path/to/jaas.confg in 
> which Kerberos authentication details, such as keytab and principal that has 
> the right access to Solr is configured
>  * A small adaption to the SolrCloudClient that is used within Manifold needs 
> to be done to enable Kerberos authentication: 
> HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
> Should this be integrated in Manifold, one may want to consider one input 
> field in the configuration in the UI where one can select / flow which user 
> defined in the Jaas conf (you can define multiple one) should be chosen. By 
> default one may simply select "client" or "SolrJClient" if Jaas.conf is 
> present in the System properties. This does not mean the user needs to be 
> named like this, but the configuration entry referencing any user should be 
> named like this.
> Having a confiugration allows to have a different users per flow. This might 
> also be needed in case you have multiple Solr clusters. 
> Related discussion 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201912.mbox/browser]
> SolrJ Kerberos integration: 
> [https://lucene.apache.org/solr/guide/8_3/kerberos-authentication-plugin.html#using-solrj-with-a-kerberized-solr]
> Jaas conf documentation: 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1629) Support Solr Kerberos Authentication

2020-01-23 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1629:
---

Assignee: Karl Wright

> Support Solr Kerberos Authentication
> 
>
> Key: CONNECTORS-1629
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1629
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Solr 7.x component
>Affects Versions: ManifoldCF 2.14
>Reporter: Jörn Franke
>Assignee: Karl Wright
>Priority: Major
>
> Several enterprise deployments of Solr are leveraging SolrCloud Kerberos 
> authentication.
> The integration seems to be rather simple and the goal of this Jira is to 
> evaluate the possential needed step to eventually contribute the Kerberos 
> integration to the ManifoldCF project.
> The following steps would be needed:
>  * One can pass the JVM parameter java.security.auth.login.config to the 
> ManifoldCF JVM using -Djava.security.auth.login.config=/path/to/jaas.confg in 
> which Kerberos authentication details, such as keytab and principal that has 
> the right access to Solr is configured
>  * A small adaption to the SolrCloudClient that is used within Manifold needs 
> to be done to enable Kerberos authentication: 
> HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
> Should this be integrated in Manifold, one may want to consider one input 
> field in the configuration in the UI where one can select / flow which user 
> defined in the Jaas conf (you can define multiple one) should be chosen. By 
> default one may simply select "client" or "SolrJClient" if Jaas.conf is 
> present in the System properties. This does not mean the user needs to be 
> named like this, but the configuration entry referencing any user should be 
> named like this.
> Having a confiugration allows to have a different users per flow. This might 
> also be needed in case you have multiple Solr clusters. 
> Related discussion 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201912.mbox/browser]
> SolrJ Kerberos integration: 
> [https://lucene.apache.org/solr/guide/8_3/kerberos-authentication-plugin.html#using-solrj-with-a-kerberized-solr]
> Jaas conf documentation: 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CONNECTORS-1631) Sharepoint connction problem

2020-01-16 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1631.
-
Fix Version/s: ManifoldCF 2.15
   Resolution: Fixed

> Sharepoint connction problem
> 
>
> Key: CONNECTORS-1631
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1631
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Zoltan Farago
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.15
>
> Attachments: Manifold connection.png
>
>
> Hello,
> We are trying to connct to a Sharepoint 2016 site wich has default 
> installation. The URL is 
> [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx]
>  and from a browser it is fully operational. The site is installed on our 
> local network, no firewall issues could be. 
> When we try to connect from the Manifold CF we get this error message: "The 
> site at 
> [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents]
>  did not exist or was external; skipping"
> This Manifold installation is able to connect to a Windows share on the same 
> server, so we think no user/pass Active Directory, etc issues could be here. 
> We checked forums, documentations but found no solution. 
>  
> Is there any special setting needed in Manifold, Sharepoint, et.? 
>  
> Thank you in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1631) Sharepoint connction problem

2020-01-16 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016840#comment-17016840
 ] 

Karl Wright commented on CONNECTORS-1631:
-

Don't know about SharePoint Online.  For SharePoint 2019 a plugin needs to be 
released that's properly linked against the SharePoint dll.  If you are 
intending to use 2019 please let me know if you can supply the DLL so that we 
can set up and release the plugin.


> Sharepoint connction problem
> 
>
> Key: CONNECTORS-1631
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1631
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Zoltan Farago
>    Assignee: Karl Wright
>Priority: Major
> Attachments: Manifold connection.png
>
>
> Hello,
> We are trying to connct to a Sharepoint 2016 site wich has default 
> installation. The URL is 
> [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx]
>  and from a browser it is fully operational. The site is installed on our 
> local network, no firewall issues could be. 
> When we try to connect from the Manifold CF we get this error message: "The 
> site at 
> [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents]
>  did not exist or was external; skipping"
> This Manifold installation is able to connect to a Windows share on the same 
> server, so we think no user/pass Active Directory, etc issues could be here. 
> We checked forums, documentations but found no solution. 
>  
> Is there any special setting needed in Manifold, Sharepoint, et.? 
>  
> Thank you in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [jira] [Commented] (CONNECTORS-1631) Sharepoint connction problem

2020-01-16 Thread Karl Wright
Don't know about SharePoint Online.  For SharePoint 2019 a plugin needs to
be released that's properly linked against the SharePoint dll.  If you are
intending to use 2019 please let me know if you can supply the DLL so that
we can set up and release the plugin.

Karl


On Thu, Jan 16, 2020 at 6:54 AM Zoltan Farago (Jira) 
wrote:

>
> [
> https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016836#comment-17016836
> ]
>
> Zoltan Farago commented on CONNECTORS-1631:
> ---
>
> [~kwri...@metacarta.com] thank you! Our developer intelled, and he is
> abla to connect now.
>
> One more question if you don't mind. Will this plugin work with Sharepoint
> 2019 and Sharepoint Online as well?
>
> If yes SP Online would be tricky to install, is there any detailed
> step-by-step guide?
>
> > Sharepoint connction problem
> > 
> >
> > Key: CONNECTORS-1631
> > URL:
> https://issues.apache.org/jira/browse/CONNECTORS-1631
> > Project: ManifoldCF
> >  Issue Type: Task
> >Reporter: Zoltan Farago
> >Assignee: Karl Wright
> >Priority: Major
> > Attachments: Manifold connection.png
> >
> >
> > Hello,
> > We are trying to connct to a Sharepoint 2016 site wich has default
> installation. The URL is [
> http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx]
>  and
> from a browser it is fully operational. The site is installed on our local
> network, no firewall issues could be.
> > When we try to connect from the Manifold CF we get this error message:
> "The site at [
> http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents
> <http://manifoldsharepoint/sites/UKAEAtestSP2016%7Chttp://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents>]
> did not exist or was external; skipping"
> > This Manifold installation is able to connect to a Windows share on the
> same server, so we think no user/pass Active Directory, etc issues could be
> here.
> > We checked forums, documentations but found no solution.
> >
> > Is there any special setting needed in Manifold, Sharepoint, et.?
> >
> > Thank you in advance!
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>


[jira] [Created] (CONNECTORS-1632) Deprecate SSL and use TLS socket factories everywhere instead

2020-01-15 Thread Karl Wright (Jira)
Karl Wright created CONNECTORS-1632:
---

 Summary: Deprecate SSL and use TLS socket factories everywhere 
instead
 Key: CONNECTORS-1632
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1632
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core, Lucene/SOLR connector
Reporter: Karl Wright
Assignee: Karl Wright


Servers that serve only TLS apparently no longer work with ManifoldCF's various 
connectors.  Changing the socket factory so that it supports the more modern 
protocols seems indicated.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1631) Sharepoint connction problem

2020-01-15 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17016350#comment-17016350
 ] 

Karl Wright commented on CONNECTORS-1631:
-

So did you install the MCF plugin for Sharepoint 2016 on the SharePoint server? 
 If not, remember that this is mandatory.



> Sharepoint connction problem
> 
>
> Key: CONNECTORS-1631
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1631
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Zoltan Farago
>    Assignee: Karl Wright
>Priority: Major
> Attachments: Manifold connection.png
>
>
> Hello,
> We are trying to connct to a Sharepoint 2016 site wich has default 
> installation. The URL is 
> [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx]
>  and from a browser it is fully operational. The site is installed on our 
> local network, no firewall issues could be. 
> When we try to connect from the Manifold CF we get this error message: "The 
> site at 
> [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents]
>  did not exist or was external; skipping"
> This Manifold installation is able to connect to a Windows share on the same 
> server, so we think no user/pass Active Directory, etc issues could be here. 
> We checked forums, documentations but found no solution. 
>  
> Is there any special setting needed in Manifold, Sharepoint, et.? 
>  
> Thank you in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1631) Sharepoint connction problem

2020-01-15 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1631:
---

Assignee: Karl Wright

> Sharepoint connction problem
> 
>
> Key: CONNECTORS-1631
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1631
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Zoltan Farago
>    Assignee: Karl Wright
>Priority: Major
> Attachments: Manifold connection.png
>
>
> Hello,
> We are trying to connct to a Sharepoint 2016 site wich has default 
> installation. The URL is 
> [http://precogwin02/sites/UKAEAtestSP2016/_layouts/15/start.aspx#/Shared%20Documents/Forms/AllItems.aspx]
>  and from a browser it is fully operational. The site is installed on our 
> local network, no firewall issues could be. 
> When we try to connect from the Manifold CF we get this error message: "The 
> site at 
> [http://manifoldsharepoint/sites/UKAEAtestSP2016|http://manifoldsharepoint/sites/UKAEAtestSP2016/Shared%20Documents]
>  did not exist or was external; skipping"
> This Manifold installation is able to connect to a Windows share on the same 
> server, so we think no user/pass Active Directory, etc issues could be here. 
> We checked forums, documentations but found no solution. 
>  
> Is there any special setting needed in Manifold, Sharepoint, et.? 
>  
> Thank you in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Oracle JDBC Job Error

2020-01-06 Thread Karl Wright
Hi Cihad,

You need to change the query.  The code is perfectly fine.  Follow the
instructions: Try using quotes around $(IDCOLUMN) variable, e.g. \"
$(IDCOLUMN)\" .
Your query should look like this:

SELECT PERSONID AS "$(IDCOLUMN)" FROM PERSON

Karl

On Mon, Jan 6, 2020 at 7:50 PM Cihad Guzel  wrote:

> Hi,
>
> I have debugged the MFC-2.15 codes and caught the problem.
>
> JDBCConnector.java line:270
>
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>
> if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return $(IDCOLUMN) 
> column.  Try using quotes around $(IDCOLUMN) variable, e.g. \"$(IDCOLUMN)\", 
> or, for MySQL, select \"by label\" in your repository connection.");
>
>
> The "row" object's value is "LCF__ID" -> this is a uppercase string
>
> "JDBCConstants.idReturnColumnName" is "lcf__id" -> this is a lowercase string
>
> So "o" object is null.
>
> I think that Oracle returns the uppercase column name. It is not a bug. How 
> can I fix it? Should I update the seed query in the Query tab? Should we 
> change the code lines?
>
> Regards,
> Cihad Guzel
>
>
> Cihad Guzel , 5 Oca 2020 Paz, 20:14 tarihinde şunu
> yazdı:
>
>> Hi,
>>
>> I try JDBC connector with Oracle (version: 11.2.0.4). I added to
>> classpath ojdbc6.jar. My seed query as follows:
>>
>> "SELECT PERSONID AS $(IDCOLUMN) FROM PERSON"
>>
>> and I have an error as follow:
>>
>> "Error: Bad seed query; doesn't return $(IDCOLUMN) column. Try using
>> quotes around $(IDCOLUMN) variable, e.g. "$(IDCOLUMN)", or, for MySQL,
>> select "by label" in your repository connection."
>>
>> I have tried JDBC connector with MsSQL and Mysql. It has run successfully.
>>
>> How can I fix it?
>>
>> Regards,
>> Cihad Guzel
>>
>


[jira] [Commented] (CONNECTORS-1629) Support Solr Kerberos Authentication

2020-01-03 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17007319#comment-17007319
 ] 

Karl Wright commented on CONNECTORS-1629:
-

Hi,

{quote}
About the ModifiedSolrClient - do I understand you correctly that you would 
prefer to make the ModifiedSolrClient working in this setting as well? Ie by 
creating a new ModifiedSolrClientKerberos and ModifiedLBSolrClientKerberos (not 
touching the ones already in Manifold)? I can look at this, but I wonder if 
this would still be needed as I did not observe any errors. Maybe the multipart 
bit is fixed in higher Solr versions?
{quote}

I wish the multipart code was fixed but I fear it is not; I tried to get the 
HttpClient team to agree to it but there was disagreement and I didn't get past 
that.  It's so long ago now that I don't even remember the discussion well, but 
some team members thought that it was not the client's responsibility to 
properly escape argument names when they were encoded in some cases but not in 
others.  If you are including metadata names and values that would require 
encoding and this is working OK, then maybe this was resolved.  But we should 
evaluate that independently.

The multipart fix was only PART of the reason for ModifiedSolrHttpClient, 
however.  The other reason was that the Solr team essentially deprecated and 
removed support for multipart posts entirely, which meant that streaming of 
large documents to solr was not possible.  I've kept that working and called 
for them to rethink that problem, at which point I was told that nobody should 
be using Solr Cell at all (!)  So that stays until the Solr team figures this 
out.  The conversation there was at least relatively recent.

A github pull is fine.  A diff gets generated by attaching a ".diff" to the URL 
and then I can patch in svn.






> Support Solr Kerberos Authentication
> 
>
> Key: CONNECTORS-1629
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1629
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Solr 7.x component
>Affects Versions: ManifoldCF 2.14
>Reporter: Jörn Franke
>Priority: Major
>
> Several enterprise deployments of Solr are leveraging SolrCloud Kerberos 
> authentication.
> The integration seems to be rather simple and the goal of this Jira is to 
> evaluate the possential needed step to eventually contribute the Kerberos 
> integration to the ManifoldCF project.
> The following steps would be needed:
>  * One can pass the JVM parameter java.security.auth.login.config to the 
> ManifoldCF JVM using -Djava.security.auth.login.config=/path/to/jaas.confg in 
> which Kerberos authentication details, such as keytab and principal that has 
> the right access to Solr is configured
>  * A small adaption to the SolrCloudClient that is used within Manifold needs 
> to be done to enable Kerberos authentication: 
> HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
> Should this be integrated in Manifold, one may want to consider one input 
> field in the configuration in the UI where one can select / flow which user 
> defined in the Jaas conf (you can define multiple one) should be chosen. By 
> default one may simply select "client" or "SolrJClient" if Jaas.conf is 
> present in the System properties. This does not mean the user needs to be 
> named like this, but the configuration entry referencing any user should be 
> named like this.
> Having a confiugration allows to have a different users per flow. This might 
> also be needed in case you have multiple Solr clusters. 
> Related discussion 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201912.mbox/browser]
> SolrJ Kerberos integration: 
> [https://lucene.apache.org/solr/guide/8_3/kerberos-authentication-plugin.html#using-solrj-with-a-kerberized-solr]
> Jaas conf documentation: 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1629) Support Solr Kerberos Authentication

2020-01-02 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17007086#comment-17007086
 ] 

Karl Wright commented on CONNECTORS-1629:
-

Hi,

I suggest we make changes piecemeal.  First, updating the Jetty version, and 
the jars that are included, as described here:

{quote}
You need jetty-client-9.4.25.v20191220.jar (maybe a slightly older 9.4.x 
version will do as well, the current manifold version not). Reason is that you 
will get otherwise a java.lang.ClassNotFoundException: 
org.eclipse.jetty.client.util.SPNEGOAuthentication error.

I was not exactly sure how to add this jar to the finally generated 
distribution of ManifoldCF so i copied it in collector-lib and added it to the 
classpath.
{quote}

To do this, we'd want to update the version of jetty specified in build.xml and 
pom.xml, and add the new jar to the jetty jar list in build.xml.  Then, in 
framework/build.xml, the new jar should be added wherever jetty jars are found.

{quote}
I had to also deactivate the ModifiedLbSolrClient (commented out below) 
otherwise you get an auth error 401. I believe the reason is that the default 
SPNEGO Protocol for HTTP Kerberos always returns 401 not auth and THEN you are 
supposed to do the Kerberos authentication, which is what SolrJ does
{quote}

The modified client is present because we need to be sure that the correct 
(overridden) version of the SolrHttpClient class is used, not the default one.  
So in this case you'd want to create a fresh copy of LBSolrClient and modify it 
accordingly.

{quote}
Finally, you need to add to options.env.unix or options.env.win:

-Djava.security.auth.login.config=/path/to/jaas-client.conf
{quote}

I would suggest adding both the config file and the -D switch to all the 
examples, but leave kerberos disabled unless somebody modifies the 
jaas-client.conf file.



> Support Solr Kerberos Authentication
> 
>
> Key: CONNECTORS-1629
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1629
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Solr 7.x component
>Affects Versions: ManifoldCF 2.14
>Reporter: Jörn Franke
>Priority: Major
>
> Several enterprise deployments of Solr are leveraging SolrCloud Kerberos 
> authentication.
> The integration seems to be rather simple and the goal of this Jira is to 
> evaluate the possential needed step to eventually contribute the Kerberos 
> integration to the ManifoldCF project.
> The following steps would be needed:
>  * One can pass the JVM parameter java.security.auth.login.config to the 
> ManifoldCF JVM using -Djava.security.auth.login.config=/path/to/jaas.confg in 
> which Kerberos authentication details, such as keytab and principal that has 
> the right access to Solr is configured
>  * A small adaption to the SolrCloudClient that is used within Manifold needs 
> to be done to enable Kerberos authentication: 
> HttpClientUtil.setConfigurer(new Krb5HttpClientConfigurer());
> Should this be integrated in Manifold, one may want to consider one input 
> field in the configuration in the UI where one can select / flow which user 
> defined in the Jaas conf (you can define multiple one) should be chosen. By 
> default one may simply select "client" or "SolrJClient" if Jaas.conf is 
> present in the System properties. This does not mean the user needs to be 
> named like this, but the configuration entry referencing any user should be 
> named like this.
> Having a confiugration allows to have a different users per flow. This might 
> also be needed in case you have multiple Solr clusters. 
> Related discussion 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-user/201912.mbox/browser]
> SolrJ Kerberos integration: 
> [https://lucene.apache.org/solr/guide/8_3/kerberos-authentication-plugin.html#using-solrj-with-a-kerberized-solr]
> Jaas conf documentation: 
> [https://docs.oracle.com/javase/8/docs/technotes/guides/security/jgss/tutorials/LoginConfigFile.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Requests for help from Priya Arora

2019-12-30 Thread Karl Wright
Hi all,

Priya has been sending me a ton of requests for help to my personal email,
and I have requested that he/she stop doing that.  I've repeatedly
requested that he/she fix the out-of-memory condition he/she is seeing on
all the ManifoldCF processes on the container setup that is being used, but
he/she has not attempted to correct this.

I've concluded that this person is essentially unfamiliar with the most
basic Java ideas and thus the advice we give is unlikely to be of further
help.  I've sent mail to them requesting that they obtain basic Java
instruction from another source than me or the ManifoldCF lists.

Thanks,
Karl


Tackling JDK 11+

2019-12-26 Thread Karl Wright
Hi folks,

Now that 2.15 is out, it's time to think what to do about JDK 11 and after.

The transition from JDK 8 to JDK 11 will require significant work and
testing, because between JDK 11 *removed* many JDK classes that used to
exist in JDK 8.  The classes can be reincluded as specific dependencies BUT
in order to know what inclusions are needed we are going to need to test
every connector on JDK 11.  In addition, since ManifoldCF distribution
includes a complete execution environment, we will be changing the jars we
include in the binary and lib distributions considerably.

One option is to simply include all components and classes that are no
longer part of JDK 11 but were part of JDK 8, if such a list exists.  This
would be the safest way to proceed, but I have no idea how long this list
of jars is, and what versions of all the component jars we'd need.  It
would be great to use somebody else's work here if it exists.  Does anyone
know a full list of jars and versions that would "convert" a JDK 11 to a
full JDK 8-compatible environment?

Thanks in advance,
Karl


[RESULT] [VOTE] Release Apache ManifoldCF 2.15, RC0

2019-12-25 Thread Karl Wright
Three +1's, >72 hrs.  Vote passes!

Karl

On Sun, Dec 22, 2019 at 7:13 PM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.15, RC0.  The
> release artifact can be found at:
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.15
> .  There is also a release tag at:
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.15-RC0 .
>
> This release of ManifoldCF ties up some loose ends in the csws connector,
> and also updates the Solr Connector to the 8.x version of SolrJ.  There are
> also some bug fixes included.
>
> Thanks,
> Karl
>
>


Re: [VOTE] Release Apache ManifoldCF 2.15, RC0

2019-12-24 Thread Karl Wright
Ran all tests.

+1 from me.

Karl


On Sun, Dec 22, 2019 at 7:13 PM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.15, RC0.  The
> release artifact can be found at:
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.15
> .  There is also a release tag at:
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.15-RC0 .
>
> This release of ManifoldCF ties up some loose ends in the csws connector,
> and also updates the Solr Connector to the 8.x version of SolrJ.  There are
> also some bug fixes included.
>
> Thanks,
> Karl
>
>


[VOTE] Release Apache ManifoldCF 2.15, RC0

2019-12-22 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.15, RC0.  The release
artifact can be found at:
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.15 .
There is also a release tag at:
https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.15-RC0 .

This release of ManifoldCF ties up some loose ends in the csws connector,
and also updates the Solr Connector to the 8.x version of SolrJ.  There are
also some bug fixes included.

Thanks,
Karl


[jira] [Resolved] (CONNECTORS-1630) Livelink/Opentext connector support REST API

2019-12-22 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1630.
-
Fix Version/s: ManifoldCF 2.14
   Resolution: Fixed

The "csws" connector is the OpenText REST connector.  It shipped with 2.14.


> Livelink/Opentext connector support REST API
> 
>
> Key: CONNECTORS-1630
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1630
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: LiveLink connector
>Reporter: Jörn Franke
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> Currently, the Livelink connector is based on the Opentext proprietary APIs 
> lapi.jar/lssl.jar
> It seems that Opentext/Livelink focuses most of their efforts on the public 
> REST API and lapi.jar becomes deprecated. Hence, a new connector shoule be 
> developed to leverage the REST API.
> This task needs to investigate the minimum REST API version needed to provide 
> the Manifold functionality (Repository/Authority connection) similar to the 
> proprietary APIs.
> One needs then also to identify the configuration options in the UI, such as
> authority connection
>  * API base Url
>  * username/password auth (it is not basic auth), NTLM, Kerberos
>  * 
> repository:
>  * API base url
>  * API version to use (currently v1 or v2, just in case both version would 
> provide the needed functionality)
>  * username/password auth (it is not basic auth), NTLM, Kerberos
>  * path to fetch (e.g. by object id of the folder)
>  * recursive fetch (yes/no)
>  * regex pattern for specific filenames
>  * regex pattern for specific (sub-)folders in case of recursive fetch
>  * mapping of username to Livelink username
>  * number of threads for API calls
> Then a plan needs to be developed on how to design the functionality. 
> Multi-threading should be used as much as possible, but should be limited to 
> a certain number of threads, e.g. by using a Execution Service,  as the REST 
> API requires many calls to get all information (e.g. to get document 
> categories one needs to "work recursively its way up").
>  
> References:
>  * OpenText REST APIs Content server: 
> [https://developer.opentext.com/webaccess/#url=%2Fawd%2Fresources%2Fapis%2Fcs-rest-api-for-cs-16-s=501]
>  * OpenText REST API Directory services (this MIGHT be needed for the 
> Authority plugin, but it MAY also be fine just with the content server APIs): 
> [https://developer.opentext.com/webaccess/#url=%2Fawd%2Fresources%2Fapis%2Fotds-16=501]
>  * Executor service fixed thread pool: 
> [https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1630) Livelink/Opentext connector support REST API

2019-12-22 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1630:
---

Assignee: Karl Wright

> Livelink/Opentext connector support REST API
> 
>
> Key: CONNECTORS-1630
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1630
> Project: ManifoldCF
>  Issue Type: New Feature
>  Components: LiveLink connector
>Reporter: Jörn Franke
>    Assignee: Karl Wright
>Priority: Major
>
> Currently, the Livelink connector is based on the Opentext proprietary APIs 
> lapi.jar/lssl.jar
> It seems that Opentext/Livelink focuses most of their efforts on the public 
> REST API and lapi.jar becomes deprecated. Hence, a new connector shoule be 
> developed to leverage the REST API.
> This task needs to investigate the minimum REST API version needed to provide 
> the Manifold functionality (Repository/Authority connection) similar to the 
> proprietary APIs.
> One needs then also to identify the configuration options in the UI, such as
> authority connection
>  * API base Url
>  * username/password auth (it is not basic auth), NTLM, Kerberos
>  * 
> repository:
>  * API base url
>  * API version to use (currently v1 or v2, just in case both version would 
> provide the needed functionality)
>  * username/password auth (it is not basic auth), NTLM, Kerberos
>  * path to fetch (e.g. by object id of the folder)
>  * recursive fetch (yes/no)
>  * regex pattern for specific filenames
>  * regex pattern for specific (sub-)folders in case of recursive fetch
>  * mapping of username to Livelink username
>  * number of threads for API calls
> Then a plan needs to be developed on how to design the functionality. 
> Multi-threading should be used as much as possible, but should be limited to 
> a certain number of threads, e.g. by using a Execution Service,  as the REST 
> API requires many calls to get all information (e.g. to get document 
> categories one needs to "work recursively its way up").
>  
> References:
>  * OpenText REST APIs Content server: 
> [https://developer.opentext.com/webaccess/#url=%2Fawd%2Fresources%2Fapis%2Fcs-rest-api-for-cs-16-s=501]
>  * OpenText REST API Directory services (this MIGHT be needed for the 
> Authority plugin, but it MAY also be fine just with the content server APIs): 
> [https://developer.opentext.com/webaccess/#url=%2Fawd%2Fresources%2Fapis%2Fotds-16=501]
>  * Executor service fixed thread pool: 
> [https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/Executors.html#newFixedThreadPool(int])



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[RESULT] [VOTE] Release Apache ManifoldCF Plugin for Solr 8.x, version 2.2 RC0

2019-12-21 Thread Karl Wright
Four +1's, > 72 hours.  Vote passes!

Karl

On Fri, Dec 20, 2019 at 5:17 AM Markus Schuch  wrote:

> +1 from me
>
> installed the plugin in a fresh solr 8.3.1 instance an successfully ran
> a small smoke test.
>
> Cheers,
> Markus
>
> Am 18.12.2019 um 08:39 schrieb Karl Wright:
> > Please vote on whether to release the initial version of the Apache
> > ManifoldCF Plugin for Solr 8.x, version 2.2, RC0.
> >
> > The release artifact can be found here:
> >
> >
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-solr-8.x-plugin-2.2
> >
> > There is also a tag at:
> >
> https://svn.apache.org/repos/asf/manifoldcf/integration/solr-8.x/tags/release-2.2-RC0
> > .
> >
> > Note that the version "2.2" describes, in part, compatibility with other
> > Solr plugins, so is appropriate in this case for an initial version.
> >
> > Thanks,
> > Karl
> >
>


Re: What do you think about moving to git?

2019-12-20 Thread Karl Wright
Hi Markus,

All of our release scripts and documentation scripts are written to work
against Subversion.  Changing these represents a non-trivial amount of work
- something I don't have time for at the moment. FWIW, our reliance on
Forrest means that our build process for the site pages performs an svn
checkout of the Forrest codebase (so it can be properly patched to support
CJK fonts), so there's no removing Subversion entirely from our
infrastructure at this time in any case.

Karl


I have no fundamental objection to going to git but this cannot be a
one-person task.

On Fri, Dec 20, 2019 at 5:44 AM Markus Schuch  wrote:

> Hi everyone,
>
> ManifoldCF is my only project I'm working on, which is still hosted in
> subversion.
>
> I would like to start a discussion about whether we could migrate to Git.
>
> Why? I see wasted potential for a more active community and more
> contributions by hiding in Subversion. Github has a lot to offer and
> developers may no longer know how to use Subversion these days.
>
> What do you think?
>
> Cheers,
> Markus
>


Re: [VOTE] Release Apache ManifoldCF Plugin for Solr 8.x, version 2.2 RC0

2019-12-19 Thread Karl Wright
Tests all pass.

+1 from me.

Karl


On Wed, Dec 18, 2019 at 2:39 AM Karl Wright  wrote:

> Please vote on whether to release the initial version of the Apache
> ManifoldCF Plugin for Solr 8.x, version 2.2, RC0.
>
> The release artifact can be found here:
>
>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-solr-8.x-plugin-2.2
>
> There is also a tag at:
> https://svn.apache.org/repos/asf/manifoldcf/integration/solr-8.x/tags/release-2.2-RC0
> .
>
> Note that the version "2.2" describes, in part, compatibility with other
> Solr plugins, so is appropriate in this case for an initial version.
>
> Thanks,
> Karl
>
>


[VOTE] Release Apache ManifoldCF Plugin for Solr 8.x, version 2.2 RC0

2019-12-17 Thread Karl Wright
Please vote on whether to release the initial version of the Apache
ManifoldCF Plugin for Solr 8.x, version 2.2, RC0.

The release artifact can be found here:

https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-solr-8.x-plugin-2.2

There is also a tag at:
https://svn.apache.org/repos/asf/manifoldcf/integration/solr-8.x/tags/release-2.2-RC0
.

Note that the version "2.2" describes, in part, compatibility with other
Solr plugins, so is appropriate in this case for an initial version.

Thanks,
Karl


[jira] [Commented] (CONNECTORS-1586) Create plugin for Solr 8.0.0 when available

2019-12-17 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16998823#comment-16998823
 ] 

Karl Wright commented on CONNECTORS-1586:
-

There's already a plugin release created, and I'd like to get it released 
before end of year.  It also has to be compatible back to 8.0.0.  See:  
https://svn.apache.org/repos/asf/manifoldcf/integration/solr-8.x/trunk


> Create plugin for Solr 8.0.0 when available
> ---
>
> Key: CONNECTORS-1586
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1586
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Shinichiro Abe
>    Assignee: Karl Wright
>Priority: Minor
> Attachments: CONNECTORS-1568.patch
>
>
> The plugin for Solr 8.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1586) Create plugin for Solr 8.0.0 when available

2019-12-17 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1586:
---

Assignee: Karl Wright

> Create plugin for Solr 8.0.0 when available
> ---
>
> Key: CONNECTORS-1586
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1586
> Project: ManifoldCF
>  Issue Type: Task
>Reporter: Shinichiro Abe
>    Assignee: Karl Wright
>Priority: Minor
> Attachments: CONNECTORS-1568.patch
>
>
> The plugin for Solr 8.0 release.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Time for another release

2019-12-13 Thread Karl Wright
It's been a relatively quiet four months for a change, but the time has
come again to push out a release.  There are some important bug fixes for
the new Csws connector that really should be in the shipping artifact, for
one thing.

But I'd like to encourage everyone to pull together anything they're
working on and get it into the code base.  I am hoping to spin RC0 sometime
Christmas week.

Thanks in advance,
Karl


[jira] [Commented] (CONNECTORS-1628) Confluence Connector hang on error

2019-11-18 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16976686#comment-16976686
 ] 

Karl Wright commented on CONNECTORS-1628:
-

Hi [~julienFL], this looks fine, please go ahead and commit to trunk.


> Confluence Connector hang on error
> --
>
> Key: CONNECTORS-1628
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1628
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.14
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Critical
> Fix For: ManifoldCF 2.15
>
> Attachments: CONNECTORS-1628.diff
>
>
> During a crawling job, if the confluence connector encounters error(s) on 
> requests, it hang and there is no other way than restarting the MCF agent so 
> it works again.
> The reason is that the connector does not release the HTTP response if an 
> exception or an HTTP error is encountered during its processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1628) Confluence Connector hang on error

2019-11-18 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1628:
---

Assignee: Julien Massiera

> Confluence Connector hang on error
> --
>
> Key: CONNECTORS-1628
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1628
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.14
>Reporter: Julien Massiera
>Assignee: Julien Massiera
>Priority: Critical
> Fix For: ManifoldCF 2.15
>
> Attachments: CONNECTORS-1628.diff
>
>
> During a crawling job, if the confluence connector encounters error(s) on 
> requests, it hang and there is no other way than restarting the MCF agent so 
> it works again.
> The reason is that the connector does not release the HTTP response if an 
> exception or an HTTP error is encountered during its processing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CONNECTORS-1627) CSWS Connector: Error tossed: null (ownerRights may be null)

2019-10-25 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1627.
-
Fix Version/s: ManifoldCF 2.15
 Assignee: Karl Wright
   Resolution: Fixed

r1868956

> CSWS Connector: Error tossed: null (ownerRights may be null)
> 
>
> Key: CONNECTORS-1627
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1627
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: LiveLink connector
>Reporter: Markus Schuch
>    Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.15
>
> Attachments: CONNECTORS-1627.patch, screenshot-1.png
>
>
> We encounter documents having object rights with {{ownerRights}} = {{null}} 
> leading to:
> {code}
> FATAL 2019-10-25T10:55:03,839 (Worker thread '15') - Error tossed: null
> java.lang.NullPointerException
>   at 
> org.apache.manifoldcf.crawler.connectors.csws.CswsConnector.processDocuments(CswsConnector.java:1276)
>  ~[?:?]
>   at 
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) 
> [mcf-pull-agent.jar:?]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Build failed in Jenkins: ManifoldCF-ant #719

2019-10-23 Thread Karl Wright
Hmm, did the Nuxeo download go away?  What do we replace it with?
Karl

On Tue, Oct 22, 2019 at 9:39 PM Apache Jenkins Server <
jenk...@builds.apache.org> wrote:

> See <
> https://builds.apache.org/job/ManifoldCF-ant/719/display/redirect?page=changes
> >
>
> Changes:
>
> [kwright] Fix the way attributes are indexed to be compatible with LAPI
> connector
>
>
> --
> [...truncated 453.86 KB...]
> AU
> site/src/documentation/resources/images/zh_CN/rss-job-security.PNG
> AU
> site/src/documentation/resources/images/zh_CN/rss-job-time-values.PNG
> AUsite/src/documentation/resources/images/zh_CN/rss-job-urls.PNG
> AUsite/src/documentation/resources/images/zh_CN/rss-status.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-configure-authoritytype.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-configure-server.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-job-metadata.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-job-paths.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-job-security.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepoint-status.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointadauthority-configure-cache.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointadauthority-configure-dc.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointadauthority-status.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointnativeauthority-configure-cache.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointnativeauthority-configure-server.PNG
> AU
> site/src/documentation/resources/images/zh_CN/sharepointnativeauthority-status.PNG
> AU
> site/src/documentation/resources/images/zh_CN/simple-history-example.PNG
> AU
> site/src/documentation/resources/images/zh_CN/simple-history-select-activities.PNG
> AU
> site/src/documentation/resources/images/zh_CN/simple-history-select-connection.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-arguments.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-commits.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-documents.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-schema.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-server.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-solr-type.PNG
> AU
> site/src/documentation/resources/images/zh_CN/solr-configure-zookeeper.PNG
> AUsite/src/documentation/resources/images/zh_CN/solr-status.PNG
> AU
> site/src/documentation/resources/images/zh_CN/tika-job-exceptions.PNG
> AU
> site/src/documentation/resources/images/zh_CN/tika-job-field-mapping.PNG
> AU
> site/src/documentation/resources/images/zh_CN/transformation-throttling.PNG
> AU
> site/src/documentation/resources/images/zh_CN/view-authority-connection.PNG
> AUsite/src/documentation/resources/images/zh_CN/view-job.PNG
> AU
> site/src/documentation/resources/images/zh_CN/view-mapping-connection.PNG
> AU
> site/src/documentation/resources/images/zh_CN/view-output-connection.PNG
> AU
> site/src/documentation/resources/images/zh_CN/view-repository-connection.PNG
> AU
> site/src/documentation/resources/images/zh_CN/view-transformation-connection.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-access-credentials-session-form.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-access-credentials-session.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-access-credentials.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-bandwidth.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-certificates.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-email.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-configure-robots.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-canonicalization.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-exclusions.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-hop-filters.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-inclusions.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-metadata.PNG
> AU
> site/src/documentation/resources/images/zh_CN/web-job-security.PNG
> AUsite/src/documentation/resources/images/zh_CN/web-job-seeds.PNG
> AUsite/src/documentation/resources/images/zh_CN/web-status.PNG
> AUsite/src/documentation/resources/images/zh_CN/welcome-screen.PNG
> AU
> site/src/documentation/resources/images/zh_CN/wiki-configure-server.PNG
> AUsite/src/documentation/resources/images/ManifoldCF-logo.PNG
> AUsite/src/documentation/resources/images/lucene_outline_200.gif
> AU

[jira] [Assigned] (CONNECTORS-1626) CSWS Authority does no return all user permissions

2019-10-21 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1626:
---

Assignee: Markus Schuch

> CSWS Authority does no return all user permissions
> --
>
> Key: CONNECTORS-1626
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1626
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: LiveLink connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
>
> Currently the CSWS Authority does return tokens for groups, a user is a 
> directly a member of.
> The CSWS Authority does not return 
> - tokens for transitive group memberships 
> - tokens for project group memberships



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [jira] [Commented] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

2019-10-11 Thread Karl Wright
If you call Tika yourself, and you aren't using streams, then that would be
an obvious reason why your memory problems occur in that environment.
Karl


On Fri, Oct 11, 2019 at 9:26 AM Donald Van den Driessche (Jira) <
j...@apache.org> wrote:

>
> [
> https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949443#comment-16949443
> ]
>
> Donald Van den Driessche commented on CONNECTORS-1625:
> --
>
> After running the same process (with the same config) locally, we had no
> issues.
> So, it might be something with the streams.
>
>
>
> We've written a custom connector to fetch the files. It might use the
> wrong way to provide the file to the Tika parser.
>
> > When processing a specific PDF Manifold goes out of memory
> > --
> >
> > Key: CONNECTORS-1625
> > URL:
> https://issues.apache.org/jira/browse/CONNECTORS-1625
> > Project: ManifoldCF
> >  Issue Type: Bug
> >  Components: Tika extractor
> >    Affects Versions: ManifoldCF 2.12
> >Reporter: Donald Van den Driessche
> >Assignee: Karl Wright
> >Priority: Major
> > Attachments: abd-serotec-antibodies-uk.pdf
> >
> >
> > When processing attached file with manifoldcf 2.12, we keep getting an
> out of memory error.
> > When just parsing it throug Tika 1.18, no issues are being found.
> > Can anyone look into it?
> > Thanks in advance!
>
>
>
> --
> This message was sent by Atlassian Jira
> (v8.3.4#803005)
>


Re: Multiprocess-ZK- Inside a docker container(linux)

2019-10-10 Thread Karl Wright
There is nothing special about running in Docker.  Something in your setup
is just incorrect.
Karl

On Thu, Oct 10, 2019 at 2:26 AM Priya Arora  wrote:

> I have tested the working of multiprocess_zk_example outside the docker,
> it's working fine. But has anybody implemented the way to run all process
> of multi process inside docker.
>
> On Thu, Oct 10, 2019 at 11:32 AM Karl Wright  wrote:
>
>> Well, I confirmed that the multiprocess_zk example scripts work fine
>> outside of docker.  You'll have to debug why it's not working for you I'm
>> afraid.
>>
>> Karl
>>
>>
>> On Thu, Oct 10, 2019 at 12:56 AM Priya Arora  wrote:
>>
>>> Yes , it is same, checked again
>>> [image: image.png]
>>>
>>> On Wed, Oct 9, 2019 at 8:06 PM Karl Wright  wrote:
>>>
>>>> JAVA_HOME should point to the jdk root, not to the bin/java directory.
>>>> Karl
>>>>
>>>> On Wed, Oct 9, 2019 at 10:32 AM Priya Arora 
>>>> wrote:
>>>>
>>>>> Yes JAVA_HOME , has been set as usr/local/openjdk-8/bin/java, as the
>>>>> docker image of Java is based on openjdk
>>>>>
>>>>>
>>>>> > On 09-Oct-2019, at 7:57 PM, Karl Wright  wrote:
>>>>> >
>>>>>
>>>>


Re: Multiprocess-ZK- Inside a docker container(linux)

2019-10-10 Thread Karl Wright
Well, I confirmed that the multiprocess_zk example scripts work fine
outside of docker.  You'll have to debug why it's not working for you I'm
afraid.

Karl


On Thu, Oct 10, 2019 at 12:56 AM Priya Arora  wrote:

> Yes , it is same, checked again
> [image: image.png]
>
> On Wed, Oct 9, 2019 at 8:06 PM Karl Wright  wrote:
>
>> JAVA_HOME should point to the jdk root, not to the bin/java directory.
>> Karl
>>
>> On Wed, Oct 9, 2019 at 10:32 AM Priya Arora  wrote:
>>
>>> Yes JAVA_HOME , has been set as usr/local/openjdk-8/bin/java, as the
>>> docker image of Java is based on openjdk
>>>
>>>
>>> > On 09-Oct-2019, at 7:57 PM, Karl Wright  wrote:
>>> >
>>>
>>


Re: Multiprocess-ZK- Inside a docker container(linux)

2019-10-09 Thread Karl Wright
JAVA_HOME should point to the jdk root, not to the bin/java directory.
Karl

On Wed, Oct 9, 2019 at 10:32 AM Priya Arora  wrote:

> Yes JAVA_HOME , has been set as usr/local/openjdk-8/bin/java, as the
> docker image of Java is based on openjdk
>
>
> > On 09-Oct-2019, at 7:57 PM, Karl Wright  wrote:
> >
>


Re: Multiprocess-ZK- Inside a docker container(linux)

2019-10-09 Thread Karl Wright
Do you have JAVA_HOME set?
Karl


On Wed, Oct 9, 2019 at 5:01 AM Priya Arora  wrote:

> Hi All,,
>
> Getting this below error while accessing run zookeeper file  to run multi
> process zookeeper process inside docker comnatiner
>
> root@67d5fbd824c4:/usr/share/manifoldcf/multiprocess-zk-example# sh
> runzookeeper.sh
> runzookeeper.sh: 18: runzookeeper.sh: [[: not found
> executecommand.sh: 18: executecommand.sh: [[: not found
>
> Can somebody suggest why am i getting this error.
>
> Thanks
> Priya
>
> On Wed, Oct 9, 2019 at 1:58 PM Priya Arora  wrote:
>
> > Hi All,
> >
> > Any suggestions would be really helpful.
> >
> > Thanks and regards
> > Priya
> >
> > On Thu, Oct 3, 2019 at 10:33 AM Priya Arora  wrote:
> >
> >> Hi Cihad,
> >>
> >> *Do you run all scripts in only one docker container or multiple*?- I
> >> have tried with the approach of creating one sh file(which includes
> call to
> >> all other required sh files to start multiprocess)
> >> *I think these scripts should run in separate containers.  :-*Does
> >> that mean to have 6 different container's for all 6 sh files.
> >>
> >> * I recommend to use mysql or postgresql instead of
> >> start-database[.sh|.bat]* :- I am using postgres as database, and
> >> defined configuration settings in properties-global.xml, so will the
> >> database starts from this file and do need to start start-database.sh
> file.
> >>
> >> *Some docker related config settings are as below:-*
> >>
> >> RUN wget
> >>
> http://apache.mirror.rafal.ca/manifoldcf/apache-manifoldcf-${MANIFOLDCF_VERSION}/apache-manifoldcf-${MANIFOLDCF_VERSION}-bin.tar.gz
> >> && \
> >> wget
> >>
> https://maven.forgerock.org/repo/repo/eu/agno3/jcifs/jcifs-ng/${CIFS_VERSION}/jcifs-ng-${CIFS_VERSION}.jar
> >> && \
> >> tar -xzvf apache-manifoldcf-${MANIFOLDCF_VERSION}-bin.tar.gz && \
> >> cp -R apache-manifoldcf-${MANIFOLDCF_VERSION} /usr/share/manifoldcf
> >> && \
> >> cp jcifs-ng-${CIFS_VERSION}.jar
> >> /usr/share/manifoldcf/connector-lib-proprietary
> >>
> >> EXPOSE 8345
> >>
> >> WORKDIR /usr/share/manifoldcf/multiprocess-zk-example
> >> COPY config/mf/multiprocess/sh.sh
> >> /usr/share/manifoldcf/multiprocess-zk-example
> >>
> >> CMD [config/mf/multiprocess/sh.sh start]
> >> ,
> >> *sh.sh :- is file that includes all  other file to call.*
> >> #!/bin/bash
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/runzookeeper.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/setglobalproperties.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/start-database.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/initialize.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/start-agents.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/start-agents-2.sh
> >> sh /usr/share/manifoldcf/multiprocess-zk-example/start-webapps.sh
> >>
> >>
> >> Any suggestion would be really helpful.
> >>
> >> Thanks and regards
> >> Priya
> >>
> >> On Tue, Oct 1, 2019 at 7:25 PM Cihad Guzel  wrote:
> >>
> >>> Hi Priya,
> >>>
> >>> Do you run all scripts in only one docker container or multiple? How do
> >>> you
> >>> use it? I think these scripts should run in separate containers.
> >>>
> >>> There is not a single command to run multiprocess-zk-example. Maybe you
> >>> can
> >>> run manifoldcf in a single process example. I recommend to use mysql or
> >>> postgresql instead of start-database[.sh|.bat]  if you want to use on
> >>> production environment.
> >>>
> >>> Your container restart problem is related that your scripts in the
> >>> container is terminated. If the script is terminated, the docker
> >>> container
> >>> is stopped and it can start again. Please check your Dockerfile, docker
> >>> run
> >>> command and the runnable script in the container.
> >>>
> >>> You should give more details about that your scripts and Dockerfile for
> >>> more comments.
> >>>
> >>> Kind Regards,
> >>> Cihad Guzel
> >>>
> >>> Priya Arora , 1 Eki 2019 Sal, 15:10 tarihinde
> şunu
> >>> yazdı:
> >>>
> >>> > Hi All,
> >>> >
> >>> > I am trying to run multi-process zookeeper process inside a docker
> >>> > container.
> >>> > Do we need to follow all steps to run multi process:-
> >>> >
> >>> >1.  *runzookeeper[.sh|.bat]* script)
> >>> >2.  *setglobalproperties[.sh|.bat]*)
> >>> >3. *start-database[.sh|.bat]*)
> >>> >4.  *initialize[.sh|.bat]*)
> >>> >5.  *start-agents[.sh|.bat]*, and optionally
> >>> *start-agents-2[.sh|.bat]*)
> >>> >6. *start-webapps[.sh|.bat]*)
> >>> >
> >>> > Is there any single command to run multi process ,as inside
> dockerfile
> >>> we
> >>> > can configure one command to start up.
> >>> > I have tried the approach to create a single sh/jar file that is
> >>> calling
> >>> > all the required(mentioned above) sh files. but its getting the
> >>> conatiner
> >>> > in restart mode.(every 1,2 minutes)
> >>> > Has any body tried the approach to configure multi process
> environment
> >>> > inside docker container.
> >>> >
> >>> > Also i manually 

Re: Error in creating authority connection with Active Directory

2019-10-09 Thread Karl Wright
Hi,
Please first of all subscribe to this list if you are going to post here.
Otherwise I have to moderate your posts into it.
Second, the users list might be a better choice.  But first, you'll need to
provide more information, such as:

(1) Providing the information from the connection view page, so we can see
your configuration and the connection status;
(2) Describing anything at all unique about the Active Directory setup.

Thanks,
Karl


On Wed, Oct 9, 2019 at 7:15 AM muthukumar r 
wrote:

> We are trying to use authority service in ManifoldCF to connect to Active
> directory to get the user tokens but We are getting dead authority error.
> Kindly help me to resolve the issue.
>


[jira] [Resolved] (CONNECTORS-1117) Create a livelink connector using the LiveLink REST API

2019-10-08 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1117.
-
Fix Version/s: (was: ManifoldCF next)
   Resolution: Duplicate

> Create a livelink connector using the LiveLink REST API
> ---
>
> Key: CONNECTORS-1117
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1117
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: LiveLink connector
>    Reporter: Karl Wright
>Assignee: Kishore Kumar
>Priority: Major
>  Labels: gsoc2015, gsoc2016
>
> LAPI is deprecated, so develop a connector that doesn't use it.
> Here's API documentation:
> https://developer.opentext.com/awd/forums/questions/1130441#r1130448



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Technical question on repo connector dev

2019-10-05 Thread Karl Wright
Yes, that is what I suggest.
Karl


On Sat, Oct 5, 2019 at 8:42 AM  wrote:

> Hi Karl,
>
> Thanks for the answer.
>
> Is your suggestion something like :
>
> processDocuments(...) {
>
> if(documentIdentifier.isURI) {
> jsonDocs = getJsonDocsFromURI(documentIdentifier)
> jsonDocs.foreach(jsonDoc -> {
> String jsonDocID = "jsonDoc+" +
> jsonDoc.toJsonString();
> activities.addDocumentReference(jsonDocID);
> })
> } else if(documentIdentifier.isJsonDoc) {
> jsonDoc = getJsonDoc(documentIdentifier)
> jsonDocVersion = jsonDoc.getVersion()
> jsonDocUri = jsonDoc.getUri();
>
> if(activities.checkDocumentNeedsReindexing(documentIdentifier,
> jsonDocVersion)) {
>
> activities.ingestDocumentWithException(documentIdentifier, jsonDoc,
> jsonDocUri)
>     }
> }
> }
>
> ?
>
> Julien
>
> -Message d'origine-
> De : Karl Wright 
> Envoyé : vendredi 4 octobre 2019 21:07
> À : dev 
> Objet : Re: Technical question on repo connector dev
>
> Hi Julien,
>
> The checkDocumentNeedsReindexing() method is meant to be used inside
> processDocuments() for the specific document you are checking.  So you can
> convert your URI to a set of JSON documents, if the document identifier is
> a URI, But you will probably want to put the actual data for the document
> in carrydown information.  You will need to also create some kind of
> non-URI document ID too.
>
> Karl
>
>
> On Fri, Oct 4, 2019 at 1:36 PM  wrote:
>
> > Hi,
> >
> >
> >
> > I am facing a simple technical case that I am not sure how to deal
> > with, concerning the development of a repository connector.
> >
> >
> >
> > I want to develop a repo connector using the ADD_CHANGE_DELETE model
> > that will normally add seed documents, and each seed document will
> > produce several documents.
> > The problem is that each produced document from a seed doc is
> > instantly ingest-able and does not need to be processed.
> >
> >
> >
> > The use case here is that the addSeedDocuments method will call an API
> > that will provide several URIs (seeds).
> >
> > In the processDocuments method, each URI provides a JSON array
> > containing JSON objects and those JSON objects are meant to become
> > repository documents and ingested.
> > So the logic would be to use the activities.addDocumentReference for
> > each JSON object before I can use the
> > activities.checkDocumentNeedsReindexing
> > (each JSON object has an id and a version field) and then ingest the
> > document. But by doing this, I am afraid that the processDocuments
> > method will be called with those newly referenced docs while they do
> > not need to be processed.
> >
> >
> >
> > Any suggestion about how to deal with this use case is welcome.
> >
> >
> >
> > Thanks,
> > Julien
> >
> >
>
>


Re: Technical question on repo connector dev

2019-10-04 Thread Karl Wright
Hi Julien,

The checkDocumentNeedsReindexing() method is meant to be used inside
processDocuments() for the specific document you are checking.  So you can
convert your URI to a set of JSON documents, if the document identifier is
a URI, But you will probably want to put the actual data for the document
in carrydown information.  You will need to also create some kind of
non-URI document ID too.

Karl


On Fri, Oct 4, 2019 at 1:36 PM  wrote:

> Hi,
>
>
>
> I am facing a simple technical case that I am not sure how to deal with,
> concerning the development of a repository connector.
>
>
>
> I want to develop a repo connector using the ADD_CHANGE_DELETE model that
> will normally add seed documents, and each seed document will produce
> several documents.
> The problem is that each produced document from a seed doc is instantly
> ingest-able and does not need to be processed.
>
>
>
> The use case here is that the addSeedDocuments method will call an API that
> will provide several URIs (seeds).
>
> In the processDocuments method, each URI provides a JSON array containing
> JSON objects and those JSON objects are meant to become repository
> documents
> and ingested.
> So the logic would be to use the activities.addDocumentReference for each
> JSON object before I can use the activities.checkDocumentNeedsReindexing
> (each JSON object has an id and a version field) and then ingest the
> document. But by doing this, I am afraid that the processDocuments method
> will be called with those newly referenced docs while they do not need to
> be
> processed.
>
>
>
> Any suggestion about how to deal with this use case is welcome.
>
>
>
> Thanks,
> Julien
>
>


[jira] [Commented] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

2019-10-04 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944386#comment-16944386
 ] 

Karl Wright commented on CONNECTORS-1625:
-

Also, FWIW, the default Java memory sizes on the example are not guaranteed to 
allow processing of N simultaneous Tika extractions (one per worker thread) of 
the sort that require more memory.  Memory sizes allocated to the JVM are 
settable in the start-options files, and the first thing you want to do is 
increase those values to see if the problem goes away for you.


> When processing a specific PDF Manifold goes out of memory
> --
>
> Key: CONNECTORS-1625
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1625
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Tika extractor
>Affects Versions: ManifoldCF 2.12
>Reporter: Donald Van den Driessche
>Assignee: Karl Wright
>Priority: Major
> Attachments: abd-serotec-antibodies-uk.pdf
>
>
> When processing attached file with manifoldcf 2.12, we keep getting an out of 
> memory error.
> When just parsing it throug Tika 1.18, no issues are being found.
> Can anyone look into it?
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

2019-10-04 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944385#comment-16944385
 ] 

Karl Wright edited comment on CONNECTORS-1625 at 10/4/19 10:29 AM:
---

What version of Manifold is this?  2.12 is pretty old by Tika standards.  We 
pretty much upgrade Tika continuously at this point and if it's not the current 
version you are running old Tika code.




was (Author: kwri...@metacarta.com):
What version of Manifold is this?
We pretty much upgrade Tika continuously at this point and if it's not the 
current version you are running old Tika code.



> When processing a specific PDF Manifold goes out of memory
> --
>
> Key: CONNECTORS-1625
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1625
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Tika extractor
>Affects Versions: ManifoldCF 2.12
>Reporter: Donald Van den Driessche
>Assignee: Karl Wright
>Priority: Major
> Attachments: abd-serotec-antibodies-uk.pdf
>
>
> When processing attached file with manifoldcf 2.12, we keep getting an out of 
> memory error.
> When just parsing it throug Tika 1.18, no issues are being found.
> Can anyone look into it?
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1625) When processing a specific PDF Manifold goes out of memory

2019-10-04 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1625:
---

Assignee: Karl Wright

> When processing a specific PDF Manifold goes out of memory
> --
>
> Key: CONNECTORS-1625
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1625
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Tika extractor
>Affects Versions: ManifoldCF 2.12
>Reporter: Donald Van den Driessche
>Assignee: Karl Wright
>Priority: Major
> Attachments: abd-serotec-antibodies-uk.pdf
>
>
> When processing attached file with manifoldcf 2.12, we keep getting an out of 
> memory error.
> When just parsing it throug Tika 1.18, no issues are being found.
> Can anyone look into it?
> Thanks in advance!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [RESULT] [VOTE] Release Apache ManifoldCF 2.14, RC2

2019-10-03 Thread Karl Wright
The integration tests always run slowly.  They take 45 minutes for me to
complete.
Karl

On Thu, Oct 3, 2019 at 9:18 AM Piergiorgio Lucidi 
wrote:

> it compiles and test are running... very slowly at the moment :-P
>
> It seems ok until now!
>
> Il giorno gio 3 ott 2019 alle ore 15:15 Karl Wright 
> ha
> scritto:
>
> > Three +1's, >72 hours.  Vote passes.
> >
> > On Thu, Oct 3, 2019 at 8:34 AM  wrote:
> >
> > > Hi,
> > >
> > > Compiles and tests ok.
> > > +1
> > >
> > > Julien
> > >
> > > -Message d'origine-
> > > De : Karl Wright 
> > > Envoyé : jeudi 3 octobre 2019 11:55
> > > À : dev 
> > > Objet : Re: [VOTE] Release Apache ManifoldCF 2.14, RC2
> > >
> > > I need one more +1 still.
> > >
> > > Karl
> > >
> > >
> > > On Mon, Sep 30, 2019 at 5:11 PM Furkan KAMACI 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > +1 from me.
> > > >
> > > > I checked:
> > > > - LICENSE and NOTICE are fine
> > > > - NOTICE needs to be updated (check my comment below)
> > > > - No unexpected binary files
> > > > - Checked PGP signatures
> > > > - Checked Checksums
> > > > - Code compiles and tests successfully run
> > > >
> > > > There is only one minor objection from me which is NOTICE file does
> > > > not include up-to-date year for copyright. I've created a PR for it.
> > > >
> > > > Kind Regards,
> > > > Furkan KAMACI
> > > >
> > > > On Mon, Sep 30, 2019 at 11:22 AM Karl Wright 
> > wrote:
> > > >
> > > > > All tests pass.  +1 from me.
> > > > > Can anyone else look at this, especially Cihad?  I'm not expecting
> > > > > this
> > > > to
> > > > > fix his import but it will possibly fix his linkage error.
> > > > >
> > > > > Karl
> > > > >
> > > > >
> > > > > On Wed, Sep 25, 2019 at 9:28 PM Karl Wright 
> > > wrote:
> > > > >
> > > > > > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> > > > > >
> > > > > > There is a release tag at
> > > > > >
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC2
> > .
> > > > > > There is a release artifact at
> > > > > >
> > > >
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.
> > > > 14
> > > > > .
> > > > > >
> > > > > > Note that this release of ManifoldCF has a new connector -- the
> > > > > > Content Services Web Services connector.  In order to integrate
> > > > > > this connector, changes to the dependencies for web services had
> > > > > > to be made, so I urge those using the Alfresco and CMIS
> connectors
> > > > > > to verify this release if
> > > > at
> > > > > > all possible.  Integration tests pass but there's nothing like an
> > > > on-site
> > > > > > test.
> > > > > >
> > > > > > RC2 also contains a fix for CONNECTORS-1623.
> > > > > >
> > > > > > RC2 also removes duplicate jars appearing connector-lib that are
> > > > > > also
> > > > in
> > > > > > the root directory.  This could potentially cause runtime link
> > > errors.
> > > > > >
> > > > > > Thanks,
> > > > > > Karl
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
> --
> Piergiorgio
>


[RESULT] [VOTE] Release Apache ManifoldCF 2.14, RC2

2019-10-03 Thread Karl Wright
Three +1's, >72 hours.  Vote passes.

On Thu, Oct 3, 2019 at 8:34 AM  wrote:

> Hi,
>
> Compiles and tests ok.
> +1
>
> Julien
>
> -Message d'origine-
> De : Karl Wright 
> Envoyé : jeudi 3 octobre 2019 11:55
> À : dev 
> Objet : Re: [VOTE] Release Apache ManifoldCF 2.14, RC2
>
> I need one more +1 still.
>
> Karl
>
>
> On Mon, Sep 30, 2019 at 5:11 PM Furkan KAMACI 
> wrote:
>
> > Hi,
> >
> > +1 from me.
> >
> > I checked:
> > - LICENSE and NOTICE are fine
> > - NOTICE needs to be updated (check my comment below)
> > - No unexpected binary files
> > - Checked PGP signatures
> > - Checked Checksums
> > - Code compiles and tests successfully run
> >
> > There is only one minor objection from me which is NOTICE file does
> > not include up-to-date year for copyright. I've created a PR for it.
> >
> > Kind Regards,
> > Furkan KAMACI
> >
> > On Mon, Sep 30, 2019 at 11:22 AM Karl Wright  wrote:
> >
> > > All tests pass.  +1 from me.
> > > Can anyone else look at this, especially Cihad?  I'm not expecting
> > > this
> > to
> > > fix his import but it will possibly fix his linkage error.
> > >
> > > Karl
> > >
> > >
> > > On Wed, Sep 25, 2019 at 9:28 PM Karl Wright 
> wrote:
> > >
> > > > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> > > >
> > > > There is a release tag at
> > > > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC2 .
> > > > There is a release artifact at
> > > >
> > https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.
> > 14
> > > .
> > > >
> > > > Note that this release of ManifoldCF has a new connector -- the
> > > > Content Services Web Services connector.  In order to integrate
> > > > this connector, changes to the dependencies for web services had
> > > > to be made, so I urge those using the Alfresco and CMIS connectors
> > > > to verify this release if
> > at
> > > > all possible.  Integration tests pass but there's nothing like an
> > on-site
> > > > test.
> > > >
> > > > RC2 also contains a fix for CONNECTORS-1623.
> > > >
> > > > RC2 also removes duplicate jars appearing connector-lib that are
> > > > also
> > in
> > > > the root directory.  This could potentially cause runtime link
> errors.
> > > >
> > > > Thanks,
> > > > Karl
> > > >
> > >
> >
>
>


Re: [VOTE] Release Apache ManifoldCF 2.14, RC2

2019-09-30 Thread Karl Wright
All tests pass.  +1 from me.
Can anyone else look at this, especially Cihad?  I'm not expecting this to
fix his import but it will possibly fix his linkage error.

Karl


On Wed, Sep 25, 2019 at 9:28 PM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>
> There is a release tag at
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC2 .
> There is a release artifact at
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14 .
>
> Note that this release of ManifoldCF has a new connector -- the Content
> Services Web Services connector.  In order to integrate this connector,
> changes to the dependencies for web services had to be made, so I urge
> those using the Alfresco and CMIS connectors to verify this release if at
> all possible.  Integration tests pass but there's nothing like an on-site
> test.
>
> RC2 also contains a fix for CONNECTORS-1623.
>
> RC2 also removes duplicate jars appearing connector-lib that are also in
> the root directory.  This could potentially cause runtime link errors.
>
> Thanks,
> Karl
>


[VOTE] Release Apache ManifoldCF 2.14, RC2

2019-09-25 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.14, RC0.

There is a release tag at
https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC2 .
There is a release artifact at
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14 .

Note that this release of ManifoldCF has a new connector -- the Content
Services Web Services connector.  In order to integrate this connector,
changes to the dependencies for web services had to be made, so I urge
those using the Alfresco and CMIS connectors to verify this release if at
all possible.  Integration tests pass but there's nothing like an on-site
test.

RC2 also contains a fix for CONNECTORS-1623.

RC2 also removes duplicate jars appearing connector-lib that are also in
the root directory.  This could potentially cause runtime link errors.

Thanks,
Karl


[CANCEL][VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
Vote canceled to address the duplicate jars Cihad found.
Karl

On Wed, Sep 25, 2019 at 8:53 PM Karl Wright  wrote:

> It's the generic connector that sucks this in.  I'm going to disable it
> and see if everything still works or not.
>
> Karl
>
>
> On Wed, Sep 25, 2019 at 8:42 PM Karl Wright  wrote:
>
>> I looked at the JSP exception.
>>
>> Velocity 1.7 (which is what we include) depends on Commons Collections
>> 3.2.1.  We include version 3.2.2, which should be fine, and it indeed has
>> that class:
>>
>> >>>>>>
>> C:\wip\mcf\trunk\lib>dir commons-collections-3.2.2.jar
>>  Volume in drive C is Windows
>>  Volume Serial Number is F4D8-E4E0
>>
>>  Directory of C:\wip\mcf\trunk\lib
>>
>> 09/06/2019  04:03 PM   588,337 commons-collections-3.2.2.jar
>>1 File(s)588,337 bytes
>>0 Dir(s)  179,446,546,432 bytes free
>>
>> C:\wip\mcf\trunk\lib>tar tf commons-collections-3.2.2.jar | grep
>> ExtendedProperties
>>
>> org/apache/commons/collections/ExtendedProperties$PropertiesTokenizer.class
>> org/apache/commons/collections/ExtendedProperties$1.class
>> org/apache/commons/collections/ExtendedProperties$PropertiesReader.class
>> org/apache/commons/collections/ExtendedProperties.class
>>
>> C:\wip\mcf\trunk\lib>
>> <<<<<<
>>
>> The only other question is whether it's in the right place.  I noted that
>> the velocity jar and the commons-collections jar appear BOTH at the root
>> level AND in connector-lib.  This doesn't seem right; if it's at the root
>> level it should not be in connector-lib too.  Let me see if I can figure
>> out why that is there.
>>
>> Karl
>>
>>
>> On Wed, Sep 25, 2019 at 8:20 PM Karl Wright  wrote:
>>
>>> How was this JSON generated?  By hand?  Or exported?  If exported, by
>>> what version of MCF?
>>>
>>> I ask because there was an issue that was fixed back around 2.10 or so
>>> (if memory serves) having to do with the export of JSON.  The JSON format
>>> previously emitted did not preserve order in some cases; after the change,
>>> it did.  Might be related to API usage in this case; can't be sure.
>>>
>>> At any rate, the problem is coming from viewing a transformation step in
>>> your pipeline.  What transformation connections are you using?
>>>
>>> Karl
>>>
>>>
>>>
>>>
>>> On Wed, Sep 25, 2019 at 7:01 PM Cihad Guzel  wrote:
>>>
>>>> Hi Karl,
>>>>
>>>> Shortly;
>>>> 1- I have created a job via ManifoldCF API. There is not any problem,
>>>> no error, all is well.
>>>> 2- Then I try to open the Edit Job page from UI and UI won't display it
>>>> properly. There is the 500 error on the popup.
>>>>
>>>> I didn't move my current installation. I have downloaded new ManifoldCF
>>>> version and have run in a clean new environment. Then, I have sent the JSON
>>>> object to my new ManifoldCF API.
>>>>
>>>> I agree with you on this:
>>>> "The actual error you are getting is because of a JSP exception. The
>>>> JSP exception is not being handled properly, possibly because of a version
>>>> mismatch between the included version of Velocity and Apache Commons
>>>> Collections. "
>>>>
>>>> Regards,
>>>> Cihad Guzel
>>>>
>>>> Karl Wright , 26 Eyl 2019 Per, 01:31 tarihinde
>>>> şunu yazdı:
>>>>
>>>>> The actual error you are getting is because of a JSP exception.  The
>>>>> JSP exception is not being handled properly, possibly because of a version
>>>>> mismatch between the included version of Velocity and Apache Commons
>>>>> Collections.  But if that is real, it's odd because it's occurring when 
>>>>> the
>>>>> velocity context is being initialized!! That means it should be happening
>>>>> on every page load in the UI, and it most certainly does not -- not for 
>>>>> me,
>>>>> not for anyone who ran the tests.  So I suspect a setup problem of some
>>>>> kind, especially if you're trying to move your 2.12 installation to 2.14 
>>>>> in
>>>>> place.  Lots of jars moved around and you might have a jar conflict.
>>>>>
>>>>> Karl
>>>>>
>>>>> On Wed, Sep 25, 2019 at 6:25 PM K

Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
It's the generic connector that sucks this in.  I'm going to disable it and
see if everything still works or not.

Karl


On Wed, Sep 25, 2019 at 8:42 PM Karl Wright  wrote:

> I looked at the JSP exception.
>
> Velocity 1.7 (which is what we include) depends on Commons Collections
> 3.2.1.  We include version 3.2.2, which should be fine, and it indeed has
> that class:
>
> >>>>>>
> C:\wip\mcf\trunk\lib>dir commons-collections-3.2.2.jar
>  Volume in drive C is Windows
>  Volume Serial Number is F4D8-E4E0
>
>  Directory of C:\wip\mcf\trunk\lib
>
> 09/06/2019  04:03 PM   588,337 commons-collections-3.2.2.jar
>1 File(s)588,337 bytes
>0 Dir(s)  179,446,546,432 bytes free
>
> C:\wip\mcf\trunk\lib>tar tf commons-collections-3.2.2.jar | grep
> ExtendedProperties
> org/apache/commons/collections/ExtendedProperties$PropertiesTokenizer.class
> org/apache/commons/collections/ExtendedProperties$1.class
> org/apache/commons/collections/ExtendedProperties$PropertiesReader.class
> org/apache/commons/collections/ExtendedProperties.class
>
> C:\wip\mcf\trunk\lib>
> <<<<<<
>
> The only other question is whether it's in the right place.  I noted that
> the velocity jar and the commons-collections jar appear BOTH at the root
> level AND in connector-lib.  This doesn't seem right; if it's at the root
> level it should not be in connector-lib too.  Let me see if I can figure
> out why that is there.
>
> Karl
>
>
> On Wed, Sep 25, 2019 at 8:20 PM Karl Wright  wrote:
>
>> How was this JSON generated?  By hand?  Or exported?  If exported, by
>> what version of MCF?
>>
>> I ask because there was an issue that was fixed back around 2.10 or so
>> (if memory serves) having to do with the export of JSON.  The JSON format
>> previously emitted did not preserve order in some cases; after the change,
>> it did.  Might be related to API usage in this case; can't be sure.
>>
>> At any rate, the problem is coming from viewing a transformation step in
>> your pipeline.  What transformation connections are you using?
>>
>> Karl
>>
>>
>>
>>
>> On Wed, Sep 25, 2019 at 7:01 PM Cihad Guzel  wrote:
>>
>>> Hi Karl,
>>>
>>> Shortly;
>>> 1- I have created a job via ManifoldCF API. There is not any problem, no
>>> error, all is well.
>>> 2- Then I try to open the Edit Job page from UI and UI won't display it
>>> properly. There is the 500 error on the popup.
>>>
>>> I didn't move my current installation. I have downloaded new ManifoldCF
>>> version and have run in a clean new environment. Then, I have sent the JSON
>>> object to my new ManifoldCF API.
>>>
>>> I agree with you on this:
>>> "The actual error you are getting is because of a JSP exception. The JSP
>>> exception is not being handled properly, possibly because of a version
>>> mismatch between the included version of Velocity and Apache Commons
>>> Collections. "
>>>
>>> Regards,
>>> Cihad Guzel
>>>
>>> Karl Wright , 26 Eyl 2019 Per, 01:31 tarihinde şunu
>>> yazdı:
>>>
>>>> The actual error you are getting is because of a JSP exception.  The
>>>> JSP exception is not being handled properly, possibly because of a version
>>>> mismatch between the included version of Velocity and Apache Commons
>>>> Collections.  But if that is real, it's odd because it's occurring when the
>>>> velocity context is being initialized!! That means it should be happening
>>>> on every page load in the UI, and it most certainly does not -- not for me,
>>>> not for anyone who ran the tests.  So I suspect a setup problem of some
>>>> kind, especially if you're trying to move your 2.12 installation to 2.14 in
>>>> place.  Lots of jars moved around and you might have a jar conflict.
>>>>
>>>> Karl
>>>>
>>>> On Wed, Sep 25, 2019 at 6:25 PM Karl Wright  wrote:
>>>>
>>>>> Hi Cihad,
>>>>> Using the API should not ever invoke JSP pages, so I do not think you
>>>>> are actually communicating with the API.  Unless what you are saying is
>>>>> that you create the job, and then the UI won't display it properly?
>>>>>
>>>>> Karl
>>>>>
>>>>>
>>>>> On Wed, Sep 25, 2019 at 6:11 PM Cihad Guzel  wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>

Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
I looked at the JSP exception.

Velocity 1.7 (which is what we include) depends on Commons Collections
3.2.1.  We include version 3.2.2, which should be fine, and it indeed has
that class:

>>>>>>
C:\wip\mcf\trunk\lib>dir commons-collections-3.2.2.jar
 Volume in drive C is Windows
 Volume Serial Number is F4D8-E4E0

 Directory of C:\wip\mcf\trunk\lib

09/06/2019  04:03 PM   588,337 commons-collections-3.2.2.jar
   1 File(s)588,337 bytes
   0 Dir(s)  179,446,546,432 bytes free

C:\wip\mcf\trunk\lib>tar tf commons-collections-3.2.2.jar | grep
ExtendedProperties
org/apache/commons/collections/ExtendedProperties$PropertiesTokenizer.class
org/apache/commons/collections/ExtendedProperties$1.class
org/apache/commons/collections/ExtendedProperties$PropertiesReader.class
org/apache/commons/collections/ExtendedProperties.class

C:\wip\mcf\trunk\lib>
<<<<<<

The only other question is whether it's in the right place.  I noted that
the velocity jar and the commons-collections jar appear BOTH at the root
level AND in connector-lib.  This doesn't seem right; if it's at the root
level it should not be in connector-lib too.  Let me see if I can figure
out why that is there.

Karl


On Wed, Sep 25, 2019 at 8:20 PM Karl Wright  wrote:

> How was this JSON generated?  By hand?  Or exported?  If exported, by what
> version of MCF?
>
> I ask because there was an issue that was fixed back around 2.10 or so (if
> memory serves) having to do with the export of JSON.  The JSON format
> previously emitted did not preserve order in some cases; after the change,
> it did.  Might be related to API usage in this case; can't be sure.
>
> At any rate, the problem is coming from viewing a transformation step in
> your pipeline.  What transformation connections are you using?
>
> Karl
>
>
>
>
> On Wed, Sep 25, 2019 at 7:01 PM Cihad Guzel  wrote:
>
>> Hi Karl,
>>
>> Shortly;
>> 1- I have created a job via ManifoldCF API. There is not any problem, no
>> error, all is well.
>> 2- Then I try to open the Edit Job page from UI and UI won't display it
>> properly. There is the 500 error on the popup.
>>
>> I didn't move my current installation. I have downloaded new ManifoldCF
>> version and have run in a clean new environment. Then, I have sent the JSON
>> object to my new ManifoldCF API.
>>
>> I agree with you on this:
>> "The actual error you are getting is because of a JSP exception. The JSP
>> exception is not being handled properly, possibly because of a version
>> mismatch between the included version of Velocity and Apache Commons
>> Collections. "
>>
>> Regards,
>> Cihad Guzel
>>
>> Karl Wright , 26 Eyl 2019 Per, 01:31 tarihinde şunu
>> yazdı:
>>
>>> The actual error you are getting is because of a JSP exception.  The JSP
>>> exception is not being handled properly, possibly because of a version
>>> mismatch between the included version of Velocity and Apache Commons
>>> Collections.  But if that is real, it's odd because it's occurring when the
>>> velocity context is being initialized!! That means it should be happening
>>> on every page load in the UI, and it most certainly does not -- not for me,
>>> not for anyone who ran the tests.  So I suspect a setup problem of some
>>> kind, especially if you're trying to move your 2.12 installation to 2.14 in
>>> place.  Lots of jars moved around and you might have a jar conflict.
>>>
>>> Karl
>>>
>>> On Wed, Sep 25, 2019 at 6:25 PM Karl Wright  wrote:
>>>
>>>> Hi Cihad,
>>>> Using the API should not ever invoke JSP pages, so I do not think you
>>>> are actually communicating with the API.  Unless what you are saying is
>>>> that you create the job, and then the UI won't display it properly?
>>>>
>>>> Karl
>>>>
>>>>
>>>> On Wed, Sep 25, 2019 at 6:11 PM Cihad Guzel  wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Firstly, Thanks @Furkan, all tests are passed with Java8 now.
>>>>>
>>>>> On the other hand, I test the ManifoldCF API. There are some errors. I
>>>>> have created a job using ManifoldCF API successfully (versions: 2.12, 
>>>>> 2.13,
>>>>> 2.14). Then I have HTTP 500 error if I click "mcf-crawler-ui/editjob.jsp" 
>>>>> .
>>>>> I get this error when only I use Manifoldcf-2.13 and Manifoldcf-2.14. I
>>>>> don't have any error when I use Manifoldcf-2.12.
>>>>>
>>>>> I se

Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
How was this JSON generated?  By hand?  Or exported?  If exported, by what
version of MCF?

I ask because there was an issue that was fixed back around 2.10 or so (if
memory serves) having to do with the export of JSON.  The JSON format
previously emitted did not preserve order in some cases; after the change,
it did.  Might be related to API usage in this case; can't be sure.

At any rate, the problem is coming from viewing a transformation step in
your pipeline.  What transformation connections are you using?

Karl




On Wed, Sep 25, 2019 at 7:01 PM Cihad Guzel  wrote:

> Hi Karl,
>
> Shortly;
> 1- I have created a job via ManifoldCF API. There is not any problem, no
> error, all is well.
> 2- Then I try to open the Edit Job page from UI and UI won't display it
> properly. There is the 500 error on the popup.
>
> I didn't move my current installation. I have downloaded new ManifoldCF
> version and have run in a clean new environment. Then, I have sent the JSON
> object to my new ManifoldCF API.
>
> I agree with you on this:
> "The actual error you are getting is because of a JSP exception. The JSP
> exception is not being handled properly, possibly because of a version
> mismatch between the included version of Velocity and Apache Commons
> Collections. "
>
> Regards,
> Cihad Guzel
>
> Karl Wright , 26 Eyl 2019 Per, 01:31 tarihinde şunu
> yazdı:
>
>> The actual error you are getting is because of a JSP exception.  The JSP
>> exception is not being handled properly, possibly because of a version
>> mismatch between the included version of Velocity and Apache Commons
>> Collections.  But if that is real, it's odd because it's occurring when the
>> velocity context is being initialized!! That means it should be happening
>> on every page load in the UI, and it most certainly does not -- not for me,
>> not for anyone who ran the tests.  So I suspect a setup problem of some
>> kind, especially if you're trying to move your 2.12 installation to 2.14 in
>> place.  Lots of jars moved around and you might have a jar conflict.
>>
>> Karl
>>
>> On Wed, Sep 25, 2019 at 6:25 PM Karl Wright  wrote:
>>
>>> Hi Cihad,
>>> Using the API should not ever invoke JSP pages, so I do not think you
>>> are actually communicating with the API.  Unless what you are saying is
>>> that you create the job, and then the UI won't display it properly?
>>>
>>> Karl
>>>
>>>
>>> On Wed, Sep 25, 2019 at 6:11 PM Cihad Guzel  wrote:
>>>
>>>> Hi,
>>>>
>>>> Firstly, Thanks @Furkan, all tests are passed with Java8 now.
>>>>
>>>> On the other hand, I test the ManifoldCF API. There are some errors. I
>>>> have created a job using ManifoldCF API successfully (versions: 2.12, 2.13,
>>>> 2.14). Then I have HTTP 500 error if I click "mcf-crawler-ui/editjob.jsp" .
>>>> I get this error when only I use Manifoldcf-2.13 and Manifoldcf-2.14. I
>>>> don't have any error when I use Manifoldcf-2.12.
>>>>
>>>> I sent following JSON object to the manifoldCF API  to create new job.
>>>>
>>>> {"job":{"children":[{"type":"start_mode","value":"manual"},{"type":"hopcount_mode","value":"accurate"},{"type":"run_mode","value":"continuous"},{"type":"recrawl_interval","value":"8640"},{"type":"reseed_interval","value":"360"},{"type":"description","value":"file_filecrawler2"},{"type":"repository_connection","value":"file_filecrawler2"},{"type":"priority","value":"1"},{"type":"pipelinestage","children":[{"type":"stage_id","value":"0"},{"type":"stage_isoutput","value":"false"},{"type":"stage_connectionname","value":"Allowed
>>>>  
>>>> Documents"},{"type":"stage_specification","children":[{"type":"extensions","value":"xlsx\r\nrtf\r\ncsv\r\ndocx\r\npptx\r\ntxt\r\npdf\r\nppt\r\nxml\r\ndoc\r\nodp\r\nxls\r\nodt\r\nods"},{"type":"mimetypes","value":"application\/xml\r\napplication\/msword\r\napplication\/pdf\r\napplication\/vnd.ms-powerpoint\r\napplication\/vnd.openxmlformats-officedocument.wordprocessingml.document\r\napplication\/rtf\r\napplic

Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
The actual error you are getting is because of a JSP exception.  The JSP
exception is not being handled properly, possibly because of a version
mismatch between the included version of Velocity and Apache Commons
Collections.  But if that is real, it's odd because it's occurring when the
velocity context is being initialized!! That means it should be happening
on every page load in the UI, and it most certainly does not -- not for me,
not for anyone who ran the tests.  So I suspect a setup problem of some
kind, especially if you're trying to move your 2.12 installation to 2.14 in
place.  Lots of jars moved around and you might have a jar conflict.

Karl

On Wed, Sep 25, 2019 at 6:25 PM Karl Wright  wrote:

> Hi Cihad,
> Using the API should not ever invoke JSP pages, so I do not think you are
> actually communicating with the API.  Unless what you are saying is that
> you create the job, and then the UI won't display it properly?
>
> Karl
>
>
> On Wed, Sep 25, 2019 at 6:11 PM Cihad Guzel  wrote:
>
>> Hi,
>>
>> Firstly, Thanks @Furkan, all tests are passed with Java8 now.
>>
>> On the other hand, I test the ManifoldCF API. There are some errors. I
>> have created a job using ManifoldCF API successfully (versions: 2.12, 2.13,
>> 2.14). Then I have HTTP 500 error if I click "mcf-crawler-ui/editjob.jsp" .
>> I get this error when only I use Manifoldcf-2.13 and Manifoldcf-2.14. I
>> don't have any error when I use Manifoldcf-2.12.
>>
>> I sent following JSON object to the manifoldCF API  to create new job.
>>
>> {"job":{"children":[{"type":"start_mode","value":"manual"},{"type":"hopcount_mode","value":"accurate"},{"type":"run_mode","value":"continuous"},{"type":"recrawl_interval","value":"8640"},{"type":"reseed_interval","value":"360"},{"type":"description","value":"file_filecrawler2"},{"type":"repository_connection","value":"file_filecrawler2"},{"type":"priority","value":"1"},{"type":"pipelinestage","children":[{"type":"stage_id","value":"0"},{"type":"stage_isoutput","value":"false"},{"type":"stage_connectionname","value":"Allowed
>>  
>> Documents"},{"type":"stage_specification","children":[{"type":"extensions","value":"xlsx\r\nrtf\r\ncsv\r\ndocx\r\npptx\r\ntxt\r\npdf\r\nppt\r\nxml\r\ndoc\r\nodp\r\nxls\r\nodt\r\nods"},{"type":"mimetypes","value":"application\/xml\r\napplication\/msword\r\napplication\/pdf\r\napplication\/vnd.ms-powerpoint\r\napplication\/vnd.openxmlformats-officedocument.wordprocessingml.document\r\napplication\/rtf\r\napplication\/vnd.oasis.opendocument.text\r\napplication\/vnd.openxmlformats-officedocument.presentationml.presentation\r\napplication\/vnd.openxmlformats-officedocument.spreadsheetml.sheet\r\ntext\/csv\r\napplication\/vnd.oasis.opendocument.presentation\r\napplication\/vnd.ms-excel\r\napplication\/vnd.oasis.opendocument.spreadsheet\r\ntext\/plain"},{"type":"mindate","value":"","attribute_value":"36593"},{"_type":"minlength","value":"","attribute_value":"1"},{"_type":"maxlength","value":"","attribute_value":"1"}]}]},{"_type":"pipelinestage","children":[{"type":"stage_id","value":"1"},{"type":"stage_isoutput","value":"false"},{"type":"stage_connectionname","value":"Source
>>  
>> Type"},{"type":"stage_prerequisite","value":"0"},{"type":"stage_specification","children":[{"type":"expression","attribute_parameter":"source_type","_attribute_value":"Windows
>>  
>> Shares"},{"_type":"expression","attribute_parameter":"crawler_id","_attribute_value":"file_filecrawler2"},{"_type":"keepAllMetadata","value":"","attribute_value":"true"},{"_type":"filterEmpty","value":"","attribute_value":"true&quo

Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
ob_jsp.java:953)
>   ... 26 more
> Caused by: java.lang.ClassNotFoundException: 
> org.apache.commons.collections.ExtendedProperties
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   ... 33 more
>
>
> Regards,
> Cihad Guzel
>
>
> Furkan KAMACI , 25 Eyl 2019 Çar, 15:14 tarihinde
> şunu yazdı:
>
>> Hi Cihad,
>>
>> That package is removed at Java 11. It is possible that you compiled the
>> project via Java 11 which is not supported at ManifoldCF yet.
>>
>> Kind Regards,
>> Furkan KAMACI
>>
>> 25 Eyl 2019 Çar, saat 15:00 tarihinde Cihad Guzel 
>> şunu
>> yazdı:
>>
>> > Hi Karl,
>> >
>> > I have following errors:
>> >
>> > [INFO] -< org.apache.manifoldcf:mcf-connector-common
>> > >-
>> > [INFO] Building ManifoldCF - Connector-Common 2.14
>> > [5/63]
>> > [INFO] [ jar
>> > ]-
>> > [INFO]
>> > [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @
>> > mcf-connector-common ---
>> > [INFO]
>> > [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
>> > mcf-connector-common ---
>> > [INFO]
>> > [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @
>> > mcf-connector-common ---
>> > [debug] execute contextualize
>> > [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> > [INFO] Copying 1 resource
>> > [INFO] Copying 3 resources
>> > [INFO]
>> > [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @
>> > mcf-connector-common ---
>> > [INFO] Compiling 59 source files to
>> >
>> >
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/target/classes
>> > [INFO] -
>> > [ERROR] COMPILATION ERROR :
>> > [INFO] -
>> > [ERROR]
>> >
>> >
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[60,21]
>> > error: package javax.xml.soap does not exist
>> > [ERROR]
>> >
>> >
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[61,21]
>> > error: package javax.xml.soap does not exist
>> > [ERROR]
>> >
>> >
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[62,21]
>> > error: package javax.xml.soap does not exist
>> > [INFO] 3 errors
>> >
>> > Regards,
>> > Cihad Guzel
>> >
>> >
>> > Karl Wright , 25 Eyl 2019 Çar, 02:01 tarihinde şunu
>> > yazdı:
>> >
>> > > Ran all tests.
>> > >
>> > > +1 from me.
>> > >
>> > > Karl
>> > >
>> > >
>> > > On Tue, Sep 24, 2019 at 1:29 PM Karl Wright 
>> wrote:
>> > >
>> > > > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>> > > >
>> > > > There is a release tag at
>> > > > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC1 .
>> > > > There is a release artifact at
>> > > >
>> >
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
>> > > .
>> > > >
>> > > > Note that this release of ManifoldCF has a new connector -- the
>> Content
>> > > > Services Web Services connector.  In order to integrate this
>> connector,
>> > > > changes to the dependencies for web services had to be made, so I
>> urge
>> > > > those using the Alfresco and CMIS connectors to verify this release
>> if
>> > at
>> > > > all possible.  Integration tests pass but there's nothing like an
>> > on-site
>> > > > test.
>> > > >
>> > > > RC1 also contains a fix for CONNECTORS-1623.
>> > > >
>> > > > Thanks,
>> > > > Karl
>> > > >
>> > >
>> >
>>
>


Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
I have no idea.
Karl


On Wed, Sep 25, 2019 at 10:55 AM Julien Massiera <
julien.massi...@francelabs.com> wrote:

> Seems like the github mirror is no more synchronized, am I wrong ?
>
> On 25/09/2019 15:32, Karl Wright wrote:
> > See:  https://issues.apache.org/jira/browse/CONNECTORS-1624
> >
> > Karl
> >
> >
> > On Wed, Sep 25, 2019 at 9:26 AM Piergiorgio Lucidi <
> piergior...@apache.org>
> > wrote:
> >
> >> Ran all the tests.
> >>
> >> +1 from me.
> >>
> >> Cheers,
> >> PJ
> >>
> >> Il giorno mar 24 set 2019 alle ore 19:30 Karl Wright <
> daddy...@gmail.com>
> >> ha scritto:
> >>
> >>> Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> >>>
> >>> There is a release tag at
> >>> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC1 .
> >>> There is a release artifact at
> >>>
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
> >> .
> >>> Note that this release of ManifoldCF has a new connector -- the Content
> >>> Services Web Services connector.  In order to integrate this connector,
> >>> changes to the dependencies for web services had to be made, so I urge
> >>> those using the Alfresco and CMIS connectors to verify this release if
> at
> >>> all possible.  Integration tests pass but there's nothing like an
> on-site
> >>> test.
> >>>
> >>> RC1 also contains a fix for CONNECTORS-1623.
> >>>
> >>> Thanks,
> >>> Karl
> >>>
> >>
> >> --
> >> Piergiorgio
> >>
> --
> Julien MASSIERA
> Directeur développement produit
> France Labs – Les experts du Search
> Datafari – Vainqueur du trophée Big Data 2018 au Digital Innovation Makers
> Summit
> www.francelabs.com
>
>


Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
See:  https://issues.apache.org/jira/browse/CONNECTORS-1624

Karl


On Wed, Sep 25, 2019 at 9:26 AM Piergiorgio Lucidi 
wrote:

> Ran all the tests.
>
> +1 from me.
>
> Cheers,
> PJ
>
> Il giorno mar 24 set 2019 alle ore 19:30 Karl Wright 
> ha scritto:
>
> > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> >
> > There is a release tag at
> > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC1 .
> > There is a release artifact at
> > https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
> .
> >
> > Note that this release of ManifoldCF has a new connector -- the Content
> > Services Web Services connector.  In order to integrate this connector,
> > changes to the dependencies for web services had to be made, so I urge
> > those using the Alfresco and CMIS connectors to verify this release if at
> > all possible.  Integration tests pass but there's nothing like an on-site
> > test.
> >
> > RC1 also contains a fix for CONNECTORS-1623.
> >
> > Thanks,
> > Karl
> >
>
>
> --
> Piergiorgio
>


[jira] [Created] (CONNECTORS-1624) Get ManifoldCF to run under Java 11 or higher

2019-09-25 Thread Karl Wright (Jira)
Karl Wright created CONNECTORS-1624:
---

 Summary: Get ManifoldCF to run under Java 11 or higher
 Key: CONNECTORS-1624
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1624
 Project: ManifoldCF
  Issue Type: Task
  Components: Framework core
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 2.15


Java 11 doesn't include a number of classes that Java 8 does.  We need to 
explicitly include jars that provide these classes or ManifoldCF will not 
function under higher Java revs.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-25 Thread Karl Wright
Yes, that is correct.
Java 11 will be a breaking change.
Karl

On Wed, Sep 25, 2019 at 8:04 AM Furkan KAMACI 
wrote:

> Hi Cihad,
>
> That package is removed at Java 11. It is possible that you compiled the
> project via Java 11 which is not supported at ManifoldCF yet.
>
> Kind Regards,
> Furkan KAMACI
>
> 25 Eyl 2019 Çar, saat 15:00 tarihinde Cihad Guzel 
> şunu yazdı:
>
>> Hi Karl,
>>
>> I have following errors:
>>
>> [INFO] -< org.apache.manifoldcf:mcf-connector-common
>> >-
>> [INFO] Building ManifoldCF - Connector-Common 2.14
>> [5/63]
>> [INFO] [ jar
>> ]-
>> [INFO]
>> [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @
>> mcf-connector-common ---
>> [INFO]
>> [INFO] --- maven-remote-resources-plugin:1.5:process (default) @
>> mcf-connector-common ---
>> [INFO]
>> [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @
>> mcf-connector-common ---
>> [debug] execute contextualize
>> [INFO] Using 'UTF-8' encoding to copy filtered resources.
>> [INFO] Copying 1 resource
>> [INFO] Copying 3 resources
>> [INFO]
>> [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @
>> mcf-connector-common ---
>> [INFO] Compiling 59 source files to
>>
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/target/classes
>> [INFO] -
>> [ERROR] COMPILATION ERROR :
>> [INFO] -
>> [ERROR]
>>
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[60,21]
>> error: package javax.xml.soap does not exist
>> [ERROR]
>>
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[61,21]
>> error: package javax.xml.soap does not exist
>> [ERROR]
>>
>> /Users/cguzel/Projects/apache/svn/mcf-tag-2.14-RC1/framework/connector-common/src/main/java/org/apache/manifoldcf/connectorcommon/common/CommonsHTTPSender.java:[62,21]
>> error: package javax.xml.soap does not exist
>> [INFO] 3 errors
>>
>> Regards,
>> Cihad Guzel
>>
>>
>> Karl Wright , 25 Eyl 2019 Çar, 02:01 tarihinde şunu
>> yazdı:
>>
>> > Ran all tests.
>> >
>> > +1 from me.
>> >
>> > Karl
>> >
>> >
>> > On Tue, Sep 24, 2019 at 1:29 PM Karl Wright  wrote:
>> >
>> > > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>> > >
>> > > There is a release tag at
>> > > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC1 .
>> > > There is a release artifact at
>> > >
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
>> > .
>> > >
>> > > Note that this release of ManifoldCF has a new connector -- the
>> Content
>> > > Services Web Services connector.  In order to integrate this
>> connector,
>> > > changes to the dependencies for web services had to be made, so I urge
>> > > those using the Alfresco and CMIS connectors to verify this release
>> if at
>> > > all possible.  Integration tests pass but there's nothing like an
>> on-site
>> > > test.
>> > >
>> > > RC1 also contains a fix for CONNECTORS-1623.
>> > >
>> > > Thanks,
>> > > Karl
>> > >
>> >
>>
>


Re: [VOTE] Release Apache ManifoldCF 2.14, RC1

2019-09-24 Thread Karl Wright
Ran all tests.

+1 from me.

Karl


On Tue, Sep 24, 2019 at 1:29 PM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>
> There is a release tag at
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC1 .
> There is a release artifact at
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14 .
>
> Note that this release of ManifoldCF has a new connector -- the Content
> Services Web Services connector.  In order to integrate this connector,
> changes to the dependencies for web services had to be made, so I urge
> those using the Alfresco and CMIS connectors to verify this release if at
> all possible.  Integration tests pass but there's nothing like an on-site
> test.
>
> RC1 also contains a fix for CONNECTORS-1623.
>
> Thanks,
> Karl
>


Re: [CANCEL] [VOTE] Release Apache ManifoldCF 2.14, RC0

2019-09-24 Thread Karl Wright
Hi Furkan,

We do not do this as part of the release process.  The tags are versioned
but the artifacts are named the same.  Only one of them ever is released so
this is OK; the older artifacts are put in a different folder labeled
"RC0", "RC1", etc.

This process was originally designed by Jukka Zitting and Grant Ingersoll
when MCF was graduating from the incubator and we have not changed it.

Karl


On Tue, Sep 24, 2019 at 1:36 PM Furkan KAMACI 
wrote:

> Hi Karl,
>
> I didn't want to hijack latest vote thread. Do we need to add a suffix as
> like RC-1 to dist file as like here:
> https://dist.apache.org/repos/dist/dev/zookeeper/ ?
>
> Kind Regards,
> Furkan KAMACI
>
> On Tue, Sep 24, 2019 at 7:32 PM Karl Wright  wrote:
>
> > CONNECTORS-1623.
> >
> > On Tue, Sep 24, 2019 at 9:04 AM Karl Wright  wrote:
> >
> > > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> > >
> > > There is a release tag at
> > > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC0 .
> > > There is a release artifact at
> > >
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
> > .
> > >
> > > Note that this release of ManifoldCF has a new connector -- the Content
> > > Services Web Services connector.  In order to integrate this connector,
> > > changes to the dependencies for web services had to be made, so I urge
> > > those using the Alfresco and CMIS connectors to verify this release if
> at
> > > all possible.  Integration tests pass but there's nothing like an
> on-site
> > > test.
> > >
> > > Thanks,
> > > Karl
> > >
> > >
> >
>


[jira] [Resolved] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1623.
-
Resolution: Fixed

> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[CANCEL] [VOTE] Release Apache ManifoldCF 2.14, RC0

2019-09-24 Thread Karl Wright
CONNECTORS-1623.

On Tue, Sep 24, 2019 at 9:04 AM Karl Wright  wrote:

> Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>
> There is a release tag at
> https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC0 .
> There is a release artifact at
> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14 .
>
> Note that this release of ManifoldCF has a new connector -- the Content
> Services Web Services connector.  In order to integrate this connector,
> changes to the dependencies for web services had to be made, so I urge
> those using the Alfresco and CMIS connectors to verify this release if at
> all possible.  Integration tests pass but there's nothing like an on-site
> test.
>
> Thanks,
> Karl
>
>


[jira] [Commented] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936967#comment-16936967
 ] 

Karl Wright commented on CONNECTORS-1623:
-

r1867468 (trunk)
r1867469 (release branch)


> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936964#comment-16936964
 ] 

Karl Wright commented on CONNECTORS-1623:
-

I found a fix; committing it.


> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936934#comment-16936934
 ] 

Karl Wright commented on CONNECTORS-1623:
-

Verification failed, with this unit test failure:

{code}
run-connector-common-tests:
[junit] Testsuite: org.apache.manifoldcf.connectorcommon.fuzzyml.TestFuzzyML
[junit] ERROR StatusLogger No log4j2 configuration file found. Using 
default configuration: logging only errors to the console.
[junit] Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.344 sec
[junit]
[junit] - Standard Error -
[junit] ERROR StatusLogger No log4j2 configuration file found. Using 
default configuration: logging only errors to the console.
[junit] -  ---
[junit] Testcase: 
testTags(org.apache.manifoldcf.connectorcommon.fuzzyml.TestFuzzyML):  FAILED
[junit] null
[junit] junit.framework.AssertionFailedError
[junit] at 
org.apache.manifoldcf.connectorcommon.fuzzyml.TestFuzzyML.testTags(TestFuzzyML.java:192)
[junit]
[junit]

BUILD FAILED
C:\wip\mcf\trunk\build.xml:290: The following error occurred while executing 
this line:
C:\wip\mcf\trunk\framework\build.xml:2030: Test 
org.apache.manifoldcf.connectorcommon.fuzzyml.TestFuzzyML failed
{code}

The test is using a real-world example HTML page and parsing it, and it fails 
because it does not correctly pick up the  tag at the end of the 
script section.  The reason may be that end tags are still processed within the 
script section and that confuses the tag pairing.  That will not be 
straightforward to fix.  [~julienFL], awaiting your suggestion for that.


> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936866#comment-16936866
 ] 

Karl Wright commented on CONNECTORS-1623:
-

I put together a fix but need to verify it.


> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1623:

Fix Version/s: ManifoldCF 2.14

> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CONNECTORS-1623) Script tags not ignored

2019-09-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1623:
---

Assignee: Karl Wright

> Script tags not ignored
> ---
>
> Key: CONNECTORS-1623
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1623
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> I discovered a problematic behavior with the 
> org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when 
> crawling web pages. This behavior poses problem in particular for the 
> scenario of form based authentication, as explained further. 
>  The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class which 
> is called by the TagParseState on each noteTag() or noteEndTag() methods, 
> uses the org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState 
> class to detect if the parsing process is in or out of a 'script' tag and 
> then do something or not with the incoming data. The problem is that the 
> TagParseState class is not aware of the type of tag currently parsed, so it 
> continues to analyze any char encountered to detect tags even if it is 
> actually parsing a script tag. 
> So let's imagine you have a script tag built like this in a web page: 
> {code:java}
> if(myvar <= 9) {...}
> {code}
> When the TagParseState parses the char '<' it will consider that a new tag 
> begins until it encounters a '>' char. So in the case above, the 
> TagParseState will never catch the end of the script tag, and thus, the 
> scriptParseState variable in the ScriptParseState class will remain in the 
> SCRIPTPARSESTATE_INSCRIPT state and the rest of the web page will not be 
> correctly handled by the other parsers. 
>  As a result, if you, for example, configure a form authentication for your 
> crawl and that the form web page contains this kind of script tag prior to 
> the form tag, the form will never be handled and the authentication will 
> fail. This was the case I encountered, and I resolved it by forcing the 
> scriptParseState to be SCRIPTPARSESTATE_NORMAL.
> ref : 
> [http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] Release Apache ManifoldCF 2.14, RC0

2019-09-24 Thread Karl Wright
It would also be great if you constructed a patch and attached it to your
ticket.  Then we can have a back-and-forth about improvements if they are
needed, and tests that exercise it.  I am really very pressed for time of
late and medical issues in my family is going to really impact what little
time I have.

Karl


On Tue, Sep 24, 2019 at 9:25 AM Karl Wright  wrote:

> Please create a ticket.  I am so busy I cannot keep track of issues
> without tickets.
> Karl
>
>
>
> On Tue, Sep 24, 2019 at 9:23 AM Julien Massiera <
> julien.massi...@francelabs.com> wrote:
>
>> Hi Karl,
>>
>> is it possible to have in this v2.14 a fix for the problem I exposed two
>> weeks ago about the web connector ? ref :
>>
>> http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E
>>
>> I didn't create a ticket on this subject but I can if you are ok with it.
>>
>> Julien
>>
>> On 24/09/2019 15:04, Karl Wright wrote:
>> > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
>> >
>> > There is a release tag at
>> > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC0 .
>> > There is a release artifact at
>> >
>> https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
>> .
>> >
>> > Note that this release of ManifoldCF has a new connector -- the Content
>> > Services Web Services connector.  In order to integrate this connector,
>> > changes to the dependencies for web services had to be made, so I urge
>> > those using the Alfresco and CMIS connectors to verify this release if
>> at
>> > all possible.  Integration tests pass but there's nothing like an
>> on-site
>> > test.
>> >
>> > Thanks,
>> > Karl
>> >
>> --
>> Julien MASSIERA
>> Directeur développement produit
>> France Labs – Les experts du Search
>> Datafari – Vainqueur du trophée Big Data 2018 au Digital Innovation
>> Makers Summit
>> www.francelabs.com
>>
>>


Re: [VOTE] Release Apache ManifoldCF 2.14, RC0

2019-09-24 Thread Karl Wright
Please create a ticket.  I am so busy I cannot keep track of issues without
tickets.
Karl



On Tue, Sep 24, 2019 at 9:23 AM Julien Massiera <
julien.massi...@francelabs.com> wrote:

> Hi Karl,
>
> is it possible to have in this v2.14 a fix for the problem I exposed two
> weeks ago about the web connector ? ref :
>
> http://mail-archives.apache.org/mod_mbox/manifoldcf-dev/201909.mbox/%3CCALUFAGA7eXi_gNBqWv2PRt2FaXuuKW5rTwLiXfceTkUAQfBvVg%40mail.gmail.com%3E
>
> I didn't create a ticket on this subject but I can if you are ok with it.
>
> Julien
>
> On 24/09/2019 15:04, Karl Wright wrote:
> > Please vote on whether to release Apache ManifoldCF 2.14, RC0.
> >
> > There is a release tag at
> > https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC0 .
> > There is a release artifact at
> > https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14
> .
> >
> > Note that this release of ManifoldCF has a new connector -- the Content
> > Services Web Services connector.  In order to integrate this connector,
> > changes to the dependencies for web services had to be made, so I urge
> > those using the Alfresco and CMIS connectors to verify this release if at
> > all possible.  Integration tests pass but there's nothing like an on-site
> > test.
> >
> > Thanks,
> > Karl
> >
> --
> Julien MASSIERA
> Directeur développement produit
> France Labs – Les experts du Search
> Datafari – Vainqueur du trophée Big Data 2018 au Digital Innovation Makers
> Summit
> www.francelabs.com
>
>


[VOTE] Release Apache ManifoldCF 2.14, RC0

2019-09-24 Thread Karl Wright
Please vote on whether to release Apache ManifoldCF 2.14, RC0.

There is a release tag at
https://svn.apache.org/repos/asf/manifoldcf/tags/release-2.14-RC0 .
There is a release artifact at
https://dist.apache.org/repos/dist/dev/manifoldcf/apache-manifoldcf-2.14 .

Note that this release of ManifoldCF has a new connector -- the Content
Services Web Services connector.  In order to integrate this connector,
changes to the dependencies for web services had to be made, so I urge
those using the Alfresco and CMIS connectors to verify this release if at
all possible.  Integration tests pass but there's nothing like an on-site
test.

Thanks,
Karl


[jira] [Resolved] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-24 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1566.
-
Resolution: Fixed

Confirmed that the connector is mostly working for multiple users.  We reserve 
the right to open individual tickets for problems that still need resolution.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-23 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936085#comment-16936085
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], is the latest code working for you?  If so, maybe we can spin the 
new release finally.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-20 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934487#comment-16934487
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], yes, that was the semantic for LAPI; version 0 was the latest 
version.  I believe this is supported in csws or you would not index any 
documents at all.  But I can believe that there are documents that have no 
versions at all, and then you might get this error.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-20 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16934473#comment-16934473
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], I've updated the code to detect this condition and not fail.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-19 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933472#comment-16933472
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 9/19/19 7:53 PM:
--

Have you looked up the error code?
It sounds like we're trying to get the latest version of a document but it is 
not actually present?

I need to know what getFaultCode() returns for this error.




was (Author: kwri...@metacarta.com):
Have you looked up the error code?
It sounds like we're trying to get the latest version of a document but it is 
not actually present?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-19 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933472#comment-16933472
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Have you looked up the error code?
It sounds like we're trying to get the latest version of a document but it is 
not actually present?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-18 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932322#comment-16932322
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I committed the naive patch.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-18 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16932319#comment-16932319
 ] 

Karl Wright commented on CONNECTORS-1566:
-

The naive patch seems as good as any to me.  The question is under what 
conditions do you get back a null rights object?  In any case, if it fails it 
would fail in a more restrictive way rather than a less restrictive.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-17 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931829#comment-16931829
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch]  I added this suffix for all queries performed.  Hopefully that will 
do the trick.  Please let me know.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-16 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930505#comment-16930505
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 9/16/19 12:43 PM:
---

[~schuch], I do not know of anything done to enable OTSubType; seems that that 
field was in the doc.  The other team DID mention needing to (somehow!) change 
the parent ID to 2004, but I committed code that should have done that, I 
thought.  It's possible that the code needs to look up the CategoriesWS node as 
the parent?  I don't know but can you verify that 2004 does the trick?

The method that should be finding the category workspace from a category path 
is this:

{code}
public RootValue(CswsContext llc, String pathString)
{code}

This peels the prefix off the beginning of the path and assumes it's the 
workspace name.  It then looks up object information for that workspace node, 
and gets the ID, which in this case should be 2004 if my assumptions are 
correct.

Can you verify that the path has the correct prefix, and that the ID returned 
is 2004?



was (Author: kwri...@metacarta.com):
[~schuch], I do not know of anything done to enable OTSubType; seems that that 
field was in the doc.  The other team DID mention needing to (somehow!) change 
the parent ID to 2004, but I committed code that should have done that, I 
thought.  It's possible that the code needs to look up the CategoriesWS node as 
the parent?  I don't know but can you verify that 2004 does the trick?

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-16 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930505#comment-16930505
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], I do not know of anything done to enable OTSubType; seems that that 
field was in the doc.  The other team DID mention needing to (somehow!) change 
the parent ID to 2004, but I committed code that should have done that, I 
thought.  It's possible that the code needs to look up the CategoriesWS node as 
the parent?  I don't know but can you verify that 2004 does the trick?

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-16 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16930418#comment-16930418
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], the other team was able to see paths suggested.  Can you put 
debugging statements enough to figure out what the problem is in your 
environment? 

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-13 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929088#comment-16929088
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], if the enterprise workspace name is configurable, doesn't the 
category workspace name also need to be?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927614#comment-16927614
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], there are no legal problems with including it.  The only code that 
would at all raise concerns would be the wsdl files, but it turns out those are 
considered interface specifications, and a court case about 5 years ago made 
those non-copyrightable.  Apache Legal signed off on that for the SharePoint 
connector, which also includes wsdls.

So please go ahead and make sure everything I overlooked is fixed.

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927563#comment-16927563
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Sounds like a bug, yes.  Probably same fix should be included in the repository 
check() function.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927501#comment-16927501
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Thanks for the confirmation.
I've committed a fix to trunk for the "only some services are needed" issue.  
I'll keep an eye out for the rest of your commits.  Please be sure to include 
these wsdls in the .xml configuration file so they do not need to be fetched 
from the server.  I patched the ones already included so they would operate in 
a file-based environment, FWIW.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Error:-Missing Resource Bundle

2019-09-11 Thread Karl Wright
The translations bundle error is completely harmless.  It will not prevent
anything from starting.

Karl


On Wed, Sep 11, 2019 at 3:23 AM Priya Arora  wrote:

> *Regarding resource bundle exception :- *
> No i have not changed anything from default configuration.
> My question here is :-
> "This only means there are no specific en_GB translations for the UI.  It
> is
> harmless."
> Being this an  error, does it allow manifoldcf to work smoothly., while
> using single process(apache-manifoldcf-2.13/example/') and multi process(
> 'apache-manifoldcf-2.13/multiprocess-zk-example') both?
>
> This mail chain does not include any Hardware configurations'
>
> Thanks
> Priya
>
> On Wed, Sep 11, 2019 at 12:46 PM Karl Wright  wrote:
>
> > 'Because when i tried launching manifoldcf multiprocess with
> > zookeeper, it didnt allow manifold UI to load. '
> >
> > WHAT didn't allow manifold UI to load???
> >
> > If you are using a multiprocess deployment, the process that isn't
> getting
> > enough memory is the java-agents process.  Changing the UI process
> > parameters makes no sense.
> >
> > Perhaps instead of telling us your hardware configuration, you tell us
> > precisely how you set up your manifoldcf instance.  What example are you
> > basing it on?  What have you changed from the defaults for the particular
> > example?
> >
> > Thanks,
> > Karl
> >
> >
> > On Wed, Sep 11, 2019 at 2:56 AM Priya Arora  wrote:
> >
> > > Does that means this Error allows ManifoldCF to run smoothly on single
> > > process. Because when i tried launching manifoldcf multiprocess with
> > > zookeeper, it didnt allow manifold UI to load.
> > > Thanks
> > > Priya
> > >
> > > On Wed, Sep 11, 2019 at 12:09 PM Karl Wright 
> wrote:
> > >
> > > > Hi,
> > > > This only means there are no specific en_GB translations for the UI.
> > It
> > > is
> > > > harmless.
> > > >
> > > > Karl
> > > >
> > > >
> > > > On Wed, Sep 11, 2019 at 2:36 AM Priya Arora 
> > wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > While ingesting records to Elastic Search via manifoldCF(2.13)
> > > crawler. I
> > > > > am facing this issue ,
> > > > > ERROR 2019-09-11T05:44:49,641 (qtp307829448-510) - Missing resource
> > > > bundle
> > > > > 'org.apache.manifoldcf.ui.i18n.common' for locale 'en_GB': Can't
> find
> > > > > bundle for base name org.apache.manifoldcf.ui.i18n.common, locale
> > > en_GB;
> > > > > trying en
> > > > > java.util.MissingResourceException: Can't find bundle for base name
> > > > > org.apache.manifoldcf.ui.i18n.common, locale en_GB
> > > > > at
> > > > >
> > > >
> > >
> >
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1581)
> > > > > ~[?:1.8.0_222]
> > > > >
> > > > > I also research ed its resolution on net, but no help till now.
> > > > > Can anybody suggest something, do i need to add some file, as of
> now
> > i
> > > > > have these files in my project setup.
> > > > >
> > > > > [image: image.png]
> > > > >
> > > > > Thanks and regards
> > > > > Priya
> > > > >
> > > >
> > >
> >
>


Re: Error:-Missing Resource Bundle

2019-09-11 Thread Karl Wright
'Because when i tried launching manifoldcf multiprocess with
zookeeper, it didnt allow manifold UI to load. '

WHAT didn't allow manifold UI to load???

If you are using a multiprocess deployment, the process that isn't getting
enough memory is the java-agents process.  Changing the UI process
parameters makes no sense.

Perhaps instead of telling us your hardware configuration, you tell us
precisely how you set up your manifoldcf instance.  What example are you
basing it on?  What have you changed from the defaults for the particular
example?

Thanks,
Karl


On Wed, Sep 11, 2019 at 2:56 AM Priya Arora  wrote:

> Does that means this Error allows ManifoldCF to run smoothly on single
> process. Because when i tried launching manifoldcf multiprocess with
> zookeeper, it didnt allow manifold UI to load.
> Thanks
> Priya
>
> On Wed, Sep 11, 2019 at 12:09 PM Karl Wright  wrote:
>
> > Hi,
> > This only means there are no specific en_GB translations for the UI.  It
> is
> > harmless.
> >
> > Karl
> >
> >
> > On Wed, Sep 11, 2019 at 2:36 AM Priya Arora  wrote:
> >
> > > Hi All,
> > >
> > > While ingesting records to Elastic Search via manifoldCF(2.13)
> crawler. I
> > > am facing this issue ,
> > > ERROR 2019-09-11T05:44:49,641 (qtp307829448-510) - Missing resource
> > bundle
> > > 'org.apache.manifoldcf.ui.i18n.common' for locale 'en_GB': Can't find
> > > bundle for base name org.apache.manifoldcf.ui.i18n.common, locale
> en_GB;
> > > trying en
> > > java.util.MissingResourceException: Can't find bundle for base name
> > > org.apache.manifoldcf.ui.i18n.common, locale en_GB
> > > at
> > >
> >
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1581)
> > > ~[?:1.8.0_222]
> > >
> > > I also research ed its resolution on net, but no help till now.
> > > Can anybody suggest something, do i need to add some file, as of now i
> > > have these files in my project setup.
> > >
> > > [image: image.png]
> > >
> > > Thanks and regards
> > > Priya
> > >
> >
>


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927302#comment-16927302
 ] 

Karl Wright commented on CONNECTORS-1566:
-

More analysis:

Locally when I run this WITHOUT the meta-inf that I've been including in the 
jar, I get:

{code}
Caused by: javax.wsdl.WSDLException: WSDLException: faultCode=PARSER_ERROR: 
Problem parsing 'http://localhost:2099/cws/Authentication.svc'.: 
java.net.ConnectException: Connection refused: connect
at com.ibm.wsdl.xml.WSDLReaderImpl.getDocument(Unknown Source)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(Unknown Source)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(Unknown Source)
at 
org.apache.cxf.wsdl11.WSDLManagerImpl.loadDefinition(WSDLManagerImpl.java:266)
at 
org.apache.cxf.wsdl11.WSDLManagerImpl.getDefinition(WSDLManagerImpl.java:165)
at 
org.apache.cxf.wsdl11.WSDLServiceFactory.(WSDLServiceFactory.java:85)
... 36 more
Caused by: java.net.ConnectException: Connection refused: connect
{code}

This makes sense since I am not running a livelink instance here.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: Error:-Missing Resource Bundle

2019-09-11 Thread Karl Wright
Hi,
This only means there are no specific en_GB translations for the UI.  It is
harmless.

Karl


On Wed, Sep 11, 2019 at 2:36 AM Priya Arora  wrote:

> Hi All,
>
> While ingesting records to Elastic Search via manifoldCF(2.13) crawler. I
> am facing this issue ,
> ERROR 2019-09-11T05:44:49,641 (qtp307829448-510) - Missing resource bundle
> 'org.apache.manifoldcf.ui.i18n.common' for locale 'en_GB': Can't find
> bundle for base name org.apache.manifoldcf.ui.i18n.common, locale en_GB;
> trying en
> java.util.MissingResourceException: Can't find bundle for base name
> org.apache.manifoldcf.ui.i18n.common, locale en_GB
> at
> java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1581)
> ~[?:1.8.0_222]
>
> I also research ed its resolution on net, but no help till now.
> Can anybody suggest something, do i need to add some file, as of now i
> have these files in my project setup.
>
> [image: image.png]
>
> Thanks and regards
> Priya
>


Re: Unexpected HTTP result code: -1: null

2019-09-11 Thread Karl Wright
. and will get back to you
>>
>> Sent from my iPhone
>>
>> > On 16-Aug-2019, at 6:23 PM, Michael Cizmar 
>> wrote:
>> >
>> > Priya  - Was this right?
>> >
>> > 48GB and 1-Core Intel(R) Xeon(R) CPU
>> >
>> > While not directly related to an out of memory issue, you should have
>> more
>> > cores allocated to ES.
>> >
>> >> On Fri, Aug 16, 2019 at 1:09 AM Priya Arora 
>> wrote:
>> >>
>> >> *Existing Threads/connections configuration is :-*
>> >>
>> >> How many worker threads do you have? - 15 worker threads has been
>> >> allocated(in properties.xml file).
>> >> And the Tika Extractor connections -10 connections are defined.
>> >>
>> >> Is this suggested to reduce the number more.
>> >> If not, what else can be a solution
>> >>
>> >> Thanks
>> >> Priya
>> >>
>> >>
>> >>
>> >>> On Wed, Aug 14, 2019 at 5:32 PM Karl Wright 
>> wrote:
>> >>>
>> >>> How many worker threads do you have?
>> >>> Even if each worker thread is constrained in memory, and they should
>> be,
>> >>> you can easily cause things to run out of memory by giving too many
>> >> worker
>> >>> threads.  Another way to keep Tika's usage constrained would be to
>> reduce
>> >>> the number of Tika Extractor connections, because that effectively
>> limits
>> >>> the number of extractions that can be going on at the same time.
>> >>>
>> >>> Karl
>> >>>
>> >>>
>> >>>> On Wed, Aug 14, 2019 at 7:23 AM Priya Arora 
>> wrote:
>> >>>>
>> >>>> Yes , I am using Tika Extractor. And the version used for manifold is
>> >>> 2.13.
>> >>>> Also I am using postgres as database.
>> >>>>
>> >>>> I have 4 types of jobs
>> >>>> One is accessing/re crawling data from a public site. Other three are
>> >>>> accessing intranet site.
>> >>>> Out of which two are giving me correct output-without any error and
>> >> third
>> >>>> one which is having data more than the other two , and  giving me
>> this
>> >>>> error.
>> >>>>
>> >>>> Is there any possibility with site accessibility issue. Can you
>> please
>> >>>> suggest some solution
>> >>>> Thanks and regards
>> >>>> Priya
>> >>>>
>> >>>> On Wed, Aug 14, 2019 at 3:11 PM Karl Wright 
>> >> wrote:
>> >>>>
>> >>>>> I will need to know more.  Do you have the tika extractor in your
>> >>>>> pipeline?  If so, what version of ManifoldCF are you using?  Tika
>> has
>> >>> had
>> >>>>> bugs related to memory consumption in the past; the out of memory
>> >>>> exception
>> >>>>> may be coming from it and therefore a stack trace is critical to
>> >> have.
>> >>>>>
>> >>>>> Alternatively, you can upgrade to the latest version of MCF (2.13)
>> >> and
>> >>>> that
>> >>>>> has a newer version of Tika without those problem.  But you may need
>> >> to
>> >>>> get
>> >>>>> the agents process more memory.
>> >>>>>
>> >>>>> Another possible cause is that you're using hsqldb in production.
>> >>> HSQLDB
>> >>>>> keeps all of its tables in memory.  If you have a large crawl, you
>> do
>> >>> not
>> >>>>> want to use HSQLDB.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Karl
>> >>>>>
>> >>>>>
>> >>>>> On Wed, Aug 14, 2019 at 3:41 AM Priya Arora 
>> >>> wrote:
>> >>>>>
>> >>>>>> Hi Karl,
>> >>>>>>
>> >>>>>> Manifold CF logs hints out me an error like :
>> >>>>>> agents process ran out of memory - shutting down
>> >>>>>> java.lang.OutOfMemoryError: Java heap space
>> >>>>>>
>> >>>>>> Also I have -Xms1024m ,-Xmx1024m memory allocated in
>> >>>>>> start-options.env.unix, start-options.env.win file.
>> >>>>>> Also Configuration:-
>> >>>>>> 1) For Crawler server - 16 GB RAM and 8-Core Intel(R) Xeon(R) CPU
>> >>>> E5-2660
>> >>>>>> v3 @ 2.60GHz and
>> >>>>>>
>> >>>>>> 2) For Elasticsearch server - 48GB and 1-Core Intel(R) Xeon(R) CPU
>> >>>>> E5-2660
>> >>>>>> v3 @ 2.60GHz and i am using postgres as database.
>> >>>>>>
>> >>>>>> Can you please help me out, what to do in this case.
>> >>>>>>
>> >>>>>> Thanks
>> >>>>>> Priya
>> >>>>>>
>> >>>>>>
>> >>>>>> On Wed, Aug 14, 2019 at 12:33 PM Karl Wright 
>> >>>> wrote:
>> >>>>>>
>> >>>>>>> The error occurs, I believe, as the result of basic connection
>> >>>>> problems,
>> >>>>>>> e.g. the connection is getting rejected.  You can find more
>> >>>> information
>> >>>>>> in
>> >>>>>>> the simple history, and in the manifoldcf log.
>> >>>>>>>
>> >>>>>>> I would like to know the underlying cause, since the connector
>> >>> should
>> >>>>> be
>> >>>>>>> resilient against errors of this kind.
>> >>>>>>>
>> >>>>>>> Karl
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Wed, Aug 14, 2019, 1:46 AM Priya Arora 
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>>> Hi Karl,
>> >>>>>>>>
>> >>>>>>>> I have an web Repository connector(Seeds:- an intranet Site).,
>> >>> and
>> >>>>> job
>> >>>>>> i
>> >>>>>>>> son Production server.
>> >>>>>>>>
>> >>>>>>>> When i ran job on PROD, the job stops itself 2 times with and
>> >>>>>>> error:Error:
>> >>>>>>>> Unexpected HTTP result code: -1: null.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Can you please provide me an idea, in which it happens so?
>> >>>>>>>>
>> >>>>>>>> Thanks and regards
>> >>>>>>>> Priya Arora
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>
>>
>


Re: TagParseState behavior with Web connector

2019-09-09 Thread Karl Wright
If you go the strict override route, then it must be limited to parsing of
HTML, and cannot apply to general parsing of XML.  There is a pathway for
that in the Web Connector but I will need to look at it in depth and I do
not have time this week.  Perhaps this weekend.

Karl


On Mon, Sep 9, 2019 at 5:28 AM  wrote:

> Hi Karl,
>
> I'm not sure we're going in the good direction by trying to apply a strict
> XML parser in the HTML connector. HTML is not mandatorily XML compliant
> (otherwise it is XHTML), and it is therefore not what many web pages are
> made of. Speaking of which, the HTML source code I took as example passes
> the HTML validation.
> I've spent some time understanding how the main browsers handle the script
> tag while creating their DOM representation. As a matter of fact, they
> basically pause the DOM creation when finding it, and hand the scripts over
> to dedicated engines. See for instance this blog explaining it :
> https://hacks.mozilla.org/2017/09/building-the-dom-faster-speculative-parsing-async-defer-and-preload/
> As such, if we want to follow a similar approach, one way I have in mind
> could be the following:
>
> Have a "getScriptParseState" method in the TagParseState class :
>
> protected int getScriptParseState()
> {
>   return 0;
> }
>
> that would be overriden by the FormParseState class :
>
> protected int getScriptParseState()
> {
>   return scriptParseState;
> }
>
> Then use this method in the switch case of the TagParseState class for the
> TAGPARSESTATE_SAWLEFTANGLE case (l271 in MCF v2.12) :
>
> 
> else if (bTagDepth == 0)
>   {
> if (isWhitespace(thisChar) || getScriptParseState() == 1 )
> {
>   // Not a tag.
>   currentState = TAGPARSESTATE_NORMAL;
> 
>
> As the scriptParseState parameter would only be set to 1 in the
> ScriptParseState class which is specific to the web connector, we are sure
> that a connector willing to parse a standard XML file will not be impacted
> by our HTML specific method.
>
> What do you think ?
>
> Julien
>
> -Message d'origine-
> De : Karl Wright 
> Envoyé : vendredi 6 septembre 2019 16:54
> À : dev 
> Objet : Re: TagParseState behavior with Web connector
>
> *IF* you wanted to allow broken XML to be still correctly parsed, the
> first thing you must do is come up with a list of exceptions to standard
> XML parsing that you would want to support.  Presuming that you have a
> browser that you think is doing a good job of handling the broken HTML in
> question, you can certainly experiment to determine what that browser does
> with specific exception cases that you come up with.  Once that is done,
> then the state diagram for the tag parser must be modified in the minimal
> way to permit your exceptions to work.
>
> This is no small task, because you will be forced to consider certain tags
> as applying context, and since you are doing that, you are therefore going
> to necessarily break correct XML parsing in a non-HTML situation.  For
> example:
>
> if a<b {dostuff};
>
> ... would, in a true XML setting, recognize the beginning of a  tag,
> and you would not want to break the case where it really was a  tag:
>
>  text  bold text  
>
> So an exception rule you might propose might be that if you start a tag,
> but don't properly complete it, the tag is not considered valid.  But then
> there's this case:
>
>  if a<b&&c>d {dostuff};
>
> Since the & is an XML entity begin, what do you do here?  The parser will
> correctly detect an invalid entity, but then it also needs to understand
> that it's also an invalid tag.
>
> There are a ton of cases, and they would all have to be handled correctly
> for javascript to consistently and successfully not be interpreted as tags.
>
> I'm willing to look at this but you're going to need to supply that list
> of cases.
>
> Karl
>
>
> On Fri, Sep 6, 2019 at 9:34 AM  wrote:
>
> > Hi Karl,
> >
> > Thanks for your suggestion. Took me some time to think about it, but I
> > think we have two different approaches for this case:
> > 1. In your case, it seems like if a source is problematic, it is its
> > own problem, not the one of the parser/connector, so the latter should
> > just discard the doc 2. In my case, we start from the principle that
> > in many situations (especially in web or enterprise scenarii), sources
> > cannot be changed as we want, be it for instance because they belong
> > to another party that has no interest in changing the code (think any
> > website that does not care who parses it), or because the software is
> > not maintained anymor

Re: Requesting edit permssions for ManifoldCF Confluence Pages

2019-09-07 Thread Karl Wright
I have edit permissions but since the upgrade to latest confluence I no
longer seem to have admin privs.
I think you'll need to create a ticket for getting added.

Karl


On Sat, Sep 7, 2019 at 11:53 AM Markus Schuch  wrote:

> Hi,
>
> I do not have edit permissions in our wiki space. Is somebody able to
> add me to the manifoldcf-committers-group?
>
> My apache username is schuch.
>
> Thanks,
> Markus
>


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924679#comment-16924679
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I integrated the branch with trunk.
The code is definitely picking up the cxf transport class now.  But the client 
I am developing for says that the connection check still yields this:

{code}
Connection failed: Error initializing web services: 
org.apache.cxf.service.factory.ServiceConstructionException: Failed to create 
service.
{code}

What's the reason behind that, do you think?  Is there a problem in the 
configuration file I've bundled with the jar, perhaps?  [~schuch], if you'd 
like to try it and debug in your environment I think we are *very* close to 
getting this done.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924522#comment-16924522
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Ok, removing that and substitution the cxf version worked for both alfresco and 
cmis.  So I think we have the solution.  Thanks!!

With the metadata stuff committed, all I'm waiting for is confirmation that it 
works properly before integrating with trunk.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924491#comment-16924491
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Hah, that worked, but *only* if I also remove jaxws-rt*.jar from connector-lib. 
 Otherwise it finds the wrong one.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924483#comment-16924483
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Thanks, this is very helpful.
The classloader switch may do the trick here and also allow us to preload the 
wsdls.  I'll give it a shot.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: TagParseState behavior with Web connector

2019-09-06 Thread Karl Wright
*IF* you wanted to allow broken XML to be still correctly parsed, the first
thing you must do is come up with a list of exceptions to standard XML
parsing that you would want to support.  Presuming that you have a browser
that you think is doing a good job of handling the broken HTML in question,
you can certainly experiment to determine what that browser does with
specific exception cases that you come up with.  Once that is done, then
the state diagram for the tag parser must be modified in the minimal way to
permit your exceptions to work.

This is no small task, because you will be forced to consider certain tags
as applying context, and since you are doing that, you are therefore going
to necessarily break correct XML parsing in a non-HTML situation.  For
example:

if a<b {dostuff};

... would, in a true XML setting, recognize the beginning of a  tag, and
you would not want to break the case where it really was a  tag:

 text  bold text  

So an exception rule you might propose might be that if you start a tag,
but don't properly complete it, the tag is not considered valid.  But then
there's this case:

 if a<b&&c>d {dostuff};

Since the & is an XML entity begin, what do you do here?  The parser will
correctly detect an invalid entity, but then it also needs to understand
that it's also an invalid tag.

There are a ton of cases, and they would all have to be handled correctly
for javascript to consistently and successfully not be interpreted as tags.

I'm willing to look at this but you're going to need to supply that list of
cases.

Karl


On Fri, Sep 6, 2019 at 9:34 AM  wrote:

> Hi Karl,
>
> Thanks for your suggestion. Took me some time to think about it, but I
> think we have two different approaches for this case:
> 1. In your case, it seems like if a source is problematic, it is its own
> problem, not the one of the parser/connector, so the latter should just
> discard the doc
> 2. In my case, we start from the principle that in many situations
> (especially in web or enterprise scenarii), sources cannot be changed as we
> want, be it for instance because they belong to another party that has no
> interest in changing the code (think any website that does not care who
> parses it), or because the software is not maintained anymore (old versions
> of CMS systems for instance).
>
> The question then is: do we want to enable connectors to be modified so
> that they can handle special non-compliant cases (which is our case), or do
> we want connectors that only and strictly index content that respect given
> specifications.
> The solutions here would be :
> 1. Use CDATA
> 2. Put the javascript code in its own file
> 3. Encode every problematic chars in the javascript
> Each solution requires to modify the source webpage which may be
> impossible or refused by the source owner, and the latter one would make
> the javascript code less readable and easy to understand by developers...
>
> So if I rephrase a bit my question, I would add to what I wrote in my
> first email:
>
> Assuming that the mentioned source document MUST be parsed to manage to
> perform the form based authentication, and assuming that it cannot be
> modified and thus it cannot comply with any of the recommendations exposed
> above, what would be your recommended approach to modify the connector so
> that it may optionally accept to handle such cases where we have spotted a
> given sequence of characters that pose problem ?
>
> Regards,
> Julien
>
> -Message d'origine-
> De : Karl Wright 
> Envoyé : jeudi 5 septembre 2019 18:30
> À : dev 
> Objet : Re: TagParseState behavior with Web connector
>
> The parser requires that the document being parsed be valid XML.  Data
> within non-CDATA sections is *required* to use entity references to include
> < or > characters.  See:
>
>
> https://stackoverflow.com/questions/330725/use-of-greater-than-symbol-in-xml
>
>
> Karl
>
>
> On Thu, Sep 5, 2019 at 12:10 PM Julien Massiera <
> julien.massi...@francelabs.com> wrote:
>
> > Hi Karl,
> >
> > I discovered a problematic behavior with the
> > org.apache.manifoldcf.connectorcommon.fuzzyml.TagParseState class when
> > crawling web pages. This behavior poses problem in particular for the
> > scenario of form based authentication, as explained further in my email.
> >
> > The org.apache.manifoldcf.connectorcommon.fuzzyml.HTMLParseState class
> > which is called by the TagParseState on each noteTag() or noteEndTag()
> > methods, uses the
> > org.apache.manifoldcf.crawler.connectors.webcrawler.ScriptParseState
> > class to detect if the parsing process is in or out of a 'script' tag
> > and then do something or not with the incoming data.
> >
> > 

[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924294#comment-16924294
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], then this might be a case where there's a provider registry we need 
to initialize somehow.  How does it look up the provider?  What class does 
that, and in what jar is it?  How do these providers register themselves?

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>    Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


<    1   2   3   4   5   6   7   8   9   10   >