[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-13 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929088#comment-16929088
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], if the enterprise workspace name is configurable, doesn't the 
category workspace name also need to be?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927614#comment-16927614
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], there are no legal problems with including it.  The only code that 
would at all raise concerns would be the wsdl files, but it turns out those are 
considered interface specifications, and a court case about 5 years ago made 
those non-copyrightable.  Apache Legal signed off on that for the SharePoint 
connector, which also includes wsdls.

So please go ahead and make sure everything I overlooked is fixed.

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927563#comment-16927563
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Sounds like a bug, yes.  Probably same fix should be included in the repository 
check() function.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927501#comment-16927501
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Thanks for the confirmation.
I've committed a fix to trunk for the "only some services are needed" issue.  
I'll keep an eye out for the rest of your commits.  Please be sure to include 
these wsdls in the .xml configuration file so they do not need to be fetched 
from the server.  I patched the ones already included so they would operate in 
a file-based environment, FWIW.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-11 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16927302#comment-16927302
 ] 

Karl Wright commented on CONNECTORS-1566:
-

More analysis:

Locally when I run this WITHOUT the meta-inf that I've been including in the 
jar, I get:

{code}
Caused by: javax.wsdl.WSDLException: WSDLException: faultCode=PARSER_ERROR: 
Problem parsing 'http://localhost:2099/cws/Authentication.svc'.: 
java.net.ConnectException: Connection refused: connect
at com.ibm.wsdl.xml.WSDLReaderImpl.getDocument(Unknown Source)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(Unknown Source)
at com.ibm.wsdl.xml.WSDLReaderImpl.readWSDL(Unknown Source)
at 
org.apache.cxf.wsdl11.WSDLManagerImpl.loadDefinition(WSDLManagerImpl.java:266)
at 
org.apache.cxf.wsdl11.WSDLManagerImpl.getDefinition(WSDLManagerImpl.java:165)
at 
org.apache.cxf.wsdl11.WSDLServiceFactory.(WSDLServiceFactory.java:85)
... 36 more
Caused by: java.net.ConnectException: Connection refused: connect
{code}

This makes sense since I am not running a livelink instance here.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924679#comment-16924679
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I integrated the branch with trunk.
The code is definitely picking up the cxf transport class now.  But the client 
I am developing for says that the connection check still yields this:

{code}
Connection failed: Error initializing web services: 
org.apache.cxf.service.factory.ServiceConstructionException: Failed to create 
service.
{code}

What's the reason behind that, do you think?  Is there a problem in the 
configuration file I've bundled with the jar, perhaps?  [~schuch], if you'd 
like to try it and debug in your environment I think we are *very* close to 
getting this done.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924522#comment-16924522
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Ok, removing that and substitution the cxf version worked for both alfresco and 
cmis.  So I think we have the solution.  Thanks!!

With the metadata stuff committed, all I'm waiting for is confirmation that it 
works properly before integrating with trunk.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924491#comment-16924491
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Hah, that worked, but *only* if I also remove jaxws-rt*.jar from connector-lib. 
 Otherwise it finds the wrong one.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924483#comment-16924483
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Thanks, this is very helpful.
The classloader switch may do the trick here and also allow us to preload the 
wsdls.  I'll give it a shot.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924294#comment-16924294
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], then this might be a case where there's a provider registry we need 
to initialize somehow.  How does it look up the provider?  What class does 
that, and in what jar is it?  How do these providers register themselves?

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924053#comment-16924053
 ] 

Karl Wright commented on CONNECTORS-1566:
-

So I have only this now in the dist libraries:
{code}
kawright@1USDKAWRIGHT:/mnt/c/wip/mcf/CONNECTORS-1566/dist$ find -name jaxws*.jar
./connector-lib/jaxws-api-2.2.6.jar
kawright@1USDKAWRIGHT:/mnt/c/wip/mcf/CONNECTORS-1566/dist$
{code}

Didn't help; the connector still gives the same error.  [~schuch], any ideas?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924044#comment-16924044
 ] 

Karl Wright commented on CONNECTORS-1566:
-

It's included with cmis and alfresco-webscript connectors.  Both of these have 
integration tests so I should be able to determine shortly if we can get rid of 
it.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-06 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924039#comment-16924039
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Here's the current distribution area jar setup:

{code}
C:\wip\mcf\CONNECTORS-1566\dist>dir lib\jaxws-rt-*.jar
 Volume in drive C is Windows
 Volume Serial Number is F4D8-E4E0

 Directory of C:\wip\mcf\CONNECTORS-1566\dist\lib

File Not Found

C:\wip\mcf\CONNECTORS-1566\dist>dir connector-common-lib\jaxws-rt-*.jar
 Volume in drive C is Windows
 Volume Serial Number is F4D8-E4E0

 Directory of C:\wip\mcf\CONNECTORS-1566\dist\connector-common-lib

File Not Found

C:\wip\mcf\CONNECTORS-1566\dist>dir connector-lib\jaxws-rt-*.jar
 Volume in drive C is Windows
 Volume Serial Number is F4D8-E4E0

 Directory of C:\wip\mcf\CONNECTORS-1566\dist\connector-lib

09/05/2019  07:28 PM 2,168,512 jaxws-rt-2.2.6.jar
   1 File(s)  2,168,512 bytes
   0 Dir(s)  200,186,331,136 bytes free

C:\wip\mcf\CONNECTORS-1566\dist>
{code}

The jar in question is being brought in by another connector.  My question: if 
we make sure it's NOT brought in, and we instead make sure the cxf version is 
available, will the other connector break?  Will the csws connector begin to 
work?  I'll check it out.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-05 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923830#comment-16923830
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 9/6/19 12:03 AM:
--

Thanks!
Now I'm having problems trying to use the Feature code for Https support.  I 
get the following error:

'Doesn't support any Service specific features'

This is without any https being used; I think it's basically complaining that 
the transport layer doesn't know what to do with the Feature I'm giving it.  
Any idea what to do about that?  Here's the code; without the commented-out 
conduitFeature, I get the error:

{code}
// Build TLSClientParameters
final TLSClientParameters tlsConfig = new TLSClientParameters();
if (keystore != null) {
  tlsConfig.setTrustManagers(keystore.getTrustManagers());
}
// Build configuration for conduit
final HttpConduitConfig config = new HttpConduitConfig();
config.setTlsClientParameters(tlsConfig);
final HttpConduitFeature conduitFeature = new HttpConduitFeature();
conduitFeature.setConduitConfig(config);

// Construct service references from the URLs
try {
this.authService = new Authentication_Service(new 
URL(authenticationServiceURL)/*, conduitFeature*/);
this.documentManagementService = new DocumentManagement_Service(new 
URL(documentManagementServiceURL)/*, conduitFeature*/);
this.contentServiceService = new ContentService_Service(new 
URL(contentServiceServiceURL)/*, conduitFeature*/);
this.memberServiceService = new MemberService_Service(new 
URL(memberServiceServiceURL)/*, conduitFeature*/);
this.searchServiceService = new SearchService_Service(new 
URL(searchServiceServiceURL)/*, conduitFeature*/);
} catch (javax.xml.ws.WebServiceException e) {
  throw new ManifoldCFException("Error initializing web services: 
"+e.getMessage(), e);
} catch (MalformedURLException e) {
  throw new ManifoldCFException("Malformed URL: "+e.getMessage(), e);
}
{code}




was (Author: kwri...@metacarta.com):
Thanks!
Now I'm having problems trying to use the Feature code for Https support.  I 
get the following error:

'Doesn't support any Service specific features'

This is without any https being used; I think it's basically complaining that 
the transport layer doesn't know what to do with the Feature I'm giving it.  
Any idea what to do about that?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-05 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923830#comment-16923830
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Thanks!
Now I'm having problems trying to use the Feature code for Https support.  I 
get the following error:

'Doesn't support any Service specific features'

This is without any https being used; I think it's basically complaining that 
the transport layer doesn't know what to do with the Feature I'm giving it.  
Any idea what to do about that?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-09-02 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921084#comment-16921084
 ] 

Karl Wright commented on CONNECTORS-1566:
-

The only thing that is preventing this from going live now is the ability to 
get metadata for documents.  Specifically, I need the following method:

{code}
String[] getAttributeValues(long docID, long catID);
{code}

[~schuch], the engineer I'm working with elsewhere seems to be incapable of 
discovering how this is done.  Do you have anyone where you work who may be 
able to find the answer?



> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CONNECTORS-1508) Add support for French Language

2019-09-02 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1508:

Fix Version/s: (was: ManifoldCF 2.14)
   ManifoldCF 2.15

> Add support for French Language
> ---
>
> Key: CONNECTORS-1508
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1508
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: ManifoldCF 2.10
>Reporter: Cedric Ulmer
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.15
>
> Attachments: cedricmanifold_fr.zip
>
>
> Some users may need a French version of the ressource bundle. I attached a 
> preliminary translation that France Labs made some time ago (probably around 
> summer 2016), but that we halted due to lack of time (and priority). It is 
> probably almost complete, but some quality checking needs to be done. Note 
> also that I forgot to check the version when I did the translations, so 
> anyone interested would need to check any modifications that may have 
> occurred between this version and the current MCF version.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CONNECTORS-1521) Documentum Connector users ManifoldCF's local time in queries constraints against the Documentum server without reference to time zones

2019-09-02 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1521:

Fix Version/s: (was: ManifoldCF 2.14)
   ManifoldCF 2.15

> Documentum Connector users ManifoldCF's local time in queries constraints 
> against the Documentum server without reference to time zones
> ---
>
> Key: CONNECTORS-1521
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1521
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Documentum connector
>Affects Versions: ManifoldCF 2.10
>Reporter: James Thomas
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.15
>
>
> I find that the time/date constraints in queries to the Documentum server are 
> based on the "raw" local time of the ManifoldCF server but appear to take no 
> account of the time zones of the two servers.
> This can lead to recently modified files not being transferred to the output 
> repository when you would naturally expect them to be. I'd like the times to 
> be aligned, perhaps by including time zone in the query. In particular, is 
> there a way to use UTC perhaps?
> Here's an example ...
>  * create a folder in Documentum
>  * set up a job to point at the folder and output to the file system
>  * put two documents into a folder in Documentum
>  * Select them, right click and export as CSV (to show the timestamps):
> {noformat}
> 1.png,48489.0,Portable Network Graphics,8/7/2018 9:04 AM,
> 2.png,28620.0,Portable Network Graphics,8/7/2018 9:04 AM,,{noformat}
> Check the local time on the ManifoldCF server machine. Observe that it's 
> reporting consistent time with the DM server:
> {noformat}
> [james@manifold]$ date
> Tue Aug  7 09:07:25 BST 2018{noformat}
> Start the job and look for the query to Documentum in the manifoldcf.log file 
> (line break added for readability):
> {noformat}
> DEBUG 2018-08-07T08:07:47.297Z (Startup thread) - DCTM: About to execute 
> query= (select for READ distinct i_chronicle_id from dm_document where 
> r_modify_date >= date('01/01/1970 00:00:00','mm/dd/ hh:mi:ss') and
> r_modify_date<=date('08/07/2018 08:07:34','mm/dd/ hh:mi:ss') 
> AND (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND 
> r_content_size>0)) AND ( Folder('/Administrator/james', DESCEND) ))
> ^C{noformat}
> Notice that the latest date asked for is *before* the modification date of 
> the files added to DM. (And is an hour out, see footnote.)
>   
>  See whether anything has been output by the File System connector. It hasn't:
> {noformat}
> [james@manifold]$ ls /bigdisc/source/PDFs/timezones/
> [james@manifold]$
> {noformat}
> Now:
>  * change the timezone on the ManifoldCF server machine
>  * restart the ManifoldCF server and the Documentum processes
>  * reseed the job
> Check the local time on the ManifoldCF server machine; it has changed:
> {noformat}
> [james@manifold]$ date
> Tue Aug  7 10:10:29 CEST 2018{noformat}
> Start the job again and notice that the query has changed by an hour, plus 
> the few minutes it took to change the date etc (and is still an hour out, see 
> footnote):
> {noformat}
> r_modify_date<=date('08/07/2018 09:11:02','mm/dd/ hh:mi:ss') 
> {noformat}
> Observe that the range of dates now covers the timestamps on the DM data, and 
> also that some data has now been transferred by the File System connector:
> {noformat}
> [james@manifold]$ ls 
> /bigdisc/source/PDFs/timezones/http/mfserver\:8080/da/component/
> drl?versionLabel=CURRENT=09018000e515
> drl?versionLabel=CURRENT=09018000e516
> {noformat}
>  
>  
> [Footnote] It appears that something is trying to take account of Daylight 
> Saving Time too.
> If I set the server date to a time outside of DST, the query is aligned with 
> the current time:
> {noformat}
> [i2e@i2ehost manifold]$ date
>  Mon Oct 29 00:01:13 CET 2018
> r_modify_date<=date('10/29/2018 00:01:39','mm/dd/ hh:mi:ss') 
> {noformat}
> But if I set the time inside DST, the time is an hour before:
> {noformat}
> [i2e@i2ehost manifold]$ date
>  Sat Oct 27 00:00:06 CEST 2018
> r_modify_date<=date('10/26/2018 23:00:26','mm/dd/ hh:mi:ss') 
> {noformat}
> This is perhaps a Java issue rather than a logic issue in the connector? See 
> e.g. [https://stackoverflow.com/questions/6392/java-time-zone-is-messed-up]



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CONNECTORS-1561) Upgrade to Tika 1.20 when available

2019-09-01 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1561.
-
Fix Version/s: ManifoldCF 2.14
 Assignee: Karl Wright
   Resolution: Fixed

> Upgrade to Tika 1.20 when available
> ---
>
> Key: CONNECTORS-1561
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1561
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Tim Allison
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> On TIKA-2776, a ManifoldCF user alerted us to a bad bug in Tika 1.19 and 
> 1.19.1 that causes tika-server to return 503 forever after it hits an OOM.  
> This is bad.  We'll be rolling a fix out in a week or two in Tika 1.20.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1561) Upgrade to Tika 1.20 when available

2019-09-01 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920382#comment-16920382
 ] 

Karl Wright commented on CONNECTORS-1561:
-

[~cguzel], we just upgraded to Tika 1.21 two weeks ago, and we're trying to get 
a release out the door.  It's not just the Tika dependency; there's a ton of 
downstream dependencies that would also need to be upgraded. 
 Please create a new ticket for Tika 1.22.


> Upgrade to Tika 1.20 when available
> ---
>
> Key: CONNECTORS-1561
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1561
> Project: ManifoldCF
>  Issue Type: Improvement
>Reporter: Tim Allison
>Priority: Major
>
> On TIKA-2776, a ManifoldCF user alerted us to a bad bug in Tika 1.19 and 
> 1.19.1 that causes tika-server to return 503 forever after it hits an OOM.  
> This is bad.  We'll be rolling a fix out in a week or two in Tika 1.20.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-08-27 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16917338#comment-16917338
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I hooked up the TLSConfigParameters feature and checked it in.  Hope it works.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-08-27 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916514#comment-16916514
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Ok, so I didn't find any obvious way of using TlsConfiguration to set a 
SSLSocketFactory yet.  That would be the ideal.  The way it works right now is 
you can set a trust store and a key store though.  These are not currently 
available from IKeystoreManager but they could be added at the framework level. 
 I'm still looking for the SSLSocketFactory mechanism though because without 
that there are no socket timeouts; we wrap it to create those.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-08-27 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916492#comment-16916492
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Hi [~schuch], if you have any snippets of code that demonstrate how to 
initialize a custom SSLSocketProvider using the HttpConduitConfig mechanism, 
PRIOR to the fetching of the wsdls, that would be fantastic.  I'm still 
struggling with that.

I'm still working with the other client to try and get metadata working 
properly as well.  Stay tuned.

The NodeRights permission is a problem and I have no solution.  We absolutely 
need to be able to construct ACLs for these documents though.  If you have any 
other suggestions how to do the same thing, I'd love to hear it.  LAPI didn't 
require full administration to crawl IIRC but I really don't know for certain, 
because for my testing setups 15 years ago I always used admin.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1105) Add maven delivery targets to poms

2019-08-26 Thread Karl Wright (Jira)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16916161#comment-16916161
 ] 

Karl Wright commented on CONNECTORS-1105:
-

Unfortunately I'm not in a position to modify my settings.xml for work-related 
reasons, so I cannot try this.  But it would be great to write up the release 
steps and put it into the appropriate confluence document on the wiki.  There's 
already a "how to release" page there; if you add a step-by-step process (and 
explain, for instance, how to get hold of your encrypted LDAP password) that 
should be sufficient.

> Add maven delivery targets to poms
> --
>
> Key: CONNECTORS-1105
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1105
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: ManifoldCF 1.8
>Reporter: Karl Wright
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> We've been asked to deliver mcf jars and wars to maven central repository by 
> some developers.  This ticket represents that work.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CONNECTORS-1621) Fix for CONNECTORS-1482 broke Solr / Tika integration

2019-08-23 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1621.
-
Resolution: Fixed

r1865744


> Fix for CONNECTORS-1482 broke Solr / Tika integration
> -
>
> Key: CONNECTORS-1621
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1621
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Lucene/SOLR connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> When you use ManifoldCF with Tika extraction and Solr indexing via the Update 
> handler, all documents except text documents get rejected by the Solr 
> connector.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CONNECTORS-1621) Fix for CONNECTORS-1482 broke Solr / Tika integration

2019-08-23 Thread Karl Wright (Jira)
Karl Wright created CONNECTORS-1621:
---

 Summary: Fix for CONNECTORS-1482 broke Solr / Tika integration
 Key: CONNECTORS-1621
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1621
 Project: ManifoldCF
  Issue Type: Bug
  Components: Lucene/SOLR connector
Affects Versions: ManifoldCF 2.13
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 2.14


When you use ManifoldCF with Tika extraction and Solr indexing via the Update 
handler, all documents except text documents get rejected by the Solr connector.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CONNECTORS-1620) Accept Sitemaps with content type application/xml

2019-08-22 Thread Karl Wright (Jira)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1620.
-
Fix Version/s: ManifoldCF 2.14
   Resolution: Fixed

r1865689


> Accept Sitemaps with content type application/xml
> -
>
> Key: CONNECTORS-1620
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1620
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Web connector
>Reporter: Markus Schuch
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> Given an Output Connection, that does not accepts the MIME type 
> {{application/xml}} for ingestion, it is currently not possible to crawl a 
> sitemap.xml, when the webserver returns {{application/xml}} as content type 
> for the sitemap.
> The sitemap is discarded before the links are extracted, because the mime 
> type {{application/xml}} is not listed in the {{interestingMimeTypeArray}}.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (CONNECTORS-1105) Add maven delivery targets to poms

2019-08-14 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16907559#comment-16907559
 ] 

Karl Wright commented on CONNECTORS-1105:
-

Feel free to add the linting fix to the poms that need it.  It's probably a lot 
faster than fixing the javadoc itself for the moment.


> Add maven delivery targets to poms
> --
>
> Key: CONNECTORS-1105
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1105
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: ManifoldCF 1.8
>Reporter: Karl Wright
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> We've been asked to deliver mcf jars and wars to maven central repository by 
> some developers.  This ticket represents that work.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1105) Add maven delivery targets to poms

2019-08-13 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906746#comment-16906746
 ] 

Karl Wright commented on CONNECTORS-1105:
-

All that I know is summarized in this ticket.  If Maurizio signed up, I never 
heard of it.


> Add maven delivery targets to poms
> --
>
> Key: CONNECTORS-1105
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1105
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: ManifoldCF 1.8
>Reporter: Karl Wright
>Assignee: Markus Schuch
>Priority: Major
> Fix For: ManifoldCF next
>
>
> We've been asked to deliver mcf jars and wars to maven central repository by 
> some developers.  This ticket represents that work.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CONNECTORS-1591) RTF comment parsing problem

2019-08-13 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1591.
-
Resolution: Fixed

r1865081


> RTF comment parsing problem
> ---
>
> Key: CONNECTORS-1591
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1591
> Project: ManifoldCF
>  Issue Type: Bug
>Reporter: Zoltan Farago
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: comment.rtf, result.txt
>
>
> We have a problem with Manifold/Tika. When a comment is parsed from and RTF 
> file, the result has no separator. see attachments



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1591) RTF comment parsing problem

2019-08-13 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16906672#comment-16906672
 ] 

Karl Wright commented on CONNECTORS-1591:
-

Ok, I'll look into this update shortly.


> RTF comment parsing problem
> ---
>
> Key: CONNECTORS-1591
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1591
> Project: ManifoldCF
>  Issue Type: Bug
>Reporter: Zoltan Farago
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: comment.rtf, result.txt
>
>
> We have a problem with Manifold/Tika. When a comment is parsed from and RTF 
> file, the result has no separator. see attachments



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CONNECTORS-1611) Update MySQL Version

2019-08-06 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1611.
-
   Resolution: Fixed
Fix Version/s: ManifoldCF 2.14

r1864515


> Update MySQL Version
> 
>
> Key: CONNECTORS-1611
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1611
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Furkan KAMACI
>Assignee: Furkan KAMACI
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> New MySQL versions come with *caching_sha2_password*. mysql-connector before 
> version 5.1.46 does not have such compatibility and users get this error:
> {code:java}
> java.sql.SQLException: Unable to load authentication plugin 
> 'caching_sha2_password'.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (CONNECTORS-1616) Confluence Authority does not handle Confluence API errors

2019-08-01 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1616:
---

   Resolution: Fixed
 Assignee: Karl Wright
Fix Version/s: (was: ManifoldCF 2.12)
   ManifoldCF 2.14

> Confluence Authority does not handle Confluence API errors
> --
>
> Key: CONNECTORS-1616
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1616
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
> Attachments: patch-CONNECTORS-1616.diff
>
>
> The Confluence Authority asks spaces permissions for each space returned by 
> the getSpaces() method of the ConfluenceClient class.
>  The concerned method getSpacePermissionsForUser(Space space, String 
> username) sends a request to Confluence and triggers a ConfluenceException if 
> the request response is different from 200, otherwise it parses the response 
> as a JSONArray to get the permissions.
> The problem is that even with a 200 request response code, the content of the 
> response may be a simple JSONObject that explains an internal Confluence 
> error. It happens frequently when the username is not allowed to get 
> permissions on certain spaces :
> {code:java}
> {
> "id": null,
> "error": {
> "code": 500,
> "message": "The application was unable to serve your request: 
> com.atlassian.confluence.rpc.NotPermittedException: Only space administrators 
> can view permissions for other users in the space.",
> "data": "com.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the 
> space.\ncom.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the space.\n\tat 
> com.atlassian.confluence.rpc.soap.services.SpacesSoapService.getPermissions(SpacesSoapService.java:188)\n\tat
>  
> com.atlassian.confluence.rpc.soap.XhtmlSoapService.getPermissionsForUser(XhtmlSoapService.java:226)\n\tat
>  sun.reflect.GeneratedMethodAccessor3812.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:498)\n\tat"
> }
> }
> {code}
> As a username will almost never have the rights to get permissions on ALL 
> spaces, it is problematic because the response parsing process triggers an 
> Exception and so the Confluence Authority response is 
> UNREACHABLEAUTHORITY:confluence, instead of delivering permissions for 
> allowed spaces.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1616) Confluence Authority does not handle Confluence API errors

2019-08-01 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897946#comment-16897946
 ] 

Karl Wright commented on CONNECTORS-1616:
-

So, the issue here is that you are giving the authority insufficient 
permissions to be able to get individual user permissions.  You can't fix that 
by simply ignoring it.  I don't understand, therefore, why you expect this 
patch to solve anything for you.  It seems to me like it would just obscure the 
fact that you are configuring a user without admin privs.


> Confluence Authority does not handle Confluence API errors
> --
>
> Key: CONNECTORS-1616
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1616
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Priority: Critical
> Fix For: ManifoldCF 2.12
>
> Attachments: patch-CONNECTORS-1616.diff
>
>
> The Confluence Authority asks spaces permissions for each space returned by 
> the getSpaces() method of the ConfluenceClient class.
>  The concerned method getSpacePermissionsForUser(Space space, String 
> username) sends a request to Confluence and triggers a ConfluenceException if 
> the request response is different from 200, otherwise it parses the response 
> as a JSONArray to get the permissions.
> The problem is that even with a 200 request response code, the content of the 
> response may be a simple JSONObject that explains an internal Confluence 
> error. It happens frequently when the username is not allowed to get 
> permissions on certain spaces :
> {code:java}
> {
> "id": null,
> "error": {
> "code": 500,
> "message": "The application was unable to serve your request: 
> com.atlassian.confluence.rpc.NotPermittedException: Only space administrators 
> can view permissions for other users in the space.",
> "data": "com.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the 
> space.\ncom.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the space.\n\tat 
> com.atlassian.confluence.rpc.soap.services.SpacesSoapService.getPermissions(SpacesSoapService.java:188)\n\tat
>  
> com.atlassian.confluence.rpc.soap.XhtmlSoapService.getPermissionsForUser(XhtmlSoapService.java:226)\n\tat
>  sun.reflect.GeneratedMethodAccessor3812.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:498)\n\tat"
> }
> }
> {code}
> As a username will almost never have the rights to get permissions on ALL 
> spaces, it is problematic because the response parsing process triggers an 
> Exception and so the Confluence Authority response is 
> UNREACHABLEAUTHORITY:confluence, instead of delivering permissions for 
> allowed spaces.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1616) Confluence Authority does not handle Confluence API errors

2019-08-01 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897940#comment-16897940
 ] 

Karl Wright commented on CONNECTORS-1616:
-

Ok, there are some problems with it.  Why did you remove stream parsing for the 
JSON?  I'll have to edit.


> Confluence Authority does not handle Confluence API errors
> --
>
> Key: CONNECTORS-1616
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1616
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Priority: Critical
> Fix For: ManifoldCF 2.12
>
> Attachments: patch-CONNECTORS-1616.diff
>
>
> The Confluence Authority asks spaces permissions for each space returned by 
> the getSpaces() method of the ConfluenceClient class.
>  The concerned method getSpacePermissionsForUser(Space space, String 
> username) sends a request to Confluence and triggers a ConfluenceException if 
> the request response is different from 200, otherwise it parses the response 
> as a JSONArray to get the permissions.
> The problem is that even with a 200 request response code, the content of the 
> response may be a simple JSONObject that explains an internal Confluence 
> error. It happens frequently when the username is not allowed to get 
> permissions on certain spaces :
> {code:java}
> {
> "id": null,
> "error": {
> "code": 500,
> "message": "The application was unable to serve your request: 
> com.atlassian.confluence.rpc.NotPermittedException: Only space administrators 
> can view permissions for other users in the space.",
> "data": "com.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the 
> space.\ncom.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the space.\n\tat 
> com.atlassian.confluence.rpc.soap.services.SpacesSoapService.getPermissions(SpacesSoapService.java:188)\n\tat
>  
> com.atlassian.confluence.rpc.soap.XhtmlSoapService.getPermissionsForUser(XhtmlSoapService.java:226)\n\tat
>  sun.reflect.GeneratedMethodAccessor3812.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:498)\n\tat"
> }
> }
> {code}
> As a username will almost never have the rights to get permissions on ALL 
> spaces, it is problematic because the response parsing process triggers an 
> Exception and so the Confluence Authority response is 
> UNREACHABLEAUTHORITY:confluence, instead of delivering permissions for 
> allowed spaces.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897372#comment-16897372
 ] 

Karl Wright commented on CONNECTORS-1615:
-

Patches welcome.


> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897343#comment-16897343
 ] 

Karl Wright commented on CONNECTORS-1615:
-

Right, but as I said, I have no way of detecting the difference between these 
two cases.


> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1616) Confluence Authority does not handle Confluence API errors

2019-07-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897105#comment-16897105
 ] 

Karl Wright commented on CONNECTORS-1616:
-

Patches welcome.
I'm not the author of the confluence connector so any help you can provide is 
useful.


> Confluence Authority does not handle Confluence API errors
> --
>
> Key: CONNECTORS-1616
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1616
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Priority: Critical
> Fix For: ManifoldCF 2.12
>
> Attachments: patch-CONNECTORS-1616.diff
>
>
> The Confluence Authority ask spaces permissions for each space returned by 
> the getSpaces() method of the ConfluenceClient class.
> The concerned method getSpacePermissionsForUser(Space space, String username) 
> sends a request to Confluence and triggers a ConfluenceException if the 
> request response is different from 200, otherwise it parses the response as a 
> JSONArray to get the permissions.
> The problem is that even with a 200 request response code, the content of the 
> response may be a simple JSONObject that explains an internal Confluence 
> error. It happens frequently when the username is not allowed to get 
> permissions on certain spaces :
> {code:java}
> {
> "id": null,
> "error": {
> "code": 500,
> "message": "The application was unable to serve your request: 
> com.atlassian.confluence.rpc.NotPermittedException: Only space administrators 
> can view permissions for other users in the space.",
> "data": "com.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the 
> space.\ncom.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the space.\n\tat 
> com.atlassian.confluence.rpc.soap.services.SpacesSoapService.getPermissions(SpacesSoapService.java:188)\n\tat
>  
> com.atlassian.confluence.rpc.soap.XhtmlSoapService.getPermissionsForUser(XhtmlSoapService.java:226)\n\tat
>  sun.reflect.GeneratedMethodAccessor3812.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:498)\n\tat"
> }
> }
> {code}
> As a username will almost never have the rights to get permissions on ALL 
> spaces, it is problematic because the response parsing process triggers an 
> Exception and so the Confluence Authority response is 
> UNREACHABLEAUTHORITY:confluence, instead of delivering permissions for 
> allowed spaces.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (CONNECTORS-1616) Confluence Authority does not handle Confluence API errors

2019-07-31 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1616:
---

Assignee: Karl Wright

> Confluence Authority does not handle Confluence API errors
> --
>
> Key: CONNECTORS-1616
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1616
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Confluence connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> The Confluence Authority ask spaces permissions for each space returned by 
> the getSpaces() method of the ConfluenceClient class.
> The concerned method getSpacePermissionsForUser(Space space, String username) 
> sends a request to Confluence and triggers a ConfluenceException if the 
> request response is different from 200, otherwise it parses the response as a 
> JSONArray to get the permissions.
> The problem is that even with a 200 request response code, the content of the 
> response may be a simple JSONObject that explains an internal Confluence 
> error. It happens frequently when the username is not allowed to get 
> permissions on certain spaces :
> {code:java}
> {
> "id": null,
> "error": {
> "code": 500,
> "message": "The application was unable to serve your request: 
> com.atlassian.confluence.rpc.NotPermittedException: Only space administrators 
> can view permissions for other users in the space.",
> "data": "com.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the 
> space.\ncom.atlassian.confluence.rpc.NotPermittedException: Only space 
> administrators can view permissions for other users in the space.\n\tat 
> com.atlassian.confluence.rpc.soap.services.SpacesSoapService.getPermissions(SpacesSoapService.java:188)\n\tat
>  
> com.atlassian.confluence.rpc.soap.XhtmlSoapService.getPermissionsForUser(XhtmlSoapService.java:226)\n\tat
>  sun.reflect.GeneratedMethodAccessor3812.invoke(Unknown Source)\n\tat 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
>  java.lang.reflect.Method.invoke(Method.java:498)\n\tat"
> }
> }
> {code}
> As a username will almost never have the rights to get permissions on ALL 
> spaces, it is problematic because the response parsing process triggers an 
> Exception and so the Confluence Authority response is 
> UNREACHABLEAUTHORITY:confluence, instead of delivering permissions for 
> allowed spaces.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1615.
-
   Resolution: Won't Fix
Fix Version/s: ManifoldCF 2.14

> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1615:
---

Assignee: Karl Wright

> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Assignee: Karl Wright
>Priority: Major
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1615) Bad Error Message when IDCOLUMN's value is actually null

2019-07-31 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16897066#comment-16897066
 ] 

Karl Wright commented on CONNECTORS-1615:
-

Your query should never return rows that have null IDs.  ManifoldCF cannot work 
with documents that don't have an ID.

As for the error message being incorrect -- if you accept that null IDs are not 
allowed, then the error message is perfectly fine.  There is no good way to 
disambiguate a null from a null that is common across all JDBC drivers we 
support.


> Bad Error Message when IDCOLUMN's value is actually null
> 
>
> Key: CONNECTORS-1615
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1615
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Michael Cizmar
>Priority: Major
>
> In the edge case that the id column is null, the error message doesn't 
> suggest that.
>  
> {code:java}
> Object o = row.getValue(JDBCConstants.idReturnColumnName);
>   if (o == null)
>   throw new ManifoldCFException("Bad seed query; doesn't return 
> $(IDCOLUMN) column. Try using quotes around $(IDCOLUMN) variable, e.g. 
> \"$(IDCOLUMN)\", or, for MySQL, select \"by label\" in your repository 
> connection.");
>   String idValue = JDBCConnection.readAsString(o);
> {code}
>  
>  
> Also, should it entirely fail if one $IDCOLUMN record is null?



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888581#comment-16888581
 ] 

Karl Wright commented on CONNECTORS-1566:
-

The TLSClientParameters programmatic way of setting things is still mysterious. 
 I see some examples online here:

https://www.programcreek.com/java-api-examples/?api=org.apache.cxf.configuration.jsse.TLSClientParameters

This looks promising because TLSClientParameters does have a way of setting a 
SSLSocketFactory, so if we can get this to be used for the construction of the 
service, we are over the hump.  But somehow we have to get hold of a 
org.apache.cxf.endpoint.Client object in order to be able to apply 
TLSClientParameters to it.  But it seems like this, too, requires an 
already-constructed service in order to work:

https://www.programcreek.com/java-api-examples/?api=org.apache.cxf.endpoint.Client

So this way is blocked too.

The only way this could really work in a non-static way is via a "Feature".  
The service constructors that CXF generates have options for including a list 
of Feature objects.  I do not know what Features are implemented by the JDK's 
transport but maybe somebody knows?




> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888567#comment-16888567
 ] 

Karl Wright commented on CONNECTORS-1566:
-

It sounds like the standard solution is to enable non-SSL access to the wsdls 
alone in the Livelink environment, from what I can read.  UGH.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888557#comment-16888557
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 7/19/19 6:22 AM:
--

So, here's where things stand.
(1) The checked in code builds a connector jar that properly includes a 
META-INF section that has a jax-ws-catalog.xml and all the wsdls, but they 
don't seem to get picked up.
(2) For standard HTTP access, the code that is checked in nevertheless works, 
because the wsdls are simply accessed from the server with the ?wsdl addition 
to the URL as a fallback.  This automatically is done in the Java JDK class 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.
(3) The user I've been developing for can ONLY use this with SSL.  I've checked 
in code which sets the BindingProvider 
"com.sun.xml.internal.ws.transport.https.client.SSLSocketFactory" property to 
establish the right SSLSocketFactory.  Unfortunately, everything I've read 
indicates that this does not work with server-fetched wsdls, because the code 
that sets the property happens AFTER the constructor for the service.  See code 
snippet below.

{code}
try {
  this.authService = new Authentication_Service(new 
URL(authenticationServiceURL));
  this.documentManagementService = new DocumentManagement_Service(new 
URL(documentManagementServiceURL));
  this.contentServiceService = new ContentService_Service(new 
URL(contentServiceServiceURL));
  this.memberServiceService = new MemberService_Service(new 
URL(memberServiceServiceURL));
  this.searchServiceService = new SearchService_Service(new 
URL(searchServiceServiceURL));
} catch (javax.xml.ws.WebServiceException e) {
  throw new ManifoldCFException("Error initializing web services: 
"+e.getMessage(), e);
} catch (MalformedURLException e) {
  throw new ManifoldCFException("Malformed URL: "+e.getMessage(), e);
}
// Initialize authclient etc.
this.authClientHandle = authService.getBasicHttpBindingAuthentication();
this.documentManagementHandle = 
documentManagementService.getBasicHttpBindingDocumentManagement();
this.contentServiceHandle = 
contentServiceService.getBasicHttpBindingContentService();
this.memberServiceHandle = 
memberServiceService.getBasicHttpBindingMemberService();
this.searchServiceHandle = 
searchServiceService.getBasicHttpBindingSearchService();

// Set up endpoints

((BindingProvider)authClientHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 authenticationServiceURL);

((BindingProvider)documentManagementHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 documentManagementServiceURL);

((BindingProvider)contentServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 contentServiceServiceURL);

((BindingProvider)memberServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 memberServiceServiceURL);

((BindingProvider)searchServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 searchServiceServiceURL);

// Set SSLSocketFactory's
if (sslSocketFactory != null) {
  
((BindingProvider)authClientHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)documentManagementHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)contentServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)memberServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)searchServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
}
{code}

So now we're apparently *forced* to figure out how to get runtime access to 
wsdls we ship work, because the fallback won't work for SSL.

In the absence of somebody who understands this stuff well enough to advise us, 
we have two choices.  First choice is to use a different supported transport 
than the standard one.  I recall reading that CXF's "async" transport used 
commons/httpclient, which I know a lot more about and would have a much better 
chance of configuring for our needs.  The second choice is to find out how 
people do SSL with the ?wsdl fallback make this work; there must be a way, no?

See: 
https://stackoverflow.com/questions/11001102/how-to-programmatically-set-the-sslcontext-of-a-jax-ws-client




was (Author: kwri...@metacarta.com):
So, here's where things stand.
(1) The checked in code builds a connector jar that properly includes a 
META-INF section that has a jax-ws-catalog.xml and all the wsdls, but they 
don't seem to get picked up.
(2) For standard HTTP access, the code that is checked in nevertheless works, 
because the wsdls are simply accessed from the 

[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888561#comment-16888561
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Here's the CXF documentation on the async transport: 
https://cxf.apache.org/docs/asynchronous-client-http-transport.html

One thing here does concern me: enabling this by default is a "Bus-level" 
property, which means it would affect every other connector that uses CXF too.  
That's probably a non-starter in the ManifoldCF environment. HOWEVER, the doc 
here does allude to a different way of setting SSLSocketFactory.  I'm going to 
look into that.

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: OTCS_IIS.png, OTCS_Tomcat.png, chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16888557#comment-16888557
 ] 

Karl Wright commented on CONNECTORS-1566:
-

So, here's where things stand.
(1) The checked in code builds a connector jar that properly includes a 
META-INF section that has a jax-ws-catalog.xml and all the wsdls, but they 
don't seem to get picked up.
(2) For standard HTTP access, the code that is checked in nevertheless works, 
because the wsdls are simply accessed from the server with the ?wsdl addition 
to the URL as a fallback.  This automatically is done in the Java JDK class 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.
(3) The user I've been developing for can ONLY use this with SSL.  I've checked 
in code which sets the BindingProvider 
"com.sun.xml.internal.ws.transport.https.client.SSLSocketFactory" property to 
establish the right SSLSocketFactory.  Unfortunately, everything I've read 
indicates that this does not work with server-fetched wsdls, because the code 
that sets the property happens AFTER the constructor for the service.  See code 
snippet below.

{code}
try {
  this.authService = new Authentication_Service(new 
URL(authenticationServiceURL));
  this.documentManagementService = new DocumentManagement_Service(new 
URL(documentManagementServiceURL));
  this.contentServiceService = new ContentService_Service(new 
URL(contentServiceServiceURL));
  this.memberServiceService = new MemberService_Service(new 
URL(memberServiceServiceURL));
  this.searchServiceService = new SearchService_Service(new 
URL(searchServiceServiceURL));
} catch (javax.xml.ws.WebServiceException e) {
  throw new ManifoldCFException("Error initializing web services: 
"+e.getMessage(), e);
} catch (MalformedURLException e) {
  throw new ManifoldCFException("Malformed URL: "+e.getMessage(), e);
}
// Initialize authclient etc.
this.authClientHandle = authService.getBasicHttpBindingAuthentication();
this.documentManagementHandle = 
documentManagementService.getBasicHttpBindingDocumentManagement();
this.contentServiceHandle = 
contentServiceService.getBasicHttpBindingContentService();
this.memberServiceHandle = 
memberServiceService.getBasicHttpBindingMemberService();
this.searchServiceHandle = 
searchServiceService.getBasicHttpBindingSearchService();

// Set up endpoints

((BindingProvider)authClientHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 authenticationServiceURL);

((BindingProvider)documentManagementHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 documentManagementServiceURL);

((BindingProvider)contentServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 contentServiceServiceURL);

((BindingProvider)memberServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 memberServiceServiceURL);

((BindingProvider)searchServiceHandle).getRequestContext().put(BindingProvider.ENDPOINT_ADDRESS_PROPERTY,
 searchServiceServiceURL);

// Set SSLSocketFactory's
if (sslSocketFactory != null) {
  
((BindingProvider)authClientHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)documentManagementHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)contentServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)memberServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
  
((BindingProvider)searchServiceHandle).getRequestContext().put(sslSocketFactoryProperty,
 sslSocketFactory);
}
{code}

So now we're apparently *forced* to figure out how to get runtime access to 
wsdls we ship work, because the fallback won't work for SSL.

In the absence of somebody who understands this stuff well enough to advise us, 
we have two choices.  First choice is to use a different supported transport 
than the standard one.  I recall reading that CXF's "async" transport used 
commons/httpclient, which I know a lot more about and would have a much better 
chance of configuring for our needs.  The second choice is to find out how 
people do SSL with the ?wsdl fallback make this work; there must be a way, no?



> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> 

[jira] [Resolved] (CONNECTORS-1614) UI bug on parameters deletion on Generic Connector

2019-07-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1614.
-
   Resolution: Fixed
Fix Version/s: ManifoldCF 2.14

r1863226

> UI bug on parameters deletion on Generic Connector
> --
>
> Key: CONNECTORS-1614
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1614
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.13
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.14
>
> Attachments: patch_generic_connector.txt
>
>
> Hi,
> There is a bug on the UI job of the Generic Connector.
> To reproduce : 
> - add a Generic repository connector
> - add a Job related to this connector
> - Click on the parameters tab
> - Add some parameters
> - Click on the Delete button next to the new parameters : it does not work.
> Please find the patch attached.
> Best regards,
> Olivier Tavard



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Assigned] (CONNECTORS-1614) UI bug on parameters deletion on Generic Connector

2019-07-17 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1614:
---

Assignee: Karl Wright

> UI bug on parameters deletion on Generic Connector
> --
>
> Key: CONNECTORS-1614
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1614
> Project: ManifoldCF
>  Issue Type: Bug
>Affects Versions: ManifoldCF 2.13
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Attachments: patch_generic_connector.txt
>
>
> Hi,
> There is a bug on the UI job of the Generic Connector.
> To reproduce : 
> - add a Generic repository connector
> - add a Job related to this connector
> - Click on the parameters tab
> - Add some parameters
> - Click on the Delete button next to the new parameters : it does not work.
> Please find the patch attached.
> Best regards,
> Olivier Tavard



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886919#comment-16886919
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I had a look at the class com/sun/xml/ws/wsdl/parser/RuntimeWSDLParser.java, 
which is where the error I'm seeing is coming from.  This is a truly ancient 
class.  What it does is try a number of different strategies for locating wsdls 
from a given web service.  I'd hoped to find out how it looked for wsdls in 
resources, but I did not find this.

The important thing to note is that, if all else fails, the wsdl locator takes 
the service URI and appends "?wsdl" to it, as a last check.  I bet this is why 
it's working for [~kishorekumar] but not for me.

So what does this mean?

I think we cannot hope to get the proper classloader passed in as an argument 
to this ancient code -- not without forking it ourselves and thus maintaining 
it.  But I think that in the case of the Livelink web services, they do adhere 
to the convention that RuntimeWSDLParser expects.  So it should be possible to 
make this work.  That's what I'm going to try to do.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886745#comment-16886745
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 7/17/19 7:32 AM:
--

Ok, updated in svn.  Please check again; I still see the same error here.



was (Author: kwri...@metacarta.com):
Ok, updated in svn.  Trying locally now.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886745#comment-16886745
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Ok, updated in svn.  Trying locally now.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886722#comment-16886722
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~kishorekumar], can you verify the current codebase?  If the wsdls cannot be 
bundled within the jar and have that work, it's fine I guess to have them 
downloaded from the server.  But I'm unsure which version of the code you 
tried, and whether I broke this after you got it working.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Comment Edited] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886710#comment-16886710
 ] 

Karl Wright edited comment on CONNECTORS-1566 at 7/17/19 6:24 AM:
--

I'm thinking that the resource loader being invoked is going to look for 
resources using the wrong classloader.  We want the class loader that loaded 
the CswsSession class, but unless there's a way to pass a classloader into the 
constructor for the services, it's going to use the classloader of the base jax 
class that does the resource lookup.  And no, there appears to be no way to 
pass a class loader to any of the generated service constructors. :-(





was (Author: kwri...@metacarta.com):
I'm thinking that the resource loader being invoked is going to look for 
resources using the wrong classloader.  We want the class loader that loaded 
the CswsSession class, but unless there's a way to pass a classloader into the 
constructor for the services, it's going to use the classloader of the base jax 
class that does the resource lookup.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-17 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886710#comment-16886710
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I'm thinking that the resource loader being invoked is going to look for 
resources using the wrong classloader.  We want the class loader that loaded 
the CswsSession class, but unless there's a way to pass a classloader into the 
constructor for the services, it's going to use the classloader of the base jax 
class that does the resource lookup.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886576#comment-16886576
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I am still getting the same error.
I think that the difference might be that you're pointing at an actual LiveLink 
instance.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886541#comment-16886541
 ] 

Karl Wright commented on CONNECTORS-1566:
-

thanks!  implemented.  Will debug to the next step now.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: chrome_cgfC00ujx7.png
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886456#comment-16886456
 ] 

Karl Wright commented on CONNECTORS-1566:
-

I tried implementing this:

https://stackoverflow.com/questions/4163586/jax-ws-client-whats-the-correct-path-to-access-the-local-wsdl

It still does not work but I do get a different exception: "2 counts of 
InaccessibleWSDLException."

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886308#comment-16886308
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Currently, the UI fails in the following way, as soon as the configuration is 
saved:

{code}
Caused by: javax.xml.ws.WebServiceException: Failed to access the WSDL at: 
file:wsdls/Authentication.wsdl. It failed with: 
wsdls\Authentication.wsdl (The system cannot find the path specified).
at 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.tryWithMex(RuntimeWSDLParser.java:250)
at 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.parse(RuntimeWSDLParser.java:231)
at 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.parse(RuntimeWSDLParser.java:194)
at 
com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.parse(RuntimeWSDLParser.java:163)
at 
com.sun.xml.internal.ws.client.WSServiceDelegate.parseWSDL(WSServiceDelegate.java:348)
at 
com.sun.xml.internal.ws.client.WSServiceDelegate.(WSServiceDelegate.java:306)
at 
com.sun.xml.internal.ws.client.WSServiceDelegate.(WSServiceDelegate.java:215)
at 
com.sun.xml.internal.ws.client.WSServiceDelegate.(WSServiceDelegate.java:196)
at 
com.sun.xml.internal.ws.client.WSServiceDelegate.(WSServiceDelegate.java:192)
at 
com.sun.xml.internal.ws.spi.ProviderImpl.createServiceDelegate(ProviderImpl.java:104)
at javax.xml.ws.Service.(Service.java:77)
{code}

The URL it is using for the wsdl is the one that it was built with, which is a 
local file.  The wsdls *either* have to be specified as all local (safest but 
hard to insure would be at the same relative path in all circumstances and for 
all ManifoldCF processes) *or* they need to always be relative to the service 
URL for each particular service.

I currently do not know (and cannot easily find) how to convince Apache CXF to 
either do without the WSDL entirely at runtime (which would be ideal), or use a 
local copy during the build process and a *different* URL-based one at runtime. 
 If you can find the answer to this, please let me know what it is.


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-16 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886309#comment-16886309
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~rafaharo], do you know offhand a solution to this?


> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-15 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16885287#comment-16885287
 ] 

Karl Wright commented on CONNECTORS-1566:
-

Hi [~schuch], CONNECTORS-1117 is indeed a duplicate, but I've kept it around 
because there's an API doc attached.

As for what version, I don't actually have access to an instance except through 
a third party, who are likely going to try this code out shortly.



> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Commented] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-07-14 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884718#comment-16884718
 ] 

Karl Wright commented on CONNECTORS-1566:
-

[~schuch], the connector is almost completed, save for the following:

- support for SSL;
- control over HTTP connection pooling so at most one HTTP connection is kept 
around per RepositoryConnection;
- Metadata indexing; I have a snippet of code but it isn't a perfect match for 
what I need.



> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CONNECTORS-1613) Array Index Out of Bounds exception, JDBC connector with attributes

2019-06-24 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1613.
-
Resolution: Fixed

r1862000


> Array Index Out of Bounds exception, JDBC connector with attributes
> ---
>
> Key: CONNECTORS-1613
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1613
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: JDBC connector
>Affects Versions: ManifoldCF 2.13
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> The following method can throw AIOOBE:
> {code}
>   protected void applyMultiAttributeValues(final RepositoryDocument rd, final 
> Map> values)
> throws ManifoldCFException
>   {
> if (values == null)
> {
>   return;
> }
> for (final String attributeName : values.keySet())
> {
>   final Set attributes = values.get(attributeName);
>   final String[] attributeValues = new String[values.size()];
>   int i = 0;
>   for (final String attributeValue : attributes)
>   {
> attributeValues[i++] = attributeValue;
>   }
>   rd.addField(attributeName, attributeValues);
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1613) Array Index Out of Bounds exception, JDBC connector with attributes

2019-06-24 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-1613:
---

 Summary: Array Index Out of Bounds exception, JDBC connector with 
attributes
 Key: CONNECTORS-1613
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1613
 Project: ManifoldCF
  Issue Type: Bug
  Components: JDBC connector
Affects Versions: ManifoldCF 2.13
Reporter: Karl Wright
Assignee: Karl Wright
 Fix For: ManifoldCF 2.14


The following method can throw AIOOBE:

{code}
  protected void applyMultiAttributeValues(final RepositoryDocument rd, final 
Map> values)
throws ManifoldCFException
  {
if (values == null)
{
  return;
}

for (final String attributeName : values.keySet())
{
  final Set attributes = values.get(attributeName);
  final String[] attributeValues = new String[values.size()];
  int i = 0;
  for (final String attributeValue : attributes)
  {
attributeValues[i++] = attributeValue;
  }
  rd.addField(attributeName, attributeValues);
}
  }
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-06-24 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1519.
-
Resolution: Fixed

r1861998

Thank you, [~glaenen]!


> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: ElasticSearchConnection.diff
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-06-24 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871104#comment-16871104
 ] 

Karl Wright commented on CONNECTORS-1519:
-

Thank you for the patch!
I will integrate shortly.


> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: ElasticSearchConnection.diff
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1612) Postpone files in SMBException

2019-06-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867466#comment-16867466
 ] 

Karl Wright commented on CONNECTORS-1612:
-

I do not want to add yet more configuration to an already extremely complex 
connector.  If the use case you are describing (long, automatic crawls) is 
where this really is seen, then I think we're good.


> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1612) Postpone files in SMBException

2019-06-18 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1612.
-
   Resolution: Fixed
Fix Version/s: ManifoldCF 2.14

r1861582


> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1612) Postpone files in SMBException

2019-06-18 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866714#comment-16866714
 ] 

Karl Wright commented on CONNECTORS-1612:
-

{quote}
3. If it fails, the job moves this document to the current end of the 
processing pipeline, and crawls the remaining documents. It increments the 
counter of tentative for this document to 2.
4. When encountering this document again, the job tries again. If it succeeds, 
the crawl moves on as usual. If it fails, it moves this document to the current 
end of the processing pipeline, increment the counter of 1, and doubles the 
delay between two tentatives.
{quote}

This logic is impossible to implement with the current architecture, given the 
way documents are queued and processed.  You will have to make do with the 
standard retry backoff mechanism that is already in place in the frame for 
documents that have retry-able errors.  These are not put at the "back of the 
queue" but are instead given a specific time that they are retried, and will 
not be looked at again until that time occurs.  For the SMB exceptions, we can 
make this time be something on the order of six hours or so; that should cover 
any intermittent problems with infrastructure.



> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CONNECTORS-1612) Postpone files in SMBException

2019-06-18 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1612:
---

Assignee: Karl Wright

> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-28 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1609.
-
Resolution: Fixed

> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Attachments: CONNECTORS-1609.diff, CONNECTORS-1609.diff
>
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1610) handle error 500 in WindowsShare repository connector

2019-05-28 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16849560#comment-16849560
 ] 

Karl Wright commented on CONNECTORS-1610:
-

ManifoldCF retries based on what the connector tells it to do.  It can either 
retry a certain number of times, or for a specific interval, and the connector 
determines which it's going to be.


> handle error 500 in WindowsShare repository connector 
> --
>
> Key: CONNECTORS-1610
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1610
> Project: ManifoldCF
>  Issue Type: Bug
>Reporter: Olivier Tavard
>Priority: Major
>
> Hi,
>  
> I have a question regarding error 500 in the WindowsShare repository 
> connector.
>  
> I  recently noticed that I have a problem with a particular file that 
> contains metadata with non ASCII characters. My pipeline in MCF basically 
> contains the embedded Tika and the data is sent to Solr.
>  
> For this particular file (it is a autocad file btw) there is an error 500 
> that occurs in Solr. This happens after the embedded Tika in MCF has 
> extracted content+metadata and has sent it to Solr.
>  
> The job does not stop and the file is sent many times to Solr which responds 
> with the same error again and again :
> The detail of the error in Solr is :
> null:org.apache.commons.fileupload.FileUploadException: Header section has 
> more than 10240 bytes (maybe it is not properly terminated)
>   at 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>   at 
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)
>  
>  
> In the MCF simple history, I can see that the same file is retried endlessly 
> (see below) and the job is still running.
> Is there a chance to change this behavior to skip the file in this case or at 
> least to stop the job after a certain number of retries ?
>  
> PS : I sent 2 times an email in the dev mailing list but the emails never 
> showed up, it is why I have created directly this issue.
>  
> Thanks,
>  
> Olivier
>  
>  
> {code:java}
> 27/05/19 14:24:48 document ingest (DatafariSolrNoTika) 
> file:/x.x.x.x/testfiler0...
> .dwg
> 500 34 369 Error from server at http://127.0.0.1:8983/solr/FileShare: 
> Expected mime type application/octet-stream but got application/json. { 
> "error":{ "msg":"Header section has more than 10240 bytes (maybe it is not 
> properly terminated)", 
> "trace":"org.apache.commons.fileupload.FileUploadException: Header section 
> has more than 10240 bytes (maybe it is not properly terminated)\n\tat 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
>  
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:784)\n\tat
>  org.apache.solr.servlet.So     27/05/19 14:24:47 extract [Tika] 
> file:/x.x.x.x/testfiler0...
> .dwg
> OK 34 74 
> 27/05/19 14:23:45 document ingest (DatafariSolrNoTika) 
> file:/x.x.x.x/testfiler0...
> .dwg
> 500 34 393 Error from server at http://127.0.0.1:8983/solr/FileShare: 
> Expected mime type application/octet-stream but got application/json. { 
> "error":{ "msg":"Header section has more than 10240 bytes (maybe it is not 
> properly terminated)", 
> "trace":"org.apache.commons.fileupload.FileUploadException: Header section 
> has more than 10240 bytes (maybe it is not properly terminated)\n\tat 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
>  
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
>  org.apache.solr.servlet{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1610) handle error 500 in WindowsShare repository connector

2019-05-28 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1610.
-
Resolution: Won't Fix

> handle error 500 in WindowsShare repository connector 
> --
>
> Key: CONNECTORS-1610
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1610
> Project: ManifoldCF
>  Issue Type: Bug
>Reporter: Olivier Tavard
>Priority: Major
>
> Hi,
>  
> I have a question regarding error 500 in the WindowsShare repository 
> connector.
>  
> I  recently noticed that I have a problem with a particular file that 
> contains metadata with non ASCII characters. My pipeline in MCF basically 
> contains the embedded Tika and the data is sent to Solr.
>  
> For this particular file (it is a autocad file btw) there is an error 500 
> that occurs in Solr. This happens after the embedded Tika in MCF has 
> extracted content+metadata and has sent it to Solr.
>  
> The job does not stop and the file is sent many times to Solr which responds 
> with the same error again and again :
> The detail of the error in Solr is :
> null:org.apache.commons.fileupload.FileUploadException: Header section has 
> more than 10240 bytes (maybe it is not properly terminated)
>   at 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
>   at 
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)
>  
>  
> In the MCF simple history, I can see that the same file is retried endlessly 
> (see below) and the job is still running.
> Is there a chance to change this behavior to skip the file in this case or at 
> least to stop the job after a certain number of retries ?
>  
> PS : I sent 2 times an email in the dev mailing list but the emails never 
> showed up, it is why I have created directly this issue.
>  
> Thanks,
>  
> Olivier
>  
>  
> {code:java}
> 27/05/19 14:24:48 document ingest (DatafariSolrNoTika) 
> file:/x.x.x.x/testfiler0...
> .dwg
> 500 34 369 Error from server at http://127.0.0.1:8983/solr/FileShare: 
> Expected mime type application/octet-stream but got application/json. { 
> "error":{ "msg":"Header section has more than 10240 bytes (maybe it is not 
> properly terminated)", 
> "trace":"org.apache.commons.fileupload.FileUploadException: Header section 
> has more than 10240 bytes (maybe it is not properly terminated)\n\tat 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
>  
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:784)\n\tat
>  org.apache.solr.servlet.So     27/05/19 14:24:47 extract [Tika] 
> file:/x.x.x.x/testfiler0...
> .dwg
> OK 34 74 
> 27/05/19 14:23:45 document ingest (DatafariSolrNoTika) 
> file:/x.x.x.x/testfiler0...
> .dwg
> 500 34 393 Error from server at http://127.0.0.1:8983/solr/FileShare: 
> Expected mime type application/octet-stream but got application/json. { 
> "error":{ "msg":"Header section has more than 10240 bytes (maybe it is not 
> properly terminated)", 
> "trace":"org.apache.commons.fileupload.FileUploadException: Header section 
> has more than 10240 bytes (maybe it is not properly terminated)\n\tat 
> org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
>  
> org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
>  
> org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
>  org.apache.solr.servlet{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1609:

Attachment: CONNECTORS-1609.diff

> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Attachments: CONNECTORS-1609.diff, CONNECTORS-1609.diff
>
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1609:

Attachment: CONNECTORS-1609.diff

> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Attachments: CONNECTORS-1609.diff
>
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848719#comment-16848719
 ] 

Karl Wright commented on CONNECTORS-1609:
-

Patched attached.  Please try and tell me whether it addresses your problem.


> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Attachments: CONNECTORS-1609.diff
>
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848713#comment-16848713
 ] 

Karl Wright edited comment on CONNECTORS-1609 at 5/27/19 8:20 AM:
--

As discussed in email, 403 actually means something already: "permission 
denied".  As such it is returned when the credentials provided are incorrect.  
It would be a bad idea to make the connector just keep going when it receives 
this error code, in my opinion.



was (Author: kwri...@metacarta.com):
As discussed in email, 403 actually means something already: "permission 
denied".  As such it is returned when the credentials provided are incorrect.  
It would be a bad idea to make the connector just keep going when it receives 
these error codes, in my opinion.


> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848713#comment-16848713
 ] 

Karl Wright commented on CONNECTORS-1609:
-

As discussed in email, 403 actually means something already: "permission 
denied".  As such it is returned when the credentials provided are incorrect.  
It would be a bad idea to make the connector just keep going when it receives 
these error codes, in my opinion.


> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CONNECTORS-1609) SharePoint connector ignore 503 errors

2019-05-27 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright reassigned CONNECTORS-1609:
---

Assignee: Karl Wright

> SharePoint connector ignore 503 errors
> --
>
> Key: CONNECTORS-1609
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1609
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: SharePoint connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> During a job, it may occur, for some reasons (related to the SharePoint 
> server configuration), that some resources of a SP site are not available 
> (for instance if it requires some credentials to open a resource). In that 
> case, the SP connector gets a 403 or a 503 response code from the SharePoint. 
> The problem is that whenever it gets this kind of response code, the job is 
> aborted with an error. 
> Since the response codes are clearly identified (403 and 503), it would be 
> better that, at least for a 503 error, the connector ignores it, continues 
> the job, and log something into the repo history instead of aborting the job



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1607) SharePoint ADFS cannot connect to the sharepoint connecter

2019-05-23 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1607.
-
Resolution: Won't Fix

Kerberos will not be supported by the SharePoint connector anytime soon.


> SharePoint ADFS cannot connect to the sharepoint connecter
> --
>
> Key: CONNECTORS-1607
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1607
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: SharePoint 2013 MCPermissions extension, SharePoint 
> connector
> Environment: SharePoint 2013
>Reporter: NEHAL BHANDARI
>Priority: Major
>
> I am trying to connect  a web application which has ADFS enabled. I have 
> created the repository and given the user name and password in the format 
> (domain\username) but I am getting 403 error. It works for windows claims but 
> doesn't work for ADFS. Does SharePoint Connector supports this? If yes, how 
> can I make it work for my ADFS enabled webapplication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1607) SharePoint ADFS cannot connect to the sharepoint connecter

2019-05-23 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847078#comment-16847078
 ] 

Karl Wright commented on CONNECTORS-1607:
-

This is not a bug, so I will be closing it as such.  Very probably ADFS is 
based on Kerberos, in that it needs ticket creation as a separate step to 
enable SOAP access to SharePoint.  The connector does not support Kerberos 
access in this way; you need to use NTLM or NTLM + SSL.


> SharePoint ADFS cannot connect to the sharepoint connecter
> --
>
> Key: CONNECTORS-1607
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1607
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: SharePoint 2013 MCPermissions extension, SharePoint 
> connector
> Environment: SharePoint 2013
>Reporter: NEHAL BHANDARI
>Priority: Major
>
> I am trying to connect  a web application which has ADFS enabled. I have 
> created the repository and given the user name and password in the format 
> (domain\username) but I am getting 403 error. It works for windows claims but 
> doesn't work for ADFS. Does SharePoint Connector supports this? If yes, how 
> can I make it work for my ADFS enabled webapplication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1593) Memory issue on org.apache.fontbox.ttf.GlyphSubstitutionTable.readLangSysTable(GlyphSubstitutionTable.java:147)

2019-05-23 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1593.
-
Resolution: Not A Problem

Wasn't a ManifoldCF problem, but rather a corrupt document being constructed by 
the source repository


> Memory issue on 
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readLangSysTable(GlyphSubstitutionTable.java:147)
> ---
>
> Key: CONNECTORS-1593
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1593
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Tika extractor
>Affects Versions: ManifoldCF 2.12
>Reporter: Donald Van den Driessche
>Assignee: Karl Wright
>Priority: Major
> Attachments: image-2019-03-22-08-57-53-887.png
>
>
> I have created an Issue with fontbox too: 
>  
> When using the internal Tika extractor in a Manifold Job on certain occasions 
> I get an Out of Memory Error.
> {code:java}
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: agents process ran out of 
> memory - shutting down
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: java.lang.OutOfMemoryError: 
> Java heap space
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readLangSysTable(GlyphSubstitutionTable.java:147)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readScriptTable(GlyphSubstitutionTable.java:129)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readScriptList(GlyphSubstitutionTable.java:98)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.GlyphSubstitutionTable.read(GlyphSubstitutionTable.java:78)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:349)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:173)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:150)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:106)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.(PDTrueTypeFont.java:199)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:75)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.pdmodel.PDResources.getFont(PDResources.java:146)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.contentstream.operator.text.SetFontAndSize.process(SetFontAndSize.java:60)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:848)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:503)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:477)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:391)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.pdf.PDF2XHTML.processPage(PDF2XHTML.java:147)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:117)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:172)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:74)
> Mar 16 14:20:06 manifold01 manifoldcf[15747]: at 
> 

[jira] [Resolved] (CONNECTORS-1606) Issue related to job run & throttling behaviour

2019-05-22 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1606.
-
Resolution: Not A Problem

Please post this question to the user group rather than creating a ticket.  
This is basically a question not a bug.


> Issue related to job run & throttling behaviour
> ---
>
> Key: CONNECTORS-1606
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1606
> Project: ManifoldCF
>  Issue Type: Task
>  Components: Documentum connector
>Affects Versions: ManifoldCF 2.9.1
>Reporter: Shashank Saurabh
>Priority: Major
> Fix For: ManifoldCF 2.9.1
>
>
> Hi Team,
> I have questions regarding crawl of contents from Documentum connector. 
> Please help on the below mentioned issue.
> 1) I'm running multiple jobs with different repository connections configured 
> in each job in parallel, and the crawl of documents through jobs is getting 
> slowed down to a great extent. Also, tried by increasing the throttling to 
> 500, but the issue still persists.
> a) Please suggest how does the throttling behave while we run more than one 
> connector at a time configured in different jobs?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-05-13 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838479#comment-16838479
 ] 

Karl Wright commented on CONNECTORS-1519:
-

[~goovaertsr], as I've said many times before, Elastic Search changes so 
quickly that it is a full-time job keeping up with it.  Our ES guy is Steph, 
which is why I tagged him, but because all of us are volunteers he may no 
longer have the option of helping out here.

What I would suggest is that if you folks have the resources to contribute 
patches to the ES connector I can make sure they are properly integrated.  
Beyond that I have little time also.  The connector is quite straightforward 
and the first step would obviously be to analyze what it is doing that ES 
doesn't like.  Using a tool like CURL would allow you to discover that.




> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1605) Update HTML Extractor connector

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1605.
-
   Resolution: Fixed
Fix Version/s: ManifoldCF 2.14

> Update HTML Extractor connector
> ---
>
> Key: CONNECTORS-1605
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1605
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.13
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.14
>
> Attachments: fix_englobing_tag_selection.txt, global_patch.txt, 
> html_extractor_transformation_connector.txt, 
> patch_HTML_extractor_connector_05_06_19.txt, 
> patch_html_extractor_08_14_18.txt, patch_html_extractor_fix_logs_08_10_18.txt
>
>
> Hi,
> I developed a transformation connector based on Jsoup. The goal of this code 
> is to simply choose an encompassing tag in a HTML document for text 
> extracting. And inside this tag, this connector allows you to remove subparts 
> that you do no want : all the tags corresponding to declared types or 
> specific attribute tag names for example.
> The code is in Apache V2 licence  and it is in attachment.
> It needs some work including code refactoring, renaming classes, unit tests 
> that I will be able to do if you are interested by the code.
> The documentation is here :
> [https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]<[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]
>  
> It does not use additional libraries that the ones already present in MCF 
> project. It is based on Jsoup library on lib folder.
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1574) Performance tuning of manifold

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1574.
-
Resolution: Fixed

No response from user; closing.


> Performance tuning of manifold
> --
>
> Key: CONNECTORS-1574
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1574
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: File system connector, JCIFS connector, Solr 6.x 
> component
>Affects Versions: ManifoldCF 2.5
> Environment: Apache manifold installed in Linux machine
> Linux version 3.10.0-327.el7.ppc64le
> Red Hat Enterprise Linux Server release 7.2 (Maipo)
>Reporter: balaji
>Assignee: Karl Wright
>Priority: Critical
>  Labels: performance
>
> My team is using *Apache ManifoldCF 2.5 with SOLR Cloud* for indexing of 
> data. we are currently having 450-500 jobs which needs to run simultaneously. 
> We need to index json data and we are using connector type as *file system* 
> along with *postgres* as backend database. 
> We are facing several issues like
> 1. Scheduling works for some jobs and doesn't work for other jobs. 
> 2. Some jobs gets completed and some jobs hangs and doesn't get completed.
> 3. With one job earlier 6 documents was getting indexed in 15minutes but 
> now even a directory path having 5 documents takes 20 minutes or sometimes 
> doesn't get completed
> 4. "list all jobs" or "status and job management" page doesn't load sometimes 
> and on seeing the pg_stat_activity we observe that 2 queries are in waiting 
> state state because of which the page doesn't load. so if we kill those 
> queries or restart manifold the issue gets resolved and the page loads 
> properly
> queries getting stuck:
> 1. SELECT ID,FAILTIME, FAILCOUNT, SEEDINGVERSION, STATUS FROM JOBS WHERE 
> (STATUS=$1 OR STATUS=$2) FOR UPDATE
> 2. UPDATE JOBS SET ERRORTEXT=NULL, ENDTIME=NULL, WINDOWEND=NULL, STATUS=$1 
> WHERE ID=$2
> note : We have deployed manifold in *linux*. Our major requirement is 
> scheduling of jobs which will run every 15 minutes
> Please help us in fine tuning manifold so that it runs smoothly and acts as a 
> robust system.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1566) Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1566:

Fix Version/s: (was: ManifoldCF 2.13)
   ManifoldCF 2.14

> Develop CSWS connector as a replacement for deprecated LiveLink LAPI connector
> --
>
> Key: CONNECTORS-1566
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1566
> Project: ManifoldCF
>  Issue Type: Task
>  Components: LiveLink connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Karl Wright
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> LAPI is being deprecated.  We need to develop a replacement for it using the 
> ContentServer Web Services API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1591) RTF comment parsing problem

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1591:

Fix Version/s: (was: ManifoldCF 2.13)
   ManifoldCF 2.14

> RTF comment parsing problem
> ---
>
> Key: CONNECTORS-1591
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1591
> Project: ManifoldCF
>  Issue Type: Bug
>Reporter: Zoltan Farago
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
> Attachments: comment.rtf, result.txt
>
>
> We have a problem with Manifold/Tika. When a comment is parsed from and RTF 
> file, the result has no separator. see attachments



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1508) Add support for French Language

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1508:

Fix Version/s: (was: ManifoldCF 2.13)
   ManifoldCF 2.14

> Add support for French Language
> ---
>
> Key: CONNECTORS-1508
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1508
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Documentation
>Affects Versions: ManifoldCF 2.10
>Reporter: Cedric Ulmer
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.14
>
> Attachments: cedricmanifold_fr.zip
>
>
> Some users may need a French version of the ressource bundle. I attached a 
> preliminary translation that France Labs made some time ago (probably around 
> summer 2016), but that we halted due to lack of time (and priority). It is 
> probably almost complete, but some quality checking needs to be done. Note 
> also that I forgot to check the version when I did the translations, so 
> anyone interested would need to check any modifications that may have 
> occurred between this version and the current MCF version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1519:

Fix Version/s: (was: ManifoldCF 2.13)
   ManifoldCF 2.14

> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1521) Documentum Connector users ManifoldCF's local time in queries constraints against the Documentum server without reference to time zones

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1521:

Fix Version/s: (was: ManifoldCF 2.13)
   ManifoldCF 2.14

> Documentum Connector users ManifoldCF's local time in queries constraints 
> against the Documentum server without reference to time zones
> ---
>
> Key: CONNECTORS-1521
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1521
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Documentum connector
>Affects Versions: ManifoldCF 2.10
>Reporter: James Thomas
>Assignee: Karl Wright
>Priority: Major
> Fix For: ManifoldCF 2.14
>
>
> I find that the time/date constraints in queries to the Documentum server are 
> based on the "raw" local time of the ManifoldCF server but appear to take no 
> account of the time zones of the two servers.
> This can lead to recently modified files not being transferred to the output 
> repository when you would naturally expect them to be. I'd like the times to 
> be aligned, perhaps by including time zone in the query. In particular, is 
> there a way to use UTC perhaps?
> Here's an example ...
>  * create a folder in Documentum
>  * set up a job to point at the folder and output to the file system
>  * put two documents into a folder in Documentum
>  * Select them, right click and export as CSV (to show the timestamps):
> {noformat}
> 1.png,48489.0,Portable Network Graphics,8/7/2018 9:04 AM,
> 2.png,28620.0,Portable Network Graphics,8/7/2018 9:04 AM,,{noformat}
> Check the local time on the ManifoldCF server machine. Observe that it's 
> reporting consistent time with the DM server:
> {noformat}
> [james@manifold]$ date
> Tue Aug  7 09:07:25 BST 2018{noformat}
> Start the job and look for the query to Documentum in the manifoldcf.log file 
> (line break added for readability):
> {noformat}
> DEBUG 2018-08-07T08:07:47.297Z (Startup thread) - DCTM: About to execute 
> query= (select for READ distinct i_chronicle_id from dm_document where 
> r_modify_date >= date('01/01/1970 00:00:00','mm/dd/ hh:mi:ss') and
> r_modify_date<=date('08/07/2018 08:07:34','mm/dd/ hh:mi:ss') 
> AND (i_is_deleted=TRUE Or (i_is_deleted=FALSE AND a_full_text=TRUE AND 
> r_content_size>0)) AND ( Folder('/Administrator/james', DESCEND) ))
> ^C{noformat}
> Notice that the latest date asked for is *before* the modification date of 
> the files added to DM. (And is an hour out, see footnote.)
>   
>  See whether anything has been output by the File System connector. It hasn't:
> {noformat}
> [james@manifold]$ ls /bigdisc/source/PDFs/timezones/
> [james@manifold]$
> {noformat}
> Now:
>  * change the timezone on the ManifoldCF server machine
>  * restart the ManifoldCF server and the Documentum processes
>  * reseed the job
> Check the local time on the ManifoldCF server machine; it has changed:
> {noformat}
> [james@manifold]$ date
> Tue Aug  7 10:10:29 CEST 2018{noformat}
> Start the job again and notice that the query has changed by an hour, plus 
> the few minutes it took to change the date etc (and is still an hour out, see 
> footnote):
> {noformat}
> r_modify_date<=date('08/07/2018 09:11:02','mm/dd/ hh:mi:ss') 
> {noformat}
> Observe that the range of dates now covers the timestamps on the DM data, and 
> also that some data has now been transferred by the File System connector:
> {noformat}
> [james@manifold]$ ls 
> /bigdisc/source/PDFs/timezones/http/mfserver\:8080/da/component/
> drl?versionLabel=CURRENT=09018000e515
> drl?versionLabel=CURRENT=09018000e516
> {noformat}
>  
>  
> [Footnote] It appears that something is trying to take account of Daylight 
> Saving Time too.
> If I set the server date to a time outside of DST, the query is aligned with 
> the current time:
> {noformat}
> [i2e@i2ehost manifold]$ date
>  Mon Oct 29 00:01:13 CET 2018
> r_modify_date<=date('10/29/2018 00:01:39','mm/dd/ hh:mi:ss') 
> {noformat}
> But if I set the time inside DST, the time is an hour before:
> {noformat}
> [i2e@i2ehost manifold]$ date
>  Sat Oct 27 00:00:06 CEST 2018
> r_modify_date<=date('10/26/2018 23:00:26','mm/dd/ hh:mi:ss') 
> {noformat}
> This is perhaps a Java issue rather than a logic issue in the connector? See 
> e.g. [https://stackoverflow.com/questions/6392/java-time-zone-is-messed-up]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1605) Update HTML Extractor connector

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1605:

Affects Version/s: (was: ManifoldCF 2.9.1)
   ManifoldCF 2.13

> Update HTML Extractor connector
> ---
>
> Key: CONNECTORS-1605
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1605
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.13
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.10
>
> Attachments: fix_englobing_tag_selection.txt, global_patch.txt, 
> html_extractor_transformation_connector.txt, 
> patch_HTML_extractor_connector_05_06_19.txt, 
> patch_html_extractor_08_14_18.txt, patch_html_extractor_fix_logs_08_10_18.txt
>
>
> Hi,
> I developed a transformation connector based on Jsoup. The goal of this code 
> is to simply choose an encompassing tag in a HTML document for text 
> extracting. And inside this tag, this connector allows you to remove subparts 
> that you do no want : all the tags corresponding to declared types or 
> specific attribute tag names for example.
> The code is in Apache V2 licence  and it is in attachment.
> It needs some work including code refactoring, renaming classes, unit tests 
> that I will be able to do if you are interested by the code.
> The documentation is here :
> [https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]<[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]
>  
> It does not use additional libraries that the ones already present in MCF 
> project. It is based on Jsoup library on lib folder.
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CONNECTORS-1605) Update HTML Extractor connector

2019-05-10 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright updated CONNECTORS-1605:

Fix Version/s: (was: ManifoldCF 2.10)

> Update HTML Extractor connector
> ---
>
> Key: CONNECTORS-1605
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1605
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.13
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Attachments: fix_englobing_tag_selection.txt, global_patch.txt, 
> html_extractor_transformation_connector.txt, 
> patch_HTML_extractor_connector_05_06_19.txt, 
> patch_html_extractor_08_14_18.txt, patch_html_extractor_fix_logs_08_10_18.txt
>
>
> Hi,
> I developed a transformation connector based on Jsoup. The goal of this code 
> is to simply choose an encompassing tag in a HTML document for text 
> extracting. And inside this tag, this connector allows you to remove subparts 
> that you do no want : all the tags corresponding to declared types or 
> specific attribute tag names for example.
> The code is in Apache V2 licence  and it is in attachment.
> It needs some work including code refactoring, renaming classes, unit tests 
> that I will be able to do if you are interested by the code.
> The documentation is here :
> [https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]<[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]
>  
> It does not use additional libraries that the ones already present in MCF 
> project. It is based on Jsoup library on lib folder.
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CONNECTORS-1605) Update HTML Extractor connector

2019-05-10 Thread Karl Wright (JIRA)
Karl Wright created CONNECTORS-1605:
---

 Summary: Update HTML Extractor connector
 Key: CONNECTORS-1605
 URL: https://issues.apache.org/jira/browse/CONNECTORS-1605
 Project: ManifoldCF
  Issue Type: Improvement
Affects Versions: ManifoldCF 2.9.1
Reporter: Olivier Tavard
Assignee: Karl Wright
 Fix For: ManifoldCF 2.10
 Attachments: fix_englobing_tag_selection.txt, global_patch.txt, 
html_extractor_transformation_connector.txt, 
patch_HTML_extractor_connector_05_06_19.txt, patch_html_extractor_08_14_18.txt, 
patch_html_extractor_fix_logs_08_10_18.txt

Hi,

I developed a transformation connector based on Jsoup. The goal of this code is 
to simply choose an encompassing tag in a HTML document for text extracting. 
And inside this tag, this connector allows you to remove subparts that you do 
no want : all the tags corresponding to declared types or specific attribute 
tag names for example.
The code is in Apache V2 licence  and it is in attachment.

It needs some work including code refactoring, renaming classes, unit tests 
that I will be able to do if you are interested by the code.
The documentation is here :

[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]<[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]

 

It does not use additional libraries that the ones already present in MCF 
project. It is based on Jsoup library on lib folder.

Best regards,

Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1500) HTML Extractor transformation connector contribution

2019-05-06 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16833691#comment-16833691
 ] 

Karl Wright commented on CONNECTORS-1500:
-

Hi [~olivierfl], please open a new ticket for further changes to shipping 
connectors.  Thanks!



> HTML Extractor transformation connector contribution
> 
>
> Key: CONNECTORS-1500
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1500
> Project: ManifoldCF
>  Issue Type: Improvement
>Affects Versions: ManifoldCF 2.9.1
>Reporter: Olivier Tavard
>Assignee: Karl Wright
>Priority: Minor
> Fix For: ManifoldCF 2.10
>
> Attachments: fix_englobing_tag_selection.txt, global_patch.txt, 
> html_extractor_transformation_connector.txt, 
> patch_HTML_extractor_connector_05_06_19.txt, 
> patch_html_extractor_08_14_18.txt, patch_html_extractor_fix_logs_08_10_18.txt
>
>
> Hi,
> I developed a transformation connector based on Jsoup. The goal of this code 
> is to simply choose an encompassing tag in a HTML document for text 
> extracting. And inside this tag, this connector allows you to remove subparts 
> that you do no want : all the tags corresponding to declared types or 
> specific attribute tag names for example.
> The code is in Apache V2 licence  and it is in attachment.
> It needs some work including code refactoring, renaming classes, unit tests 
> that I will be able to do if you are interested by the code.
> The documentation is here :
> [https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]<[https://datafari.atlassian.net/wiki/spaces/DATAFARI/pages/237240321/HTML+Extractor+Transformation+connector]
>  
> It does not use additional libraries that the ones already present in MCF 
> project. It is based on Jsoup library on lib folder.
> Best regards,
> Olivier



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-05-01 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830948#comment-16830948
 ] 

Karl Wright commented on CONNECTORS-1519:
-

[~svanschalkwyk], are you following this?

> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.13
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CONNECTORS-1602) Continuous crawling doesn't recrawl everything

2019-05-01 Thread Karl Wright (JIRA)


 [ 
https://issues.apache.org/jira/browse/CONNECTORS-1602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1602.
-
Resolution: Not A Problem

> Continuous crawling doesn't recrawl everything
> --
>
> Key: CONNECTORS-1602
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1602
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Web connector
>Reporter: Donald Van den Driessche
>Priority: Major
>
> When crawling a website in continuous crawling mode we saw that not all 
> documents are recrawled.
> The site is quite extensive. We figured out that after crawling a 
> document/page gets a recrawl timestamp in between the recrawl interval and 
> max recrawl interval.
> But if these values occur within the first crawl, Manifold starts recrawling 
> those, but seems to ignore the rest of the website. Also sometimes documents 
> get recrawled 5 times while other don't get recrawled. Apparently due to the 
> same issue.
>  
> Is it possible to shed a bit more light on the continuous crawling?
> Is it a good system to use for crawling a (extensive) website?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1604) Documentation to run Manifold over SSL

2019-04-30 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830682#comment-16830682
 ] 

Karl Wright commented on CONNECTORS-1604:
-

The standard example script for starting Jetty causes jetty.xml to be pulled in 
for configuration.  SSL configuration can be done via that XML file.  You will 
find it in the example directory.



> Documentation to run Manifold over SSL
> --
>
> Key: CONNECTORS-1604
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1604
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Drai
>Priority: Major
>
> Documentation does not talk about running manifold over SSL anywhere. I am 
> sure lot of users who deployed manifold in prod configured ssl. I tried to 
> configure SSL without success. Can someone point me to instructions to run 
> manifold over SSL please?
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1604) Documentation to run Manifold over SSL

2019-04-30 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830459#comment-16830459
 ] 

Karl Wright commented on CONNECTORS-1604:
-

I presume you mean the UI?

The UI runs under Jetty or Tomcat and uses the standard Jetty or Tomcat 
configuration files.  The documentation for SSL is therefore part of Jetty or 
Tomcat, not part of ManifoldCF.


> Documentation to run Manifold over SSL
> --
>
> Key: CONNECTORS-1604
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1604
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Drai
>Priority: Major
>
> Documentation does not talk about running manifold over SSL anywhere. I am 
> sure lot of users who deployed manifold in prod configured ssl. I tried to 
> configure SSL without success. Can someone point me to instructions to run 
> manifold over SSL please?
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1519) CLIENTPROTOCOLEXCEPTION is thrown with 2.10 -> ES 6.x.y

2019-04-29 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16829143#comment-16829143
 ] 

Karl Wright commented on CONNECTORS-1519:
-

[~st...@remcam.net], are you following this?


> CLIENTPROTOCOLEXCEPTION   is thrown with 2.10 -> ES 6.x.y
> ---
>
> Key: CONNECTORS-1519
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1519
> Project: ManifoldCF
>  Issue Type: Bug
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.13
>
>
> Investigating CLIENTPROTOCOLEXCEPTION when using 2.10 with ES 6.x.y
> More information to follow.
> Fails when using security , i.e. 
> [http://user:password@elasticsearch:9200.|http://user:password@elasticsearch:9200./]
> Remedy:
>  # Disable x-pack security.
>  # Use http://elasticsearch:9200.
>  
>  
> |07-27-2018 17:53:19.010|Indexation 
> (ES)|file:/var/manifoldcf/corpus/14.html|CLIENTPROTOCOLEXCEPTION|38053|23|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3   4   5   6   7   8   9   10   >