[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-11-01 Thread Steph van Schalkwyk (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672487#comment-16672487
 ] 

Steph van Schalkwyk commented on CONNECTORS-1529:
-

I added it as a addField in the Web Connector. Then allowing for it to be 
renamed in the Elasticsearch Connector. 
So it adds "documentId": "http://localhost:8000/10.pdf; to the metadata 
output by the Web Connector, but as an addField, so it doesn't break anything 
(at least in theory, unless there is a metadata rename with the same fieldname 
later in the pipeline).
Let me know if this works for you.


> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.12
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-11-01 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672425#comment-16672425
 ] 

Karl Wright commented on CONNECTORS-1529:
-

As long as it's a new field, seems that backwards compatibility is preserved, 
so I'm OK with it.


> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.12
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-11-01 Thread Steph van Schalkwyk (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16672423#comment-16672423
 ] 

Steph van Schalkwyk commented on CONNECTORS-1529:
-

I have added the "documentId":  metatag to the Web Connector.
 * *"documentId": 
["http://localhost:8000/10.pdf;|http://localhost:8000/10.pdf]***

*Will this work for everybody?*

*Steph*

 

 

 
 * "_index": "index_cpt_all",
 * "_type": "catalogline",
 * "_id": ["http://localhost:8000/10.pdf;|http://localhost:8000/10.pdf],
 * "_version": 1,
 * "_score": 1,
 * "_source": {
 ** "date": "2005-05-05T21:19:55Z",
 ** "pdf:PDFVersion": "1.3",
 ** "pdf:docinfo:title": "Microsoft Word - 48428.doc",
 ** "xmp:CreatorTool": "PScript5.dll Version 5.2",
 ** "Server": "SimpleHTTP/0.6 Python/3.5.2",
 ** "access_permission:modify_annotations": "true",
 ** "access_permission:can_print_degraded": "true",
 ** "dc:creator": "edocslib",
 ** "dcterms:created": "2005-05-05T21:19:55Z",
 ** "Last-Modified": "2005-05-05T21:19:55Z",
 ** "dcterms:modified": "2005-05-05T21:19:55Z",
 ** "dc:format": "application/pdf; version=1.3",
 ** "title": "Microsoft Word - 48428.doc",
 ** "Last-Save-Date": "2005-05-05T21:19:55Z",
 ** "pdf:docinfo:creator_tool": "PScript5.dll Version 5.2",
 ** "access_permission:fill_in_form": "true",
 ** "pdf:docinfo:modified": "2005-05-05T21:19:55Z",
 ** "stream_name": "10.pdf",
 ** "meta:save-date": "2005-05-05T21:19:55Z",
 ** "pdf:encrypted": "false",
 ** "dc:title": "Microsoft Word - 48428.doc",
 ** "modified": "2005-05-05T21:19:55Z",
 ** "Content-Length": "120441",
 ** "Content-Type": "application/pdf",
 ** "stream_size": "120441",
 ** "pdf:docinfo:creator": "edocslib",
 ** "X-Parsed-By": "org.apache.tika.parser.DefaultParser",
 ** "creator": "edocslib",
 ** "meta:author": "edocslib",
 ** "meta:creation-date": "2005-05-05T21:19:55Z",
 ** "created": "Thu May 05 16:19:55 CDT 2005",
 ** "documentId": 
["http://localhost:8000/10.pdf;|http://localhost:8000/10.pdf],
 ** "access_permission:extract_for_accessibility": "true",
 ** "access_permission:assemble_document": "true",
 ** "xmpTPg:NPages": "4",
 ** "Creation-Date": "2005-05-05T21:19:55Z",
 ** "resourceName": "10.pdf",
 ** "access_permission:extract_content": "true",
 ** "access_permission:can_print": "true",
 ** "Content-type": "application/pdf",
 ** "Author": "edocslib",
 ** "producer": "Acrobat Distiller 5.0 (Windows)",
 ** "access_permission:can_modify": "true",
 ** "pdf:docinfo:producer": "Acrobat Distiller 5.0 (Windows)",
 ** "pdf:docinfo:created": "2005-05-05T21:19:55Z",
 ** "indexed": "2018-11-02T00:50:48.053+",
 ** "mime-type": "application/pdf",
 ** "allow_token_document": "__nosecurity__",
 ** "deny_token_document": "__nosecurity__",
 ** "allow_token_share": "__nosecurity__",
 ** "deny_token_share": "__nosecurity__",
 ** "allow_token_parent": "__nosecurity__",
 ** "deny_token_parent": "__nosecurity__",
 ** "content": " Federal Communications Commission DA 05

> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.12
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-09-06 Thread Steph van Schalkwyk (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606600#comment-16606600
 ] 

Steph van Schalkwyk commented on CONNECTORS-1529:
-

I agree. I can do that in the Web Connector.
The ES change was a stopgap before I realized from where the issue
originated. (Not wishing to end a sentence on a preposition...)


+1.314.452.2896 (Tel/SMS)




> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-09-06 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606556#comment-16606556
 ] 

Karl Wright commented on CONNECTORS-1529:
-

[~svanschalkwyk], I would opt for including the uri as metadata in the 
appropriate upstream repository connector as you proposed earlier today.  That 
way it's available to everyone.


> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1529) Add "url" output element to ES Output Connector (required when used with the Web Repository Connector)

2018-09-06 Thread Steph van Schalkwyk (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606241#comment-16606241
 ] 

Steph van Schalkwyk commented on CONNECTORS-1529:
-

Patch attached.

> Add "url" output element to ES Output Connector (required when used with the 
> Web Repository Connector)
> --
>
> Key: CONNECTORS-1529
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1529
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: Elastic Search connector
>Affects Versions: ManifoldCF 2.10
>Reporter: Steph van Schalkwyk
>Assignee: Steph van Schalkwyk
>Priority: Major
> Fix For: ManifoldCF 2.11
>
> Attachments: elasticsearch.patch, image-2018-09-06-10-28-45-008.png
>
>
> Add "url" (copy of the _id field) to ES Output.
> ES no longer supports copying from _id (copy-to) in the schema.
> As per 
> !image-2018-09-06-10-28-45-008.png!
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)