[
https://issues.apache.org/jira/browse/CONNECTORS-954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036739#comment-14036739
]
Karl Wright commented on CONNECTORS-954:
----------------------------------------
Added the field mapping tab: r1603687
Tomorrow will revamp the amazon connector to remove the tika transformer within.
Still unanswered: (a) whether there's a good way to stream the extracted
content to Amazon, and (b) how to remove newline characters, as is currently
done. Ideally, we'd construct the JSON on the fly, but I don't know how
realistic that would be. Also, quoting may need to be addressed.
> Amazon Cloud Search connector's use of Tika should be revisited after
> pipelines are added
> -----------------------------------------------------------------------------------------
>
> Key: CONNECTORS-954
> URL: https://issues.apache.org/jira/browse/CONNECTORS-954
> Project: ManifoldCF
> Issue Type: Task
> Components: Amazon CloudSearch output connector
> Affects Versions: ManifoldCF 1.7
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 1.7
>
>
> Amazon Cloud Search connector uses Tika to extract content from binaries.
> When the pipeline support in CONNECTORS-946 is committed to trunk, we should
> do two things:
> (a) Create a Transformation Connection that extracts binary data into
> metadata, and
> (b) Remove the Tika dependency from the Amazon connector
--
This message was sent by Atlassian JIRA
(v6.2#6252)