[
https://issues.apache.org/jira/browse/CONNECTORS-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16057167#comment-16057167
]
Karl Wright commented on CONNECTORS-1433:
-----------------------------------------
We've seriously not had any issue with Tika output format so far, and Tika has
been integrated for several years. Tests on documents such as excel
spreadsheets, PDFs, text files, and word documents heretofore have yielded text
output, not base64. I don't know what has changed, if anything, but if these
formats now generate base64 it sounds like a contract change of some kind that
we missed?
> Add CLI options to pipeline modules, e.g. allow Tika to export TEXT, not
> BASE64
> -------------------------------------------------------------------------------
>
> Key: CONNECTORS-1433
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1433
> Project: ManifoldCF
> Issue Type: Wish
> Components: Tika extractor
> Reporter: Steph van Schalkwyk
>
> Would love to have Tika spout TEXT, not BASE64.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)