[
https://issues.apache.org/jira/browse/NIFI-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495843#comment-17495843
]
ASF subversion and git services commented on NIFI-9647:
-------------------------------------------------------
Commit 4141ed29ecfcd6329e3af0b7f7290a9d6aa7d8c7 in nifi's branch
refs/heads/main from Mike Thomsen
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=4141ed2 ]
NIFI-9647 Added ExtractDocumentText Processor
- Based on https://github.com/tspannhw/nifi-extracttext-processor
This closes #5732
Signed-off-by: David Handermann <[email protected]>
> Add support for full text extraction of binary documents supported by Apache
> Tika
> ---------------------------------------------------------------------------------
>
> Key: NIFI-9647
> URL: https://issues.apache.org/jira/browse/NIFI-9647
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Mike Thomsen
> Assignee: Mike Thomsen
> Priority: Major
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> This improvement will wrap Apache Tika using an updated version of Tim
> Spann's ExtractTextProcessor processor. I contacted Tim via LinkedIn, and he
> agreed to make it part of the NiFi code base going forward. In addition, this
> ticket adds the include-media profile which makes it possible to easily add
> the NiFi media bundle to a custom build of NiFi.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)