[ 
https://issues.apache.org/jira/browse/NIFI-9647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17495843#comment-17495843
 ] 

ASF subversion and git services commented on NIFI-9647:
-------------------------------------------------------

Commit 4141ed29ecfcd6329e3af0b7f7290a9d6aa7d8c7 in nifi's branch 
refs/heads/main from Mike Thomsen
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=4141ed2 ]

NIFI-9647 Added ExtractDocumentText Processor

- Based on https://github.com/tspannhw/nifi-extracttext-processor

This closes #5732

Signed-off-by: David Handermann <[email protected]>


> Add support for full text extraction of binary documents supported by Apache 
> Tika
> ---------------------------------------------------------------------------------
>
>                 Key: NIFI-9647
>                 URL: https://issues.apache.org/jira/browse/NIFI-9647
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Mike Thomsen
>            Assignee: Mike Thomsen
>            Priority: Major
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This improvement will wrap Apache Tika using an updated version of Tim 
> Spann's ExtractTextProcessor processor. I contacted Tim via LinkedIn, and he 
> agreed to make it part of the NiFi code base going forward. In addition, this 
> ticket adds the include-media profile which makes it possible to easily add 
> the NiFi media bundle to a custom build of NiFi.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to