[ 
https://issues.apache.org/jira/browse/TIKA-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980481#comment-16980481
 ] 

Hudson commented on TIKA-2966:
------------------------------

UNSTABLE: Integrated in Jenkins build Tika-trunk #1748 (See 
[https://builds.apache.org/job/Tika-trunk/1748/])
TIKA-2966 -- add drop threshold configurability to PDFParser (tallison: 
[https://github.com/apache/tika/commit/cb3c4ba5c00eb633d42ca446a7d5bb8146701605])
* (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java


> Create a tika-eval SAXHandler
> -----------------------------
>
>                 Key: TIKA-2966
>                 URL: https://issues.apache.org/jira/browse/TIKA-2966
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Major
>
> One of the improvements coming in 1.23 is the decoupling of the text stats 
> calculator from the tika-eval app.  To make this even easier to use, let's 
> add a handler that will calculate the text stats on .endDocument() and record 
> those stats in a metadata object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to