[ https://issues.apache.org/jira/browse/TIKA-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980481#comment-16980481 ]
Hudson commented on TIKA-2966: ------------------------------ UNSTABLE: Integrated in Jenkins build Tika-trunk #1748 (See [https://builds.apache.org/job/Tika-trunk/1748/]) TIKA-2966 -- add drop threshold configurability to PDFParser (tallison: [https://github.com/apache/tika/commit/cb3c4ba5c00eb633d42ca446a7d5bb8146701605]) * (edit) tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java > Create a tika-eval SAXHandler > ----------------------------- > > Key: TIKA-2966 > URL: https://issues.apache.org/jira/browse/TIKA-2966 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Assignee: Tim Allison > Priority: Major > > One of the improvements coming in 1.23 is the decoupling of the text stats > calculator from the tika-eval app. To make this even easier to use, let's > add a handler that will calculate the text stats on .endDocument() and record > those stats in a metadata object. -- This message was sent by Atlassian Jira (v8.3.4#803005)