[
https://issues.apache.org/jira/browse/TIKA-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906475#comment-17906475
]
Hudson commented on TIKA-2342:
------------------------------
SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk17 #579 (See
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/579/])
TIKA-2342: suppport PDFBox IgnoreContentStreamSpaceGlyphs; add test; remove
dead code line (tilman:
[https://github.com/apache/tika/commit/c4885fae7111e748b9a7cfeee86cd78ebea7f600])
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParser.java
* (add)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/test/resources/test-documents/testContentStreamSpaceGlyphs.pdf
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/main/java/org/apache/tika/parser/pdf/PDFParserConfig.java
* (edit)
tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-pdf-module/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java
> Broken words
> ------------
>
> Key: TIKA-2342
> URL: https://issues.apache.org/jira/browse/TIKA-2342
> Project: Tika
> Issue Type: Bug
> Components: parser
> Affects Versions: 1.14
> Environment: Tika app and Tika server
> Reporter: Nino Skopac
> Assignee: Tilman Hausherr
> Priority: Major
> Fix For: 3.0.1, 4.0.0
>
>
> Original PDF text: "Each certified or noncertified member"
> Tika extracted text: "Each certifi ed or noncertifi ed member"
--
This message was sent by Atlassian Jira
(v8.20.10#820010)