[ https://issues.apache.org/jira/browse/TIKA-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656059#comment-17656059 ]
Tika User commented on TIKA-3952: --------------------------------- [~nick] I ran this command : java -jar pdfbox-app.2.0.27.jar ExtractText problematicPDF.pdf The txt file got created in same location but the file doesn't have any content in it. > Content mismatch > ----------------- > > Key: TIKA-3952 > URL: https://issues.apache.org/jira/browse/TIKA-3952 > Project: Tika > Issue Type: Bug > Affects Versions: 2.6.0 > Reporter: Tika User > Priority: Major > Attachments: download.pdf > > > While extracting content of attached file. We are seeing below content > mismatch. > Native file content : 95 (1972); Erznoznik v. City of Jacksonville > Content we got from Tika : 95 (1972); Er{*}e{*}noznik v. City of Jacksonville > > Native file content : 438 U.S.\n726 > Content we got from Tika : 438 {*}U-S{*}.\n726 -- This message was sent by Atlassian Jira (v8.20.10#820010)