> On 23 Mar 2016, at 06:20, Allison, Timothy B. <talli...@mitre.org> wrote: > > All, > We've upgraded to 2.0.0 on Tika. Many thanks again! > One of our users is interested in continuing to use the > classic/SequentialParser, or at least having it available as a back-off > parser for corrupt pdfs [0].
Using the old parser really isn’t a good idea, it’s known to be pretty broken. I think that we would be much better off making sure the new parser can handle truncated files. We already do a lot of repair in the new parser, so this doesn’t seem like to much work? Maybe Andreas can comment further? Do we have some JIRA issues which identify some of these cases? — John > Would you be willing to distribute a shaded/relocated 1.8.x app so that we > could load both 1.8.x and 2.0.0 in the same jvm without collisions? Or, is > there a better solution? I wouldn’t recommend doing that, because you’re going to be stuck with using 1.8 for everything, not just parsing, at least as far as corrupt/truncated files are concerned. — John > Thank you! > > Cheers, > > Tim > > [0] > https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15208360#comment-15208360 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: dev-h...@pdfbox.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org