[ https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13650683#comment-13650683 ]
Andreas Lehmkühler commented on PDFBOX-1512: -------------------------------------------- I investigated some time into this and I still run into the exception. I guess the problem is, that we can't just compare the absolute position of the textpositions to be compared. In some case we have to take the context into account, e.g. if a text contains super- or subscript parts those characters have to be sorted using the x-coordinate and the fact the two characters belong together. In such cases the y-ccordinate doesn't matter. This can lead to the described exception depending on the starting point (number of textposition, starting order etc.) So I'm not sure if there is a solution, maybe we have to implement our own sort algo. > TextPositionComparator is not compatible with Java 7 > ---------------------------------------------------- > > Key: PDFBOX-1512 > URL: https://issues.apache.org/jira/browse/PDFBOX-1512 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 1.7.1 > Environment: Java 7 > Reporter: Benjamin Papez > Assignee: Andreas Lehmkühler > Attachments: TextPositionComparator.java > > > The TextPostionCompartor causes the following exception running on Java 7: > Unexpected RuntimeException from > org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison > method violates its general contract! > I think the problem is with this check: > if ( yDifference < .1 || > (pos2YBottom >= pos1YTop && pos2YBottom <= pos1YBottom) || > (pos1YBottom >= pos2YTop && pos1YBottom <= pos2YBottom)) > as it violates the contract requirement: > The implementor must also ensure that the relation is transitive: > ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0. > Finally, the implementor must ensure that compare(x, y)==0 implies that > sgn(compare(x, z))==sgn(compare(y, z)) for all z. > Java 7 now is strict and throws exceptions when the contract is violated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira