[ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152349#comment-14152349
 ] 

Uwe commented on PDFBOX-1512:
-----------------------------

You're right, Tom, in principle we'd need a different solution here to make the 
whole thing water tight (fix transitivity).

But the intention for my patch wasn't to make the comparison perfect. The 
motivation was to simply extend what is the status quo on JDK 6 (an imperfect 
but working text stripper) to JDK 7 and onwards (status quo: unpredictably 
throwing exceptions). People have been living with the imperfect comparator for 
years in their solutions, so they won't notice a change - only that they can 
now use JDK 7 safely. 

Don't get me wrong; if you can provide a comparator that solves the problem, 
post it here, I'd love to fix this properly.

However, I've been holding back migrating my project to JDK7 for over a year 
now because of this issue - I'd much rather implement an imperfect but working 
solution today than having to wait for another year for an ideal fix.

> TextPositionComparator is not compatible with Java 7
> ----------------------------------------------------
>
>                 Key: PDFBOX-1512
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.7.1
>         Environment: Java 7
>            Reporter: Benjamin Papez
>            Assignee: Andreas Lehmkühler
>         Attachments: FOP-2252.pdf, TextPositionComparator.java, Topo.pdf, 
> Topo.txt, TopoContained.pdf, TopoContained.txt, TopoOverlap.pdf, 
> TopoOverlap.txt, WFI_PDFParser_TextPostionComparator.txt, 
> illustration-of-inconsistent-sorting.png, immo-kurier_arsenal_93x62.pdf, 
> quicksort.patch
>
>
> The TextPostionCompartor causes the following exception running on Java 7: 
> Unexpected RuntimeException from 
> org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
> method violates its general contract!
> I think the problem is with this check:
> if ( yDifference < .1 ||
>     (pos2YBottom >= pos1YTop && pos2YBottom <= pos1YBottom) ||
>     (pos1YBottom >= pos2YTop && pos1YBottom <= pos2YBottom))
> as it violates the contract requirement:
> The implementor must also ensure that the relation is transitive: 
> ((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.
> Finally, the implementor must ensure that compare(x, y)==0 implies that 
> sgn(compare(x, z))==sgn(compare(y, z)) for all z.
> Java 7 now is strict and throws exceptions when the contract is violated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to