[ https://issues.apache.org/jira/browse/PDFBOX-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018906#comment-16018906 ]
Tilman Hausherr commented on PDFBOX-3799: ----------------------------------------- Re your other wish, I prefer not, because this would use more memory. > Problem in TextPosition's hashCode > ---------------------------------- > > Key: PDFBOX-3799 > URL: https://issues.apache.org/jira/browse/PDFBOX-3799 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 2.0.6 > Reporter: Miro Mannino > Assignee: Tilman Hausherr > Fix For: 2.0.7, 3.0.0 > > > Just another side effect related to TextPosition's hashCode > I am using the hashCode because I want to know the color of each letter. To > do this, during the processTextPosition, I save the current graphic state in > a map, using the current text position as key. Then, on writeString, I > iterate all the text positions and I get the color for each of them though > this map. > Of course would be easier if this information could be saved in the text > position. But this is just a desired feature. > I am discovering that from processTextPosition to writeString sometimes > happens that the same textPosition has just a different unicode. In > processTextPosition is just a "x" (char 120), but then on writeString the > same textPosition the unicode is the x, followed by '̄' (char 772). > Everything about the textPosition remains the same: same coordinates, same > System.identityHashCode; the only thing that changes is the unicode, which > causes the computation of a different hashCode. > That is giving problem. As workaround I am using now System.identityHashCode > instead of the current TextPosition's implementation -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org