[ https://issues.apache.org/jira/browse/PDFBOX-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars Torunski updated PDFBOX-2996: ---------------------------------- Attachment: diff-delta.png artikel1_20_arab.pdf-sorted-qs-recursive.txt artikel1_20_arab.pdf-sorted-qs-iterative-withRightPivot.txt artikel1_20_arab.pdf-sorted-qs-iterative-withMiddlePivot.txt artikel1_20_arab.pdf-sorted-java8-timsort.txt artikel1_20_arab.pdf-sorted-java8-legacyMergeSort.txt I tried a lot and tested a lot with the spd file and Java 8 on my Mac. The results of diff-delta.png should be carefully used, because e.g. the encoding during the file comparison could change the results. The expected result is that quick sort recursive and quick sort iterative with choosing the right index for the pivot are equal. Bubble sort is very close to the Java 8 legacy merge sort. Java 8 legacy merge sort and the Java 8 tim sort (introduced in Java 7) are very different. Tommorrow I'm trying to reproduce the output files and the test results. > StackOverflow in Quicksort > -------------------------- > > Key: PDFBOX-2996 > URL: https://issues.apache.org/jira/browse/PDFBOX-2996 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 1.8.10, 2.0.0 > Environment: Java 7 > Reporter: Manuel Aristaran > Attachments: 001991.pdf, Lars-v0-PDFBOX-2996.patch, > Lars-v1-PDFBOX-2996.patch, Lars-v2-PDFBOX-2996.patch, QuickSort.java, > artikel1_20_arab.pdf-sorted-bubble.txt, artikel1_20_arab.pdf-sorted-diff.txt, > artikel1_20_arab.pdf-sorted-iter-withRightPivot.txt, > artikel1_20_arab.pdf-sorted-iter.txt, > artikel1_20_arab.pdf-sorted-java8-legacyMergeSort.txt, > artikel1_20_arab.pdf-sorted-java8-timsort.txt, > artikel1_20_arab.pdf-sorted-qs-iterative-withMiddlePivot.txt, > artikel1_20_arab.pdf-sorted-qs-iterative-withRightPivot.txt, > artikel1_20_arab.pdf-sorted-qs-recursive.txt, > artikel1_20_arab.pdf-sorted-rekur.txt, diff-delta.png, failing_sort.pdf, > quicksort.patch > > > Running PDFTextStripper through ExtractText triggers a StackOverflow > exception in the QuickSort implementation for [this particular > document|https://www.dropbox.com/s/6crie7y5gqadwa5/1.pdf?dl=0]. > To reproduce: {{java -jar pdfbox-app-1.8.11-SNAPSHOT.jar ExtractText -sort > failing_sort.pdf}} > (Related to PDFBOX-1512) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org