Re: Regression Testing

2014-07-07 Thread Petr Slabý
Hi, following is a description of what we are doing in our company. With our software, we run regression tests after each nightly build and sometimes it is a tough fight. If there is a regression, it is not so easy to find which commit caused it, because there are potentially many between the

Paid PDFBox support

2014-07-07 Thread Aleksander Blomskøld
Hi, We're using PDFBox for PDF validation and PDF merging in a backend invoicing system. It's working pretty well for most of the time, but right now we're having some unhappy customers because of https://issues.apache.org/jira/browse/PDFBOX-1533. As it's important for us to have this fixed prett

[jira] [Commented] (PDFBOX-2107) Make PDFBox XMP library agnostic

2014-07-07 Thread MH (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053651#comment-14053651 ] MH commented on PDFBOX-2107: It took me a while to figure out why suddenly those 2 methods ar

[jira] [Commented] (PDFBOX-283) Character encoding/appearance issues when filling forms

2014-07-07 Thread Marco Primiceri (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053683#comment-14053683 ] Marco Primiceri commented on PDFBOX-283: Hello [~tilman] Maruans patch has solved

[jira] [Commented] (PDFBOX-283) Character encoding/appearance issues when filling forms

2014-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053793#comment-14053793 ] Tilman Hausherr commented on PDFBOX-283: Done in rev 1608502 for the 1.8 branch an

[jira] [Comment Edited] (PDFBOX-1695) Improve pdfbox tests

2014-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052896#comment-14052896 ] Tilman Hausherr edited comment on PDFBOX-1695 at 7/7/14 5:33 PM: --

Re: Improving OCR plugin for PDFBox

2014-07-07 Thread Santosh Arakeri
Pl dont send me mail. On Fri, Jun 27, 2014 at 12:28 PM, John Hewson wrote: > Hi Dimuthu > > That’s great. We should wait until closer to the end of the GSoC period to > integrate your work with PDFBox, as ideally we only want to have to do it > once. We’ve not included C++ dependencies before s

Re: Improving OCR plugin for PDFBox

2014-07-07 Thread John Hewson
Santosh, Please don’t e-mail the entire mailing list asking to be unsubscribed, simply send an e-mail to: dev-unsubscr...@pdfbox.apache.org -- John On 7 Jul 2014, at 10:39, Santosh Arakeri wrote: > Pl dont send me mail. > > > On Fri, Jun 27, 2014 at 12:28 PM, John Hewson wrote: > >> Hi D

Custom PDFTextStripper Warning (sometimes)

2014-07-07 Thread -A
Hi everyone; I have a program written that has two PDF function requirements: 1. It must be able to return all of the text from the file 2. It must be able to find red text within the file I have two different types of PDF files. One we can call a Job Output File, which may or may not hav

[jira] [Created] (PDFBOX-2194) Refactor predictor

2014-07-07 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2194: --- Summary: Refactor predictor Key: PDFBOX-2194 URL: https://issues.apache.org/jira/browse/PDFBOX-2194 Project: PDFBox Issue Type: Bug Component

[jira] [Resolved] (PDFBOX-2194) Refactor predictor

2014-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-2194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2194. - Resolution: Fixed Done in rev 1608530 for the trunk and rev 1608537 for the 1.8 branch.

Jenkins build became unstable: PDFBox-trunk #1124

2014-07-07 Thread Apache Jenkins Server
See

Re: Custom PDFTextStripper Warning (sometimes)

2014-07-07 Thread John Hewson
Hi Aaron You’re using the operator classes from the “org.apache.pdfbox.util.operator.pagedrawer” package with your custom TextStripper, however these class are only for use with a PageDrawer. If you look at the top entry in the stack trace "org.apache.pdfbox.util.operator.pagedrawer.FillEvenOd

Re: Custom PDFTextStripper Warning (sometimes)

2014-07-07 Thread -A
John: Excellent! That fixed it. I appreciate the fast reply. I've been scouring about for any PDFBox resources I could find and unfortunately have not found much. If there are any sites or books that go over the API that you would recommend, then by all means, please do. Thanks again though! -Aa

Jenkins build became unstable: PDFBox-trunk » Apache PDFBox #1124

2014-07-07 Thread Apache Jenkins Server
See

Re: Paid PDFBox support

2014-07-07 Thread Tilman Hausherr
I don't do freelancing and I never looked at the merge code so I'm hardly your guy, but maybe somebody else will come forward. This workaround code worked for me with the files in the JIRA issue: PDDocument doc1 = PDDocument.loadNonSeq(new File("part1.pdf"), null); PDDocument d

Re: Paid PDFBox support

2014-07-07 Thread Maruan Sahyoun
the issue is because part1.pdf in PDFBOX-1533 references the same 2 pages 3 times within the document catalog (/Kids [3 0 R, 3 0 R, 3 0 R]). Could you attach a sample pdf to PDFBOX-1533 to verify that your issue has the same cause or verify it for yourself? We are using PDFBox for merging docum

Re: Paid PDFBox support

2014-07-07 Thread Leonard Rosenthol
FWIW: It¹s unclear if such a file (with multiple references from the Pages tree) is valid. There is nothing that prevents it, but it¹s not necessary an expected thing. Leonard On 7/7/14, 5:05 PM, "Maruan Sahyoun" wrote: >the issue is because part1.pdf in PDFBOX-1533 references the same 2 pages

[jira] [Updated] (PDFBOX-1533) When merging certain PDF's several odd looking empty pages occure in the result

2014-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1533: Description: Unfortunately I cannot attach a input file for this case as it contains conf

RE: Regression Testing

2014-07-07 Thread Allison, Timothy B.
John, My initial plan for TIKA-1302 is very similar to what Tilman outlined, and my understanding/concerns/thoughts were very much in line with what he articulated. The idea is that there should be a small Apache license-able gold truth set like both projects now have for specific unit test

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-07-07 Thread Shaola Ren (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054527#comment-14054527 ] Shaola Ren commented on PDFBOX-1915: As I thought at the very beginning, I used a has

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-07-07 Thread Tilman Hausherr (JIRA)
[ https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14054580#comment-14054580 ] Tilman Hausherr commented on PDFBOX-1915: - I'll test the new code later today...