Am 30.04.2020 um 07:40 schrieb Maruan Sahyoun:
Hi,

As part of a project engagement I would be able to use a commercial product to 
extract text and generate images from PDFs. We
could use these to generate test files for comparison of our own results.

Background is that the customer is willing to support that we can enhance 
PDFBox for that matter as this is also used within the
customers environment as part of an archiving solution for text extraction and 
thumbnail generation.

We need to use our own files, official test files.

WDYT?


Hi,

Sorry, I do not really understand... the files were are using for regression testing are mostly real world "problem" files. It would be difficult to create these from scratch.

Or do you mean this product can extract text + generate images in a different way than we do, e.g. doesn't have the flaws we have (small differences when rendering on different computer)?

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to