[jira] [Comment Edited] (PDFBOX-3620) Acroform fields bad encoding

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837287#comment-15837287
 ] 

Tilman Hausherr edited comment on PDFBOX-3620 at 1/25/17 6:51 AM:
--

See their comment
https://bz.apache.org/ooo/show_bug.cgi?id=127294
{quote}
Please:
1) attach the original ODF document used to create this PDF
2) provide information about your operating system
3) try to test with latest build 4.1.3
{quote}


was (Author: tilman):
See their comment
{quote}
Please:
1) attach the original ODF document used to create this PDF
2) provide information about your operating system
3) try to test with latest build 4.1.3
{quote}

> Acroform fields bad encoding
> 
>
> Key: PDFBOX-3620
> URL: https://issues.apache.org/jira/browse/PDFBOX-3620
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.3, 2.0.4
> Environment: Windows 7
>Reporter: Neus
>  Labels: acroform
> Attachments: CONTRACTE.pdf, helloworld-Arial-Symbolic.pdf, 
> helloworld.pdf
>
>
> There's a problem with acroform fields filled using Java.
> Setting a string containing characters like óá, etc. in an Acroform field,
> when sending the pdf to the printer these characters are replaced by
> another ones. But if the pdf is saved to the disk, when I open the pdf the
> characters are correct not replaced!
> Here it is the pdf I'm using 
> https://drive.google.com/file/d/0B1_3_sVPnBolbjc3U21LTUFTeDg/view?usp=sharing
> Also I have tried to save the document in a temp file, read that
> file using pdfbox and send it to the printer and the characters like ó, é, 
> etc. don't appear well neither.
> Here's the code I'm using
> {code}
> PDDocument pdfDocument;
> try {
> pdfDocument = PDDocument.load(PrintUtils.class.getClassLoader().
> getResourceAsStream("\\application\\vistes\\clients\\contractes\\informes\\
> CONTRACTE.pdf"));
> PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
> PDAcroForm acroForm = docCatalog.getAcroForm();
> acroForm.setNeedAppearances(false);
> AcroForm.omplirCamp(acroForm, "num_contracte", contracte.getId());
> AcroForm.omplirCamp(acroForm, "identificacio_caldera",
> (contracte.getMarca() + " " + contracte.getAparell() + " " +
> contracte.getNumero_fabricacio()).trim());
> AcroForm.omplirCamp(acroForm, "data_instalacio",
> dataFormatter.format(contracte.getDataInstalacio()));
> AcroForm.omplirCamp(acroForm, "nom_usuari",
> contracte.getNom_usuari());
> AcroForm.omplirCamp(acroForm, "nom_usuari2",
> contracte.getNom_usuari());
> AcroForm.omplirCamp(acroForm, "nif", contracte.getNif());
> AcroForm.omplirCamp(acroForm, "carrer", contracte.getDireccio());
> AcroForm.omplirCamp(acroForm, "num_casa", contracte.getNumeroCasa());
> AcroForm.omplirCamp(acroForm, "telefon", contracte.getTelefon());
> AcroForm.omplirCamp(acroForm, "municipi", contracte.getPoblacio());
> AcroForm.omplirCamp(acroForm, "euros", decimalFormatter.format(
> contracte.getCost_anual()));
> //Create AttributeSet
> PrintRequestAttributeSet pset = new
> HashPrintRequestAttributeSet();
> //Add Duplex Option to AttributeSet
> pset.add(Sides.DUPLEX);
> PrinterJob job = PrinterJob.getPrinterJob();
> job.setJobName("Contracte " + (contracte.getId() != null ?
> contracte.getId() : "nou"));
> PageFormat format = job.getPageFormat(pset);
> format.setOrientation(PageFormat.PORTRAIT);
> Paper paper = new Paper();
> paper.setSize(20, 14.5);
> format.setPaper(paper);
> job.defaultPage(format);
> job.setPageable(new PDFPageable(pdfDocument));
> if (job.printDialog(pset)) {
> job.print(pset);
> }
> pdfDocument.close();
> } catch (IOException e) {
> e.printStackTrace();
> } catch (PrinterException e) {
> e.printStackTrace();
> }
> public static void omplirCamp(PDAcroForm acroForm, String nomCamp, String
> valor) throws IOException{
> PDField field = acroForm.getField(nomCamp);
> if( field != null ) {
> field.setValue(valor != null ? valor : "");
> }
> else {
> System.err.println( "No s'ha trobat el camp "+nomCamp+"!");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3620) Acroform fields bad encoding

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837287#comment-15837287
 ] 

Tilman Hausherr commented on PDFBOX-3620:
-

See their comment
{quote}
Please:
1) attach the original ODF document used to create this PDF
2) provide information about your operating system
3) try to test with latest build 4.1.3
{quote}

> Acroform fields bad encoding
> 
>
> Key: PDFBOX-3620
> URL: https://issues.apache.org/jira/browse/PDFBOX-3620
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.3, 2.0.4
> Environment: Windows 7
>Reporter: Neus
>  Labels: acroform
> Attachments: CONTRACTE.pdf, helloworld-Arial-Symbolic.pdf, 
> helloworld.pdf
>
>
> There's a problem with acroform fields filled using Java.
> Setting a string containing characters like óá, etc. in an Acroform field,
> when sending the pdf to the printer these characters are replaced by
> another ones. But if the pdf is saved to the disk, when I open the pdf the
> characters are correct not replaced!
> Here it is the pdf I'm using 
> https://drive.google.com/file/d/0B1_3_sVPnBolbjc3U21LTUFTeDg/view?usp=sharing
> Also I have tried to save the document in a temp file, read that
> file using pdfbox and send it to the printer and the characters like ó, é, 
> etc. don't appear well neither.
> Here's the code I'm using
> {code}
> PDDocument pdfDocument;
> try {
> pdfDocument = PDDocument.load(PrintUtils.class.getClassLoader().
> getResourceAsStream("\\application\\vistes\\clients\\contractes\\informes\\
> CONTRACTE.pdf"));
> PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
> PDAcroForm acroForm = docCatalog.getAcroForm();
> acroForm.setNeedAppearances(false);
> AcroForm.omplirCamp(acroForm, "num_contracte", contracte.getId());
> AcroForm.omplirCamp(acroForm, "identificacio_caldera",
> (contracte.getMarca() + " " + contracte.getAparell() + " " +
> contracte.getNumero_fabricacio()).trim());
> AcroForm.omplirCamp(acroForm, "data_instalacio",
> dataFormatter.format(contracte.getDataInstalacio()));
> AcroForm.omplirCamp(acroForm, "nom_usuari",
> contracte.getNom_usuari());
> AcroForm.omplirCamp(acroForm, "nom_usuari2",
> contracte.getNom_usuari());
> AcroForm.omplirCamp(acroForm, "nif", contracte.getNif());
> AcroForm.omplirCamp(acroForm, "carrer", contracte.getDireccio());
> AcroForm.omplirCamp(acroForm, "num_casa", contracte.getNumeroCasa());
> AcroForm.omplirCamp(acroForm, "telefon", contracte.getTelefon());
> AcroForm.omplirCamp(acroForm, "municipi", contracte.getPoblacio());
> AcroForm.omplirCamp(acroForm, "euros", decimalFormatter.format(
> contracte.getCost_anual()));
> //Create AttributeSet
> PrintRequestAttributeSet pset = new
> HashPrintRequestAttributeSet();
> //Add Duplex Option to AttributeSet
> pset.add(Sides.DUPLEX);
> PrinterJob job = PrinterJob.getPrinterJob();
> job.setJobName("Contracte " + (contracte.getId() != null ?
> contracte.getId() : "nou"));
> PageFormat format = job.getPageFormat(pset);
> format.setOrientation(PageFormat.PORTRAIT);
> Paper paper = new Paper();
> paper.setSize(20, 14.5);
> format.setPaper(paper);
> job.defaultPage(format);
> job.setPageable(new PDFPageable(pdfDocument));
> if (job.printDialog(pset)) {
> job.print(pset);
> }
> pdfDocument.close();
> } catch (IOException e) {
> e.printStackTrace();
> } catch (PrinterException e) {
> e.printStackTrace();
> }
> public static void omplirCamp(PDAcroForm acroForm, String nomCamp, String
> valor) throws IOException{
> PDField field = acroForm.getField(nomCamp);
> if( field != null ) {
> field.setValue(valor != null ? valor : "");
> }
> else {
> System.err.println( "No s'ha trobat el camp "+nomCamp+"!");
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1912) Optical Character Recognition (OCR)

2017-01-24 Thread Tim Allison (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836632#comment-15836632
 ] 

Tim Allison commented on PDFBOX-1912:
-

Y, thank you, [~tilman]. We have two basic strategies.  One runs OCR on 
extracted inline images one by one, and the other uses PDFBox to render each 
page and then runs Tesseract on the full image.  Let us know if/when you have 
questions.  

See our general [OCR wiki page|https://wiki.apache.org/tika/TikaOCR] and our 
[PDF-specific 
notes|https://wiki.apache.org/tika/PDFParser%20%28Apache%20PDFBox%29#OCR]

> Optical Character Recognition (OCR)
> ---
>
> Key: PDFBOX-1912
> URL: https://issues.apache.org/jira/browse/PDFBOX-1912
> Project: PDFBox
>  Issue Type: New Feature
>  Components: Text extraction
>Affects Versions: 2.0.0
> Environment: JDK 6, C/C++
>Reporter: John Hewson
>Assignee: John Hewson
>  Labels: gsoc2014
> Fix For: 2.1.0
>
>
> Brief explanation: The PDFBox library is widely used to extract text from PDF 
> files. However, many PDF files embed text in a malformed manner which renders 
> text extraction useless. There has recently been interest in extracting 
> governmental data from PDF files, the PDF Liberation commons being a notable 
> example, see https://github.com/pdfliberation for more details.
> Many end-users of PDFBox have been making use of OCR tools such as Google's 
> Tesseract https://code.google.com/p/tesseract-ocr/ which are run on the final 
> image generated by PDFBox. We think that by adding a more integrated OCR API 
> to PDFBox it will be possible to do a better job. PDFBox often has access to 
> encoding and positioning information for individual glyphs. Even when their 
> extracted text is meaningless, a character-by-character, or line-by-line OCR 
> could be more accurate. PDFBox also has information such as image orientation 
> which could allow it to better perform OCR on pages such as embedded 
> landscape tables.
> There are existing JNI bindings for Tesseract available at 
> https://code.google.com/p/tesseract-android-tools/
> Expected results: To extend PDF box with an API which allows external OCR 
> tools to be plugged-in, and an implementation of a Tesseract plug-in using 
> either JNI or the command line via Process.exec.
> Knowledge Prerequisite: Java, JNI (C/C++)
> Mentor: John Hewson
> PMC Note: Tesseract  is under the Apache License 2.0
> To learn more about PDFBox, please visit http://pdfbox.apache.org/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Issue Comment Deleted] (PDFBOX-3650) Merge multiple PDF files along with form fields

2017-01-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3650:

Comment: was deleted

(was: Closed due to missing feedback. (repeated because of JIRA problem, will 
be deleted next))

> Merge multiple PDF files along with form fields
> ---
>
> Key: PDFBOX-3650
> URL: https://issues.apache.org/jira/browse/PDFBOX-3650
> Project: PDFBox
>  Issue Type: New Feature
>  Components: AcroForm
>Affects Versions: 2.0.3
>Reporter: Ajit Tawade
>Priority: Minor
>  Labels: MergePDFForm
>
> We are trying to merge PDF files along with form fields using PDFBox 2.0.3 
> version.
> We have used PDFMergerUtility methods to merge documents But its not 
> fulfilling the form field merge requirement.
> Please guide us how to merge PDF files along with form fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-3650) Merge multiple PDF files along with form fields

2017-01-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-3650.
---
Resolution: Incomplete

> Merge multiple PDF files along with form fields
> ---
>
> Key: PDFBOX-3650
> URL: https://issues.apache.org/jira/browse/PDFBOX-3650
> Project: PDFBox
>  Issue Type: New Feature
>  Components: AcroForm
>Affects Versions: 2.0.3
>Reporter: Ajit Tawade
>Priority: Minor
>  Labels: MergePDFForm
>
> We are trying to merge PDF files along with form fields using PDFBox 2.0.3 
> version.
> We have used PDFMergerUtility methods to merge documents But its not 
> fulfilling the form field merge requirement.
> Please guide us how to merge PDF files along with form fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Reopened] (PDFBOX-3650) Merge multiple PDF files along with form fields

2017-01-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr reopened PDFBOX-3650:
-

> Merge multiple PDF files along with form fields
> ---
>
> Key: PDFBOX-3650
> URL: https://issues.apache.org/jira/browse/PDFBOX-3650
> Project: PDFBox
>  Issue Type: New Feature
>  Components: AcroForm
>Affects Versions: 2.0.3
>Reporter: Ajit Tawade
>Priority: Minor
>  Labels: MergePDFForm
>
> We are trying to merge PDF files along with form fields using PDFBox 2.0.3 
> version.
> We have used PDFMergerUtility methods to merge documents But its not 
> fulfilling the form field merge requirement.
> Please guide us how to merge PDF files along with form fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3650) Merge multiple PDF files along with form fields

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836599#comment-15836599
 ] 

Tilman Hausherr commented on PDFBOX-3650:
-

Closed due to missing feedback. (repeated because of JIRA problem, will be 
deleted next)

> Merge multiple PDF files along with form fields
> ---
>
> Key: PDFBOX-3650
> URL: https://issues.apache.org/jira/browse/PDFBOX-3650
> Project: PDFBox
>  Issue Type: New Feature
>  Components: AcroForm
>Affects Versions: 2.0.3
>Reporter: Ajit Tawade
>Priority: Minor
>  Labels: MergePDFForm
>
> We are trying to merge PDF files along with form fields using PDFBox 2.0.3 
> version.
> We have used PDFMergerUtility methods to merge documents But its not 
> fulfilling the form field merge requirement.
> Please guide us how to merge PDF files along with form fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1912) Optical Character Recognition (OCR)

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836598#comment-15836598
 ] 

Tilman Hausherr commented on PDFBOX-1912:
-

Alternatively, there's now an OCR component in Tika.

> Optical Character Recognition (OCR)
> ---
>
> Key: PDFBOX-1912
> URL: https://issues.apache.org/jira/browse/PDFBOX-1912
> Project: PDFBox
>  Issue Type: New Feature
>  Components: Text extraction
>Affects Versions: 2.0.0
> Environment: JDK 6, C/C++
>Reporter: John Hewson
>Assignee: John Hewson
>  Labels: gsoc2014
> Fix For: 2.1.0
>
>
> Brief explanation: The PDFBox library is widely used to extract text from PDF 
> files. However, many PDF files embed text in a malformed manner which renders 
> text extraction useless. There has recently been interest in extracting 
> governmental data from PDF files, the PDF Liberation commons being a notable 
> example, see https://github.com/pdfliberation for more details.
> Many end-users of PDFBox have been making use of OCR tools such as Google's 
> Tesseract https://code.google.com/p/tesseract-ocr/ which are run on the final 
> image generated by PDFBox. We think that by adding a more integrated OCR API 
> to PDFBox it will be possible to do a better job. PDFBox often has access to 
> encoding and positioning information for individual glyphs. Even when their 
> extracted text is meaningless, a character-by-character, or line-by-line OCR 
> could be more accurate. PDFBox also has information such as image orientation 
> which could allow it to better perform OCR on pages such as embedded 
> landscape tables.
> There are existing JNI bindings for Tesseract available at 
> https://code.google.com/p/tesseract-android-tools/
> Expected results: To extend PDF box with an API which allows external OCR 
> tools to be plugged-in, and an implementation of a Tesseract plug-in using 
> either JNI or the command line via Process.exec.
> Knowledge Prerequisite: Java, JNI (C/C++)
> Mentor: John Hewson
> PMC Note: Tesseract  is under the Apache License 2.0
> To learn more about PDFBox, please visit http://pdfbox.apache.org/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-1958) image mask outline with shading pattern is invisible

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836308#comment-15836308
 ] 

Tilman Hausherr edited comment on PDFBOX-1958 at 1/24/17 8:35 PM:
--

I've slept over this and thought about it before going to work and decided not 
to put more work into this, the advantages are not worth the extra work. The 
only file I have that is not a test file but a real world file 
(gs-bugzilla690297.pdf) looks weird even when rendered with Adobe Reader.


was (Author: tilman):
I've slept over this and thought about it before going to work and decided not 
to put more work into this, the advantages are not worth the extra work. The 
only file I have that is not a test file but a real world file 
(gs-bugzilla690297.pdf) looks weird even when rendered with Adobe Reader.

> image mask outline with shading pattern is invisible
> 
>
> Key: PDFBOX-1958
> URL: https://issues.apache.org/jira/browse/PDFBOX-1958
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: Stencil, mask, shading, shadingpattern
> Fix For: 2.0.5, 2.1.0
>
> Attachments: cinnebar1.jpg, cinnebar.pdf, cinnebar.ps, 
> gs-bugzilla690297.pdf, PATTYP2.pdf
>
>
> This is also somewhat of a regression: two weeks ago, the attached file had 
> the image rendered in b/w, now it is invisible. I was able to get the image 
> in another (wrong) color by changing one line in BeginInlineImage.java, the 
> one with TODO to
> awtImage = image.getStencilImage(colorSpace.toPaint(color, 
> image.getHeight())); // <--- TODO: pass page height?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3000) Transparency Group issues

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836311#comment-15836311
 ] 

Tilman Hausherr edited comment on PDFBOX-3000 at 1/24/17 8:35 PM:
--

[~lehmi] It is true that all changes are also in 2.0, but I didn't set a target 
because not all transparency problems are solved, i.e. I wasn't expecting to 
resolve the issue for 2.0.5. I'd like to keep this issue for a while because it 
helps me to track the improvements in one single place.


was (Author: tilman):
[~lehmi] It is true that all changes are also in 2.0, but I didn't set a target 
because not all transparency problems are solved, i.e. I wasn't expecting to 
resolve the issue for 2.0.5. I'd like to keep this issue for a while because it 
helps me to track the improvements in one single place.

> Transparency Group issues
> -
>
> Key: PDFBOX-3000
> URL: https://issues.apache.org/jira/browse/PDFBOX-3000
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: John Hewson
>  Labels: Transparency
> Fix For: 2.1.0
>
> Attachments: 007087-payment-due-p58_reduced2.pdf, blendmodes.pdf, 
> BlendModes-rgb.pdf, circle-simple.pdf, ds-firewall-enterprise-p1_reduced.pdf, 
> gs-bugzilla689309-reduced-bc0.pdf, gs-bugzilla689309-reduced-bc1.pdf, 
> gs-bugzilla689309-reduced.pdf, gs-bugzilla689931_reduced-Multiply.pdf, 
> gs-bugzilla689931_reduced-ScreenBlendPageBackground.pdf, 
> gs-bugzilla689931_reduced-Screen.pdf, gs-bugzilla690022_reduced.pdf, 
> gs-bugzilla690022-reduced-rotations.pdf, gs-bugzilla691157_mod_unc.pdf, 
> gs-bugzilla691157_mod_unc.png, gs-bugzilla691157.pdf, gs-bugzilla691348.pdf, 
> gs-bugzilla691650-2.pdf, gs-bugzilla692217_reduced.pdf, 
> gs-bugzilla693322_reduced.pdf, gs-bugzilla694556-3.pdf, 
> gs-bugzilla695354.pdf, gs-bugzilla695582-transparency-fill-stroke.pdf, 
> gs-bugzilla695582-transparency-fill-stroke.pdf-1.png, 
> PDFBOX-1697-reduced-rotations.pdf, PDFBOX-2182_mod.pdf, 
> PDFBox3359PanelTestEnhanced.java, PDFBOX-3400-RGB.pdf, 
> PDFBOX-3494_reduced_cropX.pdf, PDFBOX-3494_reduced.pdf, PDFBOX-3564-Mask.pdf, 
> PDFJS-2845-p1.pdf, PDFJS-5526-p13_reduced1.pdf, PDFJS-5526-p13_reduced2.pdf, 
> PDFJS-5526-p13_reduced3-nogroup.pdf, PDFJS-5811-2-p3_reduced4.pdf, 
> PDFJS-5811-2-p3_reduced.pdf, PDFJS-5811-2-p4_reduced-rotations.pdf, 
> PDFJS-5811-2.pdf, PDFJS-5853_reduced.pdf, 
> PDFJS-6967_reduced_outside_softmask.pdf, 
> samsung_galaxy_s_4_um-p1_reduced.pdf, snowman-nose-gradient.pdf, 
> snowman-nose-gradient-rgb.pdf, 
> snowman-nose-gradient-rgb_reduced-0.6-bad2.pdf, 
> snowman-nose-gradient-rgb_reduced-0.7-good2.pdf, 
> snowman-nose-gradient-screenshot-comparison-11.12.2016.jpg, 
> snowmen-opacity-clipping-masks-2.0.3.png, 
> snowmen-opacity-clipping-masks-2.1.0-SNAPSHOT-2016-12-07.png, 
> snowmen-opacity-clipping-masks-2.1-SNAPSHOT-10.12.2016.png, 
> snowmen-opacity-clipping-masks-adobe-illustrator.png, 
> snowmen-opacity-clipping-masks.pdf, SoftMask-Clipped.pdf, SoftMask.pdf, 
> softmask-rewrite-alt1.patch, softmask-rewrite.patch
>
>
> This is a follow-up issue for transparency group issues from PDFBOX-2423. 
> More details to come.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3653) NegativeArraySizeException thrown when converting PDF to Image (in TilingPaint.java)

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836316#comment-15836316
 ] 

Tilman Hausherr edited comment on PDFBOX-3653 at 1/24/17 8:35 PM:
--

I have set a maximum surface of 2500 for a pattern image, this is 5000 x 
5000 and dpi isn't even counted.

Initially, this check activated only with your file and another ( 
https://bugs.ghostscript.com/show_bug.cgi?id=693134 ) that doesn't have an 
issue here yet. The values are arbitrary and there might still be trouble with 
a large dpi. Your file renders fine up to 400% (= 288dpi), but not at 1000% 
(720dpi).

I also tried to limit the sizes to the page sizes, but that didn't work with 
several other files; I suspect that at a later time, the calculated large 
patterns get resized so their full surface is needed.

I have also decided to remove the code that sets big (> 32767) XStep and YStep 
values to 0. Rendering with the modified code shows improvements:
- PDFBOX-3447-XStep9.pdf: the pattern repetition at the top (didn't notice 
before, it is on top of the blue part) is gone
- PDFJS-6496-XStep9.pdf: the pattern repetition at the right (didn't notice 
before, the dotted line was double) is gone
- PDFJS-7731-XStep9.pdf: the pattern repetition at the bottom (easy to see) 
is gone

An even better solution for the future would be to paint patterns with big 
sizes only once, i.e. not use TexturePaint.


was (Author: tilman):
I have set a maximum surface of 2500 for a pattern image, this is 5000 x 
5000 and dpi isn't even counted.

Initially, this check activated only with your file and another ( 
https://bugs.ghostscript.com/show_bug.cgi?id=693134 ) that doesn't have an 
issue here yet. The values are arbitrary and there might still be trouble with 
a large dpi. Your file renders fine up to 400% (= 288dpi), but not at 1000% 
(720dpi).

I also tried to limit the sizes to the page sizes, but that didn't work with 
several other files; I suspect that at a later time, the calculated large 
patterns get resized so their full surface is needed.

I have also decided to remove the code that sets big (> 32767) XStep and YStep 
values to 0. Rendering with the modified code shows improvements:
- PDFBOX-3447-XStep9.pdf: the pattern repetition at the top (didn't notice 
before, it is on top of the blue part) is gone
- PDFJS-6496-XStep9.pdf: the pattern repetition at the right (didn't notice 
before, the dotted line was double) is gone
- PDFJS-7731-XStep9.pdf: the pattern repetition at the bottom (easy to see) 
is gone

An even better solution for the future would be to paint patterns with big 
sizes only once, i.e. not use TexturePaint.

> NegativeArraySizeException thrown when converting PDF to Image (in 
> TilingPaint.java)
> 
>
> Key: PDFBOX-3653
> URL: https://issues.apache.org/jira/browse/PDFBOX-3653
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.4
>Reporter: Emily Coyne
>Assignee: Tilman Hausherr
> Fix For: 2.0.5, 2.1.0
>
> Attachments: PDFBOX-3653_reduced.pdf, PowerPoint-slides.pdf
>
>
> Specifically page 7 of the PDF document is failing.
> PDF Document:
> http://download.win2pdf.com/samples/PowerPoint-slides.pdf
> (also attached to ticket)
> Stack trace:
> Exception in thread "main" java.lang.NegativeArraySizeException 
> at java.awt.image.DataBufferByte.(DataBufferByte.java:76)
> at java.awt.image.Raster.createInterleavedRaster(Raster.java:266)
> at java.awt.image.Raster.createInterleavedRaster(Raster.java:212)
> at 
> java.awt.image.ComponentColorModel.createCompatibleWritableRaster(ComponentColorModel.java:2825)
> at org.apache.pdfbox.rendering.TilingPaint.getImage(TilingPaint.java:134)
> at org.apache.pdfbox.rendering.TilingPaint.(TilingPaint.java:69)
> at org.apache.pdfbox.rendering.PageDrawer.getPaint(PageDrawer.java:251)
> at 
> org.apache.pdfbox.rendering.PageDrawer.getNonStrokingPaint(PageDrawer.java:526)
> at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:597)
> at 
> org.apache.pdfbox.contentstream.operator.graphics.FillEvenOddRule.process(FillEvenOddRule.java:36)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:486)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:460)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
> at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:189)
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:145)
> at 
> 

[jira] [Comment Edited] (PDFBOX-3653) NegativeArraySizeException thrown when converting PDF to Image (in TilingPaint.java)

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15836320#comment-15836320
 ] 

Tilman Hausherr edited comment on PDFBOX-3653 at 1/24/17 8:35 PM:
--

Snapshot will be at
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.5-SNAPSHOT/
within a few hours.


was (Author: tilman):
Snapshot will be at
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.5-SNAPSHOT/
within a few hours.

> NegativeArraySizeException thrown when converting PDF to Image (in 
> TilingPaint.java)
> 
>
> Key: PDFBOX-3653
> URL: https://issues.apache.org/jira/browse/PDFBOX-3653
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.4
>Reporter: Emily Coyne
>Assignee: Tilman Hausherr
> Fix For: 2.0.5, 2.1.0
>
> Attachments: PDFBOX-3653_reduced.pdf, PowerPoint-slides.pdf
>
>
> Specifically page 7 of the PDF document is failing.
> PDF Document:
> http://download.win2pdf.com/samples/PowerPoint-slides.pdf
> (also attached to ticket)
> Stack trace:
> Exception in thread "main" java.lang.NegativeArraySizeException 
> at java.awt.image.DataBufferByte.(DataBufferByte.java:76)
> at java.awt.image.Raster.createInterleavedRaster(Raster.java:266)
> at java.awt.image.Raster.createInterleavedRaster(Raster.java:212)
> at 
> java.awt.image.ComponentColorModel.createCompatibleWritableRaster(ComponentColorModel.java:2825)
> at org.apache.pdfbox.rendering.TilingPaint.getImage(TilingPaint.java:134)
> at org.apache.pdfbox.rendering.TilingPaint.(TilingPaint.java:69)
> at org.apache.pdfbox.rendering.PageDrawer.getPaint(PageDrawer.java:251)
> at 
> org.apache.pdfbox.rendering.PageDrawer.getNonStrokingPaint(PageDrawer.java:526)
> at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:597)
> at 
> org.apache.pdfbox.contentstream.operator.graphics.FillEvenOddRule.process(FillEvenOddRule.java:36)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:829)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:486)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:460)
> at 
> org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
> at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:189)
> at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:145)
> at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:94)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:236)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:94)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3620) Acroform fields bad encoding

2017-01-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15834934#comment-15834934
 ] 

Tilman Hausherr edited comment on PDFBOX-3620 at 1/24/17 8:29 PM:
--

Here's some code to correct your file, after running this the problem is gone. 
The code is a bit long but all it does is to replace the flag for Arial and 
Times and Courier.

Btw your file has the fields set read only, so I wonder why you'd want to 
change the values.

{code}
Iterator it = 
doc.getDocumentCatalog().getAcroForm().getFieldIterator();
while (it.hasNext())
{
PDField field = it.next();
if (field instanceof PDTextField)
{
for (PDAnnotationWidget widget : field.getWidgets())
{
PDAppearanceDictionary appearance = widget.getAppearance();
if (appearance == null)
{
continue;
}
PDAppearanceEntry normalAppearance = 
appearance.getNormalAppearance();
PDAppearanceStream appearanceStream = 
normalAppearance.getAppearanceStream();
if (appearanceStream == null)
{
continue;
}
PDResources resources = appearanceStream.getResources();
if (resources == null)
{
continue;
}
for (COSName name : resources.getFontNames())
{
PDFont font = resources.getFont(name);
if (!(font instanceof PDTrueTypeFont))
{
continue;
}
if (!font.getName().startsWith("Times") && 
!font.getName().startsWith("Arial") && !font.getName().startsWith("Courier"))
{
continue;
}
PDFontDescriptor fontDescriptor = font.getFontDescriptor();
if (fontDescriptor == null)
{
continue;
}
if (fontDescriptor.getFlags() == 4)
{
fontDescriptor.setFlags(32);
System.out.println(name.getName() + " " + font.getName() + 
" flag corrected");
}
}
}
}
}
{code}
Your file was created by OpenOffice 4.1.1. I'll try to open an issue there too, 
but I don't expect much.
http://arstechnica.com/information-technology/2016/09/openoffice-after-years-of-neglect-could-shut-down/
Consider updating to LibreOffice.


was (Author: tilman):
Here's some code to correct your file, after running this the problem is gone. 
The code is a bit long but all it does is to replace the flag for Arial and 
Times and Courier.

Btw your file has the fields set read only, so I wonder why you'd want to 
change the values.

{code}
Iterator it = 
doc.getDocumentCatalog().getAcroForm().getFieldIterator();
while (it.hasNext())
{
PDField field = it.next();
if (field instanceof PDTextField)
{
for (PDAnnotationWidget widget : field.getWidgets())
{
PDAppearanceDictionary appearance = widget.getAppearance();
if (appearance == null)
{
continue;
}
PDAppearanceEntry normalAppearance = 
appearance.getNormalAppearance();
PDAppearanceStream appearanceStream = 
normalAppearance.getAppearanceStream();
if (appearanceStream == null)
{
continue;
}
PDResources resources = appearanceStream.getResources();
if (resources == null)
{
continue;
}
for (COSName name : resources.getFontNames())
{
PDFont font = resources.getFont(name);
if (!(font instanceof PDTrueTypeFont))
{
continue;
}
if (!font.getName().startsWith("Times") && 
!font.getName().startsWith("Arial") && !font.getName().startsWith("Courier"))
{
continue;
}
PDFontDescriptor fontDescriptor = font.getFontDescriptor();
if (fontDescriptor == null)
{
continue;
}
if (fontDescriptor.getFlags() == 4)
{
fontDescriptor.setFlags(32);
System.out.println(name.getName() + " " + font.getName() + 
" flag corrected");
}
}
}
}
}
{code}
Your file was created by OpenOffice 4.1.1. I'll try to open an issue there too, 
but I don't expect much.
http://arstechnica.com/information-technology/2016/09/openoffice-after-years-of-neglect-could-shut-down/
Consider updating to LibreOffice.

> Acroform fields bad encoding
> 
>
> Key: PDFBOX-3620
> URL: 

JIRA mails to dev@ missing

2017-01-24 Thread Tilman Hausherr

https://issues.apache.org/jira/browse/INFRA-13374



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: JIRA thread replies

2017-01-24 Thread Tilman Hausherr

Am 24.01.2017 um 12:26 schrieb Andreas Lehmkühler:

I've opened a JIRA-ticket

https://issues.apache.org/jira/browse/INFRA-13380


Thanks!

Tilman




BR
Andreas


Andreas Lehmkühler  hat am 18. Januar 2017 um 12:40 
geschrieben:


I've forwared your question to users@infra

BR
Andreas

Tilman Hausherr  hat am 17. Januar 2017 um 18:29 
geschrieben:


Did I miss something or is it a setting? I do no longer see the
possibility to do thread replies in JIRA, only ordinary replies.

Tilman


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: JIRA thread replies

2017-01-24 Thread Andreas Lehmkühler
I've opened a JIRA-ticket 

https://issues.apache.org/jira/browse/INFRA-13380

BR
Andreas

> Andreas Lehmkühler  hat am 18. Januar 2017 um 12:40 
> geschrieben:
> 
> 
> I've forwared your question to users@infra
> 
> BR
> Andreas
> > Tilman Hausherr  hat am 17. Januar 2017 um 18:29 
> > geschrieben:
> > 
> > 
> > Did I miss something or is it a setting? I do no longer see the 
> > possibility to do thread replies in JIRA, only ordinary replies.
> > 
> > Tilman
> > 
> > 
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> > For additional commands, e-mail: dev-h...@pdfbox.apache.org
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: dev-h...@pdfbox.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org