date:20140620

Tilman Hausherr created PDFBOX-2153:
---

 Summary: Setting the correct clipping path for shading
 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Reporter: Tilman Hausherr


While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
(uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
clipping region) operator of a type 7 shading I got a lot more correct shadings 
(type 6 and lower). It looked like PDFBox had been using the clipping of the 
type 7 when drawing the type 6, which is just a rectangle above in that 
rendering. This resulted in a blank.

By adding 
{code}
graphics.setClip(getGraphicsState().getCurrentClippingPath());
{code}
in PageDrawer.shfill() just before the graphics.fill() I get several files to 
render correctly that I hadn't before.

(Setting null will probably do the same, didn't test that yet).

The following PDFs are rendered correctly with the change:
McAfee-ShadingType7.pdf
eci_altona-test-suite-v2_technical_H.pdf
crestron-p9.pdf  (these three found in PDFBOX-1915)
PDFBOX-1451.pdf (alfresco)
PDFBOX-1940.pdf (chart)
PDFBOX-1861-tracemonkey.pdf p.11

Not solved by the change:
PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
PDFBOX-1416.pdf (not shading)
texample-rgb-triangle.pdf (John has an explanation about that one)

WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2153) Setting the correct clipping path for shading


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2153:


Labels: shading shadingpattern  (was: )

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038567#comment-14038567
 ] 

Petr Slaby commented on PDFBOX-2149:


Attached a file which runs into a NPE in PDFont#isSymbolicFont() now.
{noformat}
Caused by: java.lang.NullPointerException
at org.apache.pdfbox.pdmodel.font.PDFont.isSymbolicFont(PDFont.java:694)
at 
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getGIDForCharacterCode(PDTrueTypeFont.java:408)
at 
org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getFontWidth(PDTrueTypeFont.java:378)
at org.apache.pdfbox.pdmodel.font.PDFont.getFontWidth(PDFont.java:312)
at 
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:377)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:44)
at 
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:508)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:259)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:226)
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:209)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:175)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:227)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderPageToGraphics(PDFRenderer.java:190)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderPageToGraphics(PDFRenderer.java:174)
{noformat}

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2149) Font Refactoring


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petr Slaby updated PDFBOX-2149:
---

Attachment: 000467.pdf

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2149) Font Refactoring


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petr Slaby updated PDFBOX-2149:
---

Attachment: 39.pdf

Here is another one. Hope this helps.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038579#comment-14038579
 ] 

Andreas Lehmkühler commented on PDFBOX-2149:


[~jahewson] According to the spec you are totally right, but the real world is 
quite different. There are a lot of pdf generators which don't care about the 
spec. And more important the user doesn't care about the spec either. If the 
pdf opens in acrobat than it has to be opened by any other pdf reader as well. 
Tilman has an example, Petr as well, have a look at PDFBOX-62 and you'll find 
another one and I guess there are a lot more in with wild out there triggering 
the described NPE.

Either you revert your changes to reinstate my workaround or you'll come up 
with another/better one yourself. We, the PDFBox community don't like it, but 
we've learned that we have to live with such workarounds.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: PDFBox and XMP - retire jempbox

2014-06-20 Thread Andreas Lehmkuehler


Hi,

Am 20.06.2014 08:05, schrieb Maruan Sahyoun:

Hi,

we currently have two libraries handling XMP metadata jempbox and xmpbox.

Part of PDFBOX-1187/PDFBOX-2197 was to remove a direct dependency from jempbox 
as now XMP metadata could be generated by any library and added as a stream. 
This will be available for PDFBox 2.0.0.

I would like to propose to now retire jempbox as xmpbox

# is closer to the spec (naming conventions)
# used for PDF/A validation where we can not remove a dependency on XMP 
handling as checking metadata is necessary for PDF/A compliance.

In case there is functionality in jempbox that is missing in xmpbox that could 
be added at a later stage upon request.

WDYT?

I've nothing to add

+1


BR
Maruan


BR
Andreas Lehmkühler

Re: Travis CI

2014-06-20 Thread Andreas Lehmkuehler


Hi,

Am 19.06.2014 22:03, schrieb John Hewson:

Hi All

The recent instability of Jenkins prompted me to set up Travis CI to build the 
PDFBox mirror on GitHub. Automatic builds are triggered after every commit, and 
they can often run much faster than on the busy Jenkins server, so this gives 
committers an additional means to quickly determine if their build has problems 
or not.

Good idea.


The builds are public at: https://travis-ci.org/apache/pdfbox

The Jenkins build is still the “ground truth” and passing that is what counts, 
it *might* be possible to pass Travis CI and still fail on Jenkins, so that’s 
something to keep in mind.
Especially as the travis build uses oraclejdk7 as compiler. PDFBox has java6 as 
minimum requirement and that configuration may hide incompatibilities because of 
the choosen java version.



-- John


BR
Andreas Lehmkühler

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading


[ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038601#comment-14038601
 ] 

Andreas Lehmkühler commented on PDFBOX-2153:


Hmmm, I thought that it would be sufficient to use the clipping path as 
argument when calling the fill method, but obviously it isn't. IMHO go ahead

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading


[ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038604#comment-14038604
 ] 

Petr Slaby commented on PDFBOX-2153:


Sounds reasonable. Current clipping path is passed to graphics.fill(), so if 
the graphics has a clipping path from a previous operation, it might interfere 
with that. I vote for setClip(null) because setClip() is a time and memory 
consuming operation if called with a complex path.

The change does not show any effect on my test suite documents, it seems that I 
do not have an example that would be affected. 

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: PDFBox and XMP - retire jempbox

2014-06-20 Thread Timo Boehme


Hi,

Am 20.06.2014 08:05, schrieb Maruan Sahyoun:

Hi,

we currently have two libraries handling XMP metadata jempbox and xmpbox.

Part of PDFBOX-1187/PDFBOX-2197 was to remove a direct dependency from jempbox 
as now XMP metadata could be generated by any library and added as a stream. 
This will be available for PDFBox 2.0.0.

I would like to propose to now retire jempbox as xmpbox

# is closer to the spec (naming conventions)
# used for PDF/A validation where we can not remove a dependency on XMP 
handling as checking metadata is necessary for PDF/A compliance.

In case there is functionality in jempbox that is missing in xmpbox that could 
be added at a later stage upon request.

WDYT?


+1

Best,
Timo


--

 Timo Boehme
 OntoChem GmbH
 H.-Damerow-Str. 4
 06120 Halle/Saale
 T: +49 345 4780474
 F: +49 345 4780471
 timo.boe...@ontochem.com

_

 OntoChem GmbH
 Geschäftsführer: Dr. Lutz Weber
 Sitz: Halle / Saale
 Registergericht: Stendal
 Registernummer: HRB 215461
_

Re: [VOTE] Release Apache PDFBox 1.8.6

2014-06-20 Thread Timo Boehme


Hi,

+1

many thanks for preparing the release.

Best,
Timo


Am 19.06.2014 14:28, schrieb Andreas Lehmkuehler:

Hi,

a candidate for the PDFBox 1.8.6 release is available at:

 http://people.apache.org/~lehmi/pdfbox/1.8.6/

The release candidate is a zip archive of the sources in:

 http://svn.apache.org/repos/asf/pdfbox/tags/1.8.6/

The SHA1 checksum of the archive is
543c49ebe34a443654a0c3c264f36acc07983cc6.

Please vote on releasing this package as Apache PDFBox 1.8.6.
The vote is open for the next 72 hours and passes if a majority of at
least three +1 PDFBox PMC votes are cast.

 [ ] +1 Release this package as Apache PDFBox 1.8.6
 [ ] -1 Do not release this package because...


Here is my +1

BR
Andreas Lehmkühler



--

 Timo Boehme
 OntoChem GmbH
 H.-Damerow-Str. 4
 06120 Halle/Saale
 T: +49 345 4780474
 F: +49 345 4780471
 timo.boe...@ontochem.com

_

 OntoChem GmbH
 Geschäftsführer: Dr. Lutz Weber
 Sitz: Halle / Saale
 Registergericht: Stendal
 Registernummer: HRB 215461
_

[jira] [Commented] (PDFBOX-2118) Remove ICU4J dependency


[ 
https://issues.apache.org/jira/browse/PDFBOX-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038860#comment-14038860
 ] 

Andreas Lehmkühler commented on PDFBOX-2118:


I've just realized that we don't have to replace the Bidi usage as it isn't 
used anymore, so that I just removed it from the trunk in revision 
http://svn.apache.org/r1604181. I've marked the deleted method and class as 
deprecated in the 1.8 branch in revision http://svn.apache.org/r1604182

 Remove ICU4J dependency
 ---

 Key: PDFBOX-2118
 URL: https://issues.apache.org/jira/browse/PDFBOX-2118
 Project: PDFBox
  Issue Type: Improvement
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Andreas Lehmkühler
Assignee: Andreas Lehmkühler
  Labels: ICU4J
 Fix For: 2.0.0


 The ICU4J lib is quite big and we are just using a small part of it. Both 
 features are provided by the JDK (java.text.Normalizer and java.text.Bidi) 
 since 1.6 so that it should be possible to remove the ICU4J dependency.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Jenkins build is back to normal : PDFBox-trunk #1064

2014-06-20 Thread Apache Jenkins Server

See https://builds.apache.org/job/PDFBox-trunk/1064/changes

Jenkins build is back to normal : PDFBox-trunk » PDFBox parent #1064

2014-06-20 Thread Apache Jenkins Server

See 
https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/1064/changes

[jira] [Commented] (PDFBOX-2118) Remove ICU4J dependency

2014-06-20 Thread Maruan Sahyoun (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038890#comment-14038890
 ] 

Maruan Sahyoun commented on PDFBOX-2118:


[~lehmi] Shouldn’t we deprecate the methods normalizePres() and normalizeDiac() 
in 1.8 as 2.0 uses normalizePresentationForm() and normalizeDiacritic(). It 
might also be beneficial to add both methods - using the old code i.e. 
normalizePresentationForm() calls normalizePres() - to 1.8 so people can start 
using the new methods in 1.8 already.

 Remove ICU4J dependency
 ---

 Key: PDFBOX-2118
 URL: https://issues.apache.org/jira/browse/PDFBOX-2118
 Project: PDFBox
  Issue Type: Improvement
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Andreas Lehmkühler
Assignee: Andreas Lehmkühler
  Labels: ICU4J
 Fix For: 2.0.0


 The ICU4J lib is quite big and we are just using a small part of it. Both 
 features are provided by the JDK (java.text.Normalizer and java.text.Bidi) 
 since 1.6 so that it should be possible to remove the ICU4J dependency.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2118) Remove ICU4J dependency


[ 
https://issues.apache.org/jira/browse/PDFBOX-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038914#comment-14038914
 ] 

Andreas Lehmkühler commented on PDFBOX-2118:


[~msahyoun] I'm not sure if I got your point. Both methods normalizePres() and 
normalizeDiac() aren't used directly. They are called through the 
TextNormalizer class but only if the ICU4J lib is present. I've deprecated the 
whole class which includes both methods.


 Remove ICU4J dependency
 ---

 Key: PDFBOX-2118
 URL: https://issues.apache.org/jira/browse/PDFBOX-2118
 Project: PDFBox
  Issue Type: Improvement
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Andreas Lehmkühler
Assignee: Andreas Lehmkühler
  Labels: ICU4J
 Fix For: 2.0.0


 The ICU4J lib is quite big and we are just using a small part of it. Both 
 features are provided by the JDK (java.text.Normalizer and java.text.Bidi) 
 since 1.6 so that it should be possible to remove the ICU4J dependency.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading


[ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038925#comment-14038925
 ] 

Tilman Hausherr commented on PDFBOX-2153:
-

Fixed in rev http://svn.apache.org/r1604192 for the trunk.

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2118) Remove ICU4J dependency

2014-06-20 Thread Maruan Sahyoun (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038927#comment-14038927
 ] 

Maruan Sahyoun commented on PDFBOX-2118:


[~lehmi] I’ve missed that. As TextNormalize is public one could use it directly 
...

 Remove ICU4J dependency
 ---

 Key: PDFBOX-2118
 URL: https://issues.apache.org/jira/browse/PDFBOX-2118
 Project: PDFBox
  Issue Type: Improvement
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Andreas Lehmkühler
Assignee: Andreas Lehmkühler
  Labels: ICU4J
 Fix For: 2.0.0


 The ICU4J lib is quite big and we are just using a small part of it. Both 
 features are provided by the JDK (java.text.Normalizer and java.text.Bidi) 
 since 1.6 so that it should be possible to remove the ICU4J dependency.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes


[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14038931#comment-14038931
 ] 

Tilman Hausherr commented on PDFBOX-1915:
-

Sure, go ahead. I'll look at your code later today or this WE.

I remember I tried inserting a break in your code but took that back for some 
reason.

Please correct PageDrawer.shfill() by inserting graphics.setClip(null); 
before the last line, see PDFBOX-2153. Then try rendering the eci file and 
you'll be pleasantly suprised :-)

 Implement shading with Coons and tensor-product patch meshes
 

 Key: PDFBOX-1915
 URL: https://issues.apache.org/jira/browse/PDFBOX-1915
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
Assignee: Shaola Ren
  Labels: graphical, gsoc2014, java, math, shading
 Fix For: 2.0.0

 Attachments: CONICAL.pdf, GWG060_Shading_x1a.pdf, 
 GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, McAfee-ShadingType7.pdf, 
 Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
 _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
 asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
 coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
 coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
 coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
 coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
 coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
 eci_altona-test-suite-v2_technical_H.pdf, failedTest.rar, lamp_cairo.pdf, 
 lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
 lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
 pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
 shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
 tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
 tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
 updateshading6ContourTest.rar


 Of the seven shading methods described in the PDF specification, type 6 
 (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
 implemented. I have done type 1, 4 and 5, but I don't know the math for type 
 6 and 7. My math days are decades away.
 Knowledge prerequisites: 
 - java, although you don't have to be a java ace, just feel confortable
 - math: you should know what cubic Bézier curves, Degenerate Bézier 
 curves, bilinear interpolation, tensor-product, affine transform 
 matrix and Bernstein polynomials are, or be able to learn it
 - maven (basic)
 - svn (basic)
 - an IDE like Netbeans or Eclipse or IntelliJ (basic)
 - ideally, you are either a math student who likes to program, or a computer 
 science student who is specializing in graphics.
 A first look at PDFBOX: try the command utility here:
 https://pdfbox.apache.org/commandline/#pdfToImage
 and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
 the shading types that are already implemented.
 Some simple source code to convert to images:
 String filename = blah.pdf;
 PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
 ListPDPage pdPages = document.getDocumentCatalog().getAllPages();
 int page = 0;
 for (PDPage pdPage : pdPages)
 {
 ++page;
 BufferedImage bim = RenderUtil.convertToImage(pdPage, 
 BufferedImage.TYPE_BYTE_BINARY, 300);
 ImageIO.write(bim, png, new File(filename+page+.png));
 }
 document.close();
 You are not starting from scratch. The implementation of type 4 and 5 shows 
 you how to read parameters from the PDF and set the graphics. You don't have 
 to learn the complete PDF spec, only 15 pages related to the two shading 
 types, and 6 pages about shading in general. The PDF specification is here:
 http://www.adobe.com/devnet/pdf/pdf_reference.html
 The tricky parts are:
 - decide whether a point(x,y) is inside or outside a patch
 - decide the color of a point within the patch
 To get an idea about the code, look at the classes GouraudTriangle, 
 GouraudShadingContext, Type4ShadingContext and Vertex here
 https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/
 or download the whole project from the repository.
 https://pdfbox.apache.org/downloads.html#scm
 If you want to see the existing code in the debugger with a Gouraud shading, 
 try this file:
 http://asymptote.sourceforge.net/gallery/Gouraud.pdf
 Testing:
 I have attached several example PDFs. To see which one has which shading, 
 open them with an editor like NOTEPAD++, and search for /ShadingType 
 (without the quotes). If your images are rendering like the example PDFs, 
 then you were

[jira] [Closed] (PDFBOX-1947) Axial shading doesn't appear


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-1947.
---

Resolution: Duplicate

 Axial shading doesn't appear
 

 Key: PDFBOX-1947
 URL: https://issues.apache.org/jira/browse/PDFBOX-1947
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern
 Attachments: PDFBOX-1940.pdf, pdfbox-1940.pdf-1.png


 ShadingType 2 (axial shading) doesn't appear in attached file. Maybe related 
 to PDFBOX-1442.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Closed] (PDFBOX-1451) Error in converting a pdf to image using convertToImage


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-1451.
---

Resolution: Duplicate

Shading issue fixed in PDFBOX-2153.

 Error in converting a pdf to image using convertToImage
 ---

 Key: PDFBOX-1451
 URL: https://issues.apache.org/jira/browse/PDFBOX-1451
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.5.0, 1.7.1, 1.8.6, 2.0.0
Reporter: Emanuele Lombardi
Assignee: Andreas Lehmkühler
  Labels: shading, shadingpattern
 Attachments: Alfresco_Enterprise4_Mobile.pdf, 
 Alfresco_Enterprise4_Mobile1.5.0.png, Alfresco_Enterprise4_Mobile1.7.1.png


 Hi,
 I converted a pdf to image using
 Class : PDPage 
 API : public BufferedImage convertToImage()
 i obtained an image with the first line of the bulleted list on the right 
 with strange character and:
 with 1.5.0 version is missing the image on the top
 with 1.7.1 i had a strange color issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading


[ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039024#comment-14039024
 ] 

Tilman Hausherr commented on PDFBOX-2153:
-

Fixed in rev 1604211 for the 1.8 branch.

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-2153) Setting the correct clipping path for shading


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2153:


Affects Version/s: 2.0.0
   1.8.6
   1.8.5

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039084#comment-14039084
 ] 

Tilman Hausherr commented on PDFBOX-2149:
-

The file of PDFBOX-2059 has also the NPE.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: log4j

2014-06-20 Thread Tilman Hausherr


PDFBOX-2151



Am 17.06.2014 09:36, schrieb Simon Steiner:

Hi,

  


Should pdfbox move few bits of log4j to commons logging?

  


Thanks

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes


[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039252#comment-14039252
 ] 

Tilman Hausherr commented on PDFBOX-1915:
-

I ran my tests; some patch boundaries look more like a line than like a curve, 
especially tensor-nofunction-CMYK.pdf and tensor-nofunction-RGB.pdf (all at 
96dpi). The weird thing is that I could observe these effects only with tensor 
patches, not with coons patches. The coons patches are 100% identical.

 Implement shading with Coons and tensor-product patch meshes
 

 Key: PDFBOX-1915
 URL: https://issues.apache.org/jira/browse/PDFBOX-1915
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
Assignee: Shaola Ren
  Labels: graphical, gsoc2014, java, math, shading
 Fix For: 2.0.0

 Attachments: CONICAL.pdf, GWG060_Shading_x1a.pdf, 
 GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, McAfee-ShadingType7.pdf, 
 Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
 _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
 asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
 coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
 coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
 coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
 coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
 coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
 eci_altona-test-suite-v2_technical_H.pdf, failedTest.rar, lamp_cairo.pdf, 
 lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
 lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
 pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
 shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
 tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
 tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
 updateshading6ContourTest.rar


 Of the seven shading methods described in the PDF specification, type 6 
 (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
 implemented. I have done type 1, 4 and 5, but I don't know the math for type 
 6 and 7. My math days are decades away.
 Knowledge prerequisites: 
 - java, although you don't have to be a java ace, just feel confortable
 - math: you should know what cubic Bézier curves, Degenerate Bézier 
 curves, bilinear interpolation, tensor-product, affine transform 
 matrix and Bernstein polynomials are, or be able to learn it
 - maven (basic)
 - svn (basic)
 - an IDE like Netbeans or Eclipse or IntelliJ (basic)
 - ideally, you are either a math student who likes to program, or a computer 
 science student who is specializing in graphics.
 A first look at PDFBOX: try the command utility here:
 https://pdfbox.apache.org/commandline/#pdfToImage
 and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
 the shading types that are already implemented.
 Some simple source code to convert to images:
 String filename = blah.pdf;
 PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
 ListPDPage pdPages = document.getDocumentCatalog().getAllPages();
 int page = 0;
 for (PDPage pdPage : pdPages)
 {
 ++page;
 BufferedImage bim = RenderUtil.convertToImage(pdPage, 
 BufferedImage.TYPE_BYTE_BINARY, 300);
 ImageIO.write(bim, png, new File(filename+page+.png));
 }
 document.close();
 You are not starting from scratch. The implementation of type 4 and 5 shows 
 you how to read parameters from the PDF and set the graphics. You don't have 
 to learn the complete PDF spec, only 15 pages related to the two shading 
 types, and 6 pages about shading in general. The PDF specification is here:
 http://www.adobe.com/devnet/pdf/pdf_reference.html
 The tricky parts are:
 - decide whether a point(x,y) is inside or outside a patch
 - decide the color of a point within the patch
 To get an idea about the code, look at the classes GouraudTriangle, 
 GouraudShadingContext, Type4ShadingContext and Vertex here
 https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/
 or download the whole project from the repository.
 https://pdfbox.apache.org/downloads.html#scm
 If you want to see the existing code in the debugger with a Gouraud shading, 
 try this file:
 http://asymptote.sourceforge.net/gallery/Gouraud.pdf
 Testing:
 I have attached several example PDFs. To see which one has which shading, 
 open them with an editor like NOTEPAD++, and search for /ShadingType 
 (without the quotes). If your images are rendering like the example PDFs, 
 then you were successful.
 Optional:
 Review and

[jira] [Assigned] (PDFBOX-1995) AdobePDFSchema.getProducer() returns empty string

2014-06-20 Thread Guillaume Bailleul (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Bailleul reassigned PDFBOX-1995:
--

Assignee: Guillaume Bailleul

 AdobePDFSchema.getProducer() returns empty string
 -

 Key: PDFBOX-1995
 URL: https://issues.apache.org/jira/browse/PDFBOX-1995
 Project: PDFBox
  Issue Type: Bug
  Components: XmpBox
Affects Versions: 1.8.4
Reporter: Alexandre Garino
Assignee: Guillaume Bailleul

 I experienced this bug while PDF/A validation process. The document is not 
 considered valid because the producer value is not in sync with 
 PDDocumentInformation.
 {quote}
 PDDocumentInformation.getProducer() = ` ' (one space)
 AdobePDFSchema.getProducer() = `' (empty)
 {quote}
 Below the metadata extracted from the PDF document:
  
 {quote}
 ?xpacket begin= id=W5M0MpCehiHzreSzNTczkc9d?
 x:xmpmeta xmlns:x=adobe:ns:meta/
 rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 rdf:Description rdf:about= 
 xmlns:xap=http://ns.adobe.com/xap/1.0/;
 xap:CreatorToolCanon /xap:CreatorTool
 xap:CreateDate2014-01-23T20:09:45+01:00/xap:CreateDate
 /rdf:Description
 rdf:Description rdf:about=  
 xmlns:pdf=http://ns.adobe.com/pdf/1.3/;
 pdf:Producer /pdf:Producer
 /rdf:Description
 rdf:Description rdf:about= 
 xmlns:pdfaid=http://www.aiim.org/pdfa/ns/id/;
 pdfaid:part1/pdfaid:part
 pdfaid:conformanceB/pdfaid:conformance
 /rdf:Description
 /rdf:RDF
 /x:xmpmeta
 ?xpacket end=w?
 {quote}
 As you can see the Producer value should be equal to ` ' (one space).
 The bug is located within the method DomXmpParser.removeComments. This method 
 is invoked during the unmarshalling process and removes much more than 
 comments, text nodes too! 
 I can fix (badly) MY issue by changing the code base from : 
 {quote}
 Text t = (Text) node;
 if (t.getTextContent().trim().length() == 0)
 {
 // XXX is there a better way to remove useless Text ?
 node.getParentNode().removeChild(node);
 }
 {quote}
 into : 
 {quote}
 Text t = (Text) node;
 if (t.getTextContent().startsWith(\n))
 {
 // XXX is there a better way to remove useless Text ?
 node.getParentNode().removeChild(node);
 }
 {quote}
 But this is not a long term fix.
 IMHO, the unmarshalling process should be reworked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Resolved] (PDFBOX-1995) AdobePDFSchema.getProducer() returns empty string

2014-06-20 Thread Guillaume Bailleul (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Bailleul resolved PDFBOX-1995.


   Resolution: Fixed
Fix Version/s: 2.0.0

Fix and new test added in r1604276

 AdobePDFSchema.getProducer() returns empty string
 -

 Key: PDFBOX-1995
 URL: https://issues.apache.org/jira/browse/PDFBOX-1995
 Project: PDFBox
  Issue Type: Bug
  Components: XmpBox
Affects Versions: 1.8.4
Reporter: Alexandre Garino
Assignee: Guillaume Bailleul
 Fix For: 2.0.0


 I experienced this bug while PDF/A validation process. The document is not 
 considered valid because the producer value is not in sync with 
 PDDocumentInformation.
 {quote}
 PDDocumentInformation.getProducer() = ` ' (one space)
 AdobePDFSchema.getProducer() = `' (empty)
 {quote}
 Below the metadata extracted from the PDF document:
  
 {quote}
 ?xpacket begin= id=W5M0MpCehiHzreSzNTczkc9d?
 x:xmpmeta xmlns:x=adobe:ns:meta/
 rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#;
 rdf:Description rdf:about= 
 xmlns:xap=http://ns.adobe.com/xap/1.0/;
 xap:CreatorToolCanon /xap:CreatorTool
 xap:CreateDate2014-01-23T20:09:45+01:00/xap:CreateDate
 /rdf:Description
 rdf:Description rdf:about=  
 xmlns:pdf=http://ns.adobe.com/pdf/1.3/;
 pdf:Producer /pdf:Producer
 /rdf:Description
 rdf:Description rdf:about= 
 xmlns:pdfaid=http://www.aiim.org/pdfa/ns/id/;
 pdfaid:part1/pdfaid:part
 pdfaid:conformanceB/pdfaid:conformance
 /rdf:Description
 /rdf:RDF
 /x:xmpmeta
 ?xpacket end=w?
 {quote}
 As you can see the Producer value should be equal to ` ' (one space).
 The bug is located within the method DomXmpParser.removeComments. This method 
 is invoked during the unmarshalling process and removes much more than 
 comments, text nodes too! 
 I can fix (badly) MY issue by changing the code base from : 
 {quote}
 Text t = (Text) node;
 if (t.getTextContent().trim().length() == 0)
 {
 // XXX is there a better way to remove useless Text ?
 node.getParentNode().removeChild(node);
 }
 {quote}
 into : 
 {quote}
 Text t = (Text) node;
 if (t.getTextContent().startsWith(\n))
 {
 // XXX is there a better way to remove useless Text ?
 node.getParentNode().removeChild(node);
 }
 {quote}
 But this is not a long term fix.
 IMHO, the unmarshalling process should be reworked.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (PDFBOX-2154) NPE while rendering files with type3 fonts

Tilman Hausherr created PDFBOX-2154:
---

 Summary: NPE while rendering files with type3 fonts
 Key: PDFBOX-2154
 URL: https://issues.apache.org/jira/browse/PDFBOX-2154
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 1.8.5, 1.8.4, 1.8.3, 1.8.6
Reporter: Tilman Hausherr


I get this NPE with the files of PDFBOX-1145, PDFBOX-1794, PDFBOX-2023 in 1.8 
only:

java.lang.NullPointerException
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:210)
at 
org.apache.pdfbox.pdmodel.font.Type3StreamParser.createImage(Type3StreamParser.java:59)
at 
org.apache.pdfbox.pdmodel.font.PDType3Font.createImageIfNecessary(PDType3Font.java:80)
at 
org.apache.pdfbox.pdmodel.font.PDType3Font.drawString(PDType3Font.java:102)
at 
org.apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.java:256)
at 
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:499)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at 
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:557)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235)
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:135)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:801)
at 
org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:232)
at 
org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:344)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at junit.textui.TestRunner.doRun(TestRunner.java:116)
at junit.textui.TestRunner.start(TestRunner.java:180)
at junit.textui.TestRunner.main(TestRunner.java:138)
at org.apache.pdfbox.util.TestPDFToImage.main(TestPDFToImage.java:394)


After fixing PDFStreamEngine.processStream() like this
{code}
if (aPage == null)
{
graphicsState = new PDGraphicsState();
}
else
{
graphicsState = new PDGraphicsState(aPage.findCropBox());
}
{code}
I get another NPE:

java.lang.NullPointerException
at 
org.apache.pdfbox.pdmodel.font.Type3StreamParser.createImage(Type3StreamParser.java:60)
at 
org.apache.pdfbox.pdmodel.font.PDType3Font.createImageIfNecessary(PDType3Font.java:80)
at 
org.apache.pdfbox.pdmodel.font.PDType3Font.drawString(PDType3Font.java:102)
at 
org.apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.java:256)
at 
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:506)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at 
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:564)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:275)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:242)
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:222)
at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:135)
at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:801)
at 
org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:232)
at 
org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:344)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at

[jira] [Reopened] (PDFBOX-1940) Faulty pdf-image rendering


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr reopened PDFBOX-1940:
-

  Assignee: Tilman Hausherr  (was: John Hewson)

Reopening to apply the fix to 1.8

 Faulty pdf-image rendering
 ---

 Key: PDFBOX-1940
 URL: https://issues.apache.org/jira/browse/PDFBOX-1940
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Daniel Kozimor
Assignee: Tilman Hausherr
 Fix For: 2.0.0

 Attachments: input.pdf, output.jpg


 A particular PDF is producing improper output jpg.
 The pdf in question, as well as the produced jpg can be found attached to 
 this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-1940) Faulty pdf-image rendering


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1940:


Attachment: PDFBOX-1940-v1.8.jpg

 Faulty pdf-image rendering
 ---

 Key: PDFBOX-1940
 URL: https://issues.apache.org/jira/browse/PDFBOX-1940
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Daniel Kozimor
Assignee: Tilman Hausherr
 Fix For: 2.0.0

 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg


 A particular PDF is producing improper output jpg.
 The pdf in question, as well as the produced jpg can be found attached to 
 this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-1940) Faulty pdf-image rendering


[ 
https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039370#comment-14039370
 ] 

Tilman Hausherr edited comment on PDFBOX-1940 at 6/20/14 9:19 PM:
--

Reopening to apply [the 
fix|https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/PageDrawer.java?r1=1571581r2=1571803pathrev=1571803diff_format=h]
 to 1.8



was (Author: tilman):
Reopening to apply the fix to 1.8

 Faulty pdf-image rendering
 ---

 Key: PDFBOX-1940
 URL: https://issues.apache.org/jira/browse/PDFBOX-1940
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Daniel Kozimor
Assignee: Tilman Hausherr
 Fix For: 2.0.0

 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg


 A particular PDF is producing improper output jpg.
 The pdf in question, as well as the produced jpg can be found attached to 
 this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-1940) Faulty pdf-image rendering


[ 
https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039377#comment-14039377
 ] 

Tilman Hausherr commented on PDFBOX-1940:
-

Done in rev 1604279 for the 1.8 branch.

 Faulty pdf-image rendering
 ---

 Key: PDFBOX-1940
 URL: https://issues.apache.org/jira/browse/PDFBOX-1940
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.6, 2.0.0
Reporter: Daniel Kozimor
Assignee: Tilman Hausherr
 Fix For: 2.0.0

 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg


 A particular PDF is producing improper output jpg.
 The pdf in question, as well as the produced jpg can be found attached to 
 this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (PDFBOX-1940) Faulty pdf-image rendering


 [ 
https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1940:


Affects Version/s: 1.8.6

 Faulty pdf-image rendering
 ---

 Key: PDFBOX-1940
 URL: https://issues.apache.org/jira/browse/PDFBOX-1940
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.6, 2.0.0
Reporter: Daniel Kozimor
Assignee: Tilman Hausherr
 Fix For: 2.0.0

 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg


 A particular PDF is producing improper output jpg.
 The pdf in question, as well as the produced jpg can be found attached to 
 this issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Release Apache PDFBox 1.8.6 - API docs

2014-06-20 Thread Maruan Sahyoun

the apidocs for 1.8.6 are available at 
http://pdfbox.staging.apache.org/docs/1.8.6/javadocs/

upon release they will be put into production.

BR

Maruan Sahyoun

Am 19.06.2014 um 14:28 schrieb Andreas Lehmkuehler andr...@lehmi.de:

 Hi,
 
 a candidate for the PDFBox 1.8.6 release is available at:
 
http://people.apache.org/~lehmi/pdfbox/1.8.6/
 
 The release candidate is a zip archive of the sources in:
 
http://svn.apache.org/repos/asf/pdfbox/tags/1.8.6/
 
 The SHA1 checksum of the archive is 543c49ebe34a443654a0c3c264f36acc07983cc6.
 
 Please vote on releasing this package as Apache PDFBox 1.8.6.
 The vote is open for the next 72 hours and passes if a majority of at
 least three +1 PDFBox PMC votes are cast.
 
[ ] +1 Release this package as Apache PDFBox 1.8.6
[ ] -1 Do not release this package because...
 
 
 Here is my +1
 
 BR
 Andreas Lehmkühler

[jira] [Commented] (PDFBOX-2141) Shading not applied to text


[ 
https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039402#comment-14039402
 ] 

Tilman Hausherr commented on PDFBOX-2141:
-

Committed in rev 1604282 for the trunk.

 Shading not applied to text
 ---

 Key: PDFBOX-2141
 URL: https://issues.apache.org/jira/browse/PDFBOX-2141
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Petr Slaby
Priority: Minor
 Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png, 
 PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png, 
 PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png, 
 PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch


 The attached PDF draws a text filled with horizontal shading going from red 
 to blue. When rendered via PDFBox, the text is completely filled with red. 
 The problem is that AxialShadingContext#getRaster() gets called with 
 positions that completely fell outside of the range stored in its coords[] 
 field. The fix seems to be to set glyph transform rather than graphics2d 
 transform in PageDrawer#writeText() as shown in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2141) Shading not applied to text

[
https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039490#comment-14039490
]

Tilman Hausherr commented on PDFBOX-2141:
-

Fixed in the 1.8 version in rev 1604297.

While working on the 1.8 version I noticed a comment that relates to
PDFBOX-485. In it, [~vbier] told about printing problems with hp laserjet 8150
hp laserjet 1320. [~vbier], are you still using PDFBox and these two
printers? If yes, could you please test a snapshot version? (I will tell the
URL tomorrow)

Shading not applied to text
---

Key: PDFBOX-2141
URL: https://issues.apache.org/jira/browse/PDFBOX-2141
Project: PDFBox
Issue Type: Bug
Components: Rendering
Affects Versions: 2.0.0
Reporter: Petr Slaby
Priority: Minor
Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png,
PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png,
PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png,
PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch

The attached PDF draws a text filled with horizontal shading going from red
to blue. When rendered via PDFBox, the text is completely filled with red.
The problem is that AxialShadingContext#getRaster() gets called with
positions that completely fell outside of the range stored in its coords[]
field. The fix seems to be to set glyph transform rather than graphics2d
transform in PageDrawer#writeText() as shown in the attached patch.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson commented on PDFBOX-2149:
-

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting in a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing it's FontDescriptor - it's related to PDFBOX-2140 which I'm trying to 
fix. What we're seeing in these PDFs in that when a font is missing we're 
replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the decoding of character codes by PDFont and its subclasses, this 
 will involve replacing the #getCodeFromArray, #encode and #encodeToCID 
 methods.
 - Fix decoding of content stream character codes in PDFStreamEngine, using 
 the newly refactored PDFont and using the current font's CMap to determine 
 the code width.
 Phase 4
 - Add support for generating embedded TTFs with Unicode



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:26 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting in a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:25 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting in a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting in a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing it's FontDescriptor - it's related to PDFBOX-2140 which I'm trying to 
fix. What we're seeing in these PDFs in that when a font is missing we're 
replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 -

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:27 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're substituting it but we should be synthesising a FontDescriptor from the 
substituted font which we loaded from disk. In other words the NPE is actually 
showing us that there is a bug in PDFBox which needs a fix: and returning false 
is not going to produce the correct results. Just because you got rid of an 
exception doesn't mean that PDFBox's behaviour has been corrected: there's more 
work to be done here to synthesise the missing FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're replacing it with the default font but we should be synthesising a 
FontDescriptor from the default font which we loaded from disk. In other words 
the NPE is actually showing us that there is a bug in PDFBox which needs a fix: 
and returning false is not going to produce the correct results. Just because 
you got rid of an exception doesn't mean that PDFBox's behaviour has been 
corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 - Refactor the

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:27 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
and has not FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: and returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
we're substituting it but we should be synthesising a FontDescriptor from the 
substituted font which we loaded from disk. In other words the NPE is actually 
showing us that there is a bug in PDFBox which needs a fix: and returning false 
is not going to produce the correct results. Just because you got rid of an 
exception doesn't mean that PDFBox's behaviour has been corrected: there's more 
work to be done here to synthesise the missing FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.
 Phase 3
 -

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:29 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs is that when a font is missing 
and has no FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: and returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs in that when a font is missing 
and has not FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: and returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:30 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs is that when a font is missing 
and has no FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: returning false is not going to produce the correct results. Just 
because it's possible to get rid of an exception doesn't mean that PDFBox's 
behaviour has been corrected: there's more work to be done here to synthesise 
the missing FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs is that when a font is missing 
and has no FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap.

[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring


[ 
https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039594#comment-14039594
 ] 

John Hewson edited comment on PDFBOX-2149 at 6/21/14 12:29 AM:
---

[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs is that when a font is missing 
and has no FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.


was (Author: jahewson):
[~lehmi], you've missed the point, PDFBox is already equipped to handle the 
cases where the FontDescriptor is missing by substituting a synthetic 
FontDescriptor, so we shouldn't be seeing cases where the getFontDescriptor() 
returns null. It's a bug in PDFBox. Defaulting to returning false from 
isSymbolicFont() is incorrect, for example if it's the Symbol font which is 
missing a FontDescriptor - this issue is related to PDFBOX-2140 which I'm 
trying to fix. What we're seeing in these PDFs is that when a font is missing 
and has no FontDescriptor we're substituting it but we're not synthesising a 
FontDescriptor from the substituted font which we loaded from disk. In other 
words the NPE is actually showing us that there is a bug in PDFBox which needs 
a fix: and returning false is not going to produce the correct results. Just 
because you got rid of an exception doesn't mean that PDFBox's behaviour has 
been corrected: there's more work to be done here to synthesise the missing 
FontDescriptor correctly.

 Font Refactoring
 

 Key: PDFBOX-2149
 URL: https://issues.apache.org/jira/browse/PDFBOX-2149
 Project: PDFBox
  Issue Type: Improvement
  Components: FontBox, PDModel
Affects Versions: 2.0.0
Reporter: John Hewson
Assignee: John Hewson
 Attachments: 39.pdf, 000467.pdf


 To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need 
 to sort out long-standing font/text encoding issues. The main issue is that 
 encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, 
 sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this 
 code is copy  pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and 
 Encodings despite the fact that these two encoding methods are mutually 
 exclusive. The end result is that the process of reading Encodings/CMaps is 
 often following rules which are completely invalid for that font type but 
 mostly work by luck.
 Phase 1
 - Refactor PDFont subclasses to remove setXXX methods which allow the object 
 to be corrupted. Proper use of inheritance can remove all cases where public 
 setXXX methods are used during font loading.
 - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF 
 embedding, FontBox's TrueTypeFont class is externally mutable via setXXX 
 methods used only by TTFParser: these can be made package-private.
 - the Encoding class and EncodingManager could do with some cleaning up prior 
 to further refactoring.
 - PDSimpleFont does not do anything, its functionality should be moved into 
 its superclass, PDFont.
 - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, 
 and vice versa. Loading needs to be pushed down into the appropriate 
 subclasses, as a starting point the relevant code should at least be copied 
 into the relevant subclasses ready for further refactoring.
 - TTFGlyph2D does its own decoding of char codes, rather than using the 
 font's #encode method (fair enough because #encode is broken) and there's a 
 copy and pasted version of the same code in PDTrueTypeFont - we need to 
 consolidate this code into PDTrueTypeFont where it belongs.
 Phase 2
 - Refactor loading of CMaps and Encodings from font dictionaries, this will 
 involve changes to PDFont and its subclasses to delegate loading to 
 subclasses where it can be properly encapsulated
 - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as 
 CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its 
 CMap. We'll see.

[jira] [Closed] (PDFBOX-2094) Add PrintRequestAttributeSet parameter to silentPrint()


 [ 
https://issues.apache.org/jira/browse/PDFBOX-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson closed PDFBOX-2094.
---

Resolution: Fixed

Yes, I see print() was fixed but not silentPrint(), I've now fixed this in 
r1604305.

 Add PrintRequestAttributeSet parameter to silentPrint()
 ---

 Key: PDFBOX-2094
 URL: https://issues.apache.org/jira/browse/PDFBOX-2094
 Project: PDFBox
  Issue Type: Improvement
  Components: PDModel
Affects Versions: 2.0.0
Reporter: senthuran
Assignee: John Hewson
Priority: Minor
 Fix For: 2.0.0


 The current implementation is not allow us to set the printer , paper 
 Attribute. Could you please implement the silentPrint() to  accept 
 printRequestAttributeSet as parameter. affected version from 
 pdfbox-app-2.0.0-20140506.050443-277jar to 
 pdfbox-app-2.0.0-20140506.050443-301jar . 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: Travis CI

2014-06-20 Thread John Hewson

On 20 Jun 2014, at 01:24, Andreas Lehmkuehler andr...@lehmi.de wrote:

 Hi,
 
 Am 19.06.2014 22:03, schrieb John Hewson:
 Hi All
 
 The recent instability of Jenkins prompted me to set up Travis CI to build 
 the PDFBox mirror on GitHub. Automatic builds are triggered after every 
 commit, and they can often run much faster than on the busy Jenkins server, 
 so this gives committers an additional means to quickly determine if their 
 build has problems or not.
 Good idea.
 
 The builds are public at: https://travis-ci.org/apache/pdfbox
 
 The Jenkins build is still the “ground truth” and passing that is what 
 counts, it *might* be possible to pass Travis CI and still fail on Jenkins, 
 so that’s something to keep in mind.
 Especially as the travis build uses oraclejdk7 as compiler. PDFBox has java6 
 as minimum requirement and that configuration may hide incompatibilities 
 because of the choosen java version.

Good point - I’ve added OpenJDK 6 to the Travis CI build now.

 -- John
 
 BR
 Andreas Lehmkühler
 

— John

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading


[ 
https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039610#comment-14039610
 ] 

John Hewson commented on PDFBOX-2153:
-

Yep, this fix makes sense because:

{code}
graphics.fill(getGraphicsState().getCurrentClippingPath());
{code}

Fills a shape, which happens to be the current clipping path, but like any 
paint operation it's subject to the current clipping path of the Graphics2D 
which we haven't updated, so it's stale.

---

One nitpick: can we leave out comments like:

{code}
graphics.setClip(null); // PDFBOX-2153 don't use obsolete clipping path
{code}

because that's what _svn blame_ is for :)

 Setting the correct clipping path for shading
 -

 Key: PDFBOX-2153
 URL: https://issues.apache.org/jira/browse/PDFBOX-2153
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
  Labels: shading, shadingpattern

 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf 
 (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the 
 clipping region) operator of a type 7 shading I got a lot more correct 
 shadings (type 6 and lower). It looked like PDFBox had been using the 
 clipping of the type 7 when drawing the type 6, which is just a rectangle 
 above in that rendering. This resulted in a blank.
 By adding 
 {code}
 graphics.setClip(getGraphicsState().getCurrentClippingPath());
 {code}
 in PageDrawer.shfill() just before the graphics.fill() I get several files to 
 render correctly that I hadn't before.
 (Setting null will probably do the same, didn't test that yet).
 The following PDFs are rendered correctly with the change:
 McAfee-ShadingType7.pdf
 eci_altona-test-suite-v2_technical_H.pdf
 crestron-p9.pdf  (these three found in PDFBOX-1915)
 PDFBOX-1451.pdf (alfresco)
 PDFBOX-1940.pdf (chart)
 PDFBOX-1861-tracemonkey.pdf p.11
 Not solved by the change:
 PDFBOX-2098-asyTUG.pdf p.6  (this one doesn't use shfill)
 PDFBOX-1861-tracemonkey.pdf p.6 (not shading)
 PDFBOX-1416.pdf (not shading)
 texample-rgb-triangle.pdf (John has an explanation about that one)
 WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() 
 ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: PDFBox and XMP - retire jempbox

2014-06-20 Thread John Hewson

+ 1

-- John

On 19 Jun 2014, at 23:05, Maruan Sahyoun sahy...@fileaffairs.de wrote:

 Hi,
 
 we currently have two libraries handling XMP metadata jempbox and xmpbox.
 
 Part of PDFBOX-1187/PDFBOX-2197 was to remove a direct dependency from 
 jempbox as now XMP metadata could be generated by any library and added as a 
 stream. This will be available for PDFBox 2.0.0.
 
 I would like to propose to now retire jempbox as xmpbox
 
 # is closer to the spec (naming conventions)
 # used for PDF/A validation where we can not remove a dependency on XMP 
 handling as checking metadata is necessary for PDF/A compliance. 
 
 In case there is functionality in jempbox that is missing in xmpbox that 
 could be added at a later stage upon request.
 
 WDYT? 
 
 BR
 Maruan

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-06-20 Thread Shaola Ren (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039614#comment-14039614
 ] 

Shaola Ren commented on PDFBOX-1915:


Thanks.
Re your last comment, you are absolutely right, hahaha..., I thought you may 
point out this, that's because I changed the level = 4 to level = 3 in order to 
get a faster speed in other test cases especially for the lamp_cario and 
macfeeU5 to see there is nothing wrong with the code, and I know when I changed 
the level back to 4, everything will remain the same as before. For this time's 
updating, only the code related to shading type 6 had a relatively more change 
than shading type 7, shading type 7 is almost suitable to edit the level 
parameter at the beginning.

 Implement shading with Coons and tensor-product patch meshes
 

 Key: PDFBOX-1915
 URL: https://issues.apache.org/jira/browse/PDFBOX-1915
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 1.8.5, 1.8.6, 2.0.0
Reporter: Tilman Hausherr
Assignee: Shaola Ren
  Labels: graphical, gsoc2014, java, math, shading
 Fix For: 2.0.0

 Attachments: CONICAL.pdf, GWG060_Shading_x1a.pdf, 
 GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, McAfee-ShadingType7.pdf, 
 Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
 _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
 asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
 coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
 coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
 coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
 coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
 coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
 eci_altona-test-suite-v2_technical_H.pdf, failedTest.rar, lamp_cairo.pdf, 
 lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
 lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
 pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
 shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
 tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
 tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
 updateshading6ContourTest.rar


 Of the seven shading methods described in the PDF specification, type 6 
 (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
 implemented. I have done type 1, 4 and 5, but I don't know the math for type 
 6 and 7. My math days are decades away.
 Knowledge prerequisites: 
 - java, although you don't have to be a java ace, just feel confortable
 - math: you should know what cubic Bézier curves, Degenerate Bézier 
 curves, bilinear interpolation, tensor-product, affine transform 
 matrix and Bernstein polynomials are, or be able to learn it
 - maven (basic)
 - svn (basic)
 - an IDE like Netbeans or Eclipse or IntelliJ (basic)
 - ideally, you are either a math student who likes to program, or a computer 
 science student who is specializing in graphics.
 A first look at PDFBOX: try the command utility here:
 https://pdfbox.apache.org/commandline/#pdfToImage
 and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
 the shading types that are already implemented.
 Some simple source code to convert to images:
 String filename = blah.pdf;
 PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
 ListPDPage pdPages = document.getDocumentCatalog().getAllPages();
 int page = 0;
 for (PDPage pdPage : pdPages)
 {
 ++page;
 BufferedImage bim = RenderUtil.convertToImage(pdPage, 
 BufferedImage.TYPE_BYTE_BINARY, 300);
 ImageIO.write(bim, png, new File(filename+page+.png));
 }
 document.close();
 You are not starting from scratch. The implementation of type 4 and 5 shows 
 you how to read parameters from the PDF and set the graphics. You don't have 
 to learn the complete PDF spec, only 15 pages related to the two shading 
 types, and 6 pages about shading in general. The PDF specification is here:
 http://www.adobe.com/devnet/pdf/pdf_reference.html
 The tricky parts are:
 - decide whether a point(x,y) is inside or outside a patch
 - decide the color of a point within the patch
 To get an idea about the code, look at the classes GouraudTriangle, 
 GouraudShadingContext, Type4ShadingContext and Vertex here
 https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/
 or download the whole project from the repository.
 https://pdfbox.apache.org/downloads.html#scm
 If you want to see the existing code in the debugger with a Gouraud shading, 
 try this file:
 http://asymptote.sourceforge.net/gallery/Gouraud.pdf
 Testing:
 I

[jira] [Commented] (PDFBOX-2153) Setting the correct clipping path for shading