[jira] [Commented] (PDFBOX-678) Support missing Text Rendering Modes when rendering a PDF

2014-07-04 Thread Petr Slaby (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052274#comment-14052274
 ] 

Petr Slaby commented on PDFBOX-678:
---

Implementing this seems to be fairly easy in current trunk (with the exception 
of Type3 fonts), see the attached patch. Why not do it?

> Support missing Text Rendering Modes when rendering a PDF
> -
>
> Key: PDFBOX-678
> URL: https://issues.apache.org/jira/browse/PDFBOX-678
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Reporter: Maruan Sahyoun
> Attachments: Java Printing.pdf, TextRenderingModes.java.patch
>
>
> Of the 7 different Text Rendering Modes only mode 0 (Fill Text) is correctly 
> implemented. Mode 1 (Stroke Text) falls back to Mode 0 and the others are not 
> implemented. I'm looking to implement the missing modes (at least some of 
> them).
> Before doing so I'm proposing a structural change to when rendering really 
> occurs. Currently it's done within the PDxxxFont classes. I'd rather 
> implement the (AWT) text output in PageDrawer (or helper classes within the 
> same package) and use the font classes to return an AWT font by adding a 
> getAwtFont method. Doing so we get a better separation between the PDF 
> related stuff (PDxxx) and applications like PageDrawer. The current rendering 
> specific code within the PDxxxFont classes can be retained for compatibility 
> and marked deprecated at a later stage.
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-678) Support missing Text Rendering Modes when rendering a PDF

2014-07-04 Thread Petr Slaby (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petr Slaby updated PDFBOX-678:
--

Attachment: TextRenderingModes.java.patch

> Support missing Text Rendering Modes when rendering a PDF
> -
>
> Key: PDFBOX-678
> URL: https://issues.apache.org/jira/browse/PDFBOX-678
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Reporter: Maruan Sahyoun
> Attachments: Java Printing.pdf, TextRenderingModes.java.patch
>
>
> Of the 7 different Text Rendering Modes only mode 0 (Fill Text) is correctly 
> implemented. Mode 1 (Stroke Text) falls back to Mode 0 and the others are not 
> implemented. I'm looking to implement the missing modes (at least some of 
> them).
> Before doing so I'm proposing a structural change to when rendering really 
> occurs. Currently it's done within the PDxxxFont classes. I'd rather 
> implement the (AWT) text output in PageDrawer (or helper classes within the 
> same package) and use the font classes to return an AWT font by adding a 
> getAwtFont method. Doing so we get a better separation between the PDF 
> related stuff (PDxxx) and applications like PageDrawer. The current rendering 
> specific code within the PDxxxFont classes can be retained for compatibility 
> and marked deprecated at a later stage.
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-2185) Rotation and skew not applied on rectangles

2014-07-04 Thread Petr Slaby (JIRA)
Petr Slaby created PDFBOX-2185:
--

 Summary: Rotation and skew not applied on rectangles
 Key: PDFBOX-2185
 URL: https://issues.apache.org/jira/browse/PDFBOX-2185
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Petr Slaby


When rendering the attached example, rotation and skew of rectangles is not 
applied properly. The reason is that the AppendRectangleToPath transform only 
start and end point and makes a non-rotated non-skewed result out of that. 
Instead, each corner of the rectangle has to be transformed separately as shown 
in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2185) Rotation and skew not applied on rectangles

2014-07-04 Thread Petr Slaby (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Petr Slaby updated PDFBOX-2185:
---

Attachment: AppendRectangleToPath.java.patch
example_013.pdf

> Rotation and skew not applied on rectangles
> ---
>
> Key: PDFBOX-2185
> URL: https://issues.apache.org/jira/browse/PDFBOX-2185
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Petr Slaby
> Attachments: AppendRectangleToPath.java.patch, example_013.pdf
>
>
> When rendering the attached example, rotation and skew of rectangles is not 
> applied properly. The reason is that the AppendRectangleToPath transform only 
> start and end point and makes a non-rotated non-skewed result out of that. 
> Instead, each corner of the rectangle has to be transformed separately as 
> shown in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2186:
---

 Summary: java.io.IOException: Catalog cannot be found
 Key: PDFBOX-2186
 URL: https://issues.apache.org/jira/browse/PDFBOX-2186
 Project: PDFBox
  Issue Type: Bug
  Components: Parsing
Affects Versions: 1.8.6, 1.8.7, 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
 Fix For: 1.8.7, 2.0.0


I get this with the attached file:
{code}
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: invalid xref line: 00 65535f
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: Count in xref table is 0 at offset 334372
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser initi
alParse
Warnung: Expected trailer object at position 334373, keep trying
Exception in thread "main" java.io.IOException: Catalog cannot be found
at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
quentialPDFParser.java:482)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
lPDFParser.java:757)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)

at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)

{code}

The cause is a TAB in an xref line. The solution is to search for a backslash s 
regex instead of a space only.

I'm not touching the preflight parser (who has the same code line) because I 
assume that he should not be lenient.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: PDFBOX-2186.pdf

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Description: 
I get this with the attached file:
{code}
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: invalid xref line: 00 65535f
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: Count in xref table is 0 at offset 334372
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser initi
alParse
Warnung: Expected trailer object at position 334373, keep trying
Exception in thread "main" java.io.IOException: Catalog cannot be found
at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
quentialPDFParser.java:482)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
lPDFParser.java:757)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)

at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)

{code}

The cause is a TAB in an xref line. The solution is to search for a backslash s 
regex instead of a space only.

I'm not touching the preflight parser (who has the same code line) because I 
assume that he should not be lenient.

Source of the file:
http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959

  was:
I get this with the attached file:
{code}
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: invalid xref line: 00 65535f
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
Warnung: Count in xref table is 0 at offset 334372
Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser initi
alParse
Warnung: Expected trailer object at position 334373, keep trying
Exception in thread "main" java.io.IOException: Catalog cannot be found
at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
quentialPDFParser.java:482)
at org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
lPDFParser.java:757)
at org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)

at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)

{code}

The cause is a TAB in an xref line. The solution is to search for a backslash s 
regex instead of a space only.

I'm not touching the preflight parser (who has the same code line) because I 
assume that he should not be lenient.


> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2186.
-

Resolution: Fixed

Fixed in rev 1607883 for the trunk and rev 1607884 for the 1.8 branch.

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Regression Testing

2014-07-04 Thread Tilman Hausherr
Of course I agree with the need for regression tests, however it isn't 
easy: besides the problems of the different JDKs (I use JDK7 Windows 64 
bit), there is the problem that some enhancements create slight changes 
in rendering that are not errors, i.e. both the "before" and the "after" 
files look OK by itself. This has happened when we changed the text 
rendering recently, and has happened again when the clipping was 
improved. The cause are probably slight changes in color or in boundaries.


Copyrights is a problem: I'm testing mostly with JIRA attachments that 
I've downloaded over the years. While uploading such files to JIRA might 
count as fair use, I doubt that this would still be true if they are 
included in a distribution. Instead, they should be stored somewhere on 
Apache servers where only committers and build software ("Travis", 
"Jenkins", ...) can access then. The public PDFs that Maruan mentions 
don't possibly have all the Problem cases that we solved before. However 
I have started working with these files and there are at least 5 recent 
issues that deals with them.


I'm using an improved version of the TestPDFToImage class and I will 
commit it within a few days, but I must clean it up first.


Re preflight: the default mode should be to have the Isartor tests on. 
Individuals could still disable them locally, but the central build 
software should always use them.


Tilman


Am 04.07.2014 08:43, schrieb Maruan Sahyoun:

Hi John,

thanks for binging this up. This is a very important topic which was also 
discussed at the PDFDays in Germany.

  # Tests #
In addition to rendering we shall be covering metadata and text extraction as 
well as PDF/A validation.

# Testfiles #
Recently there were a number of test sets made available which we can use. 
http://digitalcorpora.org/corpora/files , 
https://github.com/openplanets/format-corpus/tree/master/pdfCabinetOfHorrors …
For PDF/A validation there is the Isartor test suite 
http://www.pdfa.org/2011/08/download-isartor-test-suite/. Some restrictions 
apply there.
In addition we can put additional files into our own repository as you 
suggested.
So there is no shortage on test files.

TIKA-1300/TIKA-1302 has a discussion around the same topic together with some 
development for an infrastructure (VM, Jenkins …). IMHO we should join forces 
with them.

BR

Maruan


Am 04.07.2014 um 02:16 schrieb John Hewson :


Hi All

I’ve been thinking about regression testing recently and how we can improve
our tests for rendering. There are currently two problems:

1) Different JDKs produce slightly different renderings (see PDFBOX-1843).
(I suspect that AWT fonts are a big part of this, so the problem might get 
a lot better
soon once we render all fonts ourselves).

2) Most PDF test files we have are not under an Apache-friendly license, so
we can’t put the test files into the trunk SVN.

It seems that some of you have your own collections of test PDF files which you 
are
running regression tests on: that’s great but it would be much better if we had 
a
central repository of test files and sample renderings.

I’d like to suggest the following solutions to the above issues:

1) We should choose a “blessed” JDK which will be used to perform the renderings
this should be whatever is a convenient and sensible default for 
committers. (My
preference would be for Oracle’s JDK 7 because JDK 6 is deprecated has known
rendering bugs). We should make sure that Jenkins runs tests using the 
”blessed”
JDK.

   The regression test can then check to see if it is running on the “blessed” 
JDK and
   if not then the tests can be skipped and we can warn the user.

2) We should create a new “regression” branch in SVN which contains only PDF 
files
for testing and PNG images which contain known-good renderings created 
using the
“blessed” JDK. This branch would not be part of the source of PDFBox but 
will still
allow us to version control the test PDFs (it also simplifies the workflow 
for adding
new test PDFs and new known-good renderings: simply do an "svn add”).

As far as copyright and licensing is concerned we can put any PDF files 
which are
available publicly on the web into this branch without too much worry.

What does everybody think?

-- John







[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-07-04 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052537#comment-14052537
 ] 

Tilman Hausherr commented on PDFBOX-1915:
-

Speed is not the most important. On my phone, there is a second PDF viewer that 
is faster than Adobe, but Coons patches look not "smooth" but rather like the 
superhero "The Thing" and I doubt that Mr. Coons would have liked that.

Type 6 and 7 shadings are extremely hard to find in the real world. About 95% 
of test files are just that - test files, created to test the feature. The only 
real world examples here are the crestron file and the mcafee files. The real 
world McAfee files are rendered in a decent amount of time, the crestron file 
is very slow but this is due to type 2 shadings.

[~xinshu]:
We shouldn't spend too much time on optimizing 6 and 7. Have a look at it, but 
don't get stuck in that problem. Optimizing 2 and 3 is more important, because 
these shadings occur much more often in the real world. 

If you haven't already, please have a look at the profiler of Netbeans (or 
watch a few videos on youtube). You can start an application in profiler mode, 
and it will show you (depending on the settings) where the time is spent. You 
right click on the left pane and choose "profile", or use the icon on the 
toolbar. Then choose "CPU" and try different settings and see what happens. 
There is no fixed rule on how to use the profiler, but one is to analyse 
slowness at locations you don't expect. Note that the software will be much 
slower if the profiler is enabled.


> Implement shading with Coons and tensor-product patch meshes
> 
>
> Key: PDFBOX-1915
> URL: https://issues.apache.org/jira/browse/PDFBOX-1915
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 1.8.5, 1.8.6, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Shaola Ren
>  Labels: graphical, gsoc2014, java, math, shading
> Fix For: 2.0.0
>
> Attachments: CIB-coons-vs-tensormesh.pdf, CIB-coonsmesh.pdf, 
> CONICAL.pdf, GWG060_Shading_x1a.pdf, GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, 
> McAfee-ShadingType7.pdf, Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
> _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
> asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
> coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
> coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
> coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
> coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
> coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
> eci_altona-test-suite-v2_technical_H.pdf, example_030.pdf, failedTest.rar, 
> lamp_cairo.pdf, lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
> lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
> pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
> shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
> tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
> tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
> updateshading6ContourTest.rar
>
>
> Of the seven shading methods described in the PDF specification, type 6 
> (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
> implemented. I have done type 1, 4 and 5, but I don't know the math for type 
> 6 and 7. My math days are decades away.
> Knowledge prerequisites: 
> - java, although you don't have to be a java ace, just feel confortable
> - math: you should know what "cubic Bézier curves", "Degenerate Bézier 
> curves", "bilinear interpolation", "tensor-product", "affine transform 
> matrix" and "Bernstein polynomials" are, or be able to learn it
> - maven (basic)
> - svn (basic)
> - an IDE like Netbeans or Eclipse or IntelliJ (basic)
> - ideally, you are either a math student who likes to program, or a computer 
> science student who is specializing in graphics.
> A first look at PDFBOX: try the command utility here:
> https://pdfbox.apache.org/commandline/#pdfToImage
> and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
> the shading types that are already implemented.
> Some simple source code to convert to images:
> String filename = "blah.pdf";
> PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
> List pdPages = document.getDocumentCatalog().getAllPages();
> int page = 0;
> for (PDPage pdPage : pdPages)
> {
> ++page;
> BufferedImage bim = RenderUtil.convertToImage(pdPage, 
> BufferedImage.TYPE_BYTE_BINARY, 300);
> ImageIO.write(bim, "png", new File(filename+page+".png"));
> }
> document.close();
> You are not starting 

[jira] [Comment Edited] (PDFBOX-2184) Jenkins: CMMException: Invalid profile data

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052137#comment-14052137
 ] 

John Hewson edited comment on PDFBOX-2184 at 7/4/14 4:32 PM:
-

I have a theory that this is an -Open-JDK bug which is affecting this test 
because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.


was (Author: jahewson):
I have a theory that this is an OpenJDK bug which is affecting this test 
because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.

> Jenkins: CMMException: Invalid profile data
> ---
>
> Key: PDFBOX-2184
> URL: https://issues.apache.org/jira/browse/PDFBOX-2184
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: John Hewson
>
> Jenkins builds are intermittently failing with the error:
> {code}
> java.awt.color.CMMException: Invalid profile data
>   at sun.awt.color.CMM.checkStatus(CMM.java:131)
>   at sun.awt.color.ICC_Transform.(ICC_Transform.java:88)
>   at java.awt.color.ICC_ColorSpace.toRGB(ICC_ColorSpace.java:144)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDDeviceRGB.toRGB(PDDeviceRGB.java:79)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toPaint(PDColorSpace.java:255)
>   at 
> org.apache.pdfbox.rendering.PageDrawer.getNonStrokingPaint(PageDrawer.java:666)
>   at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:739)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.FillNonZeroRule.process(FillNonZeroRule.java:37)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:130)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:197)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:83)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:70)
>   at org.apache.pdfbox.util.TestRendering.render(TestRendering.java:78)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (PDFBOX-2184) Jenkins: CMMException: Invalid profile data

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052137#comment-14052137
 ] 

John Hewson edited comment on PDFBOX-2184 at 7/4/14 4:32 PM:
-

I have a theory that this is an -OpenJDK- JDK 6 bug which is affecting this 
test because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.


was (Author: jahewson):
I have a theory that this is an -Open-JDK bug which is affecting this test 
because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.

> Jenkins: CMMException: Invalid profile data
> ---
>
> Key: PDFBOX-2184
> URL: https://issues.apache.org/jira/browse/PDFBOX-2184
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: John Hewson
>
> Jenkins builds are intermittently failing with the error:
> {code}
> java.awt.color.CMMException: Invalid profile data
>   at sun.awt.color.CMM.checkStatus(CMM.java:131)
>   at sun.awt.color.ICC_Transform.(ICC_Transform.java:88)
>   at java.awt.color.ICC_ColorSpace.toRGB(ICC_ColorSpace.java:144)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDDeviceRGB.toRGB(PDDeviceRGB.java:79)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toPaint(PDColorSpace.java:255)
>   at 
> org.apache.pdfbox.rendering.PageDrawer.getNonStrokingPaint(PageDrawer.java:666)
>   at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:739)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.FillNonZeroRule.process(FillNonZeroRule.java:37)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:130)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:197)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:83)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:70)
>   at org.apache.pdfbox.util.TestRendering.render(TestRendering.java:78)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (PDFBOX-2184) Jenkins: CMMException: Invalid profile data

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052137#comment-14052137
 ] 

John Hewson edited comment on PDFBOX-2184 at 7/4/14 4:32 PM:
-

I have a theory that this is an OpenJDK 6 bug which is affecting this test 
because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.


was (Author: jahewson):
I have a theory that this is an -OpenJDK- JDK 6 bug which is affecting this 
test because it is multithreaded. I've altered the test runner in 
[r1607783|http://svn.apache.org/r1607783] to see if the issue goes away.

> Jenkins: CMMException: Invalid profile data
> ---
>
> Key: PDFBOX-2184
> URL: https://issues.apache.org/jira/browse/PDFBOX-2184
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: John Hewson
>
> Jenkins builds are intermittently failing with the error:
> {code}
> java.awt.color.CMMException: Invalid profile data
>   at sun.awt.color.CMM.checkStatus(CMM.java:131)
>   at sun.awt.color.ICC_Transform.(ICC_Transform.java:88)
>   at java.awt.color.ICC_ColorSpace.toRGB(ICC_ColorSpace.java:144)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDDeviceRGB.toRGB(PDDeviceRGB.java:79)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.toPaint(PDColorSpace.java:255)
>   at 
> org.apache.pdfbox.rendering.PageDrawer.getNonStrokingPaint(PageDrawer.java:666)
>   at org.apache.pdfbox.rendering.PageDrawer.fillPath(PageDrawer.java:739)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.FillNonZeroRule.process(FillNonZeroRule.java:37)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:130)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:488)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:254)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:221)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:197)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:83)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:70)
>   at org.apache.pdfbox.util.TestRendering.render(TestRendering.java:78)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-678) Support missing Text Rendering Modes when rendering a PDF

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052550#comment-14052550
 ] 

John Hewson commented on PDFBOX-678:


We should do this, but not until after AWT text rendering is removed, which 
will be very soon.

> Support missing Text Rendering Modes when rendering a PDF
> -
>
> Key: PDFBOX-678
> URL: https://issues.apache.org/jira/browse/PDFBOX-678
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Reporter: Maruan Sahyoun
> Attachments: Java Printing.pdf, TextRenderingModes.java.patch
>
>
> Of the 7 different Text Rendering Modes only mode 0 (Fill Text) is correctly 
> implemented. Mode 1 (Stroke Text) falls back to Mode 0 and the others are not 
> implemented. I'm looking to implement the missing modes (at least some of 
> them).
> Before doing so I'm proposing a structural change to when rendering really 
> occurs. Currently it's done within the PDxxxFont classes. I'd rather 
> implement the (AWT) text output in PageDrawer (or helper classes within the 
> same package) and use the font classes to return an AWT font by adding a 
> getAwtFont method. Doing so we get a better separation between the PDF 
> related stuff (PDxxx) and applications like PageDrawer. The current rendering 
> specific code within the PDxxxFont classes can be retained for compatibility 
> and marked deprecated at a later stage.
> WDYT?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-2187) ArrayIndexOutOfBoundsException in TIFFFaxDecoder

2014-07-04 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2187:
---

 Summary: ArrayIndexOutOfBoundsException in TIFFFaxDecoder
 Key: PDFBOX-2187
 URL: https://issues.apache.org/jira/browse/PDFBOX-2187
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.6, 1.8.7, 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
 Fix For: 1.8.7, 2.0.0


I get two types of exceptions with several files from the digitalcorpora site:

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
.java:1002)at 
org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
...
{code}

or

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException:1
   at 
org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916) 
  at org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
{code}

The fix, which is also used by others who use the same code and which solves 
both exceptions, is to increase w by one in this segment:

{code}
this.prevChangingElems = new int[w];
this.currChangingElems = new int[w];
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: (was: tiffT6-line1002-002489.pdf)

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: (was: tiffT6-line1002-002145.pdf)

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: (was: tiffT6-line1002-p002145u.pdf)

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: tiffT6-line1002-p002145u.pdf
tiffT6-line1002-002489.pdf
tiffT6-line1002-002145.pdf
tiffT6-line916-005541u.pdf

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2186) java.io.IOException: Catalog cannot be found

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2186:


Attachment: (was: tiffT6-line916-005541u.pdf)

> java.io.IOException: Catalog cannot be found
> 
>
> Key: PDFBOX-2186
> URL: https://issues.apache.org/jira/browse/PDFBOX-2186
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: PDFBOX-2186.pdf
>
>
> I get this with the attached file:
> {code}
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: invalid xref line: 00 65535f
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.PDFParser parseXrefTable
> Warnung: Count in xref table is 0 at offset 334372
> Jul 04, 2014 5:41:00 PM org.apache.pdfbox.pdfparser.NonSequentialPDFParser 
> initi
> alParse
> Warnung: Expected trailer object at position 334373, keep trying
> Exception in thread "main" java.io.IOException: Catalog cannot be found
> at org.apache.pdfbox.cos.COSDocument.getCatalog(COSDocument.java:522)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.initialParse(NonSe
> quentialPDFParser.java:482)
> at 
> org.apache.pdfbox.pdfparser.NonSequentialPDFParser.parse(NonSequentia
> lPDFParser.java:757)
> at 
> org.apache.pdfbox.pdmodel.PDDocument.loadNonSeq(PDDocument.java:1157)
> at org.apache.pdfbox.tools.PDFToImage.main(PDFToImage.java:197)
> at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:89)
> {code}
> The cause is a TAB in an xref line. The solution is to search for a backslash 
> s regex instead of a space only.
> I'm not touching the preflight parser (who has the same code line) because I 
> assume that he should not be lenient.
> Source of the file:
> http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/002.zip  file 959



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2187) ArrayIndexOutOfBoundsException in TIFFFaxDecoder

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2187:


Attachment: tiffT6-line1002-p002145u.pdf
tiffT6-line1002-002489.pdf
tiffT6-line1002-002145.pdf
tiffT6-line916-005541u.pdf

> ArrayIndexOutOfBoundsException in TIFFFaxDecoder
> 
>
> Key: PDFBOX-2187
> URL: https://issues.apache.org/jira/browse/PDFBOX-2187
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: CCITTFaxDecode, ccitt
> Fix For: 1.8.7, 2.0.0
>
> Attachments: tiffT6-line1002-002145.pdf, tiffT6-line1002-002489.pdf, 
> tiffT6-line1002-p002145u.pdf, tiffT6-line916-005541u.pdf
>
>
> I get two types of exceptions with several files from the digitalcorpora site:
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException: 16
> at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
> .java:1002)at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> ...
> {code}
> or
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException:1
>at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916)
>at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> {code}
> The fix, which is also used by others who use the same code and which solves 
> both exceptions, is to increase w by one in this segment:
> {code}
> this.prevChangingElems = new int[w];
> this.currChangingElems = new int[w];
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2187) ArrayIndexOutOfBoundsException in TIFFFaxDecoder

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2187.
-

Resolution: Fixed

Fixed in rev 1607891 for the trunk and rev 1607892 for the 1.8 branch.

> ArrayIndexOutOfBoundsException in TIFFFaxDecoder
> 
>
> Key: PDFBOX-2187
> URL: https://issues.apache.org/jira/browse/PDFBOX-2187
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: CCITTFaxDecode, ccitt
> Fix For: 1.8.7, 2.0.0
>
> Attachments: tiffT6-line1002-002145.pdf, tiffT6-line1002-002489.pdf, 
> tiffT6-line1002-p002145u.pdf, tiffT6-line916-005541u.pdf
>
>
> I get two types of exceptions with several files from the digitalcorpora site:
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException: 16
> at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
> .java:1002)at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> ...
> {code}
> or
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException:1
>at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916)
>at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> {code}
> The fix, which is also used by others who use the same code and which solves 
> both exceptions, is to increase w by one in this segment:
> {code}
> this.prevChangingElems = new int[w];
> this.currChangingElems = new int[w];
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2187) ArrayIndexOutOfBoundsException in TIFFFaxDecoder

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2187:


Description: 
I get ArrayIndexOutOfBoundsExceptions at two locations with several files from 
the digitalcorpora site:

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
.java:1002)at 
org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
...
{code}

or

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException:1
   at 
org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916) 
  at org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
{code}

The fix, which is also used by others who use the same code and which solves 
both exceptions, is to increase w by one in this segment:

{code}
this.prevChangingElems = new int[w];
this.currChangingElems = new int[w];
{code}

  was:
I get two types of exceptions with several files from the digitalcorpora site:

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException: 16
at org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
.java:1002)at 
org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
...
{code}

or

{code}
Exception in thread "AWT-EventQueue-0" 
java.lang.ArrayIndexOutOfBoundsException:1
   at 
org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916) 
  at org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
{code}

The fix, which is also used by others who use the same code and which solves 
both exceptions, is to increase w by one in this segment:

{code}
this.prevChangingElems = new int[w];
this.currChangingElems = new int[w];
{code}


> ArrayIndexOutOfBoundsException in TIFFFaxDecoder
> 
>
> Key: PDFBOX-2187
> URL: https://issues.apache.org/jira/browse/PDFBOX-2187
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: CCITTFaxDecode, ccitt
> Fix For: 1.8.7, 2.0.0
>
> Attachments: tiffT6-line1002-002145.pdf, tiffT6-line1002-002489.pdf, 
> tiffT6-line1002-p002145u.pdf, tiffT6-line916-005541u.pdf
>
>
> I get ArrayIndexOutOfBoundsExceptions at two locations with several files 
> from the digitalcorpora site:
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException: 16
> at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder
> .java:1002)at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> ...
> {code}
> or
> {code}
> Exception in thread "AWT-EventQueue-0" 
> java.lang.ArrayIndexOutOfBoundsException:1
>at 
> org.apache.pdfbox.filter.ccitt.TIFFFaxDecoder.decodeT6(TIFFFaxDecoder.java:916)
>at 
> org.apache.pdfbox.filter.CCITTFaxFilter.decode(CCITTFaxFilter.java:95)
> {code}
> The fix, which is also used by others who use the same code and which solves 
> both exceptions, is to increase w by one in this segment:
> {code}
> this.prevChangingElems = new int[w];
> this.currChangingElems = new int[w];
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [jira] [Commented] (PDFBOX-678) Support missing Text Rendering Modes when rendering a PDF

2014-07-04 Thread Leonard Rosenthol
And the biggest thing to keep in mind is that text rendering still needs
to move the pen even when the current OCG is off.

On 7/4/14, 12:38 PM, "John Hewson (JIRA)"  wrote:

>
>[ 
>https://issues.apache.org/jira/browse/PDFBOX-678?page=com.atlassian.jira.p
>lugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052550#com
>ment-14052550 ] 
>
>John Hewson commented on PDFBOX-678:
>
>
>We should do this, but not until after AWT text rendering is removed,
>which will be very soon.
>
>> Support missing Text Rendering Modes when rendering a PDF
>> -
>>
>> Key: PDFBOX-678
>> URL: https://issues.apache.org/jira/browse/PDFBOX-678
>> Project: PDFBox
>>  Issue Type: Improvement
>>  Components: Rendering
>>Reporter: Maruan Sahyoun
>> Attachments: Java Printing.pdf, TextRenderingModes.java.patch
>>
>>
>> Of the 7 different Text Rendering Modes only mode 0 (Fill Text) is
>>correctly implemented. Mode 1 (Stroke Text) falls back to Mode 0 and the
>>others are not implemented. I'm looking to implement the missing modes
>>(at least some of them).
>> Before doing so I'm proposing a structural change to when rendering
>>really occurs. Currently it's done within the PDxxxFont classes. I'd
>>rather implement the (AWT) text output in PageDrawer (or helper classes
>>within the same package) and use the font classes to return an AWT font
>>by adding a getAwtFont method. Doing so we get a better separation
>>between the PDF related stuff (PDxxx) and applications like PageDrawer.
>>The current rendering specific code within the PDxxxFont classes can be
>>retained for compatibility and marked deprecated at a later stage.
>> WDYT?
>
>
>
>--
>This message was sent by Atlassian JIRA
>(v6.2#6252)



Re: Regression Testing

2014-07-04 Thread John Hewson
Hi Tilman

Thanks for your thoughts, I think that your concerns are already covered by my 
original proposal, I’ll try to explain why and how:

> Of course I agree with the need for regression tests, however it isn't easy: 
> besides the problems of the different JDKs (I use JDK7 Windows 64 bit), there 
> is the problem that some enhancements create slight changes in rendering that 
> are not errors, i.e. both the "before" and the "after" files look OK by 
> itself. This has happened when we changed the text rendering recently, and 
> has happened again when the clipping was improved. The cause are probably 
> slight changes in color or in boundaries.

If a rendering has changed then the regression test should fail. When a failure 
occurs the developer needs to manually inspect the differences (we could 
generate a visual diff which highlights what changed to make this easier) and 
if ok then they can replace the known-good PNG with the ones just rendered. 
Indeed this will be the basic workflow for working with regression tests.

> Copyrights is a problem: I'm testing mostly with JIRA attachments that I've 
> downloaded over the years. While uploading such files to JIRA might count as 
> fair use, I doubt that this would still be true if they are included in a 
> distribution. Instead, they should be stored somewhere on Apache servers 
> where only committers and build software ("Travis", "Jenkins", ...) can 
> access then. The public PDFs that Maruan mentions don't possibly have all the 
> Problem cases that we solved before. However I have started working with 
> these files and there are at least 5 recent issues that deals with them.

The PDFs won’t be in a distribution. They will just happen to be stored in an 
SVN repo but not our source code repo, in the same way that the website is 
stored in the “cmssite” branch of SVN or indeed, are on JIRA. The law doesn’t 
distinguish between JIRA and SVN, both are publicly available via HTTP, so 
using SVN will simply be a continuation of what we’re already doing with JIRA.

The crucial factor is that we’re only storing publicly available PDFs,  because 
we have the right to do so, just like Google’s cache, and like we currently do 
with JIRA.

Additionally, the PDFs need to be version controlled otherwise we won’t be able 
to reliably recreate previous builds, so storing the files on a web server 
won’t be practical. Also committers will frequently be updating the renderings 
as bugs are fixed and we’ll need to version-control the rendered PNG files for 
the same reason. Finally, having committers-only files doesn’t fit well with 
the Apache goal of open development and would be unnecessary anyway given that 
all the PDFs are to be taken from public sources only.

In summary, I’m proposing that we just keep doing what we’re currently doing 
with JIRA but we move it into its own SVN repo along with some pre-rendered 
PNGs.

> Re preflight: the default mode should be to have the Isartor tests on. 
> Individuals could still disable them locally, but the central build software 
> should always use them.

Yes - does anybody know why this isn’t the default?

-- John

Re: Regression Testing

2014-07-04 Thread John Hewson
Hi Maruan

Thanks for your thoughts...

> # Tests #
> In addition to rendering we shall be covering metadata and text extraction as 
> well as PDF/A validation. 

Yes, we could add extracted text and validation results to the “regression” SVN 
repo also.

> # Testfiles # 
> Recently there were a number of test sets made available which we can use. […]

Excellent.

> In addition we can put additional files into our own repository as you 
> suggested.
> So there is no shortage on test files. 

Some people seem to have downloaded many (or all) of the JIRA files, I guess we 
could add those too.

> TIKA-1300/TIKA-1302 has a discussion around the same topic together with some 
> development for an infrastructure (VM, Jenkins …). IMHO we should join forces 
> with them.

I see that in TIKA-1302 the Tika developers suggest that PDFBox should set up 
its own regression tests, so I guess that’s our starting point. We should make 
sure that it’s easy to run just the text extraction regression tests using 
maven, and also ask them to give us any test files they have.

-- John

PS. Nice job handling those tough questions a PDFDays, I watched the video.

On 3 Jul 2014, at 23:43, Maruan Sahyoun  wrote:

> Hi John,
> 
> thanks for binging this up. This is a very important topic which was also 
> discussed at the PDFDays in Germany.
> 
> # Tests #
> In addition to rendering we shall be covering metadata and text extraction as 
> well as PDF/A validation. 
> 
> # Testfiles # 
> Recently there were a number of test sets made available which we can use. 
> http://digitalcorpora.org/corpora/files , 
> https://github.com/openplanets/format-corpus/tree/master/pdfCabinetOfHorrors …
> For PDF/A validation there is the Isartor test suite 
> http://www.pdfa.org/2011/08/download-isartor-test-suite/. Some restrictions 
> apply there.
> In addition we can put additional files into our own repository as you 
> suggested.
> So there is no shortage on test files. 
> 
> TIKA-1300/TIKA-1302 has a discussion around the same topic together with some 
> development for an infrastructure (VM, Jenkins …). IMHO we should join forces 
> with them.
> 
> BR
> 
> Maruan
> 
> 
> Am 04.07.2014 um 02:16 schrieb John Hewson :
> 
>> Hi All
>> 
>> I’ve been thinking about regression testing recently and how we can improve
>> our tests for rendering. There are currently two problems:
>> 
>> 1) Different JDKs produce slightly different renderings (see PDFBOX-1843).
>>   (I suspect that AWT fonts are a big part of this, so the problem might get 
>> a lot better
>>   soon once we render all fonts ourselves).
>> 
>> 2) Most PDF test files we have are not under an Apache-friendly license, so
>>   we can’t put the test files into the trunk SVN.
>> 
>> It seems that some of you have your own collections of test PDF files which 
>> you are
>> running regression tests on: that’s great but it would be much better if we 
>> had a
>> central repository of test files and sample renderings.
>> 
>> I’d like to suggest the following solutions to the above issues:
>> 
>> 1) We should choose a “blessed” JDK which will be used to perform the 
>> renderings
>>   this should be whatever is a convenient and sensible default for 
>> committers. (My
>>   preference would be for Oracle’s JDK 7 because JDK 6 is deprecated has 
>> known
>>   rendering bugs). We should make sure that Jenkins runs tests using the 
>> ”blessed”
>>   JDK.
>> 
>>  The regression test can then check to see if it is running on the “blessed” 
>> JDK and
>>  if not then the tests can be skipped and we can warn the user.
>> 
>> 2) We should create a new “regression” branch in SVN which contains only PDF 
>> files
>>   for testing and PNG images which contain known-good renderings created 
>> using the
>>   “blessed” JDK. This branch would not be part of the source of PDFBox but 
>> will still
>>   allow us to version control the test PDFs (it also simplifies the workflow 
>> for adding
>>   new test PDFs and new known-good renderings: simply do an "svn add”).
>> 
>>   As far as copyright and licensing is concerned we can put any PDF files 
>> which are
>>   available publicly on the web into this branch without too much worry.
>> 
>> What does everybody think?
>> 
>> -- John
>> 
> 



[jira] [Assigned] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reassigned PDFBOX-2179:
---

Assignee: John Hewson

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2181) Regression: NPE in PreflightContentStream

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-2181.
-

   Resolution: Fixed
Fix Version/s: 2.0.0

The problem was using trying to use the page BBox instead of the XObject BBox. 
Fixed in [r1607902|http://svn.apache.org/r1607902].

> Regression: NPE in PreflightContentStream
> -
>
> Key: PDFBOX-2181
> URL: https://issues.apache.org/jira/browse/PDFBOX-2181
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: expected.pdf
>
>
> Works in 1.8
> java -cp 
> pdf-box-svn/preflight/target/preflight-2.0.0-SNAPSHOT.jar:pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:pdf-box-svn/xmpbox/target/xmpbox-2.0.0-SNAPSHOT.jar
>  org.apache.pdfbox.preflight.Validator_A1b expected.pdf
> Exception in thread "main" java.lang.NullPointerException
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validXObjContentStream(PreflightContentStream.java:99)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2188) java.io.IOException: Expected a name or array but got: COSObject{1823, 0}

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2188:


Attachment: 000-000324.pdf

> java.io.IOException: Expected a name or array but got: COSObject{1823, 0}
> -
>
> Key: PDFBOX-2188
> URL: https://issues.apache.org/jira/browse/PDFBOX-2188
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: 000-000324.pdf
>
>
> I get this with the attached file:
> {code}
> 04.07.2014 19:20:59.356 ERROR [main] 
> org.apache.pdfbox.pdmodel.PDResources:329 - error while creating a colorspace
> java.io.IOException: Expected a name or array but got: COSObject{1823, 0}
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:162)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:55)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:145)
>   at 
> org.apache.pdfbox.pdmodel.PDResources.getColorSpaces(PDResources.java:325)
>   at 
> org.apache.pdfbox.util.operator.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:44)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:499)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:264)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:223)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:199)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:109)
>   at pdfboxziptest.PDFBoxZipTest.doPdf(PDFBoxZipTest.java:101)
>   at pdfboxziptest.PDFBoxZipTest.main(PDFBoxZipTest.java:72)
> {code}
> In the PDF:
> {code}
> 36 0 obj
> [ 
> /Pattern 1823 0 R 
> ]
> endobj
> {code}
> {code}
> 1823 0 obj
> [ 
> /ICCBased 1851 0 R 
> ]
> endobj
> {code}
> I assume that it is indeed a syntax error, but I don't really get how it 
> should have been done, and whether it can be fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-2188) java.io.IOException: Expected a name or array but got: COSObject{1823, 0}

2014-07-04 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2188:
---

 Summary: java.io.IOException: Expected a name or array but got: 
COSObject{1823, 0}
 Key: PDFBOX-2188
 URL: https://issues.apache.org/jira/browse/PDFBOX-2188
 Project: PDFBox
  Issue Type: Bug
  Components: Parsing
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
 Attachments: 000-000324.pdf

I get this with the attached file:

{code}
04.07.2014 19:20:59.356 ERROR [main] org.apache.pdfbox.pdmodel.PDResources:329 
- error while creating a colorspace
java.io.IOException: Expected a name or array but got: COSObject{1823, 0}
at 
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:162)
at 
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:55)
at 
org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:145)
at 
org.apache.pdfbox.pdmodel.PDResources.getColorSpaces(PDResources.java:325)
at 
org.apache.pdfbox.util.operator.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:44)
at 
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:499)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:264)
at 
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:223)
at 
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:199)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
at 
org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:109)
at pdfboxziptest.PDFBoxZipTest.doPdf(PDFBoxZipTest.java:101)
at pdfboxziptest.PDFBoxZipTest.main(PDFBoxZipTest.java:72)
{code}

In the PDF:
{code}
36 0 obj
[ 
/Pattern 1823 0 R 
]
endobj
{code}

{code}
1823 0 obj
[ 
/ICCBased 1851 0 R 
]
endobj
{code}

I assume that it is indeed a syntax error, but I don't really get how it should 
have been done, and whether it can be fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052605#comment-14052605
 ] 

John Hewson commented on PDFBOX-2179:
-

The two NullPointerException(s) have been fixed by PDFBOX-2181.

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (PDFBOX-2181) Regression: NPE in PreflightContentStream

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052604#comment-14052604
 ] 

John Hewson edited comment on PDFBOX-2181 at 7/4/14 6:20 PM:
-

The problem was trying to use the page BBox instead of the XObject BBox. Fixed 
in [r1607902|http://svn.apache.org/r1607902].


was (Author: jahewson):
The problem was using trying to use the page BBox instead of the XObject BBox. 
Fixed in [r1607902|http://svn.apache.org/r1607902].

> Regression: NPE in PreflightContentStream
> -
>
> Key: PDFBOX-2181
> URL: https://issues.apache.org/jira/browse/PDFBOX-2181
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: expected.pdf
>
>
> Works in 1.8
> java -cp 
> pdf-box-svn/preflight/target/preflight-2.0.0-SNAPSHOT.jar:pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:pdf-box-svn/xmpbox/target/xmpbox-2.0.0-SNAPSHOT.jar
>  org.apache.pdfbox.preflight.Validator_A1b expected.pdf
> Exception in thread "main" java.lang.NullPointerException
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validXObjContentStream(PreflightContentStream.java:99)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2181) Regression: NPE in PreflightContentStream

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-2181:


Summary: Regression: NPE in PreflightContentStream  (was: Regression NPE in 
PreflightContentStream)

> Regression: NPE in PreflightContentStream
> -
>
> Key: PDFBOX-2181
> URL: https://issues.apache.org/jira/browse/PDFBOX-2181
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
> Attachments: expected.pdf
>
>
> Works in 1.8
> java -cp 
> pdf-box-svn/preflight/target/preflight-2.0.0-SNAPSHOT.jar:pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:pdf-box-svn/xmpbox/target/xmpbox-2.0.0-SNAPSHOT.jar
>  org.apache.pdfbox.preflight.Validator_A1b expected.pdf
> Exception in thread "main" java.lang.NullPointerException
>   at 
> org.apache.pdfbox.preflight.content.PreflightContentStream.validXObjContentStream(PreflightContentStream.java:99)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052613#comment-14052613
 ] 

John Hewson commented on PDFBOX-2179:
-

I've fixed the five "IllegalStateException: Call to processSubStream() before 
processStream() or initStream()" errors in 
[r1607905|http://svn.apache.org/r1607905]. This was caused by preflight calling 
internal methods of PDFStreamEngine in a way which breaks their intended 
purpose.

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-2189) java.awt.geom.IllegalPathStateException: missing initial moveto in path definition

2014-07-04 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-2189:
---

 Summary: java.awt.geom.IllegalPathStateException: missing initial 
moveto in path definition
 Key: PDFBOX-2189
 URL: https://issues.apache.org/jira/browse/PDFBOX-2189
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
 Fix For: 2.0.0


I get this with the attached file:
{code}
java.awt.geom.IllegalPathStateException: missing initial moveto in path 
definition
at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
at java.awt.geom.Path2D.closePath(Path2D.java:1769)
{code}

I missed that one when I fixed PDFBOX-2158. The fix will be to do nothing and 
put out a warning if there's no current point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052613#comment-14052613
 ] 

John Hewson edited comment on PDFBOX-2179 at 7/4/14 6:49 PM:
-

I've fixed the five "IllegalStateException: Call to processSubStream() before 
processStream() or initStream()" errors in 
[r1607905|http://svn.apache.org/r1607905]. This was caused by preflight calling 
internal methods of PDFStreamEngine in a way which breaks their intended 
purpose.

All exceptions are now fixed, but there are two remaining test failures.


was (Author: jahewson):
I've fixed the five "IllegalStateException: Call to processSubStream() before 
processStream() or initStream()" errors in 
[r1607905|http://svn.apache.org/r1607905]. This was caused by preflight calling 
internal methods of PDFStreamEngine in a way which breaks their intended 
purpose.

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052617#comment-14052617
 ] 

John Hewson commented on PDFBOX-2179:
-

The first test failure is fixed in [r1607907|http://svn.apache.org/r1607907], 
it was caused by PostScript XObjects not being handled correctly, I have no 
idea where this regression came from, I'd speculate that the XObject resource 
wasn't being parsed or read somehow before.

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-2189) java.awt.geom.IllegalPathStateException: missing initial moveto in path definition

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2189:


Attachment: 002-002195.pdf

> java.awt.geom.IllegalPathStateException: missing initial moveto in path 
> definition
> --
>
> Key: PDFBOX-2189
> URL: https://issues.apache.org/jira/browse/PDFBOX-2189
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 002-002195.pdf
>
>
> I get this with the attached file:
> {code}
> java.awt.geom.IllegalPathStateException: missing initial moveto in path 
> definition
>   at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
>   at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
>   at java.awt.geom.Path2D.closePath(Path2D.java:1769)
> {code}
> I missed that one when I fixed PDFBOX-2158. The fix will be to do nothing and 
> put out a warning if there's no current point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: PDFDays

2014-07-04 Thread Tilman Hausherr

Am 04.07.2014 19:50, schrieb John Hewson:

PS. Nice job handling those tough questions a PDFDays, I watched the video.


Nobody expects the spanish inquisition:
https://www.youtube.com/watch?v=Tym0MObFpTI

I found what one of the inquisitors were referring to:
http://www.haskellforall.com/2014/04/worst-practices-are-viral-for-wrong.html

Btw recently the questions on stackoverflow have been less. Either the 
newbies are on vacation, or the software is getting better :)


Tilman




[jira] [Resolved] (PDFBOX-2189) java.awt.geom.IllegalPathStateException: missing initial moveto in path definition

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2189.
-

Resolution: Fixed

fixed in rev 1607914 for the trunk.

> java.awt.geom.IllegalPathStateException: missing initial moveto in path 
> definition
> --
>
> Key: PDFBOX-2189
> URL: https://issues.apache.org/jira/browse/PDFBOX-2189
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 002-002195.pdf
>
>
> I get this with the attached file:
> {code}
> java.awt.geom.IllegalPathStateException: missing initial moveto in path 
> definition
>   at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
>   at java.awt.geom.Path2D$Float.needRoom(Path2D.java:280)
>   at java.awt.geom.Path2D.closePath(Path2D.java:1769)
> {code}
> I missed that one when I fixed PDFBOX-2158. The fix will be to do nothing and 
> put out a warning if there's no current point.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-2179.
-

   Resolution: Fixed
Fix Version/s: 2.0.0

The second test failure is fixed in [r1607917|http://svn.apache.org/r1607917], 
it was introduced by PDFBOX-2149 which moved font loading into the PDFont 
constructor, so IOExceptions for damaged fonts needed to be caught earlier and 
translated into format-specific validation errors.

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
> Fix For: 2.0.0
>
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [jira] [Closed] (PDFBOX-1384) Proposals for a new PDNameTreeNode and PDNumberTreeNode

2014-07-04 Thread dnt
I'm wondering if anyone had a closer look at it to think about an integration 
into 2.0. It is meanwhile a bit older, but not necessarily outdated. If you 
think the changes made sense I can create a patch that can be applied to 2.0.
The same holds true for my other proposals that I submitted on the same day.

Dominic

Am Donnerstag, 3. Juli 2014, 01:27:25 schrieb John Hewson:
>  [
> https://issues.apache.org/jira/browse/PDFBOX-1384?page=com.atlassian.jira.p
> lugin.system.issuetabpanels:all-tabpanel ]
> 
> John Hewson closed PDFBOX-1384.
> ---
> 
> Resolution: Won't Fix
> 
> Closing due to the age of this patch.
> 
> > Proposals for a new PDNameTreeNode and PDNumberTreeNode
> > ---
> > 
> > Key: PDFBOX-1384
> > URL: https://issues.apache.org/jira/browse/PDFBOX-1384
> > 
> > Project: PDFBox
> >  
> >  Issue Type: Improvement
> >  
> >Reporter: Dominic Tubach
> >Priority: Minor
> >
> > Fix For: 2.0.0
> > 
> > Attachments: DTPDNameTreeNode.java, DTPDNameTreeNodeTest.java,
> > DTPDNumberTreeNode.java, DTPDNumberTreeNodeTest.java> 
> > Attached are proposals for a new PDNameTreeNode and a new
> > PDNumberTreeNode. (As both are very similar, I put them in one instead of
> > two issues.) Main differences:
> > - type safety through generics.
> > - it's always clear which types of objects the array holds.
> > - flexible object conversion through COSBaseConverter.
> > - remove method.
> > - size and isEmpty method.
> > - correct updating of limits (even in parent nodes) when setting kids,
> > names or removing values. (Does not set limits in root node as defined by
> > the PDF spec.) - removes empty child nodes.
> > Drawbacks:
> > - replacing the existing classes would require changes in existing code.
> > - requires (as of now) Java 1.6 (It might be enough to remove the
> > @Override annotations for Java 1.5 compatibility.) The required
> > COSBaseConverter can be found in issue #PDFBOX-1383
> > (To avoid conflicts with the existing classes i prefixed everything with
> > my initials.)
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)


signature.asc
Description: This is a digitally signed message part.


[jira] [Created] (PDFBOX-2190) Disable console logging for preflight Isartor tests

2014-07-04 Thread John Hewson (JIRA)
John Hewson created PDFBOX-2190:
---

 Summary: Disable console logging for preflight Isartor tests
 Key: PDFBOX-2190
 URL: https://issues.apache.org/jira/browse/PDFBOX-2190
 Project: PDFBox
  Issue Type: Wish
Reporter: John Hewson
Priority: Minor


The preflight Isartor test suite writes PDFBox's internal LOG messages out to 
the build console, which pollutes it. My solution is just to disable logging, 
it can always be re-enabled for debugging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2190) Disable console logging for preflight Isartor tests

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-2190.
-

   Resolution: Fixed
Fix Version/s: 2.0.0

Done in r1607923

> Disable console logging for preflight Isartor tests
> ---
>
> Key: PDFBOX-2190
> URL: https://issues.apache.org/jira/browse/PDFBOX-2190
> Project: PDFBox
>  Issue Type: Wish
>Reporter: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> The preflight Isartor test suite writes PDFBox's internal LOG messages out to 
> the build console, which pollutes it. My solution is just to disable logging, 
> it can always be re-enabled for debugging.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-2179) Regression: Some isartor tests are not passing in 2.0.0

2014-07-04 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052641#comment-14052641
 ] 

John Hewson commented on PDFBOX-2179:
-

Looking at PDFStreamEngine part of the problem which resulted in the "Call to 
processSubStream() before processStream() or initStream()" was due to 
PDStreamEngine#resetEngine() not really doing what it appeared. As this API was 
only used by TextStripper, I've made it private in that class only to avoid 
future confusion. Done in [r1607927|http://svn.apache.org/r1607927].

> Regression: Some isartor tests are not passing in 2.0.0
> ---
>
> Key: PDFBOX-2179
> URL: https://issues.apache.org/jira/browse/PDFBOX-2179
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Guillaume Bailleul
>Assignee: John Hewson
>Priority: Critical
>  Labels: regression
> Fix For: 2.0.0
>
>
> It is possible to check preflight with the isartor files while building 
> pdfbox. The option is not set by default. It can be done with the command 
> line :
> {quote}
> mvn test -Dskip.external.resources=false
> {quote}
> On July 2nd, 9 tests are failing :
> {quote}
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-f.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-h.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:175 isartor-6-2-3-3-t02-fail-i.pdf : 
> IllegalStateException raised , message=Call to processSubStream() before 
> processStream() or initStream()
> TestIsartor.validate:170 isartor-6-2-7-t01-fail-a.pdf : Invalid error code 
> returned. expected:<2.\[3.2\]> but was:<2\.[1.9\]>
> TestIsartor.validate:159 isartor-6-3-2-t01-fail-a.pdf : Invalid error code 
> returned. Expected 3.2.2, found \[3.1.1 3.3.2 \]
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-f.pdf : NullPointerException 
> raised , message=null
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-g.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-3-4-t01-fail-h.pdf : IllegalStateException 
> raised , message=Call to processSubStream() before processStream() or 
> initStream()
> TestIsartor.validate:175 isartor-6-9-t02-fail-a.pdf : NullPointerException 
> raised , message=null
> {quote}
> All is working fine with the last released version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2188) java.io.IOException: Expected a name or array but got: COSObject{1823, 0}

2014-07-04 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-2188.
-

   Resolution: Fixed
Fix Version/s: 2.0.0

Actually this is legitimate, we forgot to handle indirect objects, I've fixed 
this in [r1607933|http://svn.apache.org/r1607933].

> java.io.IOException: Expected a name or array but got: COSObject{1823, 0}
> -
>
> Key: PDFBOX-2188
> URL: https://issues.apache.org/jira/browse/PDFBOX-2188
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 000-000324.pdf
>
>
> I get this with the attached file:
> {code}
> 04.07.2014 19:20:59.356 ERROR [main] 
> org.apache.pdfbox.pdmodel.PDResources:329 - error while creating a colorspace
> java.io.IOException: Expected a name or array but got: COSObject{1823, 0}
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:162)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:55)
>   at 
> org.apache.pdfbox.pdmodel.graphics.color.PDColorSpace.create(PDColorSpace.java:145)
>   at 
> org.apache.pdfbox.pdmodel.PDResources.getColorSpaces(PDResources.java:325)
>   at 
> org.apache.pdfbox.util.operator.SetNonStrokingColorSpace.process(SetNonStrokingColorSpace.java:44)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:499)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:264)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:223)
>   at 
> org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:199)
>   at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:183)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderPage(PDFRenderer.java:228)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:160)
>   at 
> org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:109)
>   at pdfboxziptest.PDFBoxZipTest.doPdf(PDFBoxZipTest.java:101)
>   at pdfboxziptest.PDFBoxZipTest.main(PDFBoxZipTest.java:72)
> {code}
> In the PDF:
> {code}
> 36 0 obj
> [ 
> /Pattern 1823 0 R 
> ]
> endobj
> {code}
> {code}
> 1823 0 obj
> [ 
> /ICCBased 1851 0 R 
> ]
> endobj
> {code}
> I assume that it is indeed a syntax error, but I don't really get how it 
> should have been done, and whether it can be fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [jira] [Closed] (PDFBOX-1384) Proposals for a new PDNameTreeNode and PDNumberTreeNode

2014-07-04 Thread John Hewson
That’s certainly possible, I think part of the issue is that (speaking for 
myself) I’d rather see number and name trees handled more abstractly, the PD 
model should hide the details rather than exposing them all. Ideally we would 
end up with much less code.

-- John

On 4 Jul 2014, at 12:40, dnt  wrote:

> I'm wondering if anyone had a closer look at it to think about an integration 
> into 2.0. It is meanwhile a bit older, but not necessarily outdated. If you 
> think the changes made sense I can create a patch that can be applied to 2.0.
> The same holds true for my other proposals that I submitted on the same day.
> 
> Dominic
> 
> Am Donnerstag, 3. Juli 2014, 01:27:25 schrieb John Hewson:
>> [
>> https://issues.apache.org/jira/browse/PDFBOX-1384?page=com.atlassian.jira.p
>> lugin.system.issuetabpanels:all-tabpanel ]
>> 
>> John Hewson closed PDFBOX-1384.
>> ---
>> 
>>Resolution: Won't Fix
>> 
>> Closing due to the age of this patch.
>> 
>>> Proposals for a new PDNameTreeNode and PDNumberTreeNode
>>> ---
>>> 
>>>Key: PDFBOX-1384
>>>URL: https://issues.apache.org/jira/browse/PDFBOX-1384
>>> 
>>>Project: PDFBox
>>> 
>>> Issue Type: Improvement
>>> 
>>>   Reporter: Dominic Tubach
>>>   Priority: Minor
>>> 
>>>Fix For: 2.0.0
>>> 
>>>Attachments: DTPDNameTreeNode.java, DTPDNameTreeNodeTest.java,
>>>DTPDNumberTreeNode.java, DTPDNumberTreeNodeTest.java> 
>>> Attached are proposals for a new PDNameTreeNode and a new
>>> PDNumberTreeNode. (As both are very similar, I put them in one instead of
>>> two issues.) Main differences:
>>> - type safety through generics.
>>> - it's always clear which types of objects the array holds.
>>> - flexible object conversion through COSBaseConverter.
>>> - remove method.
>>> - size and isEmpty method.
>>> - correct updating of limits (even in parent nodes) when setting kids,
>>> names or removing values. (Does not set limits in root node as defined by
>>> the PDF spec.) - removes empty child nodes.
>>> Drawbacks:
>>> - replacing the existing classes would require changes in existing code.
>>> - requires (as of now) Java 1.6 (It might be enough to remove the
>>> @Override annotations for Java 1.5 compatibility.) The required
>>> COSBaseConverter can be found in issue #PDFBOX-1383
>>> (To avoid conflicts with the existing classes i prefixed everything with
>>> my initials.)
>> --
>> This message was sent by Atlassian JIRA
>> (v6.2#6252)



[jira] [Updated] (PDFBOX-2185) Rotation and skew not applied on rectangles

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2185:


Affects Version/s: 1.8.7
   1.8.6

> Rotation and skew not applied on rectangles
> ---
>
> Key: PDFBOX-2185
> URL: https://issues.apache.org/jira/browse/PDFBOX-2185
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Petr Slaby
> Attachments: AppendRectangleToPath.java.patch, example_013.pdf
>
>
> When rendering the attached example, rotation and skew of rectangles is not 
> applied properly. The reason is that the AppendRectangleToPath transform only 
> start and end point and makes a non-rotated non-skewed result out of that. 
> Instead, each corner of the rectangle has to be transformed separately as 
> shown in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-2185) Rotation and skew not applied on rectangles

2014-07-04 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2185.
-

   Resolution: Fixed
Fix Version/s: 2.0.0
   1.8.7
 Assignee: Tilman Hausherr

Thanks for finding and solving this! I committed your patch in rev 1607970 for 
the trunk and rev 1607971 for the 1.8 branch.

> Rotation and skew not applied on rectangles
> ---
>
> Key: PDFBOX-2185
> URL: https://issues.apache.org/jira/browse/PDFBOX-2185
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.6, 1.8.7, 2.0.0
>Reporter: Petr Slaby
>Assignee: Tilman Hausherr
> Fix For: 1.8.7, 2.0.0
>
> Attachments: AppendRectangleToPath.java.patch, example_013.pdf
>
>
> When rendering the attached example, rotation and skew of rectangles is not 
> applied properly. The reason is that the AppendRectangleToPath transform only 
> start and end point and makes a non-rotated non-skewed result out of that. 
> Instead, each corner of the rectangle has to be transformed separately as 
> shown in the attached patch.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-07-04 Thread Shaola Ren (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052793#comment-14052793
 ] 

Shaola Ren commented on PDFBOX-1915:


Thanks, glad to hear about this message, hope there are some fun content in 
type 2 and 3 shading, it looks that they are more real. I'll put type 6 and 7 
shading aside since now, not because getting stuck but any work will need to 
put time on. If you want, you can start a new thread about type 2 and 3 
shading, it won't really matter.

> Implement shading with Coons and tensor-product patch meshes
> 
>
> Key: PDFBOX-1915
> URL: https://issues.apache.org/jira/browse/PDFBOX-1915
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 1.8.5, 1.8.6, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Shaola Ren
>  Labels: graphical, gsoc2014, java, math, shading
> Fix For: 2.0.0
>
> Attachments: CIB-coons-vs-tensormesh.pdf, CIB-coonsmesh.pdf, 
> CONICAL.pdf, GWG060_Shading_x1a.pdf, GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, 
> McAfee-ShadingType7.pdf, Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
> _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
> asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
> coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
> coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
> coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
> coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
> coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
> eci_altona-test-suite-v2_technical_H.pdf, example_030.pdf, failedTest.rar, 
> lamp_cairo.pdf, lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
> lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
> pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
> shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
> tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
> tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
> updateshading6ContourTest.rar
>
>
> Of the seven shading methods described in the PDF specification, type 6 
> (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
> implemented. I have done type 1, 4 and 5, but I don't know the math for type 
> 6 and 7. My math days are decades away.
> Knowledge prerequisites: 
> - java, although you don't have to be a java ace, just feel confortable
> - math: you should know what "cubic Bézier curves", "Degenerate Bézier 
> curves", "bilinear interpolation", "tensor-product", "affine transform 
> matrix" and "Bernstein polynomials" are, or be able to learn it
> - maven (basic)
> - svn (basic)
> - an IDE like Netbeans or Eclipse or IntelliJ (basic)
> - ideally, you are either a math student who likes to program, or a computer 
> science student who is specializing in graphics.
> A first look at PDFBOX: try the command utility here:
> https://pdfbox.apache.org/commandline/#pdfToImage
> and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
> the shading types that are already implemented.
> Some simple source code to convert to images:
> String filename = "blah.pdf";
> PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
> List pdPages = document.getDocumentCatalog().getAllPages();
> int page = 0;
> for (PDPage pdPage : pdPages)
> {
> ++page;
> BufferedImage bim = RenderUtil.convertToImage(pdPage, 
> BufferedImage.TYPE_BYTE_BINARY, 300);
> ImageIO.write(bim, "png", new File(filename+page+".png"));
> }
> document.close();
> You are not starting from scratch. The implementation of type 4 and 5 shows 
> you how to read parameters from the PDF and set the graphics. You don't have 
> to learn the complete PDF spec, only 15 pages related to the two shading 
> types, and 6 pages about shading in general. The PDF specification is here:
> http://www.adobe.com/devnet/pdf/pdf_reference.html
> The tricky parts are:
> - decide whether a point(x,y) is inside or outside a patch
> - decide the color of a point within the patch
> To get an idea about the code, look at the classes GouraudTriangle, 
> GouraudShadingContext, Type4ShadingContext and Vertex here
> https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/
> or download the whole project from the repository.
> https://pdfbox.apache.org/downloads.html#scm
> If you want to see the existing code in the debugger with a Gouraud shading, 
> try this file:
> http://asymptote.sourceforge.net/gallery/Gouraud.pdf
> Testing:
> I have attached several example PDFs. To see which one has which shading, 
> open them wit

[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes

2014-07-04 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052814#comment-14052814
 ] 

Tilman Hausherr commented on PDFBOX-1915:
-

Have a look at PDFBOX-2117, but please don't read the dialog yet, only look at 
the attached PDF files, not the java files, and not the dialog after them.

This issue was opened by "power user" Petr who has done a lot for the project 
in the last few weeks but who didn't know about GSoC2014. I would ask you to 
first use the profiler with the files, to look at the source code for 
optimization possibilities. Maybe you'll have the same ideas as in the java 
files / the dialog there, maybe you have different ones.

Note that optimization may not only have to be done in the shading package, it 
is also possibly in the function package. Most is function type 2.

> Implement shading with Coons and tensor-product patch meshes
> 
>
> Key: PDFBOX-1915
> URL: https://issues.apache.org/jira/browse/PDFBOX-1915
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 1.8.5, 1.8.6, 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Shaola Ren
>  Labels: graphical, gsoc2014, java, math, shading
> Fix For: 2.0.0
>
> Attachments: CIB-coons-vs-tensormesh.pdf, CIB-coonsmesh.pdf, 
> CONICAL.pdf, GWG060_Shading_x1a.pdf, GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, 
> McAfee-ShadingType7.pdf, Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, 
> _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, 
> asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, 
> coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, 
> coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, 
> coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, 
> coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, 
> coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, 
> eci_altona-test-suite-v2_technical_H.pdf, example_030.pdf, failedTest.rar, 
> lamp_cairo.pdf, lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, 
> lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, 
> pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, 
> shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, 
> tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, 
> tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, 
> updateshading6ContourTest.rar
>
>
> Of the seven shading methods described in the PDF specification, type 6 
> (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been 
> implemented. I have done type 1, 4 and 5, but I don't know the math for type 
> 6 and 7. My math days are decades away.
> Knowledge prerequisites: 
> - java, although you don't have to be a java ace, just feel confortable
> - math: you should know what "cubic Bézier curves", "Degenerate Bézier 
> curves", "bilinear interpolation", "tensor-product", "affine transform 
> matrix" and "Bernstein polynomials" are, or be able to learn it
> - maven (basic)
> - svn (basic)
> - an IDE like Netbeans or Eclipse or IntelliJ (basic)
> - ideally, you are either a math student who likes to program, or a computer 
> science student who is specializing in graphics.
> A first look at PDFBOX: try the command utility here:
> https://pdfbox.apache.org/commandline/#pdfToImage
> and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have 
> the shading types that are already implemented.
> Some simple source code to convert to images:
> String filename = "blah.pdf";
> PDDocument document = PDDocument.loadNonSeq(new File(filename), null);
> List pdPages = document.getDocumentCatalog().getAllPages();
> int page = 0;
> for (PDPage pdPage : pdPages)
> {
> ++page;
> BufferedImage bim = RenderUtil.convertToImage(pdPage, 
> BufferedImage.TYPE_BYTE_BINARY, 300);
> ImageIO.write(bim, "png", new File(filename+page+".png"));
> }
> document.close();
> You are not starting from scratch. The implementation of type 4 and 5 shows 
> you how to read parameters from the PDF and set the graphics. You don't have 
> to learn the complete PDF spec, only 15 pages related to the two shading 
> types, and 6 pages about shading in general. The PDF specification is here:
> http://www.adobe.com/devnet/pdf/pdf_reference.html
> The tricky parts are:
> - decide whether a point(x,y) is inside or outside a patch
> - decide the color of a point within the patch
> To get an idea about the code, look at the classes GouraudTriangle, 
> GouraudShadingContext, Type4ShadingContext and Vertex here
> https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/
> or download the w