[jira] [Commented] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946224#comment-13946224
 ] 

Tilman Hausherr commented on PDFBOX-1999:
-

Please send me the PDF to tilman at snafu dot de.

> JBIG2Filter - FlateDecoded Globals Table
> 
>
> Key: PDFBOX-1999
> URL: https://issues.apache.org/jira/browse/PDFBOX-1999
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> When rendering a jbig2 with a Globals table that has a filter (in this case 
> compressed) JBIG2Filter was calling getFilteredStream which sounds correct 
> but in fact is not filtered but the raw data. It needs to be 
> getUnfilteredStream() . 
> I will submit a patch. I have a pdf to test it on but it is public so the 
> test will have to be done off list



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-1998.
-

   Resolution: Fixed
Fix Version/s: 1.8.5

Fixed in rev 1581244.

> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
>  Labels: mask
> Fix For: 1.8.5
>
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1998:


Labels: mask  (was: )

> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
>  Labels: mask
> Fix For: 1.8.5
>
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Dave Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946020#comment-13946020
 ] 

Dave Smith commented on PDFBOX-1996:


The pdf is not public. I can send it to you off list.

My first thought was to optimize the function, however there is more than one.

dup, 0, mul, exch, dup, 0, mul, exch, dup, 0, mul, exch, 1, mul



> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
>Priority: Minor
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table

2014-03-24 Thread Dave Smith (JIRA)
Dave Smith created PDFBOX-1999:
--

 Summary: JBIG2Filter - FlateDecoded Globals Table
 Key: PDFBOX-1999
 URL: https://issues.apache.org/jira/browse/PDFBOX-1999
 Project: PDFBox
  Issue Type: Bug
Affects Versions: 2.0.0
Reporter: Dave Smith
 Attachments: pdfbox.patch

When rendering a jbig2 with a Globals table that has a filter (in this case 
compressed) JBIG2Filter was calling getFilteredStream which sounds correct but 
in fact is not filtered but the raw data. It needs to be getUnfilteredStream() 
. 

I will submit a patch. I have a pdf to test it on but it is public so the test 
will have to be done off list



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table

2014-03-24 Thread Dave Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Smith updated PDFBOX-1999:
---

Attachment: pdfbox.patch

Here is the  fix...

> JBIG2Filter - FlateDecoded Globals Table
> 
>
> Key: PDFBOX-1999
> URL: https://issues.apache.org/jira/browse/PDFBOX-1999
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> When rendering a jbig2 with a Globals table that has a filter (in this case 
> compressed) JBIG2Filter was calling getFilteredStream which sounds correct 
> but in fact is not filtered but the raw data. It needs to be 
> getUnfilteredStream() . 
> I will submit a patch. I have a pdf to test it on but it is public so the 
> test will have to be done off list



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945856#comment-13945856
 ] 

John Hewson commented on PDFBOX-1998:
-

Looks good to me, the ImageMask is on the alpha channel, so destination in/out 
is effectively inverting the mask.

> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-1996:


Priority: Minor  (was: Major)

> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
>Priority: Minor
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945852#comment-13945852
 ] 

John Hewson commented on PDFBOX-1996:
-

Which type of function does your PDF use for the tint transform? (i.e. which 
subclass of PDFunction is used?). It might be possible to speed up the 
underlying function instead so that RGB images will be faster too.

> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945764#comment-13945764
 ] 

Tilman Hausherr edited comment on PDFBOX-1998 at 3/24/14 11:08 PM:
---

Heh heh :-)  I found this in PDXObjectImage.java:
{code}
// assume default values ([0,1]) for the DecodeArray
// TODO DecodeArray == [1,0]
graphics.setComposite(AlphaComposite.DstIn);
{code}
I tried to replace that with the following lines and it works, but I'd like to 
hear another opinion because my day job doesn't involve [Porter/Duff 
rules|http://ssp.impulsetrain.com/porterduff.html]:
{code}
COSArray decode = getDecode();
if (decode != null && decode.getInt(0) == 1)
graphics.setComposite(AlphaComposite.DstOut);
else
graphics.setComposite(AlphaComposite.DstIn);
{code}



was (Author: tilman):
Heh heh :-)  I found this in PDXObjectImage.java:
{code}
// assume default values ([0,1]) for the DecodeArray
// TODO DecodeArray == [1,0]
{code}
I tried to replace the line below that one with this and it works, but I'd like 
to hear another opinion because my day job doesn't involve [Porter/Duff 
rules|http://ssp.impulsetrain.com/porterduff.html]:
{code}
COSArray decode = getDecode();
if (decode != null && decode.getInt(0) == 1)
graphics.setComposite(AlphaComposite.DstOut);
else
graphics.setComposite(AlphaComposite.DstIn);
{code}


> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945764#comment-13945764
 ] 

Tilman Hausherr commented on PDFBOX-1998:
-

Heh heh :-)  I found this in PDXObjectImage.java:
{code}
// assume default values ([0,1]) for the DecodeArray
// TODO DecodeArray == [1,0]
{code}
I tried to replace the line below that one with this and it works, but I'd like 
to hear another opinion because my day job doesn't involve [Porter/Duff 
rules|http://ssp.impulsetrain.com/porterduff.html]:
{code}
COSArray decode = getDecode();
if (decode != null && decode.getInt(0) == 1)
graphics.setComposite(AlphaComposite.DstOut);
else
graphics.setComposite(AlphaComposite.DstIn);
{code}


> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945578#comment-13945578
 ] 

Tilman Hausherr commented on PDFBOX-1998:
-

What I did notice: the PDF has this:

/ImageMask true /Decode [ 1 0 ]

changing it to 

/ImageMask true /Decode [ 0 1 ]

changes the Adobe Viewer rendering, but not the rendering with PDFBOX. Which 
suggests that the Decode is ignored.

> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1998:


Attachment: PDFBOX-1998.PDF

> PDF rendering with reversed colors
> --
>
> Key: PDFBOX-1998
> URL: https://issues.apache.org/jira/browse/PDFBOX-1998
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 1.8.4, 1.8.5
>Reporter: Tilman Hausherr
> Attachments: PDFBOX-1998.PDF
>
>
> The attached PDF (from Étienne Landry on the user mailing list) is rendered 
> in w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-1998) PDF rendering with reversed colors

2014-03-24 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-1998:
---

 Summary: PDF rendering with reversed colors
 Key: PDFBOX-1998
 URL: https://issues.apache.org/jira/browse/PDFBOX-1998
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 1.8.4, 1.8.5
Reporter: Tilman Hausherr


The attached PDF (from Étienne Landry on the user mailing list) is rendered in 
w/b instead of b/w. This does not happen in the 2.0 version.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945463#comment-13945463
 ] 

Tilman Hausherr commented on PDFBOX-1996:
-

While I'm not the one who will commit your patch (I don't know enough of that 
topic), do you have a non-confidential PDF that would use your patch, so that 
we can see that the result is the same before and after?

> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1996) PDSeparation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1996:


Summary: PDSeparation optimization  (was: PDSeparation separation 
optimization)

> PDSeparation optimization
> -
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1996) PDSeparation separation optimization

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1996:


Summary: PDSeparation separation optimization  (was: PDSeparation separtion 
optimization)

> PDSeparation separation optimization
> 
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1997) CIE LAB item missing in rendering

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1997:


Summary: CIE LAB item missing in rendering  (was: CIA LAB item missing in 
rendering)

> CIE LAB item missing in rendering
> -
>
> Key: PDFBOX-1997
> URL: https://issues.apache.org/jira/browse/PDFBOX-1997
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: text_graphic_image.pdf, text_graphic_image.pdf-1.png
>
>
> The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few 
> weeks ago.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-1997) CIA LAB item missing in rendering

2014-03-24 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-1997:
---

 Summary: CIA LAB item missing in rendering
 Key: PDFBOX-1997
 URL: https://issues.apache.org/jira/browse/PDFBOX-1997
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Tilman Hausherr


The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few 
weeks ago.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1997) CIA LAB item missing in rendering

2014-03-24 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-1997:


Attachment: text_graphic_image.pdf-1.png
text_graphic_image.pdf

> CIA LAB item missing in rendering
> -
>
> Key: PDFBOX-1997
> URL: https://issues.apache.org/jira/browse/PDFBOX-1997
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: text_graphic_image.pdf, text_graphic_image.pdf-1.png
>
>
> The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few 
> weeks ago.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945386#comment-13945386
 ] 

Tilman Hausherr commented on PDFBOX-1994:
-

What happens if you use the PDF app with the PDFReader option?
http://www.apache.org/dyn/closer.cgi/pdfbox/1.8.4/pdfbox-app-1.8.4.jar

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (PDFBOX-1996) PDSeparation separtion optimization

2014-03-24 Thread Dave Smith (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Smith updated PDFBOX-1996:
---

Attachment: pdfbox.patch

Patch that caches black and white values

> PDSeparation separtion optimization
> ---
>
> Key: PDFBOX-1996
> URL: https://issues.apache.org/jira/browse/PDFBOX-1996
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: Dave Smith
> Attachments: pdfbox.patch
>
>
> I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) 
> to render. It uses a Separation color space and it has to run numerous 
> functions per pixel that is causing the slow down. I have a patch where I pre 
> calculate the black and white pixels and cache them instead of calculating 
> them every time. This optimization gets the page rendering down to less than 
> a second a page. I will attach my patch. I could see going forward caching 
> all calculated colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (PDFBOX-1996) PDSeparation separtion optimization

2014-03-24 Thread Dave Smith (JIRA)
Dave Smith created PDFBOX-1996:
--

 Summary: PDSeparation separtion optimization
 Key: PDFBOX-1996
 URL: https://issues.apache.org/jira/browse/PDFBOX-1996
 Project: PDFBox
  Issue Type: Improvement
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Dave Smith


I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) to 
render. It uses a Separation color space and it has to run numerous functions 
per pixel that is causing the slow down. I have a patch where I pre calculate 
the black and white pixels and cache them instead of calculating them every 
time. This optimization gets the page rendering down to less than a second a 
page. I will attach my patch. I could see going forward caching all calculated 
colours , but floats in hash maps are tricky.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944983#comment-13944983
 ] 

Tilman Hausherr commented on PDFBOX-1994:
-

The problem is that as long as you insist on using 1.4 we won't know whether 
the problem is related to that or to another cause.

Enter java -version to find out what's really running. 

Btw it could still be a corrupt file even if you can open it with Adobe, so 
please try with different files. 

There's also jstack in the jdk bin directory,  Google for it on how to get a 
thread dump.



> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944946#comment-13944946
 ] 

brijesh commented on PDFBOX-1994:
-

hi ,
i am using PDFBox 1.8 and java 1,4 locally , it working for me. 

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944942#comment-13944942
 ] 

Andreas Lehmkühler commented on PDFBOX-1994:


- the stack trace is explicit, at least one class is missing. You somehow mixed 
your environment. This has nothing to do with PDFBox. 
- as Tilman already mentioned PDFBox 1.8.x requires Java 1.5. So either your 
are not using PDFBox 1.8.x or your are not using a Java 1.4 environment.
- you should asked someone familiar with Launch4J how to configure it correct 
to get a working executeable or the even better idea you should think about 
using the JDK directly without a jar->exe converter.


> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944932#comment-13944932
 ] 

brijesh commented on PDFBOX-1994:
-

yes, add println statements in between . that is why i understand it hangs on 
the single line.

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944924#comment-13944924
 ] 

Tilman Hausherr commented on PDFBOX-1994:
-

Could you add a writeln before and after each pdfbox related call so that you 
can tell which one hangs?



> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Timo Boehme (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944919#comment-13944919
 ] 

Timo Boehme commented on PDFBOX-1994:
-

If you are using a UNIX style server system run a
kill -3 PROCESS_PID(or -QUIT)
on the server Java process.
This will give you a stack trace of the Java VM at stdout/stderr(?) - may this 
is redirected to a log file in your case. So you will see where it hangs. If it 
hangs in PDFBox you can provide us with this stack trace.

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944913#comment-13944913
 ] 

brijesh commented on PDFBOX-1994:
-

hi , 
tested with  PDDocument doc = PDDocument.loadNonSeq(file , null);
But the same issue , it hangs.

1- tested the pdf file size (it is not zero size)
2- added all permissions to the specified folder / files
3-it is not a corrupt file also.
still i didnt understand where is the exact issue.

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944897#comment-13944897
 ] 

Tilman Hausherr commented on PDFBOX-1994:
-

Btw the 2nd param of loadnonseq can be null.

It is important to get rid of external factors like that exe packer and then 
approach the real problem step by step. Currently your main class cannot be 
found. 



> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread Timo Boehme (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944880#comment-13944880
 ] 

Timo Boehme commented on PDFBOX-1994:
-

Looks like you are missing a number of libraries in your jar test case but all 
of them do not belong to PDFBox.
It seems to me that in general this is no PDFBox issue but an issue of your 
server environment. I would propose adding more test code in your server 
version (test first if file is readable etc.; after loading do some logging 
before trying to print document, ...) and use PDDocument.loadNonSeq instead of 
PDDocument.load. 

> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size

2014-03-24 Thread brijesh (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944868#comment-13944868
 ] 

brijesh commented on PDFBOX-1994:
-

Hi,
while testing the .jar , getting exception, 

Exception in thread "main" java.lang.NoClassDefFoundError: com/jgoodies/looks/pl
astic/PlasticTheme
Caused by: java.lang.ClassNotFoundException: com.jgoodies.looks.plastic.PlasticT
heme
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
at java.lang.ClassLoader.loadClass(Unknown Source)
Could not find the main class: specgas.MainGas. Program will exit.



---So i cant test the .jar.


- But i am confidant that , this is working fine in local with java 1.4 version.
and other code and corresponding .jar is working fine in server , except  
'PDDocument.load'  statement.
control is hanging with no error message.
- Can u please provide me any other method for converting the .jar to .exe ?
-


> PDDocument.load(filename.pdf) hangs for pdf files having size
> -
>
> Key: PDFBOX-1994
> URL: https://issues.apache.org/jira/browse/PDFBOX-1994
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.4
>Reporter: brijesh
>
> The below code i am using for loading my pdf. but my pdf file is not a zero 
> sized files and having full permission and it is not a corrupt file also. but 
> i ddint get any error after code. it just hangs. 
> it is working in local, but not working in server .
> (created ,jar files and then exe, then the .exe will excuted in the server)
> java using 1,4
> PDDocument pdf=PDDocument.load("d:\\filename.pdf");
> pdf.print();
> please provide me why the same code is not working in server.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: [GSoC 2014]Optical Character Recognition project - Introduction

2014-03-24 Thread DImuthu Upeksha
Hi John,

I looked at processTextPosition method in PDFTextStripper. But I
couldn't understand actual process happening inside the method. What
should be the input for that method? In my case I have words with
bounding box's coordinates. How can I make those data to compatible
with the input of processTextPosition method. As well, what is the
output of the method?

Thanks
Dimuthu

On Wed, Mar 19, 2014 at 11:19 PM, John Hewson  wrote:
> Hi Dimuthu
>
>> 1 Print those data into PDDocument again and pass through TextStripper
>> of PDFBox. This could reduce the performance of overall process.
>
> This was what I had in mind, but rather than printing the text into the 
> PDDocument
> you can inject it directly into PDFTextStripper as TextPosition instances. I 
> mentioned
> something like this a while ago:
>
>> You could subclass PDFTextStripper and override the startDocument method and 
>> use it to create a PDFRenderer and store it in a field. Then override the 
>> processPage method and use the previously created PDFRenderer to render the 
>> current page to a buffered image and perform OCR on the image. Once you have 
>> the OCR text + positions, instead of calling processStream you can call 
>> processTextPosition once for each character + position.
>
> Let's see how well it works and then re-evaluate.
>
> -- John
>



-- 
Regards

W.Dimuthu Upeksha
Undergraduate

Department of Computer Science And Engineering

University of Moratuwa, Sri Lanka