[jira] [Commented] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table
[ https://issues.apache.org/jira/browse/PDFBOX-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946224#comment-13946224 ] Tilman Hausherr commented on PDFBOX-1999: - Please send me the PDF to tilman at snafu dot de. > JBIG2Filter - FlateDecoded Globals Table > > > Key: PDFBOX-1999 > URL: https://issues.apache.org/jira/browse/PDFBOX-1999 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > When rendering a jbig2 with a Globals table that has a filter (in this case > compressed) JBIG2Filter was calling getFilteredStream which sounds correct > but in fact is not filtered but the raw data. It needs to be > getUnfilteredStream() . > I will submit a patch. I have a pdf to test it on but it is public so the > test will have to be done off list -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-1998. - Resolution: Fixed Fix Version/s: 1.8.5 Fixed in rev 1581244. > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Labels: mask > Fix For: 1.8.5 > > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1998: Labels: mask (was: ) > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Labels: mask > Fix For: 1.8.5 > > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1996) PDSeparation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946020#comment-13946020 ] Dave Smith commented on PDFBOX-1996: The pdf is not public. I can send it to you off list. My first thought was to optimize the function, however there is more than one. dup, 0, mul, exch, dup, 0, mul, exch, dup, 0, mul, exch, 1, mul > PDSeparation optimization > - > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith >Priority: Minor > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table
Dave Smith created PDFBOX-1999: -- Summary: JBIG2Filter - FlateDecoded Globals Table Key: PDFBOX-1999 URL: https://issues.apache.org/jira/browse/PDFBOX-1999 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.0 Reporter: Dave Smith Attachments: pdfbox.patch When rendering a jbig2 with a Globals table that has a filter (in this case compressed) JBIG2Filter was calling getFilteredStream which sounds correct but in fact is not filtered but the raw data. It needs to be getUnfilteredStream() . I will submit a patch. I have a pdf to test it on but it is public so the test will have to be done off list -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1999) JBIG2Filter - FlateDecoded Globals Table
[ https://issues.apache.org/jira/browse/PDFBOX-1999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Smith updated PDFBOX-1999: --- Attachment: pdfbox.patch Here is the fix... > JBIG2Filter - FlateDecoded Globals Table > > > Key: PDFBOX-1999 > URL: https://issues.apache.org/jira/browse/PDFBOX-1999 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > When rendering a jbig2 with a Globals table that has a filter (in this case > compressed) JBIG2Filter was calling getFilteredStream which sounds correct > but in fact is not filtered but the raw data. It needs to be > getUnfilteredStream() . > I will submit a patch. I have a pdf to test it on but it is public so the > test will have to be done off list -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945856#comment-13945856 ] John Hewson commented on PDFBOX-1998: - Looks good to me, the ImageMask is on the alpha channel, so destination in/out is effectively inverting the mask. > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1996) PDSeparation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1996: Priority: Minor (was: Major) > PDSeparation optimization > - > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith >Priority: Minor > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1996) PDSeparation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945852#comment-13945852 ] John Hewson commented on PDFBOX-1996: - Which type of function does your PDF use for the tint transform? (i.e. which subclass of PDFunction is used?). It might be possible to speed up the underlying function instead so that RGB images will be faster too. > PDSeparation optimization > - > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945764#comment-13945764 ] Tilman Hausherr edited comment on PDFBOX-1998 at 3/24/14 11:08 PM: --- Heh heh :-) I found this in PDXObjectImage.java: {code} // assume default values ([0,1]) for the DecodeArray // TODO DecodeArray == [1,0] graphics.setComposite(AlphaComposite.DstIn); {code} I tried to replace that with the following lines and it works, but I'd like to hear another opinion because my day job doesn't involve [Porter/Duff rules|http://ssp.impulsetrain.com/porterduff.html]: {code} COSArray decode = getDecode(); if (decode != null && decode.getInt(0) == 1) graphics.setComposite(AlphaComposite.DstOut); else graphics.setComposite(AlphaComposite.DstIn); {code} was (Author: tilman): Heh heh :-) I found this in PDXObjectImage.java: {code} // assume default values ([0,1]) for the DecodeArray // TODO DecodeArray == [1,0] {code} I tried to replace the line below that one with this and it works, but I'd like to hear another opinion because my day job doesn't involve [Porter/Duff rules|http://ssp.impulsetrain.com/porterduff.html]: {code} COSArray decode = getDecode(); if (decode != null && decode.getInt(0) == 1) graphics.setComposite(AlphaComposite.DstOut); else graphics.setComposite(AlphaComposite.DstIn); {code} > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945764#comment-13945764 ] Tilman Hausherr commented on PDFBOX-1998: - Heh heh :-) I found this in PDXObjectImage.java: {code} // assume default values ([0,1]) for the DecodeArray // TODO DecodeArray == [1,0] {code} I tried to replace the line below that one with this and it works, but I'd like to hear another opinion because my day job doesn't involve [Porter/Duff rules|http://ssp.impulsetrain.com/porterduff.html]: {code} COSArray decode = getDecode(); if (decode != null && decode.getInt(0) == 1) graphics.setComposite(AlphaComposite.DstOut); else graphics.setComposite(AlphaComposite.DstIn); {code} > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945578#comment-13945578 ] Tilman Hausherr commented on PDFBOX-1998: - What I did notice: the PDF has this: /ImageMask true /Decode [ 1 0 ] changing it to /ImageMask true /Decode [ 0 1 ] changes the Adobe Viewer rendering, but not the rendering with PDFBOX. Which suggests that the Decode is ignored. > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1998) PDF rendering with reversed colors
[ https://issues.apache.org/jira/browse/PDFBOX-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1998: Attachment: PDFBOX-1998.PDF > PDF rendering with reversed colors > -- > > Key: PDFBOX-1998 > URL: https://issues.apache.org/jira/browse/PDFBOX-1998 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 1.8.4, 1.8.5 >Reporter: Tilman Hausherr > Attachments: PDFBOX-1998.PDF > > > The attached PDF (from Étienne Landry on the user mailing list) is rendered > in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-1998) PDF rendering with reversed colors
Tilman Hausherr created PDFBOX-1998: --- Summary: PDF rendering with reversed colors Key: PDFBOX-1998 URL: https://issues.apache.org/jira/browse/PDFBOX-1998 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.4, 1.8.5 Reporter: Tilman Hausherr The attached PDF (from Étienne Landry on the user mailing list) is rendered in w/b instead of b/w. This does not happen in the 2.0 version. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1996) PDSeparation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945463#comment-13945463 ] Tilman Hausherr commented on PDFBOX-1996: - While I'm not the one who will commit your patch (I don't know enough of that topic), do you have a non-confidential PDF that would use your patch, so that we can see that the result is the same before and after? > PDSeparation optimization > - > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1996) PDSeparation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1996: Summary: PDSeparation optimization (was: PDSeparation separation optimization) > PDSeparation optimization > - > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1996) PDSeparation separation optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1996: Summary: PDSeparation separation optimization (was: PDSeparation separtion optimization) > PDSeparation separation optimization > > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1997) CIE LAB item missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1997: Summary: CIE LAB item missing in rendering (was: CIA LAB item missing in rendering) > CIE LAB item missing in rendering > - > > Key: PDFBOX-1997 > URL: https://issues.apache.org/jira/browse/PDFBOX-1997 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Tilman Hausherr > Attachments: text_graphic_image.pdf, text_graphic_image.pdf-1.png > > > The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few > weeks ago. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-1997) CIA LAB item missing in rendering
Tilman Hausherr created PDFBOX-1997: --- Summary: CIA LAB item missing in rendering Key: PDFBOX-1997 URL: https://issues.apache.org/jira/browse/PDFBOX-1997 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 2.0.0 Reporter: Tilman Hausherr The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few weeks ago. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1997) CIA LAB item missing in rendering
[ https://issues.apache.org/jira/browse/PDFBOX-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1997: Attachment: text_graphic_image.pdf-1.png text_graphic_image.pdf > CIA LAB item missing in rendering > - > > Key: PDFBOX-1997 > URL: https://issues.apache.org/jira/browse/PDFBOX-1997 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Tilman Hausherr > Attachments: text_graphic_image.pdf, text_graphic_image.pdf-1.png > > > The file from PDFBOX-1681 is missing the "CIELAB" output, it was there a few > weeks ago. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945386#comment-13945386 ] Tilman Hausherr commented on PDFBOX-1994: - What happens if you use the PDF app with the PDFReader option? http://www.apache.org/dyn/closer.cgi/pdfbox/1.8.4/pdfbox-app-1.8.4.jar > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1996) PDSeparation separtion optimization
[ https://issues.apache.org/jira/browse/PDFBOX-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Smith updated PDFBOX-1996: --- Attachment: pdfbox.patch Patch that caches black and white values > PDSeparation separtion optimization > --- > > Key: PDFBOX-1996 > URL: https://issues.apache.org/jira/browse/PDFBOX-1996 > Project: PDFBox > Issue Type: Improvement > Components: Rendering >Affects Versions: 2.0.0 >Reporter: Dave Smith > Attachments: pdfbox.patch > > > I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) > to render. It uses a Separation color space and it has to run numerous > functions per pixel that is causing the slow down. I have a patch where I pre > calculate the black and white pixels and cache them instead of calculating > them every time. This optimization gets the page rendering down to less than > a second a page. I will attach my patch. I could see going forward caching > all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-1996) PDSeparation separtion optimization
Dave Smith created PDFBOX-1996: -- Summary: PDSeparation separtion optimization Key: PDFBOX-1996 URL: https://issues.apache.org/jira/browse/PDFBOX-1996 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 2.0.0 Reporter: Dave Smith I have a 4 page black and white pdf that takes 32 seconds (8 seconds a page) to render. It uses a Separation color space and it has to run numerous functions per pixel that is causing the slow down. I have a patch where I pre calculate the black and white pixels and cache them instead of calculating them every time. This optimization gets the page rendering down to less than a second a page. I will attach my patch. I could see going forward caching all calculated colours , but floats in hash maps are tricky. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944983#comment-13944983 ] Tilman Hausherr commented on PDFBOX-1994: - The problem is that as long as you insist on using 1.4 we won't know whether the problem is related to that or to another cause. Enter java -version to find out what's really running. Btw it could still be a corrupt file even if you can open it with Adobe, so please try with different files. There's also jstack in the jdk bin directory, Google for it on how to get a thread dump. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944946#comment-13944946 ] brijesh commented on PDFBOX-1994: - hi , i am using PDFBox 1.8 and java 1,4 locally , it working for me. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944942#comment-13944942 ] Andreas Lehmkühler commented on PDFBOX-1994: - the stack trace is explicit, at least one class is missing. You somehow mixed your environment. This has nothing to do with PDFBox. - as Tilman already mentioned PDFBox 1.8.x requires Java 1.5. So either your are not using PDFBox 1.8.x or your are not using a Java 1.4 environment. - you should asked someone familiar with Launch4J how to configure it correct to get a working executeable or the even better idea you should think about using the JDK directly without a jar->exe converter. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944932#comment-13944932 ] brijesh commented on PDFBOX-1994: - yes, add println statements in between . that is why i understand it hangs on the single line. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944924#comment-13944924 ] Tilman Hausherr commented on PDFBOX-1994: - Could you add a writeln before and after each pdfbox related call so that you can tell which one hangs? > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944919#comment-13944919 ] Timo Boehme commented on PDFBOX-1994: - If you are using a UNIX style server system run a kill -3 PROCESS_PID(or -QUIT) on the server Java process. This will give you a stack trace of the Java VM at stdout/stderr(?) - may this is redirected to a log file in your case. So you will see where it hangs. If it hangs in PDFBox you can provide us with this stack trace. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944913#comment-13944913 ] brijesh commented on PDFBOX-1994: - hi , tested with PDDocument doc = PDDocument.loadNonSeq(file , null); But the same issue , it hangs. 1- tested the pdf file size (it is not zero size) 2- added all permissions to the specified folder / files 3-it is not a corrupt file also. still i didnt understand where is the exact issue. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944897#comment-13944897 ] Tilman Hausherr commented on PDFBOX-1994: - Btw the 2nd param of loadnonseq can be null. It is important to get rid of external factors like that exe packer and then approach the real problem step by step. Currently your main class cannot be found. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944880#comment-13944880 ] Timo Boehme commented on PDFBOX-1994: - Looks like you are missing a number of libraries in your jar test case but all of them do not belong to PDFBox. It seems to me that in general this is no PDFBox issue but an issue of your server environment. I would propose adding more test code in your server version (test first if file is readable etc.; after loading do some logging before trying to print document, ...) and use PDDocument.loadNonSeq instead of PDDocument.load. > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1994) PDDocument.load(filename.pdf) hangs for pdf files having size
[ https://issues.apache.org/jira/browse/PDFBOX-1994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944868#comment-13944868 ] brijesh commented on PDFBOX-1994: - Hi, while testing the .jar , getting exception, Exception in thread "main" java.lang.NoClassDefFoundError: com/jgoodies/looks/pl astic/PlasticTheme Caused by: java.lang.ClassNotFoundException: com.jgoodies.looks.plastic.PlasticT heme at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) Could not find the main class: specgas.MainGas. Program will exit. ---So i cant test the .jar. - But i am confidant that , this is working fine in local with java 1.4 version. and other code and corresponding .jar is working fine in server , except 'PDDocument.load' statement. control is hanging with no error message. - Can u please provide me any other method for converting the .jar to .exe ? - > PDDocument.load(filename.pdf) hangs for pdf files having size > - > > Key: PDFBOX-1994 > URL: https://issues.apache.org/jira/browse/PDFBOX-1994 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.4 >Reporter: brijesh > > The below code i am using for loading my pdf. but my pdf file is not a zero > sized files and having full permission and it is not a corrupt file also. but > i ddint get any error after code. it just hangs. > it is working in local, but not working in server . > (created ,jar files and then exe, then the .exe will excuted in the server) > java using 1,4 > PDDocument pdf=PDDocument.load("d:\\filename.pdf"); > pdf.print(); > please provide me why the same code is not working in server. -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: [GSoC 2014]Optical Character Recognition project - Introduction
Hi John, I looked at processTextPosition method in PDFTextStripper. But I couldn't understand actual process happening inside the method. What should be the input for that method? In my case I have words with bounding box's coordinates. How can I make those data to compatible with the input of processTextPosition method. As well, what is the output of the method? Thanks Dimuthu On Wed, Mar 19, 2014 at 11:19 PM, John Hewson wrote: > Hi Dimuthu > >> 1 Print those data into PDDocument again and pass through TextStripper >> of PDFBox. This could reduce the performance of overall process. > > This was what I had in mind, but rather than printing the text into the > PDDocument > you can inject it directly into PDFTextStripper as TextPosition instances. I > mentioned > something like this a while ago: > >> You could subclass PDFTextStripper and override the startDocument method and >> use it to create a PDFRenderer and store it in a field. Then override the >> processPage method and use the previously created PDFRenderer to render the >> current page to a buffered image and perform OCR on the image. Once you have >> the OCR text + positions, instead of calling processStream you can call >> processTextPosition once for each character + position. > > Let's see how well it works and then re-evaluate. > > -- John > -- Regards W.Dimuthu Upeksha Undergraduate Department of Computer Science And Engineering University of Moratuwa, Sri Lanka