[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246260#comment-14246260
 ] 

ASF subversion and git services commented on PDFBOX-2524:
-

Commit 1645553 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645553 ]

PDFBOX-2524: Adobe Reader requires well-formed CMaps

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246257#comment-14246257
 ] 

ASF subversion and git services commented on PDFBOX-2524:
-

Commit 1645551 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645551 ]

PDFBOX-2524: Workaround because 0 is actually a valid char

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246230#comment-14246230
 ] 

John Hewson commented on PDFBOX-2524:
-

Looks like an encoding issue, until recently our source was ISO-8859-1 but it 
is now UTF-8. If you're using an IDE then restart your IDE and try to do a full 
rebuild. If you're using Maven then try a "mvn clean install" on the top-level 
pdfbox project.

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246156#comment-14246156
 ] 

Tilman Hausherr commented on PDFBOX-2524:
-

{code}
Running org.apache.pdfbox.pdmodel.font.TestFontEmbedding
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.227 sec <<< 
FAILURE! - in org.apache.pdfbox.pdmodel.font.TestFontEmbedding
testCIDFontType2(org.apache.pdfbox.pdmodel.font.TestFontEmbedding)  Time 
elapsed: 0.227 sec  <<< FAILURE!
junit.framework.ComparisonFailure: expected:<...  Ti?ng Vi?t[]
> but was:<...  Ti?ng Vi?t[
]
>
at junit.framework.Assert.assertEquals(Assert.java:100)
at junit.framework.Assert.assertEquals(Assert.java:107)
at junit.framework.TestCase.assertEquals(TestCase.java:269)
at 
org.apache.pdfbox.pdmodel.font.TestFontEmbedding.testCIDFontType2(TestFontEmbedding.java:70)
{code}

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2567) Only one page found while the document actually contains two pages

2014-12-14 Thread Siegfried Goeschl (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246155#comment-14246155
 ] 

Siegfried Goeschl commented on PDFBOX-2567:
---

Thanks for the quick response  - I will check tomorrow :-)

> Only one page found while the document actually contains two pages
> --
>
> Key: PDFBOX-2567
> URL: https://issues.apache.org/jira/browse/PDFBOX-2567
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.7
>Reporter: Siegfried Goeschl
> Attachments: first-page-lost-01.pdf
>
>
> I'm currently converting a lot of PDF documents to images - for this 
> particular document I'm only able to extract one page
> {noformat}
> List pages = pdDocument.getDocumentCatalog().getAllPages();
> {noformat}
> Using Mac OS Preview I see that the document contains actually two pages.
> Please note that I have permission from my customer to upload the document



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2567) Only one page found while the document actually contains two pages

2014-12-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246146#comment-14246146
 ] 

Tilman Hausherr commented on PDFBOX-2567:
-

And update to 1.8.8 :-)

> Only one page found while the document actually contains two pages
> --
>
> Key: PDFBOX-2567
> URL: https://issues.apache.org/jira/browse/PDFBOX-2567
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.7
>Reporter: Siegfried Goeschl
> Attachments: first-page-lost-01.pdf
>
>
> I'm currently converting a lot of PDF documents to images - for this 
> particular document I'm only able to extract one page
> {noformat}
> List pages = pdDocument.getDocumentCatalog().getAllPages();
> {noformat}
> Using Mac OS Preview I see that the document contains actually two pages.
> Please note that I have permission from my customer to upload the document



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2567) Only one page found while the document actually contains two pages

2014-12-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246142#comment-14246142
 ] 

Andreas Lehmkühler commented on PDFBOX-2567:


The pdf was updated, you have to use the non-sequential parser 
(PDDocument#loadNonSeq instead of PDDocument#load)

> Only one page found while the document actually contains two pages
> --
>
> Key: PDFBOX-2567
> URL: https://issues.apache.org/jira/browse/PDFBOX-2567
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.7
>Reporter: Siegfried Goeschl
> Attachments: first-page-lost-01.pdf
>
>
> I'm currently converting a lot of PDF documents to images - for this 
> particular document I'm only able to extract one page
> {noformat}
> List pages = pdDocument.getDocumentCatalog().getAllPages();
> {noformat}
> Using Mac OS Preview I see that the document contains actually two pages.
> Please note that I have permission from my customer to upload the document



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246139#comment-14246139
 ] 

ASF subversion and git services commented on PDFBOX-2524:
-

Commit 1645526 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645526 ]

PDFBOX-2524: Adobe Reader requires a "UCS" CMap

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PDFBOX-2567) Only one page found while the document actually contains two pages

2014-12-14 Thread Siegfried Goeschl (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siegfried Goeschl updated PDFBOX-2567:
--
Attachment: first-page-lost-01.pdf

PDF causing the problem - please note that I have the permission from the 
customer to upload the file to JIRA

> Only one page found while the document actually contains two pages
> --
>
> Key: PDFBOX-2567
> URL: https://issues.apache.org/jira/browse/PDFBOX-2567
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 1.8.7
>Reporter: Siegfried Goeschl
> Attachments: first-page-lost-01.pdf
>
>
> I'm currently converting a lot of PDF documents to images - for this 
> particular document I'm only able to extract one page
> {noformat}
> List pages = pdDocument.getDocumentCatalog().getAllPages();
> {noformat}
> Using Mac OS Preview I see that the document contains actually two pages.
> Please note that I have permission from my customer to upload the document



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (PDFBOX-2567) Only one page found while the document actually contains two pages

2014-12-14 Thread Siegfried Goeschl (JIRA)
Siegfried Goeschl created PDFBOX-2567:
-

 Summary: Only one page found while the document actually contains 
two pages
 Key: PDFBOX-2567
 URL: https://issues.apache.org/jira/browse/PDFBOX-2567
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 1.8.7
Reporter: Siegfried Goeschl


I'm currently converting a lot of PDF documents to images - for this particular 
document I'm only able to extract one page

{noformat}
List pages = pdDocument.getDocumentCatalog().getAllPages();
{noformat}

Using Mac OS Preview I see that the document contains actually two pages.

Please note that I have permission from my customer to upload the document




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246128#comment-14246128
 ] 

ASF subversion and git services commented on PDFBOX-2524:
-

Commit 1645523 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645523 ]

PDFBOX-2524: Added unit test for CIDFontType2 embedding

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (PDFBOX-2383) PDFBox tests include copyright files

2014-12-14 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-2383:

Priority: Blocker  (was: Major)

> PDFBox tests include copyright files
> 
>
> Key: PDFBOX-2383
> URL: https://issues.apache.org/jira/browse/PDFBOX-2383
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.7, 2.0.0
>Reporter: John Hewson
>Priority: Blocker
> Fix For: 2.0.0
>
>
> The test files for PDFBox, FontBox, and Preflight include several files under 
> copyright which we probably don't have permission to redistribute, and need 
> to be removed (or preferably replaced):
> pdfbox/src/test/resources/org/apache/pdfbox/
>   - ttf/ArialMT.ttf (This is actually Bitstream Vera Sans - the license on 
> this might be ok though?)
>   - pdfparser/gdb-refcard.pdf (GPL licensed)
>   - pdmodel/page_label.pdf (Edited by Foxit PDF for Evaluation Only)
>   - pdmodel/font/256.pdf (Copyright 2004 Journal of Combinatorics)
> fontbox/src/test/resources/ttf/
> - testTrueType.ttf (NewBaskerville, Copyright © 2002 Veronika Elsner)
> preflight/src/test/resources/org/apache/padaf/preflight/font/
> - true_type.ttf (Subset of Microsoft Arial)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2383) PDFBox tests include copyright files

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246127#comment-14246127
 ] 

ASF subversion and git services commented on PDFBOX-2383:
-

Commit 1645522 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645522 ]

PDFBOX-2383: Replaced "ArialMT" with Liberation Sans

> PDFBox tests include copyright files
> 
>
> Key: PDFBOX-2383
> URL: https://issues.apache.org/jira/browse/PDFBOX-2383
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.7, 2.0.0
>Reporter: John Hewson
> Fix For: 2.0.0
>
>
> The test files for PDFBox, FontBox, and Preflight include several files under 
> copyright which we probably don't have permission to redistribute, and need 
> to be removed (or preferably replaced):
> pdfbox/src/test/resources/org/apache/pdfbox/
>   - ttf/ArialMT.ttf (This is actually Bitstream Vera Sans - the license on 
> this might be ok though?)
>   - pdfparser/gdb-refcard.pdf (GPL licensed)
>   - pdmodel/page_label.pdf (Edited by Foxit PDF for Evaluation Only)
>   - pdmodel/font/256.pdf (Copyright 2004 Journal of Combinatorics)
> fontbox/src/test/resources/ttf/
> - testTrueType.ttf (NewBaskerville, Copyright © 2002 Veronika Elsner)
> preflight/src/test/resources/org/apache/padaf/preflight/font/
> - true_type.ttf (Subset of Microsoft Arial)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2524) [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages

2014-12-14 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246103#comment-14246103
 ] 

John Hewson commented on PDFBOX-2524:
-

I'm working on PDFBOX-2565 currently, so don't worry about sending a patch. 
Lets open a new issue for Type0/CIDFontType0 fonts as these are CFF instead of 
TrueType.

> [PATCH] Two PDFont to create PDF documents in CJK and non-ISO-8859-1 languages
> --
>
> Key: PDFBOX-2524
> URL: https://issues.apache.org/jira/browse/PDFBOX-2524
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.0
>Reporter: Keiji Suzuki
>Assignee: John Hewson
> Attachments: PDCIDFontType2.patch, Type0.java, Type0CJK.java, 
> Type0Unicode.java, check-embeddability.patch, cidtype0.diff, cidtype2.diff, 
> format14.patch, two-new-fonts.diff, type0bom.pdf, type0nobom.pdf
>
>
> I made two PDFont classes for creating PDF documents in CJK and 
> non-ISO-8859-1 languages.
> One is PDType0CJKFont. This is for using CJK fonts included in the Asian font 
> package of Adobe Reader. This font doesn't require the target font at the 
> time of creating PDF documentary. This font uses UTF-16 as a text code and 
> supports surrogate pair characters.
> The other is PDType0UnicodeFont. This is for using TrueType Type0 Font which 
> can deal with any Unicode characters like a ArialUnicodeMS. Only the 
> characters which are used actually in the document are embedde. Realizing 
> this, you have to call the PDType0Unicode.reloadFont() method just before 
> closing PDPageContentStream. I think this specification is ugly, but I could 
> not thought of a suitable way to remove this spec. This font uses the 
> original glyph code of the embedded font as a text code and supports 
> surrogate pair characters too.
> Example programs using these two fonts are also attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (PDFBOX-2553) CalRGB colors different

2014-12-14 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2553.
-
   Resolution: Fixed
Fix Version/s: 2.0.0
 Assignee: Tilman Hausherr

[~ssteiner1] thanks for providing this file!

I am setting this to resolved although we've not handled different whitepoint 
cases. However I have not yet found any real world example of such a file.

> CalRGB colors different
> ---
>
> Key: PDFBOX-2553
> URL: https://issues.apache.org/jira/browse/PDFBOX-2553
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: Tilman Hausherr
>  Labels: CalRGB
> Fix For: 2.0.0
>
> Attachments: PDFBOX-2553.pdf-1.png, anna-iptimgx330.pdf-1.png, 
> pdfbox-2553.pdf-1-NEWIMPROVED.png
>
>
> http://acroeng.adobe.com/Test_Files/images/jpeg2000//Anna-IptImgx330.pdf
> java -cp 
> pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:/path/jai_imageio.jar 
> org.apache.pdfbox.tools.PDFToImage Anna-IptImgx330.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2553) CalRGB colors different

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246011#comment-14246011
 ] 

ASF subversion and git services commented on PDFBOX-2553:
-

Commit 1645478 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645478 ]

PDFBOX-2553: fix utf8 character

> CalRGB colors different
> ---
>
> Key: PDFBOX-2553
> URL: https://issues.apache.org/jira/browse/PDFBOX-2553
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>  Labels: CalRGB
> Attachments: PDFBOX-2553.pdf-1.png, anna-iptimgx330.pdf-1.png, 
> pdfbox-2553.pdf-1-NEWIMPROVED.png
>
>
> http://acroeng.adobe.com/Test_Files/images/jpeg2000//Anna-IptImgx330.pdf
> java -cp 
> pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:/path/jai_imageio.jar 
> org.apache.pdfbox.tools.PDFToImage Anna-IptImgx330.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2553) CalRGB colors different

2014-12-14 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246009#comment-14246009
 ] 

ASF subversion and git services commented on PDFBOX-2553:
-

Commit 1645477 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1645477 ]

PDFBOX-2553: use Adobe algorithm for conversion from CalRGB to CIEXYZ 
colorspace when whitepoint is (1 1 1); move CIEXYZ to RGB conversion to CIE 
base class

> CalRGB colors different
> ---
>
> Key: PDFBOX-2553
> URL: https://issues.apache.org/jira/browse/PDFBOX-2553
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>  Labels: CalRGB
> Attachments: PDFBOX-2553.pdf-1.png, anna-iptimgx330.pdf-1.png, 
> pdfbox-2553.pdf-1-NEWIMPROVED.png
>
>
> http://acroeng.adobe.com/Test_Files/images/jpeg2000//Anna-IptImgx330.pdf
> java -cp 
> pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:/path/jai_imageio.jar 
> org.apache.pdfbox.tools.PDFToImage Anna-IptImgx330.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2553) CalRGB colors different

2014-12-14 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245950#comment-14245950
 ] 

Tilman Hausherr commented on PDFBOX-2553:
-

This matter is more complex, sadly. The old code works fine for whitepoint 
(0.9505 1.0 1.089), also known as D65. The test file has no whitepoint, so it 
is (1 1 1), and there the code above works. What we can't handle are 
whitepoints that are not D65 and not (1 1 1), like in 
http://bugs.ghostscript.com/show_bug.cgi?id=686749 . Adobe doesn't tell how to 
handle the whitepoint, they say it is "beyond the scope of this document". So 
I'll keep the "hack", and use the "new" method for whitepoint (1 1 1) only. The 
only file with whitepoint (1 1 1) I found is PDFBOX-2307-159827.pdf, there on 
pages 4, 10 and 16. All my other test files have the D65 whitepoint. (I haven't 
looked at the digitalcorpora files).

> CalRGB colors different
> ---
>
> Key: PDFBOX-2553
> URL: https://issues.apache.org/jira/browse/PDFBOX-2553
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>  Labels: CalRGB
> Attachments: PDFBOX-2553.pdf-1.png, anna-iptimgx330.pdf-1.png, 
> pdfbox-2553.pdf-1-NEWIMPROVED.png
>
>
> http://acroeng.adobe.com/Test_Files/images/jpeg2000//Anna-IptImgx330.pdf
> java -cp 
> pdf-box-svn/app/target/pdfbox-app-2.0.0-SNAPSHOT.jar:/path/jai_imageio.jar 
> org.apache.pdfbox.tools.PDFToImage Anna-IptImgx330.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (PDFBOX-2512) OutOfMemory while signing large documents

2014-12-14 Thread Thomas Chojecki (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14245909#comment-14245909
 ] 

Thomas Chojecki commented on PDFBOX-2512:
-

I will try to apply the fix next year to the trunk and take a look at 
PDFBOX-2515. 

The last test with the nonSeq parser and the large document linked in the issue 
description workes more or less and created a valid signature out of the box. 
But the size was more than twice of the original document

> OutOfMemory while signing large documents
> -
>
> Key: PDFBOX-2512
> URL: https://issues.apache.org/jira/browse/PDFBOX-2512
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing, Signing
>Affects Versions: 1.8.7
>Reporter: Thomas Chojecki
>Assignee: Thomas Chojecki
> Fix For: 1.8.8
>
> Attachments: keystore.p12
>
>
> While working with large documents, we found some memory issues.
> 1. The method close() in the COSDocument, clones the objectpool and does not 
> clean it properly. The cloning in getObjects() cause a OutOfMemory exception.
> 2.The COSWriter copy the whole pdf into the memory for signing and does not 
> use BufferedInputStream for the FileInputStream which also has a big 
> performance impact. (PDFBOX-1798)
> 3. The cloning of COSStreams cause a OutOfMemory exception
> I used the CreateSignature example with a about 150 MB big document from here:
> https://cdn-reichelt.de/bilder/downloads/reichelt_01-2015_DE_B_HQ.pdf
> Additionaly I add a RandomAccessFile to the PDDocument.load in the 
> CreateSignature class.
> PDDocument doc = PDDocument.load(document,new RandomAccessFile(new 
> File("d:\\temp.bin"), "rw")); (this prevent the OOM for the third case)
> The use of a BuffedInputStream in case two, will increase the signing speed 
> from more than 5 minutes to less than 1 minute. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)