[ 
https://issues.apache.org/jira/browse/PDFBOX-4189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16438383#comment-16438383
 ] 

Tilman Hausherr commented on PDFBOX-4189:
-----------------------------------------

Your patch is very much appreciated of course, thank you. It will probably 
result in thousands new users / usages. This is a complex patch so expect this 
to take some time before it is committed. See PDFBOX-4106 for an example of a 
complex patch and the discussion.

Can you please also add an apache header to the .txt files? See the file 
"pdfbox\src\main\resources\org\apache\pdfbox\resources\glyphlist\additional.txt".
 I'd also need to know where this file came from, or whether you created it 
yourself from other data; if yes, please include a comment how, and/or the code 
that created the file.

About the commits:
 - {color:#333333}What is the story of having different data for jdk7 and 
jdk8?{color}
 - BengaliPdfGenerationHelloWorld should be integrated into the 
EmbeddedFonts.java example
 - why a log4j2.xml ? We don't use log4j2 except in preflight where log4j is 
used in Tests
 - I think I understand why my example didn't work. You disabled subsetting. 
But with subsetting the subsetter should "know" which glyphs are used. But we 
do need subsetting because otherwise files might get huge
 - The generated PDF file has trouble with text extraction: "আমি কোন পথ􀆶 􀂧ীর􀆶র 
ল􀂩ী ষ􀄞 পুতুল 􀅠পো গ􀃄া ঋষি" i.e. there are some unknown glyphs.
 - The move of GlyphsubstitutionTable breaks the API. Like I said in the PR, if 
you keep the API as it is (only expand. not change existing methods) then your 
change could be used for 2.0 too. The release of 3.0 could take years. The 
release of 2.0.10 only a few months.
 - There is a lot of logging done ("WARNUNG: oldValue: [52, 114] will be 
overridden with newValue: [114, 52]"). This is scary and should be changed or 
removed, It scares users and they create issues, thinking that something got 
wrong. If you change it to debug, please include a comment what this is about. 
See also the discussion in PDFBOX-4106, about{color:#333333} "Trying to 
un-substitute a never-before-seen gid"{color}.
 - Loosening scope restrictions is a bit of a no-no, as done in 
[TTFDataStream.java|https://github.com/apache/pdfbox/pull/46/files#diff-894ae790d373c62634ceed941b264dc3]
 , 
[TTFTable.java|https://github.com/apache/pdfbox/pull/46/files#diff-355fd8e3330f392bdae0778f942dc124]
 , and maybe elsewhere. As preached by "Effective Java", item 15: "make each 
class or member as inaccessible as possible".
 - Public methods should have a javadoc, same for classes. It doesn't have to 
be big, just make it good enough for other people to understand what is done. 
See also [https://pdfbox.apache.org/codingconventions.html] , I think most 
conventions are already respected.

I have no yet done a review review of the code (looking side-by-side), so more 
questions may be coming.

> Enable rendering of Indian languages, by reading and utilizing the GSUB table
> -----------------------------------------------------------------------------
>
>                 Key: PDFBOX-4189
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4189
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: FontBox, PDModel
>            Reporter: Palash Ray
>            Priority: Major
>         Attachments: Bengali-text-after.pdf, Bengali-text-before.pdf
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Implemented proper rendering of Indian languages, which need extensive Glyph 
> substitution. The GSUB table has been read and used effectively to replace 
> some compound words with their respective Glyphs. All tests are passing. I 
> have tested this for the Bengali font. Please review these changes and let me 
> know if it makes sense to incorporate these.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to