Attached is a zip file with a modified version of DocumentFont.java
implementing the naive solution and a diff of the changes. As I mentioned, a
more elegant approach might be to change the IntHashtable implementation.
Cheers, and thanks for taking time to look into this.
Gavin
----- Original Message ----
From: Paulo Soares <[EMAIL PROTECTED]>
To: Post all your questions about iText here
<itext-questions@lists.sourceforge.net>
Sent: Thursday, March 13, 2008 2:57:18 PM
Subject: Re: [iText-questions] Bug in DocumentFont when loading Differences
Send me your changes and we'll go from there.
Paulo
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On
> Behalf Of Gavin Disney
> Sent: Thursday, March 13, 2008 6:42 PM
> To: iText Questions
> Subject: Re: [iText-questions] Bug in DocumentFont when
> loading Differences
>
> Okay, and thanks for the quick response.
>
> The problem occurs when a substitution is being made that
> replaces a high unicode value with a low unicode value.
> Consider the font dictionary:
>
> key: /R10 value: Dictionary of type: /Font
> key: /BaseFont value: /Times-Roman
> key: /Type value: /Font
> key: /Subtype value: /Type1
> key: /Encoding value: Dictionary of type: /Encoding
> key: /Type value: /Encoding
> key: /Differences value: [39, /quotesingle]
>
> In the standard encoding, this replaces quoteright (unicode
> value 8217) at position 39 with quotesingle (unicode value
> 39). As the font is loaded by doType1TT() the encoding is
> determined and then "filled" by populating the IntHashtable
> uni2byte with the unicode values and their position. So,
> after initial population, uni2byte holds a key of 8217
> (quoteright) for position 39. Then the differences are
> processed and the mapping of unicode 39 (quotesingle) to
> position 39 is added to the map. Now uni2byte holds 2
> mappings pointing to position 39.
>
> The BaseFont is then created to instantiate metrics, and the
> widths array populated. And this is where we run into the
> problem. The keys are extracted from the map ordered
> ascending and used to populate the widths array using this loop:
>
> <snip>
> int e[] = uni2byte.toOrderedKeys();
> for (int k = 0; k < e.length; ++k) {
> int n = uni2byte.get(e[k]);
> widths[n] = bf.getRawWidth(n,
> GlyphList.unicodeToName(e[k]));
> }
> </snip>
>
> Since uni2byte holds two mappings for n=39 and the keys are
> iterated in asc. order, widths[39] will be populated with the
> width for quotesingle (unicode 39) first, then overwritten
> with the width for quoteright (unicode 8217). Now the width
> for position 39 is incorrect (333 in this case, as opposed to
> 180 for quotsingle).
>
> A simple (maybe naive) fix is to populate a second map
> (byte2uni) when filling the encoding, and then query this map
> when processing the differences, removing mappings from
> uni2byte where byte2uni maps a character at that position
> before adding the substition. e.g.:
>
> <snip>
> for (int k = 0; k < dif.size(); ++k) {
> PdfObject obj = (PdfObject)dif.get(k);
> if (obj.isNumber())
> currentNumber =
> ((PdfNumber)obj).intValue();
> else {
> int c[] =
> GlyphList.nameToUnicode(PdfName.decodeName(((PdfName)obj).toSt
> ring()));
> if (c != null && c.length > 0) {
> if
> (byte2uni.containsKey(currentNumber)) {
>
> uni2byte.remove(byte2uni.get(currentNumber)); // Remove prior
> mapping for position being substituted
> }
> uni2byte.put(c[0], currentNumber);
> byte2uni.put(currentNumber, c[0]);
> }
> ++currentNumber;
> }
> }
> </snip>
>
> Another solution is to alter the IntHashtable mechanics to
> allow easy removal of a value - enabling something like
> "uni2byte.removeValue(currentNumber);" or maybe "if
> (uni2byte.contains(currentNumber))
> uni2byte.remove(uni2byte.getKey(currentNumber));".
>
> I have a modified copy of DocumentFont.java that implements
> the naive fix, and am happy to send it to you. I'm still
> playing around with changes to the IntHashtable class as a
> solution. It's not clear to me which approach would impact
> performance least.
>
> Cheers,
> Gavin Disney
>
>
> ----- Original Message ----
> From: Paulo Soares <[EMAIL PROTECTED]>
> To: Post all your questions about iText here
> <itext-questions@lists.sourceforge.net>
> Sent: Thursday, March 13, 2008 1:52:25 PM
> Subject: Re: [iText-questions] Bug in DocumentFont when
> loading Differences
>
>
>
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> > [mailto:[EMAIL PROTECTED] On
> > Behalf Of Gavin Disney
> > Sent: Thursday, March 13, 2008 5:39 PM
> > To: iText-questions@lists.sourceforge.net
> > Subject: [iText-questions] Bug in DocumentFont when loading
> > Differences
> >
> > Hi iText Developers
> >
> > There seems to be a minor bug in the processing of
> > Differences from the Encoding dictionary when creating a
> > DocumentFont for Type 1 fonts. The bug is quite subtle, and
> > depends on the substitutions being made and the encoding
> > being used. It is straightforward to correct - I'd like to
> > discuss a couple of possible fixes.
> >
> > Is this the appropriate forum?
> >
>
> Yes.
>
> Paulo
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://itext.ugent.be/itext-in-action/