Re: BUG: Mapping of ascii minus character in PS renderer
Jeremias Maerki wrote: [..] I'll put your fix in but I can't guarantee that it'll be before Christian does the release. Bug #15936 is still an open issue ... I've mixed feelings about committing patches at this stage of the release but it's ok if they are as simple as this one. (I'll just thinking about committing the Namend Destination patch..) Christian - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Antwort: Re: BUG: Mapping of ascii minus character in PS renderer
Jeremias Maerki wrote: > is a custom encoding scheme like Acrobat Reader > uses when converting PDF to PostScript > (PDFEncoding). That'll be some work... Yes. Guess why I didn't do that yesterday. 8-) Even if you use some PS code to copy the standard encoding and just change the characters you want. Years ago, I did that for IBM codepage 850. Typing and testing took the better of 2 days if I remember correctly. If your goal would be to write the best possible PS renderer for FOP, I suppose you would want to do the character mapping entirely in the PS renderer and use reencoding in the PS code only where necessary to get at characters to in the standard encoding. This would minimize PS file size (ok, only a relatively small, absolute saving). Also, I guess you would save on typing time, as you could use an integer mapping table instead of using all these long symbolic names in the PS code. BTW: During the night, a FOP with the fix has already rendered about 50.000 pages. I checked this morning - looks good. -- Cappelino Informationstechnologie GmbH Arnd Beißner - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Re: BUG: Mapping of ascii minus character in PS renderer
On 22.01.2003 23:55:14 Arnd Beißner wrote: > Hello there, > > after some research I found and fixed a bug in the PS renderer > that can be a real nuisance. Yeah, one that I never got round to fix. > The problem is as follows: The ascii (and Unicode) minus > character is mapped to the hyphen character by the PDF > renderer. The PostScript renderer instead maps it tho the > minus character. This happens because the generated > PS code reencodes the fonts to ISO Latin 1 encoding, which > handles ascii code 45 differently from the standard PS font > encoding. > > Typograpically, the character at 45 in ISOLatin1 is a real minus, > and the character at 45 in Standard Encoding is a hyphen, which > is about half as wide as the minus in your average font. The > difference in your PS output can be quite destructive, as FOP > always formats assuming the width of the hyphen character... > > A "patch" follows. The reason I'm not yet submitting a real diff > to Bugzilla is that I am a) extremely overloaded right now and > b) this really needs to be discussed: > > Some thoughts on this > (by 'FOP' I mean formatter+PDF renderer code): > > 1. Who's right and who's wrong? > Either FOP - or - the PS renderer is right, but who? I'm sure that the PS renderer is wrong. When I wrote it I've used ISOLatin1 encoding because it got more characters right than with StandardEncoding. :-) I didn't want to spend too much time on this because at that time the PS renderer was merely a proof-of-concept. > 2. If FOP is right, then the PS renderer must be > fixed. This can be done either by fixing the method > renderWordArea or by changing the PS procedures. > However, the latter would increase PS file size > (can't copy the ISO latin 1 enconding as opposed > to the standard encoding), so I opted for changing > renderWordArea. Not happy with that on the long run. For immediately fixing this it's ok. When I rewrite the PS renderer for the redesign I intend to get that right from the beginning. The problem is not just the hyphen character. There are others. The problem is that the base14 fonts are set to WinAnsiEncoding (see org.apache.fop.render.pdf.fonts.Helvetica) and the PS renderer uses ISOLatin1. So, depending on the characters used you get multiple mismatches not just the hyphen character. What we probably need is a custom encoding scheme like Acrobat Reader uses when converting PDF to PostScript (PDFEncoding). That'll be some work... > 3. If FOP is wrong, then probably someone else > must fix it - I suppose I won't find the right place > for the fix easily. FOP is right. > Personally I think the PS renderer is wrong, since > the original Adobe PS character encoding maps > ascii 45 to the hyphen character and Adobe usually > knows what they're doing. Still, at that point in time, > Unicode wasn't there yet, so... > > This is an issue that we may possible want > to solve before 0.20.5 goes final. Personally, I won't > have time before the weekend to check with > the Unicode and/or XSL spec. > > Any comments/ideas? > > --- temp fix that I use --- I'll put your fix in but I can't guarantee that it'll be before Christian does the release. Jeremias Maerki - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
BUG: Mapping of ascii minus character in PS renderer
Hello there, after some research I found and fixed a bug in the PS renderer that can be a real nuisance. The problem is as follows: The ascii (and Unicode) minus character is mapped to the hyphen character by the PDF renderer. The PostScript renderer instead maps it tho the minus character. This happens because the generated PS code reencodes the fonts to ISO Latin 1 encoding, which handles ascii code 45 differently from the standard PS font encoding. Typograpically, the character at 45 in ISOLatin1 is a real minus, and the character at 45 in Standard Encoding is a hyphen, which is about half as wide as the minus in your average font. The difference in your PS output can be quite destructive, as FOP always formats assuming the width of the hyphen character... A "patch" follows. The reason I'm not yet submitting a real diff to Bugzilla is that I am a) extremely overloaded right now and b) this really needs to be discussed: Some thoughts on this (by 'FOP' I mean formatter+PDF renderer code): 1. Who's right and who's wrong? Either FOP - or - the PS renderer is right, but who? 2. If FOP is right, then the PS renderer must be fixed. This can be done either by fixing the method renderWordArea or by changing the PS procedures. However, the latter would increase PS file size (can't copy the ISO latin 1 enconding as opposed to the standard encoding), so I opted for changing renderWordArea. 3. If FOP is wrong, then probably someone else must fix it - I suppose I won't find the right place for the fix easily. Personally I think the PS renderer is wrong, since the original Adobe PS character encoding maps ascii 45 to the hyphen character and Adobe usually knows what they're doing. Still, at that point in time, Unicode wasn't there yet, so... This is an issue that we may possible want to solve before 0.20.5 goes final. Personally, I won't have time before the weekend to check with the Unicode and/or XSL spec. Any comments/ideas? --- temp fix that I use --- PSRenderer, method renderWordArea: for (int i = 0; i < l; i++) { char ch = s.charAt(i); char mch = fs.mapChar(ch); // temp fix abe: map ascii '-' to ISO latin 1 hyphen char if (mch == '-') { sb = sb.append("\\" + Integer.toOctalString(173)); } else /* fix ends */ if (mch > 127) { sb = sb.append("\\" + Integer.toOctalString(mch)); } else { String escape = "\\()[]{}"; if (escape.indexOf(mch) >= 0) { sb.append("\\"); } sb = sb.append(mch); } } -- Cappelino Informationstechnologie GmbH Arnd Beißner - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]