Re: BUG: Mapping of ascii minus character in PS renderer

2003-01-24 Thread Christian Geisert
Jeremias Maerki wrote:

[..]


I'll put your fix in but I can't guarantee that it'll be before
Christian does the release.


Bug #15936 is still an open issue ...

I've mixed feelings about committing patches at this stage of the
release but it's ok if they are as simple as this one.
(I'll just thinking about committing the Namend Destination patch..)

Christian


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Antwort: Re: BUG: Mapping of ascii minus character in PS renderer

2003-01-23 Thread Arnd Beißner
Jeremias Maerki wrote:
> is a custom encoding scheme like Acrobat Reader
> uses when converting PDF to PostScript
> (PDFEncoding). That'll be some work...

Yes. Guess why I didn't do that yesterday. 8-)

Even if you use some PS code to copy the standard
encoding and just change the characters you want.
Years ago, I did that for IBM codepage 850. Typing
and testing took the better of 2 days if I remember
correctly.

If your goal would be to write the best possible
PS renderer for FOP, I suppose you would want to
do the character mapping entirely in the PS renderer
and use reencoding in the PS code only where necessary
to get at characters to in the standard encoding.
This would minimize PS file size (ok, only a
relatively small, absolute saving). Also, I guess
you would save on typing time, as you could use an
integer mapping table instead of using all these
long symbolic names in the PS code.

BTW: During the night, a FOP with the fix has already
rendered about 50.000 pages. I checked this morning
- looks good.

--
Cappelino Informationstechnologie GmbH
Arnd Beißner


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




Re: BUG: Mapping of ascii minus character in PS renderer

2003-01-23 Thread Jeremias Maerki

On 22.01.2003 23:55:14 Arnd Beißner wrote:
> Hello there,
> 
> after some research I found and fixed a bug in the PS renderer
> that can be a real nuisance.

Yeah, one that I never got round to fix.

> The problem is as follows: The ascii (and Unicode) minus
> character is mapped to the hyphen character by the PDF
> renderer. The PostScript renderer instead maps it tho the
> minus character. This happens because the generated
> PS code reencodes the fonts to ISO Latin 1 encoding, which
> handles ascii code 45 differently from the standard PS font
> encoding.
> 
> Typograpically, the character at 45 in ISOLatin1 is a real minus,
> and the character at 45 in Standard Encoding is a hyphen, which
> is about half as wide as the minus in your average font. The
> difference in your PS output can be quite destructive, as FOP
> always formats assuming the width of the hyphen character...
> 
> A "patch" follows. The reason I'm not yet submitting a real diff
> to Bugzilla is that I am a) extremely overloaded right now and
> b) this really needs to be discussed:
> 
> Some thoughts on this
> (by 'FOP' I mean formatter+PDF renderer code):
> 
> 1. Who's right and who's wrong?
> Either FOP  - or - the PS renderer is right, but who?

I'm sure that the PS renderer is wrong. When I wrote it I've used
ISOLatin1 encoding because it got more characters right than with
StandardEncoding. :-) I didn't want to spend too much time on this
because at that time the PS renderer was merely a proof-of-concept.

> 2. If FOP is right, then the PS renderer must be
> fixed. This can be done either by fixing the method
> renderWordArea or by changing the PS procedures.
> However, the latter would increase PS file size
> (can't copy the ISO latin 1 enconding as opposed
> to the standard encoding), so I opted for changing
> renderWordArea.

Not happy with that on the long run. For immediately fixing this it's ok.
When I rewrite the PS renderer for the redesign I intend to get that
right from the beginning. The problem is not just the hyphen character.
There are others. The problem is that the base14 fonts are set to
WinAnsiEncoding (see org.apache.fop.render.pdf.fonts.Helvetica) and the
PS renderer uses ISOLatin1. So, depending on the characters used you get
multiple mismatches not just the hyphen character. What we probably need
is a custom encoding scheme like Acrobat Reader uses when converting PDF
to PostScript (PDFEncoding). That'll be some work...

> 3. If FOP is wrong, then probably someone else
> must fix it - I suppose I won't find the right place
> for the fix easily.

FOP is right.

> Personally I think the PS renderer is wrong, since
> the original Adobe PS character encoding maps
> ascii 45 to the hyphen character and Adobe usually
> knows what they're doing. Still, at that point in time,
> Unicode wasn't there yet, so...
> 
> This is an issue that we may possible want 
> to solve before 0.20.5 goes final. Personally, I won't
> have time before the weekend to check with
> the Unicode and/or XSL spec.
> 
> Any comments/ideas?
> 
> --- temp fix that I use ---


I'll put your fix in but I can't guarantee that it'll be before
Christian does the release.

Jeremias Maerki


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]




BUG: Mapping of ascii minus character in PS renderer

2003-01-22 Thread Arnd Beißner
Hello there,

after some research I found and fixed a bug in the PS renderer
that can be a real nuisance.

The problem is as follows: The ascii (and Unicode) minus
character is mapped to the hyphen character by the PDF
renderer. The PostScript renderer instead maps it tho the
minus character. This happens because the generated
PS code reencodes the fonts to ISO Latin 1 encoding, which
handles ascii code 45 differently from the standard PS font
encoding.

Typograpically, the character at 45 in ISOLatin1 is a real minus,
and the character at 45 in Standard Encoding is a hyphen, which
is about half as wide as the minus in your average font. The
difference in your PS output can be quite destructive, as FOP
always formats assuming the width of the hyphen character...

A "patch" follows. The reason I'm not yet submitting a real diff
to Bugzilla is that I am a) extremely overloaded right now and
b) this really needs to be discussed:

Some thoughts on this
(by 'FOP' I mean formatter+PDF renderer code):

1. Who's right and who's wrong?
Either FOP  - or - the PS renderer is right, but who?

2. If FOP is right, then the PS renderer must be
fixed. This can be done either by fixing the method
renderWordArea or by changing the PS procedures.
However, the latter would increase PS file size
(can't copy the ISO latin 1 enconding as opposed
to the standard encoding), so I opted for changing
renderWordArea.

3. If FOP is wrong, then probably someone else
must fix it - I suppose I won't find the right place
for the fix easily.

Personally I think the PS renderer is wrong, since
the original Adobe PS character encoding maps
ascii 45 to the hyphen character and Adobe usually
knows what they're doing. Still, at that point in time,
Unicode wasn't there yet, so...

This is an issue that we may possible want 
to solve before 0.20.5 goes final. Personally, I won't
have time before the weekend to check with
the Unicode and/or XSL spec.

Any comments/ideas?

--- temp fix that I use ---
PSRenderer, method renderWordArea:
for (int i = 0; i < l; i++) {
char ch = s.charAt(i);
char mch = fs.mapChar(ch);

// temp fix abe: map ascii '-' to ISO latin 1 hyphen char
if (mch == '-') {
  sb = sb.append("\\" + Integer.toOctalString(173));
} else /* fix ends */ if (mch > 127) {
sb = sb.append("\\" + Integer.toOctalString(mch));
} else {
String escape = "\\()[]{}";
if (escape.indexOf(mch) >= 0) {
sb.append("\\");
}
sb = sb.append(mch);
}
}
--
Cappelino Informationstechnologie GmbH
Arnd Beißner


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]