On Fri, 2016-03-18 at 22:53 +0100, Laszlo Ersek wrote:
> It happens to display Michał's name correctly, because it fits in latin2.

Ah, OK. You got lucky on that one. Lots of names *don't* fit in
ISO8859-2.

> The extreme lengths that I had to go to were necessary to convince
> git-send-email not to mess up Michał's CC in the email headers, picking
> it up from the commit message. The commit message was all right, but the
> email header got mis-encoded (it's a git bug; it thought the CC line was
> UTF-8, despite knowing that the full commit message was latin2).

I'd be interested in a reference to the upstream bug report for that
one.

It's going to be caused by the use of 'i18n.commitencoding=latin2'.

Hm... did you realise that when you use that setup and you commit and
push directly, that means you are putting latin2-encoded objects into
the upstream EDK2 git repository?

I'm surprised that even works without corrupting things for everyone —
it looks like your commits actually carry an explicit label marking
them as latin2.

However, I really do think that's best avoided.

The lesson we learned in the 20th century was that labelling character
sets just doesn't work very well. The labels fall off, assumptions get
made and things interpreted in the "default" character set. So your
non-standard latin2 commits *are* at some point going to be wrongly
interpreted as UTF-8. In fact, wait a minute... you just *gave* me an
example of that happening :)

The labelling problem hasn't gone away entirely in the 21st century —
everyone using a modern setup still ought to be explicitly labelling
their output as UTF-8, and correctly converting from legacy on the way
in.

But in a system where everything is UTF-8 all of the time, mislabelling
errors are completely harmless. Errors like the one above — while they
might happen — don't actually cause a problem. Something was UTF-8 and
you mistakenly interpreted it as UTF-8 because you misplaced its label?
Oh well, nobody noticed :)

So yes, I'm interested in the bug because it should be fixed. But
basically, you brought it upon yourself by operating in a mode that is
*known* to invite such errors, and was abandoned by most other people a
*long* time ago. And you're pushing that choice into the EDK2 history
so that others are exposed to the same class of bugs. :(

> emdash doesn't display correctly for me indeed, in the output of "git
> log". It's not a big annoyance, but if I am to apply a patch, I try to
> prevent it.
> 
> > 
> > I strongly suspect the latter, and that you only noticed because you
> > were looking closely at the encoding because of that Evolution bug?
> No. I tend to notice glyphs by the naked eye that are not ascii / latin1
> / latin2. Here's another example:
> 
> http://thread.gmane.org/gmane.comp.bios.edk2.devel/8563/focus=8566

I'll see that one and raise you a 'There is nothing you can put in the
man page source to make it output the correct dash on all devices' :)

https://bugzilla.redhat.com/show_bug.cgi?id=1173619


> Okay. I've googled the use of emdash in the English language, and it
> seems to be more or less interchangeable with parens. Is that okay?

In some usages, yes — but probably not this example.

Two dashes is often considered an acceptable rendering of the emdash,
if you *really* must... but in this case since it isn't actually
necessary, I'd very much prefer that you leave it as it is.

-- 
dwmw2

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to