On Fri, Jul 17, 2009 at 10:19 PM, Edward K. Ream<[email protected]> wrote:

>> We already have support for explicit selection of encoding with
>> external files.
>
> I assume you mean lines like:
>
> @first # -*- coding: utf-8 -*-

No, I mean stuff like:

@encoding utf-8

>> For .leo files, it helps to only support one format.
>
> I don't see why.  .leo files start with:
>
> <?xml version="1.0" encoding="utf-8"?>
>
> This matches the production at: http://www.w3.org/TR/REC-xml/#NT-XMLDecl,
> namely:
>
> XMLDecl   ::=   '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
> In other words, an encoding spec, while optional, is a normal part of xml
> documents.
>
> I am assuming the w3c folks had a reason for such encoding declarations, and
> that being so, I see no reason to decree that the encoding shall be utf-8.

XML is rather "open-ended", and it's understandable why they didn't
want to lock down the format. However, leo does write its own xml
files, so it can decide that the file it writes out will be utf-8. It
may lead to suboptimal representation for non-european languages, but
that's a very small nit when you weigh it against the benefit of
having an encoding you can rely on.

This doesn't mean leo shouldn't read non-utf8 xml files, but that's
something we get for free from sax parser anyway. What I'm saying is
that it's a bad idea to write something apart from utf-8.

>> (specifically, I believe the bug reported here, umlauts not working, is
>> caused by some encoding
>> glitch).
>
> Naturally, any time there are representational issues, the cause must surely
> be an encoding problem.  It seems to me that such problems are the reason
> why we do indeed want to be able to specify encodings!

If utf-8 won't work for some reason, I'm hard pressed to imagine how
other encoding would work better (for .leo files). It's easy to
imagine why other encodings would work worse, though.

I think the flexibility of encodings is somewhat similar to
positions-vs-vnodes thing - the fact that the option is available and
documented creates the illusion that it could somehow be useful,
whereas it generally won't be. There is one contorted use case where I
can imagine latin-8859-* encodings to be useful: editing the .leo xml
file in an editor that doesn't support utf-8. However, even those
editors "sort of" support utf-8, only with junk characters at the
place of non-ascii characters, and this (somewhat futile) use case is
better served by users doing some kind of file format conversion with
an external tool.

-- 
Ville M. Vainio
http://tinyurl.com/vainio

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"leo-editor" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/leo-editor?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to