Re: enc=utf-8 does not force menu titles to utf-8

Tony Mechelynck Thu, 19 Jun 2008 19:01:55 -0700

On 20/06/08 03:02, Ben Schmidt wrote:
[...]
>> Have you got any suggestions how to deal with scripts that fail to set
>> :scriptencoding?
>
> The only think I can think of that would be worth doing would be to
> check that when Vim converts a script that uses :scriptencoding, it uses
> ++bad=drop or ++bad=X and not ++bad=keep. Then as long as 'encoding'
> isn't changed, menus will be valid. If 'encoding' changes, the help
> already says they need reloading.


bad=drop or bad=x could apply if encountering a character invalid for 
'encoding' before ":scriptencoding" has been used (or after it has been 
used with no argument), or a character that is either invalid for the 
scriptencoding or cannot be translated into 'encoding' after it's been 
used with a nonempty argument. The script should not change 'encoding' 
because the data in already loaded buffers (for instance, in split 
windows), in some options (such as 'listchars') and in other menus might 
become invalid. The "official" way to do it is to keep 'encoding' 
unchanged and declare near the top, in a ":scriptencoding" statement, 
which 'fileencoding' was used for the script, except that UTF-8 doesn't 
need to be declared if there is a BOM.

>
>> It seems to me this is a real problem since it makes
>> the encoding of (e.g.) menu titles unpredictable, and its thus up to
>> the OS APIs to deal with the issue (not very reliable since each OS
>> behaves differently).
>
> I think it's merely the responsibility to ensure the application doesn't
> crash when passed invalid data, which is always a possibility. This is
> OS specific. But what kind of garbage you should get is surely
> undefined.
>
> Ben.

The menus should be in a language for which the current 'encoding' is 
valid. For instance it makes no sense to set up Chinese menus in hanzi 
while 'encoding' is latin1. If 'encoding' is euc-cn and the 
scriptencoding is GB18030 or UTF-8 it is somewhat more ticklish, since 
both of the latter can be used for Chinese text, but they can 
potentially represent some (rare) Chinese characters which have no 
representation in euc-cn. The idea then is to either have several menu 
scripts (depending on the encoding to be used), or use the lowest common 
denominator in the menu. For instance in French, the oe digraph appears 
in such common words as "œuf" (egg) "bœuf" (ox, beef), "sœur" (sister), 
"nœud" (knot, hitch), as well as in quite a number of rarer words of 
Greek origin; however it cannot be represented in Latin1. A French menu 
including such words might then have to exist in a Latin1 version (with 
œ shown as oe) and a Windows-1252 version (with œ as a digraph). 
Alternately, a synonymous expression not making use of that digraph 
could be searched for, and if found the menu could be published in 
Latin1 only. Similarly, if needed, for other encodings, including CJK.


Best regards,
Tony.
-- 
"... one of the main causes of the fall of the Roman Empire was that,
lacking zero, they had no way to indicate successful termination of
their C programs."
                -- Robert Firth

--~--~---------~--~----~------------~-------~--~----~
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
-~----------~----~----~----~------~----~------~--~---

Re: enc=utf-8 does not force menu titles to utf-8

Raspunde prin e-mail lui