On Tue, Mar 16, 2004 at 10:17:57PM +0100, Karl Brodowsky wrote:
: With FFFE and FEFF this seems obvious. In case of #! it would not be clear
: to me if this defaults to ISO-8859-1 (latin-1) or to utf-8. See HTML
: vs. XHTML as an example where the default has been changed.
Perl 6 would certainly
ns that Unicode is not fulfilling it's design goals.
Yes, we can consider any file to be unicode with some encoding. That is
how the Java-guys do it, with the restriction that they don't easily let
you choose anything other than latin-1 + \ucafe-stuff for non-latin-1
characters (or maybe I didn
Karl Brodowsky wrote:
Mark J. Reed wrote:
The UTF-8 encoding is not so attractive in locales that make
heavy use of characters which require several bytes to encode therein, or
relatively little use of characters in the ASCII range;
utf-8 is fine for languages like German, Polish, Norwegian, Spanis
On 2004-03-16 at 00:28:32, Karl Brodowsky wrote:
> Mark J. Reed wrote:
>
> >Unicode per se doesn't do anything to file sizes; it's all in how you
> >encode it.
>
> Yes. And basically there are common ways to encode this: utf-8 and utf-16
> (or similar variants requiring >= 2 bytes per character)
Another possibility is to use a UTF-8 extended system where you use values over
0x10 to encode temporary code block swaps in the encoding. I.e.,
some magic value means the one byte UTF-8 codes now mean the Greek block
instead of the ASCII block. But you would need broad agreement for that t
At 11:36 PM + 3/15/04, [EMAIL PROTECTED] wrote:
Another possibility is to use a UTF-8 extended system where you use
values over 0x10 to encode temporary code block swaps in the
encoding. I.e.,
some magic value means the one byte UTF-8 codes now mean the Greek block
instead of the ASCII b
At 12:28 AM +0100 3/16/04, Karl Brodowsky wrote:
Anyway, it will be necessary to specify the encoding of unicode in
some way, which could possibly allow even to specify even some
non-unicode-charsets.
While I'll skip diving deeper into the swamp that is character sets
and encoding (I'm already up
Mark J. Reed wrote:
Unicode per se doesn't do anything to file sizes; it's all in how you
encode it.
Yes. And basically there are common ways to encode this: utf-8 and utf-16
(or similar variants requiring >= 2 bytes per character)
The UTF-8 encoding is not so attractive in locales that make
heav
On 2004-03-13 at 09:02:50, Karl Brodowsky wrote:
> For these guys Unicode is not so attractive, because it kind of doubles the
> size of their files,
Unicode per se doesn't do anything to file sizes; it's all in how you
encode it. The UTF-8 encoding is not so attractive in locales that make
heav
And I do think people would rebel at using Latin-1 for that one.
I get enough grief for Â...Â. :-)
I can imagine that these cause some trouble with people using a charset
other than ISO-8859-1 (Latin-1) that works well with 8 bit, like Greek,
Arabic, Cyrillic and Hebrew.
For these guys Unicode is
10 matches
Mail list logo