Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-23 Thread Philippe Verdy
> Of course, indeed I just said that! If it were true then that would imply > that '\x' == '\u' making the \u and \U escapes rather pointless. That's not pointless: - '\x' is interpreted by C compilers as '\xNN' and two uppercase letters N, where '\xNN' is compiled according to the so

Re: Web Form: Other Question, Problem, or Feedback

2003-10-23 Thread Chris Jacobs
What is "japanese signs"? If it is JIS or Shift JIS then import it in SC UniPad, use the File Import function, not the File Open function. (if it is some flavor of Unicode do use the File Open function) In UniPad then use Edit Select all Edit Convert. If it is handwritten calligraphy with Indian

Re: Encoding for Fun (was Line Separator)

2003-10-23 Thread Doug Ewell
Arcane Jill wrote: > I'm going to risk the wrath of the group because I hereby place this > in the public domain. Now you can't patent it! :-) But I can implement it. > The only invented encoding which got any real use was the following > (currently nameless) one: Jill referred to it later in

Re: [OT] RE: GDP by language

2003-10-23 Thread Mark Davis
I want to caution people that the chart should *not* be taken as an exact guide. The percentage of language speakers within a country, and the percent of GDP ascribable to those language speakers are all pretty fuzzy. In addition, I had excluded countries that were at or below 0.05% of world GDP, j

Re: FW: Web Form: Other Question, Problem, or Feedback

2003-10-23 Thread Markus Scherer
If this is in C/C++ and your text is in Unicode, and you convert to a legacy (non-Unicode) codepage, then you could use the ICU conversion API. It has an option to turn non-mappable characters into numeric character references for HTML/XML. Please see http://oss.software.ibm.com/icu/userguide/co

Re: unicode on Linux

2003-10-23 Thread Markus Scherer
Stefan Persson wrote: Stephane Bortzmeyer wrote: I do not agree. It would mean *each* application has to normalize because it cannot rely on the kernel. It has huge security implications (two file names with the same name in NFC, so visually impossible to distinguish, but two different string of c

Re: Abkhaz letters

2003-10-23 Thread Peter Kirk
On 23/10/2003 12:25, Michael Everson wrote: This is extremely interesting (especially that KU) and I will look into it when I get home from the States in early November. At 18:14 +0100 2003-10-22, Anto'nio Martins-Tuva'lkin wrote: I was asked (or rather challenged) to transcribe into a web page

Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-23 Thread John Cowan
[EMAIL PROTECTED] scripsit: > But if ('\n'=='\u000A') should always be true, because ISO 14882 defines \n as > LF and defines \u as "that character whose short name in ISO/IEC 10646 is > " and the character whose short name in ISO/IEC 10646 is A is > LF. It's not clear to m

Re: [OT] RE: GDP by language

2003-10-23 Thread Peter Kirk
On 23/10/2003 01:24, [EMAIL PROTECTED] wrote: no countries as far as I know using Arabic script but not Arabic, Persian or Urdu as official languages (except perhaps Pashto in Afghanistan). Equating countries and languages is wrought with danger... Currently: Hausa, Kashmiri, Kurdish (writte

Re: GDP by language

2003-10-23 Thread Peter Kirk
On 23/10/2003 02:27, Marco Cimarosti wrote: ... But there certainly is a correlation between GDP and what people can buy, including software. But by no means a simple one, for software. This doesn't really apply at all to open source software and other freeware. And it doesn't always apply t

Re: unicode on Linux

2003-10-23 Thread Stefan Persson
Stephane Bortzmeyer wrote: I do not agree. It would mean *each* application has to normalize because it cannot rely on the kernel. It has huge security implications (two file names with the same name in NFC, so visually impossible to distinguish, but two different string of code points). Couldn't

Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-23 Thread jon
> From: <[EMAIL PROTECTED]> > > However because the universal-character-name escapes (\u and > \U) > > are defined relative to a particular encoding, namely ISO 10646, it would > be an > > error if ('\n' != '\u000A' || '\r' != '\u000D'). Whether this is > implemented by > > using the va

RE: Abkhaz letters

2003-10-23 Thread Michael Everson
At 12:53 -0700 2003-10-23, Peter Constable wrote: > - Reversed sigma -- is it a variant of U+01B7 : LATIN CAPITAL LETTER EZH? This typeform is used in Ghana, and for that context I concluded that it is a variant of 01B7. But that is Latin text, not Cyrillic. U+04E0 -- Michael Everson * * Evers

RE: GDP by language

2003-10-23 Thread Marco Cimarosti
Mark Davis wrote: > Marco, I certainly wouldn't draw that conclusion. This is not > the appropriate forum for a political or ethical discussion, Of course. I just noticed that those numbers reflect a sad fact of life: that rich people get more than poor people. As this fact is so obvious to anyon

RE: Abkhaz letters

2003-10-23 Thread Peter Constable
> -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On > Behalf Of Anto'nio Martins-Tuva'lkin > My doubts refer to: > > - Reversed sigma -- is it a variant of U+01B7 : LATIN CAPITAL LETTER > EZH? This typeform is used in Ghana, and for that context I concluded tha

FW: Web Form: Other Question, Problem, or Feedback

2003-10-23 Thread Magda Danish \(Unicode\)
Mr. Nikolai, I am forwarding yopur email to the unicode mailing list http://www.unicode.org/consortium/distlist.html for a possible answer from one of the list subscribers. Regards, Magda Danish Administrative Director The Unicode Consortium 650-693-3921 > -Original Message- > Date/Ti

Re: Abkhaz letters

2003-10-23 Thread Michael Everson
This is extremely interesting (especially that KU) and I will look into it when I get home from the States in early November. At 18:14 +0100 2003-10-22, Anto'nio Martins-Tuva'lkin wrote: I was asked (or rather challenged) to transcribe into a web page the writings on the emblem of the Abkhazian

Re[2]: GDP by language

2003-10-23 Thread Alexander Savenkov
Hello everyone, 2003-10-22T21:53:44Z Peter Kirk <[EMAIL PROTECTED]> wrote: ...snip... > The data doesn't support addition to this degree of accuracy because of > the effect of the "others" area. Cyrillic may even overtake Arabic, > because there are several countries using the Cyrillic alphabet

Re: unicode on Linux

2003-10-23 Thread Owen Taylor
On Thu, 2003-10-23 at 04:54, Stephane Bortzmeyer wrote: > On Tue, Oct 21, 2003 at 09:56:16AM -0700, > Peter Kirk <[EMAIL PROTECTED]> wrote > a message of 22 lines which said: > > > In this page, Markus Kuhn is damaging his credibility by continuing to > > refer in several places to Unicode 3.0

RE: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-23 Thread Jill Ramonsky
Are we completely sure about this? I mean - maybe the confusion is about what constitutes a "text file", not about what constitutes a line break. I would argue that a complete, valid, text file, must contain an integral number of lines. However, were I to take a text file, and split it into ten

Re: unicode on Linux

2003-10-23 Thread Stephane Bortzmeyer
On Tue, Oct 21, 2003 at 11:32:28AM -0400, Edward H. Trager <[EMAIL PROTECTED]> wrote a message of 118 lines which said: > I think there can be big debates about whether a Linux (or any *nix > kernel, for that matter) has any business normalizing file names. > Personally I think Unicode normaliz

[OT] RE: GDP by language

2003-10-23 Thread jarkko.hietaniemi
> no countries as far as I know using Arabic script but not Arabic, Persian > or Urdu as official languages (except perhaps Pashto in Afghanistan). Equating countries and languages is wrought with danger... Currently: Hausa, Kashmiri, Kurdish (written in Latin, Cyrillic, and Arabic), Sindhi. In

Abkhaz letters

2003-10-23 Thread Anto'nio Martins-Tuva'lkin
I was asked (or rather challenged) to transcribe into a web page the writings on the emblem of the Abkhazian ASSR approved in 1925.04.01, which were set using the 1909-1926 orthography. I put it at < http://www.tuvalkin.web.pt/unicode/su)geab.jpg > and started working on it. Minutes later I ened u

Re: unicode on Linux

2003-10-23 Thread Stephane Bortzmeyer
On Tue, Oct 21, 2003 at 09:56:16AM -0700, Peter Kirk <[EMAIL PROTECTED]> wrote a message of 22 lines which said: > In this page, Markus Kuhn is damaging his credibility by continuing to > refer in several places to Unicode 3.0, although the page was updated > some time after the release of Un