RE: internationalization assumption

2004-09-30 Thread jarkko.hietaniemi
> Very tantalising! > > What characters needed by French are missing from Latin-1? http://www.cs.tut.fi/~jkorpela/latin9.html >

RE: [hebrew] Re: [BULK] - Re: markup on combining characters

2004-09-14 Thread jarkko.hietaniemi
> > A lot like Tengwar, really. > > > This is hardly surprising, because I am sure that JRR Tolkien > was aware > of how Hebrew script works. Indeed according to one site I found > (http://www.cv81pl.freeserve.co.uk/tolkien.htm) he was a > Hebrew scholar, > but this doesn't strike me as very r

RE: FW: Looking for transcription or transliteration standards latin- >arabic

2004-07-08 Thread jarkko.hietaniemi
Sanan virkkoi, noin nimesi Curtis Clark: > Añd whàt shåll wë câll thë ãddítiõn of dîacrìtícs bÿ spämmêrs, ïñ ân > ättëmpt tò fóòl spåm fîltêrs? http://en.wikipedia.org/wiki/Heavy_metal_umlaut

RE: what combining diacritical mark suits d and l with stroke ?

2004-06-29 Thread jarkko.hietaniemi
> Perhaps it would help if you indicated why you are looking for > versions of these letters using combining diacritical marks. > > If the issue is weighting the letters for sorting, so that > they sort as a type of d or a type of l, then that can be > accomplished by the appropriate tailoring of

RE: LETTER OWL, COMBINING SEAGULL...

2004-06-11 Thread jarkko.hietaniemi
> > What next? http://news.bbc.co.uk/2/hi/uk_news/3775799.stm Why, LETTER COLLIE, COMBINING PANT, of course: http://news.bbc.co.uk/1/hi/sci/tech/3794079.stm >

RE: ZX80 (was: Fixed Width Spaces (was: Printing and Displaying DependentVowels))

2004-04-01 Thread jarkko.hietaniemi
> Gosh, that brings me back. All those characters that were BASIC keywords > compressed into one octet. How could we have neglected to encode such important > legacy characters, this unnecessarily complicates round-trip conversion between > ZX80s and Unicode. Indeed. http://www.howell1964.freeserv

RE: vertical direction control

2004-03-24 Thread jarkko.hietaniemi
> Which scripts are written bottom to top in vertical layout? The Unicode bidi faq tells that ancient Numidian was written bottom to top, and the Egyptian hieroglyphics could go basically in any direction. Then again, it also says that developers shouldn't really worry overmuch about these. > I

RE: Python and Unicode (was Re: Three new Technical Notes posted)

2004-01-26 Thread jarkko.hietaniemi
> On Sun, Jan 25, 2004 at 03:10:02PM +0900, > Jungshik Shin <[EMAIL PROTECTED]> wrote > a message of 20 lines which said: > > > I was a bit surprised to find that Python was listed as > using UTF-16 > > (for the internal representation) > > Indeed, this is wrong, it is a compilation option.

[slightly OT] representing Arabic (in ASCII) in IM

2003-12-30 Thread jarkko.hietaniemi
Not Unicode, but... http://www.ascusc.org/jcmc/vol9/issue1/palfreyman.html

RE: American English translation of character names

2003-12-19 Thread jarkko.hietaniemi
> particularly the 1930s, 40s, 50s, and 60s sections, and > follow the many > links from each entry. In particular, you can see the basic > character set > of the IBM 360 (as generated by the IBM 29 Card Punch) here: > > http://www.columbia.edu/acis/history/029.html > > (scroll down a bit af

RE: [OT] CJK -> CJC (Re: Corea?)

2003-12-17 Thread jarkko.hietaniemi
>Or even "Aix-la-Chapelle" to "Aachen" because that's its _current_ German name (the >French name was official in the history, and is still used in French). > You better tell the Bundespost about this :-) AFAIK (not being a German) Aachen is very much the current German name. (go

RE: meteorological symbols

2003-12-04 Thread jarkko.hietaniemi
I guess the Official Standard documents are property of WMO (=please open your wallet for a copy of the standard)... but one can find things like this online: http://adds.aviationweather.noaa.gov/metars/wxSymbols_anno1.pdf http://www.met.fsu.edu/Classes/Met1010L/wxsymtbl.html http://www.met.fsu.edu

RE: UTF-16 inside UTF-8

2003-12-03 Thread jarkko.hietaniemi
> We're not speaking about the same thing: I was not discussing the > representation of individual characters (yes it's simple to make > wchar_t 32-bit with UCS4), but the encoding of large amounts of > strings for general text processing. That's where UTF-16 is be

RE: Hexadecimal digits?

2003-11-10 Thread jarkko.hietaniemi
> >Well, obviously I support this totally, since I suggested the same > >thing myself on this list earlier this year (see > >http://groups.yahoo.com/group/unicode/message/20789). > > > >I am 100% in favor of adding hex digits to Unicode. I speak as a > >programmer, and as a designer of software

[OT] RE: GDP by language

2003-10-23 Thread jarkko.hietaniemi
> no countries as far as I know using Arabic script but not Arabic, Persian > or Urdu as official languages (except perhaps Pashto in Afghanistan). Equating countries and languages is wrought with danger... Currently: Hausa, Kashmiri, Kurdish (written in Latin, Cyrillic, and Arabic), Sindhi. In

RE: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-22 Thread jarkko.hietaniemi
> > > all of the CR LF CRLF LFCR should mark an "end of line".) > > All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the I was still staying within the ASCII and \r \n discussion, but yes, if one goes Latin 1 / Unicode the NEL and LS PS (why not FF, then?), and of course EOF. > encoding

RE: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-22 Thread jarkko.hietaniemi
> So this legacy encoding of end-of-lines is now quite obsolete > even on MacOS. I don't think it can be called "obsolete" as long as files generated using that line end convention exist. Or, at least, applications that have an operation for "read a line" will have to cope with it. (In other wo

RE: Damn'd fools

2003-07-28 Thread jarkko.hietaniemi
> Everyone who sounds French, because they speak French, is not French. > Ask any French speaking Canadian or Swiss, or any Swedish > speaking Finn. If a duke living in (arguably) French territory (he was a vassal of the king of France) and speaking (arguably) French crosses the Channel and gets

RE: Damn'd fools

2003-07-28 Thread jarkko.hietaniemi
> Thanks for the corrections -- see I told you :-) When was the next meeting of Pedants Anonymous again? :-) > > England was never ruled by the French! Please! I dunno, William Conqueror the Duke of Normandy sounds pretty French to me :-) (Of course it's a good question when do 'France' and 'Fr

RE: French group separators

2003-07-08 Thread jarkko.hietaniemi
> > Don't call me Mr. Roberts is my name. > > > > Don't call me Mr. Roberts is my name. > > In European English Mr is generally not followed by a full stop, > because the abbreviation contains the first and last letter of the > word. (In Finland that would be M:r.) Ummm...? No. Abbreviat

RE: Letterforms based on p

2003-06-09 Thread jarkko.hietaniemi
> - what is the symbol used for [most lists are silent on this question] Yes, I can understand that having to browse through the whole of OED to find out where e.g. the hypolemniscus is being used sounds a little bit, uhm, tedious... > - what is the symbol not used for [a subtle but importantly d

RE: Letterforms based on p

2003-06-09 Thread jarkko.hietaniemi
> It also appears along with other symbols used in the OED at > http://dictionary.oed.com/public/help/Advanced/symbols.htm#mod1letter. > (Again, not all these symbols are currently part of Unicode.) To state the obvious (and "random email does not official character proposals make") Unicode havi

RE: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread jarkko.hietaniemi
> I wonder how a character standardizer would like it if a bunch of > graphic artists criticized her character encoding. ☺ A professional of any kind will listen to critique.

RE: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread jarkko.hietaniemi
> 2. It is unikely that the Unicode *logo* itself (i.e. the thing at > http://www.unicode.org/webscripts/logo60s2.gif) will be incorporated > directly in any image that people are allowed to put on their > websites, > because to put the Unicode logo on a product or whatever requires a > licens

RE: IPA Null Consonant

2003-05-30 Thread jarkko.hietaniemi
> > > - Ø [LATIN CAPITAL LETTER O WITH STROKE] and ø [LATIN > > SMALL LETTER O > > > WITH STROKE] are both ruled out as their semantics is > > totally wrong. > > Not at all (as seen by example Jarkko quoted!). In Danish > and Norwegian, > yes. But in Swedish and Finnish that vowel is wri

RE: Not snazzy (was: New Unicode Savvy Logo)

2003-05-28 Thread jarkko.hietaniemi
> A logo with a yellow or light blue or pale green background > would be more appealing on various bright backgrounds. I also > think that the grey logo is too dark and difficult to red, > and the pink logo is quite strange. > > The red of the checkmark should contrast more by using > asaturat

RE: IPA Null Consonant

2003-05-28 Thread jarkko.hietaniemi
> [EMAIL PROTECTED] wrote: > > > Well, in truth we did write { instead, but that is another > sad story. > > Is that ISO 646-SE? > > Stefan ISO 646-FI, which was identical to the -SE.

RE: New contribution

2003-04-02 Thread jarkko.hietaniemi
> Not necessarily, but given the three jokers who submitted it, > it might be a fair assumption. *backspaces to heart-dot all the i's* > > --Ken > > P.S. For next February 14, I am looking forward to the proposal > for the COMBINING PENETRATING NORTH EAST ARROW OVERLAY (aka > Cupid's arrow), so t

Zhongwen

2003-03-21 Thread jarkko.hietaniemi
This might of interest for people interested in Chinese characters: http://zhongwen.com/ Uses heavily image maps (not Unicode, I am afraid :-) and frames, but to rather cool results: http://zhongwen.com/dao.htm (Disclaimer: the people-press link in the Dao De Jing page has little to do with eit

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Correction: a weekly review.

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
> The same people consider Latin a dead language, suitable only for > study of ancient documents, which is clearly not the view taken > at the Vatican, which continues to produce new documents in that language. > In recent encyclicals, however, at least as published at www.vatican.va, > the æ an

RE: Impossible combinations?

2003-03-03 Thread jarkko.hietaniemi
> I'm working on a Latin-based font that's got a large number > of kerning pairs already defined and I'm trying to pare this > list of pairs down to the bare minimum. There seem to be many > pairs which are unlikely ever to be used. These pairs all involve > a lowercase on the left with an upperc

RE: traditional vs simplified chinese

2003-02-14 Thread jarkko.hietaniemi
> > I know little about Chinese, but I have the impression that it is much more > > common for several traditional characters to correspond to one simplified > > character than vice versa. If that's true, it seems to me that it would make > > most sense to fold to simplified. > > Hmmm ... Suppose I

Medieval Unicode Font Initiative

2002-11-18 Thread jarkko.hietaniemi
I happened upon this: http://www.hit.uib.no/mufi/ Allocating PUA ranges, precomposed characters, ligatures, oh my. But off-hand I assume they are headed towards the right direction: > Unicode proposal in planning > > People working in the Unicode community have encouraged us to present an off

RE: Character identities

2002-10-29 Thread jarkko.hietaniemi
> Unicode captures the ice-age during the global warming era! > > Do we have codepoints for images found on the walls of caves? > > :) CRO-MAGNON PAINTING HUMAN SPEARING A MAMMOTH CRO-MAGNON PAINTING MAMMOTH STOMPING A HUMAN ...

RE: [ANN] World Address Project starts and relies on Unicode heavily

2002-10-07 Thread jarkko.hietaniemi
> A welcome initiative! Indeed. > I especially hope that your FAQ, when it will be > ready, will contain useful suggestions. Googling for "international address formats" brings up some nice starting points: http://www.bitboost.com/ref/international-address-formats.html http://www.bitboost.com

RE: glyph selection for Unicode in browsers

2002-10-01 Thread jarkko.hietaniemi
> Sniffing isn't a good idea in the long term. It may work > for simple web page serving, but as soon as you go XML and > start to move data around without the user having a chance > to see it frequently, you'll end up with a big mess. > > Also, 'guessing' is very ill-defined. You might serve >

RE: glyph selection for Unicode in browsers

2002-09-25 Thread jarkko.hietaniemi
> I cannot help the wrong result. (I guess some browsers might do better > work at sniffing the content of the page, but at least IE6 and Opera 6.05 > on Win32 seem to believe the server rather than the (HTML of the) page. After some experimentation it seems that I blamed Opera 6.05/Win32 wrongl

RE: glyph selection for Unicode in browsers

2002-09-25 Thread jarkko.hietaniemi
*sigh* Time for me to call it the day and go home, it seems. Opera 6.05/Win32 does *not* get it right if you have it on View -> Encoding -> Automatic detection. Why I was fooled in the below message was that the Encoding setting seems to stick even if I exit and restart Opera, that's why my tes

RE: glyph selection for Unicode in browsers

2002-09-25 Thread jarkko.hietaniemi
> You would be happy, but others might not- the standard specifically says > that the http charset takes precedence. > http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2 Yup. I guess I could argue both ways. The server admins want control; the users want control, the latter lose :-) > Howeve

RE: glyph selection for Unicode in browsers

2002-09-25 Thread jarkko.hietaniemi
I would be happy if just this would be enough to convince the browsers that the page is in UTF-8... It isn't if the HTTP server claims that the pages it serves are in ISO 8859-1. A sample of this is http://www.iki.fi/jhi/jp_utf8.html, it does have the meta charset, but since the webserver (www

Old Hungarian?

2002-09-25 Thread jarkko.hietaniemi
> You can look at http://www.unicode.org/unicode/alloc/Pipeline.html to see > what's in the pipeline, but note that code points are not yet definite. > There will be a beta period, beginning in January I believe. Whatever happened to Old Hungarian, aka Hungarian Runic, aka rovasiras? (sorry for

RE: Latin vowels?

2002-09-09 Thread jarkko.hietaniemi
> glides, and GOK what else. Of course, Mark was only differentiating > between "vowels" and "non-vowels," but that may not make things much > easier; I still wouldn't know where to put English "y". Off-hand, it seems that in English "y" mostly* is [j] if in initial position, otherwise it's eith

RE: Romanized Cyrillic bibliographic data--viable fonts?)

2002-08-30 Thread jarkko.hietaniemi
> Kudos do not pay the rent. And altruism can run out when the rent > needs to be paid ;-) Very true. But you make the hasty assumption that font designing is the activity creating the money for paying the rent.

RE: Romanized Cyrillic bibliographic data--viable fonts?)

2002-08-30 Thread jarkko.hietaniemi
> And who pays the poor font designer for his work? U+0041 U+006C U+0074 U+0072 U+0075 U+0069 U+0073 U+006D U+0020 U+006F U+0072 U+0020 U+006B U+0075 U+0064 U+006F U+0073 U+002C U+0020 U+006D U+0061 U+0079 U+0062 U+0065 U+003F

RE: FW: New version of TR29:

2002-08-20 Thread jarkko.hietaniemi
As another datapoint the following details the use of the apostrophe in Finnish http://www.cs.tut.fi/~jkorpela/kielikello/merkit.html#heittomerkki It's in Finnish :-) so allow me to summarize: (1) if consonant gradation (the change or even elision) of consonants would cause two same vowels belong

RE: FW: New version of TR29:

2002-08-20 Thread jarkko.hietaniemi
> Then there are two uses of apostrophies in quoting: within secondary quotation > marks Urk. I meant "within quotation marks as secondary quotation marks".