> Very tantalising!
>
> What characters needed by French are missing from Latin-1?
http://www.cs.tut.fi/~jkorpela/latin9.html
>
> > A lot like Tengwar, really.
> >
> This is hardly surprising, because I am sure that JRR Tolkien
> was aware
> of how Hebrew script works. Indeed according to one site I found
> (http://www.cv81pl.freeserve.co.uk/tolkien.htm) he was a
> Hebrew scholar,
> but this doesn't strike me as very r
Sanan virkkoi, noin nimesi Curtis Clark:
> Añd whàt shåll wë câll thë ãddítiõn of dîacrìtícs bÿ spämmêrs, ïñ ân
> ättëmpt tò fóòl spåm fîltêrs?
http://en.wikipedia.org/wiki/Heavy_metal_umlaut
> Perhaps it would help if you indicated why you are looking for
> versions of these letters using combining diacritical marks.
>
> If the issue is weighting the letters for sorting, so that
> they sort as a type of d or a type of l, then that can be
> accomplished by the appropriate tailoring of
>
> What next? http://news.bbc.co.uk/2/hi/uk_news/3775799.stm
Why, LETTER COLLIE, COMBINING PANT, of course:
http://news.bbc.co.uk/1/hi/sci/tech/3794079.stm
>
> Gosh, that brings me back. All those characters that were BASIC keywords
> compressed into one octet. How could we have neglected to encode such important
> legacy characters, this unnecessarily complicates round-trip conversion between
> ZX80s and Unicode.
Indeed. http://www.howell1964.freeserv
> Which scripts are written bottom to top in vertical layout?
The Unicode bidi faq tells that ancient Numidian was written bottom to top,
and the Egyptian hieroglyphics could go basically in any direction. Then again,
it also says that developers shouldn't really worry overmuch about these.
> I
> On Sun, Jan 25, 2004 at 03:10:02PM +0900,
> Jungshik Shin <[EMAIL PROTECTED]> wrote
> a message of 20 lines which said:
>
> > I was a bit surprised to find that Python was listed as
> using UTF-16
> > (for the internal representation)
>
> Indeed, this is wrong, it is a compilation option.
Not Unicode, but... http://www.ascusc.org/jcmc/vol9/issue1/palfreyman.html
> particularly the 1930s, 40s, 50s, and 60s sections, and
> follow the many
> links from each entry. In particular, you can see the basic
> character set
> of the IBM 360 (as generated by the IBM 29 Card Punch) here:
>
> http://www.columbia.edu/acis/history/029.html
>
> (scroll down a bit af
>Or even "Aix-la-Chapelle" to "Aachen" because that's its _current_ German name (the
>French name was official in the history, and is still used in French).
>
You better tell the Bundespost about this :-) AFAIK (not being a German)
Aachen is very much the current German name.
(go
I guess the Official Standard documents are property of WMO
(=please open your wallet for a copy of the standard)...
but one can find things like this online:
http://adds.aviationweather.noaa.gov/metars/wxSymbols_anno1.pdf
http://www.met.fsu.edu/Classes/Met1010L/wxsymtbl.html
http://www.met.fsu.edu
> We're not speaking about the same thing: I was not discussing the
> representation of individual characters (yes it's simple to make
> wchar_t 32-bit with UCS4), but the encoding of large amounts of
> strings for general text processing. That's where UTF-16 is be
> >Well, obviously I support this totally, since I suggested the same
> >thing myself on this list earlier this year (see
> >http://groups.yahoo.com/group/unicode/message/20789).
> >
> >I am 100% in favor of adding hex digits to Unicode. I speak as a
> >programmer, and as a designer of software
> no countries as far as I know using Arabic script but not Arabic, Persian
> or Urdu as official languages (except perhaps Pashto in Afghanistan).
Equating countries and languages is wrought with danger...
Currently: Hausa, Kashmiri, Kurdish (written in Latin, Cyrillic, and Arabic), Sindhi.
In
>
> > all of the CR LF CRLF LFCR should mark an "end of line".)
>
> All of CR, LF, , NEL, LS, PS, and EOF(!). (Assuming that the
I was still staying within the ASCII and \r \n discussion, but yes,
if one goes Latin 1 / Unicode the NEL and LS PS (why not FF, then?),
and of course EOF.
> encoding
> So this legacy encoding of end-of-lines is now quite obsolete
> even on MacOS.
I don't think it can be called "obsolete" as long as files generated using
that line end convention exist. Or, at least, applications that have an
operation for "read a line" will have to cope with it. (In other wo
> Everyone who sounds French, because they speak French, is not French.
> Ask any French speaking Canadian or Swiss, or any Swedish
> speaking Finn.
If a duke living in (arguably) French territory (he was a vassal of the king of France)
and speaking (arguably) French crosses the Channel and gets
> Thanks for the corrections -- see I told you :-)
When was the next meeting of Pedants Anonymous again? :-)
> > England was never ruled by the French! Please!
I dunno, William Conqueror the Duke of Normandy sounds pretty French to me :-)
(Of course it's a good question when do 'France' and 'Fr
> > Don't call me Mr. Roberts is my name.
> >
> > Don't call me Mr. Roberts is my name.
>
> In European English Mr is generally not followed by a full stop,
> because the abbreviation contains the first and last letter of the
> word. (In Finland that would be M:r.)
Ummm...? No.
Abbreviat
> - what is the symbol used for [most lists are silent on this question]
Yes, I can understand that having to browse through the whole of OED to find out
where e.g. the hypolemniscus is being used sounds a little bit, uhm, tedious...
> - what is the symbol not used for [a subtle but importantly d
> It also appears along with other symbols used in the OED at
> http://dictionary.oed.com/public/help/Advanced/symbols.htm#mod1letter.
> (Again, not all these symbols are currently part of Unicode.)
To state the obvious (and "random email does not official character proposals make")
Unicode havi
> I wonder how a character standardizer would like it if a bunch of
> graphic artists criticized her character encoding. ☺
A professional of any kind will listen to critique.
> 2. It is unikely that the Unicode *logo* itself (i.e. the thing at
> http://www.unicode.org/webscripts/logo60s2.gif) will be incorporated
> directly in any image that people are allowed to put on their
> websites,
> because to put the Unicode logo on a product or whatever requires a
> licens
> > > - Ø [LATIN CAPITAL LETTER O WITH STROKE] and ø [LATIN
> > SMALL LETTER O
> > > WITH STROKE] are both ruled out as their semantics is
> > totally wrong.
>
> Not at all (as seen by example Jarkko quoted!). In Danish
> and Norwegian,
> yes. But in Swedish and Finnish that vowel is wri
> A logo with a yellow or light blue or pale green background
> would be more appealing on various bright backgrounds. I also
> think that the grey logo is too dark and difficult to red,
> and the pink logo is quite strange.
>
> The red of the checkmark should contrast more by using
> asaturat
> [EMAIL PROTECTED] wrote:
>
> > Well, in truth we did write { instead, but that is another
> sad story.
>
> Is that ISO 646-SE?
>
> Stefan
ISO 646-FI, which was identical to the -SE.
> Not necessarily, but given the three jokers who submitted it,
> it might be a fair assumption. *backspaces to heart-dot all the i's*
>
> --Ken
>
> P.S. For next February 14, I am looking forward to the proposal
> for the COMBINING PENETRATING NORTH EAST ARROW OVERLAY (aka
> Cupid's arrow), so t
This might of interest for people interested in Chinese characters:
http://zhongwen.com/
Uses heavily image maps (not Unicode, I am afraid :-) and frames,
but to rather cool results:
http://zhongwen.com/dao.htm
(Disclaimer: the people-press link in the Dao De Jing page has little to do with eit
> One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/
Correction: a weekly review.
> The same people consider Latin a dead language, suitable only for
> study of ancient documents, which is clearly not the view taken
> at the Vatican, which continues to produce new documents in that language.
> In recent encyclicals, however, at least as published at www.vatican.va,
> the æ an
> I'm working on a Latin-based font that's got a large number
> of kerning pairs already defined and I'm trying to pare this
> list of pairs down to the bare minimum. There seem to be many
> pairs which are unlikely ever to be used. These pairs all involve
> a lowercase on the left with an upperc
> > I know little about Chinese, but I have the impression that it is much more
> > common for several traditional characters to correspond to one simplified
> > character than vice versa. If that's true, it seems to me that it would make
> > most sense to fold to simplified.
>
> Hmmm ... Suppose I
I happened upon this:
http://www.hit.uib.no/mufi/
Allocating PUA ranges, precomposed characters, ligatures, oh my.
But off-hand I assume they are headed towards the right direction:
> Unicode proposal in planning
>
> People working in the Unicode community have encouraged us to present an off
> Unicode captures the ice-age during the global warming era!
>
> Do we have codepoints for images found on the walls of caves?
>
> :)
CRO-MAGNON PAINTING HUMAN SPEARING A MAMMOTH
CRO-MAGNON PAINTING MAMMOTH STOMPING A HUMAN
...
> A welcome initiative!
Indeed.
> I especially hope that your FAQ, when it will be
> ready, will contain useful suggestions.
Googling for "international address formats" brings up some nice starting points:
http://www.bitboost.com/ref/international-address-formats.html
http://www.bitboost.com
> Sniffing isn't a good idea in the long term. It may work
> for simple web page serving, but as soon as you go XML and
> start to move data around without the user having a chance
> to see it frequently, you'll end up with a big mess.
>
> Also, 'guessing' is very ill-defined. You might serve
>
> I cannot help the wrong result. (I guess some browsers might do better
> work at sniffing the content of the page, but at least IE6 and Opera 6.05
> on Win32 seem to believe the server rather than the (HTML of the) page.
After some experimentation it seems that I blamed Opera 6.05/Win32 wrongl
*sigh* Time for me to call it the day and go home, it seems. Opera 6.05/Win32
does *not* get it right if you have it on View -> Encoding -> Automatic detection.
Why I was fooled in the below message was that the Encoding setting seems to
stick even if I exit and restart Opera, that's why my tes
> You would be happy, but others might not- the standard specifically says
> that the http charset takes precedence.
> http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2
Yup. I guess I could argue both ways. The server admins want control;
the users want control, the latter lose :-)
> Howeve
I would be happy if just this
would be enough to convince the browsers that the page is in UTF-8...
It isn't if the HTTP server claims that the pages it serves are in
ISO 8859-1. A sample of this is http://www.iki.fi/jhi/jp_utf8.html,
it does have the meta charset, but since the webserver (www
> You can look at http://www.unicode.org/unicode/alloc/Pipeline.html to see
> what's in the pipeline, but note that code points are not yet definite.
> There will be a beta period, beginning in January I believe.
Whatever happened to Old Hungarian, aka Hungarian Runic, aka rovasiras?
(sorry for
> glides, and GOK what else. Of course, Mark was only differentiating
> between "vowels" and "non-vowels," but that may not make things much
> easier; I still wouldn't know where to put English "y".
Off-hand, it seems that in English "y" mostly* is [j] if in initial position,
otherwise it's eith
> Kudos do not pay the rent. And altruism can run out when the rent
> needs to be paid ;-)
Very true. But you make the hasty assumption that font designing is the
activity creating the money for paying the rent.
> And who pays the poor font designer for his work?
U+0041 U+006C U+0074 U+0072 U+0075 U+0069 U+0073 U+006D U+0020 U+006F U+0072 U+0020
U+006B U+0075 U+0064 U+006F U+0073 U+002C U+0020 U+006D U+0061 U+0079 U+0062 U+0065
U+003F
As another datapoint the following details the use of the apostrophe in Finnish
http://www.cs.tut.fi/~jkorpela/kielikello/merkit.html#heittomerkki
It's in Finnish :-) so allow me to summarize:
(1) if consonant gradation (the change or even elision) of consonants would cause
two same vowels belong
> Then there are two uses of apostrophies in quoting: within secondary quotation
> marks
Urk. I meant "within quotation marks as secondary quotation marks".
47 matches
Mail list logo