Intended just as friendly fuel for this cheery and interesting fire:
Well, English has become 'Chinese-like' (i.e. more like
isolating languages and less like inflecting languages)
recently(?) with less and less inflection.
I'm not an expert, but I don't think that's
It (english) seems to me like a kind of ultimate, permanent creole,
Sorry to reply to my own message but this reminds of another question:
have there been any attempts (apart from taggers and one art project
that I know of) to design/propogate ideograph-based writing systems
for english?
I think I'd like bijective too, if I knew what it meant. Someone?
It would be a lot more fun to answer this question in plain-text
Unicode (using math notation) than in ASCII.
Informally:
"Bijective" describes a mapping between two sets. Every element of
the source set ("the
[EMAIL PROTECTED] wrote:
"Unicode is a character set encoding standard which currently provides for
its entire character repertoire to be represented using 8-bit, 16-bit or
32-bit encodings."
Please say "encoding forms".
There are three distinct terms, that sound similar, and
What exactly _would_ be wrong with calling UNICODE a
thirty-two bit encoding
If I have a 32 bit integer type, holding a Unicode code point, I have
11 bits left over to hold other data. That's worth knowing.
Btw, saying approximately 20.087 bits (Am I calculating that
We've seen several posts about the perception that Unicode is a
16 bit character set encoding. Among those, we've heard anecdotes
about the problems people have introducing newcomers to Unicode.
Here is a chapter of a reference manual I've been working on.
The original manual can be found at
Because of the widespread belief that Unicode stops at U+,
many fonts and applications that claim to support Unicode can
only handle basic characters, not supplementary characters.
Right. (Is it really a widespread belief? That's something I've
been wondering.)
So
It has proven difficult to come up with convenient terms for
the Unicode characters encoded at U+1 and beyond.
[]
2. A 'basic' code point, which may represent a 'basic
character', can range from U+ through U+.
For what purpose is such a
We are distributing an efficient, open source regular expression
pattern matcher in a C library. It implements the regular expression
language specified by W3C, in "XML Schema Part 2: Datatypes".
Our software can be retrieved from:
http://www.regexps.com
Hackerlab Rx-XML
We are distributing an open source C library that contains a
programming interface for accessing information taken from
"unidata.txt" and other Unicode databases. It provides space and time
efficient access to various character properties.
Our library does not contain all of the information
10 matches
Mail list logo