This is a test. No need to read it.

2001-05-06 Thread 11digitboy

ichi ni san yon

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Google is a major U+3070 U+304B (was: Re: Searchable web page ?!!)

2001-05-05 Thread 11digitboy



*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




- Original Message -
From: <[EMAIL PROTECTED]>
To: "James Kass" <[EMAIL PROTECTED]>
Sent: Friday, May 04, 2001 4:46 PM
Subject: Re: Searchable web page ?!!


> I don't know about other search engines, but the way
> Google seems to handle some charsets seems to make
> me think it is a
>
> U+3070 U+304B

ばか is Japanese for fool or idiot.  "Vaca" is pronounced
about the same and is the Spanish word for "cow".  Guess
cows aren't very smart.  A lot of times on Google, the
description for the page found says something like
"this page contains characters that can't be displayed
in the current character set..." , which is kind of dumb
because all they would have to do at Google is make the
character set Unicode!

>
> and if it case folds, why not kana fold?
> Are search engines sensitive to characters, or only
> byte sequences? I mean, can it tell that -- OK, let's
> pick a good one -- U+304D and SJIS-82AB are the same
> thing?
>
> A big problem might be languages like Greek which use
> the second half of the possible byte list.
>
> Are the search engines smart enough to tell an alpha
> is an alpha is an alpha?
>

I don't know very much about the search engines, and I wonder
if you meant to send this letter to the Unicode list?

With best regards,

James Kass.








Hex Unicode refs DO work in JavaScript

2001-05-01 Thread 11digitboy

Here is an example I wrote.

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com

Title: Han clock test








Han clock (mostly) by Juuitchan





Apostrophe vs. single quotes.

2001-05-01 Thread 11digitboy

"Don't you ever call me 'baby' again!" she yelled.
^   ^^
|   ||
These are three separate abstract characters, but I
use one glyph for all of them. What are the three codepoints
I use for them in Unicode? (Unicode encodes abstract
characters, I hear.)

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Terminology questions

2001-04-30 Thread 11digitboy

The attachment is in UTF-8.

Sorry if there are mozibake.

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com


Two questions:

Question 1:
This is uppercase → SHOUJO ANIME
This is lowercase → shoujo anime
This is mixed case → Shoujo Anime
This is hiragana → しょうじょあにめ
This is katakana → ショウジョアニメ
What is this? → しょうじょアニメ

Question 2:
What is the English name for mojibake?



Decimal Unicodepoints

2001-04-24 Thread 11digitboy

I have a clock at

http://www.geocities.co.jp/AnimeComic-Pen/9973/index.html

(works best in MSIE)

that would have been MUCH easier to make if only I
had decimal Unicodepoints handy. I mostly worked from
your online standard to make it (I was at school, without
my Unicode book).
Why don't you make the next print edition of the Unicode
standard (not to mention online) with Unicodepoints
in decimal as well as hex?

Yes, I cheated and looked at decompositions for PARENTHESIZED
IDEOGRAPH FIRE and the like and may have made a mistake
or two copying numbers.

The clock is in Japanese, or should be.

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





21-bit unicode

2001-04-18 Thread 11digitboy

21 = 3 * 7
so could you "flatten" it to 7-bit ASCII?

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Re: FYI...

2001-04-12 Thread 11digitboy

Greater minds than mine it doth perplex
Innocent mail banned for containing "sex"
I hope they know where to draw the line
I'd hate to see censorship of figures "69". 

*** JUUICHIKETAJIN ***




 Sarasvati <[EMAIL PROTECTED]> wrote:
> I just got a bounce message from a machine in the
> 
> delivery path to one of our list users. The bounce
> said:
> 
> >  SMTP error from remote mailer after end of data:
> >  host mail.admiralsys.com [207.243.11.4]:
> >  550 Banned text appeared in header:
> >  'sex'
> 
> So I examined the bounced message... Can you guess
> where the banned text occurred? In a "Received" header:
> 
> >  Received: from ossex1.ossinc.net (corp.webb.net
> [207.182.160.15])
> 
> I suppose there is a lesson in this.  Would anyone
> care to supply a piquant aphorism?
> 
> Cheery regards from your,
>   -- Sarasvati
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





The Code 2000 font.

2001-04-06 Thread 11digitboy

To the author of the Code 2000 font:

1) The top stroke of a capital J does not require serifs.
It is a serif.

2) The digit 4 could use some work. A digital-clock
4 would be a great improvement.

3) Japanese is fixed-width, I think.


To the rest of you:

Here is a kana sample from the font. Let it speak for
itself. Do not kill me it is 2555 bytes (if that is
indeed a digit 5).

Oh yeah, it is missing too many Han to use for much.
Like U+6642 (character for "time" or "hour").

I guess maybe the kana would be good for some kind
of poster though.

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com

 c2ksamp.gif


Too many Han (was: Re: How many noncharacters, unassigned and private area code points in 3.1.)

2001-04-06 Thread 11digitboy

Too many Han.
How do we keep Han from eating up all the codepoints?

I mean, if we said no Han we'd get a lot of irate Chinese,
but still

If your browser (word processor, paint program, etc.)
takes U+2FF0 thru U+2FFB and actually tries to *DRAW*
the character, is that OK by Unicode rules?

Trying to list all the Han characters is probably like
trying to list all the songs ever written or something.
I doubt there will ever exist a complete list of all
Han characters.

What is the Chinese equivalent of the Jouyou Kanji,
anyway?

*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Re: Iranian Rial sign proposal

2001-04-03 Thread 11digitboy

If they don't put it in this minute, there is something
WRONG. It is a CURRENCY symbol, for Pete's sake! I
mean, DOLLAR SIGN is not LATIN LETTER S WITH STROKE
And it is *UNI*code

Oh. You didn't tell us whether it goes to the left
or to the right of the digits, did you?
And it is considerably more important than KAWAII MYRIADS
HEART (for counting pinball points).

*** JUUICHIKETAJIN ***




 Roozbeh Pournader <[EMAIL PROTECTED]> wrote:
> 
> Dear friends,
> 
> You can find a proposal for encoding Iranian Rial
> sign in Unicode at:
> 
>   http://developer.sharif.edu/farsiweb/proposal/rial.html
> 
> We really appreciate your ideas and comments. Please
> send them personally
> to me (or to the list if it may be benefical for
> others). I will send it
> to UTC for consideration after that.
> 
> --roozbeh
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Bytes per character

2001-03-29 Thread 11digitboy

‚±‚Ì•¶‚́A‚Q‚UƒoƒCƒg‚Å‚·B
S0n0‡eo00ÿÿÐ0¤0È0g0Y00


Oh cool! I was expecting 39!
Why still only 26?

*** JUUICHIKETAJIN ***




 Michael Everson <[EMAIL PROTECTED]> wrote:
> At 11:58 -0800 2001-03-23, Richard Cook wrote:
> >Another web page, for your collective amusement:
> >
> >http://linguistics.berkeley.edu/~rscook/html/Unicode-tetralog.html
> 
> Why, o why, does this thoroughly enjoyable exchange
> sound so very, 
> very much like one of the conversations I have with
> my own 
> beer-buying friends?
> 
> -- 
> Michael Everson  **  Everson Gunn Teoranta  **  
> http://www.egt.ie
> 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath
> 2; Éire/Ireland
> Mob +353 86 807 9169 ** Fax +353 1 478 2597 ** Vox
> +353 1 478 2597
> 27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co.
> Átha Cliath; Éire
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





[unicode] Reading mojibake

2001-03-23 Thread 11digitboy

I taught myself to read a bit of SJIS mojibake, partly
from studying the scrambled output of my clock program
with fullwidth digits. (glitch plus O = 0. Glitch plus
P = 1, etc., I think) Anyone else here can read mojibake?
What is the English word for mojibake?
Isn't Unicode mojibake three mojibake per character,
rather than just two like in SJIS? How do you fix this,
anyway? Like if you have a lot of Unicode text, so
you don't need the extra byte.
Maybe 2 1/2 bytes per character would be good. I mean
for the extra planes and all. You guys could make a
script for this in 10 minutes.
*** JUUICHIKETAJIN ***




___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com





Unicode font repository?

2001-03-12 Thread 11digitboy

Anybody know of a Unicode font repository?
Is there a Unicode version of Fraktur? I can't help
but wonder how the hiragana would look. Some, like
"ya" would doubtless adapt very prettily, but like
"nu"?

*** JUUICHIKETAJIN ***



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Hex numbers in kanji??

2001-03-08 Thread 11digitboy

If you want to use Shodou for a report cover or something,
here goes something I made up

䆁䆁䆁侎䆁沎䆁�䆁媘䆁떎䆁ꪔ䆁ôÜ䆁抍䆁뎉䆁뢕䆁骒䆁䆁좌䆁岏䆁岏䆁岏䆁掁䆁岏좌䆁岏䆁岏䆁掁䆁좌岏䆁좌岏좌䆁厕䆁掁਍

It is in SJIS because I don't know how to post this
in Unicode.


| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




sayounara

2000-12-14 Thread 11digitboy

Make maru (the round one) a kanji, though. It IS a
kanji.

| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


 Sarasvati <[EMAIL PROTECTED]> wrote:
> Hello 11 Digit Boy...  Well well, it seems like I
> do
> quite a bit of extra housekeeping with you around,
> and
> I receive numerous complaints when you send messages
> like
> the one below.  Please either control your convulsions
> or sign off the list.  Now, say five "Hail Sarasvatis"
> and
> pound your head on the wall three times.
> 
> Your vigilant,
>   -- Sarasvati
> 
> > Re: Transcriptions of Unicode
> > 
> > Who needs those mungers? Let's nuke them straight
> to
> > HELL. WITH a nuke. Or at least a couple hundred
> hand
> > grenades.
> > 
> > | ||\ __/__  |   |  _/_   | ||
>   /
> > | _|_  ,--, /   \  /_|  -+- / --- |   
>  /
> > |V T_)| |   |\   |   ||   
> / _
> > \_/   T /  \   /  __/   |   /---  \_/ L/
> \
> 
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Transcriptions of Unicode

2000-12-13 Thread 11digitboy

Who needs those mungers? Let's nuke them straight to
HELL. WITH a nuke. Or at least a couple hundred hand
grenades.

| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


 Sarasvati <[EMAIL PROTECTED]> wrote:
> Michka wrote:
> 
> > Ok, it happened again. I can send mail to other
> people and the
> > encoding stays intact. Just the Unicode List is
> losing it.
> > Does anyone have any ideas on this?
> 
> Sarasvati contends that you're probably sending raw
> 8-bit mail
> over an SMTP connection without any indication of
> the encoding,
> nor any MIME headers in your message.  The raw message
> that was
> received by Unicode.ORG was _ALREADY_ munged into
> 7-bits, so the
> fault does not lie with Unicode.ORG.  Your original
> mail had this
> interesting header in it, which might be of some
> interest...
> 
> > Received: from 157.54.9.108 by mail5.microsoft.com
> (InterScan E-Mail VirusWall NT); Tue, 12 Dec 2000
> 10:20:42 -0800 (Pacific Standard Time)
> > Received: by inet-imc-05.redmond.corp.microsoft.com
> with Internet Mail Service (5.5.2651.58)
> >id ; Tue, 12 Dec 2000 10:20:41
> -0800
> 
> Probably someone else is munging your mail on its
> way to me.
> 
>   -- Sarasvati
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Did I do this right?

2000-12-07 Thread 11digitboy

I put some japanese text up on the web
http://11digitboy.stormloader.com
and i think i did it right. Is the &#n; format
correct where n is a unicode code in decimal? do
i set netscape to utf-8 to see it? what about msie?
please excuse the 1 handed typing.

| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Fwd: Direct dispatch from London

2000-11-30 Thread 11digitboy



| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


 Alain LaBonté  <[EMAIL PROTECTED]> wrote:
> Actual author unknown (anonymous)...
> 
> Enjoy!
> 
> Alain
> _
> >NOTICE OF REVOCATION OF INDEPENDENCE
> >
> >To the citizens of the United States of America,
> >
> >In the light of your failure to elect a President
> of the USA and  thus
> >to govern yourselves, we hereby give notice of the
> revocation of your
> >independence, effective today.
> >
> >Her Sovereign Majesty Queen Elizabeth II will resume
> monarchial duties
> >over all states,  commonwealths and other territories.
> Except Utah,
> >which she does not fancy.
> >Your new prime minister (The rt. hon. Tony Blair,
> MP for the 97.85% of
> >you who have until now been unaware that there is
> a world outside your
> >borders) will appoint a minister for America without
> the need for
> >further elections. Congress and the Senate will
> be disbanded. A
> >questionnaire will be circulated next year to determine
> whether any of
> >you
> >noticed.
> >
> >To aid in the transition to a British Crown Dependency,
> the following
> >rules are introduced with immediate effect:
> >
> >1. You should look up "revocation" in the Oxford
> English Dictionary.
> >  Then look up "aluminium". Check the pronunciation
> guide. You will be
> >amazed at  just how wrongly you have been pronouncing
> it. Generally, you
> >should raise your vocabulary to acceptable levels.
> Look up vocabulary".
> >Using  the same twenty seven words interspersed
> with filler noises such
> >as "like" and "you know" is an unacceptable and
> inefficient form of
> >communication. Look up "interspersed".
> >
> >2. There is no such thing as "US English". We will
> let Microsoft know on
> >your behalf.
> >
> >3. You should learn to distinguish the English and
> Australian accents.
> >It really isn't that hard.
> >
> >4. Hollywood will be required occasionally to cast
> English actors as the
> >good guys.
> >
> >5. You should relearn your original national anthem,
> "God Save The
> >Queen", but only after fully carrying out task 1.
> We would not want you
> >to get confused and give up half way through.
> >
> >6. You should stop playing American "football".
> There is only one kind
> >of football. What you refer to as American "football"
> is not a very good
> >game. The 2.15% of you who are aware that there
> is a world outside your
> >borders may have noticed that no one else plays
> "American" football. You
> >will no longer be allowed to play it, and should
> instead play proper
> >football. Initially, it would be best if you played
> with the girls. It
> >is a difficult game. Those of you brave enough will,
> in time, be allowed
> >to play rugby (which is similar to American "football",
> but does not
> >involve stopping for a rest every twenty seconds
> or wearing full kevlar
> >body armour like nancies). We are hoping to get
> together at least a US
> >rugby sevens side by 2005.
> >
> >7. You should declare war on Quebec and France,
> using nuclear weapons if
> >they give you any merde. The 97.85% of you who were
> not aware that there
> >is a world outside your borders should count yourselves
> lucky. The
> >Russians have never been the bad guys. "Merde" is
> French for "shit".
> >
> >8. July 4th is no longer a public holiday. November
> 8th will be a new
> >national holiday, but only in England. It will be
> called "Indecisive
> >Day".
> >
> >9. All American cars are hereby banned. They are
> crap and it is for your
> >own good. When we show you German cars

and German music?

, you will
> understand what we
> >mean.
> >
> >10. Please tell us who killed JFK. It's been driving
> us crazy.
> >
> >Thank you for your cooperation.
> >
> >Her Sovereign Majesty Queen Elizabeth II
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Japanese collation sequence?

2000-11-30 Thread 11digitboy

What is the Japanese collation sequence? Oh yeah, there
are a bunch of Roman letters thrown in. And digits
too.

Yeah, anime CDs.

Do I just katakanize the roman letters? And is "Sanzenin"
"sa-n-se-n-i-n" or "3-0-0-0-i-n"? And how do I do long
vowel mark?

| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Kana and Case (was [totally OT] Unicode terminology)

2000-11-27 Thread 11digitboy



| ||\ __/__  |   |  _/_   | ||   /
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/ _
 \_/   T /  \   /  __/   |   /---  \_/ L/ \


 Deborah Goldsmith <[EMAIL PROTECTED]> wrote:
> on 11/22/00 3:00 PM, Kenneth Whistler <[EMAIL PROTECTED]>
> wrote:
> 
> > It isn't really nonce usage, but rather the adoption
> of the formal
> > spelling mechanism of Katakana into Hiragana to
> indicate prosodic
> > length. The place you'll see this usage of the
> prolonged sound
> > mark fairly frequently is in Japanese comics, which
> are rather
> > loose and inventive in their use of spellings and
> "paraspellings"
> > to convey tone of voice and other prosodic information.
> 
> Another example is the use of dakuten on characters
> they're not normally
> applied to (e.g. U+3042 U+3099).
> 

You mean this?

_|_ ||
 |_
/| \/
\|_/\

I'm sorry, I can't draw.
How do you *say* it?


> Deborah Goldsmith
> Manager, International Toolbox Group
> Apple Computer, Inc.
> [EMAIL PROTECTED]
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Kana and Case (was [totally OT] Unicode terminology)

2000-11-22 Thread 11digitboy

Okay. Get out your copy of the lyrics to the Ranma
1/2 Complete Vocal Collection Vol. 1. Now look at
the lyrics to Ranbada Ranma (that's Track 12) and
tell me that the long vowel mark is not used with
hiragana.

| ||\ __/__  |   |  _/_   | ||  
/
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/
_
 \_/   T /  \   /  __/   |   /---  \_/ L/
\


 Thomas Chan <[EMAIL PROTECTED]> wrote:
> On Wed, 22 Nov 2000 [EMAIL PROTECTED] wrote:
> 
> > If the difference between "A" and "a" is called
> "case",
> > what is the difference between HIRAGANA LETTER
> YA
> > and KATAKANA LETTER YA called? (I think either
> of
> > those letters would do to describe this with
> the
> > new code pages. The description would be enhanced
> > by liberal application of HIRAGANA-KATAKANA LONG
> > VOWEL MARK.)
> 
> Maybe you should also be asking what the difference
> between U+0041 LATIN
> CAPITAL LETTER A, U+0391 GREEK CAPITAL LETTER ALPHA,
> and U+0410 CYRILLIC
> CAPITAL LETTER A is called.
>  
> However, although U+3084 HIRAGANA LETTER YA and
> U+30E4 KATAKANA LETTER YA
> are both derived from U+4E54 (the former from a
> cursive form; the latter
> from a simplification of the print form), it doesn't
> hold for most other
> kana, such as U+3042 HIRAGANA LETTER A and U+30A2
> KATAKANA LETTER A, which
> are derived from a cursive form of U+5B89 and a
> simplification of the
> print form of U+963F, respectively.
> 
> I don't get what you mean by "new code pages".
>  Who's creating those
> anymore?
>  
> Hiragana, unlike katakana, doesn't use U+30FC KATAKANA-HIRAGANA
> PROLONGED SOUND MARK for writing long vowels. 
> (Why does it have this name
> in Unicode?)  What's this "HIRAGANA-KATAKANA LONG
> VOWEL MARK"?--I see no
> such thing.
> 
>  
> > I like "Astral Planes" better.
> > Will they include INUKTITUT VIGESIMAL DIGITs?
> 
> I don't.  I write in Cantonese and some of contents
> of Plane 2 are very
> much down-to-earth for me.  Are you a musician?
>  If so, then Plane 1 would
> be important to you, too.
> 
> Throwing around terms like "Astral Planes", whether
> official or not, will
> just engender lack of credibility for Unicode,
> which has already happened
> to some extent among people who heard about some
> "Klingon" (in the Private
> Use Area) in Unicode.
> 
> 
> Thomas Chan
> [EMAIL PROTECTED]
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: [totally OT] Unicode terminology (was Re: string vs. char [was Re: Java and Unicode])

2000-11-22 Thread 11digitboy

If the difference between "A" and "a" is called "case",
what is the difference between HIRAGANA LETTER YA
and KATAKANA LETTER YA called? (I think either of
those letters would do to describe this with the
new code pages. The description would be enhanced
by liberal application of HIRAGANA-KATAKANA LONG
VOWEL MARK.)

I like "Astral Planes" better.
Will they include INUKTITUT VIGESIMAL DIGITs?

I should have voted Sarasvati for US President. Instead
I voted for Saotome Nodoka.


| ||\ __/__  |   |  _/_   | ||  
/
| _|_  ,--, /   \  /_|  -+- / --- | /
|V T_)| |   |\   |   ||/
_
 \_/   T /  \   /  __/   |   /---  \_/ L/
\


 Marco Cimarosti <[EMAIL PROTECTED]>
wrote:
> David Starner wrote:
> > Sent: 20 Nov 2000, Mon 16.18
> > To: Unicode List
> > Subject: Re: string vs. char [was Re: Java and
> Unicode]
> >
> > On Mon, Nov 20, 2000 at 06:54:27AM -0800, Michael
> (michka)
> > Kaplan wrote:
> > > From: "Marco Cimarosti" <[EMAIL PROTECTED]>
> > >
> > > > the Surrograte (aka "Astral") Planes.
> > >
> > > I believe the UTC has deprecated the term Astral
> planes with extreme
> > > prejudice. HTH!
> >
> > The UTC has chosen not use the term Astral Plane.
> Keeping
> > that in mind,
> > I can chose to use whatever terms I want, realizing
> of course
> > that some
> > may not get my point across. The UTC chose Surrogate
> Planes
> > for perceived
> > functionality and translatability; I chose Astral
> Planes for
> > perceived grace and beauty.
> 
> Well, I am not as angrily pro "Astral Planes" as
> David is, but I too find
> the humorous term prettier than the official one.
> And I used it because I
> think that a few people on this list may still
> find it clearer than the
> official "Surrogate Planes" -- which is more serious
> and descriptive, but
> still relatively new to many.
> 
> Moreover, although my attitude towards the UTC

I thought UTC meant Universal Coordinated Time,
like this: UTC 2000a11l22d13h02m.

> (the "government" of Unicode)
> is much more friendly than my attitude towards
> real governments out there
> (if people like J. Jenkinks or M. Davis were the
> President of the USA this
> would be a much nicer world!), still I don't feel
> quite like obeying to any
> government's orders, prohoibitions or deprecations
> without opposing the due
> resistance.
> 
> 8-) (<-- smiley wearing anti-tear-gas glasses)
> 
> _ Marco
> 
> __
> La mia e-mail è ora: My e-mail is now:
> >>>   marco.cimarostiªeurope.com   <<<
> (Cambiare "ª" in "@")  (Change "ª" to "@")
>  
> 
> __
> FREE Personalized Email at Mail.com
> Sign up at http://www.mail.com/?sr=signup
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Character counter?

2000-11-11 Thread 11digitboy

Is there a program that will count characters in
a text file?

| ||\ __/__  /   \  _/_   | ||  
/
| _|_  ,--, /   \  /_|  -+- / ,-. | /
|V T_)| |   |\   |   ||/
_
 \_/   + /  `'  '  __/   \   /`-'  \_/ L/
\


___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Possibility for internationalization of domain names

2000-11-10 Thread 11digitboy

This plan is for Plane 0 values only, though odds
are it could be extended to other planes.

Instead of base 16, use base 32. That means the digits
are 0, 1, 2, 3, ... 9, A, B, ... V. So each is 5
bits. 3 together is 15 bits. Have 2 (not 1) escape
characters and kabam! your 16 bits, in only 4 bytes
of URI.

Is this OK?

| ||\ __/__  /   \  _/_   | ||  
/
| _|_  ,--, /   \  /_|  -+- / ,-. | /
|V T_)| |   |\   |   ||/
_
 \_/   + /  `'  '  __/   \   /`-'  \_/ L/
\


___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




RE: Number separators

2000-10-31 Thread 11digitboy

About those numbers: (^_^)

Sometimes I like to write this for a pinball score:
34`8614`7040

Once I saw a Japanese use format like this: 42,9496,7296
but that is probably non-standard.

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Colours

2000-10-19 Thread 11digitboy

Are there languages you might need to encode where
colour is important? (such as, if a certain shape
in red is one letter, but in blue it is a different
letter)

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: "Giga Character Set": Nothing but noise

2000-10-15 Thread 11digitboy

It seems to me that if not for that, how could anyone
make a Chinese font? Who is going to sit down and
draw a *myriad* or more characters? Since elements
recur, this reduces the amount of labour required
greatly.
..
..

[OT] Are there any character-encoding schemes that
have CENTESIMAL DIGIT TEN, CENTESIMAL DIGIT ELEVEN,
... CENTESIMAL DIGIT NINETY-NINE? I had a clock with
SEXAGESIMAL DIGITs ZERO through FIFTY-NINE on one
wheel. Then I got sick of the noise it made sometimes
and ripped the digits out.  It seems to me that
in vertical text, what would be better than

san
zen
go
hyaku
yon
juu
hachi
nin

or

3
5
4
8
nin

would be

35
48
nin

but this is not allowed, is it?




 Jon Babcock <[EMAIL PROTECTED]> wrote:
> 
> "Carl W. Brown" <[EMAIL PROTECTED]> wrote:
> 
> > If you were to start all over again with no interest
> in
> > compatibility with existing code pages, you could
> drop the preformed
> > characters.
> 
> Since I've commented about the possibility of using
> a set of less than
> 2000 or so characters to represent all Chinese
> graphs more than once
> on this mailing list over the past few years, I'll
> be brief this time.
> 
> Such a system was developed nearly fifty years
> ago by Peter
> A. Boodberg, at the Department of Oriental Languages
> at the University
> of California, Berkeley. His work was based directly
> on a study of
> Chinese sources, especially the Shuowenjiezi Dictionary.
>  I was
> fortunate to be able to study under Professor Boodberg
> during his last
> couple years at Berkeley, shortly before his death
> in 1972. I've
> rewritten some of his ideas and placed them on
> my web site (kanji.com)
> under the name of CHA (Chinese Hemigram Annotation).
>  And because it
> is difficult to find his original writings on this
> subject, I intend
> to host a few of Boodberg's key 'cedules' soon.
> 
> When I first heard about Unicode (probably in late
> 1991), I naively
> assumed that it would employ some version of the
> Boodberg approach,
> i.e., the use of a 'small' subset of Chinese from
> which the entirety
> is composed. But, as has been stated many times
> on this list, the
> preferred approach was to base the Unicode Han
> repertoire on lists of
> precomposed hanzi/hanja/kanji that were actually
> in use in computers
> and, for the most part, were sanctioned by national
> governments. This
> was natural given the fact that the details (and
> here the details mean
> everything) of a system such as the one Dr. Boodberg
> envisioned were
> probably not available to the Unicode people, not
> were they in use by
> any national, commercial, or even academic body.
> In other words, it
> would have meant that such an approach would have
> had to have been
> developed by what came to be known as the Unicode
> Consortium itself.
> 
> Although difficult, I believe that within the decade,
> the composition
> of the Chinese script will be recognized and well-understood,
> and the
> option to treat each of the tens of thousands of
> Chinese graphs,
> including new ones but excluding of course the
> 300 or so unsegmentable
> wen, as a digraph that can be decomposed into hemigrams
> will be made
> available, perhaps even in Unicode.
> 
> In the meantime, vis-a-vis Unicode and the Han
> repertoire, it's a case
> of 'get over it'. I had to.
> 
> Jon
> 
> -- 
> Jon Babcock <[EMAIL PROTECTED]>
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Any one done an Arabic site?

2000-10-10 Thread 11digitboy

I think one of the entrance pages of www.grownupgirl.com
(a porn site) has something like that. But the site
won't load for me well now, so It loaded yesterday.
It has the "Ha ha, let's display the words 'last
updated' followed by JavaScript to show the date
and see how many people we fool" trick. I think that's
a classic trick.



--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 George Zeigler <[EMAIL PROTECTED]> wrote:
> Hello,
> 
>has anyone on this list ever built a website
> which pulls text from a
> database into the html for both right to left and
> left to right languages. 
> Such as Arabic and English for instance.
> 
> Thanks
> George
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Forget what I just said

2000-10-10 Thread 11digitboy

They're images (I should have known not to trust
anybody who uses that date trick!)

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




U+3007 not a Hanzi?

2000-10-03 Thread 11digitboy

I have seen U+3007 classified as a Hanzi. Why is
it not considered a Hanzi in Unicode? Because it
is the only Hanzi that uses that stroke??
If it is not a Hanzi, what, then, is it?

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Why are kana called "letters", not "syllables"?

2000-09-22 Thread 11digitboy

Why, for instance, is HIRAGANA LETTER ME called HIRAGANA
LETTER ME and not HIRAGANA SYLLABLE ME? It might
be explained by how they are used, for instance,
Japanese palindromes, I hear, work kana by kana.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: when a language dies six butterflies disappear from the conscious

2000-09-16 Thread 11digitboy

Are there languages with systematic color-naming
schemes, like computer hex codes for colours?

This reminds me of a certain all-vowel Japanese word,
and I think you know which word I mean.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Timothy Greenwood <[EMAIL PROTECTED]>
wrote:
> You may care to take a brief break from language
> identifiers to appreciate
> these lines. They are the final two paragraphs
> from the essay "The Last
> Word. Can the Worlds Small Languages Be Saved?"
> by Earl Shorris in the
> August edition of Harpers magazine.
> 
> " I think now that every language has its Ellam
> Yua. The consolation the old
> men sought existed only in Maya. Every epithet
> implied a unique set of
> attributes, every sound described a unique Being.
> It is not merely a
> writer's conceit to think that the human world
> is made of words and to
> remember that no two words in all the world's languages
> are alike. Of all
> the arts and sciences made by man, none equals
> a language, for only a
> language in its living entirety can describe a
> unique and irreplaceable
> world. I saw this once, in the forest of southern
> Mexico, when a butterfly
> settled beside me. The color of it was a blue unlike
> any I had ever seen,
> hue and intensity beyond naming, a test for the
> possibilities of metaphor.
> In the distance lay the ruined Maya city of Palenque,
> where the glyphs that
> speak of the reign of the great lord Pacal are
> carved in stone. The glyphs
> can be deciphered now. Perhaps. Only perhaps, for
> no one knows what words
> were spoken, what sounds were made when Pacal the
> Conqueror reigned. It may
> seem cryptic or even Socratic to say, but, in truth,
> only spoken words can
> he heard.
>  There are nine different words in Maya for the
> color blue in the
> comprehensive Porrúa Spanish-Maya Dictionary but
> just three Spanish
> translations, leaving six butterflies that can
> be seen only by the Maya,
> proving beyond doubt that when a language dies
> six butterflies disappear
> from the consciousness of the earth."
> 
> Tim
> 
>  
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: surrogate terminology

2000-09-12 Thread 11digitboy

So what notation do you use? 0x8000 is just another
way to say 32768.

By the way, how was the conference?

Um... give him a REALLY high plane, like in 
oh, I don't know how high.
You can't keep giving people their own planes, because
then I'll want one, and then you'll rip my skin away
from my body for wanting one. And Sarasvati will
make sure no woman ever wants me, not that that would
change anything.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 [EMAIL PROTECTED] wrote:
> 
> On 09/12/2000 02:59:38 PM Kenneth Whistler wrote:
> 
> [snip]
> 
> I think Ken's comments on planes is good.
> 
> 
> >3. The term "surrogate character" should be eschewed
> altogether, because
> >   of the confusion is causes. "Surrogate code
> point" can continue to
> >   be used as it currently is, and the term "surrogate
> pair" is also
> >   useful. But the other terminology related to
> characters...
> 
> The other terminology Ken discussed had to do with
> the plane in which a
> character is found. What I think is still open
> is how d800 - dfff get
> referred to. Ken indicated that "surrogate code
> point" can continue in use
> as is; I don't recall exactly how TUS 3.0 uses
> it. (Would have made for a
> rather challenging trivia question :-) My biggest
> concern here is that
> people should not be referring to U+d800 - U+dfff
> as characters. (I'd be
> willing to accept code point, provided there is
> a clear statement as to
> what is meant by a code point.) For that matter,
> I'd be inclined to say
> that the U+ notation should not be used here -
> U+ should be reserved for
> use to refer to encoded characters in terms of
> their Unicode scalar values.
> So, 0xd800 is OK, but U+d800 would be wrong.
> 
> 
> 
> - Peter
> 
> 
> ---
> Peter Constable
> 
> Non-Roman Script Initiative, SIL International
> 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
> Tel: +1 972 708 7485
> E-mail: <[EMAIL PROTECTED]>
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: FW: Date Controls

2000-08-17 Thread 11digitboy

Let's see: Could it, for instance, show the date
in the Hebrew calendar and the time in hours and
halakim?

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 "Magda Danish (Unicode)" <[EMAIL PROTECTED]>
wrote:
> 
> 
> -Original Message-
> From: Faheem Ahmed Khan [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, August 17, 2000 4:20 AM
> To: [EMAIL PROTECTED]
> Subject: Date Controls 
> 
> 
> Hi,
> 
> We are planning for a date control which would
> show up dates/time in any
> chracter. Now, how do we find out how many characters
> does a date in each
> language take up?
> For eg: To display AM in Arabic/japanese how many
> bytes are required?
> similarly for everything else.
> 
> I'm sure there must be some standard to this. Or
> is it enough if I make it
> Unicode compliant and it will take care of the
> problem?
> 
> Thanx
> rgds -Faheem
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Mixing alphabets (was: sorting my CD collection)

2000-08-10 Thread 11digitboy

You have a good point:  does nu-alpha-tau-alpha-sigma-alpha
spell "Natasa" or "Natasha"? The Greek letters given
are obviously an attempt to write "Natasha" in Greek,
but they romanize to "Natasa".

And a, b, c, d, e, f, g, h, ... HATES a, i, u, e,
o, ka, ki, ku, ...

Maybe I should just capitalize everything (except
Georgian? ... not that I have any Georgian CDs, or
am likely to... I bet few things would be rarer than,
say, a Georgian female rap CD in the US!!) and from
there, just sort by codepoint number... no good,
"Á" would come after "Z"...

Would somebody PLEASE tell me, IN THE DEFAULT UNICODE
COLLATION ALGORITHM, WHAT COMES AFTER WHAT?! I could
use a list of Unicode characters in proper collation
order, with "ties" labeled!!

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Antoine Leca <[EMAIL PROTECTED]> wrote:
> Robert Lozyniak wrote:
> >
> > How do you sort text with some in Roman and some
> > in non-Roman alphabets?
> 
> I never sort texts, only lists of items (words,
> names, titles, whatever).
> 
> Depending of the ratios, I see two main solutions:
> 
> - if Latin is the most current, _and_ only other
> Greek-
> derived scripts are used, _and_ the intended audience
> is proficient enough, I may interspeed the non-Roman
> letters as if all the Greek-derived alphabets shared
> a common order (so Greek alpha sorts just after
> Latin a,
> Cyrillic ve after Cyrillic be which follows Greek
> beta
> which follows Latin b, Greek xi after the o's and
> before
> the p's, etc.)
> 
> - in other cases, I sort the scripts separately.
> 
> 
> > Currently, I'm just romanizing
> > everything but I don't know if that is that good.
> 
> Hmmm. I won't do that. It would take me much too
> long
> to find something that begin with beta at the V
> section,
> while something that begin with mu+pi at the B
> section...
> For Cyrillic, I expect U+0427 to romanize as tcha,
> and U+0429 as chtcha, and I am not sure you will
> (or
> vice-versa).
> 
> Things are different if you actually translitterate,
> i.e. if the items are presented in Latin script.
> 
> 
> > It is probably bad to kanize digits, because
> they
> > would sort 1, 9, 5, and so on, or some other
> mixed-up
> > order.
> 
> It is always a problem to sort the digits, anyway.
> Since they are usually ony a few of them, I believe
> the
> best place is the foremost, so the search does
> not takes
> too long. But if they are more than a bunch, that
> is
> pretty always a brain damage.
> 
> 
> Antoine
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Organizing your CD collection

2000-08-10 Thread 11digitboy

How do you sort text with some in Roman and some
in non-Roman alphabets? Currently, I'm just romanizing
everything but I don't know if that is that good.
Should I just kanize Japanese?
I would love a system that just goes by characters,
and I would much prefer it if the Han digits collated
in numerical order.
So far, I have 3 alphabets: Latin, Greek, and Japanese.
It is probably bad to kanize digits, because they
would sort 1, 9, 5, and so on, or some other mixed-up
order.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Why not to move characters (was: is there any way to change already defined character codes?)

2000-08-08 Thread 11digitboy

You don't want to move characters because then you
could change the meaning of a sentence that way.
I don't want to price something at 1000 cows when
I mean 1000 yen. Or worse, 100 yen.

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




RE: is there any way to change already defined character codes?

2000-08-08 Thread 11digitboy



--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 [EMAIL PROTECTED] wrote:
> Sandro Karumidze wrote:
> > The issue is that in Unicode there is a  sequence
> of Georgian 
> > caracters different
> > from what this people think should be.
> > [...] In beginning of this century 5 characters
> were dropped
> > [...]
> > In Unicode this 5 characters follow 33. There
> is a different 
> > point of view that those 5 should be included
> among the
> > ohters.
> 
> (You definitely need an official reply, but let's
> go on with some more
> informal chatting.)
> 
> I foresee that this would not be considered a good
> reason to change
> anything.
> 
> The order of characters in Unicode (or in any other
> character encoding) is
> not important. The scope of a character set is
> to assign a unique number to
> each character, not to define an "alphabetical
> order".
> 
Yeah. Just look at the kanji digits!

> If you notice, the situation that you describe
> is true for *all* the
> alphabets in Unicode.
> 
> E.g., if you look at the Latin part, you see that
> the 26 letters used in
> modern English are all contiguously ordered in
> two areas: U0041 to U005A
> (uppercase) and U0061 to U007A (lowercase).

Yeah, but so what? All you gotta do is turn the 6th
bit off and there you go!
> 
> But that's the end of the story! All the other
> 100's Latin letters are
> scattered all over, using no consistent order.
> 
Too bad unicode values can't be fractions!!

> The same is true for Cyrillic, Greek, Hebrew, Arabic,
> and so on. Have a look
> at those blocks: the basic letters for post-czar
> Russian, modern Greek,
> Israeli Hebrew, modern Arabic etc. are consistently
> ordered, but the letters
> for other languages that use the same alphabets
> (or ancient letters for the
> same languages) are scattered all over with no
> specific order.
> 
> The reason why no one cares about the order of
> characters is that it is
> *impossible* to determine a "correct" order.
> 
> In alphabet used by more than one language (e.g.
> Latin, Cyrillic, Arabic,
> Devanagari, etc.), the alphabetic order is normally
> different for each
> language.
> 
> Moreover, many languages have more than one alphabetic
> order, all equally
> valid and in current usage.
> 
> For this reason the problem of "alphabetic order"
> has been pulled apart from
> character sets, and addressed separately.
> 
> In Unicode, the issue of "collation" is handled
> by ad-hoc optional
> algorithm, that is part of the standard but is
> separated from the encoding
> issue itself.
> 
> The algorithm is titled "Unicode Technical Report
> #10: Unicode Collation
> Algorithm", and you can find it here:
> http://www.unicode.org/unicode/reports/tr10/ .
> 
> *That* is the place to check whether Georgian Letters
> are in the correct
> order or not. And if they are not, you have two
> options:
> 
> 1) Ask Unicode to change it: here you *do* have
> some chances to be listened,
> if you have valid arguments.
> 
> 2) Change it yourself: unlike the character values,
> the collation algorithm
> is designed to be flexible and customizable.
> 
> Regards,
> _ Marco
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




GEORGIAN DIGITs

2000-08-08 Thread 11digitboy

Where are the Georgian digits? I want a set of Georgian
digits so I can use them as counter digits.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: FW: Unicode - Exponent and indication sign

2000-08-08 Thread 11digitboy

Yes. Try the middle of the "20__" range of characters.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 "Magda Danish (Unicode)" <[EMAIL PROTECTED]>
wrote:
>  
> -Original Message-
> From: Marchand, Gilles [mailto:[EMAIL PROTECTED]]
> Sent: Monday, August 07, 2000 6:33 AM
> To: '[EMAIL PROTECTED]'
> Subject: Unicode - Exponent and indication sign
> 
> 
> 
> 
> 
> 
> Hello, 
> 
> 
> we plan to use the ISO LATIN 8859-1
> as our default caracter
> set. A question from a user was: does it support
>exponentiation N2, or
> the indication sign O4 ? If so where can I find
>  the how to use method?
> 
> 
> 
> thank you for listeningn to me. 
> 
> 
> 
> Gilles Marchand 
> UQAM - Library system 
> [EMAIL PROTECTED] 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Off-topic: digraphs and trigraphs

2000-08-04 Thread 11digitboy



--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Doug Ewell <[EMAIL PROTECTED]> wrote:
> Does anyone know of a commonly used or commonly
> accepted collective
> term for multi-character sequences (e.g. digraphs,
> trigraphs, etc.)?
> I'm thinking that the word "multigraph" would be
> appropriate, but I
> don't want to invent my own term if one already
> exists.  (Besides, it
> sounds like an early '80s software package.)
> 
> Along the same lines, would the term "quadrigraph"
> be appropriate for
> a four-character sequence?
> 
---

My guess would be "tetragraph" (Greek root for four).

---
> Thanks in advance for any tips.  Please respond
> privately unless you
> feel your response may be of interest to the list
> at large.

Well, we ought to agree on what to call them, no?

> 
> -Doug Ewell
>  Fullerton, California
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




What is ` (U+0060) for?

2000-08-02 Thread 11digitboy

What's ` for? To space what? I pretty much just use
it for writing big numbers, like 42`9496`7296. 

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Fonts for I, L, 1

2000-08-02 Thread 11digitboy

 I don't need the word "ill" to look like Roman numeral
three. OKAY, ANYBODY KNOW OF SOME GOOD, BIG FONTS
WHERE NO TWO CHARACTERS LOOK ALIKE? This would especially
be good for capital pee and capital rho. These fonts
would ideally have this property in a "normal", small
size.

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




That also goes for ichi and hyphen.

2000-08-02 Thread 11digitboy

That also goes for ichi (the kanji corresponding
to our digit 1), and the kanji hyphen. I don't want
those to look alike. You don't want them to either,
ne?

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Euro

2000-07-29 Thread 11digitboy

Yeah, how WOULD you make a serifed, rounded E that
doesn't look silly and doesn't look like a C with
an extra line? Well, maybe you can, I dunno. Anyone
who can do that, I'd like to see it. 

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Asmus Freytag <[EMAIL PROTECTED]> wrote:
> At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote:
> >I was not talking about the shape. I think all
> of us have seen it, and
> >many have also read the documents which define
> its exact shape using a
> >ruler and a compass. I was talking about the origin
> of the shape.
> 
> In some sense, except for purists, this discussion
> is rapidly becoming 
> moot. The 'euro glyphs' have been out in the wild,
> on shop displays, in 
> newsprint etc. for well over a year now.
> 
> If you will, the 'common man's' idea of what a
> proper Euro glyph is, is 
> fast becoming influenced by what he sees on a daily
> basis, not by the 
> origin of the glyph or by the logo (which is prescribed
> only for its 
> appearance on the currency itself).
> 
> Given the name, I'm sure even the 'non-European'
> font designers that Werner 
> likes to blame aren't suggesting that the logo
> for the 'e'uro is based on a 
> 'c'. However, when you try to put the thing together
> with the serifs used 
> in many of the common type faces, the result can
> indeed look a bit like a 
> 'c'. This seems particularly true for monospaced
> fonts.
> 
> A./
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Euro

2000-07-28 Thread 11digitboy

Maybe this is how to settle the Euro glyph thing:
Instead of talking about it, write the Euro glyph
on a piece of paper with a pen, scan it in, and post
it. If we STILL can't agree on what it looks like,
very politely request that one of the people in CHARGE
of this Euro thing please write the Euro glyph with
a pen, and then scan in what she wrote and post it.
Historically, aren't typeface glyphs based on handwritten
glyphs? Why should some typeface designer decide
what the glyph should look like?!

--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Roozbeh Pournader <[EMAIL PROTECTED]>
wrote:
> 
> 
> On Wed, 26 Jul 2000, Werner LEMBERG wrote:
> 
> > [...] the origin of the Euro glyph is a Greek
> small epsilon.
> 
> Any reference for this? I once heard that this
> is a curved E.
> It was Peter Flynn in TUGboat I think...
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




RE: Digits (Was: What a difference a glyph makes...)

2000-07-27 Thread 11digitboy

I started this thing about DIGIT SEVEN WITH STROKE
to poke fun at the number of times glyphs appear
to be duplicated with slight variations. But now
it appears to have taken a turn to something else:

1) Sometimes short-ranging figures and long-ranging
figures (I can never remember which is which; are
short-ranging the "capital" digits?) are mixed in
a document. This is a formatting issue, right?

2) Has Unicode code-points for bold, italic, etc.?
text? Sometimes that is important to the meaning
of a text.


--
Robert Lozyniak
Accusplit pedometer IS A FILTHY PIECE OF GARBAGE
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 "Figge, Donald" <[EMAIL PROTECTED]>
wrote:
> . . . and still another digit one, non-tabular,
> for fine typography. And, of
> course, there are the old-style digits.
> 
> Don
> //
> 
> -Original Message-
> From: Valeriy E. Ushakov [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, July 26, 2000 3:19 PM
> To: Unicode List
> Subject: Digits (Was: What a difference a glyph
> makes...)
> 
> 
> On Wed, Jul 26, 2000 at 12:02:15 -0800, [EMAIL PROTECTED]
> wrote:
> 
> > This reminds me of "Are DIGIT SEVEN and DIGIT
> SEVEN
> > WITH STROKE distinct characters?" Yeah, our decimal
> > number system has at least thirteen digits:
> 
> > DIGIT ONE
> 
> Add another ONE here: digit one with bottom stroke:
> 
>/|
>_|_
> 
> This bottom stroke in ONE was mandatory, just like
> slashed zero, for
> submitting punching jobs (you know, in those batch
> days when punched
> cards were still in active use and you had an option
> to submit a
> handwritten text of your program to be punched
> for you).
> 
> 
> SY, Uwe
> -- 
> [EMAIL PROTECTED] | 
>  Zu Grunde kommen
> http://www.ptc.spbu.ru/~uwe/| 
>  Ist zu Grunde gehen
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




RE: What a difference a glyph makes...

2000-07-26 Thread 11digitboy

This reminds me of "Are DIGIT SEVEN and DIGIT SEVEN
WITH STROKE distinct characters?" Yeah, our decimal
number system has at least thirteen digits:
DIGIT ZERO
DIGIT ZERO WITH STROKE
DIGIT ONE
DIGIT TWO
DIGIT THREE
CLOSED DIGIT FOUR
OPEN DIGIT FOUR
DIGIT FIVE
DIGIT SIX
DIGIT SEVEN
DIGIT SEVEN WITH STROKE
DIGIT EIGHT
DIGIT NINE

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 "Alistair Vining" <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> >
> > Notice to British and Irish Unicoders:
> >
> > U+00A3 (POUND SIGN) is a cursive "L" with *one*
> bar
> > through it (cmp. http://charts.unicode.org/Web/U0080.html).
> > U+20A4 (LIRA SIGN) is a cursive "L" with *two*
> bars
> > through it (cmp. http://charts.unicode.org/Web/U20A0.html).
> >
> > Please, watch out carefully your next tax form,
> and remember
> > who posted this.
> 
> I assume you're joking here (the internet irony
> firewall is still up).  An L
> with two bars is an acceptable glyph for UK pounds
> as well.  They're both
> the same (libra) sign.  Or are you saying that
> an L with one bar would be
> (completely) unacceptable for (Italian) lire?
> 
> Have people started writing the Euro with only
> one bar yet?  The issue is,
> after all, rapidly disappearing for the Irish and
> Italians.
> 
> Al.
> 
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Making Unicode characters

2000-07-25 Thread 11digitboy

How do I make U+5973, for instance? I want to make
it so I can see it on the screen. I want to do that
without cheating by e.g. using Paint.

Magda, if you're in programming, make it (the keyboard
utility) yourself. Only you're not, right?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Links on Unicode site

2000-07-25 Thread 11digitboy

What do you think of the "Any Damn Browser" method
of site design? (i.e. "This page best viewed with
any damn browser")

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Mark Davis <[EMAIL PROTECTED]> wrote:
> I hope people have had a chance to try out the
> new Unicode site. I
> wanted to get some feedback from people on the
> links that should be
> most easily available: those on the home page,
> and those in the
> drop-down list of quicklinks in the top right corner.
> (Currently the
> drop-down list is a slightly culled list of the
> ones on the home page,
> plus all the technical reports. It is not accessible
> from Netscape
> Navigator.)
> 
> The question is: are there other links that you
> feel should be
> accessible from the home page (and/or drop-down)?
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




127 strokes beyond the radical?!

2000-07-20 Thread 11digitboy

On page 876, the character U+6B8B is listed as being
127 strokes beyond the radical. I'd say it's more
like 6 strokes beyond the radical. I do not suppose
that characters of 128+ strokes are indeed possible,
due to the fact that the paper would get quite soggy
from the repeated strokes.

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Ethiopic "digits"

2000-07-20 Thread 11digitboy

Look at page 92 in the book. Then look at this:
http://www.cyberethiopia.com/ethiopic/counter.htm
Especially the part about no zero.

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Depends on the language

2000-07-20 Thread 11digitboy

In English, it's ['junIkowd]. Think "unicycle" or
"unilateral" or "universal". And the "code" part
is the root word "code".
As for "unique", well, why doesn't "one" rhyme with
"stone", "bone", and "alone"?
I wonder what our fine lady friend (she knows exactly
who she is) has to say about this?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




U+3358

2000-07-19 Thread 11digitboy

This says reiten, not reiji. Why?! Shouldn't it say
REIJI??!!! Or am I going to look like a total fool
when I find out that it SHOULD say REITEN?
If the thing said REIJI, it and its friends could
be used to shorten encoding of times in text.
Why are there no reifun, ippun, nifun, ... gojuukyuufun
codepoints?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Subset of Unicode to represent Japanese Kanji?

2000-07-12 Thread 11digitboy

Correct me if I'm wrong, but I think you NEED kana
for Japanese. How can you even write "desu" ("is")
without it??

Am I right in supposing that Japanese people hate
that their kana take up 3 bytes per character, while
the Roman letters I am using now take up only 1 byte
apiece? If I were Japanese, I'd say this sucks.

To give an example of a katakana word that is an
international word: "nidoran".

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Otto Stolz <[EMAIL PROTECTED]> wrote:
> Am 2000-07-11 um 7:02 h hat Michael Martin geschrieben:
> > English, Dutch, French, German, Italian, Japanese,
> Portuguese, and Spanish.
> > It is my understanding that all of these languages
> except Japanese can be
> > supported with the basic Latin and Latin Supplement
> subset of Unicode
> > (U+ ... U+00FF [...]).
> 
> Latin-1 was invented to support those languages,
> but falls short of doing
> so, adequately. You will need additional characters
> from the following
> ranges:
> - Latin Extended A (e. g. U+0152 and U+0153 for
> French, U+0133 for Dutch,
>   perhaps U+017F for German (if you want to cover
> Fraktur fonts, that is))
> - general punctuation (e. g. U+201E and U+201A
> for German; U+2019 and
>   probably U+2010 through U+2015 for all of those
> Languages)
> - Currency Symbols (at least U+20AC, perhaps also
> U+20A3, U+20A4,
>   U+20A7; note also U+00A3, U+00A5 in the Latin-1,
> and U+0192 in the
>   Latin Extended-B regions, respectively)
> - Depending on the application envisaged, you may
> also wish to include
>   characters from the following areas:
>   - Number Forms (U+2150 through U+218F), particularly
> fractions
>   - Arrows (U+2190 through 21FF); Box Drawing,
> Block Elements, and
> Geometric Shapes (U+2000 through U+25FF)
>   - Mathematical Operators and Miscellaneous technical
> (U+2200 through
> U+23FF); Miscellaneous Symbols and Dingbats
> (U+200 through U+27BF)
> - Depending on the technolgy used, you may have
> to include characters
>   from the following ranges:
>   - Superscripts and Subscripts (U+2070 through
> U+209F)
>   - Presentation Forms (e.g. ligatures U+FB00 through
> U+FB06)
>   - The Replacement Character U+FFFD
> to name just a few :-)
> 
> Good starting points for your consideration could
> be
> - the EES, cf. ,
> - Microsoft's WGL 4 character set, cf.
>   .
> 
> > The Japanese I must support is the Kanji form.
> [...] I cannot support
> > Unicode in its entirety due to memory constraints.
> 
> If I am not mistaken, Kanji is ideographic characters,
> which would take
> the lion's share of memory to implement. Probably,
> you have to support
> kana (hiragana or katakana).
> 
> I do not know Japanese, so others may jump in.
> 
> Best wishes,
>Otto Stolz
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Names of planes, and request for sneak preview

2000-07-11 Thread 11digitboy

Okay, 0x10FFDE different characters. But what of
planes 15 and 16?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Asmus Freytag <[EMAIL PROTECTED]> wrote:
> At 12:18 PM 7/11/00 -0800, [EMAIL PROTECTED]
> wrote:
> >What about F? I was told that there are 0x10
> >possible characters?
> >Oh, by the way, if 12 is a dozen and 144 is a
> gross,
> >what are 16 and 256?
> 
> 
> There are 0x10 - 34 possible characters!
> 
> All code values ending in 0xFFFE and Ox do
> *not* refer to characters. 
> They are not just temporarily unassigned, but permanently
> reserved as 
> non-characters.
> 
> A./
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Names of planes, and request for sneak preview

2000-07-11 Thread 11digitboy

What about F? I was told that there are 0x10
possible characters?
Oh, by the way, if 12 is a dozen and 144 is a gross,
what are 16 and 256?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Kenneth Whistler <[EMAIL PROTECTED]> wrote:
> Mark responded reconditely:
> 
> > 
> > I ALY FND ANMs HRD2 DL WTH. WD PFR NML WDS.
> > 
> > Michael Everson wrote:
> > 
> > > Ar 07:53 -0800 2000-07-11, scríobh John H.
> Jenkins:
> > >
> > > >At the same time, it would be nice to have
> a Unicodally correct way
> > > >of referring to planes 1 and 2, since there
> is an important boundary
> > > >between them.
> > >
> > > Just use the acronyms BMP, SMP, and SIP.
> > >
> 
> From the practice that is developing in the relevant
> committees,
> and the discussion on this list, it would appear
> that the
> practical consensus seems to be heading towards:
> 
>  ..  "The BMP"
> 1..1 "Plane 1"
> 2..2 "Plane 2"
> E..E "Plane 14"
> 
> Those are in fact the terms that most people are
> using. It is quite
> unlikely that "SMP" and "SIP" and "SPP" are going
> to catch on
> very widely, given the difficulty of keeping them
> straight, or
> separate from other TLA's and FLA's like SMTP,
> TCPIP, etc. (SPP
> also means Southwest Power Pool, Science and Policy
> Programs,
> School of Public Policy, Society for Philosophy
> and Psychology,
> Student Protector Plan, Sandy's Pattern Pantry,
> Self-Publishing Partners,
> and the Santiago Park Plaza...) "Plane 14" is actually
> a *much* better
> term -- if you do an Internet search on that, all
> the pertinent
> Unicode-related stuff actually pops right up to
> the top of the search.
> 
> And despite Mark's disclaimer about the validity
> of any boundaries
> past the  / 1 boundary, the Plane boundaries
> do have some
> importance. They are likely to figure prominently
> in trie structures
> for accessing properties of characters past ;
> the planes themselves
> have some uniformity of their properties, since
> different things
> are being isolated to Plane 2 or Plane 14 as opposed
> to Plane 1.
> 
> Also, in favor of John Cowan's terminology, one
> might also note
> that all of the "Astral Planes" are self-naming
> by their initial
> hex digit.
> 
> --Ken
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Han character names?

2000-07-11 Thread 11digitboy

Yes, let us call it the MDIP. What would that be
in French?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Kenneth Whistler <[EMAIL PROTECTED]> wrote:
> John Cowan asked:
> 
> > "John H. Jenkins" wrote:
> > 
> > > Maybe they'll start on the Shuowen next and
> then move
> > > back to pre-Zhou stuff.  :-)
> > 
> > If they did, would the SIP overflow?
> 
> Quite possibly, depending on what one does in terms
> of unifications.
> But that is what Plane 3 is for. MDIP ("More damn
> ideographs plane") ??
> 
> --Ken
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Holy Moly! You can have THOSE letters in host names?!

2000-07-10 Thread 11digitboy

So do we want to be converting Han numerals to European
ones? Or do we want to SHOW  as , but have it access the same place as
?
(Is it called ? What DO you call
it?)

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Kenneth Whistler <[EMAIL PROTECTED]> wrote:
> Robert,
> 
> You still can't. The point of this discussion is
> to figure
> out how to make it possible without breaking things.
> 
> --Ken
> 
> > 
> > NOBODY TOLD ME YOU COULD HAVE THOSE LETTERS IN
> A
> > HOST NAME!!! So you could have  MI> > LETTER TI>.com for a lady
> named
> > Michiko, or something?
> > 
> > --
> > Robert Lozyniak
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Han character names?

2000-07-10 Thread 11digitboy

Do the Han characters have names, such as  or  or ?
So would my tattoo be ?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




What is this "case folding"?

2000-07-10 Thread 11digitboy

If it is what I think it is, I don't want it in English.
How could it tell "aids" from "AIDS", for instance?
Or "joy" from "Joy"(name)?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 [EMAIL PROTECTED] wrote:
> [EMAIL PROTECTED] wrote:
> > - Can these mutations only occur after a determinative,
> or can they also
> be
> > at the beginning of a sentence?
> I don't believe they can occur at the beginning
> of a sentence. The most 
> common construct occurs after "na" (meaning "of");
> "Ambasáid na hÉireann" 
> (Embassy of Ireland) is an example commonly encountered
> outside Ireland. 
> However, they can occur after other words.
> 
> > - Is this automatically implemented in the case
> folding function of
> > localized word processors?
> No, not unless some new word processor has been
> launched in the past year 
> or so.
> 
> B=

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Holy Moly! You can have THOSE letters in host names?!

2000-07-08 Thread 11digitboy

NOBODY TOLD ME YOU COULD HAVE THOSE LETTERS IN A
HOST NAME!!! So you could have .com for a lady named
Michiko, or something?

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Jonathan Rosenne <[EMAIL PROTECTED]> wrote:
> I did not mention Arabic vowels and Shadda because
> I don't feel qualified
> to.
> 
> Jony
> 
> > -Original Message-
> > From: Paul Hoffman / IMC [mailto:[EMAIL PROTECTED]]
> > Sent: Saturday, July 08, 2000 8:06 PM
> > To: Jonathan Rosenne; [EMAIL PROTECTED]
> > Subject: RE: [idn] Preparation of Internationalized
> Host Names - Hebrew
> >
> >
> > At 12:43 PM +0300 7/8/00, Jonathan Rosenne wrote:
> > >  > Please note that not all punctuation is
> prohibited. The rules for the
> > >>  specific kinds of punctuation that is prohibited
> are in the document.
> > >  > U+05C0, which looks just like the ASCII
> "vertical bar", is probably
> > >>  acceptable (since vertical bar is acceptable).
> U+05C3 looks just like
> > >>  a colon and is therefore not acceptable;
> thanks for pointing this
> > >>  out. (And I have noted it to the Unicode
> folks for when they update
> > >>  the standard).
> > >
> > >Its meaning is punctuation, like comma or full
> stop, never mind
> > its shape.
> >
> > Exactly my point. At present, we do *not* prohibit
> all punctuation.
> > The only prohibited punctuation are characters
> are that are reserved
> > or delimiters in URLs [RFC2396] and [RFC2732].
> If this group decides
> > to prohibit all punctuation, certainly we would
> then prohibit U+05C0.
> > Or, we might prohibit all punctuation other than
> a certain small
> > group of characters (which would be pretty difficult
> to choose
> > correctly...). But, for now, we only prohibit
> a small set.
> >
> > >  > >2. Cantillation Marks
> > >  > >0591 to 05af
> > >  > >
> > >  > >These should be either prohibited or ignored
> since they do
> > not affect
> > >>  >pronunciation, similar to ignoring case
> differences.
> > >>  >
> > >>  >Personally, I would rather prohibit them
> since their presence is
> > >>  most likely
> > >>  >to be an error.
> > >>
> > >>  If they never appear in personal names, company
> names, or spoken
> > >>  phrases, then they can safely be prohibited.
> Is that true for all of
> > >>  them?
> > >
> > >They never appear in common use, they are only
> used in biblical texts.
> >
> > Thanks, that's what I wanted to hear. I'll prohibit
> them in the
> > next draft.
> >
> > >  > >2. Points
> > >>  >05b0 to 05c4
> > >>  >
> > >>  >These should be either prohibited or ignored
> since they are
> > optional. In
> > >>  >modern Hebrew they are seldom used, not
> all systems support
> > >>  them, and it is
> > >>  >valid to omit them.
> > >>  >
> > >>  >Personally, I would rather ignore them because
> a user may enter
> > >>  them and why
> > >>  >not let him.
> > >>
> > >>  This is much more problematic. We do not
> currently have any "ignored"
> > >>  characters. If I understand this correctly,
> the host name  > >>  LETTER HE>.com looks
> and sounds different than
> > >>  .com,
> but could be considered
> > >>  the same for a host name. If so, I think
> we would have to prohibit
> > >>  them, not ignore them. Does that sound correct?
> > >
> > >They do sound different, but do not necessarily
> look different
> > because it is
> > >not mandatory to display points.
> > >
> > >Just like you ignore case in English, in Hebrew
> you should ignore points.
> >
> >  From my (very limited) understanding of Hebrew,
> this makes sense.
> > However, it means that we will have to make such
> other "ignoring"
> > rules for a variety of scripts. I'm happy to
> do that if the group
> > wants, but it certainly makes the name preparation
> harder. (Just to
> > be clear: my personal preference would have been
> not to ignore case,
> > but that decision was made *long* ago and cannot
> be reversed.) Doing
> > so would require an extra step, probably between
> checking for
> > prohibited characters and folding case, that
> says "look for any
> > characters on this list and throw it away".
> >
> > How does the group feel about this? What other
> characters in scripts
> > other than Hebrew would go here?
> >
> > --Paul Hoffman, Director
> > --Internet Mail Consortium
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: How-To handle i18n when you don't know charset?

2000-07-07 Thread 11digitboy



--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 "Michael \(michka\) Kaplan" <[EMAIL PROTECTED]>
wrote:
> I would not say that override should be impossible.
> I was merely saying that
> if the given charset is specified and is correct,
> and you change it to
> something invalid then it is their fault if
> the results are bad.
> 
> michka
> 
> 
> - Original Message -
> From: "Jonathan Rosenne" <[EMAIL PROTECTED]>
> To: "Unicode List" <[EMAIL PROTECTED]>
> Sent: Friday, July 07, 2000 9:37 AM
> Subject: RE: How-To handle i18n when you don't
> know charset?
> 
> 
> > Unfortunately, there are many Hebrew pages wrongly
> marked as 8859-1, and
> many more unmarked. So letting the user override
> the charset specification
> is necessary. I was told similar situations are
> known in Russia and Greece.
> >
> > Jony
> >
> > > -Original Message-
> > > From: Antoine Leca [mailto:[EMAIL PROTECTED]]
> > > Sent: Friday, July 07, 2000 2:06 PM
> > > To: Unicode List
> > > Subject: Re: How-To handle i18n when you don't
> know charset?
> > >
> > >
> > > Michael Kaplan wrote:
> > > >
> > > > > My experimentation indicated that if the
> user did not have
> > > their browser
> > > > > set to auto-select encoding, or if they
> manually overrode the
> encoding
> > > > > selection, the form data would be sent
> in whatever they had chosen,
> > > > > regardless of what charset may be in the
>  > > http-equiv="Content-Type"
> > > > > ...> in the HTML document head.
> > > >
> > > > My general feeling of people who specifically
> change settings
> > > so that the
> > > > text was rendered properly and then they
> specificically changed it is
> as
> > > > follows:
> > > >
> > > > THEY DESERVE WHATEVER THEY GET.
> > >
> > > My own experimentations (and large practice,
> *UNFORTUNATELY*), is that
> > > to have to manually specifying the encoding
> is a hack, being there to
> > > avoid the initial overview of authoring software
> that does not enforce
> > > an uniform *and* practical encoding scheme
> (either "all should be
> > > Unicode",
> > > or "the day you use something outside ASCII,
> it should be tagged").
> > >
> > > Problem is worse in some cases (mainly Cyrillic),
> because a number of
> > > charsets are equaly in common use, mainly for
> historical reasons.
> > > And the behaviour of Microsoft in this area
> is not necessary of help...
> > >
> > >
> > > Now, most of the time I run with "default"
> on. Sometimes, I need
> > > to change.
> > > And when I change, I let it in the changed
> position (Yes, I'm quite
> lazy),
> > > unless there is a nuisance. So quite a time,
> I am running in "changed"
> > > position...
> > >
> > > > The GIGO (garbage in, garbage out ) philospophy
> is the best way
> > > to go here,
> > > > IMHO. How much more can you do other than
> provide a java applet
> > > that will
> > > > hav a big hand come out of the screen and
> slap them silly?
> > >
> > > And *I* would be quite upset if, when I answer
> in French (using French
> > > accents) in an application that only proposes
> English as UI and asks for
> > > e.g. my profession, so I would be upset if
> the application:
> > > - either refuse to handle my accentuated profession
> > > - or, perhaps worse, misinterprets it because
> the server-side
> > > insists on using
> > > his charset instead of whatever character I
> really need.
> > >
> > > But this is what happens every day, because
> the (U.S. based) programmer
> is
> > > expecting everyone to use ASCII, of course.
> Here we cannot
> > > distinguish GIGO
> > > for lazyness or plain ignorance.
> > >
> > > Now you take the case of my friend M. Lebœuf,
> whom name includes a
> > > character not easily available in common charsets,
> trying to answer such
> > > a form included in a iso-8859-1 html page...
> I am not sure he will
> > > appreciate to see his name considered as garbage...
> > >
---
Over here, his name looks like garbage.
What is that? Ell ee bee something something you
eff.
---
> > >
> > > Antoine
> > >
> >
> >
> 
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Re: Japanese pronunciation of hex digits?

2000-07-06 Thread 11digitboy

If he was to use those IPA characters, how would
he type them?
Also, how would Tanaka-san write his full name in
Japanese? He has an English first name and Japanese
middle and last names. I hear the Japanese have no
middle names, so my guess is he is an immigrant and
adopted an English first name to fit in, his Japanese
given name moving to the middle position as a consequence.

--
Robert Lozyniak
Accusplit pedometer, purchased about 2000a07l01d19h45mZ,
has NOT FLIPPED
My page: http://walk.to/11
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 Linus Toshihiro Tanaka <[EMAIL PROTECTED]>
wrote:
> Thank you, Ken!  You are right.  I also should
> have used U+0283 instead
> of "sh".
> 
> Linus
> 
> 
> Kenneth Whistler wrote:
> > 
> > Tanaka-san,
> > 
> > > [e0u] or [e0] or [efu] or [ef]
> > >
> > > The consonant of [e0u] (and [e0]) doesn't exist
> in English (I heard that
> > > a similar sound exists in Greek but not sure).
>  Japanese characters for
> > > [0u] are U+3075, U+30D5 and U+FF8C.
> > 
> > The Japanese cosonant sound is represented in
> IPA by U+0278.
> > (a voiceless bilabial fricative)
> > 
> > --Ken
> 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Planes 1 and 2

2000-07-05 Thread 11digitboy

Where can I find charts for plane 1 and plane 2 of Unicode? Please give
me the URL or else tell me the name of the link(s) to follow. I can only
find Plane 0 on Unicode's site.



___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




The real problem?

2000-07-02 Thread 11digitboy

The REAL problem with this may be that the people discussing this issue
are not native speakers of Japanese. Truth be told, neither am I. All
I know is, furigana are a BIG help when you don't know many kanji.

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Should furigana be considered part of "plain text"?

2000-07-01 Thread 11digitboy



-- 
[EMAIL PROTECTED] - email
(917) 421-3909 x1133 - voicemail/fax



 John Hudson <[EMAIL PROTECTED]> wrote:
> At 04:04 AM 7/1/00 -0800, you wrote:
> >Furigana codes would simply mark certain text as furigana, meaning
> to
> >the text-display device, "These characters are not to be displayed
> on
> >the main line of text, but rather above it and in smaller type". There
> >ought to be  and  codes, or the equivalent,
> >in HTML; at least that is my opinion. The tag  would
> >indicate the start of the characters that the furigana is to be placed
> >over. The input kana="" would tell the browser what the kana are.
> >The  tag would indicate the end of the characters to be given
> >furigana.
> 
> I'm presuming, from your description, that Furigana is another term
> for
> Ruby. There is a  OpenType layout feature, which will be published
> with the next version of the OpenType spec, and this provides font
> support
> for Ruby/Furigana text. I think it would be the responsibility of
> application and markup language developers and standards bodies to
> decide
> how to tag this kind of text, and obviously such tagging could work
> with
> the OT feature in line layout and glyph positioning.
> 
> Note that this is a text tagging issue, not a Unicode issue, unless
> you
> feel that there is some need to indicate Ruby/Furigana in plain text.
> At
> some point, plain text ceases to be plain if you decide that layout
> information needs to be encoded.
> 
> John Hudson
> 
> Tiro Typeworks
> Vancouver, BC
> www.tiro.com
> [EMAIL PROTECTED]


Anybody willing to comment on this??? 

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




What I meant by furigana codes

2000-07-01 Thread 11digitboy

Furigana codes would simply mark certain text as furigana, meaning to
the text-display device, "These characters are not to be displayed on
the main line of text, but rather above it and in smaller type". There
ought to be  and  codes, or the equivalent,
in HTML; at least that is my opinion. The tag  would
indicate the start of the characters that the furigana is to be placed
over. The input kana="" would tell the browser what the kana are.
The  tag would indicate the end of the characters to be given
furigana.

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com




Furigana codes?

2000-07-01 Thread 11digitboy

Are there furigana codes? If not, there darn well need to be.
Like: BEGIN WHAT THE FURIGANA IS FOR, then START FURIGANA, then END FURIGANA.

___
Get your own FREE Bolt Onebox - FREE voicemail, email, and
fax, all in one place - sign up at http://www.bolt.com