Re: How to convert non latin characters?

2002-06-06 Thread Joel Rees

This is a long response. I trimmed as much as I thought I could.

Hmm. The short story is that you might want to invest some time going
through the table by hand and editing the place names to a standard form.

Note that I'm not going to suggest a standard form. Check Google. I
suspect you'll want multiple sources.

Explanation mixed in below.

> Hii there,
> 
> I have a table containing cities of the world. Now I am running into
> problems because people are starting to complain that some mayor cities are
> not in the db like Tokyo. After checking my table I discovered that there
> lots of cities are written with some non latin characters.
> 
> Tokyo is written like this: Tðkyð

Well, it looks like my browser may not preserve those characters, so it
may not be easy for lurkers to see what I'm talking about. (Got to learn
how to use this new browser someday.)

As someone mentioned, those characters were borrowed from an unrelated
character set, probably on the whim of whoever wrote the stuff you got
your lists from. The reason they borrowed unrelated characters may help
you answer your question. I can't talk about other languages, but I can
explain a little about Japanese.

When writing Japanese with Latin characters (i. e., romanizing Japanese),
one custom is to write an overscore on vowels that are doubled or
lengthened. (There are other customs, as well.) For some reason, even
the JIS character set does not contain vowels with overscores. So, I
would assume that the non-Latin characters in this case are an attempt
to indicate the lengthening of the vowels. (Both "o"s are lengthened in
Tokyo, which means it has four syllables when spoken in "standard"
Japanese.)

Another common custom is to simply repeat or transliterate, which would
give you either "Tookyoo" or "Toukyou", neither of which will be
recognizable to anyone except (perhaps) a non-native resident of Japan,
or a Japanese person with considerable experience in foreign countries.
Another approach which I've seen would result in "Tohkyoh" which may be
slightly more recognizable.

> 
> Should, and if how, I convert this characters into latin ones. And what
> happens if somebody from Japan with a different keyboard tries to find
> Tokyo?

As I said above, you will probably want to edit the names by hand to a
standard form. But, the place where you got the lists may have a tool
for filtering the names to a standard form. You might want to ask.

I was going to say I wouldn't want to write such a tool myself, but it
might not be so hard if you go country-by-country. Maybe. Depends on how
consistent the authors were.

Japanese has five vowels, a, i, u, e, and o. People don't usually do
anything strange with the consonants, at least not in the last hundred
years or so. 

But that doesn't help for Tibet or Myanmar, of course.

Unicode is helping bring these sorts of tools a lot closer to reality.
You can help, too, if you want.

-- 
Joel Rees <[EMAIL PROTECTED]>

sql, query << filter fodder forgotten first time again


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




Re: How to convert non latin characters?

2002-06-05 Thread jon.ingason


No ð is not Japanese letter. It is Iclandic letter with ISO 8859-1 0xf0 or
240.

Jon Ingason


   
   
"andy" 
   
   
   
@gmx.de> cc:   
   
 Subject: How to convert non latin 
characters?
2002-06-05 
   
10:44  
   
Please 
   
respond to 
   
"andy" 
   
   
   
   
   



Anyway what type of character is this 'ð' anyway? It does not look
Japanese to me :-)



Thank you for any help on that,

Andy
query


http://www.globosapiens.net
Global Travellers Network!


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail
<[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php





-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php




How to convert non latin characters?

2002-06-05 Thread andy

Hii there,

I have a table containing cities of the world. Now I am running into
problems because people are starting to complain that some mayor cities are
not in the db like Tokyo. After checking my table I discovered that there
lots of cities are written with some non latin characters.

Tokyo is written like this: Tðkyð

Should, and if how, I convert this characters into latin ones. And what
happens if somebody from Japan with a different keyboard tries to find
Tokyo? Anyway what type of character is this 'ð' anyway? It does not look
Japanese to me :-)

I also heard, that it is possible to compile mysql with only a latin char
set.
Would this be helpful for me?


Thank you for any help on that,

Andy
query


http://www.globosapiens.net
Global Travellers Network!


-
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/   (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php