Daniel Heiserer <[EMAIL PROTECTED]> writes:
>Hi,
>my experience in Unicode is very small, so my questions might look
>stupid.....
>
>I would like to use perl for a dicitionary. As different languages use
>different
>letters (even the european ones like spanish, german, etc.) I would like
>to 
>use these.

Most western european languages are covered by the 8-bit encoding iso8859-1.
(The same one you use for German.)
Other 8-bit encodings iso8859-* etc.  cover greek and languages needing cyrillic.

But if you want to show phonetics then you need a wider repertoire.
(Needing phonetics was reason I started on the UTF-8 road myself...)

>Using the terminal in linux/unix for input/output I guess that I would
>need a 
>utf-8 enabled terminal, right?

Don't know for sure - terminals on Linux are new to me.

>
>Assume I use a gui like perl-tk for input and output. How can i ensure
>that
>utf-8 is supported there? 

The short answer is that no _released_ perl/Tk does UTF-8 yet.
I am working on it - but as most of the snags are in the perl part
I have been working on perl5.7.* rather than Tk803.*

But _ALL_ Tk's will support iso8859-1 for characters you need for 
most (western) european languages. And can display text in other 8-bit 
encodings if you tell it to use an appropriately encoded X font.

>Do I need utf-8 fonts for perl-tk. 

The Tcl/Tk code that does UTF-8 attempts to display character glyphs
by hunting through the available fonts looking for one that has 
the glyph it needs. This works after a fashion. The snag is that 
process can take a long time (10s of seconds on a 300MHz machine),
and often gets glyphs which don't match the "style" of the others in the
string.

So perl/Tk is likely to modify this to try iso10646-1 fonts before 
(or instead of) doing that.

>Do these exist?

There are not as far as I know any fonts encoded in UTF-8. There are 
16-bit fonts encoded in iso10646-1 (which has same codepoints as Unicode).
>From a perk/Tk perspective the distinction should not matter to user code
(it is up to Tk core to convert UTF-8 encoded stuff that perl gives it
to 16-bit font index).

>Where can I get them? 

See Markus's excellent intro:

http://www.cl.cam.ac.uk/~mgk25/unicode.html

>How can I input data then ( I only have a keyboard covering the latin
>characters,
>do I need a special keyboard driver?)?

Linux can use two schemes (compose key and dead keys) to input 
characters in iso8859-1 or the local "locale" character encoding.

>Can I cut and paste input and output?

iso8858-1 is fine, UTF-8 has a proposal which I will implement 
in the UTF-8 aware perl/Tk - not other applications probably will
not support that yet.

>
>thanks daniel
-- 
Nick Ing-Simmons <[EMAIL PROTECTED]>
Via, but not speaking for: Texas Instruments Ltd.

Reply via email to