Boriss Mejias wrote :
Hi all,
Characters from other western language are not supported either. I
haven't noticed it until now even though I'm Spanish native speaker, but
I'm so used to lacking support that I didn't even try until now.
Anyway... Here are the test I just made
{Browse "á"} -> [195 177] two codes
{Browse "ñ"} -> [195 161] two codes two, which is not the same as
{Browse "n~"} -> [110 126]
I assume 195 is a composition code
Then, the last two tests:
{Browse &a} -> 97
{Browse &á} -> error
%************************** lexical error ***********************
%**
%** illegal character
%**
%** in file "Oz", line 7, column 10
%** ------------------ rejected (1 error)
cheers
Boriss
Here is my understanding of it.
Basically, Oz doesn't really use iso8859-1 or any other encoding. From
the emulator point of view there are only bytes. For these bytes to be
interpreted as text a charset/encoding is needed. The following
charsets/encodings are of interest:
1) The one of the console (for console oriented Oz programs)
2) The one that Tk uses (for Tk and QTk based Oz GUIs)
3) The one your source Oz file is written in
4) The one the C library will use (This one might potentially depends on
settings from compile-time of the virtual machine and from settings at
the runtime of it)
The only real constraint is that 3) must be ASCII-compatible. This means
that (most) byte values between 0 and 127 must have ASCII semantics
(e.g. 65=A). If not you might have difficulties writing keywords that
the compiler will recognize! It also means that byte values with value
between 0 and 127 should never be interpreted differently even if in a
special sequence or preceded by some "shifting" character etc. since the
VM interprets files as simple streams of bytes and will recognize an Oz
keyword even in a suppsoedly "shifted" state.
In practice the following charsets/encodings should be usable for Oz
source code:
all ISO 8859-x (including ISO 8859-5)
all KOI8 encodings (including KOI8-R and KOI8-U)
Most DOS and Windows codepages (including CP850 and CP1251)
UTF8
EUC-JP
These are definitely unusable:
UTF7 (at least it would be extremely difficult to use)
UTF16 (lots of embedded 0 bytes)
UTF32 (idem)
Most (all?) versions of EBCDIC (not ASCII compatible you would need to
write @ for | and even letters would need transpositions!)
ISO2022 (you could have some Japanese text being interpreted as a
closing quote, a keyword, etc.)
Of course there are still some restrictions. Since the machine
interprets files as streams of bytes, the semantic of the &x construct
is value of the byte following the byte with value 126 (& in ASCII) If
this is part of a multibyte character, the first byte will be taken and
the remaining ones are most likely to lead to a syntax error.
Charset/encoding 4) determines the semantic of operations in the Char
module (I think) and can be otherwise ignored.
If the charsets/encodings 1), 2) and 3) are not all the same, your
application might need to do some explicit conversions.
In case you need to do explicit conversion or to have operations of the
type provided by the Char module but for another charset/encoding than
4), the easiest way will be to create a binding to some Unicode library
such as libICU or to use one of the project already mentionned.
Emacs should ask for the charset/encoding when saving a file which is
not pure ASCII. In Linux, most recent distribution use UTF8 for the console.
You should be able to decide what are all these charsets/encodings by
experimenting with strings made explicitly of integers (such as [72
105]="Hi" in ASCII) according to the potential charsets/encodings.
Yves
Dmitry Negius wrote:
I have done like you said under my Windows XS Service Pack 3 computer.
I have written and compiled next:
functor
import
Application
System
define
{System.showInfo "АБВГД"}
{Application.exit 0}
end
The output from this program is pseudo-graphic trash - not the Russian
letters.
So compiler or interpreter or both has erorrs in the cyrillic support.
This not means that Mozart OPI is correct. Mozart OPI has mistake also
because written in the FAR Commander editor Oz program with correctly
displayed russian letters is wrong displayed in the Mozart OPI.
2009/8/17 Wolfgang Meyer <[email protected]
<mailto:[email protected]>>
Hi,
actually, ASCII only defines the codes 0-127.
Oz uses the ISO/IEC 8859-1 charset, which covers Western European
languages.
However, as long as you only use normal input and output and no GUI,
it might still work with Cyrillic symbols on a Computer which uses a
Cyrillic codepage.
To test whether the problem is with Emacs or with Oz, you could
write a little program like
functor
import
Application
System
define
{System.showInfo "some Cyrillic text"}
{Application.exit 0}
end
Compile it with "oz -c filename.oz" and execute it in a
shell/DOS-Box with "ozengine filename.ozf".
If this works, we know that the problem is either with Emacs or with
the Oz-Emacs-interface.
Cheers,
Wolfgang
> Cyrillic symbols are situated in the high part of ASCII table
and has
> codes
> lower then 256.
> Oz program with cyrillic symbols is ASCII text, but is wrong
displayed by
> the Emacs OPI.
> Question is still open :-)
>
> 2009/8/17 Torsten Anders <[email protected]
<mailto:[email protected]>>
>
> > Dear Dmitry,
> >
> > On 17 Aug 2009, at 14:25, Dmitry Negius wrote:
> >
> > Hello.
> > I study Mozart - Oz now and found a problem with Cyrillic input
in the
> > Emacs OPI.
> > Both 3 cyrillic inputs modes does not work in the Emacs - input
letters
> are
> > displaed
> > incorrectly.
> >
> > Is it Mozart or Emacs bug and is there workaround of this
problem?
> >
> >
> > As far as I know, Mozart source must be ASCII. Unicode
support was
> > discussed before (check the mailing list archive) but not part
for the
> > language. Mozart extensions for Unicode are proposed by
> >
> > * http://www.snowlion.nl/mozart/
> > * http://www.mozart-oz.org/mogul/info/fkonvick/unicode.html
> >
> > Hope I understood your question..
> >
> > Best
> > Torsten
> >
_________________________________________________________________________________
mozart-users mailing list
[email protected]
http://www.mozart-oz.org/mailman/listinfo/mozart-users