Re: Problem with beaver editor and UTF-8/ISO-8859-2 encoding

J.C. Roberts Thu, 14 Jan 2010 15:07:50 -0800

On Thu, 14 Jan 2010 07:20:48 +0100 Tomas Bodzar
<tomas.bod...@gmail.com> wrote:


> I invoked 'xterm -lc' then 'setxkbmap -layout "us,cz" -option
> "grp:shifts_toggle,grp_led:scroll"'.
> 
> $ beaver
> 
> entered some text in cz and save with name 'file'.
> 
> $ file file
> file: UTF-8 Unicode text, with no line terminators
> 
> $ beaver file   (no error during opening)
> 
> Now I can see text, but diacritic characters are right of letters
> instead of above them and one letter 'k' is even missing. gvim shows
> garbage. It's same (output with badly placed diacritic) for these
> options too :
> 
> $ luit -v -encoding 'UTF-8' beaver file
> UTF-8, non-ISO-2022 encoding.
> 
> $ luit -v -encoding 'ISO 8859-2' beaver file
> G0 is ASCII, G1 is Unknown (94), G2 is ISO 8859-2, G3 is Unknown (94).
> GL is G0, GR is G2.
> 
> Maybe I'm missing something obvious or I'm interpreting info from man
> pages in a bad way.


You'll need to forgive me for not knowing how to read, speak or even
type in Czech. Yesterday, I looked up the keyboard mapping for Czech
keyboards on wikipedia, built the beaver port, and tried to recreate
your results, but my results were different.

The "with no line terminators" message from file(1) is normal and
expected if, and only if, you created a file without any new lines
(i.e. your file contains only a single line of text).

I still haven't figured out how using multiple keyboard mappings really
works (i.e. your `setxkbmap -layout "us,cz"`). The testing I did was
with just the "cz" layout applied, (i.e. using `setxkbmap cz` --the
same as `setxkbmap -layout "cz"`).

I run OpenBSD -current, and since you didn't post your dmesg with this
problem, I have no clue what you are running?

Do you want to use UTF-8 ?

Or do you want to use 8-bit, single byte, ISO 8859-2 ?


As the luit(1) man page states:
       Luit  is  a  filter  that  can be run between an arbitrary
       application and a UTF-8 terminal emulator.  It  will  con-
       vert  application  output  from the locale's encoding into
       UTF-8, and convert terminal  input  from  UTF-8  into  the
       locale's encoding.

       An  application  may also request switching to a different
       output  encoding  using  ISO 2022  and   ISO 6429   escape
       sequences.   Use of this feature is discouraged: multilin-
       gual applications should be modified to directly  generate
       UTF-8 instead.

By using luit(1), you are attempting to do a conversion from the
encoding of your locale *_to_* UTF-8. I have no clue if this beaver
editor can even handle UTF-8, so even if luit(1) is successful in doing
the conversion to UTF-8 when saving, you might not be able to open the
file afterwards. *** THIS seems to be the problem.

If you want UTF-8 output, it's far better to use an application
that actually has proper UTF-* multilingual support.

With vim/gvim you can easily set your desired encoding.
        $ gvim
        :set encoding=utf-8

If you want UTF-8 support/output in your terminal emulator (xterm), the
best answer is to use uxterm(1). Similar to the -en option of xterm(1),
using uxterm(1) will handle setting up your locale properly (i.e.
setting the environment variables).

------------------------------------------------------
Test #1

  $ setxkbmap cz
  $ uxterm
  $ beaver

So I create a new file and use some of the accented/diracritic
characters in the "cz" layout, across a few lines of text. Then I save
the file as "test4.txt" and test it with file(1).

  $ file test4.txt
  test4.txt: UTF-8 Unicode text

So the creation of the file within beaver worked. BUT if I reopen the
file with beaver, the accents/diacritics *FOLLOW* the characters rather
than being above the characters.  --This is wrong and seems to be what
you are reporting.

Now, if I take that same exact file, "test4.txt," and open it in gvim,
it *initially* looks like a mess, but if I set the encoding to UTF-8,
then everything looks fine.

  $ gvim test4.txt
  :set encoding=utf-8

There is nothing wrong with the UTF-8 file that was created, but since
beaver cannot read UTF-8 files, that's where your problem is.

------------------------------------------------------
Test #2

Doing all of the above, but using `xterm -lc` rather than `uxterm` has
the exact same results, namely, beaver can create a valid UTF-8 file,
but it cannot open the resulting UTF-8 file.

------------------------------------------------------

The short answer is this 'beaver' program is not able to handle UTF-8
files. There are plenty of other text editors in the ports tree, and
many of the big "desktop" packages (kde, gnome, xfce, ...) often have
their own text editor.

-- jon

Re: Problem with beaver editor and UTF-8/ISO-8859-2 encoding

Reply via email to