On Sat, Mar 23, 2024 at 07:41:45PM +0800, Sadeep Madurange wrote:
> Initially, LANG was unset and LC_CTYPE="C". The character encoding was
> US-ASCII. I changed these variables (i.e., LANG, LC_CTYPE and locale
> settings) to en_US.UTF-8. Then the ? changed to ?. So, looks like you
> are on to something. I will check this with OpenBSD community as well.
> 
> In Xdefaults, I have set XTerm*utf-8 setting to true as well.

Your problem is that these settings are not consistent (and you still
have this problem, because the "solution" proposed by Sirius is
incorrect--even if it appears to have solved your issue).  By having
LANG unset, you've told your shell (and therefore everything started
by it) to use ASCII, but you've explicitly told xterm to use Unicode.
That's wrong.

The TL;DR of this is:

1. You should NEVER need to set Mutt's charset explicitly. [*]
2. Your shell, Mutt, and X should all inherit what they need from your
   LANG environment variable, assuming it is set properly for your
   system and environment (it definitely isn't in your case).
3. Setting Mutt's charset may appear to "work" but it's not the
   correct solution, because your shell and terminal settings are
   still inconsistent.  You'll have trouble with other things later if
   you don't fix this.
________________________________________________________________
[*] Except in extremely rare and completely esoteric cases that apply
    only to experts... and by now should really apply to no one.


The unfortunately lengthy details:
----------------------------------

Displaying characters properly is actually tricky business on modern
computers, because of the legacy methods by which we tried to
accommodate different languages, and the (relatively) recent advent of
Unicode to unify that mess.  All of the following must be set
consistently:  Your shell, your terminal program (or your operating
system's console), your font, all of your application programs, and
when appropriate, the X window system.  If any of these are not
consistently set, you can, and eventually WILL, have trouble.  Most
modern systems have the concept of a default locale, which is
typically set for you at install time, and which every process you
start inherits, unless you configure your user environment
differently.

Fortunately, there is a very simple mechanism by which this happens,
which is the LANG environment variable.  There are additional
ancillary environment variables which start with "LC_*" but you
usually should not have to set any of these, because they inherit
their value from LANG if they are not explicitly set.  When you run
the locale command, values enclosed in quotes are inherited from LANG,
and values NOT enclosed in quotes have been set explicitly:

    $ locale
    LANG=en_US.UTF-8
    LANGUAGE=
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC="en_US.UTF-8"
    LC_TIME="en_US.UTF-8"
    LC_COLLATE=C
    LC_MONETARY="en_US.UTF-8"
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER="en_US.UTF-8"
    LC_NAME="en_US.UTF-8"
    LC_ADDRESS="en_US.UTF-8"
    LC_TELEPHONE="en_US.UTF-8"
    LC_MEASUREMENT="en_US.UTF-8"
    LC_IDENTIFICATION="en_US.UTF-8"
    LC_ALL=
    
Here you can see that I explicitly set LC_COLLATE=C.  The rest are
inherited from LANG.  Typically most users will want to leave all of
the LC_* variables unset, and inherit from LANG.

I haven't tried a *BSD in a really long while, but if it doesn't ask
you for your default locale during install, or if you made a mistake
setting it up, then you should add the settings manually to your login
shell environment.  If you're using UTF-8 (which you should be--by now
every modern OS uses it by default), the value of LANG should reflect
that.  Pretty much no one should be using ASCII anymore (i.e. LANG
should NEVER be unset).  The most portable way to do that would be to
include the following in BOTH .profile and .kshrc (or whatever file
you've set ENV to):

LANG=en_US.UTF-8
export LANG

[See the Invocation section of the ksh man page for exact details of
which files you should put this in, but in general it's the ones I
said.]

Of course if you are not using English, change en_US to whatever your
default language is, but you'll want to retain the ".UTF-8" portion.

That *should* be sufficient to handle everything... however, there may
be additional places you'll need to add it for your X applications,
depending on exactly what OpenBSD does to initialize users' X
sessions.  In general, the X startup stuff is supposed to make sure
that it sources the user's environment so that you don't need to
figure out which of the 17 different files you actually need to put
this stuff in... but over the years most vendors have bastardized how
X sessions start up, so you may have to look up or trace out how your
system does it to make sure everything works correctly.  So you might
have to also add it to .xinitrc or .xsession or similar.  Or, if it
does not already, you could simply have your X init thingy source your
.profile or .kshrc or whatever--and probably should.

As for this:

On Sun, Mar 24, 2024 at 08:11:29AM +0800, Sadeep Madurange wrote:
> On 2024-03-23 12:52:40, Sirius Rayner-Karlsson via Mutt-users wrote:
> > It may be that you just need to pop in the "set charset="utf-8"" in
> > your mutt config and you are good to go.

As I said, you should NEVER need to do this... if your environment is
set up properly, Mutt will CORRECTLY inherit its charset and language
settings from that.  The only time you should ever need to mess with
Mutt's idea of your charset is if you have some very esoteric edge
case use that requires things to be different from your default
language.  Since the proliferation of Unicode, this is now EXTREMELY
rare, and if you don't know what you're doing, you almost certainly
shouldn't be doing this.

> Thanks for sharing your config. You're right. I needed to add "set
> charset=utf-8" to muttrc and set LC_CTYPE. The problem was I had set the
> latter in my .kshrc when I needed to set it in .xsession. It's all good
> now.

Your settings may "work" but they are not correct.  You might not have
trouble with Mutt displaying ??? where it shouldn't anymore, but if
you don't take care to set up your locale correctly, you will
eventually continue to experience related issues elsewhere, inside and
outside Mutt.

You should really remove that charset setting in Mutt, to confirm that
your other settings are now correct.  You don't need it, and you
shouldn't use it.  It's just more likely to cause problems later.

-- 
Derek D. Martin    http://www.pizzashack.org/   GPG Key ID: 0xDFBEAD02
-=-=-=-=-
This message is posted from an invalid address.  Replying to it will result in
undeliverable mail due to spam prevention.  Sorry for the inconvenience.

Attachment: signature.asc
Description: PGP signature

Reply via email to