On Tue, Aug 27, 2002 at 01:37:30PM -0700, Sam Peterson wrote: > > I'm using 1.4. I think LC_CTYPE and charset are set correctly in both > > cases. BTW, why does Mutt use both of these variables? I would find > > logical if charset have overridden LC_CTYPE. I can't see the rationale > > behind the current implementation, and find it counter-intuitive. Could > > anyone please shed some light on this, too? > > Hmm, from my experience, charset does override LC_CTYPE. I'll have to > double check that. To my knowledge, charset is set based on your > locale if you don't explicitly set it in your muttrc. Anyone?
Example: let's take a message encoded in iso-8859-1 and containing the character 'ç' (LATIN SMALL LETTER C WITH CEDILLA, octal code 347). The following table outlines how the message is displayed with different LC_CTYPE and charset values on a system with (hopefully) properly configured locales (mutt 1.4.0-2 on Debian unstable): LC_CTYPE charset user wants mutt shows user expects -------- ------- ---------- ---------- ------------ "" "" unknown \347 ? "" iso-8859-1 iso-8859-1 \347 ccedilla "" us-ascii us-ascii ? ? en_US.US-ASCII "" us-ascii \347 ? en_US.US-ASCII iso-8859-1 iso-8859-1 \347 ccedilla en_US.US-ASCII us-ascii us-ascii ? ? en_US.ISO-8859-1 "" iso-8859-1 ccedilla ccedilla en_US.ISO-8859-1 iso-8859-1 iso-8859-1 ccedilla ccedilla en_US.ISO-8859-1 us-ascii us-ascii ? ? Here LC_CTYPE and charset are the environment and mutt variables, respectively. "Mutt shows" column lists what mutt is actually displaying. "User wants" and "user expects" show what I call "override": the charset to be assumed by mutt given the two variable values, and the character that, IMHO, should be displayed by mutt according to the charset assumed. Apparently, mutt does the following: * display ccedilla if it can be displayed both with LC_CTYPE and charset; * display ? if char can't be displayed with charset, regardless of LC_CTYPE value; * display \347 if char can be displayed with charset, but can't be displayed with LC_CTYPE. At the moment, I can see neither any useful application of ? and \347 distinction, nor the relation between LC_CTYPE and charset -- I certainly cannot call it "override". Seems to me as if it was some side effect of code layering inside mutt. And the logic I propose is: 1. if charset is set, assumed_charset = charset; otherwise, if LC_CTYPE is set, assumed_charset = LC_CTYPE; otherwise, assumed_charset = "us-ascii". 2. * display ccedilla if it can be displayed with assumed_charset; * display ? if it can't. I don't insist my scheme is "better" since I don't know the rationale behind the current design. However, I see much user confusion with this issue and think it could be made more simple, more stupid the way I've desribed. What do you think? With kind regards, Baurjan. P.S. I can see the full address in index mode using "@". Can I see the full subject?