Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-08-02 Thread pmaydell
Oliver Kiddle wrote:
>This can mostly all be done now. I made some changes that are in 1.2
>that use iconv for converting encoded headers. These default to LC_CTYPE
>where MM_CHARSET is not set. And using mime makes it easy to compose
>messages in other charsets.

Really? How do you do that? I had in mind a process where when
you say 'repl' nmh does:
 * convert message being replied to to your local charset
 * quote it as usual to construct the draft
 * you edit in your local charset
 * at some point in sending nmh converts to the charset for sending
 * 'list' at whatnow prompt should display in local charset

but AFAIK this is certainly not 'out of the box' even if it's theoretically
possible to lash together scripts to do it.

>For viewing messages, it is easy to add a few profile entries to do the
>conversion:
>  mhshow-charset-utf-8: csconv utf-8 '%s'
>  mhshow-charset-iso-8859-15: csconv iso-8859-15 '%s'
>  and so on...

Yes, I have a few of those too. But it's obviously impossible
to put in entries for every charset iconv can cope with.

>That isn't perfect, however. One difficulty lies with things like HTML
>e-mails - the charset can be in an HTML meta tag instead of the MIME
>header. The same can apply to XML files. None of w3c, links, lynx or
>html2text seem to cope well with the different charsets.

I think that is really up to the HTML viewer, though.

>Would you want to hardcode use of iconv for message bodies within
>mhshow?

Yes. For backwards compatibility and oddball stuff you'd want to continue
to support the existing stuff but I think that nmh should just do the
Right Thing without requiring configuration and external utilities.
Ideally, plain old show should do this too. [mhshow has the disadvantage
that you don't get headers plus body in a single pager instance.]

-- PMM


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-08-02 Thread Oliver Kiddle
[EMAIL PROTECTED] wrote:
> Of course this is all rather mishandled anyway. What we ought to be
> doing is letting the user read and compose messages in their local
> charset (as defined by LC_CTYPE) and using iconv to automatically
> convert to/from whatever charset is required for the email being

This can mostly all be done now. I made some changes that are in 1.2
that use iconv for converting encoded headers. These default to LC_CTYPE
where MM_CHARSET is not set. And using mime makes it easy to compose
messages in other charsets.

For viewing messages, it is easy to add a few profile entries to do the
conversion:
  mhshow-charset-utf-8: csconv utf-8 '%s'
  mhshow-charset-iso-8859-15: csconv iso-8859-15 '%s'
  and so on...

This is at least flexible enough that you can do things like:
  mhshow-charset-us-ascii: csconv windows-1252 '%s'
  mhshow-charset-iso-8859-1: csconv windows-1252 '%s'
to cope with lying Windows mailers (windows-1252 is a superset of the
other charsets). Originally, this mhshow profile entry was intended to
launch xterm with the different font but I just call iconv or recode

I've got a few different variants of the csconv script. For example:
eval "$2" | iconv -f "$1" -t "${MM_CHARSET:-UTF-8}//TRANSLIT"

That isn't perfect, however. One difficulty lies with things like HTML
e-mails - the charset can be in an HTML meta tag instead of the MIME
header. The same can apply to XML files. None of w3c, links, lynx or
html2text seem to cope well with the different charsets.

Would you want to hardcode use of iconv for message bodies within
mhshow?

Oliver


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-08-02 Thread Valdis . Kletnieks
On Wed, 02 Aug 2006 13:00:54 BST, [EMAIL PROTECTED] said:

> Did we ever come up with a good plan for this? It's currently
> at the top of my 'list of things that annoy me about nmh'...

Whatever we do, we need to remember to include a check for the rather common
case of "Tagged as utf8/8859/2022/etc, but only contains actual ascii" and
downgrade the labeling as per the RFCs.  One subtle point is that we need to
do this *before* a "What next?" program or something tries to attach a digital
signature to the thing.



pgp5zbFNuDlEC.pgp
Description: PGP signature
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-08-02 Thread pmaydell
[EMAIL PROTECTED] wrote:
>Browsing through sbr/check_charset.c indicated that the following are valid:
>US-ASCII, ISO-8859- or UTF-8. Upper/lower case doesn't seem to matter.

Of course this is all rather mishandled anyway. What we ought to be
doing is letting the user read and compose messages in their local
charset (as defined by LC_CTYPE) and using iconv to automatically
convert to/from whatever charset is required for the email being
read/sent. (These days wanting to send mail in something other
than the local charset is becoming more common as people move to
UTF-8 but still want to read/write messages using legacy encoding
when corresponding with people using older mail clients.)

Did we ever come up with a good plan for this? It's currently
at the top of my 'list of things that annoy me about nmh'...

-- PMM


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-07-23 Thread der_wachtmeister
> > > This is a real problem: when I MIMEfy a message (on the "What now?" prompt
> =
> > > )
> > > it destroys german Umlauts.
> > > Check it out, here's a =8A (ae), =9A (oe), =9F (ue) and =A7 (sharp s).
> > > How can I tell nmh to leave Umlauts alone?
> > 
> > Is this portion of the mhbuild man page relevant?
> > ...
> >set  is  of  the  type  given  by the environment variable
> >MM_CHARSET.  If this environment variable is not set, then
> >the character set will be labeled as "x-unknown".
> 
> What are valid values for MM_CHARSET?

Browsing through sbr/check_charset.c indicated that the following are valid:
US-ASCII, ISO-8859- or UTF-8. Upper/lower case doesn't seem to matter.

Philipp


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-07-23 Thread Valdis . Kletnieks
On Sun, 23 Jul 2006 06:13:41 +0200, [EMAIL PROTECTED] said:

> What are valid values for MM_CHARSET?

As a practical matter, the valid values are "Some single-width charset you have
proper locale support installed on the system". "iso8859-1" works on my system,
most other iso8859-* should work if you have the right font installed.

I admit no being very confident of it doing iso2022-* or utf-8 correctly,
but would enjoy being pleasantly surprised to be told I'm wrong


pgpEAgOWyaj1u.pgp
Description: PGP signature
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-07-22 Thread der_wachtmeister
> > This is a real problem: when I MIMEfy a message (on the "What now?" prompt=
> > )
> > it destroys german Umlauts.
> > Check it out, here's a =8A (ae), =9A (oe), =9F (ue) and =A7 (sharp s).
> > How can I tell nmh to leave Umlauts alone?
> 
> Is this portion of the mhbuild man page relevant?
> ...
>set  is  of  the  type  given  by the environment variable
>MM_CHARSET.  If this environment variable is not set, then
>the character set will be labeled as "x-unknown".

What are valid values for MM_CHARSET?

Philipp


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-07-18 Thread David Levine
> This is a real problem: when I MIMEfy a message (on the "What now?" prompt=
> )
> it destroys german Umlauts.
> Check it out, here's a =8A (ae), =9A (oe), =9F (ue) and =A7 (sharp s).
> How can I tell nmh to leave Umlauts alone?

Is this portion of the mhbuild man page relevant?

   When composing a text content, you may indicate the  rele-
   vant  character  set  by adding the "charset" parameter to
   the directive.

#http://lists.nongnu.org/mailman/listinfo/nmh-workers


Re: [Nmh-workers] mhbuild destroying german Umlauts

2006-07-18 Thread Valdis . Kletnieks
On Tue, 18 Jul 2006 16:36:20 +0200, [EMAIL PROTECTED] said:
> This is a real problem: when I MIMEfy a message (on the "What now?" prompt)
> it destroys german Umlauts.
> Check it out, here's a Š (ae), š (oe), Ÿ (ue) and § (sharp s).
> How can I tell nmh to leave Umlauts alone?

Ick.  Your message showed up with:

Content-type: text/plain; charset=US-ASCII

charset=iso8859-1  or iso8859-15 or utf-8 would have been more correct,
depending on your display glyph religion.  But US-ASCII is just *wrong*.




pgpkgGauJKs1o.pgp
Description: PGP signature
___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers


[Nmh-workers] mhbuild destroying german Umlauts

2006-07-18 Thread der_wachtmeister
This is a real problem: when I MIMEfy a message (on the "What now?" prompt)
it destroys german Umlauts.
Check it out, here's a Š (ae), š (oe), Ÿ (ue) and § (sharp s).
How can I tell nmh to leave Umlauts alone?

Philipp


___
Nmh-workers mailing list
Nmh-workers@nongnu.org
http://lists.nongnu.org/mailman/listinfo/nmh-workers