Hi,
On Sun, Jan 11, 2026 at 08:35:13PM +0100, Thorsten Glaser wrote:
> On Sun, 11 Jan 2026, Andreas Mohr via Lynx-dev wrote:
>
> >- "outer-scope" MIME multi-part attribution is utf-8
>
> You can ignore that.
For purposes of document scope, indeed.
One could argue that
a MIME attribution could be made to
extend/govern a document's encoding config state, but OTOH it *is*
the document scope proper which is b0rken -
there's no denying or arguing that.
> >- the HTML document body/content is UTF-8-based
> > (as can be verified via
> > iconv -f utf-8 -t utf-8 <file>)
> >- the document (the authoritative container scope unit) declares
> > iso-8859-1 encoding for its body/content
>
> Yes, this is a bug… in the eMail, not in lynx.
Yup indeed.
The only complaint remaining might be
that lynx perhaps is not flexible enough to
offer applying trainwreck post-mortem workaround/bending.
> Overriding the charset is not easy, you have to edit the document
> for that. (It gets even funnier if an XML PI with a charset is
> present… not.)
Woerks, somewhat unexpected.
> For the case of declared latin1, contains utf-8, you could do
> a rather evil thing of temporarily switching the display charset
> to latin1 and “Raw 8-bit” to ON. That might just work, if you
> use lynx in the C.UTF-8 locale and don’t have any nōn-ASCII UI
> strings.
Hmm I cannot quite follow the [weirdly twisted?] processing chain here
(I have to admit that I did not try it either).
> For your scenario of…
>
> >- mailcap entry
> > text/html; lynx -assume_charset=%{charset} -display_charset=utf-8
> > -collapse_br_tags -dump %s; nametemplate=andi_%s.html; copiousoutput
>
> … you could do something like…
>
> text/html; <%s perl -0pe
> 's!<meta\s+http-equiv="Content-Type"\s+content="[^"]*"\s*/?>!!ig;' | lynx
> -assume_charset=%{charset} -display_charset=utf-8 -collapse_br_tags -dump
> -stdin; nametemplate=andi_%s.html; copiousoutput
>
> … to automatically remove such charset declaration.
Ah, right, or possibly some sed -e 's/...' alter{c|n}ation...
For the [e]links side, I had filed the "same" issue at
"0.18.0: broken HTML file (charset declaration *wrong*) - override
possibility??"
https://github.com/rkd77/elinks/issues/417
> (EN: “[…]uhr.gz is a reason to install mksh on every system.”)
Oh wow indeed!
Greetings
Andreas Mohr
--
Klimaverschandel - weil weniger Wirtschaftskrieg einfach uncool ist.