Re: [Lynx-dev] Missing First Letter?
On Mon, Apr 05, 2021 at 09:49:35AM -0500, Tim Chase wrote: > On 2021-04-05 09:19, Tim Chase wrote: > > That's odd. I get the inverse behavior from what you describe. If > > I use > > > > $ LANG=C lynx chime.html > > > > I get the unicode placeholder character for the opening > > fancy-double-quote and lynx displays the full "Hello", but if I do > > > > $ LANG=en_US.UTF-8 lynx chime.html > > > > I get the vanishing "H". I don't see this, but when I test locales, I usually use this: #!/bin/sh # $Id: with-locale,v 1.7 2015/08/16 21:20:39 tom Exp $ unset LANG unset LC_ALL unset LC_CTYPE unset LESSCHARSET LANG=$1 LC_ALL=$1 GDM_LANG=$1 export LANG export LC_ALL export GDM_LANG if test $# != 0 then shift 1 exec "$@" fi ...and in a quick check, I did with-locale C sh $ LANG=en_US.UTF-8 lynx chime.html I haven't seen a combination which makes that "H" vanish, though the double-quote can be lost... > To provide additional context, this is in an xterm on FreeBSD 12.2p4 > It may depend on what other locale-related environment variables you have set. FreeBSD's manpage for setlocale says of LANG: LANG Sets the generic locale category for native language, local customs and coded character set in the absence of more specific locale variables. but LC_ALL and LC_CTYPE are more specific. On my Debian/testing, the manpage gives more details: If locale is an empty string, "", each part of the locale that should be modified is set according to the environment variables. The details are implementation-dependent. For glibc, first (regardless of cate‐ gory), the environment variable LC_ALL is inspected, next the environ‐ ment variable with the same name as the category (see the table above), and finally the environment variable LANG. The first existing environ‐ ment variable is used. If its value is not a valid locale specifica‐ tion, the locale is unchanged, and setlocale() returns NULL. > $ ident `which xterm` > /usr/local/bin/xterm: > $FreeBSD: releng/12.2/lib/csu/amd64/reloc.c 339351 2018-10-13 23:52:55Z > kib $ > $FreeBSD: releng/12.2/lib/csu/amd64/crt1.c 339351 2018-10-13 23:52:55Z > kib $ > $FreeBSD: releng/12.2/lib/csu/common/ignore_init.c 339351 2018-10-13 > 23:52:55Z kib $ > $FreeBSD: releng/12.2/lib/csu/amd64/crti.S 217105 2011-01-07 16:07:51Z > kib $ > $FreeBSD: releng/12.2/lib/csu/common/crtbrand.c 366954 2020-10-23 > 00:00:52Z gjb $ > $FreeBSD: releng/12.2/lib/csu/amd64/crtn.S 217105 2011-01-07 > 16:07:51Z kib $ > > (and for context, I have LANG=en_US.UTF-8 in my default environment, > so that 2nd one example was only to be explicit about what would > otherwise be default behavior) > > -tim > > -- Thomas E. Dickey https://invisible-island.net ftp://ftp.invisible-island.net signature.asc Description: PGP signature ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
On 2021-04-05 09:19, Tim Chase wrote: > That's odd. I get the inverse behavior from what you describe. If > I use > > $ LANG=C lynx chime.html > > I get the unicode placeholder character for the opening > fancy-double-quote and lynx displays the full "Hello", but if I do > > $ LANG=en_US.UTF-8 lynx chime.html > > I get the vanishing "H". To provide additional context, this is in an xterm on FreeBSD 12.2p4 $ ident `which xterm` /usr/local/bin/xterm: $FreeBSD: releng/12.2/lib/csu/amd64/reloc.c 339351 2018-10-13 23:52:55Z kib $ $FreeBSD: releng/12.2/lib/csu/amd64/crt1.c 339351 2018-10-13 23:52:55Z kib $ $FreeBSD: releng/12.2/lib/csu/common/ignore_init.c 339351 2018-10-13 23:52:55Z kib $ $FreeBSD: releng/12.2/lib/csu/amd64/crti.S 217105 2011-01-07 16:07:51Z kib $ $FreeBSD: releng/12.2/lib/csu/common/crtbrand.c 366954 2020-10-23 00:00:52Z gjb $ $FreeBSD: releng/12.2/lib/csu/amd64/crtn.S 217105 2011-01-07 16:07:51Z kib $ (and for context, I have LANG=en_US.UTF-8 in my default environment, so that 2nd one example was only to be explicit about what would otherwise be default behavior) -tim ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
On 2021-03-31 19:45, Thomas Dickey wrote: > From: "Tim Chase" >> $ xxd -r > chime.html << EOF >> : 3c68 746d 6c3e 3c62 6f64 793e e280 9c48 ...H >> 0010: 656c 6c6f 3c2f 626f 6479 3e3c 2f68 746d ello >> there's a UTF-8 double-quote before the "Hello" as marked by the >> bytes 0xE2, 0x80, 0x9C. > > you'll get that behavior if your locale is set to non-UTF-8, e.g,. > "C" (using "en_US" rather than "en_US.UTF-8" may also look like > this, depending on the terminal) That's odd. I get the inverse behavior from what you describe. If I use $ LANG=C lynx chime.html I get the unicode placeholder character for the opening fancy-double-quote and lynx displays the full "Hello", but if I do $ LANG=en_US.UTF-8 lynx chime.html I get the vanishing "H". -tim ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
- Original Message - | From: "Tim Chase" | To: "Thorsten Glaser" | Cc: "Chime Hart" , "lynx-dev" | Sent: Wednesday, March 31, 2021 3:06:12 PM | Subject: Re: [Lynx-dev] Missing First Letter? | On 2021-03-31 18:36, Thorsten Glaser wrote: |> This helps nothing without a way to reproduce this locally, |> for example a URL in question. | | The source seems to have been the text/html component of an email. | However, here's a reproduction case: | | $ xxd chime.html | : 3c68 746d 6c3e 3c62 6f64 793e e280 9c48 ...H | 0010: 656c 6c6f 3c2f 626f 6479 3e3c 2f68 746d ello chime.html << EOF | : 3c68 746d 6c3e 3c62 6f64 793e e280 9c48 ...H | 0010: 656c 6c6f 3c2f 626f 6479 3e3c 2f68 746d ello http://invisible-island.net ftp://ftp.invisible-island.net ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
On 2021-03-31 18:36, Thorsten Glaser wrote: > This helps nothing without a way to reproduce this locally, > for example a URL in question. The source seems to have been the text/html component of an email. However, here's a reproduction case: $ xxd chime.html : 3c68 746d 6c3e 3c62 6f64 793e e280 9c48 ...H 0010: 656c 6c6f 3c2f 626f 6479 3e3c 2f68 746d ello chime.html << EOF : 3c68 746d 6c3e 3c62 6f64 793e e280 9c48 ...H 0010: 656c 6c6f 3c2f 626f 6479 3e3c 2f68 746d ellohttps://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Well Thorsten, these are either news-letters I am signed up to or items I receive in Alpine, which I view in LYNX as its a smoother read. Chime ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Chime Hart dixit: > Well Thorsten, this goes beyond Politico. Here are 2lines from a story in > Hollywood Reporter. Notice a w is missing from the first word, also what This helps nothing without a way to reproduce this locally, for example a URL in question. Sorry, //mirabilos ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Hi Tim: Glad you received that sample. Also, thanks for your analysis. I also bounced that over to Shellworld, but reading over there, instead of a missing beginning letter, there is an a beginning a word. Both sequences are anoying. Thanks Chime ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
I received the sample Poltico email you forwarded. Looking at the underlying HTML, it looks like there are "fancy quotes" before the characters that I noticed missing. For example, I see a fancy-quote (0x201c) followed by the display of "eter Navarro" where the underlying HTML has the fancy-quote followed by "Peter Navarro". When using "\" to toggle the view-source, I see the same symptoms within So the document at least contains the proper characters. It looks like Lynx is rendering them (both in regular and view-source) in such a fashion that the fancy-quote eats the following character. Bug? -tkc On 2021-03-31 09:54, Chime Hart wrote: > Hi Tim: Well, short of bouncing or forwarding my next Politico > news-letter your way? Anyway, I am in Debian SID in Linux. > lynx is already the newest version (2.9.0dev.6-2). > You can just imagine how anoying it would be to cut-and-paste an > article with these sort of inconsistancies. > Chime > ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Hi Tim: Well, short of bouncing or forwarding my next Politico news-letter your way? Anyway, I am in Debian SID in Linux. lynx is already the newest version (2.9.0dev.6-2). You can just imagine how anoying it would be to cut-and-paste an article with these sort of inconsistancies. Chime ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Well Thorsten, this goes beyond Politico. Here are 2lines from a story in Hollywood Reporter. Notice a w is missing from the first word, also what finishes line2 hat is a big deal. It is a big deal to the city of Boston, it is a big deal to the United States, it is a big deal for Black and brown communities, BNC president and CEO Princell Hair tells The Hollywood Reporter. hat is what you get on BNC that you won get anywhere else.^J Back again live, my 2lines in NANO became 4lines in Alpine. An only thing I can try would be to bounce a next Politico over to Shellworld, where I have never had this issue, although they are I think on 2.8.9 Chime ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
do you have a particular URL that demonstrates the problem? This sounds suspiciously like the traditional "drop cap" (which can be done in CSS alone, maintaining the text uninterrrupted, but before it became popular to do in CSS, folks would do by swapping out the first real letter with a fancy image of that letter.) I'm not sure about that carat-J aspect of things (it's a common way to render control characters, and ^J is a newline, so it's not uncommon. I poked around at multiple Politico pages but didn't encounter any issues. It might also help to know which version of lynx you're using and in what environment (at a console, in the Mac terminal, in Windows, in an xterm/rxvt/urxvt/st/gnome-terminal/whatever on Linux or a BSD, etc) which might also help track down any encoding issues. -Tim On 2021-03-31 09:09, Chime Hart wrote: > Well, this seems strange? On my local machine running the latest > Debian Lynx2.9.0 dev6. When reading sites such as Politico, many > times a first letter of aword is missing. Many times this surrounds > a bracketed link, but more often, its a first word of a line. > Speakup seems to realize there is a symbol as it makes a sound as I > arrow over it. Has any1 ever seen this? Do I need to switch to > another charactor set? I also notice if the page has no html, then > it is seemingly perfect. Some lines of these Politico pages end > with a carrot j. Thanks so much in advance for any guidance or > items I can change. Chime > > ___ > Lynx-dev mailing list > Lynx-dev@nongnu.org > https://lists.nongnu.org/mailman/listinfo/lynx-dev ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
Re: [Lynx-dev] Missing First Letter?
Hi Chime, I don't know Politico, but I assume they replaced the first letter by a picture of an intricately designed letter like in old monks' books and forgot to degrade gracefully for text browsers. What you describe certainly sounds like that. bye, //mirabilos -- 22:20⎜ The crazy that persists in his craziness becomes a master 22:21⎜ And the distance between the craziness and geniality is only measured by the success 18:35⎜ "Psychotics are consistently inconsistent. The essence of sanity is to be inconsistently inconsistent ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev
[Lynx-dev] Missing First Letter?
Well, this seems strange? On my local machine running the latest Debian Lynx2.9.0 dev6. When reading sites such as Politico, many times a first letter of aword is missing. Many times this surrounds a bracketed link, but more often, its a first word of a line. Speakup seems to realize there is a symbol as it makes a sound as I arrow over it. Has any1 ever seen this? Do I need to switch to another charactor set? I also notice if the page has no html, then it is seemingly perfect. Some lines of these Politico pages end with a carrot j. Thanks so much in advance for any guidance or items I can change. Chime ___ Lynx-dev mailing list Lynx-dev@nongnu.org https://lists.nongnu.org/mailman/listinfo/lynx-dev