Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Chime Hart
Well, Karen, I certainly agree, figuring out some seemless way of processing 
sites would be wonderful. I mean as we've discussed on this list, there are 
sites I go to nearly each day which have that 403 error, so I either run an L Y 
N X external or just see a blank page-and-go in an options menu-and-uncheck the 
user agent box. I shouldn't have to do all of that.
Now to answer your question, wget is mostly used to grab files which begin with 
http. wget had an easier time with that article, so I processed it locally.

Chime




Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Karen Lewellen

I have no idea what wget is chime, so cannot provide an opinion.
what I am wondering though is if l y n x  should expand the default user 
agent  options to help bypass issues of this kind?

Karen



On Mon, 2 May 2022, Chime Hart wrote:

Well, thanks Karen, I've always had issues understanding links and elinks. 
Meanwhile, I did finally have some success the long way around. I ran wget 
with that url, changed the name from index.html.1 to gri.html. Then I ran 
rdrview -b lynx gri.html  which gave me a clear 52 lines of an article 
without any tool bars. The funny thing was that wget would even grab that 
story. Now, I am wondering if adding wget as an external in LYNX would help?

Chime







Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Karen Lewellen

David,
How can anyone  meet a legal obligation by testing with a single tool like 
jaws?
The technical baseline clause in  at least some editions of WACG 
discourages that sort of thing, unless you are providing the tool  in 
question.
after all, major sites doing business across borders  would have a hard 
time using   Jaws alone as their basis.

and what about other populations who are equally entitled to access?
Karen



On Tue, 3 May 2022, David Woolley wrote:


On 03/05/2022 00:15, Chime Hart wrote:

 Since I really have little understanding of user agent strings, is their
 anything specific I can type in L Y N X to process this article


User agents have been a mess for a long time, although it looks like the 
original standard is beginning to be reinstated.


In the very early days of graphical browsers, sites used to discriminate 
against some browsers and as a result Internet Explorer started claiming to 
be Netscape and adding its real identity as a comment.  It looks like the 
latest versions still do that.


As such if you are up against browser discrimination, you just have to 
experiment, to work out the rules the site uses.  There is a database of them 
here: , which looks 
like it isn't Ajax based.


The other big problem for Lynx, is that many sites, probably most of the 
major sites, do not send the common browsers a document, but rather a program 
which loads data and formats it into the page.  You can sometimes recognize 
these on graphical browsers, either because you get a placeholder format, 
with grey blocks for notional text, before the actual text appears, or by the 
way that individual parts of the page fail independently.


Of the ones I've looked at in some detail, both ancestry.com and Nextdoor 
normally operate in this way.  The main reason they might create a simple 
text version is for search engines, not for end users. If you are very luck 
they may also consider text only browsers, but there is probably no 
commercial imperative for this, as they can generally meet their legal 
obligations by creating pages that work with Jaws and a graphical browser.


A lot more probably do this as well.  Maybe most of the well known sites; I 
haven't looked at how the typical small business site (the one page sites 
with sections that replace each other as you scroll through the page) work, 
when it comes to document mode fall back.


From what is going on under the hood on the site you are looking at, they 
have absolutely no interest in providing a static document.  The site is 
behaving like one that is there to attract people to the adverts.








Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Chime Hart
Well, thanks Karen, I've always had issues understanding links and elinks. 
Meanwhile, I did finally have some success the long way around. I ran wget with 
that url, changed the name from index.html.1 to gri.html. Then I ran rdrview -b 
lynx gri.html  which gave me a clear 52 lines of an article without any tool 
bars. The funny thing was that wget would even grab that story. Now, I am 
wondering if adding wget as an external in LYNX would help?

Chime




Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread David Woolley

On 03/05/2022 00:15, Chime Hart wrote:
Since I really have little understanding of user agent strings, is their 
anything specific I can type in L Y N X to process this article


User agents have been a mess for a long time, although it looks like the 
original standard is beginning to be reinstated.


In the very early days of graphical browsers, sites used to discriminate 
against some browsers and as a result Internet Explorer started claiming 
to be Netscape and adding its real identity as a comment.  It looks like 
the latest versions still do that.


As such if you are up against browser discrimination, you just have to 
experiment, to work out the rules the site uses.  There is a database of 
them here: , 
which looks like it isn't Ajax based.


The other big problem for Lynx, is that many sites, probably most of the 
major sites, do not send the common browsers a document, but rather a 
program which loads data and formats it into the page.  You can 
sometimes recognize these on graphical browsers, either because you get 
a placeholder format, with grey blocks for notional text, before the 
actual text appears, or by the way that individual parts of the page 
fail independently.


Of the ones I've looked at in some detail, both ancestry.com and 
Nextdoor normally operate in this way.  The main reason they might 
create a simple text version is for search engines, not for end users. 
If you are very luck they may also consider text only browsers, but 
there is probably no commercial imperative for this, as they can 
generally meet their legal obligations by creating pages that work with 
Jaws and a graphical browser.


A lot more probably do this as well.  Maybe most of the well known 
sites; I haven't looked at how the typical small business site (the one 
page sites with sections that replace each other as you scroll through 
the page) work, when it comes to document mode fall back.


From what is going on under the hood on the site you are looking at, 
they have absolutely no interest in providing a static document.  The 
site is behaving like one that is there to attract people to the adverts.





Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Karen Lewellen

Hi chime,
The site does work when using links the chain,
Elinks as well.
If you truly want the file it can be saved in links the chain and then 
likely run in l y n X
granted, I did not seek the story itself, just visited the main page in 
links the chain.
When on content like an article, select the   escape key, then arrow down 
until you hear save formatted document.
select the enter key, you will either have a file name to remove, first or 
a blank.

save the file as
whatever.html
choosing the file name you wish.
then run lynx whatever.html,
which should give you a file in lynx that ou can save to something else.
does that resonate?
Karen



On Mon, 2 May 2022, Chime Hart wrote:

Thank you David for your analysis. Since I really have little understanding 
of user agent strings, is their anything specific I can type in L Y N X to 
process this article, which I not only want to save the text of, but 
eventually explore what else is there? Thanks so much in advance

Chime







Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Thomas Dickey
On Mon, May 02, 2022 at 03:22:43PM -0700, Chime Hart wrote:
> Hi All: 1 of my google news-alerts pointed me to an article which first gave
> me a 403 error, but even after ajusting an option for user agent, still says
> "cannot connect" This does work in elinks, also LINKS, and w3m. However,
> what else I am not understanding. In w3m there are maybe 171 lines on 1
> page, while both other browsers have at least 2 pages. I have my screen set
> at 180lines by 270 columns. Here is an url
> https://globalriskinsights.com/2022/05/the-russo-ukrainian-war-and-nagorno-karabakhs-faltering-ceasefire/
> I would rather access this in L Y N X as I better understand how to save
> files. I also want to run this story through rdrview. I've also just tried
> the main site, same result. Thanks so much in advance.
> Chime

There are two problems:

a) the 403 is returned because of Lynx's user-agent string.

b) suppressing the user-agent string, gnutls (or lynx) returns an error
   
HTTP: Hit unexpected network read error; aborting connection; status 0:The TLS 
connection was non-properly terminated..

That might be some particular detail of the algorithms used.
It could be a bug in lynx -- or like the user-agent...

Lynx's trace says

Secure 256-bit TLS1.3 (ECDHE_RSA_AES_256_GCM_SHA384) HTTP connection

links and Elinks' "=" screen show similar results - but not identical:

SSL Cipher: TLS1.3 - ECDHE-RSA - AES-256-GCM - AEAD - X.509  (compr: NULL)

For instance, that AEAD might be relevant.

w3m's "=" screen shows a lot about the signature, but no summary like those.

(by the way, links (not elinks) also says it's using brotli compression,
but -- I added that in lynx recently -- doesn't affect the result).

-- 
Thomas E. Dickey 
https://invisible-island.net
ftp://ftp.invisible-island.net


signature.asc
Description: PGP signature


Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Chime Hart
Thank you David for your analysis. Since I really have little understanding of 
user agent strings, is their anything specific I can type in L Y N X to process 
this article, which I not only want to save the text of, but eventually explore 
what else is there? Thanks so much in advance

Chime




Re: [Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread David Woolley

On 02/05/2022 23:22, Chime Hart wrote:
I would rather access this in L Y N X as I better understand how to save 
files. I also want to run this story through rdrview. I've also just 
tried the main site, same result. Thanks so much in advance.


That may well be the problem.  Unfortunately Lynx has a reputation for 
being used for grabbing content, which site owners will often treat as 
information theft.  It is possible that the Links user agent is being 
treated as an online reader.


Accessing it on Firefox, it seems to be very bandwidth hogging, as it is 
continually fetching multiple files, even you are not doing anything. 
It wouldn't surprise me if it normally constructs pages on the fly, in 
the browser, but has a fall back mode for text browsers it considers 
safe.  It was updating so much that I couldn't really get at the key 
files to see if the real contents was Ajax data.


You might want to try using Links' native user agent string.  The 
standard fake strings may make it assume you are using an Ajax platform.




[Lynx-dev] Why Won't this Site Work in LYNX?

2022-05-02 Thread Chime Hart
Hi All: 1 of my google news-alerts pointed me to an article which first gave me 
a 403 error, but even after ajusting an option for user agent, still says 
"cannot connect" This does work in elinks, also LINKS, and w3m. However, what 
else I am not understanding. In w3m there are maybe 171 lines on 1 page, while 
both other browsers have at least 2 pages. I have my screen set at 180lines by 
270 columns. Here is an url

https://globalriskinsights.com/2022/05/the-russo-ukrainian-war-and-nagorno-karabakhs-faltering-ceasefire/
I would rather access this in L Y N X as I better understand how to save files. 
I also want to run this story through rdrview. I've also just tried the main 
site, same result. Thanks so much in advance.

Chime



Re: [Lynx-dev] 'File exists. Overwrite? (y/n)'

2022-05-02 Thread Chime Hart

Well, maybe best of both worlds, y/n/a for append.
Chime




Re: [Lynx-dev] 'File exists. Overwrite? (y/n)'

2022-05-02 Thread Karen Lewellen

I on the other hand love love love this tool just as it is.
As  I manage information sometimes on the same subject, but from different 
sources, This lets me keep files I want to download later organized.
then there are the many times a week when  updating a Reading source, say 
a new  chapter of fanfiction.
I can overwrite the story  file title including the new chapter, which is 
better than my prior method of creating a file name that includes the 
chapter title.

One man's pleasure is another man's poison I dare say.



On Mon, 2 May 2022, russellb...@gmail.com wrote:


When I get this, I always say no, but that returns me to the
prompt to enter the name of the target file.  I have to hit ctrl-u to
clear it to get out.  I wish this prompt gave me the choice of
yes/no/cancel.  I *know* I can hit ctrl-g, but that still leaves me at
the save prompt.

russell bell






[Lynx-dev] 'File exists. Overwrite? (y/n)'

2022-05-02 Thread russellbell
When I get this, I always say no, but that returns me to the
prompt to enter the name of the target file.  I have to hit ctrl-u to
clear it to get out.  I wish this prompt gave me the choice of
yes/no/cancel.  I *know* I can hit ctrl-g, but that still leaves me at
the save prompt.

russell bell