[Lynx-dev] Can I use Lynx to archive-as-I-browse in a queue-based workflow?

James R. Haigh (+ML.NonGNU.Lynx subaddress) via Lynx-dev Wed, 03 Dec 2025 06:14:37 -0800

Hello,
    What I'd like to be able to do is to browse, but as I am browsing, to keep 
every downloaded resource in an archive.  Further to this, I do not want to 
visit hyperlinks as soon as I see them, but rather to put them into a queue of 
things to read, and simply to keep reading the very next thing in the queue by 
default, because I think that that would be a much better workflow for avoiding 
getting trapped in rabbitholes -- and there's a sound computer science argument 
to support this view.  I'm not saying to do all of my browsing in a single 
queue, but generally the workflow of a single project should be in it's own 
queue, so that I can adjust priorities between queues.  I plan to implement 
this queuing system as a wrapper around Lynx, because I may wish to extend it 
outside of the scope of mere Web-browsing (or even Gopher-browsing).  But I do 
at least need a way to hook into Lynx's GET & POST actions, and I did not see a 
way to do this in the manual page.


    Currently I use W3M which caches everything, and it's easy enough to 
archive this cache, but a problem that I have with it is that it lacks metadata 
-- particularly the URL that was used to downloaded it is not necessary 
contained within the downloaded resource, and the HTTP status code is not kept 
either.  Unfortunately W3M does not have options for running multiple instances 
from distinct files like Lynx does, so I'm going to try Lynx again for the more 
advanced things that I plan to do here.

    Possibly the simplest way to implement this from a wrapper script is to 
call Lynx with an existing local HTML file -- an offline homepage, probably one 
that is updated with a history or index of hyperlinks contained within the 
local archive, thus somewhat mitigating the need for any proprietary & 
privacy-invading search engine as a service -- and then to have Lynx simply 
return the URL of any hyperlink selected or any Goto URL entered, on standard 
output, such that the wrapper script can read this URL string, retrieve it, add 
it to the reading queue, and then open the next resource in the queue -- with 
Lynx if the resource is hypertext, or with whatever other defined external 
viewer is the next resource in the queue is an image (e.g. SXIV), a video (e.g. 
MPV), a PDF (e.g. Atril), etc..

    Even if that were possible, it would still not be quite right, because it 
would interrupt reading of the current hypertext document at the 1st hyperlink 
that I wish to (later) visit, by switching to the next resource in the queue 
before having finished reading the current hypertext.  I could try making my 
wrapper script relaunch Lynx with the same hypertext & cursor location as 
before, such as to resume reading at the hyperlink just downloaded, and only 
move to the next resource if Lynx exits with the Cursor on the last character 
position.  This would probably be a nice workflow, but would have an annoying 
flicker, may not be very robust, and large downloads would block reading until 
complete.  Although this would avoid the need for hooks where Lynx runs scripts 
itself, Lynx currently does not offer any way that I can see in the manual page 
to return with a cursor position and/or URL, as if from a menu.  So it doesn't 
really help.

    Probably the best way to implement this is if Lynx had a way to hook 
arbitrary aspects of its user interface to external commands or scripts.  I'd 
like to hook goto/visit to a script that 1st runs a background subshell in 
which the URL given as an argument is archived, and that the archived local 
file be appended to a reading queue for this project/workflow.  I also wish to 
be able to record which archived document I discovered this hyperlink from, and 
to disambiguate duplicate hyperlinks by noting its hyperlink number.  The local 
file URL & hyperlink number should suffice to reconstruct the tree or DAG of 
ones browsing at a later date.

    This would still mean that I'd have to quit Lynx each time I was done with 
the current hypertext resource and wished to look at the next item in the 
resource queue.  I would like to be able to hook into some menus in Lynx, such 
as to have a keybinding that displays the remaining queue with the next item 
selected, such that hitting Enter would view this next item, but equally I 
could just as well skip to a later item in the queue.  Maybe even to have 
keybindings that move an item up or down, or remove it from the reading list.  
What's the minimum that Lynx could enable here that would enable me to write 
scripts that implement this sort of thing?

    Something that is important to me is that any of this custom script-writing 
is preset at or before the launch of Lynx.  I'm not asking to run arbitrary 
code at runtime, like with clientside JavaScript that is now prolific & 
intrusive on most of the mainstream Web.  I actually do not agree with 
Turing-complete code being included in documents.  What I'm asking for is more 
akin to a plugin or addon to the browser itself, that is lockable to not allow 
any other code than that which was preset at invocation or even installed into 
the system by a privileged user -- so that it can be made to be both secure, 
and a stable interface that does not keep changing and breaking adaptations.

    Please let me know what Lynx can do, what ideas you have about all this, 
and what minimum amount of easy changes to Lynx might make this a lot more 
elegant of an implementation of this workflow.

Kind regards,
James.
-- 
Wealth doesn't bring happiness, but poverty brings sadness.
Sent from Debian with Claws Mail, using email subaddressing as an AI-free 
alternative to error-prone heuristical spam filtering.

pgprFNT5naKCq.pgp
Description: OpenPGP digital signature

[Lynx-dev] Can I use Lynx to archive-as-I-browse in a queue-based workflow?

Reply via email to