On 26/01/2026 21:30, Gianluca Cannata wrote:
Il giorno lun 26 gen 2026 alle ore 08:38 Amos Jeffries ha scritto:
On 26/01/2026 04:30, Gianluca Cannata wrote:
Hi Amos, Samuel, The Hurd Team,
this a follow up of my previous email that strive to handle HTTP redirect.
There are two files: one adds simply the User-Agent because some
websites does not like HTTP/1.0 requests without an User-Agent header;
the second one implements a simple redirect mechanism if the first
HEAD request returns a response with a Location header.
Tried with:
settrans -facg /tmp/site ./httpfs -D -L 1 gnu.org/
In the HTML parser for parsing tmp
Connecting to gnu.org via gnu.org:80
HTTP Protocol Verified. Status: 301
Connecting to www.gnu.org via www.gnu.org:80
HTTP Protocol Verified. Status: 200
entering the main loop
ls -1 /tmp/site/
filling out dir tmp
index.html
cat /tmp/site/index.html
Connecting to www.gnu.org via www.gnu.org:80
HTTP Protocol Verified. Status: 200
I have a question: in the next patch shall I focus on removing the
HEAD and using only a GET ? Because this patch does not handle the
case if eventually the GET request replies with a Location header.
PS: the result after ls command is much longer but i cut it off for brevity.
Sincerely,
Gianluca
Looking at just the HEAD handling loop ...
Index: httpfs/http.c
===================================================================
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -187,72 +187,142 @@ error_t open_connection(struct netnode *
size_t towrite;
char buffer[4096];
ssize_t bytes_read;
+ int redirects_followed = 0;
- /* 1. Target selection.
- * If ip_addr (proxy global variable) is set, we use it.
- * Otherwise we use the node URL.
- */
- const char *target_host = (strcmp (ip_addr, "0.0.0.0") != 0) ? ip_addr :
node->url;
-
- /* 2. Agnostic resolution (IPv4/IPv6) */
- if ((err = lookup_host (target_host, &server_addr, &addr_len, &sock_type,
&protocol)) != 0) {
- fprintf (stderr, "Cannot resolve host: %s\n", target_host);
- return err;
- }
+ while (1) {
+ if (redirects_followed > max_redirects)
+ return ELOOP;
+
+ /* 1. Target selection.
+ * If ip_addr (proxy global variable) is set, we use it.
+ * Otherwise we use the node URL.
+ */
+ const char *target_host = (strcmp (ip_addr, "0.0.0.0") != 0) ?
ip_addr : node->url;
+
+ /* 2. Agnostic resolution (IPv4/IPv6) */
+ if ((err = lookup_host (target_host, &server_addr, &addr_len,
&sock_type, &protocol)) != 0) {
+ fprintf (stderr, "Cannot resolve host: %s\n",
target_host);
+ return err;
+ }
Might be better to have this as a do {...} while (!err) loop, since at
least one attempt should always happen.
I may be wrong, but isn't while (1) always true and it will iterate at
least once like a do-while ?
Both iterate once, but do-while is a lot harder to become an infinite
loop by accident.
HTH
Amos