On 26/01/2026 21:30, Gianluca Cannata wrote:
Il giorno lun 26 gen 2026 alle ore 08:38 Amos Jeffries ha scritto:

On 26/01/2026 04:30, Gianluca Cannata wrote:
Hi Amos, Samuel, The Hurd Team,

this a follow up of my previous email that strive to handle HTTP redirect.

There are two files: one adds simply the User-Agent because some
websites does not like HTTP/1.0 requests without an User-Agent header;
the second one implements a simple redirect mechanism if the first
HEAD request returns a response with a Location header.

Tried with:

settrans -facg /tmp/site ./httpfs -D -L 1 gnu.org/
In the HTML parser for parsing tmp
Connecting to gnu.org via gnu.org:80
HTTP Protocol Verified. Status: 301
Connecting to www.gnu.org via www.gnu.org:80
HTTP Protocol Verified. Status: 200
entering the main loop

ls -1 /tmp/site/
filling out dir tmp
index.html

cat /tmp/site/index.html
Connecting to www.gnu.org via www.gnu.org:80
HTTP Protocol Verified. Status: 200

I have a question: in the next patch shall I focus on removing the
HEAD and using only a GET ? Because this patch does not handle the
case if eventually the GET request replies with a Location header.

PS: the result after ls command is much longer but i cut it off for brevity.

Sincerely,

Gianluca


Looking at just the HEAD handling loop ...


Index: httpfs/http.c
===================================================================
--- httpfs.orig/http.c
+++ httpfs/http.c
@@ -187,72 +187,142 @@ error_t open_connection(struct netnode *
         size_t towrite;
         char buffer[4096];
         ssize_t bytes_read;
+       int redirects_followed = 0;

-       /* 1. Target selection.
-        * If ip_addr (proxy global variable) is set, we use it.
-        * Otherwise we use the node URL.
-        */
-       const char *target_host = (strcmp (ip_addr, "0.0.0.0") != 0) ? ip_addr : 
node->url;
-
-       /* 2. Agnostic resolution (IPv4/IPv6) */
-       if ((err = lookup_host (target_host, &server_addr, &addr_len, &sock_type, 
&protocol)) != 0) {
-               fprintf (stderr, "Cannot resolve host: %s\n", target_host);
-               return err;
-       }
+       while (1) {
+               if (redirects_followed > max_redirects)
+                       return ELOOP;
+
+               /* 1. Target selection.
+                * If ip_addr (proxy global variable) is set, we use it.
+                * Otherwise we use the node URL.
+                */
+               const char *target_host = (strcmp (ip_addr, "0.0.0.0") != 0) ? 
ip_addr : node->url;
+
+               /* 2. Agnostic resolution (IPv4/IPv6) */
+               if ((err = lookup_host (target_host, &server_addr, &addr_len, 
&sock_type, &protocol)) != 0) {
+                       fprintf (stderr, "Cannot resolve host: %s\n", 
target_host);
+                       return err;
+               }

Might be better to have this as a do {...} while (!err) loop, since at
least one attempt should always happen.

I may be wrong, but isn't while (1) always true and it will iterate at
least once like a do-while ?


Both iterate once, but do-while is a lot harder to become an infinite loop by accident.


HTH
Amos

Reply via email to