Encoding URL paths changes the requested URL and therefore may yield
different responses (opposed to an unencoded URL), solely depending on
how the server implements de/encoding.

Thus it is imperative to inform users about the factually requested URL
in case servers behave unexpectedly.  Consider this example:

        $ ftp http://openbsd.org/[
        Trying 64:ff9b::8180:5c2...
        Requesting http://openbsd.org/[
        Redirected to https://www.openbsd.org/[
        Trying 64:ff9b::8180:5c2...
        Requesting https://www.openbsd.org/[
        ftp: Error retrieving https://www.openbsd.org/[: 404 Not Found

ftp(1) actually encodes "[" into "%5b" before the request, but it
pretends to have sent "/[" in the GET request as per its output.


Fix this and always print exactly what was requested to help users debug
(server related) de/encoding issues:

        ./obj/ftp http://openbsd.org/[
        Trying 64:ff9b::8180:5c2...
        Requesting http://openbsd.org/%5b
        Redirected to https://www.openbsd.org/[
        Trying 64:ff9b::8180:5c2...
        Requesting https://www.openbsd.org/%5b
        ftp: Error retrieving https://www.openbsd.org/%5b: 404 Not Found

This matches exactly what is seen on the wire, e.g. with tshark(1).

In this case of openbsd.org, the server's 301 Moved Permanently's
Location header does contain the decoded path "/[", so that output is
indeed correct and ftp encodes it again as expected before following it.

Other servers, e.g. ccc.de, respont with 302 Moved Temporarily and retain
the decoded URL path:

        $ ./obj/ftp http://ccc.de/[
        Trying 2001:67c:20a0:2:0:164:0:39...
        Requesting http://ccc.de/%5b
        Redirected to https://www.ccc.de/%5b
        Trying 2001:67c:20a0:2:0:164:0:39...
        Requesting https://www.ccc.de/%5b
        ftp: Error retrieving https://www.ccc.de/%5b: 404 Not Found

ftp follows the Location header but has nothing to (re)encode and with
this diff above about also matches exactly what is seen on the wire.

Feeback? Objections? OK?
NB:  I'd do media tests before committing since ftp grows a little and
this code lands in the SMALL version as well.

Index: fetch.c
===================================================================
RCS file: /cvs/src/usr.bin/ftp/fetch.c,v
retrieving revision 1.206
diff -u -p -r1.206 fetch.c
--- fetch.c     6 Nov 2021 14:27:45 -0000       1.206
+++ fetch.c     6 Nov 2021 16:58:14 -0000
@@ -320,6 +320,7 @@ url_get(const char *origline, const char
        int isunavail = 0, retryafter = -1;
        struct addrinfo hints, *res0, *res;
        const char *savefile;
+       char *eurl = NULL;
        char *pathbuf = NULL;
        char *proxyurl = NULL;
        char *credentials = NULL, *proxy_credentials = NULL;
@@ -713,10 +714,17 @@ noslash:
 #endif /* !NOSSL */
 
        epath = url_encode(path);
+       if (asprintf(&eurl, "%s%s%s%s/%s",
+           scheme,
+           full_host,
+           portnum ? ":" : "",
+           portnum ? portnum : "",
+           epath) == -1)
+               errx(1, "Cannot build encoded URL");
        if (proxyurl) {
                if (verbose) {
                        fprintf(ttyout, "Requesting %s (via %s)\n",
-                           origline, proxyurl);
+                           eurl, proxyurl);
                }
                /*
                 * Host: directive must use the destination host address for
@@ -735,7 +743,7 @@ noslash:
                ftp_printf(fin, "\r\n");
        } else {
                if (verbose)
-                       fprintf(ttyout, "Requesting %s\n", origline);
+                       fprintf(ttyout, "Requesting %s\n", eurl);
 #ifndef SMALL
                if (resume || timestamp) {
                        if (stat(savefile, &stbuf) == 0) {
@@ -833,7 +841,7 @@ noslash:
        status = strtonum(ststr, 200, 503, &errstr);
        if (errstr) {
                strnvis(gerror, cp, sizeof gerror, VIS_SAFE);
-               warnx("Error retrieving %s: %s", origline, gerror);
+               warnx("Error retrieving %s: %s", eurl, gerror);
                goto cleanup_url_get;
        }
 
@@ -879,7 +887,7 @@ noslash:
                break;
        default:
                strnvis(gerror, cp, sizeof gerror, VIS_SAFE);
-               warnx("Error retrieving %s: %s", origline, gerror);
+               warnx("Error retrieving %s: %s", eurl, gerror);
                goto cleanup_url_get;
        }
 
@@ -1012,7 +1020,7 @@ noslash:
        if (isunavail) {
                if (retried || retryafter != 0)
                        warnx("Error retrieving %s: 503 Service Unavailable",
-                           origline);
+                           eurl);
                else {
                        if (verbose)
                                fprintf(ttyout, "Retrying %s\n", origline);
@@ -1134,6 +1142,7 @@ improper:
        warnx("Improper response from %s", host);
 
 cleanup_url_get:
+       free(eurl);
 #ifndef SMALL
        free(full_host);
 #endif /* !SMALL */

Reply via email to