Hi Jeff, Thank you for your reply.
The colon has nothing to do with the issue. If we remove the colon, the issue still persists: *curl* "a;bc@xyz" curl: (6) Could not resolve host: xyz *wget* "a;bc@xyz" wget: unable to resolve host address ‘a;bc@xyz’ *wget* "abc@xyz" wget: unable to resolve host address ‘xyz’ So, when the semicolon is included in *userinfo*, wget treats *userinfo* as part of the hostname. You can replicate this after disconnecting from your network first. Thank you, Bachir On Mon, Feb 5, 2024 at 10:08 PM Jeffrey Walton <[email protected]> wrote: > On Mon, Feb 5, 2024 at 4:57 PM Bachir Bendrissou <[email protected]> > wrote: > > > > The url attached example contains a semicolon in the userinfo segment. > > > > Wget rejects this url with the following error message: > > > > *Bad port number.* > > > > It seems that Wget sees "c" as a port number. When "c" is replaced by a > > digit, Wget accepts the url and attempts to resolve "xyz". > > > > It's worth noting that both curl and aria2 accept the url example. > > > > Why is the semicolon not allowed in userinfo, despite the fact that other > > special characters are allowed? > > A colon in the userinfo is deprecated but not forbidden. However, an > application can choose to reject it. From RFC 3968, Uniform Resource > Identifier (URI): Generic Syntax, Section 3.2, > <https://datatracker.ietf.org/doc/html/rfc3986#section-3.2>. > > The userinfo subcomponent may consist of a user name and, optionally, > scheme-specific information about how to gain authorization to access > the resource. The user information, if present, is followed by a > commercial at-sign ("@") that delimits it from the host. > > userinfo = *( unreserved / pct-encoded / sub-delims / ":" ) > > Use of the format "user:password" in the userinfo field is > deprecated. Applications should not render as clear text any data > after the first colon (":") character found within a userinfo > subcomponent unless the data after the colon is the empty string > (indicating no password). Applications may choose to ignore or > reject such data when it is received as part of a reference and > should reject the storage of such data in unencrypted form. > > According to the BNF is Appendix A, the semicolon ';' is allowed as a > <sub-delims> token. It does not need to be percent encoded. > > Jeff >
