A canonical URL host name dilemma

2021-10-09 Thread Daniel Stenberg via curl-library
Hello friends. Let me take you through a bug, my current work and the little dilemma I'm facing in regards to how to "canonicalize" host names in URLs! I'll end the mail with a question about a possible solution I've thought of. # Not parsing percent-encoded host names in URLs $ curl htt

Re: A canonical URL host name dilemma

2021-10-09 Thread Henrik Holst via curl-library
#D would most likely be the preferred way if it's possible, however it sounds both brittle and the "works differently if not built with IDN support" gives it the kind of "it depends" quality that one perhaps not want from this API? In essence I think it boils down to the use case of extracting the

Re: A canonical URL host name dilemma

2021-10-09 Thread Daniel Stenberg via curl-library
On Sat, 9 Oct 2021, Henrik Holst wrote: Thanks for your thoughts! #D would most likely be the preferred way if it's possible, however it sounds both brittle and the "works differently if not built with IDN support" gives it the kind of "it depends" quality that one perhaps not want from this

Re: A canonical URL host name dilemma

2021-10-10 Thread Daniel Stenberg via curl-library
On Sat, 9 Oct 2021, Daniel Stenberg via curl-library wrote: The question is perhaps then if that new option should rather be A) "don't URL encode host names" or B) "don't URL encode host names that are valid IDN names". Making it A) is way simpler and make a more predictable behavior. I rea

Re: A canonical URL host name dilemma

2021-10-10 Thread Ray Satiro via curl-library
On 10/10/2021 4:48 AM, Daniel Stenberg via curl-library wrote: > Stick to returning the name *un*-encoded by default in URLs and > introduce a new option that percents-encode the host name when the URL > is retrieved. > > This, to maintain the existing behavior to a larger extent. Parsing a > URL w

Re: A canonical URL host name dilemma

2021-10-10 Thread Daniel Stenberg via curl-library
On Sun, 10 Oct 2021, Ray Satiro via curl-library wrote: If someone passes https://%63url.se/ what is the disadvantage to storing it as https://curl.se/ and only returning it like that? I don't think there is any disadvantage for that case. The possible disadvantage rather comes when you use n