OK, I'm convinced that these are not "real" hosts, and that breaking them will not result in actual user-visible breakage. Would it be possible to ship this with a server-side flag that'd enable us to quickly revert in case we're wrong?
On Thu, Aug 19, 2021 at 6:09 PM Matt Menke <mme...@google.com> wrote: > So given information on this thread, I think the ways these can succeed > are possibly: > 1) Entries in the HOSTS file. > 2) Intermediary DNS servers or MitMs typo squatting, or actively > attacking the user. > 3) Local DNS servers providing additional DNS results. > 4) Suffix search (which would trigger on, e.g., "1533.67.89" but not > "67.89.1533") > 5) mDNS > 6) Local tools injecting DNS lookup results. > > Unfortunately, we don't have a good way to gather hard data on which are > more common. Given that there are potential security implications here, > I'm reluctant to wait for another round of data gathering, though we could > probably distinguish cases 4) and 5) from the others, and 1) as well, at > least on some platforms. Also not sure how useful just knowing about those > cases would be. > On Thursday, August 19, 2021 at 11:54:29 AM UTC-4 Harald Alvestrand wrote: > >> Thanks - I misremembered which end of the class B network got jammed >> together. 67.89.31533 is indeed the one that is "legal" syntax, and so of >> course 67.89.1 is too. >> >> >> On Thu, Aug 19, 2021 at 3:20 PM Matt Menke <mme...@google.com> wrote: >> >>> Note that 127.0.1 is mapped to 127.0.0.1 by GURL, following the URL >>> spec, so that would generally not make it to the DNS resolver (unless >>> something tried to resolve it directly). Per the URL spec, "31533.67.89" >>> would not be normalized by GURL, but "67.89.31533" would be converted to >>> 67.89.123.45. My instrumentation was at the DNS layer, so "127.0.1" and >>> "67.89.31533" would not show up as problematic hostnames in my metrics, >>> though "31533.67.89" would. >>> >>> On Thu, Aug 19, 2021 at 9:10 AM Harald Alvestrand <h...@google.com> >>> wrote: >>> >>>> An interesting property of all-numeric hostnames is that they *may* be >>>> legitimate IPv4 addresses using highly archaic IP address formats - we're >>>> so used to the 123.45.67.89 syntax that we forget that 31533.67.89 once was >>>> regarded as a legitimate way to encode the same address (class B notation). >>>> >>>> There are also certain resolvers that will "helpfully" map an >>>> all-numeric hostname presented in DNS to an IP address without asking >>>> anyone. >>>> So if those two bugs (or "archaic features") occur together, the result >>>> may be a successful resolution. >>>> >>>> No idea why it would occur more often on Android than on Windows, >>>> though. And my Linux boxes don't resolve 127.0.1 to anything. >>>> >>>> >>>> >>>> On Thu, Aug 19, 2021 at 2:42 PM Yoav Weiss <yoav...@chromium.org> >>>> wrote: >>>> >>>>> Interesting! What happens then in the "successful resolution" case >>>>> Matt mentioned? >>>>> >>>>> On Thu, Aug 19, 2021 at 2:39 PM Harald Alvestrand <h...@google.com> >>>>> wrote: >>>>> >>>>>> Department of odd facts: >>>>>> >>>>>> - The ICANN rules for new TLDs forbid all top level domain names that >>>>>> start with a digit >>>>>> - The IDNA rules for bidirectional scripts forbid domain names that >>>>>> start with a digit (Unicode bidi afficandoes will know why) >>>>>> - The only real reason why leading digits aren't outlawed in domain >>>>>> names at the second level is 3com. >>>>>> >>>>>> It seems safe to say that no legitimate fully qualified hostname will >>>>>> ever have a last component consisting only of digits. >>>>>> That means the only time we could get a legitimate hostname is for >>>>>> something that has to be resolved via a search path. >>>>>> >>>>>> >>>>>> On Thu, Aug 19, 2021 at 2:33 PM Yoav Weiss <yoav...@chromium.org> >>>>>> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Aug 19, 2021 at 2:28 PM Matt Menke <mme...@google.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I created the title using Chrome Status's deprecation template, so >>>>>>>> any confusion should be blamed on that. >>>>>>>> >>>>>>> >>>>>>> +Jason Robbins - on the title issues. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> I used the "Draft Intent to Deprecate and Remove email" button, and >>>>>>>> assume I'd need to do a "Draft Intent to Ship email" before shipping to >>>>>>>> stable, after a 50% trial on prerelease channels. >>>>>>>> >>>>>>> >>>>>>> There's no need for 2 emails for removals. We can discuss the full >>>>>>> deprecation, experimentation/trials and removal on stable here. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> On Thu, Aug 19, 2021 at 3:15 AM Yoav Weiss <yoav...@chromium.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> On Wed, Aug 18, 2021 at 11:36 PM Matt Menke <mme...@google.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Aug 18, 2021 at 5:23 PM Yoav Weiss <yoav...@chromium.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Aug 18, 2021 at 11:18 PM Matt Menke <mme...@google.com> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> On Wed, Aug 18, 2021 at 4:53 PM Yoav Weiss < >>>>>>>>>>>> yoav...@chromium.org> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Aug 18, 2021 at 8:47 PM 'Matt Menke' via blink-dev < >>>>>>>>>>>>> blin...@chromium.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Contact emailsmme...@google.com >>>>>>>>>>>>>> >>>>>>>>>>>>>> ExplainerNone >>>>>>>>>>>>>> >>>>>>>>>>>>>> Specificationhttps://url.spec.whatwg.org/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> Summary >>>>>>>>>>>>>> >>>>>>>>>>>>>> Most hostnames that aren't valid IPv4 addresses, but end in >>>>>>>>>>>>>> numbers are treated as valid, and looked up via DNS (e.g., >>>>>>>>>>>>>> http://foo.127.1/). Per the Public Suffix List spec, the >>>>>>>>>>>>>> eTLD+1 of the hostname in that URL should be "127.1". If that is >>>>>>>>>>>>>> ever fed >>>>>>>>>>>>>> back into a URLs, "http://127.1/ <http://127.0.0.1/>" is >>>>>>>>>>>>>> mapped to "http://127.0.0.1/" by the URL spec, which seems >>>>>>>>>>>>>> potentially dangerous. "127.0.0.0.1" could also potentially be >>>>>>>>>>>>>> used to >>>>>>>>>>>>>> confuse users. We want to reject URLs with these hostnames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Blink componentInternals>Network>DNS >>>>>>>>>>>>>> <https://bugs.chromium.org/p/chromium/issues/list?q=component:Internals%3ENetwork%3EDNS> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Motivation >>>>>>>>>>>>>> >>>>>>>>>>>>>> Most hostnames that aren't valid IPv4 addresses, but end in >>>>>>>>>>>>>> numbers are treated as valid hostnames, and looked up via DNS. >>>>>>>>>>>>>> Example >>>>>>>>>>>>>> hostnames: 127.0.0.0.1, foo.0.1, 10.0.0.09, 08.1.2.3. These can >>>>>>>>>>>>>> be >>>>>>>>>>>>>> problematic for the following reason: * "http://foo.127.1/" >>>>>>>>>>>>>> has an eTLD+1 of "127.1", per the public suffix list spec. If >>>>>>>>>>>>>> that's ever >>>>>>>>>>>>>> used as the hostname in a new URL, however, as in " >>>>>>>>>>>>>> http://127.1 <http://127.0.0.1/>", it will then get mapped >>>>>>>>>>>>>> to "http://127.0.0.1/", per the URL spec, which is a >>>>>>>>>>>>>> different host, which is not safe. * "http://127.0.0.0.1" >>>>>>>>>>>>>> and "http://1.2.3.09", both of which are looked up via DNS >>>>>>>>>>>>>> rather than failing or being treated as IPv4 hostnames, also seem >>>>>>>>>>>>>> potentially confusing. While no exploit is currently known here, >>>>>>>>>>>>>> we want to >>>>>>>>>>>>>> remove support for these as a preventative security measure. The >>>>>>>>>>>>>> URL spec >>>>>>>>>>>>>> has been updated so that any URL with a hostname ending in a >>>>>>>>>>>>>> number that's >>>>>>>>>>>>>> not an IPv4 address (including, e.g., http://foo.1./, but >>>>>>>>>>>>>> not http://foo.1../) is considered invalid. Since this is >>>>>>>>>>>>>> part of the URL spec, not the DNS spec, we want to reject these >>>>>>>>>>>>>> URLs are >>>>>>>>>>>>>> the GURL layer, for URLs with appropriate protocols (http, >>>>>>>>>>>>>> https, ws, wss, >>>>>>>>>>>>>> file). For consistency, we should also fail DNS lookup attempts >>>>>>>>>>>>>> of these >>>>>>>>>>>>>> sorts of hostnames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Initial public proposalhttps://github.com/whatwg/url/pull/619 >>>>>>>>>>>>>> >>>>>>>>>>>>>> TAG reviewNot required for an Intent to Deprecate, I believe. >>>>>>>>>>>>>> >>>>>>>>>>>>>> TAG review statusNot applicable >>>>>>>>>>>>>> >>>>>>>>>>>>>> Risks >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Interoperability and Compatibility >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any URL with an affected hostname will fail to load, and will >>>>>>>>>>>>>> need to be migrated to another hostname. URLs of this form do >>>>>>>>>>>>>> appear to be >>>>>>>>>>>>>> in use, though it's not clear under what circumstances. No entry >>>>>>>>>>>>>> in the >>>>>>>>>>>>>> public suffix list is affected. Affected URLs make up no more >>>>>>>>>>>>>> than 0.0003% >>>>>>>>>>>>>> of hostnames looked up via the host resolver on any platform, >>>>>>>>>>>>>> and are >>>>>>>>>>>>>> basically not used in any file URLs, according to our metrics. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Do we have reason to believe these hostnames are not >>>>>>>>>>>>> legitimate ones? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Unfortunately, we have no insight into them - they could be >>>>>>>>>>>> mistyped URLs sent to typo squatting ISPs that OSX lets through >>>>>>>>>>>> but the >>>>>>>>>>>> Windows host resolver blocks, and various flavors of Linux treat >>>>>>>>>>>> differently. Or they could be mapped via a hosts file, or they >>>>>>>>>>>> could be >>>>>>>>>>>> hostnames that only resolve on public networks. Could be some >>>>>>>>>>>> network tool >>>>>>>>>>>> that uses them when installed locally, but is only available on >>>>>>>>>>>> certain >>>>>>>>>>>> platforms. No reason to think one possibility is more likely than >>>>>>>>>>>> the >>>>>>>>>>>> others. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Do we have UKM for them that would enable us to test a random >>>>>>>>>>> sample? >>>>>>>>>>> I'm concerned about blocking those hostnames if they are >>>>>>>>>>> legitimate, as that's something that a web developer can't do >>>>>>>>>>> anything >>>>>>>>>>> about. >>>>>>>>>>> So even if the number of hosts is small, I'd like to get more >>>>>>>>>>> certainty that they are *not* legitimate hosts before blocking them. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> We have UKM on their number (0.0003% of DNS lookups on OSX, less >>>>>>>>>> elsewhere - we can't meaningfully instrument percent of created >>>>>>>>>> GURLs), but >>>>>>>>>> we don't have their hostnames, what they resolve to, or know >>>>>>>>>> anything else >>>>>>>>>> about them, unfortunately. >>>>>>>>>> >>>>>>>>>> Navigation to a subset of these as frame URLs were broken at one >>>>>>>>>> point - I'm pretty sure the breakage even made it to stable: >>>>>>>>>> https://crbug.com/1173238. There were no reports of problems. >>>>>>>>>> Only non-IPv4 URLs where the last two components were broken, >>>>>>>>>> though, and >>>>>>>>>> it didn't affect subresources. On OSX and Android, over 99% of >>>>>>>>>> successfully resolved problematic hostnames fit into that bucket, >>>>>>>>>> though on >>>>>>>>>> Linux, only about 2% do. >>>>>>>>>> >>>>>>>>>> That doesn't give us any hard conclusions, except they're either >>>>>>>>>> not deliberate navigations on OSX/Android, or they're not >>>>>>>>>> navigations. >>>>>>>>>> >>>>>>>>> >>>>>>>>> :| >>>>>>>>> >>>>>>>>> One more question: Is this an intent to Prototype or an intent to >>>>>>>>> deprecate? The title is a bit unclear.. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On OSX and Android, about 90% of host resolver lookups for >>>>>>>>>>>>>> these hostnames succeed, 60% do on Linux, and 2% on Windows and >>>>>>>>>>>>>> ChromeOS. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Do you know where those failures are coming from? >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Could be typos, could be the Windows and ChromeOS host >>>>>>>>>>>> resolvers don't let them through. Since we've had no filed bugs >>>>>>>>>>>> about >>>>>>>>>>>> them, I suspect the failures are not deliberate navigations or >>>>>>>>>>>> intended >>>>>>>>>>>> network requests. I'm much more interested in where the successes >>>>>>>>>>>> are >>>>>>>>>>>> coming from, myself. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> To allow for emergency disabling in case of wider than >>>>>>>>>>>>>> expected breakage, I intend to add a feature for it, and do a >>>>>>>>>>>>>> 50% field >>>>>>>>>>>>>> trial on pre-release channels, though plan to just enable the >>>>>>>>>>>>>> feature, >>>>>>>>>>>>>> rather than do a gradual rollout to stable, given the low usage. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Gecko: Positive ( >>>>>>>>>>>>>> https://github.com/whatwg/url/pull/619#issuecomment-890826499 >>>>>>>>>>>>>> <https://www.chromestatus.com/admin/features/launch/5679790780579840/1?intent=1> >>>>>>>>>>>>>> ) >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Can you file an official position request? >>>>>>>>>>>>> https://bit.ly/blink-signals >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Done for Mozilla: >>>>>>>>>>>> https://github.com/mozilla/standards-positions/issues/568 >>>>>>>>>>>> >>>>>>>>>>>> Should I also do this for WebKit as well? They have in process >>>>>>>>>>>> CLs, so not sure if it's still needed. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Agree that in-flight patches for WebKit are a sufficient >>>>>>>>>>> positive signal. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> WebKit: In development ( >>>>>>>>>>>>>> https://bugs.webkit.org/show_bug.cgi?id=228826) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Web developers: No signals >>>>>>>>>>>>>> >>>>>>>>>>>>>> Activation >>>>>>>>>>>>>> >>>>>>>>>>>>>> This breaks anything using one of these domains, and requires >>>>>>>>>>>>>> migrating to other hostnames. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Security >>>>>>>>>>>>>> >>>>>>>>>>>>>> None >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Debuggability >>>>>>>>>>>>>> >>>>>>>>>>>>>> These will act like any other invalid URL. Behavior is >>>>>>>>>>>>>> context dependent. Since this is logic deep within GURL, and >>>>>>>>>>>>>> GURLs are >>>>>>>>>>>>>> created in a great many places, console warnings specifically >>>>>>>>>>>>>> for this seem >>>>>>>>>>>>>> not practical. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is this feature fully tested by web-platform-tests >>>>>>>>>>>>>> <https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md> >>>>>>>>>>>>>> ?No. Javascript URL construction is tested, but URLs are >>>>>>>>>>>>>> used in a great many other places, which don't have test >>>>>>>>>>>>>> coverage, since >>>>>>>>>>>>>> DNS lookups for these domains must succeed in the first place >>>>>>>>>>>>>> for the tests >>>>>>>>>>>>>> to be meaningful. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Flag name >>>>>>>>>>>>>> >>>>>>>>>>>>>> Requires code in //chrome?False >>>>>>>>>>>>>> >>>>>>>>>>>>>> Tracking bughttps://crbug.com/1237032 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Estimated milestones >>>>>>>>>>>>>> DevTrial on desktop 95 >>>>>>>>>>>>>> DevTrial on Webview 95 >>>>>>>>>>>>>> >>>>>>>>>>>>>> Link to entry on the Chrome Platform Status >>>>>>>>>>>>>> https://www.chromestatus.com/feature/5679790780579840 >>>>>>>>>>>>>> >>>>>>>>>>>>>> This intent message was generated by Chrome Platform Status >>>>>>>>>>>>>> <https://www.chromestatus.com/>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>>>>> Google Groups "blink-dev" group. >>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>>>>> it, send an email to blink-dev+...@chromium.org. >>>>>>>>>>>>>> To view this discussion on the web visit >>>>>>>>>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAEK7mvq%2Bfnau%3DE%2BONhe0kr9HOpN84eCpoub84%3DswKzPkrGzi5A%40mail.gmail.com >>>>>>>>>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAEK7mvq%2Bfnau%3DE%2BONhe0kr9HOpN84eCpoub84%3DswKzPkrGzi5A%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>>>>>>>>> . >>>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "blink-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to blink-dev+...@chromium.org. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfWB4wVuGshgPaLVXp%3DYsWUiXgJhUABD3ZFJ9xbhg1J3ww%40mail.gmail.com >>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfWB4wVuGshgPaLVXp%3DYsWUiXgJhUABD3ZFJ9xbhg1J3ww%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- You received this message because you are subscribed to the Google Groups "blink-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscr...@chromium.org. To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfXSvhCTwvMniDz_HR%2BSmFgrHXgm5sfJGYLDjyQQ3a4M-Q%40mail.gmail.com.