OK, I'm convinced that these are not "real" hosts, and that breaking them
will not result in actual user-visible breakage. Would it be possible to
ship this with a server-side flag that'd enable us to quickly revert in
case we're wrong?

On Thu, Aug 19, 2021 at 6:09 PM Matt Menke <mme...@google.com> wrote:

> So given information on this thread, I think the ways these can succeed
> are possibly:
> 1)  Entries in the HOSTS file.
> 2)  Intermediary DNS servers or MitMs typo squatting, or actively
> attacking the user.
> 3)  Local DNS servers providing additional DNS results.
> 4)  Suffix search (which would trigger on, e.g., "1533.67.89" but not
> "67.89.1533")
> 5)  mDNS
> 6)  Local tools injecting DNS lookup results.
>
> Unfortunately, we don't have a good way to gather hard data on which are
> more common.  Given that there are potential security implications here,
> I'm reluctant to wait for another round of data gathering, though we could
> probably distinguish cases 4) and 5) from the others, and 1) as well, at
> least on some platforms.  Also not sure how useful just knowing about those
> cases would be.
> On Thursday, August 19, 2021 at 11:54:29 AM UTC-4 Harald Alvestrand wrote:
>
>> Thanks - I misremembered which end of the class B network got jammed
>> together. 67.89.31533 is indeed the one that is "legal" syntax, and so of
>> course 67.89.1 is too.
>>
>>
>> On Thu, Aug 19, 2021 at 3:20 PM Matt Menke <mme...@google.com> wrote:
>>
>>> Note that 127.0.1 is mapped to 127.0.0.1 by GURL, following the URL
>>> spec, so that would generally not make it to the DNS resolver (unless
>>> something tried to resolve it directly).  Per the URL spec, "31533.67.89"
>>> would not be normalized by GURL, but "67.89.31533" would be converted to
>>> 67.89.123.45.  My instrumentation was at the DNS layer, so "127.0.1" and
>>> "67.89.31533" would not show up as problematic hostnames in my metrics,
>>> though "31533.67.89" would.
>>>
>>> On Thu, Aug 19, 2021 at 9:10 AM Harald Alvestrand <h...@google.com>
>>> wrote:
>>>
>>>> An interesting property of all-numeric hostnames is that they *may* be
>>>> legitimate IPv4 addresses using highly archaic IP address formats - we're
>>>> so used to the 123.45.67.89 syntax that we forget that 31533.67.89 once was
>>>> regarded as a legitimate way to encode the same address (class B notation).
>>>>
>>>> There are also certain resolvers that will "helpfully" map an
>>>> all-numeric hostname presented in DNS to an IP address without asking
>>>> anyone.
>>>> So if those two bugs (or "archaic features") occur together, the result
>>>> may be a successful resolution.
>>>>
>>>> No idea why it would occur more often on Android than on Windows,
>>>> though. And my Linux boxes don't resolve 127.0.1 to anything.
>>>>
>>>>
>>>>
>>>> On Thu, Aug 19, 2021 at 2:42 PM Yoav Weiss <yoav...@chromium.org>
>>>> wrote:
>>>>
>>>>> Interesting! What happens then in the "successful resolution" case
>>>>> Matt mentioned?
>>>>>
>>>>> On Thu, Aug 19, 2021 at 2:39 PM Harald Alvestrand <h...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Department of odd facts:
>>>>>>
>>>>>> - The ICANN rules for new TLDs forbid all top level domain names that
>>>>>> start with a digit
>>>>>> - The IDNA rules for bidirectional scripts forbid domain names that
>>>>>> start with a digit (Unicode bidi afficandoes will know why)
>>>>>> - The only real reason why leading digits aren't outlawed in domain
>>>>>> names at the second level is 3com.
>>>>>>
>>>>>> It seems safe to say that no legitimate fully qualified hostname will
>>>>>> ever have a last component consisting only of digits.
>>>>>> That means the only time we could get a legitimate hostname is for
>>>>>> something that has to be resolved via a search path.
>>>>>>
>>>>>>
>>>>>> On Thu, Aug 19, 2021 at 2:33 PM Yoav Weiss <yoav...@chromium.org>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Aug 19, 2021 at 2:28 PM Matt Menke <mme...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I created the title using Chrome Status's deprecation template, so
>>>>>>>> any confusion should be blamed on that.
>>>>>>>>
>>>>>>>
>>>>>>> +Jason Robbins - on the title issues.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I used the "Draft Intent to Deprecate and Remove email" button, and
>>>>>>>> assume I'd need to do a "Draft Intent to Ship email" before shipping to
>>>>>>>> stable, after a 50% trial on prerelease channels.
>>>>>>>>
>>>>>>>
>>>>>>> There's no need for 2 emails for removals. We can discuss the full
>>>>>>> deprecation, experimentation/trials and removal on stable here.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Aug 19, 2021 at 3:15 AM Yoav Weiss <yoav...@chromium.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Aug 18, 2021 at 11:36 PM Matt Menke <mme...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Aug 18, 2021 at 5:23 PM Yoav Weiss <yoav...@chromium.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Aug 18, 2021 at 11:18 PM Matt Menke <mme...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Aug 18, 2021 at 4:53 PM Yoav Weiss <
>>>>>>>>>>>> yoav...@chromium.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Aug 18, 2021 at 8:47 PM 'Matt Menke' via blink-dev <
>>>>>>>>>>>>> blin...@chromium.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Contact emailsmme...@google.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ExplainerNone
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Specificationhttps://url.spec.whatwg.org/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Summary
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Most hostnames that aren't valid IPv4 addresses, but end in
>>>>>>>>>>>>>> numbers are treated as valid, and looked up via DNS (e.g.,
>>>>>>>>>>>>>> http://foo.127.1/). Per the Public Suffix List spec, the
>>>>>>>>>>>>>> eTLD+1 of the hostname in that URL should be "127.1". If that is 
>>>>>>>>>>>>>> ever fed
>>>>>>>>>>>>>> back into a URLs, "http://127.1/ <http://127.0.0.1/>" is
>>>>>>>>>>>>>> mapped to "http://127.0.0.1/"; by the URL spec, which seems
>>>>>>>>>>>>>> potentially dangerous. "127.0.0.0.1" could also potentially be 
>>>>>>>>>>>>>> used to
>>>>>>>>>>>>>> confuse users. We want to reject URLs with these hostnames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Blink componentInternals>Network>DNS
>>>>>>>>>>>>>> <https://bugs.chromium.org/p/chromium/issues/list?q=component:Internals%3ENetwork%3EDNS>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Motivation
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Most hostnames that aren't valid IPv4 addresses, but end in
>>>>>>>>>>>>>> numbers are treated as valid hostnames, and looked up via DNS. 
>>>>>>>>>>>>>> Example
>>>>>>>>>>>>>> hostnames: 127.0.0.0.1, foo.0.1, 10.0.0.09, 08.1.2.3. These can 
>>>>>>>>>>>>>> be
>>>>>>>>>>>>>> problematic for the following reason: * "http://foo.127.1/";
>>>>>>>>>>>>>> has an eTLD+1 of "127.1", per the public suffix list spec. If 
>>>>>>>>>>>>>> that's ever
>>>>>>>>>>>>>> used as the hostname in a new URL, however, as in "
>>>>>>>>>>>>>> http://127.1 <http://127.0.0.1/>", it will then get mapped
>>>>>>>>>>>>>> to "http://127.0.0.1/";, per the URL spec, which is a
>>>>>>>>>>>>>> different host, which is not safe. * "http://127.0.0.0.1";
>>>>>>>>>>>>>> and "http://1.2.3.09";, both of which are looked up via DNS
>>>>>>>>>>>>>> rather than failing or being treated as IPv4 hostnames, also seem
>>>>>>>>>>>>>> potentially confusing. While no exploit is currently known here, 
>>>>>>>>>>>>>> we want to
>>>>>>>>>>>>>> remove support for these as a preventative security measure. The 
>>>>>>>>>>>>>> URL spec
>>>>>>>>>>>>>> has been updated so that any URL with a hostname ending in a 
>>>>>>>>>>>>>> number that's
>>>>>>>>>>>>>> not an IPv4 address (including, e.g., http://foo.1./, but
>>>>>>>>>>>>>> not http://foo.1../) is considered invalid. Since this is
>>>>>>>>>>>>>> part of the URL spec, not the DNS spec, we want to reject these 
>>>>>>>>>>>>>> URLs are
>>>>>>>>>>>>>> the GURL layer, for URLs with appropriate protocols (http, 
>>>>>>>>>>>>>> https, ws, wss,
>>>>>>>>>>>>>> file). For consistency, we should also fail DNS lookup attempts 
>>>>>>>>>>>>>> of these
>>>>>>>>>>>>>> sorts of hostnames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Initial public proposalhttps://github.com/whatwg/url/pull/619
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> TAG reviewNot required for an Intent to Deprecate, I believe.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> TAG review statusNot applicable
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Risks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Interoperability and Compatibility
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any URL with an affected hostname will fail to load, and will
>>>>>>>>>>>>>> need to be migrated to another hostname. URLs of this form do 
>>>>>>>>>>>>>> appear to be
>>>>>>>>>>>>>> in use, though it's not clear under what circumstances. No entry 
>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>> public suffix list is affected. Affected URLs make up no more 
>>>>>>>>>>>>>> than 0.0003%
>>>>>>>>>>>>>> of hostnames looked up via the host resolver on any platform, 
>>>>>>>>>>>>>> and are
>>>>>>>>>>>>>> basically not used in any file URLs, according to our metrics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do we have reason to believe these hostnames are not
>>>>>>>>>>>>> legitimate ones?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Unfortunately, we have no insight into them - they could be
>>>>>>>>>>>> mistyped URLs sent to typo squatting ISPs that OSX lets through 
>>>>>>>>>>>> but the
>>>>>>>>>>>> Windows host resolver blocks, and various flavors of Linux treat
>>>>>>>>>>>> differently.  Or they could be mapped via a hosts file, or they 
>>>>>>>>>>>> could be
>>>>>>>>>>>> hostnames that only resolve on public networks.  Could be some 
>>>>>>>>>>>> network tool
>>>>>>>>>>>> that uses them when installed locally, but is only available on 
>>>>>>>>>>>> certain
>>>>>>>>>>>> platforms.  No reason to think one possibility is more likely than 
>>>>>>>>>>>> the
>>>>>>>>>>>> others.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Do we have UKM for them that would enable us to test a random
>>>>>>>>>>> sample?
>>>>>>>>>>> I'm concerned about blocking those hostnames if they are
>>>>>>>>>>> legitimate, as that's something that a web developer can't do 
>>>>>>>>>>> anything
>>>>>>>>>>> about.
>>>>>>>>>>> So even if the number of hosts is small, I'd like to get more
>>>>>>>>>>> certainty that they are *not* legitimate hosts before blocking them.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We have UKM on their number (0.0003% of DNS lookups on OSX, less
>>>>>>>>>> elsewhere - we can't meaningfully instrument percent of created 
>>>>>>>>>> GURLs), but
>>>>>>>>>> we don't have their hostnames, what they resolve to, or know 
>>>>>>>>>> anything else
>>>>>>>>>> about them, unfortunately.
>>>>>>>>>>
>>>>>>>>>> Navigation to a subset of these as frame URLs were broken at one
>>>>>>>>>> point - I'm pretty sure the breakage even made it to stable:
>>>>>>>>>> https://crbug.com/1173238.  There were no reports of problems.
>>>>>>>>>> Only non-IPv4 URLs where the last two components were broken, 
>>>>>>>>>> though, and
>>>>>>>>>> it didn't affect subresources.  On OSX and Android, over 99% of
>>>>>>>>>> successfully resolved problematic hostnames fit into that bucket, 
>>>>>>>>>> though on
>>>>>>>>>> Linux, only about 2% do.
>>>>>>>>>>
>>>>>>>>>> That doesn't give us any hard conclusions, except they're either
>>>>>>>>>> not deliberate navigations on OSX/Android, or they're not 
>>>>>>>>>> navigations.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :|
>>>>>>>>>
>>>>>>>>> One more question: Is this an intent to Prototype or an intent to
>>>>>>>>> deprecate? The title is a bit unclear..
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On OSX and Android, about 90% of host resolver lookups for
>>>>>>>>>>>>>> these hostnames succeed, 60% do on Linux, and 2% on Windows and 
>>>>>>>>>>>>>> ChromeOS.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you know where those failures are coming from?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Could be typos, could be the Windows and ChromeOS host
>>>>>>>>>>>> resolvers don't let them through.  Since we've had no filed bugs 
>>>>>>>>>>>> about
>>>>>>>>>>>> them, I suspect the failures are not deliberate navigations or 
>>>>>>>>>>>> intended
>>>>>>>>>>>> network requests.  I'm much more interested in where the successes 
>>>>>>>>>>>> are
>>>>>>>>>>>> coming from, myself.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> To allow for emergency disabling in case of wider than
>>>>>>>>>>>>>> expected breakage, I intend to add a feature for it, and do a 
>>>>>>>>>>>>>> 50% field
>>>>>>>>>>>>>> trial on pre-release channels, though plan to just enable the 
>>>>>>>>>>>>>> feature,
>>>>>>>>>>>>>> rather than do a gradual rollout to stable, given the low usage.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Gecko: Positive (
>>>>>>>>>>>>>> https://github.com/whatwg/url/pull/619#issuecomment-890826499
>>>>>>>>>>>>>> <https://www.chromestatus.com/admin/features/launch/5679790780579840/1?intent=1>
>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you file an official position request?
>>>>>>>>>>>>> https://bit.ly/blink-signals
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Done for Mozilla:
>>>>>>>>>>>> https://github.com/mozilla/standards-positions/issues/568
>>>>>>>>>>>>
>>>>>>>>>>>> Should I also do this for WebKit as well?  They have in process
>>>>>>>>>>>> CLs, so not sure if it's still needed.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Agree that in-flight patches for WebKit are a sufficient
>>>>>>>>>>> positive signal.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> WebKit: In development (
>>>>>>>>>>>>>> https://bugs.webkit.org/show_bug.cgi?id=228826)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Web developers: No signals
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Activation
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This breaks anything using one of these domains, and requires
>>>>>>>>>>>>>> migrating to other hostnames.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Security
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> None
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Debuggability
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These will act like any other invalid URL. Behavior is
>>>>>>>>>>>>>> context dependent. Since this is logic deep within GURL, and 
>>>>>>>>>>>>>> GURLs are
>>>>>>>>>>>>>> created in a great many places, console warnings specifically 
>>>>>>>>>>>>>> for this seem
>>>>>>>>>>>>>> not practical.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is this feature fully tested by web-platform-tests
>>>>>>>>>>>>>> <https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md>
>>>>>>>>>>>>>> ?No.  Javascript URL construction is tested, but URLs are
>>>>>>>>>>>>>> used in a great many other places, which don't have test 
>>>>>>>>>>>>>> coverage, since
>>>>>>>>>>>>>> DNS lookups for these domains must succeed in the first place 
>>>>>>>>>>>>>> for the tests
>>>>>>>>>>>>>> to be meaningful.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Flag name
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Requires code in //chrome?False
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tracking bughttps://crbug.com/1237032
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Estimated milestones
>>>>>>>>>>>>>> DevTrial on desktop 95
>>>>>>>>>>>>>> DevTrial on Webview 95
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Link to entry on the Chrome Platform Status
>>>>>>>>>>>>>> https://www.chromestatus.com/feature/5679790780579840
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This intent message was generated by Chrome Platform Status
>>>>>>>>>>>>>> <https://www.chromestatus.com/>.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>> Google Groups "blink-dev" group.
>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>>> it, send an email to blink-dev+...@chromium.org.
>>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAEK7mvq%2Bfnau%3DE%2BONhe0kr9HOpN84eCpoub84%3DswKzPkrGzi5A%40mail.gmail.com
>>>>>>>>>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAEK7mvq%2Bfnau%3DE%2BONhe0kr9HOpN84eCpoub84%3DswKzPkrGzi5A%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "blink-dev" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to blink-dev+...@chromium.org.
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfWB4wVuGshgPaLVXp%3DYsWUiXgJhUABD3ZFJ9xbhg1J3ww%40mail.gmail.com
>>>>>>> <https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfWB4wVuGshgPaLVXp%3DYsWUiXgJhUABD3ZFJ9xbhg1J3ww%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to blink-dev+unsubscr...@chromium.org.
To view this discussion on the web visit 
https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfXSvhCTwvMniDz_HR%2BSmFgrHXgm5sfJGYLDjyQQ3a4M-Q%40mail.gmail.com.

Reply via email to