On Jul 28, 2014, at 12:36 PM, Bill Woodcock <wo...@pch.net> wrote: > > On Jul 28, 2014, at 9:28 AM, William Herrin <b...@herrin.us> wrote: >> The data set suffers three flaws: > > Depending on your point of view, a lot more than three, undoubtedly. > >> 1. It is not representative of the actual traffic flows on the Internet. > > There are an infinite number of things it’s not representative of, but it > also doesn’t claim to be representative of them. Traffic flows on the > Internet is a different survey of a different thing, but if someone can > figure out how to do it well, I would be very supportive of their effort. > It's a _much_ more difficult survey to do, since it requires getting people > to pony up their unanonymized netflow data, which they’re a lot less likely > to do, en masse, than their peering data. We’ve been trying to figure out a > way to do it on a large and representative enough scale to matter for twenty > years, without too much headway. The larger the Internet gets, the more > difficult it is to survey well, so the problem gets harder with time, rather > than easier.
This most likely won’t happen unless it becomes some sort of an international treaty obligation and even then it would end up in courts for a long time. Leaving aside data privacy requirements many carriers have, most companies guard their traffic information rather zealously for some reason. -dorian