On 18/08/2020 06:44, Daniel Sterling wrote:
...is it possible to identify (and thus classify)
plain old bulk downloads, as separate from video streams? They're both
going to use http / https (or possibly QUIC) -- and they're both
likely to come from CDN networks... I can't think of a simple way to
tell them apart.

If there was an easy way to do it, I would already have done so. We are unfortunately hamstrung by some bad design and deployment around Diffserv, which might otherwise provide a useful end-to-end visible signal here.

Is this enough of a problem that people would try to make a list of
netblocks / prefixes that belong to video vs other CDN content?

It's possible that someone is doing this, but I don't specifically know of such a source of information. It would of course be better to find a solution that didn't rely on white/black lists, which have a distressing habit of going stale.

But one of the more reliable ways might be to use Autonomous System (AS) information. ASes are an organisational unit used for assigning IP address ranges and for routing, and usually correspond to a more-or-less significant Internet organisation. It should be feasible to map an observed IP address to an AS, then look up the address blocks assigned to that AS, thereby capturing a whole range of related IP addresses.

I do notice video streams are much more bursty than plain downloads
for me, but that may not hold for all users.

That is, for me at least, a video stream may average 5mbps over, say,
1 minute, but it will sit at 0mbps for a while and then burst at
20mbps for a bit.

Correct, YouTube at least likes to fetch a big block of data from disk and send it all at once, then rely on the client buffer to tide it over while the disk services other requests. It makes some sense when you consider how slow disk seeks are relative to the number of clients they need to support, each of which will generally be watching a different video (or at least a different part of the same one).

However, this burstiness disappears on the wire just when you would like to use it to identify traffic, ie. when the video traffic saturates the bandwidth available to it. If there's only just enough bandwidth, or even *less* than what is required, then YouTube sends data continuously into the client buffer, trying to keep it as full as possible.

There are no easy answers here. But I've suggested some things to look for and try out.

 - Jonathan Morton
_______________________________________________
Bloat mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/bloat

Reply via email to