As a decent sized north American ISP I think I need totally agree with this 
post.    There simply is not any economically justifiable reason to collect 
customer data, doing so is expensive, and unless you are trying to traffic 
shape like a cell carrier has zero economic benefit.     In our case we do 
1:4000 netflow samples and that is literally it, we use that data for peering 
analytics and failure modeling.

This is true for both large ISPs I've been involved with and in both cases I 
would have overseen the policy.
        
What I see in this thread is a bunch of folks guessing that clearly have not 
been involved in large eyeball ISP operations.


-----Original Message-----
From: NANOG <nanog-bounces+john=vanoppen....@nanog.org> On Behalf Of Saku Ytti
Sent: Tuesday, May 16, 2023 7:56 AM
To: Tom Beecher <beec...@beecher.cc>
Cc: nanog@nanog.org
Subject: Re: Do ISP's collect and analyze traffic of users?

I can't tell what large is. But I've worked for enterprise ISP and consumer 
ISPs, and none of the shops I worked for had capability to monetise information 
they had. And the information they had was increasingly low resolution. 
Infraprovider are notoriously bad even monetising their infra.

I'm sure do monetise. But generally service providers are not interesting or 
have active shareholders, so very little pressure to make more money, hence 
firesales happen all the time due infrastructure increasingly seen as a 
liability, not an asset. They are generally boring companies and internally no 
one has incentive to monetise data, as it wouldn't improve their personal 
compensation. And regulations like GDPR create problems people rather not 
solve, unless pressured.

Technically most people started 20 years ago with some netflow sampling ratio, 
and they still use the same sampling ratio, despite many orders of magnitude 
more packets. Meaning previously the share of flows captured was magnitude 
higher than today, and today only very few flows are seen in very typical 
applications, and netflow is largely for volumetric ddos and high level 
ingressAS=>egressAS metrics.

Hardware offered increasingly does IPFIX as if it was sflow, that is,
0 cache, immediately exported after sampled, because you'd need like
1:100 or higher resolution, to have any significant luck in hitting the same 
flow twice. PTX has stopped supporting flow-cache entirely because of this, at 
the sampling rate where cache would do something, the cache would overflow.

Of course there are other monetisation opportunities via other mechanism than 
data-in-the-wire, like DNS


On Tue, 16 May 2023 at 15:57, Tom Beecher <beec...@beecher.cc> wrote:
>
> Two simple rules for most large ISPs.
>
> 1. If they can see it, as long as they are not legally prohibited, they'll 
> collect it.
> 2. If they can legally profit from that information, in any way, they will.
>
> Now, ther privacy policies will always include lots of nice sounding clauses, 
> such as 'We don't see your personally identifiable information'. This of 
> course allows them to sell 'anonymized' sets of that data, which sounds great 
> , except as researchers have proven, it's pretty trivial to scoop up 
> multiple, discrete anonymized data sets, and cross reference to identify 
> individuals. Netflow data may not be as directly 'valuable' as other types of 
> data, but it can be used in the blender too.
>
> Information is the currency of the realm.
>
>
>
> On Mon, May 15, 2023 at 7:00 PM Michael Thomas <m...@mtcc.com> wrote:
>>
>>
>> And maybe try to monetize it? I'm pretty sure that they can be 
>> compelled to do that, but do they do it for their own reasons too? Or 
>> is this way too much overhead to be doing en mass? (I vaguely recall 
>> that netflow, for example, can make routers unhappy if there is too much 
>> "flow").
>>
>> Obviously this is likely to depend on local laws but since this is 
>> NANOG we can limit it to here.
>>
>> Mike
>>


--
  ++ytti

Reply via email to