On 2020-04-01 09:58, REMI DRUILHE wrote:
You are right, most of the time this kind de-anonymisation is extreme.
And right again when you say that there is no personal data stored in
Prometheus.

I am also not a lawyer but I know from my DPO that the national data
protection authority (NDPA) might be very very very meticulous,
especially in my domain of video processing... We had several meetings
about it and we had to review our data processing multiple time. I was
just looking for a way to delete data with a hard deadline if the NDPA
say that the current solution is not good enough (the one with
storage.tsdb.retention.time option). I think it is better to come with
an answer than saying that we did not thought about it.

Unfortunately there are no guarantees around deletion.

In addition to the fuzziness around exactly when a block might be removed you can also end up with data files hanging around in certain error scenarios (e.g. tmp files if there are issues loading the WAL on startup or during block rotation)


Le mardi 31 mars 2020 17:51:31 UTC+2, Stuart Clark a écrit :

No that sounds fairly normal. One thing to note is that those
timestamps are not the times the methods were called. They are when
Prometheus scraped your application. So if you scrape once a minute
the actual call could have been at any point during that minute.
Equally if there are multiple calls during that minute you'd have no
idea when they happened either.

I'm not a lawyer or GDPR expert, but I think the type of extreme
de-anonymisation you are suggesting is not generally something you'd
be expected to be worrying about. Equally even if you do have an
idea of who might have called an API there still isn't any personal
data in Prometheus.

On 31 March 2020 15:27:36 BST, REMI DRUILHE <remi....@atos.net>
wrote:
In our code, we are using a counter to count the accesses to the
various methods of the API. We have one counter per method. We do
not store the timestamp. But when we ask Prometheus with a
"query_range" (see request below), it returns the list of all the
methods that have been accessed.

curl

'http://172.22.0.15:9090/api/v1/query_range?query=bea_nb_request&start=2020-03-31T00:01:00.000Z&end=2020-03-31T17:00:00.000Z&step=60s
[1]'

For each of our API method, it also returns a list of key-value
where the key is the timestamp and the value is the value of the
counter at that time (see example below). Thus, in some way, you are
able to track when the method has been called. And if our system is
used by a single user, then it is easy to follow which methods he
called. It is a bit twisted, but the national data protection
authority might also be twisted sometimes... But according to your
previous answers, maybe we did not used the counter in a proper way
and we should change the way it is designed.

{
"status":"success",
"data":{
"resultType":"matrix",
"result":[
{
"metric":{
"__name__":"bea_nb_request",
"action":"my_api_method",
"instance":"bea:8081",
"job":"bea"
},
"values":[
[
1585663440,
"1"
],
[
1585663500,
"2"
],
[
1585663560,
"3"
],
[
1585663620,
"3"
],
[
1585663680,
"3"
],
[
1585663740,
"3"
],
[
1585663800,
"3"
],
[
1585663860,
"3"
]
]
},
others_api_methods...
}
]
}
}

Le mardi 31 mars 2020 13:40:03 UTC+2, Stuart Clark a écrit :
How are you storing the timestamp? Is that in a label or a metric
value as the last call to the API?

In general these are sounding like you are trying to store events
within Prometheus rather than metrics. Normally you'd not have a
timestamp but a counter of the number of calls to the API.

On 31 March 2020 12:27:38 BST, REMI DRUILHE <remi....@atos.net>
wrote:

Le lundi 30 mars 2020 16:37:11 UTC+2, Brian Candler a écrit :
On Monday, 30 March 2020 09:34:01 UTC+1, REMI DRUILHE wrote:
In our context, Prometheus is storing system metrics and business
metrics, especially the number of accesses to the methods of our
API.

That presumably is an aggegate of all calls to a particular method.

If you recorded counts as separate metrics labelled by source IP
address or username, then that would be identifiable.  But prometheus
does not work well with such high cardinality metrics anyway.

Yeah, it is just the timestamp of the call that is stored, not the IP
or the user name. Thus, it is not identifiable with Prometheus only.
But, the system aims at being used by 1 or 2 persons at the same time
in a closed network. In this context, I think it could be easy for
someone to associate the timestamp with the person that was using the
application at a specific time.

Anyway, I will figure out another way to achieve what we would like to
do.

Thanks for the help.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

 --
You received this message because you are subscribed to the Google
Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/prometheus-users/29f1e59c-1d72-436a-9883-c81c71e0cbd9%40googlegroups.com
[2].


Links:
------
[1]
http://172.22.0.15:9090/api/v1/query_range?query=bea_nb_request&amp;start=2020-03-31T00:01:00.000Z&amp;end=2020-03-31T17:00:00.000Z&amp;step=60s
[2]
https://groups.google.com/d/msgid/prometheus-users/29f1e59c-1d72-436a-9883-c81c71e0cbd9%40googlegroups.com?utm_medium=email&utm_source=footer

--
Stuart Clark

--
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to prometheus-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/458434923b2718cb76a687a7efecdce8%40Jahingo.com.

Reply via email to