On 2020-04-26 10:08, Michał Górny wrote:
> What do you think?  Do you foresee other problems?  Do you have
> other needs?  Can you think of better solutions?

While I would really like to have data, I think it's impossible to get
correct data and therefore we shouldn't collect any data at all because
the invalid data we would collect would be misused/misinterpreted.

Let's start with your first example already,

> the primary goal of the project would be to gather statistics on
> popularity of packages, in order to help us prioritize our attention
> and make decisions on what to keep and what to remove

Let's assume we will get reports that app-misc/foo is only installed 20
times. If you are going to judge based on this data, "Obviously, nobody
is using that package, it's stuck on <whatever>... safe to remove" your
view is biased:

Because reporting will never be mandatory, we don't know if app-misc/foo
is just unlucky because most of its user haven't opt-in into reporting,
too (you can assume something like this for people with tor-related
programs for example).

Now think about large installations which are probably not allowed to
"phone home", using their private local mirror and are even using build
hosts. I am aware of *multiple* large Gentoo deployments -- for servers.
You will never get data from these installations. Instead, stats will be
drowned by several home users which are more likely to submit data.
Not to mention the new containerized world...

It's the problem you all should know from Mozilla, Google, Microsoft
*duck*: They all do 'data-driven development'. The problem: *We* are
power users. We are using several features most normal users don't even
know. However, most of us are also aware about privacy and are disabling
stats. The result: These companies are killing popular power user
features just because their data indicates that nobody is using that

Please don't create pressure on users to opt-in to gentoostats to
prevent something like this for Gentoo.

My point is: I'll strongly object against *any* decision based on this
project because the data will be *always wrong*. Therefore the data is
useless and I wouldn't even consider collecting them in first place.
Where there is a trough the pigs gather... and at one point people will
start to ignore that the data is useless just to underline *their* point
in their current situation. :/

Thomas Deutschmann / Gentoo Linux Developer
C4DD 695F A713 8F24 2AA1 5638 5849 7EE5 1D5D 74A5

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to