Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Couldn't the stats job you want run on toolserver? Peter Gervai wrote: Hello, I wasn't subscribed to this list, since I usually try to avoid the politics around. I was notified, however, that some interesting claims were made and some steps taken (again) without any discussion whatsoever. First, let me tell it here again - as I have told it on a different list - that I am extremely disappointed by the lack of discussion before someone from outside seriously interfere with other project based on, as it turns out, incorrect informations. In the past people with privileges (if we ever considered them that way instead of people with work to be done) were more cautious. I would like you all fast-handed guys to slow down and talk first, get informed, and act later. I already commented elsewhere on vls, in summary I miss the discussion and I do not believe the case actually breached any privacy, but this isn't my concern now (as I'm in a bit of hurry). Regarding huwp, it would have been pretty easy to find out who to ask. Apart from the obvious choice of anyone with any flags on huwp, it could've been easy to identify who made the changes, and ask them. Like, for example me. As far as I see, lots of wasted energies go around, like people planning how to block javascript, how to block counters, etc. It is the wrong way. The good way is, and I'm repeating myself again, is FIRST to get to know WHY these scripts are there in the first hand, what solution they have to solve. This is a crucial step, fellows, which you neglected to take. (And we all know that the reason is to create usage stats.) Next step should be examining whether there is anything this violates, like, Privacy Policy. In the case of Google this is debateable, since I don't know what is the scope of the data retention. However I completely do know about the Hungarian stats. Let me share the real information here, briefly, since I have to go soon, but I do not want to let you destroy something you're not aware of. The stats (which have, by surprise, a dedicated domain under th hu wikipedia domain) runs on a dedicated server, with nothing else on it. Its sole purpose to gather and publish the stats. Basically nobody have permission to log in the servers but me, and I since I happen to be checkuser as well it wouldn't even be ntertaining to read it, even if it wasn't big enough making this useless. I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. (And those who are unfamiliar with me, I happen to be the founder of huwp as well, apart from my job in computer security.) If you would have gathered this knowledge (which means that the server is closed and run by an identified user to WMF), then you could have started the discussion. As it is obvious, don't make any interfering moves while discussing it for days, or even weeks, wouldn't change anything. What have you achieved with removing the code? You killed our stats, which provides us with the statistics originally WMF provided (same data content), but later killed off. We'll propose (huwp) some solutions on the problem, but I'll really have to go now. Tgr can help discussing it, and I'll thank him for his help in advance. :-) So, think about these in the weekend, I'm back on monday. I hop there can be an _useful_ discussion, with thinking people and not people acting on impulses. Peter Gervai Hungary ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Hi! Couldn't the stats job you want run on toolserver? Really, this isn't much of foundation-l issue - we have been collecting and providing detailed article viewership statistics for over a year. People are building various applications on top of that data, like http://wikirank.com/en/Jimmy_Wales - and we already handle the data processing task. *shrug*, if anyone wants better standards, better interfaces, etc - it all can be achieved, in one way or another, without sacrificing privacy of our users. As I've stated and will state again, we will err towards privacy, if we have to err. toolserver could be vehicle for some of data analysis and aggregation, but currently users on it are not supposed to get private data either, nor it is able to scale with overall content delivery infrastructure. I'd like everyone to understand, that 'who reads what' is 1000x more data (and hence more privacy issues) than 'who edits what'. Just those who come and read about whatever they want to read, do not have representation on this mailing list, so we have to have that in mind too. We're building service for much much larger group of people, and interests of few should not be sacrificing privacy of everyone else. BR, Domas ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Discussing something as a general social concern is one thing, claiming that it is a wmf legal issue is something different. John Michael Snow skrev: John at Darkstar wrote: Are the developers lawyers? A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. Privacy is not simply a legal issue, it's a general social concern. Our privacy policy should not be treated as merely a legal document, it's an effort to express part of our social compact. Unless it's a matter of interpreting legal regulations somewhere else, I consider the developers equally competent to address privacy issues. If Brion or Tim or Domas identify an issue, they don't need to run to Mike every time to check that it really is something that should be addressed. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
It is a WMF legal issue, in addition to being a social issue. No claim is being made that its a legal issue, it's just a fact. On Sun, Jun 7, 2009 at 2:43 AM, John at Darkstar vac...@jeb.no wrote: Discussing something as a general social concern is one thing, claiming that it is a wmf legal issue is something different. John Michael Snow skrev: John at Darkstar wrote: Are the developers lawyers? A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. Privacy is not simply a legal issue, it's a general social concern. Our privacy policy should not be treated as merely a legal document, it's an effort to express part of our social compact. Unless it's a matter of interpreting legal regulations somewhere else, I consider the developers equally competent to address privacy issues. If Brion or Tim or Domas identify an issue, they don't need to run to Mike every time to check that it really is something that should be addressed. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Just to be clear, it has been claimed in this thread that the CheckUser right also gives those admins the right to collect additional data on users and analyze it. I've just read the privacy policy and that is not true. You'll also find [[Privacy policy]] interesting, although you might decide to edit it to make sure that it no longer mentions the word legal so that your argument can continue to be true, at least to you. On Sun, Jun 7, 2009 at 9:54 AM, Brian brian.min...@colorado.edu wrote: It is a WMF legal issue, in addition to being a social issue. No claim is being made that its a legal issue, it's just a fact. On Sun, Jun 7, 2009 at 2:43 AM, John at Darkstar vac...@jeb.no wrote: Discussing something as a general social concern is one thing, claiming that it is a wmf legal issue is something different. John Michael Snow skrev: John at Darkstar wrote: Are the developers lawyers? A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. Privacy is not simply a legal issue, it's a general social concern. Our privacy policy should not be treated as merely a legal document, it's an effort to express part of our social compact. Unless it's a matter of interpreting legal regulations somewhere else, I consider the developers equally competent to address privacy issues. If Brion or Tim or Domas identify an issue, they don't need to run to Mike every time to check that it really is something that should be addressed. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
This might be going off topic, and not really helpful in finding a solution (along the lines of wamping up WMF stats capabilities in the near future or reinstating the huwiki solution in a way accpetable to the WMF and the hu.wp community and possibly benefitting other communities, as well): On Sun, Jun 7, 2009 at 6:44 PM, Brian brian.min...@colorado.edu wrote: Just to be clear, it has been claimed in this thread that the CheckUser right also gives those admins the right to collect additional data on users and analyze it. I've just read the privacy policy and that is not true. I believe there was no such claim, if anything, it was pointed out that setting up the stats engine didn't give access to information that was not accessible before by the Checkusers (even if logged), and that most fears of data being handled by the wrong hands are mitigated by the facts that the data was handled by a CheckUser (and thus a) a person already with access to said data and b) a person identified to the WMF and trusted by the community*). (*Not that I would want to introduce community trust into the argument, just pointing out the inherent properties of being a CU.) Best regards, Bence Damokos ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
I'm going off of statements like this: I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. On Sun, Jun 7, 2009 at 11:13 AM, Bence Damokos bdamo...@gmail.com wrote: This might be going off topic, and not really helpful in finding a solution (along the lines of wamping up WMF stats capabilities in the near future or reinstating the huwiki solution in a way accpetable to the WMF and the hu.wp community and possibly benefitting other communities, as well): On Sun, Jun 7, 2009 at 6:44 PM, Brian brian.min...@colorado.edu wrote: Just to be clear, it has been claimed in this thread that the CheckUser right also gives those admins the right to collect additional data on users and analyze it. I've just read the privacy policy and that is not true. I believe there was no such claim, if anything, it was pointed out that setting up the stats engine didn't give access to information that was not accessible before by the Checkusers (even if logged), and that most fears of data being handled by the wrong hands are mitigated by the facts that the data was handled by a CheckUser (and thus a) a person already with access to said data and b) a person identified to the WMF and trusted by the community*). (*Not that I would want to introduce community trust into the argument, just pointing out the inherent properties of being a CU.) Best regards, Bence Damokos ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Hi! Are the developers lawyers? IANAL. A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. I failed to phrase what I wanted to write you in a way, that I wouldn't make me look like an arrogant prick, so I will not write it. Let me tell something else, instead. Anyway, WMF has always been standing for privacy of our users. I wholeheartedly approve the privacy stance, which means that we don't even consider exceptions when it comes to giving away private data. We just don't give it away. This is why we opt out of phorm, this is why we don't facilitate numerous researchers (or whomever hide behind those names), and we don't even keep most of private data ourselves. Someone on this thread said, that WMF keeps private data internally. We don't have readership data, there're no such thing as access logs in our farm, the closest one to the concept is one out of overall requests, which doesn't have long retention, and is used for short term operational purposes. Every other private data point is the one that is visible by checkusers, has both audit trail, and quite restricted access to information (at least there are verification procedures). So, we tend to understand data privacy policies internally quite well, that was incentive of written down privacy policy, and that has been part of constant internal dialogue how to handle overall privacy. We know that our reader privacy is quite good (especially if people use TOR and HTTPS :), we know that we have to balance our contributor privacy issues in order to be what we are. We err to the side of privacy, as that is where we would have highest damages. Anyway, I answered your question, IANAL, but I'm in one way or another part of organization that has one. We asked the lawyer to describe our intent and position, and he did. We're happy to enforce it. And, Brian, Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! You don't seen to have sufficient understanding how it works. :( Domas ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Hi! I believe there was no such claim, if anything, it was pointed out that setting up the stats engine didn't give access to information that was not accessible before by the Checkusers (even if logged), and that most fears of data being handled by the wrong hands are mitigated by the facts that the data was handled by a CheckUser (and thus a) a person already with access to said data and b) a person identified to the WMF and trusted by the community*). checkusers don't have readership data. Domas ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Sun, Jun 7, 2009 at 11:17 AM, Domas Mituzas midom.li...@gmail.comwrote: And, Brian, Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! You don't seen to have sufficient understanding how it works. :( Domas Assuming you're not taking this out of context, please explain the difference between how it works and my conception of how it works. Here we have someone who violated the privacy policy and tried to rationalize it. I explained how. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Hi! Assuming you're not taking this out of context, please explain the difference between how it works and my conception of how it works. Sorry, I misread your statement. I took Volunteer admins as Volunteer sysadmins - my greatest apology. BR, Domas ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
I don't think that any random admin on one of the projects should be able to insert a web bug into Common.js is what he suggests. The Hungarian situation seems to have been in place with support of the hungarian community, at least at start. Frankly, I'd rather see private sensitive data on an external server ran by a couple of wiki volunteers, than on an external server ran by a contracted 3rd party supplier. i wish you well, teun On Sat, Jun 6, 2009 at 5:08 AM, Brian brian.min...@colorado.edu wrote: On Fri, Jun 5, 2009 at 8:46 PM, Samuel Klein meta...@gmail.com wrote: Peter said that he could run whatever was being done on an external server on a WMF machine that [core] developers have access to. What does this have to do with being Foundation staff? He is trying rationalize his previous behavior by stating that he thinks any random admin on one of the projects should be able to insert a web bug into Common.js that logs user data to a non foundation server. I'm not sure what keywords to use but I seem to recall this issue coming up a couple of years ago (this exact case). At any rate it seems that he was aware of the controversy from the beginning. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Sat, Jun 6, 2009 at 3:05 AM, Robert Rohde raro...@gmail.com wrote: On Fri, Jun 5, 2009 at 4:30 PM, Peter Gervaigrin...@gmail.com wrote: snip The community cannot decide that Random_user1 and Random_user2 etc will agree with the communities view on the stats being passed to an external server. As you are aware it's not really random user, so what you write is more rhetoric and less facts. I debate your statement as I believe the community can pretty much decide anything unless it violates some higher level policy, and it's been told this predates the PP. And I tend to disagree in its violation, but it's an open debate. snip The wording of Privacy Policy has always been rather vague and mushy (something I've complained about in the past). However the spirit of the policy, and the way it has been applied, might be summarized thusly: Personally identifiable data does not leave the WMF's control without the WMF's express permission. In general, the circumstances where people have access to such data and the purposes for which they can use it are explicitly defined in advance. You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. Given the situation with Wikimedia DE and the toolserver cluster etc., I think it should be possible in principle for the WMF to reach an agreement that allows data to be communicated to servers operated by Wikimedia chapters for purposes that benefit Wikimedia. In light of current sentiment and Foundation practice though, I think that any such arrangements should require prior approval. That your set up has existed for years can provide some confidence in its reasonableness and security, but I wouldn't support turning it back on until people have looked at and reviewed the details though. Sorry for the abrupt way that things were handled, but erring on the side of protecting user privacy is generally a good thing. Now that you are here discussing the matter, I'd hope a reasonable solution can be found. this is an idea i would wholey support, please dont think im against the whole idea of stats, all i'm arguing against here is your current interpretation of the privacy policy. if you were to sign an agreement with the wmf about this then i would support these stats and be very interested in what they provide and can be adapted to do. -Robert Rohde ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
This is another e-mail on this subject that just strikes me as flawed. These are not vague privacy fears - they are real privacy fears. I see a fundamental failure by those involved in this controversy to understand this point. On Sat, Jun 6, 2009 at 1:31 AM, Tisza Gergő gti...@gmail.com wrote: Robert Rohde raro...@... writes: You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. The (WM-DE-owned) toolserver ran a statistics script called WikiCharts for a few years, which worked with data relayed by Common.js from several wikipedias, including de and en. While that is not exactly the same situation (as the WMF has access to the toolserver), I think it proves my point that passing IP data to an (in the strict organizational sense) third-party server does not necessarily violate the privacy policy, neither letter nor spirit, as long as that server remains within the larger WM community. It is important to understand that this is a much more general question than that of web statistics: any third-party service that interacts with the standard wiki user interface receives private data, whether it needs it or not, because the user interface (the HTML page) is executed in the user's browser, and the browser has to contact the third-party service, and it cannot hide its IP in that process. For example, we considered setting up some sort of spell checking service for hu.wp. That is something that cannot be done well centrally - there is too much difference between languages. And if you do it with a local server, it has to communicate with the user's browser, and could in theory log requests and correlate them with edits on the wiki, thus it has to conform with the privacy guidelines. It would be a shame if all such uses would be blindly forbidden because of vague privacy fears. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
I also have not seen a clear explanation of what those who would like to generate statistics using web bugs plan to do with that data. How do they plan to use the data, and why aren't the plethora of statistics now made officially available by the WMF not satisfactory? You have bypassed the correct procedure. The amount of time that it takes the WMF to accomplish goals can be frustrating. Getting them to make your goal their goal can be frustrating. But it all has to start with you presenting them with a coherent goal that takes all the constraints into account. Then you need to get WMF approval which often involves getting community approval. Let's be clear that the privacy policy is a legal issue for the WMF. Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! 2009/6/6 Brian brian.min...@colorado.edu This is another e-mail on this subject that just strikes me as flawed. These are not vague privacy fears - they are real privacy fears. I see a fundamental failure by those involved in this controversy to understand this point. On Sat, Jun 6, 2009 at 1:31 AM, Tisza Gergő gti...@gmail.com wrote: Robert Rohde raro...@... writes: You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. The (WM-DE-owned) toolserver ran a statistics script called WikiCharts for a few years, which worked with data relayed by Common.js from several wikipedias, including de and en. While that is not exactly the same situation (as the WMF has access to the toolserver), I think it proves my point that passing IP data to an (in the strict organizational sense) third-party server does not necessarily violate the privacy policy, neither letter nor spirit, as long as that server remains within the larger WM community. It is important to understand that this is a much more general question than that of web statistics: any third-party service that interacts with the standard wiki user interface receives private data, whether it needs it or not, because the user interface (the HTML page) is executed in the user's browser, and the browser has to contact the third-party service, and it cannot hide its IP in that process. For example, we considered setting up some sort of spell checking service for hu.wp. That is something that cannot be done well centrally - there is too much difference between languages. And if you do it with a local server, it has to communicate with the user's browser, and could in theory log requests and correlate them with edits on the wiki, thus it has to conform with the privacy guidelines. It would be a shame if all such uses would be blindly forbidden because of vague privacy fears. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
* clap - clap * John Peter Gervai skrev: Hello, I wasn't subscribed to this list, since I usually try to avoid the politics around. I was notified, however, that some interesting claims were made and some steps taken (again) without any discussion whatsoever. First, let me tell it here again - as I have told it on a different list - that I am extremely disappointed by the lack of discussion before someone from outside seriously interfere with other project based on, as it turns out, incorrect informations. In the past people with privileges (if we ever considered them that way instead of people with work to be done) were more cautious. I would like you all fast-handed guys to slow down and talk first, get informed, and act later. I already commented elsewhere on vls, in summary I miss the discussion and I do not believe the case actually breached any privacy, but this isn't my concern now (as I'm in a bit of hurry). Regarding huwp, it would have been pretty easy to find out who to ask. Apart from the obvious choice of anyone with any flags on huwp, it could've been easy to identify who made the changes, and ask them. Like, for example me. As far as I see, lots of wasted energies go around, like people planning how to block javascript, how to block counters, etc. It is the wrong way. The good way is, and I'm repeating myself again, is FIRST to get to know WHY these scripts are there in the first hand, what solution they have to solve. This is a crucial step, fellows, which you neglected to take. (And we all know that the reason is to create usage stats.) Next step should be examining whether there is anything this violates, like, Privacy Policy. In the case of Google this is debateable, since I don't know what is the scope of the data retention. However I completely do know about the Hungarian stats. Let me share the real information here, briefly, since I have to go soon, but I do not want to let you destroy something you're not aware of. The stats (which have, by surprise, a dedicated domain under th hu wikipedia domain) runs on a dedicated server, with nothing else on it. Its sole purpose to gather and publish the stats. Basically nobody have permission to log in the servers but me, and I since I happen to be checkuser as well it wouldn't even be ntertaining to read it, even if it wasn't big enough making this useless. I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. (And those who are unfamiliar with me, I happen to be the founder of huwp as well, apart from my job in computer security.) If you would have gathered this knowledge (which means that the server is closed and run by an identified user to WMF), then you could have started the discussion. As it is obvious, don't make any interfering moves while discussing it for days, or even weeks, wouldn't change anything. What have you achieved with removing the code? You killed our stats, which provides us with the statistics originally WMF provided (same data content), but later killed off. We'll propose (huwp) some solutions on the problem, but I'll really have to go now. Tgr can help discussing it, and I'll thank him for his help in advance. :-) So, think about these in the weekend, I'm back on monday. I hop there can be an _useful_ discussion, with thinking people and not people acting on impulses. Peter Gervai Hungary ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
The strange thingh is, some such servers seems to be outside discussion while others are not. ;) John Tisza Gergő skrev: Nathan nawr...@... writes: Others have since discussed more centralised and secure methods for providing these statistics via the WMF - this is the ideal outcome, and one that might have been achieved earlier had you proposed your method rather than simply going ahead alone. Setting up an off-the-shelf awstats with an invisible pixel is web statistics 101, not something that needs to be invented. The reason nothing similar got implemented is not that nobody thought of this method, but that it wouldn't work with enwiki so nobody cared. Actually, the old knams stat (which also collected referrers, so it was in some aspects superior) could have been easily kept working by filtering out enwiki, and maybe the next few largest projects; again, nobody cared. Features that only benefit the smaller projects rarely get enough developer interest, which is understandable, but then it is only natural that those smaller projects try to solve their issues for themselves. And we did it with privacy in mind - we would have obviously preferred Google Analytics ourselves, but we didn't switch because we didn't want the logs to leak to servers not controlled by WM community, and because it shows data that can be used to identify IP adresses. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
You can make claims about what you yourself wants or believe, but do *not* claim that your personal beliefs reflects legal issues for Foundation. If Foundation needs to make claims about what is and whats not a legal issue, then such claims should be made by Mike. John Brian skrev: I also have not seen a clear explanation of what those who would like to generate statistics using web bugs plan to do with that data. How do they plan to use the data, and why aren't the plethora of statistics now made officially available by the WMF not satisfactory? You have bypassed the correct procedure. The amount of time that it takes the WMF to accomplish goals can be frustrating. Getting them to make your goal their goal can be frustrating. But it all has to start with you presenting them with a coherent goal that takes all the constraints into account. Then you need to get WMF approval which often involves getting community approval. Let's be clear that the privacy policy is a legal issue for the WMF. Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! 2009/6/6 Brian brian.min...@colorado.edu This is another e-mail on this subject that just strikes me as flawed. These are not vague privacy fears - they are real privacy fears. I see a fundamental failure by those involved in this controversy to understand this point. On Sat, Jun 6, 2009 at 1:31 AM, Tisza Gergő gti...@gmail.com wrote: Robert Rohde raro...@... writes: You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. The (WM-DE-owned) toolserver ran a statistics script called WikiCharts for a few years, which worked with data relayed by Common.js from several wikipedias, including de and en. While that is not exactly the same situation (as the WMF has access to the toolserver), I think it proves my point that passing IP data to an (in the strict organizational sense) third-party server does not necessarily violate the privacy policy, neither letter nor spirit, as long as that server remains within the larger WM community. It is important to understand that this is a much more general question than that of web statistics: any third-party service that interacts with the standard wiki user interface receives private data, whether it needs it or not, because the user interface (the HTML page) is executed in the user's browser, and the browser has to contact the third-party service, and it cannot hide its IP in that process. For example, we considered setting up some sort of spell checking service for hu.wp. That is something that cannot be done well centrally - there is too much difference between languages. And if you do it with a local server, it has to communicate with the user's browser, and could in theory log requests and correlate them with edits on the wiki, thus it has to conform with the privacy guidelines. It would be a shame if all such uses would be blindly forbidden because of vague privacy fears. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Or by one of the WMF developers removing the web bug. 2009/6/6 John at Darkstar vac...@jeb.no You can make claims about what you yourself wants or believe, but do *not* claim that your personal beliefs reflects legal issues for Foundation. If Foundation needs to make claims about what is and whats not a legal issue, then such claims should be made by Mike. John Brian skrev: I also have not seen a clear explanation of what those who would like to generate statistics using web bugs plan to do with that data. How do they plan to use the data, and why aren't the plethora of statistics now made officially available by the WMF not satisfactory? You have bypassed the correct procedure. The amount of time that it takes the WMF to accomplish goals can be frustrating. Getting them to make your goal their goal can be frustrating. But it all has to start with you presenting them with a coherent goal that takes all the constraints into account. Then you need to get WMF approval which often involves getting community approval. Let's be clear that the privacy policy is a legal issue for the WMF. Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! 2009/6/6 Brian brian.min...@colorado.edu This is another e-mail on this subject that just strikes me as flawed. These are not vague privacy fears - they are real privacy fears. I see a fundamental failure by those involved in this controversy to understand this point. On Sat, Jun 6, 2009 at 1:31 AM, Tisza Gergő gti...@gmail.com wrote: Robert Rohde raro...@... writes: You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. The (WM-DE-owned) toolserver ran a statistics script called WikiCharts for a few years, which worked with data relayed by Common.js from several wikipedias, including de and en. While that is not exactly the same situation (as the WMF has access to the toolserver), I think it proves my point that passing IP data to an (in the strict organizational sense) third-party server does not necessarily violate the privacy policy, neither letter nor spirit, as long as that server remains within the larger WM community. It is important to understand that this is a much more general question than that of web statistics: any third-party service that interacts with the standard wiki user interface receives private data, whether it needs it or not, because the user interface (the HTML page) is executed in the user's browser, and the browser has to contact the third-party service, and it cannot hide its IP in that process. For example, we considered setting up some sort of spell checking service for hu.wp. That is something that cannot be done well centrally - there is too much difference between languages. And if you do it with a local server, it has to communicate with the user's browser, and could in theory log requests and correlate them with edits on the wiki, thus it has to conform with the privacy guidelines. It would be a shame if all such uses would be blindly forbidden because of vague privacy fears. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Are the developers lawyers? A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. John Brian skrev: Or by one of the WMF developers removing the web bug. 2009/6/6 John at Darkstar vac...@jeb.no You can make claims about what you yourself wants or believe, but do *not* claim that your personal beliefs reflects legal issues for Foundation. If Foundation needs to make claims about what is and whats not a legal issue, then such claims should be made by Mike. John Brian skrev: I also have not seen a clear explanation of what those who would like to generate statistics using web bugs plan to do with that data. How do they plan to use the data, and why aren't the plethora of statistics now made officially available by the WMF not satisfactory? You have bypassed the correct procedure. The amount of time that it takes the WMF to accomplish goals can be frustrating. Getting them to make your goal their goal can be frustrating. But it all has to start with you presenting them with a coherent goal that takes all the constraints into account. Then you need to get WMF approval which often involves getting community approval. Let's be clear that the privacy policy is a legal issue for the WMF. Volunteer admins cannot take user privacy into their own hands, under their own interpretation. That's just not how it works! 2009/6/6 Brian brian.min...@colorado.edu This is another e-mail on this subject that just strikes me as flawed. These are not vague privacy fears - they are real privacy fears. I see a fundamental failure by those involved in this controversy to understand this point. On Sat, Jun 6, 2009 at 1:31 AM, Tisza Gergő gti...@gmail.com wrote: Robert Rohde raro...@... writes: You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. The (WM-DE-owned) toolserver ran a statistics script called WikiCharts for a few years, which worked with data relayed by Common.js from several wikipedias, including de and en. While that is not exactly the same situation (as the WMF has access to the toolserver), I think it proves my point that passing IP data to an (in the strict organizational sense) third-party server does not necessarily violate the privacy policy, neither letter nor spirit, as long as that server remains within the larger WM community. It is important to understand that this is a much more general question than that of web statistics: any third-party service that interacts with the standard wiki user interface receives private data, whether it needs it or not, because the user interface (the HTML page) is executed in the user's browser, and the browser has to contact the third-party service, and it cannot hide its IP in that process. For example, we considered setting up some sort of spell checking service for hu.wp. That is something that cannot be done well centrally - there is too much difference between languages. And if you do it with a local server, it has to communicate with the user's browser, and could in theory log requests and correlate them with edits on the wiki, thus it has to conform with the privacy guidelines. It would be a shame if all such uses would be blindly forbidden because of vague privacy fears. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
John at Darkstar wrote: Are the developers lawyers? A developer claiming something has an unwanted privacy issue is very different from making claims about something being a legal issue on the behalf of Foundation. Simply don't do it. Privacy is not simply a legal issue, it's a general social concern. Our privacy policy should not be treated as merely a legal document, it's an effort to express part of our social compact. Unless it's a matter of interpreting legal regulations somewhere else, I consider the developers equally competent to address privacy issues. If Brion or Tim or Domas identify an issue, they don't need to run to Mike every time to check that it really is something that should be addressed. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
[Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Hello, I wasn't subscribed to this list, since I usually try to avoid the politics around. I was notified, however, that some interesting claims were made and some steps taken (again) without any discussion whatsoever. First, let me tell it here again - as I have told it on a different list - that I am extremely disappointed by the lack of discussion before someone from outside seriously interfere with other project based on, as it turns out, incorrect informations. In the past people with privileges (if we ever considered them that way instead of people with work to be done) were more cautious. I would like you all fast-handed guys to slow down and talk first, get informed, and act later. I already commented elsewhere on vls, in summary I miss the discussion and I do not believe the case actually breached any privacy, but this isn't my concern now (as I'm in a bit of hurry). Regarding huwp, it would have been pretty easy to find out who to ask. Apart from the obvious choice of anyone with any flags on huwp, it could've been easy to identify who made the changes, and ask them. Like, for example me. As far as I see, lots of wasted energies go around, like people planning how to block javascript, how to block counters, etc. It is the wrong way. The good way is, and I'm repeating myself again, is FIRST to get to know WHY these scripts are there in the first hand, what solution they have to solve. This is a crucial step, fellows, which you neglected to take. (And we all know that the reason is to create usage stats.) Next step should be examining whether there is anything this violates, like, Privacy Policy. In the case of Google this is debateable, since I don't know what is the scope of the data retention. However I completely do know about the Hungarian stats. Let me share the real information here, briefly, since I have to go soon, but I do not want to let you destroy something you're not aware of. The stats (which have, by surprise, a dedicated domain under th hu wikipedia domain) runs on a dedicated server, with nothing else on it. Its sole purpose to gather and publish the stats. Basically nobody have permission to log in the servers but me, and I since I happen to be checkuser as well it wouldn't even be ntertaining to read it, even if it wasn't big enough making this useless. I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. (And those who are unfamiliar with me, I happen to be the founder of huwp as well, apart from my job in computer security.) If you would have gathered this knowledge (which means that the server is closed and run by an identified user to WMF), then you could have started the discussion. As it is obvious, don't make any interfering moves while discussing it for days, or even weeks, wouldn't change anything. What have you achieved with removing the code? You killed our stats, which provides us with the statistics originally WMF provided (same data content), but later killed off. We'll propose (huwp) some solutions on the problem, but I'll really have to go now. Tgr can help discussing it, and I'll thank him for his help in advance. :-) So, think about these in the weekend, I'm back on monday. I hop there can be an _useful_ discussion, with thinking people and not people acting on impulses. Peter Gervai Hungary ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
I can understand your frustration, Peter, but perhaps hu.wp could also have taken a more collaborative approach. If you would like to use a method for collecting statistics that others will view as violating the privacy policy, or as presenting risks normally not considered throughout the rest of the Wikimedia community of projects, then you should propose your method for consideration prior to simply implementing it. As you note (some steps taken (again)) this has happened before, so some consultation with the rest of the community before going out on a limb is advised. Others have since discussed more centralised and secure methods for providing these statistics via the WMF - this is the ideal outcome, and one that might have been achieved earlier had you proposed your method rather than simply going ahead alone. Nathan ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Nathan wrote: I can understand your frustration, Peter, but perhaps hu.wp could also have taken a more collaborative approach. If you would like to use a method for collecting statistics that others will view as violating the privacy policy, or as presenting risks normally not considered throughout the rest of the Wikimedia community of projects, then you should propose your method for consideration prior to simply implementing it. As you note (some steps taken (again)) this has happened before, so some consultation with the rest of the community before going out on a limb is advised. Others have since discussed more centralised and secure methods for providing these statistics via the WMF - this is the ideal outcome, and one that might have been achieved earlier had you proposed your method rather than simply going ahead alone. Nathan ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l - From what I'm reading, the foundation already collects raw data containing information collected by most normal websites (IP, I guess) and such data can only be released under special circumstances. External stats appear to violate the privacy policy, to me. (http://meta.wikimedia.org/wiki/Meta:Privacy_policy) -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkopYiUACgkQSPTq06lEuY8jeACfSIzcWQnOC0rbAYArBjV1QJoZ CooAoKCFnx5tasAe5O3+y5YlBFhlvdKQ =NU8H -END PGP SIGNATURE- ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
I'd like to note in the interest of facts that the Huwp stats have been implemented (without complaint till now, June 2009) since October 2006; the current version of the privacy policy has been available in English since October 2008. I think it might not be very productive to judge the action of implementing a stats engine in light of a privacy policy that has been adopted later than the action was performed nor might it be fruitful to shift blame for not discussing something three years ago (which could even have been discussed in some way). Best regards, Bence Damokos On Fri, Jun 5, 2009 at 8:15 PM, Nathan nawr...@gmail.com wrote: Others have since discussed more centralised and secure methods for providing these statistics via the WMF - this is the ideal outcome, and one that might have been achieved earlier had you proposed your method rather than simply going ahead alone. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
2009/6/5 Peter Gervai grin...@gmail.com snip The stats (which have, by surprise, a dedicated domain under th hu wikipedia domain) runs on a dedicated server, with nothing else on it. Its sole purpose to gather and publish the stats. Basically nobody have permission to log in the servers but me, and I since I happen to be checkuser as well it wouldn't even be ntertaining to read it, even if it wasn't big enough making this useless. I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. (And those who are unfamiliar with me, I happen to be the founder of huwp as well, apart from my job in computer security.) snip Just a remark on the checkuser argument. Checkuser actions and checks are logged, and can be double checked by other checkusers and stewards. This server can not. I can imagine that this would pose a problem. eia ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
effe iets anders wrote: 2009/6/5 Peter Gervai grin...@gmail.com snip The stats (which have, by surprise, a dedicated domain under th hu wikipedia domain) runs on a dedicated server, with nothing else on it. Its sole purpose to gather and publish the stats. Basically nobody have permission to log in the servers but me, and I since I happen to be checkuser as well it wouldn't even be ntertaining to read it, even if it wasn't big enough making this useless. I happen to be the one who have created the Hungarian checkuser policy, which is, as far as I know, the strictest one in WMF projects, and it's no joke, and I intend to follow it. (And those who are unfamiliar with me, I happen to be the founder of huwp as well, apart from my job in computer security.) snip Just a remark on the checkuser argument. Checkuser actions and checks are logged, and can be double checked by other checkusers and stewards. This server can not. I can imagine that this would pose a problem. Checkuser also only stores the data for a known period of time (3 months) and, with the fairly recent exception of user-user email, only records actions that are publicly logged by MediaWiki (edits and other logged actions), not individual pageviews. -- Alex (wikipedia:en:User:Mr.Z-man) ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
And that without any complain from 2005 onward (practically from the beginning of huwiki's real existence). B. -Original Message- It is linked from the statistics page and other relevant places, not exactly a secret.) __ ESET Smart Security - Vírusdefiníciós adatbázis: 4134 (20090605) __ Az üzenetet az ESET Smart Security ellenorizte. http://www.eset.hu ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 9:49 PM, Tisza Gergő gti...@gmail.com wrote: Bence Damokos bdamo...@... writes: I'd like to note in the interest of facts that the Huwp stats have been implemented (without complaint till now, June 2009) since October 2006; the current version of the privacy policy has been available in English since October 2008. It was implemented in October 2005, actually (not long after the knams stats stopped IIRC); MediaWiki:Lastmodifiedat replaced an earlier message in 2006, that is why the page history doesn't go back further. More importantly, the privacy policy explicitly states that developers might have access to the raw logs. The stat is thus in compliance with the letter of the privacy policy, and I don't see why it would be countrary of its spirit. (As stated, the only purpose is to provide statistics which include no personally identifiable information; the operator is one of the most trusted users of the hu.wp community, the founder of the community, the head of Wikimedia Hungary, admin, bureaucrat, checkuser, whatnot; and the stat server was operated with the knowledge and consent of the community. It is linked from the statistics page and other relevant places, not exactly a secret.) There are a few issues with this. Devs have access to logs on WMF servers, not random external servers. The community cannot decide that Random_user1 and Random_user2 etc will agree with the communities view on the stats being passed to an external server. Also there *may* be issues with the security of that server that means it could be compromised and could probably be accessed by the web hosting company if they so wished. I still fail to see how, at this point (not before when there was no policy) this can be considered to be acceptable. IP information etc is still being passed to an external server, regardless of who it is being operated by. As we can see at http://meta.wikimedia.org/wiki/Privacy and copied below I don't see where this is acceptable. Release: Policy on Release of Data It is the policy of Wikimedia that personally identifiable data collected in the server logs, or through records in the database via the CheckUser feature, or through other non-publicly-available methods, may be released by Wikimedia volunteers or staff, in any of the following situations: 1. In response to a valid subpoena or other compulsory request from law enforcement, 2. With permission of the affected user, 3. When necessary for investigation of abuse complaints, 4. Where the information pertains to page views generated by a spider or bot and its dissemination is necessary to illustrate or resolve technical issues, 5. Where the user has been vandalizing articles or persistently behaving in a disruptive way, data may be released to a service provider, carrier, or other third-party entity to assist in the targeting of IP blocks, or to assist in the formulation of a complaint to relevant Internet Service Providers, 6. Where it is reasonably necessary to protect the rights, property or safety of the Wikimedia Foundation, its users or the public. Except as described above, Wikimedia policy does not permit distribution of personally identifiable information under any circumstances. Regards Mark ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Mark (Markie) wrote: I still fail to see how, at this point (not before when there was no policy) this can be considered to be acceptable. As I understand it, nobody is arguing that it's considered acceptable at this point. People involved in the Hungarian Wikipedia have been explaining the background, trying to establish that they shouldn't be blamed for having this in place. That's understandable as well, and I have no interest in seeing blame attached to anyone here. Let's just make sure these external trackers are removed, and that we work on our internal resources to collect information in a way consistent with the privacy policy. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Apologies for this, I'm getting confused between multiple threads on this. Regards Mark On Fri, Jun 5, 2009 at 10:22 PM, Michael Snow wikipe...@verizon.net wrote: Mark (Markie) wrote: I still fail to see how, at this point (not before when there was no policy) this can be considered to be acceptable. As I understand it, nobody is arguing that it's considered acceptable at this point. People involved in the Hungarian Wikipedia have been explaining the background, trying to establish that they shouldn't be blamed for having this in place. That's understandable as well, and I have no interest in seeing blame attached to anyone here. Let's just make sure these external trackers are removed, and that we work on our internal resources to collect information in a way consistent with the privacy policy. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 5:22 PM, Michael Snowwikipe...@verizon.net wrote: As I understand it, nobody is arguing that it's considered acceptable at this point. Peter Gervai seemed to argue exactly that, unless I badly misread him: someone from outside seriously interfere with other project based on, as it turns out, incorrect informations. . . . . . . I do not believe the case actually breached any privacy . . . And so did Tisza Gergő: More importantly, the privacy policy explicitly states that developers might have access to the raw logs. The stat is thus in compliance with the letter of the privacy policy, and I don't see why it would be countrary of its spirit. The privacy policy clearly prohibits release of data to outside sources for the purpose of statistical analysis, since that doesn't fall within the six enumerated points under Release: Policy on Release of Data. I suppose it's arguable by the letter of the policy that sending the data to a server which only a single Wikipedian has access to isn't release. However, I think it's clear that the intent of the policy was otherwise, and Domas acted in accordance with established policy and with full understanding of the nature of the script he was removing. It might be worth defining release more clearly to avoid any confusion in the future. Would it have been any different if it was being sent to the toolserver instead of a totally third-party server, for instance? I'd think not, but it's not fully clear from reading the policy. How about a checkuser downloading some data to his computer for analysis beyond that permitted by the web-based interface? Why is that not release if downloading it to a server is? Does that depend on the amount, intent, or some other purpose? (Or is it release? If so, why is it different from downloading web pages so you can view them in your browser?) Also, there are multiple places where the policy vaguely and redundantly states that logs will not be publicized, in multiple ways: is not made public, will not be published, is not reproduced publicly. In general, there's a lot of repetition that makes the policy hard to draw firm conclusions from. If you just saw those mentions, you might think it was just fine to reproduce it as long as it wasn't actually *public*. It could use more precise and condensed wording. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 5:58 PM, Tisza Gergőgti...@gmail.com wrote: I do argue that it is not in violation of the privacy policy (whether the people here find it acceptable is another question). It may be within the letter of the privacy policy. I think that's entirely arguable, since the policy is so vague. However, it's very clearly against the *intent* of the privacy policy as dictated by the Board. Domas Mitzuas and Michael Snow are both Board members and have both made it clear that they think there's no question that the script in question violated the privacy policy. I believe the major problems with the script are 1) It sent data to a server not directly controlled by the Wikimedia Foundation. No personally identifiable information should be sent in bulk to any non-Wikimedia server. Operation of any server hosting significant amounts of sensitive information must be directly and immediately accountable to Wikimedia's normal chain of command. 2) This use of data was not specifically authorized by the Wikimedia Foundation, via either the Board or appropriate officers. Peter may be a checkuser, but that gives him authorization only to use checkuser functions, not to collect or harvest other types of data. As has been noted, the data collected includes much more than checkusers can access in the course of using their checkuser rights. Neither of these points is made clear in the written privacy policy, however, if they are in fact intended. Last I heard, Erik Zachte is working on improved statistics for all Wikimedia projects. These are running on Wikimedia servers and specifically approved by Wikimedia. It seems like the best course of action would be for people to point out what they think is lacking in his statistics, and perhaps offer to help improve them. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Aryeh Gregor wrote: On Fri, Jun 5, 2009 at 5:22 PM, Michael Snowwikipe...@verizon.net wrote: As I understand it, nobody is arguing that it's considered acceptable at this point. Peter Gervai seemed to argue exactly that, unless I badly misread him: And so did Tisza Gergő: Maybe it's just the lawyer in me, but I read those comments primarily as a defense against a perceived prosecution for allegedly violating the privacy policy. Not, and this is the distinction I was trying to get at, as positive arguments that this particular approach should be accepted going forward. I suppose it's arguable by the letter of the policy that sending the data to a server which only a single Wikipedian has access to isn't release. However, I think it's clear that the intent of the policy was otherwise, and Domas acted in accordance with established policy and with full understanding of the nature of the script he was removing. I agree that regardless of whether there was a technical policy violation, the setup was problematic, and I trust Domas's judgment in addressing the situation. Also, there are multiple places where the policy vaguely and redundantly states that logs will not be publicized, in multiple ways: is not made public, will not be published, is not reproduced publicly. In general, there's a lot of repetition that makes the policy hard to draw firm conclusions from. If you just saw those mentions, you might think it was just fine to reproduce it as long as it wasn't actually *public*. It could use more precise and condensed wording. Policies being what they are, at some level it must state principles and will not be able to anticipate every single case. Implementation then depends on people exercising judgment when those cases arise. Some of the redundancy is possibly for emphasis, or out of an abundance of caution, so that people don't think an exception arises when something is not explicitly stated. That being said, suggestions for particular improvements are always welcome. --Michael Snow ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Just a few sidenotes now. 2009/6/5 Mark (Markie) newsmar...@googlemail.com: There are a few issues with this. Devs have access to logs on WMF servers, not random external servers. This is a good suggestion, basically you say that I should request the foundation to provide me a server inside WMF with developer access. I don't mind that (as long as it have Debian installed). This is a good (though a bit expensive) _temporary_ solution, since it only serves huwp. It is not impossible to provide service to other projects but definitely not for any wp above huwp size, since the current solution is a hack and do not scale. (And I could process squid logs, naturally, which is a better way to do it.) Final solution would be to create either a modified awstats to handle the stuff better or to write custom code to make it. I don't really have the time to do these just right now. The community cannot decide that Random_user1 and Random_user2 etc will agree with the communities view on the stats being passed to an external server. As you are aware it's not really random user, so what you write is more rhetoric and less facts. I debate your statement as I believe the community can pretty much decide anything unless it violates some higher level policy, and it's been told this predates the PP. And I tend to disagree in its violation, but it's an open debate. Also there *may* be issues with the security of that server that means it could be compromised and could probably be accessed by the web hosting company if they so wished. Sure, but I happen to be the web hosting company as well. You are guessing instead of trying to get informed, as others do around. As I told you the only person accessing the site is myself. And security-wise there is no 100% security, and it's well possible that wikimedia servers tunnel all the data to the chinese secret service. You may trust me to know my job as well. :-) I still fail to see how, at this point (not before when there was no policy) this can be considered to be acceptable. IP information etc is still being [...] Let me help. Release: Policy on Release of Data It is the policy of Wikimedia that personally identifiable data collected in the server logs, or through records in the database via the CheckUser feature, or through other non-publicly-available methods, may be released by Wikimedia volunteers or staff, in any of the following situations: It is not the server log, it is not database records, and it is not other non-publicly-available method by the staff. So the data was not released by the staff to us. (And we not happened to steal it from them either.) This complies with the policy. Now, let's see that volunteer part. We're volunteers, and some can debate that we are using a non-publicly-available method (even if the original intent was, in my opinion, clearly to cover methods used on the WMF servers, and _not_ covering this); in this case the policy requires us (the volunteers) not to release the identifiable data. And we comply, since we do not release any personally identifiable data. Do you see now? Except as described above, Wikimedia policy does not permit distribution of personally identifiable information under any circumstances. And it's great that you quoted that, since it shows nicely that we comply here as well since we do not distribute p.i.i. under any circumstances. But we - as huwp - don't stick to this server, as I mentioned, and I'd gladly put it up on WMF servers, even if this do not really mean or change anything. But I find it unacceptable that anyone kill off the stats which was running for plenty of years now, without even trying to look around. I see that it's pretty easy, since neither of you use it, it's somebody else's problem. Try to see for a moment like it's not. And since it was okay for the past 5 years I'd be glad if you would continue the discussion WHILE reverting your changes. I don't believe a few days would make a difference. And another sidenote: if a newspaper makes a few false statements, what is the correct way of actions? Telling them [kindly] that they're stupid or interfering with your own projects and fellow editors? And which is the easier? Peter ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
This argument - which is effectively that community members should be considered Wikimedia Foundation staff members - is very brittle. It is neither sound nor valid. Do yourself a favor and consider the logic of the other side. It will save you from confusion later when you realize that you were the only person who didn't see it earlier. On Fri, Jun 5, 2009 at 5:30 PM, Peter Gervai grin...@gmail.com wrote: Just a few sidenotes now. 2009/6/5 Mark (Markie) newsmar...@googlemail.com: There are a few issues with this. Devs have access to logs on WMF servers, not random external servers. This is a good suggestion, basically you say that I should request the foundation to provide me a server inside WMF with developer access. I don't mind that (as long as it have Debian installed). This is a good (though a bit expensive) _temporary_ solution, since it only serves huwp. It is not impossible to provide service to other projects but definitely not for any wp above huwp size, since the current solution is a hack and do not scale. (And I could process squid logs, naturally, which is a better way to do it.) Final solution would be to create either a modified awstats to handle the stuff better or to write custom code to make it. I don't really have the time to do these just right now. The community cannot decide that Random_user1 and Random_user2 etc will agree with the communities view on the stats being passed to an external server. As you are aware it's not really random user, so what you write is more rhetoric and less facts. I debate your statement as I believe the community can pretty much decide anything unless it violates some higher level policy, and it's been told this predates the PP. And I tend to disagree in its violation, but it's an open debate. Also there *may* be issues with the security of that server that means it could be compromised and could probably be accessed by the web hosting company if they so wished. Sure, but I happen to be the web hosting company as well. You are guessing instead of trying to get informed, as others do around. As I told you the only person accessing the site is myself. And security-wise there is no 100% security, and it's well possible that wikimedia servers tunnel all the data to the chinese secret service. You may trust me to know my job as well. :-) I still fail to see how, at this point (not before when there was no policy) this can be considered to be acceptable. IP information etc is still being [...] Let me help. Release: Policy on Release of Data It is the policy of Wikimedia that personally identifiable data collected in the server logs, or through records in the database via the CheckUser feature, or through other non-publicly-available methods, may be released by Wikimedia volunteers or staff, in any of the following situations: It is not the server log, it is not database records, and it is not other non-publicly-available method by the staff. So the data was not released by the staff to us. (And we not happened to steal it from them either.) This complies with the policy. Now, let's see that volunteer part. We're volunteers, and some can debate that we are using a non-publicly-available method (even if the original intent was, in my opinion, clearly to cover methods used on the WMF servers, and _not_ covering this); in this case the policy requires us (the volunteers) not to release the identifiable data. And we comply, since we do not release any personally identifiable data. Do you see now? Except as described above, Wikimedia policy does not permit distribution of personally identifiable information under any circumstances. And it's great that you quoted that, since it shows nicely that we comply here as well since we do not distribute p.i.i. under any circumstances. But we - as huwp - don't stick to this server, as I mentioned, and I'd gladly put it up on WMF servers, even if this do not really mean or change anything. But I find it unacceptable that anyone kill off the stats which was running for plenty of years now, without even trying to look around. I see that it's pretty easy, since neither of you use it, it's somebody else's problem. Try to see for a moment like it's not. And since it was okay for the past 5 years I'd be glad if you would continue the discussion WHILE reverting your changes. I don't believe a few days would make a difference. And another sidenote: if a newspaper makes a few false statements, what is the correct way of actions? Telling them [kindly] that they're stupid or interfering with your own projects and fellow editors? And which is the easier? Peter ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l ___ foundation-l mailing list
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 4:30 PM, Peter Gervaigrin...@gmail.com wrote: snip The community cannot decide that Random_user1 and Random_user2 etc will agree with the communities view on the stats being passed to an external server. As you are aware it's not really random user, so what you write is more rhetoric and less facts. I debate your statement as I believe the community can pretty much decide anything unless it violates some higher level policy, and it's been told this predates the PP. And I tend to disagree in its violation, but it's an open debate. snip The wording of Privacy Policy has always been rather vague and mushy (something I've complained about in the past). However the spirit of the policy, and the way it has been applied, might be summarized thusly: Personally identifiable data does not leave the WMF's control without the WMF's express permission. In general, the circumstances where people have access to such data and the purposes for which they can use it are explicitly defined in advance. You may not be aware, but the relaying of page view data to third party analysis platforms has been tried on a number of occasions in the past and consistently shutdown. (I think this even includes cases before the Privacy Policy was adopted.) However, to my recollection there has never been a case that quite mirrors yours since we are talking about a privately hosted server administered by a highly trusted community member. Given the situation with Wikimedia DE and the toolserver cluster etc., I think it should be possible in principle for the WMF to reach an agreement that allows data to be communicated to servers operated by Wikimedia chapters for purposes that benefit Wikimedia. In light of current sentiment and Foundation practice though, I think that any such arrangements should require prior approval. That your set up has existed for years can provide some confidence in its reasonableness and security, but I wouldn't support turning it back on until people have looked at and reviewed the details though. Sorry for the abrupt way that things were handled, but erring on the side of protecting user privacy is generally a good thing. Now that you are here discussing the matter, I'd hope a reasonable solution can be found. -Robert Rohde ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 6:22 PM, Tisza Gergőgti...@gmail.com wrote: Tisza Gergő gti...@... writes: I do argue that it is not in violation of the privacy policy (whether the people here find it acceptable is another question). Just to make it clear, I don't think accordance with the privacy policy automatically entitles one to do something. The PP is a minimum set of requirements strong enough to assure users and weak enough to not hinder ourselves (as it is difficult to change it); if something is permitted by the policy, but the WMF or the developers or the relevant community is against it, then it will not be done. That's a reasonable view. So instead of talking about the privacy policy (which would be routinely violated if spread of IP data to non-WMF-owned servers would indeed be a violation - consider WikiMiniAtlas, for example) it would be more productive to talk about whether such a use is acceptable, and if not, what can be done to make it so. Agreed. This is a matter of a local project wanting to maintain a long-standing feature or service without adverseley affecting anyone, violating shared meta-community norms, or having to wait for bottlenecks in centralized implementation. It is very wiki to want to find ways to fix things yourself. (For example, would it help if WM-HU took ownership? We could also write a complementary privacy policy for it, stating that it will never be used for any other reason than statistics, who has access, how long the raw logs are kept etc.) Perhaps other messages in this thread will shed light here... I hear people outside of hu:wp expressing a desire to centralize and maintain a bottleneck for the simple reason that a bottleneck is easier to police. SJ ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
Michael Snow writes: Maybe it's just the lawyer in me, but I read those comments primarily as a defense against a perceived prosecution for allegedly violating the privacy policy. I don't read them that way - rather as saying This isn't clearly in violation; it has been working for a long time and has been publicly discussed before, ending in [default] acceptance; we weren't given any notice. What gives? Brianbrian.min...@colorado.edu writes: This argument - which is effectively that community members should be considered Wikimedia Foundation staff members - is very brittle. It is neither sound nor valid. Do yourself a favor and consider the logic of the other side. It will save you from confusion later when you realize that you were the only person who didn't see it earlier. Peter said that he could run whatever was being done on an external server on a WMF machine that [core] developers have access to. What does this have to do with being Foundation staff? Peter Gervai writes: But we - as huwp - don't stick to this server, as I mentioned, and I'd gladly put it up on WMF servers, even if this do not really mean or change anything. But I find it unacceptable that anyone kill off the stats which was running for plenty of years now, without even trying to look around. I see that it's pretty easy, since neither of you use it, it's somebody else's problem. Try to see for a moment like it's not. And since it was okay for the past 5 years I'd be glad if you would continue the discussion WHILE reverting your changes. I don't believe a few days would make a difference. This seems like the heart of the matter. It sounds as though hu:wp wants to find a way to continue having access to stats; are happy to make this happen in a way that other devs are comfortable with (and willing to help), but feel slighted. Robert Rhode writes: Sorry for the abrupt way that things were handled, but erring on the side of protecting user privacy is generally a good thing. Now that you are here discussing the matter, I'd hope a reasonable solution can be found. You said it. While f-l isn't the place to find a technical solution (though this thread looks promising - http://lists.wikimedia.org/pipermail/wikitech-l/2009-June/043335.html), it may be the right place to discuss how to foreshadow and discuss changes that address the power balance between local and global projects. I can imagine similar changes resulting from adding a global wikimedia policy that is known to contradict policies on a few mid-sized wikis, and then instantly implementing the result. [Peter: would you have considered a mention on this list notice? on wikitech? on hu:wp?] SJ ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Re: [Foundation-l] Wikipedia tracks user behaviour via third party companies #2
On Fri, Jun 5, 2009 at 8:46 PM, Samuel Klein meta...@gmail.com wrote: Peter said that he could run whatever was being done on an external server on a WMF machine that [core] developers have access to. What does this have to do with being Foundation staff? He is trying rationalize his previous behavior by stating that he thinks any random admin on one of the projects should be able to insert a web bug into Common.js that logs user data to a non foundation server. I'm not sure what keywords to use but I seem to recall this issue coming up a couple of years ago (this exact case). At any rate it seems that he was aware of the controversy from the beginning. ___ foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l