On Fri, Aug 12, 2011 at 5:35 PM, Dennis E. Hamilton <[email protected]> wrote: > +1 > > on no data collected before we have a clear need for it and are prepared to > deal with it responsibly >
I agree, but I think we will want to enable analytics as soon as the new wiki and website are live. So we should get this into the privacy policy from the start. If you know anything about web analytics, you know that this is not something where you ask a question today, turn it on, and have the answer an hour later. It requires that you collect the data over a sustained period of time. If you want to make informed decisions, and have a statistically sound basis for using the data, you need to have collected it in advance. You are typically asking what the effect of a change has been on access patterns. So you need baseline data, as well as post-change data. And often you want retrospective data to inform a decision today. Remember, we're not collecting any personally identifying information here. It is aggregate information about what countries vistors are coming from, at what times of the day, what pages are most frequently visited, how long they are lingering on various pages, what websites are referring the most visitors, what browsers most of them are using, etc. > - Dennis > > -----Original Message----- > From: Simon Phipps [mailto:[email protected]] > Sent: Friday, August 12, 2011 14:15 > To: [email protected] > Subject: Re: [WWW] Web analytics > > > On 12 Aug 2011, at 22:01, Rob Weir wrote: > >> On Fri, Aug 12, 2011 at 4:30 PM, Simon Phipps <[email protected]> wrote: >>> >>> I suggest the right question is "which project members need which data and >>> why". The answer today may well be "none", since we don't actually have any >>> resources to visit yet. This is also likely to change over time, and we'll >>> need to add analytics as and when people request (and justify) according to >>> their specific needs and remove them when they're no longer justified. >>> >>> I suggest we resist the idea of capturing bulk analytics "just because", >>> and instead devise a lightweight process for justifying and requesting >>> collection of data. I'd guess there is already a process to copy somewhere >>> in Apache - any mentors with suggestions where to look? >>> >> >> I see that tracking code is used with the websites of most of the >> groups you are affiliated with: LibreOffice (Piwik), ForgeRock >> (Google Analytics, including in the community pages) and OSI (Google >> Analytics). And as was mentioned before, OpenOffice.org uses Google >> Analytics currently. > > It is indeed endemic. We have a unique opportunity to address the issue > thoughtfully. And by the way I am delighted you're paying such close > attention to my career. > >> >> Have you given them similar advice? > > Where possible, yes. Abuse of personal data is something which concerns me > greatly. > >> Or is there something special >> about OpenOffice at Apache > > Yes. At the moment as far as I am aware AOOo has no significant resources of > interest to non-project-members and no groups of members with active > applications for the data from analytics. Both situations will certainly > change, but on best YAGNI principles I suggest doing what's needed when it's > needed, on the basis of actual documented requirements. > >> that suggests that we should not be >> optimizing our website based on visitor stats like others, including >> LibreOffice, are? > > When we have an end-user website capable of optimisation along with project > members stepping forward to harvest the data, process it and act upon it, > that will be a fine thing to do. What's needed, when it's needed. > > S. > > > >
