What is missing is the disaggregated statistics (Referrers to specific pages, etc). And possibly a lot more, as I just pulled a couple of examples of the top of my head, I am not a GA specialist, it is just one of many things I do in my overall job. The specific metrics available will actually depend on the version of GA tracker being run, on options enabled in GA Admin, etc.
And if people don't see the value in having more detailed statistics, I will not waste my time on doing it. I have no commercial interest riding on the decision. My understanding was that the analytics was in place but there was nobody volunteering to leverage it, so we were paying the "information leakage tax" without getting anything out of it. I've offered to solve the "nobody volunteered" part to - at least - have a fully informed discussion. This conversation feels like it is veering towards a formal vote on "information leakage tax". If that's actually what we want to do, I am +0 on keeping it for at least 3 month for Lucene and +1 for having it for Solr with a review at the end of that. Regards, Alex. On Wed, 3 Mar 2021 at 09:41, Robert Muir <[email protected]> wrote: > > I'm not trying to come across as anti-analytics, i'm not. But I feel a lot of > those questions can be answered by the aggregate stats already provided by > apache (presumably from httpd access_log), without adding > privacy-invading-google-tracker javascripts and cookies. So, while your > answers are good, they don't justify google analytics in my eyes. > > As an example, lets look at > https://uls.apache.org/exports/lucene.apache.org.yaml and consider your list > 1. You can see breakdown of pageviews and "visitors" by day. I don't know how > they determine unique "visitor" since it isn't cookie tracking: maybe some > combo of (IP address, TLS session ID, user agent), but whatever they have is > good enough for me. > 2. I can see most popular pages and your 6.6 ref guide stuff > 3. Top referrers gives you a rough idea of where people are coming from > (including internal referrers). So people are clicking links on those pages. > 4. see #1. > 5. see #3. Google provides no additional magic here, this is referer (sic) > header either way. > 6. i think the download process is actually hacked up/convoluted just to > force some GA tracking. At least i know if i disable javascript, the download > buttons still work. > 7. what is missing? > > > On Wed, Mar 3, 2021 at 9:15 AM Alexandre Rafalovitch <[email protected]> > wrote: >> >> I block any analytics I can find. I am with you on the overall positioning. >> And yes, the absolute numbers lie. >> >> At the same time, we can get a lot of relative numbers and trends that are >> valuable in other ways. >> >> For example: >> 1) Are the social media announcements of new releases drive people to >> download Solr? >> 2) Which Ref Guide pages (if we had GA there) are most popular and why can't >> we convince users to use the latest version instead of 6.6 (looking at >> referrals). My specific peeve is that I think URPs page should be a lot more >> visible, I would love to see if my assumptions are true by seeing if people >> discover that page, relative to other pages. >> 3) What is the page flow on the website? Are there any pages that are >> complete invisible because of how we linked to them? Are there super popular >> pages that are completely out of date? >> 4) Do we have increase or decrease in traffic matching specific events >> 5) Is there a specific partner/agency site that is driving a lot of >> attention to Solr; can we replicate that with others? >> 6) Do we even count downloads in GA? Because GA is for HTML pages only by >> default >> 7) If any of this is valuable, but we want to pull out GA anyway, this would >> help to know what tracking information we would like from Apache Infra? >> >> In general, these kinds of questions are the domain of Developer >> Relationships role. Lucene/Solr project does not have one as such, which may >> explain why not many people understand the values of modern analytics >> solutions. I am offering my time to make the value of analytics concrete, so >> we are making the next decision based on reality rather than our collective >> imagination of what analytics actually does or does not. >> >> Regards, >> Alex. >> >> >> >> >> On Wed., Mar. 3, 2021, 8:40 a.m. Robert Muir, <[email protected]> wrote: >>> >>> >>> >>> On Wed, Mar 3, 2021 at 8:35 AM Michael Sokolov <[email protected]> wrote: >>>> >>>> Before you look, should we have a betting pool on the number of >>>> downloads/day? I will arrange for a bottle of some excellent liquid to >>>> be sent to the closest guess at the number of redirects to the mirror >>>> sites, as determined by Alexandre. Also, has it been increasing over >>>> the last year? Finally, if we can predict these trends using activity >>>> on the main apache site, maybe we don't need to track independently. >>> >>> >>> Why do we even care? >>> >>> How many users are downloading lucene tgz from the site versus using an >>> artifact in maven repositories (via maven, gradle, etc)? How many users are >>> downloading solr tgz from the site versus using solr official image from >>> docker hub? >>> >>> I'm just asking these questions to try to understand the need for the >>> google tracking. >>> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
