I'm not trying to come across as anti-analytics, i'm not. But I feel a lot
of those questions can be answered by the aggregate stats already provided
by apache (presumably from httpd access_log), without adding
privacy-invading-google-tracker javascripts and cookies. So, while your
answers are good, they don't justify google analytics in my eyes.

As an example, lets look at
https://uls.apache.org/exports/lucene.apache.org.yaml and consider your list
1. You can see breakdown of pageviews and "visitors" by day. I don't know
how they determine unique "visitor" since it isn't cookie tracking: maybe
some combo of (IP address, TLS session ID, user agent), but whatever they
have is good enough for me.
2. I can see most popular pages and your 6.6 ref guide stuff
3. Top referrers gives you a rough idea of where people are coming from
(including internal referrers). So people are clicking links on those
pages.
4. see #1.
5. see #3. Google provides no additional magic here, this is referer (sic)
header either way.
6. i think the download process is actually hacked up/convoluted just to
force some GA tracking. At least i know if i disable javascript, the
download buttons still work.
7. what is missing?


On Wed, Mar 3, 2021 at 9:15 AM Alexandre Rafalovitch <arafa...@gmail.com>
wrote:

> I block any analytics I can find. I am with you on the overall
> positioning. And yes, the absolute numbers lie.
>
> At the same time, we can get a lot of relative numbers and trends that are
> valuable in other ways.
>
> For example:
> 1) Are the social media announcements of new releases drive people to
> download Solr?
> 2) Which Ref Guide pages (if we had GA there) are most popular and why
> can't we convince users to use the latest version instead of 6.6 (looking
> at referrals). My specific peeve is that I think URPs page should be a lot
> more visible, I would love to see if my assumptions are true by seeing if
> people discover that page, relative to other pages.
> 3) What is the page flow on the website? Are there any pages that are
> complete invisible because of how we linked to them? Are there super
> popular pages that are completely out of date?
> 4) Do we have increase or decrease in traffic matching specific events
> 5) Is there a specific partner/agency site that is driving a lot of
> attention to Solr; can we replicate that with others?
> 6) Do we even count downloads in GA? Because GA is for HTML pages only by
> default
> 7) If any of this is valuable, but we want to pull out GA anyway, this
> would help to know what tracking information we would like from Apache
> Infra?
>
> In general, these kinds of questions are the domain of Developer
> Relationships role. Lucene/Solr project does not have one as such, which
> may explain why not many people understand the values of modern analytics
> solutions. I am offering my time to make the value of analytics concrete,
> so we are making the next decision based on  reality rather than our
> collective imagination of what analytics actually does or does not.
>
> Regards,
>    Alex.
>
>
>
>
> On Wed., Mar. 3, 2021, 8:40 a.m. Robert Muir, <rcm...@gmail.com> wrote:
>
>>
>>
>> On Wed, Mar 3, 2021 at 8:35 AM Michael Sokolov <msoko...@gmail.com>
>> wrote:
>>
>>> Before you look, should we have a betting pool on the number of
>>> downloads/day? I will arrange for a bottle of some excellent liquid to
>>> be sent to the closest guess at the number of redirects to the mirror
>>> sites, as determined by Alexandre. Also, has it been increasing over
>>> the last year? Finally, if we can predict these trends using activity
>>> on the main apache site, maybe we don't need to track independently.
>>>
>>
>> Why do we even care?
>>
>> How many users are downloading lucene tgz from the site versus using an
>> artifact in maven repositories (via maven, gradle, etc)? How many users are
>> downloading solr tgz from the site versus using solr official image from
>> docker hub?
>>
>> I'm just asking these questions to try to understand the need for the
>> google tracking.
>>
>>

Reply via email to