>Thanks for the dump of data Nuria. I assume these all add up to 100% (roughly)
and are global?
Roughly and global. Yes to both.  As I said, "very" preliminary data. Good
enough to triage bugs though.

>So if I understand correctly, if I get the above access and follow your
instructions I can get this data when I
>do need it until we have some nice page I can go to to retrieve it :).
Yes. I also will be happy to re-run it if needed be.

On Fri, Oct 17, 2014 at 11:30 AM, Jon Robson <jrob...@wikimedia.org> wrote:

> On Wed, Oct 15, 2014 at 12:12 PM, Andrew Otto <ao...@wikimedia.org> wrote:
> > Jon,
> >
> > Recent unsampled webrequest logs are available for querying in Hive now!
> >
> > https://wikitech.wikimedia.org/wiki/Analytics/Cluster
> >
> > :)
> >
> > If you don’t already have access for this, submit an RT request to get
> access to stat1002 and the analytics-privatedata-group.
> >
>
> That's good to know. Thanks. I'm not sure if I have stat1002 access
> but every time you mention RT I shudder ;-)
>
> Thanks for the dump of data Nuria. I assume these all add up to 100%
> (roughly) and are global? So if I understand correctly, if I get the
> above access and follow your instructions I can get this data when I
> do need it until we have some nice page I can go to to retrieve it :).
>
> This is good to know when we have these sort of questions so thanks a
> bunch. We are currently interested in phablet traffic (big screen
> mobile devices) so this should be useful information for us thanks!
>
>
> On Thu, Oct 16, 2014 at 7:15 PM, Nuria Ruiz <nu...@wikimedia.org> wrote:
> >>And I have no idea what our traffic for
> >>Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of
> >>our traffic.
> > So the answer to this question (with preliminary data) is that neither
> 2.1
> > nor 2.2 amount to 0.05% of traffic to the mobile site.
> >
> > I have attached the list of user agents and devices (with percentages)
> for
> > the last 30 days. I did not included any device/browser combo with less
> than
> > 0.05% of traffic.
> >
> > For about 4% of traffic we could not identify the browser, this might be
> > cause the user agent was not there or because ua-parser could not figure
> it
> > out, I understand this is not ideal but I am sending this cause I feel
> this
> > list provides quite a bit of value and should help you triage bugs.
> >
> >  iOS takes the cake which does not cease to amaze me.
> >
> > I described what I did to gather the data here (anyone with permits to
> 1002
> > can repro):
> > https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/QueryUsingUDF
> >
> >
> > On Wed, Oct 15, 2014 at 12:15 PM, Nuria Ruiz <nu...@wikimedia.org>
> wrote:
> >>
> >> >And I have no idea what our traffic for
> >> >Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of
> >> >our traffic.
> >> Understood, it is hard for you guys to work without knowing this data. I
> >> will try to get a user agent list for data from last month but, as I
> >> mentioned earlier, I think providing this data in a regular basis
> (monthly?)
> >> is a good goal for us.
> >>
> >> On Wed, Oct 15, 2014 at 10:35 AM, Jon Robson <jrob...@wikimedia.org>
> >> wrote:
> >>>
> >>> Anything would be useful. I just hit this situation again. I was
> >>> reviewing some code and someone used JSON.stringify - this is not
> >>> available in Android < 2.3 and I have no idea what our traffic for
> >>> Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of
> >>> our traffic.
> >>>
> >>> In the mean time while I don't have a fancy place to find out the
> >>> answers to this how can I get these answers?
> >>> Should I mail the analytics mailing list to ask these questions? Cc a
> >>> point person on bugzilla with the question? Ping someone privately?
> >>>
> >>> Jon
> >>>
> >>>
> >>>
> >>> On Tue, Oct 14, 2014 at 10:30 AM, Nuria Ruiz <nu...@wikimedia.org>
> wrote:
> >>> >>Woah! Nice :D How are definitions updates handled?
> >>> > Since we talked about this on IRC, restating here to keep the
> archives
> >>> > happy.
> >>> > We pull the ua parser jar from our archiva depot, an update will
> >>> > involve
> >>> > building a new jar, uploading it to archiva and updating our
> dependency
> >>> > file
> >>> > (pom.xml) to point to the newly updated version.
> >>> >
> >>> >
> >>> >
> >>> > On Fri, Oct 10, 2014 at 9:59 PM, Oliver Keyes <oke...@wikimedia.org>
> >>> > wrote:
> >>> >>
> >>> >> Woah! Nice :D How are definitions updates handled?
> >>> >>
> >>> >> On 10 October 2014 18:58, Nuria Ruiz <nu...@wikimedia.org> wrote:
> >>> >>>
> >>> >>> >1. A UDF for ua-parser or whatever we decide to use (this will
> >>> >>> > possibly
> >>> >>> > be necessary for pageviews, but not necessarily - it depends on
> our
> >>> >>> > >spider/automaton detection strategy)
> >>> >>> We got this one ready today:
> >>> >>> https://gerrit.wikimedia.org/r/#/c/166142/
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>>
> >>> >>> On Fri, Oct 10, 2014 at 3:55 PM, Oliver Keyes <
> oke...@wikimedia.org>
> >>> >>> wrote:
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>> On 10 October 2014 16:02, Nuria Ruiz <nu...@wikimedia.org> wrote:
> >>> >>>>>
> >>> >>>>> >At some point I believe we hope to just, you know. Have a
> >>> >>>>> > regularly
> >>> >>>>> > updated browser matrix somewhere.
> >>> >>>>> I REALLY think this should make it into our goals, if it cannot
> be
> >>> >>>>> done
> >>> >>>>> this quarter it should for sure be done this quarter.
> >>> >>>>>
> >>> >>>>
> >>> >>>> I agree it would be nice. It's one of those things that will
> either
> >>> >>>> come
> >>> >>>> as a side-effect of other stuff, OR require subsantially more
> work,
> >>> >>>> and
> >>> >>>> nothing in-between. Things we need for it:
> >>> >>>>
> >>> >>>> 1. A UDF for ua-parser or whatever we decide to use (this will
> >>> >>>> possibly
> >>> >>>> be necessary for pageviews, but not necessarily - it depends on
> our
> >>> >>>> spider/automaton detection strategy)
> >>> >>>> 2. Pageviews data
> >>> >>>> 3. A table somewhere.
> >>> >>>>
> >>> >>>> Take 1, apply to 2, stick in 3. Maybe grab the same data for
> >>> >>>> text/html
> >>> >>>> requests overall (depends on query runtime), maybe don't.
> >>> >>>>
> >>> >>>> The ideal implementation, obviously, is to pair this up with a
> site
> >>> >>>> that
> >>> >>>> automatically parses the results into HTML. That should be the end
> >>> >>>> goal. but
> >>> >>>> in terms of engineering support we can get most of the way there
> >>> >>>> simply by
> >>> >>>> ensuring we always have a recent snapshot to hand. I can probably
> >>> >>>> put
> >>> >>>> something together over the sampled logs and throw it in SQL if
> >>> >>>> there are
> >>> >>>> urgent needs.
> >>> >>>>
> >>> >>>>>
> >>> >>>>> Do we not have more recent data than May?
> >>> >>>>
> >>> >>>>
> >>> >>>> We don't, but thanks to the utilities library I built, the code
> for
> >>> >>>> generating it would literally run:
> >>> >>>>
> >>> >>>> library(WMUtils)
> >>> >>>> uas <-
> >>> >>>>
> >>> >>>>
> as.data.table(ua_parse(data_sieve(do.call("rbind",lapply(seq(20140901,20140930,1),sampled_logs)))$user_agent))
> >>> >>>>
> >>> >>>> uas <- uas[,j = list(requests = .N, by = c("os","browser")]
> >>> >>>>
> >>> >>>> write.table(uas, file = uas_for_jon.tsv, sep = "\t", row.names =
> >>> >>>> FALSE,
> >>> >>>> quote = TRUE)
> >>> >>>>
> >>> >>>> ...assuming we didn't care about readability.
> >>> >>>>
> >>> >>>> Point is, in the time until we have the new parser built into
> Hadoop
> >>> >>>> and
> >>> >>>> that setup, we can totally generate interim data from the sampled
> >>> >>>> logs using
> >>> >>>> the same parser at a tiny cost in research/programming time, iff
> >>> >>>> (the
> >>> >>>> mathematical if) we need it enough that we're cool with the
> >>> >>>> sampling, and
> >>> >>>> people can convince [[Dario|Our Great Leader]] to authorise me to
> >>> >>>> spend 15
> >>> >>>> minutes of my time on it.
> >>> >>>>
> >>> >>>>>
> >>> >>>>>
> >>> >>>>> On Fri, Oct 10, 2014 at 12:45 PM, Oliver Keyes
> >>> >>>>> <oke...@wikimedia.org>
> >>> >>>>> wrote:
> >>> >>>>>>
> >>> >>>>>> Email Dario and I, if he prioritises it I'll run a check on more
> >>> >>>>>> recent data.
> >>> >>>>>>
> >>> >>>>>> At some point I believe we hope to just, you know. Have a
> >>> >>>>>> regularly
> >>> >>>>>> updated browser matrix somewhere. This comes some time after
> >>> >>>>>> pageviews
> >>> >>>>>> though.
> >>> >>>>>>
> >>> >>>>>> On 10 October 2014 14:38, Toby Negrin <tneg...@wikimedia.org>
> >>> >>>>>> wrote:
> >>> >>>>>>>
> >>> >>>>>>> Hi Jon -- I'm sure other folks will have more information but
> >>> >>>>>>> here's
> >>> >>>>>>> a link to a slide with some data from May[1]. We don't see a
> lot
> >>> >>>>>>> of Windows
> >>> >>>>>>> phone traffic.
> >>> >>>>>>>
> >>> >>>>>>> -Toby
> >>> >>>>>>>
> >>> >>>>>>> [1]
> >>> >>>>>>>
> >>> >>>>>>>
> https://docs.google.com/a/wikimedia.org/presentation/d/19tZgTi6VUG04wfGWVzcaZKY26oQiXhPaHI9g2tBmMKE/edit#slide=id.g382406373_08
> >>> >>>>>>>
> >>> >>>>>>> On Fri, Oct 10, 2014 at 11:17 AM, Jon Robson
> >>> >>>>>>> <jrob...@wikimedia.org>
> >>> >>>>>>> wrote:
> >>> >>>>>>>>
> >>> >>>>>>>> I was going through our backlog again today, and I noticed a
> bug
> >>> >>>>>>>> about
> >>> >>>>>>>> supporting editing on Windows Phones with IE9 [1]
> >>> >>>>>>>>
> >>> >>>>>>>> Yet again, I wondered 'how many of our users are using IE9'
> as I
> >>> >>>>>>>> wondered if because of this lack of support we are losing out
> on
> >>> >>>>>>>> lots
> >>> >>>>>>>> of potential editors.
> >>> >>>>>>>>
> >>> >>>>>>>> What's the easiest way to get this information now? Is it
> >>> >>>>>>>> available?
> >>> >>>>>>>>
> >>> >>>>>>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=55599
> >>> >>>>>>>>
> >>> >>>>>>>> _______________________________________________
> >>> >>>>>>>> Analytics mailing list
> >>> >>>>>>>> Analytics@lists.wikimedia.org
> >>> >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >>>>>>>
> >>> >>>>>>>
> >>> >>>>>>
> >>> >>>>>>
> >>> >>>>>>
> >>> >>>>>> --
> >>> >>>>>> Oliver Keyes
> >>> >>>>>> Research Analyst
> >>> >>>>>> Wikimedia Foundation
> >>> >>>>>>
> >>> >>>>>> _______________________________________________
> >>> >>>>>> Analytics mailing list
> >>> >>>>>> Analytics@lists.wikimedia.org
> >>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
> >>> >>>>>>
> >>> >>>>>
> >>> >>>>
> >>> >>>>
> >>> >>>>
> >>> >>>> --
> >>> >>>> Oliver Keyes
> >>> >>>> Research Analyst
> >>> >>>> Wikimedia Foundation
> >>> >>>
> >>> >>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Oliver Keyes
> >>> >> Research Analyst
> >>> >> Wikimedia Foundation
> >>> >
> >>> >
> >>
> >>
> >
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to