>Thanks for the dump of data Nuria. I assume these all add up to 100% (roughly) and are global? Roughly and global. Yes to both. As I said, "very" preliminary data. Good enough to triage bugs though.
>So if I understand correctly, if I get the above access and follow your instructions I can get this data when I >do need it until we have some nice page I can go to to retrieve it :). Yes. I also will be happy to re-run it if needed be. On Fri, Oct 17, 2014 at 11:30 AM, Jon Robson <jrob...@wikimedia.org> wrote: > On Wed, Oct 15, 2014 at 12:12 PM, Andrew Otto <ao...@wikimedia.org> wrote: > > Jon, > > > > Recent unsampled webrequest logs are available for querying in Hive now! > > > > https://wikitech.wikimedia.org/wiki/Analytics/Cluster > > > > :) > > > > If you don’t already have access for this, submit an RT request to get > access to stat1002 and the analytics-privatedata-group. > > > > That's good to know. Thanks. I'm not sure if I have stat1002 access > but every time you mention RT I shudder ;-) > > Thanks for the dump of data Nuria. I assume these all add up to 100% > (roughly) and are global? So if I understand correctly, if I get the > above access and follow your instructions I can get this data when I > do need it until we have some nice page I can go to to retrieve it :). > > This is good to know when we have these sort of questions so thanks a > bunch. We are currently interested in phablet traffic (big screen > mobile devices) so this should be useful information for us thanks! > > > On Thu, Oct 16, 2014 at 7:15 PM, Nuria Ruiz <nu...@wikimedia.org> wrote: > >>And I have no idea what our traffic for > >>Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of > >>our traffic. > > So the answer to this question (with preliminary data) is that neither > 2.1 > > nor 2.2 amount to 0.05% of traffic to the mobile site. > > > > I have attached the list of user agents and devices (with percentages) > for > > the last 30 days. I did not included any device/browser combo with less > than > > 0.05% of traffic. > > > > For about 4% of traffic we could not identify the browser, this might be > > cause the user agent was not there or because ua-parser could not figure > it > > out, I understand this is not ideal but I am sending this cause I feel > this > > list provides quite a bit of value and should help you triage bugs. > > > > iOS takes the cake which does not cease to amaze me. > > > > I described what I did to gather the data here (anyone with permits to > 1002 > > can repro): > > https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Hive/QueryUsingUDF > > > > > > On Wed, Oct 15, 2014 at 12:15 PM, Nuria Ruiz <nu...@wikimedia.org> > wrote: > >> > >> >And I have no idea what our traffic for > >> >Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of > >> >our traffic. > >> Understood, it is hard for you guys to work without knowing this data. I > >> will try to get a user agent list for data from last month but, as I > >> mentioned earlier, I think providing this data in a regular basis > (monthly?) > >> is a good goal for us. > >> > >> On Wed, Oct 15, 2014 at 10:35 AM, Jon Robson <jrob...@wikimedia.org> > >> wrote: > >>> > >>> Anything would be useful. I just hit this situation again. I was > >>> reviewing some code and someone used JSON.stringify - this is not > >>> available in Android < 2.3 and I have no idea what our traffic for > >>> Android 2.1 and 2.2 is and if it is significant e.g. more than 1% of > >>> our traffic. > >>> > >>> In the mean time while I don't have a fancy place to find out the > >>> answers to this how can I get these answers? > >>> Should I mail the analytics mailing list to ask these questions? Cc a > >>> point person on bugzilla with the question? Ping someone privately? > >>> > >>> Jon > >>> > >>> > >>> > >>> On Tue, Oct 14, 2014 at 10:30 AM, Nuria Ruiz <nu...@wikimedia.org> > wrote: > >>> >>Woah! Nice :D How are definitions updates handled? > >>> > Since we talked about this on IRC, restating here to keep the > archives > >>> > happy. > >>> > We pull the ua parser jar from our archiva depot, an update will > >>> > involve > >>> > building a new jar, uploading it to archiva and updating our > dependency > >>> > file > >>> > (pom.xml) to point to the newly updated version. > >>> > > >>> > > >>> > > >>> > On Fri, Oct 10, 2014 at 9:59 PM, Oliver Keyes <oke...@wikimedia.org> > >>> > wrote: > >>> >> > >>> >> Woah! Nice :D How are definitions updates handled? > >>> >> > >>> >> On 10 October 2014 18:58, Nuria Ruiz <nu...@wikimedia.org> wrote: > >>> >>> > >>> >>> >1. A UDF for ua-parser or whatever we decide to use (this will > >>> >>> > possibly > >>> >>> > be necessary for pageviews, but not necessarily - it depends on > our > >>> >>> > >spider/automaton detection strategy) > >>> >>> We got this one ready today: > >>> >>> https://gerrit.wikimedia.org/r/#/c/166142/ > >>> >>> > >>> >>> > >>> >>> > >>> >>> > >>> >>> On Fri, Oct 10, 2014 at 3:55 PM, Oliver Keyes < > oke...@wikimedia.org> > >>> >>> wrote: > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> On 10 October 2014 16:02, Nuria Ruiz <nu...@wikimedia.org> wrote: > >>> >>>>> > >>> >>>>> >At some point I believe we hope to just, you know. Have a > >>> >>>>> > regularly > >>> >>>>> > updated browser matrix somewhere. > >>> >>>>> I REALLY think this should make it into our goals, if it cannot > be > >>> >>>>> done > >>> >>>>> this quarter it should for sure be done this quarter. > >>> >>>>> > >>> >>>> > >>> >>>> I agree it would be nice. It's one of those things that will > either > >>> >>>> come > >>> >>>> as a side-effect of other stuff, OR require subsantially more > work, > >>> >>>> and > >>> >>>> nothing in-between. Things we need for it: > >>> >>>> > >>> >>>> 1. A UDF for ua-parser or whatever we decide to use (this will > >>> >>>> possibly > >>> >>>> be necessary for pageviews, but not necessarily - it depends on > our > >>> >>>> spider/automaton detection strategy) > >>> >>>> 2. Pageviews data > >>> >>>> 3. A table somewhere. > >>> >>>> > >>> >>>> Take 1, apply to 2, stick in 3. Maybe grab the same data for > >>> >>>> text/html > >>> >>>> requests overall (depends on query runtime), maybe don't. > >>> >>>> > >>> >>>> The ideal implementation, obviously, is to pair this up with a > site > >>> >>>> that > >>> >>>> automatically parses the results into HTML. That should be the end > >>> >>>> goal. but > >>> >>>> in terms of engineering support we can get most of the way there > >>> >>>> simply by > >>> >>>> ensuring we always have a recent snapshot to hand. I can probably > >>> >>>> put > >>> >>>> something together over the sampled logs and throw it in SQL if > >>> >>>> there are > >>> >>>> urgent needs. > >>> >>>> > >>> >>>>> > >>> >>>>> Do we not have more recent data than May? > >>> >>>> > >>> >>>> > >>> >>>> We don't, but thanks to the utilities library I built, the code > for > >>> >>>> generating it would literally run: > >>> >>>> > >>> >>>> library(WMUtils) > >>> >>>> uas <- > >>> >>>> > >>> >>>> > as.data.table(ua_parse(data_sieve(do.call("rbind",lapply(seq(20140901,20140930,1),sampled_logs)))$user_agent)) > >>> >>>> > >>> >>>> uas <- uas[,j = list(requests = .N, by = c("os","browser")] > >>> >>>> > >>> >>>> write.table(uas, file = uas_for_jon.tsv, sep = "\t", row.names = > >>> >>>> FALSE, > >>> >>>> quote = TRUE) > >>> >>>> > >>> >>>> ...assuming we didn't care about readability. > >>> >>>> > >>> >>>> Point is, in the time until we have the new parser built into > Hadoop > >>> >>>> and > >>> >>>> that setup, we can totally generate interim data from the sampled > >>> >>>> logs using > >>> >>>> the same parser at a tiny cost in research/programming time, iff > >>> >>>> (the > >>> >>>> mathematical if) we need it enough that we're cool with the > >>> >>>> sampling, and > >>> >>>> people can convince [[Dario|Our Great Leader]] to authorise me to > >>> >>>> spend 15 > >>> >>>> minutes of my time on it. > >>> >>>> > >>> >>>>> > >>> >>>>> > >>> >>>>> On Fri, Oct 10, 2014 at 12:45 PM, Oliver Keyes > >>> >>>>> <oke...@wikimedia.org> > >>> >>>>> wrote: > >>> >>>>>> > >>> >>>>>> Email Dario and I, if he prioritises it I'll run a check on more > >>> >>>>>> recent data. > >>> >>>>>> > >>> >>>>>> At some point I believe we hope to just, you know. Have a > >>> >>>>>> regularly > >>> >>>>>> updated browser matrix somewhere. This comes some time after > >>> >>>>>> pageviews > >>> >>>>>> though. > >>> >>>>>> > >>> >>>>>> On 10 October 2014 14:38, Toby Negrin <tneg...@wikimedia.org> > >>> >>>>>> wrote: > >>> >>>>>>> > >>> >>>>>>> Hi Jon -- I'm sure other folks will have more information but > >>> >>>>>>> here's > >>> >>>>>>> a link to a slide with some data from May[1]. We don't see a > lot > >>> >>>>>>> of Windows > >>> >>>>>>> phone traffic. > >>> >>>>>>> > >>> >>>>>>> -Toby > >>> >>>>>>> > >>> >>>>>>> [1] > >>> >>>>>>> > >>> >>>>>>> > https://docs.google.com/a/wikimedia.org/presentation/d/19tZgTi6VUG04wfGWVzcaZKY26oQiXhPaHI9g2tBmMKE/edit#slide=id.g382406373_08 > >>> >>>>>>> > >>> >>>>>>> On Fri, Oct 10, 2014 at 11:17 AM, Jon Robson > >>> >>>>>>> <jrob...@wikimedia.org> > >>> >>>>>>> wrote: > >>> >>>>>>>> > >>> >>>>>>>> I was going through our backlog again today, and I noticed a > bug > >>> >>>>>>>> about > >>> >>>>>>>> supporting editing on Windows Phones with IE9 [1] > >>> >>>>>>>> > >>> >>>>>>>> Yet again, I wondered 'how many of our users are using IE9' > as I > >>> >>>>>>>> wondered if because of this lack of support we are losing out > on > >>> >>>>>>>> lots > >>> >>>>>>>> of potential editors. > >>> >>>>>>>> > >>> >>>>>>>> What's the easiest way to get this information now? Is it > >>> >>>>>>>> available? > >>> >>>>>>>> > >>> >>>>>>>> [1] https://bugzilla.wikimedia.org/show_bug.cgi?id=55599 > >>> >>>>>>>> > >>> >>>>>>>> _______________________________________________ > >>> >>>>>>>> Analytics mailing list > >>> >>>>>>>> Analytics@lists.wikimedia.org > >>> >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> >>>>>>> > >>> >>>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> > >>> >>>>>> -- > >>> >>>>>> Oliver Keyes > >>> >>>>>> Research Analyst > >>> >>>>>> Wikimedia Foundation > >>> >>>>>> > >>> >>>>>> _______________________________________________ > >>> >>>>>> Analytics mailing list > >>> >>>>>> Analytics@lists.wikimedia.org > >>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics > >>> >>>>>> > >>> >>>>> > >>> >>>> > >>> >>>> > >>> >>>> > >>> >>>> -- > >>> >>>> Oliver Keyes > >>> >>>> Research Analyst > >>> >>>> Wikimedia Foundation > >>> >>> > >>> >>> > >>> >> > >>> >> > >>> >> > >>> >> -- > >>> >> Oliver Keyes > >>> >> Research Analyst > >>> >> Wikimedia Foundation > >>> > > >>> > > >> > >> > > >
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics