>I see it is quite complicated to work with this data. Rather, the data you are asking for does not exists.
>See also https://phabricator.wikimedia.org/T119352, which is proposing to track time on site / page in general. Right, this is a proposal on how to calculate "time on site" that requires (to Marcel's point) instrumenting mediawiki. The "time on site" metric could be reported for browsers that support send beacon and visibility API. On Fri, Jul 1, 2016 at 8:33 AM, Gabriel Wicke <gwi...@wikimedia.org> wrote: > See also https://phabricator.wikimedia.org/T119352, which is proposing to > track time on site / page in general. > > On Jul 1, 2016 4:24 PM, "Marcel Ruiz Forns" <mfo...@wikimedia.org> wrote: > > If we were doing this internally, a possibility would be to instrument > MediaWiki and send sampled events with the time on page to EventLogging. > This would not be retroactive though, we would have to wait a couple months > to collect significant data. In any case, I'm not sure if this would be > possible with an NDA? > > On Fri, Jul 1, 2016 at 11:52 AM, Marc Miquel <marcmiq...@gmail.com> wrote: > >> I see it is quite complicated to work with this data. It is a pity >> considering that valuable insights could be driven by readers' behaviors. I >> will think about what can be useful for the study. >> >> Thanks for the answers, Nuria and Marcel! :) >> Cheers, >> >> Marc >> >> El dj., 30 juny 2016 a les 14:16, Marcel Ruiz Forns (< >> mfo...@wikimedia.org>) va escriure: >> >>> Marc, I also see what Nuria says. Also please consider that the majority >>> of Wikipedia sessions have only one pageview. So in the majority of >>> sessions it would not be possible to approximate the time spent on page >>> with boundaries with Joseph's alternative. >>> >>> On Thu, Jun 30, 2016 at 2:02 PM, Nuria Ruiz <nu...@wikimedia.org> wrote: >>> >>>> >Aye, as Joseph says, the time-on-page or time-leaving is not >>>> collected, except as an extension of session reconstruction work. If you >>>> want a >concrete time, you're not gonna get it. >>>> >>>> I was about to make the same point, the data set that will most closely >>>> answer your questions is the one Oliver mentioned, otherwise we do not keep >>>> any information related to time on site and page requests so there is no >>>> "approximation" possible that will work on overall data. Even if you >>>> calculate signatures with IP-hash +user agent to approximate users (a >>>> method with known issues) there is no way for you to distinguish someone >>>> reading a page for an hour and someone that came to wikipedia twice in the >>>> same hour and spent a minute each time. Hopefully my example makes things >>>> more clear. >>>> >>>> Thanks, >>>> >>>> Nuria >>>> >>>> On Wed, Jun 29, 2016 at 4:58 AM, Oliver Keyes <ironho...@gmail.com> >>>> wrote: >>>> >>>>> Aye, as Joseph says, the time-on-page or time-leaving is not >>>>> collected, except as an extension of session reconstruction work. If you >>>>> want a concrete time, you're not gonna get it. >>>>> >>>>> While PC-based data is more reliable than mobile, that does not >>>>> necessarily mean "reliable". I'm sort of confused, I guess, as to why the >>>>> datasets I linked (unless I'm misremembering them?) don't help: you would >>>>> have to do the calculation yourself but they should contain all the data >>>>> necessary to make that calculation (unless you want to have the pageID or >>>>> title associated with the time-on-page, in which case...yeah, that's an >>>>> issue). >>>>> >>>>> On Wed, Jun 29, 2016 at 3:16 AM, Marc Miquel <marcmiq...@gmail.com> >>>>> wrote: >>>>> >>>>>> Thanks for the answer, Oliver. But I am not sure it answers my >>>>>> questions. I'd like to study aspects like how much time is spent in >>>>>> certain pages, as a proxy of how content is approached/read/understood. >>>>>> I'd >>>>>> be happy with time of entering the page, time of leaving. This is >>>>>> not entirely centered on 'user activity', but I said that because I >>>>>> imagined data would be stored in a similar way to editor sessions, or in >>>>>> a >>>>>> database and I would need to do the time calculations. >>>>>> >>>>>> Cheers, >>>>>> >>>>>> Marc >>>>>> >>>>>> >>>>>> El dc., 29 juny, 2016 03:11, Oliver Keyes <ironho...@gmail.com> va >>>>>> escriure: >>>>>> >>>>>>> If historic data is okay, there's already a dataset released ( >>>>>>> https://figshare.com/articles/Activity_Sessions_datasets/1291033) >>>>>>> that was designed specifically to answer questions around how to best >>>>>>> calculate session length with regards to Wikipedia ( >>>>>>> http://arxiv.org/abs/1411.2878) >>>>>>> >>>>>>> On Tue, Jun 28, 2016 at 3:42 PM, Marc Miquel <marcmiq...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello! >>>>>>>> >>>>>>>> I was thinking about user sessions, yes, so this would mean to >>>>>>>> aggregate pageviews visited by a user during a short amount of time (I >>>>>>>> should check the cutoff, but it could be around an hour or less). >>>>>>>> >>>>>>>> I am particularly interested in understanding the order in which >>>>>>>> pages are seen (start, end), duration, etc. >>>>>>>> I wouldn't need data from a long period neither, but I think data >>>>>>>> from multiple languages would be helpful. >>>>>>>> >>>>>>>> I imagined reader data could be sensitive to privacy, but would an >>>>>>>> NDA with my university and some sort of data encoding help with this? >>>>>>>> As I >>>>>>>> said, it is for a scientific purpose. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Marc >>>>>>>> >>>>>>>> El dt., 28 juny 2016 a les 21:09, Nuria Ruiz (<nu...@wikimedia.org>) >>>>>>>> va escriure: >>>>>>>> >>>>>>>>> >>>>>>>>> Hello! >>>>>>>>> >>>>>>>>> >I am considering to study reader engagement for different >>>>>>>>> article topics in different languages. Because of this, I would like >>>>>>>>> to >>>>>>>>> know if there is >any plan to make available pageviews dumps detailing >>>>>>>>> activity log at session level per user - in a similar way to editor >>>>>>>>> sessions. >>>>>>>>> >>>>>>>>> Are you thinking of "all-pageviews-visited-by-a-certain-user"? If >>>>>>>>> so, no we do not have any projects to provide that data as due to >>>>>>>>> privacy >>>>>>>>> concerns we neither have nor keep that information. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> Nuria >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Jun 28, 2016 at 6:55 PM, Leila Zia <le...@wikimedia.org> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> + Analytics >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Jun 28, 2016 at 6:36 AM, Marc Miquel < >>>>>>>>>> marcmiq...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> I have a question for you regarding pageviews datadumps. >>>>>>>>>>> >>>>>>>>>>> I am considering to study reader engagement for different >>>>>>>>>>> article topics in different languages. Because of this, I would >>>>>>>>>>> like to >>>>>>>>>>> know if there is any plan to make available pageviews dumps >>>>>>>>>>> detailing >>>>>>>>>>> activity log at session level per user - in a similar way to editor >>>>>>>>>>> sessions. >>>>>>>>>>> >>>>>>>>>>> Since this would be for a research project I might ask funding >>>>>>>>>>> for it, I would like to know if I could count on that, what is the >>>>>>>>>>> nature >>>>>>>>>>> of the available data, and what would be the procedure to obtain >>>>>>>>>>> this data >>>>>>>>>>> and if there would be any implication because of privacy concerns. >>>>>>>>>>> >>>>>>>>>>> Thank you very much! >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> >>>>>>>>>>> Marc Miquel >>>>>>>>>>> ᐧ >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Wiki-research-l mailing list >>>>>>>>>>> wiki-researc...@lists.wikimedia.org >>>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Analytics mailing list >>>>>>>>>> Analytics@lists.wikimedia.org >>>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>>>>> >>>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Analytics mailing list >>>>>>>>> Analytics@lists.wikimedia.org >>>>>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Wiki-research-l mailing list >>>>>>>> wiki-researc...@lists.wikimedia.org >>>>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>>>> >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> Wiki-research-l mailing list >>>>>>> wiki-researc...@lists.wikimedia.org >>>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Wiki-research-l mailing list >>>>>> wiki-researc...@lists.wikimedia.org >>>>>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Analytics mailing list >>>>> Analytics@lists.wikimedia.org >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Analytics mailing list >>>> Analytics@lists.wikimedia.org >>>> https://lists.wikimedia.org/mailman/listinfo/analytics >>>> >>>> >>> >>> >>> -- >>> *Marcel Ruiz Forns* >>> Analytics Developer >>> Wikimedia Foundation >>> _______________________________________________ >>> Analytics mailing list >>> Analytics@lists.wikimedia.org >>> https://lists.wikimedia.org/mailman/listinfo/analytics >>> >> >> _______________________________________________ >> Analytics mailing list >> Analytics@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/analytics >> >> > > > -- > *Marcel Ruiz Forns* > Analytics Developer > Wikimedia Foundation > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > > > > _______________________________________________ > Analytics mailing list > Analytics@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/analytics > >
_______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics