> It does still seem to me that the data to determine secondary api requests
> should already be present in the existing log line. If the value of the
> page param in an action=mobileview api request matches the page in the
> referrer (perhaps with normalization), it's a secondary request as per case
> 1 below.  Otherwise, it's a pageview as per case 2.  Difficult or expensive
> to reconcile?  Not when you're doing distributed log analysis via hadoop.
>
So I did look into this prior to writing the RFC and the issue is that a
lot of API referrers don't contain the querystring. I don't know what
triggers this so if we can fix this then we can definitely derive the
secondary pageview request from the referrer field.
D



> On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <aricha...@wikimedia.org
> >wrote:
>
> > Thanks, Jon. To try and clarify a bit more about the API requests... they
> > are not made on a per-section basis. As I mentioned earlier, there are
> two
> > cases in which article content gets loaded by the API:
> >
> > 1) Going directly to a page (eg clicking a link from a Google search)
> will
> > result in the backend serving a page with ONLY summary section content
> and
> > section headers. The rest of the page is lazily loaded via API request
> once
> > the JS for the page gets loaded. The idea is to increase responsiveness
> by
> > reducing the delay for an article to load (further details in the article
> > Jon previously linked to). The API request looks like:
> >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes&sectionprop=level%7Cline%7Canchor&sections=all
> >
> > 2) Loading an article entirely via Javascript - like when a link is
> clicked
> > in an article to another article, or an article is loaded via search.
> This
> > will make ONE call to the API to load article content. API request looks
> > like:
> >
> >
> http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes&sectionprop=level%7Cline%7Canchor&sections=all
> >
> > These API requests are identical, but only #2 should be counted as a
> > 'pageview' - #1 is a secondary API request and should not be counted as a
> > 'pageview'. You could make the argument that we just count all of these
> API
> > requests as pageviews, but there are cases when we can't load article
> > content from the API (like devices that do not support JS), so we need to
> > be able to count the traditional page request as a pageview - thus we
> need
> > a way to differentiate the types of API requests being made when they
> > otherwise share the same URL.
> >
> >
> >
> > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson <jdlrob...@gmail.com> wrote:
> >
> > > I'm a bit worried that now we are asking why pages are lazy loaded
> > > rather than focusing on the fact that they currently __are doing
> > > this___ and how we can log these (if we want to discuss this further
> > > let's start another thread as I'm getting extremely confused doing so
> > > on this one).
> > >
> > > Lazy loading sections
> > > ################
> > > For motivation behind moving MobileFrontend into the direction of lazy
> > > loading section content and subsequent pages can be found here [1], I
> > > just gave it a refresh as it was a little out of date.
> > >
> > > In summary the reason is to
> > > 1) make the app feel more responsive by simply loading content rather
> > > than reloading the entire interface
> > > 2) reducing the payload sent to a device.
> > >
> > > Session Tracking
> > > ################
> > >
> > > Going back to the discussion of tracking mobile page views, it sounds
> > > like a header stating whether a page is being viewed in alpha, beta or
> > > stable works fine for standard page views.
> > >
> > > As for the situations where an entire page is loaded via the api it
> > > makes no difference to us to whether we
> > > 1) send the same header (set via javascript) or
> > > 2) add a query string parameter.
> > >
> > > The only advantage I can see of using a header is that an initial page
> > > load of the article San Francisco currently uses the same api url as a
> > > page load of the article San Francisco via javascript (e.g. I click a
> > > link to 'San Francisco' on the California article).
> > >
> > > In this new method they would use different urls (as the data sent is
> > > different). I'm not sure how that would effect caching.
> > >
> > > Let us know which method is preferred. From my perspective
> > > implementation of either is easy.
> > >
> > > [1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections
> > >
> > > On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman <
> afeld...@wikimedia.org>
> > > wrote:
> > > > Max - good answers re: caching concerns.  That leaves studying if the
> > > bytes
> > > > transferred on average mobile article view increases or decreases
> with
> > > lazy
> > > > section loading.  If it increases, I'd say this isn't a positive
> > > direction
> > > > to go in and stop there.  If it decreases, then we should look at the
> > > > effect on total latency, number of requests required per pageview,
> and
> > > the
> > > > impact on backend apache utilization which I'd expect to be > 0.
> > > >
> > > > Does the mobile team have specific goals that this project aims to
> > > > accomplish?  If so, we can use those as the measure against which to
> > > > compare an impact analysis.
> > > >
> > > > On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik <maxsem.w...@gmail.com
> >
> > > wrote:
> > > >
> > > >> On 11.02.2013, 22:11 Asher wrote:
> > > >>
> > > >> > And then I'd wonder about the server side implementation. How will
> > > >> frontend
> > > >> > cache invalidation work? Are we going to need to purge every
> > > individual
> > > >> > article section relative to /w/api.php on edit?
> > > >>
> > > >> Since the API doesn't require pretty URLs, we could simply append
> the
> > > >> current revision ID to the mobileview URLs.
> > > >>
> > > >> > Article HTML in memcached
> > > >> > (parser cache), mobile processed HTML in memcached.. Now
> individual
> > > >> > sections in memcached? If so, should we calculate memcached space
> > > needs
> > > >> for
> > > >> > article text as 3x the current parser cache utilization? More
> > > memcached
> > > >> > usage is great, not asking to dissuade its use but because its
> > better
> > > to
> > > >> > capacity plan than to react.
> > > >>
> > > >> action=mobileview caches pages only in full and serves
> > > >> only sections requested, so no changes in request patterns will
> result
> > > >> in increased memcached usage.
> > > >>
> > > >> --
> > > >> Best regards,
> > > >>   Max Semenik ([[User:MaxSem]])
> > > >>
> > > >>
> > > >> _______________________________________________
> > > >> Wikitech-l mailing list
> > > >> Wikitech-l@lists.wikimedia.org
> > > >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > >>
> > > > _______________________________________________
> > > > Wikitech-l mailing list
> > > > Wikitech-l@lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
> > >
> > >
> > > --
> > > Jon Robson
> > > http://jonrobson.me.uk
> > > @rakugojon
> > >
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > Wikitech-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > >
> >
> >
> >
> > --
> > Arthur Richards
> > Software Engineer, Mobile
> > [[User:Awjrichards]]
> > IRC: awjr
> > +1-415-839-6885 x6687
> > _______________________________________________
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to