> It does still seem to me that the data to determine secondary api requests > should already be present in the existing log line. If the value of the > page param in an action=mobileview api request matches the page in the > referrer (perhaps with normalization), it's a secondary request as per case > 1 below. Otherwise, it's a pageview as per case 2. Difficult or expensive > to reconcile? Not when you're doing distributed log analysis via hadoop. > So I did look into this prior to writing the RFC and the issue is that a lot of API referrers don't contain the querystring. I don't know what triggers this so if we can fix this then we can definitely derive the secondary pageview request from the referrer field. D
> On Mon, Feb 11, 2013 at 7:11 PM, Arthur Richards <aricha...@wikimedia.org > >wrote: > > > Thanks, Jon. To try and clarify a bit more about the API requests... they > > are not made on a per-section basis. As I mentioned earlier, there are > two > > cases in which article content gets loaded by the API: > > > > 1) Going directly to a page (eg clicking a link from a Google search) > will > > result in the backend serving a page with ONLY summary section content > and > > section headers. The rest of the page is lazily loaded via API request > once > > the JS for the page gets loaded. The idea is to increase responsiveness > by > > reducing the delay for an article to load (further details in the article > > Jon previously linked to). The API request looks like: > > > > > http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all > > > > 2) Loading an article entirely via Javascript - like when a link is > clicked > > in an article to another article, or an article is loaded via search. > This > > will make ONE call to the API to load article content. API request looks > > like: > > > > > http://en.m.wikipedia.org/w/api.php?format=json&action=mobileview&page=Liverpool+F.C.+in+European+football&variant=en&redirects=yes&prop=sections%7Ctext&noheadings=yes§ionprop=level%7Cline%7Canchor§ions=all > > > > These API requests are identical, but only #2 should be counted as a > > 'pageview' - #1 is a secondary API request and should not be counted as a > > 'pageview'. You could make the argument that we just count all of these > API > > requests as pageviews, but there are cases when we can't load article > > content from the API (like devices that do not support JS), so we need to > > be able to count the traditional page request as a pageview - thus we > need > > a way to differentiate the types of API requests being made when they > > otherwise share the same URL. > > > > > > > > On Mon, Feb 11, 2013 at 6:42 PM, Jon Robson <jdlrob...@gmail.com> wrote: > > > > > I'm a bit worried that now we are asking why pages are lazy loaded > > > rather than focusing on the fact that they currently __are doing > > > this___ and how we can log these (if we want to discuss this further > > > let's start another thread as I'm getting extremely confused doing so > > > on this one). > > > > > > Lazy loading sections > > > ################ > > > For motivation behind moving MobileFrontend into the direction of lazy > > > loading section content and subsequent pages can be found here [1], I > > > just gave it a refresh as it was a little out of date. > > > > > > In summary the reason is to > > > 1) make the app feel more responsive by simply loading content rather > > > than reloading the entire interface > > > 2) reducing the payload sent to a device. > > > > > > Session Tracking > > > ################ > > > > > > Going back to the discussion of tracking mobile page views, it sounds > > > like a header stating whether a page is being viewed in alpha, beta or > > > stable works fine for standard page views. > > > > > > As for the situations where an entire page is loaded via the api it > > > makes no difference to us to whether we > > > 1) send the same header (set via javascript) or > > > 2) add a query string parameter. > > > > > > The only advantage I can see of using a header is that an initial page > > > load of the article San Francisco currently uses the same api url as a > > > page load of the article San Francisco via javascript (e.g. I click a > > > link to 'San Francisco' on the California article). > > > > > > In this new method they would use different urls (as the data sent is > > > different). I'm not sure how that would effect caching. > > > > > > Let us know which method is preferred. From my perspective > > > implementation of either is easy. > > > > > > [1] http://www.mediawiki.org/wiki/MobileFrontend/Dynamic_Sections > > > > > > On Mon, Feb 11, 2013 at 12:50 PM, Asher Feldman < > afeld...@wikimedia.org> > > > wrote: > > > > Max - good answers re: caching concerns. That leaves studying if the > > > bytes > > > > transferred on average mobile article view increases or decreases > with > > > lazy > > > > section loading. If it increases, I'd say this isn't a positive > > > direction > > > > to go in and stop there. If it decreases, then we should look at the > > > > effect on total latency, number of requests required per pageview, > and > > > the > > > > impact on backend apache utilization which I'd expect to be > 0. > > > > > > > > Does the mobile team have specific goals that this project aims to > > > > accomplish? If so, we can use those as the measure against which to > > > > compare an impact analysis. > > > > > > > > On Mon, Feb 11, 2013 at 12:21 PM, Max Semenik <maxsem.w...@gmail.com > > > > > wrote: > > > > > > > >> On 11.02.2013, 22:11 Asher wrote: > > > >> > > > >> > And then I'd wonder about the server side implementation. How will > > > >> frontend > > > >> > cache invalidation work? Are we going to need to purge every > > > individual > > > >> > article section relative to /w/api.php on edit? > > > >> > > > >> Since the API doesn't require pretty URLs, we could simply append > the > > > >> current revision ID to the mobileview URLs. > > > >> > > > >> > Article HTML in memcached > > > >> > (parser cache), mobile processed HTML in memcached.. Now > individual > > > >> > sections in memcached? If so, should we calculate memcached space > > > needs > > > >> for > > > >> > article text as 3x the current parser cache utilization? More > > > memcached > > > >> > usage is great, not asking to dissuade its use but because its > > better > > > to > > > >> > capacity plan than to react. > > > >> > > > >> action=mobileview caches pages only in full and serves > > > >> only sections requested, so no changes in request patterns will > result > > > >> in increased memcached usage. > > > >> > > > >> -- > > > >> Best regards, > > > >> Max Semenik ([[User:MaxSem]]) > > > >> > > > >> > > > >> _______________________________________________ > > > >> Wikitech-l mailing list > > > >> Wikitech-l@lists.wikimedia.org > > > >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > >> > > > > _______________________________________________ > > > > Wikitech-l mailing list > > > > Wikitech-l@lists.wikimedia.org > > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > > > > > > > > -- > > > Jon Robson > > > http://jonrobson.me.uk > > > @rakugojon > > > > > > _______________________________________________ > > > Wikitech-l mailing list > > > Wikitech-l@lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > > > > > > > -- > > Arthur Richards > > Software Engineer, Mobile > > [[User:Awjrichards]] > > IRC: awjr > > +1-415-839-6885 x6687 > > _______________________________________________ > > Wikitech-l mailing list > > Wikitech-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l