> On Jan 7, 2015, at 6:42 AM, Gilles Dubuc <gil...@wikimedia.org> wrote:
> 
> Right -- couldn't we just tag the URL?
> 
> The event of the user actually viewing the image is completely disconnected 
> from the URL hit in Media Viewer, which is why we need EL and can't rely on 
> existing server logs.
>  
> Eventlogging data currently does go to files, as well as to the DB.
> 
> Great, then I guess it's a matter of only making the data go to files and not 
> to DB for the particular schema we'll create. Does that sound like something 
> feasible? How much work would be required to set it up?

this is a feature that other teams requested in the past, I agree it would be 
very helpful. In an ideal world, we would be able to specify the log 
configuration (where to write the data, pruning requirements, schema ownership) 
directly via a JSON object associated with the main schema.

Dario

> On Tue, Jan 6, 2015 at 7:45 PM, Andrew Otto <ao...@wikimedia.org 
> <mailto:ao...@wikimedia.org>> wrote:
> Eventlogging data currently does go to files, as well as to the DB.  Check it 
> out on stat1003 at /srv/eventlogging/archive.
> 
> If you need something with higher throughput then eventlogging itself 
> supports…then let’s talk :D
> 
> -Ao
> 
> 
> 
> 
>> On Jan 6, 2015, at 13:28, Erik Zachte <ezac...@wikimedia.org 
>> <mailto:ezac...@wikimedia.org>> wrote:
>> 
>> You mean attach an X-analytics parameter, for extra images beyond the one 
>> the user initially requested.
>>  
>> But then we would undercount, basically missing all image views from 
>> clicking right arrow in image viewer.
>> I'm not sure how much we would miss then.
>> iirc Gilles said this browsing feature was used quite a long, but I'm not 
>> sure.
>>  
>> From: analytics-boun...@lists.wikimedia.org 
>> <mailto:analytics-boun...@lists.wikimedia.org> 
>> [mailto:analytics-boun...@lists.wikimedia.org 
>> <mailto:analytics-boun...@lists.wikimedia.org>] On Behalf Of Toby Negrin
>> Sent: Tuesday, January 06, 2015 19:16
>> To: A mailing list for the Analytics Team at WMF and everybody who has an 
>> interest in Wikipedia and analytics.
>> Subject: Re: [Analytics] Making EventLogging output to a log file instead of 
>> the DB
>> 
>>  
>> 
>> Right -- couldn't we just tag the URL?
>> 
>>  
>> 
>> On Tue, Jan 6, 2015 at 10:10 AM, Erik Zachte <ezac...@wikimedia.org 
>> <mailto:ezac...@wikimedia.org>> wrote:
>> 
>> Just to clarify, this is about prefetched images which have not been shown 
>> to the public.
>> 
>> They were sent to the browser ahead of a possible request to speed things up 
>> but in many cases never actually requested.
>> 
>> https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts#Prefetched_images
>>  
>> <https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts#Prefetched_images>
>> - Erik
>> 
>>  
>> 
>> From: analytics-boun...@lists.wikimedia.org 
>> <mailto:analytics-boun...@lists.wikimedia.org> 
>> [mailto:analytics-boun...@lists.wikimedia.org 
>> <mailto:analytics-boun...@lists.wikimedia.org>] On Behalf Of Toby Negrin
>> Sent: Tuesday, January 06, 2015 18:49
>> To: A mailing list for the Analytics Team at WMF and everybody who has an 
>> interest in Wikipedia and analytics.
>> Subject: Re: [Analytics] Making EventLogging output to a log file instead of 
>> the DB
>> 
>>  
>> 
>> Hi Gilles -- why won't the page view logs work by themselves for this 
>> purpose? EL can be configured to write into Hadoop which is probably the 
>> best way to get the throughput you need but it seems overcomplicated.
>> 
>>  
>> 
>> -Toby
>> 
>>  
>> 
>> On Tue, Jan 6, 2015 at 9:41 AM, Gilles Dubuc <gil...@wikimedia.org 
>> <mailto:gil...@wikimedia.org>> wrote:
>> 
>> This depends on [1] so we're not going to need that immediately, but in 
>> order to help Erik Zachte with his RfC [2] to track unique media views in 
>> Media Viewer, I'm going to need to use something almost exactly like 
>> EventLogging. The main difference being that it should skip writing to the 
>> database and write to a log file instead.
>> 
>> That's because we'll be recording around 20-25M image views per day, which 
>> would needlessly overload EventLogging for little purpose since the data 
>> will be used for offline stats generation and doesn't need to be made 
>> available in a relational database. Of course if storage space and 
>> EventLogging capacity were no object, we could just use EL and keep the 
>> ever-growing table forever, but I have the impression that we want to be 
>> reasonable here and only write to a log, since that's what Erik needs.
>> 
>> So here's the question: for a specific schema, can EventLogging work the way 
>> it does but only record hits to a log file (maybe it already does that 
>> before hitting the DB?) and not write to the DB? If not, how difficult would 
>> it be to make EL capable of doing that?
>> 
>> 
>> [1] https://phabricator.wikimedia.org/T44815 
>> <https://phabricator.wikimedia.org/T44815>
>> [2] 
>> https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts
>>  
>> <https://www.mediawiki.org/wiki/Requests_for_comment/Media_file_request_counts>
>> 
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/analytics 
>> <https://lists.wikimedia.org/mailman/listinfo/analytics>
>>  
>> 
>> 
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/analytics 
>> <https://lists.wikimedia.org/mailman/listinfo/analytics>
>>  
>> 
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
>> https://lists.wikimedia.org/mailman/listinfo/analytics 
>> <https://lists.wikimedia.org/mailman/listinfo/analytics>
> 
> 
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org <mailto:Analytics@lists.wikimedia.org>
> https://lists.wikimedia.org/mailman/listinfo/analytics 
> <https://lists.wikimedia.org/mailman/listinfo/analytics>
> 
> 
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to