Forwarding to the analytics list for reference.

---------- Forwarded message ---------
From: Ho Chung <chungho4...@gmail.com>
Date: Mon, Mar 15, 2021 at 11:45 AM
Subject: Re: [Analytics] About: refine_webrequest.hql
To: Joseph Allemandou <jalleman...@wikimedia.org>


Hello


Thanks for your reply

Because i was research your Analytics team public discuss history and
wikiteah about web request time stamp


https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Webrequest

https://phabricator.wikimedia.org/T212529

 I have been in doubt at that time, you're used java technology, but your
HIVE version did not support java before October 2018.

The wmf.webrequest file is located in HIVE.

When collecting the privacy data of readership , whether the time stamp
used the reader's computer system clock instead of the Wikipedia computer
server clock when reading and browsing the page

Now I am more clear. On the public discussion page of your analysis team,
said that all the time is utc by Ottomata

It’s just that you technicians don’t want to unify the expression of the
time stamp format, but in fact all of them use UTC

在 2021年3月15日週一 16:14,Joseph Allemandou <jalleman...@wikimedia.org> 寫道:

> Hi,
> the `dt` field is the time in UTC (no timezone specified) at which the
> request ends being processed by Varnish.
> Cheers
> Joseph
>
> On Mon, Mar 15, 2021 at 8:36 AM Luca Toscano <ltosc...@wikimedia.org>
> wrote:
>
>> +A mailing list for the Analytics Team at WMF and everybody who has an
>> interest in Wikipedia and analytics. <analytics@lists.wikimedia.org>
>>
>> Hi!
>>
>> I added the Analytics mailing list in Cc so other people can chime in,
>> this is the canonical way to follow up with us and the community, please
>> avoid direct email if possible :)
>>
>> Thanks!
>>
>> Luca
>>
>>
>>
>> On Sat, Mar 13, 2021 at 10:57 PM Ho Chung <chungho4...@gmail.com> wrote:
>>
>>> Hello
>>>
>>> I have some problem request , about refine_webrequest.hql
>>>
>>>
>>> In this file timestamp is use utc ?
>>>
>>> This file is it connect wmf_raw.webrequest and wmf.webrequest ?
>>>
>>> Because i can't read the code have add Z / +/- zone time
>>>
>>>
>>>
>>>  -- Hack to get a correct timestamp because of hive inconsistent
>>> conversion
>>>
>>>  CAST(unix_timestamp(dt, "yyyy-MM-dd'T'HH:mm:ss") * 1.0 as timestamp) as
>>> ts,
>>>
>>>
>>> https://github.com/wikimedia/analytics-refinery/blob/master/oozie/webrequest/load/refine_webrequest.hql
>>>
>>> I emailed wiki legal request 3 month they not sure , can you clearly ask
>>> me .
>>>
>>> If not use utc, is use your server clock or  , my computer clock?
>>>
>>>
>>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>
>
> --
> Joseph Allemandou (joal) (he / him)
> Staff Data Engineer
> Wikimedia Foundation
>


-- 
Joseph Allemandou (joal) (he / him)
Staff Data Engineer
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to