> It will create significant discrepancies with the existing geolocation
data we record for pageviews
If you only need country (or whatever is in the cookie), then likely
whatever the output dataset is would only include country when selecting
from pageviews.  If you need more than country (it sounded like you
didn’t), then we can get into doing the IP Geocoding  in EventLogging, but
there are few technical complications here, and we’re prefer not to have to
do this if we don’t have to.

On Wed, Feb 7, 2018 at 12:09 PM, Tilman Bayer <tba...@wikimedia.org> wrote:

> Thanks everyone! Separate from Sam's mapping out the frontend
> instrumentation work at https://phabricator.wikimedia.org/T184793 , I
> have created a task for the backend work at https://phabricator.wikimedia.
> org/T186728 based on this thread.
>
> Regarding the last few posts about the geolocation information, from the
> data analysis perspective, there is indeed another, more serious concern
> about using the GeoIP cookie: It will create significant discrepancies with
> the existing geolocation data we record for pageviews, where we have chosen
> to derive this information from the IP instead. (Remember the overarching
> goal here of measuring page previews the same way we measure page views
> currently; the basic principle is that if a reader visits a page and then
> uses the page preview feature on that page to read preview cards, all the
> metadata that is recorded for both should have identical values for both
> the preview and the pageview.) Therefore, we should go with the kind of
> solution Andrew outlined above (adapting/reusing GetGeoDataUDF or such).
>
> On Thu, Feb 1, 2018 at 7:36 AM, Andrew Otto <o...@wikimedia.org> wrote:
>
>> Wow Sam, yeah, if this cookie works for you, it will make many things
>> much easier for us.  Check it out and let us know.  If it doesn’t work for
>> some reason, we can figure out the backend geocoding part.
>>
>>
>>
>> On Thu, Feb 1, 2018 at 2:43 AM, Sam Smith <samsm...@wikimedia.org> wrote:
>>
>>> On Tue, Jan 30, 2018 at 8:02 AM, Andrew Otto <o...@wikimedia.org> wrote:
>>>
>>>> > Using the GeoIP cookie will require reconfiguring the EventLogging
>>>> varnishkafka instance [0]
>>>>
>>>> I’m not familiar with this cookie, but, if we used it, I thought it
>>>> would be sent back to by the client in the event. E.g. event.country =
>>>> response.headers.country; EventLogging.emit(event);
>>>>
>>>> That way, there’s no additional special logic needed on the server side
>>>> to geocode or populate the country in the event.
>>>>
>>>
>>> Hah! I didn't think about accessing the GeoIP cookie on the client. As
>>> you say, the implementation is quite easy.
>>>
>>> My only concern with this approach is the duplication of the value
>>> between the cookie, which is sent in every HTTP request to the
>>> /beacon/event endpoint, and the event itself. This duplication seems
>>> reasonable when balanced against capturing either: the client IP and then
>>> doing similar geocoding further along in the pipeline; or the cookie for
>>> all requests to that endpoint and then discarding them further along in the
>>> pipeline. It also reflects a seemingly core principle of the EventLogging
>>> system: that it doesn't capture potentiallly PII by default.
>>>
>>> -Sam
>>>
>>>
>>>
>>> _______________________________________________
>>> Analytics mailing list
>>> Analytics@lists.wikimedia.org
>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>>
>>>
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
>
> --
> Tilman Bayer
> Senior Analyst
> Wikimedia Foundation
> IRC (Freenode): HaeB
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

Reply via email to