Re: [Analytics] Maybe Analytics project in Phabricator

2015-04-27 Thread Andre Klapper
On Fri, 2015-04-17 at 18:15 -0700, Grace Gellerman wrote:
 The project is intended for Analytics customers to alert Analytics of
 work in their products that they think might intersect with ours. It's
 a way of giving Analytics an early heads-up so that Analytics can
 either say,Thanks for the early warning! or Thanks, but this does
 not touch Analytics.
 
 
 We can remind participants at Scrum-of-Scrums that they can use this
 project.

Isn't that pretty much what
https://phabricator.wikimedia.org/tag/blocked-on-analytics/ is for? 
Both projects should receive urgent triage anyway (and hence a decision
whether a task is actually Analytics territory or not), but I see zero
folks listed under Watchers [1] on either project pages?

 So for now, please do not archive it.  Thanks!

I would like to archive that project soon, given my comment above.
Furthermore, that project has been entirely unused (maybe because nobody
has ever heard of that project...).

If I imagined every project to have a corresponding maybe-project, we'd
just create unneeded abstraction layers.
Newly created tasks should receive triage. One triage steps is defining
if the task is associated to the right project(s). No maybe needed.

Cheers,
andre

[1] 
https://www.mediawiki.org/wiki/Phabricator/Help#Receiving_updates_and_notifications
-- 
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/


___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Technical] WMF-Last-Access

2015-04-27 Thread Marcel Ruiz Forns
+1 'last'
___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Technical] WMF-Last-Access

2015-04-27 Thread Oliver Keyes
+1 for ISO dates. They're also more parsable by researchers.

On 27 April 2015 at 18:57, Dario Taraborelli dtarabore...@wikimedia.org wrote:
 I also noticed the cookie stores a string with a 3-letter month 
 (27-Apr-2015), any reason not to use a shorter ISO date instead (2015-04-27)?

 On Apr 27, 2015, at 3:00 PM, Marcel Ruiz Forns mfo...@wikimedia.org wrote:

 +1 'last'
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics


 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics



-- 
Oliver Keyes
Research Analyst
Wikimedia Foundation

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Technical] WMF-Last-Access

2015-04-27 Thread Dan Andreescu
Gonna stop this ISO date fancy bandwagon right here :)

We could do it with a bunch of VCL code but that affects performance of the
site and we'd rather take the hit in analytics.  We could look into making
a UDF that deals with this and other common date code we'd want to DRY.

On Mon, Apr 27, 2015 at 4:02 PM, Oliver Keyes oke...@wikimedia.org wrote:

 +1 for ISO dates. They're also more parsable by researchers.

 On 27 April 2015 at 18:57, Dario Taraborelli dtarabore...@wikimedia.org
 wrote:
  I also noticed the cookie stores a string with a 3-letter month
 (27-Apr-2015), any reason not to use a shorter ISO date instead
 (2015-04-27)?
 
  On Apr 27, 2015, at 3:00 PM, Marcel Ruiz Forns mfo...@wikimedia.org
 wrote:
 
  +1 'last'
  ___
  Analytics mailing list
  Analytics@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/analytics
 
 
  ___
  Analytics mailing list
  Analytics@lists.wikimedia.org
  https://lists.wikimedia.org/mailman/listinfo/analytics



 --
 Oliver Keyes
 Research Analyst
 Wikimedia Foundation

 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics

___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Ops] udp2log shutdown (for analytics instances) next week

2015-04-27 Thread Jeff Green

Ok thanks for the heads up!

On Mon, 27 Apr 2015, Andrew Otto wrote:


Hi again!
Today I turned of most udp2log webrequest filters.  For now, I have left the 
Fundraising filters, as well as the 5xx and
sampled-1000 filters running.  All of these filters are now running on erbium.  
oxygen's udp2log instance has been shut off.

Instead of constantly updating this thread, I will track this here: 
https://phabricator.wikimedia.org/T97294

Thanks!

On Tue, Apr 21, 2015 at 3:49 PM, Andrew Otto ao...@wikimedia.org wrote:
  Hi all!

  Now that all data that is generated by udp2log is also being generated by 
the Analytics Cluster, we are finally ready
  to turn off analytics udp2log instances.  I will start with the ones that 
are used to generate the logs on stat1002 at
  /a/squid/archive.  The (identical) cluster generated logs can be found on 
stat1002 at /a/log/webrequest/archive.  I
  will paste the contents of the README file in /a/squid/archive describing 
the differences at the bottom of this email.

  If you use any of the logs in /a/squid/archive for regular statistics, 
you will need to switch your code to use files
  in /a/log/webrequest/archive instead.  I plan to start turning off 
udp2log instances on  Monday April 27th (that’s next
  week!).


  From the README:

  [@stat1002:/a/squid/archive] $ cat README.migrate-to-hive.2015-02-17
  ***
  *                                                                     *
  *  This directory will run stale once udp2log will get turned off.    *
  *  Please use the corresponding TSVs from /a/log/webrequest/archive/  *
  *  instead.                                                           *
  *                                                                     *
  ***



  The TSV files in this directory underneath /a/squid/archive get
  generated by udp2log and suffer from

  * Sub-par data quality (E.g.: udp2log had an inherent loss).
  * Lack of a way to backfill/fix data.
  * Some files consuming https requests twice, which made filtering
    necessary.
  * Consfusing naming scheme, where each file covered 24 hours, but not
    midnight to midnight, but ~06:30 previous day to ~06:30 current day.

  The new TSVs at /a/log/webrequest/archive/ contain the same
  information but get generated by Hive, and address the above four
  issues:

  * By using Hive's webrequest table as input, the inherent loss is
    gone. Also statistics on the hour's data quality are available.
  * Hive data allows to backfill/fix data.
  * Only data from the varnishes gets picked up. So https traffic no
    longer gets duplicated.
  * The files now cover 24 hours from midnight to midnight. No more
    stitching/cutting is needed to get the logs for a given day.


  Please migrate to using the Hive-generated TSVs from

    /a/log/webrequest/archive/


  Thanks!  I’ll keep you updated as this happens.

  -Andrew Otto





___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] Maybe Analytics project in Phabricator

2015-04-27 Thread Kevin Leduc
+1 to Dan

On Monday, April 27, 2015, Dan Andreescu dandree...@wikimedia.org wrote:
 Sounds to me like the nuance we were trying to go for is causing
confusion.  This is unintended and my opinion is that we should remove
maybe-analytics and just tell everyone to use blocked-on-analytics as
liberally as they wish.
 On Mon, Apr 27, 2015 at 1:45 AM, Andre Klapper aklap...@wikimedia.org
wrote:

 On Fri, 2015-04-17 at 18:15 -0700, Grace Gellerman wrote:
  The project is intended for Analytics customers to alert Analytics of
  work in their products that they think might intersect with ours. It's
  a way of giving Analytics an early heads-up so that Analytics can
  either say,Thanks for the early warning! or Thanks, but this does
  not touch Analytics.
 
 
  We can remind participants at Scrum-of-Scrums that they can use this
  project.

 Isn't that pretty much what
 https://phabricator.wikimedia.org/tag/blocked-on-analytics/ is for?
 Both projects should receive urgent triage anyway (and hence a decision
 whether a task is actually Analytics territory or not), but I see zero
 folks listed under Watchers [1] on either project pages?

  So for now, please do not archive it.  Thanks!

 I would like to archive that project soon, given my comment above.
 Furthermore, that project has been entirely unused (maybe because nobody
 has ever heard of that project...).

 If I imagined every project to have a corresponding maybe-project, we'd
 just create unneeded abstraction layers.
 Newly created tasks should receive triage. One triage steps is defining
 if the task is associated to the right project(s). No maybe needed.

 Cheers,
 andre

 [1]
https://www.mediawiki.org/wiki/Phabricator/Help#Receiving_updates_and_notifications
 --
 Andre Klapper | Wikimedia Bugwrangler
 http://blogs.gnome.org/aklapper/


 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics


___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics


Re: [Analytics] [Technical] WMF-Last-Access

2015-04-27 Thread Dario Taraborelli
I also noticed the cookie stores a string with a 3-letter month (27-Apr-2015), 
any reason not to use a shorter ISO date instead (2015-04-27)?

 On Apr 27, 2015, at 3:00 PM, Marcel Ruiz Forns mfo...@wikimedia.org wrote:
 
 +1 'last'
 ___
 Analytics mailing list
 Analytics@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/analytics


___
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics