Hi!
I'd like to hear from stakeholders about purging old data from the
eventlogging database. Yes, no, why [not], etc.
I understand from Ori that there is a 90 day retention policy, and that
purging has been discussed previously but not addressed for various
reasons. Certainly there are many time
On May 20, 2014, at 10:09 PM, Sean Pringle wrote:
> Hi!
>
> I'd like to hear from stakeholders about purging old data from the
> eventlogging database. Yes, no, why [not], etc.
>
> I understand from Ori that there is a 90 day retention policy, and that
> purging has been discussed previously
On Tue, May 20, 2014 at 10:36 PM, Dario Taraborelli <
dtarabore...@wikimedia.org> wrote:
> On May 20, 2014, at 10:09 PM, Sean Pringle wrote:
>
> Hi!
>
> I'd like to hear from stakeholders about purging old data from the
> eventlogging database. Yes, no, why [not], etc.
>
> I understand from Ori t
>Not to hijack the thread, but: to do this in the schema itself confuses the
>structure of the data
>with the mechanics of its use. I think having a couple of helpers in
>JavaScript and PHP
> for simple random sampling is sufficient.
Much agree with ori here. We would be bloating schema with prop
> The motivation behind your proposal is (I think) a desire to have a unified
> configuration interface for data collection jobs. This makes total sense and
> it's worth pursuing. I just don't think we should stuff everything into the
> schema. The schema is just that: a schema. It's a data mode
On Wed, May 21, 2014 at 5:03 PM, Ori Livneh wrote:
>
> On Tue, May 20, 2014 at 10:36 PM, Dario Taraborelli <
> dtarabore...@wikimedia.org> wrote:
>
>> On May 20, 2014, at 10:09 PM, Sean Pringle
>> wrote:
>>
>
>
>> *Existing schemas* would need to be audited on a case by case basis.
>>
>
> By who
On May 27, 2014, at 7:49 PM, Sean Pringle wrote:
> On Wed, May 21, 2014 at 5:03 PM, Ori Livneh wrote:
>
> On Tue, May 20, 2014 at 10:36 PM, Dario Taraborelli
> wrote:
> On May 20, 2014, at 10:09 PM, Sean Pringle wrote:
>
> Existing schemas would need to be audited on a case by case basis.
Second Dario for NavigationTiming data. Before archiving it I would like us to
have a project for processing it.
Also, graphs directly query the EL data store in many instances.
Removing the data would mean we will only be showing 90 days of data on
dashboards, that will send many complaints o
I just announced this potential change in Scrum of Scrums and the Mobile
team said they also would like to keep old data, but not for all of their
schemas. They're cleaning up their graphs and we should check with them
when we start deleting.
On Wed, May 28, 2014 at 2:56 AM, Nuria wrote:
> Sec
On Wed, May 28, 2014 at 10:50 AM, Dan Andreescu
wrote:
> I just announced this potential change in Scrum of Scrums and the Mobile
> team said they also would like to keep old data, but not for all of their
> schemas. They're cleaning up their graphs and we should check with them
> when we start
+1 to Dario's mention of the many schemas that just capture production DB
stuff in a better way.
Re. growth: Old growth experiment schemas continue to be a great resource
for checking old work and sometimes even new hypotheses. When Dario and
Kevin get around to us, I'll have a complete list of s
On Wed, May 28, 2014 at 11:26 PM, Steven Walling
wrote:
> My main question is what the rationale is. Is it to improve query
> performance on analytics dbs?
>
I imagine it will help, but it's probably not the primary reason. I imagine
Sean would like to have the database in a state of equilibrium
On Fri, May 30, 2014 at 3:28 PM, Ori Livneh wrote:
> On Wed, May 28, 2014 at 11:26 PM, Steven Walling
> wrote:
>
>> My main question is what the rationale is. Is it to improve query
>> performance on analytics dbs?
>>
>
> I imagine it will help, but it's probably not the primary reason. I
> imag
I see, I thought concern was privacy rather than capacity. In that case we
should put in our backlog an item to short out schemas and find the ones whose
data can be deleted. I will file an item to this extent.
In the future we hopefully have this metadata about the schema available
somewhere.
Nuria, I believe that Dario already did that[1].
1. https://trello.com/c/F0DsiSXn/305-audit-historical-el-data-for-retention
On Fri, May 30, 2014 at 1:33 AM, Nuria wrote:
> I see, I thought concern was privacy rather than capacity. In that case we
> should put in our backlog an item to short o
On Thu, May 29, 2014 at 11:03 PM, Sean Pringle
wrote:
> On Fri, May 30, 2014 at 3:28 PM, Ori Livneh wrote:
>
>> On Wed, May 28, 2014 at 11:26 PM, Steven Walling
>> wrote:
>>
>>> My main question is what the rationale is. Is it to improve query
>>> performance on analytics dbs?
>>>
>>
>> I imagi
16 matches
Mail list logo