[ 
https://issues.apache.org/jira/browse/PIO-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15875107#comment-15875107
 ] 

Pat Ferrel commented on PIO-45:
-------------------------------

The latest SelfCleaningDatasource still has some issues. I created a template 
that only cleans the EventServer and created an integration test to illustrate.

1) the following $sets in pseudo code:
{code:java}
Nexus,$set, "categories" :["Tablets"], today 
Nexus,$set, "categories" :["Tablets", "Electronics"], today - 2 days
Nexus,$set, "categories" :["Tablets", "Electronics", "Google"], today - 6 days
{code}

should aggregate today into 
{code:java}
Nexus,$set, "categories": ["Tablets"]
{code}

But the actual value comes out:
{code:java}
Nexus,$set, "categories": ["Tablets", "Electronics", "Google"]
{code}

The aggregate is basically the last/most recent $set/$unset for any named 
property.

For a given object all properties touched over all time are aggregated, not all 
values each property has taken on. The values are only from the most recent 
$set.

So unique properties accumulate until $delete of the object. But the most 
recent $set wins in aggregation.

This seem super important to get right since this is the only method to trim 
and compact the EventStore and without it working correctly, events accumulate 
forever.

> SelfCleaningDatasource erases all data
> --------------------------------------
>
>                 Key: PIO-45
>                 URL: https://issues.apache.org/jira/browse/PIO-45
>             Project: PredictionIO
>          Issue Type: Bug
>    Affects Versions: 0.10.0-incubating
>            Reporter: Pat Ferrel
>            Assignee: Alexander  Merritt
>            Priority: Blocker
>             Fix For: 0.11.0
>
>         Attachments: import_handmade_simple.py, 
> sample-time-window-and-downsample-data.txt
>
>
> as integrated into the UR, in the integration-test, the SelfCleaningDataset 
> erases all data. This feature works fine in the AML version of PIO.
> Although not tested one could assume that this would be true with any other 
> Datasource in other templates.
> [~emergentorder] can you check to see if the PIO merge was done correctly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to