Matthew Hayes created DATAFU-13:
-----------------------------------

             Summary: Hourglass fixed-length windows should be robust to 
reappearing data
                 Key: DATAFU-13
                 URL: https://issues.apache.org/jira/browse/DATAFU-13
             Project: DataFu
          Issue Type: Improvement
            Reporter: Matthew Hayes
            Assignee: Matthew Hayes


For a fixed-length window where output is being reused, the oldest day is 
"subtracted" from the previous output as the window advances. However it's 
possible a day may be missing when output is computed. If the data then 
reappears later, it will eventually be subtracted off, even though it wasn't 
included in the output previously. This could yield "negative" values.

Hourglass doesn't currently track the intervals included in the output, only 
the start and end time. This could easily be solved by including the interval 
coverage as well. It also means that when the data reappears it can be included 
in the next output.

From: [Issue #76 on GitHub|https://github.com/linkedin/datafu/issues/76]




--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to