Matthew Hayes created DATAFU-13:
-----------------------------------
Summary: Hourglass fixed-length windows should be robust to
reappearing data
Key: DATAFU-13
URL: https://issues.apache.org/jira/browse/DATAFU-13
Project: DataFu
Issue Type: Improvement
Reporter: Matthew Hayes
Assignee: Matthew Hayes
For a fixed-length window where output is being reused, the oldest day is
"subtracted" from the previous output as the window advances. However it's
possible a day may be missing when output is computed. If the data then
reappears later, it will eventually be subtracted off, even though it wasn't
included in the output previously. This could yield "negative" values.
Hourglass doesn't currently track the intervals included in the output, only
the start and end time. This could easily be solved by including the interval
coverage as well. It also means that when the data reappears it can be included
in the next output.
From: [Issue #76 on GitHub|https://github.com/linkedin/datafu/issues/76]
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)