Github user rtudoran commented on the issue:
https://github.com/apache/flink/pull/3641
@fhueske Thanks for the example. First of all i am happy that we agree that
we need to emit something for every input :). I was scared that this will not
be the case
Now regarding the 2 options for the output - i would still have a
preference for the first option. The reason for this is that at least in the
case of proctime there is no 2 events that are simultaneous in absolute terms.
There is an implicit serialization (also based on how flink engine works...one
event will arrive before the other). It might simply be a coincidence that the
granularity that the computers now cannot measure system time with more fine
granularity...and because of that we would have 2 events which are apparently
with the same proc time stamp...
However, as you mentioned several times (and i agree) having a uniform
behavior might be a key point. In this case we would indeed implement the
behavior you suggested. - Let me know again if you are sure about this. I
raised my concerns but i am open to accept what you suggest.
In case you suggest to implement it with proctimes, i have 2 questions:
1) do you know if there is some example for timer in proc time (if not, now
problem - i will figure it out)
2) in the case of harness tests - the correct implementation of the
behavior might not match the test. If i write a test that does
{code}
testHarness.setProcessingTime(3)
testHarness.processElement(new StreamRecord(rInput, 1001)) // let us
assume that the system time is 1490906681
testHarness.processElement(new StreamRecord(rInput, 2002)) // let us
assume that the system time is still 1490906681
+ testHarness.processElement(new StreamRecord(rInput, 2003)) // let us
assume that the system time is now 1490906682
+ testHarness.processElement(new StreamRecord(rInput, 2004)) // let us
assume that the system time is now 1490906682
{code}
in this case if we do a Count, than the output would be
ev1 2
ev2 2
ev3 2
ev4 2
...instead of having all with an associated count of 4 (which would be in
the behavior you mention). This would be because i would assume that the timer
+1ms would be triggered based on system time.
In such a case - should we just build a limited test that would work?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---