[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-07-05 Thread kl0u
Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/5342
  
Thanks for the work @florianschmidt1994 ! Merging this.


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-06-13 Thread kl0u
Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/5342
  
Thanks @florianschmidt1994 . I will, but may be not today.


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-06-13 Thread florianschmidt1994
Github user florianschmidt1994 commented on the issue:

https://github.com/apache/flink/pull/5342
  
@kl0u I made some changes on how I handle events on either side of the 
stream. By introducing some generic methods we can now reuse large parts of the 
code for either input stream and remove a lot of code duplication. Could you 
have another look at this?


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-03-08 Thread kl0u
Github user kl0u commented on the issue:

https://github.com/apache/flink/pull/5342
  
Changes look good to me! I will let it run on Travis and then merge.


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-01-30 Thread florianschmidt1994
Github user florianschmidt1994 commented on the issue:

https://github.com/apache/flink/pull/5342
  
@hequn8128 Thank you for the review. Regarding your concern about delaying 
the watermark I added some sketches and description about my thought process to 
the design document.


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-01-29 Thread florianschmidt1994
Github user florianschmidt1994 commented on the issue:

https://github.com/apache/flink/pull/5342
  
@bowenli86 the document should now be be public for everyone to comment on. 
Yes, it caches data on either side, and for each incoming element it looks 
up eligible records from the other side, and joins and emits those if they 
fulfil the user criteria. Entries get removed from the cache whenever they are 
too old to be joined, which is determined by a combination of the current 
watermark and the time boundary defined by the user.


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-01-26 Thread bowenli86
Github user bowenli86 commented on the issue:

https://github.com/apache/flink/pull/5342
  
Very interesting! two things:
1. can you make the google doc publicly viewable? I cannot access it right 
now
2. how does it handle event time window joins of two streams, where data in 
one stream always quite late than the other? For example, we are joining stream 
A and B on a 10 min event-time tumbling window from 12:00 -12:10, 12:10 - 
12:20 data in stream B always arrive 30 mins later than the data in stream 
A. How does the operators handle that? Does it cache A's data until B's data 
arrives, do the join, and remove A's data from cache?   (I haven't read the 
code in detail, just try to get a general idea of the overall design)


---


[GitHub] flink issue #5342: [FLINK-8479] Timebounded stream join

2018-01-24 Thread florianschmidt1994
Github user florianschmidt1994 commented on the issue:

https://github.com/apache/flink/pull/5342
  
Yeah seems like I made a typo there. It's actually 
https://issues.apache.org/jira/browse/FLINK-8479, I just fixed the title. 


---