Re: Handling burst I/O when using tumbling/sliding windows

2018-10-15 Thread Piotr Nowojski
Hi, Just to sum up this thread. Discussion in the before mentioned design doc concluded that there is no need for API changes. The semantic that Rong was asking for can be achieved by implementing a custom WindowAssigner, that mimics one of the existing ones like TumblingEventTimeWindows, and

Re: Handling burst I/O when using tumbling/sliding windows

2018-10-10 Thread Rong Rong
Hi Piotrek, Thanks for the feedback and reviews. Yes, as I explained previously in reply to the (2B) point. I think it is possible to create our own customized window assigner without any API change if we eliminate the requirement of *"the same key should always results in the same offset"* I

Re: Handling burst I/O when using tumbling/sliding windows

2018-10-09 Thread Piotr Nowojski
Hi, Sorry for getting back so late and thanks for the improved document :) I think now I got your idea. You are now trying (or have you already done it?) to implement a custom window assigner, that would work as in the [Figure 3] from your document? I think that indeed should be possible

Re: Handling burst I/O when using tumbling/sliding windows

2018-10-01 Thread Rong Rong
Hi Piotrek, Thanks for the quick response. To follow up with the questions: Re 1). Yes it is causing network I/O issues on Kafka itself. Re 2a). Actually. I thought about it last weekend and I think there's a way for a work around: We directly duplicated the key extraction logic in our window

Re: Handling burst I/O when using tumbling/sliding windows

2018-09-28 Thread Piotr Nowojski
Hi, Thanks for the response again :) Re 1). Do you mean that this extra burst external I/O network traffic is causing disturbance with other systems reading/writing from Kafka? With Kafka itself? Re 2a) Yes, it should be relatively simple, however any new brick makes the overall component

Re: Handling burst I/O when using tumbling/sliding windows

2018-09-28 Thread Rong Rong
Hi Piotrek, Thanks for getting back to me so quickly. Let me explain. Re 1). As I explained in the doc. we are using a basic Kafka-in Kafka-out system with same partition number on both side. It is causing degraded performance in external I/O network traffic. It is definitely possible to

Re: Handling burst I/O when using tumbling/sliding windows

2018-09-28 Thread Piotr Nowojski
Hi, Re 1. Can you be more specific? What system are you using, what’s happening and how does it brake? While delaying windows firing is probably the most cost effective solution for this particular problem, it has some disadvantages: a) putting even more logic to already complicated component

Re: Handling burst I/O when using tumbling/sliding windows

2018-09-27 Thread Rong Rong
HI Piotrek, Yes, to be more clear, 1) the network I/O issue I am referring to is in between Flink and external sink. We did not see issues in between operators. 2) yes we've considered rate limiting sink functions as well which is also mentioned in the doc. along with some of the the pro-con we

Re: Handling burst I/O when using tumbling/sliding windows

2018-09-27 Thread Piotr Nowojski
Hi, Thanks for the proposal. Could you provide more background/explanation/motivation why do you need such feature? What do you mean by “network I/O” degradation? On it’s own burst writes shouldn’t cause problems within Flink. If they do, we might want to fix the original underlying problem

Handling burst I/O when using tumbling/sliding windows

2018-09-26 Thread Rong Rong
Hi Dev, I was wondering if there's any previous discussion regarding how to handle burst network I/O when deploying Flink applications with window operators. We've recently see some significant network I/O degradation when trying to use sliding window to perform rolling aggregations. The pattern