Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-02-07 Thread Rui Fan
Thanks Emre for the feedback! I still think max/mean is more simple and easy to understand for users. But I don’t have a strong opinion about it. This proposal is absolutely useful for flink users! In order to ensure the value for users, would you mind if we wait for a while and check if there

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-02-01 Thread Kartoglu, Emre
Hi Rui, Thanks for the useful feedback and caring about the user experience. I will update the FLIP based on 1 comment. I consider this a minor update. Please find my detailed responses below. "numRecordsInPerSecond sounds make sense to me, and I think it's necessary to mention it in the FLIP

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-31 Thread Rui Fan
> I was thinking about using the existing numRecordsInPerSecond metric numRecordsInPerSecond sounds make sense to me, and I think it's necessary to mention it in the FLIP wiki. It will let other developers to easily understand. WDYT? BTW, that's why I ask whether the data skew score means total

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-31 Thread Kartoglu, Emre
Hi Rui, " and provide the total and current score in the detailed tab. I didn't see the detailed design in the FLIP, would you mind improve the design doc? Thanks". It will essentially be a basic list view similar to the "Checkpoints" tab. I only briefly mentioned this in the FLIP because it

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-31 Thread Rui Fan
Sorry for the late reply. > So you would have a high data skew while 1 subtask is receiving all the data, but on average (say over 1-2 days) data skew would come down to 0 because all subtasks would have received their portion of the data. > I'm inclined to think that the current proposal might

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-23 Thread Kartoglu, Emre
Hi Krzysztof, Thank you for the feedback! Please find my comments below. 1. Configurability Adding a feature flag / configuration to enable this is still on the table as far as I am concerned. However I believe adding a new metric shouldn't warrant a flag/configuration. One might argue that

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-23 Thread Krzysztof Dziołak
Hi Emre, Thank you for driving this proposal. I've got two questions about the extensions to the proposal that are not captured in the FLIP. 1. Configurability - what kind of configuration would you propose to maintain for this feature? Would On/off switch and/or aggregated period length

Re: Re:[DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-16 Thread Kartoglu, Emre
Hi Xuyang, Thanks for the feedback! Please find my response below. > 1. How will the colors of vertics with high data skew scores be unified with > existing backpressure and high busyness colors on the UI? Users should be able to distinguish at a glance which vertics in the entire job graph is

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-16 Thread Kartoglu, Emre
Hi Rui, Thanks for the feedback. Please find my response below: > The number_of_records_received_by_each_subtask is the total received records, > right? No it's not the total. I understand why this is confusing. I had initially wanted to name it "the list of number of records received by each

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-15 Thread Rui Fan
Thanks Emre for driving this proposal! It's very useful for troubleshooting. I have a question: The number_of_records_received_by_each_subtask is the total received records, right? I'm not sure whether we should check data skew based on the latest duration period. In the production, I found

Re:[DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-15 Thread Xuyang
Hi, Emre. In large-scale production jobs, the phenomenon of data skew often occurs. Having an metric on the UI that reflects data skew without the need for manual inspection of each vertex by clicking on them would be quite cool. This could help users quickly identify problematic nodes,