Hi Xuyang, Thanks for the feedback! Please find my response below.
> 1. How will the colors of vertics with high data skew scores be unified with > existing backpressure and high busyness colors on the UI? Users should be able to distinguish at a glance which vertics in the entire job graph is skewed. The current proposal does not suggest to change the colours of the vertices based on data skew. In another exchange with Rui, we touch on why data skew might not necessarily be bad (for instance if data skew is the designed behaviour). The colours are currently dedicated to the Busy/Backpressure metrics. I would not be keen on introducing another colour or using the same colours for data skew as I am not sure if that'll help or confuse users. I am also keen to keep the scope of this FLIP as minimal as possible with as few contentious points as possible. We could also revisit this point in future FLIPs, if it does not become a blocker for this one. Please let me know your thoughts. 2. Can you tell me that you prefer to unify Data Skew Score and Exception tab? In my opinion, Data Skew Score is in the same category as the existing Backpressured and Busy metrics. The FLIP does not propose to unify the Data Skew tab and the Exception tab. The proposed Data Skew tab would sit next to the Exception tab (but I'm not too opinionated on where it sits). Backpressure and Busy metrics are somewhat special in that they have high visibility thanks to the vertices changing colours based on their value. I agree that Data Skew is in the same category in that it can be used as an indicator of the job's health. I'm not sure if the suggestion here then is to not introduce a tab for data skew? I'd appreciate some clarification here. Look forward to hearing your thoughts. Emre On 16/01/2024, 06:05, "Xuyang" <xyzhong...@163.com <mailto:xyzhong...@163.com>> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi, Emre. In large-scale production jobs, the phenomenon of data skew often occurs. Having an metric on the UI that reflects data skew without the need for manual inspection of each vertex by clicking on them would be quite cool. This could help users quickly identify problematic nodes, simplifying development and operations. I'm mainly curious about two minor points: 1. How will the colors of vertics with high data skew scores be unified with existing backpressure and high busyness colors on the UI? Users should be able to distinguish at a glance which vertics in the entire job graph is skewed. 2. Can you tell me that you prefer to unify Data Skew Score and Exception tab? In my opinion, Data Skew Score is in the same category as the existing Backpressured and Busy metrics. Looking forward to your reply. -- Best! Xuyang At 2024-01-16 00:59:57, "Kartoglu, Emre" <kar...@amazon.co.uk.inva <mailto:kar...@amazon.co.uk.inva>LID> wrote: >Hello, > >I’m opening this thread to discuss a FLIP[1] to make data skew more visible on >Flink Dashboard. > >Data skew is currently not as visible as it should be. Users have to click >each operator and check how much data each sub-task is processing and compare >the sub-tasks against each other. This is especially cumbersome and >error-prone for jobs with big job graphs and high parallelism. I’m proposing >this FLIP to improve this. > >Kind regards, >Emre > >[1] >https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard > ><https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard> > > >