[RESULT] [VOTE] FLIP-418: Show data skew score on Flink Dashboard

2024-02-15 Thread Kartoglu, Emre
I am happy to announce that “FLIP-418: Show data skew score on Flink Dashboard” 
has been accepted with Consensus.

FLIP: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard

Votes:


  *   Aleksandr Pilipenko +1 (non-binding)
  *   Danny Cranmer +1 (binding)
  *   Hong  Liang +1 (binding)
  *   Rui Fan +1 (binding)
  *   Yuepeng Pan +1 (non-binding)

There are no disapproving votes.

Thanks all!

Emre


RE: [VOTE] FLIP-418: Show data skew score on Flink Dashboard

2024-02-15 Thread Kartoglu, Emre
Thanks all, this vote is now closed. I will announce the results on a separate 
thread.

On 2024/01/29 10:09:10 "Kartoglu, Emre" wrote:
> Hello,
>
> I'd like to call votes on FLIP-418: Show data skew score on Flink Dashboard.
>
> FLIP: 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
> Discussion: https://lists.apache.org/thread/m5ockoork0h2zr78h77dcrn71rbt35ql
>
> Kind regards,
> Emre
>
>


Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-02-01 Thread Kartoglu, Emre
Hi Rui,

Thanks for the useful feedback and caring about the user experience. 
I will update the FLIP based on 1 comment. I consider this a minor update.

Please find my detailed responses below. 

"numRecordsInPerSecond sounds make sense to me, and I think
it's necessary to mention it in the FLIP wiki. It will let other developers
to easily understand. WDYT?"

I feel like this might be touching implementation details. No objections though,
 I will update the FLIP with this as one of the ways in which we can achieve 
the proposal.


"After I detailed read the FLIP and Average_absolute_deviation, we know
0% is the best, 100% is worst."

Correct.


"I guess it is difficult for users who have not read the documentation to
know the meaning of 50%. We hope that the designed Data skew will
be easy for users to understand without reading or learning a series
of backgrounds."

I think I understand where you're coming from. My thought is that the user 
won't have to
know exactly how the skew percentage/score is calculated. But this score will
act as a warning sign for them. Upon seeing a skew score of 80% for an 
operator, as a user 
I will go and click on the operator to see many of my subtasks are not 
receiving any data at all.
So it acts as a metric to get the user's attention to the skewed operator and 
fix issues.


"For example, as you mentioned before, flink has a metric:
numRecordsInPerSecond.
I believe users know what numRecordsInPerSecond means even if they
didn't read any documentation."

The FLIP suggests that we will provide an explanation of the data skew score
under the proposed Data Skew tab. I would like the exact wording to be left to 
the code review process to prevent these from blocking the implementation 
work/progress. 
This will be a user-friendly explanation with an option for the curious user to 
see the exact formula.


Kind regards,
Emre


On 01/02/2024, 03:26, "Rui Fan" <1996fan...@gmail.com 
<mailto:1996fan...@gmail.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






> I was thinking about using the existing numRecordsInPerSecond metric


numRecordsInPerSecond sounds make sense to me, and I think
it's necessary to mention it in the FLIP wiki. It will let other developers
to easily understand. WDYT?


BTW, that's why I ask whether the data skew score means total
receive records.


> this would always give you a score higher than 1, with no way to cap the
score.


Yeah, you are right. max/mean is not a score, it's the data skew multiple.
And I guess max/mean is easier to understand than
Average_absolute_deviation.


> I'm more used to working with percentages. The problem with the max/mean
metric is I wouldn't immediately know whether a score of 300 is bad for
instance.
> Whereas if users saw above 50% as suggested in the FLIP for instance,
they would consider taking action. I'm tempted to push back on this
suggestion. Happy to discuss further, there is a chance I'm not seeing the
downside of the proposed percentage based metric yet. Please let me know.


After I detailed read the FLIP and Average_absolute_deviation, we know
0% is the best, 100% is worst.


I guess it is difficult for users who have not read the documentation to
know the meaning of 50%. We hope that the designed Data skew will
be easy for users to understand without reading or learning a series
of backgrounds.


For example, as you mentioned before, flink has a metric:
numRecordsInPerSecond.
I believe users know what numRecordsInPerSecond means even if they
didn't read any documentation.


Of course, I'm opening for it. I may have missed something. I'd like to
hear
more feedback from the community.


Best,
Rui


On Thu, Feb 1, 2024 at 4:13 AM Kartoglu, Emre mailto:kar...@amazon.co.uk.inva>lid>
wrote:


> Hi Rui,
>
> " and provide the total and current score in the detailed tab. I didn't
> see the detailed design in the FLIP, would you mind
> improve the design doc? Thanks".
>
> It will essentially be a basic list view similar to the "Checkpoints" tab.
> I only briefly mentioned this in the FLIP because it will be a basic list
> view.
> No problem though, I will update the FLIP.
>
>
> Please find my responses below quotations.
>
> " 1. About the current skew score, I still don't understand how to get
> the list_of_number_of_records_received_by_each_subtask for
> each subtask.
>
> the list_of_number_of_records_received_by_each_subtask of subtask1
> is
>
> total received records of subtask 1 from beginning to now -
> total received records of subtask 1 from beginning to (now - 1min), right?"
>
> Yes, essential

Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-31 Thread Kartoglu, Emre
Hi Rui,

" and provide the total and current score in the detailed tab. I didn't see the 
detailed design in the FLIP, would you mind
improve the design doc? Thanks".

It will essentially be a basic list view similar to the "Checkpoints" tab. I 
only briefly mentioned this in the FLIP because it will be a basic list view.
No problem though, I will update the FLIP.


Please find my responses below quotations.

" 1. About the current skew score, I still don't understand how to get
the list_of_number_of_records_received_by_each_subtask for
each subtask.

the list_of_number_of_records_received_by_each_subtask of subtask1
is 

total received records of subtask 1 from beginning to now -
total received records of subtask 1 from beginning to (now - 1min), right?"

Yes, essentially correct. I was thinking about using the existing 
numRecordsInPerSecond metric (see 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/), this 
would give us per second granularity and this would be more "current/live" than 
per minute.


"IIUC, you proposed score is between 0% to 100%, and 0% is the best.
And the 100% is the worst."

Correct.


" For data skew, I'm not sure whether a multiple value is more intuitive.
It means data skew score = max / mean.
 The data skew score is between 1 and infinity. 1 is the best, and
the bigger the worse."

I'm not sure I follow you here. Yes, this would always give you a score higher 
than 1, with no way to cap the score.
I'm more used to working with percentages. The problem with the max/mean metric 
is I wouldn't immediately know whether a score of 300 is bad for instance.
Whereas if users saw above 50% as suggested in the FLIP for instance, they 
would consider taking action. I'm tempted to push back on this suggestion. 
Happy to discuss further, there is a chance I'm not seeing the downside of the 
proposed percentage based metric yet. Please let me know.

Kind regards,
Emre

On 31/01/2024, 10:57, "Rui Fan" <1996fan...@gmail.com 
<mailto:1996fan...@gmail.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Sorry for the late reply.




> So you would have a high data skew while 1 subtask is receiving all the
data, but on average (say over 1-2 days) data skew would come down to 0
because all subtasks would have received their portion of the data.
> I'm inclined to think that the current proposal might still be fair, as
you do indeed have a skew by definition (but an intentional one). We can
have a few ways forward:
>
> 0) We can keep the behaviour as proposed. My thoughts are that data skew
is data skew, however intentional it may be. It is not necessarily bad,
like in your example.


It makes sense to me. Flink should show data skew correctly
regardless of whether the user is intentional or not.




> 1) Show data skew based on the beginning of time (not a live/current score).
I mentioned some downsides to this in the FLIP: If you break or fix your
data skew recently, the historical data might hide the recent fix/breakage,
and it is inconsistent with the other metrics shown on the vertices e.g.
Backpressure/Busy metrics show the live/current score.
>
> 2) We can choose not to put data skew score on the vertices on the job
graph. And instead just use the new proposed Data Skew tab which could show
live/current skew score and the total data skew score from the beginning of
job.


It makes sense, we can show the current skew score in the DAG WebUI by
default,
and provide the total and current score in the detailed tab.


I didn't see the detailed design in the FLIP, would you mind
improve the design doc? Thanks


Also, I have 2 questions for now:


1. About the current skew score, I still don't understand how to get
the list_of_number_of_records_received_by_each_subtask for
each subtask.


the list_of_number_of_records_received_by_each_subtask of subtask1
is total received records of subtask 1 from beginning to now -
total received records of subtask 1 from beginning to (now - 1min), right?


Note: 1min is an example. 30s or 2min is fine for me.


2. The skew score is percent


I'm not sure whether the score shown in percent format is reasonable.
For busy ratio or backpressure ratio, they are shown in percent format
is intuitive.


IIUC, you proposed score is between 0% to 100%, and 0% is the best.
And the 100% is the worst.


For data skew, I'm not sure whether a multiple value is more intuitive.
It means data skew score = max / mean.


For example, we have 5 subtasks, the received record numbers are
[10,10, 10, 100, 10].
data skew score = max / mean = 100 / (140/5) = 100/ 28 = 3.57.


The data skew score is between 1 and infinity. 1 is the best, and
the bigger the worse.

[VOTE] FLIP-418: Show data skew score on Flink Dashboard

2024-01-29 Thread Kartoglu, Emre
Hello,

I'd like to call votes on FLIP-418: Show data skew score on Flink Dashboard.

FLIP: 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
Discussion: https://lists.apache.org/thread/m5ockoork0h2zr78h77dcrn71rbt35ql

Kind regards,
Emre



Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-23 Thread Kartoglu, Emre
Hi Krzysztof,

Thank you for the feedback! Please find my comments below.

1. Configurability

Adding a feature flag / configuration to enable this is still on the table as 
far as I am concerned. However I believe adding a new metric shouldn't warrant 
a flag/configuration. One might argue that we should have it for showing the 
metrics on the Flink UI, and I'd appreciate input on this. My default position 
is to not have a configuration/flag unless there is a good reason (e.g. it 
turns out there is impact on Flink UI for so far unknown reason). This is 
because the proposed change should only be improving the experience without any 
unwanted side effect.

2. Metrics

I agree the new metrics should be compatible with the rest of the Flink metric 
reporting mechanism. I will update the FLIP and propose names for the metrics.

Kind regards,
Emre

On 23/01/2024, 10:31, "Krzysztof Dziołak" mailto:kdzio...@live.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi Emre,


Thank you for driving this proposal. I've got two questions about the 
extensions to the proposal that are not captured in the FLIP.




1. Configurability - what kind of configuration would you propose to maintain 
for this feature? Would On/off switch and/or aggregated period length be 
configurable? Should we capture the toggles in the FLIP ?
2. Metrics - are we planning to emit the skew metric via metric reporters 
mechanism. Should we capture proposed metric schema in the FLIP ?


Kind regards,
Krzysztof


____
From: Kartoglu, Emre mailto:kar...@amazon.co.uk.inva>LID>
Sent: Monday, January 15, 2024 4:59 PM
To: dev@flink.apache.org <mailto:dev@flink.apache.org> mailto:dev@flink.apache.org>>
Subject: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard


Hello,


I’m opening this thread to discuss a FLIP[1] to make data skew more visible on 
Flink Dashboard.


Data skew is currently not as visible as it should be. Users have to click each 
operator and check how much data each sub-task is processing and compare the 
sub-tasks against each other. This is especially cumbersome and error-prone for 
jobs with big job graphs and high parallelism. I’m proposing this FLIP to 
improve this.


Kind regards,
Emre


[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
 
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard>











Re: Re:[DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-16 Thread Kartoglu, Emre
Hi Xuyang,

Thanks for the feedback! Please find my response below.

> 1. How will the colors of vertics with high data skew scores be unified with 
> existing backpressure and high busyness
colors on the UI? Users should be able to distinguish at a glance which vertics 
in the entire job graph is skewed.

The current proposal does not suggest to change the colours of the vertices 
based on data skew. In another exchange with Rui, we touch on why data skew 
might not necessarily be bad (for instance if data skew is the designed 
behaviour). The colours are currently dedicated to the Busy/Backpressure 
metrics. I would not be keen on introducing another colour or using the same 
colours for data skew as I am not sure if that'll help or confuse users. I am 
also keen to keep the scope of this FLIP as minimal as possible with as few 
contentious points as possible. We could also revisit this point in future 
FLIPs, if it does not become a blocker for this one. Please let me know your 
thoughts.

2. Can you tell me that you prefer to unify Data Skew Score and Exception tab? 
In my opinion, Data Skew Score is in
the same category as the existing Backpressured and Busy metrics.

The FLIP does not propose to unify the Data Skew tab and the Exception tab. The 
proposed Data Skew tab would sit next to the Exception tab (but I'm not too 
opinionated on where it sits). Backpressure and Busy metrics are somewhat 
special in that they have high visibility thanks to the vertices changing 
colours based on their value. I agree that Data Skew is in the same category in 
that it can be used as an indicator of the job's health. I'm not sure if the 
suggestion here then is to not introduce a tab for data skew? I'd appreciate 
some clarification here.

Look forward to hearing your thoughts.

Emre


On 16/01/2024, 06:05, "Xuyang" mailto:xyzhong...@163.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi, Emre.




In large-scale production jobs, the phenomenon of data skew often occurs. 
Having an metric on the UI that
reflects data skew without the need for manual inspection of each vertex by 
clicking on them would be quite cool.
This could help users quickly identify problematic nodes, simplifying 
development and operations.




I'm mainly curious about two minor points:
1. How will the colors of vertics with high data skew scores be unified with 
existing backpressure and high busyness
colors on the UI? Users should be able to distinguish at a glance which vertics 
in the entire job graph is skewed.
2. Can you tell me that you prefer to unify Data Skew Score and Exception tab? 
In my opinion, Data Skew Score is in
the same category as the existing Backpressured and Busy metrics.




Looking forward to your reply.






--


Best!
Xuyang










At 2024-01-16 00:59:57, "Kartoglu, Emre" mailto:kar...@amazon.co.uk.inva>LID> wrote:
>Hello,
>
>I’m opening this thread to discuss a FLIP[1] to make data skew more visible on 
>Flink Dashboard.
>
>Data skew is currently not as visible as it should be. Users have to click 
>each operator and check how much data each sub-task is processing and compare 
>the sub-tasks against each other. This is especially cumbersome and 
>error-prone for jobs with big job graphs and high parallelism. I’m proposing 
>this FLIP to improve this.
>
>Kind regards,
>Emre
>
>[1] 
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
> 
><https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard>
>
>
>





Re: [DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-16 Thread Kartoglu, Emre
Hi Rui,

Thanks for the feedback. Please find my response below:

> The number_of_records_received_by_each_subtask is the total received records, 
> right?

No it's not the total. I understand why this is confusing. I had initially 
wanted to name it "the list of number of records received by each subtask". So 
its type is a list. Example: [10, 10, 10] => 3 sub-tasks and each one received 
10 records. 

In your example, you have subtasks with each one designed to receive records at 
different times of the day. I hadn't thought about this use case! 
So you would have a high data skew while 1 subtask is receiving all the data, 
but on average (say over 1-2 days) data skew would come down to 0 because all 
subtasks would have received their portion of the data.
I'm inclined to think that the current proposal might still be fair, as you do 
indeed have a skew by definition (but an intentional one). We can have a few 
ways forward:

0) We can keep the behaviour as proposed. My thoughts are that data skew is 
data skew, however intentional it may be. It is not necessarily bad, like in 
your example.

1) Show data skew based on the beginning of time (not a live/current score). I 
mentioned some downsides to this in the FLIP: If you break or fix your data 
skew recently, the historical data might hide the recent fix/breakage, and it 
is inconsistent with the other metrics shown on the vertices e.g. 
Backpressure/Busy metrics show the live/current score.

2) We can choose not to put data skew score on the vertices on the job graph. 
And instead just use the new proposed Data Skew tab which could show 
live/current skew score and the total data skew score from the beginning of job.

Keen to hear your thoughts.

Kind regards,
Emre


On 16/01/2024, 06:44, "Rui Fan" <1996fan...@gmail.com 
<mailto:1996fan...@gmail.com>> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Thanks Emre for driving this proposal!


It's very useful for troubleshooting.


I have a question:


The number_of_records_received_by_each_subtask is the
total received records, right?


I'm not sure whether we should check data skew based on
the latest duration period.


In the production, I found the the total received records of
all subtasks is balanced, but in the each time period, they
are skew.


For example, a flink job has `group by` or `keyBy` based on
hour field. It mean:
- In the 0-1 o'clock, subtaskA is busy, the rest of subtasks are idle.
- In the 1-2 o'clock, subtaskB is busy, the rest of subtasks are idle.
- Next hour, the busy subtask is changed.


Looking forward to your opinions~


Best,
Rui


On Tue, Jan 16, 2024 at 2:05 PM Xuyang mailto:xyzhong...@163.com>> wrote:


> Hi, Emre.
>
>
> In large-scale production jobs, the phenomenon of data skew often occurs.
> Having an metric on the UI that
> reflects data skew without the need for manual inspection of each vertex
> by clicking on them would be quite cool.
> This could help users quickly identify problematic nodes, simplifying
> development and operations.
>
>
> I'm mainly curious about two minor points:
> 1. How will the colors of vertics with high data skew scores be unified
> with existing backpressure and high busyness
> colors on the UI? Users should be able to distinguish at a glance which
> vertics in the entire job graph is skewed.
> 2. Can you tell me that you prefer to unify Data Skew Score and Exception
> tab? In my opinion, Data Skew Score is in
> the same category as the existing Backpressured and Busy metrics.
>
>
> Looking forward to your reply.
>
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> At 2024-01-16 00:59:57, "Kartoglu, Emre"  <mailto:kar...@amazon.co.uk.inva>LID>
> wrote:
> >Hello,
> >
> >I’m opening this thread to discuss a FLIP[1] to make data skew more
> visible on Flink Dashboard.
> >
> >Data skew is currently not as visible as it should be. Users have to
> click each operator and check how much data each sub-task is processing and
> compare the sub-tasks against each other. This is especially cumbersome and
> error-prone for jobs with big job graphs and high parallelism. I’m
> proposing this FLIP to improve this.
> >
> >Kind regards,
> >Emre
> >
> >[1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard
>  
> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard>
> >
> >
> >
>





[DISCUSS] FLIP-418: Show data skew score on Flink Dashboard

2024-01-15 Thread Kartoglu, Emre
Hello,

I’m opening this thread to discuss a FLIP[1] to make data skew more visible on 
Flink Dashboard.

Data skew is currently not as visible as it should be. Users have to click each 
operator and check how much data each sub-task is processing and compare the 
sub-tasks against each other. This is especially cumbersome and error-prone for 
jobs with big job graphs and high parallelism. I’m proposing this FLIP to 
improve this.

Kind regards,
Emre

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-418%3A+Show+data+skew+score+on+Flink+Dashboard





Show data skew score on Flink Dashboard?

2024-01-05 Thread Kartoglu, Emre
Hello,

Is there a reason why a type of data skew score (probably just a percentage) is 
not shown on the Flink Dashboard / UI?

Currently users have to click on each operator and check how much data each 
subtask is processing to tell if there is skew. This is not efficient, 
especially cumbersome and also error prone for big job graphs.

It would be useful to have this shown on the operator. Possibly also a warning 
message at the top or somewhere “more meta” if a significant amount of skew is 
detected (so that users don’t have to zoom in on each and every operator to 
check the skew score).

I’d be happy to create a ticket for it, if there are no objections?

Kind regards,
Emre


Re: Maven plugin to detect issues early on

2023-05-22 Thread Kartoglu, Emre
Hi Jing,

The proposed plugin would be used by Flink application developers, when they 
are writing their Flink job. It would trigger during compilation/packaging and 
would look for known incompatibilities, bad practices, or bugs.
For instance one cause of frustration for our customers is connector 
incompatibilities (specifically Kafka and Kinesis) with certain Flink versions. 
This plugin would be a quick way to update a list of known incompatibilities, 
bugs, bad practices, so customers get errors during compilation/packaging and 
not after they've deployed their Flink job.

From what you're saying, the FLIP route might not be the best way to go. We 
might publish this plugin in our own GitHub namespace/group first, and then get 
community acknowledgement/support for it. I believe working with the Flink 
community on this is key as we'd need their support/opinion to do this the 
right way and reach more Flink users.

Thanks
Emre

On 21/05/2023, 16:48, "Jing Ge" mailto:j...@ververica.com.inva>LID> wrote:


CAUTION: This email originated from outside of the organization. Do not click 
links or open attachments unless you can confirm the sender and know the 
content is safe.






Hi Emre,


Thanks for your proposal. It looks very interesting! Please pay attention
that most connectors have been externalized. Will your proposed plug be
used for building Flink Connectors or Flink itself? Furthermore, it would
be great if you could elaborate features wrt best practices so that we
could understand how the plugin will help us.


Afaik, FLIP is recommended for improvement ideas that will change public
APIs. I am not sure if a new maven plugin belongs to it.


Best regards,
Jing


On Tue, May 16, 2023 at 11:29 AM Kartoglu, Emre mailto:kar...@amazon.co.uk.inva>lid>
wrote:


> Hello all,
>
> Myself and 2 colleagues developed a Maven plugin (no support for Gradle or
> other build tools yet) that we use internally to detect potential issues in
> Flink apps at compilation/packaging stage:
>
>
> * Known connector version incompatibilities – so far covering Kafka
> and Kinesis
> * Best practices e.g. setting operator IDs
>
> We’d like to make this open-source. Ideally with the Flink community’s
> support/mention of it on the Flink website, so more people use it.
>
> Going forward, I believe we have at least the following options:
>
> * Get community support: Create a FLIP to discuss where the plugin
> should live, what kind of problems it should detect etc.
> * We still open-source it but without the community support (if the
> community has objections to officially supporting it for instance).
>
> Just wanted to gauge the feeling/thoughts towards this tool from the
> community before going ahead.
>
> Thanks,
> Emre
>
>





RE: Call for help on the Web UI (In-Place Rescaling)

2023-05-19 Thread Kartoglu, Emre
Hi David,

This looks awesome. I am no expert on UI/UX, but still have opinions 😊

I normally use the Overview tab for monitoring Flink jobs, and having control 
inputs there breaks my assumption that Overview is “read-only” and for 
“watching”.
Having said that for “educational purposes” that might actually be a good place 
- I am imagining there would be a “educationalMode: true” flag or something 
somewhere to enable these buttons (and other educational bits in future).

The “educational purpose” bit makes me a lot more relaxed about having those 
buttons as they are in the video!

Couple other things to consider:


  *   Confirming new parallelism before actually doing it, e.g. having a 
“Deploy/Commit/Save” button
  *   Allow users to enter parallelism without having to increment/decrement 
one by one

Thanks,
Emre

On 2023/05/19 06:49:08 David Morávek wrote:
> Hi Everyone,
>
> In FLINK-31471, we've introduced new "in-place rescaling features" to the
> Web UI that show up when the scheduler supports FLIP-291 REST endpoints.
>
> I expect this to be a significant feature for user education (they have an
> easy way to try out how rescaling behaves, especially in combination with a
> backpressure monitor) and marketing (read as "we can do fancy demos").
>
> However, the current sketch is not optimal due to my lack of UI/UX skills.
>
> Are there any volunteers that could and would like to help polish this?
>
> Here is a short demo [2] of what the current implementation can do.
>
> [1] https://issues.apache.org/jira/browse/FLINK-31471
> [2] https://www.youtube.com/watch?v=B1NVDTazsZY
>
> Best,
> D.
>


Maven plugin to detect issues early on

2023-05-16 Thread Kartoglu, Emre
Hello all,

Myself and 2 colleagues developed a Maven plugin (no support for Gradle or 
other build tools yet) that we use internally to detect potential issues in 
Flink apps at compilation/packaging stage:


  *   Known connector version incompatibilities – so far covering Kafka and 
Kinesis
  *   Best practices e.g. setting operator IDs

We’d like to make this open-source. Ideally with the Flink community’s 
support/mention of it on the Flink website, so more people use it.

Going forward, I believe we have at least the following options:

  *   Get community support: Create a FLIP to discuss where the plugin should 
live, what kind of problems it should detect etc.
  *   We still open-source it but without the community support (if the 
community has objections to officially supporting it for instance).

Just wanted to gauge the feeling/thoughts towards this tool from the community 
before going ahead.

Thanks,
Emre