Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-27 Thread Mason Chen
Hi Leonard, Thanks for the review! See my responses below: 1. Let's take for example the case that a split is a file (i.e. what I call bounded). To calculate the pending records, the connector needs to know how many splits are left and the number of records in each split. If a reader only knows i

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-18 Thread Leonard Xu
Thanks Mason for starting this thread discussion, generally +1 for the motivation and proposal . I have some questions about the detail after read the FLIP. 1. The FLIP says "However, pendingRecords is currently only reported by the SourceReader and doesn’t cover the case for sources that onl

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-17 Thread Rui Fan
Thanks Mason for your feedback and update! The sources you listed look good to me, +1 for this proposal! Best, Rui On Sat, Nov 18, 2023 at 3:38 AM Mason Chen wrote: > Also, it looks like externalizing the Hive connector is unblocked based on > the past email thread. https://issues.apache.org/j

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-17 Thread Mason Chen
Also, it looks like externalizing the Hive connector is unblocked based on the past email thread. https://issues.apache.org/jira/browse/FLINK-30064 seems to have some progress and perhaps we shouldn't touch it for now. On Fri, Nov 17, 2023 at 11:00 AM Mason Chen wrote: > Hi Rui and Max, > > Than

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-17 Thread Mason Chen
Hi Rui and Max, Thanks for the feedback! If yes, I suggest this FLIP includes registering metric part, otherwise > these metrics still cannot work. Yup, you understood it correctly. I'll add that to the required list of work. Note that I'll include only FLIP-27 sources in the Flink repo: FileSou

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-17 Thread Maximilian Michels
Hi Mason, Thank you for the proposal. This is a highly requested feature to make the source scaling of Flink Autoscaling generic across all sources. The current implementation handles every source individually, and if we don't find any backlog metrics, we default to using busy time only. At this p

Re: [DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-16 Thread Rui Fan
Hi Mason, Thank you for driving this proposal! Currently, Autoscaler only supports the maximum source parallelism of KafkaSource. Introducing the generic metric to support it is good to me, +1 for this proposal. I have a question: You added the metric in the flink repo, and Autoscaler will fetch

[DISCUSS] FLIP-394: Add Metrics for Connector Agnostic Autoscaling

2023-11-16 Thread Mason Chen
Hi all, I would like to start a discussion on FLIP-394: Add Metrics for Connector Agnostic Autoscaling [1]. This FLIP recommends adding two metrics to make autoscaling work for bounded split source implementations like IcebergSource. These metrics are required by the Flink Kubernetes Operator aut