[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17724125#comment-17724125 ] Zhanghao Chen commented on FLINK-32124: --- Hi [~gyfora]. I confirmed that operator actually aligns it, an internal code change in our production breaks it. Sorry for the confusion and I'll close this ticket. > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723821#comment-17723821 ] Gyula Fora commented on FLINK-32124: [~Zhanghao Chen] can you please confirm that the current behaviour is actually always partition alignment? How could we align it even better? > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723820#comment-17723820 ] Gyula Fora commented on FLINK-32124: My bad, this is what I meant: https://issues.apache.org/jira/browse/FLINK-32119 > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723819#comment-17723819 ] Fang Yong commented on FLINK-32124: --- [~gyfora]Is the link https://issues.apache.org/jira/browse/FLINK-32124 wrong? It's just the link of the current issue > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723801#comment-17723801 ] Zhanghao Chen commented on FLINK-32124: --- Thanks [~gyfora]. I'll follow up there. > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-32124) Add option to enable partition alignment for sources
[ https://issues.apache.org/jira/browse/FLINK-32124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723796#comment-17723796 ] Gyula Fora commented on FLINK-32124: This is related to https://issues.apache.org/jira/browse/FLINK-32124 But I think currently we actually consider partition balance. We set the max parallelism to the number of partitions and only select parallelisms that are divisors of this (so there is always balance) > Add option to enable partition alignment for sources > > > Key: FLINK-32124 > URL: https://issues.apache.org/jira/browse/FLINK-32124 > Project: Flink > Issue Type: Improvement > Components: Autoscaler >Reporter: Zhanghao Chen >Priority: Major > > Currently, autoscaler did not consider balancing partitions among source > tasks. In our production env, partition skew has proven to be a severe > problem for many jobs. Especially in a job topology with all forward or > rescale shuffles, partition skew on the source side can further lead to data > imbalance in later operators. > We should add an option to enable partition alignment for sources for that, > but making it disabled by default as this has a side effect in that partition > usu. has limited factors and enabling alignment will greatly limit our > scaling choices. Also, if data among partitions are imbalanced in the first > place, partition alignment won't help as well (this is not a common case > inside our company though). -- This message was sent by Atlassian Jira (v8.20.10#820010)