[GitHub] [spark] HeartSaVioR edited a comment on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-27 Thread GitBox
HeartSaVioR edited a comment on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-768801016 > AQE won't kick in if users specify num partitions, e.g. df.repartition(5), I think the same applies here if the sink requires a certain num partitions. The ca

[GitHub] [spark] HeartSaVioR edited a comment on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-27 Thread GitBox
HeartSaVioR edited a comment on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-768780252 Probably the discussion would be more constructive/productive if each idea may bring the explanation with the actual case, which storage is expected to get benefit on

[GitHub] [spark] HeartSaVioR edited a comment on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-27 Thread GitBox
HeartSaVioR edited a comment on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-768107101 Actually the proposal is more likely giving data source to force having static number of partitions regardless of output data. I see valid concerns about drawb

[GitHub] [spark] HeartSaVioR edited a comment on pull request #31355: [SPARK-34255][SQL] Support partitioning with static number on required distribution and ordering on V2 write

2021-01-27 Thread GitBox
HeartSaVioR edited a comment on pull request #31355: URL: https://github.com/apache/spark/pull/31355#issuecomment-768107101 Actually the proposal is more likely giving data source to force having static number of partitions regardless of output data. I see valid concerns about drawb