[ https://issues.apache.org/jira/browse/SPARK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Rosen resolved SPARK-5763. ------------------------------- Resolution: Won't Fix Resolving as "Won't fix" for now, given discussion on the PR RE: this functionality being provided as part of Spark SQL / DataFrames. > Sort-based Groupby and Join to resolve skewed data > -------------------------------------------------- > > Key: SPARK-5763 > URL: https://issues.apache.org/jira/browse/SPARK-5763 > Project: Spark > Issue Type: Improvement > Components: Shuffle, Spark Core > Reporter: Lianhui Wang > > In SPARK-4644, it provide a way to resolve skewed data. But when we has more > keys that are skewed, I think that the way in SPARK-4644 is inappropriate. So > we can use sort-merge to resolve skewed-groupby and skewed-join.because > SPARK-2926 implement merge-sort, we can implement sort-merge for skewed based > on SPARK-2926. And i have implemented sort-merge-groupby and it is very well > for skewed data in my test.Later i will implement sort-merge-join to resolve > skewed-join. > [~rxin] [~sandyr] [~andrewor14] how about your opinions about this? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org