Ohad Raviv created SPARK-41277: ---------------------------------- Summary: Save and leverage shuffle key in tblproperties Key: SPARK-41277 URL: https://issues.apache.org/jira/browse/SPARK-41277 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.3.1 Reporter: Ohad Raviv
I'm not sure if I'm not missing anything trivial. In a typical process, many datasets get materialized and many of them after a shuffle (e.g join). then they would again be involved in further actions and often use the same key. Wouldn't it make sense to save the shuffle key along with the table to avoid unnecessary shuffles? Also, the implementation seems quite straightforward - to just leverage the bucketing mechanism. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org