Hi all,

I’d like to start a vote for SPIP: Storage Partitioned Join for Data Source V2.

The proposal is to support a new type of join: storage partitioned join which
covers bucket join support for DataSourceV2 but is more general. The goal
is to let Spark leverage distribution properties reported by data sources and
eliminate shuffle whenever possible.

Please also refer to:

   - Previous discussion in dev mailing list: [DISCUSS] SPIP: Storage 
Partitioned Join for Data Source V2
   
<https://lists.apache.org/thread.html/r7dc67c3db280a8b2e65855cb0b1c86b524d4e6ae1ed9db9ca12cb2e6%40%3Cdev.spark.apache.org%3E>
   .
   - JIRA: SPARK-37166 <https://issues.apache.org/jira/browse/SPARK-37166>
   - Design doc 
<https://docs.google.com/document/d/1foTkDSM91VxKgkEcBMsuAvEjNybjja-uHk-r3vtXWFE>
 

Please vote on the SPIP for the next 72 hours:

[ ] +1: Accept the proposal as an official SPIP
[ ] +0
[ ] -1: I don’t think this is a good idea because …

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to