Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-24 Thread huaxin gao
+1. Thanks for lifting the current restrictions on bucket join and making this more generalized. On Sun, Oct 24, 2021 at 9:33 AM Ryan Blue wrote: > +1 from me as well. Thanks Chao for doing so much to get it to this point! > > On Sat, Oct 23, 2021 at 11:29 PM DB Tsai wrote: > >> +1 on this

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-24 Thread Ryan Blue
+1 from me as well. Thanks Chao for doing so much to get it to this point! On Sat, Oct 23, 2021 at 11:29 PM DB Tsai wrote: > +1 on this SPIP. > > This is a more generalized version of bucketed tables and bucketed > joins which can eliminate very expensive data shuffles when joins, and > many

Re: [DISCUSS] SPIP: Storage Partitioned Join for Data Source V2

2021-10-24 Thread DB Tsai
+1 on this SPIP. This is a more generalized version of bucketed tables and bucketed joins which can eliminate very expensive data shuffles when joins, and many users in the Apache Spark community have wanted this feature for a long time! Thank you, Ryan and Chao, for working on this, and I look