Re: Spark CBO

2015-07-31 Thread Olivier Girardot
Hi, there is one cost-based analyzer implemented in Spark SQL, if I'm not mistaken, regarding the Join operations, If the join operation is done with a small dataset then Spark SQL's strategy will be to broadcast automatically the small dataset instead of shuffling. I guess you have something

Spark CBO

2015-07-31 Thread burakkk
Hi everyone, I'm wondering that is there any plan to implement cost-based optimizer for Spark SQL? Best regards... -- *BURAK ISIKLI* | *http://burakisikli.wordpress.com http://burakisikli.wordpress.com*