You are right. By default CBO is not enabled. Whilst the CBO was the
default optimizer in earlier versions of Spark, it has been replaced by the
AQE in recent releases.

spark.sql.cbo.strategy

As I understand, The spark.sql.cbo.strategy configuration property
specifies the optimizer strategy used by Spark SQL to generate query
execution plans. There are two main optimizer strategies available:

   -

   CBO (Cost-Based Optimization): The default optimizer strategy, which
   analyzes the query plan and estimates the execution costs associated with
   each operation. It uses statistics to guide its decisions, selecting the
   plan with the lowest estimated cost.
   -

   CBO-Like (Cost-Based Optimization-Like): A simplified optimizer strategy
   that mimics some of the CBO's logic, but without the ability to estimate
   costs. This strategy is faster than CBO for simple queries, but may not
   produce the most efficient plan for complex queries.

The spark.sql.cbo.strategy property can be set to either CBO or CBO-Like.
The default value is AUTO, which means that Spark will automatically choose
the most appropriate strategy based on the complexity of the query and
availability of statistic


Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Mon, 11 Dec 2023 at 17:11, Nicholas Chammas <nicholas.cham...@gmail.com>
wrote:

>
> On Dec 11, 2023, at 6:40 AM, Mich Talebzadeh <mich.talebza...@gmail.com>
> wrote:
>
> By default, the CBO is enabled in Spark.
>
>
> Note that this is not correct. AQE is enabled
> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L664-L669>
>  by
> default, but CBO isn’t
> <https://github.com/apache/spark/blob/8235f1d56bf232bb713fe24ff6f2ffdaf49d2fcc/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L2694-L2699>
> .
>

Reply via email to