Thanks Mich for your reply. I am curious to know one thing, Hive uses CBO
which take into account of cpu cost, Does hive optimizer has any advantage over
spark catalyst optimizer?.
Regards,
Srinivasan Hariharan
+91-9940395830
From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]
Sent: Friday, June 10, 2016 3:27 PM
To: Srinivasan Hariharan02 <srinivasan_...@infosys.com>
Cc: Takeshi Yamamuro <linguin@gmail.com>; user@spark.apache.org
Subject: Re: Catalyst optimizer cpu/Io cost
in an SMP system such as Oracle or Sybase the CBO will take into account LIO,
PIO and CPU costing or use some empirical costing.
In a distributed system like Spark with so many nodes that may not be that easy
or its contribution to the Catalyst decision may be subject to variations that
may not make it worthwhile.
HTH
Dr Mich Talebzadeh
LinkedIn
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
http://talebzadehmich.wordpress.com<http://talebzadehmich.wordpress.com/>
On 10 June 2016 at 10:45, Srinivasan Hariharan02
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote:
Thanks Takeshi. Is there any reason for not using I/o cpu cost in catalyst
optimizer?. Some sql engines which leverages Apache calcite has cost planner
like volcanoPlanner which takes cpu and io cost for plan optimization.
Regards,
Srinivasan Hariharan
+91-9940395830<tel:%2B91-9940395830>
From: Takeshi Yamamuro
[mailto:linguin@gmail.com<mailto:linguin@gmail.com>]
Sent: Friday, June 10, 2016 2:38 PM
To: Srinivasan Hariharan02
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>>
Cc: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Re: Catalyst optimizer cpu/Io cost
Hi,
There no way to retrieve that information in spark.
In fact, the current optimizer only consider the byte size of outputs in
LogicalPlan.
Related code can be found in
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala#L90
If you want to know more about catalyst, you can check the Yin Huai's slide in
spark summit 2016.
https://spark-summit.org/2016/speakers/yin-huai/
# Note: the slide is not available now, and it seems it will be in a few weeks.
// maropu
On Fri, Jun 10, 2016 at 3:29 PM, Srinivasan Hariharan02
<srinivasan_...@infosys.com<mailto:srinivasan_...@infosys.com>> wrote:
Hi,,
How can I get spark sql query cpu and Io cost after optimizing for the best
logical plan. Is there any api to retrieve this information?. If anyone point
me to the code where actually cpu and Io cost computed in catalyst module.
Regards,
Srinivasan Hariharan
+91-9940395830<tel:%2B91-9940395830>
--
---
Takeshi Yamamuro