[
https://issues.apache.org/jira/browse/SPARK-12255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ant_nebula updated SPARK-12255:
-------------------------------
Description:
why "cache table a as select * from b" will do shuffle,and create 2 stages.
example:
table "ods_pay_consume" is from "KafkaUtils.createDirectStream"
{code}
hiveContext.sql("cache table dwd_pay_consume as select * from ods_pay_consume")
{code}
create DAG as pic1.jsp
{code}
hiveContext.sql(""cache table dw_game_server_recharge as select * from
dwd_pay_consume")
{code}
create DAG as pic2.jsp,and this similar caculate from the beginning,"cache
table dwd_pay_consume" is not effect.
was:
why "cache table a as select * from b" will do shuffle,and create 2 stage.
example:
table "ods_pay_consume" is from "KafkaUtils.createDirectStream"
{code}
hiveContext.sql("cache table dwd_pay_consume as select * from ods_pay_consume")
{code}
create DAG as pic1.jsp
{code}
hiveContext.sql(""cache table dw_game_server_recharge as select * from
dwd_pay_consume")
{code}
create DAG as pic2.jsp,and this similar caculate from the beginning,"cache
table dwd_pay_consume" is not effect.
Summary: why "cache table a as select * from b" will do shuffle,and
create 2 stages (was: why "cache table a as select * from b" will do
shuffle,and create 2 stage)
> why "cache table a as select * from b" will do shuffle,and create 2 stages
> --------------------------------------------------------------------------
>
> Key: SPARK-12255
> URL: https://issues.apache.org/jira/browse/SPARK-12255
> Project: Spark
> Issue Type: Question
> Environment: spark-1.4.1-bin-hadoop2.4
> Reporter: ant_nebula
> Attachments: pic1.jpg, pic2.jpg
>
>
> why "cache table a as select * from b" will do shuffle,and create 2 stages.
> example:
> table "ods_pay_consume" is from "KafkaUtils.createDirectStream"
> {code}
> hiveContext.sql("cache table dwd_pay_consume as select * from
> ods_pay_consume")
> {code}
> create DAG as pic1.jsp
> {code}
> hiveContext.sql(""cache table dw_game_server_recharge as select * from
> dwd_pay_consume")
> {code}
> create DAG as pic2.jsp,and this similar caculate from the beginning,"cache
> table dwd_pay_consume" is not effect.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]