why "cache table a as select * from b" will do shuffle,and create 2 stages.
example:
table "ods_pay_consume" is from "KafkaUtils.createDirectStream"
hiveContext.sql("cache table dwd_pay_consume as select * from
ods_pay_consume")
this code will make 2 statges of DAG
hive
why "cache table a as select * from b" will do shuffle,and create 2 stages.
example:
table "ods_pay_consume" is from "KafkaUtils.createDirectStream"
hiveContext.sql("cache table dwd_pay_consume as select * from
ods_pay_consume")
this code will make 2 statges of DAG
hive