*I have run a query many times, there will be two results without regular.*
*One is 36834699 and other is 18464706.*

*The query is *
set spark.yarn.queue=soft.high;
set hive.execution.engine=spark;
select /*+mapjoin(t3,t4,t5)*/
  count(1)
from
  (
  select
    coalesce(t11.qua,t12.qua,t13.qua) qua,
    coalesce(t11.scene,t12.lanmu_id,t13.lanmu_id) scene,
    coalesce(t11.app_id,t12.appid,t13.app_id) app_id,
    expos_pv,
    expos_uv,
    dload_pv,
    dload_uv,
    dload_cnt,
    dload_user,
    evil_dload_cnt,
    evil_dload_user,
    update_dcnt,
    update_duser,
    hand_suc_incnt,
    hand_suc_inuser,
    day_hand_suc_incnt,
    day_hand_suc_inuser
  from
    (select * from t_ed_soft_assist_useraction_stat where ds=20160126)t11
    full outer join
    (select * from t_md_soft_lanmu_app_dload_detail where ds=20160126)t12
    on t11.qua=t12.qua and t11.app_id=t12.appid and t11.scene=t12.lanmu_id
    full outer join
    (select * from t_md_soft_client_install_lanmu  where ds=20160126)t13
    on t11.qua=t13.qua and t11.app_id=t13.app_id and t11.scene=t13.lanmu_id
  )t1
  left outer join t_rd_qua t3 on t3.ds=20160126 and t1.qua=t3.qua
  left outer join t_rd_soft_appnew_last t4 on t4.ds=20160126 and
t1.app_id=t4.app_id
  left outer join t_rd_soft_page_conf t5 on t5.ds=20160126 and
t1.scene=t5.pageid and t3.client_type_id=t5.ismtt;


*Explain query is*
STAGE DEPENDENCIES:
  Stage-2 is a root stage
  Stage-1 depends on stages: Stage-2
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-2
    Spark
      DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:2
      Vertices:
        Map 6
            Map Operator Tree:
                TableScan
                  alias: t3
                  Statistics: Num rows: 1051 Data size: 113569 Basic stats:
COMPLETE Column stats: NONE
                  Spark HashTable Sink Operator
                    keys:
                      0 _col0 (type: string)
                      1 qua (type: string)
            Local Work:
              Map Reduce Local Work
        Map 7
            Map Operator Tree:
                TableScan
                  alias: t4
                  Statistics: Num rows: 2542751 Data size: 220433659 Basic
stats: COMPLETE Column stats: NONE
                  Spark HashTable Sink Operator
                    keys:
                      0 UDFToDouble(_col2) (type: double)
                      1 UDFToDouble(app_id) (type: double)
            Local Work:
              Map Reduce Local Work
        Map 8
            Map Operator Tree:
                TableScan
                  alias: t5
                  Statistics: Num rows: 143 Data size: 28605 Basic stats:
COMPLETE Column stats: NONE
                  Spark HashTable Sink Operator
                    keys:
                      0 _col1 (type: string), UDFToDouble(_col20) (type:
double)
                      1 pageid (type: string), UDFToDouble(ismtt) (type:
double)
            Local Work:
              Map Reduce Local Work

  Stage: Stage-1
    Spark
      Edges:
        Reducer 2 <- Map 1 (PARTITION-LEVEL SORT, 5), Map 4
(PARTITION-LEVEL SORT, 5), Map 5 (PARTITION-LEVEL SORT, 5)
        Reducer 3 <- Reducer 2 (GROUP, 1)
      DagName: mqq_20160127151826_e8197f40-18d7-430c-9fc8-993facb74534:1
      Vertices:
        Map 1
            Map Operator Tree:
                TableScan
                  alias: t_ed_soft_assist_useraction_stat
                  Statistics: Num rows: 16368107 Data size: 651461220 Basic
stats: COMPLETE Column stats: NONE
                  Select Operator
                    expressions: qua (type: string), scene (type: string),
app_id (type: string)
                    outputColumnNames: _col0, _col1, _col2
                    Statistics: Num rows: 16368107 Data size: 651461220
Basic stats: COMPLETE Column stats: NONE
                    Reduce Output Operator
                      key expressions: _col0 (type: string),
UDFToDouble(_col2) (type: double), _col1 (type: string)
                      sort order: +++
                      Map-reduce partition columns: _col0 (type: string),
UDFToDouble(_col2) (type: double), _col1 (type: string)
                      Statistics: Num rows: 16368107 Data size: 651461220
Basic stats: COMPLETE Column stats: NONE
                      value expressions: _col2 (type: string)
        Map 4
            Map Operator Tree:
                TableScan
                  alias: t_md_soft_lanmu_app_dload_detail
                  Statistics: Num rows: 2503976 Data size: 203324640 Basic
stats: COMPLETE Column stats: NONE
                  Select Operator
                    expressions: qua (type: string), appid (type: bigint),
lanmu_id (type: string)
                    outputColumnNames: _col2, _col3, _col4
                    Statistics: Num rows: 2503976 Data size: 203324640
Basic stats: COMPLETE Column stats: NONE
                    Reduce Output Operator
                      key expressions: _col2 (type: string),
UDFToDouble(_col3) (type: double), _col4 (type: string)
                      sort order: +++
                      Map-reduce partition columns: _col2 (type: string),
UDFToDouble(_col3) (type: double), _col4 (type: string)
                      Statistics: Num rows: 2503976 Data size: 203324640
Basic stats: COMPLETE Column stats: NONE
                      value expressions: _col3 (type: bigint)
        Map 5
            Map Operator Tree:
                TableScan
                  alias: t_md_soft_client_install_lanmu
                  Statistics: Num rows: 2143605 Data size: 178502520 Basic
stats: COMPLETE Column stats: NONE
                  Select Operator
                    expressions: lanmu_id (type: string), qua (type:
string), app_id (type: int)
                    outputColumnNames: _col13, _col2, _col3
                    Statistics: Num rows: 2143605 Data size: 178502520
Basic stats: COMPLETE Column stats: NONE
                    Reduce Output Operator
                      key expressions: _col2 (type: string),
UDFToDouble(_col3) (type: double), _col13 (type: string)
                      sort order: +++
                      Map-reduce partition columns: _col2 (type: string),
UDFToDouble(_col3) (type: double), _col13 (type: string)
                      Statistics: Num rows: 2143605 Data size: 178502520
Basic stats: COMPLETE Column stats: NONE
                      value expressions: _col3 (type: int)
        Reducer 2
            Local Work:
              Map Reduce Local Work
            Reduce Operator Tree:
              Join Operator
                condition map:
                     Outer Join 0 to 1
                     Outer Join 0 to 2
                keys:
                  0 _col0 (type: string), UDFToDouble(_col2) (type:
double), _col1 (type: string)
                  1 _col2 (type: string), UDFToDouble(_col3) (type:
double), _col4 (type: string)
                  2 _col2 (type: string), UDFToDouble(_col3) (type:
double), _col13 (type: string)
                outputColumnNames: _col0, _col1, _col2, _col10, _col11,
_col12, _col26, _col27, _col37
                Statistics: Num rows: 36009836 Data size: 1433214715 Basic
stats: COMPLETE Column stats: NONE
                Select Operator
                  expressions: COALESCE(_col0,_col10,_col26) (type:
string), COALESCE(_col1,_col12,_col37) (type: string),
COALESCE(_col2,_col11,_col27) (type: string)
                  outputColumnNames: _col0, _col1, _col2
                  Statistics: Num rows: 36009836 Data size: 1433214715
Basic stats: COMPLETE Column stats: NONE
                  Map Join Operator
                    condition map:
                         Left Outer Join0 to 1
                    keys:
                      0 _col0 (type: string)
                      1 qua (type: string)
                    outputColumnNames: _col1, _col2, _col20
                    input vertices:
                      1 Map 6
                    Statistics: Num rows: 39610820 Data size: 1576536220
Basic stats: COMPLETE Column stats: NONE
                    Map Join Operator
                      condition map:
                           Left Outer Join0 to 1
                      keys:
                        0 UDFToDouble(_col2) (type: double)
                        1 UDFToDouble(app_id) (type: double)
                      outputColumnNames: _col1, _col20
                      input vertices:
                        1 Map 7
                      Statistics: Num rows: 43571902 Data size: 1734189879
Basic stats: COMPLETE Column stats: NONE
                      Map Join Operator
                        condition map:
                             Left Outer Join0 to 1
                        keys:
                          0 _col1 (type: string), UDFToDouble(_col20)
(type: double)
                          1 pageid (type: string), UDFToDouble(ismtt)
(type: double)
                        input vertices:
                          1 Map 8
                        Statistics: Num rows: 47929093 Data size:
1907608908 Basic stats: COMPLETE Column stats: NONE
                        Group By Operator
                          aggregations: count(1)
                          mode: hash
                          outputColumnNames: _col0
                          Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: NONE
                          Reduce Output Operator
                            sort order:
                            Statistics: Num rows: 1 Data size: 8 Basic
stats: COMPLETE Column stats: NONE
                            value expressions: _col0 (type: bigint)
        Reducer 3
            Reduce Operator Tree:
              Group By Operator
                aggregations: count(VALUE._col0)
                mode: mergepartial
                outputColumnNames: _col0
                Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE
Column stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink

Time taken: 3.723 seconds, Fetched: 173 row(s)



*Besh wishes.*
*Thank for any help.*

Reply via email to