[ https://issues.apache.org/jira/browse/HIVE-22181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
qingfa zhou updated HIVE-22181: ------------------------------- Attachment: image-2019-09-10-12-40-53-338.png > Same query but different result occasionally > -------------------------------------------- > > Key: HIVE-22181 > URL: https://issues.apache.org/jira/browse/HIVE-22181 > Project: Hive > Issue Type: Bug > Components: Hive > Affects Versions: 2.3.0 > Environment: hadoop:2.7.3.2.6.1.0-129 > hive:2.3.0 > tez:0.8.4 > java:1.8.0_212 > Reporter: qingfa zhou > Priority: Major > Attachments: WX20190910-123604.png, WX20190910-123621.png, > WX20190910-123641.png, image-2019-09-10-12-40-34-958.png, > image-2019-09-10-12-40-53-338.png > > > h3. 1.Phenomenon > Running the same SQL(hive on tez) several times will result in > inconsistent results . The following results are generated respectively > 1) > OK > 3951864 808 > 2) > OK > 1822979 353 > But the first result is true. > h3. 2.SQL > select > count(1),count(distinct store_code) > from > ( > select > store_code, > all_pay_id, > payable_price, > row_number()over(partition by store_code order by payable_price desc) as rw, > count(1)over(partition by store_code) as store_user > from > ( > select > store_code, > all_pay_id, > sum(payable_price) as payable_price > from data_promotion.mdw_user_promotion_shopping_behaivor_four_week_detail a > where dt='20190904' > group by store_code,all_pay_id > ) a > order by rw > ) a > where rw/store_user<=0.8 > ; > > h3. 3.Troubleshooting > I lookup task result from tez web ui,give the result as follows: > the first: > the second: > !image-2019-09-10-12-18-08-769.png! > so , The second RECORDS_OUT_INTERMEDIATE_Reducer_4's result is much > lower than the first. > This is detailed info of the second result from reducer_3 to reducer_4 > !image-2019-09-10-12-26-41-681.png! > This is execution plan of the sql: > OK > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (SIMPLE_EDGE) > Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 6 > File Output Operator [FS_24] > Group By Operator [GBY_22] (rows=1 width=128) > Output:["_col0","_col1"],aggregations:["count(VALUE._col0)","count(DISTINCT > KEY._col0:0._col0)"] > <-Reducer 5 [SIMPLE_EDGE] > SHUFFLE [RS_21] > Group By Operator [GBY_20] (rows=5619870 width=2328) > Output:["_col0","_col1","_col2"],aggregations:["count(1)","count(DISTINCT > _col0)"],keys:_col0 > Select Operator [SEL_17] (rows=5619870 width=2328) > Output:["_col0"] > <-Reducer 4 [SIMPLE_EDGE] > SHUFFLE [RS_16] > Select Operator [SEL_14] (rows=5619870 width=2328) > Output:["_col0","_col3"] > Filter Operator [FIL_25] (rows=5619870 width=2328) > predicate:((_col0 / count_window_1) <= 0.8) > PTF Operator [PTF_13] (rows=16859610 width=2328) > Function definitions:[{},\\{"name:":"windowingtablefunction","order > by:":"_col1 ASC NULLS FIRST","partition by:":"_col1"}] > Select Operator [SEL_12] (rows=16859610 width=2328) > Output:["_col0","_col1"] > <-Reducer 3 [SIMPLE_EDGE] > SHUFFLE [RS_11] > PartitionCols:_col0 > Select Operator [SEL_10] (rows=16859610 width=2328) > Output:["_col0","row_number_window_0"] > PTF Operator [PTF_9] (rows=16859610 width=2328) > Function definitions:[{},\\{"name:":"windowingtablefunction","order > by:":"_col2 DESC NULLS LAST","partition by:":"_col0"}] > Select Operator [SEL_8] (rows=16859610 width=2328) > Output:["_col0","_col2"] > <-Reducer 2 [SIMPLE_EDGE] > SHUFFLE [RS_7] > PartitionCols:_col0 > Select Operator [SEL_6] (rows=16859610 width=2328) > Output:["_col0","_col2"] > Group By Operator [GBY_5] (rows=16859610 width=2328) > > Output:["_col0","_col1","_col2"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0, > KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=33719220 width=2328) > > Output:["_col0","_col1","_col2"],aggregations:["sum(payable_price)"],keys:store_code, > all_pay_id > Select Operator [SEL_2] (rows=33719220 width=2328) > Output:["store_code","all_pay_id","payable_price"] > TableScan [TS_0] (rows=33719220 width=2328) > > data_promotion@mdw_user_promotion_shopping_behaivor_four_week_detail,a,Tbl:COMPLETE,Col:NONE,Output:["all_pay_id","store_code","payable_price"] > > Reducer 4 show PTF , so I don't know if this result has anything to do > with PTF? Can you help me. > > -- This message was sent by Atlassian Jira (v8.3.2#803003)