[ 
https://issues.apache.org/jira/browse/HIVE-8207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao updated HIVE-8207:
-----------------------
    Description: 
Now that multi-table insertion is committed to branch, we should enable those 
related qtests.

Here is a list of qfiles that should be activated (some of them may already be 
activated).
The list may not be comprehensive.

{noformat}
add_part_multiple.q
auto_smb_mapjoin_14.q
bucket5.q
column_access_stats.q
date_udf.q
groupby10.q
groupby11.q
groupby3_map_multi_distinct.q
groupby3_map.q
groupby3_map_skew.q
groupby3_noskew_multi_distinct.q
groupby3_noskew.q
groupby7_map_multi_single_reducer.q
groupby7_map.q
groupby7_map_skew.q
groupby7_noskew_multi_single_reducer.q
groupby7_noskew.q
groupby7.q
groupby8_map.q
groupby8_map_skew.q
groupby8_noskew.q
groupby8.q
groupby9.q
groupby_complex_types_multi_single_reducer.q
groupby_complex_types.q
groupby_cube1.q
groupby_map_ppr_multi_distinct.q
groupby_map_ppr.q
groupby_multi_insert_common_distinct.q
groupby_multi_single_reducer2.q
groupby_multi_single_reducer3.q
groupby_multi_single_reducer.q
groupby_position.q
groupby_ppr.q
groupby_rollup1.q
groupby_sort_1_23.q
groupby_sort_1.q
groupby_sort_skew_1_23.q
infer_bucket_sort_multi_insert.q
innerjoin.q
input12_hadoop20.q
input12.q
input13.q
input14.q
input17.q
input18.q
input1_limit.q
input_part2.q
insert_into3.q
join_nullsafe.q
load_dyn_part8.q
metadata_only_queries_with_filters.q
multigroupby_singlemr.q
multi_insert_gby2.q
multi_insert_gby3.q
multi_insert_gby.q
multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q
multi_insert.q
parallel.q
partition_date2.q
pcr.q
ppd_multi_insert.q
ppd_transform.q
smb_mapjoin_11.q
smb_mapjoin_12.q
smb_mapjoin_13.q
smb_mapjoin_15.q
smb_mapjoin_16.q
stats4.q
subquery_multiinsert.q
table_access_keys_stats.q
tez_dml.q
udaf_percentile_approx_20.q
udaf_percentile_approx_23.q
union17.q
union18.q
union19.q
{noformat}                                                                      
        

There are some tests that cannot be enabled right now, due to various reasons:

1. ForwardOperator Issue, including
{noformat}
groupby7_noskew_multi_single_reducer.q
groupby8_map.q
groupby8_map_skew.q
groupby8_noskew.q
groupby8.q
groupby9.q
groupby10.q
groupby_multi_insert_common_distinct.q 
union17.q
{noformat}

*Reason*: currently, if the node to break in the operator tree is a 
ForwardOperator, we simple do nothing. However, we may have the following case:

{noformat}
      ...
      RS_0
       |
      FOR
       |
     /   \
   GBY_1  GBY_2
    |     |
   ...   ...
    |     |
   RS_1  RS_2
    |     |
   ...   ...
    |     |
   FS_1  FS_2
{noformat}

which may result to:
{noformat}
          RW
         /  \
       RW    RW
{noformat}

and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches 
will get duplicated (and same) inputs.

2. Stats issue, including:
{noformat}
bucket5.q
infer_bucket_sort_multi_insert.q
stats4.q
smb_mapjoin_13.q
smb_mapjoin_15.q
{noformat}

*Reason*: In these tests, I get diff error because {{numRows}} and 
{{rawDataSize}} are -1, but they are expected to be some positive value. I 
don't think this is related to multi-insertion.

3. Join/SMB Join Issue, including
{noformat}
auto_smb_mapjoin_14.q
auto_sortmerge_join_13.q
smb_mapjoin_11.q
smb_mapjoin_12.q
smb_mapjoin_13.q
smb_mapjoin_15.q
smb_mapjoin_16.q
{noformat}

*Reason*: These tests either failed with exception or failed with diff. I think 
it's because SMB Join (HIVE-8202) isn't supported right now.

4. Result doesn't match, including
{noformat}
groupby3_map_skew.q
groupby_map_ppr_multi_distinct.q
groupby_complex_types_multi_single_reducer.q
groupby_map_ppr.q
partition_date2.q
udaf_percentile_approx_23.q
{noformat}

*Reason*: The results from these tests are different from MR's. For instance, 
test for groupby3_map_skew.q failed because:

{noformat}
< 130091.0      260.182 256.10355987055016      98.0    0.0     
142.92680950752379      143.06995106518903      20428.07288     20469.0109
---
> 130091.0      260.182 256.10355987055016      98.0    0.0     
> 142.9268095075238       143.06995106518906      20428.07288     20469.0109
{noformat}
I don't know why this will happen. But, I think they may not be related to 
multi-insertion.


  was:
Now that multi-table insertion is committed to branch, we should enable those 
related qtests.

Here is a list of qfiles that should be activated (some of them may already be 
activated).
The list may not be comprehensive.

{noformat}
add_part_multiple.q
auto_smb_mapjoin_14.q
bucket5.q
column_access_stats.q
date_udf.q
groupby10.q
groupby11.q
groupby3_map_multi_distinct.q
groupby3_map.q
groupby3_map_skew.q
groupby3_noskew_multi_distinct.q
groupby3_noskew.q
groupby7_map_multi_single_reducer.q
groupby7_map.q
groupby7_map_skew.q
groupby7_noskew_multi_single_reducer.q
groupby7_noskew.q
groupby7.q
groupby8_map.q
groupby8_map_skew.q
groupby8_noskew.q
groupby8.q
groupby9.q
groupby_complex_types_multi_single_reducer.q
groupby_complex_types.q
groupby_cube1.q
groupby_map_ppr_multi_distinct.q
groupby_map_ppr.q
groupby_multi_insert_common_distinct.q
groupby_multi_single_reducer2.q
groupby_multi_single_reducer3.q
groupby_multi_single_reducer.q
groupby_position.q
groupby_ppr.q
groupby_rollup1.q
groupby_sort_1_23.q
groupby_sort_1.q
groupby_sort_skew_1_23.q
infer_bucket_sort_multi_insert.q
innerjoin.q
input12_hadoop20.q
input12.q
input13.q
input14.q
input17.q
input18.q
input1_limit.q
input_part2.q
insert_into3.q
join_nullsafe.q
load_dyn_part8.q
metadata_only_queries_with_filters.q
multigroupby_singlemr.q
multi_insert_gby2.q
multi_insert_gby3.q
multi_insert_gby.q
multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q
multi_insert.q
parallel.q
partition_date2.q
pcr.q
ppd_multi_insert.q
ppd_transform.q
smb_mapjoin_11.q
smb_mapjoin_12.q
smb_mapjoin_13.q
smb_mapjoin_15.q
smb_mapjoin_16.q
stats4.q
subquery_multiinsert.q
table_access_keys_stats.q
tez_dml.q
udaf_percentile_approx_20.q
udaf_percentile_approx_23.q
union17.q
union18.q
union19.q
{noformat}                                                                      
        

There are some tests that cannot be enabled right now, due to various reasons:

1. ForwardOperator Issue, including
{noformat}
groupby7_noskew_multi_single_reducer.q
groupby8_map.q
groupby8_map_skew.q
groupby8_noskew.q
groupby8.q
groupby9.q
groupby10.q
groupby_multi_insert_common_distinct.q 
union17.q
{noformat}

*Reason*: currently, if the node to break in the operator tree is a 
ForwardOperator, we simple do nothing. However, we may have the following case:

{noformat}
    ......  FOR -> RS_0 -> RS_1
                        \
                         -> RS_2
{noformat}

Here, {{RS_0}} leads to both {{RS_1}} and {{RS_2}}, and because of the issue in 
HIVE-7731 and HIVE-8118, both downstream branches will get duplicated results.

2. Stats issue, including:
{noformat}
bucket5.q
infer_bucket_sort_multi_insert.q
stats4.q
smb_mapjoin_13.q
smb_mapjoin_15.q
{noformat}

*Reason*: In these tests, I get diff error because {{numRows}} and 
{{rawDataSize}} are -1, but they are expected to be some positive value. I 
don't think this is related to multi-insertion.

3. Join/SMB Join Issue, including
{noformat}
auto_smb_mapjoin_14.q
auto_sortmerge_join_13.q
smb_mapjoin_11.q
smb_mapjoin_12.q
smb_mapjoin_13.q
smb_mapjoin_15.q
smb_mapjoin_16.q
{noformat}

*Reason*: These tests either failed with exception or failed with diff. I think 
it's because SMB Join (HIVE-8202) isn't supported right now.

4. Result doesn't match, including
{noformat}
groupby3_map_skew.q
groupby_map_ppr_multi_distinct.q
groupby_complex_types_multi_single_reducer.q
groupby_map_ppr.q
partition_date2.q
udaf_percentile_approx_23.q
{noformat}

*Reason*: The results from these tests are different from MR's. For instance, 
test for groupby3_map_skew.q failed because:

{noformat}
< 130091.0      260.182 256.10355987055016      98.0    0.0     
142.92680950752379      143.06995106518903      20428.07288     20469.0109
---
> 130091.0      260.182 256.10355987055016      98.0    0.0     
> 142.9268095075238       143.06995106518906      20428.07288     20469.0109
{noformat}
I don't know why this will happen. But, I think they may not be related to 
multi-insertion.



> Add .q tests for multi-table insertion [Spark Branch]
> -----------------------------------------------------
>
>                 Key: HIVE-8207
>                 URL: https://issues.apache.org/jira/browse/HIVE-8207
>             Project: Hive
>          Issue Type: Test
>          Components: Spark
>            Reporter: Chao
>            Assignee: Chao
>         Attachments: HIVE-8207.1-spark.patch
>
>
> Now that multi-table insertion is committed to branch, we should enable those 
> related qtests.
> Here is a list of qfiles that should be activated (some of them may already 
> be activated).
> The list may not be comprehensive.
> {noformat}
> add_part_multiple.q
> auto_smb_mapjoin_14.q
> bucket5.q
> column_access_stats.q
> date_udf.q
> groupby10.q
> groupby11.q
> groupby3_map_multi_distinct.q
> groupby3_map.q
> groupby3_map_skew.q
> groupby3_noskew_multi_distinct.q
> groupby3_noskew.q
> groupby7_map_multi_single_reducer.q
> groupby7_map.q
> groupby7_map_skew.q
> groupby7_noskew_multi_single_reducer.q
> groupby7_noskew.q
> groupby7.q
> groupby8_map.q
> groupby8_map_skew.q
> groupby8_noskew.q
> groupby8.q
> groupby9.q
> groupby_complex_types_multi_single_reducer.q
> groupby_complex_types.q
> groupby_cube1.q
> groupby_map_ppr_multi_distinct.q
> groupby_map_ppr.q
> groupby_multi_insert_common_distinct.q
> groupby_multi_single_reducer2.q
> groupby_multi_single_reducer3.q
> groupby_multi_single_reducer.q
> groupby_position.q
> groupby_ppr.q
> groupby_rollup1.q
> groupby_sort_1_23.q
> groupby_sort_1.q
> groupby_sort_skew_1_23.q
> infer_bucket_sort_multi_insert.q
> innerjoin.q
> input12_hadoop20.q
> input12.q
> input13.q
> input14.q
> input17.q
> input18.q
> input1_limit.q
> input_part2.q
> insert_into3.q
> join_nullsafe.q
> load_dyn_part8.q
> metadata_only_queries_with_filters.q
> multigroupby_singlemr.q
> multi_insert_gby2.q
> multi_insert_gby3.q
> multi_insert_gby.q
> multi_insert_lateral_view.qmulti_insert_move_tasks_share_dependencies.q
> multi_insert.q
> parallel.q
> partition_date2.q
> pcr.q
> ppd_multi_insert.q
> ppd_transform.q
> smb_mapjoin_11.q
> smb_mapjoin_12.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> smb_mapjoin_16.q
> stats4.q
> subquery_multiinsert.q
> table_access_keys_stats.q
> tez_dml.q
> udaf_percentile_approx_20.q
> udaf_percentile_approx_23.q
> union17.q
> union18.q
> union19.q
> {noformat}                                                                    
>           
> There are some tests that cannot be enabled right now, due to various reasons:
> 1. ForwardOperator Issue, including
> {noformat}
> groupby7_noskew_multi_single_reducer.q
> groupby8_map.q
> groupby8_map_skew.q
> groupby8_noskew.q
> groupby8.q
> groupby9.q
> groupby10.q
> groupby_multi_insert_common_distinct.q 
> union17.q
> {noformat}
> *Reason*: currently, if the node to break in the operator tree is a 
> ForwardOperator, we simple do nothing. However, we may have the following 
> case:
> {noformat}
>       ...
>       RS_0
>        |
>       FOR
>        |
>      /   \
>    GBY_1  GBY_2
>     |     |
>    ...   ...
>     |     |
>    RS_1  RS_2
>     |     |
>    ...   ...
>     |     |
>    FS_1  FS_2
> {noformat}
> which may result to:
> {noformat}
>           RW
>          /  \
>        RW    RW
> {noformat}
> and because of the issue in HIVE-7731 and HIVE-8118, both downstream branches 
> will get duplicated (and same) inputs.
> 2. Stats issue, including:
> {noformat}
> bucket5.q
> infer_bucket_sort_multi_insert.q
> stats4.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> {noformat}
> *Reason*: In these tests, I get diff error because {{numRows}} and 
> {{rawDataSize}} are -1, but they are expected to be some positive value. I 
> don't think this is related to multi-insertion.
> 3. Join/SMB Join Issue, including
> {noformat}
> auto_smb_mapjoin_14.q
> auto_sortmerge_join_13.q
> smb_mapjoin_11.q
> smb_mapjoin_12.q
> smb_mapjoin_13.q
> smb_mapjoin_15.q
> smb_mapjoin_16.q
> {noformat}
> *Reason*: These tests either failed with exception or failed with diff. I 
> think it's because SMB Join (HIVE-8202) isn't supported right now.
> 4. Result doesn't match, including
> {noformat}
> groupby3_map_skew.q
> groupby_map_ppr_multi_distinct.q
> groupby_complex_types_multi_single_reducer.q
> groupby_map_ppr.q
> partition_date2.q
> udaf_percentile_approx_23.q
> {noformat}
> *Reason*: The results from these tests are different from MR's. For instance, 
> test for groupby3_map_skew.q failed because:
> {noformat}
> < 130091.0      260.182 256.10355987055016      98.0    0.0     
> 142.92680950752379      143.06995106518903      20428.07288     20469.0109
> ---
> > 130091.0      260.182 256.10355987055016      98.0    0.0     
> > 142.9268095075238       143.06995106518906      20428.07288     20469.0109
> {noformat}
> I don't know why this will happen. But, I think they may not be related to 
> multi-insertion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to