[ https://issues.apache.org/jira/browse/IMPALA-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Jeges resolved IMPALA-9094. ---------------------------------- Fix Version/s: Impala 3.4.0 Resolution: Fixed Commit 476e1b12e79c83674ec5cd0983e30e9d47c65e8f in impala's branch refs/heads/master from Fang-Yu Rao [ https://gitbox.apache.org/repos/asf?p=impala.git;h=476e1b1 ] IMPALA-9047: Bump CDP_BUILD_NUMBER to 1471450 This patch bumps CDP_BUILD_NUMBER to 1471450. The new GBN upgrades Ranger from 1.2 to 2.0, which includes the change to the default Ranger policies described in https://issues.apache.org/jira/browse/RANGER-2536. Some of the Ranger tests fail, because they assume the older behavior. To address this issue, this patch temporarily disables those affected Ranger tests. Specifically, the affected tests in the following test files are disabled for now. 1. test_authorized_proxy.py 2. test_ranger.py 3. AuthorizationStmtTest.java 4. RangerAuditLogTest.java IMPALA-8842 part 2: (Hive3) Use 'engine' field in HMS stat API The new CDP GBN includes the fix for HIVE-22046. HIVE-22046 added 'engine' column to TAB_COL_STATS and PART_COL_STATS HMS tables. The new column is used to differentiate among column stats computed by different engines. The related HMS API calls were changed accordingly. Part of this patch is Step 4 in a series of steps to coordinate the introduction of HMS API changes to Hive3 and Impala. For more information see IMPALA-8842 part 1. Step 4 replaces *V2 calls with *. The *V2 names were introduced temporarily and will be removed from the HMS API in the near future. Testing: This patch passes the affected Ranger tests listed above on a local machine. E2E tests were added to make sure that column statistics are differentiated by engine for partitioned and non-partitioned tables. The tests are executed for transactional and non-transactional tables. Change-Id: I962423cf202ad632b5817669500b3e3479f1a454 Reviewed-on: http://gerrit.cloudera.org:8080/14576 Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Update test_hms_integration.py test_compute_stats_get_to_hive to account for > separate Hive/Impala statistics > ------------------------------------------------------------------------------------------------------------ > > Key: IMPALA-9094 > URL: https://issues.apache.org/jira/browse/IMPALA-9094 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 3.4.0 > Reporter: Joe McDonnell > Assignee: Attila Jeges > Priority: Blocker > Fix For: Impala 3.4.0 > > > With newer Hive versions, Impala and Hive stats are kept separately and won't > overwrite each other. test_hms_integration.py test_compute_stats_get_to_hive > expects that Hive stats change when Impala does compute stats. > test_compute_stats_get_to_impala expects that Impala stats change when Hive > does compute stats. These tests need to be revised. Here are the example test > failures: > {noformat} > metadata/test_hms_integration.py:486: in test_compute_stats_get_to_hive > assert hive_stats != self.hive_column_stats(table_name, 'x') > E assert {'# col_name': 'data_type', 'col_name': 'data_type', 'x': 'int'} > != {'# col_name': 'data_type', 'col_name': 'data_type', 'x': 'int'} > E + where {'# col_name': 'data_type', 'col_name': 'data_type', 'x': > 'int'} = <bound method TestHmsIntegration.hive_column_stats of > <test_hms_integration.TestHmsIntegration object at > 0xe260e50>>('zbberubbydyldirc.fkqzvzekyqsjnflk', 'x') > E + where <bound method TestHmsIntegration.hive_column_stats of > <test_hms_integration.TestHmsIntegration object at 0xe260e50>> = > <test_hms_integration.TestHmsIntegration object at > 0xe260e50>.hive_column_stats{noformat} > If my theory is right, we should flip the test to make sure that Impala > compute stats doesn't impact Hive and vice versa. -- This message was sent by Atlassian Jira (v8.3.4#803005)