Stamatis Zampetakis created HIVE-26168: ------------------------------------------
Summary: EXPLAIN DDL command output is not deterministic Key: HIVE-26168 URL: https://issues.apache.org/jira/browse/HIVE-26168 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Stamatis Zampetakis The EXPLAIN DDL command (HIVE-24596) can be used to recreate the schema for a given query in order to debug planner issues. This is achieved by fetching information from the metastore and outputting series of DDL commands. The output commands though may appear in different order among runs since there is no mechanism to enforce an explicit order. Consider for instance the following scenario. {code:sql} CREATE TABLE customer ( `c_custkey` bigint, `c_name` string, `c_address` string ); INSERT INTO customer VALUES (1, 'Bob', '12 avenue Mansart'), (2, 'Alice', '24 avenue Mansart'); EXPLAIN DDL SELECT c_custkey FROM customer WHERE c_name = 'Bob'; {code} +Result 1+ {noformat} ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' ); ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== {noformat} +Result 2+ {noformat} ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' ); ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== {noformat} The two results are equivalent but the statements appear in a different order. This is not a big issue cause the results remain correct but it may lead to test flakiness so it might be worth addressing. -- This message was sent by Atlassian Jira (v8.20.7#820007)