Csaba Ringhofer created IMPALA-8409: ---------------------------------------
Summary: STRINGs without stats have too low row-size in explain plan Key: IMPALA-8409 URL: https://issues.apache.org/jira/browse/IMPALA-8409 Project: IMPALA Issue Type: Bug Components: Frontend Affects Versions: Impala 3.2.0 Reporter: Csaba Ringhofer STRING columns without avg_size statistic are calculated into the row-size as 11 bytes, while they take 12 bytes in the tuple (+ more somewhere in the memory if they are not empty). The issue is caused by adding -1 (meaning unknown) to the 12 byte slot size. I think that this doesn't cause problems, as the estimation is probably way off without statistics anyway, but row-size >= tuple size seems like a meaningful invariant that we shouldn't break. Reproduce: {code} create table test_row_size (s string); explain select * from test_row_size; Result: ... WARNING: The following tables are missing relevant table and/or column statistics. default.test_row_size ... 00:SCAN HDFS [default.test_row_size] partitions=1/1 files=0 size=0B row-size=11B cardinality=0 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org