[ https://issues.apache.org/jira/browse/IMPALA-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang updated IMPALA-8409: ----------------------------------- Fix Version/s: (was: Impala 4.0) Impala 3.3.0 > STRINGs without stats have too low row-size in explain plan > ----------------------------------------------------------- > > Key: IMPALA-8409 > URL: https://issues.apache.org/jira/browse/IMPALA-8409 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 3.2.0 > Reporter: Csaba Ringhofer > Assignee: Csaba Ringhofer > Priority: Minor > Labels: explain, statistics > Fix For: Impala 3.3.0 > > > STRING columns without avg_size statistic are calculated into the row-size as > 11 bytes, while they take 12 bytes in the tuple (+ more somewhere in the > memory if they are not empty). The issue is caused by adding -1 (meaning > unknown) to the 12 byte slot size. > I think that this doesn't cause problems, as the estimation is probably way > off without statistics anyway, but row-size >= tuple size seems like a > meaningful invariant that we shouldn't break. > Reproduce: > {code} > create table test_row_size (s string); > explain select * from test_row_size; > Result: > ... > WARNING: The following tables are missing relevant table and/or column > statistics. > default.test_row_size > ... > 00:SCAN HDFS [default.test_row_size] > partitions=1/1 files=0 size=0B > row-size=11B cardinality=0 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org