[ https://issues.apache.org/jira/browse/IMPALA-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17705722#comment-17705722 ]
ASF subversion and git services commented on IMPALA-8205: --------------------------------------------------------- Commit c6223b2aeb8ae23a094551aa2abc8fab75e13165 in impala's branch refs/heads/branch-4.1.2 from stiga-huang [ https://gitbox.apache.org/repos/asf?p=impala.git;h=c6223b2ae ] IMPALA-11953: Declare num_trues and num_falses in TIntermediateColumnStats as optional TIntermediateColumnStats is the representation of incremental stats which are stored in HMS partition properties using keys like "impala_intermediate_stats_chunk0", "impala_intermediate_stats_chunk1", "impala_intermediate_stats_chunk2", etc. Fields in TIntermediateColumnStats should be optional to ensure backward compatibility. IMPALA-8205 adds two required fields, num_trues and num_falses, in TIntermediateColumnStats. This breaks the incremental stats loading in higher versions of Impala if the stats are generated by older Impala versions (< 4.0). This patch changes the fields to be optional. Tests: - Verified the incremental stats generated by CDH Impala cluster can be loaded by CDP Impala cluster with this fix. Change-Id: I4f74d5d0676e7ce9eb4ea8061a15610846db3ca5 Reviewed-on: http://gerrit.cloudera.org:8080/19555 Reviewed-by: Riza Suminto <riza.sumi...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> > Illegal statistics for numFalse and numTrue > ------------------------------------------- > > Key: IMPALA-8205 > URL: https://issues.apache.org/jira/browse/IMPALA-8205 > Project: IMPALA > Issue Type: Bug > Components: Catalog > Reporter: wuchang > Assignee: wuchang > Priority: Major > Labels: impala, numFalse, numTrue, statistics > Fix For: Impala 4.0.0 > > > When impala compute statistics, it set *numFalse = -1* and *numTrue = 1* when > the statistic is missing; > *-1* for *numFalse* will corrupt some query engine like Presto and there > already exists some PR report and hotfix it : > [presto-11859|https://github.com/prestodb/presto/pull/11859] > *1* for *numTrue* is also unreasonable because we are not sure whether it > indicates the real numTrue statistics or a missing statistics; > Also, previously , the *nullCount* also use -1 to indicate its absence which > also caused problem for Presto. Presto has to add a hotfix for > it([presto-11549|https://github.com/prestodb/presto/pull/11549]) . But it is > a fortunate that impala has fixed this bug; > It is necessary to set to null when these statistics are absent instead of -1 > and 1. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org