[ https://issues.apache.org/jira/browse/IMPALA-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Becker resolved IMPALA-12783. ------------------------------------ Resolution: Fixed > Nested struct with varlen data crashes > -------------------------------------- > > Key: IMPALA-12783 > URL: https://issues.apache.org/jira/browse/IMPALA-12783 > Project: IMPALA > Issue Type: Bug > Components: Backend > Reporter: Daniel Becker > Assignee: Daniel Becker > Priority: Major > > If a struct ("main") is within an array and contains two child structs ("s1" > ans "s2") which both contain strings (or other varlen data), it crashes when > re-materialised (for example in a sort with limit) if codegen is enabled. > To reproduce: > In Hive: > {code:java} > create table nested (arr ARRAY<STRUCT<s1: STRUCT<str1: STRING>, s2: > STRUCT<str2: STRING>>>) stored as parquet; > insert into nested values (array( named_struct("s1", named_struct("str1", "A > string that is long"), "s2", named_struct("str2", "Another string that is > long") )));{code} > In Impala: > {code:java} > select 1, arr from nested order by 1 limit 1;{code} > This seems to be because in the codegen'd code, when checking if the strings > ("str1" and "str2" in the example) are NULL, we incorrectly calculate the > offset of the null indicator byte from the memory adress of their containing > struct, not from the beginning of the "master tuple", which in this case is > the item tuple of the array. > Note that the null indicators of the struct members are at the end of the > tuple containing the struct (recursively), i.e. the master tuple. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org