[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements
[ https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14268625#comment-14268625 ] Alexander Behm commented on HIVE-4022: -- Easier workaround: IF(false, named_struct(a, 1), NULL) Structs and struct fields cannot be NULL in INSERT statements - Key: HIVE-4022 URL: https://issues.apache.org/jira/browse/HIVE-4022 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Michael Malak Originally thought to be Avro-specific, and first noted with respect to HIVE-3528 Avro SerDe doesn't handle serializing Nullable types that require access to a Schema, it turns out even native Hive tables cannot store NULL in a STRUCT field or for the entire STRUCT itself, at least when the NULL is specified directly in the INSERT statement. Again, this affects both Avro-backed tables and native Hive tables. ***For native Hive tables: The following: echo 1,2 twovalues.csv hive CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc; CREATE TABLE oc (z STRUCTa: int, b: int); INSERT INTO TABLE oc SELECT null FROM tc; produces the error FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from void to structa:int,b:int. The following: INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc; produces the error: FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from structa:void,b:void to structa:int,b:int. ***For Avro: In HIVE-3528, there is in fact a null-struct test case in line 14 of https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt The test script at https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q does indeed work. But in that test, the query gets all of its data from a test table verbatim: INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer; If instead we stick in a hard-coded null for the struct directly into the query, it fails: INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, bytes1, fixed1 FROM test_serializer; with the following error: FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different 'as_avro': Cannot convert column 10 from void to structsint:int,sboolean:boolean,sstring:string. Note, though, that substituting a hard-coded null for string1 (and restoring struct1 into the query) does work: INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, bytes1, fixed1 FROM test_serializer; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements
[ https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817769#comment-13817769 ] Adrian Hains commented on HIVE-4022: I found a workaround to get me past this restriction. I had a need to add some struct columns to a table t1 by way of copying the data to a new table t2 with the correct updated schema. Trying to insert directly to t2 by selecting from t1 with null literals failed for me as described in this jira ticket. To work around this I created an additional table t2copy that has the same schema as t2. Then I did an insert to t1 selecting from t2 left outer join t2copy, and referencing the t2copy.newStructColumn instance to have a table-sourced null value pass to t1. This worked. It may be that t2copy having the same struct definition is unnecessary, and a simple empty table with a bogus struct column definition would have worked just as well. Structs and struct fields cannot be NULL in INSERT statements - Key: HIVE-4022 URL: https://issues.apache.org/jira/browse/HIVE-4022 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Michael Malak Originally thought to be Avro-specific, and first noted with respect to HIVE-3528 Avro SerDe doesn't handle serializing Nullable types that require access to a Schema, it turns out even native Hive tables cannot store NULL in a STRUCT field or for the entire STRUCT itself, at least when the NULL is specified directly in the INSERT statement. Again, this affects both Avro-backed tables and native Hive tables. ***For native Hive tables: The following: echo 1,2 twovalues.csv hive CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc; CREATE TABLE oc (z STRUCTa: int, b: int); INSERT INTO TABLE oc SELECT null FROM tc; produces the error FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from void to structa:int,b:int. The following: INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc; produces the error: FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from structa:void,b:void to structa:int,b:int. ***For Avro: In HIVE-3528, there is in fact a null-struct test case in line 14 of https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt The test script at https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q does indeed work. But in that test, the query gets all of its data from a test table verbatim: INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer; If instead we stick in a hard-coded null for the struct directly into the query, it fails: INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, bytes1, fixed1 FROM test_serializer; with the following error: FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different 'as_avro': Cannot convert column 10 from void to structsint:int,sboolean:boolean,sstring:string. Note, though, that substituting a hard-coded null for string1 (and restoring struct1 into the query) does work: INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, bytes1, fixed1 FROM test_serializer; -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4022) Structs and struct fields cannot be NULL in INSERT statements
[ https://issues.apache.org/jira/browse/HIVE-4022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13582662#comment-13582662 ] Michael Malak commented on HIVE-4022: - Note that there is a workaround for the case of setting STRUCT fields to NULL, but not for setting the whole STRUCT to a NULL. The following workaround does work: INSERT INT TABLE oc SELECT named_struct('a', cast(null as int), 'b', cast(null as int)) FROM tc; But there is no equivalent workaround to casting the whole STRUCT to NULL, as noted in the first comment of https://issues.apache.org/jira/browse/HIVE-1287 Structs and struct fields cannot be NULL in INSERT statements - Key: HIVE-4022 URL: https://issues.apache.org/jira/browse/HIVE-4022 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Michael Malak Originally thought to be Avro-specific, and first noted with respect to HIVE-3528 Avro SerDe doesn't handle serializing Nullable types that require access to a Schema, it turns out even native Hive tables cannot store NULL in a STRUCT field or for the entire STRUCT itself, at least when the NULL is specified directly in the INSERT statement. Again, this affects both Avro-backed tables and native Hive tables. ***For native Hive tables: The following: echo 1,2 twovalues.csv hive CREATE TABLE tc (x INT, y INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; LOAD DATA LOCAL INPATH 'twovalues.csv' INTO TABLE tc; CREATE TABLE oc (z STRUCTa: int, b: int); INSERT INTO TABLE oc SELECT null FROM tc; produces the error FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from void to structa:int,b:int. The following: INSERT INTO TABLE oc SELECT named_struct('a', null, 'b', null) FROM tc; produces the error: FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'oc': Cannot convert column 0 from structa:void,b:void to structa:int,b:int. ***For Avro: In HIVE-3528, there is in fact a null-struct test case in line 14 of https://github.com/apache/hive/blob/15cc604bf10f4c2502cb88fb8bb3dcd45647cf2c/data/files/csv.txt The test script at https://github.com/apache/hive/blob/12d6f3e7d21f94e8b8490b7c6d291c9f4cac8a4f/ql/src/test/queries/clientpositive/avro_nullable_fields.q does indeed work. But in that test, the query gets all of its data from a test table verbatim: INSERT OVERWRITE TABLE as_avro SELECT * FROM test_serializer; If instead we stick in a hard-coded null for the struct directly into the query, it fails: INSERT OVERWRITE TABLE as_avro SELECT string1, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, null, enum1, nullableint, bytes1, fixed1 FROM test_serializer; with the following error: FAILED: SemanticException [Error 10044]: Line 1:23 Cannot insert into target table because column number/types are different 'as_avro': Cannot convert column 10 from void to structsint:int,sboolean:boolean,sstring:string. Note, though, that substituting a hard-coded null for string1 (and restoring struct1 into the query) does work: INSERT OVERWRITE TABLE as_avro SELECT null, int1, tinyint1, smallint1, bigint1, boolean1, float1, double1, list1, map1, struct1, enum1, nullableint, bytes1, fixed1 FROM test_serializer; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira