[jira] [Updated] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema
[ https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-7174: -- Resolution: Fixed Status: Resolved (was: Patch Available) Patch committed to trunk. Thanks Jarcec for the contribution. Do not accept string as scale and precision when reading Avro schema Key: HIVE-7174 URL: https://issues.apache.org/jira/browse/HIVE-7174 Project: Hive Issue Type: Bug Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Fix For: 0.14.0 Attachments: HIVE-7174.patch, dec.avro I've noticed that the current AvroSerde will happily accept schema that uses string instead of integer for scale and precision, e.g. fragment {{precision:4,scale:1}} from following table: {code} CREATE TABLE `avro_dec1`( `name` string COMMENT 'from deserializer', `value` decimal(4,1) COMMENT 'from deserializer') COMMENT 'just drop the schema right into the HQL' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', 'avro.schema.literal'='{\namespace\:\com.howdy\,\name\:\some_schema\,\type\:\record\,\fields\:[{\name\:\name\,\type\:\string\},{\name\:\value\,\type\:{\type\:\bytes\,\logicalType\:\decimal\,\precision\:\4\,\scale\:\1\}}]}' ); {code} However the Decimal spec defined in AVRO-1402 requires only integer to be there and hence is allowing only following fragment instead {{precision:4,scale:1}} (e.g. no double quotes around numbers). As Hive can propagate this incorrect schema to new files and hence creating files with invalid schema, I think that we should alter the behavior and insist on the correct schema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema
[ https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated HIVE-7174: - Attachment: dec.avro Do not accept string as scale and precision when reading Avro schema Key: HIVE-7174 URL: https://issues.apache.org/jira/browse/HIVE-7174 Project: Hive Issue Type: Bug Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Fix For: 0.14.0 Attachments: HIVE-7174.patch, dec.avro I've noticed that the current AvroSerde will happily accept schema that uses string instead of integer for scale and precision, e.g. fragment {{precision:4,scale:1}} from following table: {code} CREATE TABLE `avro_dec1`( `name` string COMMENT 'from deserializer', `value` decimal(4,1) COMMENT 'from deserializer') COMMENT 'just drop the schema right into the HQL' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', 'avro.schema.literal'='{\namespace\:\com.howdy\,\name\:\some_schema\,\type\:\record\,\fields\:[{\name\:\name\,\type\:\string\},{\name\:\value\,\type\:{\type\:\bytes\,\logicalType\:\decimal\,\precision\:\4\,\scale\:\1\}}]}' ); {code} However the Decimal spec defined in AVRO-1402 requires only integer to be there and hence is allowing only following fragment instead {{precision:4,scale:1}} (e.g. no double quotes around numbers). As Hive can propagate this incorrect schema to new files and hence creating files with invalid schema, I think that we should alter the behavior and insist on the correct schema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema
[ https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated HIVE-7174: - Attachment: HIVE-7174.patch Attaching patch that replaces method call {{getValueAsInt}} that is allowed to do type transformations (String - Integer) with {{getIntValue}} that will fail in case that user will use string instead. I've verified that Hive now do not accepts the incorrect schema. Do not accept string as scale and precision when reading Avro schema Key: HIVE-7174 URL: https://issues.apache.org/jira/browse/HIVE-7174 Project: Hive Issue Type: Bug Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Attachments: HIVE-7174.patch I've noticed that the current AvroSerde will happily accept schema that uses string instead of integer for scale and precision, e.g. fragment {{precision:4,scale:1}} from following table: {code} CREATE TABLE `avro_dec1`( `name` string COMMENT 'from deserializer', `value` decimal(4,1) COMMENT 'from deserializer') COMMENT 'just drop the schema right into the HQL' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', 'avro.schema.literal'='{\namespace\:\com.howdy\,\name\:\some_schema\,\type\:\record\,\fields\:[{\name\:\name\,\type\:\string\},{\name\:\value\,\type\:{\type\:\bytes\,\logicalType\:\decimal\,\precision\:\4\,\scale\:\1\}}]}' ); {code} However the Decimal spec defined in AVRO-1402 requires only integer to be there and hence is allowing only following fragment instead {{precision:4,scale:1}} (e.g. no double quotes around numbers). As Hive can propagate this incorrect schema to new files and hence creating files with invalid schema, I think that we should alter the behavior and insist on the correct schema. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-7174) Do not accept string as scale and precision when reading Avro schema
[ https://issues.apache.org/jira/browse/HIVE-7174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jarek Jarcec Cecho updated HIVE-7174: - Fix Version/s: 0.14.0 Status: Patch Available (was: Open) Do not accept string as scale and precision when reading Avro schema Key: HIVE-7174 URL: https://issues.apache.org/jira/browse/HIVE-7174 Project: Hive Issue Type: Bug Reporter: Jarek Jarcec Cecho Assignee: Jarek Jarcec Cecho Fix For: 0.14.0 Attachments: HIVE-7174.patch I've noticed that the current AvroSerde will happily accept schema that uses string instead of integer for scale and precision, e.g. fragment {{precision:4,scale:1}} from following table: {code} CREATE TABLE `avro_dec1`( `name` string COMMENT 'from deserializer', `value` decimal(4,1) COMMENT 'from deserializer') COMMENT 'just drop the schema right into the HQL' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' TBLPROPERTIES ( 'numFiles'='1', 'avro.schema.literal'='{\namespace\:\com.howdy\,\name\:\some_schema\,\type\:\record\,\fields\:[{\name\:\name\,\type\:\string\},{\name\:\value\,\type\:{\type\:\bytes\,\logicalType\:\decimal\,\precision\:\4\,\scale\:\1\}}]}' ); {code} However the Decimal spec defined in AVRO-1402 requires only integer to be there and hence is allowing only following fragment instead {{precision:4,scale:1}} (e.g. no double quotes around numbers). As Hive can propagate this incorrect schema to new files and hence creating files with invalid schema, I think that we should alter the behavior and insist on the correct schema. -- This message was sent by Atlassian JIRA (v6.2#6252)