Re: Review Request 24085: HIVE-7446: Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24085/#review50049 --- Ship it! Ship It! - Tom White On Aug. 8, 2014, midnight, Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24085/ --- (Updated Aug. 8, 2014, midnight) Review request for hive. Bugs: HIVE-7446 https://issues.apache.org/jira/browse/HIVE-7446 Repository: hive-git Description --- HIVE-7446: Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables Diffs - ql/src/test/queries/clientpositive/avro_add_column.q PRE-CREATION ql/src/test/queries/clientpositive/avro_add_column2.q PRE-CREATION ql/src/test/queries/clientpositive/avro_add_column3.q PRE-CREATION ql/src/test/results/clientpositive/avro_add_column.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_add_column2.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_add_column3.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java 915f01679183904d0d93b9b8a88dc1a64ac2af78 serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java 722bdf9f8452fe8632db7d9167182310e467281d serde/src/test/resources/avro-nested-struct.avsc 785af83cd01fe91626741b3d7659d8f515854774 serde/src/test/resources/avro-struct.avsc 313c74f6140615d2737ef1a49a2777656f35f4e3 Diff: https://reviews.apache.org/r/24085/diff/ Testing --- qTests Thanks, Ashish Singh
Re: Review Request 24085: HIVE-7446: Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24085/#review49224 --- Ship it! Ship It! - Tom White On July 30, 2014, 2:33 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24085/ --- (Updated July 30, 2014, 2:33 a.m.) Review request for hive. Bugs: HIVE-7446 https://issues.apache.org/jira/browse/HIVE-7446 Repository: hive-git Description --- HIVE-7446: Add support to ALTER TABLE .. ADD COLUMN to Avro backed tables Diffs - ql/src/test/queries/clientpositive/avro_add_column.q PRE-CREATION ql/src/test/queries/clientpositive/avro_add_column2.q PRE-CREATION ql/src/test/results/clientpositive/avro_add_column.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_add_column2.q.out PRE-CREATION Diff: https://reviews.apache.org/r/24085/diff/ Testing --- qTests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
On July 18, 2014, 1:57 p.m., Tom White wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java, line 112 https://reviews.apache.org/r/23387/diff/9/?file=634614#file634614line112 Is it possible to default name to table name, namespace to database name, and doc to table comment? Ashish Singh wrote: I was planning to do this, but slipped off my mind. Thanks for pointing this out. I don't think it is possible to retrieve database name inside serde. Addressed name and doc. Tom White wrote: Thanks for fixing this. There's no test that name and comment are picked up from the table definition - perhaps you could add one, or at least confirm it manually. I couldn't see where in Hive they get set... Otherwise, +1 from me - thanks for addressing all my comments. This is a great feature to add. Ashish Singh wrote: Tom, actually its tested by all the unit tests now. Look at the diffs in https://reviews.apache.org/r/23387/diff/8-9/#index_header. Tom White wrote: Unless I am missing something, the unit tests in TestTypeInfoToSchema don't test this since they hardcode the table name to avrotest and the table comment to This is to test hive-avro. Perhaps this is tested indirectly through the ql tests since Avro schema resolution rules mean that a record schema's name must match for both the reader and writer. However, this isn't true for comments (Avro schema doc), and it would be good to confirm that inserting data into an Avro-backed Hive table creates Avro files with the expected top-level name and comment. Ashish Singh wrote: Tom, my bad. I thought we are talking about having hive typeinfo in doc for corresponding avro schema. I did verify that top-level name and comment are being created as expected before posting the patch here. I log created avro schema in AvroSerde.java and that came handy to verify this. OK, thanks for clarifying. +1 from me. - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48120 --- On July 22, 2014, 5:13 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 22, 2014, 5:13 p.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 4564e75d9bfc73f8e10f160e2535f1a08b90ff79 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION serde/src/test/resources/avro-nested-struct.avsc PRE-CREATION serde/src/test/resources/avro-struct.avsc PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
On July 18, 2014, 1:57 p.m., Tom White wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java, line 112 https://reviews.apache.org/r/23387/diff/9/?file=634614#file634614line112 Is it possible to default name to table name, namespace to database name, and doc to table comment? Ashish Singh wrote: I was planning to do this, but slipped off my mind. Thanks for pointing this out. I don't think it is possible to retrieve database name inside serde. Addressed name and doc. Tom White wrote: Thanks for fixing this. There's no test that name and comment are picked up from the table definition - perhaps you could add one, or at least confirm it manually. I couldn't see where in Hive they get set... Otherwise, +1 from me - thanks for addressing all my comments. This is a great feature to add. Ashish Singh wrote: Tom, actually its tested by all the unit tests now. Look at the diffs in https://reviews.apache.org/r/23387/diff/8-9/#index_header. Unless I am missing something, the unit tests in TestTypeInfoToSchema don't test this since they hardcode the table name to avrotest and the table comment to This is to test hive-avro. Perhaps this is tested indirectly through the ql tests since Avro schema resolution rules mean that a record schema's name must match for both the reader and writer. However, this isn't true for comments (Avro schema doc), and it would be good to confirm that inserting data into an Avro-backed Hive table creates Avro files with the expected top-level name and comment. - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48120 --- On July 19, 2014, 5:11 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 19, 2014, 5:11 a.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 4564e75d9bfc73f8e10f160e2535f1a08b90ff79 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
On July 18, 2014, 1:57 p.m., Tom White wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java, line 112 https://reviews.apache.org/r/23387/diff/9/?file=634614#file634614line112 Is it possible to default name to table name, namespace to database name, and doc to table comment? Ashish Singh wrote: I was planning to do this, but slipped off my mind. Thanks for pointing this out. I don't think it is possible to retrieve database name inside serde. Addressed name and doc. Thanks for fixing this. There's no test that name and comment are picked up from the table definition - perhaps you could add one, or at least confirm it manually. I couldn't see where in Hive they get set... Otherwise, +1 from me - thanks for addressing all my comments. This is a great feature to add. - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48120 --- On July 19, 2014, 5:11 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 19, 2014, 5:11 a.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerdeUtils.java 4564e75d9bfc73f8e10f160e2535f1a08b90ff79 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48120 --- Getting closer - a few more comments. ql/src/test/queries/clientpositive/avro_schema_evolution_native.q https://reviews.apache.org/r/23387/#comment84420 It would be nice to support ALTER TABLE t ADD COLUMN ..., but that is an enhancement - file another JIRA for that? serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java https://reviews.apache.org/r/23387/#comment84417 Is it possible to default name to table name, namespace to database name, and doc to table comment? serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84416 Remove this now. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84418 Should be null not for no doc. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84419 Null rather than to indicate the lack of a namespace. (This is fixed in https://issues.apache.org/jira/browse/AVRO-1535, but that hasn't been released yet.) serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84413 This should add to the start of the list. Also, needs a test for this case. - Tom White On July 17, 2014, 9:10 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 17, 2014, 9:10 p.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
On July 17, 2014, 1:49 p.m., Tom White wrote: serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java, line 229 https://reviews.apache.org/r/23387/diff/8/?file=634160#file634160line229 It would be simpler to make sure that NULL is included (and is the first branch in the union) in the createAvroUnion() method, and just fall through here. Ashish Singh wrote: I do not think this will be something good or feasible without redesigning many parts without any obvious gain. createAvroUnion() only creates a schema for union, based on union typeinfo passed to it. If I hack it to add null to all unions, I will still need to handle union here differently as union of unions is not possible. That's OK. What you have now is much better, although I found a small bug where the null isn't being added to the start of the list (see other comment). - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48004 --- On July 17, 2014, 9:10 p.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 17, 2014, 9:10 p.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review48004 --- Ashish, thanks for addressing my feedback. Here's a bit more. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84252 Still need to pass the Hive column definition here as the field comment. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84253 It would be simpler to make sure that NULL is included (and is the first branch in the union) in the createAvroUnion() method, and just fall through here. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84256 If you made all records have names then this case statement wouldn't be needed as the default case would be used. Also, having non-deterministic schemas is something we should avoid, since otherwise files in different partitions or written at different times would have schemas that differed only in the record names. Instead, use a counter for gensym - this will work since one instance of TypeInfoToSchema is only used to convert one schema (although it might be a good idea to enforce that). serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84254 Also need to test the case when the union includes NULL, to check it's not included twice. Also, when it's included but not in the first branch of the union. - Tom White On July 17, 2014, 2:50 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 17, 2014, 2:50 a.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroDeserializer.java 0db12437406170686a21b6055d83156fe5d6a55f serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request 23387: HIVE-6806: Native avro support
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/#review47894 --- Overall looks good. I added a few comments regarding the Avro schema mapping. It would also be useful to update the documentation (https://cwiki.apache.org/confluence/display/Hive/AvroSerDe) with to cover this feature once it is released, especially the type mappings table. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84124 The doc string here could be the Hive column definition, which would be helpful for debugging. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84121 VOID is missing - it maps to Schema.Type.NULL in Avro. Also, SHORT and BYTE could map to INT. And CHAR, VARCHAR could map to STRING. serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84123 Are optional fields supported properly? All schemas should probably be wrapped in an Avro null union to allow values to be null: private static Schema optional(Schema schema) { return Schema.createUnion(Arrays.asList(Schema.create(Schema.Type.NULL), schema)); } serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84119 This should be BINARY. BYTE can be mapped to Avro Schema.Type.INT serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84122 Throw an exception if the key type isn't string. serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84126 COLUMN_NAMES serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84125 Pass this as a parameter to getAvroSchemaString() - it's a bit odd to set it in tests then read it in getAvroSchemaString(). serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java https://reviews.apache.org/r/23387/#comment84127 Add a field for each Hive type to test they can all work in the context of a record. Also add a nested record. - Tom White On July 16, 2014, 3:35 a.m., Ashish Singh wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23387/ --- (Updated July 16, 2014, 3:35 a.m.) Review request for hive. Bugs: HIVE-6806 https://issues.apache.org/jira/browse/HIVE-6806 Repository: hive-git Description --- HIVE-6806: Native avro support Diffs - ql/src/java/org/apache/hadoop/hive/ql/io/AvroStorageFormatDescriptor.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/IOConstants.java 1bae0a8fee04049f90b16d813ff4c96707b349c8 ql/src/main/resources/META-INF/services/org.apache.hadoop.hive.ql.io.StorageFormatDescriptor a23ff115512da5fe3167835a88d582c427585b8e ql/src/test/org/apache/hadoop/hive/ql/io/TestStorageFormatDescriptor.java d53ebc65174d66bfeee25fd2891c69c78f9137ee ql/src/test/queries/clientpositive/avro_compression_enabled_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_decimal_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_joins_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_partitioned_native.q PRE-CREATION ql/src/test/queries/clientpositive/avro_schema_evolution_native.q PRE-CREATION ql/src/test/results/clientpositive/avro_compression_enabled_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_decimal_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_joins_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_partitioned_native.q.out PRE-CREATION ql/src/test/results/clientpositive/avro_schema_evolution_native.q.out PRE-CREATION serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroSerDe.java 1fe31e0034f8988d03a0c51a90904bb93e7cb157 serde/src/java/org/apache/hadoop/hive/serde2/avro/TypeInfoToSchema.java PRE-CREATION serde/src/test/org/apache/hadoop/hive/serde2/avro/TestTypeInfoToSchema.java PRE-CREATION Diff: https://reviews.apache.org/r/23387/diff/ Testing --- Added qTests and unit tests Thanks, Ashish Singh
Re: Review Request: HIVE-2468 - Make Hive compile against Hadoop 0.22
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2762/ --- (Updated 2011-11-30 16:57:27.344021) Review request for hive. Summary --- See https://issues.apache.org/jira/browse/HIVE-2468 This addresses bug HIVE-2468. https://issues.apache.org/jira/browse/HIVE-2468 Diffs (updated) - trunk/bin/hive 1203794 trunk/build-common.xml 1203794 trunk/build.properties 1203794 trunk/build.xml 1203794 trunk/conf/hive-default.xml 1203794 trunk/contrib/build.xml 1203794 trunk/hbase-handler/build.xml 1203794 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 1203794 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableOutputFormat.java 1203794 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java 1203794 trunk/hwi/build.xml 1203794 trunk/jdbc/build.xml 1203794 trunk/ql/build.xml 1203794 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1203794 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobTrackerURLResolver.java 1203794 trunk/service/build.xml 1203794 trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1203794 trunk/shims/build.xml 1203794 trunk/shims/src/0.20/java/org/apache/hadoop/fs/ProxyFileSystem.java 1203794 trunk/shims/src/0.20/java/org/apache/hadoop/fs/ProxyLocalFileSystem.java 1203794 trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 1203794 trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 1203794 trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Jetty23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenIdentifier23.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenSelector23.java PRE-CREATION trunk/shims/src/common/java/org/apache/hadoop/fs/ProxyFileSystem.java PRE-CREATION trunk/shims/src/common/java/org/apache/hadoop/fs/ProxyLocalFileSystem.java PRE-CREATION trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 1203794 trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java 1203794 Diff: https://reviews.apache.org/r/2762/diff Testing --- Thanks, Tom
[jira] [Updated] (HIVE-2468) Make Hive compile against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HIVE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2468: Attachment: HIVE-2468.patch This patch fixes org.apache.hadoop.hive.ql.io.TestSymlinkTextInputFormat (the CombineFileSplit problem), and the classpath problems. I ran the whole unit test suite against Hadoop 0.23 and over 95% of the tests pass now. Make Hive compile against Hadoop 0.23 - Key: HIVE-2468 URL: https://issues.apache.org/jira/browse/HIVE-2468 Project: Hive Issue Type: Task Reporter: Konstantin Shvachko Assignee: Tom White Attachments: HIVE-2468.patch, HIVE-2468.patch, HIVE-2468.patch, HIVE-2468.patch Due to restructure of Hadoop 0.22 branch compared to Hadoop 0.20 Hive does not compile against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2468) Make Hive compile against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HIVE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155593#comment-13155593 ] Tom White commented on HIVE-2468: - Thanks for the report, Amareshwari. I cleared my Ivy cache and hit the same problem (it was compiling fine against 0.23.0 before). I'll create a new patch. Make Hive compile against Hadoop 0.23 - Key: HIVE-2468 URL: https://issues.apache.org/jira/browse/HIVE-2468 Project: Hive Issue Type: Task Reporter: Konstantin Shvachko Assignee: Tom White Attachments: HIVE-2468.patch Due to restructure of Hadoop 0.22 branch compared to Hadoop 0.20 Hive does not compile against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2468) Make Hive compile against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HIVE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2468: Attachment: HIVE-2468.patch It's not complete yet, but here's an updated patch. I still need to fix shim tests, and the missing ProxyLocalFileSystem in 0.23. Make Hive compile against Hadoop 0.23 - Key: HIVE-2468 URL: https://issues.apache.org/jira/browse/HIVE-2468 Project: Hive Issue Type: Task Reporter: Konstantin Shvachko Assignee: Tom White Attachments: HIVE-2468.patch, HIVE-2468.patch Due to restructure of Hadoop 0.22 branch compared to Hadoop 0.20 Hive does not compile against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2468) Make Hive compile against Hadoop 0.23
[ https://issues.apache.org/jira/browse/HIVE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13153401#comment-13153401 ] Tom White commented on HIVE-2468: - All tests pass against 0.20. Make Hive compile against Hadoop 0.23 - Key: HIVE-2468 URL: https://issues.apache.org/jira/browse/HIVE-2468 Project: Hive Issue Type: Task Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: HIVE-2468.patch Due to restructure of Hadoop 0.22 branch compared to Hadoop 0.20 Hive does not compile against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-2468 - Make Hive compile against Hadoop 0.22
On 2011-11-17 07:33:27, Carl Steinbach wrote: trunk/build.properties, line 13 https://reviews.apache.org/r/2762/diff/1/?file=56790#file56790line13 For the time being we need to stick with 0.20.1 as the default Hadoop version against which Hive is tested, but we should start using 0.23.0 for the security tests, and should ensure that all tests pass when run against 0.23.0. Please set hadoop.version to 0.20.1 and hadoop.security.version to 0.23.0, and ensure that all tests pass when run as ant test -Dhadoop.version=0.23.0. Also, I added 0.23.0 to the cloudera and facebook hive-deps mirrors, so please reference those locations instead. Thanks for taking a look Carl. I agree that we should leave 0.20.1 as the default. I think security tests should be left at the default for now too. This issue was to get Hive compiling against Hadoop 0.23.0, and I imagined any unit test failures against 0.23.0 could be fixed in follow up issues. Does that sound reasonable? - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2762/#review3316 --- On 2011-11-07 23:55:47, Tom White wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2762/ --- (Updated 2011-11-07 23:55:47) Review request for hive. Summary --- See https://issues.apache.org/jira/browse/HIVE-2468 This addresses bug HIVE-2468. https://issues.apache.org/jira/browse/HIVE-2468 Diffs - trunk/build.properties 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableOutputFormat.java 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java 1196775 trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1196775 trunk/shims/build.xml 1196775 trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 1196775 trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 1196775 trunk/shims/src/0.23/java/org/apache/hadoop/fs/ProxyFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/fs/ProxyLocalFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/EmptyShim.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/HiveHarFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Jetty23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenIdentifier.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenSelector.java PRE-CREATION trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 1196775 trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java 1196775 Diff: https://reviews.apache.org/r/2762/diff Testing --- Thanks, Tom
Review Request: HIVE-2468 - Make Hive compile against Hadoop 0.22
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/2762/ --- Review request for hive. Summary --- See https://issues.apache.org/jira/browse/HIVE-2468 This addresses bug HIVE-2468. https://issues.apache.org/jira/browse/HIVE-2468 Diffs - trunk/build.properties 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableInputFormat.java 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHBaseTableOutputFormat.java 1196775 trunk/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java 1196775 trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1196775 trunk/shims/build.xml 1196775 trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java 1196775 trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 1196775 trunk/shims/src/0.23/java/org/apache/hadoop/fs/ProxyFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/fs/ProxyLocalFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/EmptyShim.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/HiveHarFileSystem.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/shims/Jetty23Shims.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenIdentifier.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java PRE-CREATION trunk/shims/src/0.23/java/org/apache/hadoop/hive/thrift/DelegationTokenSelector.java PRE-CREATION trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java 1196775 trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java 1196775 Diff: https://reviews.apache.org/r/2762/diff Testing --- Thanks, Tom
[jira] [Updated] (HIVE-2468) Make Hive compile against Hadoop 0.22
[ https://issues.apache.org/jira/browse/HIVE-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2468: Attachment: HIVE-2468.patch Here's a patch that Roman Shaposhnik and I wrote to get Hive compiling against Hadoop 0.23 (I haven't tried it against 0.22). Make Hive compile against Hadoop 0.22 - Key: HIVE-2468 URL: https://issues.apache.org/jira/browse/HIVE-2468 Project: Hive Issue Type: Task Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: HIVE-2468.patch Due to restructure of Hadoop 0.22 branch compared to Hadoop 0.20 Hive does not compile against 0.22 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-2457. Files in Avro-backed Hive tables do not have a .avro extension
On 2011-09-21 00:26:28, Carl Steinbach wrote: trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 892 https://reviews.apache.org/r/1989/diff/1/?file=2#file2line892 Please add this configuration property to HiveConf and hive-default.xml Tom White wrote: Does Hive have the concept of private configuration properties? This is one that would set by SerDe's, not by users, which is why I didn't add it to HiveConf/hive-default.xml. Carl Steinbach wrote: No, it doesn't, but it should. I'll file a JIRA. In the meantime this property should still be included in hive-default and HiveConf. OK, I added the property to hive-default and HiveConf in the latest patch. - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1989/#review1982 --- On 2011-09-20 22:28:53, Carl Steinbach wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1989/ --- (Updated 2011-09-20 22:28:53) Review request for hive. Summary --- Review for HIVE-2457 This addresses bug HIVE-2457. https://issues.apache.org/jira/browse/HIVE-2457 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1173340 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1173340 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 1173340 trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java PRE-CREATION Diff: https://reviews.apache.org/r/1989/diff Testing --- Thanks, Carl
[jira] [Updated] (HIVE-2457) Files in Avro-backed Hive tables do not have a .avro extension
[ https://issues.apache.org/jira/browse/HIVE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2457: Attachment: HIVE-2457.patch Here's a new patch which addresses all of Carl's feedback. (I don't seem to be able to upload to RB since I didn't create the original review.) Files in Avro-backed Hive tables do not have a .avro extension Key: HIVE-2457 URL: https://issues.apache.org/jira/browse/HIVE-2457 Project: Hive Issue Type: Improvement Components: Query Processor, Serializers/Deserializers Reporter: Tom White Assignee: Tom White Attachments: HIVE-2457.patch, HIVE-2457.patch When using the Avro SerDe (see HIVE-895, https://github.com/jghoman/haivvreo) the files created for an Avro table do not have a .avro extension, which causes problems for tools like Avro MapReduce or Sqoop which expect the extension. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2457) Files in Avro-backed Hive tables do not have a .avro extension
[ https://issues.apache.org/jira/browse/HIVE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2457: Status: Patch Available (was: Open) Files in Avro-backed Hive tables do not have a .avro extension Key: HIVE-2457 URL: https://issues.apache.org/jira/browse/HIVE-2457 Project: Hive Issue Type: Improvement Components: Query Processor, Serializers/Deserializers Reporter: Tom White Assignee: Tom White Attachments: HIVE-2457.patch, HIVE-2457.patch When using the Avro SerDe (see HIVE-895, https://github.com/jghoman/haivvreo) the files created for an Avro table do not have a .avro extension, which causes problems for tools like Avro MapReduce or Sqoop which expect the extension. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2457) Files in Avro-backed Hive tables do not have a .avro extension
[ https://issues.apache.org/jira/browse/HIVE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2457: Attachment: HIVE-2457.patch Here's a patch which introduces a new property, {{hive.output.file.extension}}, that SerDes can set to control the file extension. If not set, it falls back to the current rule: add a codec extension (e.g. .gz) only in the case of text files. I've tested it by modifying Haivvrreo to set the new property (https://github.com/tomwhite/haivvreo/commit/899a2c44cd25aca6c469d6c2fdb8419e66dda380). With this change I see that new files in an Avro-backed table created by Hive have a '.avro' extension. Files in Avro-backed Hive tables do not have a .avro extension Key: HIVE-2457 URL: https://issues.apache.org/jira/browse/HIVE-2457 Project: Hive Issue Type: Improvement Components: Query Processor, Serializers/Deserializers Reporter: Tom White Attachments: HIVE-2457.patch When using the Avro SerDe (see HIVE-895, https://github.com/jghoman/haivvreo) the files created for an Avro table do not have a .avro extension, which causes problems for tools like Avro MapReduce or Sqoop which expect the extension. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2457) Files in Avro-backed Hive tables do not have a .avro extension
[ https://issues.apache.org/jira/browse/HIVE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom White updated HIVE-2457: Status: Patch Available (was: Open) Files in Avro-backed Hive tables do not have a .avro extension Key: HIVE-2457 URL: https://issues.apache.org/jira/browse/HIVE-2457 Project: Hive Issue Type: Improvement Components: Query Processor, Serializers/Deserializers Reporter: Tom White Attachments: HIVE-2457.patch When using the Avro SerDe (see HIVE-895, https://github.com/jghoman/haivvreo) the files created for an Avro table do not have a .avro extension, which causes problems for tools like Avro MapReduce or Sqoop which expect the extension. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-2457. Files in Avro-backed Hive tables do not have a .avro extension
On 2011-09-21 00:26:28, Carl Steinbach wrote: Thanks for the review! On 2011-09-21 00:26:28, Carl Steinbach wrote: trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java, line 892 https://reviews.apache.org/r/1989/diff/1/?file=2#file2line892 Please add this configuration property to HiveConf and hive-default.xml Does Hive have the concept of private configuration properties? This is one that would set by SerDe's, not by users, which is why I didn't add it to HiveConf/hive-default.xml. On 2011-09-21 00:26:28, Carl Steinbach wrote: trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java, line 28 https://reviews.apache.org/r/1989/diff/1/?file=4#file4line28 In addition to the unit test it would also be nice to test this via TestCliDriver. It should be possible to verify this from the CLI by doing something like this: -- Set the filename suffix property. Then create a new table and stream -- data into it. Then use the dfs cat command to dump the contents of -- the raw files in the warehouse to stdout hive dfs -cat ${hiveconf:hive.metastore.warehouse.dir}/tablename/*.avro; Good idea. I'll look at adding a test like this. - Tom --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1989/#review1982 --- On 2011-09-20 22:28:53, Carl Steinbach wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1989/ --- (Updated 2011-09-20 22:28:53) Review request for hive. Summary --- Review for HIVE-2457 This addresses bug HIVE-2457. https://issues.apache.org/jira/browse/HIVE-2457 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1173340 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1173340 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 1173340 trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java PRE-CREATION Diff: https://reviews.apache.org/r/1989/diff Testing --- Thanks, Carl