----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18254/ -----------------------------------------------------------
(Updated Feb. 21, 2014, 1:31 a.m.) Review request for hive. Changes ------- Fix test. Did not fix the output completely deterministically in my previous attempt. Needed to sort by both key,value for srcbucket to get deterministic result. Decided to use table 'src' instead which has unique key,value pairs. Bugs: HIVE-6375 https://issues.apache.org/jira/browse/HIVE-6375 Repository: hive-git Description ------- There is a Hive bug in SemanticAnalyzer that chooses different names for columns in the CreateTable task and the FileSink task. columnInfo.getInternalName() was used in one place, and fieldSchema still used columnInfo.getAlias() if it is available. This change makes both consistent, favoring columnInfo.getAlias if it is available. This is not revealed before because other file-formats like RcFile seem to use column-ordinal position, and Avro file stores the schema separately altogether. Diffs (updated) ----- ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a01aa0e ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION ql/src/test/results/clientpositive/ctas.q.out 9668855 ql/src/test/results/clientpositive/ctas_hadoop20.q.out 2c0059d ql/src/test/results/clientpositive/merge3.q.out ae7dc71 ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION Diff: https://reviews.apache.org/r/18254/diff/ Testing ------- Added parquet_ctas.q. Covers cases where column name is gotten directly from input table (implied alias), where name is auto-generated, where name is specified as alias, and a mix of the three. Thanks, Szehon Ho