-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18254/
-----------------------------------------------------------

(Updated Feb. 21, 2014, 1:31 a.m.)


Review request for hive.


Changes
-------

Fix test.  Did not fix the output completely deterministically in my previous 
attempt.   Needed to sort by both key,value for srcbucket to get deterministic 
result.

Decided to use table 'src' instead which has unique key,value pairs.


Bugs: HIVE-6375
    https://issues.apache.org/jira/browse/HIVE-6375


Repository: hive-git


Description
-------

There is a Hive bug in SemanticAnalyzer that chooses different names for 
columns in the CreateTable task and the FileSink task.  
columnInfo.getInternalName() was used in one place, and fieldSchema still used 
columnInfo.getAlias() if it is available.  This change makes both consistent, 
favoring columnInfo.getAlias if it is available.

This is not revealed before because other file-formats like RcFile seem to use 
column-ordinal position, and Avro file stores the schema separately altogether.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a01aa0e 
  ql/src/test/queries/clientpositive/parquet_ctas.q PRE-CREATION 
  ql/src/test/results/clientpositive/ctas.q.out 9668855 
  ql/src/test/results/clientpositive/ctas_hadoop20.q.out 2c0059d 
  ql/src/test/results/clientpositive/merge3.q.out ae7dc71 
  ql/src/test/results/clientpositive/parquet_ctas.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/18254/diff/


Testing
-------

Added parquet_ctas.q.  Covers cases where column name is gotten directly from 
input table (implied alias), where name is auto-generated, where name is 
specified as alias, and a mix of the three.


Thanks,

Szehon Ho

Reply via email to