[ https://issues.apache.org/jira/browse/HIVE-3753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506924#comment-13506924 ]
Matthew Rathbone commented on HIVE-3753: ---------------------------------------- We thought of that too, unfortunately it makes no difference, and the same issue persists. > 'CTAS' and INSERT OVERWRITE send different column names to the underlying > SerDe > ------------------------------------------------------------------------------- > > Key: HIVE-3753 > URL: https://issues.apache.org/jira/browse/HIVE-3753 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 0.9.0 > Reporter: Matthew Rathbone > Original Estimate: 24h > Remaining Estimate: 24h > > A good example is with a JSON serde > (https://github.com/rathboma/Hive-JSON-Serde-1) > Here is a simple example of how the two results differ: > CREATE TABLE foo ROW FORMAT SERDE '....JsonSerDe' SELECT host from table1; > generates => {"_col0": "localhost"} > CREATE TABLE foo(host string) ROW FORMAT SERDE '....JsonSerDe'; > INSERT OVERWRITE TABLE FOO SELECT host FROM table; > generates => {"host": "localhost"} > The SerDe gets passed column names in two places: > 1) The property Constants.LIST_COLUMNS > 2) It gets passed a StructObjectInspector on serialize > In the CTAS example above, both of these contain '_col0' as the column name. > This is not true in the second example, as the LIST_COLUMNS property contains > the real column names. > I'd be happy to help out with this change, but I fear that the solution lies > somewhere in SemanticAnalyser.java, and I'm having a hard time finding my way > around. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira