Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/17776
  
    When users do not provide alias name in the SELECT query, we call 
`toPrettySQL` to generate the alias name. 
    
    For example, 
    the string `get_json_object(jstring, '$.f1')` will be the alias name for 
the function call in the statement 
    ```SQL
    SELECT key, get_json_object(jstring, '$.f1') FROM tempView
    ``` 
    
    Above is not an issue for the SELECT query statements. However, for CTAS, 
we hit the issue due to a bug in Hive metastore. Hive metastore does not like 
the column names containing commas and returned a confusing error message, like:
    ```
    17/04/26 23:12:56 ERROR [hive.log(397) -- main]: error in initSerDe: 
org.apache.hadoop.hive.serde2.SerDeException 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements 
while columns.types has 1 elements!
    org.apache.hadoop.hive.serde2.SerDeException: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe: columns has 2 elements 
while columns.types has 1 elements!
    ```
    
    Thus, this PR is to remove the comma from the alias names so that Spark SQL 
users can do CTAS for the function call containing commas but without 
user-given alias names. 
    
    BTW, also add the description into the PR description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to