Shrikant Prasad created SPARK-42655:
---------------------------------------

             Summary: Incorrect ambiguous column reference error
                 Key: SPARK-42655
                 URL: https://issues.apache.org/jira/browse/SPARK-42655
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 3.2.0
            Reporter: Shrikant Prasad


val df1 = 
sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", 
"col5")
val op_cols_same_case = List("id","col2","col3","col4", "col5", "id")
val df2 = df1.select(op_cols_same_case .head, op_cols_same_case .tail: _*)
df2.select("id").show() 
 
This query runs fine.
 
But when we change the casing of the op_cols to have mix of upper & lower case 
("id" & "ID") it throws an ambiguous col ref error:
 
val df1 = 
sc.parallelize(List((1,2,3,4,5),(1,2,3,4,5))).toDF("id","col2","col3","col4", 
"col5")
val op_cols_same_case = List("id","col2","col3","col4", "col5", "ID")
val df2 = df1.select(op_cols_same_case .head, op_cols_same_case .tail: _*)
df2.select("id").show() 
 
Since, Spark is case insensitive, it should work for second case also when we 
have upper and lower case column names in the column list.
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to