Simon Reon created SPARK-29240:
----------------------------------

             Summary: PySpark 2.4 about sql function 'element_at' param 
'extraction'
                 Key: SPARK-29240
                 URL: https://issues.apache.org/jira/browse/SPARK-29240
             Project: Spark
          Issue Type: Improvement
          Components: PySpark
    Affects Versions: 2.4.0
            Reporter: Simon Reon


I was trying to translate {color:#FF0000}Scala{color} into 
{color:#FF0000}python{color} with {color:#FF0000}PySpark 2.4.0{color} .Codes 
below aims to extract col '{color:#FF0000}list{color}' value using col 
'{color:#FF0000}num{color}' as index.

 
{code:java}
x = spark.createDataFrame([((1,2,3),1),((4,5,6),2),((7,8,9),3)],['list','num'])
x.show(){code}
 
||list||num||
|[1,2,3]|1|
|[4,5,6]|2|
|[7,8,9]|3|

I suppose to use new func '{color:#FF0000}element_at{color}' in 2.4.0 .But it 
gives an error:
{code:java}
x.withColumn('aa',F.element_at('list',x.num.cast('int')))
{code}
_TypeError: Column is not iterable_

 

Finally ,I have to use {color:#FF0000}udf{color} to solve this problem.

But in Scala ,it is ok when the second param '{color:#FF0000}extraction{color}' 
in func '{color:#FF0000}element_at{color}' is a col name with int type: 
{code:java}
//Scala
val y = x.withColumn("aa",element_at('list,'num.cast("int")))
y.show(){code}
||list||num|| aa||
|[1,2,3]|1| 1|
|[4,5,6] |2 |5 |
|[7,8,9] |3 |9 |

 I hope it could be fixed in latest version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to