[ https://issues.apache.org/jira/browse/SPARK-29627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-29627. ---------------------------------- Fix Version/s: 3.0.0 Resolution: Fixed Issue resolved by pull request 26288 [https://github.com/apache/spark/pull/26288] > array_contains should allow column instances in PySpark > ------------------------------------------------------- > > Key: SPARK-29627 > URL: https://issues.apache.org/jira/browse/SPARK-29627 > Project: Spark > Issue Type: Bug > Components: PySpark, SQL > Affects Versions: 3.0.0 > Reporter: Hyukjin Kwon > Priority: Minor > Fix For: 3.0.0 > > > Scala API works well with column instances: > {code} > import org.apache.spark.sql.functions._ > val df = Seq(Array("a", "b", "c"), Array.empty[String]).toDF("data") > df.select(array_contains($"data", lit("a"))).collect() > {code} > {code} > Array[org.apache.spark.sql.Row] = Array([true], [false]) > {code} > However, seems PySpark one doesn't: > {code} > from pyspark.sql.functions import array_contains, lit > df = spark.createDataFrame([(["a", "b", "c"],), ([],)], ['data']) > df.select(array_contains(df.data, lit("a"))).show() > {code} > {code} > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/.../spark/python/pyspark/sql/functions.py", line 1950, in > array_contains > return Column(sc._jvm.functions.array_contains(_to_java_column(col), value)) > File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", > line 1277, in __call__ > File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", > line 1241, in _build_args > File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", > line 1228, in _get_args > File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_collections.py", > line 500, in convert > File "/.../spark/python/pyspark/sql/column.py", line 344, in __iter__ > raise TypeError("Column is not iterable") > TypeError: Column is not iterable > {code} > We should let it allow -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org