Maciej Szymkiewicz created SPARK-34136:
------------------------------------------

             Summary: Support complex types in pyspark.sql.functions.lit
                 Key: SPARK-34136
                 URL: https://issues.apache.org/jira/browse/SPARK-34136
             Project: Spark
          Issue Type: Improvement
          Components: PySpark, SQL
    Affects Versions: 3.2.0, 3.1.1
            Reporter: Maciej Szymkiewicz


At the moment, Python users have to use dedicated function to create complex 
literal column. For example to create an array:

{code:python}
from pyspark.sql.functions import array, lit

xs = [1, 2, 3]
array(*[lit(x) for x in xs])
{code}

or map

{code:python}
from pyspark.sql.functions import create_map, lit, map_from_arrays
from itertools import chain

kvs = {"a": 1, "b": 2}

create_map(*chain.from_iterable(
    (lit(k), lit(v)) for k, v in kvs.items()
))

# or

map_from_arrays(
    array(*[lit(k) for k in kvs.keys()]),
    array(*[lit(v) for v in kvs.values()])
)
{code}

This is very verbose for such simple task. 

In Scala we have `typedLit` that addresses such cases

{code:scala}
scala> typedLit(Map("a" -> 1, "b" -> 2))
res0: org.apache.spark.sql.Column = keys: [a,b], values: [1,2]

scala> typedLit(Array(1, 2, 3))
res1: org.apache.spark.sql.Column = [1,2,3]

{code}


but its API is not Python-friendly.

It would be nice if {{lit}} could cover at least basic complex types.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to