Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r151042155 --- Diff: python/pyspark/sql/functions.py --- @@ -2247,16 +2142,20 @@ def pandas_udf(f=None, returnType=StringType()): | 8| JOHN DOE| 22| +----------+--------------+------------+ - 2. A `pandas.DataFrame` -> A `pandas.DataFrame` + 2. GROUP_MAP - This udf is only used with :meth:`pyspark.sql.GroupedData.apply`. + A group map UDF defines transformation: A `pandas.DataFrame` -> A `pandas.DataFrame` The returnType should be a :class:`StructType` describing the schema of the returned `pandas.DataFrame`. + The length of the returned `pandas.DataFrame` can arbitrary. --- End diff -- nit: `can arbitrary` -> `can be arbitrary`?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org