[
https://issues.apache.org/jira/browse/SPARK-54380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18040787#comment-18040787
]
Raisa commented on SPARK-54380:
-------------------------------
Hi [~databach] , thank you for your interest in helping with this issue.
* {*}Use case justification{*}: I work on implementing this functionality in
the Narwhals library ([https://github.com/narwhals-dev/narwhals)] where PySpark
is one of the backends that users can choose. To be able to align the behaviour
with other libraries, it would be very helpful to have this flag. As for
example, in Polars the default behaviour for this function is to sort lists
with NULLS FIRST
([https://docs.pola.rs/api/python/dev/reference/expressions/api/polars.Expr.list.sort.html).]
In Narwhals we promise that there should be not much overhead converting
between different libraries, so implementing this flag would help a lot.
* {*}Backward compatibility{*}: Yes implementing the flag with the default
`nulls_first=False` to maintain the existing behaviour makes perfect sense.
> Adding NULLS_FIRST flag to `array_sort`
> ---------------------------------------
>
> Key: SPARK-54380
> URL: https://issues.apache.org/jira/browse/SPARK-54380
> Project: Spark
> Issue Type: Request
> Components: PySpark
> Affects Versions: 4.0.1
> Reporter: Raisa
> Priority: Minor
>
> Could you please consider adding an option to place NULLS first to
> `pyspark.sql.functions.array_sort` as currently NULLS are placed last?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]