[
https://issues.apache.org/jira/browse/SPARK-55821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SPARK-55821:
-----------------------------------
Labels: pull-request-available (was: )
> Enforce keyword-only arguments in serializer __init__ methods
> -------------------------------------------------------------
>
> Key: SPARK-55821
> URL: https://issues.apache.org/jira/browse/SPARK-55821
> Project: Spark
> Issue Type: Improvement
> Components: PySpark
> Affects Versions: 4.2.0
> Reporter: Yicong Huang
> Priority: Minor
> Labels: pull-request-available
>
> The serializer classes in `pyspark.sql.pandas.serializers` accept many
> positional arguments in their `__init__` methods, making call sites
> error-prone and hard to read.
> For example, `ArrowStreamPandasUDFSerializer.__init__` takes 12 parameters,
> `ApplyInPandasWithStateSerializer.__init__` takes 7 parameters, etc. When
> these are called with positional arguments, it is very easy to mix up the
> order.
> We should enforce keyword-only arguments (using `*` separator after `self`)
> in serializer `__init__` methods to improve readability and prevent
> positional argument mistakes.
> All call sites in `worker.py` and within `serializers.py` (subclass
> `super().__init__` calls) must also be updated to use keyword arguments.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]