[ 
https://issues.apache.org/jira/browse/SPARK-34629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300901#comment-17300901
 ] 

Maciej Szymkiewicz commented on SPARK-34629:
--------------------------------------------

{quote}
Would love to help out here if I may. Does it help if I start identifying which 
APIs are still missing from the hints?
{quote}

That be helpful [~chilltake]. However, please keep in mind, that certain parts 
of the code are intentionally not covered. These are explicitly ignored in 
[mypy.ini|https://github.com/apache/spark/blob/master/python/mypy.ini].

In general other parts should be covered as long as the code was in use in 
tests or examples, so false negatives might be actually shaded by other 
definitions and / or hit some deficiency of the type checker.

More likely than missing hints we'll have missing overloads (some of these log 
provided by [~hyukjin.kwon] ‒ these can be tricky to handle without negative 
control ‒ I am still thinking about bringing data tests from pyspark-stubs 
here, which would be helpful in such cases, but there are hard to maintain).

> Python type hints improvement
> -----------------------------
>
>                 Key: SPARK-34629
>                 URL: https://issues.apache.org/jira/browse/SPARK-34629
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 3.1.2
>            Reporter: Hyukjin Kwon
>            Priority: Critical
>
> We added PySpark type hints at SPARK-32681
> However, looks like there are still many missing APIs to type. I maintain a 
> project called [Koalas](https://github.com/databricks/koalas), and I found 
> these errors 
> https://gist.github.com/HyukjinKwon/9faabc5f2680b56007d71ef7cf0ad400
> For example, {{pyspark.__version__}} and {{pyspark.sql.Column.contains}} are 
> missing in the type hints.
> I believe this is the same case to other projects that enables mypy in their 
> project (presumably also given SPARK-34544).
> This umbrella JIRA targets to identify such cases and improve Python type 
> hints in PySpark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to