[ 
https://issues.apache.org/jira/browse/ARROW-17335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17577392#comment-17577392
 ] 

Jorrick Sleijster edited comment on ARROW-17335 at 8/9/22 12:32 PM:
--------------------------------------------------------------------

I think you make a good point Joris but as you mention, I don't think we can 
use inline type annotations :'(. Therefore, we'd have to use generated stubs, 
which we can't use for checking whether the underlying code actually has the 
right types.

I think we will therefore have to wait (or take action ourselves upstream) 
until mypy or cython implements decent support for Python stub generation.

Hence, I think it's better to threat them separate for new and start of with 
stub generation, which can then later be replaced by a better implementation 
once available.


was (Author: JIRAUSER294085):
Agreeing with you Joris but as you mention, I don't think we can use inline 
type annotations :'(. Therefore, we'd have to use generated stubs, which we 
can't use for checking whether the underlying code actually has the right types.

I think we will therefore have to wait (or take action ourselves upstream) 
until mypy or cython implements decent support for Python stub generation.

Hence, I think it's better to threat them separate for new and start of with 
stub generation, which can then later be replaced by a better implementation 
once available.

> [Python] Type checking support
> ------------------------------
>
>                 Key: ARROW-17335
>                 URL: https://issues.apache.org/jira/browse/ARROW-17335
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Python
>            Reporter: Jorrick Sleijster
>            Priority: Major
>   Original Estimate: 10h
>  Remaining Estimate: 10h
>
> h1. mypy and static type checking
> As of Python3.6, it has been possible to introduce typing information in the 
> code. This became immensely popular in a short period of time. Shortly after, 
> the tool `mypy` arrived and this has become the industry standard for static 
> type checking inside Python. It is able to check very quickly for invalid 
> types which makes it possible to serve as a pre-commit. It has raised many 
> bugs that I did not see myself and has been a very valuable tool.
> h2. Now what does this mean for PyArrow?
> When we run mypy on code that uses PyArrow, you will get error message as 
> follows:
> ```
> some_util_using_pyarrow/hdfs_utils.py:5: error: Skipping analyzing "pyarrow": 
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:9: error: Skipping analyzing "pyarrow": 
> module is installed, but missing library stubs or py.typed marker
> some_util_using_pyarrow/hdfs_utils.py:11: error: Skipping analyzing 
> "pyarrow.fs": module is installed, but missing library stubs or py.typed 
> marker
> ```
> More information is available here: 
> [https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-library-stubs-or-py-typed-marker]
> h2. You can solve this in three ways:
>  # Ignore the message. This, however, will put all types from PyArrow to 
> `Any`, making it unable to find user errors with the PyArrow library
>  # Create a Python stub file. This is what previously used to be the 
> standard, however, it no longer a popular option. This is because stubs are 
> extra, next to the source code, while you can also inline the code with type 
> hints, which brings me to our third option.
>  # Create a `py.typed` file and use inline type hints. This is the most 
> popular option today because it requires no extra files (except for the 
> py.typed file), allows all the type hints to be with the code (like now in 
> the documentation) and not only provides your users but also the developers 
> of the library themselves with type hints (and hinting of issues inside your 
> IDE).
>  
> My personal opinion already shines through the options, it is 3 as this has 
> shortly become the industry standard since the introduction.
> h2. What should we do?
> I'd very much like to work on this, however, I don't feel like wasting time. 
> Therefore, I am raising this ticket to see if this had been considered before 
> or if we just didn't get to this yet.
> I'd like to open the discussion here:
>  # Do you agree with number #3 as type hints.
>  # Should we remove the documentation annotations for the type hints given 
> they will be inside the functions? Or should we keep it and specify it in the 
> code? Which would make it double.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to