[ https://issues.apache.org/jira/browse/SPARK-36146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-36146: ------------------------------------ Assignee: Apache Spark > Upgrade Python version from 3.6 to higher version in GitHub linter > ------------------------------------------------------------------ > > Key: SPARK-36146 > URL: https://issues.apache.org/jira/browse/SPARK-36146 > Project: Spark > Issue Type: Sub-task > Components: Project Infra, PySpark > Affects Versions: 3.3.0 > Reporter: Hyukjin Kwon > Assignee: Apache Spark > Priority: Major > > MyPy checks fails with higher Python versions. For example, with Python 3.8: > {code} > python/pyspark/sql/pandas/_typing/protocols/frame.pyi:64: error: Name > "np.ndarray" is not defined > python/pyspark/sql/pandas/_typing/protocols/frame.pyi:91: error: Name > "np.recarray" is not defined > python/pyspark/sql/pandas/_typing/protocols/frame.pyi:165: error: Name > "np.ndarray" is not defined > python/pyspark/pandas/categorical.py:82: error: Item "dtype[Any]" of > "Union[dtype[Any], Any]" has no attribute "categories" > python/pyspark/pandas/categorical.py:109: error: Item "dtype[Any]" of > "Union[dtype[Any], Any]" has no attribute "ordered" > python/pyspark/ml/linalg/__init__.pyi:184: error: Return type "ndarray[Any, > Any]" of "toArray" incompatible with return type "NoReturn" in supertype > "Matrix" > python/pyspark/ml/linalg/__init__.pyi:217: error: Return type "ndarray[Any, > Any]" of "toArray" incompatible with return type "NoReturn" in supertype > "Matrix" > python/pyspark/pandas/typedef/typehints.py:163: error: Module has no > attribute "bool"; maybe "bool_" or "bool8"? > python/pyspark/pandas/typedef/typehints.py:174: error: Module has no > attribute "float"; maybe "float_", "cfloat", or "float96"? > python/pyspark/pandas/typedef/typehints.py:180: error: Module has no > attribute "int"; maybe "uint", "rint", or "intp"? > python/pyspark/pandas/ml.py:81: error: Value of type variable > "_DTypeScalar_co" of "dtype" cannot be "object" > python/pyspark/pandas/indexing.py:1649: error: Module has no attribute "int"; > maybe "uint", "rint", or "intp"? > python/pyspark/pandas/indexing.py:1656: error: Module has no attribute "int"; > maybe "uint", "rint", or "intp"? > python/pyspark/pandas/frame.py:4969: error: Function "numpy.array" is not > valid as a type > python/pyspark/pandas/frame.py:4969: note: Perhaps you need "Callable[...]" > or a callback protocol? > python/pyspark/pandas/frame.py:4970: error: Function "numpy.array" is not > valid as a type > python/pyspark/pandas/frame.py:4970: note: Perhaps you need "Callable[...]" > or a callback protocol? > python/pyspark/pandas/frame.py:7402: error: "List[Any]" has no attribute > "tolist" > python/pyspark/pandas/series.py:1030: error: Module has no attribute > "_NoValue" > python/pyspark/pandas/series.py:1031: error: Module has no attribute > "_NoValue" > python/pyspark/pandas/indexes/category.py:159: error: Item "dtype[Any]" of > "Union[dtype[Any], Any]" has no attribute "categories" > python/pyspark/pandas/indexes/category.py:180: error: Item "dtype[Any]" of > "Union[dtype[Any], Any]" has no attribute "ordered" > python/pyspark/pandas/namespace.py:2036: error: Argument 1 to "column_name" > has incompatible type "float"; expected "str" > python/pyspark/pandas/mlflow.py:59: error: Incompatible types in assignment > (expression has type "Type[floating[Any]]", variable has type "str") > python/pyspark/pandas/data_type_ops/categorical_ops.py:43: error: Item > "dtype[Any]" of "Union[dtype[Any], Any]" has no attribute "categories" > python/pyspark/pandas/data_type_ops/categorical_ops.py:43: error: Item > "dtype[Any]" of "Union[dtype[Any], Any]" has no attribute "ordered" > python/pyspark/pandas/data_type_ops/categorical_ops.py:56: error: Item > "dtype[Any]" of "Union[dtype[Any], Any]" has no attribute "categories" > python/pyspark/pandas/tests/test_typedef.py:70: error: Name "np.float" is not > defined > python/pyspark/pandas/tests/test_typedef.py:77: error: Name "np.float" is not > defined > python/pyspark/pandas/tests/test_typedef.py:85: error: Name "np.float" is not > defined > python/pyspark/pandas/tests/test_typedef.py:100: error: Name "np.float" is > not defined > python/pyspark/pandas/tests/test_typedef.py:108: error: Name "np.float" is > not defined > python/pyspark/mllib/clustering.pyi:152: error: Incompatible types in > assignment (expression has type "ndarray[Any, Any]", base class "KMeansModel" > defined the type as "List[ndarray[Any, Any]]") > python/pyspark/mllib/classification.pyi:93: error: Signature of "predict" > incompatible with supertype "LinearClassificationModel" > Found 32 errors in 15 files (checked 315 source files) > 1 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org