[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-29 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r461360724



##
File path: python/pyspark/ml/tests/test_stat.py
##
@@ -40,7 +40,7 @@ def test_chisquaretest(self):
 
 
 if __name__ == "__main__":
-from pyspark.ml.tests.test_stat import *
+from pyspark.ml.tests.test_stat import *  # noqa: F401

Review comment:
   Ah, these imports actually makes the output file paths correct. For 
example, if you run the unittests, the console outputs are like:
   
   ```
   test_blahblah (pyspark.sql.blahblah.BlahBlahTests) ... ok
   
   --
   Ran 3 tests in 0.001s
   
   OK
   ```
   
   if you remove the imports, the output becomes:
   
   ```
   test_blahblah (__main__.BlahBlahTests) ... ok
   
   --
   Ran 3 tests in 0.001s
   
   OK
   ```
   
   It's related to who we run the tests via our custom script 
`python/run-tests.py`. Let's leave them as are for now.

##
File path: python/pyspark/ml/tests/test_wrapper.py
##
@@ -116,9 +116,8 @@ def test_new_java_array(self):
 java_array = JavaWrapper._new_java_array(str_list, java_class)
 self.assertEqual(_java2py(self.sc, java_array), expected_str_list)
 
-if __name__ == "__main__":
-from pyspark.ml.tests.test_wrapper import *

Review comment:
   So .. let's don't remove this.

##
File path: python/pyspark/ml/tests/test_stat.py
##
@@ -40,7 +40,7 @@ def test_chisquaretest(self):
 
 
 if __name__ == "__main__":
-from pyspark.ml.tests.test_stat import *
+from pyspark.ml.tests.test_stat import *  # noqa: F401

Review comment:
   Ah, these imports actually make the output file paths correct. For 
example, if you run the unittests, the console outputs are like:
   
   ```
   test_blahblah (pyspark.sql.blahblah.BlahBlahTests) ... ok
   
   --
   Ran 3 tests in 0.001s
   
   OK
   ```
   
   if you remove the imports, the output becomes:
   
   ```
   test_blahblah (__main__.BlahBlahTests) ... ok
   
   --
   Ran 3 tests in 0.001s
   
   OK
   ```
   
   It's related to how we run the tests via our custom script 
`python/run-tests.py`. Let's leave them as are for now.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-24 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r460362434



##
File path: python/pyspark/ml/tests/test_stat.py
##
@@ -40,7 +40,7 @@ def test_chisquaretest(self):
 
 
 if __name__ == "__main__":
-from pyspark.ml.tests.test_stat import *
+from pyspark.ml.tests.test_stat import *  # noqa: F401

Review comment:
   No, no. don't remove this. It will keep the source code path in console 
output. It is also used in coverage report as well. XML reporter is used in 
Jenkins.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-24 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r460362184



##
File path: python/pyspark/heapq3.py
##
@@ -865,7 +865,7 @@ def nlargest(n, iterable, key=None):
 
 # If available, use C implementation
 try:
-from _heapq import *
+from _heapq import *  # noqa: F401

Review comment:
   This too. It was ported back from CPython 3, and is excluded in other 
linter checks. Can we exclude this file as well?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-24 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r460361924



##
File path: python/pyspark/cloudpickle/cloudpickle.py
##
@@ -57,7 +57,6 @@
 from .compat import pickle
 from typing import Generic, Union, Tuple, Callable
 from pickle import _getattribute
-from importlib._bootstrap import _find_spec

Review comment:
   @Fokko, can we exclude this file in the linter checking? We're already 
skipping other lint checks (see `dev/tox.ini`) and this is usually exactly 
matched to the release of cloudpickle.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-16 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r455781637



##
File path: python/pyspark/mllib/classification.py
##
@@ -102,6 +102,7 @@ class LogisticRegressionModel(LinearClassificationModel):
   in Multinomial Logistic Regression. By default, it is binary
   logistic regression so numClasses will be set to 2.
 
+>>> from pyspark.mllib.linalg import SparseVector

Review comment:
   I don't think PySpark examples import relevant PySpark classes within 
doctests.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29121: [SPARK-32319][PYSPARK] Remove unused imports

2020-07-15 Thread GitBox


HyukjinKwon commented on a change in pull request #29121:
URL: https://github.com/apache/spark/pull/29121#discussion_r454870938



##
File path: python/pyspark/sql/functions.py
##
@@ -29,9 +29,9 @@
 from pyspark.sql.dataframe import DataFrame
 from pyspark.sql.types import StringType, DataType
 # Keep UserDefinedFunction import for backwards compatible import; moved in 
SPARK-22409
-from pyspark.sql.udf import UserDefinedFunction, _create_udf

Review comment:
   `UserDefinedFunction` should be kept as described above.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org