Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20487#discussion_r165865284 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1923,6 +1923,9 @@ def toPandas(self): 0 2 Alice 1 5 Bob """ + from pyspark.sql.utils import require_minimum_pandas_version --- End diff -- Ah, that's pyarrow vs this one is pandas. Wanted to produce a proper message before `import pandas as pd` before :-). Above case (https://github.com/apache/spark/pull/20487/files#r165714499) is when Pandas is lower than 0.19.2. When Pandas is missing, it shows sth like: ``` >>> spark.range(1).toPandas() ``` before: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/dataframe.py", line 1975, in toPandas import pandas as pd ImportError: No module named pandas ``` after: ``` File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/dataframe.py", line 1927, in toPandas require_minimum_pandas_version() File "/.../spark/python/pyspark/sql/utils.py", line 125, in require_minimum_pandas_version "it was not found." % minimum_pandas_version) ImportError: Pandas >= 0.19.2 must be installed; however, it was not found. ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org