[ 
https://issues.apache.org/jira/browse/SPARK-37730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464522#comment-17464522
 ] 

Apache Spark commented on SPARK-37730:
--------------------------------------

User 'mslapek' has created a pull request for this issue:
https://github.com/apache/spark/pull/35000

> plot.hist throws AttributeError on pandas=1.3.5
> -----------------------------------------------
>
>                 Key: SPARK-37730
>                 URL: https://issues.apache.org/jira/browse/SPARK-37730
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.2.0, 3.3.0
>         Environment: Conda environment.yml (also tested with 3.3.0-SNAPSHOT):
> {{name: testenv}}
> {{channels:}}
> {{  - conda-forge}}
> {{dependencies:}}
> {{  - python=3.9.9}}
> {{  }}
> {{  - numpy=1.21.5}}
> {{  - pandas=1.3.5}}
> {{  - matplotlib=3.5.1}}
> {{  }}
> {{  - pyspark=3.2.0}}
>  
>            Reporter: Michał Słapek
>            Priority: Major
>
> plot.hist from PySpark throws AttributeError exception when pyspark.pandas is 
> used with pandas=1.3.5.
> Pandas in commit 
> [https://github.com/pandas-dev/pandas/commit/029907c9d69a0260401b78a016a6c4515d8f1c40]
> replaced MPLPlot._add_legend_handle with 
> MPLPlot._append_legend_handles_labels.
> I've attached PR on github which replaces use of MPLPlot._add_legend_handle 
> in PySpark with MPLPlot._append_legend_handles_labels.
> Code:
> {{import pyspark.pandas as ps}}
> {{from matplotlib import pyplot as }}{{plt}}
> {{ps.set_option("plotting.backend", "matplotlib")}}
> {{{}df = ps.DataFrame({}}}{{{}{'data': [4, 5, 5, 6, 8, 9]}{}}}{{{}){}}}
> {{df['data'].plot.hist()}}
> {{plt.show()}}
>  
> Truncated traceback:
> {{Traceback (most recent call last): }}
> {{File "/home/develop/Documents/sparkbug/code.py", line 6, in <module>}}
> {{df['data'].plot.hist()}}
> {{...}}
> {{File 
> "/mnt/transient/develop/miniconda3/envs/testenv/lib/python3.9/site-packages/pyspark/pandas/plot/matplotlib.py",
>  line 403, in _make_plot}}
> {{self._add_legend_handle(artists[0], label, index=i)}}
> {{AttributeError: 'PandasOnSparkHistPlot' object has no attribute 
> '_add_legend_handle'}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to