[ 
https://issues.apache.org/jira/browse/SPARK-43282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee resolved SPARK-43282.
---------------------------------
    Resolution: Won't Fix

> Investigate DataFrame.sort_values with pandas behavior.
> -------------------------------------------------------
>
>                 Key: SPARK-43282
>                 URL: https://issues.apache.org/jira/browse/SPARK-43282
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Pandas API on Spark
>    Affects Versions: 4.0.0
>            Reporter: Haejoon Lee
>            Priority: Major
>
> {code:java}
> import pandas as pd
> pdf = pd.DataFrame(
>     {
>         "a": pd.Categorical([1, 2, 3, 1, 2, 3]),
>         "b": pd.Categorical(
>             ["b", "a", "c", "c", "b", "a"], categories=["c", "b", "d", "a"]
>         ),
>     },
> )
> pdf.groupby("a").apply(lambda x: x).sort_values(["a"])
> Traceback (most recent call last):
> ...
> ValueError: 'a' is both an index level and a column label, which is 
> ambiguous. {code}
> We should investigate this issue whether this is intended behavior or just 
> bug in pandas.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to