[ https://issues.apache.org/jira/browse/SPARK-43282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Haejoon Lee resolved SPARK-43282. --------------------------------- Resolution: Won't Fix > Investigate DataFrame.sort_values with pandas behavior. > ------------------------------------------------------- > > Key: SPARK-43282 > URL: https://issues.apache.org/jira/browse/SPARK-43282 > Project: Spark > Issue Type: Sub-task > Components: Pandas API on Spark > Affects Versions: 4.0.0 > Reporter: Haejoon Lee > Priority: Major > > {code:java} > import pandas as pd > pdf = pd.DataFrame( > { > "a": pd.Categorical([1, 2, 3, 1, 2, 3]), > "b": pd.Categorical( > ["b", "a", "c", "c", "b", "a"], categories=["c", "b", "d", "a"] > ), > }, > ) > pdf.groupby("a").apply(lambda x: x).sort_values(["a"]) > Traceback (most recent call last): > ... > ValueError: 'a' is both an index level and a column label, which is > ambiguous. {code} > We should investigate this issue whether this is intended behavior or just > bug in pandas. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org