[ 
https://issues.apache.org/jira/browse/SPARK-40265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haejoon Lee updated SPARK-40265:
--------------------------------
    Description: 
There is inconsistent behavior on Index.intersection for pandas API on Spark as 
below:


{code:python}
>>> other = [(1, 2), (3, 4)]
>>> pidx
Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
>>> psidx
Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
>>> psidx.intersection(other).sort_values()
MultiIndex([], )
>>> pidx.intersection(other).sort_values()
Traceback (most recent call last):
...
ValueError: Names should be list-like for a MultiIndex
{code}

We should fix it to follow pandas.

  was:
There is inconsistent behavior on Index.intersection for pandas API on Spark as 
below:


{code:python}
>>> pidx
Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
>>> psidx
Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
>>> psidx.intersection(other).sort_values()
MultiIndex([], )
>>> pidx.intersection(other).sort_values()
Traceback (most recent call last):
...
ValueError: Names should be list-like for a MultiIndex
{code}

We should fix it to follow pandas.


> Fix the inconsistent behavior for Index.intersection.
> -----------------------------------------------------
>
>                 Key: SPARK-40265
>                 URL: https://issues.apache.org/jira/browse/SPARK-40265
>             Project: Spark
>          Issue Type: Test
>          Components: Pandas API on Spark
>    Affects Versions: 3.4.0
>            Reporter: Haejoon Lee
>            Priority: Major
>
> There is inconsistent behavior on Index.intersection for pandas API on Spark 
> as below:
> {code:python}
> >>> other = [(1, 2), (3, 4)]
> >>> pidx
> Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
> >>> psidx
> Int64Index([1, 2, 3, 4], dtype='int64', name='Koalas')
> >>> psidx.intersection(other).sort_values()
> MultiIndex([], )
> >>> pidx.intersection(other).sort_values()
> Traceback (most recent call last):
> ...
> ValueError: Names should be list-like for a MultiIndex
> {code}
> We should fix it to follow pandas.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to