[
https://issues.apache.org/jira/browse/SPARK-47998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075032#comment-18075032
]
Bach Truong Tan Phat edited comment on SPARK-47998 at 4/21/26 4:39 AM:
-----------------------------------------------------------------------
We’d like to work on this issue. We plan to investigate the type validation in
pandas-on-Spark concat, improve handling for native pandas DataFrame/Series
inputs where conversion is appropriate, improve the resulting error message for
unsupported inputs, and add regression tests.
was (Author: JIRAUSER313068):
We’d like to work on this issue. We plan to investigate the type validation in
pandas-on-Spark
concat, improve handling for native pandas DataFrame/Series inputs where
conversion is appropriate, improve the resulting error message for unsupported
inputs, and add regression tests.
> pandas-on-spark DataFrame.concat will not join a Pandas dataframe and raises
> a misleading error
> -----------------------------------------------------------------------------------------------
>
> Key: SPARK-47998
> URL: https://issues.apache.org/jira/browse/SPARK-47998
> Project: Spark
> Issue Type: Bug
> Components: Pandas API on Spark
> Affects Versions: 3.4.3
> Reporter: Philip Kahn
> Priority: Minor
>
> The `concat` method has a strict type check, that raises a misleading error:
> !image-2024-04-25-11-33-29-208.png!
> Note that the type raised is of `objs`, rather than `obj`, so a list of
> various objects will say that it cannot concatenate objects of type list,
> rather than the failed internal types.
>
> Additionally, this strictly checks for pandas-on-spark Series and DataFrames;
> since both objects will happily convert a naive Pandas object, something like
>
> objs = [DataFrame(x) if isinstance(x, pd.Dataframe) else Series(x) if
> isinstance(x, pd.Series) else x for x in objs]
> would trivially make this work in those cases and prevent a different strange
> error reporting that a dataframe wasn't valid in a dataframe concatenation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]