[ 
https://issues.apache.org/jira/browse/SPARK-47998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075032#comment-18075032
 ] 

Bach Truong Tan Phat edited comment on SPARK-47998 at 4/21/26 4:39 AM:
-----------------------------------------------------------------------

We’d like to work on this issue. We plan to investigate the type validation in 
pandas-on-Spark concat,  improve handling for native pandas DataFrame/Series 
inputs where conversion is appropriate, improve the resulting error message for 
unsupported inputs, and add regression tests.


was (Author: JIRAUSER313068):
We’d like to work on this issue. We plan to investigate the type validation in 
pandas-on-Spark
concat,  improve handling for native pandas DataFrame/Series inputs where 
conversion is appropriate, improve the resulting error message for unsupported 
inputs, and add regression tests.

> pandas-on-spark DataFrame.concat will not join a Pandas dataframe and raises 
> a misleading error
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47998
>                 URL: https://issues.apache.org/jira/browse/SPARK-47998
>             Project: Spark
>          Issue Type: Bug
>          Components: Pandas API on Spark
>    Affects Versions: 3.4.3
>            Reporter: Philip Kahn
>            Priority: Minor
>
> The `concat` method has a strict type check, that raises a misleading error:
> !image-2024-04-25-11-33-29-208.png!
> Note that the type raised is of `objs`, rather than `obj`, so a list of 
> various objects will say that it cannot concatenate objects of type list, 
> rather than the failed internal types.
>  
> Additionally, this strictly checks for pandas-on-spark Series and DataFrames; 
> since both objects will happily convert a naive Pandas object, something like
>  
> objs = [DataFrame(x) if isinstance(x, pd.Dataframe) else Series(x) if 
> isinstance(x, pd.Series) else x for x in objs] 
> would trivially make this work in those cases and prevent a different strange 
> error reporting that a dataframe wasn't valid in a dataframe concatenation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to