[ 
https://issues.apache.org/jira/browse/SPARK-47998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18076638#comment-18076638
 ] 

Bach Truong Tan Phat commented on SPARK-47998:
----------------------------------------------

We've open a pull request: https://github.com/apache/spark/pull/55561/

> pandas-on-spark DataFrame.concat will not join a Pandas dataframe and raises 
> a misleading error
> -----------------------------------------------------------------------------------------------
>
>                 Key: SPARK-47998
>                 URL: https://issues.apache.org/jira/browse/SPARK-47998
>             Project: Spark
>          Issue Type: Bug
>          Components: Pandas API on Spark
>    Affects Versions: 3.4.3
>            Reporter: Philip Kahn
>            Priority: Minor
>              Labels: pull-request-available
>
> The `concat` method has a strict type check, that raises a misleading error:
> !image-2024-04-25-11-33-29-208.png!
> Note that the type raised is of `objs`, rather than `obj`, so a list of 
> various objects will say that it cannot concatenate objects of type list, 
> rather than the failed internal types.
>  
> Additionally, this strictly checks for pandas-on-spark Series and DataFrames; 
> since both objects will happily convert a naive Pandas object, something like
>  
> objs = [DataFrame(x) if isinstance(x, pd.Dataframe) else Series(x) if 
> isinstance(x, pd.Series) else x for x in objs] 
> would trivially make this work in those cases and prevent a different strange 
> error reporting that a dataframe wasn't valid in a dataframe concatenation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to