Philip Kahn created SPARK-47998:
-----------------------------------

             Summary: pandas-on-spark DataFrame.concat will not join a Pandas 
dataframe and raises a misleading error
                 Key: SPARK-47998
                 URL: https://issues.apache.org/jira/browse/SPARK-47998
             Project: Spark
          Issue Type: Bug
          Components: Pandas API on Spark
    Affects Versions: 3.4.3
            Reporter: Philip Kahn


The `concat` method has a strict type check, that raises a misleading error:

!image-2024-04-25-11-33-29-208.png!
Note that the type raised is of `objs`, rather than `obj`, so a list of various 
objects will say that it cannot concatenate objects of type list, rather than 
the failed internal types.

 

Additionally, this strictly checks for pandas-on-spark Series and DataFrames; 
since both objects will happily convert a naive Pandas object, something like

 

objs = [DataFrame(x) if isinstance(x, pd.Dataframe) else Series(x) if 
isinstance(x, pd.Series) else x for x in objs] 

would trivially make this work in those cases and prevent a different strange 
error reporting that a dataframe wasn't valid in a dataframe concatenation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to