dlindelof commented on issue #26747: [SPARK-29188][PYTHON] toPandas (without Arrow) gets wrong dtypes when applied on empty DF URL: https://github.com/apache/spark/pull/26747#issuecomment-561519797 @srowen This illustrates the current behaviour, where an empty Spark Dataframe with a column of type `LongType` becomes a Pandas Dataframe with a column of type `object`, i.e. string: ``` In [62]: foo = spark.sql("SELECT CAST(1 AS LONG) AS bar WHERE 1 = 0") In [63]: foo Out[63]: DataFrame[bar: bigint] In [64]: foo.toPandas().dtypes Out[64]: bar object dtype: object ``` When the dataframe is not empty, this is what you see: ``` In [65]: foo = spark.sql("SELECT CAST(1 AS LONG) AS bar WHERE 1 = 1") In [66]: foo.toPandas().dtypes Out[66]: bar int64 dtype: object ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org