[GitHub] [spark] dlindelof commented on issue #26747: [SPARK-29188][PYTHON] toPandas (without Arrow) gets wrong dtypes when applied on empty DF

GitBox Tue, 03 Dec 2019 23:49:24 -0800

dlindelof commented on issue #26747: [SPARK-29188][PYTHON] toPandas (without 
Arrow) gets wrong dtypes when applied on empty DF
URL: https://github.com/apache/spark/pull/26747#issuecomment-561519797
 
 
   @srowen This illustrates the current behaviour, where an empty Spark 
Dataframe with a column of type `LongType` becomes a Pandas Dataframe with a 
column of type `object`, i.e. string:
   
   ```
   In [62]: foo = spark.sql("SELECT CAST(1 AS LONG) AS bar WHERE 1 = 0")
   
   In [63]: foo
   Out[63]: DataFrame[bar: bigint]
   
   In [64]: foo.toPandas().dtypes
   Out[64]:
   bar    object
   dtype: object
   ```
   
   When the dataframe is not empty, this is what you see:
   
   ```
   In [65]: foo = spark.sql("SELECT CAST(1 AS LONG) AS bar WHERE 1 = 1")
   
   In [66]: foo.toPandas().dtypes
   Out[66]:
   bar    int64
   dtype: object
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dlindelof commented on issue #26747: [SPARK-29188][PYTHON] toPandas (without Arrow) gets wrong dtypes when applied on empty DF

Reply via email to