[GitHub] spark pull request #18945: Add option to convert nullable int columns to flo...

logannc Wed, 20 Sep 2017 21:58:46 -0700

Github user logannc commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18945#discussion_r140148777
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -1761,12 +1761,37 @@ def toPandas(self):
                     raise ImportError("%s\n%s" % (e.message, msg))
             else:
                 dtype = {}
    +            columns_with_null_int = {}
    +            def null_handler(rows, columns_with_null_int):
    +                for row in rows:
    +                    row = row.asDict()
    +                    for column in columns_with_null_int:
    +                        val = row[column]
    +                        dt = dtype[column]
    +                        if val is not None:
    +                            if abs(val) > 16777216: # Max value before 
np.float32 loses precision.
    +                                val = np.float64(val)
    +                                if np.float64 != dt:
    --- End diff --
    
    No, not strictly necessary, but also hardly harmful and it may future proof 
a bit...? Anyway, it can be removed if you think it should.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18945: Add option to convert nullable int columns to flo...

Reply via email to