[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

HyukjinKwon Mon, 23 Jul 2018 01:31:28 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21546#discussion_r204319644
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -2095,9 +2095,11 @@ def toPandas(self):
                             _check_dataframe_localize_timestamps
                         import pyarrow
     
    -                    tables = self._collectAsArrow()
    -                    if tables:
    -                        table = pyarrow.concat_tables(tables)
    +                    # Collect un-ordered list of batches, and list of 
correct order indices
    +                    batches, batch_order = self._collectAsArrow()
    +                    if batches:
    --- End diff --
    
    Not a big deal at all and personal preference: I would do this like 
`len(batches) > 0`.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

Reply via email to