[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

BryanCutler Mon, 23 Jul 2018 09:54:47 -0700

Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21546#discussion_r204479024
  
    --- Diff: python/pyspark/sql/dataframe.py ---
    @@ -2095,9 +2095,11 @@ def toPandas(self):
                             _check_dataframe_localize_timestamps
                         import pyarrow
     
    -                    tables = self._collectAsArrow()
    -                    if tables:
    -                        table = pyarrow.concat_tables(tables)
    +                    # Collect un-ordered list of batches, and list of 
correct order indices
    +                    batches, batch_order = self._collectAsArrow()
    +                    if batches:
    --- End diff --
    
    Sure, I was playing around with this being an iterator, but I will change 
it since it is a list now



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #21546: [SPARK-23030][SQL][PYTHON] Use Arrow stream forma...

Reply via email to