[GitHub] [spark] HyukjinKwon commented on a change in pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

GitBox Thu, 24 Jun 2021 02:09:55 -0700


HyukjinKwon commented on a change in pull request #33054:
URL: https://github.com/apache/spark/pull/33054#discussion_r657770758




##########
File path: python/pyspark/sql/dataframe.py
##########
@@ -2695,6 +2695,62 @@ def writeTo(self, table):
         """
         return DataFrameWriterV2(self, table)
 
+    def to_pandas_on_spark(self, index_col=None):
+        """
+        Converts the existing DataFrame into a pandas-on-Spark DataFrame.
+
+        If a pandas-on-Spark DataFrame is converted to a Spark DataFrame and 
then back
+        to pandas-on-Spark, it will lose the index information and the 
original index
+        will be turned into a normal column.
+
+        Parameters
+        ----------
+        index_col: str or list of str, optional, default: None
+            Index column of table in Spark.
+
+        See Also
+        --------
+        DataFrame.to_spark
+
+        Examples
+        --------
+        >>> df = spark.createDataFrame([("a", 1), ("b", 2), ("c",  3)], 
["Col1", "Col2"])
+
+        >>> psdf = df.to_pandas_on_spark()

Review comment:
       I think you can skip those tests for now. See 
`PandasConversionMixin.toPandas` as an example.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

Reply via email to