HyukjinKwon commented on a change in pull request #33054: URL: https://github.com/apache/spark/pull/33054#discussion_r657770758
########## File path: python/pyspark/sql/dataframe.py ########## @@ -2695,6 +2695,62 @@ def writeTo(self, table): """ return DataFrameWriterV2(self, table) + def to_pandas_on_spark(self, index_col=None): + """ + Converts the existing DataFrame into a pandas-on-Spark DataFrame. + + If a pandas-on-Spark DataFrame is converted to a Spark DataFrame and then back + to pandas-on-Spark, it will lose the index information and the original index + will be turned into a normal column. + + Parameters + ---------- + index_col: str or list of str, optional, default: None + Index column of table in Spark. + + See Also + -------- + DataFrame.to_spark + + Examples + -------- + >>> df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)], ["Col1", "Col2"]) + + >>> psdf = df.to_pandas_on_spark() Review comment: I think you can skip those tests for now. See `PandasConversionMixin.toPandas` as an example. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org