Hyukjin Kwon created SPARK-40005: ------------------------------------ Summary: Self-contained examples with parameter descriptions in PySpark documentation Key: SPARK-40005 URL: https://issues.apache.org/jira/browse/SPARK-40005 Project: Spark Issue Type: Umbrella Components: Documentation, PySpark Affects Versions: 3.4.0 Reporter: Hyukjin Kwon
This JIRA aims to improve PySpark documentation in: - {{pyspark}} - {{pyspark.ml}} - {{pyspark.sql}} - {{pyspark.sql.streaming}} We should: - Make the examples self-contained, e.g., https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html - Document {{Parameters}} https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html#pandas.DataFrame.pivot. There are many API that misses parameters in PySpark, e.g., [DataFrame.union|https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.DataFrame.union.html#pyspark.sql.DataFrame.union] If the size of file is large, e.g., dataframe.py, we should split that down into each subtask, and improve documentation. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org