Github user mortada commented on the issue: https://github.com/apache/spark/pull/15053 @HyukjinKwon I may still be confused about something - first of all what do you mean by the package level docstring? Do you mean here: https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L1559 or here: https://github.com/apache/spark/blob/master/python/pyspark/sql/dataframe.py#L18? Also, is the idea that we would define `df` globally, and then for the docstring of each function we would *not* do: ``` >>> df = spark.createDataFrame([('Alice', 2), ('Bob', 5)], ['name', 'age']) >>> df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect() [Row(age=3), Row(age=4)] ``` and instead we do: ``` >>> df.show() +-----+---+ | name|age| +-----+---+ |Alice| 2| | Bob| 5| +-----+---+ >>> df.select(when(df['age'] == 2, 3).otherwise(4).alias("age")).collect() [Row(age=3), Row(age=4)] ``` therefore not showing the user how to construct the DataFrame?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org