Mortada Mehyar created SPARK-18069:
--------------------------------------

             Summary: Many examples in Python docstrings are incomplete
                 Key: SPARK-18069
                 URL: https://issues.apache.org/jira/browse/SPARK-18069
             Project: Spark
          Issue Type: Documentation
          Components: Documentation
    Affects Versions: 2.0.1
            Reporter: Mortada Mehyar
            Priority: Minor



A lot of the python API functions show example usage that is incomplete. The 
docstring shows output without having the input DataFrame defined. It can be 
quite confusing trying to understand and/or follow the example.

For instance, the docstring for `DataFrame.dtypes()` is currently


{code}
     def dtypes(self):
         """Returns all column names and their data types as a list.
 
         >>> df.dtypes
         [('age', 'int'), ('name', 'string')]
         """
{code}

when it should really be
{code}
     def dtypes(self):
         """Returns all column names and their data types as a list.
 
         >>> df = spark.createDataFrame([('Alice', 2), ('Bob', 5)], ['name', 
'age'])
         >>> df.dtypes
         [('age', 'int'), ('name', 'string')]
         """
{code}

I have a pending PR for fixing many of these occurrences here: 
https://github.com/apache/spark/pull/15053 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to