chris snow created SPARK-11658:
----------------------------------

             Summary: simplify documentation for PySpark combineByKey
                 Key: SPARK-11658
                 URL: https://issues.apache.org/jira/browse/SPARK-11658
             Project: Spark
          Issue Type: Improvement
          Components: Documentation, PySpark
    Affects Versions: 1.5.1
            Reporter: chris snow
            Priority: Minor


The current documentation for combineByKey looks like this:

{code}
        >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
        >>> def f(x): return x
        >>> def add(a, b): return a + str(b)
        >>> sorted(x.combineByKey(str, add, add).collect())
        [('a', '11'), ('b', '1')]
        """
{code}

I think it could be simplified to:

{code}
        >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)])
        >>> def add(a, b): return a + str(b)
        >>> x.combineByKey(str, add, add).collect()
        [('a', '11'), ('b', '1')]
        """
{code}

I'll shortly add a patch for this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to