chris snow created SPARK-11658: ---------------------------------- Summary: simplify documentation for PySpark combineByKey Key: SPARK-11658 URL: https://issues.apache.org/jira/browse/SPARK-11658 Project: Spark Issue Type: Improvement Components: Documentation, PySpark Affects Versions: 1.5.1 Reporter: chris snow Priority: Minor
The current documentation for combineByKey looks like this: {code} >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)]) >>> def f(x): return x >>> def add(a, b): return a + str(b) >>> sorted(x.combineByKey(str, add, add).collect()) [('a', '11'), ('b', '1')] """ {code} I think it could be simplified to: {code} >>> x = sc.parallelize([("a", 1), ("b", 1), ("a", 1)]) >>> def add(a, b): return a + str(b) >>> x.combineByKey(str, add, add).collect() [('a', '11'), ('b', '1')] """ {code} I'll shortly add a patch for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org