Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22011#discussion_r208093660 --- Diff: python/pyspark/rdd.py --- @@ -2429,6 +2441,29 @@ def _wrap_function(sc, func, deserializer, serializer, profiler=None): sc.pythonVer, broadcast_vars, sc._javaAccumulator) +class RDDBarrier(object): + + """ + .. note:: Experimental + + An RDDBarrier turns an RDD into a barrier RDD, which forces Spark to launch tasks of the stage + contains this RDD together. + """ + + def __init__(self, rdd): + self.rdd = rdd + self._jrdd = rdd._jrdd + + def mapPartitions(self, f, preservesPartitioning=False): + """ + Return a new RDD by applying a function to each partition of this RDD. + """ --- End diff -- shall we match the documentation, or why is it different? FWIW, for coding block, just `` `blabla` `` should be good enough. Nicer if linked properly by like `` :class:`ClassName` ``.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org