Python implementation of RDD interface

Sven Kreiss Fri, 29 May 2015 08:31:23 -0700

I wanted to share a Python implementation of RDDs: pysparkling.

http://trivial.io/post/120179819751/pysparkling-is-a-native-implementation-of-the


The benefit is that you can apply the same code that you use in PySpark on
large datasets in pysparkling on small datasets or single documents. When
running with pysparkling, there is no dependency on the Java Virtual
Machine or Hadoop.

Sven

Python implementation of RDD interface

Reply via email to