Re: How do you get the partitioner for an RDD in Java?

Imran Rashid Tue, 17 Feb 2015 17:36:32 -0800

a JavaRDD is just a wrapper around a normal RDD defined in scala, which is
stored in the "rdd" field.  You can access everything that way.  The
JavaRDD wrappers just provide some interfaces that are a bit easier to work
with in Java.


If this is at all convincing, here's me demonstrating it inside the
spark-shell (yes its scala, but I'm using the java api)

scala> val jsc = new JavaSparkContext(sc)
> jsc: org.apache.spark.api.java.JavaSparkContext =
> org.apache.spark.api.java.JavaSparkContext@7d365529
>


scala> val data = jsc.parallelize(java.util.Arrays.asList(Array("a", "b",
> "c")))
> data: org.apache.spark.api.java.JavaRDD[Array[String]] =
> ParallelCollectionRDD[0] at parallelize at <console>:15
>


scala> data.rdd.partitioner
> res0: Option[org.apache.spark.Partitioner] = None


On Tue, Feb 17, 2015 at 3:44 PM, Darin McBeath <ddmcbe...@yahoo.com.invalid>
wrote:

> In an 'early release' of the Learning Spark book, there is the following
> reference:
>
> In Scala and Java, you can determine how an RDD is partitioned using its
> partitioner property (or partitioner() method in Java)
>
> However, I don't see the mentioned 'partitioner()' method in Spark 1.2 or
> a way of getting this information.
>
> I'm curious if anyone has any suggestions for how I might go about finding
> how an RDD is partitioned in a Java program.
>
> Thanks.
>
> Darin.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: How do you get the partitioner for an RDD in Java?

Reply via email to