Re: How do you get the partitioner for an RDD in Java?

2015-02-17 Thread Darin McBeath


Thanks Imran.  That's exactly what I needed to know.

Darin.


From: Imran Rashid iras...@cloudera.com
To: Darin McBeath ddmcbe...@yahoo.com 
Cc: User user@spark.apache.org 
Sent: Tuesday, February 17, 2015 8:35 PM
Subject: Re: How do you get the partitioner for an RDD in Java?



a JavaRDD is just a wrapper around a normal RDD defined in scala, which is 
stored in the rdd field.  You can access everything that way.  The JavaRDD 
wrappers just provide some interfaces that are a bit easier to work with in 
Java.

If this is at all convincing, here's me demonstrating it inside the spark-shell 
(yes its scala, but I'm using the java api)

scala val jsc = new JavaSparkContext(sc)
jsc: org.apache.spark.api.java.JavaSparkContext = 
org.apache.spark.api.java.JavaSparkContext@7d365529

 
scala val data = jsc.parallelize(java.util.Arrays.asList(Array(a, b, c)))
data: org.apache.spark.api.java.JavaRDD[Array[String]] = 
ParallelCollectionRDD[0] at parallelize at console:15

 
scala data.rdd.partitioner
res0: Option[org.apache.spark.Partitioner] = None




On Tue, Feb 17, 2015 at 3:44 PM, Darin McBeath ddmcbe...@yahoo.com.invalid 
wrote:

In an 'early release' of the Learning Spark book, there is the following 
reference:

In Scala and Java, you can determine how an RDD is partitioned using its 
partitioner property (or partitioner() method in Java)

However, I don't see the mentioned 'partitioner()' method in Spark 1.2 or a 
way of getting this information.

I'm curious if anyone has any suggestions for how I might go about finding how 
an RDD is partitioned in a Java program.

Thanks.

Darin.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: How do you get the partitioner for an RDD in Java?

2015-02-17 Thread Imran Rashid
a JavaRDD is just a wrapper around a normal RDD defined in scala, which is
stored in the rdd field.  You can access everything that way.  The
JavaRDD wrappers just provide some interfaces that are a bit easier to work
with in Java.

If this is at all convincing, here's me demonstrating it inside the
spark-shell (yes its scala, but I'm using the java api)

scala val jsc = new JavaSparkContext(sc)
 jsc: org.apache.spark.api.java.JavaSparkContext =
 org.apache.spark.api.java.JavaSparkContext@7d365529



scala val data = jsc.parallelize(java.util.Arrays.asList(Array(a, b,
 c)))
 data: org.apache.spark.api.java.JavaRDD[Array[String]] =
 ParallelCollectionRDD[0] at parallelize at console:15



scala data.rdd.partitioner
 res0: Option[org.apache.spark.Partitioner] = None


On Tue, Feb 17, 2015 at 3:44 PM, Darin McBeath ddmcbe...@yahoo.com.invalid
wrote:

 In an 'early release' of the Learning Spark book, there is the following
 reference:

 In Scala and Java, you can determine how an RDD is partitioned using its
 partitioner property (or partitioner() method in Java)

 However, I don't see the mentioned 'partitioner()' method in Spark 1.2 or
 a way of getting this information.

 I'm curious if anyone has any suggestions for how I might go about finding
 how an RDD is partitioned in a Java program.

 Thanks.

 Darin.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




RE: How do you get the partitioner for an RDD in Java?

2015-02-17 Thread Mohammed Guller
Where did you look?

BTW, it is defined in the RDD class as a val:

val  partitioner: Option[Partitioner] 


Mohammed

-Original Message-
From: Darin McBeath [mailto:ddmcbe...@yahoo.com.INVALID] 
Sent: Tuesday, February 17, 2015 1:45 PM
To: User
Subject: How do you get the partitioner for an RDD in Java?

In an 'early release' of the Learning Spark book, there is the following 
reference:

In Scala and Java, you can determine how an RDD is partitioned using its 
partitioner property (or partitioner() method in Java)

However, I don't see the mentioned 'partitioner()' method in Spark 1.2 or a way 
of getting this information.

I'm curious if anyone has any suggestions for how I might go about finding how 
an RDD is partitioned in a Java program.

Thanks.

Darin.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org