subject:"Implementing custom RDD in Java"

Re: Implementing custom RDD in Java

2015-05-26 Thread Alex Robbins

I know it isn't exactly what you are asking for, but you could solve it
like this:

Driver program queries dynamo for the s3 file keys.
sc.textFile each of the file keys and .union them all together to make your
RDD.

You could wrap that up in a function and it wouldn't be too painful to
reuse. I don't personally know about creating custom RDDs in Java.

On Mon, May 25, 2015 at 10:37 PM, Swaranga Sarma sarma.swara...@gmail.com
wrote:

My data is in S3 and is indexed in Dynamo. For example, If I want to load
data given a time range, I will first need to query Dynamo for the S3 file
keys for the corresponding time range and then load them in Spark. The
files may not always be in the same S3 path prefix, hence
sc.testFile(s3://directory_path/) won't
work. I am looking for pointers on how to implement something analogous to
HadoopRDD or JdbcRDD but in Java.

I am looking to do something similar to what they have done here:
https://github.com/lagerspetz/TimeSeriesSpark/blob/master/src/spark/timeseries/dynamodb/DynamoDbRDD.scala.
This one reads data from Dynamo, my custom RDD would query DynamoDB for the
S3 file keys, and then load them from S3.

On Mon, May 25, 2015 at 8:19 PM, Alex Robbins
alexander.j.robb...@gmail.com wrote:

If a Hadoop InputFormat already exists for your data source, you can load
it from there. Otherwise, maybe you can dump your data source out as text
and load it from there. Without more detail on what your data source is,
it'll be hard for anyone to help.

On Mon, May 25, 2015 at 5:00 PM, swaranga sarma.swara...@gmail.com
wrote:

Hello,

I have a custom data source and I want to load the data into Spark to
perform some computations. For this I see that I might need to implement
a
new RDD for my data source.

I am a complete Scala noob and I am hoping that I can implement the RDD
in
Java only. I looked around the internet and could not find any resources.
Any pointers?

--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Implementing-custom-RDD-in-Java-tp23026.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

--
Sent from my Lumia thumb-typed with errors.

Implementing custom RDD in Java

2015-05-25 Thread Swaranga Sarma

Hello,

I have a custom data source and I want to load the data into Spark to
perform some computations. For this I see that I might need to implement a
new RDD for my data source.

I am a complete Scala noob and I am hoping that I can implement the RDD in
Java only. I looked around the internet and could not find any resources.
Any pointers?

-- 
Sent from my Lumia thumb-typed with errors.

Implementing custom RDD in Java

2015-05-25 Thread swaranga

Hello,

I have a custom data source and I want to load the data into Spark to
perform some computations. For this I see that I might need to implement a
new RDD for my data source.

I am a complete Scala noob and I am hoping that I can implement the RDD in
Java only. I looked around the internet and could not find any resources.
Any pointers?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Implementing-custom-RDD-in-Java-tp23026.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Implementing custom RDD in Java

2015-05-25 Thread Swaranga Sarma

On Mon, May 25, 2015 at 8:19 PM, Alex Robbins alexander.j.robb...@gmail.com
wrote:

On Mon, May 25, 2015 at 5:00 PM, swaranga sarma.swara...@gmail.com
wrote:

Hello,

I have a custom data source and I want to load the data into Spark to
perform some computations. For this I see that I might need to implement a
new RDD for my data source.

I am a complete Scala noob and I am hoping that I can implement the RDD in
Java only. I looked around the internet and could not find any resources.
Any pointers?

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

--
Sent from my Lumia thumb-typed with errors.

Re: Implementing custom RDD in Java

Implementing custom RDD in Java

Implementing custom RDD in Java

Re: Implementing custom RDD in Java

4 matches

Site Navigation

Mail list logo

Footer information