mapping JavaRDD to jdbc DataFrame

2015-05-04 Thread Lior Chaga
Hi,

I'd like to use a JavaRDD containing parameters for an SQL query, and use
SparkSQL jdbc to load data from mySQL.

Consider the following pseudo code:

JavaRDDString namesRdd = ... ;
...
options.put(url, jdbc:mysql://mysql?user=usr);
options.put(password, pass);
options.put(dbtable, (SELECT * FROM mytable WHERE userName = ?)
sp_campaigns);
DataFrame myTableDF = m_sqlContext.load(jdbc, options);


I'm looking for a way to map namesRdd and get for each name the result of
the queries, without loosing spark context.

Using a mapping function doesn't seem like an option, because I don't have
SQLContext inside it.
I can only think of using collect, and than iterating over the string in
the RDD and execute the query, but it would run in the driver program.

Any suggestions?

Thanks,
Lior


Re: mapping JavaRDD to jdbc DataFrame

2015-05-04 Thread ayan guha
You can use applySchema

On Mon, May 4, 2015 at 10:16 PM, Lior Chaga lio...@taboola.com wrote:

 Hi,

 I'd like to use a JavaRDD containing parameters for an SQL query, and use
 SparkSQL jdbc to load data from mySQL.

 Consider the following pseudo code:

 JavaRDDString namesRdd = ... ;
 ...
 options.put(url, jdbc:mysql://mysql?user=usr);
 options.put(password, pass);
 options.put(dbtable, (SELECT * FROM mytable WHERE userName = ?)
 sp_campaigns);
 DataFrame myTableDF = m_sqlContext.load(jdbc, options);


 I'm looking for a way to map namesRdd and get for each name the result of
 the queries, without loosing spark context.

 Using a mapping function doesn't seem like an option, because I don't have
 SQLContext inside it.
 I can only think of using collect, and than iterating over the string in
 the RDD and execute the query, but it would run in the driver program.

 Any suggestions?

 Thanks,
 Lior




-- 
Best Regards,
Ayan Guha