Re: Using data frames to join separate RDDs in spark streaming

2016-06-05 Thread Cyril Scetbon
Problem solved by creating only one RDD. > On Jun 1, 2016, at 14:05, Cyril Scetbon wrote: > > It seems that to join a DStream with a RDD I can use : > > mgs.transform(rdd => rdd.join(rdd1)) > > or > > mgs.foreachRDD(rdd => rdd.join(rdd1)) > > But, I can't see why

Re: Using data frames to join separate RDDs in spark streaming

2016-06-01 Thread Cyril Scetbon
It seems that to join a DStream with a RDD I can use : mgs.transform(rdd => rdd.join(rdd1)) or mgs.foreachRDD(rdd => rdd.join(rdd1)) But, I can't see why rdd1.toDF("id","aid") really causes SPARK-5063 > On Jun 1, 2016, at 12:00, Cyril Scetbon wrote: > > Hi guys, >

Using data frames to join separate RDDs in spark streaming

2016-06-01 Thread Cyril Scetbon
Hi guys, I have a 2 input data streams that I want to join using Dataframes and unfortunately I get the message produced by https://issues.apache.org/jira/browse/SPARK-5063 as I can't reference rdd1 in (2) : (1) val rdd1 = sc.esRDD(es_resource.toLowerCase, query) .map(r