subject:"Joining a RDD to a Dataframe"

Re: Joining a RDD to a Dataframe

2016-05-13 Thread Xinh Huynh

Hi Cyril, In the case where there are no documents, it looks like there is a typo in "addresses" (check the number of "d"s): | scala> df.select(explode(df("addresses.id")).as("aid"), df("id")) <== addresses | org.apache.spark.sql.AnalysisException: Cannot resolve column name "id" among

Re: Joining a RDD to a Dataframe

2016-05-12 Thread Cyril Scetbon

Nobody has the answer ? Another thing I've seen is that if I have no documents at all : scala> df.select(explode(df("addresses.id")).as("aid")).collect res27: Array[org.apache.spark.sql.Row] = Array() Then scala> df.select(explode(df("addresses.id")).as("aid"), df("id"))

Re: Joining a RDD to a Dataframe

2016-05-08 Thread Cyril Scetbon

Hi Ashish, The issue is not related to converting a RDD to a DF. I did it. I was just asking if I should do it differently. The issue regards the exception when using array_contains with a sql.Column instead of a value. I found another way to do it using explode as follows :

Re: Joining a RDD to a Dataframe

2016-05-08 Thread Ashish Dubey

Is there any reason you dont want to convert this - i dont think join b/w RDD and DF is supported. On Sat, May 7, 2016 at 11:41 PM, Cyril Scetbon wrote: > Hi, > > I have a RDD built during a spark streaming job and I'd like to join it to > a DataFrame (E/S input) to

Joining a RDD to a Dataframe

2016-05-08 Thread Cyril Scetbon

Hi, I have a RDD built during a spark streaming job and I'd like to join it to a DataFrame (E/S input) to enrich it. It seems that I can't join the RDD and the DF without converting first the RDD to a DF (Tell me if I'm wrong). Here are the schemas of both DF : scala> df res32:

Re: Joining a RDD to a Dataframe

Re: Joining a RDD to a Dataframe

Re: Joining a RDD to a Dataframe

Re: Joining a RDD to a Dataframe

Joining a RDD to a Dataframe

5 matches

Site Navigation

Mail list logo

Footer information