Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop
It seems to be because of this issues: https://issues.apache.org/jira/browse/SPARK-10925 I added a checkpoint, as suggested, to break the lineage and it worked. Best regards, 2017-07-04 17:26 GMT+02:00 Bernard Jesop : > Thank Didac, > > My bad, actually this code is

Re: Analysis Exception after join

2017-07-04 Thread Bernard Jesop
Thank Didac, My bad, actually this code is incomplete, it should have been : - dfAgg = df.groupBy("S_ID").agg(...). I want to access the aggregated values (of dfAgg) for each row of 'df', that is why I do a left outer join. Also, regarding the second parameter, I am using this signature of

Re: Analysis Exception after join

2017-07-03 Thread Didac Gil
With the left join, you are joining two tables. In your case, df is the left table, dfAgg is the right table. The second parameter should be the joining condition, right? For instance dfRes = df.join(dfAgg, $”userName”===$”name", "left_outer”) having a field in df called userName, and another

Analysis Exception after join

2017-07-03 Thread Bernard Jesop
Hello, I don't understand my error message. Basically, all I am doing is : - dfAgg = df.groupBy("S_ID") - dfRes = df.join(dfAgg, Seq("S_ID"), "left_outer") However I get this AnalysisException: " Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved attribute(s) S_ID#1903L