Thanks Michael, we'll update then. Evert On Sep 11, 2015 20:59, "Michael Armbrust" <mich...@databricks.com> wrote:
> Here is what I get on branch-1.5: > > x = sc.parallelize([dict(k=1, v="Evert"), dict(k=2, v="Erik")]).toDF() > y = sc.parallelize([dict(k=1, v="Ruud"), dict(k=3, v="Vincent")]).toDF() > x.registerTempTable('x') > y.registerTempTable('y') > sqlContext.sql("select y.v, x.v FROM x INNER JOIN y ON x.k=y.k").collect() > > Out[1]: [Row(v=u'Ruud', v=u'Evert')] > > On Fri, Sep 11, 2015 at 3:14 AM, Evert Lammerts <evert.lamme...@gmail.com> > wrote: > >> Am I overlooking something? This doesn't seem right: >> >> x = sc.parallelize([dict(k=1, v="Evert"), dict(k=2, v="Erik")]).toDF() >> y = sc.parallelize([dict(k=1, v="Ruud"), dict(k=3, v="Vincent")]).toDF() >> x.registerTempTable('x') >> y.registerTempTable('y') >> sqlContext.sql("select y.v, x.v FROM x INNER JOIN y ON x.k=y.k").collect() >> >> Out[26]: [Row(v=u'Evert', v=u'Evert')] >> >> May just be because I'm behind; I'm on: >> >> Spark 1.5.0-SNAPSHOT (git revision 27ef854) built for Hadoop 2.6.0 Build >> flags: -Pyarn -Psparkr -Phadoop-2.6 -Dhadoop.version=2.6.0 -Phive >> -Phive-thriftserver -DskipTests >> >> Can somebody check whether the above code does work on the latest release? >> >> Thanks! >> Evert >> > >