Re: selecting columns with the same name in a join

2015-09-13 Thread Evert Lammerts
t(k=1, v="Ruud"), dict(k=3, v="Vincent")]).toDF() > x.registerTempTable('x') > y.registerTempTable('y') > sqlContext.sql("select y.v, x.v FROM x INNER JOIN y ON x.k=y.k").collect() > > Out[1]: [Row(v=u'Ruud', v=u'Evert')] > > On

selecting columns with the same name in a join

2015-09-11 Thread Evert Lammerts
Am I overlooking something? This doesn't seem right: x = sc.parallelize([dict(k=1, v="Evert"), dict(k=2, v="Erik")]).toDF() y = sc.parallelize([dict(k=1, v="Ruud"), dict(k=3, v="Vincent")]).toDF() x.registerTempTable('x') y.registerTempTable('y') sqlContext.sql("select y.v, x.v FROM x INNER JOIN y

Re: Querying registered RDD (AsTable) using JDBC

2014-12-19 Thread Evert Lammerts
Yes you can, using HiveContext, a metastore and the thriftserver. The metastore persists information about your SchemaRDD, and the HiveContext, initialised with information on the metastore, can interact with the metastore. The thriftserver provides JDBC connections using the metastore. Using MySQ