Re: Spark SQL DSL for joins?

2014-12-21 Thread Cheng Lian
On 12/17/14 1:43 PM, Jerry Raj wrote: Hi, I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I have two tables (backed by Parquet files) and I need to do a join across them using a common field (user_id). This works fine using standard SQL but not using the

Spark SQL DSL for joins?

2014-12-16 Thread Jerry Raj
Hi, I'm using the Scala DSL for Spark SQL, but I'm not able to do joins. I have two tables (backed by Parquet files) and I need to do a join across them using a common field (user_id). This works fine using standard SQL but not using the language-integrated DSL neither t1.join(t2, on =

Re: Spark SQL DSL for joins?

2014-12-16 Thread Jerry Raj
Another problem with the DSL: t1.where('term == dmin).count() returns zero. But sqlCtx.sql(select * from t1 where term = 'dmin').count() returns 700, which I know is correct from the data. Is there something wrong with how I'm using the DSL? Thanks On 17/12/14 11:13 am, Jerry Raj wrote:

Re: Spark SQL DSL for joins?

2014-12-16 Thread Tobias Pfeiffer
Jerry, On Wed, Dec 17, 2014 at 3:35 PM, Jerry Raj jerry@gmail.com wrote: Another problem with the DSL: t1.where('term == dmin).count() returns zero. Looks like you need ===: https://spark.apache.org/docs/1.1.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD Tobias