Re: Spark SQL -- more than two tables for join

2014-10-07 Thread TANG Gen
Hi, the same problem happens when I try several joins together, such as 'SELECT * FROM sales INNER JOIN magasin ON sales.STO_KEY = magasin.STO_KEY INNER JOIN eans ON (sales.BARC_KEY = eans.BARC_KEY and magasin.FORM_KEY = eans.FORM_KEY)' The error information is as follow:

Re: Spark SQL -- more than two tables for join

2014-10-07 Thread Gen
Hi, in fact, the same problem happens when I try several joins together: SELECT * FROM sales INNER JOIN magasin ON sales.STO_KEY = magasin.STO_KEY INNER JOIN eans ON (sales.BARC_KEY = eans.BARC_KEY and magasin.FORM_KEY = eans.FORM_KEY) py4j.protocol.Py4JJavaError: An error occurred while

Re: Spark SQL -- more than two tables for join

2014-10-07 Thread Matei Zaharia
The issue is that you're using SQLContext instead of HiveContext. SQLContext implements a smaller subset of the SQL language and so you're getting a SQL parse error because it doesn't support the syntax you have. Look at how you'd write this in HiveQL, and then try doing that with HiveContext.

Re: Re: Spark SQL -- more than two tables for join

2014-09-11 Thread Yin Huai
...@gmail.com arthur.hk.c...@gmail.com *CC:* arunshell87 shell.a...@gmail.com; u...@spark.incubator.apache.org *Subject:* Re: Spark SQL -- more than two tables for join What version of Spark SQL are you running here? I think a lot of your concerns have likely been addressed in more recent

Re: Spark SQL -- more than two tables for join

2014-09-10 Thread arunshell87
Hi, I too had tried SQL queries with joins, MINUS , subqueries etc but they did not work in Spark Sql. I did not find any documentation on what queries work and what do not work in Spark SQL, may be we have to wait for the Spark book to be released in Feb-2015. I believe you can try HiveQL in

Re: Spark SQL -- more than two tables for join

2014-09-10 Thread arthur.hk.c...@gmail.com
Hi, May be you can take a look about the following. http://databricks.com/blog/2014/03/26/spark-sql-manipulating-structured-data-using-spark-2.html Good luck. Arthur On 10 Sep, 2014, at 9:09 pm, arunshell87 shell.a...@gmail.com wrote: Hi, I too had tried SQL queries with joins, MINUS ,

Re: Spark SQL -- more than two tables for join

2014-09-10 Thread Michael Armbrust
What version of Spark SQL are you running here? I think a lot of your concerns have likely been addressed in more recent versions of the code / documentation. (Spark 1.1 should be published in the next few days) In particular, for serious applications you should use a HiveContext and HiveQL as

Re: Re: Spark SQL -- more than two tables for join

2014-09-10 Thread boyingk...@163.com
) at org.apache.spark.examples.sql.SparkSQLHBaseRelation.main(SparkSQLHBaseRelation.scala) boyingk...@163.com From: Michael Armbrust Date: 2014-09-11 00:28 To: arthur.hk.c...@gmail.com CC: arunshell87; u...@spark.incubator.apache.org Subject: Re: Spark SQL -- more than two tables for join What version of Spark SQL are you running here? I think a lot