Hi 

Some findings: 

1)  spark sql does not support multiple join 
2)  spark left join: has performance issue
3)  spark sql’s cache table: does not support two-tier query 
4)  spark sql does not support repartition

Arthur

On 10 Sep, 2014, at 10:22 pm, arthur.hk.c...@gmail.com 
<arthur.hk.c...@gmail.com> wrote:

> Hi,
> 
> May be you can take a look about the following.
> 
> http://databricks.com/blog/2014/03/26/spark-sql-manipulating-structured-data-using-spark-2.html
> 
> Good luck.
> Arthur
> 
> On 10 Sep, 2014, at 9:09 pm, arunshell87 <shell.a...@gmail.com> wrote:
> 
>> 
>> Hi,
>> 
>> I too had tried SQL queries with joins, MINUS , subqueries etc but they did
>> not work in Spark Sql. 
>> 
>> I did not find any documentation on what queries work and what do not work
>> in Spark SQL, may be we have to wait for the Spark book to be released in
>> Feb-2015.
>> 
>> I believe you can try HiveQL in Spark for your requirement.
>> 
>> Thanks,
>> Arun
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-more-than-two-tables-for-join-tp13865p13877.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to