Re: SparkSQL: Nested Query error

2014-10-30 Thread SK
Hi,

 I am getting an error in the Query Plan when I use the SQL statement
exactly as you have suggested. Is that the exact SQL statement I should be
using (I am not very familiar with SQL syntax)?


I also tried using the SchemaRDD's subtract method to perform this query.
usersRDD.subtract(deviceRDD).count(). The count comes out to be  1, but
there are many UIDs in tusers that are not in device - so the result is
not correct. 

I would like to know the right way to do frame this query in SparkSQL.

thanks



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691p17705.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



SparkSQL: Nested Query error

2014-10-29 Thread SK
Hi,

I am using Spark 1.1.0. I have the following SQL statement where I am trying
to count the number of UIDs that are in the tusers table but not in the
device table.

val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers
WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device))

I am getting the following error:
Exception in thread main java.lang.RuntimeException: [1.61] failure:
string literal expected
SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid
FROM device)

I am not sure if every subquery has to be a string, so I tried to enclose
the subquery as a  string literal as follows: 
val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers
WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device))
But that resulted in a compilation error.

What is the right way to frame the above query in Spark SQL?

thanks




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: SparkSQL: Nested Query error

2014-10-29 Thread Sanjiv Mittal
You may use -

select count(u_uid) from tusers a left outer join device b on (a.u_uid =
 b.d_uid) where b.d_uid is null

On Wed, Oct 29, 2014 at 5:45 PM, SK skrishna...@gmail.com wrote:

 Hi,

 I am using Spark 1.1.0. I have the following SQL statement where I am
 trying
 to count the number of UIDs that are in the tusers table but not in the
 device table.

 val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers
 WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device))

 I am getting the following error:
 Exception in thread main java.lang.RuntimeException: [1.61] failure:
 string literal expected
 SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid
 FROM device)

 I am not sure if every subquery has to be a string, so I tried to enclose
 the subquery as a  string literal as follows:
 val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers
 WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device))
 But that resulted in a compilation error.

 What is the right way to frame the above query in Spark SQL?

 thanks




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org