Re: SparkSQL: Nested Query error
Hi, I am getting an error in the Query Plan when I use the SQL statement exactly as you have suggested. Is that the exact SQL statement I should be using (I am not very familiar with SQL syntax)? I also tried using the SchemaRDD's subtract method to perform this query. usersRDD.subtract(deviceRDD).count(). The count comes out to be 1, but there are many UIDs in tusers that are not in device - so the result is not correct. I would like to know the right way to do frame this query in SparkSQL. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691p17705.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
SparkSQL: Nested Query error
Hi, I am using Spark 1.1.0. I have the following SQL statement where I am trying to count the number of UIDs that are in the tusers table but not in the device table. val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device)) I am getting the following error: Exception in thread main java.lang.RuntimeException: [1.61] failure: string literal expected SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device) I am not sure if every subquery has to be a string, so I tried to enclose the subquery as a string literal as follows: val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device)) But that resulted in a compilation error. What is the right way to frame the above query in Spark SQL? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: SparkSQL: Nested Query error
You may use - select count(u_uid) from tusers a left outer join device b on (a.u_uid = b.d_uid) where b.d_uid is null On Wed, Oct 29, 2014 at 5:45 PM, SK skrishna...@gmail.com wrote: Hi, I am using Spark 1.1.0. I have the following SQL statement where I am trying to count the number of UIDs that are in the tusers table but not in the device table. val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device)) I am getting the following error: Exception in thread main java.lang.RuntimeException: [1.61] failure: string literal expected SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device) I am not sure if every subquery has to be a string, so I tried to enclose the subquery as a string literal as follows: val users_with_no_device = sql_cxt.sql(SELECT COUNT (u_uid) FROM tusers WHERE tusers.u_uid NOT IN (SELECT d_uid FROM device)) But that resulted in a compilation error. What is the right way to frame the above query in Spark SQL? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-Nested-Query-error-tp17691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org