Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Fengdong Yu
val req_logs_with_dpid = req_logs.filter(req_logs("req_info.pid") != "" ) Azuryy Yu Sr. Infrastructure Engineer cel: 158-0164-9103 wetchat: azuryy On Wed, Dec 9, 2015 at 7:43 PM, Prashant Bhardwaj < prashant2006s...@gmail.com> wrote: > Hi > > I have two columns in my json which can have null,

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Prashant Bhardwaj
Already tried it. But getting following error. overloaded method value filter with alternatives: (conditionExpr: String)org.apache.spark.sql.DataFrame (condition: org.apache.spark.sql.Column)org.apache.spark.sql.DataFrame cannot be applied to (Boolean) Also tried: val req_logs_with_dpid =

Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Prashant Bhardwaj
Hi I have two columns in my json which can have null, empty and non-empty string as value. I know how to filter records which have non-null value using following: val req_logs = sqlContext.read.json(filePath) val req_logs_with_dpid = req_log.filter("req_info.dpid is not null or

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Gokula Krishnan D
Hello Prashant - Can you please try like this : For the instance, input file name is "student_detail.txt" and ID,Name,Sex,Age === 101,Alfred,Male,30 102,Benjamin,Male,31 103,Charlie,Female,30 104,Julie,Female,30 105,Maven,Male,30 106,Dexter,Male,30 107,Lundy,Male,32

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Gokula Krishnan D
Ok, then you can slightly change like [image: Inline image 1] Thanks & Regards, Gokula Krishnan* (Gokul)* On Wed, Dec 9, 2015 at 11:09 AM, Prashant Bhardwaj < prashant2006s...@gmail.com> wrote: > I have to do opposite of what you're doing. I have to filter non-empty > records. > > On Wed, Dec

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Prashant Bhardwaj
Anyway I got it. I have to use !== instead of ===. Thank BTW. On Wed, Dec 9, 2015 at 9:39 PM, Prashant Bhardwaj < prashant2006s...@gmail.com> wrote: > I have to do opposite of what you're doing. I have to filter non-empty > records. > > On Wed, Dec 9, 2015 at 9:33 PM, Gokula Krishnan D

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Prashant Bhardwaj
I have to do opposite of what you're doing. I have to filter non-empty records. On Wed, Dec 9, 2015 at 9:33 PM, Gokula Krishnan D wrote: > Hello Prashant - > > Can you please try like this : > > For the instance, input file name is "student_detail.txt" and > >

Re: Filtering records based on empty value of column in SparkSql

2015-12-09 Thread Gokula Krishnan D
Please refer the link and drop() provides features to drop the rows with Null / Non-Null columns. Hope, it also helps. https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.sql.DataFrameNaFunctions Thanks & Regards, Gokula Krishnan* (Gokul)* On Wed, Dec 9, 2015 at 11:12