Hi Muhammet, python also supports sql queries http://spark.apache.org/docs/latest/sql-programming-guide.html#running-sql-queries-programmatically
Regards, -- Bedrytski Aliaksandr sp...@bedryt.ski On Mon, Sep 26, 2016, at 10:01, muhammet pakyürek wrote: > > > > but my requst is related to python because i have designed preprocess > for data which looks for rows including NaN values. if the number of > Nan is high above the threshodl. it s deleted otherwise fill it with a > predictive value. therefore i need python version for this process > > > > *From:* Bedrytski Aliaksandr <sp...@bedryt.ski> *Sent:* Monday, > September 26, 2016 7:53 AM *To:* muhammet pakyürek *Cc:* > user@spark.apache.org *Subject:* Re: how to find NaN values of each > row of spark dataframe to decide whether the rows is dropeed or not > > Hi Muhammet, > > have you tried to use sql queries? > >> spark.sql(""" >> SELECT >> field1, >> field2, >> field3 >> FROM table1 >> WHERE >> field1 != 'Nan', >> field2 != 'Nan', >> field3 != 'Nan' >> """) > > This query filters rows containing Nan for a table with 3 columns. > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 09:30, muhammet pakyürek wrote: >> >> is there any way to do this directly. if its not, is there any todo >> this indirectly using another datastrcutures of spark >> >