Also take a look at this API: https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.DataFrameNaFunctions
On Mon, Sep 26, 2016 at 1:09 AM, Bedrytski Aliaksandr <sp...@bedryt.ski> wrote: > Hi Muhammet, > > python also supports sql queries http://spark.apache.org/docs/latest/sql- > programming-guide.html#running-sql-queries-programmatically > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 10:01, muhammet pakyürek wrote: > > > > > but my requst is related to python because i have designed preprocess > for data which looks for rows including NaN values. if the number of Nan > is high above the threshodl. it s deleted otherwise fill it with a > predictive value. therefore i need python version for this process > > > ------------------------------ > > *From:* Bedrytski Aliaksandr <sp...@bedryt.ski> > *Sent:* Monday, September 26, 2016 7:53 AM > *To:* muhammet pakyürek > *Cc:* user@spark.apache.org > *Subject:* Re: how to find NaN values of each row of spark dataframe to > decide whether the rows is dropeed or not > > Hi Muhammet, > > have you tried to use sql queries? > > spark.sql(""" > SELECT > field1, > field2, > field3 > FROM table1 > WHERE > field1 != 'Nan', > field2 != 'Nan', > field3 != 'Nan' > """) > > > This query filters rows containing Nan for a table with 3 columns. > > Regards, > -- > Bedrytski Aliaksandr > sp...@bedryt.ski > > > > On Mon, Sep 26, 2016, at 09:30, muhammet pakyürek wrote: > > > is there any way to do this directly. if its not, is there any todo this > indirectly using another datastrcutures of spark > > > >