i dont see this behavior in the current spark master: scala> val df = Seq("m_123", "m_111", "m_145", "m_098", "m_666").toDF("msrid") df: org.apache.spark.sql.DataFrame = [msrid: string]
scala> df.filter($"msrid".isin("m_123")).count res0: Long = 1 scala> df.filter($"msrid".isin("m_123","m_111","m_145")).count res1: Long = 3 On Mon, Apr 17, 2017 at 10:50 AM, nayan sharma <nayansharm...@gmail.com> wrote: > Thanks for responding. > df.filter($”msrid”===“m_123” || $”msrid”===“m_111”) > > there are lots of workaround to my question but Can you let know whats > wrong with the “isin” query. > > Regards, > Nayan > > Begin forwarded message: > > *From: *ayan guha <guha.a...@gmail.com> > *Subject: **Re: isin query* > *Date: *17 April 2017 at 8:13:24 PM IST > *To: *nayan sharma <nayansharm...@gmail.com>, user@spark.apache.org > > How about using OR operator in filter? > > On Tue, 18 Apr 2017 at 12:35 am, nayan sharma <nayansharm...@gmail.com> > wrote: > >> Dataframe (df) having column msrid(String) having values >> m_123,m_111,m_145,m_098,m_666 >> >> I wanted to filter out rows which are having values m_123,m_111,m_145 >> >> df.filter($"msrid".isin("m_123","m_111","m_145")).count >> count =0 >> while >> df.filter($"msrid".isin("m_123")).count >> count=121212 >> I have tried using queries like >> df.filter($"msrid" isin (List("m_123","m_111","m_145"):_*)) >> count =0 >> but >> >> df.filter($"msrid" isin (List("m_123"):_*)) >> count=121212 >> >> Any suggestion will do a great help to me. >> >> Thanks, >> Nayan >> > -- > Best Regards, > Ayan Guha > > >