Take a look at the following methods: * Filters rows using the given condition. * {{{ * // The following are equivalent: * peopleDf.filter($"age" > 15) * peopleDf.where($"age" > 15) * }}} * @group dfops * @since 1.3.0 */ def filter(condition: Column): DataFrame = Filter(condition.expr, logicalPlan)
* Filters rows using the given SQL expression. * {{{ * peopleDf.filter("age > 15") * }}} * @group dfops * @since 1.3.0 */ def filter(conditionExpr: String): DataFrame = { Cheers On Wed, Sep 9, 2015 at 8:04 PM, prachicsa <prachi...@gmail.com> wrote: > > > I want to apply filter based on a list of values in Spark. This is how I > get > the list: > > DataFrame df = sqlContext.read().json("../sample.json"); > > df.groupBy("token").count().show(); > > Tokens = df.select("token").collect(); > for(int i = 0; i < Tokens.length; i++){ > System.out.println(Tokens[i].get(0)); // Need to apply filter > for Token[i].get(0) > } > > Rdd on which I want apply filter is this: > > JavaRDD<String> file = context.textFile(args[0]); > > I figured out a way to filter in java: > > private static final Function<String, Boolean> Filter = > new Function<String, Boolean>() { > @Override > public Boolean call(String s) { > return s.contains("Set"); > } > }; > > How do I go about it? > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Filtering-an-rdd-depending-upon-a-list-of-values-in-Spark-tp24631.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >