Take a look at the following methods:

   * Filters rows using the given condition.
   * {{{
   *   // The following are equivalent:
   *   peopleDf.filter($"age" > 15)
   *   peopleDf.where($"age" > 15)
   * }}}
   * @group dfops
   * @since 1.3.0
   */
  def filter(condition: Column): DataFrame = Filter(condition.expr,
logicalPlan)

  * Filters rows using the given SQL expression.
   * {{{
   *   peopleDf.filter("age > 15")
   * }}}
   * @group dfops
   * @since 1.3.0
   */
  def filter(conditionExpr: String): DataFrame = {

Cheers

On Wed, Sep 9, 2015 at 8:04 PM, prachicsa <prachi...@gmail.com> wrote:

>
>
> I want to apply filter based on a list of values in Spark. This is how I
> get
> the list:
>
> DataFrame df = sqlContext.read().json("../sample.json");
>
>         df.groupBy("token").count().show();
>
>         Tokens = df.select("token").collect();
>         for(int i = 0; i < Tokens.length; i++){
>             System.out.println(Tokens[i].get(0)); // Need to apply filter
> for Token[i].get(0)
>         }
>
> Rdd on which I want apply filter is this:
>
> JavaRDD<String> file = context.textFile(args[0]);
>
> I figured out a way to filter in java:
>
> private static final Function<String, Boolean> Filter =
>             new Function<String, Boolean>() {
>                 @Override
>                 public Boolean call(String s) {
>                     return s.contains("Set");
>                 }
>             };
>
> How do I go about it?
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Filtering-an-rdd-depending-upon-a-list-of-values-in-Spark-tp24631.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to