Please see
http://spark.apache.org/docs/latest/programming-guide.html#local-vs-cluster-modes

Cheers

On Mon, Jul 20, 2015 at 3:21 PM, <dlmar...@comcast.net> wrote:

>
>
> I’m trying to keep track of some information in a RDD.flatMap() function
> (using Java API in 1.4.0). I have two longs in the function, and I am
> incrementing them when appropriate, and checking their values to determine
> how many objects to output from the function. I’m not trying to read the
> values in the driver or use them as a global counter, just trying to count
> within the task. This appears not to be working. Should I expect this to
> work? If so, any pointers as to what I might be doing wrong? Code looks
> something like:
>
>
>
> JavaRDD<LabeledPoint> parsedData = data.map(new Function<String,
> LabeledPoint>() {
>
>      Long count = 0L;
>
>      public LabeledPoint call(String line) {
>
>        count++;
>
>        String[] parts = line.split(",");
>
>        String[] features = parts[1].split(" ");
>
>        double[] v = new double[features.length];
>
>        for (int i = 0; i < features.length - 1; i++)
>
>          v[i] = Double.parseDouble(features[i]);
>
>        if (count == 50) {
>
>          //return something else
>
>        }
>
>        return new LabeledPoint(Double.parseDouble(parts[0]),
> Vectors.dense(v));
>
>      }
>
>    }
>
> );
>
>
>

Reply via email to