I'm trying to keep track of some information in a RDD.flatMap() function
(using Java API in 1.4.0). I have two longs in the function, and I am
incrementing them when appropriate, and checking their values to determine
how many objects to output from the function. I'm not trying to read the
values in the driver or use them as a global counter, just trying to count
within the task. This appears not to be working. Should I expect this to
work? If so, any pointers as to what I might be doing wrong? Code looks
something like:

 

JavaRDD<LabeledPoint> parsedData = data.map(new Function<String,
LabeledPoint>() {

     Long count = 0L;

     public LabeledPoint call(String line) {

       count++;

       String[] parts = line.split(",");

       String[] features = parts[1].split(" ");

       double[] v = new double[features.length];

       for (int i = 0; i < features.length - 1; i++)

         v[i] = Double.parseDouble(features[i]);

       if (count == 50) {

         //return something else

       }

       return new LabeledPoint(Double.parseDouble(parts[0]),
Vectors.dense(v));

     }

   }

);

 

Reply via email to