I'm trying to keep track of some information in a RDD.flatMap() function (using Java API in 1.4.0). I have two longs in the function, and I am incrementing them when appropriate, and checking their values to determine how many objects to output from the function. I'm not trying to read the values in the driver or use them as a global counter, just trying to count within the task. This appears not to be working. Should I expect this to work? If so, any pointers as to what I might be doing wrong? Code looks something like:
JavaRDD<LabeledPoint> parsedData = data.map(new Function<String, LabeledPoint>() { Long count = 0L; public LabeledPoint call(String line) { count++; String[] parts = line.split(","); String[] features = parts[1].split(" "); double[] v = new double[features.length]; for (int i = 0; i < features.length - 1; i++) v[i] = Double.parseDouble(features[i]); if (count == 50) { //return something else } return new LabeledPoint(Double.parseDouble(parts[0]), Vectors.dense(v)); } } );