I posted several examples in java at http://lordjoesoftware.blogspot.com/
Generally code like this works and I show how to accumulate more complex
values.
// Make two accumulators using Statistics
final AccumulatorInteger totalLetters= ctx.accumulator(0L,
ttl);
Map the key value into a key,Tuple2key,value and process that -
Also ask the Spark maintainers for a version of keyed operations where the
key is passed in as an argument - I run into these cases all the time
/**
* map a tuple int a key tuple pair to insure subsequent processing has
I have been playing with using accumulators (despite the possible error with
multiple attempts) These provide a convenient way to get some numbers while
still performing business logic.
I posted some sample code at http://lordjoesoftware.blogspot.com/.
Even if accumulators are not perfect today -
public static void main(String[] args) throws Exception {
System.out.println(Set Log to Warn);
Logger rootLogger = Logger.getRootLogger();
rootLogger.setLevel(Level.WARN);
...
works for me
--
View this message in context:
What I have been doing is building a JavaSparkContext the first time it is
needed and keeping it as a ThreadLocal - All my code uses
SparkUtilities.getCurrentContext(). On a Slave machine you build a new
context and don't have to serialize it
The code is in a large project at
A rather more general question is - assume I have an JavaRDDK which is
sorted -
How can I convert this into a JavaPairRDDInteger,K where the Integer is
tie index - 0...N - 1.
Easy to do on one machine
JavaRDDK values = ... // create here
JavaRDDInteger,K positions =