Hi,

I would like to create multiple key-value pairs, where all keys still can be
reduced. For instance, I have the following 2 lines:
A,B,C
B,D

I would like to return the following pairs for the first line:
A,B
A,C
B,A
B,C
C,A
C,B
And for the second
B,D
D,B

After a reduce by key, I want to end up with
A,<B,C>
B<A,B,D>
C<A,B>
D

In Hadoop, I used a list and a for-loop to write multiple times like below
context.write(new Text(local[i]), new Text(local[j]));

In Spark I was thinking of the mapToPair with a JavaPairRDD, but this only
returns 1 Tuple2. I know I can return a <key, list&lt;value>>, but then I
could only reduce on the A|B|C, not on all.
JavaPairRDD<String, String> tuples = actors.mapToPair(
  new PairFunction<String, String, String>() {
    public Tuple2<String, String> call(String w) {
      return new Tuple2<String, String>(w, "1");
    }
});

Thanks!

P.S. No need to fill in the function, just interested in the return type
P.S.2 I'm using Java 7, so I can't use lambda's :)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Return-multiple-K-V-pairs-from-a-Java-Function-tp12720.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to