unsubscribe
How does the Spark Accumulator work under the covers?
Hello, I was wondering on what does the Spark accumulator do under the covers. I’ve implemented my own associative addInPlace function for the accumulator, where is this function being run? Let’s say you call something like myRdd.map(x => sum += x) is “sum” being accumulated locally in any way, for each element or partition or node? Is “sum” a broadcast variable? Or does it only exist on the driver node? How does the driver node get access to the “sum”? Thanks, Areg
Sorting a Table in Spark RDD
Hello, So I have crated a table in in RDD in spark in thei format: col1col2 --- 1. 10 11 2. 12 8 3. 9 13 4. 2 3 And the RDD is ristributed by the rows (rows 1, 2 on one node and rows 3 4 on another) I want to sort each column of the table so that that output is the following: col1col2 --- 1. 2 3 2. 9 8 3. 10 11 4. 122 13 Is tehre a easy way to do this with spark RDD? The only way that i can think of so far is to transpose the table somehow.. Thanks Areg
Sorting a table in Spark
Hello,