Hi,
I would like to do multiple groupBy's in an RDD followed by a single reduce. In
Java, I would need to overwrite type safety if I want this to by done multiple
type (dynamic).
Is there such an example? Would the following work?
JavaPairRDDString, ListRow ret = all.groupBy(new
I am trying to start 12 workers with 2 cores on each Node using the following:
In spark-env.sh (copied in every slave) I have set:
SPARK_WORKER_INSTANCES=12
SPARK_WORKER_CORES=2
I start Scala console with:
SPARK_WORKER_CORES=2 SPARK_MEM=3g MASTER=spark://x:7077
Hi,
I have a cluster of 20 servers, each having 24 cores and 30GB of RAM allocated
to Spark. Spark runs in a STANDALONE mode.
I am trying to load some 200+GB files and cache the rows using .cache().
What I would like to do is the following: (ATM from the scala console)
-Evenly load the files