Hi All,
We have recently moved to Spark 1.1 from 0.9 for an application handling a
fair number of very large datasets partitioned across multiple nodes. About
half of each of these large datasets is stored in off heap byte arrays and
about half in the standard Java heap.
While these datasets are
Hi All,
I am not sure if this is a 0.9.0 problem to be fixed in 0.9.1 so perhaps
already being addressed, but I am having a devil of a time with a spark
0.9.0 client jar for hadoop 2.X. If I go to the site and download:
- Download binaries for Hadoop 2 (HDP2, CDH5): find an Apache
mirror
-Dhadoop.version=2.3.0 -Dyarn.version=2.3.0 -DskipTests clean
package
And from http://spark.apache.org/docs/latest/running-on-yarn.html, for
sbt build, you could try:
SPARK_HADOOP_VERSION=2.3.0 SPARK_YARN=true sbt/sbt assembly
Thanks,
Rahul Singhal
From: Erik Freed erikjfr