I am calling the Spark Dataset API (map method) and getting exceptions on
deserialization of task results. I am calling this API from Clojure using
standard JVM interop syntax.
This gist has a tiny Clojure program that shows the problem, as well as the
corresponding (working) Scala
Hello All,
In the Spark documentation under "Hardware Requirements" it very clearly
states:
We recommend having *4-8 disks* per node, configured *without* RAID (just
as separate mount points)
My question is why not raid? What is the argument\reason for not using Raid?
Thanks!
-Eddie
Hi,
Why am I getting this error which prevents my KMeans clustering algorithm to
work inside of Spark? I'm trying to run a sample Scala model found in
Databricks website on my Cloudera-Spark 1-node local VM. For completeness, the
Scala program is as follows: Thx
import