Those logs you included are from the Spark executor processes, as opposed
to the YARN NodeManager processes.
If you don't think you have access to the NodeManager logs, I would try
setting spark.yarn.executor.memoryOverhead to something like 1024 or 2048
and seeing if that helps. If it does, it's
Hello Xiangrui and Sandy,
Thanks for jumping in to help.
So, first thing... After my email last night I reran my code using 10
executors, 2G each, and everything ran okay. So, that's good, but I'm
still curious as to what I was doing wrong.
For Xiangrui's questions:
My training set is 49174
Hi Mike,
Do you have access to your YARN NodeManager logs? When executors die
randomly on YARN, it's often because they use more memory than allowed for
their YARN container. You would see messages to the effect of "container
killed because physical memory limits exceeded".
-Sandy
On Wed, Oct
The cost depends on the feature dimension, number of instances, number
of classes, and number of partitions. Do you mind sharing those
numbers? -Xiangrui
On Wed, Oct 1, 2014 at 6:31 PM, Mike Bernico wrote:
> Hi Everyone,
>
> I'm working on training mllib's Naive Bayes to classify TF/IDF vectoried
Hi Everyone,
I'm working on training mllib's Naive Bayes to classify TF/IDF vectoried
docs using Spark 1.1.0.
I've gotten this to work fine on a smaller set of data, but when I increase
the number of vectorized documents I get hung up on training. The only
messages I'm seeing are below. I'm pr