Re: Improving Spark multithreaded performance?

Kyle Ellrott Fri, 27 Jun 2014 08:17:36 -0700

As far as I can tell there are is no data to broadcast (unless there is
something internal to mllib that needs to be broadcast) I've coalesced the
input RDDs to keep the number of partitions limited. When running, I've
tried to get up to 500 concurrent stages, and I've coalesced the RDDs down
to 2 partitions, so about 1000 tasks.
Despite having over 500 threads in the threadpool working on mllib tasks,
the total CPU usage never really goes above 150%.
I've tried increasing 'spark.akka.threads' but that doesn't seem to do
anything.


My one thought would be that maybe because I'm using MLUtils.kFold to
generate the RDDs is that because I have so many tasks working off RDDs
that are permutations of original RDDs that maybe that is creating some
sort of dependency bottleneck.

Kyle


On Thu, Jun 26, 2014 at 6:35 PM, Aaron Davidson <[email protected]> wrote:

> I don't have specific solutions for you, but the general things to try are:
>
> - Decrease task size by broadcasting any non-trivial objects.
> - Increase duration of tasks by making them less fine-grained.
>
> How many tasks are you sending? I've seen in the past something like 25
> seconds for ~10k total medium-sized tasks.
>
>
> On Thu, Jun 26, 2014 at 12:06 PM, Kyle Ellrott <[email protected]>
> wrote:
>
>> I'm working to set up a calculation that involves calling
>> mllib's SVMWithSGD.train several thousand times on different permutations
>> of the data. I'm trying to run the separate jobs using a threadpool to
>> dispatch the different requests to a spark context connected a Mesos's
>> cluster, using course scheduling, and a max of 2000 cores on Spark 1.0.
>> Total utilization of the system is terrible. Most of the 'aggregate at
>> GradientDescent.scala:178' stages(where mllib spends most of its time) take
>> about 3 seconds, but have ~25 seconds of scheduler delay time.
>> What kind of things can I do to improve this?
>>
>> Kyle
>>
>
>

Re: Improving Spark multithreaded performance?

Reply via email to