Hi Kyle,

A few questions:

1) Did you use `setIntercept(true)`?
2) How many features?

I'm a little worried about driver's load because the final aggregation
and weights update happen on the driver. Did you check driver's memory
usage as well?


On Fri, Jun 27, 2014 at 8:10 AM, Kyle Ellrott <kellr...@soe.ucsc.edu> wrote:
> As far as I can tell there are is no data to broadcast (unless there is
> something internal to mllib that needs to be broadcast) I've coalesced the
> input RDDs to keep the number of partitions limited. When running, I've
> tried to get up to 500 concurrent stages, and I've coalesced the RDDs down
> to 2 partitions, so about 1000 tasks.
> Despite having over 500 threads in the threadpool working on mllib tasks,
> the total CPU usage never really goes above 150%.
> I've tried increasing 'spark.akka.threads' but that doesn't seem to do
> anything.
> My one thought would be that maybe because I'm using MLUtils.kFold to
> generate the RDDs is that because I have so many tasks working off RDDs that
> are permutations of original RDDs that maybe that is creating some sort of
> dependency bottleneck.
> Kyle
> On Thu, Jun 26, 2014 at 6:35 PM, Aaron Davidson <ilike...@gmail.com> wrote:
>> I don't have specific solutions for you, but the general things to try
>> are:
>> - Decrease task size by broadcasting any non-trivial objects.
>> - Increase duration of tasks by making them less fine-grained.
>> How many tasks are you sending? I've seen in the past something like 25
>> seconds for ~10k total medium-sized tasks.
>> On Thu, Jun 26, 2014 at 12:06 PM, Kyle Ellrott <kellr...@soe.ucsc.edu>
>> wrote:
>>> I'm working to set up a calculation that involves calling mllib's
>>> SVMWithSGD.train several thousand times on different permutations of the
>>> data. I'm trying to run the separate jobs using a threadpool to dispatch the
>>> different requests to a spark context connected a Mesos's cluster, using
>>> course scheduling, and a max of 2000 cores on Spark 1.0.
>>> Total utilization of the system is terrible. Most of the 'aggregate at
>>> GradientDescent.scala:178' stages(where mllib spends most of its time) take
>>> about 3 seconds, but have ~25 seconds of scheduler delay time.
>>> What kind of things can I do to improve this?
>>> Kyle

Reply via email to