Hi Tim, That would be awesome. We have seen some really disparate Mesos allocations for our Spark Streaming jobs. (like (7,4,1) over 3 executors for 4 kafka consumer instead of the ideal (3,3,3,3)) For network dependent consumers, achieving an even deployment would provide a reliable and reproducible streaming job execution from the performance point of view. We're deploying in coarse grain mode. Not sure Spark Streaming would work well in fine-grained given the added latency to acquire a worker.
You mention that you're changing the Mesos scheduler. Is there a Jira where this job is taking place? -kr, Gerard. On Mon, Dec 22, 2014 at 6:01 PM, Timothy Chen <tnac...@gmail.com> wrote: > Hi Gerard, > > Really nice guide! > > I'm particularly interested in the Mesos scheduling side to more evenly > distribute cores across cluster. > > I wonder if you are using coarse grain mode or fine grain mode? > > I'm making changes to the spark mesos scheduler and I think we can propose > a best way to achieve what you mentioned. > > Tim > > Sent from my iPhone > > > On Dec 22, 2014, at 8:33 AM, Gerard Maas <gerard.m...@gmail.com> wrote: > > > > Hi, > > > > After facing issues with the performance of some of our Spark Streaming > > jobs, we invested quite some effort figuring out the factors that affect > > the performance characteristics of a Streaming job. We defined an > > empirical model that helps us reason about Streaming jobs and applied it > to > > tune the jobs in order to maximize throughput. > > > > We have summarized our findings in a blog post with the intention of > > collecting feedback and hoping that it is useful to other Spark Streaming > > users facing similar issues. > > > > http://www.virdata.com/tuning-spark/ > > > > Your feedback is welcome. > > > > With kind regards, > > > > Gerard. > > Data Processing Team Lead > > Virdata.com > > @maasg >