There is also spark-perf <https://github.com/amplab/spark-perf>.


On Sat, Oct 12, 2013 at 2:22 PM, Christopher Nguyen <[email protected]> wrote:

> Roman, an area I think would (a) have high impact, and (b) is relatively
> not well covered is performance analysis. I'm sure most teams are doing
> this internally at their respective companies, but there is no shared code
> base and shared wisdom about what we're finding/improving.
>
> For example, consider the task of loading a table from disk into memory by
> Shark. We're getting conflicting data about how much of this is cpu-bound
> vs I/O-bound. Our effort to track this down should be sharable somehow, and
> would benefit from others' findings. Of course this is dependent on the
> particular configuration, but there is a lot of test harness code/scripts
> that can be shared. And individual findings, even if/especially if they are
> conflicting, are very valuable if well documented.
>
> There is a Benchmark effort covered here
> https://amplab.cs.berkeley.edu/benchmark/, but it addresses a slightly
> different goal. You could consider this Perf-Analysis as part of that, or
> as its own effort.
>
> This may be more than you were looking to own, but given your stated
> enthusiasm :) I want to throw the idea out there.
>
> --
> Christopher T. Nguyen
> Co-founder & CEO, Adatao <http://adatao.com>
> linkedin.com/in/ctnguyen
>
>
>
> On Sat, Oct 12, 2013 at 1:48 PM, Роман Ткаленко <[email protected]
> >wrote:
>
> > Hello.
> > I'm trying to dive into Spark's sources on a deeper-than-mere-glance
> level
> > and I find beginning with writing unit tests a good way to do it. So,
> > basically, I'm wondering if there are points to which I could
> specifically
> > apply my enthusiasm, i. e. are there some un- or not enough covered parts
> > for which I could write some tests?
> > I'm wondering as well about the state of Apache-hosted JIRA for Spark - I
> > currently can't see any entry in there. Should I look for them in Github
> > mirror or still in the antecedent JIRA instance on
> > http://spark-project.atlassian.net/?
> > Regards,
> > Roman.
> >
>

Reply via email to