+1 I'd love to extend this to design as well. I'll dig into this and come back.
- Jeremy ........................... Jeremy Anderson Github: https://github.com/objectadjective Twitter: https://twitter.com/ObjectAdjective LinkedIn: http://www.linkedin.com/in/objectadjective On 6 January 2017 at 12:12, Mike Dusenberry <dusenberr...@gmail.com> wrote: > +1 We should definitely submit a few good project proposals, and > particularly those that aim to improve the ability of the user to work on a > wide range of ML problems in a simple and easy manner on top of Spark. > This could include: building out a full ML demo to solve a real, > large-scale problem that would benefit from a distributed approach; overall > performance improvements that address a full class, or wider area, of ML > algorithms, rather than a single, specific script; infrastructure for > [performance] testing, and identification of wide areas of improvement > (your example proposal fits here, and is quite nice!); helping with > building out fully-featured, clean, well-tested DSLs in Python & Scala > (we've started, but it would be good to continue stressing them -- we could > even aim to replace DML with the DSLs); etc. I like the example proposal > that you've given since it would be beneficial to the entire project, > rather than a single, isolated area. > > - Mike > > > -- > > Michael W. Dusenberry > GitHub: github.com/dusenberrymw > LinkedIn: linkedin.com/in/mikedusenberry > > On Fri, Jan 6, 2017 at 11:57 AM, Madison Myers <madisonjmy...@gmail.com> > wrote: > > > +1 I think it's a great idea, Felix > > > > On Fri, Jan 6, 2017 at 11:54 AM, <fschue...@posteo.de> wrote: > > > > > Hi all, > > > > > > as it just came up on the ML, I want to bring this up again for general > > > discussion. I think we should try to get at least one or two students > for > > > this year's GSOC. If you have never heard of GSOC, look here: > > > http://write.flossmanuals.net/gsoc-mentoring/what-is-gsoc/ and here: > > > https://developers.google.com/open-source/gsoc/ > > > > > > Applications for organizations open on January 19th and it is a great > way > > > of introducing new people to the SystemML development and get more > > > contributors. > > > To apply, we need to propose projects for a 4-month period in which a > > > student works on them full time (May - August). Each proposed project > > needs > > > one community member to mentor it - in the end Google decides how many > > > students each project gets, depending of the quality of the proposed > > ideas. > > > To successfully apply we need (1) good ideas for projects and (2) > people > > > willing to mentor those ideas. > > > For an initial brainstorming I suggest that we first figure out if we > > want > > > to participate (which mainly means we need to find people willing to > > mentor > > > projects) and then start collecting ideas. Ideas can be anything from > > > infrastructure, to core development or implementation of new > algorithms. > > > > > > Here is a quick example of how a project proposal could look like: > > > > > > > > > Title: Performance Benchmarks and Experiments > > > > > > Description: To make decisions about new features and the evaluation of > > > old assumptions we need up-to-date performance statistics on multiple > > > levels of the systems and on different architectures (local, > distributed, > > > GPU). The systematic evaluation of performance can be measured with > > > performance tests and micro-benchmarks. In this way, changes to the > > project > > > or alternative implementations (i.g. for low-level linear algebra > > backends) > > > can be systematically evaluated and compared. (Semi-) Automated > > benchmarks > > > can help make these decisions and challenge assumptions that were made > > > during earlier development. In the course of this project, the student > > > should build a benchmark infrastructure and conduct experiments, that > > > compare different choices in critical parts (sparsity thresholds, BLAS > > > backends, optimization decisions, etc.). > > > > > > Expected Outcome: A benchmark suite than can be used to detect > > regressions > > > or improvements in critical components of the system. > > > > > > Skills required: Java/Scala, some knowledge of benchmarking; preferred: > > > knowledge about high-performance-computing and/or distributed systems. > > > > > > Possible Mentors: Matthias, Niketan, Nakul, Felix > > > > > > > > > Let's decide on if we want to apply as an organization! > > > > > > - Felix > > > > > > > > > > > -- > > *Madison J. Myers* > > *--------------------------* > > *Spark Technology Center, IBM Watson* > > *UC Berkeley, Master of Information & Data Science '17* > > > > *King's College London, MA Political Science '14* > > *New York University, BA Political Science '12* > > > > - > > LinkedIn <http://linkedin.com/in/madisonjmyers> > > >