Re: [DISCUSS] Roadmap SystemML 1.0

Luciano Resende Mon, 16 Jan 2017 19:36:08 -0800

Instead of Epic, we could use the target release ? Also, we have a roadmap
page on the site and we should keep that up to date, or get rid of that and
use roadmap on jira.


On Mon, Jan 16, 2017 at 6:20 PM <dusenberr...@gmail.com> wrote:

> Now that we've had some discussion here, it would be good to transfer this
> discussion into a JIRA epic, containing sub tasks. That way, we can
> properly track our progress on these items and facilitate contributions
> from the community.  Note that some of the sub tasks may already exist as
> individual issues.
>
>
>
> Would anyone in the community like to volunteer for creating these issues?
>
>
>
> - Mike
>
>
>
> --
>
>
>
> Mike Dusenberry
>
> GitHub: github.com/dusenberrymw
>
> LinkedIn: linkedin.com/in/mikedusenberry
>
>
>
> Sent from my iPhone.
>
>
>
>
>
> > On Jan 4, 2017, at 6:00 PM, dusenberr...@gmail.com wrote:
>
> >
>
> > Overall, this is a good list of items that should be worked on,
> particularly because it contains several user-facing items.  However, to
> echo what Luciano said, I'm also concerned about the timeline.  At this
> stage, I agree that we need to release more often, and with a more
> user-oriented "product" focus as a guide for timelines.  I.e. we should
> orient our release timelines around items that focus on the "product" of
> allowing the user to work on a wide range of ML problems in a simple and
> easy manner on top of Spark.
>
> >
>
> > With that in mind, I agree that a focus on a subset of (1) and (2) would
> be good for an immediate release, with a particular focus on Spark 2.0
> support as a priority.
>
> >
>
> > How about we aim for a February 1st release date for the initial items?
>
> >
>
> > -Mike
>
> >
>
> > --
>
> >
>
> > Mike Dusenberry
>
> > GitHub: github.com/dusenberrymw
>
> > LinkedIn: linkedin.com/in/mikedusenberry
>
> >
>
> > Sent from my iPhone.
>
> >
>
> >
>
> >> On Jan 3, 2017, at 4:17 PM, Niketan Pansare <npan...@us.ibm.com> wrote:
>
> >>
>
> >> Hi Matthias,
>
> >>
>
> >> Thanks for the detailed roadmap.
>
> >>
>
> >> +1 for all the items with few modifications.
>
> >>
>
> >> 1) APIs and Language:
>
> >> * Cleanup new MLContext (matrix/frame data types, move tests, etc)
>
> >> >> Ensure Python and Scala MLContext have same API capability.
>
> >>
>
> >> * Remove old MLContext
>
> >> * Consolidate MLContext and JMLC
>
> >> * Full support for Scala/Python DSLs
>
> >> >> +1 for Python DSL except for push-down of loop structures and
> functions.
>
> >>
>
> >> * Remove old file-based transform
>
> >> * Scala/Python wrappers for all existing algorithms
>
> >> * Data converters (additional formats: e.g., libsvm; performance)
>
> >>
>
> >> 2) Updated Dependencies:
>
> >> * Spark 2.0 support
>
> >> * Matrix block library (isolated jar)
>
> >>
>
> >> 3) Compiler/Runtime Features:
>
> >> * GPU support (full compiler and runtime support)
>
> >> >> Can we break this down into phases:
> https://issues.apache.org/jira/browse/SYSTEMML-445 ? We can discuss the
> timeline of the phases in the JIRA.
>
> >>
>
> >> * Compressed linear algebra v2
>
> >> * Code generation (automatic operator fusion)
>
> >> * Extended parfor (full spark exploitation, micro-batch support)
>
> >> * Scale-up architecture (large dense blocks, numa)?
>
> >>
>
> >> 4) Tools
>
> >> * Extended stats (task locality, shuffle, etc)
>
> >> * Cloud resource advisor (extended resource optimizer)?
>
> >>
>
> >> 5) Algorithms
>
> >> * Graduate "staging" algorithms (robustness/performance)
>
> >> * Perftest: include all algorithms into automated performance tests
>
> >> >> via spark-submit + via Scala/Python wrappers
>
> >>
>
> >> * Simplify usage decision trees, random forest, mlogreg, msvm
>
> >> (preprocessing, label representation, etc)
>
> >> >> + command-line variable naming. For example: maxi, maxiter, etc.
>
> >>
>
> >> Thanks,
>
> >>
>
> >> Niketan Pansare
>
> >> IBM Almaden Research Center
>
> >> E-mail: npansar At us.ibm.com
>
> >> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >>
>
> >> Matthias Boehm ---01/03/2017 02:44:39 PM---Yes indeed, most of (3) and
> (4) can be done incrementally. For (5), some of the changes might also
>
> >>
>
> >> From: Matthias Boehm <mboe...@googlemail.com>
>
> >> To: dev@systemml.incubator.apache.org
>
> >> Date: 01/03/2017 02:44 PM
>
> >> Subject: Re: [DISCUSS] Roadmap SystemML 1.0
>
> >>
>
> >>
>
> >>
>
> >>
>
> >> Yes indeed, most of (3) and (4) can be done incrementally. For (5), some
>
> >> of the changes might also modify the signature of algorithms (i.e.,
>
> >> parameters and required input data) but it would help, for example with
>
> >> decision trees, as users no longer need to dummy code their inputs.
>
> >>
>
> >> Generally, I'm fine with making (3), (4), and part of (5) optional and
>
> >> let the "must-have" features from (1) and (2) determine the timeline.
>
> >>
>
> >> Regards,
>
> >> Matthias
>
> >>
>
> >> On 1/3/2017 11:27 PM, Luciano Resende wrote:
>
> >> > On Tue, Jan 3, 2017 at 11:50 AM, Matthias Boehm <
> mboe...@googlemail.com>
>
> >> > wrote:
>
> >> >
>
> >> >> I'd like to initiate the discussion of a concrete roadmap for our
> next
>
> >> >> release. According, to previous discussions, I'd think it's fair to
> say
>
> >> >> that we agree on calling it SystemML 1.0. We should carefully plan
> this
>
> >> >> release as it's an opportunity to change APIs and remove some older
>
> >> >> deprecated features. I'd like to encourage not just developers but
> also the
>
> >> >> broader community to participate in this discussion.
>
> >> >>
>
> >> >> Personally, I think a target date of Q2/2017 is realistic. Let's
> start
>
> >> >> with collecting the major features and changes that potentially
> affect
>
> >> >> users. Here is an initial list, but please feel free to add and up-
> or
>
> >> >> down-vote the individual items.
>
> >> >>
>
> >> >> 1) APIs and Language:
>
> >> >> * Cleanup new MLContext (matrix/frame data types, move tests, etc)
>
> >> >> * Remove old MLContext
>
> >> >> * Consolidate MLContext and JMLC
>
> >> >> * Full support for Scala/Python DSLs
>
> >> >> * Remove old file-based transform
>
> >> >> * Scala/Python wrappers for all existing algorithms
>
> >> >> * Data converters (additional formats: e.g., libsvm; performance)
>
> >> >>
>
> >> >> 2) Updated Dependencies:
>
> >> >> * Spark 2.0 support
>
> >> >> * Matrix block library (isolated jar)
>
> >> >>
>
> >> >> 3) Compiler/Runtime Features:
>
> >> >> * GPU support (full compiler and runtime support)
>
> >> >> * Compressed linear algebra v2
>
> >> >> * Code generation (automatic operator fusion)
>
> >> >> * Extended parfor (full spark exploitation, micro-batch support)
>
> >> >> * Scale-up architecture (large dense blocks, numa)?
>
> >> >>
>
> >> >> 4) Tools
>
> >> >> * Extended stats (task locality, shuffle, etc)
>
> >> >> * Cloud resource advisor (extended resource optimizer)?
>
> >> >>
>
> >> >> 5) Algorithms
>
> >> >> * Graduate "staging" algorithms (robustness/performance)
>
> >> >> * Perftest: include all algorithms into automated performance tests
>
> >> >> * Simplify usage decision trees, random forest, mlogreg, msvm
>
> >> >> (preprocessing, label representation, etc)
>
> >> >>
>
> >> >> Items marked with a ? can potentially be moved out to subsequent
> releases.
>
> >> >>
>
> >> >>
>
> >> >> Regards,
>
> >> >> Matthias
>
> >> >>
>
> >> >
>
> >> > My understanding is that most of the items in 1 and 2 are going to
> break
>
> >> > backward compatibility, while the others can be done incrementally.
> Is this
>
> >> > assumption correct? If so, can we finish 1 and 2 and do a 1.0
> release. and
>
> >> > them, continue with 3, 4, 5, etc ? as I don't think we should wait for
>
> >> > 2017/Q2 to do a 1.0 release. I believe in release early, release
> often,
>
> >> > particularly to attract new users, that can help verifying and
> contributing
>
> >> > to specific releases.
>
> >> >
>
> >> > Thoughts ?
>
> >> >
>
> >>
>
> >>
>
> >>
>
> >>
>
> --
Sent from my Mobile device

Re: [DISCUSS] Roadmap SystemML 1.0

Reply via email to