In order to make this roadmap more concrete, I created the following epics for the target release 1.0 with about 50 subtasks, and linked related existing issues. Given the discussion on a short release cycle, the bare minimum would be SYSTEMML-1299 (which includes all changes that affect the external behavior), and a subset of SYSTEMML-1308 (especially features that address proper cleanups and robustness against OOMs).
SYSTEMML-1299 Language feature updates SYSTEMML-1321 Compiler feature extensions SYSTEMML-1308 Runtime feature extensions SYSTEMML-1284 Code generation for operator fusion SYSTEMML-1328 Perftest extensions I did not touch GPUs, Deep Learning, DSLs, and algorithms yet. So please have a look, and update or create them if necessary. Regards, Matthias On Mon, Jan 16, 2017 at 8:14 PM, <dusenberr...@gmail.com> wrote: > Yeah using the target release would be good. Actually, with that in mind, > I believe that we have been marking closed issues since the 0.11 release as > targeting an upcoming "1.0" release, but it would probably be more correct > to update those to "0.12" since we decided to release 0.12. In addition, we > should set the target of the Spark 2.x support issue to "0.13". > > As for the roadmap, it would be good to update the website with a > high-level overview, with links to associated JIRA issues. > > -- > > Mike Dusenberry > GitHub: github.com/dusenberrymw > LinkedIn: linkedin.com/in/mikedusenberry > > Sent from my iPhone. > > > > On Jan 16, 2017, at 7:35 PM, Luciano Resende <luckbr1...@gmail.com> > wrote: > > > > Instead of Epic, we could use the target release ? Also, we have a > roadmap > > page on the site and we should keep that up to date, or get rid of that > and > > use roadmap on jira. > > > >> On Mon, Jan 16, 2017 at 6:20 PM <dusenberr...@gmail.com> wrote: > >> > >> Now that we've had some discussion here, it would be good to transfer > this > >> discussion into a JIRA epic, containing sub tasks. That way, we can > >> properly track our progress on these items and facilitate contributions > >> from the community. Note that some of the sub tasks may already exist > as > >> individual issues. > >> > >> > >> > >> Would anyone in the community like to volunteer for creating these > issues? > >> > >> > >> > >> - Mike > >> > >> > >> > >> -- > >> > >> > >> > >> Mike Dusenberry > >> > >> GitHub: github.com/dusenberrymw > >> > >> LinkedIn: linkedin.com/in/mikedusenberry > >> > >> > >> > >> Sent from my iPhone. > >> > >> > >> > >> > >> > >>>> On Jan 4, 2017, at 6:00 PM, dusenberr...@gmail.com wrote: > >>> > >>> > >> > >>> Overall, this is a good list of items that should be worked on, > >> particularly because it contains several user-facing items. However, to > >> echo what Luciano said, I'm also concerned about the timeline. At this > >> stage, I agree that we need to release more often, and with a more > >> user-oriented "product" focus as a guide for timelines. I.e. we should > >> orient our release timelines around items that focus on the "product" of > >> allowing the user to work on a wide range of ML problems in a simple and > >> easy manner on top of Spark. > >> > >>> > >> > >>> With that in mind, I agree that a focus on a subset of (1) and (2) > would > >> be good for an immediate release, with a particular focus on Spark 2.0 > >> support as a priority. > >> > >>> > >> > >>> How about we aim for a February 1st release date for the initial items? > >> > >>> > >> > >>> -Mike > >> > >>> > >> > >>> -- > >> > >>> > >> > >>> Mike Dusenberry > >> > >>> GitHub: github.com/dusenberrymw > >> > >>> LinkedIn: linkedin.com/in/mikedusenberry > >> > >>> > >> > >>> Sent from my iPhone. > >> > >>> > >> > >>> > >> > >>>> On Jan 3, 2017, at 4:17 PM, Niketan Pansare <npan...@us.ibm.com> > wrote: > >> > >>>> > >> > >>>> Hi Matthias, > >> > >>>> > >> > >>>> Thanks for the detailed roadmap. > >> > >>>> > >> > >>>> +1 for all the items with few modifications. > >> > >>>> > >> > >>>> 1) APIs and Language: > >> > >>>> * Cleanup new MLContext (matrix/frame data types, move tests, etc) > >> > >>>>>> Ensure Python and Scala MLContext have same API capability. > >> > >>>> > >> > >>>> * Remove old MLContext > >> > >>>> * Consolidate MLContext and JMLC > >> > >>>> * Full support for Scala/Python DSLs > >> > >>>>>> +1 for Python DSL except for push-down of loop structures and > >> functions. > >> > >>>> > >> > >>>> * Remove old file-based transform > >> > >>>> * Scala/Python wrappers for all existing algorithms > >> > >>>> * Data converters (additional formats: e.g., libsvm; performance) > >> > >>>> > >> > >>>> 2) Updated Dependencies: > >> > >>>> * Spark 2.0 support > >> > >>>> * Matrix block library (isolated jar) > >> > >>>> > >> > >>>> 3) Compiler/Runtime Features: > >> > >>>> * GPU support (full compiler and runtime support) > >> > >>>>>> Can we break this down into phases: > >> https://issues.apache.org/jira/browse/SYSTEMML-445 ? We can discuss the > >> timeline of the phases in the JIRA. > >> > >>>> > >> > >>>> * Compressed linear algebra v2 > >> > >>>> * Code generation (automatic operator fusion) > >> > >>>> * Extended parfor (full spark exploitation, micro-batch support) > >> > >>>> * Scale-up architecture (large dense blocks, numa)? > >> > >>>> > >> > >>>> 4) Tools > >> > >>>> * Extended stats (task locality, shuffle, etc) > >> > >>>> * Cloud resource advisor (extended resource optimizer)? > >> > >>>> > >> > >>>> 5) Algorithms > >> > >>>> * Graduate "staging" algorithms (robustness/performance) > >> > >>>> * Perftest: include all algorithms into automated performance tests > >> > >>>>>> via spark-submit + via Scala/Python wrappers > >> > >>>> > >> > >>>> * Simplify usage decision trees, random forest, mlogreg, msvm > >> > >>>> (preprocessing, label representation, etc) > >> > >>>>>> + command-line variable naming. For example: maxi, maxiter, etc. > >> > >>>> > >> > >>>> Thanks, > >> > >>>> > >> > >>>> Niketan Pansare > >> > >>>> IBM Almaden Research Center > >> > >>>> E-mail: npansar At us.ibm.com > >> > >>>> http://researcher.watson.ibm.com/researcher/view.php? > person=us-npansar > >> > >>>> > >> > >>>> Matthias Boehm ---01/03/2017 02:44:39 PM---Yes indeed, most of (3) and > >> (4) can be done incrementally. For (5), some of the changes might also > >> > >>>> > >> > >>>> From: Matthias Boehm <mboe...@googlemail.com> > >> > >>>> To: dev@systemml.incubator.apache.org > >> > >>>> Date: 01/03/2017 02:44 PM > >> > >>>> Subject: Re: [DISCUSS] Roadmap SystemML 1.0 > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> Yes indeed, most of (3) and (4) can be done incrementally. For (5), > some > >> > >>>> of the changes might also modify the signature of algorithms (i.e., > >> > >>>> parameters and required input data) but it would help, for example > with > >> > >>>> decision trees, as users no longer need to dummy code their inputs. > >> > >>>> > >> > >>>> Generally, I'm fine with making (3), (4), and part of (5) optional and > >> > >>>> let the "must-have" features from (1) and (2) determine the timeline. > >> > >>>> > >> > >>>> Regards, > >> > >>>> Matthias > >> > >>>> > >> > >>>> On 1/3/2017 11:27 PM, Luciano Resende wrote: > >> > >>>>> On Tue, Jan 3, 2017 at 11:50 AM, Matthias Boehm < > >> mboe...@googlemail.com> > >> > >>>>> wrote: > >> > >>>>> > >> > >>>>>> I'd like to initiate the discussion of a concrete roadmap for our > >> next > >> > >>>>>> release. According, to previous discussions, I'd think it's fair to > >> say > >> > >>>>>> that we agree on calling it SystemML 1.0. We should carefully plan > >> this > >> > >>>>>> release as it's an opportunity to change APIs and remove some older > >> > >>>>>> deprecated features. I'd like to encourage not just developers but > >> also the > >> > >>>>>> broader community to participate in this discussion. > >> > >>>>>> > >> > >>>>>> Personally, I think a target date of Q2/2017 is realistic. Let's > >> start > >> > >>>>>> with collecting the major features and changes that potentially > >> affect > >> > >>>>>> users. Here is an initial list, but please feel free to add and up- > >> or > >> > >>>>>> down-vote the individual items. > >> > >>>>>> > >> > >>>>>> 1) APIs and Language: > >> > >>>>>> * Cleanup new MLContext (matrix/frame data types, move tests, etc) > >> > >>>>>> * Remove old MLContext > >> > >>>>>> * Consolidate MLContext and JMLC > >> > >>>>>> * Full support for Scala/Python DSLs > >> > >>>>>> * Remove old file-based transform > >> > >>>>>> * Scala/Python wrappers for all existing algorithms > >> > >>>>>> * Data converters (additional formats: e.g., libsvm; performance) > >> > >>>>>> > >> > >>>>>> 2) Updated Dependencies: > >> > >>>>>> * Spark 2.0 support > >> > >>>>>> * Matrix block library (isolated jar) > >> > >>>>>> > >> > >>>>>> 3) Compiler/Runtime Features: > >> > >>>>>> * GPU support (full compiler and runtime support) > >> > >>>>>> * Compressed linear algebra v2 > >> > >>>>>> * Code generation (automatic operator fusion) > >> > >>>>>> * Extended parfor (full spark exploitation, micro-batch support) > >> > >>>>>> * Scale-up architecture (large dense blocks, numa)? > >> > >>>>>> > >> > >>>>>> 4) Tools > >> > >>>>>> * Extended stats (task locality, shuffle, etc) > >> > >>>>>> * Cloud resource advisor (extended resource optimizer)? > >> > >>>>>> > >> > >>>>>> 5) Algorithms > >> > >>>>>> * Graduate "staging" algorithms (robustness/performance) > >> > >>>>>> * Perftest: include all algorithms into automated performance tests > >> > >>>>>> * Simplify usage decision trees, random forest, mlogreg, msvm > >> > >>>>>> (preprocessing, label representation, etc) > >> > >>>>>> > >> > >>>>>> Items marked with a ? can potentially be moved out to subsequent > >> releases. > >> > >>>>>> > >> > >>>>>> > >> > >>>>>> Regards, > >> > >>>>>> Matthias > >> > >>>>>> > >> > >>>>> > >> > >>>>> My understanding is that most of the items in 1 and 2 are going to > >> break > >> > >>>>> backward compatibility, while the others can be done incrementally. > >> Is this > >> > >>>>> assumption correct? If so, can we finish 1 and 2 and do a 1.0 > >> release. and > >> > >>>>> them, continue with 3, 4, 5, etc ? as I don't think we should wait > for > >> > >>>>> 2017/Q2 to do a 1.0 release. I believe in release early, release > >> often, > >> > >>>>> particularly to attract new users, that can help verifying and > >> contributing > >> > >>>>> to specific releases. > >> > >>>>> > >> > >>>>> Thoughts ? > >> > >>>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >>>> > >> > >> -- > > Sent from my Mobile device >