I agree with this sentiment, that one big drop could be more than people could/would devote time to, and that small proposals/prototypes would be more digestible.
Also would be easier to steer course as we go. > On Apr 5, 2014, at 8:30 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: > > PS. I personally don't think there would be significant hiccups with the > review process. There's a very good chance things are either resolvable or > insignificant enough to be foregone due to "power of do" Apache principle. > However, please keep in mind the costs of commiters' time -- the best way > is to do things in smaller steps. We also need some time to collect some > input from users of Mahout APIs, not just internally in the project -- if > there's any change to such apis. > > -d > > >> On Sat, Apr 5, 2014 at 8:04 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote: >> >> >> >> >>> On Fri, Apr 4, 2014 at 2:13 PM, Ted Dunning <ted.dunn...@gmail.com> wrote: >>> >>> To add to Sri's comments: >> >>> This code is intended for contribution if the >>> >>> objections of one committer are over-come by the concrete results of the >>> prototype. >> >> I would like to comment that there are no concerns against making this >> contribution -- not at this point anyway. >> >> There is a technicality concern based solely on vague and very >> non-specific communication of intended contributor. However, since >> prototype is not made available to Mahout community, there's no way to >> either confirm, refute or resolve this -- or any other -- concern at this >> point. >> >> No physical & tangible contribution -- no concerns. Can't be. >> >> There are of course plenty of cases when closed project becomes open, but >> usually this either goes through Apache incubation process, or there's a >> legitimate reason to keep it closed (e.g. novel methodology and patent or >> publication pending). >> >> If none of this apply, i would respectfully urge the perspective >> contributors to submit their work for early review, assuming everyone is >> holding Mahout community interests dear first. >> >> The reasons to make prototype and TDD available early include: >> >> -- eliminate all sorts of speculative thinking per above. The sooner we do >> that, the less speculations we'll produce in waiting. >> -- it is hard for committers to do a quality review on a super-massive >> commit dumps due to time constraints. It is much easier to do so in steps >> and portions. >> -- failure to engage community into the effort: No coder alone making any >> changes to Mahout code could reliably assert that they are not creating >> problems for Mahout and/or outside users, since no one has the entire >> Mahout picture in his or her head. We need the entire community to assert >> benign nature of Mahout code modifications or additions. >> -- it is also more expensive to resolve architectural problems once >> siginficant amount of changes is made, it would be a bit of "my way of >> highway" way of offering things. >> -- development of intended open software contribution that is available >> only to corporate entities, is not, well, open by definition. >> >> >> >>> >>> >>> On Fri, Apr 4, 2014 at 6:47 PM, SriSatish Ambati <srisat...@0xdata.com >>>> wrote: >>> >>>> Grant, >>>> On 0xdata / H2O front: >>>> >>>> We feel very excited at making Apache Mahout the principal platform for >>>> scalable machine learning and are rapidly prototyping an initial >>>> integration with the Matrix API. Ted (apache.org), Cliff Click ( >>>> acm.org/0xdata), Anand Avati (Redhat) and Michal Malohava (0xdata) are >>>> heads down on that & making brisk progress. We hope to get the >>> discussions >>>> restarted in the JIRAs and google hangouts as soon as we get past the >>> first >>>> cut . >>>> >>>> We also chose to have the first level integration with Mahout will be >>> as a >>>> maven dependency - >>>> That way we can flesh things out without major interruption and the >>> grant >>>> work. >>>> >>>> In parallel, several members and teams have been reworking the core >>>> architecture to get a clean separation on the Algorithms & Core, an >>>> in-memory (mr/task) API and a decent client framework with data >>> read/write. >>>> This will allow Apache Mahout and other ML libraries to use Spark, >>>> Stratosphere or other engines for performance and extensibility. >>>> >>>> This is the state of the union at the moment - >>>> I'm very enthusiastic at making this a win for the ardent Community of >>>> Machine Learning users and developers. >>>> We are very grateful for the warmth, welcome, attention and impassionate >>>> reviews we received from the Apache community. Thank you for that. >>>> We should have more to report in the month ahead. >>>> >>>> Looking forward, Sri >>>> >>>> >>>> >>>> On Fri, Apr 4, 2014 at 6:44 AM, Grant Ingersoll <gsing...@apache.org> >>>> wrote: >>>> >>>>> Can someone summarize the 0xData and the Spark work for me for the >>> board >>>>> report? I've unfortunately been too busy to keep up on the threads on >>>> it, >>>>> but need to write the board report for this month. >>>>> >>>>> You can either summarize here or add it to the community section at >>> https://svn.apache.org/repos/asf/mahout/pmc/board-reports/2014/board-report-apr.txt >>>>> >>>>> Also, assuming we are going ahead w/ the 0xData stuff, we likely need >>> to >>>>> do a software grant for that. >>>>> >>>>> Thanks, >>>>> Grant >>>>> >>>>> -------------------------------------------- >>>>> Grant Ingersoll | @gsingers >>>>> http://www.lucidworks.com >>>> >>>> >>>> -- >>>> ceo & co-founder, 0 <http://www.0xdata.com/>*x*data Inc >>>> +1-408.316.8192 >> >>