I agree with this sentiment, that one big drop could be more than people 
could/would devote time to, and that small proposals/prototypes would be more 
digestible.

Also would be easier to steer course as we go.

> On Apr 5, 2014, at 8:30 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> 
> PS. I personally don't think there would be significant hiccups with the
> review process. There's a very good chance things are either resolvable or
> insignificant enough to be foregone due to "power of do" Apache principle.
> However, please keep in mind the costs of commiters' time -- the best way
> is to do things in smaller steps. We also need some time to collect some
> input from users of Mahout APIs, not just internally in the project -- if
> there's any change to such apis.
> 
> -d
> 
> 
>> On Sat, Apr 5, 2014 at 8:04 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
>> 
>> 
>> 
>> 
>>> On Fri, Apr 4, 2014 at 2:13 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>>> 
>>> To add to Sri's comments:
>> 
>>> This code is intended for contribution if the
>>> 
>>> objections of one committer are over-come by the concrete results of the
>>> prototype.
>> 
>> I would like to comment that there are no concerns against making this
>> contribution -- not at this point anyway.
>> 
>> There is a technicality concern based solely on vague and very
>> non-specific communication of intended contributor. However, since
>> prototype is not made available to Mahout community, there's no way to
>> either confirm, refute or resolve this -- or any other -- concern at this
>> point.
>> 
>> No physical & tangible contribution -- no concerns. Can't be.
>> 
>> There are of course plenty of cases when closed project becomes open, but
>> usually this either goes through Apache incubation process, or there's a
>> legitimate reason to keep it closed (e.g. novel methodology and patent or
>> publication pending).
>> 
>> If none of this apply, i would respectfully urge the perspective
>> contributors to submit their work for early review, assuming everyone is
>> holding Mahout community interests dear first.
>> 
>> The reasons to make prototype and TDD available early include:
>> 
>> -- eliminate all sorts of speculative thinking per above. The sooner we do
>> that, the less speculations we'll produce in waiting.
>> -- it is hard for committers to do a quality review on a super-massive
>> commit dumps due to time constraints. It is much easier to do so in steps
>> and portions.
>> -- failure to engage community into the effort: No coder alone making any
>> changes to Mahout code could reliably assert that they are not creating
>> problems for Mahout and/or outside users, since no one has the entire
>> Mahout picture in his or her head.  We need the entire community to assert
>> benign nature of Mahout code modifications or additions.
>> -- it is also more expensive to resolve architectural problems once
>> siginficant amount of changes is made, it would be a bit of "my way of
>> highway" way of offering things.
>> -- development of intended open software contribution that is available
>> only to corporate entities, is not, well, open by definition.
>> 
>> 
>> 
>>> 
>>> 
>>> On Fri, Apr 4, 2014 at 6:47 PM, SriSatish Ambati <srisat...@0xdata.com
>>>> wrote:
>>> 
>>>> Grant,
>>>> On 0xdata / H2O front:
>>>> 
>>>> We feel very excited at making Apache Mahout the principal platform for
>>>> scalable machine learning and are rapidly prototyping an initial
>>>> integration with the Matrix API. Ted (apache.org), Cliff Click (
>>>> acm.org/0xdata), Anand Avati (Redhat) and Michal Malohava (0xdata) are
>>>> heads down on that & making brisk progress. We hope to get the
>>> discussions
>>>> restarted in the JIRAs and google hangouts as soon as we get past the
>>> first
>>>> cut .
>>>> 
>>>> We also chose to have the first level integration with Mahout will be
>>> as a
>>>> maven dependency -
>>>> That way we can flesh things out without major interruption and the
>>> grant
>>>> work.
>>>> 
>>>> In parallel, several members and teams have been reworking the core
>>>> architecture to get a clean separation on the Algorithms & Core, an
>>>> in-memory (mr/task) API and a decent client framework with data
>>> read/write.
>>>> This will allow Apache Mahout and other ML libraries to use Spark,
>>>> Stratosphere or other engines for performance and extensibility.
>>>> 
>>>> This is the state of the union at the moment -
>>>> I'm very enthusiastic at making this a win for the ardent Community of
>>>> Machine Learning users and developers.
>>>> We are very grateful for the warmth, welcome, attention and impassionate
>>>> reviews we received from the Apache community.  Thank you for that.
>>>> We should have more to report in the month ahead.
>>>> 
>>>> Looking forward, Sri
>>>> 
>>>> 
>>>> 
>>>> On Fri, Apr 4, 2014 at 6:44 AM, Grant Ingersoll <gsing...@apache.org>
>>>> wrote:
>>>> 
>>>>> Can someone summarize the 0xData and the Spark work for me for the
>>> board
>>>>> report?  I've unfortunately been too busy to keep up on the threads on
>>>> it,
>>>>> but need to write the board report for this month.
>>>>> 
>>>>> You can either summarize here or add it to the community section at
>>> https://svn.apache.org/repos/asf/mahout/pmc/board-reports/2014/board-report-apr.txt
>>>>> 
>>>>> Also, assuming we are going ahead w/ the 0xData stuff, we likely need
>>> to
>>>>> do a software grant for that.
>>>>> 
>>>>> Thanks,
>>>>> Grant
>>>>> 
>>>>> --------------------------------------------
>>>>> Grant Ingersoll | @gsingers
>>>>> http://www.lucidworks.com
>>>> 
>>>> 
>>>> --
>>>> ceo & co-founder, 0 <http://www.0xdata.com/>*x*data Inc
>>>> +1-408.316.8192
>> 
>> 

Reply via email to