On Tue, Nov 12, 2013 at 10:18 AM, Sebastian Schelter <[email protected]> wrote:

>
>
> @Sean
>
> However, I also cannot understand why Cloudera and you need to start a
> new open source project that in many ways mirrors what mahout offers.
> Why not contribute the algorithm implementations (the computation layer)
> to mahout and built the serving layer as a project on top of that? I
> don't see what would have prevented this, I would think it would have
> been warmly welcomed by this community.
>
> I can also understand Ted's worries about Cloudera's attitude towards
> open source, after having heard Impala's view of "open source" at the
> last Buzzwords (the lead developer of Impala answered the question
> whether Impala accepts patches with the statement that Impala is
> developed by Cloudera engineers and others can only look at the source
> code on github...). I hope that Oryx chooses another path (I also hope
> this for Impala).
>
> Its a very bad day for mahout today.
>

Second that -- it's all about control. It has always been.  Even when
rivals of Cloudera's model have been forced to work together on a project,
it has always been about committership clout. The next logical step is to
cut all foreign committers and claim (sell) the utter domain expertise. It
is not bad, it's just a business model. There are tons of companies that
try to build Drill-like brute force machine clones too today, even i they
open some of all what they do.

Even Amplab is sort of that way, keeping control over architecture during
decision times and then dumping it on the community once all important
calls are already made. Again, nothing wrong with it, it is all open and
free after all. And a healthy share of dictatorship cuts the dev effort.

However, let's call it out for what it is.  Open source usually means
community. Just because something got opened, doesn't mean there's a
community outside its initial effort, but i suppose customers still like
the "open" word without making much of a distinction over "community" part
of it.

-d


>
> --sebastian
>
> PS:
>
> I still have to comment to this statement: "I don't think the current
> state of the code means it's feasible to truly evolve it towards things
> like Hadoop 2, Spark, real-time."
>
> To me this sounds like a marketing statement, "look, we can give you
> something better than mahout". Porting mahout's algorithms to spark is
> something that can be done with very little effort, I ported
> RowSimilarityJob in a single evening recently as a getting started with
> Spark exercise. Making the codebase ready is only a matter of will to
> invest time and efforts.
>

+1111. This statement Sebastian is referring to can't be farther from the
truth. We use Mahout as building blocks in Spark framework. Including
Pregel and GraphX (the latter is still under dev at Amplab). We are, and
will on hook to contribute that back to Mahout. (well, we are still
rehashing what we move and what we are not. but i am moving at github some
parts of it per mahout issue).

All the difference is really whether one decides to contribute it, subject
to peer review, or just keeps saying "it's not possible".


>
>
>
> On 12.11.2013 16:54, Sean Owen wrote:
> > On Tue, Nov 12, 2013 at 2:13 PM, Ted Dunning <[email protected]>
> wrote:
> >> Cloudera's primary influence is to get you to ask to go emeritus, i.e.
> stop
> >> contributing.
> >>
> >> You have contributed in the past.  That's great.  And now you work for
> >> Cloudera.
> >
> > I started building on a new code base and left the PMC from about mid
> > 2012 and began at Cloudera in July 2013. Right -- check the archives?
> > I mean... it doesn't add up even time-wise.
> >
> > It's only relevant in that I hope to expose and defuse this suggestion
> > of some kind of plot. Certainly, it's best to steer clear of what
> > might be perceived as vendor stone-throwing... I am sure it's not
> > relevant to dev@.
> >
> >
> >> Getting a paycheck is also a legitimate reason for you do this.  And it
> >> should be recognized where the paycheck comes from and what is really
> going
> >> on.
> >
> > A plot so deep even the plotters are unaware! I am definitely paid to
> > write open source code as are a lot of people here and it's a Good
> > Thing. Surely we do not suggest otherwise?
> >
> >
> >> Well, I think that it is a hypocrisy fail going on.  I get criticized
> all
> >> the time by Cloudera employees for "not being open".  And now the shoe
> is
> >> on the other foot where Cloudera decides it is better to not contribute
> to
> >> an existing open source project and, indeed, even hires away a key
> >> developer of same.
> >
> > I don't understand the equivalence -- was it not clear that Oryx is
> > open source not proprietary? -- but pursuing it is just going to look
> > like vendor spat.
> >
> > I don't understand the idea that contributing to one open source
> > project is wrong, but to another is right. Mahout is not more sacred
> > than any other, nor more open or important by having an Apache badge.
> > It can't be that, because Mahout exists, nobody else should try to
> > write anything like ML on Hadoop.
> >
> > Ted sorry to be on your black list -- a lesson to anyone else thinking
> > of leaving an Apache project? ay, you know where I live! I am happy to
> > be accused of working on another open project now, but hope nobody
> > agrees with the other suggestions. I'd feel bad if it were read widely
> > this way.
> >
>
>

Reply via email to