Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

Steve Loughran Sat, 16 Apr 2016 04:47:39 -0700

On 15/04/2016, 17:41, "Mattmann, Chris A (3980)" 
<[email protected]> wrote:

>Yeah in support of this statement I think that my primary interest in
>this Spark Extras and the good work by Luciano here is that anytime we
>take bits out of a code base and “move it to GitHub” I see a bad precedent
>being set.
>
>Creating this project at the ASF creates a synergy between *Apache Spark*
>which is *at the ASF*.
>
>We welcome comments and as Luciano said, this is meant to invite and be
>open to those in the Apache Spark PMC to join and help.
>
>Cheers,
>Chris

As one of the people named, here's my rationale:

Throwing stuff into github creates that world of branches, and its no longer 
something that could be managed through the ASF, where managed is: governance, 
participation and a release process that includes auditing dependencies, 
code-signoff, etc,

As an example, there's a mutant hive JAR which spark uses, that's something 
which currently evolved between my repo and Patrick Wendell's; now that Josh 
Rosen has taken on the bold task of "trying to move spark and twill to Kryo 3", 
he's going to own that code, and now the reference branch will move somewhere 
else.

In contrast, if there was an ASF location for this, then it'd be something 
anyone with commit rights could maintain and publish

(actually, I've just realised life is hard here as the hive is a fork of ASF 
hive —really the spark branch should be a separate branch in Hive's own repo 
... But the concept is the same: those bits of the codebase which are core 
parts of the spark project should really live in or near it)

If everyone on the spark commit list gets write access to this extras repo, 
moving things is straightforward. Release wise, things could/should be in sync.

If there's a risk, its the eternal problem of the contrib/ dir .... Stuff ends 
up there that never gets maintained. I don't see that being any worse than if 
things were thrown to the wind of a thousand github repos: at least now there'd 
be a central issue tracking location.

Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends

Reply via email to