As a subproject of Hadoop, Chimera could maintain its own cadence.
There's also no reason why it should maintain dependencies on other
parts of Hadoop, if those are separable. How is this solution
inadequate?

If Chimera is not successful as an independent project or stalls,
Hadoop and/or Spark and/or $project will have to reabsorb it as
maintainers. Projects have high mortality in early life, and a fight
over inheritance/maintenance is something we'd like to avoid. If, on
the other hand, it develops enough of a community where it is
obviously viable, then we can (and should) break it out as a TLP (as
we have before). If other Apache projects take a dependency on
Chimera, we're open to adding them to security@hadoop.

Unlike Yetus, which was largely rewritten right before it was made
into a TLP, security in Hadoop has a complicated pedigree. If Chimera
eventually becomes a TLP, it seems fair to include those who work on
it while it is a subproject. Declared upfront, that criterion is
fairer than any post hoc justification, and will lead to a more
accurate account of its community than a subset of the Hadoop
PMC/committers that volunteer. -C


On Mon, Feb 1, 2016 at 9:29 PM, Chen, Haifeng <haifeng.c...@intel.com> wrote:
> Thanks to all folks providing feedbacks and participating the discussions.
>
> @Owen, do you still have any concerns on going forward in the direction of 
> Apache Commons (or other options, TLP)?
>
> Thanks,
> Haifeng
>
> -----Original Message-----
> From: Chen, Haifeng [mailto:haifeng.c...@intel.com]
> Sent: Saturday, January 30, 2016 10:52 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: RE: Hadoop encryption module as Apache Chimera incubator project
>
>>> I believe encryption is becoming a core part of Hadoop. I think that
>>> moving core components out of Hadoop is bad from a project management 
>>> perspective.
>
>> Although it's certainly true that encryption capabilities (in HDFS, YARN, 
>> etc.) are becoming core to Hadoop, I don't think that should really 
>> influence whether or not the non-Hadoop-specific encryption routines should 
>> be part of the Hadoop code base, or part of the code base of another project 
>> that Hadoop depends on. If Chimera had existed as a library hosted at ASF 
>> when HDFS encryption was first developed, HDFS probably would have just 
>> added that as a dependency and been done with it. I don't think we would've 
>> copy/pasted the code for Chimera into the Hadoop code base.
>
> Agree with ATM. I want to also make an additional clarification. I agree that 
> the encryption capabilities are becoming core to Hadoop. While this effort is 
> to put common and shared encryption routines such as crypto stream 
> implementations into a scope which can be widely shared across the Apache 
> ecosystem. This doesn't move Hadoop encryption out of Hadoop (that is not 
> possible).
>
> Agree if we make it a separate and independent releases project in Hadoop 
> takes a step further than the existing approach and solve some issues (such 
> as libhadoop.so problem). Frankly speaking, I think it is not the best option 
> we can try. I also expect that an independent release project within Hadoop 
> core will also complicate the existing release ideology of Hadoop release.
>
> Thanks,
> Haifeng
>
> -----Original Message-----
> From: Aaron T. Myers [mailto:a...@cloudera.com]
> Sent: Friday, January 29, 2016 9:51 AM
> To: hdfs-dev@hadoop.apache.org
> Subject: Re: Hadoop encryption module as Apache Chimera incubator project
>
> On Wed, Jan 27, 2016 at 11:31 AM, Owen O'Malley <omal...@apache.org> wrote:
>
>> I believe encryption is becoming a core part of Hadoop. I think that
>> moving core components out of Hadoop is bad from a project management 
>> perspective.
>>
>
> Although it's certainly true that encryption capabilities (in HDFS, YARN,
> etc.) are becoming core to Hadoop, I don't think that should really influence 
> whether or not the non-Hadoop-specific encryption routines should be part of 
> the Hadoop code base, or part of the code base of another project that Hadoop 
> depends on. If Chimera had existed as a library hosted at ASF when HDFS 
> encryption was first developed, HDFS probably would have just added that as a 
> dependency and been done with it. I don't think we would've copy/pasted the 
> code for Chimera into the Hadoop code base.
>
>
>> To put it another way, a bug in the encryption routines will likely
>> become a security problem that security@hadoop needs to hear about.
>>
> I don't think
>> adding a separate project in the middle of that communication chain is
>> a good idea. The same applies to data corruption problems, and so on...
>>
>
> Isn't the same true of all the libraries that Hadoop currently depends upon? 
> If the commons-httpclient library (or commons-codec, or commons-io, or guava, 
> or...) has a security vulnerability, we need to know about it so that we can 
> update our dependency to a fixed version. This case doesn't seem materially 
> different than that.
>
>
>>
>>
>> > It may be good to keep at generalized place(As in the discussion, we
>> > thought that place could be Apache Commons).
>>
>>
>> Apache Commons is a collection of *Java* projects, so Chimera as a
>> JNI-based library isn't a natural fit.
>>
>
> Could very well be that Apache Commons's charter would preclude Chimera.
> You probably know better than I do about that.
>
>
>> Furthermore, Apache Commons doesn't
>> have its own security list so problems will go to the generic
>> secur...@apache.org.
>>
>
> That seems easy enough to remedy, if they wanted to, and besides I'm not sure 
> why that would influence this discussion. In my experience projects that 
> don't have a separate security@project.a.o mailing list tend to just handle 
> security issues on their private@project.a.o mailing list, which seems fine 
> to me.
>
>
>>
>> Why do you think that Apache Commons is a better home than Hadoop?
>>
>
> I'm certainly not at all wedded to Apache Commons, that just seemed like a 
> natural place to put it to me. Could be that a brand new TLP might make more 
> sense.
>
> I *do* think that if other non-Hadoop projects want to make use of Chimera, 
> which as I understand it is the goal which started this thread, then Chimera 
> should exist outside of Hadoop so that:
>
> a) Projects that have nothing to do with Hadoop can just depend directly on 
> Chimera, which has nothing Hadoop-specific in there.
>
> b) The Hadoop project doesn't have to export/maintain/concern itself with yet 
> another publicly-consumed interface.
>
> c) Chimera can have its own (presumably much faster) release cadence 
> completely separate from Hadoop.
>
> --
> Aaron T. Myers
> Software Engineer, Cloudera

Reply via email to