It's not really easier, they've been working on getting this released
for 2.5 years or more. What I think will make it easier is having more
of a precedence. In the government, it's always easier to say no than
yes. Showing that it can be done and done successfully will push them
to develop a consistent process. Hopefully in the future it will take
less than 2.5 years to go public :)

-Joey

On Fri, Sep 2, 2011 at 3:37 PM, Ted Yu <[email protected]> wrote:
> Thanks for the update Joey.
> May someone close to NSA disclose what may have changed recently that allows
> contributing to Open Source eaiser ?
>
> On Fri, Sep 2, 2011 at 12:30 PM, Joey Echeverria <[email protected]> wrote:
>
>> To add to what Todd said, I actually worked with those guys for the
>> last 3 years and have used Accumulo in production. It's true that it
>> would have been better if they had been able to contribute to HBase
>> rather than go on their own, but it's not easy to contribute to open
>> source, either officially or unofficially when you work at NSA. I
>> think there is precedence for competing and/or "duplicate" Apache
>> projects, Avro/Thrift and HBase/Cassandra come to mind. I'm mostly
>> interested in this project setting a precedent for other work at NSA
>> to be developed as open source.
>>
>> -Joey
>>
>> On Fri, Sep 2, 2011 at 3:09 PM, Todd Lipcon <[email protected]> wrote:
>> > Hey folks,
>> >
>> > <wearing my Todd hat and not my Cloudera hat!>
>> >
>> > I've been in touch with this team for the last 18 months or so.
>> > They're good people, smart, and have a healthy respect for HBase and
>> > our team. Though they haven't contributed code or participated on the
>> > lists, I can vouch that they do follow our development and generally
>> > do understand HBase as well as what makes their system different. In
>> > the context of the incubator proposal, they're trying to explain why
>> > their system is different than HBase, and not trying to knock our
>> > project. They do borrow our ideas, and in the future we'll be able to
>> > borrow some of theirs. Iterator trees, for example, are distinct from
>> > coprocessors and have some really nice capabilities which I'm looking
>> > forward to adapting into HBase.
>> >
>> > There are a couple things to keep in mind about the story here:
>> > - they first evaluated HBase 3 years ago. HBase at that point was not
>> > usable for their application - I think several of us here remember the
>> > state of HBase at the time and might have made the same decision. So,
>> > they started their own project with an internal team of 5-6 people.
>> > - contributing to open source from within the NSA is not easy, for
>> > obvious reasons. They've jumped through many hoops to open source
>> > this, and we should be thankful for that. Now that they're out in open
>> > source land, I think we'll see them collaborating with us much more
>> > openly.
>> >
>> > I for one look forward to working with these folks, and maybe merging
>> > the projects some time down the road as the feature lists converge.
>> >
>> > -Todd
>> >
>> > On Fri, Sep 2, 2011 at 11:40 AM, Gary Helmling <[email protected]>
>> wrote:
>> >> Some comments on the proposal and differentiation vs HBase:
>> >>
>> >> Access Labels:
>> >>
>> >> The proposal claims that this is "unlikely to be adopted [in HBase]".
>>  This
>> >> is completely untrue.  This has been discussed many times in the past in
>> >> relation to our security implementation.  It's just been deferred at the
>> >> moment due to a need to focus on the initial implementation.  But it's
>> >> certainly viewed as a potentially important feature for a future
>> iteration.
>> >> Contributions always welcome!
>> >>
>> >> see HBASE-3435: Provide per-column-qualifier and per-key-value security
>> for
>> >> HBASE-3025
>> >>
>> >>
>> >> Iterators:
>> >>
>> >> What do these provide that RegionObservers don't?  I'm speculating since
>> the
>> >> proposal provides little in the way of details, but if these are
>> "unlikely
>> >> to be adopted" it's only because coprocessors already offer more
>> extensive
>> >> functionality.
>> >>
>> >>
>> >> "Flexibility" aka online schema changes and locality groups
>> >>
>> >> Locality groups seem to be the only meaningful differentiation in this
>> >> entire comparison.
>> >>
>> >>
>> >> Testing
>> >>
>> >> Performance under "some configurations and conditions" and
>> unsubstantiated
>> >> "greater data integrity" is not meaningful differentiation.
>> >>
>> >>
>> >> Apache Brand
>> >>
>> >> Claims a relationship with HBase.  Is there overlapping code or is this
>> just
>> >> the duplication of functionality?  There's no community relationship
>> that
>> >> I'm aware of.  I haven't seen any of the proposed committers on the
>> HBase
>> >> user and dev lists to this point, so that doesn't set much of a
>> precedent
>> >> for community interaction.
>> >>
>> >>
>> >> Overall I see no meaningful differentiation vs HBase as an existing
>> project,
>> >> no past attempts to interact with the most relevant Apache community,
>> and
>> >> only an, until now, private "community" of government users.  I think
>> it's
>> >> great that they want to open source this.  I don't want to discourage
>> that
>> >> -- go for it!  But I don't see what the benefit is of ASF incubating
>> this.
>> >> I only see the potential for community fragmentation and market
>> confusion
>> >> over such closely similar projects.
>> >>
>> >>
>> >> Gary
>> >>
>> >>
>> >> On Fri, Sep 2, 2011 at 11:06 AM, Stack <[email protected]> wrote:
>> >>
>> >>> See here for the incubator proposal:
>> >>> http://wiki.apache.org/incubator/AccumuloProposal
>> >>>
>> >>> Reactions probably better belong over on the incubator mailing list
>> >>> but I thought a discussion here first might be useful developing a
>> >>> stance.
>> >>>
>> >>> Initial reaction, not having seen the code, is that it seems to be
>> close to
>> >>> HBase; so close, they call HBase out explicitly in their proposal.
>> >>>
>> >>> The cell based 'access labels' seem like a matter of adding
>> >>> an extra field to KV and their Iterators seem like a specialization on
>> >>> Coprocessors.  The ability to add column families on the fly seems too
>> >>> minor a difference to call out especially if online schema edits are
>> >>> now (soon) supported.  They talk of locality group like functionality
>> >>> too -- that
>> >>> could be a significant difference.  We would have to see the code but
>> at
>> >>> first blush, differences look small.
>> >>>
>> >>> Yet another BT implementation further divides this contended space.
>> >>> If there were to be an effort integrating HBase into Accumulo or vice
>> >>> versa, its likely to distract significantly from project forward motion
>> (If
>> >>> the Accumulo fellows were interested in integrating the two projects,
>> >>> I'd have thought they'd have tried to talk to us before this so thats
>> >>> probably not their intent).
>> >>>
>> >>> On other hand, if their once-secret project is out in the open, we can
>> >>> steal the Apache-licensed good bits and....
>> >>>
>> >>> What do folks think?
>> >>>
>> >>> St.Ack
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Todd Lipcon
>> > Software Engineer, Cloudera
>> >
>>
>>
>>
>> --
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
>>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Reply via email to