Re: [PROPOSAL] : Airflow

2016-03-19 Thread Scott Deboy
Ll
On Mar 17, 2016 6:51 AM, "Jake Farrell"  wrote:

> Hi Siddharth
> Thanks for drafting a proposal and looking to bring Airflow to the Apache
> Incubator. Overall the proposal looks good, just a couple comments. The
> proposal has the incubator listed as the sponsor and a quick check shows
> Chris Riccomini is not on the IPMC or a member currently, details on roles
> are available at [1]. Not a blocker as you have members listed in your
> group of mentors and can ask that they step up as the champion.
>
> Mailing lists are in the old format, they should be
> - priv...@airflow.incubator.apache.org (moderated)
> - d...@airflow.incubator.apache.org
> - comm...@airflow.incubator.apache.org
>
> Github shows 109 contributors, but the proposal lists only 6 initial
> committers, where any of the other existing contributors considered for the
> proposal?
>
> When you submit the proposal please make sure to include the entire
> proposal in the email. If you have any questions please let us know
>
> -Jake
>
>
> [1]:
>
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
>
> On Wed, Mar 16, 2016 at 8:28 PM, Siddharth Anand  >
> wrote:
>
> > https://wiki.apache.org/incubator/AirflowProposal
> >
> > Thoughts and comments are welcome!
> > -s (Sid)
> >
>


Re: Update on Apache Toree and LGPL dependency

2016-03-19 Thread Chip Senkbeil
Just wanted to give a status update with this one. JeroMQ is down to just
four contributors that have not responded. The current, active committers
for JeroMQ have reverted the commits for one of the contributors here:

https://github.com/zeromq/jeromq/pull/333

So, progress is still being made on this one!

> +1
>
> > On Mar 6, 2016, at 6:58 PM, Gino Bustelo  wrote:
> >
> > @john The 0mq ecosystem is made up of many projects of different sizes
and maturity.
> In the case of JeroMQ, the committers are showing an overwhelming
momentum to transition to
> MPL. I don't see any reason for us to consider any other alternative at
this juncture.
> >
> > Gino B.
> >
> >> On Mar 5, 2016, at 11:42 PM, Henri Yandell  wrote:
> >>
> >> Having chatted around the 0mq community in the past; I've confidence in
> >> their desire to move to MPL; and 26/32 committers is a great step
forward.
> >> You raise a good reservation though John - if you remove the blocker
on the
> >> usage side, it's easy for the licensing to remain as is.
> >>
> >>
> >> I'm +1 for releasing, with a prominent note of the LGPL dependency
(along
> >> with a note of the resolution plan). It might be that the Toree
committers
> >> may be motivated to rewrite code over at 0mq if there ends up being any
> >> committers who are unavailable or unwilling to relicense.
> >>
> >> Hen
> >>
> >>> On Sat, Mar 5, 2016 at 3:45 PM, John D. Ament 
wrote:
> >>>
> >>> Sorry, misread the revision I was looking at.  The intent to move to
MPL
> >>> was done on March 22 2014, 2 years ago this month, not December 2013.
> >>>
> >>> John
> >>>
> >>> On Sat, Mar 5, 2016 at 6:41 PM John D. Ament 
> >>> wrote:
> >>>
>  I have some reservations with what you're proposing, and would like
you
> >>> to
>  consult w/ legal-discuss on this first.
> 
>  There's a difference between what Mynewt did and what you're
proposing.
>  Specifically, this was a transitive dependency that they relied upon
>  indirectly, so its more of a call out for the library that was
leveraging
>  it.  They also intended to replace the library.
> 
>  In your case, you're directly tied to a presently LGPL'd library.
You
>  have no intentions (from what I can see) of moving off of the
library.
> 
>  I'm also doubting their long term goals of moving to MPL.  If you
look at
>  [1], you'll see that the page hasn't been updated since October
2014.  In
>  addition, looking at the pages revision history (the beauty of
wikis),
> >>> the
>  intent to move to MPL was published in December 2013, making the
> >>> statement
>  over 2 years old.
> 
>  I think while this might be OK for an initial incubator release, the
>  project needs to weigh very heavily if it wants to continue to
leverage
>  ZeroMQ or not going forward.
> 
>  [1]: http://zeromq.org/area:licensing
> 
> 
> > On Sat, Mar 5, 2016 at 5:06 PM Gino Bustelo 
> wrote:
> >
> > Wanted to give folks an update on our progress with dealing with
JeroMQ,
> > an
> > LGPL package that enables us to communicate via 0MQ. The 0MQ
community
> >>> is
> > very aware of the issues with LGPL (LGPLv3 + static link exception)
and
> >>> it
> > is their intention to try to move projects to MPL v2. This is not an
> >>> easy
> > task depending on the age and size of the projects.
> >
> > Apache Toree's API access point is through the 0MQ transport layer
> >>> (using
> > JeroMQ) and that is how Apache Toree connects out-of-the-box with
> >>> Jupyter,
> > a very common way of consuming Apache Toree that is already in
> >>> production.
> >
> > At this point, the JeroMQ project is still released under LGPL, but
our
> > team initiated communications in mid-February with members of the
JeroMQ
> > community to begin their transition to MPL v2 (
> > https://github.com/zeromq/jeromq/issues/326). The JeroMQ community
> > reacted
> > very positively and quickly began the process of collecting votes
from
> > their committers (https://github.com/zeromq/jeromq/issues/327).
After
> >>> 15
> > days, the current tally stands at 26 out of 32 committers have
agreed
> to
> > switch license.
> >
> > Apache Toree has a JIRA (
> >>> https://issues.apache.org/jira/browse/TOREE-262)
> > where we keep all the relevant links and update with the latest
> > information. As that process is underway, we will move forward with
> >>> plans
> > to release a 0.1.0 version of Apache Toree based on the precedence
set
> >>> by
> > Apache Mynewt (
> >>>
http://mail-archives.apache.org/mod_mbox/incubator-general/201602.mbox/%3C5F118AA0-4ADA-403B-A6EB-4A85F0B30651%40me.com%3E
> > ).
> >
> > Thanks,
> > Gino
> >>>
> >
> > -
> > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> > For additional commands, e-mail: general-h...@i

Re: Advice on binary NOTICE

2016-03-19 Thread Justin Mclean
HI,

I’m also not a lawyer, but I can't see anything in that NOTICE that would need 
to copied across. If the notice was well formed it would have a copyright line 
so perhaps just as that would follow the intent?

Remember the NOTICE contents effects downstream projects and it needs to be 
keeps as short as possible. Don’t put anything in that’s not required. [1]

Thanks,
Justin

1. http://www.apache.org/dev/licensing-howto.html#mod-notice
-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Please add me to the incubator's contributors group

2016-03-19 Thread Siddharth Anand
Siddharth Anand


Re: Please add me to the incubator's contributors group

2016-03-19 Thread Siddharth Anand
Hmn.. I agree the space can be problematic.. so I am trying to change my
user name under settings, but it looks like it needs me to change my email
address as well.

On Wed, Mar 16, 2016 at 5:06 PM, Siddharth Anand  wrote:

> Yes. Thanks.
>
> I had tried to followed the recommended convention of "FirstnameLastname",
> but it was taken. So, I used "Siddharth Anand" (with a space) and it
> worked.
>
>
>
>
> On Wed, Mar 16, 2016 at 4:57 PM, Nick Burch  wrote:
>
>> On Wed, 16 Mar 2016, Siddharth Anand wrote:
>>
>>> Siddharth Anand
>>>
>>
>> If you can tell us what username you signed up to the incubator wiki
>> with, then one of us can do so for you
>>
>> Nick
>>
>> -
>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>> For additional commands, e-mail: general-h...@incubator.apache.org
>>
>>
>


Advice on binary NOTICE

2016-03-19 Thread Stephen Mallette
The Jackson JSON processing lib which is Apache 2.0 licensed carries this
NOTICE file:

--
# Jackson JSON processor

Jackson is a high-performance, Free/Open Source JSON processing library.
It was originally written by Tatu Saloranta (tatu.salora...@iki.fi), and has
been in development since 2007.
It is currently developed by a community of developers, as well as supported
commercially by FasterXML.com.

## Licensing

Jackson core and extension components may be licensed under different
licenses.
To find the details that apply to this artifact see the accompanying
LICENSE file.
For more information, including possible other licensing options, contact
FasterXML.com (http://fasterxml.com).

## Credits

A list of contributors may be found from CREDITS file, which is included
in some artifacts (usually source distributions); but is always available
from the source code management (SCM) system project uses.
--

Does anyone have any advice on what portion of this is relevant for
inclusion in a binary NOTICE file? Should it all be included perhaps?

Perhaps more generally, given

http://www.apache.org/dev/licensing-howto.html#alv2-dep

where it says,

"If the dependency supplies a NOTICE file, its contents must be analyzed
and the relevant portions bubbled up into the top-level NOTICE file."

is there any more detailed information on how "relevant portions" get
determined?

Thanks,

Stephen


Request for permission to edit wiki

2016-03-19 Thread Siddharth Anand
Hi, I'd like to get permission to edit wiki.
My user name is Siddharth Anand

Thank you


Allowed Champions on podlings

2016-03-19 Thread John D. Ament
All,

It was recently pointed out that some of our docs are a bit inconsistent
around who can champion a candidate podling.

http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
http://incubator.apache.org/incubation/Incubation_Policy.html#Champion

In the former, officers and members are allowed to, but in the latter only
members.  I'd like to propose to make these two consistent, and allow both
officers and members to be champions.  I'll make this change in about 72
hours via lazy consensus unless someone comes up with a $reason why it
should be members only.

John


Re: Subject: [RESULT][VOTE] MADlib v1.9alpha-rc2

2016-03-19 Thread Frank McQuillan
Release is located at
https://dist.apache.org/repos/dist/release/incubator/madlib/1.9alpha-incubating/


On Fri, Mar 11, 2016 at 5:23 PM, Frank McQuillan 
wrote:

> The vote has PASSED with 3 +1 binding votes from the Incubator PMC
> members, and no 0 or -1 votes:
>
> +1 Justin Mclean
> +1 Roman Shaposhnik
> +1 Konstantin Boudnik
>
> Thread:
>
> http://mail-archives.apache.org/mod_mbox/incubator-general/201603.mbox/%3CCAKBQfzQh%3DJ3DrFSgFEY8teRDpEf5Yz3r7eBffTZVVN_9evpBJg%40mail.gmail.com%3E
>
> On behalf of the MADlib community, thank you to all who reviewed and voted
> on this release candidate.
>
> We will proceed with promoting this release candidate.
>
> Regards,
> Frank McQuillan
>


Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Josh Elser

+1

Glad to see this hit incubator. Feel free to add me as a mentor if you'd 
like another.


Daniel Dai wrote:

Hi,

I would like to propose Omid as an Apache Incubator project:

https://wiki.apache.org/incubator/OmidProposal

I've posted posted the text of the proposal below:

Thanks,
Daniel

= Omid Proposal =

=== Abstract ===

Omid is a flexible, reliable, high performant and scalable ACID
transactional framework that allows client applications to execute
transactions on top of MVCC key/value-based NoSQL datastores
(currently Apache HBase) providing Snapshot Isolation guarantees on
the accessed data.


=== Proposal ===

Omid is a flexible open-source transactional framework that provides
ACID transactions with Snapshot Isolation guarantees on top of NoSQL
datastores. In particular, the current codebase brings the concept of
transactions to the popular Apache HBase datastore. Omid offers great
performance, it is highly available, and scalable. Omid's current
version is able to scale to thousands of clients triggering concurrent
transactions on application data stored in HBase. Omid can scale
beyond 100K transactions per second on mid-range hardware while
incurring in a minimal impact on the speed of data access in the
datastore. We’re currently experimenting with a prototype version that
can improve the performance up to ~380K TPS.


Omid has been publicly available as an open-source project in Github
under Apache License Version 2.0 since 2011 [1]. During these years,
it has generated certain interest in the open source community,
especially since the public presentation of the first version in
Hadoop Summit 2013 [2]. Currently the Github project has 241 Stars and
93 forks. Yahoo Inc. submits this proposal to the Apache Software
Foundation with the aim to transfer the Omid project -including its
source code and documentation- to Apache in order to start the build
of a stable open source community around it.


[1] https://github.com/yahoo/omid

[2] Omid presentation at Hadoop Summit 2013:
https://www.youtube.com/watch?v=Rhdmo9pVGgU&index=68&list=PLSAiKuajRe2luyqLU464Nxz4aQe7EPBus


=== Background ===

An Omid prototype was first released as an open-source project back in
2011. Inspired by Google Percolator [1], it offered a lock-free
approach to transactions in NoSQL datastores (See [2]). However,
during these years, the design of Omid has evolved significantly.
Whilst the current open-sourced version maintains many aspects of the
original implementation, it is the result of a major redesign of the
first prototype released in 2011.


Omid has now a more decentralized design that does not sacrifice the
consistency and performance of the original version. The current
design also enables Omid to scale to thousands of clients executing
transactions concurrently on application data stored in HBase.
Internally, Omid still utilizes a lock-free approach to support
multiple concurrent clients. Its design also relies on a centralized
conflict detection component, the TSO, which now resolves in an
efficient manner writeset collisions among concurrent transactions
without having to piggyback commit information to the clients. Another
important benefit of Omid is that it doesn't require any modification
of the underlying key-value datastore, HBase in this case. Moreover,
the recently added high availability algorithm allows to eliminate the
single point of failure represented by the TSO in those system
deployments requiring a higher degree of dependability. Last but not
least, the provided user API is very simple, mimicking transaction
managers in the relational world: begin, commit, rollback.


Omid is used internally at Yahoo. Sieve, Yahoo’s web-scale content
management platform powering some of next-generation search and
personalization products is using Omid as a transaction manager in its
processing pipeline. Sieve essentially acts as a huge processing hub
between content feeds and serving systems. It provides an environment
for highly customizable, real-time, streamed information processing,
with typical discovery-to-service latencies of just a few seconds. In
terms of scale and availability, Omid’s new design was largely driven
by Sieve’s requirements.


At Yahoo, we are also making an effort to disseminate the current
status of the project through blog entries (See [3], [4] and [5]) and
submissions to technical and academic conferences such as ATC 2016,
Hadoop Summit 2016, HBaseConf 2016. Last but not least, Omid also
appeared in a TechCrunch article in the last quarter of 2015 (See [6])


[1] D. Peng and F. Dabek, Large-scale Incremental Processing Using
Distributed Transactions and Notifications. USENIX Symposium on
Operating Systems Design and Implementation, 2010

[2] D. Gomez-Ferro, F. Junqueira, I. Kelly, B. Reed, and M. Yabandeh.
Omid: Lock-free transactional support for distributed data stores. In
Proc. of ICDE, 2013.

[3] 
http://yahoohadoop.tumblr.com/post/129089878751/introducing-omid-transaction

Request for Administrator Wiki Access: ChrisNauroth

2016-03-19 Thread Chris Nauroth
Can I please be added to the incubator wiki Administrators group?  My wiki 
login is ChrisNauroth.  I would like administrator access so that I can field 
requests from contributors to get edit access on incubator proposals.

Thank you,
—Chris


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Nick Burch

On Fri, 18 Mar 2016, Greg Trasuk wrote:
I don’t think it’s the Incubator’s job to choose which competing 
projects should join the foundation.  All we’re here to do is to make 
sure that a community knows how to act like an Apache community, and 
that the artifacts are licensed properly.


This is only my view, and I know that some key incubator folks think 
it's too prescriptive, but I have seen it work


TL;DR - Alternate ideas and approaches Good, Confusion or Corporatism Bad

Where we have two different communities, working in the same space, but in 
different languages or different approaches, then that's fine. The ASF 
doesn't pick "winners", it picks "runners". So, having a Batch 
implementation of the Foo protocol in C, and having a proposed podling for 
a Streaming implementation of the Foo protocol in Java is fine.


Where we have two different companies doing rival implementations who 
refuse to co-operate, that's an issue. Two companies who are competitors, 
who both read the "Foo protocol" spec / Foo paper, and who found rival 
projects to implement Foo in Java, is a problem. They don't have a 
technical distinction, just a refusal to co-operate and a refusal to take 
off $DAYJOB hats and a refusal to work for the best interests of the 
community. That's an issue for the incubator and the ASF


If we have a similar proposed project coming in, I would expect the 
proposed project to have a chat with the existing one to see if a merger 
is possible. If they're in the same langauge, and take similar approaches, 
then a merger could deliver a better community with more features, which 
would be better for everyone.


However, if they two communities had a chat, and decided they really were 
different + could explain that, then in my book that's fine. Document and 
explain those, so potential new community members can pick the "right" one 
for them. Maybe collaborate on some common code / tests / etc, don't 
bad-mouth each other, and help new people pick the appropriate one for 
them, then that's fine.


AcmeCorp and Contoso both want to bring a Java project for doing Foo, and 
won't co-operate because they're competitors = red flag


AcmeCorp found a Ruby project for Foo, grow it, bring it to the ASF, then 
a formerly Contoso backed Java project for Foo comes, fine.


AcmeCorp did a "Foo for 1-3 machines that's easy to get started with" and 
want to bring that, while Contoso have been working on a Foo that's a bit 
tough to setup for small clusters, but scales brilliantly past 3 racks, 
that's fine. The can share some Foo compliance tests, and new community 
members can consider their deployment sizes and pick the "right" one to 
join for them



Only my view, though at least some others share it, I hope that helps at 
least a little?


Nick

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: Please add me to the incubator's contributors group

2016-03-19 Thread Nick Burch

On Wed, 16 Mar 2016, Siddharth Anand wrote:

Siddharth Anand


If you can tell us what username you signed up to the incubator wiki with, 
then one of us can do so for you


Nick

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [PROPOSAL] : Airflow

2016-03-19 Thread Chris Riccomini
> The proposal has the incubator listed as the sponsor and a quick check
shows Chris Riccomini is not on the IPMC or a member currently, details on
roles are available at [1].

I was under the impression that I could sponsor because I am an officer:

"A candidate project shall be sponsored by an Officer
 or Member
 of the Foundation"

The officer page lists me as a VP of Apache Samza. Perhaps I'm misreading?

> Mailing lists are in the old format

Fixed.

> Github shows 109 contributors, but the proposal lists only 6 initial
committers, where any of the other existing contributors considered for the
proposal?

In terms of the high ratio of contributors to committers (109:6), the
reason is as follows. Most contributions are to the fringes of the system
and are in the form of 3rd party contributed hooks/operators (for 3rd party
integrations). For example, if you want to use Airflow to speak to
Cassandra, you may want to contribute a Cassandra Hook and CRUD operators.
Airflow allows for this extensibility and allows for our users to adapt
Airflow to their needs. However, the committer list is limited to folks who
will touch the internals of the system (e.g. scheduler, metadata
management, etc...). Overtime, we would like other contributors to get
familiar with the internals to the point where they can also become
maintainers, but they don't currently have the expertise.


On Thu, Mar 17, 2016 at 8:16 AM, Scott Deboy  wrote:

> Ll
> On Mar 17, 2016 6:51 AM, "Jake Farrell"  wrote:
>
> > Hi Siddharth
> > Thanks for drafting a proposal and looking to bring Airflow to the Apache
> > Incubator. Overall the proposal looks good, just a couple comments. The
> > proposal has the incubator listed as the sponsor and a quick check shows
> > Chris Riccomini is not on the IPMC or a member currently, details on
> roles
> > are available at [1]. Not a blocker as you have members listed in your
> > group of mentors and can ask that they step up as the champion.
> >
> > Mailing lists are in the old format, they should be
> > - priv...@airflow.incubator.apache.org (moderated)
> > - d...@airflow.incubator.apache.org
> > - comm...@airflow.incubator.apache.org
> >
> > Github shows 109 contributors, but the proposal lists only 6 initial
> > committers, where any of the other existing contributors considered for
> the
> > proposal?
> >
> > When you submit the proposal please make sure to include the entire
> > proposal in the email. If you have any questions please let us know
> >
> > -Jake
> >
> >
> > [1]:
> >
> >
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> >
> > On Wed, Mar 16, 2016 at 8:28 PM, Siddharth Anand
>  > >
> > wrote:
> >
> > > https://wiki.apache.org/incubator/AirflowProposal
> > >
> > > Thoughts and comments are welcome!
> > > -s (Sid)
> > >
> >
>


Request for Wiki Edit Access: SiddharthAnand

2016-03-19 Thread Siddharth Anand
Can I please be added to the incubator wiki Contributors group?  My wiki
login is SiddharthAnand.  I would like to edit my Airflow Incubator
Proposal (https://wiki.apache.org/incubator/AirflowProposal) in response to
feedback.

Thank you,
-s (Sid)


Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Flavio Junqueira
I understand the concern, so let me try to offer some facts and see if we can 
make progress from there. 

Omid has been around for some time now, and its initial design appeared in a 
couple of research papers that I actually co-authored. The architecture is 
based on the idea of having a centralized transaction status oracle that shares 
transaction status data with clients for scalability. The current Omid project 
evolved out of that initial work and it is a much improved version over that 
first iteration, with the improvements focusing on scalability. It currently 
runs in production at scale at Yahoo! and there is interest from other 
companies according to the proposal. There is a series of blog posts about the 
experience in the project proposal.

Tephra has a very similar architecture. The description here says that it has a 
transaction server, which sounds like the TSO in the original Omid papers. I 
haven't spent enough time understanding the precise protocol they use, but I 
must say that the protocol is very important for correctness and scalability. 
Having two protocols with different properties could justify the presence of 
two projects, but they both promise snapshot isolation so I suspect they will 
be doing very similar things.

Overall, as I see it, it would be very unfair to reject the Omid proposal on 
the basis that Tephra was incubated a couple of weeks ago. I'd much rather see 
how the two communities evolve and have the mentors of the projects fostering 
collaboration and possibly a merge of the two projects before graduation. Why 
not think of a general transaction status oracle with different protocol 
implementations assuming it makes sense? I wouldn't like to see any of the two 
blocked upfront on the basis that they are in the same space, though. We could 
postpone this decision until graduation when we'll have more knowledge about 
the projects and the growth of the two communities.

-Flavio

> On 18 Mar 2016, at 23:19, Henry Saputra  wrote:
> 
> I know Apache incubator does not play favorite but it is getting awkward
> that TWO transaction engine for HBase coming to incubator at the same time.
> 
> As most people know, the other one is Tephra, that just coming to incubator
> few weeks ago.
> 
> As member of IPMC, I would like to see Omid provide some more details
> comparisons about the difference that the project bring,  in term of
> approach and possible integrations with other ASF projects.
> 
> If possible, I would prefer to see Omid team work together with Tephra to
> work on working together to make one solid transaction engine for HBase and
> later NoSQL databases.
> 
> 
> - Henry
> 
> On Thu, Mar 17, 2016 at 1:17 PM, Daniel Dai  wrote:
> 
>> Hi,
>> 
>> I would like to propose Omid as an Apache Incubator project:
>> 
>> https://wiki.apache.org/incubator/OmidProposal
>> 
>> I've posted posted the text of the proposal below:
>> 
>> Thanks,
>> Daniel
>> 
>> = Omid Proposal =
>> 
>> === Abstract ===
>> 
>> Omid is a flexible, reliable, high performant and scalable ACID
>> transactional framework that allows client applications to execute
>> transactions on top of MVCC key/value-based NoSQL datastores
>> (currently Apache HBase) providing Snapshot Isolation guarantees on
>> the accessed data.
>> 
>> 
>> === Proposal ===
>> 
>> Omid is a flexible open-source transactional framework that provides
>> ACID transactions with Snapshot Isolation guarantees on top of NoSQL
>> datastores. In particular, the current codebase brings the concept of
>> transactions to the popular Apache HBase datastore. Omid offers great
>> performance, it is highly available, and scalable. Omid's current
>> version is able to scale to thousands of clients triggering concurrent
>> transactions on application data stored in HBase. Omid can scale
>> beyond 100K transactions per second on mid-range hardware while
>> incurring in a minimal impact on the speed of data access in the
>> datastore. We’re currently experimenting with a prototype version that
>> can improve the performance up to ~380K TPS.
>> 
>> 
>> Omid has been publicly available as an open-source project in Github
>> under Apache License Version 2.0 since 2011 [1]. During these years,
>> it has generated certain interest in the open source community,
>> especially since the public presentation of the first version in
>> Hadoop Summit 2013 [2]. Currently the Github project has 241 Stars and
>> 93 forks. Yahoo Inc. submits this proposal to the Apache Software
>> Foundation with the aim to transfer the Omid project -including its
>> source code and documentation- to Apache in order to start the build
>> of a stable open source community around it.
>> 
>> 
>> [1] https://github.com/yahoo/omid
>> 
>> [2] Omid presentation at Hadoop Summit 2013:
>> 
>> https://www.youtube.com/watch?v=Rhdmo9pVGgU&index=68&list=PLSAiKuajRe2luyqLU464Nxz4aQe7EPBus
>> 
>> 
>> === Background ===
>> 
>> An Omid prototype was first released as an open-source project b

Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Flavio Junqueira
Those are great observations, Nick. Merging projects is often hard even if the 
developers are willing to do it, though. The projects could be already running 
in production and major changes could be disruptive to the point of making the 
merged project not viable.

Attempting to merge projects through a conversation between the two parties 
sounds reasonable, but I suspect that many times the two parties will prefer to 
at least start independently. Perhaps the incubator can do the job of bridging 
the conversation and making sure that the differences are sorted out before 
graduation and assess if a merge is possible during the process.

-Flavio 


> On 19 Mar 2016, at 11:23, Nick Burch  wrote:
> 
> On Fri, 18 Mar 2016, Greg Trasuk wrote:
>> I don’t think it’s the Incubator’s job to choose which competing projects 
>> should join the foundation.  All we’re here to do is to make sure that a 
>> community knows how to act like an Apache community, and that the artifacts 
>> are licensed properly.
> 
> This is only my view, and I know that some key incubator folks think it's too 
> prescriptive, but I have seen it work
> 
> TL;DR - Alternate ideas and approaches Good, Confusion or Corporatism Bad
> 
> Where we have two different communities, working in the same space, but in 
> different languages or different approaches, then that's fine. The ASF 
> doesn't pick "winners", it picks "runners". So, having a Batch implementation 
> of the Foo protocol in C, and having a proposed podling for a Streaming 
> implementation of the Foo protocol in Java is fine.
> 
> Where we have two different companies doing rival implementations who refuse 
> to co-operate, that's an issue. Two companies who are competitors, who both 
> read the "Foo protocol" spec / Foo paper, and who found rival projects to 
> implement Foo in Java, is a problem. They don't have a technical distinction, 
> just a refusal to co-operate and a refusal to take off $DAYJOB hats and a 
> refusal to work for the best interests of the community. That's an issue for 
> the incubator and the ASF
> 
> If we have a similar proposed project coming in, I would expect the proposed 
> project to have a chat with the existing one to see if a merger is possible. 
> If they're in the same langauge, and take similar approaches, then a merger 
> could deliver a better community with more features, which would be better 
> for everyone.
> 
> However, if they two communities had a chat, and decided they really were 
> different + could explain that, then in my book that's fine. Document and 
> explain those, so potential new community members can pick the "right" one 
> for them. Maybe collaborate on some common code / tests / etc, don't 
> bad-mouth each other, and help new people pick the appropriate one for them, 
> then that's fine.
> 
> AcmeCorp and Contoso both want to bring a Java project for doing Foo, and 
> won't co-operate because they're competitors = red flag
> 
> AcmeCorp found a Ruby project for Foo, grow it, bring it to the ASF, then a 
> formerly Contoso backed Java project for Foo comes, fine.
> 
> AcmeCorp did a "Foo for 1-3 machines that's easy to get started with" and 
> want to bring that, while Contoso have been working on a Foo that's a bit 
> tough to setup for small clusters, but scales brilliantly past 3 racks, 
> that's fine. The can share some Foo compliance tests, and new community 
> members can consider their deployment sizes and pick the "right" one to join 
> for them
> 
> 
> Only my view, though at least some others share it, I hope that helps at 
> least a little?
> 
> Nick
> 
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Advice on binary NOTICE

2016-03-19 Thread Marvin Humphrey
On Fri, Mar 18, 2016 at 7:57 AM, Stephen Mallette  wrote:
> The Jackson JSON processing lib which is Apache 2.0 licensed carries this
> NOTICE file:

>8 snip 8<

> Does anyone have any advice on what portion of this is relevant for
> inclusion in a binary NOTICE file? Should it all be included perhaps?

Ultimately, this is up to the judgment of the supplier of the
convenience binary, and IANAL.  That said...

It seems to me that a good, simple rule can be applied to any
ALv2-licensed dependency which originates from outside the ASF and
provides a NOTICE file -- regardless of length: Propagate the entire
content of the dependency NOTICE file into the top-level NOTICE file.

I think this more conservative approach is justified because it is
hard to know whether the dependency's copyright holders would agree
with downstream analysis as to which parts of the NOTICE may be
omitted and which may not.

> Perhaps more generally, given
>
> http://www.apache.org/dev/licensing-howto.html#alv2-dep
>
> where it says,
>
> "If the dependency supplies a NOTICE file, its contents must be analyzed
> and the relevant portions bubbled up into the top-level NOTICE file."
>
> is there any more detailed information on how "relevant portions" get
> determined?

Well, the definitive resources for determining licensing obligations
are the literal license texts. That's more detail than you're asking
for, of course.

For binaries, we don't provide significant guidance at this time. It's
challenging to get our official source releases compliant with our own
policies consistently, let alone convenience binaries. There's a lot
of work to be done.

Marvin Humphrey

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Allowed Champions on podlings

2016-03-19 Thread Tim Williams
On Fri, Mar 18, 2016 at 12:33 AM, John D. Ament  wrote:
> All,
>
> It was recently pointed out that some of our docs are a bit inconsistent
> around who can champion a candidate podling.
>
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> http://incubator.apache.org/incubation/Incubation_Policy.html#Champion
>
> In the former, officers and members are allowed to, but in the latter only
> members.  I'd like to propose to make these two consistent, and allow both
> officers and members to be champions.  I'll make this change in about 72
> hours via lazy consensus unless someone comes up with a $reason why it
> should be members only.

Without judging the goodness of it... I'd just point out that
currently in the non-Member Officer case, they must be a member of the
Sponsoring PMC.  I thought that was true of Members as well, btw, but
thought I'd point out that it's not simply Members and Officers.

--tim

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Gangumalla, Uma
Proposal looks very good and clear. This project should be a good addition.

+1 (non-binding)

Regards,
Uma

On 3/17/16, 1:17 PM, "Daniel Dai"  wrote:

>Hi,
>
>I would like to propose Omid as an Apache Incubator project:
>
>https://wiki.apache.org/incubator/OmidProposal
>
>I've posted posted the text of the proposal below:
>
>Thanks,
>Daniel
>
>= Omid Proposal =
>
>=== Abstract ===
>
>Omid is a flexible, reliable, high performant and scalable ACID
>transactional framework that allows client applications to execute
>transactions on top of MVCC key/value-based NoSQL datastores
>(currently Apache HBase) providing Snapshot Isolation guarantees on
>the accessed data.
>
>
>=== Proposal ===
>
>Omid is a flexible open-source transactional framework that provides
>ACID transactions with Snapshot Isolation guarantees on top of NoSQL
>datastores. In particular, the current codebase brings the concept of
>transactions to the popular Apache HBase datastore. Omid offers great
>performance, it is highly available, and scalable. Omid's current
>version is able to scale to thousands of clients triggering concurrent
>transactions on application data stored in HBase. Omid can scale
>beyond 100K transactions per second on mid-range hardware while
>incurring in a minimal impact on the speed of data access in the
>datastore. We¹re currently experimenting with a prototype version that
>can improve the performance up to ~380K TPS.
>
>
>Omid has been publicly available as an open-source project in Github
>under Apache License Version 2.0 since 2011 [1]. During these years,
>it has generated certain interest in the open source community,
>especially since the public presentation of the first version in
>Hadoop Summit 2013 [2]. Currently the Github project has 241 Stars and
>93 forks. Yahoo Inc. submits this proposal to the Apache Software
>Foundation with the aim to transfer the Omid project -including its
>source code and documentation- to Apache in order to start the build
>of a stable open source community around it.
>
>
>[1] https://github.com/yahoo/omid
>
>[2] Omid presentation at Hadoop Summit 2013:
>https://www.youtube.com/watch?v=Rhdmo9pVGgU&index=68&list=PLSAiKuajRe2luyq
>LU464Nxz4aQe7EPBus
>
>
>=== Background ===
>
>An Omid prototype was first released as an open-source project back in
>2011. Inspired by Google Percolator [1], it offered a lock-free
>approach to transactions in NoSQL datastores (See [2]). However,
>during these years, the design of Omid has evolved significantly.
>Whilst the current open-sourced version maintains many aspects of the
>original implementation, it is the result of a major redesign of the
>first prototype released in 2011.
>
>
>Omid has now a more decentralized design that does not sacrifice the
>consistency and performance of the original version. The current
>design also enables Omid to scale to thousands of clients executing
>transactions concurrently on application data stored in HBase.
>Internally, Omid still utilizes a lock-free approach to support
>multiple concurrent clients. Its design also relies on a centralized
>conflict detection component, the TSO, which now resolves in an
>efficient manner writeset collisions among concurrent transactions
>without having to piggyback commit information to the clients. Another
>important benefit of Omid is that it doesn't require any modification
>of the underlying key-value datastore, HBase in this case. Moreover,
>the recently added high availability algorithm allows to eliminate the
>single point of failure represented by the TSO in those system
>deployments requiring a higher degree of dependability. Last but not
>least, the provided user API is very simple, mimicking transaction
>managers in the relational world: begin, commit, rollback.
>
>
>Omid is used internally at Yahoo. Sieve, Yahoo¹s web-scale content
>management platform powering some of next-generation search and
>personalization products is using Omid as a transaction manager in its
>processing pipeline. Sieve essentially acts as a huge processing hub
>between content feeds and serving systems. It provides an environment
>for highly customizable, real-time, streamed information processing,
>with typical discovery-to-service latencies of just a few seconds. In
>terms of scale and availability, Omid¹s new design was largely driven
>by Sieve¹s requirements.
>
>
>At Yahoo, we are also making an effort to disseminate the current
>status of the project through blog entries (See [3], [4] and [5]) and
>submissions to technical and academic conferences such as ATC 2016,
>Hadoop Summit 2016, HBaseConf 2016. Last but not least, Omid also
>appeared in a TechCrunch article in the last quarter of 2015 (See [6])
>
>
>[1] D. Peng and F. Dabek, Large-scale Incremental Processing Using
>Distributed Transactions and Notifications. USENIX Symposium on
>Operating Systems Design and Implementation, 2010
>
>[2] D. Gomez-Ferro, F. Junqueira, I. Kelly, B. Reed, and M. Yabandeh.
>Omid: Lock-free transactio

Re: [VOTE] Accept Tephra into the Apache Incubator

2016-03-19 Thread Phillip Rhodes
+1


This message optimized for indexing by NSA PRISM

On Fri, Mar 18, 2016 at 2:53 PM, Stack  wrote:

> I'm late, but let me add my +1 anyways.
> St.Ack
>
> On Thu, Mar 3, 2016 at 5:29 PM, Poorna Chandra  wrote:
>
> > Hi All,
> >
> > Tephra proposal was sent out for discussion last week. The proposal is
> > available at https://wiki.apache.org/incubator/TephraProposal
> >
> > Please vote to accept Tephra into the Apache Incubator. The vote will be
> > open for the next 72 hours.
> >
> > [ ] +1 Accept Tephra as an Apache Incubator podling.
> > [ ] +0 Abstain.
> > [ ] -1 Don’t accept Tephra as an Apache Incubator podling because ...
> >
> > Thanks,
> > Poorna.
> >
> > --
> >
> > = Abstract =
> >
> > Tephra is a system for providing globally consistent transactions on
> > top of Apache HBase and other storage engines.
> >
> > = Proposal =
> >
> > Tephra is a transaction engine for distributed data stores like Apache
> > HBase.
> > It provides ACID semantics for concurrent data operations that span over
> > region
> > boundaries in HBase using Optimistic Concurrency Control.
> >
> > = Background =
> >
> > HBase provides strong consistency with row- or region-level ACID
> > operations. However, it sacrifices cross-region and cross-table
> > consistency in favor of scalability. This trade-off requires application
> > developers to handle  the complexity of ensuring consistency when their
> > modifications span region boundaries. By providing support for global
> > transactions that span regions, tables, or multiple RPCs,
> > Tephra simplifies application development on top of HBase, without a
> > significant impact on performance or scalability for many workloads.
> >
> > Tephra leverages HBase’s native data versioning to provide
> multi-versioned
> > concurrency control (MVCC) for transactional reads and writes.
> > With MVCC capability, each transaction sees its own consistent “snapshot”
> > of
> > data, providing snapshot isolation of concurrent transactions.
> > MVCC along with conflict detection and handling enables Optimistic
> > Concurrency
> > Control.
> >
> > Tephra consists of three main components:
> >  * Transaction Server – maintains global view of transaction state,
> assigns
> >new transaction IDs and performs conflict detection;
> >  * Transaction Client – coordinates start, commit, and rollback of
> > transactions; and
> >  * Transaction Processor Coprocessor – applies filtering to the data read
> > (based
> >on a given transaction’s state) and cleans up any data from old
> >(no longer visible) transactions.
> >
> > Although Tephra only supports HBase now, it can be extended to support
> > transactions on any store that has multi-versioning and rollback
> > support. The transactions
> > can span over multiple stores and storage paradigms.
> >
> > = Rationale =
> >
> > Tephra has simple abstractions which can be used by an application to
> > add transaction support over HBase. By abstracting away transaction
> > handling using Tephra, the application is freed of
> > transaction logic, and the application developer can focus on the use
> case.
> > Also, Tephra can be extended to support transactions on data sources
> other
> > than HBase.
> >
> > By making Tephra an Apache open source project, we believe that there
> will
> > be wider adoption and more opportunities for Tephra to be integrated
> > into other Apache projects.
> >
> > = Current Status =
> >
> > Tephra was built at Cask Data Inc. initially as part of
> > open-source framework Cask Data Application Platform (CDAP)
> > [[http://cdap.io/]].
> > It was later converted into an independent open source project with
> > Apache 2.0 License [[https://github.com/caskdata/tephra]].
> >
> > Tephra is used in CDAP as the transaction engine. As part of CDAP, Tephra
> > has been deployed at multiple companies.
> >
> > Apache Phoenix is using Tephra as transaction engine in the next release.
> >
> > == Meritocracy ==
> >
> > Our intent with this incubator proposal is to start building a diverse
> > developer community around Tephra following the Apache meritocracy model.
> > Since Tephra was initially developed in early 2013, we have had fast
> > adoption and contributions within Cask Data. We are looking forward to
> > new contributors. We wish to build a community based on Apache's
> > meritocracy principles, working with those who contribute significantly
> to
> > the project and welcoming them to be committers both during the
> incubation
> > process and beyond.
> >
> > == Community ==
> >
> > Core developers of Tephra are at Cask Data. Recently the developer
> > community
> > has expanded to include folks from Apache Phoenix. We hope to extend our
> > contributor base significantly and we will invite all who are interested
> > in working on distributed transaction engine.
> >
> > == Core Developers ==
> >
> > A few engineers from Cask Data and outside have developed Tephra:
> > Andreas Neumann, Terence Yim, Gary Helmling, Andrew

Re: [PROPOSAL] : Airflow

2016-03-19 Thread Jake Farrell
Hey Chris
As an officer you can be a champion if your project you chair (Samza) is
sponsoring the incoming podling with the intent of merging the two
communities - "Where the Champion is not a Member of the Foundation (i.e.
is an Officer only), the Champion shall be a member of the PMC of the
Sponsor.", did not think this was the case based on the proposal.

Makes sense about the committers and developing the community during
incubation, was just curious because there where some people with 20+
commits not in the initial committers list. Was not looking for a change,
more that there had been some thought behind this exclusion and how the
project intended to start developing that community

-Jake

On Thu, Mar 17, 2016 at 11:58 AM, Chris Riccomini 
wrote:

> > The proposal has the incubator listed as the sponsor and a quick check
> shows Chris Riccomini is not on the IPMC or a member currently, details on
> roles are available at [1].
>
> I was under the impression that I could sponsor because I am an officer:
>
> "A candidate project shall be sponsored by an Officer
>  or Member
>  of the Foundation"
>
> The officer page lists me as a VP of Apache Samza. Perhaps I'm misreading?
>
> > Mailing lists are in the old format
>
> Fixed.
>
> > Github shows 109 contributors, but the proposal lists only 6 initial
> committers, where any of the other existing contributors considered for the
> proposal?
>
> In terms of the high ratio of contributors to committers (109:6), the
> reason is as follows. Most contributions are to the fringes of the system
> and are in the form of 3rd party contributed hooks/operators (for 3rd party
> integrations). For example, if you want to use Airflow to speak to
> Cassandra, you may want to contribute a Cassandra Hook and CRUD operators.
> Airflow allows for this extensibility and allows for our users to adapt
> Airflow to their needs. However, the committer list is limited to folks who
> will touch the internals of the system (e.g. scheduler, metadata
> management, etc...). Overtime, we would like other contributors to get
> familiar with the internals to the point where they can also become
> maintainers, but they don't currently have the expertise.
>
>
> On Thu, Mar 17, 2016 at 8:16 AM, Scott Deboy 
> wrote:
>
> > Ll
> > On Mar 17, 2016 6:51 AM, "Jake Farrell"  wrote:
> >
> > > Hi Siddharth
> > > Thanks for drafting a proposal and looking to bring Airflow to the
> Apache
> > > Incubator. Overall the proposal looks good, just a couple comments. The
> > > proposal has the incubator listed as the sponsor and a quick check
> shows
> > > Chris Riccomini is not on the IPMC or a member currently, details on
> > roles
> > > are available at [1]. Not a blocker as you have members listed in your
> > > group of mentors and can ask that they step up as the champion.
> > >
> > > Mailing lists are in the old format, they should be
> > > - priv...@airflow.incubator.apache.org (moderated)
> > > - d...@airflow.incubator.apache.org
> > > - comm...@airflow.incubator.apache.org
> > >
> > > Github shows 109 contributors, but the proposal lists only 6 initial
> > > committers, where any of the other existing contributors considered for
> > the
> > > proposal?
> > >
> > > When you submit the proposal please make sure to include the entire
> > > proposal in the email. If you have any questions please let us know
> > >
> > > -Jake
> > >
> > >
> > > [1]:
> > >
> > >
> >
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> > >
> > > On Wed, Mar 16, 2016 at 8:28 PM, Siddharth Anand
> >  > > >
> > > wrote:
> > >
> > > > https://wiki.apache.org/incubator/AirflowProposal
> > > >
> > > > Thoughts and comments are welcome!
> > > > -s (Sid)
> > > >
> > >
> >
>


RE: Allowed Champions on podlings

2016-03-19 Thread Jean-Baptiste Onofré


+1
RegardsJB 


Sent from my Samsung device

 Original message 
From: "John D. Ament"  
Date: 18/03/2016  01:33  (GMT+01:00) 
To: general@incubator.apache.org 
Subject: Allowed Champions on podlings 

All,

It was recently pointed out that some of our docs are a bit inconsistent
around who can champion a candidate podling.

http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
http://incubator.apache.org/incubation/Incubation_Policy.html#Champion

In the former, officers and members are allowed to, but in the latter only
members.  I'd like to propose to make these two consistent, and allow both
officers and members to be champions.  I'll make this change in about 72
hours via lazy consensus unless someone comes up with a $reason why it
should be members only.

John


Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Henry Saputra
Thanks for the great explanation, Flavio.

As many have mentioned before, it is definitely ok to have similar projects
in ASF. We have prior acts before and I didn't expect incubator to reject
good projects coming in.

My intention was to avoid split of resources where both projects have
very similar goal and approach. But maybe both projects have different
subtle differences that worthy to be done as independent effort.

Just being devil advocate a bit to see if potential to collaborate.

- Henry

On Saturday, March 19, 2016, Flavio Junqueira  wrote:

> I understand the concern, so let me try to offer some facts and see if we
> can make progress from there.
>
> Omid has been around for some time now, and its initial design appeared in
> a couple of research papers that I actually co-authored. The architecture
> is based on the idea of having a centralized transaction status oracle that
> shares transaction status data with clients for scalability. The current
> Omid project evolved out of that initial work and it is a much improved
> version over that first iteration, with the improvements focusing on
> scalability. It currently runs in production at scale at Yahoo! and there
> is interest from other companies according to the proposal. There is a
> series of blog posts about the experience in the project proposal.
>
> Tephra has a very similar architecture. The description here says that it
> has a transaction server, which sounds like the TSO in the original Omid
> papers. I haven't spent enough time understanding the precise protocol they
> use, but I must say that the protocol is very important for correctness and
> scalability. Having two protocols with different properties could justify
> the presence of two projects, but they both promise snapshot isolation so I
> suspect they will be doing very similar things.
>
> Overall, as I see it, it would be very unfair to reject the Omid proposal
> on the basis that Tephra was incubated a couple of weeks ago. I'd much
> rather see how the two communities evolve and have the mentors of the
> projects fostering collaboration and possibly a merge of the two projects
> before graduation. Why not think of a general transaction status oracle
> with different protocol implementations assuming it makes sense? I wouldn't
> like to see any of the two blocked upfront on the basis that they are in
> the same space, though. We could postpone this decision until graduation
> when we'll have more knowledge about the projects and the growth of the two
> communities.
>
> -Flavio
>
> > On 18 Mar 2016, at 23:19, Henry Saputra  > wrote:
> >
> > I know Apache incubator does not play favorite but it is getting awkward
> > that TWO transaction engine for HBase coming to incubator at the same
> time.
> >
> > As most people know, the other one is Tephra, that just coming to
> incubator
> > few weeks ago.
> >
> > As member of IPMC, I would like to see Omid provide some more details
> > comparisons about the difference that the project bring,  in term of
> > approach and possible integrations with other ASF projects.
> >
> > If possible, I would prefer to see Omid team work together with Tephra to
> > work on working together to make one solid transaction engine for HBase
> and
> > later NoSQL databases.
> >
> >
> > - Henry
> >
> > On Thu, Mar 17, 2016 at 1:17 PM, Daniel Dai  > wrote:
> >
> >> Hi,
> >>
> >> I would like to propose Omid as an Apache Incubator project:
> >>
> >> https://wiki.apache.org/incubator/OmidProposal
> >>
> >> I've posted posted the text of the proposal below:
> >>
> >> Thanks,
> >> Daniel
> >>
> >> = Omid Proposal =
> >>
> >> === Abstract ===
> >>
> >> Omid is a flexible, reliable, high performant and scalable ACID
> >> transactional framework that allows client applications to execute
> >> transactions on top of MVCC key/value-based NoSQL datastores
> >> (currently Apache HBase) providing Snapshot Isolation guarantees on
> >> the accessed data.
> >>
> >>
> >> === Proposal ===
> >>
> >> Omid is a flexible open-source transactional framework that provides
> >> ACID transactions with Snapshot Isolation guarantees on top of NoSQL
> >> datastores. In particular, the current codebase brings the concept of
> >> transactions to the popular Apache HBase datastore. Omid offers great
> >> performance, it is highly available, and scalable. Omid's current
> >> version is able to scale to thousands of clients triggering concurrent
> >> transactions on application data stored in HBase. Omid can scale
> >> beyond 100K transactions per second on mid-range hardware while
> >> incurring in a minimal impact on the speed of data access in the
> >> datastore. We’re currently experimenting with a prototype version that
> >> can improve the performance up to ~380K TPS.
> >>
> >>
> >> Omid has been publicly available as an open-source project in Github
> >> under Apache License Version 2.0 since 2011 [1]. During these years,
> >> it has generated certain interest in the open source c

Re: Please add me to the incubator's contributors group

2016-03-19 Thread Siddharth Anand
Yes. Thanks.

I had tried to followed the recommended convention of "FirstnameLastname",
but it was taken. So, I used "Siddharth Anand" (with a space) and it
worked.




On Wed, Mar 16, 2016 at 4:57 PM, Nick Burch  wrote:

> On Wed, 16 Mar 2016, Siddharth Anand wrote:
>
>> Siddharth Anand
>>
>
> If you can tell us what username you signed up to the incubator wiki with,
> then one of us can do so for you
>
> Nick
>
> -
> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
> For additional commands, e-mail: general-h...@incubator.apache.org
>
>


Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Andrew Purtell
Apache Phoenix just released version 4.7.0 with big news: transactions support, 
using Tephra. There's some interest in a successful Tephra incubation beyond 
the podling already. That said, that new code in Phoenix can be made pluggable 
to support more than one transaction oracle. Omid might be able to provide 
workable integration to stand in for Tephra. Collaboration between or even a 
joining of the two communities could be good but even if not as a potential 
downstream consumer it's good to have options! (provided the number of 
alternatives is bounded with reason of course). I think it would be good to see 
Omid get in. I think an Omid podling would find interested collaborators in the 
Phoenix and HBase communities right away. 


> On Mar 19, 2016, at 12:20 PM, Henry Saputra  wrote:
> 
> Thanks for the great explanation, Flavio.
> 
> As many have mentioned before, it is definitely ok to have similar projects
> in ASF. We have prior acts before and I didn't expect incubator to reject
> good projects coming in.
> 
> My intention was to avoid split of resources where both projects have
> very similar goal and approach. But maybe both projects have different
> subtle differences that worthy to be done as independent effort.
> 
> Just being devil advocate a bit to see if potential to collaborate.
> 
> - Henry
> 
>> On Saturday, March 19, 2016, Flavio Junqueira  wrote:
>> 
>> I understand the concern, so let me try to offer some facts and see if we
>> can make progress from there.
>> 
>> Omid has been around for some time now, and its initial design appeared in
>> a couple of research papers that I actually co-authored. The architecture
>> is based on the idea of having a centralized transaction status oracle that
>> shares transaction status data with clients for scalability. The current
>> Omid project evolved out of that initial work and it is a much improved
>> version over that first iteration, with the improvements focusing on
>> scalability. It currently runs in production at scale at Yahoo! and there
>> is interest from other companies according to the proposal. There is a
>> series of blog posts about the experience in the project proposal.
>> 
>> Tephra has a very similar architecture. The description here says that it
>> has a transaction server, which sounds like the TSO in the original Omid
>> papers. I haven't spent enough time understanding the precise protocol they
>> use, but I must say that the protocol is very important for correctness and
>> scalability. Having two protocols with different properties could justify
>> the presence of two projects, but they both promise snapshot isolation so I
>> suspect they will be doing very similar things.
>> 
>> Overall, as I see it, it would be very unfair to reject the Omid proposal
>> on the basis that Tephra was incubated a couple of weeks ago. I'd much
>> rather see how the two communities evolve and have the mentors of the
>> projects fostering collaboration and possibly a merge of the two projects
>> before graduation. Why not think of a general transaction status oracle
>> with different protocol implementations assuming it makes sense? I wouldn't
>> like to see any of the two blocked upfront on the basis that they are in
>> the same space, though. We could postpone this decision until graduation
>> when we'll have more knowledge about the projects and the growth of the two
>> communities.
>> 
>> -Flavio
>> 
 On 18 Mar 2016, at 23:19, Henry Saputra >> > wrote:
>>> 
>>> I know Apache incubator does not play favorite but it is getting awkward
>>> that TWO transaction engine for HBase coming to incubator at the same
>> time.
>>> 
>>> As most people know, the other one is Tephra, that just coming to
>> incubator
>>> few weeks ago.
>>> 
>>> As member of IPMC, I would like to see Omid provide some more details
>>> comparisons about the difference that the project bring,  in term of
>>> approach and possible integrations with other ASF projects.
>>> 
>>> If possible, I would prefer to see Omid team work together with Tephra to
>>> work on working together to make one solid transaction engine for HBase
>> and
>>> later NoSQL databases.
>>> 
>>> 
>>> - Henry
>>> 
 On Thu, Mar 17, 2016 at 1:17 PM, Daniel Dai >> > wrote:
>>> 
 Hi,
 
 I would like to propose Omid as an Apache Incubator project:
 
 https://wiki.apache.org/incubator/OmidProposal
 
 I've posted posted the text of the proposal below:
 
 Thanks,
 Daniel
 
 = Omid Proposal =
 
 === Abstract ===
 
 Omid is a flexible, reliable, high performant and scalable ACID
 transactional framework that allows client applications to execute
 transactions on top of MVCC key/value-based NoSQL datastores
 (currently Apache HBase) providing Snapshot Isolation guarantees on
 the accessed data.
 
 
 === Proposal ===
 
 Omid is a flexible open-source transactional framework that provides
 ACID transac

Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Jean-Baptiste Onofré

+1 (binding)

Regards
JB

On 03/17/2016 09:17 PM, Daniel Dai wrote:

Hi,

I would like to propose Omid as an Apache Incubator project:

https://wiki.apache.org/incubator/OmidProposal

I've posted posted the text of the proposal below:

Thanks,
Daniel

= Omid Proposal =

=== Abstract ===

Omid is a flexible, reliable, high performant and scalable ACID
transactional framework that allows client applications to execute
transactions on top of MVCC key/value-based NoSQL datastores
(currently Apache HBase) providing Snapshot Isolation guarantees on
the accessed data.


=== Proposal ===

Omid is a flexible open-source transactional framework that provides
ACID transactions with Snapshot Isolation guarantees on top of NoSQL
datastores. In particular, the current codebase brings the concept of
transactions to the popular Apache HBase datastore. Omid offers great
performance, it is highly available, and scalable. Omid's current
version is able to scale to thousands of clients triggering concurrent
transactions on application data stored in HBase. Omid can scale
beyond 100K transactions per second on mid-range hardware while
incurring in a minimal impact on the speed of data access in the
datastore. We’re currently experimenting with a prototype version that
can improve the performance up to ~380K TPS.


Omid has been publicly available as an open-source project in Github
under Apache License Version 2.0 since 2011 [1]. During these years,
it has generated certain interest in the open source community,
especially since the public presentation of the first version in
Hadoop Summit 2013 [2]. Currently the Github project has 241 Stars and
93 forks. Yahoo Inc. submits this proposal to the Apache Software
Foundation with the aim to transfer the Omid project -including its
source code and documentation- to Apache in order to start the build
of a stable open source community around it.


[1] https://github.com/yahoo/omid

[2] Omid presentation at Hadoop Summit 2013:
https://www.youtube.com/watch?v=Rhdmo9pVGgU&index=68&list=PLSAiKuajRe2luyqLU464Nxz4aQe7EPBus


=== Background ===

An Omid prototype was first released as an open-source project back in
2011. Inspired by Google Percolator [1], it offered a lock-free
approach to transactions in NoSQL datastores (See [2]). However,
during these years, the design of Omid has evolved significantly.
Whilst the current open-sourced version maintains many aspects of the
original implementation, it is the result of a major redesign of the
first prototype released in 2011.


Omid has now a more decentralized design that does not sacrifice the
consistency and performance of the original version. The current
design also enables Omid to scale to thousands of clients executing
transactions concurrently on application data stored in HBase.
Internally, Omid still utilizes a lock-free approach to support
multiple concurrent clients. Its design also relies on a centralized
conflict detection component, the TSO, which now resolves in an
efficient manner writeset collisions among concurrent transactions
without having to piggyback commit information to the clients. Another
important benefit of Omid is that it doesn't require any modification
of the underlying key-value datastore, HBase in this case. Moreover,
the recently added high availability algorithm allows to eliminate the
single point of failure represented by the TSO in those system
deployments requiring a higher degree of dependability. Last but not
least, the provided user API is very simple, mimicking transaction
managers in the relational world: begin, commit, rollback.


Omid is used internally at Yahoo. Sieve, Yahoo’s web-scale content
management platform powering some of next-generation search and
personalization products is using Omid as a transaction manager in its
processing pipeline. Sieve essentially acts as a huge processing hub
between content feeds and serving systems. It provides an environment
for highly customizable, real-time, streamed information processing,
with typical discovery-to-service latencies of just a few seconds. In
terms of scale and availability, Omid’s new design was largely driven
by Sieve’s requirements.


At Yahoo, we are also making an effort to disseminate the current
status of the project through blog entries (See [3], [4] and [5]) and
submissions to technical and academic conferences such as ATC 2016,
Hadoop Summit 2016, HBaseConf 2016. Last but not least, Omid also
appeared in a TechCrunch article in the last quarter of 2015 (See [6])


[1] D. Peng and F. Dabek, Large-scale Incremental Processing Using
Distributed Transactions and Notifications. USENIX Symposium on
Operating Systems Design and Implementation, 2010

[2] D. Gomez-Ferro, F. Junqueira, I. Kelly, B. Reed, and M. Yabandeh.
Omid: Lock-free transactional support for distributed data stores. In
Proc. of ICDE, 2013.

[3] 
http://yahoohadoop.tumblr.com/post/129089878751/introducing-omid-transaction-processing-for

[4] 
http://yahoohadoop.tum

Please add me to the incubator's contributors group

2016-03-19 Thread Siddharth Anand
Siddharth Anand

My wiki username is above. I would like to add my incubator proposal as I
have a champion and 3 mentors and have received the go-ahead from them.

-s


Re: [PROPOSAL] : Airflow

2016-03-19 Thread Chris Riccomini
@Jake, may I join the IPMC? :)

On Thu, Mar 17, 2016 at 9:06 AM, Jake Farrell  wrote:

> Hey Chris
> As an officer you can be a champion if your project you chair (Samza) is
> sponsoring the incoming podling with the intent of merging the two
> communities - "Where the Champion is not a Member of the Foundation (i.e.
> is an Officer only), the Champion shall be a member of the PMC of the
> Sponsor.", did not think this was the case based on the proposal.
>
> Makes sense about the committers and developing the community during
> incubation, was just curious because there where some people with 20+
> commits not in the initial committers list. Was not looking for a change,
> more that there had been some thought behind this exclusion and how the
> project intended to start developing that community
>
> -Jake
>
> On Thu, Mar 17, 2016 at 11:58 AM, Chris Riccomini 
> wrote:
>
> > > The proposal has the incubator listed as the sponsor and a quick check
> > shows Chris Riccomini is not on the IPMC or a member currently, details
> on
> > roles are available at [1].
> >
> > I was under the impression that I could sponsor because I am an officer:
> >
> > "A candidate project shall be sponsored by an Officer
> >  or Member
> >  of the Foundation"
> >
> > The officer page lists me as a VP of Apache Samza. Perhaps I'm
> misreading?
> >
> > > Mailing lists are in the old format
> >
> > Fixed.
> >
> > > Github shows 109 contributors, but the proposal lists only 6 initial
> > committers, where any of the other existing contributors considered for
> the
> > proposal?
> >
> > In terms of the high ratio of contributors to committers (109:6), the
> > reason is as follows. Most contributions are to the fringes of the system
> > and are in the form of 3rd party contributed hooks/operators (for 3rd
> party
> > integrations). For example, if you want to use Airflow to speak to
> > Cassandra, you may want to contribute a Cassandra Hook and CRUD
> operators.
> > Airflow allows for this extensibility and allows for our users to adapt
> > Airflow to their needs. However, the committer list is limited to folks
> who
> > will touch the internals of the system (e.g. scheduler, metadata
> > management, etc...). Overtime, we would like other contributors to get
> > familiar with the internals to the point where they can also become
> > maintainers, but they don't currently have the expertise.
> >
> >
> > On Thu, Mar 17, 2016 at 8:16 AM, Scott Deboy 
> > wrote:
> >
> > > Ll
> > > On Mar 17, 2016 6:51 AM, "Jake Farrell"  wrote:
> > >
> > > > Hi Siddharth
> > > > Thanks for drafting a proposal and looking to bring Airflow to the
> > Apache
> > > > Incubator. Overall the proposal looks good, just a couple comments.
> The
> > > > proposal has the incubator listed as the sponsor and a quick check
> > shows
> > > > Chris Riccomini is not on the IPMC or a member currently, details on
> > > roles
> > > > are available at [1]. Not a blocker as you have members listed in
> your
> > > > group of mentors and can ask that they step up as the champion.
> > > >
> > > > Mailing lists are in the old format, they should be
> > > > - priv...@airflow.incubator.apache.org (moderated)
> > > > - d...@airflow.incubator.apache.org
> > > > - comm...@airflow.incubator.apache.org
> > > >
> > > > Github shows 109 contributors, but the proposal lists only 6 initial
> > > > committers, where any of the other existing contributors considered
> for
> > > the
> > > > proposal?
> > > >
> > > > When you submit the proposal please make sure to include the entire
> > > > proposal in the email. If you have any questions please let us know
> > > >
> > > > -Jake
> > > >
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion
> > > >
> > > > On Wed, Mar 16, 2016 at 8:28 PM, Siddharth Anand
> > >  > > > >
> > > > wrote:
> > > >
> > > > > https://wiki.apache.org/incubator/AirflowProposal
> > > > >
> > > > > Thoughts and comments are welcome!
> > > > > -s (Sid)
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] [PROPOSAL] Omid for Apache Incubator

2016-03-19 Thread Chris Nauroth
+1 (binding)

--Chris Nauroth




On 3/17/16, 1:17 PM, "Daniel Dai"  wrote:

>Hi,
>
>I would like to propose Omid as an Apache Incubator project:
>
>https://wiki.apache.org/incubator/OmidProposal
>
>I've posted posted the text of the proposal below:
>
>Thanks,
>Daniel
>
>= Omid Proposal =
>
>=== Abstract ===
>
>Omid is a flexible, reliable, high performant and scalable ACID
>transactional framework that allows client applications to execute
>transactions on top of MVCC key/value-based NoSQL datastores
>(currently Apache HBase) providing Snapshot Isolation guarantees on
>the accessed data.
>
>
>=== Proposal ===
>
>Omid is a flexible open-source transactional framework that provides
>ACID transactions with Snapshot Isolation guarantees on top of NoSQL
>datastores. In particular, the current codebase brings the concept of
>transactions to the popular Apache HBase datastore. Omid offers great
>performance, it is highly available, and scalable. Omid's current
>version is able to scale to thousands of clients triggering concurrent
>transactions on application data stored in HBase. Omid can scale
>beyond 100K transactions per second on mid-range hardware while
>incurring in a minimal impact on the speed of data access in the
>datastore. We¹re currently experimenting with a prototype version that
>can improve the performance up to ~380K TPS.
>
>
>Omid has been publicly available as an open-source project in Github
>under Apache License Version 2.0 since 2011 [1]. During these years,
>it has generated certain interest in the open source community,
>especially since the public presentation of the first version in
>Hadoop Summit 2013 [2]. Currently the Github project has 241 Stars and
>93 forks. Yahoo Inc. submits this proposal to the Apache Software
>Foundation with the aim to transfer the Omid project -including its
>source code and documentation- to Apache in order to start the build
>of a stable open source community around it.
>
>
>[1] https://github.com/yahoo/omid
>
>[2] Omid presentation at Hadoop Summit 2013:
>https://www.youtube.com/watch?v=Rhdmo9pVGgU&index=68&list=PLSAiKuajRe2luyq
>LU464Nxz4aQe7EPBus
>
>
>=== Background ===
>
>An Omid prototype was first released as an open-source project back in
>2011. Inspired by Google Percolator [1], it offered a lock-free
>approach to transactions in NoSQL datastores (See [2]). However,
>during these years, the design of Omid has evolved significantly.
>Whilst the current open-sourced version maintains many aspects of the
>original implementation, it is the result of a major redesign of the
>first prototype released in 2011.
>
>
>Omid has now a more decentralized design that does not sacrifice the
>consistency and performance of the original version. The current
>design also enables Omid to scale to thousands of clients executing
>transactions concurrently on application data stored in HBase.
>Internally, Omid still utilizes a lock-free approach to support
>multiple concurrent clients. Its design also relies on a centralized
>conflict detection component, the TSO, which now resolves in an
>efficient manner writeset collisions among concurrent transactions
>without having to piggyback commit information to the clients. Another
>important benefit of Omid is that it doesn't require any modification
>of the underlying key-value datastore, HBase in this case. Moreover,
>the recently added high availability algorithm allows to eliminate the
>single point of failure represented by the TSO in those system
>deployments requiring a higher degree of dependability. Last but not
>least, the provided user API is very simple, mimicking transaction
>managers in the relational world: begin, commit, rollback.
>
>
>Omid is used internally at Yahoo. Sieve, Yahoo¹s web-scale content
>management platform powering some of next-generation search and
>personalization products is using Omid as a transaction manager in its
>processing pipeline. Sieve essentially acts as a huge processing hub
>between content feeds and serving systems. It provides an environment
>for highly customizable, real-time, streamed information processing,
>with typical discovery-to-service latencies of just a few seconds. In
>terms of scale and availability, Omid¹s new design was largely driven
>by Sieve¹s requirements.
>
>
>At Yahoo, we are also making an effort to disseminate the current
>status of the project through blog entries (See [3], [4] and [5]) and
>submissions to technical and academic conferences such as ATC 2016,
>Hadoop Summit 2016, HBaseConf 2016. Last but not least, Omid also
>appeared in a TechCrunch article in the last quarter of 2015 (See [6])
>
>
>[1] D. Peng and F. Dabek, Large-scale Incremental Processing Using
>Distributed Transactions and Notifications. USENIX Symposium on
>Operating Systems Design and Implementation, 2010
>
>[2] D. Gomez-Ferro, F. Junqueira, I. Kelly, B. Reed, and M. Yabandeh.
>Omid: Lock-free transactional support for distributed data stores. In
>Proc. of ICDE, 2013.
>
>[3] 
>

[PROPOSAL] : Airflow

2016-03-19 Thread Siddharth Anand
https://wiki.apache.org/incubator/AirflowProposal

Thoughts and comments are welcome!
-s (Sid)


Re: [PROPOSAL] : Airflow

2016-03-19 Thread Jake Farrell
Hi Siddharth
Thanks for drafting a proposal and looking to bring Airflow to the Apache
Incubator. Overall the proposal looks good, just a couple comments. The
proposal has the incubator listed as the sponsor and a quick check shows
Chris Riccomini is not on the IPMC or a member currently, details on roles
are available at [1]. Not a blocker as you have members listed in your
group of mentors and can ask that they step up as the champion.

Mailing lists are in the old format, they should be
- priv...@airflow.incubator.apache.org (moderated)
- d...@airflow.incubator.apache.org
- comm...@airflow.incubator.apache.org

Github shows 109 contributors, but the proposal lists only 6 initial
committers, where any of the other existing contributors considered for the
proposal?

When you submit the proposal please make sure to include the entire
proposal in the email. If you have any questions please let us know

-Jake


[1]:
http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Champion

On Wed, Mar 16, 2016 at 8:28 PM, Siddharth Anand 
wrote:

> https://wiki.apache.org/incubator/AirflowProposal
>
> Thoughts and comments are welcome!
> -s (Sid)
>


Re: daijy for incubator wiki access

2016-03-19 Thread Marvin Humphrey
On Wed, Mar 16, 2016 at 10:14 PM, Daniel Dai  wrote:
> Please add daijy to incubator contributors group so I can edit wiki.

Done.

Marvin Humphrey

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org



Re: Please add me to the incubator's contributors group

2016-03-19 Thread Siddharth Anand
I have changed my user name to `r39132` (the same as my twitter and github
handle)

Please add that user name to the group that can edit wiki. Sorry for any
inconvenience.

On Wed, Mar 16, 2016 at 5:09 PM, Siddharth Anand  wrote:

> Hmn.. I agree the space can be problematic.. so I am trying to change my
> user name under settings, but it looks like it needs me to change my email
> address as well.
>
> On Wed, Mar 16, 2016 at 5:06 PM, Siddharth Anand  wrote:
>
>> Yes. Thanks.
>>
>> I had tried to followed the recommended convention of
>> "FirstnameLastname", but it was taken. So, I used "Siddharth Anand" (with a
>> space) and it worked.
>>
>>
>>
>>
>> On Wed, Mar 16, 2016 at 4:57 PM, Nick Burch  wrote:
>>
>>> On Wed, 16 Mar 2016, Siddharth Anand wrote:
>>>
 Siddharth Anand

>>>
>>> If you can tell us what username you signed up to the incubator wiki
>>> with, then one of us can do so for you
>>>
>>> Nick
>>>
>>> -
>>> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
>>> For additional commands, e-mail: general-h...@incubator.apache.org
>>>
>>>
>>
>