RE: [VOTE] Phoenix for incubator project

Vasudevan, Ramkrishna S Fri, 06 Dec 2013 02:06:19 -0800

+1 from me.

Regards
Ram


-----Original Message-----
From: Bruno Mahé [mailto:bm...@apache.org] 
Sent: Friday, December 06, 2013 3:30 PM
To: general@incubator.apache.org
Subject: Re: [VOTE] Phoenix for incubator project

On 12/05/2013 01:43 PM, Stack wrote:
> Discussion of the Phoenix proposal has settled since its original 
> posting on November 7th.  Feedback has been incorporated.
>
> Let us now move to a vote.
>
> Should Phoenix become an Apache incubator project?
>
> [] +1 Accept Phoenix into the Incubator [] +0 Don't care whether or 
> which [] -1 Do not accept Phoenix into the Incubator because...
>
> The latest version of the proposal can be found here [1].  It is also 
> posted below for your convenience.
>
> Let the vote run 72 hours.
>
> Thank you,
> St.Ack
>
> 1. https://wiki.apache.org/incubator/PhoenixProposal
>
>
>
>
> Abstract
>
> Phoenix is an open source SQL query engine for Apache HBase, a NoSQL 
> data store. It is accessed as a JDBC driver and enables querying and 
> managing HBase tables using SQL.
>
> Proposal
>
> Phoenix is an open source SQL skin over HBase delivered as a 
> client-embedded JDBC driver targeting low latency queries over HBase data.
> Phoenix takes your SQL query, compiles it into a series of HBase 
> scans, and orchestrates the running of those scans to produce regular 
> JDBC result sets. The table metadata is stored in an HBase table and 
> versioned, such that snapshot queries over prior versions will 
> automatically use the correct schema. Direct use of the HBase API, 
> along with coprocessors and custom filters, results in performance on 
> the order of milliseconds for small queries, or seconds for tens of 
> millions of rows. Phoenix interfaces with both Pig and Map-reduce for the 
> input and output of data.
>
> Background
>
> Phoenix initially started as an internal project at Salesforce.com to 
> efficiently analyze big data stored in HBase. It was open sourced on 
> Github about a year ago in Jan 2013. Over time Phoenix, together with 
> HBase as the storage tier, has begun to evolve into a general SQL 
> database with support for metadata management, secondary indexes, 
> joins, query optimization, and multi-tenancy. This is expected to 
> continue as Phoenix implements a cost-based query optimizer and 
> potentially transaction support, and surfaces new HBase security 
> features such as encryption and cell-level security. Phoenix's 
> developer community has also grown to include additional companies 
> such as Intel, who have contributed join support to Phoenix, as well 
> as Hortonworks, who are in the process of porting Phoenix to the 0.96 release 
> of HBase.
>
> Rationale
>
> As usage and the number of contributors to Phoenix has grown, we have 
> sought for a long-term home for the project, and we believe the Apache 
> foundation would be a great fit. Joining Apache would ensure that 
> tried and true processes and procedures are in place for the growing 
> number of organizations interested in contributing to Phoenix. Phoenix 
> is also a good fit for the Apache foundation: Phoenix already 
> interoperates with several existing Apache projects (HBase, Hadoop, 
> Pig, BigTop). The Phoenix team is familiar with the Apache process and 
> and believes in the Apache mission - the team already includes multiple 
> Apache committers.
>
> Initial Goals
>
> The initial goals will be to move the existing codebase to Apache and 
> integrate with the Apache development process. Once this is 
> accomplished, we plan for incremental development and releases that 
> follow the Apache guidelines.
>
> Current Status
>
> Phoenix has undergone two major and three minor releases (1.0, 1.1, 
> 1.2, 2.0, and 2.1) as well as many patch releases. Phoenix is being 
> used in production by Salesforce.com as well as at other 
> organizations. The Phoenix codebase is currently hosted at github.com, 
> which will form the basis of the Apache git repository.
>
> Meritocracy
>
> The Phoenix project already operates on meritocratic principles. 
> Phoenix has several developers from various organizations outside of 
> Salesforce.com who have contributed major new features. While this 
> process has remained mostly informal, as we do not have an official 
> committer list, an implicit organization exists in which individuals 
> who contribute major components act as maintainers for those modules. 
> If accepted, the Phoenix project would include several of these 
> participants as initial committers. We will work to identify all 
> committers and PPMC members for the project and to operate under the ASF 
> meritocratic principles.
>
> Community
>
> Acceptance into the Apache foundation would bolster the already strong 
> user and developer community around Phoenix. That community includes 
> many contributors from various other companies, and an active mailing 
> list composed of hundreds of users.
>
> Core Developers
>
> The core developers of our project are listed in our contributors and 
> initial PPMC below. Though many are employed at Salesforce.com, there 
> is a representative cross sampling of other organizations including 
> Intel, Hortonworks, and Cloudera.
>
> Alignment
>
> Our proposed Phoenix effort aligns closely with Apache HBase. The 
> HBase project perimeter is denoted by a simple byte-array based 
> Create, Read, Update, Delete and Scan APIs with no current plans to 
> extend beyond this bounds. Phoenix complements this with a higher 
> level API in SQL with which many are already familiar. At first 
> glance, it may seem that Phoenix should just be folded into HBase as a 
> new module. However, the focus of the two projects will be quite 
> different, especially as Phoenix matures. With secondary indexing and 
> joins just having been introduced into Phoenix, the next big frontier 
> will be to implement a cost-based query optimizer. This is the 
> heart-and-soul of most relational databases and can can take a lifetime to 
> get right.
>
> HBase is focused on being a scalable data store agnostic to types and 
> schema. Phoenix would layer typing, and relational facilities on top 
> of this scalable store. By keeping Apache HBase and Phoenix separate, 
> both may evolve independently and at different rates. Though the focus 
> of the two projects is different, the relationship between them is 
> very positive and mutually beneficial. New features in HBase will be 
> leveraged in Phoenix as it makes sense to surface these in a SQL 
> paradigm. In addition, Phoenix may drive new features in HBase, as 
> evidenced by the new type system recently introduced into HBase. This 
> will enable better interoperability between Apache Hive, standalone 
> HBase uses case, and Phoenix by defining a standard serialization format.
>
> Phoenix can be divided into a front end and a back end. The front end 
> is delivered as a JDBC driver and contains, among other things, the 
> SQL parser and query planner. The front end is currently written for 
> the HBase client API but could be extended to support other data stores in 
> the Apache family.
>
> The back end is, currently, HBase specific components for pushing as 
> much work to the server as possible. However, if there were sufficient 
> interest to build them, contributions to Phoenix of new back ends for 
> other data stores in the Apache family would be feasible.
>
> Other projects exists that perform SQL over HBase data (such as Apache 
> Hive), however these products do not provide the same low latency 
> query capabilities as Phoenix. Instead, they are more oriented around 
> maximizing throughput for batched operations. Phoenix opens the door 
> to a completely new set of use cases for Apache HBase that demand a 
> more interactive user experience.
>
> There are also a number of related Apache projects and dependencies 
> that are mentioned in the Relationships with Other Apache products section.
>
> Known Risks
>
> Orphaned Products
>
> Given the current level of investment in Phoenix - the risk of the 
> project being abandoned is minimal. All current and planned HBase use 
> cases at Salesforce.com go through Phoenix. In addition, both Intel 
> and Hortonworks plan to include Phoenix in their distributions. Other 
> companies have devoted significant internal infrastructure investment in 
> Phoenix.
>
> Inexperience with Open Source
>
> Phoenix has existed as a healthy open source project for almost a year.
> During that time, James, Mujtaba, and others have successfully 
> fostered an open-source community, attracting users and developers 
> from a diverse group of companies including Intel, Intuit, Bloomberg, Tagged, 
> and Hortonworks.
> Although neither are committers on other Apache projects, both James 
> and Mujtaba have experience working with and contributing to other 
> Apache projects.
>
> Homogenous Developers
>
> The initial list of committers includes developers from several 
> institutions, including Salesforce, Intel, and Hortonworks.
>
> Reliance on Salaried Developers
>
> Like most open source projects, Phoenix receives substantial support 
> from salaried developers. A large fraction of Phoenix development is 
> supported by Salesforce.com. In addition, those working from within 
> corporations and universities often devote “after hours” or spare time 
> to the project. We will continue our efforts to ensure stewardship of 
> the project to be independent of salaried developers.
>
> Relationship with Other Apache Products
>
> Although Phoenix provides a higher level abstraction than Apache HBase 
> by hiding its client APIs, Phoenix relies on Apache HBase for both 
> storing and retrieving data. It also inter-operates with Apache HBase 
> by allowing existing data, not created by Phoenix, to be queried. In 
> addition, both Apache Pig and Hadoop are supported for data input and 
> output. Finally, the Phoenix is included and installable through 
> Apache Bigtop and the build and test suite are run through Apache Maven.
>
> Phoenix offers an alternative query engine to Apache Hadoop (MapReduce).
> Unlike MapReduce, Phoenix is designed for lower-latency, OLTP, and 
> interactive workloads. This makes the projects complimentary as users 
> may run MapReduce and Phoenix side-by-side.
>
> We plan to increase the interoperability between Phoenix, Apache Hive, 
> and standalone Apache HBase usage by standardizing on a new type 
> system that has been introduced in the current major release of HBase. 
> By all these products adopting this new serialization format, 
> interoperability between them will take a big step forward.
>
> In addition, we plan to explore providing lower level APIs for other 
> products such as Apache Drill to plug into when querying HBase data so 
> that they get the performance benefits of using Phoenix.
>
> A Excessive Fascination with the Apache Brand
>
> Phoenix is already a healthy and relatively well known open source project.
> This proposal is not for the purpose of generating publicity. Rather, 
> the primary benefits to joining Apache are those outlined in the 
> Rationale section.
>
> Documentation
>
> Additional documentation on Phoenix may be found on its github website:
>
> Phoenix overview:
> https://github.com/forcedotcom/phoenix/blob/master/README.md
>
> Phoenix wiki: https://github.com/forcedotcom/phoenix/wiki
>
> Phoenix road map: https://github.com/forcedotcom/phoenix/wiki#roadmap
>
> Phoenix issue tracking:
> https://github.com/forcedotcom/phoenix/issues?direction=desc&sort=upda
> ted&state=open
>
> Phoenix codebase: https://github.com/forcedotcom/phoenix
>
> Phoenix SQL language reference: http://forcedotcom.github.io/phoenix/
>
> Phoenix performance:
> https://github.com/forcedotcom/phoenix/wiki/Performance#phoenix-vs-rel
> ated-products
>
> User group: https://groups.google.com/group/phoenix-hbase-user
>
> Initial Source
>
> The Phoenix codebase is currently hosted on Github:
> https://github.com/forcedotcom/phoenix.
>
> Source and Intellectual Property Submission Plan
>
> Currently, the Phoenix codebase is distributed under a BSD license. 
> Upon entering Apache, the Phoenix license will be migrated to the 
> Apache 2.0 License.
>
> External Dependencies
>
> Beyond relying on Apache HBase, Phoenix has the following external
> dependencies:
>
> ANTLR 3.5 (BSD license: http://www.antlr3.org/license.html)
>
> Sqlline 1.1.2 (BSD license:
> https://github.com/julianhyde/sqlline/blob/master/LICENSE)
>
> Open CSV 2.3 (Apache 2.0 license)
>
> Upon acceptance to the incubator, we would begin a thorough analysis 
> of all transitive dependencies to verify this information and 
> introduce license checking into the build and release process by integrating 
> with Apache Rat.
>
> Required Resources
>
> Mailing list
>
> We will migrate the existing Phoenix mailing lists as follows:
>
> phoenix-hbase-u...@googlegroups.com --> 
> us...@phoenix.incubator.apache.org
>
> phoenix-hbase-...@googlegroups.com --> 
> d...@phoenix.incubator.apache.org
>
> priv...@phoenix.incubator.apache.org for IPMC members
>
> comm...@phoenix.incubator.apache.org
>
> The latter is to be consistent with the new PIAO naming scheme for podlings.
>
> Source control
>
> The Phoenix team would like to use Git for source control, due to our 
> current use of Git. We request a writeable Git repo for Phoenix, and 
> mirroring to be set up to Github through INFRA.
>
> Issue Tracking
>
> Phoenix currently uses the github issue tracking system associated 
> with its github repo:
> https://github.com/forcedotcom/phoenix/issues?direction=desc&sort=updated&state=open.
> We will migrate to the Apache JIRA:
> http://issues.apache.org/jira/browse/PHOENIX
>
> Other Resources
>
> Jenkins/Hudson for builds and test running.
> Wiki for documentation purposes
> Blog to improve project dissemination
>
> Initial Committers
>
> James Taylor <jtaylor at salesforce dot com>
>
> Mujtaba Chohan <mchohan at salesforce dot com>
>
> Jesse Yates <jyates at apache dot org>
>
> Eli Levine <elevine at salesforce dot com>
>
> Simon Toens <stoens at salesforce dot com>
>
> Maryann Xue <wei.xue at intel dot com>
>
> Anoop Sam John <anoopsamjohn at apache dot org>
>
> Ramkrishna S Vasudevan <ramkrishna at apache dot org>
>
> Jeffrey Zhong <jeffreyz at apache dot org>
>
> Nick Dimiduk <ndimiduk at apache dot org>
>
> Affiliations
>
> The initial committers are from three organizations: Salesforce.com, 
> Intel, and Hortonworks.
>
> James Taylor (Salesforce.com)
> Mujtaba Chohan (Salesforce.com)
> Jesse Yates (Salesforce.com)
> Eli Levine (Salesforce.com)
> Simon Toens (Salesforce.com)
> Maryann Xue (Intel)
> Anoop Sam John (Intel)
> Ramkrishna S Vasudevan (Intel)
> Jeffrey Zhong (Hortonworks)
> Nick Dimiduk (Hortonworks)
>
> Sponsors
>
> Champion
>
> Michael Stack
>
> Nominated Mentors
>
> Michael Stack
> Lars Hofhansl
> Andrew Purtell
> Devaraj Das
> Enis Soztutar
> Steven Noels
>
> Sponsoring Entity
>
> The Apache Incubator
>


+1

Thanks,
Bruno

---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

RE: [VOTE] Phoenix for incubator project

Reply via email to