Note that there is an existing database product called Palo - an open source OLAP engine by German company Jedox[1]. There there is a high likelihood that Palo would have to change its name during incubation, if accepted.
Julian [1] https://en.wikipedia.org/wiki/Palo_(OLAP_database) <https://en.wikipedia.org/wiki/Palo_(OLAP_database)> > On Jun 10, 2018, at 3:49 AM, Han Luke <luke...@gmail.com> wrote: > > Cool Dave, it’s great to have you to be the campaign. > > > ________________________________ > From: Tan,Zhongyi <tanzhon...@baidu.com <mailto:tanzhon...@baidu.com>> > Sent: Saturday, June 9, 2018 8:16:28 AM > To: general@incubator.apache.org <mailto:general@incubator.apache.org> > Subject: Re: Looking for Champion > > thanks,willem > > we are very appreciate. > >> 在 2018年6月8日,23:03,Willem Jiang <willem.ji...@gmail.com> 写道: >> >> Hi, >> >> I'm willing to be the Mentor. >> Please count me in. >> >> >> >> Willem Jiang >> >> Twitter: willemjiang >> Weibo: 姜宁willem >> >>> On Fri, Jun 8, 2018 at 8:59 PM, Dave Fisher <dave2w...@comcast.net> wrote: >>> >>> Hi - >>> >>> I’m willing to Champion and Mentor. I have a couple of comments inline. >>> I’ll look at dependency licenses later today. It’s early for me. >>> >>> >>>> On Jun 7, 2018, at 9:45 PM, Li,De(BDG) <l...@baidu.com> wrote: >>>> >>>> Hi all, >>>> >>>> I am Reed, as a developer worked with the team for Palo (a MPP-based >>> interactive SQL data warehousing). >>>> https://github.com/baidu/palo/wiki/Palo-Overview >>>> >>>> We propose to contribute Palo as an Apache Incubator project, and >>>> we are still looking for possible Champion if anyone would like to >>> volunteer. Thanks a lot. >>>> >>>> Best Regards, >>>> Reed >>>> >>>> =================== >>>> The draft of the proposal as below: >>>> >>>> #Apache Palo >>>> >>>> ##Abstract >>>> >>>> Palo is a MPP-based interactive SQL data warehousing for reporting and >>> analysis. >>>> >>>> ##Proposal >>>> >>>> We propose to contribute the Palo codebase and associated artifacts >>> (e.g. documentation, web-site content etc.) to the Apache Software >>> Foundation with the intent of forming a productive, meritocratic and open >>> community around Palo’s continued development, according to the ‘Apache >>> Way’. >>>> >>>> Baidu owns several trademarks regarding Palo, and proposes to transfer >>> ownership of those trademarks in full to the ASF. >>>> >>>> ###Overview of Palo >>>> >>>> Palo’s implementation consists of two daemons: Frontend (FE) and Backend >>> (BE). >>>> >>>> **Frontend daemon** consists of query coordinator and catalog manager. >>> Query coordinator is responsible for receiving users’ sql queries, >>> compiling queries and managing queries execution. Catalog manager is >>> responsible for managing metadata such as databases, tables, partitions, >>> replicas and etc. Several frontend daemons could be deployed to guarantee >>> fault-tolerance, and load balancing. >>>> >>>> **Backend daemon** stores the data and executes the query fragments. >>> Many backend daemons could also be deployed to provide scalability and >>> fault-tolerance. >>>> >>>> A typical Palo cluster generally composes of several frontend daemons >>> and dozens to hundreds of backend daemons. >>>> >>>> Users can use MySQL client tools to connect any frontend daemon to >>> submit SQL query. Frontend receives the query and compiles it into query >>> plans executable by the Backend. Then Frontend sends the query plan >>> fragments to Backend. Backend will build a query execution DAG. Data is >>> fetched and pipelined into the DAG. The final result response is sent to >>> client via Frontend. The distribution of query fragment execution takes >>> minimizing data movement and maximizing scan locality as the main goal. >>>> >>>> ##Background >>>> >>>> At Baidu, Prior to Palo, different tools were deployed to solve diverse >>> requirements in many ways. And when a use case requires the simultaneous >>> availability of capabilities that cannot all be provided by a single tool, >>> users were forced to build hybrid architectures that stitch multiple tools >>> together, but we believe that they shouldn’t need to accept such inherent >>> complexity. A storage system built to provide great performance across a >>> broad range of workloads provides a more elegant solution to the problems >>> that hybrid architectures aim to solve. Palo is the solution. >>>> >>>> Palo is designed to be a simple and single tightly coupled system, not >>> depending on other systems. Palo provides high concurrent low latency point >>> query performance, but also provides high throughput queries of ad-hoc >>> analysis. Palo provides bulk-batch data loading, but also provides near >>> real-time mini-batch data loading. Palo also provides high availability, >>> reliability, fault tolerance, and scalability. >>>> >>>> ##Rationale >>>> >>>> Palo mainly integrates the technology of Google Mesa and Apache Impala. >>>> >>>> Mesa is a highly scalable analytic data storage system that stores >>> critical measurement data related to Google's Internet advertising >>> business. Mesa is designed to satisfy complex and challenging set of users’ >>> and systems’ requirements, including near real-time data ingestion and >>> query ability, as well as high availability, reliability, fault tolerance, >>> and scalability for large data and query volumes. >>>> >>>> Impala is a modern, open-source MPP SQL engine architected from the >>> ground up for the Hadoop data processing environment. At present, by virtue >>> of its superior performance and rich functionality, Impala has been >>> comparable to many commercial MPP database query engine. Mesa can satisfy >>> the needs of many of our storage requirements, however Mesa itself does not >>> provide a SQL query engine; Impala is a very good MPP SQL query engine, but >>> the lack of a perfect distributed storage engine. So in the end we chose >>> the combination of these two technologies. >>>> >>>> Learning from Mesa’s data model, we developed a distributed storage >>> engine. Unlike Mesa, this storage engine does not rely on any distributed >>> file system. Then we deeply integrate this storage engine with Impala query >>> engine. Query compiling, query execution coordination and catalog >>> management of storage engine are integrated to be frontend daemon; query >>> execution and data storage are integrated to be backend daemon. With this >>> integration, we implemented a single, full-featured, high performance state >>> the art of MPP database, as well as maintaining the simplicity. >>>> >>>> ##Current Status >>>> >>>> Palo has been an open source project on GitHub ( >>> https://github.com/baidu/palo). >>>> >>>> ###Meritocracy >>>> >>>> Palo has been deployed in production at Baidu and is applying more than >>> 200 lines of business. It has demonstrated great performance benefits and >>> has proved to be a better way for reporting and analysis based big data. >>> Still We look forward to growing a rich user and developer community. >>>> >>>> ###Community >>>> >>>> Palo seeks to develop developer and user communities during incubation. >>>> >>>> ###Core Developers >>>> >>>> * Ruyue Ma (https://github.com/maruyue, maru...@baidu.com<mailto:maruy >>> u...@baidu.com>) >>>> * Chun Zhao (https://github.com/imay, buaa.zh...@gmail.com<mailto:bu >>> aa.zh...@gmail.com>) >>>> * Mingyu Chen (https://github.com/morningman,chenmin...@baidu.com) >>>> * De Li(https://github.com/lide-reed, mailtol...@sina.com)<mailto:ma >>> iltol...@sina.com%EF%BC%89> >>>> * Hao Chen (https://github.com/chenhao7253886, chenha...@baidu.com >>> <mailto:chenha...@baidu.com>) >>>> * Chaoyong Li (https://github.com/cyongli, lichaoy...@baidu.com<mailto: >>> lichaoy...@baidu.com>) >>>> * Bin Lin (https://github.com/lingbin, lingbi...@gmail.com<mailto:lin >>> gbi...@gmail.com>) >>>> >>>> ###Alignment >>>> >>>> Palo is related to several other Apache projects: >>>> >>>> * Palo can also read data stored in Apache Hadoop clusters powered by >>> the HDFS filesystem. >>>> * Palo is closely integrated with Impala, which is also being proposed >>> to the Incubator. >>> >>> Apache Impala has completed Incubation. Jim Apple is VP, Impala. >>> >>>> * Palo uses Apache Thrift as its RPC and serialization framework of >>> choice. >>>> >>>> ##Known Risks >>>> >>>> ###Orphaned Products >>>> >>>> The core developers of Palo team plan to work full time on this project. >>> There is very little risk of Palo getting orphaned since at least one large >>> company (Baidu) is extensively using it in their production. For example, >>> currently there are more than 200 use cases using Palo in production. >>> Furthermore, since Palo was open sourced at the beginning of October 2017, >>> it has received more than 660 stars and been forked nearly 170 times. We >>> plan to extend and diversify this community further through Apache. >>>> >>>> ###Inexperience with Open Source >>>> >>>> The core developers are all active users and followers of open source. >>> They are already committers and contributors to the Palo Github project. >>> All have been involved with the source code that has been released under an >>> open source license, and several of them also have experience developing >>> code in an open source environment. Though the core set of Developers do >>> not have Apache Open Source experience, there are plans to onboard >>> individuals with Apache open source experience on to the project. >>>> >>>> ###Homogenous Developers >>>> >>>> The most of core developers are from Baidu, but after Palo was open >>> sourced, Palo received a lot of bug fixes and enhancements from other >>> developers not working at Baidu. >>>> >>>> ###Reliance on Salaried Developers >>>> >>>> Baidu invested in Palo as the OLAP solution and some of its key >>> engineers are working full time on the project. In addition, since there is >>> a growing Big Data need for scalable OLAP solutions, we look forward to >>> other Apache developers and researchers to contribute to the project. Also >>> key to addressing the risk associated with relying on Salaried developers >>> from a single entity is to increase the diversity of the contributors and >>> actively lobby for Domain experts in the BI space to contribute. Apache >>> Palo intends to do this. >>>> >>>> ###An Excessive Fascination with the Apache Brand >>>> >>>> Palo is proposing to enter incubation at Apache in order to help efforts >>> to diversify the committer-base, not so much to capitalize on the Apache >>> brand. The Palo project is in production use already inside Baidu, but is >>> not expected to be an Baidu product for external customers. As such, the >>> Palo project is not seeking to use the Apache brand as a marketing tool. >>>> >>>> ##Documentation >>>> >>>> Information about Palo can be found at https://github.com/baidu/palo. >>> The following links provide more information about Palo in open source: >>>> >>>> * Palo wiki site: https://github.com/baidu/palo/wiki >>>> * Codebase at Github: https://github.com/baidu/palo >>>> * Issue Tracking: https://github.com/baidu/palo/issues >>>> * Overview: https://github.com/baidu/palo/wiki/Palo-Overview >>>> * FAQ: https://github.com/baidu/palo/wiki/Palo-FAQ >>>> >>>> ##Initial Source >>>> >>>> Palo has been under development since 2017 by a team of engineers at >>> Baidu Inc. It is currently hosted on Github.com under an Apache license at >>> https://github.com/baidu/palo. >>>> >>>> ##External Dependencies >>>> >>>> Palo has the following external dependencies. >>>> >>>> * Google gflags (BSD) >>>> * Google glog (BSD) >>>> * Apache Thrift (Apache Software License v2.0) >>>> * Apache Commons (Apache Software License v2.0) >>>> * Boost (Boost Software License) >>>> * OpenLdap (OpenLDAP Software License) >>>> * rapidjson (Tencent) >>>> * Google RE2 (BSD-style) >>>> * lz4 (BSD) >>>> * snappy (BSD) >>>> * cyrus-sasl (CMU License) >>>> * Twitter Bootstrap (Apache Software License v2.0) >>>> * d3 (BSD) >>>> * LLVM (BSD-like) >>>> >>>> Build and test dependencies: >>>> >>>> * ant (Apache Software License v2.0) >>>> * Apache Maven (Apache Software License v2.0) >>>> * cmake (BSD) >>>> * clang (BSD) >>>> * Google gtest (Apache Software License v2.0) >>>> >>>> ##Required Resources >>>> >>>> ###Mailing List >>>> >>>> There are currently no mailing lists. The usual mailing lists are >>> expected to be set up when entering incubation: >>>> >>>> priv...@palo.incubator.apache.org<mailto:private@palo. >>> incubator.apache.org> >>>> d...@palo.incubator.apache.org<mailto:d...@palo.incubator.apache.org> >>>> comm...@palo.incubator.apache.org<mailto:commits@palo. >>> incubator.apache.org> >>>> >>>> ###Subversion Directory >>>> >>>> Upon entering incubation: https://github.com/baidu/palo. >>>> After incubation, we want to move the existing repo from >>> https://github.com/baidu/palo to Apache infrastructure. >>>> >>>> ###Issue Tracking >>>> >>>> Palo currently uses GitHub to track issues. Would like to continue to do >>> so while we discuss migration possibilities with the ASF Infra committee. >>>> >>>> ###Other Resources >>>> >>>> The existing code already has unit tests so we will make use of existing >>> Apache continuous testing infrastructure. The resulting load should not be >>> very large. >>>> >>>> ##Initial Committers >>>> >>>> * Ruyue Ma (https://github.com/maruyue, maru...@baidu.com<mailto:maruy >>> u...@baidu.com>) >>>> * Chun Zhao (https://github.com/imay, buaa.zh...@gmail.com<mailto:bu >>> aa.zh...@gmail.com>) >>>> * Mingyu Chen (https://github.com/morningman,chenmin...@baidu.com) >>>> * De Li(https://github.com/lide-reed, mailtol...@sina.com)<mailto:ma >>> iltol...@sina.com%EF%BC%89> >>>> * Hao Chen (https://github.com/chenhao7253886, chenha...@baidu.com >>> <mailto:chenha...@baidu.com>) >>>> * Chaoyong Li (https://github.com/cyongli, lichaoy...@baidu.com<mailto: >>> lichaoy...@baidu.com>) >>>> * Bin Lin (https://github.com/lingbin, lingbi...@gmail.com<mailto:lin >>> gbi...@gmail.com>) >>>> >>>> ##Affiliations >>>> >>>> The initial committers are employees of Baidu Inc.. The nominated >>> mentors are employees of TODO. >>>> >>>> ##Sponsors >>>> >>>> ###Champion >>>> >>>> TODO >>>> >>>> ###Nominated Mentors >>>> >>>> * sijie guo, guosi...@gmail.com<mailto:guosi...@gmail.com> >>>> * Luke Han, luke...@apache.org<mailto:luke...@apache.org> >>>> * Zheng Shao, zs...@apache.org<mailto:zs...@apache.org> >>> >>> Mentors must be members of the IPMC and almost always Members of the ASF. >>> >>> At this moment only Luke Han is qualified. >>> >>> Regards, >>> Dave >>> >>>> >>>> ###Sponsoring Entity >>>> >>>> We are requesting the Incubator to sponsor this project. >>> >>> > B婯KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB??[溳X溫軞X橩K[XZ[?賉橽榌 > ][溳X溫軞X橮[樰X榏軏榎X?K涇櫭B憶軋Y][蹣[圹[X[??K[XZ[?賉橽榌 Z[[樰X榏軏榎X?K涇櫭B