+1 non binding Tim
> On Oct 11, 2015, at 4:59 PM, Luke Han <luke...@gmail.com> wrote: > > +1 (non-binding) > > > Best Regards! > --------------------- > > Luke Han > > On Mon, Oct 12, 2015 at 4:33 AM, Alan D. Cabrera <l...@toolazydogs.com> > wrote: > >> +1 - binding >> >> >> Regards, >> Alan >> >>> On Oct 9, 2015, at 8:55 AM, Atri Sharma <a...@apache.org> wrote: >>> >>> Hi all, >>> >>> Following the discussion about Concerted I would like to call a vote for >>> accepting Concerted as a new incubator project. >>> >>> The proposal text is included below, and available on the wiki: >>> >>> https://wiki.apache.org/incubator/ConcertedProposal >>> >>> The vote is open for 72 hours: >>> >>> [ ] +1 accept Concerted in the Incubator >>> [ ] ±0 >>> [ ] -1 (please give reason) >>> >>> Regards, >>> >>> Atri >>> >>> = Abstract = >>> >>> Concerted is an in memory write less read more engine aimed to provide >>> extreme read performance with very high degree of concurrency and >>> scalability and focus on minimizing own resource footprint. >>> >>> = Proposal = >>> Concerted is built on the principal that a new type of workload is >>> dominating the scene and is now needed to be supported. These are the >> large >>> data set analytical workloads being analyzed or used on large clusters or >>> high power machines. Large analytical workloads depend on the ability to >>> query large data sets efficiently and in high concurrency while >> maintaining >>> semantics such as immediate consistency. An in memory engine designed to >>> support extreme read queries while providing support for aggregation >>> through various features (such as multidimensional representation of >>> tuples) will accelerate many usecases around large scale analytics. >>> >>> Concerted believes that best understanding of user application lies with >>> user application developer. The need for massive read scaling should be >> on >>> demand and should be flexible to the level that user can decide as to >> which >>> representation and access of data suits his/her current requirements. >>> Hence, Concerted is not built in a traditional client/server model. >>> Concerted provides users with an API which can be used to load, read, >>> update and delete data. User chooses which data structure has to be used >>> for his current requirements. All API access is covered by Concerted's >>> internal systems like lock manager, transaction manager and cache manager >>> which ensure that reads scale to high level in every API call. >>> >>> Concerted is a Do It Yourself in memory platform for making in memory >>> supporting engines. The use case we think of is supporting big data >>> warehouses like Hive, but there are endless use cases for a custom, >> highly >>> scalable in memory platform. >>> >>> The goal of this proposal is to leverage an existing code base available >> on >>> Github and licensed under the Apache License 2.0 to build a community >>> around the project. Currently the community consists of existing hackers >> of >>> Concerted as well as people who have been following and associated with >> the >>> project since a while as well as database experts who are excited about >>> building a project like this. We are hoping that entering into Apache >> would >>> help us attract more contributors as well as connect with existing big >> data >>> projects like Apache Hive, Apache HAWQ, Apache Storm, Apache Tajo, Apache >>> Spark, Apache Geode to leverage their community base while assisting in >>> their use cases with Concerted. We had a discussion with founders of >> Apache >>> Tajo and they showed interest in using Concerted for some of their use >>> cases. >>> = Background = >>> Relational databases were built with the cost of physical memory in mind. >>> The cost is no longer very relevant and physical memory is now available >> on >>> demand. Another driving factor behind Concerted is that there is a >> paradigm >>> shift with big data coming into picture. Disk IO speeds are more of a >>> bottleneck than ever before. Combining the read dominance of analytical >>> workload with the speed of in memory structures, Concerted fits the >> current >>> scene. Also, supporting OLAP workloads with in memory support for faster >>> read constant queries and joins will be useful. >>> >>> = Rationale = >>> As explained above, large analytical workloads need an in memory >>> lightweight engine which supports massive read concurrency, ground level >>> support for aggregations and analytics, extreme scalability and high read >>> performance, along with the engine being very light itself. Concerted >> aims >>> to solve these needs. Concerted is designed and built with three goals as >>> objectives: >>> >>> >>> Performance >>> To provide high performance access to data from a large number of >> rows, >>> Concerted uses efficient representation and in memory indexing of data >>> coupled with high performance transactions, custom transactions and >>> lightweight locking and lockless techniques and an intelligent locking >>> manager. >>> >>> Scalability >>> Concerted is built with extreme concurrency and scalability in mind. >>> >>> Efficiency >>> Concerted aims to give expected performance under vast variety of >>> workloads and aims to have as low footprint as possible. >>> >>> = Initial Goals = >>> The initial goal is to leverage an existing code base and invest in >>> building a community around the project. We anticipate a lot of initial >>> restructuring of the existing code so that it becomes easier to include >> new >>> contributors and minimize ramp up time. We plan to approach this >>> refactoring in a fully transparent, community-driven way thus starting to >>> practice the "Apache Way" governance model from the get go. >>> >>> Various contributors are getting individual changes into branches in >> github >>> repository and our initial major goal will be to merge in all those >> changes >>> in master repository. >>> >>> = Current Status = >>> Concerted is currently under restructuring to suit the needs of an open >>> source project. Current source is available at >>> https://github.com/atris/Concerted (Please note that updated codebase is >>> not yet present on github) Concerted is currently being licensed under >>> Apache License 2.0. Most of the code base is implemented in C and C++ and >>> has external dependencies listed later. >>> >>> == Meritocracy == >>> >>> We plan to drive the technical roadmap and implementation in a fully >>> transparent, community-driven way soliciting feedback from all of the >>> community members and building a consensus-driven approach to evolving >> the >>> code base and the community itself. Users and new contributors will be >>> treated with respect and welcomed. By participating in the community and >>> providing quality patches/support that move the project forward, >>> contributors will earn merit. They also will be encouraged to provide >>> non-code contributions (documentation, events, community management, >> etc.) >>> and will gain merit for doing so. Those with a proven support and quality >>> track record will be encouraged to become committers. >>> >>> == Community == >>> In memory is the new cutting edge thing and a new community around >>> performance oriented systems and enhancing relational database >> performance >>> by having complete in memory OLTP engines will greatly benefit >> performance. >>> So we expect data warehousing projects and communities as well as >> projects >>> and companies looking for high performance OLTP performance. In addition, >>> Ingenium Data Systems is building products around Concerted and will have >>> salaried developers contribute to the project as part of job >> responsibility. >>> >>> == Core Developers == >>> Core developers are a diverse group of developers, many of which are very >>> experienced in open source and the Apache Hadoop ecosystem. Specifically, >>> Atri is an Apache Apex committer and Atri and Pavel are major >> contributors >>> to PostgreSQL project.Atri is also committer for other open source >> projects. >>> >>> * Amrish <amrishs AT ingeniumsys DOT com> >>> * Nupur S <nupurs AT ingeniumsys DOT com> >>> * Pavel Stehule <pavel DOT stehule AT gmail.com> >>> * Atri Sharma <atri AT apache DOT org> >>> * Nishith Singhal <nishsinghal AT gmail DOT com> >>> * Michael Down <michael AT dowuk DOT com> >>> * Vijayakumar Ramdoss <vijayakumar DOT ramdoss AT emc DOT com> >>> * Wang Albert <albertwang87 AT gmail DOT com> >>> * Hans-Jurgen Schonig <postgres AT cybertec DOT at> >>> * Kris Popat <krispopat AT apache DOT org> >>> * Ayrton Gomesz <com DOT ayrton AT gmail DOT com> >>> >>> == Alignment == >>> Concerted will be helpful to systems like Tajo which can benefit with in >>> memory structures optimized for heavy reads and joins (dimension tables). >>> In addition Concerted will benefit projects looking for in memory >>> relational database as a metadata store, which is the case for most of >> the >>> Apache Big Data projects. We expect Apache HAWQ (incubating), Apache >> Hive, >>> Apache Storm, Apache Tajo to be utilizing Concerted as a supporting >> engine. >>> For eg, a data warehouse built on HAWQ, Hive or Tajo can utilize >> Concerted >>> as an in memory engine for querying and joining dimensional tables. >>> >>> = Known Risks = >>> >>> == Orphaned Products == >>> Most of the code is developed by a small group of core developers and >> this >>> may be a risk for orphaned product. However, the code base is simple as >>> compared to other open source projects and the interest level in >> Concerted >>> has risen exponentially over the years with many computer professionals >>> expressing interest in the project and doing some use cases of the >>> same.Specifically, there were some projects done around Concerted in >> JIIT, >>> Noida (an engineering school) and Wang is a student in Lehigh University >>> who has been following Concerted's progress over many years. The core >>> developers are aligned with this project and since the code base is >> simple, >>> future committers will have a quick ramp up and the risk shall be >>> mitigated. Besides, Ingenium Data Systems is launching a product based on >>> Concerted and will be having all its salaried developers contribute to >>> Concerted as a part of their job functions. >>> >>> == Inexperience with Open Source == >>> Most of the initial committers have experience working on open source >>> projects. In particular, Atri is an active member of many open source >>> projects. >>> >>> == Homogeneous Developers == >>> Although initial core developers were based out of India, community now >>> consists of computer professionals from various parts of the world hence >>> diversity should not be an issue. In addition, we will be documenting >>> internals of the project in public facing documents and it shall allow >> more >>> contributors to join in. >>> >>> == Reliance on Salaried Developers == >>> It is expected that Concerted development will occur on both salaried >> time >>> and on volunteer time. Nupur and Amrish belong to Ingenium and are >>> committed to building this project along with their team. Atri, as the >>> originator of this project, will be actively working on the project and >> is >>> now pushing Concerted into major data warehousing projects, since he is >>> involved in architecture of data platforms. Developers are expected to be >>> contributing in their volunteer time. In addition, we will be working >> with >>> various open source projects which will be benefited by Concerted and >> will >>> be involving those communities into Concerted's development as well. For >>> eg, Apache Tajo has shown interest and will be supporting development of >>> the project. >>> >>> == Relationships with Other Apache Products == >>> Concerted has some overlapping function with Apache Geode(Incubating). >>> However, Geode is an in memory key value store whereas Concerted is a >> write >>> less read many engine. Concerted will complement Geode and increase the >> use >>> cases Geode can support with Concerted's help. >>> >>> A major objective for Concerted is supporting OLAP workloads and data >>> warehouses with in memory performance and highly performant reads and >>> joins. Concerted will be collaborating with many open source projects >> such >>> as Apache HAWQ (incubating), Apache Hive, Apache Tajo etc to support >> their >>> OLAP workloads hence enabling them to support larger set of usecases >> with a >>> better throughput. For eg, a star schema in Hive will benefit from having >>> dimension tables in Concerted with highly efficient and scalable reads >> and >>> joins will be very fast. Similar workload for Tajo. >>> >>> Concerted will fit in many other use cases in Apache spectrum as well. >> For >>> eg, Concerted can be used with Apache Geode for in memory aggregation >>> indexing. Concerted can also be used with Apache Flink for streaming real >>> time data into in memory, perform in memory aggregation and then >> performing >>> batch processing for efficiency. >>> >>> >>> == A Excessive Fascination with the Apache Brand == >>> We believe that the "Apache Way" governance model will provide additional >>> help to us in finding contributors and growing the community. The >> community >>> and development process will make this project more stable and help >>> establish ubiquitous APIs. In addition, Concerted is looking to support >>> multiple Apache projects in their use cases and accelerate their >>> performance while soliciting their support in development of the project. >>> We will not be using Apache brand for excessive branding or with any >>> commercial aspects of Concerted. Apache brand will primarily be used for >>> community building. >>> >>> = Documentation = >>> Public documents are currently in development and will be published soon. >>> >>> = Initial Source = >>> The initial source is written in C++ and is heavily in development. It >> will >>> be restructured and released publicly. >>> We understand that there might be concerns around github source being >>> developed by only a single person and development not happening after >> 2013. >>> The source on github is only the source initially developed as an >>> independent project hence the limitation. However, due to reason that >>> project has been present on github for a while now, it has attracted >>> attention and people have been using and developing it locally. For eg, >>> Ingenium Data System took an interest in the project and locally >> developed >>> it and used it in an upcoming product they are going to release soon. The >>> project now wants to accumulate all independent development efforts and >>> help attract people to grow the community and project. We are currently >> in >>> process of updating github repository and making branches for all local >>> development efforts. >>> >>> = Source and Intellectual Property Submission Plan = >>> >>> We intend the entire code base to be licensed under the Apache License, >>> Version 2.0. >>> >>> = External Dependencies = >>> Currently, Concerted only depends on g++ compiler and pthreads. pthreads >>> will be replaced by Boost in next release. >>> >>> = Cryptography = >>> >>> N/A >>> >>> = Required Resources = >>> == Mailling List == >>> *priv...@concerted.incubator.apache.org (moderated subscriptions) >>> *comm...@concerted.incubator.apache.org >>> *d...@concerted.incubator.apache.org >>> *iss...@concerted.incubator.apache.org >>> >>> == Git Repository == >>> >>> https://git-wip-us.apache.org/repos/asf/incubator-concerted.git >>> >>> == Issue Tracking == >>> Jira Concerted (CONCERTED) >>> >>> == Other Resources == >>> * Continuous Integration >>> * Jenkins >>> * Wiki >>> * cwiki.apache.org/confluence/display/CONCERTED >>> >>> = Initial Committers = >>> * Roman Shaposhnik <rvs AT apache DOT org> >>> * Daniel Dai <daijy AT apache DOT org> >>> * Jake Farrell <jfarrell AT apache DOT org> >>> * Lars Hofhansl <larsh AT apache DOT org> >>> * Julian Hyde <jhyde AT apache DOT org> >>> * Chris Nauroth <cnauroth AT hortonworks DOT com> >>> * Pavel Stehule <pavel DOT stehule AT gmail.com> >>> * Amrish <amrishs AT ingeniumsys DOT com> >>> * Nupur S <nupurs AT ingeniumsys DOT com> >>> * Atri Sharma <atri AT apache DOT org> >>> * Nishith Singhal <nishsinghal AT gmail DOT com> >>> * Michael Down <michael AT dowuk DOT com> >>> * Vijayakumar Ramdoss <vijayakumar DOT ramdoss AT emc DOT com> >>> * Wang Albert <albertwang87 AT gmail DOT com> >>> * Hans-Jurgen Schonig <postgres AT cybertec DOT at> >>> * Kris Popat <krispopat AT apache DOT org> >>> * Ayrton Gomesz <com DOT ayrton AT gmail DOT com> >>> >>> = Affiliations = >>> * Roman Shaposhnik (Pivotal) >>> * Daniel Dai (HortonWorks) >>> * Jake Farrell (Acquia) >>> * Lars Hofhansl (Salesforce) >>> * Julian Hyde (HortonWorks) >>> * Chris Nauroth (HortonWorks) >>> * Pavel Stehule (GoodData) >>> * Amrish (Ingenium Data Systems) >>> * Nupur S (Ingenium Data Systems) >>> * Atri Sharma (Barclays) >>> * Nishith Singhal (Wipro) >>> * Michael Down (Barclays) >>> * Vijayakumar Ramdoss (EMC) >>> * Wang Albert (Lehigh University) >>> * Hans- Jurgen Schonig (CyberTec) >>> * Kris Popat (CETIS LLP) >>> * Ayrton Gomesz (IQLabs) >>> >>> The nominated mentors are employees of HortonWorks, Acquia, and >> Salesforce. >>> >>> * Daniel Dai (HortonWorks) >>> * Jake Farrell (Acquia) >>> * Lars Hofhansl (Salesforce) >>> * Julian Hyde (HortonWorks) >>> * Chris Nauroth (HortonWorks) >>> >>> = Sponsors = >>> >>> == Champion == >>> >>> * Roman Shaposhnik (rvs AT apache DOT org) >>> >>> == Nominated Mentors == >>> >>> * Daniel Dai <daijy AT apache DOT org> >>> * Jake Farrell <jfarrell AT apache DOT org> >>> * Lars Hofhansl <larsh AT apache DOT org> >>> * Julian Hyde <jhyde AT apache DOT org> >>> * Chris Nauroth <cnauroth AT hortonworks DOT com> >>> >>> == Sponsoring Entity == >>> Apache Incubator >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org