Re: [VOTE] Accept Apache Atlas into Apache Incubator
Hi Jake, Sorry that I missed your comment and delay in my response. Thanks for the heads up and will take this up with podling name search jira. Thanks! On Sun, May 3, 2015 at 6:52 PM, Jake Farrell jfarr...@apache.org wrote: Sorry I missed the discussion thread for this proposed podling, the name for this project may have an issue with Netflix Atlas [1] when it comes time to graduate, may be worth the discussion of switching names if voted in before any infra resources are setup -Jake [1]: http://techblog.netflix.com/2014/12/introducing-atlas-netflixs-primary.html [2]: https://github.com/netflix/atlas On Fri, May 1, 2015 at 3:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and
[RESULT][VOTE] Accept Apache Atlas into Apache Incubator
Hi folks, With nine +1 binding votes, six +1 non-binding votes, (fifteen +1 votes total) NO +/-0 or -1 votes, this VOTE PASSES. Thanks to all who voted! Here's a tally of +1 binding votes: Jitendra Pandey Alan Gates Jan I Taylor Goetz Arun C Murthy Jean-Baptiste Onofre Suresh Srinivas Sharad Agarwal Amareshwari Non-binding: Dilli Arumugam Shwetha Shivalingamurthy Venkatesh Seetharam Ajay Yadav Venkat Ranganathan Suma Shivaprasad Thanks! On Fri, May 1, 2015 at 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (non binding) Minor nit: Please update my email address to venkatran...@apache.org Thanks Venkat On 5/1/15, 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. === Inexperience with Open Source
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1(non binding) Thanks Suma On Tue, May 5, 2015 at 8:23 PM, Venkat Ranganathan vranganat...@hortonworks.com wrote: +1 (non binding) Minor nit: Please update my email address to venkatran...@apache.org Thanks Venkat On 5/1/15, 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) Thanks Amareshwari On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption.
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (non-binding) Cheers Ajay Yadava On Tue, May 5, 2015 at 10:04 AM, Sharad Agarwal sha...@apache.org wrote: +1 (binding) On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (non-binding) On 01/05/15 12:56 pm, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging ³big data² software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. === Inexperience with Open Source === Many of the core developers are PMC and committers of Apache. Harish Butani is PMC
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) @Venkatesh: I'm interesting to help there. If you want, don't hesitate to add me in the initial committer set. Regards JB On 05/01/2015 09:26 AM, Seetharam Venkatesh wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. ===
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) On 4 May 2015 at 13:29, Suresh Srinivas sur...@hortonworks.com wrote: +1 (binding). (Minor nit: Please use sur...@apache.org as my email address). Regards, Suresh From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam Venkatesh venkat...@innerzeal.com Sent: Friday, May 01, 2015 12:26 AM To: general@incubator.apache.org Subject: [VOTE] Accept Apache Atlas into Apache Incubator Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. === Inexperience with
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding). (Minor nit: Please use sur...@apache.org as my email address). Regards, Suresh From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam Venkatesh venkat...@innerzeal.com Sent: Friday, May 01, 2015 12:26 AM To: general@incubator.apache.org Subject: [VOTE] Accept Apache Atlas into Apache Incubator Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) thanks, Arun On May 1, 2015, at 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued
Re: [VOTE] Accept Apache Atlas into Apache Incubator
Sorry I missed the discussion thread for this proposed podling, the name for this project may have an issue with Netflix Atlas [1] when it comes time to graduate, may be worth the discussion of switching names if voted in before any infra resources are setup -Jake [1]: http://techblog.netflix.com/2014/12/introducing-atlas-netflixs-primary.html [2]: https://github.com/netflix/atlas On Fri, May 1, 2015 at 3:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding). Alan. Seetharam Venkatesh mailto:venkat...@innerzeal.com May 1, 2015 at 0:26 Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata. Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. === Inexperience with Open Source === Many of the core developers are PMC and committers of Apache. Harish Butani is PMC
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 On 5/1/15 10:52 AM, Jitendra Pandey jiten...@hortonworks.com wrote: +1 Thanks Venkatesh for the proposal. Apache Atlas is a great addition to Apache Hadoop ecosystem, and it fills a significant void in the area of data governance in open source. From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam Venkatesh venkat...@innerzeal.com Sent: Friday, May 01, 2015 12:26 AM To: general@incubator.apache.org Subject: [VOTE] Accept Apache Atlas into Apache Incubator Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging ³big data² software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) have fun jan i. On Friday, May 1, 2015, Dilli Arumugam darumu...@hortonworks.com wrote: +1 On 5/1/15 10:52 AM, Jitendra Pandey jiten...@hortonworks.com javascript:; wrote: +1 Thanks Venkatesh for the proposal. Apache Atlas is a great addition to Apache Hadoop ecosystem, and it fills a significant void in the area of data governance in open source. From: vseetha...@gmail.com javascript:; vseetha...@gmail.com javascript:; on behalf of Seetharam Venkatesh venkat...@innerzeal.com javascript:; Sent: Friday, May 01, 2015 12:26 AM To: general@incubator.apache.org javascript:; Subject: [VOTE] Accept Apache Atlas into Apache Incubator Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging ³big data² software projects. Atlas has been designed to solve
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 (binding) -Taylor On May 1, 2015, at 3:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote: Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued
Re: [VOTE] Accept Apache Atlas into Apache Incubator
+1 Thanks Venkatesh for the proposal. Apache Atlas is a great addition to Apache Hadoop ecosystem, and it fills a significant void in the area of data governance in open source. From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam Venkatesh venkat...@innerzeal.com Sent: Friday, May 01, 2015 12:26 AM To: general@incubator.apache.org Subject: [VOTE] Accept Apache Atlas into Apache Incubator Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core
[VOTE] Accept Apache Atlas into Apache Incubator
Hello folks, Following the discussion earlier in the thread: http://s.apache.org/r2 I would like to call a VOTE for accepting Apache Atlas as a new incubator project. The proposal is available at: https://wiki.apache.org/incubator/AtlasProposal Also, the text of the latest wiki proposal is included at the bottom of this email. The VOTE is open for at least the next 72 hours: [ ] +1 accept Apache Atlas into the Apache Incubator [ ] ±0 Abstain [ ] -1 because... Of course I am +1! (non-binding) Thanks! = Apache Atlas Proposal = == Abstract == Apache Atlas is a scalable and extensible set of core foundational governance services that enables enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the complete enterprise data ecosystem. == Proposal == Apache Atlas allows agnostic governance visibility into Hadoop, these abilities are enabled through a set of core foundational services powered by a flexible metadata repository. These services include: * Search and Lineage for datasets * Metadata driven data access control * Indexed and Searchable Centralized Auditing operational Events * Data lifecycle management – ingestion to disposition * Metadata interchange with other metadata tools == Background == Hadoop is one of many platforms in the modern enterprise data ecosystem and requires governance controls commensurate with this reality. Currently, there is no easy or complete way to provide comprehensive visibility and control into Hadoop audit, lineage, and security for workflows that require Hadoop and non-Hadoop processing. Many solutions are usually point based, and require a monolithic application workflow. Multi-tenancy and concurrency are problematic as these offerings are not aware of activity outside of their narrow focus. As Hadoop gains greater popularity, governance concerns will become increasingly vital to increasing maturity and furthering adoption. It is a particular barrier to expanding enterprise data under management. == Rationale == Atlas will address issues previously discussed by providing governance capabilities in Hadoop -- using both a prescriptive and forensic model enriched by business taxonomical metadata.Atlas, at its core, is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack -- enable governance controls that are truly platform agnostic and effectively (and defensibly) address compliance concerns. Initially working with a group of leading partners in several industries, Atlas is built to solve specific real world governance problems that accelerate product maturity and time to value. Atlas aims to grow a community to help build a widely adopted pattern for governance, metadata modeling and exchange in Hadoop – which will advance the interests for the whole community. == Current Status == An initial version with a valuable set of features is developed by the list of initial committers and is hosted on github. === Meritocracy === Our intent with this proposal is to start building a diverse developer community around Atlas following the Apache meritocracy model. We have wanted to make the project open source and encourage contributors from multiple organizations from the start. We plan to provide plenty of support to new developers and to quickly recruit those who make solid contributions to committer status. === Community === We are happy to report that the initial team already represents multiple organizations. We hope to extend the user and developer base further in the future and build a solid open source community around Atlas. === Core Developers === Atlas development is currently being led by engineers from Hortonworks – Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the engineers have deep expertise in Hadoop and are quite familiar with the Hadoop Ecosystem. === Alignment === The ASF is a natural host for Atlas given that it is already the home of Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big data” software projects. Atlas has been designed to solve the data governance challenges and opportunities of the Hadoop ecosystem family of products as well as integration to the tradition Enterprise Data ecosystem. Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas of data governance and compliance management. == Known Risks == === Orphaned products Reliance on Salaried Developers === The core developers plan to work full time on the project. There is very little risk of Atlas getting orphaned. A prototype of Atlas is in use and being actively developed by several companies and have vested interest in its continued vitality and adoption. === Inexperience with Open Source === Many of the core developers are PMC and committers of Apache. Harish Butani is PMC Apache Hive, Venkatesh Seetharam is PMC on Apache Falcon and Apache Knox, Shwetha GS is PMC