Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-08 Thread Seetharam Venkatesh
Hi Jake,

Sorry that I missed your comment and delay in my response. Thanks for the
heads up and will take this up with podling name search jira.

Thanks!

On Sun, May 3, 2015 at 6:52 PM, Jake Farrell jfarr...@apache.org wrote:

 Sorry I missed the discussion thread for this proposed podling, the name
 for this project may have an issue with Netflix Atlas [1] when it comes
 time to graduate, may be worth the discussion of switching names if voted
 in before any infra resources are setup

 -Jake

 [1]:
 http://techblog.netflix.com/2014/12/introducing-atlas-netflixs-primary.html
 [2]: https://github.com/netflix/atlas


 On Fri, May 1, 2015 at 3:26 AM, Seetharam Venkatesh 
 venkat...@innerzeal.com
  wrote:

  Hello folks,
 
  Following the discussion earlier in the thread: http://s.apache.org/r2
 
  I would like to call a VOTE for accepting Apache Atlas as a new incubator
  project.
 
  The proposal is available at:
  https://wiki.apache.org/incubator/AtlasProposal
  Also, the text of the latest wiki proposal is included at the bottom of
  this email.
 
  The VOTE is open for at least the next 72 hours:
 
   [ ] +1 accept Apache Atlas into the Apache Incubator
   [ ] ±0 Abstain
   [ ] -1 because...
 
  Of course I am +1! (non-binding)
 
  Thanks!
 
 
  = Apache Atlas Proposal =
 
  == Abstract ==
 
  Apache Atlas is a scalable and extensible set of core foundational
  governance services that enables enterprises to effectively and
 efficiently
  meet their compliance requirements within Hadoop and allows integration
  with the complete enterprise data ecosystem.
 
  == Proposal ==
 
  Apache Atlas allows agnostic governance visibility into Hadoop, these
  abilities are enabled through a set of core foundational services powered
  by a flexible metadata repository.
 
  These services include:
 
   * Search and Lineage for datasets
   * Metadata driven data access control
   * Indexed and Searchable Centralized Auditing operational Events
   * Data lifecycle management – ingestion to disposition
   * Metadata interchange with other metadata tools
 
  == Background ==
 
  Hadoop is one of many platforms in the modern enterprise data ecosystem
 and
  requires governance controls commensurate with this reality.
 
  Currently, there is no easy or complete way to provide comprehensive
  visibility and control into Hadoop audit, lineage, and security for
  workflows that require Hadoop and non-Hadoop processing.
 
  Many solutions are usually point based, and require a monolithic
  application workflow.  Multi-tenancy and concurrency are problematic as
  these offerings are not aware of activity outside of their narrow focus.
 
  As Hadoop gains greater popularity, governance concerns will become
  increasingly vital to increasing maturity and furthering adoption. It is
 a
  particular barrier to expanding enterprise data under management.
 
  == Rationale ==
 
  Atlas will address issues previously discussed by providing governance
  capabilities in Hadoop -- using both a prescriptive and forensic model
  enriched by business taxonomical metadata.Atlas, at its core, is
  designed to exchange metadata with other tools and processes within and
  outside of the Hadoop stack -- enable governance controls that are truly
  platform agnostic and effectively (and defensibly) address compliance
  concerns.
 
  Initially working with a group of leading partners in several industries,
  Atlas is built to solve specific real world governance problems that
  accelerate product maturity and time to value.
 
  Atlas aims to grow a community to help build a widely adopted pattern for
  governance, metadata modeling and exchange in Hadoop – which will advance
  the interests for the whole community.
 
  == Current Status ==
 
  An initial version with a valuable set of features is developed by the
 list
  of initial committers and is hosted on github.
 
  === Meritocracy ===
 
  Our intent with this proposal is to start building a diverse  developer
  community around Atlas following the Apache meritocracy model. We have
  wanted to make the project open source and encourage contributors from
  multiple organizations from the start.
 
  We plan to provide plenty of support to new developers and to quickly
  recruit those who make solid contributions to committer status.
 
  === Community ===
 
  We are happy to report that the initial team already represents multiple
  organizations. We hope to extend the user and developer base further in
 the
  future and build a solid open source community around Atlas.
 
  === Core Developers ===
 
  Atlas development is currently being led by engineers from Hortonworks –
  Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
  engineers have deep expertise in Hadoop and are quite familiar with the
  Hadoop Ecosystem.
 
  === Alignment ===
 
  The ASF is a natural host for Atlas given that it is already the home of
  Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and 

[RESULT][VOTE] Accept Apache Atlas into Apache Incubator

2015-05-05 Thread Seetharam Venkatesh
Hi folks,

With nine +1 binding votes, six +1 non-binding votes, (fifteen +1 votes
total) NO +/-0 or -1 votes, this VOTE PASSES.

Thanks to all who voted! Here's a tally of +1 binding votes:

Jitendra Pandey
Alan Gates
Jan I
Taylor Goetz
Arun C Murthy
Jean-Baptiste Onofre
Suresh Srinivas
Sharad Agarwal
Amareshwari

Non-binding:
Dilli Arumugam
Shwetha Shivalingamurthy
Venkatesh Seetharam
Ajay Yadav
Venkat Ranganathan
Suma Shivaprasad

Thanks!

On Fri, May 1, 2015 at 12:26 AM, Seetharam Venkatesh 
venkat...@innerzeal.com wrote:

 Hello folks,

 Following the discussion earlier in the thread: http://s.apache.org/r2

 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.

 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.

 The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

 Of course I am +1! (non-binding)

 Thanks!


 = Apache Atlas Proposal =

 == Abstract ==

 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.

 == Proposal ==

 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.

 These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

 == Background ==

 Hadoop is one of many platforms in the modern enterprise data ecosystem
 and requires governance controls commensurate with this reality.

 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.

 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.

 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.

 == Rationale ==

 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.

 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.

 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.

 == Current Status ==

 An initial version with a valuable set of features is developed by the
 list of initial committers and is hosted on github.

 === Meritocracy ===

 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.

 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.

 === Community ===

 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.

 === Core Developers ===

 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.

 === Alignment ===

 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.

 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.

 Atlas fills the gap that the Hadoop Ecosystem has been 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-05 Thread Venkat Ranganathan
+1 (non binding)

Minor nit: Please update my email address to venkatran...@apache.org

Thanks

Venkat



On 5/1/15, 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com wrote:

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
being actively developed by several companies and have vested interest in
its continued vitality and adoption.

=== Inexperience with Open Source 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-05 Thread Suma Shivaprasad
+1(non binding)

Thanks
Suma

On Tue, May 5, 2015 at 8:23 PM, Venkat Ranganathan 
vranganat...@hortonworks.com wrote:

 +1 (non binding)

 Minor nit: Please update my email address to venkatran...@apache.org

 Thanks

 Venkat



 On 5/1/15, 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com
 wrote:

 Hello folks,
 
 Following the discussion earlier in the thread: http://s.apache.org/r2
 
 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.
 
 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.
 
 The VOTE is open for at least the next 72 hours:
 
  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...
 
 Of course I am +1! (non-binding)
 
 Thanks!
 
 
 = Apache Atlas Proposal =
 
 == Abstract ==
 
 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and
 efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.
 
 == Proposal ==
 
 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.
 
 These services include:
 
  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools
 
 == Background ==
 
 Hadoop is one of many platforms in the modern enterprise data ecosystem
 and
 requires governance controls commensurate with this reality.
 
 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.
 
 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.
 
 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.
 
 == Rationale ==
 
 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.
 
 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.
 
 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.
 
 == Current Status ==
 
 An initial version with a valuable set of features is developed by the
 list
 of initial committers and is hosted on github.
 
 === Meritocracy ===
 
 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.
 
 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.
 
 === Community ===
 
 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in
 the
 future and build a solid open source community around Atlas.
 
 === Core Developers ===
 
 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.
 
 === Alignment ===
 
 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.
 
 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.
 
 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the
 areas
 of data governance and compliance management.
 
 == Known Risks ==
 
 === Orphaned products  Reliance on Salaried Developers ===
 The core developers plan to work full 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-05 Thread amareshwarisr .
+1 (binding)

Thanks
Amareshwari

On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh 
venkat...@innerzeal.com wrote:

 Hello folks,

 Following the discussion earlier in the thread: http://s.apache.org/r2

 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.

 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.

 The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

 Of course I am +1! (non-binding)

 Thanks!


 = Apache Atlas Proposal =

 == Abstract ==

 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.

 == Proposal ==

 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.

 These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

 == Background ==

 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.

 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.

 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.

 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.

 == Rationale ==

 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.

 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.

 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.

 == Current Status ==

 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.

 === Meritocracy ===

 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.

 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.

 === Community ===

 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.

 === Core Developers ===

 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.

 === Alignment ===

 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.

 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.

 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance management.

 == Known Risks ==

 === Orphaned products  Reliance on Salaried Developers ===
 The core developers plan to work full time on the project. There is very
 little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
 being actively developed by several companies and have vested interest in
 its continued vitality and adoption.

 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-05 Thread Ajay Yadav
+1 (non-binding)

Cheers
Ajay Yadava

On Tue, May 5, 2015 at 10:04 AM, Sharad Agarwal sha...@apache.org wrote:

 +1 (binding)

 On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh 
 venkat...@innerzeal.com wrote:

  Hello folks,
 
  Following the discussion earlier in the thread: http://s.apache.org/r2
 
  I would like to call a VOTE for accepting Apache Atlas as a new incubator
  project.
 
  The proposal is available at:
  https://wiki.apache.org/incubator/AtlasProposal
  Also, the text of the latest wiki proposal is included at the bottom of
  this email.
 
  The VOTE is open for at least the next 72 hours:
 
   [ ] +1 accept Apache Atlas into the Apache Incubator
   [ ] ±0 Abstain
   [ ] -1 because...
 
  Of course I am +1! (non-binding)
 
  Thanks!
 
 
  = Apache Atlas Proposal =
 
  == Abstract ==
 
  Apache Atlas is a scalable and extensible set of core foundational
  governance services that enables enterprises to effectively and
 efficiently
  meet their compliance requirements within Hadoop and allows integration
  with the complete enterprise data ecosystem.
 
  == Proposal ==
 
  Apache Atlas allows agnostic governance visibility into Hadoop, these
  abilities are enabled through a set of core foundational services powered
  by a flexible metadata repository.
 
  These services include:
 
   * Search and Lineage for datasets
   * Metadata driven data access control
   * Indexed and Searchable Centralized Auditing operational Events
   * Data lifecycle management – ingestion to disposition
   * Metadata interchange with other metadata tools
 
  == Background ==
 
  Hadoop is one of many platforms in the modern enterprise data ecosystem
 and
  requires governance controls commensurate with this reality.
 
  Currently, there is no easy or complete way to provide comprehensive
  visibility and control into Hadoop audit, lineage, and security for
  workflows that require Hadoop and non-Hadoop processing.
 
  Many solutions are usually point based, and require a monolithic
  application workflow.  Multi-tenancy and concurrency are problematic as
  these offerings are not aware of activity outside of their narrow focus.
 
  As Hadoop gains greater popularity, governance concerns will become
  increasingly vital to increasing maturity and furthering adoption. It is
 a
  particular barrier to expanding enterprise data under management.
 
  == Rationale ==
 
  Atlas will address issues previously discussed by providing governance
  capabilities in Hadoop -- using both a prescriptive and forensic model
  enriched by business taxonomical metadata.Atlas, at its core, is
  designed to exchange metadata with other tools and processes within and
  outside of the Hadoop stack -- enable governance controls that are truly
  platform agnostic and effectively (and defensibly) address compliance
  concerns.
 
  Initially working with a group of leading partners in several industries,
  Atlas is built to solve specific real world governance problems that
  accelerate product maturity and time to value.
 
  Atlas aims to grow a community to help build a widely adopted pattern for
  governance, metadata modeling and exchange in Hadoop – which will advance
  the interests for the whole community.
 
  == Current Status ==
 
  An initial version with a valuable set of features is developed by the
 list
  of initial committers and is hosted on github.
 
  === Meritocracy ===
 
  Our intent with this proposal is to start building a diverse  developer
  community around Atlas following the Apache meritocracy model. We have
  wanted to make the project open source and encourage contributors from
  multiple organizations from the start.
 
  We plan to provide plenty of support to new developers and to quickly
  recruit those who make solid contributions to committer status.
 
  === Community ===
 
  We are happy to report that the initial team already represents multiple
  organizations. We hope to extend the user and developer base further in
 the
  future and build a solid open source community around Atlas.
 
  === Core Developers ===
 
  Atlas development is currently being led by engineers from Hortonworks –
  Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
  engineers have deep expertise in Hadoop and are quite familiar with the
  Hadoop Ecosystem.
 
  === Alignment ===
 
  The ASF is a natural host for Atlas given that it is already the home of
  Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
  data” software projects.
 
  Atlas has been designed to solve the data governance challenges and
  opportunities of the Hadoop ecosystem family of products as well as
  integration to the tradition Enterprise Data ecosystem.
 
  Atlas fills the gap that the Hadoop Ecosystem has been lacking in the
 areas
  of data governance and compliance management.
 
  == Known Risks ==
 
  === Orphaned products  Reliance on Salaried Developers ===
  The core developers plan to work full 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-04 Thread Shwetha Shivalingamurthy
+1 (non-binding)

On 01/05/15 12:56 pm, Seetharam Venkatesh venkat...@innerzeal.com
wrote:

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and
efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management ­ ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem
and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop ­ which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the
list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in
the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks ­
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging ³big
data² software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the
areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
being actively developed by several companies and have vested interest in
its continued vitality and adoption.

=== Inexperience with Open Source ===
Many of the core developers are PMC and committers of Apache. Harish
Butani
is PMC 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-04 Thread Jean-Baptiste Onofré

+1 (binding)

@Venkatesh: I'm interesting to help there. If you want, don't hesitate 
to add me in the initial committer set.


Regards
JB

On 05/01/2015 09:26 AM, Seetharam Venkatesh wrote:

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
being actively developed by several companies and have vested interest in
its continued vitality and adoption.

=== 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-04 Thread Jakob Homan
+1 (binding)

On 4 May 2015 at 13:29, Suresh Srinivas sur...@hortonworks.com wrote:
 +1 (binding).

 (Minor nit: Please use sur...@apache.org as my email address).

 Regards,
 Suresh
 
 From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam 
 Venkatesh venkat...@innerzeal.com
 Sent: Friday, May 01, 2015 12:26 AM
 To: general@incubator.apache.org
 Subject: [VOTE] Accept Apache Atlas into Apache Incubator

 Hello folks,

 Following the discussion earlier in the thread: http://s.apache.org/r2

 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.

 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.

 The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

 Of course I am +1! (non-binding)

 Thanks!


 = Apache Atlas Proposal =

 == Abstract ==

 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.

 == Proposal ==

 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.

 These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

 == Background ==

 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.

 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.

 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.

 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.

 == Rationale ==

 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.

 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.

 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.

 == Current Status ==

 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.

 === Meritocracy ===

 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.

 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.

 === Community ===

 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.

 === Core Developers ===

 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.

 === Alignment ===

 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.

 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.

 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance management

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-04 Thread Sharad Agarwal
+1 (binding)

On Fri, May 1, 2015 at 12:56 PM, Seetharam Venkatesh 
venkat...@innerzeal.com wrote:

 Hello folks,

 Following the discussion earlier in the thread: http://s.apache.org/r2

 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.

 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.

 The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

 Of course I am +1! (non-binding)

 Thanks!


 = Apache Atlas Proposal =

 == Abstract ==

 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.

 == Proposal ==

 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.

 These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

 == Background ==

 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.

 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.

 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.

 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.

 == Rationale ==

 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.

 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.

 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.

 == Current Status ==

 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.

 === Meritocracy ===

 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.

 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.

 === Community ===

 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.

 === Core Developers ===

 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.

 === Alignment ===

 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.

 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.

 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance management.

 == Known Risks ==

 === Orphaned products  Reliance on Salaried Developers ===
 The core developers plan to work full time on the project. There is very
 little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
 being actively developed by several companies and have vested interest in
 its continued vitality and adoption.

 === Inexperience with 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-04 Thread Suresh Srinivas
+1 (binding). 

(Minor nit: Please use sur...@apache.org as my email address).

Regards,
Suresh

From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam 
Venkatesh venkat...@innerzeal.com
Sent: Friday, May 01, 2015 12:26 AM
To: general@incubator.apache.org
Subject: [VOTE] Accept Apache Atlas into Apache Incubator

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-03 Thread Arun C Murthy
+1 (binding)

thanks,
Arun

On May 1, 2015, at 12:26 AM, Seetharam Venkatesh venkat...@innerzeal.com 
wrote:

 Hello folks,
 
 Following the discussion earlier in the thread: http://s.apache.org/r2
 
 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.
 
 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.
 
 The VOTE is open for at least the next 72 hours:
 
 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...
 
 Of course I am +1! (non-binding)
 
 Thanks!
 
 
 = Apache Atlas Proposal =
 
 == Abstract ==
 
 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.
 
 == Proposal ==
 
 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.
 
 These services include:
 
 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools
 
 == Background ==
 
 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.
 
 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.
 
 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.
 
 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.
 
 == Rationale ==
 
 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.
 
 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.
 
 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.
 
 == Current Status ==
 
 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.
 
 === Meritocracy ===
 
 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.
 
 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.
 
 === Community ===
 
 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.
 
 === Core Developers ===
 
 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.
 
 === Alignment ===
 
 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.
 
 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.
 
 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance management.
 
 == Known Risks ==
 
 === Orphaned products  Reliance on Salaried Developers ===
 The core developers plan to work full time on the project. There is very
 little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
 being actively developed by several companies and have vested interest in
 its continued 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-03 Thread Jake Farrell
Sorry I missed the discussion thread for this proposed podling, the name
for this project may have an issue with Netflix Atlas [1] when it comes
time to graduate, may be worth the discussion of switching names if voted
in before any infra resources are setup

-Jake

[1]:
http://techblog.netflix.com/2014/12/introducing-atlas-netflixs-primary.html
[2]: https://github.com/netflix/atlas


On Fri, May 1, 2015 at 3:26 AM, Seetharam Venkatesh venkat...@innerzeal.com
 wrote:

 Hello folks,

 Following the discussion earlier in the thread: http://s.apache.org/r2

 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.

 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.

 The VOTE is open for at least the next 72 hours:

  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...

 Of course I am +1! (non-binding)

 Thanks!


 = Apache Atlas Proposal =

 == Abstract ==

 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.

 == Proposal ==

 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.

 These services include:

  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management – ingestion to disposition
  * Metadata interchange with other metadata tools

 == Background ==

 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.

 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.

 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.

 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.

 == Rationale ==

 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.

 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.

 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.

 == Current Status ==

 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.

 === Meritocracy ===

 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.

 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.

 === Community ===

 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.

 === Core Developers ===

 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.

 === Alignment ===

 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.

 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.

 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-02 Thread Alan Gates

+1 (binding).

Alan.


Seetharam Venkatesh mailto:venkat...@innerzeal.com
May 1, 2015 at 0:26
Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

[ ] +1 accept Apache Atlas into the Apache Incubator
[ ] ±0 Abstain
[ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and 
efficiently

meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

* Search and Lineage for datasets
* Metadata driven data access control
* Indexed and Searchable Centralized Auditing operational Events
* Data lifecycle management – ingestion to disposition
* Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data 
ecosystem and

requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow. Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata. Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the 
list

of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further 
in the

future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive, Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the 
areas

of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting orphaned. A prototype of Atlas is in use and
being actively developed by several companies and have vested interest in
its continued vitality and adoption.

=== Inexperience with Open Source ===
Many of the core developers are PMC and committers of Apache. Harish 
Butani

is PMC 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-02 Thread Dilli Arumugam
+1

On 5/1/15 10:52 AM, Jitendra Pandey jiten...@hortonworks.com wrote:

+1

Thanks Venkatesh for the proposal. Apache Atlas is a great addition to
Apache Hadoop ecosystem, and it fills a significant void in the area of
data governance in open source.

From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam
Venkatesh venkat...@innerzeal.com
Sent: Friday, May 01, 2015 12:26 AM
To: general@incubator.apache.org
Subject: [VOTE] Accept Apache Atlas into Apache Incubator

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and
efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management ­ ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem
and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop ­ which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the
list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in
the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks ­
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging ³big
data² software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the
areas
of data governance and compliance management.

== Known Risks

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-02 Thread jan i
+1 (binding)

have fun
jan i.

On Friday, May 1, 2015, Dilli Arumugam darumu...@hortonworks.com wrote:

 +1

 On 5/1/15 10:52 AM, Jitendra Pandey jiten...@hortonworks.com
 javascript:; wrote:

 +1
 
 Thanks Venkatesh for the proposal. Apache Atlas is a great addition to
 Apache Hadoop ecosystem, and it fills a significant void in the area of
 data governance in open source.
 
 From: vseetha...@gmail.com javascript:; vseetha...@gmail.com
 javascript:; on behalf of Seetharam
 Venkatesh venkat...@innerzeal.com javascript:;
 Sent: Friday, May 01, 2015 12:26 AM
 To: general@incubator.apache.org javascript:;
 Subject: [VOTE] Accept Apache Atlas into Apache Incubator
 
 Hello folks,
 
 Following the discussion earlier in the thread: http://s.apache.org/r2
 
 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.
 
 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.
 
 The VOTE is open for at least the next 72 hours:
 
  [ ] +1 accept Apache Atlas into the Apache Incubator
  [ ] ±0 Abstain
  [ ] -1 because...
 
 Of course I am +1! (non-binding)
 
 Thanks!
 
 
 = Apache Atlas Proposal =
 
 == Abstract ==
 
 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and
 efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.
 
 == Proposal ==
 
 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.
 
 These services include:
 
  * Search and Lineage for datasets
  * Metadata driven data access control
  * Indexed and Searchable Centralized Auditing operational Events
  * Data lifecycle management ­ ingestion to disposition
  * Metadata interchange with other metadata tools
 
 == Background ==
 
 Hadoop is one of many platforms in the modern enterprise data ecosystem
 and
 requires governance controls commensurate with this reality.
 
 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.
 
 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.
 
 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.
 
 == Rationale ==
 
 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.
 
 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.
 
 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop ­ which will advance
 the interests for the whole community.
 
 == Current Status ==
 
 An initial version with a valuable set of features is developed by the
 list
 of initial committers and is hosted on github.
 
 === Meritocracy ===
 
 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.
 
 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.
 
 === Community ===
 
 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in
 the
 future and build a solid open source community around Atlas.
 
 === Core Developers ===
 
 Atlas development is currently being led by engineers from Hortonworks ­
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.
 
 === Alignment ===
 
 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging ³big
 data² software projects.
 
 Atlas has been designed to solve

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-02 Thread P. Taylor Goetz
+1 (binding)

-Taylor


 On May 1, 2015, at 3:26 AM, Seetharam Venkatesh venkat...@innerzeal.com 
 wrote:
 
 Hello folks,
 
 Following the discussion earlier in the thread: http://s.apache.org/r2
 
 I would like to call a VOTE for accepting Apache Atlas as a new incubator
 project.
 
 The proposal is available at:
 https://wiki.apache.org/incubator/AtlasProposal
 Also, the text of the latest wiki proposal is included at the bottom of
 this email.
 
 The VOTE is open for at least the next 72 hours:
 
 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...
 
 Of course I am +1! (non-binding)
 
 Thanks!
 
 
 = Apache Atlas Proposal =
 
 == Abstract ==
 
 Apache Atlas is a scalable and extensible set of core foundational
 governance services that enables enterprises to effectively and efficiently
 meet their compliance requirements within Hadoop and allows integration
 with the complete enterprise data ecosystem.
 
 == Proposal ==
 
 Apache Atlas allows agnostic governance visibility into Hadoop, these
 abilities are enabled through a set of core foundational services powered
 by a flexible metadata repository.
 
 These services include:
 
 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools
 
 == Background ==
 
 Hadoop is one of many platforms in the modern enterprise data ecosystem and
 requires governance controls commensurate with this reality.
 
 Currently, there is no easy or complete way to provide comprehensive
 visibility and control into Hadoop audit, lineage, and security for
 workflows that require Hadoop and non-Hadoop processing.
 
 Many solutions are usually point based, and require a monolithic
 application workflow.  Multi-tenancy and concurrency are problematic as
 these offerings are not aware of activity outside of their narrow focus.
 
 As Hadoop gains greater popularity, governance concerns will become
 increasingly vital to increasing maturity and furthering adoption. It is a
 particular barrier to expanding enterprise data under management.
 
 == Rationale ==
 
 Atlas will address issues previously discussed by providing governance
 capabilities in Hadoop -- using both a prescriptive and forensic model
 enriched by business taxonomical metadata.Atlas, at its core, is
 designed to exchange metadata with other tools and processes within and
 outside of the Hadoop stack -- enable governance controls that are truly
 platform agnostic and effectively (and defensibly) address compliance
 concerns.
 
 Initially working with a group of leading partners in several industries,
 Atlas is built to solve specific real world governance problems that
 accelerate product maturity and time to value.
 
 Atlas aims to grow a community to help build a widely adopted pattern for
 governance, metadata modeling and exchange in Hadoop – which will advance
 the interests for the whole community.
 
 == Current Status ==
 
 An initial version with a valuable set of features is developed by the list
 of initial committers and is hosted on github.
 
 === Meritocracy ===
 
 Our intent with this proposal is to start building a diverse  developer
 community around Atlas following the Apache meritocracy model. We have
 wanted to make the project open source and encourage contributors from
 multiple organizations from the start.
 
 We plan to provide plenty of support to new developers and to quickly
 recruit those who make solid contributions to committer status.
 
 === Community ===
 
 We are happy to report that the initial team already represents multiple
 organizations. We hope to extend the user and developer base further in the
 future and build a solid open source community around Atlas.
 
 === Core Developers ===
 
 Atlas development is currently being led by engineers from Hortonworks –
 Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
 engineers have deep expertise in Hadoop and are quite familiar with the
 Hadoop Ecosystem.
 
 === Alignment ===
 
 The ASF is a natural host for Atlas given that it is already the home of
 Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
 data” software projects.
 
 Atlas has been designed to solve the data governance challenges and
 opportunities of the Hadoop ecosystem family of products as well as
 integration to the tradition Enterprise Data ecosystem.
 
 Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
 of data governance and compliance management.
 
 == Known Risks ==
 
 === Orphaned products  Reliance on Salaried Developers ===
 The core developers plan to work full time on the project. There is very
 little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
 being actively developed by several companies and have vested interest in
 its continued 

Re: [VOTE] Accept Apache Atlas into Apache Incubator

2015-05-01 Thread Jitendra Pandey
+1

Thanks Venkatesh for the proposal. Apache Atlas is a great addition to Apache 
Hadoop ecosystem, and it fills a significant void in the area of data 
governance in open source.

From: vseetha...@gmail.com vseetha...@gmail.com on behalf of Seetharam 
Venkatesh venkat...@innerzeal.com
Sent: Friday, May 01, 2015 12:26 AM
To: general@incubator.apache.org
Subject: [VOTE] Accept Apache Atlas into Apache Incubator

Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core

[VOTE] Accept Apache Atlas into Apache Incubator

2015-05-01 Thread Seetharam Venkatesh
Hello folks,

Following the discussion earlier in the thread: http://s.apache.org/r2

I would like to call a VOTE for accepting Apache Atlas as a new incubator
project.

The proposal is available at:
https://wiki.apache.org/incubator/AtlasProposal
Also, the text of the latest wiki proposal is included at the bottom of
this email.

The VOTE is open for at least the next 72 hours:

 [ ] +1 accept Apache Atlas into the Apache Incubator
 [ ] ±0 Abstain
 [ ] -1 because...

Of course I am +1! (non-binding)

Thanks!


= Apache Atlas Proposal =

== Abstract ==

Apache Atlas is a scalable and extensible set of core foundational
governance services that enables enterprises to effectively and efficiently
meet their compliance requirements within Hadoop and allows integration
with the complete enterprise data ecosystem.

== Proposal ==

Apache Atlas allows agnostic governance visibility into Hadoop, these
abilities are enabled through a set of core foundational services powered
by a flexible metadata repository.

These services include:

 * Search and Lineage for datasets
 * Metadata driven data access control
 * Indexed and Searchable Centralized Auditing operational Events
 * Data lifecycle management – ingestion to disposition
 * Metadata interchange with other metadata tools

== Background ==

Hadoop is one of many platforms in the modern enterprise data ecosystem and
requires governance controls commensurate with this reality.

Currently, there is no easy or complete way to provide comprehensive
visibility and control into Hadoop audit, lineage, and security for
workflows that require Hadoop and non-Hadoop processing.

Many solutions are usually point based, and require a monolithic
application workflow.  Multi-tenancy and concurrency are problematic as
these offerings are not aware of activity outside of their narrow focus.

As Hadoop gains greater popularity, governance concerns will become
increasingly vital to increasing maturity and furthering adoption. It is a
particular barrier to expanding enterprise data under management.

== Rationale ==

Atlas will address issues previously discussed by providing governance
capabilities in Hadoop -- using both a prescriptive and forensic model
enriched by business taxonomical metadata.Atlas, at its core, is
designed to exchange metadata with other tools and processes within and
outside of the Hadoop stack -- enable governance controls that are truly
platform agnostic and effectively (and defensibly) address compliance
concerns.

Initially working with a group of leading partners in several industries,
Atlas is built to solve specific real world governance problems that
accelerate product maturity and time to value.

Atlas aims to grow a community to help build a widely adopted pattern for
governance, metadata modeling and exchange in Hadoop – which will advance
the interests for the whole community.

== Current Status ==

An initial version with a valuable set of features is developed by the list
of initial committers and is hosted on github.

=== Meritocracy ===

Our intent with this proposal is to start building a diverse  developer
community around Atlas following the Apache meritocracy model. We have
wanted to make the project open source and encourage contributors from
multiple organizations from the start.

We plan to provide plenty of support to new developers and to quickly
recruit those who make solid contributions to committer status.

=== Community ===

We are happy to report that the initial team already represents multiple
organizations. We hope to extend the user and developer base further in the
future and build a solid open source community around Atlas.

=== Core Developers ===

Atlas development is currently being led by engineers from Hortonworks –
Harish Butani, Venkatesh Seetharam, Shwetha G S, and Jon Maron. All the
engineers have deep expertise in Hadoop and are quite familiar with the
Hadoop Ecosystem.

=== Alignment ===

The ASF is a natural host for Atlas given that it is already the home of
Hadoop, Falcon, Hive,  Pig, Oozie, Knox, Ranger, and other emerging “big
data” software projects.

Atlas has been designed to solve the data governance challenges and
opportunities of the Hadoop ecosystem family of products as well as
integration to the tradition Enterprise Data ecosystem.

Atlas fills the gap that the Hadoop Ecosystem has been lacking in the areas
of data governance and compliance management.

== Known Risks ==

=== Orphaned products  Reliance on Salaried Developers ===
The core developers plan to work full time on the project. There is very
little risk of Atlas getting orphaned.  A prototype of Atlas is in use and
being actively developed by several companies and have vested interest in
its continued vitality and adoption.

=== Inexperience with Open Source ===
Many of the core developers are PMC and committers of Apache. Harish Butani
is PMC Apache Hive, Venkatesh Seetharam is PMC on Apache Falcon and Apache
Knox, Shwetha GS is PMC