Re: [VOTE] Accept Eagle into Apache Incubation

2015-11-03 Thread Henry Saputra
Follow up announcement, the Apache Eagle incubating mailing lists are
now available:

• d...@eagle.incubator.apache.org (subscribe by sending email to
dev-subscr...@eagle.incubator.apache.org)
• comm...@eagle.incubator.apache.org (subscribe by sending email to
commits-subscr...@eagle.incubator.apache.org)
• u...@eagle.incubator.apache.org (subscribe by sending email to
user-subscr...@eagle.incubator.apache.org)


Thanks,

Henry


On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun  wrote:
> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
> project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to 
> sensitive data, recognize attacks, malicious activities and take actions in 
> real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
> Eagle also provides machine learning models for detecting anomalous user 
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
> access to sensitive data, recognize attacks, malicious activities in hadoop 
> and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time, 
> enforces policies defined on sensitive data access and alerts or blocks 
> user’s access to that sensitive data in real time. Eagle also creates user 
> profiles based on the typical access behaviour for HDFS and Hive and sends 
> alerts when anomalous behaviour is detected. Eagle can also import sensitive 
> data information classified by external classification engines to help define 
> its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop logs 
> in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create policies 
> based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service and 
> the visualization component. Eagle provides intuitive user interface to 
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data 
> source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
> monitoring collects data from Kafka which is populated from namenode log4j 
> appender or from logstash agent. Eagle hive monitoring collects hive query 
> logs from running job through YARN API, which is designed to be scalable and 
> fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
> data, and also supports relational database through configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an 
> abstraction of Apache Storm. It can also be extended to other streaming 
> engines. This abstraction allows developers to assemble data transformation, 
> filtering, external data join etc. without physically bound to a specific 
> streaming platform. Eagle streaming API allows developers to easily integrate 
> business logic with Eagle policy engine and internally Eagle framework 
> compiles business logic execution DAG into program primitives of underlying 
> stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
> transforms audit log from Namenode to object and joins sensitivity metadata, 
> security zone metadata which are generated from external programs or 
> configured by user. Eagle hive monitoring filters running jobs to get hive 
> query string and parses query string into object and then joins sensitivity 
> metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API, 
> scalable policy engine framework, extensible policy engine framework. Stream 
> metadata API allows developers to declare event schema including what 
> attributes constitute an event, what is the type for each attribute, and how 
> to dynamically resolve attribute value in runtime when user configures 
> policy. Scalable policy engine framework allows policies to be executed on 
> different physical nodes in parallel. It is also used to define your own 
> policy partitioner class. Policy engine framework together with streaming 
> partitioning capability provided by all streaming platforms will make sure 
> policies and events can be evaluated in a fully distributed way. Extensible 
> policy engine framework allows 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-11-03 Thread Zhang, Edward (GDI Hadoop)
Thanks Henry very much. That is nice we got those mail lists so we can
communicate with community well.

Thanks
Edward Zhang

On 11/3/15, 16:46, "Henry Saputra"  wrote:

>Follow up announcement, the Apache Eagle incubating mailing lists are
>now available:
>
>€ d...@eagle.incubator.apache.org (subscribe by sending email to
>dev-subscr...@eagle.incubator.apache.org)
>€ comm...@eagle.incubator.apache.org (subscribe by sending email to
>commits-subscr...@eagle.incubator.apache.org)
>€ u...@eagle.incubator.apache.org (subscribe by sending email to
>user-subscr...@eagle.incubator.apache.org)
>
>
>Thanks,
>
>Henry
>
>
>On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun 
>wrote:
>> Hello Everyone,
>>
>> Thanks for all the feedback on the Eagle Proposal.
>>
>> I would like to call for a [VOTE] on Eagle joining the ASF as an
>>incubation project.
>>
>> The vote is open for 72 hours:
>>
>> [ ] +1 accept Eagle in the Incubator
>> [ ] ±0
>> [ ] -1 (please give reason)
>>
>> Eagle is a Monitoring solution for Hadoop to instantly identify access
>>to sensitive data, recognize attacks, malicious activities and take
>>actions in real time. Eagle supports a wide variety of policies on HDFS
>>data and Hive. Eagle also provides machine learning models for detecting
>>anomalous user behavior in Hadoop.
>>
>> The proposal is available on the wiki here:
>> https://wiki.apache.org/incubator/EagleProposal
>>
>> The text of the proposal is also available at the end of this email.
>>
>> Thanks for your time and help.
>>
>> Thanks,
>> Arun
>>
>> 
>>
>> Eagle
>>
>> Abstract
>> Eagle is an Open Source Monitoring solution for Hadoop to instantly
>>identify access to sensitive data, recognize attacks, malicious
>>activities in hadoop and take actions.
>>
>> Proposal
>> Eagle audits access to HDFS files, Hive and HBase tables in real time,
>>enforces policies defined on sensitive data access and alerts or blocks
>>user¹s access to that sensitive data in real time. Eagle also creates
>>user profiles based on the typical access behaviour for HDFS and Hive
>>and sends alerts when anomalous behaviour is detected. Eagle can also
>>import sensitive data information classified by external classification
>>engines to help define its policies.
>>
>> Overview of Eagle
>> Eagle has 3 main parts.
>> 1.Data collection and storage - Eagle collects data from various hadoop
>>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
>>storage.
>> 2.Data processing and policy engine - Eagle allows users to create
>>policies based on various metadata properties on HDFS, Hive and HBase
>>data.
>> 3.Eagle services - Eagle services include policy manager, query service
>>and the visualization component. Eagle provides intuitive user interface
>>to administer Eagle and an alert dashboard to respond to real time
>>alerts.
>>
>> Data Collection and Storage:
>> Eagle provides programming API for extending Eagle to integrate any
>>data source into Eagle policy evaluation framework. For example, Eagle
>>hdfs audit monitoring collects data from Kafka which is populated from
>>namenode log4j appender or from logstash agent. Eagle hive monitoring
>>collects hive query logs from running job through YARN API, which is
>>designed to be scalable and fault-tolerant. Eagle uses HBase as storage
>>for storing metadata and metrics data, and also supports relational
>>database through configuration change.
>>
>> Data Processing and Policy Engine:
>> Processing Engine: Eagle provides stream processing API which is an
>>abstraction of Apache Storm. It can also be extended to other streaming
>>engines. This abstraction allows developers to assemble data
>>transformation, filtering, external data join etc. without physically
>>bound to a specific streaming platform. Eagle streaming API allows
>>developers to easily integrate business logic with Eagle policy engine
>>and internally Eagle framework compiles business logic execution DAG
>>into program primitives of underlying stream infrastructure e.g. Apache
>>Storm. For example, Eagle HDFS monitoring transforms audit log from
>>Namenode to object and joins sensitivity metadata, security zone
>>metadata which are generated from external programs or configured by
>>user. Eagle hive monitoring filters running jobs to get hive query
>>string and parses query string into object and then joins sensitivity
>>metadata.
>> Alerting Framework: Eagle Alert Framework includes stream metadata API,
>>scalable policy engine framework, extensible policy engine framework.
>>Stream metadata API allows developers to declare event schema including
>>what attributes constitute an event, what is the type for each
>>attribute, and how to dynamically resolve attribute value in runtime
>>when user configures policy. Scalable policy engine framework allows
>>policies to be executed on different physical nodes in parallel. It is
>>also used to define your own policy partitioner class. Policy engine

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-25 Thread Li Yang
+1 (non-binding)

On Mon, Oct 26, 2015 at 10:50 AM, hongbin ma  wrote:

> +1 (non binding)
>
> On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goers 
> wrote:
>
> > +1 (binding)
> >
> > Ralph
> >
> > > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun 
> > wrote:
> > >
> > > Hello Everyone,
> > >
> > > Thanks for all the feedback on the Eagle Proposal.
> > >
> > > I would like to call for a [VOTE] on Eagle joining the ASF as an
> > incubation project.
> > >
> > > The vote is open for 72 hours:
> > >
> > > [ ] +1 accept Eagle in the Incubator
> > > [ ] ±0
> > > [ ] -1 (please give reason)
> > >
> > > Eagle is a Monitoring solution for Hadoop to instantly identify access
> > to sensitive data, recognize attacks, malicious activities and take
> actions
> > in real time. Eagle supports a wide variety of policies on HDFS data and
> > Hive. Eagle also provides machine learning models for detecting anomalous
> > user behavior in Hadoop.
> > >
> > > The proposal is available on the wiki here:
> > > https://wiki.apache.org/incubator/EagleProposal
> > >
> > > The text of the proposal is also available at the end of this email.
> > >
> > > Thanks for your time and help.
> > >
> > > Thanks,
> > > Arun
> > >
> > > 
> > >
> > > Eagle
> > >
> > > Abstract
> > > Eagle is an Open Source Monitoring solution for Hadoop to instantly
> > identify access to sensitive data, recognize attacks, malicious
> activities
> > in hadoop and take actions.
> > >
> > > Proposal
> > > Eagle audits access to HDFS files, Hive and HBase tables in real time,
> > enforces policies defined on sensitive data access and alerts or blocks
> > user’s access to that sensitive data in real time. Eagle also creates
> user
> > profiles based on the typical access behaviour for HDFS and Hive and
> sends
> > alerts when anomalous behaviour is detected. Eagle can also import
> > sensitive data information classified by external classification engines
> to
> > help define its policies.
> > >
> > > Overview of Eagle
> > > Eagle has 3 main parts.
> > > 1.Data collection and storage - Eagle collects data from various hadoop
> > logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> storage.
> > > 2.Data processing and policy engine - Eagle allows users to create
> > policies based on various metadata properties on HDFS, Hive and HBase
> data.
> > > 3.Eagle services - Eagle services include policy manager, query service
> > and the visualization component. Eagle provides intuitive user interface
> to
> > administer Eagle and an alert dashboard to respond to real time alerts.
> > >
> > > Data Collection and Storage:
> > > Eagle provides programming API for extending Eagle to integrate any
> data
> > source into Eagle policy evaluation framework. For example, Eagle hdfs
> > audit monitoring collects data from Kafka which is populated from
> namenode
> > log4j appender or from logstash agent. Eagle hive monitoring collects
> hive
> > query logs from running job through YARN API, which is designed to be
> > scalable and fault-tolerant. Eagle uses HBase as storage for storing
> > metadata and metrics data, and also supports relational database through
> > configuration change.
> > >
> > > Data Processing and Policy Engine:
> > > Processing Engine: Eagle provides stream processing API which is an
> > abstraction of Apache Storm. It can also be extended to other streaming
> > engines. This abstraction allows developers to assemble data
> > transformation, filtering, external data join etc. without physically
> bound
> > to a specific streaming platform. Eagle streaming API allows developers
> to
> > easily integrate business logic with Eagle policy engine and internally
> > Eagle framework compiles business logic execution DAG into program
> > primitives of underlying stream infrastructure e.g. Apache Storm. For
> > example, Eagle HDFS monitoring transforms audit log from Namenode to
> object
> > and joins sensitivity metadata, security zone metadata which are
> generated
> > from external programs or configured by user. Eagle hive monitoring
> filters
> > running jobs to get hive query string and parses query string into object
> > and then joins sensitivity metadata.
> > > Alerting Framework: Eagle Alert Framework includes stream metadata API,
> > scalable policy engine framework, extensible policy engine framework.
> > Stream metadata API allows developers to declare event schema including
> > what attributes constitute an event, what is the type for each attribute,
> > and how to dynamically resolve attribute value in runtime when user
> > configures policy. Scalable policy engine framework allows policies to be
> > executed on different physical nodes in parallel. It is also used to
> define
> > your own policy partitioner class. Policy engine framework together with
> > streaming partitioning capability provided by all streaming platforms
> will
> > make sure policies and events can be 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-25 Thread Don Bosco Durai
+1 non binding 
Bosco 



_
From: Li Yang <liy...@apache.org>
Sent: Sunday, October 25, 2015 8:13 PM
Subject: Re: [VOTE] Accept Eagle into Apache Incubation
To:  <general@incubator.apache.org>


+1 (non-binding)

On Mon, Oct 26, 2015 at 10:50 AM, hongbin ma <mahong...@apache.org> wrote:

> +1 (non binding)
>
> On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goers <ralph.go...@dslextreme.com>
> wrote:
>
> > +1 (binding)
> >
> > Ralph
> >
> > > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun <armanoha...@ebay.com>
> > wrote:
> > >
> > > Hello Everyone,
> > >
> > > Thanks for all the feedback on the Eagle Proposal.
> > >
> > > I would like to call for a [VOTE] on Eagle joining the ASF as an
> > incubation project.
> > >
> > > The vote is open for 72 hours:
> > >
> > > [ ] +1 accept Eagle in the Incubator
> > > [ ] ±0
> > > [ ] -1 (please give reason)
> > >
> > > Eagle is a Monitoring solution for Hadoop to instantly identify access
> > to sensitive data, recognize attacks, malicious activities and take
> actions
> > in real time. Eagle supports a wide variety of policies on HDFS data and
> > Hive. Eagle also provides machine learning models for detecting anomalous
> > user behavior in Hadoop.
> > >
> > > The proposal is available on the wiki here:
> > > https://wiki.apache.org/incubator/EagleProposal
> > >
> > > The text of the proposal is also available at the end of this email.
> > >
> > > Thanks for your time and help.
> > >
> > > Thanks,
> > > Arun
> > >
> > > 
> > >
> > > Eagle
> > >
> > > Abstract
> > > Eagle is an Open Source Monitoring solution for Hadoop to instantly
> > identify access to sensitive data, recognize attacks, malicious
> activities
> > in hadoop and take actions.
> > >
> > > Proposal
> > > Eagle audits access to HDFS files, Hive and HBase tables in real time,
> > enforces policies defined on sensitive data access and alerts or blocks
> > user’s access to that sensitive data in real time. Eagle also creates
> user
> > profiles based on the typical access behaviour for HDFS and Hive and
> sends
> > alerts when anomalous behaviour is detected. Eagle can also import
> > sensitive data information classified by external classification engines
> to
> > help define its policies.
> > >
> > > Overview of Eagle
> > > Eagle has 3 main parts.
> > > 1.Data collection and storage - Eagle collects data from various hadoop
> > logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> storage.
> > > 2.Data processing and policy engine - Eagle allows users to create
> > policies based on various metadata properties on HDFS, Hive and HBase
> data.
> > > 3.Eagle services - Eagle services include policy manager, query service
> > and the visualization component. Eagle provides intuitive user interface
> to
> > administer Eagle and an alert dashboard to respond to real time alerts.
> > >
> > > Data Collection and Storage:
> > > Eagle provides programming API for extending Eagle to integrate any
> data
> > source into Eagle policy evaluation framework. For example, Eagle hdfs
> > audit monitoring collects data from Kafka which is populated from
> namenode
> > log4j appender or from logstash agent. Eagle hive monitoring collects
> hive
> > query logs from running job through YARN API, which is designed to be
> > scalable and fault-tolerant. Eagle uses HBase as storage for storing
> > metadata and metrics data, and also supports relational database through
> > configuration change.
> > >
> > > Data Processing and Policy Engine:
> > > Processing Engine: Eagle provides stream processing API which is an
> > abstraction of Apache Storm. It can also be extended to other streaming
> > engines. This abstraction allows developers to assemble data
> > transformation, filtering, external data join etc. without physically
> bound
> > to a specific streaming platform. Eagle streaming API allows developers
> to
> > easily integrate business logic with Eagle policy engine and internally
> > Eagle framework compiles business logic execution DAG into program
> > primitives of underlying stream infrastructure e.g. Apache Storm. For
> > example, Eagle HDFS monitoring transforms audit log from Namenode to
> object
> > and joins sensitivity metadata, security zone me

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-25 Thread hongbin ma
+1 (non binding)

On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goers 
wrote:

> +1 (binding)
>
> Ralph
>
> > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun 
> wrote:
> >
> > Hello Everyone,
> >
> > Thanks for all the feedback on the Eagle Proposal.
> >
> > I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
> >
> > The vote is open for 72 hours:
> >
> > [ ] +1 accept Eagle in the Incubator
> > [ ] ±0
> > [ ] -1 (please give reason)
> >
> > Eagle is a Monitoring solution for Hadoop to instantly identify access
> to sensitive data, recognize attacks, malicious activities and take actions
> in real time. Eagle supports a wide variety of policies on HDFS data and
> Hive. Eagle also provides machine learning models for detecting anomalous
> user behavior in Hadoop.
> >
> > The proposal is available on the wiki here:
> > https://wiki.apache.org/incubator/EagleProposal
> >
> > The text of the proposal is also available at the end of this email.
> >
> > Thanks for your time and help.
> >
> > Thanks,
> > Arun
> >
> > 
> >
> > Eagle
> >
> > Abstract
> > Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
> >
> > Proposal
> > Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
> >
> > Overview of Eagle
> > Eagle has 3 main parts.
> > 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> > 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> > 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
> >
> > Data Collection and Storage:
> > Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
> >
> > Data Processing and Policy Engine:
> > Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> > Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> > Machine Learning module: Eagle provides capabilities to define user
> activity patterns or 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-25 Thread Ralph Goers
+1 (binding)

Ralph

> On Oct 23, 2015, at 7:11 AM, Manoharan, Arun  wrote:
> 
> Hello Everyone,
> 
> Thanks for all the feedback on the Eagle Proposal.
> 
> I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
> project.
> 
> The vote is open for 72 hours:
> 
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
> 
> Eagle is a Monitoring solution for Hadoop to instantly identify access to 
> sensitive data, recognize attacks, malicious activities and take actions in 
> real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
> Eagle also provides machine learning models for detecting anomalous user 
> behavior in Hadoop.
> 
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
> 
> The text of the proposal is also available at the end of this email.
> 
> Thanks for your time and help.
> 
> Thanks,
> Arun
> 
> 
> 
> Eagle
> 
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
> access to sensitive data, recognize attacks, malicious activities in hadoop 
> and take actions.
> 
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time, 
> enforces policies defined on sensitive data access and alerts or blocks 
> user’s access to that sensitive data in real time. Eagle also creates user 
> profiles based on the typical access behaviour for HDFS and Hive and sends 
> alerts when anomalous behaviour is detected. Eagle can also import sensitive 
> data information classified by external classification engines to help define 
> its policies.
> 
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop logs 
> in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create policies 
> based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service and 
> the visualization component. Eagle provides intuitive user interface to 
> administer Eagle and an alert dashboard to respond to real time alerts.
> 
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data 
> source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
> monitoring collects data from Kafka which is populated from namenode log4j 
> appender or from logstash agent. Eagle hive monitoring collects hive query 
> logs from running job through YARN API, which is designed to be scalable and 
> fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
> data, and also supports relational database through configuration change.
> 
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an 
> abstraction of Apache Storm. It can also be extended to other streaming 
> engines. This abstraction allows developers to assemble data transformation, 
> filtering, external data join etc. without physically bound to a specific 
> streaming platform. Eagle streaming API allows developers to easily integrate 
> business logic with Eagle policy engine and internally Eagle framework 
> compiles business logic execution DAG into program primitives of underlying 
> stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
> transforms audit log from Namenode to object and joins sensitivity metadata, 
> security zone metadata which are generated from external programs or 
> configured by user. Eagle hive monitoring filters running jobs to get hive 
> query string and parses query string into object and then joins sensitivity 
> metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API, 
> scalable policy engine framework, extensible policy engine framework. Stream 
> metadata API allows developers to declare event schema including what 
> attributes constitute an event, what is the type for each attribute, and how 
> to dynamically resolve attribute value in runtime when user configures 
> policy. Scalable policy engine framework allows policies to be executed on 
> different physical nodes in parallel. It is also used to define your own 
> policy partitioner class. Policy engine framework together with streaming 
> partitioning capability provided by all streaming platforms will make sure 
> policies and events can be evaluated in a fully distributed way. Extensible 
> policy engine framework allows developer to plugin a new policy engine with a 
> few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle 
> supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user activity 
> patterns or user profiles for Hadoop users based on the user behaviour in the 
> platform. These user profiles are modeled using Machine Learning algorithms 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-24 Thread Amareshwari Sriramdasu
+1 (binding)

On Fri, Oct 23, 2015 at 7:41 PM, Manoharan, Arun 
wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle uses 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Owen O'Malley
+1 (binding)

On Fri, Oct 23, 2015 at 8:42 AM, wp chun  wrote:

> +1
> wp_c...@hotmail.com
> >
> > On 10/23/15, 11:26 PM, "P. Taylor Goetz"  wrote:
> >
> > >+1 (binding)
> > >
> > >-Taylor
> > >
> > >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun 
> > >>wrote:
> > >>
> > >> Hello Everyone,
> > >>
> > >> Thanks for all the feedback on the Eagle Proposal.
> > >>
> > >> I would like to call for a [VOTE] on Eagle joining the ASF as an
> > >>incubation project.
> > >>
> > >> The vote is open for 72 hours:
> > >>
> > >> [ ] +1 accept Eagle in the Incubator
> > >> [ ] ±0
> > >> [ ] -1 (please give reason)
> > >>
> > >> Eagle is a Monitoring solution for Hadoop to instantly identify access
> > >>to sensitive data, recognize attacks, malicious activities and take
> > >>actions in real time. Eagle supports a wide variety of policies on HDFS
> > >>data and Hive. Eagle also provides machine learning models for
> detecting
> > >>anomalous user behavior in Hadoop.
> > >>
> > >> The proposal is available on the wiki here:
> > >> https://wiki.apache.org/incubator/EagleProposal
> > >>
> > >> The text of the proposal is also available at the end of this email.
> > >>
> > >> Thanks for your time and help.
> > >>
> > >> Thanks,
> > >> Arun
> > >>
> > >> 
> > >>
> > >> Eagle
> > >>
> > >> Abstract
> > >> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> > >>identify access to sensitive data, recognize attacks, malicious
> > >>activities in hadoop and take actions.
> > >>
> > >> Proposal
> > >> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> > >>enforces policies defined on sensitive data access and alerts or blocks
> > >>user¹s access to that sensitive data in real time. Eagle also creates
> > >>user profiles based on the typical access behaviour for HDFS and Hive
> > >>and sends alerts when anomalous behaviour is detected. Eagle can also
> > >>import sensitive data information classified by external classification
> > >>engines to help define its policies.
> > >>
> > >> Overview of Eagle
> > >> Eagle has 3 main parts.
> > >> 1.Data collection and storage - Eagle collects data from various
> hadoop
> > >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> > >>storage.
> > >> 2.Data processing and policy engine - Eagle allows users to create
> > >>policies based on various metadata properties on HDFS, Hive and HBase
> > >>data.
> > >> 3.Eagle services - Eagle services include policy manager, query
> service
> > >>and the visualization component. Eagle provides intuitive user
> interface
> > >>to administer Eagle and an alert dashboard to respond to real time
> > >>alerts.
> > >>
> > >> Data Collection and Storage:
> > >> Eagle provides programming API for extending Eagle to integrate any
> > >>data source into Eagle policy evaluation framework. For example, Eagle
> > >>hdfs audit monitoring collects data from Kafka which is populated from
> > >>namenode log4j appender or from logstash agent. Eagle hive monitoring
> > >>collects hive query logs from running job through YARN API, which is
> > >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage
> > >>for storing metadata and metrics data, and also supports relational
> > >>database through configuration change.
> > >>
> > >> Data Processing and Policy Engine:
> > >> Processing Engine: Eagle provides stream processing API which is an
> > >>abstraction of Apache Storm. It can also be extended to other streaming
> > >>engines. This abstraction allows developers to assemble data
> > >>transformation, filtering, external data join etc. without physically
> > >>bound to a specific streaming platform. Eagle streaming API allows
> > >>developers to easily integrate business logic with Eagle policy engine
> > >>and internally Eagle framework compiles business logic execution DAG
> > >>into program primitives of underlying stream infrastructure e.g. Apache
> > >>Storm. For example, Eagle HDFS monitoring transforms audit log from
> > >>Namenode to object and joins sensitivity metadata, security zone
> > >>metadata which are generated from external programs or configured by
> > >>user. Eagle hive monitoring filters running jobs to get hive query
> > >>string and parses query string into object and then joins sensitivity
> > >>metadata.
> > >> Alerting Framework: Eagle Alert Framework includes stream metadata
> API,
> > >>scalable policy engine framework, extensible policy engine framework.
> > >>Stream metadata API allows developers to declare event schema including
> > >>what attributes constitute an event, what is the type for each
> > >>attribute, and how to dynamically resolve attribute value in runtime
> > >>when user configures policy. Scalable policy engine framework allows
> > >>policies to be executed on different physical nodes in parallel. It is
> > >>also used to define your own policy partitioner class. Policy engine
> > >>framework 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Luke Han
+1 (non-binding)


Best Regards!
-

Luke Han

On Fri, Oct 23, 2015 at 11:26 PM, P. Taylor Goetz  wrote:

> +1 (binding)
>
> -Taylor
>
> > On Oct 23, 2015, at 10:11 AM, Manoharan, Arun 
> wrote:
> >
> > Hello Everyone,
> >
> > Thanks for all the feedback on the Eagle Proposal.
> >
> > I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
> >
> > The vote is open for 72 hours:
> >
> > [ ] +1 accept Eagle in the Incubator
> > [ ] ±0
> > [ ] -1 (please give reason)
> >
> > Eagle is a Monitoring solution for Hadoop to instantly identify access
> to sensitive data, recognize attacks, malicious activities and take actions
> in real time. Eagle supports a wide variety of policies on HDFS data and
> Hive. Eagle also provides machine learning models for detecting anomalous
> user behavior in Hadoop.
> >
> > The proposal is available on the wiki here:
> > https://wiki.apache.org/incubator/EagleProposal
> >
> > The text of the proposal is also available at the end of this email.
> >
> > Thanks for your time and help.
> >
> > Thanks,
> > Arun
> >
> > 
> >
> > Eagle
> >
> > Abstract
> > Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
> >
> > Proposal
> > Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
> >
> > Overview of Eagle
> > Eagle has 3 main parts.
> > 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> > 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> > 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
> >
> > Data Collection and Storage:
> > Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
> >
> > Data Processing and Policy Engine:
> > Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> > Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> > Machine Learning module: Eagle provides 

RE: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread wp chun
+1
wp_c...@hotmail.com
> 
> On 10/23/15, 11:26 PM, "P. Taylor Goetz"  wrote:
> 
> >+1 (binding)
> >
> >-Taylor
> >
> >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun 
> >>wrote:
> >> 
> >> Hello Everyone,
> >> 
> >> Thanks for all the feedback on the Eagle Proposal.
> >> 
> >> I would like to call for a [VOTE] on Eagle joining the ASF as an
> >>incubation project.
> >> 
> >> The vote is open for 72 hours:
> >> 
> >> [ ] +1 accept Eagle in the Incubator
> >> [ ] ±0
> >> [ ] -1 (please give reason)
> >> 
> >> Eagle is a Monitoring solution for Hadoop to instantly identify access
> >>to sensitive data, recognize attacks, malicious activities and take
> >>actions in real time. Eagle supports a wide variety of policies on HDFS
> >>data and Hive. Eagle also provides machine learning models for detecting
> >>anomalous user behavior in Hadoop.
> >> 
> >> The proposal is available on the wiki here:
> >> https://wiki.apache.org/incubator/EagleProposal
> >> 
> >> The text of the proposal is also available at the end of this email.
> >> 
> >> Thanks for your time and help.
> >> 
> >> Thanks,
> >> Arun
> >> 
> >> 
> >> 
> >> Eagle
> >> 
> >> Abstract
> >> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> >>identify access to sensitive data, recognize attacks, malicious
> >>activities in hadoop and take actions.
> >> 
> >> Proposal
> >> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> >>enforces policies defined on sensitive data access and alerts or blocks
> >>user¹s access to that sensitive data in real time. Eagle also creates
> >>user profiles based on the typical access behaviour for HDFS and Hive
> >>and sends alerts when anomalous behaviour is detected. Eagle can also
> >>import sensitive data information classified by external classification
> >>engines to help define its policies.
> >> 
> >> Overview of Eagle
> >> Eagle has 3 main parts.
> >> 1.Data collection and storage - Eagle collects data from various hadoop
> >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> >>storage.
> >> 2.Data processing and policy engine - Eagle allows users to create
> >>policies based on various metadata properties on HDFS, Hive and HBase
> >>data.
> >> 3.Eagle services - Eagle services include policy manager, query service
> >>and the visualization component. Eagle provides intuitive user interface
> >>to administer Eagle and an alert dashboard to respond to real time
> >>alerts.
> >> 
> >> Data Collection and Storage:
> >> Eagle provides programming API for extending Eagle to integrate any
> >>data source into Eagle policy evaluation framework. For example, Eagle
> >>hdfs audit monitoring collects data from Kafka which is populated from
> >>namenode log4j appender or from logstash agent. Eagle hive monitoring
> >>collects hive query logs from running job through YARN API, which is
> >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage
> >>for storing metadata and metrics data, and also supports relational
> >>database through configuration change.
> >> 
> >> Data Processing and Policy Engine:
> >> Processing Engine: Eagle provides stream processing API which is an
> >>abstraction of Apache Storm. It can also be extended to other streaming
> >>engines. This abstraction allows developers to assemble data
> >>transformation, filtering, external data join etc. without physically
> >>bound to a specific streaming platform. Eagle streaming API allows
> >>developers to easily integrate business logic with Eagle policy engine
> >>and internally Eagle framework compiles business logic execution DAG
> >>into program primitives of underlying stream infrastructure e.g. Apache
> >>Storm. For example, Eagle HDFS monitoring transforms audit log from
> >>Namenode to object and joins sensitivity metadata, security zone
> >>metadata which are generated from external programs or configured by
> >>user. Eagle hive monitoring filters running jobs to get hive query
> >>string and parses query string into object and then joins sensitivity
> >>metadata.
> >> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> >>scalable policy engine framework, extensible policy engine framework.
> >>Stream metadata API allows developers to declare event schema including
> >>what attributes constitute an event, what is the type for each
> >>attribute, and how to dynamically resolve attribute value in runtime
> >>when user configures policy. Scalable policy engine framework allows
> >>policies to be executed on different physical nodes in parallel. It is
> >>also used to define your own policy partitioner class. Policy engine
> >>framework together with streaming partitioning capability provided by
> >>all streaming platforms will make sure policies and events can be
> >>evaluated in a fully distributed way. Extensible policy engine framework
> >>allows developer to plugin a new policy engine with a few lines of
> >>codes. 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Libin Sun
+1 (non-binding)

2015-10-23 23:50 GMT+08:00 Owen O'Malley :

> +1 (binding)
>
> On Fri, Oct 23, 2015 at 8:42 AM, wp chun  wrote:
>
> > +1
> > wp_c...@hotmail.com
> > >
> > > On 10/23/15, 11:26 PM, "P. Taylor Goetz"  wrote:
> > >
> > > >+1 (binding)
> > > >
> > > >-Taylor
> > > >
> > > >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun  >
> > > >>wrote:
> > > >>
> > > >> Hello Everyone,
> > > >>
> > > >> Thanks for all the feedback on the Eagle Proposal.
> > > >>
> > > >> I would like to call for a [VOTE] on Eagle joining the ASF as an
> > > >>incubation project.
> > > >>
> > > >> The vote is open for 72 hours:
> > > >>
> > > >> [ ] +1 accept Eagle in the Incubator
> > > >> [ ] ±0
> > > >> [ ] -1 (please give reason)
> > > >>
> > > >> Eagle is a Monitoring solution for Hadoop to instantly identify
> access
> > > >>to sensitive data, recognize attacks, malicious activities and take
> > > >>actions in real time. Eagle supports a wide variety of policies on
> HDFS
> > > >>data and Hive. Eagle also provides machine learning models for
> > detecting
> > > >>anomalous user behavior in Hadoop.
> > > >>
> > > >> The proposal is available on the wiki here:
> > > >> https://wiki.apache.org/incubator/EagleProposal
> > > >>
> > > >> The text of the proposal is also available at the end of this email.
> > > >>
> > > >> Thanks for your time and help.
> > > >>
> > > >> Thanks,
> > > >> Arun
> > > >>
> > > >> 
> > > >>
> > > >> Eagle
> > > >>
> > > >> Abstract
> > > >> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> > > >>identify access to sensitive data, recognize attacks, malicious
> > > >>activities in hadoop and take actions.
> > > >>
> > > >> Proposal
> > > >> Eagle audits access to HDFS files, Hive and HBase tables in real
> time,
> > > >>enforces policies defined on sensitive data access and alerts or
> blocks
> > > >>user¹s access to that sensitive data in real time. Eagle also creates
> > > >>user profiles based on the typical access behaviour for HDFS and Hive
> > > >>and sends alerts when anomalous behaviour is detected. Eagle can also
> > > >>import sensitive data information classified by external
> classification
> > > >>engines to help define its policies.
> > > >>
> > > >> Overview of Eagle
> > > >> Eagle has 3 main parts.
> > > >> 1.Data collection and storage - Eagle collects data from various
> > hadoop
> > > >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> > > >>storage.
> > > >> 2.Data processing and policy engine - Eagle allows users to create
> > > >>policies based on various metadata properties on HDFS, Hive and HBase
> > > >>data.
> > > >> 3.Eagle services - Eagle services include policy manager, query
> > service
> > > >>and the visualization component. Eagle provides intuitive user
> > interface
> > > >>to administer Eagle and an alert dashboard to respond to real time
> > > >>alerts.
> > > >>
> > > >> Data Collection and Storage:
> > > >> Eagle provides programming API for extending Eagle to integrate any
> > > >>data source into Eagle policy evaluation framework. For example,
> Eagle
> > > >>hdfs audit monitoring collects data from Kafka which is populated
> from
> > > >>namenode log4j appender or from logstash agent. Eagle hive monitoring
> > > >>collects hive query logs from running job through YARN API, which is
> > > >>designed to be scalable and fault-tolerant. Eagle uses HBase as
> storage
> > > >>for storing metadata and metrics data, and also supports relational
> > > >>database through configuration change.
> > > >>
> > > >> Data Processing and Policy Engine:
> > > >> Processing Engine: Eagle provides stream processing API which is an
> > > >>abstraction of Apache Storm. It can also be extended to other
> streaming
> > > >>engines. This abstraction allows developers to assemble data
> > > >>transformation, filtering, external data join etc. without physically
> > > >>bound to a specific streaming platform. Eagle streaming API allows
> > > >>developers to easily integrate business logic with Eagle policy
> engine
> > > >>and internally Eagle framework compiles business logic execution DAG
> > > >>into program primitives of underlying stream infrastructure e.g.
> Apache
> > > >>Storm. For example, Eagle HDFS monitoring transforms audit log from
> > > >>Namenode to object and joins sensitivity metadata, security zone
> > > >>metadata which are generated from external programs or configured by
> > > >>user. Eagle hive monitoring filters running jobs to get hive query
> > > >>string and parses query string into object and then joins sensitivity
> > > >>metadata.
> > > >> Alerting Framework: Eagle Alert Framework includes stream metadata
> > API,
> > > >>scalable policy engine framework, extensible policy engine framework.
> > > >>Stream metadata API allows developers to declare event schema
> including
> > > >>what attributes constitute an event, what is the type 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Ted Dunning
+1 (binding)


On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun 
wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle uses 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Hao Chen
+1 (non-binding)

On Fri, Oct 23, 2015 at 10:11 PM, Manoharan, Arun 
wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Hitesh Shah
+1 (binding)

— Hitesh

On Oct 23, 2015, at 7:11 AM, Manoharan, Arun  wrote:

> Hello Everyone,
> 
> Thanks for all the feedback on the Eagle Proposal.
> 
> I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
> project.
> 
> The vote is open for 72 hours:
> 
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
> 
> Eagle is a Monitoring solution for Hadoop to instantly identify access to 
> sensitive data, recognize attacks, malicious activities and take actions in 
> real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
> Eagle also provides machine learning models for detecting anomalous user 
> behavior in Hadoop.
> 
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
> 
> The text of the proposal is also available at the end of this email.
> 
> Thanks for your time and help.
> 
> Thanks,
> Arun
> 
> 
> 
> Eagle
> 
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
> access to sensitive data, recognize attacks, malicious activities in hadoop 
> and take actions.
> 
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time, 
> enforces policies defined on sensitive data access and alerts or blocks 
> user’s access to that sensitive data in real time. Eagle also creates user 
> profiles based on the typical access behaviour for HDFS and Hive and sends 
> alerts when anomalous behaviour is detected. Eagle can also import sensitive 
> data information classified by external classification engines to help define 
> its policies.
> 
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop logs 
> in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create policies 
> based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service and 
> the visualization component. Eagle provides intuitive user interface to 
> administer Eagle and an alert dashboard to respond to real time alerts.
> 
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data 
> source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
> monitoring collects data from Kafka which is populated from namenode log4j 
> appender or from logstash agent. Eagle hive monitoring collects hive query 
> logs from running job through YARN API, which is designed to be scalable and 
> fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
> data, and also supports relational database through configuration change.
> 
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an 
> abstraction of Apache Storm. It can also be extended to other streaming 
> engines. This abstraction allows developers to assemble data transformation, 
> filtering, external data join etc. without physically bound to a specific 
> streaming platform. Eagle streaming API allows developers to easily integrate 
> business logic with Eagle policy engine and internally Eagle framework 
> compiles business logic execution DAG into program primitives of underlying 
> stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
> transforms audit log from Namenode to object and joins sensitivity metadata, 
> security zone metadata which are generated from external programs or 
> configured by user. Eagle hive monitoring filters running jobs to get hive 
> query string and parses query string into object and then joins sensitivity 
> metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API, 
> scalable policy engine framework, extensible policy engine framework. Stream 
> metadata API allows developers to declare event schema including what 
> attributes constitute an event, what is the type for each attribute, and how 
> to dynamically resolve attribute value in runtime when user configures 
> policy. Scalable policy engine framework allows policies to be executed on 
> different physical nodes in parallel. It is also used to define your own 
> policy partitioner class. Policy engine framework together with streaming 
> partitioning capability provided by all streaming platforms will make sure 
> policies and events can be evaluated in a fully distributed way. Extensible 
> policy engine framework allows developer to plugin a new policy engine with a 
> few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle 
> supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user activity 
> patterns or user profiles for Hadoop users based on the user behaviour in the 
> platform. These user profiles are modeled using Machine Learning algorithms 
> 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread John D. Ament
+1
On Oct 23, 2015 10:11, "Manoharan, Arun"  wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle uses Eigen Value 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Adunuthula, Seshu
+1 (non binding)

On 10/23/15, 9:52 AM, "Hitesh Shah"  wrote:

>+1 (binding)
>
>‹ Hitesh
>
>On Oct 23, 2015, at 7:11 AM, Manoharan, Arun  wrote:
>
>> Hello Everyone,
>> 
>> Thanks for all the feedback on the Eagle Proposal.
>> 
>> I would like to call for a [VOTE] on Eagle joining the ASF as an
>>incubation project.
>> 
>> The vote is open for 72 hours:
>> 
>> [ ] +1 accept Eagle in the Incubator
>> [ ] ±0
>> [ ] -1 (please give reason)
>> 
>> Eagle is a Monitoring solution for Hadoop to instantly identify access
>>to sensitive data, recognize attacks, malicious activities and take
>>actions in real time. Eagle supports a wide variety of policies on HDFS
>>data and Hive. Eagle also provides machine learning models for detecting
>>anomalous user behavior in Hadoop.
>> 
>> The proposal is available on the wiki here:
>> https://wiki.apache.org/incubator/EagleProposal
>> 
>> The text of the proposal is also available at the end of this email.
>> 
>> Thanks for your time and help.
>> 
>> Thanks,
>> Arun
>> 
>> 
>> 
>> Eagle
>> 
>> Abstract
>> Eagle is an Open Source Monitoring solution for Hadoop to instantly
>>identify access to sensitive data, recognize attacks, malicious
>>activities in hadoop and take actions.
>> 
>> Proposal
>> Eagle audits access to HDFS files, Hive and HBase tables in real time,
>>enforces policies defined on sensitive data access and alerts or blocks
>>user¹s access to that sensitive data in real time. Eagle also creates
>>user profiles based on the typical access behaviour for HDFS and Hive
>>and sends alerts when anomalous behaviour is detected. Eagle can also
>>import sensitive data information classified by external classification
>>engines to help define its policies.
>> 
>> Overview of Eagle
>> Eagle has 3 main parts.
>> 1.Data collection and storage - Eagle collects data from various hadoop
>>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
>>storage.
>> 2.Data processing and policy engine - Eagle allows users to create
>>policies based on various metadata properties on HDFS, Hive and HBase
>>data.
>> 3.Eagle services - Eagle services include policy manager, query service
>>and the visualization component. Eagle provides intuitive user interface
>>to administer Eagle and an alert dashboard to respond to real time
>>alerts.
>> 
>> Data Collection and Storage:
>> Eagle provides programming API for extending Eagle to integrate any
>>data source into Eagle policy evaluation framework. For example, Eagle
>>hdfs audit monitoring collects data from Kafka which is populated from
>>namenode log4j appender or from logstash agent. Eagle hive monitoring
>>collects hive query logs from running job through YARN API, which is
>>designed to be scalable and fault-tolerant. Eagle uses HBase as storage
>>for storing metadata and metrics data, and also supports relational
>>database through configuration change.
>> 
>> Data Processing and Policy Engine:
>> Processing Engine: Eagle provides stream processing API which is an
>>abstraction of Apache Storm. It can also be extended to other streaming
>>engines. This abstraction allows developers to assemble data
>>transformation, filtering, external data join etc. without physically
>>bound to a specific streaming platform. Eagle streaming API allows
>>developers to easily integrate business logic with Eagle policy engine
>>and internally Eagle framework compiles business logic execution DAG
>>into program primitives of underlying stream infrastructure e.g. Apache
>>Storm. For example, Eagle HDFS monitoring transforms audit log from
>>Namenode to object and joins sensitivity metadata, security zone
>>metadata which are generated from external programs or configured by
>>user. Eagle hive monitoring filters running jobs to get hive query
>>string and parses query string into object and then joins sensitivity
>>metadata.
>> Alerting Framework: Eagle Alert Framework includes stream metadata API,
>>scalable policy engine framework, extensible policy engine framework.
>>Stream metadata API allows developers to declare event schema including
>>what attributes constitute an event, what is the type for each
>>attribute, and how to dynamically resolve attribute value in runtime
>>when user configures policy. Scalable policy engine framework allows
>>policies to be executed on different physical nodes in parallel. It is
>>also used to define your own policy partitioner class. Policy engine
>>framework together with streaming partitioning capability provided by
>>all streaming platforms will make sure policies and events can be
>>evaluated in a fully distributed way. Extensible policy engine framework
>>allows developer to plugin a new policy engine with a few lines of
>>codes. WSO2 Siddhi CEP engine is the policy engine which Eagle supports
>>as first-class citizen.
>> Machine Learning module: Eagle provides capabilities to define user
>>activity patterns or user profiles for Hadoop users based on the user

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Julian Hyde
+1 (binding)

> On Oct 23, 2015, at 10:13 AM, John D. Ament  wrote:
> 
> +1
> On Oct 23, 2015 10:11, "Manoharan, Arun"  wrote:
> 
>> Hello Everyone,
>> 
>> Thanks for all the feedback on the Eagle Proposal.
>> 
>> I would like to call for a [VOTE] on Eagle joining the ASF as an
>> incubation project.
>> 
>> The vote is open for 72 hours:
>> 
>> [ ] +1 accept Eagle in the Incubator
>> [ ] ±0
>> [ ] -1 (please give reason)
>> 
>> Eagle is a Monitoring solution for Hadoop to instantly identify access to
>> sensitive data, recognize attacks, malicious activities and take actions in
>> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
>> Eagle also provides machine learning models for detecting anomalous user
>> behavior in Hadoop.
>> 
>> The proposal is available on the wiki here:
>> https://wiki.apache.org/incubator/EagleProposal
>> 
>> The text of the proposal is also available at the end of this email.
>> 
>> Thanks for your time and help.
>> 
>> Thanks,
>> Arun
>> 
>> 
>> 
>> Eagle
>> 
>> Abstract
>> Eagle is an Open Source Monitoring solution for Hadoop to instantly
>> identify access to sensitive data, recognize attacks, malicious activities
>> in hadoop and take actions.
>> 
>> Proposal
>> Eagle audits access to HDFS files, Hive and HBase tables in real time,
>> enforces policies defined on sensitive data access and alerts or blocks
>> user’s access to that sensitive data in real time. Eagle also creates user
>> profiles based on the typical access behaviour for HDFS and Hive and sends
>> alerts when anomalous behaviour is detected. Eagle can also import
>> sensitive data information classified by external classification engines to
>> help define its policies.
>> 
>> Overview of Eagle
>> Eagle has 3 main parts.
>> 1.Data collection and storage - Eagle collects data from various hadoop
>> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
>> 2.Data processing and policy engine - Eagle allows users to create
>> policies based on various metadata properties on HDFS, Hive and HBase data.
>> 3.Eagle services - Eagle services include policy manager, query service
>> and the visualization component. Eagle provides intuitive user interface to
>> administer Eagle and an alert dashboard to respond to real time alerts.
>> 
>> Data Collection and Storage:
>> Eagle provides programming API for extending Eagle to integrate any data
>> source into Eagle policy evaluation framework. For example, Eagle hdfs
>> audit monitoring collects data from Kafka which is populated from namenode
>> log4j appender or from logstash agent. Eagle hive monitoring collects hive
>> query logs from running job through YARN API, which is designed to be
>> scalable and fault-tolerant. Eagle uses HBase as storage for storing
>> metadata and metrics data, and also supports relational database through
>> configuration change.
>> 
>> Data Processing and Policy Engine:
>> Processing Engine: Eagle provides stream processing API which is an
>> abstraction of Apache Storm. It can also be extended to other streaming
>> engines. This abstraction allows developers to assemble data
>> transformation, filtering, external data join etc. without physically bound
>> to a specific streaming platform. Eagle streaming API allows developers to
>> easily integrate business logic with Eagle policy engine and internally
>> Eagle framework compiles business logic execution DAG into program
>> primitives of underlying stream infrastructure e.g. Apache Storm. For
>> example, Eagle HDFS monitoring transforms audit log from Namenode to object
>> and joins sensitivity metadata, security zone metadata which are generated
>> from external programs or configured by user. Eagle hive monitoring filters
>> running jobs to get hive query string and parses query string into object
>> and then joins sensitivity metadata.
>> Alerting Framework: Eagle Alert Framework includes stream metadata API,
>> scalable policy engine framework, extensible policy engine framework.
>> Stream metadata API allows developers to declare event schema including
>> what attributes constitute an event, what is the type for each attribute,
>> and how to dynamically resolve attribute value in runtime when user
>> configures policy. Scalable policy engine framework allows policies to be
>> executed on different physical nodes in parallel. It is also used to define
>> your own policy partitioner class. Policy engine framework together with
>> streaming partitioning capability provided by all streaming platforms will
>> make sure policies and events can be evaluated in a fully distributed way.
>> Extensible policy engine framework allows developer to plugin a new policy
>> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
>> engine which Eagle supports as first-class citizen.
>> Machine Learning module: Eagle provides capabilities to define user
>> activity patterns or user profiles for 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Chris Nauroth
+1 (binding)

--Chris Nauroth




On 10/23/15, 7:11 AM, "Manoharan, Arun"  wrote:

>Hello Everyone,
>
>Thanks for all the feedback on the Eagle Proposal.
>
>I would like to call for a [VOTE] on Eagle joining the ASF as an
>incubation project.
>
>The vote is open for 72 hours:
>
>[ ] +1 accept Eagle in the Incubator
>[ ] ±0
>[ ] -1 (please give reason)
>
>Eagle is a Monitoring solution for Hadoop to instantly identify access to
>sensitive data, recognize attacks, malicious activities and take actions
>in real time. Eagle supports a wide variety of policies on HDFS data and
>Hive. Eagle also provides machine learning models for detecting anomalous
>user behavior in Hadoop.
>
>The proposal is available on the wiki here:
>https://wiki.apache.org/incubator/EagleProposal
>
>The text of the proposal is also available at the end of this email.
>
>Thanks for your time and help.
>
>Thanks,
>Arun
>
>
>
>Eagle
>
>Abstract
>Eagle is an Open Source Monitoring solution for Hadoop to instantly
>identify access to sensitive data, recognize attacks, malicious
>activities in hadoop and take actions.
>
>Proposal
>Eagle audits access to HDFS files, Hive and HBase tables in real time,
>enforces policies defined on sensitive data access and alerts or blocks
>user¹s access to that sensitive data in real time. Eagle also creates
>user profiles based on the typical access behaviour for HDFS and Hive and
>sends alerts when anomalous behaviour is detected. Eagle can also import
>sensitive data information classified by external classification engines
>to help define its policies.
>
>Overview of Eagle
>Eagle has 3 main parts.
>1.Data collection and storage - Eagle collects data from various hadoop
>logs in real time using Kafka/Yarn API and uses HDFS and HBase for
>storage.
>2.Data processing and policy engine - Eagle allows users to create
>policies based on various metadata properties on HDFS, Hive and HBase
>data.
>3.Eagle services - Eagle services include policy manager, query service
>and the visualization component. Eagle provides intuitive user interface
>to administer Eagle and an alert dashboard to respond to real time alerts.
>
>Data Collection and Storage:
>Eagle provides programming API for extending Eagle to integrate any data
>source into Eagle policy evaluation framework. For example, Eagle hdfs
>audit monitoring collects data from Kafka which is populated from
>namenode log4j appender or from logstash agent. Eagle hive monitoring
>collects hive query logs from running job through YARN API, which is
>designed to be scalable and fault-tolerant. Eagle uses HBase as storage
>for storing metadata and metrics data, and also supports relational
>database through configuration change.
>
>Data Processing and Policy Engine:
>Processing Engine: Eagle provides stream processing API which is an
>abstraction of Apache Storm. It can also be extended to other streaming
>engines. This abstraction allows developers to assemble data
>transformation, filtering, external data join etc. without physically
>bound to a specific streaming platform. Eagle streaming API allows
>developers to easily integrate business logic with Eagle policy engine
>and internally Eagle framework compiles business logic execution DAG into
>program primitives of underlying stream infrastructure e.g. Apache Storm.
>For example, Eagle HDFS monitoring transforms audit log from Namenode to
>object and joins sensitivity metadata, security zone metadata which are
>generated from external programs or configured by user. Eagle hive
>monitoring filters running jobs to get hive query string and parses query
>string into object and then joins sensitivity metadata.
>Alerting Framework: Eagle Alert Framework includes stream metadata API,
>scalable policy engine framework, extensible policy engine framework.
>Stream metadata API allows developers to declare event schema including
>what attributes constitute an event, what is the type for each attribute,
>and how to dynamically resolve attribute value in runtime when user
>configures policy. Scalable policy engine framework allows policies to be
>executed on different physical nodes in parallel. It is also used to
>define your own policy partitioner class. Policy engine framework
>together with streaming partitioning capability provided by all streaming
>platforms will make sure policies and events can be evaluated in a fully
>distributed way. Extensible policy engine framework allows developer to
>plugin a new policy engine with a few lines of codes. WSO2 Siddhi CEP
>engine is the policy engine which Eagle supports as first-class citizen.
>Machine Learning module: Eagle provides capabilities to define user
>activity patterns or user profiles for Hadoop users based on the user
>behaviour in the platform. These user profiles are modeled using Machine
>Learning algorithms and used for detection of anomalous users activities.
>Eagle uses Eigen Value Decomposition, and Density Estimation algorithms
>for 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Balaji Ganesan
+1

On Fri, Oct 23, 2015 at 12:26 PM, Chris Nauroth 
wrote:

> +1 (binding)
>
> --Chris Nauroth
>
>
>
>
> On 10/23/15, 7:11 AM, "Manoharan, Arun"  wrote:
>
> >Hello Everyone,
> >
> >Thanks for all the feedback on the Eagle Proposal.
> >
> >I would like to call for a [VOTE] on Eagle joining the ASF as an
> >incubation project.
> >
> >The vote is open for 72 hours:
> >
> >[ ] +1 accept Eagle in the Incubator
> >[ ] ±0
> >[ ] -1 (please give reason)
> >
> >Eagle is a Monitoring solution for Hadoop to instantly identify access to
> >sensitive data, recognize attacks, malicious activities and take actions
> >in real time. Eagle supports a wide variety of policies on HDFS data and
> >Hive. Eagle also provides machine learning models for detecting anomalous
> >user behavior in Hadoop.
> >
> >The proposal is available on the wiki here:
> >https://wiki.apache.org/incubator/EagleProposal
> >
> >The text of the proposal is also available at the end of this email.
> >
> >Thanks for your time and help.
> >
> >Thanks,
> >Arun
> >
> >
> >
> >Eagle
> >
> >Abstract
> >Eagle is an Open Source Monitoring solution for Hadoop to instantly
> >identify access to sensitive data, recognize attacks, malicious
> >activities in hadoop and take actions.
> >
> >Proposal
> >Eagle audits access to HDFS files, Hive and HBase tables in real time,
> >enforces policies defined on sensitive data access and alerts or blocks
> >user¹s access to that sensitive data in real time. Eagle also creates
> >user profiles based on the typical access behaviour for HDFS and Hive and
> >sends alerts when anomalous behaviour is detected. Eagle can also import
> >sensitive data information classified by external classification engines
> >to help define its policies.
> >
> >Overview of Eagle
> >Eagle has 3 main parts.
> >1.Data collection and storage - Eagle collects data from various hadoop
> >logs in real time using Kafka/Yarn API and uses HDFS and HBase for
> >storage.
> >2.Data processing and policy engine - Eagle allows users to create
> >policies based on various metadata properties on HDFS, Hive and HBase
> >data.
> >3.Eagle services - Eagle services include policy manager, query service
> >and the visualization component. Eagle provides intuitive user interface
> >to administer Eagle and an alert dashboard to respond to real time alerts.
> >
> >Data Collection and Storage:
> >Eagle provides programming API for extending Eagle to integrate any data
> >source into Eagle policy evaluation framework. For example, Eagle hdfs
> >audit monitoring collects data from Kafka which is populated from
> >namenode log4j appender or from logstash agent. Eagle hive monitoring
> >collects hive query logs from running job through YARN API, which is
> >designed to be scalable and fault-tolerant. Eagle uses HBase as storage
> >for storing metadata and metrics data, and also supports relational
> >database through configuration change.
> >
> >Data Processing and Policy Engine:
> >Processing Engine: Eagle provides stream processing API which is an
> >abstraction of Apache Storm. It can also be extended to other streaming
> >engines. This abstraction allows developers to assemble data
> >transformation, filtering, external data join etc. without physically
> >bound to a specific streaming platform. Eagle streaming API allows
> >developers to easily integrate business logic with Eagle policy engine
> >and internally Eagle framework compiles business logic execution DAG into
> >program primitives of underlying stream infrastructure e.g. Apache Storm.
> >For example, Eagle HDFS monitoring transforms audit log from Namenode to
> >object and joins sensitivity metadata, security zone metadata which are
> >generated from external programs or configured by user. Eagle hive
> >monitoring filters running jobs to get hive query string and parses query
> >string into object and then joins sensitivity metadata.
> >Alerting Framework: Eagle Alert Framework includes stream metadata API,
> >scalable policy engine framework, extensible policy engine framework.
> >Stream metadata API allows developers to declare event schema including
> >what attributes constitute an event, what is the type for each attribute,
> >and how to dynamically resolve attribute value in runtime when user
> >configures policy. Scalable policy engine framework allows policies to be
> >executed on different physical nodes in parallel. It is also used to
> >define your own policy partitioner class. Policy engine framework
> >together with streaming partitioning capability provided by all streaming
> >platforms will make sure policies and events can be evaluated in a fully
> >distributed way. Extensible policy engine framework allows developer to
> >plugin a new policy engine with a few lines of codes. WSO2 Siddhi CEP
> >engine is the policy engine which Eagle supports as first-class citizen.
> >Machine Learning module: Eagle provides capabilities to define user
> >activity 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread 周千昊
+1

Shaofeng Shi 于2015年10月24日周六 08:40写道:

> +1 (non-binding)
>
> "Manoharan, Arun" 编写:
>
> >Hello Everyone,
> >
> >Thanks for all the feedback on the Eagle Proposal.
> >
> >I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
> >
> >The vote is open for 72 hours:
> >
> >[ ] +1 accept Eagle in the Incubator
> >[ ] ±0
> >[ ] -1 (please give reason)
> >
> >Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
> >
> >The proposal is available on the wiki here:
> >https://wiki.apache.org/incubator/EagleProposal
> >
> >The text of the proposal is also available at the end of this email.
> >
> >Thanks for your time and help.
> >
> >Thanks,
> >Arun
> >
> >
> >
> >Eagle
> >
> >Abstract
> >Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
> >
> >Proposal
> >Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
> >
> >Overview of Eagle
> >Eagle has 3 main parts.
> >1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> >2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> >3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
> >
> >Data Collection and Storage:
> >Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
> >
> >Data Processing and Policy Engine:
> >Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> >Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> >Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Luciano Resende
+1 (binding)

On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun 
wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle uses 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Henry Saputra
+1 (binding)

On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun  wrote:
> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
> project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to 
> sensitive data, recognize attacks, malicious activities and take actions in 
> real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
> Eagle also provides machine learning models for detecting anomalous user 
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
> access to sensitive data, recognize attacks, malicious activities in hadoop 
> and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time, 
> enforces policies defined on sensitive data access and alerts or blocks 
> user’s access to that sensitive data in real time. Eagle also creates user 
> profiles based on the typical access behaviour for HDFS and Hive and sends 
> alerts when anomalous behaviour is detected. Eagle can also import sensitive 
> data information classified by external classification engines to help define 
> its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop logs 
> in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create policies 
> based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service and 
> the visualization component. Eagle provides intuitive user interface to 
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data 
> source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
> monitoring collects data from Kafka which is populated from namenode log4j 
> appender or from logstash agent. Eagle hive monitoring collects hive query 
> logs from running job through YARN API, which is designed to be scalable and 
> fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
> data, and also supports relational database through configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an 
> abstraction of Apache Storm. It can also be extended to other streaming 
> engines. This abstraction allows developers to assemble data transformation, 
> filtering, external data join etc. without physically bound to a specific 
> streaming platform. Eagle streaming API allows developers to easily integrate 
> business logic with Eagle policy engine and internally Eagle framework 
> compiles business logic execution DAG into program primitives of underlying 
> stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
> transforms audit log from Namenode to object and joins sensitivity metadata, 
> security zone metadata which are generated from external programs or 
> configured by user. Eagle hive monitoring filters running jobs to get hive 
> query string and parses query string into object and then joins sensitivity 
> metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API, 
> scalable policy engine framework, extensible policy engine framework. Stream 
> metadata API allows developers to declare event schema including what 
> attributes constitute an event, what is the type for each attribute, and how 
> to dynamically resolve attribute value in runtime when user configures 
> policy. Scalable policy engine framework allows policies to be executed on 
> different physical nodes in parallel. It is also used to define your own 
> policy partitioner class. Policy engine framework together with streaming 
> partitioning capability provided by all streaming platforms will make sure 
> policies and events can be evaluated in a fully distributed way. Extensible 
> policy engine framework allows developer to plugin a new policy engine with a 
> few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle 
> supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user activity 
> patterns or user profiles for Hadoop users based on the user behaviour in the 
> platform. These user profiles are modeled using Machine Learning algorithms 
> and used for detection 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread larry mccay
+1 (non-binding)

On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun 
wrote:

> Hello Everyone,
>
> Thanks for all the feedback on the Eagle Proposal.
>
> I would like to call for a [VOTE] on Eagle joining the ASF as an
> incubation project.
>
> The vote is open for 72 hours:
>
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
>
> Eagle is a Monitoring solution for Hadoop to instantly identify access to
> sensitive data, recognize attacks, malicious activities and take actions in
> real time. Eagle supports a wide variety of policies on HDFS data and Hive.
> Eagle also provides machine learning models for detecting anomalous user
> behavior in Hadoop.
>
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
>
> The text of the proposal is also available at the end of this email.
>
> Thanks for your time and help.
>
> Thanks,
> Arun
>
> 
>
> Eagle
>
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly
> identify access to sensitive data, recognize attacks, malicious activities
> in hadoop and take actions.
>
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time,
> enforces policies defined on sensitive data access and alerts or blocks
> user’s access to that sensitive data in real time. Eagle also creates user
> profiles based on the typical access behaviour for HDFS and Hive and sends
> alerts when anomalous behaviour is detected. Eagle can also import
> sensitive data information classified by external classification engines to
> help define its policies.
>
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop
> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create
> policies based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service
> and the visualization component. Eagle provides intuitive user interface to
> administer Eagle and an alert dashboard to respond to real time alerts.
>
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data
> source into Eagle policy evaluation framework. For example, Eagle hdfs
> audit monitoring collects data from Kafka which is populated from namenode
> log4j appender or from logstash agent. Eagle hive monitoring collects hive
> query logs from running job through YARN API, which is designed to be
> scalable and fault-tolerant. Eagle uses HBase as storage for storing
> metadata and metrics data, and also supports relational database through
> configuration change.
>
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an
> abstraction of Apache Storm. It can also be extended to other streaming
> engines. This abstraction allows developers to assemble data
> transformation, filtering, external data join etc. without physically bound
> to a specific streaming platform. Eagle streaming API allows developers to
> easily integrate business logic with Eagle policy engine and internally
> Eagle framework compiles business logic execution DAG into program
> primitives of underlying stream infrastructure e.g. Apache Storm. For
> example, Eagle HDFS monitoring transforms audit log from Namenode to object
> and joins sensitivity metadata, security zone metadata which are generated
> from external programs or configured by user. Eagle hive monitoring filters
> running jobs to get hive query string and parses query string into object
> and then joins sensitivity metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API,
> scalable policy engine framework, extensible policy engine framework.
> Stream metadata API allows developers to declare event schema including
> what attributes constitute an event, what is the type for each attribute,
> and how to dynamically resolve attribute value in runtime when user
> configures policy. Scalable policy engine framework allows policies to be
> executed on different physical nodes in parallel. It is also used to define
> your own policy partitioner class. Policy engine framework together with
> streaming partitioning capability provided by all streaming platforms will
> make sure policies and events can be evaluated in a fully distributed way.
> Extensible policy engine framework allows developer to plugin a new policy
> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy
> engine which Eagle supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user
> activity patterns or user profiles for Hadoop users based on the user
> behaviour in the platform. These user profiles are modeled using Machine
> Learning algorithms and used for detection of anomalous users activities.
> Eagle 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread P. Taylor Goetz
+1 (binding)

-Taylor

> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun  wrote:
> 
> Hello Everyone,
> 
> Thanks for all the feedback on the Eagle Proposal.
> 
> I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
> project.
> 
> The vote is open for 72 hours:
> 
> [ ] +1 accept Eagle in the Incubator
> [ ] ±0
> [ ] -1 (please give reason)
> 
> Eagle is a Monitoring solution for Hadoop to instantly identify access to 
> sensitive data, recognize attacks, malicious activities and take actions in 
> real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
> Eagle also provides machine learning models for detecting anomalous user 
> behavior in Hadoop.
> 
> The proposal is available on the wiki here:
> https://wiki.apache.org/incubator/EagleProposal
> 
> The text of the proposal is also available at the end of this email.
> 
> Thanks for your time and help.
> 
> Thanks,
> Arun
> 
> 
> 
> Eagle
> 
> Abstract
> Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
> access to sensitive data, recognize attacks, malicious activities in hadoop 
> and take actions.
> 
> Proposal
> Eagle audits access to HDFS files, Hive and HBase tables in real time, 
> enforces policies defined on sensitive data access and alerts or blocks 
> user’s access to that sensitive data in real time. Eagle also creates user 
> profiles based on the typical access behaviour for HDFS and Hive and sends 
> alerts when anomalous behaviour is detected. Eagle can also import sensitive 
> data information classified by external classification engines to help define 
> its policies.
> 
> Overview of Eagle
> Eagle has 3 main parts.
> 1.Data collection and storage - Eagle collects data from various hadoop logs 
> in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
> 2.Data processing and policy engine - Eagle allows users to create policies 
> based on various metadata properties on HDFS, Hive and HBase data.
> 3.Eagle services - Eagle services include policy manager, query service and 
> the visualization component. Eagle provides intuitive user interface to 
> administer Eagle and an alert dashboard to respond to real time alerts.
> 
> Data Collection and Storage:
> Eagle provides programming API for extending Eagle to integrate any data 
> source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
> monitoring collects data from Kafka which is populated from namenode log4j 
> appender or from logstash agent. Eagle hive monitoring collects hive query 
> logs from running job through YARN API, which is designed to be scalable and 
> fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
> data, and also supports relational database through configuration change.
> 
> Data Processing and Policy Engine:
> Processing Engine: Eagle provides stream processing API which is an 
> abstraction of Apache Storm. It can also be extended to other streaming 
> engines. This abstraction allows developers to assemble data transformation, 
> filtering, external data join etc. without physically bound to a specific 
> streaming platform. Eagle streaming API allows developers to easily integrate 
> business logic with Eagle policy engine and internally Eagle framework 
> compiles business logic execution DAG into program primitives of underlying 
> stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
> transforms audit log from Namenode to object and joins sensitivity metadata, 
> security zone metadata which are generated from external programs or 
> configured by user. Eagle hive monitoring filters running jobs to get hive 
> query string and parses query string into object and then joins sensitivity 
> metadata.
> Alerting Framework: Eagle Alert Framework includes stream metadata API, 
> scalable policy engine framework, extensible policy engine framework. Stream 
> metadata API allows developers to declare event schema including what 
> attributes constitute an event, what is the type for each attribute, and how 
> to dynamically resolve attribute value in runtime when user configures 
> policy. Scalable policy engine framework allows policies to be executed on 
> different physical nodes in parallel. It is also used to define your own 
> policy partitioner class. Policy engine framework together with streaming 
> partitioning capability provided by all streaming platforms will make sure 
> policies and events can be evaluated in a fully distributed way. Extensible 
> policy engine framework allows developer to plugin a new policy engine with a 
> few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle 
> supports as first-class citizen.
> Machine Learning module: Eagle provides capabilities to define user activity 
> patterns or user profiles for Hadoop users based on the user behaviour in the 
> platform. These user profiles are modeled using Machine Learning 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Samant, Medha
+1
-Medha

On 10/23/15, 1:14 PM, "Balaji Ganesan"  wrote:

>+1
>
>On Fri, Oct 23, 2015 at 12:26 PM, Chris Nauroth 
>wrote:
>
>> +1 (binding)
>>
>> --Chris Nauroth
>>
>>
>>
>>
>> On 10/23/15, 7:11 AM, "Manoharan, Arun"  wrote:
>>
>> >Hello Everyone,
>> >
>> >Thanks for all the feedback on the Eagle Proposal.
>> >
>> >I would like to call for a [VOTE] on Eagle joining the ASF as an
>> >incubation project.
>> >
>> >The vote is open for 72 hours:
>> >
>> >[ ] +1 accept Eagle in the Incubator
>> >[ ] ±0
>> >[ ] -1 (please give reason)
>> >
>> >Eagle is a Monitoring solution for Hadoop to instantly identify access
>>to
>> >sensitive data, recognize attacks, malicious activities and take
>>actions
>> >in real time. Eagle supports a wide variety of policies on HDFS data
>>and
>> >Hive. Eagle also provides machine learning models for detecting
>>anomalous
>> >user behavior in Hadoop.
>> >
>> >The proposal is available on the wiki here:
>> >https://wiki.apache.org/incubator/EagleProposal
>> >
>> >The text of the proposal is also available at the end of this email.
>> >
>> >Thanks for your time and help.
>> >
>> >Thanks,
>> >Arun
>> >
>> >
>> >
>> >Eagle
>> >
>> >Abstract
>> >Eagle is an Open Source Monitoring solution for Hadoop to instantly
>> >identify access to sensitive data, recognize attacks, malicious
>> >activities in hadoop and take actions.
>> >
>> >Proposal
>> >Eagle audits access to HDFS files, Hive and HBase tables in real time,
>> >enforces policies defined on sensitive data access and alerts or blocks
>> >user¹s access to that sensitive data in real time. Eagle also creates
>> >user profiles based on the typical access behaviour for HDFS and Hive
>>and
>> >sends alerts when anomalous behaviour is detected. Eagle can also
>>import
>> >sensitive data information classified by external classification
>>engines
>> >to help define its policies.
>> >
>> >Overview of Eagle
>> >Eagle has 3 main parts.
>> >1.Data collection and storage - Eagle collects data from various hadoop
>> >logs in real time using Kafka/Yarn API and uses HDFS and HBase for
>> >storage.
>> >2.Data processing and policy engine - Eagle allows users to create
>> >policies based on various metadata properties on HDFS, Hive and HBase
>> >data.
>> >3.Eagle services - Eagle services include policy manager, query service
>> >and the visualization component. Eagle provides intuitive user
>>interface
>> >to administer Eagle and an alert dashboard to respond to real time
>>alerts.
>> >
>> >Data Collection and Storage:
>> >Eagle provides programming API for extending Eagle to integrate any
>>data
>> >source into Eagle policy evaluation framework. For example, Eagle hdfs
>> >audit monitoring collects data from Kafka which is populated from
>> >namenode log4j appender or from logstash agent. Eagle hive monitoring
>> >collects hive query logs from running job through YARN API, which is
>> >designed to be scalable and fault-tolerant. Eagle uses HBase as storage
>> >for storing metadata and metrics data, and also supports relational
>> >database through configuration change.
>> >
>> >Data Processing and Policy Engine:
>> >Processing Engine: Eagle provides stream processing API which is an
>> >abstraction of Apache Storm. It can also be extended to other streaming
>> >engines. This abstraction allows developers to assemble data
>> >transformation, filtering, external data join etc. without physically
>> >bound to a specific streaming platform. Eagle streaming API allows
>> >developers to easily integrate business logic with Eagle policy engine
>> >and internally Eagle framework compiles business logic execution DAG
>>into
>> >program primitives of underlying stream infrastructure e.g. Apache
>>Storm.
>> >For example, Eagle HDFS monitoring transforms audit log from Namenode
>>to
>> >object and joins sensitivity metadata, security zone metadata which are
>> >generated from external programs or configured by user. Eagle hive
>> >monitoring filters running jobs to get hive query string and parses
>>query
>> >string into object and then joins sensitivity metadata.
>> >Alerting Framework: Eagle Alert Framework includes stream metadata API,
>> >scalable policy engine framework, extensible policy engine framework.
>> >Stream metadata API allows developers to declare event schema including
>> >what attributes constitute an event, what is the type for each
>>attribute,
>> >and how to dynamically resolve attribute value in runtime when user
>> >configures policy. Scalable policy engine framework allows policies to
>>be
>> >executed on different physical nodes in parallel. It is also used to
>> >define your own policy partitioner class. Policy engine framework
>> >together with streaming partitioning capability provided by all
>>streaming
>> >platforms will make sure policies and events can be evaluated in a
>>fully
>> >distributed way. Extensible policy engine framework allows developer to
>> 

Re: [VOTE] Accept Eagle into Apache Incubation

2015-10-23 Thread Shaofeng Shi
+1 (non-binding)

"Manoharan, Arun" 编写:

>Hello Everyone,
>
>Thanks for all the feedback on the Eagle Proposal.
>
>I would like to call for a [VOTE] on Eagle joining the ASF as an incubation 
>project.
>
>The vote is open for 72 hours:
>
>[ ] +1 accept Eagle in the Incubator
>[ ] ±0
>[ ] -1 (please give reason)
>
>Eagle is a Monitoring solution for Hadoop to instantly identify access to 
>sensitive data, recognize attacks, malicious activities and take actions in 
>real time. Eagle supports a wide variety of policies on HDFS data and Hive. 
>Eagle also provides machine learning models for detecting anomalous user 
>behavior in Hadoop.
>
>The proposal is available on the wiki here:
>https://wiki.apache.org/incubator/EagleProposal
>
>The text of the proposal is also available at the end of this email.
>
>Thanks for your time and help.
>
>Thanks,
>Arun
>
>
>
>Eagle
>
>Abstract
>Eagle is an Open Source Monitoring solution for Hadoop to instantly identify 
>access to sensitive data, recognize attacks, malicious activities in hadoop 
>and take actions.
>
>Proposal
>Eagle audits access to HDFS files, Hive and HBase tables in real time, 
>enforces policies defined on sensitive data access and alerts or blocks user’s 
>access to that sensitive data in real time. Eagle also creates user profiles 
>based on the typical access behaviour for HDFS and Hive and sends alerts when 
>anomalous behaviour is detected. Eagle can also import sensitive data 
>information classified by external classification engines to help define its 
>policies.
>
>Overview of Eagle
>Eagle has 3 main parts.
>1.Data collection and storage - Eagle collects data from various hadoop logs 
>in real time using Kafka/Yarn API and uses HDFS and HBase for storage.
>2.Data processing and policy engine - Eagle allows users to create policies 
>based on various metadata properties on HDFS, Hive and HBase data.
>3.Eagle services - Eagle services include policy manager, query service and 
>the visualization component. Eagle provides intuitive user interface to 
>administer Eagle and an alert dashboard to respond to real time alerts.
>
>Data Collection and Storage:
>Eagle provides programming API for extending Eagle to integrate any data 
>source into Eagle policy evaluation framework. For example, Eagle hdfs audit 
>monitoring collects data from Kafka which is populated from namenode log4j 
>appender or from logstash agent. Eagle hive monitoring collects hive query 
>logs from running job through YARN API, which is designed to be scalable and 
>fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics 
>data, and also supports relational database through configuration change.
>
>Data Processing and Policy Engine:
>Processing Engine: Eagle provides stream processing API which is an 
>abstraction of Apache Storm. It can also be extended to other streaming 
>engines. This abstraction allows developers to assemble data transformation, 
>filtering, external data join etc. without physically bound to a specific 
>streaming platform. Eagle streaming API allows developers to easily integrate 
>business logic with Eagle policy engine and internally Eagle framework 
>compiles business logic execution DAG into program primitives of underlying 
>stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring 
>transforms audit log from Namenode to object and joins sensitivity metadata, 
>security zone metadata which are generated from external programs or 
>configured by user. Eagle hive monitoring filters running jobs to get hive 
>query string and parses query string into object and then joins sensitivity 
>metadata.
>Alerting Framework: Eagle Alert Framework includes stream metadata API, 
>scalable policy engine framework, extensible policy engine framework. Stream 
>metadata API allows developers to declare event schema including what 
>attributes constitute an event, what is the type for each attribute, and how 
>to dynamically resolve attribute value in runtime when user configures policy. 
>Scalable policy engine framework allows policies to be executed on different 
>physical nodes in parallel. It is also used to define your own policy 
>partitioner class. Policy engine framework together with streaming 
>partitioning capability provided by all streaming platforms will make sure 
>policies and events can be evaluated in a fully distributed way. Extensible 
>policy engine framework allows developer to plugin a new policy engine with a 
>few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle 
>supports as first-class citizen.
>Machine Learning module: Eagle provides capabilities to define user activity 
>patterns or user profiles for Hadoop users based on the user behaviour in the 
>platform. These user profiles are modeled using Machine Learning algorithms 
>and used for detection of anomalous users activities. Eagle uses Eigen Value 
>Decomposition, and Density Estimation algorithms for