Re: [VOTE] Accept Eagle into Apache Incubation
Follow up announcement, the Apache Eagle incubating mailing lists are now available: • d...@eagle.incubator.apache.org (subscribe by sending email to dev-subscr...@eagle.incubator.apache.org) • comm...@eagle.incubator.apache.org (subscribe by sending email to commits-subscr...@eagle.incubator.apache.org) • u...@eagle.incubator.apache.org (subscribe by sending email to user-subscr...@eagle.incubator.apache.org) Thanks, Henry On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an incubation > project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly identify > access to sensitive data, recognize attacks, malicious activities in hadoop > and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import sensitive > data information classified by external classification engines to help define > its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop logs > in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create policies > based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service and > the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs audit > monitoring collects data from Kafka which is populated from namenode log4j > appender or from logstash agent. Eagle hive monitoring collects hive query > logs from running job through YARN API, which is designed to be scalable and > fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics > data, and also supports relational database through configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data transformation, > filtering, external data join etc. without physically bound to a specific > streaming platform. Eagle streaming API allows developers to easily integrate > business logic with Eagle policy engine and internally Eagle framework > compiles business logic execution DAG into program primitives of underlying > stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring > transforms audit log from Namenode to object and joins sensitivity metadata, > security zone metadata which are generated from external programs or > configured by user. Eagle hive monitoring filters running jobs to get hive > query string and parses query string into object and then joins sensitivity > metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. Stream > metadata API allows developers to declare event schema including what > attributes constitute an event, what is the type for each attribute, and how > to dynamically resolve attribute value in runtime when user configures > policy. Scalable policy engine framework allows policies to be executed on > different physical nodes in parallel. It is also used to define your own > policy partitioner class. Policy engine framework together with streaming > partitioning capability provided by all streaming platforms will make sure > policies and events can be evaluated in a fully distributed way. Extensible > policy engine framework allows
Re: [VOTE] Accept Eagle into Apache Incubation
Thanks Henry very much. That is nice we got those mail lists so we can communicate with community well. Thanks Edward Zhang On 11/3/15, 16:46, "Henry Saputra"wrote: >Follow up announcement, the Apache Eagle incubating mailing lists are >now available: > >€ d...@eagle.incubator.apache.org (subscribe by sending email to >dev-subscr...@eagle.incubator.apache.org) >€ comm...@eagle.incubator.apache.org (subscribe by sending email to >commits-subscr...@eagle.incubator.apache.org) >€ u...@eagle.incubator.apache.org (subscribe by sending email to >user-subscr...@eagle.incubator.apache.org) > > >Thanks, > >Henry > > >On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arun >wrote: >> Hello Everyone, >> >> Thanks for all the feedback on the Eagle Proposal. >> >> I would like to call for a [VOTE] on Eagle joining the ASF as an >>incubation project. >> >> The vote is open for 72 hours: >> >> [ ] +1 accept Eagle in the Incubator >> [ ] ±0 >> [ ] -1 (please give reason) >> >> Eagle is a Monitoring solution for Hadoop to instantly identify access >>to sensitive data, recognize attacks, malicious activities and take >>actions in real time. Eagle supports a wide variety of policies on HDFS >>data and Hive. Eagle also provides machine learning models for detecting >>anomalous user behavior in Hadoop. >> >> The proposal is available on the wiki here: >> https://wiki.apache.org/incubator/EagleProposal >> >> The text of the proposal is also available at the end of this email. >> >> Thanks for your time and help. >> >> Thanks, >> Arun >> >> >> >> Eagle >> >> Abstract >> Eagle is an Open Source Monitoring solution for Hadoop to instantly >>identify access to sensitive data, recognize attacks, malicious >>activities in hadoop and take actions. >> >> Proposal >> Eagle audits access to HDFS files, Hive and HBase tables in real time, >>enforces policies defined on sensitive data access and alerts or blocks >>user¹s access to that sensitive data in real time. Eagle also creates >>user profiles based on the typical access behaviour for HDFS and Hive >>and sends alerts when anomalous behaviour is detected. Eagle can also >>import sensitive data information classified by external classification >>engines to help define its policies. >> >> Overview of Eagle >> Eagle has 3 main parts. >> 1.Data collection and storage - Eagle collects data from various hadoop >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for >>storage. >> 2.Data processing and policy engine - Eagle allows users to create >>policies based on various metadata properties on HDFS, Hive and HBase >>data. >> 3.Eagle services - Eagle services include policy manager, query service >>and the visualization component. Eagle provides intuitive user interface >>to administer Eagle and an alert dashboard to respond to real time >>alerts. >> >> Data Collection and Storage: >> Eagle provides programming API for extending Eagle to integrate any >>data source into Eagle policy evaluation framework. For example, Eagle >>hdfs audit monitoring collects data from Kafka which is populated from >>namenode log4j appender or from logstash agent. Eagle hive monitoring >>collects hive query logs from running job through YARN API, which is >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage >>for storing metadata and metrics data, and also supports relational >>database through configuration change. >> >> Data Processing and Policy Engine: >> Processing Engine: Eagle provides stream processing API which is an >>abstraction of Apache Storm. It can also be extended to other streaming >>engines. This abstraction allows developers to assemble data >>transformation, filtering, external data join etc. without physically >>bound to a specific streaming platform. Eagle streaming API allows >>developers to easily integrate business logic with Eagle policy engine >>and internally Eagle framework compiles business logic execution DAG >>into program primitives of underlying stream infrastructure e.g. Apache >>Storm. For example, Eagle HDFS monitoring transforms audit log from >>Namenode to object and joins sensitivity metadata, security zone >>metadata which are generated from external programs or configured by >>user. Eagle hive monitoring filters running jobs to get hive query >>string and parses query string into object and then joins sensitivity >>metadata. >> Alerting Framework: Eagle Alert Framework includes stream metadata API, >>scalable policy engine framework, extensible policy engine framework. >>Stream metadata API allows developers to declare event schema including >>what attributes constitute an event, what is the type for each >>attribute, and how to dynamically resolve attribute value in runtime >>when user configures policy. Scalable policy engine framework allows >>policies to be executed on different physical nodes in parallel. It is >>also used to define your own policy partitioner class. Policy engine
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) On Mon, Oct 26, 2015 at 10:50 AM, hongbin mawrote: > +1 (non binding) > > On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goers > wrote: > > > +1 (binding) > > > > Ralph > > > > > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun > > wrote: > > > > > > Hello Everyone, > > > > > > Thanks for all the feedback on the Eagle Proposal. > > > > > > I would like to call for a [VOTE] on Eagle joining the ASF as an > > incubation project. > > > > > > The vote is open for 72 hours: > > > > > > [ ] +1 accept Eagle in the Incubator > > > [ ] ±0 > > > [ ] -1 (please give reason) > > > > > > Eagle is a Monitoring solution for Hadoop to instantly identify access > > to sensitive data, recognize attacks, malicious activities and take > actions > > in real time. Eagle supports a wide variety of policies on HDFS data and > > Hive. Eagle also provides machine learning models for detecting anomalous > > user behavior in Hadoop. > > > > > > The proposal is available on the wiki here: > > > https://wiki.apache.org/incubator/EagleProposal > > > > > > The text of the proposal is also available at the end of this email. > > > > > > Thanks for your time and help. > > > > > > Thanks, > > > Arun > > > > > > > > > > > > Eagle > > > > > > Abstract > > > Eagle is an Open Source Monitoring solution for Hadoop to instantly > > identify access to sensitive data, recognize attacks, malicious > activities > > in hadoop and take actions. > > > > > > Proposal > > > Eagle audits access to HDFS files, Hive and HBase tables in real time, > > enforces policies defined on sensitive data access and alerts or blocks > > user’s access to that sensitive data in real time. Eagle also creates > user > > profiles based on the typical access behaviour for HDFS and Hive and > sends > > alerts when anomalous behaviour is detected. Eagle can also import > > sensitive data information classified by external classification engines > to > > help define its policies. > > > > > > Overview of Eagle > > > Eagle has 3 main parts. > > > 1.Data collection and storage - Eagle collects data from various hadoop > > logs in real time using Kafka/Yarn API and uses HDFS and HBase for > storage. > > > 2.Data processing and policy engine - Eagle allows users to create > > policies based on various metadata properties on HDFS, Hive and HBase > data. > > > 3.Eagle services - Eagle services include policy manager, query service > > and the visualization component. Eagle provides intuitive user interface > to > > administer Eagle and an alert dashboard to respond to real time alerts. > > > > > > Data Collection and Storage: > > > Eagle provides programming API for extending Eagle to integrate any > data > > source into Eagle policy evaluation framework. For example, Eagle hdfs > > audit monitoring collects data from Kafka which is populated from > namenode > > log4j appender or from logstash agent. Eagle hive monitoring collects > hive > > query logs from running job through YARN API, which is designed to be > > scalable and fault-tolerant. Eagle uses HBase as storage for storing > > metadata and metrics data, and also supports relational database through > > configuration change. > > > > > > Data Processing and Policy Engine: > > > Processing Engine: Eagle provides stream processing API which is an > > abstraction of Apache Storm. It can also be extended to other streaming > > engines. This abstraction allows developers to assemble data > > transformation, filtering, external data join etc. without physically > bound > > to a specific streaming platform. Eagle streaming API allows developers > to > > easily integrate business logic with Eagle policy engine and internally > > Eagle framework compiles business logic execution DAG into program > > primitives of underlying stream infrastructure e.g. Apache Storm. For > > example, Eagle HDFS monitoring transforms audit log from Namenode to > object > > and joins sensitivity metadata, security zone metadata which are > generated > > from external programs or configured by user. Eagle hive monitoring > filters > > running jobs to get hive query string and parses query string into object > > and then joins sensitivity metadata. > > > Alerting Framework: Eagle Alert Framework includes stream metadata API, > > scalable policy engine framework, extensible policy engine framework. > > Stream metadata API allows developers to declare event schema including > > what attributes constitute an event, what is the type for each attribute, > > and how to dynamically resolve attribute value in runtime when user > > configures policy. Scalable policy engine framework allows policies to be > > executed on different physical nodes in parallel. It is also used to > define > > your own policy partitioner class. Policy engine framework together with > > streaming partitioning capability provided by all streaming platforms > will > > make sure policies and events can be
Re: [VOTE] Accept Eagle into Apache Incubation
+1 non binding Bosco _ From: Li Yang <liy...@apache.org> Sent: Sunday, October 25, 2015 8:13 PM Subject: Re: [VOTE] Accept Eagle into Apache Incubation To: <general@incubator.apache.org> +1 (non-binding) On Mon, Oct 26, 2015 at 10:50 AM, hongbin ma <mahong...@apache.org> wrote: > +1 (non binding) > > On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goers <ralph.go...@dslextreme.com> > wrote: > > > +1 (binding) > > > > Ralph > > > > > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun <armanoha...@ebay.com> > > wrote: > > > > > > Hello Everyone, > > > > > > Thanks for all the feedback on the Eagle Proposal. > > > > > > I would like to call for a [VOTE] on Eagle joining the ASF as an > > incubation project. > > > > > > The vote is open for 72 hours: > > > > > > [ ] +1 accept Eagle in the Incubator > > > [ ] ±0 > > > [ ] -1 (please give reason) > > > > > > Eagle is a Monitoring solution for Hadoop to instantly identify access > > to sensitive data, recognize attacks, malicious activities and take > actions > > in real time. Eagle supports a wide variety of policies on HDFS data and > > Hive. Eagle also provides machine learning models for detecting anomalous > > user behavior in Hadoop. > > > > > > The proposal is available on the wiki here: > > > https://wiki.apache.org/incubator/EagleProposal > > > > > > The text of the proposal is also available at the end of this email. > > > > > > Thanks for your time and help. > > > > > > Thanks, > > > Arun > > > > > > > > > > > > Eagle > > > > > > Abstract > > > Eagle is an Open Source Monitoring solution for Hadoop to instantly > > identify access to sensitive data, recognize attacks, malicious > activities > > in hadoop and take actions. > > > > > > Proposal > > > Eagle audits access to HDFS files, Hive and HBase tables in real time, > > enforces policies defined on sensitive data access and alerts or blocks > > user’s access to that sensitive data in real time. Eagle also creates > user > > profiles based on the typical access behaviour for HDFS and Hive and > sends > > alerts when anomalous behaviour is detected. Eagle can also import > > sensitive data information classified by external classification engines > to > > help define its policies. > > > > > > Overview of Eagle > > > Eagle has 3 main parts. > > > 1.Data collection and storage - Eagle collects data from various hadoop > > logs in real time using Kafka/Yarn API and uses HDFS and HBase for > storage. > > > 2.Data processing and policy engine - Eagle allows users to create > > policies based on various metadata properties on HDFS, Hive and HBase > data. > > > 3.Eagle services - Eagle services include policy manager, query service > > and the visualization component. Eagle provides intuitive user interface > to > > administer Eagle and an alert dashboard to respond to real time alerts. > > > > > > Data Collection and Storage: > > > Eagle provides programming API for extending Eagle to integrate any > data > > source into Eagle policy evaluation framework. For example, Eagle hdfs > > audit monitoring collects data from Kafka which is populated from > namenode > > log4j appender or from logstash agent. Eagle hive monitoring collects > hive > > query logs from running job through YARN API, which is designed to be > > scalable and fault-tolerant. Eagle uses HBase as storage for storing > > metadata and metrics data, and also supports relational database through > > configuration change. > > > > > > Data Processing and Policy Engine: > > > Processing Engine: Eagle provides stream processing API which is an > > abstraction of Apache Storm. It can also be extended to other streaming > > engines. This abstraction allows developers to assemble data > > transformation, filtering, external data join etc. without physically > bound > > to a specific streaming platform. Eagle streaming API allows developers > to > > easily integrate business logic with Eagle policy engine and internally > > Eagle framework compiles business logic execution DAG into program > > primitives of underlying stream infrastructure e.g. Apache Storm. For > > example, Eagle HDFS monitoring transforms audit log from Namenode to > object > > and joins sensitivity metadata, security zone me
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non binding) On Mon, Oct 26, 2015 at 12:20 AM, Ralph Goerswrote: > +1 (binding) > > Ralph > > > On Oct 23, 2015, at 7:11 AM, Manoharan, Arun > wrote: > > > > Hello Everyone, > > > > Thanks for all the feedback on the Eagle Proposal. > > > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > > > The vote is open for 72 hours: > > > > [ ] +1 accept Eagle in the Incubator > > [ ] ±0 > > [ ] -1 (please give reason) > > > > Eagle is a Monitoring solution for Hadoop to instantly identify access > to sensitive data, recognize attacks, malicious activities and take actions > in real time. Eagle supports a wide variety of policies on HDFS data and > Hive. Eagle also provides machine learning models for detecting anomalous > user behavior in Hadoop. > > > > The proposal is available on the wiki here: > > https://wiki.apache.org/incubator/EagleProposal > > > > The text of the proposal is also available at the end of this email. > > > > Thanks for your time and help. > > > > Thanks, > > Arun > > > > > > > > Eagle > > > > Abstract > > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > > > Proposal > > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > > > Overview of Eagle > > Eagle has 3 main parts. > > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > > > Data Collection and Storage: > > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > > > Data Processing and Policy Engine: > > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > > Machine Learning module: Eagle provides capabilities to define user > activity patterns or
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) Ralph > On Oct 23, 2015, at 7:11 AM, Manoharan, Arunwrote: > > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an incubation > project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly identify > access to sensitive data, recognize attacks, malicious activities in hadoop > and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import sensitive > data information classified by external classification engines to help define > its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop logs > in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create policies > based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service and > the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs audit > monitoring collects data from Kafka which is populated from namenode log4j > appender or from logstash agent. Eagle hive monitoring collects hive query > logs from running job through YARN API, which is designed to be scalable and > fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics > data, and also supports relational database through configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data transformation, > filtering, external data join etc. without physically bound to a specific > streaming platform. Eagle streaming API allows developers to easily integrate > business logic with Eagle policy engine and internally Eagle framework > compiles business logic execution DAG into program primitives of underlying > stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring > transforms audit log from Namenode to object and joins sensitivity metadata, > security zone metadata which are generated from external programs or > configured by user. Eagle hive monitoring filters running jobs to get hive > query string and parses query string into object and then joins sensitivity > metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. Stream > metadata API allows developers to declare event schema including what > attributes constitute an event, what is the type for each attribute, and how > to dynamically resolve attribute value in runtime when user configures > policy. Scalable policy engine framework allows policies to be executed on > different physical nodes in parallel. It is also used to define your own > policy partitioner class. Policy engine framework together with streaming > partitioning capability provided by all streaming platforms will make sure > policies and events can be evaluated in a fully distributed way. Extensible > policy engine framework allows developer to plugin a new policy engine with a > few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle > supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user activity > patterns or user profiles for Hadoop users based on the user behaviour in the > platform. These user profiles are modeled using Machine Learning algorithms
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) On Fri, Oct 23, 2015 at 7:41 PM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle uses
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) On Fri, Oct 23, 2015 at 8:42 AM, wp chunwrote: > +1 > wp_c...@hotmail.com > > > > On 10/23/15, 11:26 PM, "P. Taylor Goetz" wrote: > > > > >+1 (binding) > > > > > >-Taylor > > > > > >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun > > >>wrote: > > >> > > >> Hello Everyone, > > >> > > >> Thanks for all the feedback on the Eagle Proposal. > > >> > > >> I would like to call for a [VOTE] on Eagle joining the ASF as an > > >>incubation project. > > >> > > >> The vote is open for 72 hours: > > >> > > >> [ ] +1 accept Eagle in the Incubator > > >> [ ] ±0 > > >> [ ] -1 (please give reason) > > >> > > >> Eagle is a Monitoring solution for Hadoop to instantly identify access > > >>to sensitive data, recognize attacks, malicious activities and take > > >>actions in real time. Eagle supports a wide variety of policies on HDFS > > >>data and Hive. Eagle also provides machine learning models for > detecting > > >>anomalous user behavior in Hadoop. > > >> > > >> The proposal is available on the wiki here: > > >> https://wiki.apache.org/incubator/EagleProposal > > >> > > >> The text of the proposal is also available at the end of this email. > > >> > > >> Thanks for your time and help. > > >> > > >> Thanks, > > >> Arun > > >> > > >> > > >> > > >> Eagle > > >> > > >> Abstract > > >> Eagle is an Open Source Monitoring solution for Hadoop to instantly > > >>identify access to sensitive data, recognize attacks, malicious > > >>activities in hadoop and take actions. > > >> > > >> Proposal > > >> Eagle audits access to HDFS files, Hive and HBase tables in real time, > > >>enforces policies defined on sensitive data access and alerts or blocks > > >>user¹s access to that sensitive data in real time. Eagle also creates > > >>user profiles based on the typical access behaviour for HDFS and Hive > > >>and sends alerts when anomalous behaviour is detected. Eagle can also > > >>import sensitive data information classified by external classification > > >>engines to help define its policies. > > >> > > >> Overview of Eagle > > >> Eagle has 3 main parts. > > >> 1.Data collection and storage - Eagle collects data from various > hadoop > > >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for > > >>storage. > > >> 2.Data processing and policy engine - Eagle allows users to create > > >>policies based on various metadata properties on HDFS, Hive and HBase > > >>data. > > >> 3.Eagle services - Eagle services include policy manager, query > service > > >>and the visualization component. Eagle provides intuitive user > interface > > >>to administer Eagle and an alert dashboard to respond to real time > > >>alerts. > > >> > > >> Data Collection and Storage: > > >> Eagle provides programming API for extending Eagle to integrate any > > >>data source into Eagle policy evaluation framework. For example, Eagle > > >>hdfs audit monitoring collects data from Kafka which is populated from > > >>namenode log4j appender or from logstash agent. Eagle hive monitoring > > >>collects hive query logs from running job through YARN API, which is > > >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage > > >>for storing metadata and metrics data, and also supports relational > > >>database through configuration change. > > >> > > >> Data Processing and Policy Engine: > > >> Processing Engine: Eagle provides stream processing API which is an > > >>abstraction of Apache Storm. It can also be extended to other streaming > > >>engines. This abstraction allows developers to assemble data > > >>transformation, filtering, external data join etc. without physically > > >>bound to a specific streaming platform. Eagle streaming API allows > > >>developers to easily integrate business logic with Eagle policy engine > > >>and internally Eagle framework compiles business logic execution DAG > > >>into program primitives of underlying stream infrastructure e.g. Apache > > >>Storm. For example, Eagle HDFS monitoring transforms audit log from > > >>Namenode to object and joins sensitivity metadata, security zone > > >>metadata which are generated from external programs or configured by > > >>user. Eagle hive monitoring filters running jobs to get hive query > > >>string and parses query string into object and then joins sensitivity > > >>metadata. > > >> Alerting Framework: Eagle Alert Framework includes stream metadata > API, > > >>scalable policy engine framework, extensible policy engine framework. > > >>Stream metadata API allows developers to declare event schema including > > >>what attributes constitute an event, what is the type for each > > >>attribute, and how to dynamically resolve attribute value in runtime > > >>when user configures policy. Scalable policy engine framework allows > > >>policies to be executed on different physical nodes in parallel. It is > > >>also used to define your own policy partitioner class. Policy engine > > >>framework
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) Best Regards! - Luke Han On Fri, Oct 23, 2015 at 11:26 PM, P. Taylor Goetzwrote: > +1 (binding) > > -Taylor > > > On Oct 23, 2015, at 10:11 AM, Manoharan, Arun > wrote: > > > > Hello Everyone, > > > > Thanks for all the feedback on the Eagle Proposal. > > > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > > > The vote is open for 72 hours: > > > > [ ] +1 accept Eagle in the Incubator > > [ ] ±0 > > [ ] -1 (please give reason) > > > > Eagle is a Monitoring solution for Hadoop to instantly identify access > to sensitive data, recognize attacks, malicious activities and take actions > in real time. Eagle supports a wide variety of policies on HDFS data and > Hive. Eagle also provides machine learning models for detecting anomalous > user behavior in Hadoop. > > > > The proposal is available on the wiki here: > > https://wiki.apache.org/incubator/EagleProposal > > > > The text of the proposal is also available at the end of this email. > > > > Thanks for your time and help. > > > > Thanks, > > Arun > > > > > > > > Eagle > > > > Abstract > > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > > > Proposal > > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > > > Overview of Eagle > > Eagle has 3 main parts. > > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > > > Data Collection and Storage: > > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > > > Data Processing and Policy Engine: > > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > > Machine Learning module: Eagle provides
RE: [VOTE] Accept Eagle into Apache Incubation
+1 wp_c...@hotmail.com > > On 10/23/15, 11:26 PM, "P. Taylor Goetz"wrote: > > >+1 (binding) > > > >-Taylor > > > >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun > >>wrote: > >> > >> Hello Everyone, > >> > >> Thanks for all the feedback on the Eagle Proposal. > >> > >> I would like to call for a [VOTE] on Eagle joining the ASF as an > >>incubation project. > >> > >> The vote is open for 72 hours: > >> > >> [ ] +1 accept Eagle in the Incubator > >> [ ] ±0 > >> [ ] -1 (please give reason) > >> > >> Eagle is a Monitoring solution for Hadoop to instantly identify access > >>to sensitive data, recognize attacks, malicious activities and take > >>actions in real time. Eagle supports a wide variety of policies on HDFS > >>data and Hive. Eagle also provides machine learning models for detecting > >>anomalous user behavior in Hadoop. > >> > >> The proposal is available on the wiki here: > >> https://wiki.apache.org/incubator/EagleProposal > >> > >> The text of the proposal is also available at the end of this email. > >> > >> Thanks for your time and help. > >> > >> Thanks, > >> Arun > >> > >> > >> > >> Eagle > >> > >> Abstract > >> Eagle is an Open Source Monitoring solution for Hadoop to instantly > >>identify access to sensitive data, recognize attacks, malicious > >>activities in hadoop and take actions. > >> > >> Proposal > >> Eagle audits access to HDFS files, Hive and HBase tables in real time, > >>enforces policies defined on sensitive data access and alerts or blocks > >>user¹s access to that sensitive data in real time. Eagle also creates > >>user profiles based on the typical access behaviour for HDFS and Hive > >>and sends alerts when anomalous behaviour is detected. Eagle can also > >>import sensitive data information classified by external classification > >>engines to help define its policies. > >> > >> Overview of Eagle > >> Eagle has 3 main parts. > >> 1.Data collection and storage - Eagle collects data from various hadoop > >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for > >>storage. > >> 2.Data processing and policy engine - Eagle allows users to create > >>policies based on various metadata properties on HDFS, Hive and HBase > >>data. > >> 3.Eagle services - Eagle services include policy manager, query service > >>and the visualization component. Eagle provides intuitive user interface > >>to administer Eagle and an alert dashboard to respond to real time > >>alerts. > >> > >> Data Collection and Storage: > >> Eagle provides programming API for extending Eagle to integrate any > >>data source into Eagle policy evaluation framework. For example, Eagle > >>hdfs audit monitoring collects data from Kafka which is populated from > >>namenode log4j appender or from logstash agent. Eagle hive monitoring > >>collects hive query logs from running job through YARN API, which is > >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage > >>for storing metadata and metrics data, and also supports relational > >>database through configuration change. > >> > >> Data Processing and Policy Engine: > >> Processing Engine: Eagle provides stream processing API which is an > >>abstraction of Apache Storm. It can also be extended to other streaming > >>engines. This abstraction allows developers to assemble data > >>transformation, filtering, external data join etc. without physically > >>bound to a specific streaming platform. Eagle streaming API allows > >>developers to easily integrate business logic with Eagle policy engine > >>and internally Eagle framework compiles business logic execution DAG > >>into program primitives of underlying stream infrastructure e.g. Apache > >>Storm. For example, Eagle HDFS monitoring transforms audit log from > >>Namenode to object and joins sensitivity metadata, security zone > >>metadata which are generated from external programs or configured by > >>user. Eagle hive monitoring filters running jobs to get hive query > >>string and parses query string into object and then joins sensitivity > >>metadata. > >> Alerting Framework: Eagle Alert Framework includes stream metadata API, > >>scalable policy engine framework, extensible policy engine framework. > >>Stream metadata API allows developers to declare event schema including > >>what attributes constitute an event, what is the type for each > >>attribute, and how to dynamically resolve attribute value in runtime > >>when user configures policy. Scalable policy engine framework allows > >>policies to be executed on different physical nodes in parallel. It is > >>also used to define your own policy partitioner class. Policy engine > >>framework together with streaming partitioning capability provided by > >>all streaming platforms will make sure policies and events can be > >>evaluated in a fully distributed way. Extensible policy engine framework > >>allows developer to plugin a new policy engine with a few lines of > >>codes.
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) 2015-10-23 23:50 GMT+08:00 Owen O'Malley: > +1 (binding) > > On Fri, Oct 23, 2015 at 8:42 AM, wp chun wrote: > > > +1 > > wp_c...@hotmail.com > > > > > > On 10/23/15, 11:26 PM, "P. Taylor Goetz" wrote: > > > > > > >+1 (binding) > > > > > > > >-Taylor > > > > > > > >> On Oct 23, 2015, at 10:11 AM, Manoharan, Arun > > > > >>wrote: > > > >> > > > >> Hello Everyone, > > > >> > > > >> Thanks for all the feedback on the Eagle Proposal. > > > >> > > > >> I would like to call for a [VOTE] on Eagle joining the ASF as an > > > >>incubation project. > > > >> > > > >> The vote is open for 72 hours: > > > >> > > > >> [ ] +1 accept Eagle in the Incubator > > > >> [ ] ±0 > > > >> [ ] -1 (please give reason) > > > >> > > > >> Eagle is a Monitoring solution for Hadoop to instantly identify > access > > > >>to sensitive data, recognize attacks, malicious activities and take > > > >>actions in real time. Eagle supports a wide variety of policies on > HDFS > > > >>data and Hive. Eagle also provides machine learning models for > > detecting > > > >>anomalous user behavior in Hadoop. > > > >> > > > >> The proposal is available on the wiki here: > > > >> https://wiki.apache.org/incubator/EagleProposal > > > >> > > > >> The text of the proposal is also available at the end of this email. > > > >> > > > >> Thanks for your time and help. > > > >> > > > >> Thanks, > > > >> Arun > > > >> > > > >> > > > >> > > > >> Eagle > > > >> > > > >> Abstract > > > >> Eagle is an Open Source Monitoring solution for Hadoop to instantly > > > >>identify access to sensitive data, recognize attacks, malicious > > > >>activities in hadoop and take actions. > > > >> > > > >> Proposal > > > >> Eagle audits access to HDFS files, Hive and HBase tables in real > time, > > > >>enforces policies defined on sensitive data access and alerts or > blocks > > > >>user¹s access to that sensitive data in real time. Eagle also creates > > > >>user profiles based on the typical access behaviour for HDFS and Hive > > > >>and sends alerts when anomalous behaviour is detected. Eagle can also > > > >>import sensitive data information classified by external > classification > > > >>engines to help define its policies. > > > >> > > > >> Overview of Eagle > > > >> Eagle has 3 main parts. > > > >> 1.Data collection and storage - Eagle collects data from various > > hadoop > > > >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for > > > >>storage. > > > >> 2.Data processing and policy engine - Eagle allows users to create > > > >>policies based on various metadata properties on HDFS, Hive and HBase > > > >>data. > > > >> 3.Eagle services - Eagle services include policy manager, query > > service > > > >>and the visualization component. Eagle provides intuitive user > > interface > > > >>to administer Eagle and an alert dashboard to respond to real time > > > >>alerts. > > > >> > > > >> Data Collection and Storage: > > > >> Eagle provides programming API for extending Eagle to integrate any > > > >>data source into Eagle policy evaluation framework. For example, > Eagle > > > >>hdfs audit monitoring collects data from Kafka which is populated > from > > > >>namenode log4j appender or from logstash agent. Eagle hive monitoring > > > >>collects hive query logs from running job through YARN API, which is > > > >>designed to be scalable and fault-tolerant. Eagle uses HBase as > storage > > > >>for storing metadata and metrics data, and also supports relational > > > >>database through configuration change. > > > >> > > > >> Data Processing and Policy Engine: > > > >> Processing Engine: Eagle provides stream processing API which is an > > > >>abstraction of Apache Storm. It can also be extended to other > streaming > > > >>engines. This abstraction allows developers to assemble data > > > >>transformation, filtering, external data join etc. without physically > > > >>bound to a specific streaming platform. Eagle streaming API allows > > > >>developers to easily integrate business logic with Eagle policy > engine > > > >>and internally Eagle framework compiles business logic execution DAG > > > >>into program primitives of underlying stream infrastructure e.g. > Apache > > > >>Storm. For example, Eagle HDFS monitoring transforms audit log from > > > >>Namenode to object and joins sensitivity metadata, security zone > > > >>metadata which are generated from external programs or configured by > > > >>user. Eagle hive monitoring filters running jobs to get hive query > > > >>string and parses query string into object and then joins sensitivity > > > >>metadata. > > > >> Alerting Framework: Eagle Alert Framework includes stream metadata > > API, > > > >>scalable policy engine framework, extensible policy engine framework. > > > >>Stream metadata API allows developers to declare event schema > including > > > >>what attributes constitute an event, what is the type
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle uses
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) On Fri, Oct 23, 2015 at 10:11 PM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) — Hitesh On Oct 23, 2015, at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an incubation > project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly identify > access to sensitive data, recognize attacks, malicious activities in hadoop > and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import sensitive > data information classified by external classification engines to help define > its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop logs > in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create policies > based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service and > the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs audit > monitoring collects data from Kafka which is populated from namenode log4j > appender or from logstash agent. Eagle hive monitoring collects hive query > logs from running job through YARN API, which is designed to be scalable and > fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics > data, and also supports relational database through configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data transformation, > filtering, external data join etc. without physically bound to a specific > streaming platform. Eagle streaming API allows developers to easily integrate > business logic with Eagle policy engine and internally Eagle framework > compiles business logic execution DAG into program primitives of underlying > stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring > transforms audit log from Namenode to object and joins sensitivity metadata, > security zone metadata which are generated from external programs or > configured by user. Eagle hive monitoring filters running jobs to get hive > query string and parses query string into object and then joins sensitivity > metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. Stream > metadata API allows developers to declare event schema including what > attributes constitute an event, what is the type for each attribute, and how > to dynamically resolve attribute value in runtime when user configures > policy. Scalable policy engine framework allows policies to be executed on > different physical nodes in parallel. It is also used to define your own > policy partitioner class. Policy engine framework together with streaming > partitioning capability provided by all streaming platforms will make sure > policies and events can be evaluated in a fully distributed way. Extensible > policy engine framework allows developer to plugin a new policy engine with a > few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle > supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user activity > patterns or user profiles for Hadoop users based on the user behaviour in the > platform. These user profiles are modeled using Machine Learning algorithms >
Re: [VOTE] Accept Eagle into Apache Incubation
+1 On Oct 23, 2015 10:11, "Manoharan, Arun"wrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle uses Eigen Value
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non binding) On 10/23/15, 9:52 AM, "Hitesh Shah"wrote: >+1 (binding) > >‹ Hitesh > >On Oct 23, 2015, at 7:11 AM, Manoharan, Arun wrote: > >> Hello Everyone, >> >> Thanks for all the feedback on the Eagle Proposal. >> >> I would like to call for a [VOTE] on Eagle joining the ASF as an >>incubation project. >> >> The vote is open for 72 hours: >> >> [ ] +1 accept Eagle in the Incubator >> [ ] ±0 >> [ ] -1 (please give reason) >> >> Eagle is a Monitoring solution for Hadoop to instantly identify access >>to sensitive data, recognize attacks, malicious activities and take >>actions in real time. Eagle supports a wide variety of policies on HDFS >>data and Hive. Eagle also provides machine learning models for detecting >>anomalous user behavior in Hadoop. >> >> The proposal is available on the wiki here: >> https://wiki.apache.org/incubator/EagleProposal >> >> The text of the proposal is also available at the end of this email. >> >> Thanks for your time and help. >> >> Thanks, >> Arun >> >> >> >> Eagle >> >> Abstract >> Eagle is an Open Source Monitoring solution for Hadoop to instantly >>identify access to sensitive data, recognize attacks, malicious >>activities in hadoop and take actions. >> >> Proposal >> Eagle audits access to HDFS files, Hive and HBase tables in real time, >>enforces policies defined on sensitive data access and alerts or blocks >>user¹s access to that sensitive data in real time. Eagle also creates >>user profiles based on the typical access behaviour for HDFS and Hive >>and sends alerts when anomalous behaviour is detected. Eagle can also >>import sensitive data information classified by external classification >>engines to help define its policies. >> >> Overview of Eagle >> Eagle has 3 main parts. >> 1.Data collection and storage - Eagle collects data from various hadoop >>logs in real time using Kafka/Yarn API and uses HDFS and HBase for >>storage. >> 2.Data processing and policy engine - Eagle allows users to create >>policies based on various metadata properties on HDFS, Hive and HBase >>data. >> 3.Eagle services - Eagle services include policy manager, query service >>and the visualization component. Eagle provides intuitive user interface >>to administer Eagle and an alert dashboard to respond to real time >>alerts. >> >> Data Collection and Storage: >> Eagle provides programming API for extending Eagle to integrate any >>data source into Eagle policy evaluation framework. For example, Eagle >>hdfs audit monitoring collects data from Kafka which is populated from >>namenode log4j appender or from logstash agent. Eagle hive monitoring >>collects hive query logs from running job through YARN API, which is >>designed to be scalable and fault-tolerant. Eagle uses HBase as storage >>for storing metadata and metrics data, and also supports relational >>database through configuration change. >> >> Data Processing and Policy Engine: >> Processing Engine: Eagle provides stream processing API which is an >>abstraction of Apache Storm. It can also be extended to other streaming >>engines. This abstraction allows developers to assemble data >>transformation, filtering, external data join etc. without physically >>bound to a specific streaming platform. Eagle streaming API allows >>developers to easily integrate business logic with Eagle policy engine >>and internally Eagle framework compiles business logic execution DAG >>into program primitives of underlying stream infrastructure e.g. Apache >>Storm. For example, Eagle HDFS monitoring transforms audit log from >>Namenode to object and joins sensitivity metadata, security zone >>metadata which are generated from external programs or configured by >>user. Eagle hive monitoring filters running jobs to get hive query >>string and parses query string into object and then joins sensitivity >>metadata. >> Alerting Framework: Eagle Alert Framework includes stream metadata API, >>scalable policy engine framework, extensible policy engine framework. >>Stream metadata API allows developers to declare event schema including >>what attributes constitute an event, what is the type for each >>attribute, and how to dynamically resolve attribute value in runtime >>when user configures policy. Scalable policy engine framework allows >>policies to be executed on different physical nodes in parallel. It is >>also used to define your own policy partitioner class. Policy engine >>framework together with streaming partitioning capability provided by >>all streaming platforms will make sure policies and events can be >>evaluated in a fully distributed way. Extensible policy engine framework >>allows developer to plugin a new policy engine with a few lines of >>codes. WSO2 Siddhi CEP engine is the policy engine which Eagle supports >>as first-class citizen. >> Machine Learning module: Eagle provides capabilities to define user >>activity patterns or user profiles for Hadoop users based on the user
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) > On Oct 23, 2015, at 10:13 AM, John D. Amentwrote: > > +1 > On Oct 23, 2015 10:11, "Manoharan, Arun" wrote: > >> Hello Everyone, >> >> Thanks for all the feedback on the Eagle Proposal. >> >> I would like to call for a [VOTE] on Eagle joining the ASF as an >> incubation project. >> >> The vote is open for 72 hours: >> >> [ ] +1 accept Eagle in the Incubator >> [ ] ±0 >> [ ] -1 (please give reason) >> >> Eagle is a Monitoring solution for Hadoop to instantly identify access to >> sensitive data, recognize attacks, malicious activities and take actions in >> real time. Eagle supports a wide variety of policies on HDFS data and Hive. >> Eagle also provides machine learning models for detecting anomalous user >> behavior in Hadoop. >> >> The proposal is available on the wiki here: >> https://wiki.apache.org/incubator/EagleProposal >> >> The text of the proposal is also available at the end of this email. >> >> Thanks for your time and help. >> >> Thanks, >> Arun >> >> >> >> Eagle >> >> Abstract >> Eagle is an Open Source Monitoring solution for Hadoop to instantly >> identify access to sensitive data, recognize attacks, malicious activities >> in hadoop and take actions. >> >> Proposal >> Eagle audits access to HDFS files, Hive and HBase tables in real time, >> enforces policies defined on sensitive data access and alerts or blocks >> user’s access to that sensitive data in real time. Eagle also creates user >> profiles based on the typical access behaviour for HDFS and Hive and sends >> alerts when anomalous behaviour is detected. Eagle can also import >> sensitive data information classified by external classification engines to >> help define its policies. >> >> Overview of Eagle >> Eagle has 3 main parts. >> 1.Data collection and storage - Eagle collects data from various hadoop >> logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. >> 2.Data processing and policy engine - Eagle allows users to create >> policies based on various metadata properties on HDFS, Hive and HBase data. >> 3.Eagle services - Eagle services include policy manager, query service >> and the visualization component. Eagle provides intuitive user interface to >> administer Eagle and an alert dashboard to respond to real time alerts. >> >> Data Collection and Storage: >> Eagle provides programming API for extending Eagle to integrate any data >> source into Eagle policy evaluation framework. For example, Eagle hdfs >> audit monitoring collects data from Kafka which is populated from namenode >> log4j appender or from logstash agent. Eagle hive monitoring collects hive >> query logs from running job through YARN API, which is designed to be >> scalable and fault-tolerant. Eagle uses HBase as storage for storing >> metadata and metrics data, and also supports relational database through >> configuration change. >> >> Data Processing and Policy Engine: >> Processing Engine: Eagle provides stream processing API which is an >> abstraction of Apache Storm. It can also be extended to other streaming >> engines. This abstraction allows developers to assemble data >> transformation, filtering, external data join etc. without physically bound >> to a specific streaming platform. Eagle streaming API allows developers to >> easily integrate business logic with Eagle policy engine and internally >> Eagle framework compiles business logic execution DAG into program >> primitives of underlying stream infrastructure e.g. Apache Storm. For >> example, Eagle HDFS monitoring transforms audit log from Namenode to object >> and joins sensitivity metadata, security zone metadata which are generated >> from external programs or configured by user. Eagle hive monitoring filters >> running jobs to get hive query string and parses query string into object >> and then joins sensitivity metadata. >> Alerting Framework: Eagle Alert Framework includes stream metadata API, >> scalable policy engine framework, extensible policy engine framework. >> Stream metadata API allows developers to declare event schema including >> what attributes constitute an event, what is the type for each attribute, >> and how to dynamically resolve attribute value in runtime when user >> configures policy. Scalable policy engine framework allows policies to be >> executed on different physical nodes in parallel. It is also used to define >> your own policy partitioner class. Policy engine framework together with >> streaming partitioning capability provided by all streaming platforms will >> make sure policies and events can be evaluated in a fully distributed way. >> Extensible policy engine framework allows developer to plugin a new policy >> engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy >> engine which Eagle supports as first-class citizen. >> Machine Learning module: Eagle provides capabilities to define user >> activity patterns or user profiles for
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) --Chris Nauroth On 10/23/15, 7:11 AM, "Manoharan, Arun"wrote: >Hello Everyone, > >Thanks for all the feedback on the Eagle Proposal. > >I would like to call for a [VOTE] on Eagle joining the ASF as an >incubation project. > >The vote is open for 72 hours: > >[ ] +1 accept Eagle in the Incubator >[ ] ±0 >[ ] -1 (please give reason) > >Eagle is a Monitoring solution for Hadoop to instantly identify access to >sensitive data, recognize attacks, malicious activities and take actions >in real time. Eagle supports a wide variety of policies on HDFS data and >Hive. Eagle also provides machine learning models for detecting anomalous >user behavior in Hadoop. > >The proposal is available on the wiki here: >https://wiki.apache.org/incubator/EagleProposal > >The text of the proposal is also available at the end of this email. > >Thanks for your time and help. > >Thanks, >Arun > > > >Eagle > >Abstract >Eagle is an Open Source Monitoring solution for Hadoop to instantly >identify access to sensitive data, recognize attacks, malicious >activities in hadoop and take actions. > >Proposal >Eagle audits access to HDFS files, Hive and HBase tables in real time, >enforces policies defined on sensitive data access and alerts or blocks >user¹s access to that sensitive data in real time. Eagle also creates >user profiles based on the typical access behaviour for HDFS and Hive and >sends alerts when anomalous behaviour is detected. Eagle can also import >sensitive data information classified by external classification engines >to help define its policies. > >Overview of Eagle >Eagle has 3 main parts. >1.Data collection and storage - Eagle collects data from various hadoop >logs in real time using Kafka/Yarn API and uses HDFS and HBase for >storage. >2.Data processing and policy engine - Eagle allows users to create >policies based on various metadata properties on HDFS, Hive and HBase >data. >3.Eagle services - Eagle services include policy manager, query service >and the visualization component. Eagle provides intuitive user interface >to administer Eagle and an alert dashboard to respond to real time alerts. > >Data Collection and Storage: >Eagle provides programming API for extending Eagle to integrate any data >source into Eagle policy evaluation framework. For example, Eagle hdfs >audit monitoring collects data from Kafka which is populated from >namenode log4j appender or from logstash agent. Eagle hive monitoring >collects hive query logs from running job through YARN API, which is >designed to be scalable and fault-tolerant. Eagle uses HBase as storage >for storing metadata and metrics data, and also supports relational >database through configuration change. > >Data Processing and Policy Engine: >Processing Engine: Eagle provides stream processing API which is an >abstraction of Apache Storm. It can also be extended to other streaming >engines. This abstraction allows developers to assemble data >transformation, filtering, external data join etc. without physically >bound to a specific streaming platform. Eagle streaming API allows >developers to easily integrate business logic with Eagle policy engine >and internally Eagle framework compiles business logic execution DAG into >program primitives of underlying stream infrastructure e.g. Apache Storm. >For example, Eagle HDFS monitoring transforms audit log from Namenode to >object and joins sensitivity metadata, security zone metadata which are >generated from external programs or configured by user. Eagle hive >monitoring filters running jobs to get hive query string and parses query >string into object and then joins sensitivity metadata. >Alerting Framework: Eagle Alert Framework includes stream metadata API, >scalable policy engine framework, extensible policy engine framework. >Stream metadata API allows developers to declare event schema including >what attributes constitute an event, what is the type for each attribute, >and how to dynamically resolve attribute value in runtime when user >configures policy. Scalable policy engine framework allows policies to be >executed on different physical nodes in parallel. It is also used to >define your own policy partitioner class. Policy engine framework >together with streaming partitioning capability provided by all streaming >platforms will make sure policies and events can be evaluated in a fully >distributed way. Extensible policy engine framework allows developer to >plugin a new policy engine with a few lines of codes. WSO2 Siddhi CEP >engine is the policy engine which Eagle supports as first-class citizen. >Machine Learning module: Eagle provides capabilities to define user >activity patterns or user profiles for Hadoop users based on the user >behaviour in the platform. These user profiles are modeled using Machine >Learning algorithms and used for detection of anomalous users activities. >Eagle uses Eigen Value Decomposition, and Density Estimation algorithms >for
Re: [VOTE] Accept Eagle into Apache Incubation
+1 On Fri, Oct 23, 2015 at 12:26 PM, Chris Naurothwrote: > +1 (binding) > > --Chris Nauroth > > > > > On 10/23/15, 7:11 AM, "Manoharan, Arun" wrote: > > >Hello Everyone, > > > >Thanks for all the feedback on the Eagle Proposal. > > > >I would like to call for a [VOTE] on Eagle joining the ASF as an > >incubation project. > > > >The vote is open for 72 hours: > > > >[ ] +1 accept Eagle in the Incubator > >[ ] ±0 > >[ ] -1 (please give reason) > > > >Eagle is a Monitoring solution for Hadoop to instantly identify access to > >sensitive data, recognize attacks, malicious activities and take actions > >in real time. Eagle supports a wide variety of policies on HDFS data and > >Hive. Eagle also provides machine learning models for detecting anomalous > >user behavior in Hadoop. > > > >The proposal is available on the wiki here: > >https://wiki.apache.org/incubator/EagleProposal > > > >The text of the proposal is also available at the end of this email. > > > >Thanks for your time and help. > > > >Thanks, > >Arun > > > > > > > >Eagle > > > >Abstract > >Eagle is an Open Source Monitoring solution for Hadoop to instantly > >identify access to sensitive data, recognize attacks, malicious > >activities in hadoop and take actions. > > > >Proposal > >Eagle audits access to HDFS files, Hive and HBase tables in real time, > >enforces policies defined on sensitive data access and alerts or blocks > >user¹s access to that sensitive data in real time. Eagle also creates > >user profiles based on the typical access behaviour for HDFS and Hive and > >sends alerts when anomalous behaviour is detected. Eagle can also import > >sensitive data information classified by external classification engines > >to help define its policies. > > > >Overview of Eagle > >Eagle has 3 main parts. > >1.Data collection and storage - Eagle collects data from various hadoop > >logs in real time using Kafka/Yarn API and uses HDFS and HBase for > >storage. > >2.Data processing and policy engine - Eagle allows users to create > >policies based on various metadata properties on HDFS, Hive and HBase > >data. > >3.Eagle services - Eagle services include policy manager, query service > >and the visualization component. Eagle provides intuitive user interface > >to administer Eagle and an alert dashboard to respond to real time alerts. > > > >Data Collection and Storage: > >Eagle provides programming API for extending Eagle to integrate any data > >source into Eagle policy evaluation framework. For example, Eagle hdfs > >audit monitoring collects data from Kafka which is populated from > >namenode log4j appender or from logstash agent. Eagle hive monitoring > >collects hive query logs from running job through YARN API, which is > >designed to be scalable and fault-tolerant. Eagle uses HBase as storage > >for storing metadata and metrics data, and also supports relational > >database through configuration change. > > > >Data Processing and Policy Engine: > >Processing Engine: Eagle provides stream processing API which is an > >abstraction of Apache Storm. It can also be extended to other streaming > >engines. This abstraction allows developers to assemble data > >transformation, filtering, external data join etc. without physically > >bound to a specific streaming platform. Eagle streaming API allows > >developers to easily integrate business logic with Eagle policy engine > >and internally Eagle framework compiles business logic execution DAG into > >program primitives of underlying stream infrastructure e.g. Apache Storm. > >For example, Eagle HDFS monitoring transforms audit log from Namenode to > >object and joins sensitivity metadata, security zone metadata which are > >generated from external programs or configured by user. Eagle hive > >monitoring filters running jobs to get hive query string and parses query > >string into object and then joins sensitivity metadata. > >Alerting Framework: Eagle Alert Framework includes stream metadata API, > >scalable policy engine framework, extensible policy engine framework. > >Stream metadata API allows developers to declare event schema including > >what attributes constitute an event, what is the type for each attribute, > >and how to dynamically resolve attribute value in runtime when user > >configures policy. Scalable policy engine framework allows policies to be > >executed on different physical nodes in parallel. It is also used to > >define your own policy partitioner class. Policy engine framework > >together with streaming partitioning capability provided by all streaming > >platforms will make sure policies and events can be evaluated in a fully > >distributed way. Extensible policy engine framework allows developer to > >plugin a new policy engine with a few lines of codes. WSO2 Siddhi CEP > >engine is the policy engine which Eagle supports as first-class citizen. > >Machine Learning module: Eagle provides capabilities to define user > >activity
Re: [VOTE] Accept Eagle into Apache Incubation
+1 Shaofeng Shi于2015年10月24日周六 08:40写道: > +1 (non-binding) > > "Manoharan, Arun" 编写: > > >Hello Everyone, > > > >Thanks for all the feedback on the Eagle Proposal. > > > >I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > > >The vote is open for 72 hours: > > > >[ ] +1 accept Eagle in the Incubator > >[ ] ±0 > >[ ] -1 (please give reason) > > > >Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > > >The proposal is available on the wiki here: > >https://wiki.apache.org/incubator/EagleProposal > > > >The text of the proposal is also available at the end of this email. > > > >Thanks for your time and help. > > > >Thanks, > >Arun > > > > > > > >Eagle > > > >Abstract > >Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > > >Proposal > >Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > > >Overview of Eagle > >Eagle has 3 main parts. > >1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > >2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > >3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > > >Data Collection and Storage: > >Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > > >Data Processing and Policy Engine: > >Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > >Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > >Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle uses
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an incubation > project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly identify > access to sensitive data, recognize attacks, malicious activities in hadoop > and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import sensitive > data information classified by external classification engines to help define > its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop logs > in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create policies > based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service and > the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs audit > monitoring collects data from Kafka which is populated from namenode log4j > appender or from logstash agent. Eagle hive monitoring collects hive query > logs from running job through YARN API, which is designed to be scalable and > fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics > data, and also supports relational database through configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data transformation, > filtering, external data join etc. without physically bound to a specific > streaming platform. Eagle streaming API allows developers to easily integrate > business logic with Eagle policy engine and internally Eagle framework > compiles business logic execution DAG into program primitives of underlying > stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring > transforms audit log from Namenode to object and joins sensitivity metadata, > security zone metadata which are generated from external programs or > configured by user. Eagle hive monitoring filters running jobs to get hive > query string and parses query string into object and then joins sensitivity > metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. Stream > metadata API allows developers to declare event schema including what > attributes constitute an event, what is the type for each attribute, and how > to dynamically resolve attribute value in runtime when user configures > policy. Scalable policy engine framework allows policies to be executed on > different physical nodes in parallel. It is also used to define your own > policy partitioner class. Policy engine framework together with streaming > partitioning capability provided by all streaming platforms will make sure > policies and events can be evaluated in a fully distributed way. Extensible > policy engine framework allows developer to plugin a new policy engine with a > few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle > supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user activity > patterns or user profiles for Hadoop users based on the user behaviour in the > platform. These user profiles are modeled using Machine Learning algorithms > and used for detection
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) On Fri, Oct 23, 2015 at 7:11 AM, Manoharan, Arunwrote: > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an > incubation project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly > identify access to sensitive data, recognize attacks, malicious activities > in hadoop and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import > sensitive data information classified by external classification engines to > help define its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop > logs in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create > policies based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service > and the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs > audit monitoring collects data from Kafka which is populated from namenode > log4j appender or from logstash agent. Eagle hive monitoring collects hive > query logs from running job through YARN API, which is designed to be > scalable and fault-tolerant. Eagle uses HBase as storage for storing > metadata and metrics data, and also supports relational database through > configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data > transformation, filtering, external data join etc. without physically bound > to a specific streaming platform. Eagle streaming API allows developers to > easily integrate business logic with Eagle policy engine and internally > Eagle framework compiles business logic execution DAG into program > primitives of underlying stream infrastructure e.g. Apache Storm. For > example, Eagle HDFS monitoring transforms audit log from Namenode to object > and joins sensitivity metadata, security zone metadata which are generated > from external programs or configured by user. Eagle hive monitoring filters > running jobs to get hive query string and parses query string into object > and then joins sensitivity metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. > Stream metadata API allows developers to declare event schema including > what attributes constitute an event, what is the type for each attribute, > and how to dynamically resolve attribute value in runtime when user > configures policy. Scalable policy engine framework allows policies to be > executed on different physical nodes in parallel. It is also used to define > your own policy partitioner class. Policy engine framework together with > streaming partitioning capability provided by all streaming platforms will > make sure policies and events can be evaluated in a fully distributed way. > Extensible policy engine framework allows developer to plugin a new policy > engine with a few lines of codes. WSO2 Siddhi CEP engine is the policy > engine which Eagle supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user > activity patterns or user profiles for Hadoop users based on the user > behaviour in the platform. These user profiles are modeled using Machine > Learning algorithms and used for detection of anomalous users activities. > Eagle
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (binding) -Taylor > On Oct 23, 2015, at 10:11 AM, Manoharan, Arunwrote: > > Hello Everyone, > > Thanks for all the feedback on the Eagle Proposal. > > I would like to call for a [VOTE] on Eagle joining the ASF as an incubation > project. > > The vote is open for 72 hours: > > [ ] +1 accept Eagle in the Incubator > [ ] ±0 > [ ] -1 (please give reason) > > Eagle is a Monitoring solution for Hadoop to instantly identify access to > sensitive data, recognize attacks, malicious activities and take actions in > real time. Eagle supports a wide variety of policies on HDFS data and Hive. > Eagle also provides machine learning models for detecting anomalous user > behavior in Hadoop. > > The proposal is available on the wiki here: > https://wiki.apache.org/incubator/EagleProposal > > The text of the proposal is also available at the end of this email. > > Thanks for your time and help. > > Thanks, > Arun > > > > Eagle > > Abstract > Eagle is an Open Source Monitoring solution for Hadoop to instantly identify > access to sensitive data, recognize attacks, malicious activities in hadoop > and take actions. > > Proposal > Eagle audits access to HDFS files, Hive and HBase tables in real time, > enforces policies defined on sensitive data access and alerts or blocks > user’s access to that sensitive data in real time. Eagle also creates user > profiles based on the typical access behaviour for HDFS and Hive and sends > alerts when anomalous behaviour is detected. Eagle can also import sensitive > data information classified by external classification engines to help define > its policies. > > Overview of Eagle > Eagle has 3 main parts. > 1.Data collection and storage - Eagle collects data from various hadoop logs > in real time using Kafka/Yarn API and uses HDFS and HBase for storage. > 2.Data processing and policy engine - Eagle allows users to create policies > based on various metadata properties on HDFS, Hive and HBase data. > 3.Eagle services - Eagle services include policy manager, query service and > the visualization component. Eagle provides intuitive user interface to > administer Eagle and an alert dashboard to respond to real time alerts. > > Data Collection and Storage: > Eagle provides programming API for extending Eagle to integrate any data > source into Eagle policy evaluation framework. For example, Eagle hdfs audit > monitoring collects data from Kafka which is populated from namenode log4j > appender or from logstash agent. Eagle hive monitoring collects hive query > logs from running job through YARN API, which is designed to be scalable and > fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics > data, and also supports relational database through configuration change. > > Data Processing and Policy Engine: > Processing Engine: Eagle provides stream processing API which is an > abstraction of Apache Storm. It can also be extended to other streaming > engines. This abstraction allows developers to assemble data transformation, > filtering, external data join etc. without physically bound to a specific > streaming platform. Eagle streaming API allows developers to easily integrate > business logic with Eagle policy engine and internally Eagle framework > compiles business logic execution DAG into program primitives of underlying > stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring > transforms audit log from Namenode to object and joins sensitivity metadata, > security zone metadata which are generated from external programs or > configured by user. Eagle hive monitoring filters running jobs to get hive > query string and parses query string into object and then joins sensitivity > metadata. > Alerting Framework: Eagle Alert Framework includes stream metadata API, > scalable policy engine framework, extensible policy engine framework. Stream > metadata API allows developers to declare event schema including what > attributes constitute an event, what is the type for each attribute, and how > to dynamically resolve attribute value in runtime when user configures > policy. Scalable policy engine framework allows policies to be executed on > different physical nodes in parallel. It is also used to define your own > policy partitioner class. Policy engine framework together with streaming > partitioning capability provided by all streaming platforms will make sure > policies and events can be evaluated in a fully distributed way. Extensible > policy engine framework allows developer to plugin a new policy engine with a > few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle > supports as first-class citizen. > Machine Learning module: Eagle provides capabilities to define user activity > patterns or user profiles for Hadoop users based on the user behaviour in the > platform. These user profiles are modeled using Machine Learning
Re: [VOTE] Accept Eagle into Apache Incubation
+1 -Medha On 10/23/15, 1:14 PM, "Balaji Ganesan"wrote: >+1 > >On Fri, Oct 23, 2015 at 12:26 PM, Chris Nauroth >wrote: > >> +1 (binding) >> >> --Chris Nauroth >> >> >> >> >> On 10/23/15, 7:11 AM, "Manoharan, Arun" wrote: >> >> >Hello Everyone, >> > >> >Thanks for all the feedback on the Eagle Proposal. >> > >> >I would like to call for a [VOTE] on Eagle joining the ASF as an >> >incubation project. >> > >> >The vote is open for 72 hours: >> > >> >[ ] +1 accept Eagle in the Incubator >> >[ ] ±0 >> >[ ] -1 (please give reason) >> > >> >Eagle is a Monitoring solution for Hadoop to instantly identify access >>to >> >sensitive data, recognize attacks, malicious activities and take >>actions >> >in real time. Eagle supports a wide variety of policies on HDFS data >>and >> >Hive. Eagle also provides machine learning models for detecting >>anomalous >> >user behavior in Hadoop. >> > >> >The proposal is available on the wiki here: >> >https://wiki.apache.org/incubator/EagleProposal >> > >> >The text of the proposal is also available at the end of this email. >> > >> >Thanks for your time and help. >> > >> >Thanks, >> >Arun >> > >> > >> > >> >Eagle >> > >> >Abstract >> >Eagle is an Open Source Monitoring solution for Hadoop to instantly >> >identify access to sensitive data, recognize attacks, malicious >> >activities in hadoop and take actions. >> > >> >Proposal >> >Eagle audits access to HDFS files, Hive and HBase tables in real time, >> >enforces policies defined on sensitive data access and alerts or blocks >> >user¹s access to that sensitive data in real time. Eagle also creates >> >user profiles based on the typical access behaviour for HDFS and Hive >>and >> >sends alerts when anomalous behaviour is detected. Eagle can also >>import >> >sensitive data information classified by external classification >>engines >> >to help define its policies. >> > >> >Overview of Eagle >> >Eagle has 3 main parts. >> >1.Data collection and storage - Eagle collects data from various hadoop >> >logs in real time using Kafka/Yarn API and uses HDFS and HBase for >> >storage. >> >2.Data processing and policy engine - Eagle allows users to create >> >policies based on various metadata properties on HDFS, Hive and HBase >> >data. >> >3.Eagle services - Eagle services include policy manager, query service >> >and the visualization component. Eagle provides intuitive user >>interface >> >to administer Eagle and an alert dashboard to respond to real time >>alerts. >> > >> >Data Collection and Storage: >> >Eagle provides programming API for extending Eagle to integrate any >>data >> >source into Eagle policy evaluation framework. For example, Eagle hdfs >> >audit monitoring collects data from Kafka which is populated from >> >namenode log4j appender or from logstash agent. Eagle hive monitoring >> >collects hive query logs from running job through YARN API, which is >> >designed to be scalable and fault-tolerant. Eagle uses HBase as storage >> >for storing metadata and metrics data, and also supports relational >> >database through configuration change. >> > >> >Data Processing and Policy Engine: >> >Processing Engine: Eagle provides stream processing API which is an >> >abstraction of Apache Storm. It can also be extended to other streaming >> >engines. This abstraction allows developers to assemble data >> >transformation, filtering, external data join etc. without physically >> >bound to a specific streaming platform. Eagle streaming API allows >> >developers to easily integrate business logic with Eagle policy engine >> >and internally Eagle framework compiles business logic execution DAG >>into >> >program primitives of underlying stream infrastructure e.g. Apache >>Storm. >> >For example, Eagle HDFS monitoring transforms audit log from Namenode >>to >> >object and joins sensitivity metadata, security zone metadata which are >> >generated from external programs or configured by user. Eagle hive >> >monitoring filters running jobs to get hive query string and parses >>query >> >string into object and then joins sensitivity metadata. >> >Alerting Framework: Eagle Alert Framework includes stream metadata API, >> >scalable policy engine framework, extensible policy engine framework. >> >Stream metadata API allows developers to declare event schema including >> >what attributes constitute an event, what is the type for each >>attribute, >> >and how to dynamically resolve attribute value in runtime when user >> >configures policy. Scalable policy engine framework allows policies to >>be >> >executed on different physical nodes in parallel. It is also used to >> >define your own policy partitioner class. Policy engine framework >> >together with streaming partitioning capability provided by all >>streaming >> >platforms will make sure policies and events can be evaluated in a >>fully >> >distributed way. Extensible policy engine framework allows developer to >>
Re: [VOTE] Accept Eagle into Apache Incubation
+1 (non-binding) "Manoharan, Arun"编写: >Hello Everyone, > >Thanks for all the feedback on the Eagle Proposal. > >I would like to call for a [VOTE] on Eagle joining the ASF as an incubation >project. > >The vote is open for 72 hours: > >[ ] +1 accept Eagle in the Incubator >[ ] ±0 >[ ] -1 (please give reason) > >Eagle is a Monitoring solution for Hadoop to instantly identify access to >sensitive data, recognize attacks, malicious activities and take actions in >real time. Eagle supports a wide variety of policies on HDFS data and Hive. >Eagle also provides machine learning models for detecting anomalous user >behavior in Hadoop. > >The proposal is available on the wiki here: >https://wiki.apache.org/incubator/EagleProposal > >The text of the proposal is also available at the end of this email. > >Thanks for your time and help. > >Thanks, >Arun > > > >Eagle > >Abstract >Eagle is an Open Source Monitoring solution for Hadoop to instantly identify >access to sensitive data, recognize attacks, malicious activities in hadoop >and take actions. > >Proposal >Eagle audits access to HDFS files, Hive and HBase tables in real time, >enforces policies defined on sensitive data access and alerts or blocks user’s >access to that sensitive data in real time. Eagle also creates user profiles >based on the typical access behaviour for HDFS and Hive and sends alerts when >anomalous behaviour is detected. Eagle can also import sensitive data >information classified by external classification engines to help define its >policies. > >Overview of Eagle >Eagle has 3 main parts. >1.Data collection and storage - Eagle collects data from various hadoop logs >in real time using Kafka/Yarn API and uses HDFS and HBase for storage. >2.Data processing and policy engine - Eagle allows users to create policies >based on various metadata properties on HDFS, Hive and HBase data. >3.Eagle services - Eagle services include policy manager, query service and >the visualization component. Eagle provides intuitive user interface to >administer Eagle and an alert dashboard to respond to real time alerts. > >Data Collection and Storage: >Eagle provides programming API for extending Eagle to integrate any data >source into Eagle policy evaluation framework. For example, Eagle hdfs audit >monitoring collects data from Kafka which is populated from namenode log4j >appender or from logstash agent. Eagle hive monitoring collects hive query >logs from running job through YARN API, which is designed to be scalable and >fault-tolerant. Eagle uses HBase as storage for storing metadata and metrics >data, and also supports relational database through configuration change. > >Data Processing and Policy Engine: >Processing Engine: Eagle provides stream processing API which is an >abstraction of Apache Storm. It can also be extended to other streaming >engines. This abstraction allows developers to assemble data transformation, >filtering, external data join etc. without physically bound to a specific >streaming platform. Eagle streaming API allows developers to easily integrate >business logic with Eagle policy engine and internally Eagle framework >compiles business logic execution DAG into program primitives of underlying >stream infrastructure e.g. Apache Storm. For example, Eagle HDFS monitoring >transforms audit log from Namenode to object and joins sensitivity metadata, >security zone metadata which are generated from external programs or >configured by user. Eagle hive monitoring filters running jobs to get hive >query string and parses query string into object and then joins sensitivity >metadata. >Alerting Framework: Eagle Alert Framework includes stream metadata API, >scalable policy engine framework, extensible policy engine framework. Stream >metadata API allows developers to declare event schema including what >attributes constitute an event, what is the type for each attribute, and how >to dynamically resolve attribute value in runtime when user configures policy. >Scalable policy engine framework allows policies to be executed on different >physical nodes in parallel. It is also used to define your own policy >partitioner class. Policy engine framework together with streaming >partitioning capability provided by all streaming platforms will make sure >policies and events can be evaluated in a fully distributed way. Extensible >policy engine framework allows developer to plugin a new policy engine with a >few lines of codes. WSO2 Siddhi CEP engine is the policy engine which Eagle >supports as first-class citizen. >Machine Learning module: Eagle provides capabilities to define user activity >patterns or user profiles for Hadoop users based on the user behaviour in the >platform. These user profiles are modeled using Machine Learning algorithms >and used for detection of anomalous users activities. Eagle uses Eigen Value >Decomposition, and Density Estimation algorithms for