http://git-wip-us.apache.org/repos/asf/metron/blob/ae1d3eb9/site/current-book/metron-analytics/metron-profiler/index.html ---------------------------------------------------------------------- diff --git a/site/current-book/metron-analytics/metron-profiler/index.html b/site/current-book/metron-analytics/metron-profiler/index.html index c3bf7ad..fe498af 100644 --- a/site/current-book/metron-analytics/metron-profiler/index.html +++ b/site/current-book/metron-analytics/metron-profiler/index.html @@ -1,295 +1,168 @@ <!DOCTYPE html> <!-- - | Generated by Apache Maven Doxia at 2018-01-03 - | Rendered using Apache Maven Fluido Skin 1.3.0 + | Generated by Apache Maven Doxia Site Renderer 1.8 from src/site/markdown/metron-analytics/metron-profiler/index.md at 2018-06-07 + | Rendered using Apache Maven Fluido Skin 1.7 --> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> - <meta name="Date-Revision-yyyymmdd" content="20180103" /> + <meta name="Date-Revision-yyyymmdd" content="20180607" /> <meta http-equiv="Content-Language" content="en" /> <title>Metron – Metron Profiler</title> - <link rel="stylesheet" href="../../css/apache-maven-fluido-1.3.0.min.css" /> + <link rel="stylesheet" href="../../css/apache-maven-fluido-1.7.min.css" /> <link rel="stylesheet" href="../../css/site.css" /> <link rel="stylesheet" href="../../css/print.css" media="print" /> - - - <script type="text/javascript" src="../../js/apache-maven-fluido-1.3.0.min.js"></script> - - - -<script type="text/javascript">$( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } );</script> - - </head> - <body class="topBarDisabled"> - - - - - <div class="container-fluid"> - <div id="banner"> - <div class="pull-left"> - <a href="http://metron.apache.org/" id="bannerLeft"> - <img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/> - </a> - </div> - <div class="pull-right"> </div> + <script type="text/javascript" src="../../js/apache-maven-fluido-1.7.min.js"></script> +<script type="text/javascript"> + $( document ).ready( function() { $( '.carousel' ).carousel( { interval: 3500 } ) } ); + </script> + </head> + <body class="topBarDisabled"> + <div class="container-fluid"> + <div id="banner"> + <div class="pull-left"><a href="http://metron.apache.org/" id="bannerLeft"><img src="../../images/metron-logo.png" alt="Apache Metron" width="148px" height="48px"/></a></div> + <div class="pull-right"></div> <div class="clear"><hr/></div> </div> <div id="breadcrumbs"> <ul class="breadcrumb"> - - - <li class=""> - <a href="http://www.apache.org" class="externalLink" title="Apache"> - Apache</a> - </li> - <li class="divider ">/</li> - <li class=""> - <a href="http://metron.apache.org/" class="externalLink" title="Metron"> - Metron</a> - </li> - <li class="divider ">/</li> - <li class=""> - <a href="../../index.html" title="Documentation"> - Documentation</a> - </li> - <li class="divider ">/</li> - <li class="">Metron Profiler</li> - - - - <li id="publishDate" class="pull-right">Last Published: 2018-01-03</li> <li class="divider pull-right">|</li> - <li id="projectVersion" class="pull-right">Version: 0.4.2</li> - - </ul> + <li class=""><a href="http://www.apache.org" class="externalLink" title="Apache">Apache</a><span class="divider">/</span></li> + <li class=""><a href="http://metron.apache.org/" class="externalLink" title="Metron">Metron</a><span class="divider">/</span></li> + <li class=""><a href="../../index.html" title="Documentation">Documentation</a><span class="divider">/</span></li> + <li class="active ">Metron Profiler</li> + <li id="publishDate" class="pull-right"><span class="divider">|</span> Last Published: 2018-06-07</li> + <li id="projectVersion" class="pull-right">Version: 0.5.0</li> + </ul> </div> - - <div class="row-fluid"> - <div id="leftColumn" class="span3"> + <div id="leftColumn" class="span2"> <div class="well sidebar-nav"> - - - <ul class="nav nav-list"> - <li class="nav-header">User Documentation</li> - - <li> - - <a href="../../index.html" title="Metron"> - <i class="icon-chevron-down"></i> - Metron</a> - <ul class="nav nav-list"> - - <li> - - <a href="../../Upgrading.html" title="Upgrading"> - <i class="none"></i> - Upgrading</a> - </li> - - <li> - - <a href="../../metron-analytics/index.html" title="Analytics"> - <i class="icon-chevron-down"></i> - Analytics</a> - <ul class="nav nav-list"> - - <li> - - <a href="../../metron-analytics/metron-maas-service/index.html" title="Maas-service"> - <i class="none"></i> - Maas-service</a> - </li> - - <li class="active"> - - <a href="#"><i class="none"></i>Profiler</a> - </li> - - <li> - - <a href="../../metron-analytics/metron-profiler-client/index.html" title="Profiler-client"> - <i class="none"></i> - Profiler-client</a> - </li> - - <li> - - <a href="../../metron-analytics/metron-statistics/index.html" title="Statistics"> - <i class="icon-chevron-right"></i> - Statistics</a> - </li> - </ul> - </li> - - <li> - - <a href="../../metron-contrib/metron-docker/index.html" title="Docker"> - <i class="none"></i> - Docker</a> - </li> - - <li> - - <a href="../../metron-deployment/index.html" title="Deployment"> - <i class="icon-chevron-right"></i> - Deployment</a> - </li> - - <li> - - <a href="../../metron-interface/metron-alerts/index.html" title="Alerts"> - <i class="none"></i> - Alerts</a> - </li> - - <li> - - <a href="../../metron-interface/metron-config/index.html" title="Config"> - <i class="none"></i> - Config</a> - </li> - - <li> - - <a href="../../metron-interface/metron-rest/index.html" title="Rest"> - <i class="none"></i> - Rest</a> - </li> - - <li> - - <a href="../../metron-platform/index.html" title="Platform"> - <i class="icon-chevron-right"></i> - Platform</a> - </li> - - <li> - - <a href="../../metron-sensors/index.html" title="Sensors"> - <i class="icon-chevron-right"></i> - Sensors</a> - </li> - - <li> - - <a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"> - <i class="none"></i> - Stellar-3rd-party-example</a> - </li> - - <li> - - <a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"> - <i class="icon-chevron-right"></i> - Stellar-common</a> - </li> - - <li> - - <a href="../../use-cases/index.html" title="Use-cases"> - <i class="icon-chevron-right"></i> - Use-cases</a> - </li> - </ul> - </li> - </ul> - - - - <hr class="divider" /> - - <div id="poweredBy"> - <div class="clear"></div> - <div class="clear"></div> - <div class="clear"></div> - <a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"> - <img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /> - </a> - </div> + <ul class="nav nav-list"> + <li class="nav-header">User Documentation</li> + <li><a href="../../index.html" title="Metron"><span class="icon-chevron-down"></span>Metron</a> + <ul class="nav nav-list"> + <li><a href="../../CONTRIBUTING.html" title="CONTRIBUTING"><span class="none"></span>CONTRIBUTING</a></li> + <li><a href="../../Upgrading.html" title="Upgrading"><span class="none"></span>Upgrading</a></li> + <li><a href="../../metron-analytics/index.html" title="Analytics"><span class="icon-chevron-down"></span>Analytics</a> + <ul class="nav nav-list"> + <li><a href="../../metron-analytics/metron-maas-service/index.html" title="Maas-service"><span class="none"></span>Maas-service</a></li> + <li class="active"><a href="#"><span class="none"></span>Profiler</a></li> + <li><a href="../../metron-analytics/metron-profiler-client/index.html" title="Profiler-client"><span class="none"></span>Profiler-client</a></li> + <li><a href="../../metron-analytics/metron-statistics/index.html" title="Statistics"><span class="icon-chevron-right"></span>Statistics</a></li> + </ul> +</li> + <li><a href="../../metron-contrib/metron-docker/index.html" title="Docker"><span class="none"></span>Docker</a></li> + <li><a href="../../metron-contrib/metron-performance/index.html" title="Performance"><span class="none"></span>Performance</a></li> + <li><a href="../../metron-deployment/index.html" title="Deployment"><span class="icon-chevron-right"></span>Deployment</a></li> + <li><a href="../../metron-interface/metron-alerts/index.html" title="Alerts"><span class="none"></span>Alerts</a></li> + <li><a href="../../metron-interface/metron-config/index.html" title="Config"><span class="none"></span>Config</a></li> + <li><a href="../../metron-interface/metron-rest/index.html" title="Rest"><span class="none"></span>Rest</a></li> + <li><a href="../../metron-platform/index.html" title="Platform"><span class="icon-chevron-right"></span>Platform</a></li> + <li><a href="../../metron-sensors/index.html" title="Sensors"><span class="icon-chevron-right"></span>Sensors</a></li> + <li><a href="../../metron-stellar/stellar-3rd-party-example/index.html" title="Stellar-3rd-party-example"><span class="none"></span>Stellar-3rd-party-example</a></li> + <li><a href="../../metron-stellar/stellar-common/index.html" title="Stellar-common"><span class="icon-chevron-right"></span>Stellar-common</a></li> + <li><a href="../../metron-stellar/stellar-zeppelin/index.html" title="Stellar-zeppelin"><span class="none"></span>Stellar-zeppelin</a></li> + <li><a href="../../use-cases/index.html" title="Use-cases"><span class="icon-chevron-right"></span>Use-cases</a></li> + </ul> +</li> +</ul> + <hr /> + <div id="poweredBy"> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> + <div class="clear"></div> +<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy"><img class="builtBy" alt="Built by Maven" src="../../images/logos/maven-feather.png" /></a> + </div> </div> </div> - - - <div id="bodyColumn" class="span9" > - - <h1>Metron Profiler</h1> -<p><a name="Metron_Profiler"></a></p> -<p>The Profiler is a feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior.</p> -<p>This is achieved by summarizing the streaming telemetry data consumed by Metron over sliding windows. A summary statistic is applied to the data received within a given window. Collecting this summary across many windows results in a time series that is useful for analysis.</p> -<p>Any field contained within a message can be used to generate a profile. A profile can even be produced by combining fields that originate in different data sources. A user has considerable power to transform the data used in a profile by leveraging the Stellar language. A user only need configure the desired profiles and ensure that the Profiler topology is running.</p> + <div id="bodyColumn" class="span10" > +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> +<h1>Metron Profiler</h1> +<p><a name="Metron_Profiler"></a></p> +<p>The Profiler is a feature extraction mechanism that can generate a profile describing the behavior of an entity. An entity might be a server, user, subnet or application. Once a profile has been generated defining what normal behavior looks-like, models can be built that identify anomalous behavior.</p> +<p>This is achieved by summarizing the streaming telemetry data consumed by Metron over sliding windows. A summary statistic is applied to the data received within a given window. Collecting this summary across many windows results in a time series that is useful for analysis.</p> +<p>Any field contained within a message can be used to generate a profile. A profile can even be produced by combining fields that originate in different data sources. A user has considerable power to transform the data used in a profile by leveraging the Stellar language. A user only need configure the desired profiles and ensure that the Profiler topology is running.</p> <ul> - + <li><a href="#Installation">Installation</a></li> - <li><a href="#Creating_Profiles">Creating Profiles</a></li> - <li><a href="#Deploying_Profiles">Deploying Profiles</a></li> - <li><a href="#Anatomy_of_a_Profile">Anatomy of a Profile</a></li> - <li><a href="#Configuring_the_Profiler">Configuring the Profiler</a></li> - <li><a href="#Examples">Examples</a></li> - <li><a href="#Implementation">Implementation</a></li> </ul> <div class="section"> <h2><a name="Installation"></a>Installation</h2> <p>The Profiler can be installed with either of these two methods.</p> - <ul> - + <li><a href="#Ambari_Installation">Ambari Installation</a></li> - <li><a href="#Manual_Installation">Manual Installation</a></li> </ul> <div class="section"> <h3><a name="Ambari_Installation"></a>Ambari Installation</h3> -<p>The Metron Profiler is installed automatically when installing Metron using the Ambari MPack. You can skip the <a href="#Installation">Installation</a> section and move ahead to <a href="#Creating_Profiles">Creating Profiles</a> should this be the case.</p></div> +<p>The Metron Profiler is installed automatically when installing Metron using the Ambari MPack. You can skip the <a href="#Installation">Installation</a> section and move ahead to <a href="#Creating_Profiles">Creating Profiles</a> should this be the case.</p></div> <div class="section"> <h3><a name="Manual_Installation"></a>Manual Installation</h3> -<p>This section will describe the steps necessary to manually install the Profiler on an RPM-based Linux distribution. This assumes that core Metron has already been installed and validated. If you installed Metron using the <a href="#Ambari_MPack">Ambari MPack</a>, then the Profiler has already been installed and you can skip this section.</p> - +<p>This section will describe the steps necessary to manually install the Profiler on an RPM-based Linux distribution. This assumes that core Metron has already been installed and validated. If you installed Metron using the <a href="#Ambari_MPack">Ambari MPack</a>, then the Profiler has already been installed and you can skip this section.</p> <ol style="list-style-type: decimal"> - + <li> + <p>Build the Metron RPMs (see Building the <a href="../../metron-deployment/index.html#RPMs">RPMs</a>).</p> <p>You may have already built the Metron RPMs when core Metron was installed.</p> - -<div class="source"> -<div class="source"> -<pre>$ find metron-deployment/ -name "metron-profiler*.rpm" + +<div> +<div> +<pre class="source">$ find metron-deployment/ -name "metron-profiler*.rpm" metron-deployment//packaging/docker/rpm-docker/RPMS/noarch/metron-profiler-0.4.1-201707131420.noarch.rpm -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Copy the Profiler RPM to the installation host. </p> -<p>The installation host must be the same host on which core Metron was installed. Depending on how you installed Metron, the Profiler RPM might have already been copied to this host with the other Metron RPMs.</p> - -<div class="source"> -<div class="source"> -<pre>[root@node1 ~]# find /localrepo/ -name "metron-profiler*.rpm" + +<p>Copy the Profiler RPM to the installation host.</p> +<p>The installation host must be the same host on which core Metron was installed. Depending on how you installed Metron, the Profiler RPM might have already been copied to this host with the other Metron RPMs.</p> + +<div> +<div> +<pre class="source">[root@node1 ~]# find /localrepo/ -name "metron-profiler*.rpm" /localrepo/metron-profiler-0.4.1-201707112313.noarch.rpm -</pre></div></div></li> - +</pre></div></div> +</li> <li> + <p>Install the RPM.</p> - -<div class="source"> -<div class="source"> -<pre>[root@node1 ~]# rpm -ivh metron-profiler-*.noarch.rpm + +<div> +<div> +<pre class="source">[root@node1 ~]# rpm -ivh metron-profiler-*.noarch.rpm Preparing... ########################################### [100%] 1:metron-profiler ########################################### [100%] </pre></div></div> - -<div class="source"> -<div class="source"> -<pre>[root@node1 ~]# rpm -ql metron-profiler + +<div> +<div> +<pre class="source">[root@node1 ~]# rpm -ql metron-profiler /usr/metron /usr/metron/0.4.2 /usr/metron/0.4.2/bin @@ -301,67 +174,71 @@ Preparing... ########################################### [100%] /usr/metron/0.4.2/flux/profiler/remote.yaml /usr/metron/0.4.2/lib /usr/metron/0.4.2/lib/metron-profiler-0.4.2-uber.jar -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Edit the configuration file located at <tt>$METRON_HOME/config/profiler.properties</tt>. </p> - -<div class="source"> -<div class="source"> -<pre>kafka.zk=node1:2181 + +<p>Edit the configuration file located at <tt>$METRON_HOME/config/profiler.properties</tt>.</p> + +<div> +<div> +<pre class="source">kafka.zk=node1:2181 kafka.broker=node1:6667 </pre></div></div> - + <ul> - -<li></li> - -<li></li> - </ul></li> - + +<li>Change <tt>kafka.zk</tt> to refer to Zookeeper in your environment.</li> +<li>Change <tt>kafka.broker</tt> to refer to a Kafka Broker in your environment.</li> +</ul> +</li> <li> -<p>Create a table within HBase that will store the profile data. By default, the table is named <tt>profiler</tt> with a column family <tt>P</tt>. The table name and column family must match the Profiler’s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>). </p> - -<div class="source"> -<div class="source"> -<pre>$ /usr/hdp/current/hbase-client/bin/hbase shell + +<p>Create a table within HBase that will store the profile data. By default, the table is named <tt>profiler</tt> with a column family <tt>P</tt>. The table name and column family must match the Profiler’s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>).</p> + +<div> +<div> +<pre class="source">$ /usr/hdp/current/hbase-client/bin/hbase shell hbase(main):001:0> create 'profiler', 'P' -</pre></div></div></li> - +</pre></div></div> +</li> <li> + <p>Start the Profiler topology.</p> - -<div class="source"> -<div class="source"> -<pre>$ cd $METRON_HOME + +<div> +<div> +<pre class="source">$ cd $METRON_HOME $ bin/start_profiler_topology.sh -</pre></div></div></li> +</pre></div></div> +</li> </ol> -<p>At this point the Profiler is running and consuming telemetry messages. We have not defined any profiles yet, so it is not doing anything very useful. The next section walks you through the steps to create your very first “Hello, World!” profile.</p></div></div> +<p>At this point the Profiler is running and consuming telemetry messages. We have not defined any profiles yet, so it is not doing anything very useful. The next section walks you through the steps to create your very first “Hello, World!” profile.</p></div></div> <div class="section"> <h2><a name="Creating_Profiles"></a>Creating Profiles</h2> -<p>This section will describe how to create your very first “Hello, World” profile. It will also outline a useful workflow for creating, testing, and deploying profiles.</p> -<p>Creating and refining profiles is an iterative process. Iterating against a live stream of data is slow, difficult and error prone. The Profile Debugger was created to provide a controlled and isolated execution environment to create, refine and troubleshoot profiles.</p> - +<p>This section will describe how to create your very first “Hello, World” profile. It will also outline a useful workflow for creating, testing, and deploying profiles.</p> +<p>Creating and refining profiles is an iterative process. Iterating against a live stream of data is slow, difficult and error prone. The Profile Debugger was created to provide a controlled and isolated execution environment to create, refine and troubleshoot profiles.</p> <ol style="list-style-type: decimal"> - + <li> -<p>Launch the Stellar Shell. We will leverage the Profiler Debugger from within the Stellar Shell.</p> - -<div class="source"> -<div class="source"> -<pre>[root@node1 ~]# $METRON_HOME/bin/stellar + +<p>Launch the Stellar Shell. We will leverage the Profiler Debugger from within the Stellar Shell.</p> + +<div> +<div> +<pre class="source">[root@node1 ~]# $METRON_HOME/bin/stellar Stellar, Go! [Stellar]>>> %functions PROFILER PROFILER_APPLY, PROFILER_FLUSH, PROFILER_INIT -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Create a simple <tt>hello-world</tt> profile that will count the number of messages for each <tt>ip_src_addr</tt>. The <tt>SHELL_EDIT</tt> function will open an editor in which you can copy/paste the following Profiler configuration.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> conf := SHELL_EDIT() + +<p>Create a simple <tt>hello-world</tt> profile that will count the number of messages for each <tt>ip_src_addr</tt>. The <tt>SHELL_EDIT</tt> function will open an editor in which you can copy/paste the following Profiler configuration.</p> + +<div> +<div> +<pre class="source">[Stellar]>>> conf := SHELL_EDIT() [Stellar]>>> conf { "profiles": [ @@ -375,123 +252,126 @@ PROFILER_APPLY, PROFILER_FLUSH, PROFILER_INIT } ] } -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Create a Profile execution environment; the Profile Debugger. </p> -<p>The Profiler will output the number of profiles that have been defined, the number of messages that have been applied and the number of routes that have been followed. </p> + +<p>Create a Profile execution environment; the Profile Debugger.</p> +<p>The Profiler will output the number of profiles that have been defined, the number of messages that have been applied and the number of routes that have been followed.</p> <p>A route is defined when a message is applied to a specific profile.</p> - <ul> - + <li>If a message is not needed by any profile, then there are no routes.</li> - <li>If a message is needed by one profile, then one route has been followed.</li> - <li>If a message is needed by two profiles, then two routes have been followed.</li> - </ul> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> p := PROFILER_INIT(conf) -[Stellar]>>> p +</ul> + +<div> +<div> +<pre class="source">[Stellar]>>> profiler := PROFILER_INIT(conf) +[Stellar]>>> profiler Profiler{1 profile(s), 0 messages(s), 0 route(s)} -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Create a message that mimics the telemetry that your profile will consume. </p> -<p>This message can be as simple or complex as you like. For the <tt>hello-world</tt> profile, all you need is a message containing an <tt>ip_src_addr</tt> field.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> msg := SHELL_EDIT() + +<p>Create a message that mimics the telemetry that your profile will consume.</p> +<p>This message can be as simple or complex as you like. For the <tt>hello-world</tt> profile, all you need is a message containing an <tt>ip_src_addr</tt> field.</p> + +<div> +<div> +<pre class="source">[Stellar]>>> msg := SHELL_EDIT() [Stellar]>>> msg { "ip_src_addr": "10.0.0.1" } -</pre></div></div></li> - +</pre></div></div> +</li> <li> + <p>Apply the message to your Profiler, as many times as you like.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> PROFILER_APPLY(msg, p) + +<div> +<div> +<pre class="source">[Stellar]>>> PROFILER_APPLY(msg, profiler) Profiler{1 profile(s), 1 messages(s), 1 route(s)} </pre></div></div> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> PROFILER_APPLY(msg, p) + +<div> +<div> +<pre class="source">[Stellar]>>> PROFILER_APPLY(msg, profiler) Profiler{1 profile(s), 2 messages(s), 2 route(s)} -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Flush the Profiler. </p> -<p>A flush is what occurs at the end of each 15 minute period in the Profiler. The result is a list of Profile Measurements. Each measurement is a map containing detailed information about the profile data that has been generated. The <tt>value</tt> field is what is written to HBase when running this profile in the Profiler topology. </p> -<p>There will always be one measurement for each [profile, entity] pair. This profile simply counts the number of messages by IP source address. Notice that the value is ‘3’ for the entity ‘10.0.0.1’ as we applied 3 messages with an ‘ip_src_addr’ of ’10.0.0.1’.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> values := PROFILER_FLUSH(profiler) + +<p>Flush the Profiler.</p> +<p>A flush is what occurs at the end of each 15 minute period in the Profiler. The result is a list of Profile Measurements. Each measurement is a map containing detailed information about the profile data that has been generated. The <tt>value</tt> field is what is written to HBase when running this profile in the Profiler topology.</p> +<p>There will always be one measurement for each [profile, entity] pair. This profile simply counts the number of messages by IP source address. Notice that the value is ‘3’ for the entity ‘10.0.0.1’ as we applied 3 messages with an ‘ip_src_addr’ of ’10.0.0.1’.</p> + +<div> +<div> +<pre class="source">[Stellar]>>> values := PROFILER_FLUSH(profiler) [Stellar]>>> values [{period={duration=900000, period=1669628, start=1502665200000, end=1502666100000}, profile=hello-world, groups=[], value=3, entity=10.0.0.1}] -</pre></div></div></li> - +</pre></div></div> +</li> <li> + <p>Apply real, live telemetry to your profile.</p> -<p>Once you are happy with your profile against a controlled data set, it can be useful to introduce more complex, live data. This example extracts 10 messages of live, enriched telemetry to test your profile(s).</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> %define bootstrap.servers := "node1:6667" +<p>Once you are happy with your profile against a controlled data set, it can be useful to introduce more complex, live data. This example extracts 10 messages of live, enriched telemetry to test your profile(s).</p> + +<div> +<div> +<pre class="source">[Stellar]>>> %define bootstrap.servers := "node1:6667" node1:6667 -[Stellar]>>> msgs := KAFKA_GET("indexing", 10) +[Stellar]>>> msgs := KAFKA_GET("indexing", 10) [Stellar]>>> LENGTH(msgs) 10 </pre></div></div> + <p>Apply those 10 messages to your profile(s).</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> PROFILER_APPLY(msgs, p) + +<div> +<div> +<pre class="source">[Stellar]>>> PROFILER_APPLY(msgs, profiler) Profiler{1 profile(s), 10 messages(s), 10 route(s)} -</pre></div></div></li> +</pre></div></div> +</li> </ol></div> <div class="section"> <h2><a name="Deploying_Profiles"></a>Deploying Profiles</h2> -<p>This section will describe the steps required to get your first “Hello, World!”" profile running. This assumes that you have a successful Profiler <a href="#Installation">Installation</a> and have it running. You can deploy profiles in two different ways.</p> - +<p>This section will describe the steps required to get your first “Hello, World!”" profile running. This assumes that you have a successful Profiler <a href="#Installation">Installation</a> and have it running. You can deploy profiles in two different ways.</p> <ul> - + <li><a href="#Deploying_Profiles_with_the_Stellar_Shell">Deploying Profiles with the Stellar Shell</a></li> - <li><a href="#Deploying_Profiles_from_the_Command_Line">Deploying Profiles from the Command Line</a></li> </ul> <div class="section"> <h3><a name="Deploying_Profiles_with_the_Stellar_Shell"></a>Deploying Profiles with the Stellar Shell</h3> -<p>Continuing the previous running example, at this point, you have seen how your profile behaves against real, live telemetry in a controlled execution environment. The next step is to deploy your profile to the live, actively running Profiler topology.</p> - +<p>Continuing the previous running example, at this point, you have seen how your profile behaves against real, live telemetry in a controlled execution environment. The next step is to deploy your profile to the live, actively running Profiler topology.</p> <ol style="list-style-type: decimal"> - + <li> -<p>Start the Stellar Shell with the <tt>-z ZK:2181</tt> command line argument. This is required when deploying a new profile to the active Profiler topology. Replace <tt>ZK:2181</tt> with a URL that is appropriate to your environment.</p> - -<div class="source"> -<div class="source"> -<pre>[root@node1 ~]# $METRON_HOME/bin/stellar -z ZK:2181 + +<p>Start the Stellar Shell with the <tt>-z ZK:2181</tt> command line argument. This is required when deploying a new profile to the active Profiler topology. Replace <tt>ZK:2181</tt> with a URL that is appropriate to your environment.</p> + +<div> +<div> +<pre class="source">[root@node1 ~]# $METRON_HOME/bin/stellar -z ZK:2181 Stellar, Go! [Stellar]>>> [Stellar]>>> %functions CONFIG CONFIG_GET, CONFIG_PUT -</pre></div></div></li> - +</pre></div></div> +</li> <li> + <p>If you haven’t already, define your profile.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> conf := SHELL_EDIT() + +<div> +<div> +<pre class="source">[Stellar]>>> conf := SHELL_EDIT() [Stellar]>>> conf { "profiles": [ @@ -505,37 +385,40 @@ Stellar, Go! } ] } -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Check what is already deployed. </p> -<p>Pushing a new profile configuration is destructive. It will overwrite any existing configuration. Check what you have out there. Manually merge the existing configuration with your new profile definition.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> existing := CONFIG_GET("PROFILER") -</pre></div></div></li> - + +<p>Check what is already deployed.</p> +<p>Pushing a new profile configuration is destructive. It will overwrite any existing configuration. Check what you have out there. Manually merge the existing configuration with your new profile definition.</p> + +<div> +<div> +<pre class="source">[Stellar]>>> existing := CONFIG_GET("PROFILER") +</pre></div></div> +</li> <li> -<p>Deploy your profile. This will push the configuration to to the live, actively running Profiler topology. This will overwrite any existing profile definitions.</p> - -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> CONFIG_PUT("PROFILER", conf) -</pre></div></div></li> + +<p>Deploy your profile. This will push the configuration to to the live, actively running Profiler topology. This will overwrite any existing profile definitions.</p> + +<div> +<div> +<pre class="source">[Stellar]>>> CONFIG_PUT("PROFILER", conf) +</pre></div></div> +</li> </ol></div> <div class="section"> <h3><a name="Deploying_Profiles_from_the_Command_Line"></a>Deploying Profiles from the Command Line</h3> - <ol style="list-style-type: decimal"> - + <li> -<p>Create the profile definition in a file located at <tt>$METRON_HOME/config/zookeeper/profiler.json</tt>. This file will likely not exist, if you have never created Profiles before.</p> + +<p>Create the profile definition in a file located at <tt>$METRON_HOME/config/zookeeper/profiler.json</tt>. This file will likely not exist, if you have never created Profiles before.</p> <p>The following example will create a profile that simply counts the number of messages per <tt>ip_src_addr</tt>.</p> - -<div class="source"> -<div class="source"> -<pre>{ + +<div> +<div> +<pre class="source">{ "profiles": [ { "profile": "hello-world", @@ -547,21 +430,23 @@ Stellar, Go! } ] } -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Upload the profile definition to Zookeeper. Change <tt>node1:2181</tt> to refer the actual Zookeeper host in your environment.</p> - -<div class="source"> -<div class="source"> -<pre>$ cd $METRON_HOME + +<p>Upload the profile definition to Zookeeper. Change <tt>node1:2181</tt> to refer the actual Zookeeper host in your environment.</p> + +<div> +<div> +<pre class="source">$ cd $METRON_HOME $ bin/zk_load_configs.sh -m PUSH -i config/zookeeper/ -z node1:2181 </pre></div></div> + <p>You can validate this by reading back the Metron configuration from Zookeeper using the same script. The result should look-like the following.</p> - -<div class="source"> -<div class="source"> -<pre>$ bin/zk_load_configs.sh -m DUMP -z node1:2181 + +<div> +<div> +<pre class="source">$ bin/zk_load_configs.sh -m DUMP -z node1:2181 ... PROFILER Config: profiler { @@ -576,378 +461,386 @@ PROFILER Config: profiler } ] } -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Ensure that test messages are being sent to the Profiler’s input topic in Kafka. The Profiler will consume messages from the input topic defined in the Profiler’s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>). By default this is the <tt>indexing</tt> topic.</p></li> - + +<p>Ensure that test messages are being sent to the Profiler’s input topic in Kafka. The Profiler will consume messages from the input topic defined in the Profiler’s configuration (see <a href="#Configuring_the_Profiler">Configuring the Profiler</a>). By default this is the <tt>indexing</tt> topic.</p> +</li> <li> -<p>Check the HBase table to validate that the Profiler is writing the profile. Remember that the Profiler is flushing the profile every 15 minutes. You will need to wait at least this long to start seeing profile data in HBase.</p> - -<div class="source"> -<div class="source"> -<pre>$ /usr/hdp/current/hbase-client/bin/hbase shell + +<p>Check the HBase table to validate that the Profiler is writing the profile. Remember that the Profiler is flushing the profile every 15 minutes. You will need to wait at least this long to start seeing profile data in HBase.</p> + +<div> +<div> +<pre class="source">$ /usr/hdp/current/hbase-client/bin/hbase shell hbase(main):001:0> count 'profiler' -</pre></div></div></li> - +</pre></div></div> +</li> <li> -<p>Use the <a href="../metron-profiler-client/index.html">Profiler Client</a> to read the profile data. The following <tt>PROFILE_GET</tt> command will read the data written by the <tt>hello-world</tt> profile. This assumes that <tt>10.0.0.1</tt> is one of the values for <tt>ip_src_addr</tt> contained within the telemetry consumed by the Profiler.</p> - -<div class="source"> -<div class="source"> -<pre>$ bin/stellar -z node1:2181 + +<p>Use the <a href="../metron-profiler-client/index.html">Profiler Client</a> to read the profile data. The following <tt>PROFILE_GET</tt> command will read the data written by the <tt>hello-world</tt> profile. This assumes that <tt>10.0.0.1</tt> is one of the values for <tt>ip_src_addr</tt> contained within the telemetry consumed by the Profiler.</p> + +<div> +<div> +<pre class="source">$ bin/stellar -z node1:2181 [Stellar]>>> PROFILE_GET( "hello-world", "10.0.0.1", PROFILE_FIXED(30, "MINUTES")) [451, 448] </pre></div></div> -<p>This result indicates that over the past 30 minutes, the Profiler stored two values related to the source IP address “10.0.0.1”. In the first 15 minute period, the IP <tt>10.0.0.1</tt> was seen in 451 telemetry messages. In the second 15 minute period, the same IP was seen in 448 telemetry messages.</p> -<p>It is assumed that the <tt>PROFILE_GET</tt> client is correctly configured to match the Profile configuration before using it to read that Profile. More information on configuring and using the Profiler client can be found <a href="../metron-profiler-client/index.html">here</a>. </p></li> + +<p>This result indicates that over the past 30 minutes, the Profiler stored two values related to the source IP address “10.0.0.1”. In the first 15 minute period, the IP <tt>10.0.0.1</tt> was seen in 451 telemetry messages. In the second 15 minute period, the same IP was seen in 448 telemetry messages.</p> +<p>It is assumed that the <tt>PROFILE_GET</tt> client is correctly configured to match the Profile configuration before using it to read that Profile. More information on configuring and using the Profiler client can be found <a href="../metron-profiler-client/index.html">here</a>.</p> +</li> </ol></div></div> <div class="section"> <h2><a name="Anatomy_of_a_Profile"></a>Anatomy of a Profile</h2> -<p>A profile definition requires a JSON-formatted set of elements, many of which can contain Stellar code. The specification contains the following elements. (For the impatient, skip ahead to the <a href="#Examples">Examples</a>.)</p> +<div class="section"> +<h3><a name="Profiler"></a>Profiler</h3> +<p>The Profiler configuration contains only two fields; only one of which is required.</p> + +<div> +<div> +<pre class="source">{ + "profiles": [ + { "profile": "one", ... }, + { "profile": "two", ... } + ], + "timestampField": "timestamp" +} +</pre></div></div> <table border="0" class="table table-striped"> - <thead> - +<thead> + <tr class="a"> - -<th>Name </th> - -<th> </th> - -<th>Description</th> - </tr> - </thead> - <tbody> - +<th> Name </th> +<th> </th> +<th> Description</th></tr> +</thead><tbody> + <tr class="b"> - -<td><a href="#profile">profile</a> </td> - -<td>Required </td> - -<td>Unique name identifying the profile.</td> - </tr> - +<td> <a href="#profiles">profiles</a> </td> +<td> Required </td> +<td> A list of zero or more Profile definitions.</td></tr> <tr class="a"> - -<td><a href="#foreach">foreach</a> </td> - -<td>Required </td> - -<td>A separate profile is maintained “for each” of these.</td> - </tr> - +<td> <a href="#timestampField">timestampField</a> </td> +<td> Optional </td> +<td> Indicates whether processing time or event time should be used. By default, processing time is enabled.</td></tr> +</tbody> +</table> +<div class="section"> +<h4><a name="profiles"></a><tt>profiles</tt></h4> +<p><i>Required</i></p> +<p>A list of zero or more Profile definitions.</p></div> +<div class="section"> +<h4><a name="timestampField"></a><tt>timestampField</tt></h4> +<p><i>Optional</i></p> +<p>Indicates whether processing time or event time is used. By default, processing time is enabled.</p> +<div class="section"> +<h5><a name="Processing_Time"></a>Processing Time</h5> +<p>By default, no <tt>timestampField</tt> is defined. In this case, the Profiler uses system time when generating profiles. This means that the profiles are generated based on when the data has been processed by the Profiler. This is also known as ‘processing time’.</p> +<p>This is the simplest mode of operation, but has some draw backs. If the Profiler is consuming live data and all is well, the processing and event times will likely remain similar and consistent. If processing time diverges from event time, then the Profiler will generate skewed profiles.</p> +<p>There are a few scenarios that might cause skewed profiles when using processing time. For example when a system has undergone a scheduled maintenance window and is restarted, a high volume of messages will need to be processed by the Profiler. The output of the Profiler might indicate an increase in activity during this time, although no change in activity actually occurred on the target network. The same situation could occur if an upstream system which provides telemetry undergoes an outage.</p> +<p><a href="#Event_Time">Event Time</a> can be used to mitigate these problems.</p></div> +<div class="section"> +<h5><a name="Event_Time"></a>Event Time</h5> +<p>Alternatively, a <tt>timestampField</tt> can be defined. This must be the name of a field contained within the telemetry processed by the Profiler. The Profiler will extract and use the timestamp contained within this field.</p> +<ul> + +<li> + +<p>If a message does not contain this field, it will be dropped.</p> +</li> +<li> + +<p>The field must contain a timestamp in epoch milliseconds expressed as either a numeric or string. Otherwise, the message will be dropped.</p> +</li> +<li> + +<p>The Profiler will use the same field across all telemetry sources and for all profiles.</p> +</li> +<li> + +<p>Be aware of clock skew across telemetry sources. If your profile is processing telemetry from multiple sources where the clock differs significantly, the Profiler may assume that some of those messages are late and will be ignored. Adjusting the <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a> and <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a> can help accommodate skewed clocks.</p> +</li> +</ul></div></div></div> +<div class="section"> +<h3><a name="Profiles"></a>Profiles</h3> +<p>A profile definition requires a JSON-formatted set of elements, many of which can contain Stellar code. The specification contains the following elements. (For the impatient, skip ahead to the <a href="#Examples">Examples</a>.)</p> +<table border="0" class="table table-striped"> +<thead> + +<tr class="a"> +<th> Name </th> +<th> </th> +<th> Description</th></tr> +</thead><tbody> + <tr class="b"> - -<td><a href="#onlyif">onlyif</a> </td> - -<td>Optional </td> - -<td>Boolean expression that determines if a message should be applied to the profile.</td> - </tr> - +<td> <a href="#profile">profile</a> </td> +<td> Required </td> +<td> Unique name identifying the profile.</td></tr> <tr class="a"> - -<td><a href="#groupBy">groupBy</a> </td> - -<td>Optional </td> - -<td>One or more Stellar expressions used to group the profile measurements when persisted.</td> - </tr> - +<td> <a href="#foreach">foreach</a> </td> +<td> Required </td> +<td> A separate profile is maintained “for each” of these.</td></tr> <tr class="b"> - -<td><a href="#init">init</a> </td> - -<td>Optional </td> - -<td>One or more expressions executed at the start of a window period.</td> - </tr> - +<td> <a href="#onlyif">onlyif</a> </td> +<td> Optional </td> +<td> Boolean expression that determines if a message should be applied to the profile.</td></tr> <tr class="a"> - -<td><a href="#update">update</a> </td> - -<td>Required </td> - -<td>One or more expressions executed when a message is applied to the profile.</td> - </tr> - +<td> <a href="#groupBy">groupBy</a> </td> +<td> Optional </td> +<td> One or more Stellar expressions used to group the profile measurements when persisted.</td></tr> <tr class="b"> - -<td><a href="#result">result</a> </td> - -<td>Required </td> - -<td>Stellar expressions that are executed when the window period expires.</td> - </tr> - +<td> <a href="#init">init</a> </td> +<td> Optional </td> +<td> One or more expressions executed at the start of a window period.</td></tr> <tr class="a"> - -<td><a href="#expires">expires</a> </td> - -<td>Optional </td> - -<td>Profile data is purged after this period of time, specified in days.</td> - </tr> - </tbody> -</table> +<td> <a href="#update">update</a> </td> +<td> Required </td> +<td> One or more expressions executed when a message is applied to the profile.</td></tr> +<tr class="b"> +<td> <a href="#result">result</a> </td> +<td> Required </td> +<td> Stellar expressions that are executed when the window period expires.</td></tr> +<tr class="a"> +<td> <a href="#expires">expires</a> </td> +<td> Optional </td> +<td> Profile data is purged after this period of time, specified in days.</td></tr> +</tbody> +</table></div> <div class="section"> <h3><a name="profile"></a><tt>profile</tt></h3> <p><i>Required</i></p> -<p>A unique name identifying the profile. The field is treated as a string.</p></div> +<p>A unique name identifying the profile. The field is treated as a string.</p></div> <div class="section"> <h3><a name="foreach"></a><tt>foreach</tt></h3> <p><i>Required</i></p> -<p>A separate profile is maintained ‘for each’ of these. This is effectively the entity that the profile is describing. The field is expected to contain a Stellar expression whose result is the entity name. </p> +<p>A separate profile is maintained ‘for each’ of these. This is effectively the entity that the profile is describing. The field is expected to contain a Stellar expression whose result is the entity name.</p> <p>For example, if <tt>ip_src_addr</tt> then a separate profile would be maintained for each unique IP source address in the data; 10.0.0.1, 10.0.0.2, etc.</p></div> <div class="section"> <h3><a name="onlyif"></a><tt>onlyif</tt></h3> <p><i>Optional</i></p> -<p>An expression that determines if a message should be applied to the profile. A Stellar expression that returns a Boolean is expected. A message is only applied to a profile if this expression is true. This allows a profile to filter the messages that get applied to it.</p></div> +<p>An expression that determines if a message should be applied to the profile. A Stellar expression that returns a Boolean is expected. A message is only applied to a profile if this expression is true. This allows a profile to filter the messages that get applied to it.</p></div> <div class="section"> <h3><a name="groupBy"></a><tt>groupBy</tt></h3> <p><i>Optional</i></p> -<p>One or more Stellar expressions used to group the profile measurements when persisted. This can be used to sort the Profile data to allow for a contiguous scan when accessing subsets of the data. This is also one way to deal with calendar effects. For example, where activity on a weekday can be very different from a weekend.</p> -<p>A common use case would be grouping by day of week. This allows a contiguous scan to access all profile data for Mondays only. Using the following definition would achieve this.</p> +<p>One or more Stellar expressions used to group the profile measurements when persisted. This can be used to sort the Profile data to allow for a contiguous scan when accessing subsets of the data. This is also one way to deal with calendar effects. For example, where activity on a weekday can be very different from a weekend.</p> +<p>A common use case would be grouping by day of week. This allows a contiguous scan to access all profile data for Mondays only. Using the following definition would achieve this.</p> -<div class="source"> -<div class="source"> -<pre>"groupBy": [ "DAY_OF_WEEK(start)" ] +<div> +<div> +<pre class="source">"groupBy": [ "DAY_OF_WEEK(start)" ] </pre></div></div> -<p>The expression can reference any of these variables.</p> +<p>The expression can reference any of these variables.</p> <ul> - + <li>Any variable defined by the profile in its <tt>init</tt> or <tt>update</tt> expressions.</li> - <li><tt>profile</tt> The name of the profile.</li> - <li><tt>entity</tt> The name of the entity being profiled.</li> - <li><tt>start</tt> The start time of the profile period in epoch milliseconds.</li> - <li><tt>end</tt> The end time of the profile period in epoch milliseconds.</li> - <li><tt>duration</tt> The duration of the profile period in milliseconds.</li> - <li><tt>result</tt> The result of executing the <tt>result</tt> expression.</li> </ul></div> <div class="section"> <h3><a name="init"></a><tt>init</tt></h3> <p><i>Optional</i></p> -<p>One or more expressions executed at the start of a window period. A map is expected where the key is the variable name and the value is a Stellar expression. The map can contain zero or more variable:expression pairs. At the start of each window period, each expression is executed once and stored in the given variable. Note that constant init values such as “0” must be in quotes regardless of their type, as the init value must be a string to be executed by Stellar.</p> +<p>One or more expressions executed at the start of a window period. A map is expected where the key is the variable name and the value is a Stellar expression. The map can contain zero or more variable:expression pairs. At the start of each window period, each expression is executed once and stored in the given variable. Note that constant init values such as “0” must be in quotes regardless of their type, as the init value must be a string to be executed by Stellar.</p> -<div class="source"> -<div class="source"> -<pre>"init": { +<div> +<div> +<pre class="source">"init": { "var1": "0", "var2": "1" } -</pre></div></div></div> +</pre></div></div> +</div> <div class="section"> <h3><a name="update"></a><tt>update</tt></h3> <p><i>Required</i></p> -<p>One or more expressions executed when a message is applied to the profile. A map is expected where the key is the variable name and the value is a Stellar expression. The map can include 0 or more variables/expressions. When each message is applied to the profile, the expression is executed and stored in a variable with the given name.</p> +<p>One or more expressions executed when a message is applied to the profile. A map is expected where the key is the variable name and the value is a Stellar expression. The map can include 0 or more variables/expressions. When each message is applied to the profile, the expression is executed and stored in a variable with the given name.</p> -<div class="source"> -<div class="source"> -<pre>"update": { +<div> +<div> +<pre class="source">"update": { "var1": "var1 + 1", "var2": "var2 + 1" } -</pre></div></div></div> +</pre></div></div> +</div> <div class="section"> <h3><a name="result"></a><tt>result</tt></h3> <p><i>Required</i></p> -<p>Stellar expressions that are executed when the window period expires. The expressions are expected to summarize the messages that were applied to the profile over the window period. In the most basic form a single result is persisted for later retrieval.</p> +<p>Stellar expressions that are executed when the window period expires. The expressions are expected to summarize the messages that were applied to the profile over the window period. In the most basic form a single result is persisted for later retrieval.</p> -<div class="source"> -<div class="source"> -<pre>"result": "var1 + var2" +<div> +<div> +<pre class="source">"result": "var1 + var2" </pre></div></div> -<p>For more advanced use cases, a profile can generate two types of results. A profile can define one or both of these result types at the same time.</p> +<p>For more advanced use cases, a profile can generate two types of results. A profile can define one or both of these result types at the same time.</p> <ul> - -<li><tt>profile</tt>: A required expression that defines a value that is persisted for later retrieval.</li> - + +<li><tt>profile</tt>: A required expression that defines a value that is persisted for later retrieval.</li> <li><tt>triage</tt>: An optional expression that defines values that are accessible within the Threat Triage process.</li> </ul> <p><b>profile</b></p> -<p>A required Stellar expression that results in a value that is persisted in the profile store for later retrieval. The expression can result in any object that is Kryo serializable. These values can be retrieved for later use with the <a href="../metron-profiler-client/index.html">Profiler Client</a>.</p> +<p>A required Stellar expression that results in a value that is persisted in the profile store for later retrieval. The expression can result in any object that is Kryo serializable. These values can be retrieved for later use with the <a href="../metron-profiler-client/index.html">Profiler Client</a>.</p> -<div class="source"> -<div class="source"> -<pre>"result": { +<div> +<div> +<pre class="source">"result": { "profile": "2 + 2" } </pre></div></div> + <p>An alternative, simplified form is also acceptable.</p> -<div class="source"> -<div class="source"> -<pre>"result": "2 + 2" +<div> +<div> +<pre class="source">"result": "2 + 2" </pre></div></div> + <p><b>triage</b></p> -<p>An optional map of one or more Stellar expressions. The value of each expression is made available to the Threat Triage process under the given name. Each expression must result in a either a primitive type, like an integer, long, or short, or a String. All other types will result in an error.</p> -<p>In the following example, three values, the minimum, the maximum and the mean are appended to a message. This message is consumed by Metron, like other sources of telemetry, and each of these values are accessible from within the Threat Triage process using the given field names; <tt>min</tt>, <tt>max</tt>, and <tt>mean</tt>.</p> +<p>An optional map of one or more Stellar expressions. The value of each expression is made available to the Threat Triage process under the given name. Each expression must result in a either a primitive type, like an integer, long, or short, or a String. All other types will result in an error.</p> +<p>In the following example, three values, the minimum, the maximum and the mean are appended to a message. This message is consumed by Metron, like other sources of telemetry, and each of these values are accessible from within the Threat Triage process using the given field names; <tt>min</tt>, <tt>max</tt>, and <tt>mean</tt>.</p> -<div class="source"> -<div class="source"> -<pre>"result": { +<div> +<div> +<pre class="source">"result": { "triage": { "min": "STATS_MIN(stats)", "max": "STATS_MAX(stats)", "mean": "STATS_MEAN(stats)" } } -</pre></div></div></div> +</pre></div></div> +</div> <div class="section"> <h3><a name="expires"></a><tt>expires</tt></h3> <p><i>Optional</i></p> -<p>A numeric value that defines how many days the profile data is retained. After this time, the data expires and is no longer accessible. If no value is defined, the data does not expire.</p> +<p>A numeric value that defines how many days the profile data is retained. After this time, the data expires and is no longer accessible. If no value is defined, the data does not expire.</p> <p>The REPL can be a powerful for developing profiles. Read all about <a href="../metron-profiler-client/index.html#developing_profiles">Developing Profiles</a>.</p></div></div> <div class="section"> <h2><a name="Configuring_the_Profiler"></a>Configuring the Profiler</h2> -<p>The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in local filesystem at <tt>$METRON_HOME/config/profiler.properties</tt>. The values can be changed on disk and then the Profiler topology must be restarted.</p> - +<p>The Profiler runs as an independent Storm topology. The configuration for the Profiler topology is stored in local filesystem at <tt>$METRON_HOME/config/profiler.properties</tt>. After changing these values, the Profiler topology must be restarted for the changes to take effect.</p> <table border="0" class="table table-striped"> - <thead> - +<thead> + +<tr class="a"> +<th> Setting </th> +<th> Description</th></tr> +</thead><tbody> + +<tr class="b"> +<td> <a href="#profiler.input.topic"><tt>profiler.input.topic</tt></a> </td> +<td> The name of the input Kafka topic.</td></tr> +<tr class="a"> +<td> <a href="#profiler.output.topic"><tt>profiler.output.topic</tt></a> </td> +<td> The name of the output Kafka topic.</td></tr> +<tr class="b"> +<td> <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a> </td> +<td> The duration of each profile period.</td></tr> <tr class="a"> - -<th>Setting </th> - -<th>Description</th> - </tr> - </thead> - <tbody> - +<td> <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a> </td> +<td> The units used to specify the <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</td></tr> <tr class="b"> - -<td><a href="#profiler.input.topic"><tt>profiler.input.topic</tt></a> </td> - -<td>The name of the Kafka topic from which to consume data.</td> - </tr> - +<td> <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a> </td> +<td> The duration of each profile window.</td></tr> <tr class="a"> - -<td><a href="#profiler.output.topic"><tt>profiler.output.topic</tt></a> </td> - -<td>The name of the Kafka topic to which profile data is written. Only used with profiles that define the <a href="#result"><tt>triage</tt> result field</a>.</td> - </tr> - +<td> <a href="#profilerpwindowdurationunits"><tt>profiler.window.duration.units</tt></a> </td> +<td> The units used to specify the <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a>.</td></tr> <tr class="b"> - -<td><a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a> </td> - -<td>The duration of each profile period.</td> - </tr> - +<td> <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a> </td> +<td> The maximum time lag for timestamps.</td></tr> <tr class="a"> - -<td><a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a> </td> - -<td>The units used to specify the <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</td> - </tr> - +<td> <a href="#profilerpwindowlagunits"><tt>profiler.window.lag.units</tt></a> </td> +<td> The units used to specify the <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a>.</td></tr> <tr class="b"> - -<td><a href="#profiler.workers"><tt>profiler.workers</tt></a> </td> - -<td>The number of worker processes for the topology.</td> - </tr> - +<td> <a href="#profiler.workers"><tt>profiler.workers</tt></a> </td> +<td> The number of worker processes for the topology.</td></tr> <tr class="a"> - -<td><a href="#profiler.executors"><tt>profiler.executors</tt></a> </td> - -<td>The number of executors to spawn per component.</td> - </tr> - +<td> <a href="#profiler.executors"><tt>profiler.executors</tt></a> </td> +<td> The number of executors to spawn per component.</td></tr> <tr class="b"> - -<td><a href="#profiler.ttl"><tt>profiler.ttl</tt></a> </td> - -<td>If a message has not been applied to a Profile in this period of time, the Profile will be forgotten and its resources will be cleaned up.</td> - </tr> - +<td> <a href="#profiler.ttl"><tt>profiler.ttl</tt></a> </td> +<td> If a message has not been applied to a Profile in this period of time, the Profile will be forgotten and its resources will be cleaned up.</td></tr> <tr class="a"> - -<td><a href="#profiler.ttl.units"><tt>profiler.ttl.units</tt></a> </td> - -<td>The units used to specify the <tt>profiler.ttl</tt>.</td> - </tr> - +<td> <a href="#profiler.ttl.units"><tt>profiler.ttl.units</tt></a> </td> +<td> The units used to specify the <tt>profiler.ttl</tt>.</td></tr> <tr class="b"> - -<td><a href="#profiler.hbase.salt.divisor"><tt>profiler.hbase.salt.divisor</tt></a> </td> - -<td>A salt is prepended to the row key to help prevent hotspotting.</td> - </tr> - +<td> <a href="#profiler.hbase.salt.divisor"><tt>profiler.hbase.salt.divisor</tt></a> </td> +<td> A salt is prepended to the row key to help prevent hot-spotting.</td></tr> <tr class="a"> - -<td><a href="#profiler.hbase.table"><tt>profiler.hbase.table</tt></a> </td> - -<td>The name of the HBase table that profiles are written to.</td> - </tr> - +<td> <a href="#profiler.hbase.table"><tt>profiler.hbase.table</tt></a> </td> +<td> The name of the HBase table that profiles are written to.</td></tr> <tr class="b"> - -<td><a href="#profiler.hbase.column.family"><tt>profiler.hbase.column.family</tt></a> </td> - -<td>The column family used to store profiles.</td> - </tr> - +<td> <a href="#profiler.hbase.column.family"><tt>profiler.hbase.column.family</tt></a> </td> +<td> The column family used to store profiles.</td></tr> <tr class="a"> - -<td><a href="#profiler.hbase.batch"><tt>profiler.hbase.batch</tt></a> </td> - -<td>The number of puts that are written to HBase in a single batch.</td> - </tr> - +<td> <a href="#profiler.hbase.batch"><tt>profiler.hbase.batch</tt></a> </td> +<td> The number of puts that are written to HBase in a single batch.</td></tr> <tr class="b"> - -<td><a href="#profiler.hbase.flush.interval.seconds"><tt>profiler.hbase.flush.interval.seconds</tt></a> </td> - -<td>The maximum number of seconds between batch writes to HBase.</td> - </tr> - </tbody> +<td> <a href="#profiler.hbase.flush.interval.seconds"><tt>profiler.hbase.flush.interval.seconds</tt></a> </td> +<td> The maximum number of seconds between batch writes to HBase.</td></tr> +<tr class="a"> +<td> <a href="#topology.kryo.register"><tt>topology.kryo.register</tt></a> </td> +<td> Storm will use Kryo serialization for these classes.</td></tr> +</tbody> </table> <div class="section"> <h3><a name="profiler.input.topic"></a><tt>profiler.input.topic</tt></h3> <p><i>Default</i>: indexing</p> -<p>The name of the Kafka topic from which to consume data. By default, the Profiler consumes data from the <tt>indexing</tt> topic so that it has access to fully enriched telemetry.</p></div> +<p>The name of the Kafka topic from which to consume data. By default, the Profiler consumes data from the <tt>indexing</tt> topic so that it has access to fully enriched telemetry.</p></div> <div class="section"> <h3><a name="profiler.output.topic"></a><tt>profiler.output.topic</tt></h3> <p><i>Default</i>: enrichments</p> -<p>The name of the Kafka topic to which profile data is written. This property is only applicable to profiles that define the <a href="#result"><tt>result</tt> <tt>triage</tt> field</a>. This allows Profile data to be selectively triaged like any other source of telemetry in Metron.</p></div> +<p>The name of the Kafka topic to which profile data is written. This property is only applicable to profiles that define the <a href="#result"><tt>result</tt> <tt>triage</tt> field</a>. This allows Profile data to be selectively triaged like any other source of telemetry in Metron.</p></div> <div class="section"> <h3><a name="profiler.period.duration"></a><tt>profiler.period.duration</tt></h3> <p><i>Default</i>: 15</p> -<p>The duration of each profile period. This value should be defined along with <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a>.</p> -<p><i>Important</i>: To read a profile using the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a>, the Profiler Client’s <tt>profiler.client.period.duration</tt> property must match this value. Otherwise, the Profiler Client will be unable to read the profile data. </p></div> +<p>The duration of each profile period. This value should be defined along with <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a>.</p> +<p><i>Important</i>: To read a profile using the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a>, the Profiler Client’s <tt>profiler.client.period.duration</tt> property must match this value. Otherwise, the Profiler Client will be unable to read the profile data.</p></div> <div class="section"> <h3><a name="profiler.period.duration.units"></a><tt>profiler.period.duration.units</tt></h3> <p><i>Default</i>: MINUTES</p> -<p>The units used to specify the <tt>profiler.period.duration</tt>. This value should be defined along with <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</p> -<p><i>Important</i>: To read a profile using the Profiler Client, the Profiler Client’s <tt>profiler.client.period.duration.units</tt> property must match this value. Otherwise, the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a> will be unable to read the profile data.</p></div> +<p>The units used to specify the <tt>profiler.period.duration</tt>. This value should be defined along with <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a>.</p> +<p><i>Important</i>: To read a profile using the Profiler Client, the Profiler Client’s <tt>profiler.client.period.duration.units</tt> property must match this value. Otherwise, the <a href="metron-analytics/metron-profiler-client/index.html">Profiler Client</a> will be unable to read the profile data.</p></div> +<div class="section"> +<h3><a name="profiler.window.duration"></a><tt>profiler.window.duration</tt></h3> +<p><i>Default</i>: 30</p> +<p>The duration of each profile window. Telemetry that arrives within a slice of time is processed within a single window.</p> +<p>Many windows of telemetry will be processed during a single profile period. This does not change the output of the Profiler, it only changes how the Profiler processes data. The window defines how much data the Profiler processes in a single pass.</p> +<p>This value should be defined along with <a href="#profiler.window.duration.units"><tt>profiler.window.duration.units</tt></a>.</p> +<p>This value must be less than the period duration as defined by <a href="#profiler.period.duration"><tt>profiler.period.duration</tt></a> and <a href="#profiler.period.duration.units"><tt>profiler.period.duration.units</tt></a>.</p></div> +<div class="section"> +<h3><a name="profiler.window.duration.units"></a><tt>profiler.window.duration.units</tt></h3> +<p><i>Default</i>: SECONDS</p> +<p>The units used to specify the <tt>profiler.window.duration</tt>. This value should be defined along with <a href="#profiler.window.duration"><tt>profiler.window.duration</tt></a>.</p></div> +<div class="section"> +<h3><a name="profiler.window.lag"></a><tt>profiler.window.lag</tt></h3> +<p><i>Default</i>: 1</p> +<p>The maximum time lag for timestamps. Timestamps cannot arrive out-of-order by more than this amount. This value should be defined along with <a href="#profiler.window.lag.units"><tt>profiler.window.lag.units</tt></a>.</p></div> +<div class="section"> +<h3><a name="profiler.window.lag.units"></a><tt>profiler.window.lag.units</tt></h3> +<p><i>Default</i>: SECONDS</p> +<p>The units used to specify the <tt>profiler.window.lag</tt>. This value should be defined along with <a href="#profiler.window.lag"><tt>profiler.window.lag</tt></a>.</p></div> <div class="section"> <h3><a name="profiler.workers"></a><tt>profiler.workers</tt></h3> <p><i>Default</i>: 1</p> -<p>The number of worker processes to create for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div> +<p>The number of worker processes to create for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div> <div class="section"> <h3><a name="profiler.executors"></a><tt>profiler.executors</tt></h3> <p><i>Default</i>: 0</p> -<p>The number of executors to spawn per component for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div> +<p>The number of executors to spawn per component for the Profiler topology. This property is useful for performance tuning the Profiler.</p></div> <div class="section"> <h3><a name="profiler.ttl"></a><tt>profiler.ttl</tt></h3> <p><i>Default</i>: 30</p> <p>If a message has not been applied to a Profile in this period of time, the Profile will be terminated and its resources will be cleaned up. This value should be defined along with <a href="#profiler.ttl.units"><tt>profiler.ttl.units</tt></a>.</p> -<p>This time-to-live does not affect the persisted Profile data in HBase. It only affects the state stored in memory during the execution of the latest profile period. This state will be deleted if the time-to-live is exceeded.</p></div> +<p>This time-to-live does not affect the persisted Profile data in HBase. It only affects the state stored in memory during the execution of the latest profile period. This state will be deleted if the time-to-live is exceeded.</p></div> <div class="section"> <h3><a name="profiler.ttl.units"></a><tt>profiler.ttl.units</tt></h3> <p><i>Default</i>: MINUTES</p> @@ -955,11 +848,11 @@ hbase(main):001:0> count 'profiler' <div class="section"> <h3><a name="profiler.hbase.salt.divisor"></a><tt>profiler.hbase.salt.divisor</tt></h3> <p><i>Default</i>: 1000</p> -<p>A salt is prepended to the row key to help prevent hotspotting. This constant is used to generate the salt. This constant should be roughly equal to the number of nodes in the Hbase cluster to ensure even distribution of data.</p></div> +<p>A salt is prepended to the row key to help prevent hotspotting. This constant is used to generate the salt. This constant should be roughly equal to the number of nodes in the Hbase cluster to ensure even distribution of data.</p></div> <div class="section"> <h3><a name="profiler.hbase.table"></a><tt>profiler.hbase.table</tt></h3> <p><i>Default</i>: profiler</p> -<p>The name of the HBase table that profile data is written to. The Profiler expects that the table exists and is writable. It will not create the table.</p></div> +<p>The name of the HBase table that profile data is written to. The Profiler expects that the table exists and is writable. It will not create the table.</p></div> <div class="section"> <h3><a name="profiler.hbase.column.family"></a><tt>profiler.hbase.column.family</tt></h3> <p><i>Default</i>: P</p> @@ -971,15 +864,37 @@ hbase(main):001:0> count 'profiler' <div class="section"> <h3><a name="profiler.hbase.flush.interval.seconds"></a><tt>profiler.hbase.flush.interval.seconds</tt></h3> <p><i>Default</i>: 30</p> -<p>The maximum number of seconds between batch writes to HBase.</p></div></div> +<p>The maximum number of seconds between batch writes to HBase.</p></div> +<div class="section"> +<h3><a name="topology.kryo.register"></a><tt>topology.kryo.register</tt></h3> +<p><i>Default</i>:</p> + +<div> +<div> +<pre class="source">[ org.apache.metron.profiler.ProfileMeasurement, \ + org.apache.metron.profiler.ProfilePeriod, \ + org.apache.metron.common.configuration.profiler.ProfileResult, \ + org.apache.metron.common.configuration.profiler.ProfileResultExpressions, \ + org.apache.metron.common.configuration.profiler.ProfileTriageExpressions, \ + org.apache.metron.common.configuration.profiler.ProfilerConfig, \ + org.apache.metron.common.configuration.profiler.ProfileConfig, \ + org.json.simple.JSONObject, \ + java.util.LinkedHashMap, \ + org.apache.metron.statistics.OnlineStatisticsProvider ] +</pre></div></div> + +<p>Storm will use Kryo serialization for these classes. Kryo serialization is more performant than Java serialization, in most cases.</p> +<p>For these classes, Storm will uses Kryo’s <tt>FieldSerializer</tt> as defined in the <a class="externalLink" href="http://storm.apache.org/releases/1.1.2/Serialization.html">Storm Serialization docs</a>. For all other classes not in this list, Storm defaults to using Java serialization which is slower and not recommended for a production topology.</p> +<p>This value should only need altered if you have defined a profile that results in a non-primitive, user-defined type that is not in this list. If the class is not defined in this list, Java serialization will be used and the class must adhere to Java’s serialization requirements.</p> +<p>The performance of the entire Profiler topology can be negatively impacted if any profile produces results that undergo Java serialization.</p></div></div> <div class="section"> <h2><a name="Examples"></a>Examples</h2> <p>The following examples are intended to highlight the functionality provided by the Profiler. Try out these examples easily in the Stellar Shell as described in the <a href="#Creating_Profiles">Creating Profiles</a> section.</p> -<p>These examples assume a fictitious input message stream that looks like the following. </p> +<p>These examples assume a fictitious input message stream that looks like the following.</p> -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> msgs := SHELL_EDIT() +<div> +<div> +<pre class="source">[Stellar]>>> msgs := SHELL_EDIT() [Stellar]>>> msgs [ { @@ -1002,13 +917,14 @@ hbase(main):001:0> count 'profiler' } ] </pre></div></div> + <div class="section"> <h3><a name="Example_1"></a>Example 1</h3> <p>The total number of bytes of HTTP data for each host. The following configuration would be used to generate this profile.</p> -<div class="source"> -<div class="source"> -<pre>{ +<div> +<div> +<pre class="source">{ "profiles": [ { "profile": "example1", @@ -1026,31 +942,25 @@ hbase(main):001:0> count 'profiler' ] } </pre></div></div> -<p>This creates a profile…</p> +<p>This creates a profile…</p> <ul> - + <li>Named ‘example1’</li> - <li>That for each IP source address</li> - <li>Only if the ‘protocol’ field equals ‘HTTP’</li> - <li>Initializes a counter ‘total_bytes’ to zero</li> - <li>Adds to ‘total_bytes’ the value of the message’s ‘bytes_in’ field</li> - <li>Returns ‘total_bytes’ as the result</li> - <li>The profile data will expire in 30 days</li> </ul></div> <div class="section"> <h3><a name="Example_2"></a>Example 2</h3> <p>The ratio of DNS traffic to HTTP traffic for each host. The following configuration would be used to generate this profile.</p> -<div class="source"> -<div class="source"> -<pre>{ +<div> +<div> +<pre class="source">{ "profiles": [ { "profile": "example2", @@ -1069,29 +979,24 @@ hbase(main):001:0> count 'profiler' ] } </pre></div></div> -<p>This creates a profile…</p> +<p>This creates a profile…</p> <ul> - + <li>Named ‘example2’</li> - <li>That for each IP source address</li> - <li>Only if the ‘protocol’ field equals ‘HTTP’ or ‘DNS’</li> - <li>Accumulates the number of DNS requests</li> - <li>Accumulates the number of HTTP requests</li> - <li>Returns the ratio of these as the result</li> </ul></div> <div class="section"> <h3><a name="Example_3"></a>Example 3</h3> <p>The average of the <tt>length</tt> field of HTTP traffic. The following configuration would be used to generate this profile.</p> -<div class="source"> -<div class="source"> -<pre>{ +<div> +<div> +<pre class="source">{ "profiles": [ { "profile": "example3", @@ -1103,28 +1008,24 @@ hbase(main):001:0> count 'profiler' ] } </pre></div></div> -<p>This creates a profile…</p> +<p>This creates a profile…</p> <ul> - + <li>Named ‘example3’</li> - <li>That for each IP source address</li> - <li>Only if the ‘protocol’ field is ‘HTTP’</li> - <li>Adds the <tt>length</tt> field from each message</li> - <li>Calculates the average as the result</li> </ul></div> <div class="section"> <h3><a name="Example_4"></a>Example 4</h3> -<p>It is important to note that the Profiler can persist any serializable Object, not just numeric values. An alternative to the previous example could take advantage of this. </p> -<p>Instead of storing the mean of the lengths, the profile could store a statistical summarization of the lengths. This summary can then be used at a later time to calculate the mean, min, max, percentiles, or any other sensible metric. This provides a much greater degree of flexibility.</p> +<p>It is important to note that the Profiler can persist any serializable Object, not just numeric values. An alternative to the previous example could take advantage of this.</p> +<p>Instead of storing the mean of the lengths, the profile could store a statistical summarization of the lengths. This summary can then be used at a later time to calculate the mean, min, max, percentiles, or any other sensible metric. This provides a much greater degree of flexibility.</p> -<div class="source"> -<div class="source"> -<pre>{ +<div> +<div> +<pre class="source">{ "profiles": [ { "profile": "example4", @@ -1136,94 +1037,101 @@ hbase(main):001:0> count 'profiler' ] } </pre></div></div> + <p>The following Stellar REPL session shows how you might use this summary to calculate different metrics with the same underlying profile data. It is assumed that the PROFILE_GET client is configured as described <a href="../metron-profiler-client/index.html">here</a>.</p> <p>Retrieve the last 30 minutes of profile measurements for a specific host.</p> -<div class="source"> -<div class="source"> -<pre>$ bin/stellar -z node1:2181 +<div> +<div> +<pre class="source">$ bin/stellar -z node1:2181 [Stellar]>>> stats := PROFILE_GET( "example4", "10.0.0.1", PROFILE_FIXED(30, "MINUTES")) [Stellar]>>> stats [org.apache.metron.common.math.stats.OnlineStatisticsProvider@79fe4ab9, ...] </pre></div></div> + <p>Calculate different metrics with the same profile data.</p> -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> STATS_MEAN( GET_FIRST( stats)) +<div> +<div> +<pre class="source">[Stellar]>>> STATS_MEAN( GET_FIRST( stats)) 15979.0625 [Stellar]>>> STATS_PERCENTILE( GET_FIRST(stats), 90) 30310.958 </pre></div></div> + <p>Merge all of the profile measurements over the past 30 minutes into a single summary and calculate the 90th percentile.</p> -<div class="source"> -<div class="source"> -<pre>[Stellar]>>> merged := STATS_MERGE( stats) +<div> +<div> +<pre class="source">[Stellar]>>> merged := STATS_MERGE( stats) [Stellar]>>> STATS_PERCENTILE(merged, 90) 29810.992 </pre></div></div> + <p>More information on accessing profile data can be found in the <a href="../metron-profiler-client/index.html">Profiler Client</a>.</p> <p>More information on using the <a href="../../metron-platform/metron-common/index.html"><tt>STATS_*</tt> functions in Stellar can be found here</a>.</p></div></div> <div class="section"> <h2><a name="Implementation"></a>Implementation</h2></div> <div class="section"> <h2><a name="Key_Classes"></a>Key Classes</h2> - <ul> - + <li> -<p><tt>ProfileMeasurement</tt> - Represents a single data point within a Profile. A Profile is effectively a time series. To this end a Profile is composed of many ProfileMeasurement values which in aggregate form a time series.</p></li> - + +<p><tt>ProfileMeasurement</tt> - Represents a single data point within a Profile. A Profile is effectively a time series. To this end a Profile is composed of many ProfileMeasurement values which in aggregate form a time series.</p> +</li> <li> -<p><tt>ProfilePeriod</tt> - The Profiler captures one <tt>ProfileMeasurement</tt> each <tt>ProfilePeriod</tt>. A <tt>ProfilePeriod</tt> will occur at fixed, deterministic points in time. This allows for efficient retrieval of profile data.</p></li> - + +<p><tt>ProfilePeriod</tt> - The Profiler captures one <tt>ProfileMeasurement</tt> each <tt>ProfilePeriod</tt>. A <tt>ProfilePeriod</tt> will occur at fixed, deterministic points in time. This allows for efficient retrieval of profile data.</p> +</li> <li> -<p><tt>RowKeyBuilder</tt> - Builds row keys that can be used to read or write profile data to HBase.</p></li> - + +<p><tt>RowKeyBuilder</tt> - Builds row keys that can be used to read or write profile data to HBase.</p> +</li> <li> -<p><tt>ColumnBuilder</tt> - Defines the columns of data stored with a profile measurement.</p></li> - + +<p><tt>ColumnBuilder</tt> - Defines the columns of data stored with a profile measurement.</p> +</li> <li> -<p><tt>ProfileHBaseMapper</tt> - Defines for the <tt>HBaseBolt</tt> how profile measurements are stored in HBase. This class leverages a <tt>RowKeyBuilder</tt> and <tt>ColumnBuilder</tt>.</p></li> + +<p><tt>ProfileHBaseMapper</tt> - Defines for the <tt>HBaseBolt</tt> how profile measurements are stored in HBase. This class leverages a <tt>RowKeyBuilder</tt> and <tt>ColumnBuilder</tt>.</p> +</li> </ul></div> <div class="section"> <h2><a name="Storm_Topology"></a>Storm Topology</h2> <p>The Profiler is implemented as a Storm topology using the following bolts and spouts.</p> - <ul> - + <li> -<p><tt>KafkaSpout</tt> - A spout that consumes messages from a single Kafka topic. In most cases, the Profiler topology will consume messages from the <tt>indexing</tt> topic. This topic contains fully enriched messages that are ready to be indexed. This ensures that profiles can take advantage of all the available data elements.</p></li> - + +<p><tt>KafkaSpout</tt> - A spout that consumes messages from a single Kafka topic. In most cases, the Profiler topology will consume messages from the <tt>indexing</tt> topic. This topic contains fully enriched messages that are ready to be indexed. This ensures that profiles can take advantage of all the available data elements.</p> +</li> <li> -<p><tt>ProfileSplitterBolt</tt> - The bolt responsible for filtering incoming messages and directing each to the one or more downstream bolts that are responsible for building a profile. Each message may be needed by 0, 1 or even many profiles. Each emitted tuple contain
<TRUNCATED>