metron git commit: METRON-1050 Improve Docs of `profiler.period.duration` (nickwallen) closes apache/metron#656

nickallen Tue, 25 Jul 2017 14:57:42 -0700

Repository: metron
Updated Branches:
  refs/heads/master 577ff80e3 -> 5c0ac32d1



METRON-1050 Improve Docs of `profiler.period.duration`  (nickwallen) closes 
apache/metron#656


Project: http://git-wip-us.apache.org/repos/asf/metron/repo
Commit: http://git-wip-us.apache.org/repos/asf/metron/commit/5c0ac32d
Tree: http://git-wip-us.apache.org/repos/asf/metron/tree/5c0ac32d
Diff: http://git-wip-us.apache.org/repos/asf/metron/diff/5c0ac32d

Branch: refs/heads/master
Commit: 5c0ac32d19e3805ec4b7ac587ed196e0431f8b35
Parents: 577ff80
Author: nickwallen <n...@nickallen.org>
Authored: Tue Jul 25 17:56:36 2017 -0400
Committer: nickallen <nickal...@apache.org>
Committed: Tue Jul 25 17:56:36 2017 -0400

----------------------------------------------------------------------
 .../metron-profiler-client/README.md            |   1 -
 metron-analytics/metron-profiler/README.md      | 138 +++++++++++++++----
 .../src/main/config/profiler.properties         |   4 +-
 metron-deployment/README.md                     |   5 +-
 site-book/bin/fix-md-dialect.py                 |  11 +-
 5 files changed, 120 insertions(+), 39 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/metron/blob/5c0ac32d/metron-analytics/metron-profiler-client/README.md
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler-client/README.md 
b/metron-analytics/metron-profiler-client/README.md
index 220dd5d..dcf30f6 100644
--- a/metron-analytics/metron-profiler-client/README.md
+++ b/metron-analytics/metron-profiler-client/README.md
@@ -60,7 +60,6 @@ want to change the global Client configuration so as not to 
disrupt the work of
 | profiler.client.salt.divisor          | The salt divisor used to store 
profile data.                                                                   
                    | Optional | 1000     |
 | hbase.provider.impl                   | The name of the HBaseTableProvider 
implementation class.                                                           
                | Optional |          |
 
-
 ### Profile Selectors
 
 You will notice that the third argument for `PROFILE_GET` is a list of 
`ProfilePeriod` objects.  This list is expected to

http://git-wip-us.apache.org/repos/asf/metron/blob/5c0ac32d/metron-analytics/metron-profiler/README.md
----------------------------------------------------------------------
diff --git a/metron-analytics/metron-profiler/README.md 
b/metron-analytics/metron-profiler/README.md
index 7a68969..66c5557 100644
--- a/metron-analytics/metron-profiler/README.md
+++ b/metron-analytics/metron-profiler/README.md
@@ -17,7 +17,7 @@ Any field contained within a message can be used to generate 
a profile.  A profi
 
 Follow these instructions to install the Profiler.  This assumes that core 
Metron has already been installed and validated.  
 
-1. Build the Metron RPMs by [following these 
instructions](../../metron-deployment#rpm).  
+1. Build the Metron RPMs (see Building the 
[RPMs](../../metron-deployment#rpms)).  
 
     You may have already built the Metron RPMs when core Metron was installed.
 
@@ -58,7 +58,7 @@ Follow these instructions to install the Profiler.  This 
assumes that core Metro
     /usr/metron/0.4.1/lib/metron-profiler-0.4.0-uber.jar
     ```
 
-1. Create a table within HBase that will store the profile data. The table 
name and column family must match the [Profiler's 
configuration](#configuring-the-profiler).  By default, the table is named 
`profiler` with a column family `P`.
+1. Create a table within HBase that will store the profile data. By default, 
the table is named `profiler` with a column family `P`.  The table name and 
column family must match the Profiler's configuration (see [Configuring the 
Profiler](#configuring-the-profiler)).  
 
     ```
     $ /usr/hdp/current/hbase-client/bin/hbase shell
@@ -83,9 +83,9 @@ At this point the Profiler is running and consuming telemetry 
messages.  We have
 
 ## Getting Started
 
-This section will describe the steps required to get your first "Hello, 
World!"" profile running.  This assumes that you have successfully [installed 
the Profiler](#installation) and have it running.
+This section will describe the steps required to get your first "Hello, 
World!"" profile running.  This assumes that you have a successful Profiler 
[Installation](#installation) and have it running.
 
-1. Create the profile definition in a file located at 
`$METRON_HOME/config/zookeeper/profiler.json`.  
+1. Create the profile definition in a file located at 
`$METRON_HOME/config/zookeeper/profiler.json`.  This file will likely not 
exist, if you have never created Profiles before.
 
     The following example will create a profile that simply counts the number 
of messages per `ip_src_addr`.
     ```
@@ -129,7 +129,7 @@ This section will describe the steps required to get your 
first "Hello, World!""
     }
     ```
 
-1. Ensure that test messages are being sent to the Profiler's input topic in 
Kafka.  The Profiler will consume messages from the `inputTopic` defined in the 
[Profiler's configuration](#configuring-the-profiler).  By default this is the 
`indexing` topic.
+1. Ensure that test messages are being sent to the Profiler's input topic in 
Kafka.  The Profiler will consume messages from the input topic defined in the 
Profiler's configuration (see [Configuring the 
Profiler](#configuring-the-profiler)).  By default this is the `indexing` topic.
 
 1. Check the HBase table to validate that the Profiler is writing the profile. 
 Remember that the Profiler is flushing the profile every 15 minutes.  You will 
need to wait at least this long to start seeing profile data in HBase.
     ```
@@ -137,15 +137,19 @@ This section will describe the steps required to get your 
first "Hello, World!""
     hbase(main):001:0> count 'profiler'
     ```
 
-1. Use the Profiler Client to read the profile data.  The below example 
`PROFILE_GET` command will read data written by the sample profile given above, 
if 10.0.0.1 is one of the input values for `ip_src_addr`.
-More information on configuring and using the client can be found 
[here](../metron-profiler-client).
-It is assumed that the `PROFILE_GET` client is correctly configured before 
using it.
+1. Use the [Profiler Client](../metron-profiler-client) to read the profile 
data.  The following `PROFILE_GET` command will read the data written by the 
`hello-world` profile. This assumes that `10.0.0.1` is one of the values for 
`ip_src_addr` contained within the telemetry consumed by the Profiler.
+
     ```
     $ bin/stellar -z node1:2181
     [Stellar]>>> PROFILE_GET( "hello-world", "10.0.0.1", PROFILE_FIXED(30, 
"MINUTES"))
     [451, 448]
     ```
 
+    This result indicates that over the past 30 minutes, the Profiler stored 
two values related to the source IP address "10.0.0.1".  In the first 15 minute 
period, the IP `10.0.0.1` was seen in 451 telemetry messages.  In the second 15 
minute period, the same IP was seen in 448 telemetry messages.
+
+    It is assumed that the `PROFILE_GET` client is correctly configured to 
match the Profile configuration before using it to read that Profile.  More 
information on configuring and using the Profiler client can be found 
[here](../metron-profiler-client).  
+
+
 ## Creating Profiles
 
 The Profiler specification requires a JSON-formatted set of elements, many of 
which can contain Stellar code.  The specification contains the following 
elements.  (For the impatient, skip ahead to the [Examples](#examples).)
@@ -275,27 +279,105 @@ The Profiler runs as an independent Storm topology.  The 
configuration for the P
 The values can be changed on disk and then the Profiler topology must be 
restarted.
 
 
-| Setting                               | Description
-|---                                    |---
-| profiler.workers                      | The number of worker processes to 
create for the topology.
-| profiler.executors                    | The number of executors to spawn per 
component.
-| profiler.input.topic                  | The name of the Kafka topic from 
which to consume data.
-| profiler.output.topic                 | The name of the Kafka topic to which 
profile data is written.  Only used with profiles that use the [`triage` result 
field](#result).
-| profiler.period.duration              | The duration of each profile period. 
 This value should be defined along with `profiler.period.duration.units`.
-| profiler.period.duration.units        | The units used to specify the 
`profiler.period.duration`.
-| profiler.ttl                          | If a message has not been applied to 
a Profile in this period of time, the Profile will be forgotten and its 
resources will be cleaned up. This value should be defined along with 
`profiler.ttl.units`.
-| profiler.ttl.units                    | The units used to specify the 
`profiler.ttl`.
-| profiler.hbase.salt.divisor           | A salt is prepended to the row key 
to help prevent hotspotting.  This constant is used to generate the salt.  
Ideally, this constant should be roughly equal to the number of nodes in the 
Hbase cluster.
-| profiler.hbase.table                  | The name of the HBase table that 
profiles are written to.
-| profiler.hbase.column.family          | The column family used to store 
profiles.
-| profiler.hbase.batch                  | The number of puts that are written 
in a single batch.
-| profiler.hbase.flush.interval.seconds | The maximum number of seconds 
between batch writes to HBase.
-
-After altering the configuration, start the Profiler.
+| Setting                                                                      
 | Description
+|---                                                                           
 |---
+| [`profiler.input.topic`](#profilerinputtopic)                                
 | The name of the Kafka topic from which to consume data.
+| [`profiler.output.topic`](#profileroutputtopic)                              
 | The name of the Kafka topic to which profile data is written.  Only used 
with profiles that define the [`triage` result field](#result).
+| [`profiler.period.duration`](#profilerperiodduration)                        
 | The duration of each profile period.  
+| [`profiler.period.duration.units`](#profilerperioddurationunits)             
 | The units used to specify the 
[`profiler.period.duration`](#profilerperiodduration).  
+| [`profiler.workers`](#profilerworkers)                                       
 | The number of worker processes for the topology.
+| [`profiler.executors`](#profilerexecutors)                                   
 | The number of executors to spawn per component.
+| [`profiler.ttl`](#profilerttl)                                               
 | If a message has not been applied to a Profile in this period of time, the 
Profile will be forgotten and its resources will be cleaned up.
+| [`profiler.ttl.units`](#profilerttlunits)                                    
 | The units used to specify the `profiler.ttl`.
+| [`profiler.hbase.salt.divisor`](#profilerhbasesaltdivisor)                   
 | A salt is prepended to the row key to help prevent hotspotting.
+| [`profiler.hbase.table`](#profilerhbasetable)                                
 | The name of the HBase table that profiles are written to.
+| [`profiler.hbase.column.family`](#profilerhbasecolumnfamily)                 
 | The column family used to store profiles.
+| [`profiler.hbase.batch`](#profilerhbasebatch)                                
 | The number of puts that are written to HBase in a single batch.
+| 
[`profiler.hbase.flush.interval.seconds`](#profilerhbaseflushintervalseconds) | 
The maximum number of seconds between batch writes to HBase.
 
-```
-$ $METRON_HOME/start_profiler_topology.sh
-```
+### `profiler.input.topic`
+
+*Default*: indexing
+
+The name of the Kafka topic from which to consume data.  By default, the 
Profiler consumes data from the `indexing` topic so that it has access to fully 
enriched telemetry.
+
+### `profiler.output.topic`
+
+*Default*: enrichments
+
+The name of the Kafka topic to which profile data is written.  This property 
is only applicable to profiles that define  the [`result` `triage` 
field](#result).  This allows Profile data to be selectively triaged like any 
other source of telemetry in Metron.
+
+### `profiler.period.duration`
+
+*Default*: 15
+
+The duration of each profile period.  This value should be defined along with 
[`profiler.period.duration.units`](#profilerperioddurationunits).
+
+*Important*: To read a profile using the [Profiler 
Client](metron-analytics/metron-profiler-client), the Profiler Client's 
`profiler.client.period.duration` property must match this value.  Otherwise, 
the Profiler Client will be unable to read the profile data.  
+
+### `profiler.period.duration.units`
+
+*Default*: MINUTES
+
+The units used to specify the `profiler.period.duration`.  This value should 
be defined along with [`profiler.period.duration`](#profilerperiodduration).
+
+*Important*: To read a profile using the Profiler Client, the Profiler 
Client's `profiler.client.period.duration.units` property must match this 
value.  Otherwise, the [Profiler 
Client](metron-analytics/metron-profiler-client) will be unable to read the 
profile data.
+
+### `profiler.workers`
+
+*Default*: 1
+
+The number of worker processes to create for the Profiler topology.  This 
property is useful for performance tuning the Profiler.
+
+### `profiler.executors`
+
+*Default*: 0
+
+The number of executors to spawn per component for the Profiler topology.  
This property is useful for performance tuning the Profiler.
+
+### `profiler.ttl`
+
+*Default*: 30
+
+ If a message has not been applied to a Profile in this period of time, the 
Profile will be terminated and its resources will be cleaned up. This value 
should be defined along with [`profiler.ttl.units`](#profilerttlunits).
+
+ This time-to-live does not affect the persisted Profile data in HBase.  It 
only affects the state stored in memory during the execution of the latest 
profile period.  This state will be deleted if the time-to-live is exceeded.
+
+### `profiler.ttl.units`
+
+*Default*: MINUTES
+
+The units used to specify the [`profiler.ttl`](#profilerttl).
+
+### `profiler.hbase.salt.divisor`
+
+*Default*: 1000
+
+A salt is prepended to the row key to help prevent hotspotting.  This constant 
is used to generate the salt.  This constant should be roughly equal to the 
number of nodes in the Hbase cluster to ensure even distribution of data.
+
+### `profiler.hbase.table`
+
+*Default*: profiler
+
+The name of the HBase table that profile data is written to.  The Profiler 
expects that the table exists and is writable.  It will not create the table.
+
+### `profiler.hbase.column.family`
+
+*Default*: P
+
+The column family used to store profile data in HBase.
+
+### `profiler.hbase.batch`
+
+*Default*: 10
+
+The number of puts that are written to HBase in a single batch.
+
+### `profiler.hbase.flush.interval.seconds`
+
+*Default*: 30
+
+The maximum number of seconds between batch writes to HBase.
 
 ## Examples
 

http://git-wip-us.apache.org/repos/asf/metron/blob/5c0ac32d/metron-analytics/metron-profiler/src/main/config/profiler.properties
----------------------------------------------------------------------
diff --git 
a/metron-analytics/metron-profiler/src/main/config/profiler.properties 
b/metron-analytics/metron-profiler/src/main/config/profiler.properties
index f020b30..873c837 100644
--- a/metron-analytics/metron-profiler/src/main/config/profiler.properties
+++ b/metron-analytics/metron-profiler/src/main/config/profiler.properties
@@ -24,12 +24,12 @@ topology.worker.childopts=
 
 ##### Profiler #####
 
-profiler.workers=1
-profiler.executors=0
 profiler.input.topic=indexing
 profiler.output.topic=enrichments
 profiler.period.duration=15
 profiler.period.duration.units=MINUTES
+profiler.workers=1
+profiler.executors=0
 profiler.ttl=30
 profiler.ttl.units=MINUTES
 profiler.hbase.salt.divisor=1000

http://git-wip-us.apache.org/repos/asf/metron/blob/5c0ac32d/metron-deployment/README.md
----------------------------------------------------------------------
diff --git a/metron-deployment/README.md b/metron-deployment/README.md
index dd3f510..9470fb5 100644
--- a/metron-deployment/README.md
+++ b/metron-deployment/README.md
@@ -61,7 +61,7 @@ This will set up
 ### Prerequisites
 - A cluster managed by Ambari 2.4.2+
 - Metron RPMs available on the cluster in the /localrepo directory.  See 
[RPM](#rpm) for further information.
-- [Node.js](https://nodejs.org/en/download/package-manager/) repository 
installed on the Management UI host 
+- [Node.js](https://nodejs.org/en/download/package-manager/) repository 
installed on the Management UI host
 
 ### Building Management Pack
 From `metron-deployment` run
@@ -104,7 +104,7 @@ There are a set of limitations that should be addressed 
based to improve the cur
 - Several configuration parameters used when installing the Metron service 
could (and should) be grabbed from Ambari.  Install will require them to be 
manually entered.
 - Need to handle upgrading Metron
 
-## RPM
+## RPMs
 RPMs can be built to install the components in metron-platform. These RPMs are 
built in a Docker container and placed into `target`.
 
 Components in the RPMs:
@@ -178,4 +178,3 @@ Using the MPack is preferred, but instructions for 
Kerberizing manually can be f
 
 ## TODO
 - Support Ubuntu deployments
-

http://git-wip-us.apache.org/repos/asf/metron/blob/5c0ac32d/site-book/bin/fix-md-dialect.py
----------------------------------------------------------------------
diff --git a/site-book/bin/fix-md-dialect.py b/site-book/bin/fix-md-dialect.py
index 5e6db3e..02be2fb 100755
--- a/site-book/bin/fix-md-dialect.py
+++ b/site-book/bin/fix-md-dialect.py
@@ -59,7 +59,8 @@ import inspect
 import re
 
 # These are the characters excluded by Markdown from use in auto-generated 
anchor text for Headings.
-EXCLUDED_CHARS_REGEX = r'[^\w\-]'   # all non-alphanumerics except "-" and 
"_".  Whitespace are previously converted.
+EXCLUDED_CHARS_REGEX_GHM = r'[^\w\-]'   # all non-alphanumerics except "-" and 
"_".  Whitespace are previously converted.
+EXCLUDED_CHARS_REGEX_DOX = r'[^\w\.\-]'   # all non-alphanumerics except "-", 
"_", and ".".  Whitespace are previously converted.
 
 def report_error(s) :
     print >>sys.stderr, "ERROR: " + s 
@@ -242,12 +243,12 @@ def rewrite_relative_links() :
             trace('labeltext = "' + labeltext + '"')
             scratch = labeltext.lower()                  # Github-MD forces 
all anchors to lowercase
             scratch = re.sub(r'[\s]', "-", scratch)      # convert whitespace 
to "-"
-            scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch)  # strip 
non-alphanumerics
+            scratch = re.sub(EXCLUDED_CHARS_REGEX_GHM, "", scratch)  # strip 
non-alphanumerics
             if (scratch == named_anchor) :
                 trace("Found a rewritable case")
                 scratch = labeltext                      # Doxia-markdown 
doesn't change case
                 scratch = re.sub(r'[\s]', "_", scratch)  # convert whitespace 
to "_"
-                scratch = re.sub(EXCLUDED_CHARS_REGEX, "", scratch)  # strip 
non-alphanumerics
+                scratch = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", scratch)  # 
strip non-alphanumerics except "."
                 href = re.sub("#" + named_anchor, "#" + scratch, href)
 
         trace("After anchor rewrite, href is: " + href)
@@ -372,9 +373,9 @@ for FILENAME in sys.argv[1:] :
             active_type = "none"
             indent_stack.init_indent()
             if re.search(r'^#[^#]', inputline) :
-                # First-level headers ("H1") need explicit anchor inserted.  
This fixes problem #6.
+                # First-level headers ("H1") need explicit anchor inserted 
(Doxia style).  This fixes problem #6.
                 anchor_name = re.sub(r' ', "_", inputline[1:].strip())
-                anchor_name = re.sub(EXCLUDED_CHARS_REGEX, "", anchor_name)
+                anchor_name = re.sub(EXCLUDED_CHARS_REGEX_DOX, "", anchor_name)
                 anchor_text = '<a name="' + anchor_name + '"></a>'
                 if H1_COUNT == 0 :
                     # Treat the first header differently - put the header 
after instead of before

metron git commit: METRON-1050 Improve Docs of `profiler.period.duration` (nickwallen) closes apache/metron#656

Reply via email to