documen...

ajothomas Wed, 18 Jan 2023 11:34:08 -0800

Modified: samza/site/learn/documentation/latest/operations/monitoring.html
URL: 
http://svn.apache.org/viewvc/samza/site/learn/documentation/latest/operations/monitoring.html?rev=1906774&r1=1906773&r2=1906774&view=diff
==============================================================================
--- samza/site/learn/documentation/latest/operations/monitoring.html (original)
+++ samza/site/learn/documentation/latest/operations/monitoring.html Wed Jan 18 
19:33:25 2023
@@ -227,6 +227,12 @@
     
       
         
+      <a class="side-navigation__group-item" data-match-active="" 
href="/releases/1.8.0">1.8.0</a>
+      
+        
+      <a class="side-navigation__group-item" data-match-active="" 
href="/releases/1.7.0">1.7.0</a>
+      
+        
       <a class="side-navigation__group-item" data-match-active="" 
href="/releases/1.6.0">1.6.0</a>
       
         
@@ -538,6 +544,14 @@
               
               
 
+              <li class="hide"><a 
href="/learn/documentation/1.8.0/operations/monitoring">1.8.0</a></li>
+
+              
+
+              <li class="hide"><a 
href="/learn/documentation/1.7.0/operations/monitoring">1.7.0</a></li>
+
+              
+
               <li class="hide"><a 
href="/learn/documentation/1.6.0/operations/monitoring">1.6.0</a></li>
 
               
@@ -646,184 +660,213 @@
 <p>Like any other production software, it is critical to monitor the health of 
our Samza jobs. Samza relies on metrics for monitoring and includes an 
extensible metrics library. While a few standard metrics are provided 
out-of-the-box, it is easy to define metrics specific to your application.</p>
 
 <ul>
-<li><a href="#a-metrics-reporters">A. Metrics Reporters</a>
-
-<ul>
-<li><a href="#jmxreporter">A.1 Reporting Metrics to JMX (JMX Reporter)</a>
-
-<ul>
-<li><a href="#enablejmxreporter">Enabling the JMX Reporter</a></li>
-<li><a href="#jmxreporter">Using the JMX Reporter</a></li>
-</ul></li>
-<li><a href="#snapshotreporter">A.2 Reporting Metrics to Kafka 
(MetricsSnapshot Reporter)</a>
-
-<ul>
-<li><a href="#enablesnapshotreporter">Enabling the MetricsSnapshot 
Reporter</a></li>
-</ul></li>
-<li><a href="#customreporter">A.3 Creating a Custom MetricsReporter</a></li>
-</ul></li>
-<li><a href="#metrictypes">B. Metric Types in Samza</a></li>
-<li><a href="#userdefinedmetrics">C. Adding User-Defined Metrics</a>
-
-<ul>
-<li><a href="#lowlevelapi">Low Level Task API</a></li>
-<li><a href="#highlevelapi">High Level Streams API</a></li>
-</ul></li>
-<li><a href="#keyinternalsamzametrics">D. Key Internal Samza Metrics</a>
-
-<ul>
-<li><a href="#vitalmetrics">D.1 Vital Metrics</a></li>
-<li><a href="#storemetrics">D.2 Store Metrics</a></li>
-<li><a href="#operatormetrics">D.3 Operator Metrics</a></li>
-</ul></li>
-<li><a href="#metricssheet">E. Metrics Reference Sheet</a></li>
+  <li><a href="#a-metrics-reporters">A. Metrics Reporters</a>
+    <ul>
+      <li><a href="#jmxreporter">A.1 Reporting Metrics to JMX (JMX 
Reporter)</a>
+        <ul>
+          <li><a href="#enablejmxreporter">Enabling the JMX Reporter</a></li>
+          <li><a href="#jmxreporter">Using the JMX Reporter</a></li>
+        </ul>
+      </li>
+      <li><a href="#snapshotreporter">A.2 Reporting Metrics to Kafka 
(MetricsSnapshot Reporter)</a>
+        <ul>
+          <li><a href="#enablesnapshotreporter">Enabling the MetricsSnapshot 
Reporter</a></li>
+        </ul>
+      </li>
+      <li><a href="#customreporter">A.3 Creating a Custom 
MetricsReporter</a></li>
+    </ul>
+  </li>
+  <li><a href="#metrictypes">B. Metric Types in Samza</a></li>
+  <li><a href="#userdefinedmetrics">C. Adding User-Defined Metrics</a>
+    <ul>
+      <li><a href="#lowlevelapi">Low Level Task API</a></li>
+      <li><a href="#highlevelapi">High Level Streams API</a></li>
+    </ul>
+  </li>
+  <li><a href="#keyinternalsamzametrics">D. Key Internal Samza Metrics</a>
+    <ul>
+      <li><a href="#vitalmetrics">D.1 Vital Metrics</a></li>
+      <li><a href="#storemetrics">D.2 Store Metrics</a></li>
+      <li><a href="#operatormetrics">D.3 Operator Metrics</a></li>
+    </ul>
+  </li>
+  <li><a href="#metricssheet">E. Metrics Reference Sheet</a></li>
 </ul>
 
 <h2 id="a-metrics-reporters">A. Metrics Reporters</h2>
 
-<p>Samza&rsquo;s metrics library encapsulates the metrics collection and 
sampling logic. Metrics Reporters in Samza are responsible for emitting metrics 
to external services which may archive, process, visualize the metrics&rsquo; 
values, or trigger alerts based on them.</p>
+<p>Samza's metrics library encapsulates the metrics collection and sampling 
logic. Metrics Reporters in Samza are responsible for emitting metrics to 
external services which may archive, process, visualize the metrics' values, or 
trigger alerts based on them.</p>
 
 <p>Samza includes default implementations for two such Metrics Reporters:</p>
 
 <ol>
-<li><p>a) A <em>JMXReporter</em> (detailed <a href="#jmxreporter">below</a>) 
which allows using standard JMX clients for probing containers to retrieve 
metrics encoded as JMX MBeans. Visualization tools such as <a 
href="https://grafana.com/dashboards/3457";>Grafana</a> could also be used to 
visualize this JMX data.</p></li>
-<li><p>b) A <em>MetricsSnapshot</em> reporter (detailed <a 
href="#snapshotreporter">below</a>) which allows periodically publishing all 
metrics to Kafka. A downstream Samza job could then consume and publish these 
metrics to other metrics management systems such as <a 
href="https://prometheus.io/";>Prometheus</a> and <a 
href="https://graphiteapp.org/";>Graphite</a>.</p></li>
+  <li>
+    <p>a) A <em>JMXReporter</em> (detailed <a href="#jmxreporter">below</a>) 
which allows using standard JMX clients for probing containers to retrieve 
metrics encoded as JMX MBeans. Visualization tools such as <a 
href="https://grafana.com/dashboards/3457";>Grafana</a> could also be used to 
visualize this JMX data.</p>
+  </li>
+  <li>
+    <p>b) A <em>MetricsSnapshot</em> reporter (detailed <a 
href="#snapshotreporter">below</a>) which allows periodically publishing all 
metrics to Kafka. A downstream Samza job could then consume and publish these 
metrics to other metrics management systems such as <a 
href="https://prometheus.io/";>Prometheus</a> and <a 
href="https://graphiteapp.org/";>Graphite</a>.</p>
+  </li>
 </ol>
 
 <p>Note that Samza allows multiple Metrics Reporters to be used 
simultaneously.</p>
 
-<h3 id="a-1-reporting-metrics-to-jmx-jmx-reporter"><a name="jmxreporter"></a> 
A.1 Reporting Metrics to JMX (JMX Reporter)</h3>
-
-<p>This reporter encodes all its internal and user-defined metrics as JMX 
MBeans and hosts a JMX MBean server. Standard JMX clients (such as JConsole, 
VisualVM) can thus be used to probe Samza&rsquo;s containers and 
YARN-ApplicationMaster to retrieve these metrics&rsquo; values. JMX also 
provides additional profiling capabilities (e.g., for CPU and memory 
utilization), which are also enabled by this reporter.</p>
+<h3 id="-a1-reporting-metrics-to-jmx-jmx-reporter"><a name="jmxreporter"></a> 
A.1 Reporting Metrics to JMX (JMX Reporter)</h3>
 
-<h4 id="enabling-the-jmx-reporter"><a name="enablejmxreporter"></a> Enabling 
the JMX Reporter</h4>
+<p>This reporter encodes all its internal and user-defined metrics as JMX 
MBeans and hosts a JMX MBean server. Standard JMX clients (such as JConsole, 
VisualVM) can thus be used to probe Samza's containers and 
YARN-ApplicationMaster to retrieve these metrics' values. JMX also provides 
additional profiling capabilities (e.g., for CPU and memory utilization), which 
are also enabled by this reporter.</p>
 
+<h4 id="-enabling-the-jmx-reporter"><a name="enablejmxreporter"></a> Enabling 
the JMX Reporter</h4>
 <p>JMXReporter can be enabled by adding the following configuration.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>#Define a Samza metrics reporter called 
&quot;jxm&quot;, which publishes to JMX
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>#Define a Samza metrics reporter called "jxm", which 
publishes to JMX
 metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory
 
 # Use the jmx reporter (if using multiple reporters, separate them with commas)
 metrics.reporters=jmx
-</code></pre></div>
-<h4 id="using-the-jmx-reporter"><a name="usejmxreporter"></a> Using the JMX 
Reporter</h4>
+
+</code></pre></div></div>
+
+<h4 id="-using-the-jmx-reporter"><a name="usejmxreporter"></a> Using the JMX 
Reporter</h4>
 
 <p>To connect to the JMX MBean server, first obtain the JMX Server URL and 
port, published in the container logs:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>2018-08-14 11:30:49.888 [main] JmxServer [INFO] 
Started JmxServer registry port=54661 server port=54662 
url=service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
-</code></pre></div>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>
+2018-08-14 11:30:49.888 [main] JmxServer [INFO] Started JmxServer registry 
port=54661 server port=54662 
url=service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
+
+</code></pre></div></div>
+
 <p>If using the <strong>JConsole</strong> JMX client, launch it with the 
service URL as:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>jconsole 
service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
-</code></pre></div>
-<p><img src="/img/latest/learn/documentation/operations/jconsole.png" 
alt="JConsole" class="diagram-large"></p>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>jconsole 
service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
+</code></pre></div></div>
+
+<p><img src="/img/latest/learn/documentation/operations/jconsole.png" 
alt="JConsole" class="diagram-large" /></p>
 
 <p>If using the VisualVM JMX client, run:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>jvisualvm
-</code></pre></div>
-<p>After <strong>VisualVM</strong> starts, click the &ldquo;Add JMX 
Connection&rdquo; button and paste in your JMX server URL (obtained from the 
logs).
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>jvisualvm
+</code></pre></div></div>
+
+<p>After <strong>VisualVM</strong> starts, click the "Add JMX Connection" 
button and paste in your JMX server URL (obtained from the logs).
 Install the VisualVM-MBeans plugin (Tools-&gt;Plugin) to view the metrics 
MBeans.</p>
 
-<p><img src="/img/latest/learn/documentation/operations/visualvm.png" 
alt="VisualVM" class="diagram-small"></p>
+<p><img src="/img/latest/learn/documentation/operations/visualvm.png" 
alt="VisualVM" class="diagram-small" /></p>
 
-<h3 id="a-2-reporting-metrics-to-kafka-metricssnapshot-reporter"><a 
name="snapshotreporter"></a> A.2 Reporting Metrics to Kafka (MetricsSnapshot 
Reporter)</h3>
+<h3 id="-a2-reporting-metrics-to-kafka-metricssnapshot-reporter"><a 
name="snapshotreporter"></a> A.2 Reporting Metrics to Kafka (MetricsSnapshot 
Reporter)</h3>
 
 <p>This reporter publishes metrics to Kafka.</p>
 
-<h4 id="enabling-the-metricssnapshot-reporter"><a 
name="enablesnapshotreporter"></a> Enabling the MetricsSnapshot Reporter</h4>
+<h4 id="-enabling-the-metricssnapshot-reporter"><a 
name="enablesnapshotreporter"></a> Enabling the MetricsSnapshot Reporter</h4>
+<p>To enable this reporter, simply append the following to your job's 
configuration.</p>
 
-<p>To enable this reporter, simply append the following to your job&rsquo;s 
configuration.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>#Define a metrics reporter called 
&quot;snapshot&quot;
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>#Define a metrics reporter called "snapshot"
 metrics.reporters=snapshot
 
metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
-</code></pre></div>
+</code></pre></div></div>
+
 <p>Specify the kafka topic to which the reporter should publish to</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>metrics.reporter.snapshot.stream=kafka.metrics
-</code></pre></div>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>metrics.reporter.snapshot.stream=kafka.metrics
+</code></pre></div></div>
+
 <p>Specify the serializer to be used for the metrics data</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
 systems.kafka.streams.metrics.samza.msg.serde=metrics
-</code></pre></div>
+</code></pre></div></div>
 <p>With this configuration, all containers (including the 
YARN-ApplicationMaster) will publish their JSON-encoded metrics 
-to a Kafka topic called &ldquo;metrics&rdquo; every 60 seconds.
+to a Kafka topic called "metrics" every 60 seconds.
 The following is an example of such a metrics message:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>{
-  &quot;header&quot;: {
-    &quot;container-name&quot;: &quot;samza-container-0&quot;,
-
-    &quot;exec-env-container-id&quot;: &quot;YARN-generated containerID&quot;,
-    &quot;host&quot;: &quot;samza-grid-1234.example.com&quot;,
-    &quot;job-id&quot;: &quot;1&quot;,
-    &quot;job-name&quot;: &quot;my-samza-job&quot;,
-    &quot;reset-time&quot;: 1401729000347,
-    &quot;samza-version&quot;: &quot;0.0.1&quot;,
-    &quot;source&quot;: &quot;TaskName-Partition1&quot;,
-    &quot;time&quot;: 1401729420566,
-    &quot;version&quot;: &quot;0.0.1&quot;
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>{
+  "header": {
+    "container-name": "samza-container-0",
+
+    "exec-env-container-id": "YARN-generated containerID",
+    "host": "samza-grid-1234.example.com",
+    "job-id": "1",
+    "job-name": "my-samza-job",
+    "reset-time": 1401729000347,
+    "samza-version": "0.0.1",
+    "source": "TaskName-Partition1",
+    "time": 1401729420566,
+    "version": "0.0.1"
   },
-  &quot;metrics&quot;: {
-    &quot;org.apache.samza.container.TaskInstanceMetrics&quot;: {
-      &quot;commit-calls&quot;: 1,
-      &quot;window-calls&quot;: 0,
-      &quot;process-calls&quot;: 14,
-
-      &quot;messages-actually-processed&quot;: 14,
-      &quot;send-calls&quot;: 0,
-
-      &quot;flush-calls&quot;: 1,
-      &quot;pending-messages&quot;: 0,
-      &quot;messages-in-flight&quot;: 0,
-      &quot;async-callback-complete-calls&quot;: 14,
-        &quot;wikipedia-#en.wikipedia-0-offset&quot;: 8979,
+  "metrics": {
+    "org.apache.samza.container.TaskInstanceMetrics": {
+      "commit-calls": 1,
+      "window-calls": 0,
+      "process-calls": 14,
+
+      "messages-actually-processed": 14,
+      "send-calls": 0,
+
+      "flush-calls": 1,
+      "pending-messages": 0,
+      "messages-in-flight": 0,
+      "async-callback-complete-calls": 14,
+        "wikipedia-#en.wikipedia-0-offset": 8979,
     }
   }
 }
-</code></pre></div>
+</code></pre></div></div>
+
 <p>Each message contains a header which includes information about the job, 
time, and container from which the metrics were obtained. 
 The remainder of the message contains the metric values, grouped by their 
types, such as TaskInstanceMetrics, SamzaContainerMetrics,  
KeyValueStoreMetrics, JVMMetrics, etc. Detailed descriptions of the various 
metric categories and metrics are available <a 
href="#metricssheet">here</a>.</p>
 
 <p>It is possible to configure the MetricsSnapshot reporter to use a different 
serializer using this configuration</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>serializers.registry.metrics.class=&lt;classpath-to-my-custom-serializer-factory&gt;
-</code></pre></div>
-<p>To configure the reporter to publish with a different frequency (default 60 
seconds), add the following to your job&rsquo;s configuration</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>metrics.reporter.snapshot.interval=&lt;publish 
frequency in seconds&gt;
-</code></pre></div>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>serializers.registry.metrics.class=&lt;classpath-to-my-custom-serializer-factory&gt;
+</code></pre></div></div>
+
+<p>To configure the reporter to publish with a different frequency (default 60 
seconds), add the following to your job's configuration</p>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>metrics.reporter.snapshot.interval=&lt;publish 
frequency in seconds&gt;
+</code></pre></div></div>
+
 <p>Similarly, to limit the set of metrics emitted you can use the regex based 
blacklist supported by this reporter. For example, to limit it to publishing 
only SamzaContainerMetrics use:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>metrics.reporter.snapshot.blacklist=^(?!.\*?(?:SamzaContainerMetrics)).\*$
-</code></pre></div>
-<h3 id="a-3-creating-a-custom-metricsreporter"><a name="customreporter"></a> 
A.3 Creating a Custom MetricsReporter</h3>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>metrics.reporter.snapshot.blacklist=^(?!.\*?(?:SamzaContainerMetrics)).\*$
+</code></pre></div></div>
+
+<h3 id="-a3-creating-a-custom-metricsreporter"><a name="customreporter"></a> 
A.3 Creating a Custom MetricsReporter</h3>
 
 <p>Creating a custom MetricsReporter entails implementing the MetricsReporter 
interface. The lifecycle of Metrics Reporters is managed by Samza and is 
aligned with the Samza container lifecycle. Metrics Reporters can poll metric 
values and can receive callbacks when new metrics are added at runtime, e.g., 
user-defined metrics. Metrics Reporters are responsible for maintaining 
executor pools, IO connections, and any in-memory state that they require in 
order to export metrics to the desired external system, and managing the 
lifecycles of such components.</p>
 
-<p>After implementation, a custom reporter can be enabled by appending the 
following to the Samza job&rsquo;s configuration:</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>#Define a metrics reporter with a desired name
+<p>After implementation, a custom reporter can be enabled by appending the 
following to the Samza job's configuration:</p>
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>#Define a metrics reporter with a desired name
 
metrics.reporter.&lt;my-custom-reporter-name&gt;.class=&lt;classpath-of-my-custom-reporter-factory&gt;
 
 
 #Enable its use for metrics reporting
 metrics.reporters=&lt;my-custom-reporter-name&gt;
-</code></pre></div>
-<h2 id="b-metric-types-in-samza"><a name="metrictypes"></a> B. Metric Types in 
Samza</h2>
+</code></pre></div></div>
 
-<p>Metrics in Samza are divided into three types &ndash; <em>Gauges</em>, 
<em>Counters</em>, and <em>Timers</em>.</p>
+<h2 id="-b-metric-types-in-samza"><a name="metrictypes"></a> B. Metric Types 
in Samza</h2>
+
+<p>Metrics in Samza are divided into three types â <em>Gauges</em>, 
<em>Counters</em>, and <em>Timers</em>.</p>
 
 <p><em>Gauges</em> are useful when measuring the magnitude of a certain system 
property, e.g., the current queue length, or a buffer size.</p>
 
 <p><em>Counters</em> are useful in measuring metrics that are cumulative 
values, e.g., the number of messages processed since container startup. Certain 
counters are also useful when visualized with their rate-of-change, e.g., the 
rate of message processing.</p>
 
-<p><em>Timers</em> are useful for storing and reporting a sliding-window of 
timing values. </p>
+<p><em>Timers</em> are useful for storing and reporting a sliding-window of 
timing values.</p>
 
-<h2 id="c-adding-user-defined-metrics"><a name="userdefinedmetrics"></a> C. 
Adding User-Defined Metrics</h2>
+<h2 id="-c-adding-user-defined-metrics"><a name="userdefinedmetrics"></a> C. 
Adding User-Defined Metrics</h2>
 
 <p>To add a new metric, you can simply use the <em>MetricsRegistry</em> in the 
provided TaskContext of 
 the init() method to register new metrics. The code snippets below show 
examples of registering and updating a user-defined
  Counter metric. Timers and gauges can similarly be used from within your task 
class.</p>
 
-<h3 id="low-level-task-api"><a name="lowlevelapi"></a> Low Level Task API</h3>
+<h3 id="-low-level-task-api"><a name="lowlevelapi"></a> Low Level Task API</h3>
 
 <p>Simply have your task implement the InitableTask interface and access the 
MetricsRegistry from the TaskContext.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>public class MyJavaStreamTask implements 
StreamTask, InitableTask {
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>public class MyJavaStreamTask implements StreamTask, 
InitableTask {
 
   private Counter messageCount;
   public void init(Config config, TaskContext context) {
-    this.messageCount = 
context.getMetricsRegistry().newCounter(getClass().getName(), 
&quot;message-count&quot;);
+    this.messageCount = 
context.getMetricsRegistry().newCounter(getClass().getName(), "message-count");
 
   }
 
@@ -832,11 +875,13 @@ the init() method to register new metric
   }
 
 }
-</code></pre></div>
-<h3 id="high-level-streams-api"><a name="highlevelapi"></a> High Level Streams 
API</h3>
+</code></pre></div></div>
+
+<h3 id="-high-level-streams-api"><a name="highlevelapi"></a> High Level 
Streams API</h3>
 
 <p>In the High Level Streams API, you can define a ContextManager and access 
the MetricsRegistry from the TaskContext, using which you can add and update 
your metrics.</p>
-<div class="highlight"><pre><code class="language-text" 
data-lang="text"><span></span>public class MyJavaStreamApp implements 
StreamApplication {
+
+<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre 
class="highlight"><code>public class MyJavaStreamApp implements 
StreamApplication {
 
   private Counter messageCount = null;
 
@@ -854,7 +899,7 @@ the init() method to register new metric
   @Override
   public void init(Config config, TaskContext context) {
       messageCount = context.getMetricsRegistry().
-      newCounter(getClass().getName(), &quot;message-count&quot;);
+      newCounter(getClass().getName(), "message-count");
   }
 
   private IndexedRecord countMessage(IndexedRecord value) {
@@ -866,1174 +911,1276 @@ the init() method to register new metric
   public void close() { }
 
   }
-</code></pre></div>
-<h2 id="d-key-internal-samza-metrics"><a name="keyinternalsamzametrics"></a> 
D. Key Internal Samza Metrics</h2>
+</code></pre></div></div>
+
+<h2 id="-d-key-internal-samza-metrics"><a name="keyinternalsamzametrics"></a> 
D. Key Internal Samza Metrics</h2>
 
-<p>Samza&rsquo;s internal metrics allow for detailed monitoring of a Samza job 
and all its components. Detailed descriptions 
+<p>Samza's internal metrics allow for detailed monitoring of a Samza job and 
all its components. Detailed descriptions 
 of all internal metrics are listed in a reference sheet <a 
href="#e-metrics-reference-sheet">here</a>. 
 However, a small subset of internal metrics facilitates easy high-level 
monitoring of a job.</p>
 
-<p>These key metrics can be grouped into three categories: <em>Vital 
metrics</em>, <em>Store</em><em>metrics</em>, and <em>Operator metrics</em>. 
+<p>These key metrics can be grouped into three categories: <em>Vital 
metrics</em>, <em>Store__metrics</em>, and <em>Operator metrics</em>. 
 We explain each of these categories in detail below.</p>
 
-<h3 id="d-1-vital-metrics"><a name="vitalmetrics"></a> D.1. Vital Metrics</h3>
+<h3 id="-d1-vital-metrics"><a name="vitalmetrics"></a> D.1. Vital Metrics</h3>
 
-<p>These metrics indicate the vital signs of a Samza job&rsquo;s health. Note 
that these metrics are categorized into different groups based on the Samza 
component they are emitted by, (e.g. SamzaContainerMetrics, 
TaskInstanceMetrics, ApplicationMaster metrics, etc).</p>
+<p>These metrics indicate the vital signs of a Samza job's health. Note that 
these metrics are categorized into different groups based on the Samza 
component they are emitted by, (e.g. SamzaContainerMetrics, 
TaskInstanceMetrics, ApplicationMaster metrics, etc).</p>
 
-<table><thead>
-<tr>
-<th><strong>Metric Name</strong></th>
-<th><strong>Group</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>Availability &ndash; Are there any resource failures impacting my 
job?</strong></td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>job-healthy</td>
-<td>ContainerProcessManagerMetrics</td>
-<td>A binary value, where 1 indicates that all the required containers 
configured for a job are running, 0 otherwise.</td>
-</tr>
-<tr>
-<td>failed-containers</td>
-<td>ContainerProcessManagerMetrics</td>
-<td>Number of containers that have failed in the job&rsquo;s lifetime</td>
-</tr>
-<tr>
-<td><strong>Input Processing Lag &ndash; Is my job lagging ?</strong></td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>&lt;Topic&gt;-&lt;Partition&gt;-messages-behind-high-watermark</td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>KafkaSystemConsumerMetrics</td>
-<td>Number of input messages waiting to be processed on an input 
topic-partition</td>
-<td></td>
-</tr>
-<tr>
-<td>consumptionLagMs</td>
-<td>EventHubSystemConsumer</td>
-<td>Time difference between the processing and enqueuing (into EventHub)  of 
input events</td>
-</tr>
-<tr>
-<td>millisBehindLatest</td>
-<td>KinesisSystemConsumerMetrics</td>
-<td>Current processing lag measured from the tip of the stream, expressed in 
milliseconds.</td>
-</tr>
-<tr>
-<td><strong>Output/Produce Errors &ndash; Is my job failing to produce 
output?</strong></td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>producer-send-failed</td>
-<td>KafkaSystemProducerMetrics</td>
-<td>Number of send requests to Kafka (e.g., output topics) that failed due to 
unrecoverable errors</td>
-</tr>
-<tr>
-<td>flush-failed</td>
-<td>HdfsSystemProducerMetrics</td>
-<td>Number of failed flushes to HDFS</td>
-</tr>
-<tr>
-<td><strong>Processing Time &ndash; Is my job spending too much time 
processing inputs?</strong></td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>process-ns</td>
-<td>SamzaContainerMetrics</td>
-<td>Amount of time the job is spending in processing each input</td>
-</tr>
-<tr>
-<td>commit-ns</td>
-<td>SamzaContainerMetrics</td>
-<td>Amount of time the job is spending in checkpointing inputs (and flushing 
producers, checkpointing KV stores, flushing side input stores).</td>
-</tr>
-<tr>
-<td>The frequency of this function is configured using 
<em>task.commit.ms</em></td>
-<td></td>
-<td></td>
-</tr>
-<tr>
-<td>window-ns</td>
-<td>SamzaContainerMetrics</td>
-<td>In case of WindowableTasks being used, amount of time the job is spending 
in its window() operations</td>
-</tr>
-</tbody></table>
+<table>
+  <thead>
+    <tr>
+      <th><strong>Metric Name</strong></th>
+      <th><strong>Group</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>Availability â Are there any resource failures impacting 
my job?</strong></td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>job-healthy</td>
+      <td>ContainerProcessManagerMetrics</td>
+      <td>A binary value, where 1 indicates that all the required containers 
configured for a job are running, 0 otherwise.</td>
+    </tr>
+    <tr>
+      <td>failed-containers</td>
+      <td>ContainerProcessManagerMetrics</td>
+      <td>Number of containers that have failed in the job's lifetime</td>
+    </tr>
+    <tr>
+      <td><strong>Input Processing Lag â Is my job lagging ?</strong></td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>&lt;Topic&gt;-&lt;Partition&gt;-messages-behind-high-watermark</td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>KafkaSystemConsumerMetrics</td>
+      <td>Number of input messages waiting to be processed on an input 
topic-partition</td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>consumptionLagMs</td>
+      <td>EventHubSystemConsumer</td>
+      <td>Time difference between the processing and enqueuing (into EventHub) 
 of input events</td>
+    </tr>
+    <tr>
+      <td>millisBehindLatest</td>
+      <td>KinesisSystemConsumerMetrics</td>
+      <td>Current processing lag measured from the tip of the stream, 
expressed in milliseconds.</td>
+    </tr>
+    <tr>
+      <td><strong>Output/Produce Errors â Is my job failing to produce 
output?</strong></td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>producer-send-failed</td>
+      <td>KafkaSystemProducerMetrics</td>
+      <td>Number of send requests to Kafka (e.g., output topics) that failed 
due to unrecoverable errors</td>
+    </tr>
+    <tr>
+      <td>flush-failed</td>
+      <td>HdfsSystemProducerMetrics</td>
+      <td>Number of failed flushes to HDFS</td>
+    </tr>
+    <tr>
+      <td><strong>Processing Time â Is my job spending too much time 
processing inputs?</strong></td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>process-ns</td>
+      <td>SamzaContainerMetrics</td>
+      <td>Amount of time the job is spending in processing each input</td>
+    </tr>
+    <tr>
+      <td>commit-ns</td>
+      <td>SamzaContainerMetrics</td>
+      <td>Amount of time the job is spending in checkpointing inputs (and 
flushing producers, checkpointing KV stores, flushing side input stores).</td>
+    </tr>
+    <tr>
+      <td>The frequency of this function is configured using 
<em>task.commit.ms</em></td>
+      <td>Â </td>
+      <td>Â </td>
+    </tr>
+    <tr>
+      <td>window-ns</td>
+      <td>SamzaContainerMetrics</td>
+      <td>In case of WindowableTasks being used, amount of time the job is 
spending in its window() operations</td>
+    </tr>
+  </tbody>
+</table>
 
-<h3 id="d-2-store-metrics"><a name="storemetrics"></a>  D.2. Store Metrics</h3>
+<h3 id="--d2-store-metrics"><a name="storemetrics"></a>  D.2. Store 
Metrics</h3>
 
 <p>Stateful Samza jobs typically use RocksDB backed KV stores for storing 
state. Therefore, timing metrics associated with 
 KV stores can be useful for monitoring input lag. These are some key metrics 
for KV stores. 
 The metrics reference sheet <a href="#e-metrics-reference-sheet">here</a> 
details all metrics for KV stores.</p>
 
-<table><thead>
-<tr>
-<th><strong>Metric name</strong></th>
-<th><strong>Group</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td>get-ns, put-ns, delete-ns, all-ns</td>
-<td>KeyValueStorageEngineMetrics</td>
-<td>Time spent performing respective KV store operations</td>
-</tr>
-</tbody></table>
+<table>
+  <thead>
+    <tr>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Group</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>get-ns, put-ns, delete-ns, all-ns</td>
+      <td>KeyValueStorageEngineMetrics</td>
+      <td>Time spent performing respective KV store operations</td>
+    </tr>
+  </tbody>
+</table>
 
-<h3 id="d-3-operator-metrics"><a name="operatormetrics"></a>  D.3. Operator 
Metrics</h3>
+<h3 id="--d3-operator-metrics"><a name="operatormetrics"></a>  D.3. Operator 
Metrics</h3>
 
-<p>If your Samza job uses Samza&rsquo;s Fluent API or Samza-SQL, Samza creates 
a DAG (directed acyclic graph) of 
+<p>If your Samza job uses Samza's Fluent API or Samza-SQL, Samza creates a DAG 
(directed acyclic graph) of 
 <em>operators</em> to form the required data processing pipeline. In such 
cases, operator metrics allow fine-grained 
 monitoring of such operators. Key operator metrics are listed below, while a 
detailed list is present 
 in the metrics reference sheet.</p>
 
-<table><thead>
-<tr>
-<th><strong>Metric name</strong></th>
-<th><strong>Group</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><Operator-ID\>-handle-message-ns</td>
-<td>WindowOperatorImpl, PartialJoinOperatorImpl, StreamOperatorImpl, 
StreamTableJoinOperatorImpl, etc</td>
-<td>Time spent handling a given input message by the operator</td>
-</tr>
-</tbody></table>
-
-<h2 id="e-metrics-reference-sheet"><a name="metricssheet"></a>  E. Metrics 
Reference Sheet</h2>
+<table>
+  <thead>
+    <tr>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Group</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>&lt;Operator-ID&gt;-handle-message-ns</td>
+      <td>WindowOperatorImpl, PartialJoinOperatorImpl, StreamOperatorImpl, 
StreamTableJoinOperatorImpl, etc</td>
+      <td>Time spent handling a given input message by the operator</td>
+    </tr>
+  </tbody>
+</table>
 
-<p>Suffixes &ldquo;-ms&rdquo; and &ldquo;-ns&rdquo; to metric names indicated 
milliseconds and nanoseconds respectively. All &ldquo;average time&rdquo; 
metrics are calculated over a sliding time window of 300 seconds.</p>
+<h2 id="--e-metrics-reference-sheet"><a name="metricssheet"></a>  E. Metrics 
Reference Sheet</h2>
+<p>Suffixes "-ms" and "-ns" to metric names indicated milliseconds and 
nanoseconds respectively. All "average time" metrics are calculated over a 
sliding time window of 300 seconds.</p>
 
 <p>All &lt;system&gt;, &lt;stream&gt;, &lt;partition&gt;, &lt;store-name&gt;, 
&lt;topic&gt;, are populated with the corresponding actual values at 
runtime.</p>
 
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>ContainerProcessManagerMetrics</strong></td>
-<td>running-containers</td>
-<td>Total number of running containers.</td>
-</tr>
-<tr>
-<td></td>
-<td>needed-containers</td>
-<td>Number of containers needed for the job to be declared healthy.</td>
-</tr>
-<tr>
-<td></td>
-<td>completed-containers</td>
-<td>Number of containers that have completed their execution and exited.</td>
-</tr>
-<tr>
-<td></td>
-<td>failed-containers</td>
-<td>Number of containers that have failed in the job&rsquo;s lifetime.</td>
-</tr>
-<tr>
-<td></td>
-<td>released-containers</td>
-<td>Number of containers released due to overallocation by the 
YARN-ResourceManager.</td>
-</tr>
-<tr>
-<td></td>
-<td>container-count</td>
-<td>Number of containers configured for the job.</td>
-</tr>
-<tr>
-<td></td>
-<td>redundant-notifications</td>
-<td>Number of redundant onResourceCompletedcallbacks received from the RM 
after container shutdown.</td>
-</tr>
-<tr>
-<td></td>
-<td>job-healthy</td>
-<td>A binary value, where 1 indicates that all the required containers 
configured for a job are running, 0 otherwise.</td>
-</tr>
-<tr>
-<td></td>
-<td>preferred-host-requests</td>
-<td>Number of container resource-requests for a preferred host received by the 
cluster manager.</td>
-</tr>
-<tr>
-<td></td>
-<td>any-host-requests</td>
-<td>Number of container resource-requests for <em>any</em> host received by 
the cluster manager</td>
-</tr>
-<tr>
-<td></td>
-<td>expired-preferred-host-requests</td>
-<td>Number of expired resource-requests-for -preferred-host received by the 
cluster manager.</td>
-</tr>
-<tr>
-<td></td>
-<td>expired-any-host-requests</td>
-<td>Number of expired resource-requests-for -any-host received by the cluster 
manager.</td>
-</tr>
-<tr>
-<td></td>
-<td>host-affinity-match-pct</td>
-<td>Percentage of non-expired preferred host requests. This measures the % of 
resource-requests for which host-affinity provided the preferred host.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SamzaContainerMetrics (Timer metrics)</strong></td>
-<td>choose-ns</td>
-<td>Average time spent by a task instance for choosing the input to process; 
this includes time spent waiting for input, selecting one in case of multiple 
inputs, and deserializing input.</td>
-</tr>
-<tr>
-<td></td>
-<td>window-ns</td>
-<td>In case of WindowableTasks being used, average time a task instance is 
spending in its window() operations.</td>
-</tr>
-<tr>
-<td></td>
-<td>timer-ns</td>
-<td>Average time spent in the timer-callback when a timer registered with 
TaskContext fires.</td>
-</tr>
-<tr>
-<td></td>
-<td>process-ns</td>
-<td>Average time the job is spending in processing each input.</td>
-</tr>
-<tr>
-<td></td>
-<td>commit-ns</td>
-<td>Average time the job is spending in checkpointing inputs (and flushing 
producers, checkpointing KV stores, flushing side input stores). The frequency 
of this function is configured using <em>task.commit.ms.</em></td>
-</tr>
-<tr>
-<td></td>
-<td>block-ns</td>
-<td>Average time the run loop is blocked because all task instances are busy 
processing input; could indicate lag accumulating.</td>
-</tr>
-<tr>
-<td></td>
-<td>container-startup-time</td>
-<td>Time spent in starting the container. This includes time to start the JMX 
server, starting metrics reporters, starting system producers, consumers, 
system admins, offset manager, locality manager, disk space manager, security 
manager, statistics manager, and initializing all task instances.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SamzaContainerMetrics (Counters and Gauges)</strong></td>
-<td>commit-calls</td>
-<td>Number of commits. Each commit includes input checkpointing, flushing 
producers, checkpointing KV stores, flushing side input stores, etc.</td>
-</tr>
-<tr>
-<td></td>
-<td>window-calls</td>
-<td>In case of WindowableTask, this measures the number of window 
invocations.</td>
-</tr>
-<tr>
-<td></td>
-<td>timer-calls</td>
-<td>Number of timer callbacks.</td>
-</tr>
-<tr>
-<td></td>
-<td>process-calls</td>
-<td>Number of process method invocations.</td>
-</tr>
-<tr>
-<td></td>
-<td>process-envelopers</td>
-<td>Number of input message envelopes processed.</td>
-</tr>
-<tr>
-<td></td>
-<td>process-null-envelopes</td>
-<td>Number of times no input message envelopes was available for the run loop 
to process.</td>
-</tr>
-<tr>
-<td></td>
-<td>event-loop-utilization</td>
-<td>The duty-cycle of the event loop. That is, the fraction of time of each 
event loop iteration that is spent in process(), window(), and commit.</td>
-</tr>
-<tr>
-<td></td>
-<td>disk-usage-bytes</td>
-<td>Total disk space size used by key-value stores (in bytes).</td>
-</tr>
-<tr>
-<td></td>
-<td>disk-quota-bytes</td>
-<td>Disk memory usage quota for key-value stores (in bytes).</td>
-</tr>
-<tr>
-<td></td>
-<td>executor-work-factor</td>
-<td>The work factor of the run loop. A work factor of 1 indicates full 
throughput, while a work factor of less than 1 will introduce delays into the 
execution to approximate the requested work factor. The work factor is set by 
the disk space monitor in accordance with the disk quota policy. Given the 
latest percentage of available disk quota, this policy returns the work factor 
that should be applied.</td>
-</tr>
-<tr>
-<td></td>
-<td>physical-memory-mb</td>
-<td>The physical memory used by the Samza container process (native + on heap) 
(in MBs).</td>
-</tr>
-<tr>
-<td></td>
-<td>physical-memory-utilization</td>
-<td>The ratio between the physical memory used by the Samza container process 
(native + on heap) and the total physical memory of the Samza container.</td>
-</tr>
-<tr>
-<td></td>
-<td><TaskName\>-<StoreName\>-restore-time</td>
-<td>Time taken to restore task stores (per task store).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>Job-Coordinator Metrics (Gauge)</strong></td>
-<td>&lt;system&gt;-&lt;stream&gt;-partitionCount</td>
-<td>The current number of partitions detected by the Stream Partition Count 
Monitor. This can be enabled by configuring 
<em>job.coordinator.monitor-partition-change</em> to true.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>TaskInstance Metrics (Counters and Gauges)</strong></td>
-<td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-offset</td>
-<td>The offset of the last processed message on the given 
system-stream-partition input.</td>
-</tr>
-<tr>
-<td></td>
-<td>commit-calls</td>
-<td>Number of commit calls for the task. Each commit call involves 
checkpointing inputs (and flushing producers, checkpointing KV stores, flushing 
side input stores).</td>
-</tr>
-<tr>
-<td></td>
-<td>window-calls</td>
-<td>In case of WIndowableTask, the number of window() invocations on the 
task.</td>
-</tr>
-<tr>
-<td></td>
-<td>process-calls</td>
-<td>Number of process method calls.</td>
-</tr>
-<tr>
-<td></td>
-<td>send-calls</td>
-<td>Number of send method calls (representing number of messages that were 
sent to the underlying SystemProducers)</td>
-</tr>
-<tr>
-<td></td>
-<td>flush-calls</td>
-<td>Number of times the underlying system producers were flushed.</td>
-</tr>
-<tr>
-<td></td>
-<td>messages-actually-processed</td>
-<td>Number of messages processed by the task.</td>
-</tr>
-<tr>
-<td></td>
-<td>pending-messages</td>
-<td>Number of pending messages in the pending envelope queue</td>
-</tr>
-<tr>
-<td></td>
-<td>messages-in-flight</td>
-<td>Number of input messages currently being processed. This is impacted by 
the task.max.concurrency configuration.</td>
-</tr>
-<tr>
-<td></td>
-<td>async-callback-complete-calls</td>
-<td>Number of processAsync invocations that have completed (applicable to 
AsyncStreamTasks).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td>OffsetManagerMetrics (Gauge)</td>
-<td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-checkpointed-offset</td>
-<td>Latest checkpointed offsets for each input system-stream-partition.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>JvmMetrics (Timers)</strong></td>
-<td>gc-time-millis</td>
-<td>Total time spent in GC.</td>
-</tr>
-<tr>
-<td></td>
-<td><gc-name\>-time-millis</td>
-<td>Total time spent in garbage collection (for each garbage collector) (in 
milliseconds)</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>JvmMetrics (Counters and Gauges)</strong></td>
-<td>gc-count</td>
-<td>Number of GC invocations.</td>
-</tr>
-<tr>
-<td></td>
-<td>mem-heap-committed-mb</td>
-<td>Size of committed heap memory (in MBs) Because the guest allocates memory 
lazily to the JVM heap and because the difference between Free and Used memory 
is opaque to the guest, the guest commits memory to the JVM heap as it is 
required. The Committed memory, therefore, is a measure of how much memory the 
JVM heap is really consuming in the guest.<a 
href="https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html";>https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html</a></td>
-</tr>
-<tr>
-<td></td>
-<td>mem-heap-used-mb</td>
-<td>Used memory from the perspective of the JVM is (Working set + Garbage) and 
Free memory is (Current heap size â Used memory).</td>
-</tr>
-<tr>
-<td></td>
-<td>mem-heap-max-mb</td>
-<td>Size of maximum heap memory (in MBs). This is defined by the âXmx 
option.</td>
-</tr>
-<tr>
-<td></td>
-<td>mem-nonheap-committed-mb</td>
-<td>Size of non-heap memory committed in MBs.</td>
-</tr>
-<tr>
-<td></td>
-<td>mem-nonheap-used-mb</td>
-<td>Size of non-heap memory used in MBs.</td>
-</tr>
-<tr>
-<td></td>
-<td>mem-nonheap-max-mb</td>
-<td>Size of non-heap memory in MBs. This can be changed using 
âXX:MaxPermSize VM option.</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-new</td>
-<td>Number of threads not started at that instant.</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-runnable</td>
-<td>Number of running threads at that instant.</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-timed-waiting</td>
-<td>Current number of timed threads waiting at that instant. A thread in 
TIMED_WAITING stated as: &ldquo;A thread that is waiting for another thread to 
perform an action for up to a specified waiting time is in this 
state.&rdquo;</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-waiting</td>
-<td>Current number of waiting threads.</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-blocked</td>
-<td>Current number of blocked threads.</td>
-</tr>
-<tr>
-<td></td>
-<td>threads-terminated</td>
-<td>Current number of terminated threads.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;gc-name&gt;-gc-count</td>
-<td>Number of garbage collection calls (for each garbage collector).</td>
-</tr>
-<tr>
-<td><strong>(Emitted only if the OS supports it)</strong></td>
-<td>process-cpu-usage</td>
-<td>Returns the &ldquo;recent cpu usage&rdquo; for the Java Virtual Machine 
process.</td>
-</tr>
-<tr>
-<td><strong>(Emitted only if the OS supports it)</strong></td>
-<td>system-cpu-usage</td>
-<td>Returns the &ldquo;recent cpu usage&rdquo; for the whole system.</td>
-</tr>
-<tr>
-<td><strong>(Emitted only if the OS supports it)</strong></td>
-<td>open-file-descriptor-count</td>
-<td>Count of open file descriptors.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SystemConsumersMetrics (Counters and Gauges)</strong> <br/> These 
metrics are emitted when multiplexing and coordinating between per-system 
consumers and message choosers for polling</td>
-<td>chose-null</td>
-<td>Number of times the message chooser returned a null message envelope. This 
is typically indicative of low input traffic on one or more input 
partitions.</td>
-</tr>
-<tr>
-<td></td>
-<td>chose-object</td>
-<td>Number of times the message chooser returned a non-null message 
envelope.</td>
-</tr>
-<tr>
-<td></td>
-<td>deserialization-error</td>
-<td>Number of times an incoming message was not deserialized successfully.</td>
-</tr>
-<tr>
-<td></td>
-<td>ssps-needed-by-chooser</td>
-<td>Number of systems for which no buffered message exists, and hence these 
systems need to be polled (to obtain a message).</td>
-</tr>
-<tr>
-<td></td>
-<td>poll-timeout</td>
-<td>The timeout for polling at that instant.</td>
-</tr>
-<tr>
-<td></td>
-<td>unprocessed-messages</td>
-<td>Number of unprocessed messages buffered in SystemConsumers.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-polls</td>
-<td>Number of times the given system was polled</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-ssp-fetches-per-poll</td>
-<td>Number of partitions of the given system polled at that instant.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-messages-per-poll</td>
-<td>Number of times the SystemConsumer for the underlying system was polled to 
get new messages.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-messages-chosen</td>
-<td>Number of messages that were chosen by the MessageChooser for particular 
system stream partition.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SystemConsumersMetrics (Timers)</strong></td>
-<td>poll-ns</td>
-<td>Average time spent polling all underlying systems for new messages (in 
nanoseconds).</td>
-</tr>
-<tr>
-<td></td>
-<td>deserialization-ns</td>
-<td>Average time spent deserializing incoming messages (in nanoseconds).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KafkaSystemConsumersMetrics (Timers)</strong></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-offset-change</td>
-<td>The next offset to be read for this topic and partition.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-bytes-read</td>
-<td>Total size of all messages read for a topic partition (payload + key 
size).</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-messages-read</td>
-<td>Number of messages read for a topic partition.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-high-watermark</td>
-<td>Offset of the last committed message in Kafka&rsquo;s topic partition.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-messages-behind-high-watermark</td>
-<td>Number of input messages waiting to be processed on an input 
topic-partition. That is, the difference between high watermark and next 
offset.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-<host\>-<port\>-reconnects</td>
-<td>Number of reconnects to a broker on a particular host and port.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-<host\>-<port\>-bytes-read</td>
-<td>Total size of all messages read from a broker on a particular host and 
port.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-<host\>-<port\>-messages-read</td>
-<td>Number of times the consumer used a broker on a particular host and port 
to get new messages.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-<host\>-<port\>-skipped-fetch-requests</td>
-<td>Number of times the fetchMessage method is called but no topic/partitions 
needed new messages.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-<host\>-<port\>-topic-partitions</td>
-<td>Number of broker&rsquo;s topic partitions which are being consumed.</td>
-</tr>
-<tr>
-<td></td>
-<td>poll-count</td>
-<td>Number of polls the KafkaSystemConsumer performed to get new messages.</td>
-</tr>
-<tr>
-<td></td>
-<td>no-more-messages-SystemStreamPartition [&lt;system&gt;, &lt;stream&gt;, 
&lt;partition&gt;]</td>
-<td>Indicates if the Kafka consumer is at the head for particular partition. 1 
if it is caught up, 0 otherwise.</td>
-</tr>
-<tr>
-<td></td>
-<td>blocking-poll-count-SystemStreamPartition [&lt;system&gt;, &lt;stream&gt;, 
&lt;partition&gt;]</td>
-<td>Number of times a blocking poll is executed (polling until we get at least 
one message, or until we catch up to the head of the stream) (per 
partition).</td>
-</tr>
-<tr>
-<td></td>
-<td>blocking-poll-timeout-count-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
-<td>Number of times a blocking poll has timed out (polling until we get at 
least one message within a timeout period) (per partition).</td>
-</tr>
-<tr>
-<td></td>
-<td>buffered-message-count-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
-<td>Current number of messages in queue (per partition).</td>
-</tr>
-<tr>
-<td></td>
-<td>buffered-message-size-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
-<td>Current size of messages in queue (if 
systems.system.samza.fetch.threshold.bytes is defined) (per partition).</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-offset-change</td>
-<td>The next offset to be read for a topic partition.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-bytes-read</td>
-<td>Total size of all messages read for a topic partition (payload + key 
size).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SystemProducersMetrics (Counters and Gauges)</strong> <br/>These 
metrics are aggregated across Producers.</td>
-<td>sends</td>
-<td>Number of send method calls. Representing total number of sent 
messages.</td>
-</tr>
-<tr>
-<td></td>
-<td>flushes</td>
-<td>Number of flush method calls for all registered producers.</td>
-</tr>
-<tr>
-<td></td>
-<td><source\>-sends</td>
-<td>Number of sent messages for a particular source (task instance).</td>
-</tr>
-<tr>
-<td></td>
-<td><source\>-flushes</td>
-<td>Number of flushes for particular source (task instance).</td>
-</tr>
-<tr>
-<td></td>
-<td>serialization error</td>
-<td>Number of errors occurred while serializing envelopes before sending.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KafkaSystemProducersMetrics (Counters)</strong></td>
-<td>&lt;system&gt;-producer-sends</td>
-<td>Number of send invocations to the KafkaSystemProducer.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-producer-send-success</td>
-<td>Number of send requests that were successfully completed by the 
KafkaSystemProducer.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-producer-send-failed</td>
-<td>Number of send requests to Kafka (e.g., output topics) that failed due to 
unrecoverable errors</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-flushes</td>
-<td>Number of calls made to flush in the KafkaSystemProducer.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-flush-failed</td>
-<td>Number of times flush operation failed.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KafkaSystemProducersMetrics (Timers)</strong></td>
-<td>&lt;system&gt;-flush-ns</td>
-<td>Represents average time the flush call takes to complete (in 
nanoseconds).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KeyValueStorageEngineMetrics (Counters)</strong> <br/> These 
metrics provide insight into the type and number of KV Store operations taking 
place</td>
-<td><store-name\>-puts</td>
-<td>Total number of put operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-put-alls</td>
-<td>Total number putAll operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-gets</td>
-<td>Total number get operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-get-alls</td>
-<td>Total number getAll operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-alls</td>
-<td>Total number of accesses to the iterator on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-ranges</td>
-<td>Total number of accesses to a sorted-range iterator on the given KV 
store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-deletes</td>
-<td>Total number delete operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-delete-alls</td>
-<td>Total number deleteAll operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-flushes</td>
-<td>Total number flush operations on the given KV store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-restored-messages</td>
-<td>Number of entries in the KV store restored from the changelog for that 
store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-restored-bytes</td>
-<td>Size in bytes of entries in the KV store restored from the changelog for 
that store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-snapshots</td>
-<td>Total number of snapshot operations on the given KV store.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KeyValueStorageEngineMetrics (Timers)</strong> <br/> These metrics 
provide insight into the latencies of  of KV Store operations</td>
-<td><store-name\>-get-ns</td>
-<td>Average duration of the get operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-get-all-ns</td>
-<td>Average duration of the getAll operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-put-ns</td>
-<td>Average duration of the put operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-put-all-ns</td>
-<td>Average duration of the putAll operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-delete-ns</td>
-<td>Average duration of the delete operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-delete-all-ns</td>
-<td>Average duration of the deleteAll operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-flush-ns</td>
-<td>Average duration of the flush operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-all-ns</td>
-<td>Average duration of obtaining an iterator (using the all operation) on the 
given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-range-ns</td>
-<td>Average duration of obtaining a sorted-range iterator (using the all 
operation) on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td><store-name\>-snapshot-ns</td>
-<td>Average duration of the snapshot operation on the given KV Store.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>KeyValueStoreMetrics (Counters)</strong> <br/> These metrics are 
measured at the App-facing layer for different KV Stores, e.g., RocksDBStore, 
InMemoryKVStore.</td>
-<td><store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, 
<store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, 
<store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes</td>
-<td>Total number of the specified operation on the given KV Store.(These 
metrics have are equivalent to the respective ones under 
KeyValueStorageEngineMetrics).</td>
-</tr>
-<tr>
-<td></td>
-<td>bytes-read</td>
-<td>Total number of bytes read (when serving reads &ndash; gets, getAlls, and 
iterations).</td>
-</tr>
-<tr>
-<td></td>
-<td>bytes-written</td>
-<td>Total number of bytes written (when serving writes &ndash; puts, 
putAlls).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>SerializedKeyValueStoreMetrics (Counters)</strong> <br/> These 
metrics are measured at the serialization layer.</td>
-<td><store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, 
<store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, 
<store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes</td>
-<td>Total number of the specified operation on the given KV Store. (These 
metrics have are equivalent to the respective ones under 
KeyValueStorageEngineMetrics)</td>
-</tr>
-<tr>
-<td></td>
-<td>bytes-deserialized</td>
-<td>Total number of bytes deserialized (when serving reads &ndash; gets, 
getAlls, and iterations).</td>
-</tr>
-<tr>
-<td></td>
-<td>bytes-serialized</td>
-<td>Total number of bytes serialized (when serving reads and writes &ndash; 
gets, getAlls, puts, putAlls). In addition to writes, serialization is also 
done during reads to serialize key to bytes for lookup in the underlying 
store.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>LoggedStoreMetrics (Counters)</strong> <br/> These metrics are 
measured at the changeLog-backup layer for KV stores.</td>
-<td><store-name\>-gets, <store-name\>-puts, <store-name\>-alls, 
<store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges,</td>
-<td>Total number of the specified operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td></td>
-<td></td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>CachedStoreMetrics (Counters and Gauges)</strong> <br/> These 
metrics are measured at the caching layer for RocksDB-backed KV stores.</td>
-<td><store-name\>-gets, <store-name\>-puts, <store-name\>-alls, 
<store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges,</td>
-<td>Total number of the specified operation on the given KV Store.</td>
-</tr>
-<tr>
-<td></td>
-<td>cache-hits</td>
-<td>Total number of get and getAll operations that hit cached entries.</td>
-</tr>
-<tr>
-<td></td>
-<td>put-all-dirty-entries-batch-size</td>
-<td>Total number of dirty KV-entries written-back to the underlying store.</td>
-</tr>
-<tr>
-<td></td>
-<td>dirty-count</td>
-<td>Number of entries in the cache marked dirty at that instant.</td>
-</tr>
-<tr>
-<td></td>
-<td>cache-count</td>
-<td>Number of entries in the cache at that instant.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>RoundRobinChooserMetrics (Counters)</strong></td>
-<td>buffered-messages</td>
-<td>Size of the queue with potential messages to process.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>BatchingChooserMetrics (Counters and gauges)</strong></td>
-<td>batch-resets</td>
-<td>Number of batch resets because they  exceeded the max batch size 
limit.</td>
-</tr>
-<tr>
-<td></td>
-<td>batched-envelopes</td>
-<td>Number of envelopes in the batch at the current instant.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>BootstrappingChooserMetrics (Gauges)</strong></td>
-<td>lagging-batch-streams</td>
-<td>Number of bootstrapping streams that are lagging.</td>
-</tr>
-<tr>
-<td></td>
-<td>&lt;system&gt;-&lt;stream&gt;-lagging-partitions</td>
-<td>Number of lagging partitions in the stream (for each stream marked as 
bootstrapping stream).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>HdfsSystemProducerMetrics (Counters)</strong></td>
-<td>system-producer-sends</td>
-<td>Total number of attempts to write to HDFS.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-send-success</td>
-<td>Total number of successful writes to HDFS.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-send-failed</td>
-<td>Total number of failures while sending envelopes to HDFS.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-flushes</td>
-<td>Total number of attempts to flush data to HDFS.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-flush-success</td>
-<td>Total number of successfully flushed all written data to HDFS.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-flush-failed</td>
-<td>Total number of failures while flushing data to HDFS.</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>HdfsSystemProducerMetrics (Timers)</strong></td>
-<td>system-send-ms</td>
-<td>Average time spent for writing messages to HDFS (in milliseconds).</td>
-</tr>
-<tr>
-<td></td>
-<td>system-flush-ms</td>
-<td>Average time spent for flushing messages to HDFS (in milliseconds).</td>
-</tr>
-</tbody></table>
-
-<table><thead>
-<tr>
-<th><strong>Group</strong></th>
-<th><strong>Metric name</strong></th>
-<th><strong>Meaning</strong></th>
-</tr>
-</thead><tbody>
-<tr>
-<td><strong>ElasticsearchSystemProducerMetrics (Counters)</strong></td>
-<td>system-bulk-send-success</td>
-<td>Total number of successful bulk requests</td>
-</tr>
-<tr>
-<td></td>
-<td>system-docs-inserted</td>
-<td>Total number of documents created.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-docs-updated</td>
-<td>Total number of documents updated.</td>
-</tr>
-<tr>
-<td></td>
-<td>system-version-conflicts</td>
-<td>Number of times the failed requests due to conflicts with the current 
state of the document.</td>
-</tr>
-</tbody></table>
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>ContainerProcessManagerMetrics</strong></td>
+      <td>running-containers</td>
+      <td>Total number of running containers.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>needed-containers</td>
+      <td>Number of containers needed for the job to be declared healthy.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>completed-containers</td>
+      <td>Number of containers that have completed their execution and 
exited.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>failed-containers</td>
+      <td>Number of containers that have failed in the job's lifetime.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>released-containers</td>
+      <td>Number of containers released due to overallocation by the 
YARN-ResourceManager.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>container-count</td>
+      <td>Number of containers configured for the job.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>redundant-notifications</td>
+      <td>Number of redundant onResourceCompletedcallbacks received from the 
RM after container shutdown.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>job-healthy</td>
+      <td>A binary value, where 1 indicates that all the required containers 
configured for a job are running, 0 otherwise.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>preferred-host-requests</td>
+      <td>Number of container resource-requests for a preferred host received 
by the cluster manager.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>any-host-requests</td>
+      <td>Number of container resource-requests for <em>any</em> host received 
by the cluster manager</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>expired-preferred-host-requests</td>
+      <td>Number of expired resource-requests-for -preferred-host received by 
the cluster manager.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>expired-any-host-requests</td>
+      <td>Number of expired resource-requests-for -any-host received by the 
cluster manager.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>host-affinity-match-pct</td>
+      <td>Percentage of non-expired preferred host requests. This measures the 
% of resource-requests for which host-affinity provided the preferred host.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>SamzaContainerMetrics (Timer metrics)</strong></td>
+      <td>choose-ns</td>
+      <td>Average time spent by a task instance for choosing the input to 
process; this includes time spent waiting for input, selecting one in case of 
multiple inputs, and deserializing input.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>window-ns</td>
+      <td>In case of WindowableTasks being used, average time a task instance 
is spending in its window() operations.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>timer-ns</td>
+      <td>Average time spent in the timer-callback when a timer registered 
with TaskContext fires.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>process-ns</td>
+      <td>Average time the job is spending in processing each input.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>commit-ns</td>
+      <td>Average time the job is spending in checkpointing inputs (and 
flushing producers, checkpointing KV stores, flushing side input stores). The 
frequency of this function is configured using <em>task.commit.ms.</em></td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>block-ns</td>
+      <td>Average time the run loop is blocked because all task instances are 
busy processing input; could indicate lag accumulating.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>container-startup-time</td>
+      <td>Time spent in starting the container. This includes time to start 
the JMX server, starting metrics reporters, starting system producers, 
consumers, system admins, offset manager, locality manager, disk space manager, 
security manager, statistics manager, and initializing all task instances.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>SamzaContainerMetrics (Counters and Gauges)</strong></td>
+      <td>commit-calls</td>
+      <td>Number of commits. Each commit includes input checkpointing, 
flushing producers, checkpointing KV stores, flushing side input stores, 
etc.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>window-calls</td>
+      <td>In case of WindowableTask, this measures the number of window 
invocations.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>timer-calls</td>
+      <td>Number of timer callbacks.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>process-calls</td>
+      <td>Number of process method invocations.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>process-envelopers</td>
+      <td>Number of input message envelopes processed.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>process-null-envelopes</td>
+      <td>Number of times no input message envelopes was available for the run 
loop to process.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>event-loop-utilization</td>
+      <td>The duty-cycle of the event loop. That is, the fraction of time of 
each event loop iteration that is spent in process(), window(), and commit.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>disk-usage-bytes</td>
+      <td>Total disk space size used by key-value stores (in bytes).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>disk-quota-bytes</td>
+      <td>Disk memory usage quota for key-value stores (in bytes).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>executor-work-factor</td>
+      <td>The work factor of the run loop. A work factor of 1 indicates full 
throughput, while a work factor of less than 1 will introduce delays into the 
execution to approximate the requested work factor. The work factor is set by 
the disk space monitor in accordance with the disk quota policy. Given the 
latest percentage of available disk quota, this policy returns the work factor 
that should be applied.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>total-process-cpu-usage</td>
+      <td>The process cpu usage percentage (in the [0, 100] interval) used by 
the Samza container process and all its child processes.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>physical-memory-mb</td>
+      <td>The physical memory used by the Samza container process (native + on 
heap) (in MBs).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>physical-memory-utilization</td>
+      <td>The ratio between the physical memory used by the Samza container 
process (native + on heap) and the total physical memory of the Samza 
container.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>container-thread-pool-size</td>
+      <td>The current size of a Samza containerâs thread pool. It may or may 
not be the same as job.container.thread.pool.size, depending on the 
implementation.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>container-active-threads</td>
+      <td>The approximate actively used threads in a Samza containerâs 
thread pool.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;TaskName&gt;-&lt;StoreName&gt;-restore-time</td>
+      <td>Time taken to restore task stores (per task store).</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>Job-Coordinator Metrics (Gauge)</strong></td>
+      <td>&lt;system&gt;-&lt;stream&gt;-partitionCount</td>
+      <td>The current number of partitions detected by the Stream Partition 
Count Monitor. This can be enabled by configuring 
<em>job.coordinator.monitor-partition-change</em> to true.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>TaskInstance Metrics (Counters and Gauges)</strong></td>
+      <td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-offset</td>
+      <td>The offset of the last processed message on the given 
system-stream-partition input.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>commit-calls</td>
+      <td>Number of commit calls for the task. Each commit call involves 
checkpointing inputs (and flushing producers, checkpointing KV stores, flushing 
side input stores).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>window-calls</td>
+      <td>In case of WIndowableTask, the number of window() invocations on the 
task.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>process-calls</td>
+      <td>Number of process method calls.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>send-calls</td>
+      <td>Number of send method calls (representing number of messages that 
were sent to the underlying SystemProducers)</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>flush-calls</td>
+      <td>Number of times the underlying system producers were flushed.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>messages-actually-processed</td>
+      <td>Number of messages processed by the task.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>pending-messages</td>
+      <td>Number of pending messages in the pending envelope queue</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>messages-in-flight</td>
+      <td>Number of input messages currently being processed. This is impacted 
by the task.max.concurrency configuration.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>async-callback-complete-calls</td>
+      <td>Number of processAsync invocations that have completed (applicable 
to AsyncStreamTasks).</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>OffsetManagerMetrics (Gauge)</td>
+      
<td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-checkpointed-offset</td>
+      <td>Latest checkpointed offsets for each input 
system-stream-partition.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>JvmMetrics (Timers)</strong></td>
+      <td>gc-time-millis</td>
+      <td>Total time spent in GC.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;gc-name&gt;-time-millis</td>
+      <td>Total time spent in garbage collection (for each garbage collector) 
(in milliseconds)</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>JvmMetrics (Counters and Gauges)</strong></td>
+      <td>gc-count</td>
+      <td>Number of GC invocations.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-heap-committed-mb</td>
+      <td>Size of committed heap memory (in MBs) Because the guest allocates 
memory lazily to the JVM heap and because the difference between Free and Used 
memory is opaque to the guest, the guest commits memory to the JVM heap as it 
is required. The Committed memory, therefore, is a measure of how much memory 
the JVM heap is really consuming in the guest.<a 
href="https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html";>https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html</a></td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-heap-used-mb</td>
+      <td>Used memory from the perspective of the JVM is (Working set + 
Garbage) and Free memory is (Current heap size â Used memory).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-heap-max-mb</td>
+      <td>Size of maximum heap memory (in MBs). This is defined by the âXmx 
option.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-nonheap-committed-mb</td>
+      <td>Size of non-heap memory committed in MBs.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-nonheap-used-mb</td>
+      <td>Size of non-heap memory used in MBs.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>mem-nonheap-max-mb</td>
+      <td>Size of non-heap memory in MBs. This can be changed using 
âXX:MaxPermSize VM option.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-new</td>
+      <td>Number of threads not started at that instant.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-runnable</td>
+      <td>Number of running threads at that instant.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-timed-waiting</td>
+      <td>Current number of timed threads waiting at that instant. A thread in 
TIMED_WAITING stated as: "A thread that is waiting for another thread to 
perform an action for up to a specified waiting time is in this state."</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-waiting</td>
+      <td>Current number of waiting threads.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-blocked</td>
+      <td>Current number of blocked threads.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>threads-terminated</td>
+      <td>Current number of terminated threads.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;gc-name&gt;-gc-count</td>
+      <td>Number of garbage collection calls (for each garbage collector).</td>
+    </tr>
+    <tr>
+      <td><strong>(Emitted only if the OS supports it)</strong></td>
+      <td>process-cpu-usage</td>
+      <td>Returns the "recent cpu usage" for the Java Virtual Machine 
process.</td>
+    </tr>
+    <tr>
+      <td><strong>(Emitted only if the OS supports it)</strong></td>
+      <td>system-cpu-usage</td>
+      <td>Returns the "recent cpu usage" for the whole system.</td>
+    </tr>
+    <tr>
+      <td><strong>(Emitted only if the OS supports it)</strong></td>
+      <td>open-file-descriptor-count</td>
+      <td>Count of open file descriptors.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>SystemConsumersMetrics (Counters and Gauges)</strong> <br /> 
These metrics are emitted when multiplexing and coordinating between per-system 
consumers and message choosers for polling</td>
+      <td>chose-null</td>
+      <td>Number of times the message chooser returned a null message 
envelope. This is typically indicative of low input traffic on one or more 
input partitions.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>chose-object</td>
+      <td>Number of times the message chooser returned a non-null message 
envelope.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>deserialization-error</td>
+      <td>Number of times an incoming message was not deserialized 
successfully.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>ssps-needed-by-chooser</td>
+      <td>Number of systems for which no buffered message exists, and hence 
these systems need to be polled (to obtain a message).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>poll-timeout</td>
+      <td>The timeout for polling at that instant.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>unprocessed-messages</td>
+      <td>Number of unprocessed messages buffered in SystemConsumers.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-polls</td>
+      <td>Number of times the given system was polled</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-ssp-fetches-per-poll</td>
+      <td>Number of partitions of the given system polled at that instant.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-messages-per-poll</td>
+      <td>Number of times the SystemConsumer for the underlying system was 
polled to get new messages.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;stream&gt;-&lt;partition&gt;-messages-chosen</td>
+      <td>Number of messages that were chosen by the MessageChooser for 
particular system stream partition.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>SystemConsumersMetrics (Timers)</strong></td>
+      <td>poll-ns</td>
+      <td>Average time spent polling all underlying systems for new messages 
(in nanoseconds).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>deserialization-ns</td>
+      <td>Average time spent deserializing incoming messages (in 
nanoseconds).</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>KafkaSystemConsumersMetrics (Timers)</strong></td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-offset-change</td>
+      <td>The next offset to be read for this topic and partition.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-bytes-read</td>
+      <td>Total size of all messages read for a topic partition (payload + key 
size).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-messages-read</td>
+      <td>Number of messages read for a topic partition.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-high-watermark</td>
+      <td>Offset of the last committed message in Kafka's topic partition.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      
<td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-messages-behind-high-watermark</td>
+      <td>Number of input messages waiting to be processed on an input 
topic-partition. That is, the difference between high watermark and next 
offset.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;host&gt;-&lt;port&gt;-reconnects</td>
+      <td>Number of reconnects to a broker on a particular host and port.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;host&gt;-&lt;port&gt;-bytes-read</td>
+      <td>Total size of all messages read from a broker on a particular host 
and port.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;host&gt;-&lt;port&gt;-messages-read</td>
+      <td>Number of times the consumer used a broker on a particular host and 
port to get new messages.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;host&gt;-&lt;port&gt;-skipped-fetch-requests</td>
+      <td>Number of times the fetchMessage method is called but no 
topic/partitions needed new messages.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;host&gt;-&lt;port&gt;-topic-partitions</td>
+      <td>Number of broker's topic partitions which are being consumed.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>poll-count</td>
+      <td>Number of polls the KafkaSystemConsumer performed to get new 
messages.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>no-more-messages-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
+      <td>Indicates if the Kafka consumer is at the head for particular 
partition. 1 if it is caught up, 0 otherwise.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>blocking-poll-count-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
+      <td>Number of times a blocking poll is executed (polling until we get at 
least one message, or until we catch up to the head of the stream) (per 
partition).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>blocking-poll-timeout-count-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
+      <td>Number of times a blocking poll has timed out (polling until we get 
at least one message within a timeout period) (per partition).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>buffered-message-count-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
+      <td>Current number of messages in queue (per partition).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>buffered-message-size-SystemStreamPartition [&lt;system&gt;, 
&lt;stream&gt;, &lt;partition&gt;]</td>
+      <td>Current size of messages in queue (if 
systems.system.samza.fetch.threshold.bytes is defined) (per partition).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-offset-change</td>
+      <td>The next offset to be read for a topic partition.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-&lt;topic&gt;-&lt;partition&gt;-bytes-read</td>
+      <td>Total size of all messages read for a topic partition (payload + key 
size).</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>SystemProducersMetrics (Counters and Gauges)</strong> <br 
/>These metrics are aggregated across Producers.</td>
+      <td>sends</td>
+      <td>Number of send method calls. Representing total number of sent 
messages.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>flushes</td>
+      <td>Number of flush method calls for all registered producers.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;source&gt;-sends</td>
+      <td>Number of sent messages for a particular source (task instance).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;source&gt;-flushes</td>
+      <td>Number of flushes for particular source (task instance).</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>serialization error</td>
+      <td>Number of errors occurred while serializing envelopes before 
sending.</td>
+    </tr>
+  </tbody>
+</table>
+
+<table>
+  <thead>
+    <tr>
+      <th><strong>Group</strong></th>
+      <th><strong>Metric name</strong></th>
+      <th><strong>Meaning</strong></th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td><strong>KafkaSystemProducersMetrics (Counters)</strong></td>
+      <td>&lt;system&gt;-producer-sends</td>
+      <td>Number of send invocations to the KafkaSystemProducer.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-producer-send-success</td>
+      <td>Number of send requests that were successfully completed by the 
KafkaSystemProducer.</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-producer-send-failed</td>
+      <td>Number of send requests to Kafka (e.g., output topics) that failed 
due to unrecoverable errors</td>
+    </tr>
+    <tr>
+      <td>Â </td>
+      <td>&lt;system&gt;-flushes</td>


[... 434 lines stripped ...]

svn commit: r1906774 [38/49] - in /samza/site: ./ archive/ blog/ case-studies/ community/ contribute/ img/latest/learn/documentation/api/ learn/documentation/latest/ learn/documentation/latest/api/ learn/documentation/latest/api/javadocs/ learn/documen...

Reply via email to