[ 
https://issues.apache.org/jira/browse/STORM-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009634#comment-15009634
 ] 

ASF GitHub Bot commented on STORM-885:
--------------------------------------

Github user d2r commented on a diff in the pull request:

    https://github.com/apache/storm/pull/838#discussion_r45128342
  
    --- Diff: docs/documentation/Pacemaker.md ---
    @@ -0,0 +1,89 @@
    +# Pacemaker
    +
    +### Introduction
    +Pacemaker is a storm daemon designed to process heartbeats from workers. 
As Storm is scaled up, ZooKeeper begins to become a bottleneck due to high 
volumes of writes from workers doing heartbeats. Lots of writes to disk and 
traffic across the network is generated as ZooKeeper tries to maintain 
consistency.
    +
    +Because heartbeats are of an ephemeral nature, they do not need to be 
persisted to disk or synced across nodes; an in-memory store will do. This is 
the role of Pacemaker. Pacemaker functions as a simple in-memory key/value 
store with ZooKeeper-like, directory-style keys and byte array values.
    +
    +The corresponding Pacemaker client is a plugin for the `ClusterState` 
interface, `org.apache.storm.pacemaker.pacemaker_state_factory`. Heartbeat 
calls are funneled by the `ClusterState` produced by `pacemaker_state_factory` 
into the Pacemaker daemon, while other set/get operations are forwarded to 
ZooKeeper.
    +
    +------
    +
    +### Configuration
    +
    + - `pacemaker.host` : The host that the Pacemaker daemon is running on
    + - `pacemaker.port` : The port that Pacemaker will listen on
    + - `pacemaker.max.threads` : Maximum number of threads Pacemaker daemon 
will use to handle requests.
    + - `pacemaker.childopts` : Any JVM parameters that need to go to the 
Pacemaker. (used by storm-deploy project)
    + - `pacemaker.auth.method` : The authentication method that is used (more 
info below)
    +
    +#### Example
    +
    +To get Pacemaker up and running, set the following option in the cluster 
config on all nodes:
    +```
    +storm.cluster.state.store: 
"org.apache.storm.pacemaker.pacemaker_state_factory"
    +```
    +
    +The Pacemaker host also needs to be set on all nodes:
    +```
    +pacemaker.host: somehost.mycompany.com
    +```
    +
    +And then start all of your daemons
    +
    +(including Pacemaker):
    +```
    +$ storm pacemaker
    +```
    +
    +The Storm cluster should now be pushing all worker heartbeats through 
Pacemaker.
    +
    +### Security
    +
    +Currently digest (password-based) and Kerberos security are supported. 
Security is currently only around reads, not writes. Writes may be performed by 
anyone, whereas reads may only be performed by authorized and authenticated 
users. This is an area for future development, as it leaves the cluster open to 
DoS attacks, but it prevents any sensitive information from reaching 
unauthorized eyes, which was the main goal.
    +
    +#### Digest
    +To configure digest authentication, set `pacemaker.auth.method: DIGEST` in 
the cluster config on the nodes hosting Nimbus and Pacemaker.
    +The nodes must also have `java.security.auth.login.config` set to point to 
a jaas config file containing the following structure:
    --- End diff --
    
    `JAAS` (I think this is all caps?)


> Heartbeat Server (Pacemaker)
> ----------------------------
>
>                 Key: STORM-885
>                 URL: https://issues.apache.org/jira/browse/STORM-885
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>            Reporter: Robert Joseph Evans
>            Assignee: Kyle Nusbaum
>
> Large highly connected topologies and large clusters write a lot of data into 
> ZooKeeper.  The heartbeats, that make up the majority of this data, do not 
> need to be persisted to disk.  Pacemaker is intended to be a secure 
> replacement for storing the heartbeats without changing anything within the 
> heartbeats.  In the future as more metrics are added in, we may want to look 
> into switching it over to look more like Heron, where a metrics server is 
> running for each node/topology.  And can be used to aggregate/per-aggregate 
> them in a more scalable manor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to