Repository: incubator-distributedlog
Updated Branches:
  refs/heads/master 945c14a99 -> 3469fc878


http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/implementation/storage.rst
----------------------------------------------------------------------
diff --git 
a/website/docs/0.4.0-incubating/user_guide/implementation/storage.rst 
b/website/docs/0.4.0-incubating/user_guide/implementation/storage.rst
new file mode 100644
index 0000000..1fa3de0
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/implementation/storage.rst
@@ -0,0 +1,326 @@
+---
+layout: default
+
+# Sub-level navigation
+sub-nav-group: user-guide
+sub-nav-parent: implementation
+sub-nav-pos: 1
+sub-nav-title: Storage
+
+---
+
+.. contents:: Storage
+
+Storage
+=======
+
+This describes some implementation details of storage layer.
+
+Ensemble Placement Policy
+-------------------------
+
+`EnsemblePlacementPolicy` encapsulates the algorithm that bookkeeper client 
uses to select a number of bookies from the
+cluster as an ensemble for storing data. The algorithm is typically based on 
the data input as well as the network
+topology properties.
+
+By default, BookKeeper offers a `RackawareEnsemblePlacementPolicy` for placing 
the data across racks within a
+datacenter, and a `RegionAwareEnsemblePlacementPolicy` for placing the data 
across multiple datacenters.
+
+How does EnsemblePlacementPolicy work?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The interface of `EnsemblePlacementPolicy` is described as below.
+
+::
+
+    public interface EnsemblePlacementPolicy {
+
+        /**
+         * Initialize the policy.
+         *
+         * @param conf client configuration
+         * @param optionalDnsResolver dns resolver
+         * @param hashedWheelTimer timer
+         * @param featureProvider feature provider
+         * @param statsLogger stats logger
+         * @param alertStatsLogger stats logger for alerts
+         */
+        public EnsemblePlacementPolicy initialize(ClientConfiguration conf,
+                                                  Optional<DNSToSwitchMapping> 
optionalDnsResolver,
+                                                  HashedWheelTimer 
hashedWheelTimer,
+                                                  FeatureProvider 
featureProvider,
+                                                  StatsLogger statsLogger,
+                                                  AlertStatsLogger 
alertStatsLogger);
+
+        /**
+         * Uninitialize the policy
+         */
+        public void uninitalize();
+
+        /**
+         * A consistent view of the cluster (what bookies are available as 
writable, what bookies are available as
+         * readonly) is updated when any changes happen in the cluster.
+         *
+         * @param writableBookies
+         *          All the bookies in the cluster available for write/read.
+         * @param readOnlyBookies
+         *          All the bookies in the cluster available for readonly.
+         * @return the dead bookies during this cluster change.
+         */
+        public Set<BookieSocketAddress> 
onClusterChanged(Set<BookieSocketAddress> writableBookies,
+                                                         
Set<BookieSocketAddress> readOnlyBookies);
+
+        /**
+         * Choose <i>numBookies</i> bookies for ensemble. If the count is more 
than the number of available
+         * nodes, {@link BKNotEnoughBookiesException} is thrown.
+         *
+         * @param ensembleSize
+         *          Ensemble Size
+         * @param writeQuorumSize
+         *          Write Quorum Size
+         * @param excludeBookies
+         *          Bookies that should not be considered as targets.
+         * @return list of bookies chosen as targets.
+         * @throws BKNotEnoughBookiesException if not enough bookies available.
+         */
+        public ArrayList<BookieSocketAddress> newEnsemble(int ensembleSize, 
int writeQuorumSize, int ackQuorumSize,
+                                                          
Set<BookieSocketAddress> excludeBookies) throws BKNotEnoughBookiesException;
+
+        /**
+         * Choose a new bookie to replace <i>bookieToReplace</i>. If no bookie 
available in the cluster,
+         * {@link BKNotEnoughBookiesException} is thrown.
+         *
+         * @param bookieToReplace
+         *          bookie to replace
+         * @param excludeBookies
+         *          bookies that should not be considered as candidate.
+         * @return the bookie chosen as target.
+         * @throws BKNotEnoughBookiesException
+         */
+        public BookieSocketAddress replaceBookie(int ensembleSize, int 
writeQuorumSize, int ackQuorumSize,
+                                                 
Collection<BookieSocketAddress> currentEnsemble, BookieSocketAddress 
bookieToReplace,
+                                                 Set<BookieSocketAddress> 
excludeBookies) throws BKNotEnoughBookiesException;
+
+        /**
+         * Reorder the read sequence of a given write quorum <i>writeSet</i>.
+         *
+         * @param ensemble
+         *          Ensemble to read entries.
+         * @param writeSet
+         *          Write quorum to read entries.
+         * @param bookieFailureHistory
+         *          Observed failures on the bookies
+         * @return read sequence of bookies
+         */
+        public List<Integer> 
reorderReadSequence(ArrayList<BookieSocketAddress> ensemble,
+                                                 List<Integer> writeSet, 
Map<BookieSocketAddress, Long> bookieFailureHistory);
+
+
+        /**
+         * Reorder the read last add confirmed sequence of a given write 
quorum <i>writeSet</i>.
+         *
+         * @param ensemble
+         *          Ensemble to read entries.
+         * @param writeSet
+         *          Write quorum to read entries.
+         * @param bookieFailureHistory
+         *          Observed failures on the bookies
+         * @return read sequence of bookies
+         */
+        public List<Integer> 
reorderReadLACSequence(ArrayList<BookieSocketAddress> ensemble,
+                                                List<Integer> writeSet, 
Map<BookieSocketAddress, Long> bookieFailureHistory);
+    }
+
+The methods in this interface covers three parts - 1) initialization and 
uninitialization; 2) how to choose bookies to
+place data; and 3) how to choose bookies to do speculative reads.
+
+Initialization and uninitialization
+___________________________________
+
+The ensemble placement policy is constructed by jvm reflection during 
constructing bookkeeper client. After the
+`EnsemblePlacementPolicy` is constructed, bookkeeper client will call 
`#initialize` to initialize the placement policy.
+
+The `#initialize` method takes a few resources from bookkeeper for 
instantiating itself. These resources include:
+
+1. `ClientConfiguration` : The client configuration that used for constructing 
the bookkeeper client. The implementation of the placement policy could obtain 
its settings from this configuration.
+2. `DNSToSwitchMapping`: The DNS resolver for the ensemble policy to build the 
network topology of the bookies cluster. It is optional.
+3. `HashedWheelTimer`: A hashed wheel timer that could be used for timing 
related work. For example, a stabilize network topology could use it to delay 
network topology changes to reduce impacts of flapping bookie registrations due 
to zk session expires.
+4. `FeatureProvider`: A feature provider that the policy could use for 
enabling or disabling its offered features. For example, a region-aware 
placement policy could offer features to disable placing data to a specific 
region at runtime.
+5. `StatsLogger`: A stats logger for exposing stats.
+6. `AlertStatsLogger`: An alert stats logger for exposing critical stats that 
needs to be alerted.
+
+The ensemble placement policy is a single instance per bookkeeper client. The 
instance will be `#uninitialize` when
+closing the bookkeeper client. The implementation of a placement policy should 
be responsible for releasing all the
+resources that allocated during `#initialize`.
+
+How to choose bookies to place
+______________________________
+
+The bookkeeper client discovers list of bookies from zookeeper via 
`BookieWatcher` - whenever there are bookie changes,
+the ensemble placement policy will be notified with new list of bookies via 
`onClusterChanged(writableBookie, readOnlyBookies)`.
+The implementation of the ensemble placement policy will react on those 
changes to build new network topology. Subsequent
+operations like `newEnsemble` or `replaceBookie` hence can operate on the new 
network topology.
+
+newEnsemble(ensembleSize, writeQuorumSize, ackQuorumSize, excludeBookies)
+    Choose `ensembleSize` bookies for ensemble. If the count is more than the 
number of available nodes,
+    `BKNotEnoughBookiesException` is thrown.
+
+replaceBookie(ensembleSize, writeQuorumSize, ackQuorumSize, currentEnsemble, 
bookieToReplace, excludeBookies)
+    Choose a new bookie to replace `bookieToReplace`. If no bookie available 
in the cluster,
+    `BKNotEnoughBookiesException` is thrown.
+
+
+Both `RackAware` and `RegionAware` placement policies are `TopologyAware` 
policies. They build a `NetworkTopology` on
+responding bookie changes, use it for ensemble placement and ensure 
rack/region coverage for write quorums - a write
+quorum should be covered by at least two racks or regions.
+
+Network Topology
+^^^^^^^^^^^^^^^^
+
+The network topology is presenting a cluster of bookies in a tree hierarchical 
structure. For example, a bookie cluster
+may be consists of many data centers (aka regions) filled with racks of 
machines. In this tree structure, leaves
+represent bookies and inner nodes represent switches/routes that manage 
traffic in/out of regions or racks.
+
+For example, there are 3 bookies in region `A`. They are `bk1`, `bk2` and 
`bk3`. And their network locations are
+`/region-a/rack-1/bk1`, `/region-a/rack-1/bk2` and `/region-a/rack-2/bk3`. So 
the network topology will look like below:
+
+::
+
+              root
+               |
+           region-a
+             /  \
+        rack-1  rack-2
+         /  \       \
+       bk1  bk2     bk3
+
+Another example, there are 4 bookies spanning in two regions `A` and `B`. They 
are `bk1`, `bk2`, `bk3` and `bk4`. And
+their network locations are `/region-a/rack-1/bk1`, `/region-a/rack-1/bk2`, 
`/region-b/rack-2/bk3` and `/region-b/rack-2/bk4`.
+The network topology will look like below:
+
+::
+
+                    root
+                    /  \
+             region-a  region-b
+                |         |
+              rack-1    rack-2
+               / \       / \
+             bk1  bk2  bk3  bk4
+
+The network location of each bookie is resolved by a `DNSResolver` (interface 
is described as below). The `DNSResolver`
+resolves a list of DNS-names or IP-addresses into a list of network locations. 
The network location that is returned
+must be a network path of the form `/region/rack`, where `/` is the root, and 
`region` is the region id representing
+the data center where `rack` is located. The network topology of the bookie 
cluster would determine the number of
+components in the network path.
+
+::
+
+    /**
+     * An interface that must be implemented to allow pluggable
+     * DNS-name/IP-address to RackID resolvers.
+     *
+     */
+    @Beta
+    public interface DNSToSwitchMapping {
+        /**
+         * Resolves a list of DNS-names/IP-addresses and returns back a list of
+         * switch information (network paths). One-to-one correspondence must 
be
+         * maintained between the elements in the lists.
+         * Consider an element in the argument list - x.y.com. The switch 
information
+         * that is returned must be a network path of the form /foo/rack,
+         * where / is the root, and 'foo' is the switch where 'rack' is 
connected.
+         * Note the hostname/ip-address is not part of the returned path.
+         * The network topology of the cluster would determine the number of
+         * components in the network path.
+         * <p/>
+         *
+         * If a name cannot be resolved to a rack, the implementation
+         * should return {@link NetworkTopology#DEFAULT_RACK}. This
+         * is what the bundled implementations do, though it is not a formal 
requirement
+         *
+         * @param names the list of hosts to resolve (can be empty)
+         * @return list of resolved network paths.
+         * If <i>names</i> is empty, the returned list is also empty
+         */
+        public List<String> resolve(List<String> names);
+
+        /**
+         * Reload all of the cached mappings.
+         *
+         * If there is a cache, this method will clear it, so that future 
accesses
+         * will get a chance to see the new data.
+         */
+        public void reloadCachedMappings();
+    }
+
+By default, the network topology responds to bookie changes immediately. That 
means if a bookie's znode appears in  or
+disappears from zookeeper, the network topology will add the bookie or remove 
the bookie immediately. It introduces
+instability when bookie's zookeeper registration becomes flapping. In order to 
address this, there is a `StabilizeNetworkTopology`
+which delays removing bookies from network topology if they disappear from 
zookeeper. It could be enabled by setting
+the following option.
+
+::
+
+    # enable stabilize network topology by setting it to a positive value.
+    bkc.networkTopologyStabilizePeriodSeconds=10
+
+
+RackAware and RegionAware
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+`RackAware` placement policy basically just chooses bookies from different 
racks in the built network topology. It
+guarantees that a write quorum will cover at least two racks.
+
+`RegionAware` placement policy is a hierarchical placement policy, which it 
chooses equal-sized bookies from regions, and
+within each region it uses `RackAware` placement policy to choose bookies from 
racks. For example, if there is 3 regions -
+`region-a`, `region-b` and `region-c`, an application want to allocate a 
15-bookies ensemble. First, it would figure
+out there are 3 regions and it should allocate 5 bookies from each region. 
Second, for each region, it would use
+`RackAware` placement policy to choose 5 bookies.
+
+How to choose bookies to do speculative reads?
+______________________________________________
+
+`reorderReadSequence` and `reorderReadLACSequence` are two methods exposed by 
the placement policy, to help client
+determine a better read sequence according to the network topology and the 
bookie failure history.
+
+In `RackAware` placement policy, the reads will be tried in following sequence:
+
+- bookies are writable and didn't experience failures before
+- bookies are writable and experienced failures before
+- bookies are readonly
+- bookies already disappeared from network topology
+
+In `RegionAware` placement policy, the reads will be tried in similar 
following sequence as `RackAware` placement policy.
+There is a slight different on trying writable bookies: after trying every 2 
bookies from local region, it would try
+a bookie from remote region. Hence it would achieve low latency even there is 
network issues within local region.
+
+How to enable different EnsemblePlacementPolicy?
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Users could configure using different ensemble placement policies by setting 
following options in distributedlog
+configuration files.
+
+::
+
+    # enable rack-aware ensemble placement policy
+    
bkc.ensemblePlacementPolicy=org.apache.bookkeeper.client.RackawareEnsemblePlacementPolicy
+    # enable region-aware ensemble placement policy
+    
bkc.ensemblePlacementPolicy=org.apache.bookkeeper.client.RegionAwareEnsemblePlacementPolicy
+
+The network topology of bookies built by either 
`RackawareEnsemblePlacementPolicy` or `RegionAwareEnsemblePlacementPolicy`
+is done via a `DNSResolver`. The default `DNSResolver` is a script based DNS 
resolver. It reads the configuration
+parameters, executes any defined script, handles errors and resolves domain 
names to network locations. The script
+is configured via following settings in distributedlog configuration.
+
+::
+
+    bkc.networkTopologyScriptFileName=/path/to/dns/resolver/script
+
+Alternatively, the `DNSResolver` could be configured in following settings and 
loaded via reflection. `DNSResolverForRacks`
+is a good example to check out for customizing your dns resolver based our 
network environments.
+
+::
+
+    
bkEnsemblePlacementDnsResolverClass=org.apache.distributedlog.net.DNSResolverForRacks
+

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/implementation/writeproxy.rst
----------------------------------------------------------------------
diff --git 
a/website/docs/0.4.0-incubating/user_guide/implementation/writeproxy.rst 
b/website/docs/0.4.0-incubating/user_guide/implementation/writeproxy.rst
new file mode 100644
index 0000000..bc9feca
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/implementation/writeproxy.rst
@@ -0,0 +1,5 @@
+---
+layout: default
+---
+
+.. contents:: Write Proxy

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/main.rst
----------------------------------------------------------------------
diff --git a/website/docs/0.4.0-incubating/user_guide/main.rst 
b/website/docs/0.4.0-incubating/user_guide/main.rst
new file mode 100644
index 0000000..37eacbc
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/main.rst
@@ -0,0 +1,13 @@
+---
+title: "User Guide"
+layout: guide
+
+# Sub-level navigation
+sub-nav-group: user-guide
+sub-nav-id: user-guide
+sub-nav-parent: _root_
+sub-nav-group-title: User Guide
+sub-nav-pos: 1
+sub-nav-title: User Guide
+
+---

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/references/features.rst
----------------------------------------------------------------------
diff --git a/website/docs/0.4.0-incubating/user_guide/references/features.rst 
b/website/docs/0.4.0-incubating/user_guide/references/features.rst
new file mode 100644
index 0000000..c92ae3f
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/references/features.rst
@@ -0,0 +1,42 @@
+---
+layout: default
+
+# Sub-level navigation
+sub-nav-group: user-guide
+sub-nav-parent: references
+sub-nav-pos: 3
+sub-nav-title: Available Features
+---
+
+.. contents:: Available Features
+
+Features
+========
+
+BookKeeper Features
+-------------------
+
+*<scope>* is the scope value of the FeatureProvider passed to BookKeeperClient 
builder. in DistributedLog write proxy, the *<scope>* is 'bkc'.
+
+- *<scope>.repp_disable_durability_enforcement*: Feature to disable durability 
enforcement on region aware data placement policy. It is a feature that applied 
for global replicated log only. If the availability value is larger than zero, 
the region aware data placement policy will *NOT* enfore region-wise 
durability. That says if a *Log* is writing to region A, B, C with write quorum 
size *15* and ack quorum size *9*. If the availability value of this feature is 
zero, it requires *9*
+  acknowledges from bookies from at least two regions. If the availability 
value of this feature is larger than zero, the enforcement is *disabled* and it 
could acknowledge after receiving *9* acknowledges from whatever regions. By 
default the availability is zero. Turning on this value to tolerant multiple 
region failures.
+
+- *<scope>.disable_ensemble_change*: Feature to disable ensemble change on 
DistributedLog writers. If the availability value of this feature is larger 
than zero, it would disable ensemble change on writers. It could be used for 
toleranting zookeeper outage.
+
+- *<scope>.<region>.disallow_bookie_placement*: Feature to disallow choosing a 
bookie replacement from a given *region* when ensemble changing. It is a 
feature that applied for global replicated log. If the availability value is 
larger than zero, the writer (write proxy) will stop choosing a bookie from 
*<region>* when ensemble changing. It is useful to blackout a region 
dynamically.
+
+DistributedLog Features
+-----------------------
+
+*<scope>* is the scope value of the FeatureProvider passed to 
DistributedLogNamespace builder. in DistributedLog write proxy, the *<scope>* 
is 'dl'.
+
+- *<scope>.disable_logsegment_rolling*: Feature to disable log segment 
rolling. If the availability value is larger than zero, the writer (write 
proxy) will stop rolling to new log segments and keep writing to current log 
segments. It is a useful feature to tolerant zookeeper outage.
+
+- *<scope>.disable_write_limit*: Feature to disable write limiting. If the 
availability value is larger than zero, the writer (write proxy) will disable 
write limiting. It is used to control write limiting dynamically.
+
+Write Proxy Features
+--------------------
+
+- *region_stop_accept_new_stream*: Feature to disable accepting new streams in 
current region. It is a feature that applied for global replicated log only. If 
the availability value is larger than zero, the write proxies will stop 
accepting new streams and throw RegionAvailable exception to client. Hence 
client will know this region is stopping accepting new streams. Client will be 
forced to send requests to other regions. It is a feature used for ownership 
failover between regions.
+- *service_rate_limit_disabled*: Feature to disable service rate limiting. If 
the availability value is larger than zero, the write proxies will disable rate 
limiting.
+- *service_checksum_disabled*: Feature to disable service request checksum 
validation. If the availability value is larger than zero, the write proxies 
will disable request checksum validation.

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/references/main.rst
----------------------------------------------------------------------
diff --git a/website/docs/0.4.0-incubating/user_guide/references/main.rst 
b/website/docs/0.4.0-incubating/user_guide/references/main.rst
new file mode 100644
index 0000000..b0e62eb
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/references/main.rst
@@ -0,0 +1,28 @@
+---
+layout: default
+
+# Top navigation
+top-nav-group: user-guide
+top-nav-pos: 9
+top-nav-title: References
+
+# Sub-level navigation
+sub-nav-group: user-guide
+sub-nav-parent: user-guide
+sub-nav-id: references
+sub-nav-pos: 9
+sub-nav-title: References
+---
+
+References
+===========
+
+This page keeps references on configuration settings, metrics and features 
that exposed in DistributedLog.
+
+- `Metrics`_
+
+.. _Metrics: ./metrics
+
+- `Available Features`_
+
+.. _Available Features: ./features

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/user_guide/references/metrics.rst
----------------------------------------------------------------------
diff --git a/website/docs/0.4.0-incubating/user_guide/references/metrics.rst 
b/website/docs/0.4.0-incubating/user_guide/references/metrics.rst
new file mode 100644
index 0000000..9f3c9a3
--- /dev/null
+++ b/website/docs/0.4.0-incubating/user_guide/references/metrics.rst
@@ -0,0 +1,492 @@
+---
+layout: default
+
+# Sub-level navigation
+sub-nav-group: user-guide
+sub-nav-parent: references
+sub-nav-pos: 2
+sub-nav-title: Metrics
+
+---
+
+.. contents:: Metrics
+
+Metrics
+=======
+
+This section lists the metrics exposed by main classes.
+
+({scope} is referencing current scope value of passed in StatsLogger.)
+
+MonitoredFuturePool
+-------------------
+
+**{scope}/tasks_pending**
+
+Gauge. How many tasks are pending in this future pool? If this value becomes 
high, it means that
+the future pool execution rate couldn't keep up with submission rate. That 
would be cause high
+*task_pending_time* hence affecting the callers that use this future pool.
+It could also cause heavy jvm gc if this pool keeps building up.
+
+**{scope}/task_pending_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
waiting being executed.
+It becomes high because either *tasks_pending* is building up or 
*task_execution_time* is high blocking other
+tasks to execute.
+
+**{scope}/task_execution_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
execution. If it becomes high,
+it would block other tasks to execute if there isn't enough threads in this 
executor, hence cause high
+*task_pending_time* and impact user end latency.
+
+**{scope}/task_enqueue_time**
+
+OpStats. The time that tasks spent on submission. The submission time would 
also impact user end latency.
+
+MonitoredScheduledThreadPoolExecutor
+------------------------------------
+
+**{scope}/pending_tasks**
+
+Gauge. How many tasks are pending in this thread pool executor? If this value 
becomes high, it means that
+the thread pool executor execution rate couldn't keep up with submission rate. 
That would be cause high
+*task_pending_time* hence affecting the callers that use this executor. It 
could also cause heavy jvm gc if
+queue keeps building up.
+
+**{scope}/completed_tasks**
+
+Gauge. How many tasks are completed in this thread pool executor?
+
+**{scope}/total_tasks**
+
+Gauge. How many tasks are submitted to this thread pool executor?
+
+**{scope}/task_pending_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
waiting being executed.
+It becomes high because either *pending_tasks* is building up or 
*task_execution_time* is high blocking other
+tasks to execute.
+
+**{scope}/task_execution_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
execution. If it becomes high,
+it would block other tasks to execute if there isn't enough threads in this 
executor, hence cause high
+*task_pending_time* and impact user end latency.
+
+OrderedScheduler
+----------------
+
+OrderedScheduler is a thread pool based *ScheduledExecutorService*. It is 
comprised with multiple
+MonitoredScheduledThreadPoolExecutor_. Each 
MonitoredScheduledThreadPoolExecutor_ is wrapped into a
+MonitoredFuturePool_. So there are aggregated stats and per-executor stats 
exposed.
+
+Aggregated Stats
+~~~~~~~~~~~~~~~~
+
+**{scope}/task_pending_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
waiting being executed.
+It becomes high because either *pending_tasks* is building up or 
*task_execution_time* is high blocking other
+tasks to execute.
+
+**{scope}/task_execution_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
execution. If it becomes high,
+it would block other tasks to execute if there isn't enough threads in this 
executor, hence cause high
+*task_pending_time* and impact user end latency.
+
+**{scope}/futurepool/tasks_pending**
+
+Gauge. How many tasks are pending in this future pool? If this value becomes 
high, it means that
+the future pool execution rate couldn't keep up with submission rate. That 
would be cause high
+*task_pending_time* hence affecting the callers that use this future pool.
+It could also cause heavy jvm gc if this pool keeps building up.
+
+**{scope}/futurepool/task_pending_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
waiting being executed.
+It becomes high because either *tasks_pending* is building up or 
*task_execution_time* is high blocking other
+tasks to execute.
+
+**{scope}/futurepool/task_execution_time**
+
+OpStats. It measures the characteristics about the time that tasks spent on 
execution. If it becomes high,
+it would block other tasks to execute if there isn't enough threads in this 
executor, hence cause high
+*task_pending_time* and impact user end latency.
+
+**{scope}/futurepool/task_enqueue_time**
+
+OpStats. The time that tasks spent on submission. The submission time would 
also impact user end latency.
+
+Per Executor Stats
+~~~~~~~~~~~~~~~~~~
+
+Stats about individual executors are exposed under 
*{scope}/{name}-executor-{id}-0*. *{name}* is the scheduler
+name and *{id}* is the index of the executor in the pool. The corresponding 
stats of its futurepool are exposed
+under *{scope}/{name}-executor-{id}-0/futurepool*. See 
MonitoredScheduledThreadPoolExecutor_ and MonitoredFuturePool_
+for more details.
+
+ZooKeeperClient
+---------------
+
+Operation Stats
+~~~~~~~~~~~~~~~
+
+All operation stats are exposed under {scope}/zk. The stats are **latency** 
*OpStats*
+on zookeeper operations.
+
+**{scope}/zk/{op}**
+
+latency stats on operations.
+these operations are *create_client*, *get_data*, *set_data*, *delete*, 
*get_children*, *multi*, *get_acl*, *set_acl* and *sync*.
+
+Watched Event Stats
+~~~~~~~~~~~~~~~~~~~
+
+All stats on zookeeper watched events are exposed under {scope}/watcher. The 
stats are *Counter* about the watched events that this client received:
+
+**{scope}/watcher/state/{keeper_state}**
+
+the number of `KeeperState` changes that this client received. The states are 
*Disconnected*, *SyncConnected*,
+*AuthFailed*, *ConnectedReadOnly*, *SaslAuthenticated* and *Expired*. By 
monitoring metrics like *SyncConnected*
+or *Expired* it would help understanding the healthy of this zookeeper client.
+
+**{scope}/watcher/events/{event}**
+
+the number of `Watcher.Event` received by this client. Those events are 
*None*, *NodeCreated*, *NodeDeleted*,
+*NodeDataChanged*, *NodeChildrenChanged*.
+
+Watcher Manager Stats
+~~~~~~~~~~~~~~~~~~~~~
+
+This ZooKeeperClient provides a watcher manager to manage watchers for 
applications. It tracks the mapping between
+paths and watcher. It is the way to provide the ability on removing watchers. 
The stats are *Gauge* about the number
+of watchers managed by this zookeeper client.
+
+**{scope}/watcher_manager/total_watches**
+
+total number of watches that are managed by this watcher manager. If it keeps 
growing, it usually means that
+watchers are leaking (resources aren't closed properly). It will cause OOM.
+
+**{scope}/watcher_manager/num_child_watches**
+
+total number of paths that are watched by this watcher manager.
+
+BookKeeperClient
+----------------
+
+TODO: add bookkeeper stats there
+
+DistributedReentrantLock
+------------------------
+
+All stats related to locks are exposed under {scope}/lock.
+
+**{scope}/acquire**
+
+OpStats. It measures the characteristics about the time that spent on 
acquiring locks.
+
+**{scope}/release**
+
+OpStats. It measures the characteristics about the time that spent on 
releasing locks.
+
+**{scope}/reacquire**
+
+OpStats. The lock will be expired when the underneath zookeeper session 
expired. The
+reentrant lock will attempt to re-acquire the lock automatically when session 
expired.
+This metric measures the characteristics about the time that spent on 
re-acquiring locks.
+
+**{scope}/internalTryRetries**
+
+Counter. The number of retries that locks spend on re-creating internal locks. 
Typically,
+a new internal lock will be created when session expired.
+
+**{scope}/acquireTimeouts**
+
+Counter. The number of timeouts that caller experienced when acquiring locks.
+
+**{scope}/tryAcquire**
+
+OpStats. It measures the characteristics about the time that each internal 
lock spent on
+acquiring.
+
+**{scope}/tryTimeouts**
+
+Counter. The number of timeouts that internal locks try acquiring.
+
+**{scope}/unlock**
+
+OpStats. It measures the characteristics about the time that the caller spent 
on unlocking
+internal locks.
+
+BKLogHandler
+------------
+
+The log handler is a base class on managing log segments. so all the metrics 
in this class are
+related log segments retrieval and exposed under {scope}/logsegments. They are 
all `OpStats` in
+the format of `{scope}/logsegments/{op}`. Those operations are:
+
+* force_get_list: force to get the list of log segments.
+* get_list: get the list of the log segments. it might just retrieve from 
local log segment cache.
+* get_filtered_list: get the filtered list of log segments.
+* get_full_list: get the full list of log segments.
+* get_inprogress_segment: time between the inprogress log segment created and 
the handler read it.
+* get_completed_segment: time between a log segment is turned to completed and 
the handler read it.
+* negative_get_inprogress_segment: record the negative values for 
`get_inprogress_segment`.
+* negative_get_completed_segment: record the negative values for 
`get_completed_segment`.
+* recover_last_entry: recovering last entry from a log segment.
+* recover_scanned_entries: the number of entries that are scanned during 
recovering.
+
+See BKLogWriteHandler_ for write handlers.
+
+See BKLogReadHandler_ for read handlers.
+
+BKLogReadHandler
+----------------
+
+The core logic in log reader handle is readahead worker. Most of readahead 
stats are exposed under
+{scope}/readahead_worker.
+
+**{scope}/readahead_worker/wait**
+
+Counter. Number of waits that readahead worker is waiting. If this keeps 
increasing, it usually means
+readahead keep getting full because of reader slows down reading.
+
+**{scope}/readahead_worker/repositions**
+
+Counter. Number of repositions that readhead worker encounters. Reposition 
means that a readahead worker
+finds that it isn't advancing to a new log segment and force re-positioning.
+
+**{scope}/readahead_worker/entry_piggy_back_hits**
+
+Counter. It increases when the last add confirmed being advanced because of 
the piggy-back lac.
+
+**{scope}/readahead_worker/entry_piggy_back_misses**
+
+Counter. It increases when the last add confirmed isn't advanced by a read 
entry because it doesn't
+iggy back a newer lac.
+
+**{scope}/readahead_worker/read_entries**
+
+OpStats. Stats on number of entries read per readahead read batch.
+
+**{scope}/readahead_worker/read_lac_counter**
+
+Counter. Stats on the number of readLastConfirmed operations
+
+**{scope}/readahead_worker/read_lac_and_entry_counter**
+
+Counter. Stats on the number of readLastConfirmedAndEntry operations.
+
+**{scope}/readahead_worker/cache_full**
+
+Counter. It increases each time readahead worker finds cache become full. If 
it keeps increasing,
+that means reader slows down reading.
+
+**{scope}/readahead_worker/resume**
+
+OpStats. Stats on readahead worker resuming reading from wait state.
+
+**{scope}/readahead_worker/long_poll_interruption**
+
+OpStats. Stats on the number of interruptions happened to long poll. the 
interruptions are usually
+because of receiving zookeeper notifications.
+
+**{scope}/readahead_worker/notification_execution**
+
+OpStats. Stats on executions over the notifications received from zookeeper.
+
+**{scope}/readahead_worker/metadata_reinitialization**
+
+OpStats. Stats on metadata reinitialization after receiving notifcation from 
log segments updates.
+
+**{scope}/readahead_worker/idle_reader_warn**
+
+Counter. It increases each time the readahead worker detects itself becoming 
idle.
+
+BKLogWriteHandler
+-----------------
+
+Log write handlers are responsible for log segment creation/deletions. All the 
metrics are exposed under
+{scope}/segments.
+
+**{scope}/segments/open**
+
+OpStats. Latency characteristics on starting a new log segment.
+
+**{scope}/segments/close**
+
+OpStats. Latency characteristics on completing an inprogress log segment.
+
+**{scope}/segments/recover**
+
+OpStats. Latency characteristics on recovering a log segment.
+
+**{scope}/segments/delete**
+
+OpStats. Latency characteristics on deleting a log segment.
+
+BKAsyncLogWriter
+----------------
+
+**{scope}/log_writer/write**
+
+OpStats. latency characteristics about the time that write operations spent.
+
+**{scope}/log_writer/write/queued**
+
+OpStats. latency characteristics about the time that write operations spent in 
the queue.
+`{scope}/log_writer/write` latency is high might because the write operations 
are pending
+in the queue for long time due to log segment rolling.
+
+**{scope}/log_writer/bulk_write**
+
+OpStats. latency characteristics about the time that bulk_write operations 
spent.
+
+**{scope}/log_writer/bulk_write/queued**
+
+OpStats. latency characteristics about the time that bulk_write operations 
spent in the queue.
+`{scope}/log_writer/bulk_write` latency is high might because the write 
operations are pending
+in the queue for long time due to log segment rolling.
+
+**{scope}/log_writer/get_writer**
+
+OpStats. the time spent on getting the writer. it could spike when there is 
log segment rolling
+happened during getting the writer. it is a good stat to look into when the 
latency is caused by
+queuing time.
+
+**{scope}/log_writer/pending_request_dispatch**
+
+Counter. the number of queued operations that are dispatched after log segment 
is rolled. it is
+an metric on measuring how many operations has been queued because of log 
segment rolling.
+
+BKAsyncLogReader
+----------------
+
+**{scope}/async_reader/future_set**
+
+OpStats. Time spent on satisfying futures of read requests. if it is high, it 
means that the caller
+takes time on processing the result of read requests. The side effect is 
blocking consequent reads.
+
+**{scope}/async_reader/schedule**
+
+OpStats. Time spent on scheduling next reads.
+
+**{scope}/async_reader/background_read**
+
+OpStats. Time spent on background reads.
+
+**{scope}/async_reader/read_next_exec**
+
+OpStats. Time spent on executing `reader#readNext()`
+
+**{scope}/async_reader/time_between_read_next**
+
+OpStats. Time spent on between two consequent `reader#readNext()`. if it is 
high, it means that
+the caller is slowing down on calling `reader#readNext()`.
+
+**{scope}/async_reader/delay_until_promise_satisfied**
+
+OpStats. Total latency for the read requests.
+
+**{scope}/async_reader/idle_reader_error**
+
+Counter. The number idle reader errors.
+
+BKDistributedLogManager
+-----------------------
+
+Future Pools
+~~~~~~~~~~~~
+
+The stats about future pools that used by writers are exposed under 
{scope}/writer_future_pool,
+while the stats about future pools that used by readers are exposed under 
{scope}/reader_future_pool.
+See MonitoredFuturePool_ for detail stats.
+
+Distributed Locks
+~~~~~~~~~~~~~~~~~
+
+The stats about the locks used by writers are exposed under {scope}/lock while 
those used by readers
+are exposed under {scope}/read_lock/lock. See DistributedReentrantLock_ for 
detail stats.
+
+Log Handlers
+~~~~~~~~~~~~
+
+**{scope}/logsegments**
+
+All basic stats of log handlers are exposed under {scope}/logsegments. See 
BKLogHandler_ for detail stats.
+
+**{scope}/segments**
+
+The stats about write log handlers are exposed under {scope}/segments. See 
BKLogWriteHandler_ for detail stats.
+
+**{scope}/readhead_worker**
+
+The stats about read log handlers are exposed under {scope}/readahead_worker.
+See BKLogReadHandler_ for detail stats.
+
+Writers
+~~~~~~~
+
+All writer related metrics are exposed under {scope}/log_writer. See 
BKAsyncLogWriter_ for detail stats.
+
+Readers
+~~~~~~~
+
+All reader related metrics are exposed under {scope}/async_reader. See 
BKAsyncLogReader_ for detail stats.
+
+BKDistributedLogNamespace
+-------------------------
+
+ZooKeeper Clients
+~~~~~~~~~~~~~~~~~
+
+There are various of zookeeper clients created per namespace for different 
purposes. They are:
+
+**{scope}/dlzk_factory_writer_shared**
+
+Stats about the zookeeper client shared by all DL writers.
+
+**{scope}/dlzk_factory_reader_shared**
+
+Stats about the zookeeper client shared by all DL readers.
+
+**{scope}/bkzk_factory_writer_shared**
+
+Stats about the zookeeper client used by bookkeeper client that shared by all 
DL writers.
+
+**{scope}/bkzk_factory_reader_shared**
+
+Stats about the zookeeper client used by bookkeeper client that shared by all 
DL readers.
+
+See ZooKeeperClient_ for zookeeper detail stats.
+
+BookKeeper Clients
+~~~~~~~~~~~~~~~~~~
+
+All the bookkeeper client related stats are exposed directly to current 
{scope}. See BookKeeperClient_
+for detail stats.
+
+Utils
+~~~~~
+
+**{scope}/factory/thread_pool**
+
+Stats about the ordered scheduler used by this namespace. See 
OrderedScheduler_ for detail stats.
+
+**{scope}/factory/readahead_thread_pool**
+
+Stats about the readahead thread pool executor used by this namespace. See 
MonitoredScheduledThreadPoolExecutor_
+for detail stats.
+
+**{scope}/writeLimiter**
+
+Stats about the global write limiter used by list namespace.
+
+DistributedLogManager
+~~~~~~~~~~~~~~~~~~~~~
+
+All the core stats about reader and writer are exposed under current {scope} 
via BKDistributedLogManager_.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/latest
----------------------------------------------------------------------
diff --git a/website/docs/latest b/website/docs/latest
new file mode 120000
index 0000000..92a7f82
--- /dev/null
+++ b/website/docs/latest
@@ -0,0 +1 @@
+../../docs
\ No newline at end of file


Reply via email to