http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/bookkeeper.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/bookkeeper.rst b/website/docs/0.4.0-incubating/admin_guide/bookkeeper.rst new file mode 100644 index 0000000..c60fbf1 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/bookkeeper.rst @@ -0,0 +1,213 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 6 +top-nav-title: BookKeeper + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: bookkeeper +sub-nav-pos: 6 +sub-nav-title: BookKeeper +--- + +.. contents:: BookKeeper + +BookKeeper +========== + +For reliable BookKeeper service, you should deploy BookKeeper in a cluster. + +Run from bookkeeper source +-------------------------- + +The version of BookKeeper that DistributedLog depends on is not the official opensource version. +It is twitter's production version `4.3.4-TWTTR`, which is available in `https://github.com/twitter/bookkeeper`. +We are working actively with BookKeeper community to merge all twitter's changes back to the community. + +The major changes in Twitter's bookkeeper includes: + +- BOOKKEEPER-670_: Long poll reads and LastAddConfirmed piggyback. It is to reduce the tailing read latency. +- BOOKKEEPER-759_: Delay ensemble change if it doesn't break ack quorum constraint. It is to reduce the write latency on bookie failures. +- BOOKKEEPER-757_: Ledger recovery improvements, to reduce the latency on ledger recovery. +- Misc improvements on bookie recovery and bookie storage. + +.. _BOOKKEEPER-670: https://issues.apache.org/jira/browse/BOOKKEEPER-670 +.. _BOOKKEEPER-759: https://issues.apache.org/jira/browse/BOOKKEEPER-759 +.. _BOOKKEEPER-757: https://issues.apache.org/jira/browse/BOOKKEEPER-757 + +To build bookkeeper, run: + +1. First checkout the bookkeeper source code from twitter's branch. + +.. code-block:: bash + + $ git clone https://github.com/twitter/bookkeeper.git bookkeeper + + +2. Build the bookkeeper package: + +.. code-block:: bash + + $ cd bookkeeper + $ mvn clean package assembly:single -DskipTests + +However, since `bookkeeper-server` is one of the dependency of `distributedlog-service`. +You could simply run bookkeeper using same set of scripts provided in `distributedlog-service`. +In the following sections, we will describe how to run bookkeeper using the scripts provided in +`distributedlog-service`. + +Run from distributedlog source +------------------------------ + +Build ++++++ + +First of all, build DistributedLog: + +.. code-block:: bash + + $ mvn clean install -DskipTests + + +Configuration ++++++++++++++ + +The configuration file `bookie.conf` under `distributedlog-service/conf` is a template of production +configuration to run a bookie node. Most of the configuration settings are good for production usage. +You might need to configure following settings according to your environment and hardware platform. + +Port +^^^^ + +By default, the service port is `3181`, where the bookie server listens on. You can change the port +to whatever port you like by modifying the following setting. + +:: + + bookiePort=3181 + + +Disks +^^^^^ + +You need to configure following settings according to the disk layout of your hardware. It is recommended +to put `journalDirectory` under a separated disk from others for performance. It is okay to set +`indexDirectories` to be same as `ledgerDirectories`. However, it is recommended to put `indexDirectories` +to a SSD driver for better performance. + +:: + + # Directory Bookkeeper outputs its write ahead log + journalDirectory=/tmp/data/bk/journal + + # Directory Bookkeeper outputs ledger snapshots + ledgerDirectories=/tmp/data/bk/ledgers + + # Directory in which index files will be stored. + indexDirectories=/tmp/data/bk/ledgers + + +To better understand how bookie nodes work, please check bookkeeper_ website for more details. + +ZooKeeper +^^^^^^^^^ + +You need to configure following settings to point the bookie to the zookeeper server that it is using. +You need to make sure `zkLedgersRootPath` exists before starting the bookies. + +:: + + # Root zookeeper path to store ledger metadata + # This parameter is used by zookeeper-based ledger manager as a root znode to + # store all ledgers. + zkLedgersRootPath=/messaging/bookkeeper/ledgers + # A list of one of more servers on which zookeeper is running. + zkServers=localhost:2181 + + +Stats Provider +^^^^^^^^^^^^^^ + +Bookies use `StatsProvider` to expose its metrics. The `StatsProvider` is a pluggable library to +adopt to various stats collecting systems. Please check monitoring_ for more details. + +.. _monitoring: ./monitoring + +:: + + # stats provide - use `codahale` metrics library + statsProviderClass=org.apache.bookkeeper.stats.CodahaleMetricsServletProvider + + ### Following settings are stats provider related settings + + # Exporting codahale stats in http port `9001` + codahaleStatsHttpPort=9001 + + +Index Settings +^^^^^^^^^^^^^^ + +- `pageSize`: size of a index page in ledger cache, in bytes. If there are large number + of ledgers and each ledger has fewer entries, smaller index page would improve memory usage. +- `pageLimit`: The maximum number of index pages in ledger cache. If nummber of index pages + reaches the limitation, bookie server starts to swap some ledgers from memory to disk. + Increase this value when swap becomes more frequent. But make sure `pageLimit*pageSize` + should not be more than JVM max memory limitation. + + +Journal Settings +^^^^^^^^^^^^^^^^ + +- `journalMaxGroupWaitMSec`: The maximum wait time for group commit. It is valid only when + `journalFlushWhenQueueEmpty` is false. +- `journalFlushWhenQueueEmpty`: Flag indicates whether to flush/sync journal. If it is `true`, + bookie server will sync journal when there is no other writes in the journal queue. +- `journalBufferedWritesThreshold`: The maximum buffered writes for group commit, in bytes. + It is valid only when `journalFlushWhenQueueEmpty` is false. +- `journalBufferedEntriesThreshold`: The maximum buffered writes for group commit, in entries. + It is valid only when `journalFlushWhenQueueEmpty` is false. + +Setting `journalFlushWhenQueueEmpty` to `true` will produce low latency when the traffic is low. +However, the latency varies a lost when the traffic is increased. So it is recommended to set +`journalMaxGroupWaitMSec`, `journalBufferedEntriesThreshold` and `journalBufferedWritesThreshold` +to reduce the number of fsyncs made to journal disk, to achieve sustained low latency. + +Thread Settings +^^^^^^^^^^^^^^^ + +It is recommended to configure following settings to align with the cpu cores of the hardware. + +:: + + numAddWorkerThreads=4 + numJournalCallbackThreads=4 + numReadWorkerThreads=4 + numLongPollWorkerThreads=4 + +Run ++++ + +As `bookkeeper-server` is shipped as part of `distributedlog-service`, you could use the `dlog-daemon.sh` +script to start `bookie` as daemon thread. + +Start the bookie: + +.. code-block:: bash + + $ ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf /path/to/bookie/conf + + +Stop the bookie: + +.. code-block:: bash + + $ ./distributedlog-service/bin/dlog-daemon.sh stop bookie + + +Please check bookkeeper_ website for more details. + +.. _bookkeeper: http://bookkeeper.apache.org/
http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/hardware.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/hardware.rst b/website/docs/0.4.0-incubating/admin_guide/hardware.rst new file mode 100644 index 0000000..ab1dd91 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/hardware.rst @@ -0,0 +1,138 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 3 +top-nav-title: Hardware + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: hardware +sub-nav-pos: 3 +sub-nav-title: Hardware +--- + +.. contents:: Hardware + +Hardware +======== + +Figure 1 describes the data flow of DistributedLog. Write traffic comes to `Write Proxy` +and the data is replicated in `RF` (replication factor) ways to `BookKeeper`. BookKeeper +stores the replicated data and keeps the data for a given retention period. The data is +read by `Read Proxy` and fanout to readers. + +In such layered architecture, each layer has its own responsibilities and different resource +requirements. It makes the capacity and cost model much clear and users could scale +different layers independently. + +.. figure:: ../images/costmodel.png + :align: center + + Figure 1. DistributedLog Cost Model + +Metrics +~~~~~~~ + +There are different metrics measuring the capability of a service instance in each layer +(e.g a `write proxy` node, a `bookie` storage node, a `read proxy` node and such). These metrics +can be `rps` (requests per second), `bps` (bits per second), `number of streams` that a instance +can support, and latency requirements. `bps` is the best and simple factor on measuring the +capability of current distributedlog architecture. + +Write Proxy +~~~~~~~~~~~ + +Write Proxy (WP) is a stateless serving service that writes and replicates fan-in traffic into BookKeeper. +The capability of a write proxy instance is purely dominated by the *OUTBOUND* network bandwidth, +which is reflected as incoming `Write Throughput` and `Replication Factor`. + +Calculating the capacity of Write Proxy (number of instances of write proxies) is pretty straightforward. +The formula is listed as below. + +:: + + Number of Write Proxies = (Write Throughput) * (Replication Factor) / (Write Proxy Outbound Bandwidth) + +As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC). + +The cost estimation is also straightforward. + +:: + + Bandwidth TCO ($/day/MB) = (Write Proxy TCO) / (Write Proxy Outbound Bandwidth) + Cost of write proxies = (Write Throughput) * (Replication Factor) / (Bandwidth TCO) + +CPUs +^^^^ + +DistributedLog is not CPU bound. You can run an instance with 8 or 12 cores just fine. + +Memories +^^^^^^^^ + +There's a fair bit of caching. Consider running with at least 8GB of memory. + +Disks +^^^^^ + +This is a stateless process, disk performances are not relevant. + +Network +^^^^^^^ + +Depending on your throughput, you might be better off running this with 10Gb NIC. In this scenario, you can easily achieves 350MBps of writes. + + +BookKeeper +~~~~~~~~~~ + +BookKeeper is the log segment store, which is a stateful service. There are two factors to measure the +capability of a Bookie instance: `bandwidth` and `storage`. The bandwidth is majorly dominated by the +outbound traffic from write proxy, which is `(Write Throughput) * (Replication Factor)`. The storage is +majorly dominated by the traffic and also `Retention Period`. + +Calculating the capacity of BookKeeper (number of instances of bookies) is a bit more complicated than Write +Proxy. The total number of instances is the maximum number of the instances of bookies calculated using +`bandwidth` and `storage`. + +:: + + Number of bookies based on bandwidth = (Write Throughput) * (Replication Factor) / (Bookie Inbound Bandwidth) + Number of bookies based on storage = (Write Throughput) * (Replication Factor) * (Replication Factor) / (Bookie disk space) + Number of bookies = maximum((number of bookies based on bandwidth), (number of bookies based on storage)) + +We should consider both bandwidth and storage when choosing the hardware for bookies. There are several rules to follow: +- A bookie should have multiple disks. +- The number of disks used as journal disks should have similar I/O bandwidth as its *INBOUND* network bandwidth. For example, if you plan to use a disk for journal which I/O bandwidth is around 100MBps, a 1Gb NIC is a better choice than 10Gb NIC. +- The number of disks used as ledger disks should be large enough to hold data if retention period is typical long. + +The cost estimation is straightforward based on the number of bookies estimated above. + +:: + + Cost of bookies = (Number of bookies) * (Bookie TCO) + +Read Proxy +~~~~~~~~~~ + +Similar as Write Proxy, Read Proxy is also dominated by *OUTBOUND* bandwidth, which is reflected as incoming `Write Throughput` and `Fanout Factor`. + +Calculating the capacity of Read Proxy (number of instances of read proxies) is also pretty straightforward. +The formula is listed as below. + +:: + + Number of Read Proxies = (Write Throughput) * (Fanout Factor) / (Read Proxy Outbound Bandwidth) + +As it is bandwidth bound, we'd recommend using machines that have high network bandwith (e.g 10Gb NIC). + +The cost estimation is also straightforward. + +:: + + Bandwidth TCO ($/day/MB) = (Read Proxy TCO) / (Read Proxy Outbound Bandwidth) + Cost of read proxies = (Write Throughput) * (Fanout Factor) / (Bandwidth TCO) + http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/loadtest.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/loadtest.rst b/website/docs/0.4.0-incubating/admin_guide/loadtest.rst new file mode 100644 index 0000000..c7591eb --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/loadtest.rst @@ -0,0 +1,100 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 2 +top-nav-title: Load Test + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: performance +sub-nav-pos: 2 +sub-nav-title: Load Test +--- + +.. contents:: Load Test + +Load Test +========= + +Overview +-------- + +Under distributedlog-benchmark you will find a set of applications intended for generating large amounts of load in a distributedlog cluster. These applications are suitable for load testing, performance testing, benchmarking, or even simply smoke testing a distributedlog cluster. + +The dbench script can run in several modes: + +1. bkwrite - Benchmark the distributedlog write path using the core library + +2. write - Benchmark the distributedlog write path, via write proxy, using the thin client + +3. read - Benchmark the distributedlog read path using the core library + + +Running Dbench +-------------- + +The distributedlog-benchmark binary dbench is intended to be run simultaneously from many machines with identical settings. Together, all instances of dbench comprise a benchmark job. How you launch a benchmark job will depend on your operating environment. We recommend using a cluster scheduler like aurora or kubernetes to simplify the process, but tools like capistrano can also simplify this process greatly. + +The benchmark script can be found at + +:: + + distributedlog-benchmark/bin/dbench + +Arguments may be passed to this script via environment variables. The available arguments depend on the execution mode. For an up to date list, check the script itself. + + +Write to Proxy with Thin Client +------------------------------- + +The proxy write test (mode = `write`) can be used to send writes to a proxy cluster to be written to a set of streams. + +For example to use the proxy write test to generate 10000 requests per second across 10 streams using 50 machines, run the following command on each machine. + +:: + + STREAM_NAME_PREFIX=loadtest_ + BENCHMARK_DURATION=60 # minutes + DL_NAMESPACE=<dl namespace> + NUM_STREAMS=10 + INITIAL_RATE=200 + distributedlog-benchmark/bin/dbench write + + +Write to BookKeeper with Core Library +------------------------------------- + +The core library write test (mode = `bkwrite`) can be used to send writes to directly to bookkeeper using the core library. + +For example to use the core library write test to generate 100MBps across 10 streams using 100 machines, run the following command on each machine. + +:: + + STREAM_NAME_PREFIX=loadtest_ + BENCHMARK_DURATION=60 # minutes + DL_NAMESPACE=<dl namespace> + NUM_STREAMS=10 + INITIAL_RATE=1024 + MSG_SIZE=1024 + distributedlog-benchmark/bin/dbench bkwrite + + +Read from BookKeeper with Core Library +-------------------------------------- + +The core library read test (mode = `read`) can be used to read directly from bookkeeper using the core library. + +For example to use the core library read test to read from 10 streams on 100 instances, run the following command on each machine. + +:: + + STREAM_NAME_PREFIX=loadtest_ + BENCHMARK_DURATION=60 # minutes + DL_NAMESPACE=<dl namespace> + MAX_STREAM_ID=9 + NUM_READERS_PER_STREAM=5 + TRUNCATION_INTERVAL=60 # seconds + distributedlog-benchmark/bin/dbench read http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/main.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/main.rst b/website/docs/0.4.0-incubating/admin_guide/main.rst new file mode 100644 index 0000000..5fb7a3a --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/main.rst @@ -0,0 +1,13 @@ +--- +title: "Admin Guide" +layout: guide + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-id: admin-guide +sub-nav-parent: _root_ +sub-nav-group-title: Admin Guide +sub-nav-pos: 1 +sub-nav-title: Admin Guide + +--- http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/monitoring.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/monitoring.rst b/website/docs/0.4.0-incubating/admin_guide/monitoring.rst new file mode 100644 index 0000000..59645c1 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/monitoring.rst @@ -0,0 +1,398 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 4 +top-nav-title: Monitoring + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: monitoring +sub-nav-pos: 4 +sub-nav-title: Monitoring +--- + +.. contents:: Monitoring + +Monitoring +========== + +DistributedLog uses the stats library provided by Apache BookKeeper for reporting metrics in +both the server and the client. This can be configured to report stats using pluggable stats +provider to integrate with your monitoring system. + +Stats Provider +~~~~~~~~~~~~~~ + +`StatsProvider` is a provider that provides different kinds of stats logger for different scopes. +The provider is also responsible for reporting its managed metrics. + +:: + + // Create the stats provider + StatsProvider statsProvider = ...; + // Start the stats provider + statsProvider.start(conf); + // Stop the stats provider + statsProvider.stop(); + +Stats Logger +____________ + +A scoped `StatsLogger` is a stats logger that records 3 kinds of statistics +under a given `scope`. + +A `StatsLogger` could be either created by obtaining from stats provider with +the scope name: + +:: + + StatsProvider statsProvider = ...; + StatsLogger statsLogger = statsProvider.scope("test-scope"); + +Or created by obtaining from a stats logger with a sub scope name: + +:: + + StatsLogger rootStatsLogger = ...; + StatsLogger subStatsLogger = rootStatsLogger.scope("sub-scope"); + +All the metrics in a stats provider are managed in a hierarchical of scopes. + +:: + + // all stats recorded by `rootStatsLogger` are under 'root' + StatsLogger rootStatsLogger = statsProvider.scope("root"); + // all stats recorded by 'subStatsLogger1` are under 'root/scope1' + StatsLogger subStatsLogger1 = statsProvider.scope("scope1"); + // all stats recorded by 'subStatsLogger2` are under 'root/scope2' + StatsLogger subStatsLogger2 = statsProvider.scope("scope2"); + +Counters +++++++++ + +A `Counter` is a cumulative metric that represents a single numerical value. A **counter** +is typically used to count requests served, tasks completed, errors occurred, etc. Counters +should not be used to expose current counts of items whose number can also go down, e.g. +the number of currently running tasks. Use `Gauges` for this use case. + +To change a counter, use: + +:: + + StatsLogger statsLogger = ...; + Counter births = statsLogger.getCounter("births"); + // increment the counter + births.inc(); + // decrement the counter + births.dec(); + // change the counter by delta + births.add(-10); + // reset the counter + births.reset(); + +Gauges +++++++ + +A `Gauge` is a metric that represents a single numerical value that can arbitrarily go up and down. + +Gauges are typically used for measured values like temperatures or current memory usage, but also +"counts" that can go up and down, like the number of running tasks. + +To define a gauge, stick the following code somewhere in the initialization: + +:: + + final AtomicLong numPendingRequests = new AtomicLong(0L); + StatsLogger statsLogger = ...; + statsLogger.registerGauge( + "num_pending_requests", + new Gauge<Number>() { + @Override + public Number getDefaultValue() { + return 0; + } + @Override + public Number getSample() { + return numPendingRequests.get(); + } + }); + +The gauge must always return a numerical value when sampling. + +Metrics (OpStats) ++++++++++++++++++ + +A `OpStats` is a set of metrics that represents the statistics of an `operation`. Those metrics +include `success` or `failure` of the operations and its distribution (also known as `Histogram`). +It is usually used for timing. + +:: + + StatsLogger statsLogger = ...; + OpStatsLogger writeStats = statsLogger.getOpStatsLogger("writes"); + long writeLatency = ...; + + // register success op + writeStats.registerSuccessfulEvent(writeLatency); + + // register failure op + writeStats.registerFailedEvent(writeLatency); + +Available Stats Providers +~~~~~~~~~~~~~~~~~~~~~~~~~ + +All the available stats providers are listed as below: + +* Twitter Science Stats (deprecated) +* Twitter Ostrich Stats (deprecated) +* Twitter Finagle Stats +* Codahale Stats + +Twitter Science Stats +_____________________ + +Use following dependency to enable Twitter science stats provider. + +:: + + <dependency> + <groupId>org.apache.bookkeeper.stats</groupId> + <artifactId>twitter-science-provider</artifactId> + <version>${bookkeeper.version}</version> + </dependency> + +Construct the stats provider for clients. + +:: + + StatsProvider statsProvider = new TwitterStatsProvider(); + DistributedLogConfiguration conf = ...; + + // starts the stats provider (optional) + statsProvider.start(conf); + + // all the dl related stats are exposed under "dlog" + StatsLogger statsLogger = statsProvider.getStatsLogger("dlog"); + DistributedLogNamespace namespace = DistributedLogNamespaceBuilder.newBuilder() + .uri(...) + .conf(conf) + .statsLogger(statsLogger) + .build(); + + ... + + // stop the stats provider (optional) + statsProvider.stop(); + + +Expose the stats collected by the stats provider by configuring following settings: + +:: + + // enable exporting the stats + statsExport=true + // exporting the stats at port 8080 + statsHttpPort=8080 + + +If exporting stats is enabled, all the stats are exported by the http endpoint. +You could curl the http endpoint to check the stats. + +:: + + curl -s <host>:8080/vars + + +check ScienceStats_ for more details. + +.. _ScienceStats: https://github.com/twitter/commons/tree/master/src/java/com/twitter/common/stats + +Twitter Ostrich Stats +_____________________ + +Use following dependency to enable Twitter ostrich stats provider. + +:: + + <dependency> + <groupId>org.apache.bookkeeper.stats</groupId> + <artifactId>twitter-ostrich-provider</artifactId> + <version>${bookkeeper.version}</version> + </dependency> + +Construct the stats provider for clients. + +:: + + StatsProvider statsProvider = new TwitterOstrichProvider(); + DistributedLogConfiguration conf = ...; + + // starts the stats provider (optional) + statsProvider.start(conf); + + // all the dl related stats are exposed under "dlog" + StatsLogger statsLogger = statsProvider.getStatsLogger("dlog"); + DistributedLogNamespace namespace = DistributedLogNamespaceBuilder.newBuilder() + .uri(...) + .conf(conf) + .statsLogger(statsLogger) + .build(); + + ... + + // stop the stats provider (optional) + statsProvider.stop(); + + +Expose the stats collected by the stats provider by configuring following settings: + +:: + + // enable exporting the stats + statsExport=true + // exporting the stats at port 8080 + statsHttpPort=8080 + + +If exporting stats is enabled, all the stats are exported by the http endpoint. +You could curl the http endpoint to check the stats. + +:: + + curl -s <host>:8080/stats.txt + + +check Ostrich_ for more details. + +.. _Ostrich: https://github.com/twitter/ostrich + +Twitter Finagle Metrics +_______________________ + +Use following dependency to enable bridging finagle stats receiver to bookkeeper's stats provider. +All the stats exposed by the stats provider will be collected by finagle stats receiver and exposed +by Twitter's admin service. + +:: + + <dependency> + <groupId>org.apache.bookkeeper.stats</groupId> + <artifactId>twitter-finagle-provider</artifactId> + <version>${bookkeeper.version}</version> + </dependency> + +Construct the stats provider for clients. + +:: + + StatsReceiver statsReceiver = ...; // finagle stats receiver + StatsProvider statsProvider = new FinagleStatsProvider(statsReceiver); + DistributedLogConfiguration conf = ...; + + // the stats provider does nothing on start. + statsProvider.start(conf); + + // all the dl related stats are exposed under "dlog" + StatsLogger statsLogger = statsProvider.getStatsLogger("dlog"); + DistributedLogNamespace namespace = DistributedLogNamespaceBuilder.newBuilder() + .uri(...) + .conf(conf) + .statsLogger(statsLogger) + .build(); + + ... + + // the stats provider does nothing on stop. + statsProvider.stop(); + + +check `finagle metrics library`__ for more details on how to expose the stats. + +.. _TwitterServer: https://twitter.github.io/twitter-server/Migration.html + +__ TwitterServer_ + +Codahale Metrics +________________ + +Use following dependency to enable Twitter ostrich stats provider. + +:: + + <dependency> + <groupId>org.apache.bookkeeper.stats</groupId> + <artifactId>codahale-metrics-provider</artifactId> + <version>${bookkeeper.version}</version> + </dependency> + +Construct the stats provider for clients. + +:: + + StatsProvider statsProvider = new CodahaleMetricsProvider(); + DistributedLogConfiguration conf = ...; + + // starts the stats provider (optional) + statsProvider.start(conf); + + // all the dl related stats are exposed under "dlog" + StatsLogger statsLogger = statsProvider.getStatsLogger("dlog"); + DistributedLogNamespace namespace = DistributedLogNamespaceBuilder.newBuilder() + .uri(...) + .conf(conf) + .statsLogger(statsLogger) + .build(); + + ... + + // stop the stats provider (optional) + statsProvider.stop(); + + +Expose the stats collected by the stats provider in different ways by configuring following settings. +Check Codehale_ on how to configuring report endpoints. + +:: + + // How frequent report the stats + codahaleStatsOutputFrequencySeconds=... + // The prefix string of codahale stats + codahaleStatsPrefix=... + + // + // Report Endpoints + // + + // expose the stats to Graphite + codahaleStatsGraphiteEndpoint=... + // expose the stats to CSV files + codahaleStatsCSVEndpoint=... + // expose the stats to Slf4j logging + codahaleStatsSlf4jEndpoint=... + // expose the stats to JMX endpoint + codahaleStatsJmxEndpoint=... + + +check Codehale_ for more details. + +.. _Codehale: https://dropwizard.github.io/metrics/3.1.0/ + +Enable Stats Provider on Bookie Servers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The stats provider used by *Bookie Servers* is configured by setting the following option. + +:: + + // class of stats provider + statsProviderClass="org.apache.bookkeeper.stats.CodahaleMetricsProvider" + +Metrics +~~~~~~~ + +Check the Metrics_ reference page for the metrics exposed by DistributedLog. + +.. _Metrics: ../user_guide/references/metrics http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/operations.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/operations.rst b/website/docs/0.4.0-incubating/admin_guide/operations.rst new file mode 100644 index 0000000..e56c596 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/operations.rst @@ -0,0 +1,224 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 1 +top-nav-title: Operations + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: operations +sub-nav-pos: 1 +sub-nav-title: Operations +--- + +.. contents:: DistributedLog Operations + +DistributedLog Operations +========================= + +Feature Provider +~~~~~~~~~~~~~~~~ + +DistributedLog uses a `feature-provider` library provided by Apache BookKeeper for managing features +dynamically at runtime. It is a feature-flag_ system used to proportionally control what features +are enabled for the system. In other words, it is a way of altering the control in a system without +restarting it. It can be used during all stages of development, its most visible use case is on +production. For instance, during a production release, you can enable or disable individual features, +control the data flow through the system, thereby minimizing risk of system failure in real time. + +.. _feature-flag: https://en.wikipedia.org/wiki/Feature_toggle + +This `feature-provider` interface is pluggable and easy to integrate with any configuration management +system. + +API +___ + +`FeatureProvider` is a provider that manages features under different scopes. The provider is responsible +for loading features dynamically at runtime. A `Feature` is a numeric flag that control how much percentage +of this feature will be available to the system - the number is called `availability`. + +:: + + Feature.name() => returns the name of this feature + Feature.availability() => returns the availability of this feature + Feature.isAvailable() => returns true if its availability is larger than 0; otherwise false + + +It is easy to obtain a feature from the provider by just providing a feature name. + +:: + + FeatureProvider provider = ...; + Feature feature = provider.getFeature("feature1"); // returns the feature named 'feature1' + + +The `FeatureProvider` is scopable to allow creating features in a hierarchical way. For example, if a system +is comprised of two subsystems, one is *cache*, while the other one is *storage*. so the features belong to +different subsystems can be created under different scopes. + +:: + + FeatureProvider provider = ...; + FeatureProvider cacheFeatureProvider = provider.scope("cache"); + FeatureProvider storageFeatureProvider = provider.scope("storage"); + Feature writeThroughFeature = cacheFeatureProvider.getFeature("write_through"); + Feature duralWriteFeature = storageFeatureProvider.getFeature("dural_write"); + + // so the available features under `provider` are: (assume scopes are separated by '.') + // - 'cache.write_through' + // - 'storage.dural_write' + + +The feature provider could be passed to `DistributedLogNamespaceBuilder` when building the namespace, +thereby it would be used for controlling the features exposed under `DistributedLogNamespace`. + +:: + + FeatureProvider rootProvider = ...; + FeatureProvider dlFeatureProvider = rootProvider.scope("dlog"); + DistributedLogNamespace namespace = DistributedLogNamespaceBuilder.newBuilder() + .uri(uri) + .conf(conf) + .featureProvider(dlFeatureProvider) + .build(); + + +The feature provider is loaded by reflection on distributedlog write proxy server. You could specify +the feature provider class name as below. Otherwise it would use `DefaultFeatureProvider`, which disables +all the features by default. + +:: + + featureProviderClass=org.apache.distributedlog.feature.DynamicConfigurationFeatureProvider + + + +Configuration Based Feature Provider +____________________________________ + +Beside `DefaultFeatureProvider`, distributedlog also provides a file-based feature provider - it loads +the features from properties files. + +All the features and their availabilities are configured in properties file format. For example, + +:: + + cache.write_through=100 + storage.dural_write=0 + + +You could configure `featureProviderClass` in distributedlog configuration file by setting it to +`org.apache.distributedlog.feature.DynamicConfigurationFeatureProvider` to enable file-based feature +provider. The feature provider will load the features from two files, one is base config file configured +by `fileFeatureProviderBaseConfigPath`, while the other one is overlay config file configured by +`fileFeatureProviderOverlayConfigPath`. Current implementation doesn't differentiate these two files +too much other than the `overlay` config will override the settings in `base` config. It is recommended +to have a base config file for storing the default availability values for your system and dynamically +adjust the availability values in overlay config file. + +:: + + featureProviderClass=org.apache.distributedlog.feature.DynamicConfigurationFeatureProvider + fileFeatureProviderBaseConfigPath=/path/to/base/config + fileFeatureProviderOverlayConfigPath=/path/to/overlay/config + // how frequent we reload the config files + dynamicConfigReloadIntervalSec=60 + + +Available Features +__________________ + +Check the Features_ reference page for the features exposed by DistributedLog. + +.. _Features: ../user_guide/references/features + +`dlog` +~~~~~~ + +A CLI is provided for inspecting DistributedLog streams and metadata. + +.. code:: bash + + dlog + JMX enabled by default + Usage: dlog <command> + where command is one of: + local Run distributedlog sandbox + example Run distributedlog example + tool Run distributedlog tool + proxy_tool Run distributedlog proxy tool to interact with proxies + balancer Run distributedlog balancer + admin Run distributedlog admin tool + help This help message + + or command is the full name of a class with a defined main() method. + + Environment variables: + DLOG_LOG_CONF Log4j configuration file (default $HOME/src/distributedlog/distributedlog-service/conf/log4j.properties) + DLOG_EXTRA_OPTS Extra options to be passed to the jvm + DLOG_EXTRA_CLASSPATH Add extra paths to the dlog classpath + +These variable can also be set in conf/dlogenv.sh + +Create a stream +_______________ + +To create a stream: + +.. code:: bash + + dlog tool create -u <DL URI> -r <STREAM PREFIX> -e <STREAM EXPRESSION> + + +List the streams +________________ + +To list all the streams under a given DistributedLog namespace: + +.. code:: bash + + dlog tool list -u <DL URI> + +Show stream's information +_________________________ + +To view the metadata associated with a stream: + +.. code:: bash + + dlog tool show -u <DL URI> -s <STREAM NAME> + + +Dump a stream +_____________ + +To dump the items inside a stream: + +.. code:: bash + + dlog tool dump -u <DL URI> -s <STREAM NAME> -o <START TXN ID> -l <NUM RECORDS> + +Delete a stream +_______________ + +To delete a stream, run: + +.. code:: bash + + dlog tool delete -u <DL URI> -s <STREAM NAME> + + +Truncate a stream +_________________ + +Truncate the streams under a given DistributedLog namespace. You could specify a filter to match the streams that you want to truncate. + +There is a difference between the ``truncate`` and ``delete`` command. When you issue a ``truncate``, the data will be purge without removing the streams. A ``delete`` will delete the stream. You can pass the flag ``-delete`` to the ``truncate`` command to also delete the streams. + +.. code:: bash + + dlog tool truncate -u <DL URI> http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/performance.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/performance.rst b/website/docs/0.4.0-incubating/admin_guide/performance.rst new file mode 100644 index 0000000..fa36660 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/performance.rst @@ -0,0 +1,22 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 2 +top-nav-title: Performance Tuning + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: performance +sub-nav-pos: 2 +sub-nav-title: Performance Tuning +--- + +.. contents:: Performance Tuning + +Performance Tuning +================== + +(describe how to tune performance, critical settings) http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/vagrant.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/vagrant.rst b/website/docs/0.4.0-incubating/admin_guide/vagrant.rst new file mode 100644 index 0000000..e4ed574 --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/vagrant.rst @@ -0,0 +1,18 @@ +Sample Deployment using vagrant +================================ + +This file explains vagrant deployment. + +Prerequesites +-------------- +1. Vagrant: From https://www.vagrantup.com/downloads.html +2. vagrant-hostmanager plugin: From https://github.com/devopsgroup-io/vagrant-hostmanager + +Steps +----- +1. Create a snapshot using ./scripts/snapshot +2. Run vagrant up from the root directory of the enlistment +3. Vagrant brings up a zookeepers with machine names: zk1,zk2, .... with IP addresses 192.168.50.11,192.168.50.12,.... +4. Vagrant brings the bookies with machine names: bk1,bk2, ... with IP addresses 192.168.50.51,192.168.50.52,.... +5. The script will also start writeproxies at distributedlog://$PUBLIC_ZOOKEEPER_ADDRESSES/messaging/distributedlog/mynamespace as the namespace. If you want it to point to a different namespace/port/shard, please update the ./vagrant/bk.sh script. +6. If you want to run the client on the host machine, please add these node names and their IP addresses to the /etc/hosts file on the host. http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/admin_guide/zookeeper.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/admin_guide/zookeeper.rst b/website/docs/0.4.0-incubating/admin_guide/zookeeper.rst new file mode 100644 index 0000000..0fee9de --- /dev/null +++ b/website/docs/0.4.0-incubating/admin_guide/zookeeper.rst @@ -0,0 +1,104 @@ +--- +layout: default + +# Top navigation +top-nav-group: admin-guide +top-nav-pos: 5 +top-nav-title: ZooKeeper + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-parent: admin-guide +sub-nav-id: zookeeper +sub-nav-pos: 5 +sub-nav-title: ZooKeeper +--- + +ZooKeeper +========= + +To run a DistributedLog ensemble, you'll need a set of Zookeeper +nodes. There is no constraints on the number of Zookeeper nodes you +need. One node is enough to run your cluster, but for reliability +purpose, you should run at least 3 nodes. + +Version +------- + +DistributedLog leverages zookeepr `multi` operations for metadata updates. +So the minimum version of zookeeper is 3.4.*. We recommend to run stable +zookeeper version `3.4.8`. + +Run ZooKeeper from distributedlog source +---------------------------------------- + +Since `zookeeper` is one of the dependency of `distributedlog-service`. You could simply +run `zookeeper` servers using same set of scripts provided in `distributedlog-service`. +In the following sections, we will describe how to run zookeeper using the scripts provided +in `distributedlog-service`. + +Build ++++++ + +First of all, build DistributedLog: + +.. code-block:: bash + + $ mvn clean install -DskipTests + +Configuration ++++++++++++++ + +The configuration file `zookeeper.conf.template` under `distributedlog-service/conf` is a template of +production configuration to run a zookeeper node. Most of the configuration settings are good for +production usage. You might need to configure following settings according to your environment and +hardware platform. + +Ensemble +^^^^^^^^ + +You need to configure the zookeeper servers form this ensemble as below: + +:: + + server.1=127.0.0.1:2710:3710:participant;0.0.0.0:2181 + + +Please check zookeeper_ website for more configurations. + +Disks +^^^^^ + +You need to configure following settings according to the disk layout of your hardware. +It is recommended to put `dataLogDir` under a separated disk from others for performance. + +:: + + # the directory where the snapshot is stored. + dataDir=/tmp/data/zookeeper + + # where txlog are written + dataLogDir=/tmp/data/zookeeper/txlog + + +Run ++++ + +As `zookeeper` is shipped as part of `distributedlog-service`, you could use the `dlog-daemon.sh` +script to start `zookeeper` as daemon thread. + +Start the zookeeper: + +.. code-block:: bash + + $ ./distributedlog-service/bin/dlog-daemon.sh start zookeeper /path/to/zookeeper.conf + +Stop the zookeeper: + +.. code-block:: bash + + $ ./distributedlog-service/bin/dlog-daemon.sh stop zookeeper + +Please check zookeeper_ website for more details. + +.. _zookeeper: http://zookeeper.apache.org/ http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/basics/introduction.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/basics/introduction.rst b/website/docs/0.4.0-incubating/basics/introduction.rst new file mode 100644 index 0000000..13ee120 --- /dev/null +++ b/website/docs/0.4.0-incubating/basics/introduction.rst @@ -0,0 +1,138 @@ +--- +# Top navigation +top-nav-group: user-guide +top-nav-pos: 1 +top-nav-title: Introduction + +# Sub-level navigation +sub-nav-group: user-guide +sub-nav-parent: user-guide +sub-nav-id: introduction +sub-nav-pos: 1 +sub-nav-title: Introduction + +layout: default +--- + +.. contents:: DistributedLog Overview + +Introduction +============ + +DistributedLog (DL) is a high performance replicated log service. +It offers durability, replication and strong consistency, which provides a fundamental building block +for building reliable distributed systems, e.g replicated-state-machines, general pub/sub systems, +distributed databases, distributed queues and etc. + +DistributedLog maintains sequences of records in categories called *Logs* (aka *Log Streams*). +The processes that write records to a DL log are *writers*, while the processes that read +from logs and process the records are *readers*. + + +.. figure:: ../images/softwarestack.png + :align: center + + Figure 1. DistributedLog Software Stack + +Logs +---- + +A **log** is an ordered, immutable sequence of *log records*. + +.. figure:: ../images/datamodel.png + :align: center + + Figure 2. Anatomy of a log stream + +Log Records +~~~~~~~~~~~ + +Each **log record** is a sequence of bytes. +**Log records** are written sequentially into a *log stream*, and will be assigned with +a unique sequence number *called* **DLSN** (DistributedLog Sequence Number). Besides *DLSN*, +applications could assign its own sequence number while constructing log records. The +application defined sequence number is called **TransactionID** (*txid*). Either *DLSN* +or *TransactionID* could be used for positioning readers to start reading from a specific +*log record*. + +Log Segments +~~~~~~~~~~~~ + +A **log** is broken down into *segments*, which each log segment contains its subset of +records. **Log segments** are distributed and stored in a log segment store (e.g Apache BookKeeper). +DistributedLog rolls the log segments based on configured rolling policy - either a configurable +period of time (e.g. every 2 hours) or a configurable maximum size (e.g. every 128MB). +So the data of logs will be divided into equal-sized *log segments* and distributed evenly +across log segment storage nodes. It allows the log to scale beyond a size that will fit on +a single server and also spread read traffic among the cluster. + +The data of logs will either be kept forever until application *explicitly* truncates or be retained +for a configurable period of time. **Explicit Truncation** is useful for building replicated +state machines such as distributed databases. They usually require strong controls over when +the data could be truncated. **Time-based Retention** is useful for real-time analytics. They only +care about the data within a period of time. + +Namespaces +~~~~~~~~~~ + +The *log streams* belong to same organization are usually categorized and managed under +a **namespace**. A DL **namespace** is basically for applications to locate where the +*log streams* are. Applications could *create* and *delete* streams under a namespace, +and also be able to *truncate* a stream to given sequence number (either *DLSN* or *TransactionID*). + +Writers +------- + +Writers write data into the logs of their choice. All the records are +appended into the logs in order. The sequencing is done by the writer, +which means there is only one active writer for a log at a given time. +DL guarantees correctness when two writers attempt writing to +to a same log when network partition happens - via fencing mechanism +in log segment store. + +The log writers are served and managed in a service tier called *Write Proxy*. +The *Write Proxy* is used for accepting fan-in writes from large number +of clients. Details on **Fan-in and Fan-out** can be found further into this doc. + +Readers +------- + +Readers read records from the logs of their choice, starting from a provided +position. The provided position could be either *DLSN* or *TransactionID*. +The readers will read records in strict order from the logs. Different readers +could read records starting from different positions in a same log. + +Unlike other pub/sub systems, DistributedLog doesn't record/manage readers' positions. +It leaves the tracking responsibility to applications, as different applications +might have different requirements on tracking and coordinating positions. It is hard +to get it right with a single approach. For example, distributed databases might store +the reader positions along with SSTables, so they would resume applying transactions +from the positions stored in SSTables. Tracking reader positions could easily be done +in application level using various stores (e.g. ZooKeeper, file system, or key/value stores). + +The log records could be cached in a service tier called *Read Proxy*, to serve +a large number of readers. + +Fan-in and Fan-out +------------------ + +The core of DistributedLog supports single-writer, multiple-readers semantics. The service layer +built on top of the *DistributedLog Core* to support large scale of number of writers and readers. +The service layer includes **Write Proxy** and **Read Proxy**. **Write Proxy** manages +the writers of logs and fail over them when machines are failed. It allows supporting +which don't care about the log ownership by aggregating writes from many sources (aka *Fan-in*). +**Read Proxy** optimize reader path by caching log records in cases where hundreds or +thousands of readers are consuming a same log stream. + +Guarantees +---------- + +At a high level, DistributedLog gives the following guarantees: + +* Records written by a writer to a log will be appended in the order they are written. That is, if a record *R1* is written by same writer as a record *R2*, *R1* will have a smaller sequence number than *R2*. +* Readers see records in the same order they were written to the log. +* All records are persisted on disk before acknowledgments, to gurantee durability. +* For a log with replication factor of N, DistributedLog tolerates up to N-1 server failures without losing any records. + +More details on these guarantees are given in the [design section](http://distributedlog.incubator.apache.org/docs/0.4.0-incubating/user_guide/design/main.html). + http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/css/main.scss ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/css/main.scss b/website/docs/0.4.0-incubating/css/main.scss new file mode 100644 index 0000000..f2e566e --- /dev/null +++ b/website/docs/0.4.0-incubating/css/main.scss @@ -0,0 +1,53 @@ +--- +# Only the main Sass file needs front matter (the dashes are enough) +--- +@charset "utf-8"; + + + +// Our variables +$base-font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; +$base-font-size: 16px; +$base-font-weight: 400; +$small-font-size: $base-font-size * 0.875; +$base-line-height: 1.5; + +$spacing-unit: 30px; + +$text-color: #111; +$background-color: #fdfdfd; +$brand-color: #2a7ae2; + +$grey-color: #828282; +$grey-color-light: lighten($grey-color, 40%); +$grey-color-dark: darken($grey-color, 25%); + +// Width of the content area +$content-width: 800px; + +$on-palm: 600px; +$on-laptop: 800px; + + + +// Use media queries like this: +// @include media-query($on-palm) { +// .wrapper { +// padding-right: $spacing-unit / 2; +// padding-left: $spacing-unit / 2; +// } +// } +@mixin media-query($device) { + @media screen and (max-width: $device) { + @content; + } +} + + + +// Import partials from `sass_dir` (defaults to `_sass`) +@import + "base", + "layout", + "syntax-highlighting" +; http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/css/theme.css ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/css/theme.css b/website/docs/0.4.0-incubating/css/theme.css new file mode 100644 index 0000000..c0dd5a4 --- /dev/null +++ b/website/docs/0.4.0-incubating/css/theme.css @@ -0,0 +1,21 @@ +body { + padding-top: 70px; + padding-bottom: 30px; + font-family: 'Roboto', sans-serif; +} + +.theme-dropdown .dropdown-menu { + position: static; + display: block; + margin-bottom: 20px; +} + +.theme-showcase > p > .btn { + margin: 5px 0; +} + +.theme-showcase .navbar .container { + width: auto; +} + +@import url(https://fonts.googleapis.com/css?family=Roboto:400,300); http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/deployment/cluster.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/deployment/cluster.rst b/website/docs/0.4.0-incubating/deployment/cluster.rst new file mode 100644 index 0000000..b4caf3f --- /dev/null +++ b/website/docs/0.4.0-incubating/deployment/cluster.rst @@ -0,0 +1,561 @@ +--- +title: Cluster Setup + +top-nav-group: deployment +top-nav-pos: 1 +top-nav-title: Cluster Setup + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-id: cluster-deployment +sub-nav-parent: admin-guide +sub-nav-group-title: Cluster Setup +sub-nav-pos: 1 +sub-nav-title: Cluster Setup + +layout: default +--- + +.. contents:: This page provides instructions on how to run **DistributedLog** in a fully distributed fashion. + +Cluster Setup & Deployment +========================== + +This section describes how to run DistributedLog in `distributed` mode. +To run a cluster with DistributedLog, you need a Zookeeper cluster and a Bookkeeper cluster. + +Build +----- + +To build DistributedLog, run: + +.. code-block:: bash + + mvn clean install -DskipTests + + +Or run `./scripts/snapshot` to build the release packages from current source. The released +packages contain the binaries for running `distributedlog-service`, `distributedlog-benchmark` +and `distributedlog-tutorials`. + +NOTE: we run the following instructions from distributedlog source code after +running `mvn clean install`. And assume `DL_HOME` is the directory of +distributedlog source. + +Zookeeper +--------- + +(If you already have a zookeeper cluster running, you could skip this section.) + +We could use the `dlog-daemon.sh` and the `zookeeper.conf.template` to demonstrate run a 1-node +zookeeper ensemble locally. + +Create a `zookeeper.conf` from the `zookeeper.conf.template`. + +.. code-block:: bash + + $ cp distributedlog-service/conf/zookeeper.conf.template distributedlog-service/conf/zookeeper.conf + +Configure the settings in `zookeeper.conf`. By default, it will use `/tmp/data/zookeeper` for storing +the zookeeper data. Let's create the data directories for zookeeper. + +.. code-block:: bash + + $ mkdir -p /tmp/data/zookeeper/txlog + +Once the data directory is created, we need to assign `myid` for this zookeeper node. + +.. code-block:: bash + + $ echo "1" > /tmp/data/zookeeper/myid + +Start the zookeeper daemon using `dlog-daemon.sh`. + +.. code-block:: bash + + $ ./distributedlog-service/bin/dlog-daemon.sh start zookeeper ${DL_HOME}/distributedlog-service/conf/zookeeper.conf + +You could verify the zookeeper setup using `zkshell`. + +.. code-block:: bash + + // ./distributedlog-service/bin/dlog zkshell ${zkservers} + $ ./distributedlog-service/bin/dlog zkshell localhost:2181 + Connecting to localhost:2181 + Welcome to ZooKeeper! + JLine support is enabled + + WATCHER:: + + WatchedEvent state:SyncConnected type:None path:null + [zk: localhost:2181(CONNECTED) 0] ls / + [zookeeper] + [zk: localhost:2181(CONNECTED) 1] + +Please refer to the `ZooKeeper Guide`_ for more details on setting up zookeeper cluster. + +.. _ZooKeeper Guide: ../admin_guide/zookeeper + +Bookkeeper +---------- + +(If you already have a bookkeeper cluster running, you could skip this section.) + +We could use the `dlog-daemon.sh` and the `bookie.conf.template` to demonstrate run a 3-nodes +bookkeeper cluster locally. + +Create a `bookie.conf` from the `bookie.conf.template`. Since we are going to run a 3-nodes +bookkeeper cluster locally. Let's make three copies of `bookie.conf.template`. + +.. code-block:: bash + + $ cp distributedlog-service/conf/bookie.conf.template distributedlog-service/conf/bookie-1.conf + $ cp distributedlog-service/conf/bookie.conf.template distributedlog-service/conf/bookie-2.conf + $ cp distributedlog-service/conf/bookie.conf.template distributedlog-service/conf/bookie-3.conf + +Configure the settings in the bookie configuraiont files. + +First of all, choose the zookeeper cluster that the bookies will use and set `zkServers` in +the configuration files. + +:: + + zkServers=localhost:2181 + + +Choose the zookeeper path to store bookkeeper metadata and set `zkLedgersRootPath` in the configuration +files. Let's use `/messaging/bookkeeper/ledgers` in this instruction. + +:: + + zkLedgersRootPath=/messaging/bookkeeper/ledgers + + +Format bookkeeper metadata +++++++++++++++++++++++++++ + +(NOTE: only format bookkeeper metadata when first time setting up the bookkeeper cluster.) + +The bookkeeper shell doesn't automatically create the `zkLedgersRootPath` when running `metaformat`. +So using `zkshell` to create the `zkLedgersRootPath`. + +:: + + $ ./distributedlog-service/bin/dlog zkshell localhost:2181 + Connecting to localhost:2181 + Welcome to ZooKeeper! + JLine support is enabled + + WATCHER:: + + WatchedEvent state:SyncConnected type:None path:null + [zk: localhost:2181(CONNECTED) 0] create /messaging '' + Created /messaging + [zk: localhost:2181(CONNECTED) 1] create /messaging/bookkeeper '' + Created /messaging/bookkeeper + [zk: localhost:2181(CONNECTED) 2] create /messaging/bookkeeper/ledgers '' + Created /messaging/bookkeeper/ledgers + [zk: localhost:2181(CONNECTED) 3] + + +If the `zkLedgersRootPath`, run `metaformat` to format the bookkeeper metadata. + +:: + + $ BOOKIE_CONF=${DL_HOME}/distributedlog-service/conf/bookie-1.conf ./distributedlog-service/bin/dlog bkshell metaformat + Are you sure to format bookkeeper metadata ? (Y or N) Y + +Add Bookies ++++++++++++ + +Once the bookkeeper metadata is formatted, it is ready to add bookie nodes to the cluster. + +Configure Ports +^^^^^^^^^^^^^^^ + +Configure the ports that used by bookies. + +bookie-1: + +:: + + # Port that bookie server listen on + bookiePort=3181 + # Exporting codahale stats + 185 codahaleStatsHttpPort=9001 + +bookie-2: + +:: + + # Port that bookie server listen on + bookiePort=3182 + # Exporting codahale stats + 185 codahaleStatsHttpPort=9002 + +bookie-3: + +:: + + # Port that bookie server listen on + bookiePort=3183 + # Exporting codahale stats + 185 codahaleStatsHttpPort=9003 + +Configure Disk Layout +^^^^^^^^^^^^^^^^^^^^^ + +Configure the disk directories used by a bookie server by setting following options. + +:: + + # Directory Bookkeeper outputs its write ahead log + journalDirectory=/tmp/data/bk/journal + # Directory Bookkeeper outputs ledger snapshots + ledgerDirectories=/tmp/data/bk/ledgers + # Directory in which index files will be stored. + indexDirectories=/tmp/data/bk/ledgers + +As we are configuring a 3-nodes bookkeeper cluster, we modify the following settings as below: + +bookie-1: + +:: + + # Directory Bookkeeper outputs its write ahead log + journalDirectory=/tmp/data/bk-1/journal + # Directory Bookkeeper outputs ledger snapshots + ledgerDirectories=/tmp/data/bk-1/ledgers + # Directory in which index files will be stored. + indexDirectories=/tmp/data/bk-1/ledgers + +bookie-2: + +:: + + # Directory Bookkeeper outputs its write ahead log + journalDirectory=/tmp/data/bk-2/journal + # Directory Bookkeeper outputs ledger snapshots + ledgerDirectories=/tmp/data/bk-2/ledgers + # Directory in which index files will be stored. + indexDirectories=/tmp/data/bk-2/ledgers + +bookie-3: + +:: + + # Directory Bookkeeper outputs its write ahead log + journalDirectory=/tmp/data/bk-3/journal + # Directory Bookkeeper outputs ledger snapshots + ledgerDirectories=/tmp/data/bk-3/ledgers + # Directory in which index files will be stored. + indexDirectories=/tmp/data/bk-3/ledgers + +Format bookie +^^^^^^^^^^^^^ + +Once the disk directories are configured correctly in the configuration file, use +`bkshell bookieformat` to format the bookie. + +:: + + BOOKIE_CONF=${DL_HOME}/distributedlog-service/conf/bookie-1.conf ./distributedlog-service/bin/dlog bkshell bookieformat + BOOKIE_CONF=${DL_HOME}/distributedlog-service/conf/bookie-2.conf ./distributedlog-service/bin/dlog bkshell bookieformat + BOOKIE_CONF=${DL_HOME}/distributedlog-service/conf/bookie-3.conf ./distributedlog-service/bin/dlog bkshell bookieformat + + +Start bookie +^^^^^^^^^^^^ + +Start the bookie using `dlog-daemon.sh`. + +:: + + SERVICE_PORT=3181 ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf ${DL_HOME}/distributedlog-service/conf/bookie-1.conf + SERVICE_PORT=3182 ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf ${DL_HOME}/distributedlog-service/conf/bookie-2.conf + SERVICE_PORT=3183 ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf ${DL_HOME}/distributedlog-service/conf/bookie-3.conf + +Verify whether the bookie is setup correctly. You could simply check whether the bookie is showed up in +zookeeper `zkLedgersRootPath`/available znode. + +:: + + $ ./distributedlog-service/bin/dlog zkshell localhost:2181 + Connecting to localhost:2181 + Welcome to ZooKeeper! + JLine support is enabled + + WATCHER:: + + WatchedEvent state:SyncConnected type:None path:null + [zk: localhost:2181(CONNECTED) 0] ls /messaging/bookkeeper/ledgers/available + [127.0.0.1:3181, 127.0.0.1:3182, 127.0.0.1:3183, readonly] + [zk: localhost:2181(CONNECTED) 1] + + +Or check if the bookie is exposing the stats at port `codahaleStatsHttpPort`. + +:: + + // ping the service + $ curl localhost:9001/ping + pong + // checking the stats + curl localhost:9001/metrics?pretty=true + +Stop bookie +^^^^^^^^^^^ + +Stop the bookie using `dlog-daemon.sh`. + +:: + + $ ./distributedlog-service/bin/dlog-daemon.sh stop bookie + // Example: + $ SERVICE_PORT=3181 ./distributedlog-service/bin/dlog-daemon.sh stop bookie + doing stop bookie ... + stopping bookie + Shutdown is in progress... Please wait... + Shutdown completed. + +Turn bookie to readonly +^^^^^^^^^^^^^^^^^^^^^^^ + +Start the bookie in `readonly` mode. + +:: + + $ SERVICE_PORT=3181 ./distributedlog-service/bin/dlog-daemon.sh start bookie --conf ${DL_HOME}/distributedlog-service/conf/bookie-1.conf --readonly + +Verify if the bookie is running in `readonly` mode. + +:: + + $ ./distributedlog-service/bin/dlog zkshell localhost:2181 + Connecting to localhost:2181 + Welcome to ZooKeeper! + JLine support is enabled + + WATCHER:: + + WatchedEvent state:SyncConnected type:None path:null + [zk: localhost:2181(CONNECTED) 0] ls /messaging/bookkeeper/ledgers/available + [127.0.0.1:3182, 127.0.0.1:3183, readonly] + [zk: localhost:2181(CONNECTED) 1] ls /messaging/bookkeeper/ledgers/available/readonly + [127.0.0.1:3181] + [zk: localhost:2181(CONNECTED) 2] + +Please refer to the `BookKeeper Guide`_ for more details on setting up bookkeeper cluster. + +.. _BookKeeper Guide: ../admin_guide/bookkeeper + +Create Namespace +---------------- + +After setting up a zookeeper cluster and a bookkeeper cluster, you could provision DL namespaces +for applications to use. + +Provisioning a DistributedLog namespace is accomplished via the `bind` command available in `dlog tool`. + +Namespace is bound by writing bookkeeper environment settings (e.g. the ledger path, bkLedgersZkPath, +or the set of Zookeeper servers used by bookkeeper, bkZkServers) as metadata in the zookeeper path of +the namespace DL URI. The DL library resolves the DL URI to determine which bookkeeper cluster it +should read and write to. + +The namespace binding has following features: + +- `Inheritance`: suppose `distributedlog://<zkservers>/messaging/distributedlog` is bound to bookkeeper + cluster `X`. All the streams created under `distributedlog://<zkservers>/messaging/distributedlog`, + will write to bookkeeper cluster `X`. +- `Override`: suppose `distributedlog://<zkservers>/messaging/distributedlog` is bound to bookkeeper + cluster `X`. You want streams under `distributedlog://<zkservers>/messaging/distributedlog/S` write + to bookkeeper cluster `Y`. You could just bind `distributedlog://<zkservers>/messaging/distributedlog/S` + to bookkeeper cluster `Y`. The binding to `distributedlog://<zkservers>/messaging/distributedlog/S` + only affects streams under `distributedlog://<zkservers>/messaging/distributedlog/S`. + +Create namespace binding using `dlog tool`. For example, we create a namespace +`distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace` pointing to the +bookkeeper cluster we just created above. + +:: + + $ distributedlog-service/bin/dlog admin bind \\ + -dlzr 127.0.0.1:2181 \\ + -dlzw 127.0.0.1:2181 \\ + -s 127.0.0.1:2181 \\ + -bkzr 127.0.0.1:2181 \\ + -l /messaging/bookkeeper/ledgers \\ + -i false \\ + -r true \\ + -c \\ + distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace + + No bookkeeper is bound to distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace + Created binding on distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace. + + +- Configure the zookeeper cluster used for storing DistributedLog metadata: `-dlzr` and `-dlzw`. + Ideally `-dlzr` and `-dlzw` would be same the zookeeper server in distributedlog namespace uri. + However to scale zookeeper reads, the zookeeper observers sometimes are added in a different + domain name than participants. In such case, configuring `-dlzr` and `-dlzw` to different + zookeeper domain names would help isolating zookeeper write and read traffic. +- Configure the zookeeper cluster used by bookkeeper for storing the metadata : `-bkzr` and `-s`. + Similar as `-dlzr` and `-dlzw`, you could configure the namespace to use different zookeeper + domain names for readers and writers to access bookkeeper metadatadata. +- Configure the bookkeeper ledgers path: `-l`. +- Configure the zookeeper path to store DistributedLog metadata. It is implicitly included as part + of namespace URI. + +Write Proxy +----------- + +A write proxy consists of multiple write proxies. They don't store any state locally. So they are +mostly stateless and can be run as many as you can. + +Configuration ++++++++++++++ + +Different from bookkeeper, DistributedLog tries not to configure any environment related settings +in configuration files. Any environment related settings are stored and configured via `namespace binding`. +The configuration file should contain non-environment related settings. + +There is a `write_proxy.conf` template file available under `distributedlog-service` module. + +Run write proxy ++++++++++++++++ + +A write proxy could be started using `dlog-daemon.sh` script under `distributedlog-service`. + +:: + + WP_SHARD_ID=${WP_SHARD_ID} WP_SERVICE_PORT=${WP_SERVICE_PORT} WP_STATS_PORT=${WP_STATS_PORT} ./distributedlog-service/bin/dlog-daemon.sh start writeproxy + +- `WP_SHARD_ID`: A non-negative integer. You don't need to guarantee uniqueness of shard id, as it is just an + indicator to the client for routing the requests. If you are running the `write proxy` using a cluster scheduler + like `aurora`, you could easily obtain a shard id and use that to configure `WP_SHARD_ID`. +- `WP_SERVICE_PORT`: The port that write proxy listens on. +- `WP_STATS_PORT`: The port that write proxy exposes stats to a http endpoint. + +Please check `distributedlog-service/conf/dlogenv.sh` for more environment variables on configuring write proxy. + +- `WP_CONF_FILE`: The path to the write proxy configuration file. +- `WP_NAMESPACE`: The distributedlog namespace that the write proxy is serving for. + +For example, we start 3 write proxies locally and point to the namespace created above. + +:: + + $ WP_SHARD_ID=1 WP_SERVICE_PORT=4181 WP_STATS_PORT=20001 ./distributedlog-service/bin/dlog-daemon.sh start writeproxy + $ WP_SHARD_ID=2 WP_SERVICE_PORT=4182 WP_STATS_PORT=20002 ./distributedlog-service/bin/dlog-daemon.sh start writeproxy + $ WP_SHARD_ID=3 WP_SERVICE_PORT=4183 WP_STATS_PORT=20003 ./distributedlog-service/bin/dlog-daemon.sh start writeproxy + +The write proxy will announce itself to the zookeeper path `.write_proxy` under the dl namespace path. + +We could verify that the write proxy is running correctly by checking the zookeeper path or checking its stats port. + +:: + + $ ./distributedlog-service/bin/dlog zkshell localhost:2181 + Connecting to localhost:2181 + Welcome to ZooKeeper! + JLine support is enabled + + WATCHER:: + + WatchedEvent state:SyncConnected type:None path:null + [zk: localhost:2181(CONNECTED) 0] ls /messaging/distributedlog/mynamespace/.write_proxy + [member_0000000000, member_0000000001, member_0000000002] + + +:: + + $ curl localhost:20001/ping + pong + + +Add and Remove Write Proxies +++++++++++++++++++++++++++++ + +Removing a write proxy is pretty straightforward by just killing the process. + +:: + + WP_SHARD_ID=1 WP_SERVICE_PORT=4181 WP_STATS_PORT=10001 ./distributedlog-service/bin/dlog-daemon.sh stop writeproxy + + +Adding a new write proxy is just adding a new host and starting the write proxy +process as described above. + +Write Proxy Naming +++++++++++++++++++ + +The `dlog-daemon.sh` script starts the write proxy by announcing it to the `.write_proxy` path under +the dl namespace. So you could use uri in the distributedlog client builder to access the write proxy cluster. + +Verify the setup +++++++++++++++++ + +You could verify the write proxy cluster by running tutorials over the setup cluster. + +Create 10 streams. + +:: + + $ ./distributedlog-service/bin/dlog tool create -u distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace -r stream- -e 0-10 + You are going to create streams : [stream-0, stream-1, stream-2, stream-3, stream-4, stream-5, stream-6, stream-7, stream-8, stream-9, stream-10] (Y or N) Y + + +Tail read from the 10 streams. + +:: + + $ ./distributedlog-tutorials/distributedlog-basic/bin/runner run org.apache.distributedlog.basic.MultiReader distributedlog://127.0.0.1:2181/messaging/distributedlog/mynamespace stream-0,stream-1,stream-2,stream-3,stream-4,stream-5,stream-6,stream-7,stream-8,stream-9,stream-10 + + +Run record generator over some streams + +:: + + $ ./distributedlog-tutorials/distributedlog-basic/bin/runner run org.apache.distributedlog.basic.RecordGenerator 'zk!127.0.0.1:2181!/messaging/distributedlog/mynamespace/.write_proxy' stream-0 100 + $ ./distributedlog-tutorials/distributedlog-basic/bin/runner run org.apache.distributedlog.basic.RecordGenerator 'zk!127.0.0.1:2181!/messaging/distributedlog/mynamespace/.write_proxy' stream-1 100 + + +Check the terminal running `MultiReader`. You will see similar output as below: + +:: + + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=21044, slotId=0} from stream stream-0 + """ + record-1464085079105 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=21046, slotId=0} from stream stream-0 + """ + record-1464085079113 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=9636, slotId=0} from stream stream-1 + """ + record-1464085079110 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=21048, slotId=0} from stream stream-0 + """ + record-1464085079125 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=9638, slotId=0} from stream stream-1 + """ + record-1464085079121 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=21050, slotId=0} from stream stream-0 + """ + record-1464085079133 + """ + Received record DLSN{logSegmentSequenceNo=1, entryId=9640, slotId=0} from stream stream-1 + """ + record-1464085079130 + """ + + + +Please refer to the Performance_ page for more details on tuning performance. + +.. _Performance: ../admin_guide/performance http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/deployment/docker.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/deployment/docker.rst b/website/docs/0.4.0-incubating/deployment/docker.rst new file mode 100644 index 0000000..d9fd87a --- /dev/null +++ b/website/docs/0.4.0-incubating/deployment/docker.rst @@ -0,0 +1,49 @@ +--- +title: Docker +top-nav-group: deployment +top-nav-pos: 2 +top-nav-title: Docker +layout: default +--- + +.. contents:: This page provides instructions on how to deploy **DistributedLog** using docker. + +Docker Setup +============ + +Prerequesites +------------- +1. Docker + +Steps +----- +1. Create a snapshot using + +.. code-block:: bash + + ./scripts/snapshot + + +2. Create your own docker image using + +.. code-block:: bash + + docker build -t <your image name> . + + +3. You can run the docker container using + +.. code-block:: bash + + docker run -e ZK_SERVERS=<zk server list> -e DEPLOY_BK=<true|false> -e DEPLOY_WP=<true|false> <your image name> + + +Environment variables +--------------------- + +Following are the environment variables which can change how the docker container runs. + +1. **ZK_SERVERS**: ZK servers running exernally (the container does not run a zookeeper) +2. **DEPLOY_BOTH**: Deploys writeproxies as well as the bookies +3. **DEPLOY_WP**: Flag to notify that a writeproxy needs to be deployed +4. **DEPLOY_BK**: Flag to notify that a bookie needs to be deployed http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/deployment/global-cluster.rst ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/deployment/global-cluster.rst b/website/docs/0.4.0-incubating/deployment/global-cluster.rst new file mode 100644 index 0000000..a8c1523 --- /dev/null +++ b/website/docs/0.4.0-incubating/deployment/global-cluster.rst @@ -0,0 +1,113 @@ +--- +title: Global Cluster Setup + +top-nav-group: deployment +top-nav-pos: 1 +top-nav-title: Global Cluster Setup + +# Sub-level navigation +sub-nav-group: admin-guide +sub-nav-id: cluster-deployment +sub-nav-parent: admin-guide +sub-nav-group-title: Global Cluster Setup +sub-nav-pos: 1 +sub-nav-title: Global Cluster Setup + +layout: default +--- + +.. contents:: This page provides instructions on how to run **DistributedLog** across multiple regions. + + +Cluster Setup & Deployment +========================== + +Setting up `globally replicated DistributedLog <../user_guide/globalreplicatedlog/main>`_ is very similar to setting up local DistributedLog. +The most important change is use a ZooKeeper cluster configured across multiple-regions. Once set up, DistributedLog +and BookKeeper are configured to use the global ZK cluster for all metadata storage, and the system will more or +less work. The remaining steps are necessary to ensure things like durability in the face of total region failure. + +The key differences with standard cluster setup are summarized below: + +- The zookeeper cluster must be running across all of the target regions, say A, B, C. + +- Region aware placement policy and a few other options must be configured in DL config. + +- DistributedLog clients should be configured to talk to all regions. + +We elaborate on these steps in the following sections. + + +Global Zookeeper +---------------- + +When defining your server and participant lists in zookeeper configuration, a sufficient number of nodes from each +region must be included. + +Please consult the ZooKeeper documentation for detailed Zookeeper setup instructions. + + +DistributedLog Configuration +---------------------------- + +In multi-region DistributedLog several DL config changes are needed. + +Placement Policy +++++++++++++++++ + +The region-aware placement policy must be configured. Below, it is configured to place replicas across 3 regions, A, B, and C. + +:: + + # placement policy + bkc.ensemblePlacementPolicy=org.apache.bookkeeper.client.RegionAwareEnsemblePlacementPolicy + bkc.reppRegionsToWrite=A;B;C + bkc.reppMinimumRegionsForDurability=2 + bkc.reppEnableDurabilityEnforcementInReplace=true + bkc.reppEnableValidation=true + +Connection Timeouts ++++++++++++++++++++ + +In global replicated mode, the proxy nodes will be writing to the entire ensemble, which exists in multiple regions. +If cross-region latency is higher than local region latency (i.e. if its truly cross-region) then it is advisable to +use a higher BookKeeper client connection tiemout. + +:: + + # setting connect timeout to 1 second for global cluster + bkc.connectTimeoutMillis=1000 + +Quorum Size ++++++++++++ + +It is advisable to run with a larger ensemble to ensure cluster health in the event of region loss (the ensemble +will be split across all regions). + +The values of these settings will depend on your operational and durability requirements. + +:: + + ensemble-size=9 + write-quorum-size=9 + ack-quorum-size=5 + +Client Configuration +-------------------- + +Although not required, it is recommended to configure the write client to use all available regions. Several methods +in DistributedLogClientBuilder can be used to achieve this. + +.. code-block:: java + + DistributedLogClientBuilder.serverSets + DistributedLogClientBuilder.finagleNameStrs + + +Additional Steps +================ + +Other clients settings may need to be tuned - for example in the write client, timeouts will likely need to be +increased. + +Aside from this however, cluster setup is exactly the same as `single region setup <cluster>`_. http://git-wip-us.apache.org/repos/asf/incubator-distributedlog/blob/3469fc87/website/docs/0.4.0-incubating/fonts/bootstrap/glyphicons-halflings-regular.eot ---------------------------------------------------------------------- diff --git a/website/docs/0.4.0-incubating/fonts/bootstrap/glyphicons-halflings-regular.eot b/website/docs/0.4.0-incubating/fonts/bootstrap/glyphicons-halflings-regular.eot new file mode 100755 index 0000000..b93a495 Binary files /dev/null and b/website/docs/0.4.0-incubating/fonts/bootstrap/glyphicons-halflings-regular.eot differ