con…

mauzhang Wed, 18 Jan 2017 23:22:26 -0800

[GEARPUMP-213] Fix EditOnGitHub link and rename docs/docs to docs/conâ¦


â¦tents

Author: manuzhang <[email protected]>

Closes #135 from manuzhang/doc.


Project: http://git-wip-us.apache.org/repos/asf/incubator-gearpump/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-gearpump/commit/5f90b70f
Tree: http://git-wip-us.apache.org/repos/asf/incubator-gearpump/tree/5f90b70f
Diff: http://git-wip-us.apache.org/repos/asf/incubator-gearpump/diff/5f90b70f

Branch: refs/heads/master
Commit: 5f90b70f98a5aec68be5e6cbd46805916281d3f2
Parents: ab640f7
Author: manuzhang <[email protected]>
Authored: Thu Jan 19 15:20:51 2017 +0800
Committer: manuzhang <[email protected]>
Committed: Thu Jan 19 15:21:01 2017 +0800

----------------------------------------------------------------------
 docs/build_doc.sh                               |    6 +-
 docs/contents/api/java.md                       |    1 +
 docs/contents/api/scala.md                      |    1 +
 .../deployment/deployment-configuration.md      |   84 ++
 docs/contents/deployment/deployment-docker.md   |    5 +
 docs/contents/deployment/deployment-ha.md       |   75 ++
 docs/contents/deployment/deployment-local.md    |   34 +
 .../deployment/deployment-msg-delivery.md       |   60 +
 .../deployment/deployment-resource-isolation.md |  112 ++
 docs/contents/deployment/deployment-security.md |   80 ++
 .../deployment/deployment-standalone.md         |   59 +
 .../deployment/deployment-ui-authentication.md  |  290 +++++
 docs/contents/deployment/deployment-yarn.md     |  135 +++
 .../deployment/get-gearpump-distribution.md     |   83 ++
 .../contents/deployment/hardware-requirement.md |   30 +
 docs/contents/dev/dev-connectors.md             |  237 ++++
 docs/contents/dev/dev-custom-serializer.md      |  137 +++
 docs/contents/dev/dev-ide-setup.md              |   29 +
 docs/contents/dev/dev-non-streaming-example.md  |  133 +++
 docs/contents/dev/dev-rest-api.md               | 1083 ++++++++++++++++++
 docs/contents/dev/dev-storm.md                  |  214 ++++
 docs/contents/dev/dev-write-1st-app.md          |  370 ++++++
 docs/contents/img/actor_hierarchy.png           |  Bin 0 -> 109855 bytes
 docs/contents/img/checkpoint_equation.png       |  Bin 0 -> 1663 bytes
 .../img/checkpoint_interval_equation.png        |  Bin 0 -> 16637 bytes
 docs/contents/img/checkpointing.png             |  Bin 0 -> 104471 bytes
 docs/contents/img/checkpointing_interval.png    |  Bin 0 -> 44284 bytes
 docs/contents/img/clock.png                     |  Bin 0 -> 33147 bytes
 docs/contents/img/dag.png                       |  Bin 0 -> 18263 bytes
 docs/contents/img/dashboard.gif                 |  Bin 0 -> 152314 bytes
 docs/contents/img/dashboard.png                 |  Bin 0 -> 29434 bytes
 docs/contents/img/dashboard_3.png               |  Bin 0 -> 42684 bytes
 docs/contents/img/download.jpg                  |  Bin 0 -> 3999 bytes
 docs/contents/img/dynamic.png                   |  Bin 0 -> 40091 bytes
 docs/contents/img/exact.png                     |  Bin 0 -> 104471 bytes
 docs/contents/img/failures.png                  |  Bin 0 -> 95078 bytes
 docs/contents/img/flow_control.png              |  Bin 0 -> 61928 bytes
 docs/contents/img/flowcontrol.png               |  Bin 0 -> 61928 bytes
 docs/contents/img/ha.png                        |  Bin 0 -> 47152 bytes
 docs/contents/img/kafka_wordcount.png           |  Bin 0 -> 14520 bytes
 docs/contents/img/layout.png                    |  Bin 0 -> 126947 bytes
 docs/contents/img/logo.png                      |  Bin 0 -> 2053 bytes
 docs/contents/img/logo.svg                      |   71 ++
 docs/contents/img/logo2.png                     |  Bin 0 -> 7970 bytes
 docs/contents/img/messageLoss.png               |  Bin 0 -> 37166 bytes
 docs/contents/img/netty_transport.png           |  Bin 0 -> 65010 bytes
 docs/contents/img/replay.png                    |  Bin 0 -> 37255 bytes
 docs/contents/img/shuffle.png                   |  Bin 0 -> 23550 bytes
 docs/contents/img/storm_gearpump_cluster.png    |  Bin 0 -> 30930 bytes
 docs/contents/img/storm_gearpump_dag.png        |  Bin 0 -> 54000 bytes
 docs/contents/img/submit.png                    |  Bin 0 -> 32954 bytes
 docs/contents/img/submit2.png                   |  Bin 0 -> 51933 bytes
 docs/contents/img/through_vs_message_size.png   |  Bin 0 -> 20965 bytes
 docs/contents/index.md                          |   35 +
 docs/contents/introduction/basic-concepts.md    |   46 +
 docs/contents/introduction/commandline.md       |   84 ++
 docs/contents/introduction/features.md          |   67 ++
 .../contents/introduction/gearpump-internals.md |  228 ++++
 docs/contents/introduction/message-delivery.md  |   47 +
 .../contents/introduction/performance-report.md |   34 +
 .../introduction/submit-your-1st-application.md |   39 +
 docs/docs/api/java.md                           |    1 -
 docs/docs/api/scala.md                          |    1 -
 .../docs/deployment/deployment-configuration.md |   84 --
 docs/docs/deployment/deployment-docker.md       |    5 -
 docs/docs/deployment/deployment-ha.md           |   75 --
 docs/docs/deployment/deployment-local.md        |   34 -
 docs/docs/deployment/deployment-msg-delivery.md |   60 -
 .../deployment/deployment-resource-isolation.md |  112 --
 docs/docs/deployment/deployment-security.md     |   80 --
 docs/docs/deployment/deployment-standalone.md   |   59 -
 .../deployment/deployment-ui-authentication.md  |  290 -----
 docs/docs/deployment/deployment-yarn.md         |  135 ---
 .../deployment/get-gearpump-distribution.md     |   83 --
 docs/docs/deployment/hardware-requirement.md    |   30 -
 docs/docs/dev/dev-connectors.md                 |  237 ----
 docs/docs/dev/dev-custom-serializer.md          |  137 ---
 docs/docs/dev/dev-ide-setup.md                  |   29 -
 docs/docs/dev/dev-non-streaming-example.md      |  133 ---
 docs/docs/dev/dev-rest-api.md                   | 1083 ------------------
 docs/docs/dev/dev-storm.md                      |  214 ----
 docs/docs/dev/dev-write-1st-app.md              |  370 ------
 docs/docs/img/actor_hierarchy.png               |  Bin 109855 -> 0 bytes
 docs/docs/img/checkpoint_equation.png           |  Bin 1663 -> 0 bytes
 docs/docs/img/checkpoint_interval_equation.png  |  Bin 16637 -> 0 bytes
 docs/docs/img/checkpointing.png                 |  Bin 104471 -> 0 bytes
 docs/docs/img/checkpointing_interval.png        |  Bin 44284 -> 0 bytes
 docs/docs/img/clock.png                         |  Bin 33147 -> 0 bytes
 docs/docs/img/dag.png                           |  Bin 18263 -> 0 bytes
 docs/docs/img/dashboard.gif                     |  Bin 152314 -> 0 bytes
 docs/docs/img/dashboard.png                     |  Bin 29434 -> 0 bytes
 docs/docs/img/dashboard_3.png                   |  Bin 42684 -> 0 bytes
 docs/docs/img/download.jpg                      |  Bin 3999 -> 0 bytes
 docs/docs/img/dynamic.png                       |  Bin 40091 -> 0 bytes
 docs/docs/img/exact.png                         |  Bin 104471 -> 0 bytes
 docs/docs/img/failures.png                      |  Bin 95078 -> 0 bytes
 docs/docs/img/flow_control.png                  |  Bin 61928 -> 0 bytes
 docs/docs/img/flowcontrol.png                   |  Bin 61928 -> 0 bytes
 docs/docs/img/ha.png                            |  Bin 47152 -> 0 bytes
 docs/docs/img/kafka_wordcount.png               |  Bin 14520 -> 0 bytes
 docs/docs/img/layout.png                        |  Bin 126947 -> 0 bytes
 docs/docs/img/logo.png                          |  Bin 2053 -> 0 bytes
 docs/docs/img/logo.svg                          |   71 --
 docs/docs/img/logo2.png                         |  Bin 7970 -> 0 bytes
 docs/docs/img/messageLoss.png                   |  Bin 37166 -> 0 bytes
 docs/docs/img/netty_transport.png               |  Bin 65010 -> 0 bytes
 docs/docs/img/replay.png                        |  Bin 37255 -> 0 bytes
 docs/docs/img/shuffle.png                       |  Bin 23550 -> 0 bytes
 docs/docs/img/storm_gearpump_cluster.png        |  Bin 30930 -> 0 bytes
 docs/docs/img/storm_gearpump_dag.png            |  Bin 54000 -> 0 bytes
 docs/docs/img/submit.png                        |  Bin 32954 -> 0 bytes
 docs/docs/img/submit2.png                       |  Bin 51933 -> 0 bytes
 docs/docs/img/through_vs_message_size.png       |  Bin 20965 -> 0 bytes
 docs/docs/index.md                              |   35 -
 docs/docs/introduction/basic-concepts.md        |   46 -
 docs/docs/introduction/commandline.md           |   84 --
 docs/docs/introduction/features.md              |   67 --
 docs/docs/introduction/gearpump-internals.md    |  228 ----
 docs/docs/introduction/message-delivery.md      |   47 -
 docs/docs/introduction/performance-report.md    |   34 -
 .../introduction/submit-your-1st-application.md |   39 -
 docs/mkdocs.yml                                 |    2 +
 122 files changed, 3909 insertions(+), 3905 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/build_doc.sh
----------------------------------------------------------------------
diff --git a/docs/build_doc.sh b/docs/build_doc.sh
index 2c30372..a3e70de 100755
--- a/docs/build_doc.sh
+++ b/docs/build_doc.sh
@@ -64,8 +64,10 @@ export BUILD_API=$2
 # render file templates
 echo "Rendering file templates using mustache..."
 TEMP_DIR="tmp"
-rm -rf $TEMP_DIR
-copy_dir docs $TEMP_DIR
+if [ -d "$TEMP_DIR" ]; then
+       rm -rf "$TEMP_DIR"
+fi
+copy_dir contents $TEMP_DIR
 render_files version.yml "$TEMP_DIR/introduction $TEMP_DIR/dev 
$TEMP_DIR/deployment $TEMP_DIR/api $TEMP_DIR/index.md"
 
 # generate site documents

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/api/java.md
----------------------------------------------------------------------
diff --git a/docs/contents/api/java.md b/docs/contents/api/java.md
new file mode 100644
index 0000000..3b94f91
--- /dev/null
+++ b/docs/contents/api/java.md
@@ -0,0 +1 @@
+Placeholder

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/api/scala.md
----------------------------------------------------------------------
diff --git a/docs/contents/api/scala.md b/docs/contents/api/scala.md
new file mode 100644
index 0000000..3b94f91
--- /dev/null
+++ b/docs/contents/api/scala.md
@@ -0,0 +1 @@
+Placeholder

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-configuration.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-configuration.md 
b/docs/contents/deployment/deployment-configuration.md
new file mode 100644
index 0000000..1dadbd7
--- /dev/null
+++ b/docs/contents/deployment/deployment-configuration.md
@@ -0,0 +1,84 @@
+## Master and Worker configuration
+
+Master and Worker daemons will only read configuration from `conf/gear.conf`.
+
+Master reads configuration from section master and gearpump:
+
+       :::bash
+       master {
+       }
+       gearpump{
+       }
+
+
+Worker reads configuration from section worker and gearpump:
+
+       :::bash
+       worker {
+       }
+       gearpump{
+       }
+       
+
+## Configuration for user submitted application job
+
+For user application job, it will read configuration file `gear.conf` and 
`application.conf` from classpath, while `application.conf` has higher priority.
+The default classpath contains:
+
+1. `conf/`
+2. current working directory.
+
+For example, you can put a `application.conf` on your working directory, and 
then it will be effective when you submit a new job application.
+
+## Logging
+
+To change the log level, you need to change both `gear.conf`, and 
`log4j.properties`.
+
+### To change the log level for master and worker daemon
+
+Please change `log4j.rootLevel` in `log4j.properties`, 
`gearpump-master.akka.loglevel` and `gearpump-worker.akka.loglevel` in 
`gear.conf`.
+
+### To change the log level for application job
+
+Please change `log4j.rootLevel` in `log4j.properties`, and `akka.loglevel` in 
`gear.conf` or `application.conf`.
+
+## Gearpump Default Configuration
+
+This is the default configuration for `gear.conf`.
+
+| config item    | default value  | description      |
+| -------------- | -------------- | ---------------- |
+| gearpump.hostname | "127.0.0.1" | hostname of current machine. If you are 
using local mode, then set this to 127.0.0.1. If you are using cluster mode, 
make sure this hostname can be accessed by other machines. |
+| gearpump.cluster.masters | ["127.0.0.1:3000"] | Config to set the master 
nodes of the cluster. If there are multiple master in the list, then the master 
nodes runs in HA mode. For example, you may start three master, on node1: 
`bin/master -ip node1 -port 3000`, on node2: `bin/master -ip node2 -port 3000`, 
on node3: `bin/master -ip node3 -port 3000`, then you need to set  
`gearpump.cluster.masters = ["node1:3000","node2:3000","node3:3000"]` |
+| gearpump.task-dispatcher | "gearpump.shared-thread-pool-dispatcher" | 
default dispatcher for task actor |
+| gearpump.metrics.enabled | true | flag to enable the metrics system |
+| gearpump.metrics.sample-rate | 1 | We will take one sample every 
`gearpump.metrics.sample-rate` data points. Note it may have impact that the 
statistics on UI portal is not accurate. Change it to 1 if you want accurate 
metrics in UI |
+| gearpump.metrics.report-interval-ms | 15000 | we will report once every 15 
seconds |
+| gearpump.metrics.reporter  | "akka" | available value: "graphite", "akka", 
"logfile" which write the metrics data to different places. |
+| gearpump.retainHistoryData.hours | 72 | max hours of history data to retain, 
Note: Due to implementation limitation(we store all history in memory), please 
don't set this to too big which may exhaust memory. |
+| gearpump.retainHistoryData.intervalMs | 3600000 |  time interval between two 
data points for history data (unit: ms). Usually this is set to a big value so 
that we only store coarse-grain data |
+| gearpump.retainRecentData.seconds | 300 | max seconds of recent data to 
retain. This is for the fine-grain data |
+| gearpump.retainRecentData.intervalMs | 15000 | time interval between two 
data points for recent data (unit: ms) |
+| gearpump.log.daemon.dir | "logs" | The log directory for daemon 
processes(relative to current working directory) |
+| gearpump.log.application.dir | "logs" | The log directory for 
applications(relative to current working directory) |
+| gearpump.serializers | a map | custom serializer for streaming application, 
e.g. `"scala.Array" = ""` |
+| gearpump.worker.slots | 1000 | How many slots each worker contains |
+| gearpump.appmaster.vmargs | "-server  -Xss1M -XX:+HeapDumpOnOutOfMemoryError 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC 
-XX:NewRatio=3 -Djava.rmi.server.hostname=localhost" | JVM arguments for 
AppMaster |
+| gearpump.appmaster.extraClasspath | "" | JVM default class path for 
AppMaster |
+| gearpump.executor.vmargs | "-server -Xss1M -XX:+HeapDumpOnOutOfMemoryError 
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC 
-XX:NewRatio=3  -Djava.rmi.server.hostname=localhost" | JVM arguments for 
executor |
+| gearpump.executor.extraClasspath | "" | JVM default class path for executor |
+| gearpump.jarstore.rootpath | "jarstore/" |   Define where the submitted jar 
file will be stored. This path follows the hadoop path schema. For HDFS, use 
`hdfs://host:port/path/`, and HDFS HA, `hdfs://namespace/path/`; if you want to 
store on master nodes, then use local directory. `jarstore.rootpath = 
"jarstore/"` will point to relative directory where master is started. 
`jarstore.rootpath = "/jarstore/"` will point to absolute directory on master 
server |
+| gearpump.scheduling.scheduler-class 
|"org.apache.gearpump.cluster.scheduler.PriorityScheduler" | Class to schedule 
the applications. |
+| gearpump.services.host | "127.0.0.1" | dashboard UI host address |
+| gearpump.services.port | 8090 | dashboard UI host port |
+| gearpump.netty.buffer-size | 5242880 | netty connection buffer size |
+| gearpump.netty.max-retries | 30 | maximum number of retries for a netty 
client to connect to remote host |
+| gearpump.netty.base-sleep-ms | 100 | base sleep time for a netty client to 
retry a connection. Actual sleep time is a multiple of this value |
+| gearpump.netty.max-sleep-ms | 1000 | maximum sleep time for a netty client 
to retry a connection |
+| gearpump.netty.message-batch-size | 262144 | netty max batch size |
+| gearpump.netty.flush-check-interval | 10 | max flush interval for the netty 
layer, in milliseconds |
+| gearpump.netty.dispatcher | "gearpump.shared-thread-pool-dispatcher" | 
default dispatcher for netty client and server |
+| gearpump.shared-thread-pool-dispatcher | default Dispatcher with 
"fork-join-executor" | default shared thread pool dispatcher |
+| gearpump.single-thread-dispatcher | PinnedDispatcher | default single thread 
dispatcher |
+| gearpump.serialization-framework | 
"org.apache.gearpump.serializer.FastKryoSerializationFramework" | Gearpump has 
built-in serialization framework using Kryo. Users are allowed to use a 
different serialization framework, like Protobuf. See 
`org.apache.gearpump.serializer.FastKryoSerializationFramework` to find how a 
custom serialization framework can be defined |
+| worker.executor-share-same-jvm-as-worker | false | whether the executor 
actor is started in the same jvm(process) from which running the worker actor, 
the intention of this setting is for the convenience of single machine 
debugging, however, the app jar need to be added to the worker's classpath when 
you set it true and have a 'real' worker in the cluster |
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-docker.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-docker.md 
b/docs/contents/deployment/deployment-docker.md
new file mode 100644
index 0000000..c71ed9d
--- /dev/null
+++ b/docs/contents/deployment/deployment-docker.md
@@ -0,0 +1,5 @@
+## Gearpump Docker Container
+
+There is pre-built docker container available at [Docker 
Repo](https://hub.docker.com/r/gearpump/gearpump/)
+
+Check the documents there to find how to launch a Gearpump cluster in one line.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-ha.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-ha.md 
b/docs/contents/deployment/deployment-ha.md
new file mode 100644
index 0000000..9e907c0
--- /dev/null
+++ b/docs/contents/deployment/deployment-ha.md
@@ -0,0 +1,75 @@
+To support HA, we allow to start master on multiple nodes. They will form a 
quorum to decide consistency. For example, if we start master on 5 nodes and 2 
nodes are down, then the cluster is still consistent and functional.
+
+Here are the steps to enable the HA mode:
+
+### 1. Configure.
+
+#### Select master machines
+
+Distribute the package to all nodes. Modify `conf/gear.conf` on all nodes. You 
MUST configure
+
+       :::bash
+       gearpump.hostname
+
+to make it point to your hostname(or ip), and
+
+       :::bash
+       gearpump.cluster.masters
+       
+to a list of master nodes. For example, if I have 3 master nodes (node1, 
node2, and node3),  then the `gearpump.cluster.masters` can be set as
+
+       :::bash
+       gearpump.cluster {
+         masters = ["node1:3000", "node2:3000", "node3:3000"]
+       }
+       
+
+#### Configure distributed storage to store application jars.
+In `conf/gear.conf`, For entry `gearpump.jarstore.rootpath`, please choose the 
storage folder for application jars. You need to make sure this jar storage is 
highly available. We support two storage systems:
+
+  1). HDFS
+  
+  You need to configure the `gearpump.jarstore.rootpath` like this
+
+       :::bash
+    hdfs://host:port/path/
+
+
+  For HDFS HA,
+  
+       :::bash
+   hdfs://namespace/path/
+
+
+  2). Shared NFS folder
+  
+  First you need to map the NFS directory to local directory(same path) on all 
machines of master nodes.
+Then you need to set the `gearpump.jarstore.rootpath` like this:
+
+       :::bash
+       file:///your_nfs_mapping_directory
+
+
+  3). If you don't set this value, we will use the local directory of master 
node.
+  NOTE! There is no HA guarantee in this case, which means we are unable to 
recover running applications when master goes down.
+
+### 2. Start Daemon.
+
+On node1, node2, node3, Start Master
+
+       :::bash
+       ## on node1
+       bin/master -ip node1 -port 3000
+
+       ## on node2
+       bin/master -ip node2 -port 3000
+
+       ## on node3
+       bin/master -ip node3 -port 3000
+  
+
+### 3. Done!
+
+Now you have a highly available HA cluster. You can kill any node, the master 
HA will take effect.
+
+**NOTE**: It can take up to 15 seconds for master node to fail-over. You can 
change the fail-over timeout time by adding config in `gear.conf` 
`gearpump-master.akka.cluster.auto-down-unreachable-after=10s` or set it to a 
smaller value

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-local.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-local.md 
b/docs/contents/deployment/deployment-local.md
new file mode 100644
index 0000000..81e6029
--- /dev/null
+++ b/docs/contents/deployment/deployment-local.md
@@ -0,0 +1,34 @@
+You can start the Gearpump service in a single JVM(local mode), or in a 
distributed cluster(cluster mode). To start the cluster in local mode, you can 
use the local /local.bat helper scripts, it is very useful for developing or 
troubleshooting.
+
+Below are the steps to start a Gearpump service in **Local** mode:
+
+### Step 1: Get your Gearpump binary ready
+To get your Gearpump service running in local mode, you first need to have a 
Gearpump distribution binary ready.
+Please follow [this guide](get-gearpump-distribution) to have the binary.  
+
+### Step 2: Start the cluster
+You can start a local mode cluster in single line
+
+       :::bash
+       ## start the master and 2 workers in single JVM. The master will listen 
on 3000
+       ## you can Ctrl+C to kill the local cluster after you finished the 
startup tutorial.
+       bin/local
+       
+
+**NOTE:** You may need to execute `chmod +x bin/*` in shell to make the script 
file `local` executable.
+
+**NOTE:** You can change the default port by changing config 
`gearpump.cluster.masters` in `conf/gear.conf`.
+
+**NOTE: Change the working directory**. Log files by default will be generated 
under current working directory. So, please "cd" to required working directly 
before running the shell commands.
+
+**NOTE: Run as Daemon**. You can run it as a background process. For example, 
use [nohup](http://linux.die.net/man/1/nohup) on Linux.
+
+### Step 3: Start the Web UI server
+Open another shell,
+
+       :::bash
+       bin/services
+       
+You can manage the applications in UI 
[http://127.0.0.1:8090](http://127.0.0.1:8090) or by [Command Line 
tool](../introduction/commandline).
+The default username and password is "admin:admin", you can check
+[UI Authentication](../deployment/deployment-ui-authentication) to find how to 
manage users.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-msg-delivery.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-msg-delivery.md 
b/docs/contents/deployment/deployment-msg-delivery.md
new file mode 100644
index 0000000..53e8c2e
--- /dev/null
+++ b/docs/contents/deployment/deployment-msg-delivery.md
@@ -0,0 +1,60 @@
+## How to deploy for At Least Once Message Delivery?
+
+As introduced in the [What is At Least Once Message 
Delivery](../introduction/message-delivery#what-is-at-least-once-message-delivery),
 Gearpump has a built in KafkaSource. To get at least once message delivery, 
users should deploy a Kafka cluster as the offset store along with the Gearpump 
cluster. 
+
+Here's an example to deploy a local Kafka cluster. 
+
+1. download the latest Kafka from the official website and extract to a local 
directory (`$KAFKA_HOME`)
+
+2. Boot up the single-node Zookeeper instance packaged with Kafka. 
+
+       :::bash
+       $KAFKA_HOME/bin/zookeeper-server-start.sh 
$KAFKA_HOME/config/zookeeper.properties
+    
+ 
+3. Start a Kafka broker
+
+           :::bash
+           $KAFKA_HOME/bin/kafka-server-start.sh 
$KAFKA_HOME/config/kafka.properties
+             
+
+4. When creating a offset store for `KafkaSource`, set the zookeeper connect 
string to `localhost:2181` and broker list to `localhost:9092` in 
`KafkaStorageFactory`.
+
+           :::scala
+           val offsetStorageFactory = new 
KafkaStorageFactory("localhost:2181", "localhost:9092")
+           val source = new KafkaSource("topic1", "localhost:2181", 
offsetStorageFactory)
+           
+
+## How to deploy for Exactly Once Message Delivery?
+
+Exactly Once Message Delivery requires both an offset store and a checkpoint 
store. For the offset store, a Kafka cluster should be deployed as in the 
previous section. As for the checkpoint store, Gearpump has built-in support 
for Hadoop file systems, like HDFS. Hence, users should deploy a HDFS cluster 
alongside the Gearpump cluster. 
+
+Here's an example to deploy a local HDFS cluster.
+
+1. download Hadoop 2.6 from the official website and extracts it to a local 
directory `HADOOP_HOME`
+
+2. add following configuration to `$HADOOP_HOME/etc/core-site.xml`
+
+           :::xml
+           <configuration>
+             <property>
+               <name>fs.defaultFS</name>
+               <value>hdfs://localhost:9000</value>
+             </property>
+           </configuration>
+           
+
+3. start HDFS
+
+           :::bash
+           $HADOOP_HOME/sbin/start-dfs.sh
+           
+   
+4. When creating a `HadoopCheckpointStore`, set the hadoop configuration as in 
the `core-site.xml`
+
+               :::scala   
+       val hadoopConfig = new Configuration
+       hadoopConfig.set("fs.defaultFS", "hdfs://localhost:9000")
+       val checkpointStoreFactory = new 
HadoopCheckpointStoreFactory("MessageCount", hadoopConfig, new 
FileSizeRotation(1000))
+
+    
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-resource-isolation.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-resource-isolation.md 
b/docs/contents/deployment/deployment-resource-isolation.md
new file mode 100644
index 0000000..ee47802
--- /dev/null
+++ b/docs/contents/deployment/deployment-resource-isolation.md
@@ -0,0 +1,112 @@
+CGroup (abbreviated from control groups) is a Linux kernel feature to limit, 
account, and isolate resource usage (CPU, memory, disk I/O, etc.) of process 
groups.In Gearpump, we use cgroup to manage CPU resources.
+
+## Start CGroup Service 
+
+CGroup feature is only supported by Linux whose kernel version is larger than 
2.6.18. Please also make sure the SELinux is disabled before start CGroup.
+
+The following steps are supposed to be executed by root user.
+
+1. Check `/etc/cgconfig.conf` exist or not. If not exists, please `yum install 
libcgroup`.
+
+2. Run following command to see whether the **cpu** subsystem is already 
mounted to the file system.
+ 
+               :::bash
+               lssubsys -m
+               
+    Each subsystem in CGroup will have a corresponding mount file path in 
local file system. For example, the following output shows that **cpu** 
subsystem is mounted to file path `/sys/fs/cgroup/cpu`
+   
+               :::bash
+               cpu /sys/fs/cgroup/cpu
+               net_cls /sys/fs/cgroup/net_cls
+               blkio /sys/fs/cgroup/blkio
+               perf_event /sys/fs/cgroup/perf_event
+  
+   
+3. If you want to assign permission to user **gear** to launch Gearpump Worker 
and applications with resource isolation enabled, you need to check gear's uid 
and gid in `/etc/passwd` file, let's take **500** for example.
+
+4. Add following content to `/etc/cgconfig.conf`
+    
+               
+               # The mount point of cpu subsystem.
+               # If your system already mounted it, this segment should be 
eliminated.
+               mount {    
+                 cpu = /cgroup/cpu;
+               }
+                   
+               # Here the group name "gearpump" represents a node in CGroup's 
hierarchy tree.
+               # When the CGroup service is started, there will be a folder 
generated under the mount point of cpu subsystem,
+               # whose name is "gearpump".
+                   
+               group gearpump {
+                  perm {
+                      task {
+                          uid = 500;
+                          gid = 500;
+                       }
+                      admin {
+                          uid = 500;
+                          gid = 500;
+                      }
+                  }
+                  cpu {
+                  }
+               }
+          
+   
+   Please note that if the output of step 2 shows that **cpu** subsystem is 
already mounted, then the `mount` segment should not be included.
+   
+4. Then Start cgroup service
+   
+               :::bash
+               sudo service cgconfig restart 
+   
+   
+5. There should be a folder **gearpump** generated under the mount point of 
cpu subsystem and its owner is **gear:gear**.  
+  
+6. Repeat the above-mentioned steps on each machine where you want to launch 
Gearpump.   
+
+## Enable Cgroups in Gearpump 
+1. Login into the machine which has CGroup prepared with user **gear**.
+
+               :::bash
+               ssh gear@node
+   
+
+2. Enter into Gearpump's home folder, edit gear.conf under folder 
`${GEARPUMP_HOME}/conf/`
+
+               :::bash
+               gearpump.worker.executor-process-launcher = 
"org.apache.gearpump.cluster.worker.CGroupProcessLauncher"
+   
+               gearpump.cgroup.root = "gearpump"
+   
+
+   Please note the gearpump.cgroup.root **gearpump** must be consistent with 
the group name in /etc/cgconfig.conf.
+
+3. Repeat the above-mentioned steps on each machine where you want to launch 
Gearpump
+
+4. Start the Gearpump cluster, please refer to [Deploy Gearpump in Standalone 
Mode](deployment-standalone)
+
+## Launch Application From Command Line
+1. Login into the machine which has Gearpump distribution.
+
+2. Enter into Gearpump's home folder, edit gear.conf under folder 
`${GEARPUMP_HOME}/conf/`
+   
+               :::bash
+               gearpump.cgroup.cpu-core-limit-per-executor = 
${your_preferred_int_num}
+   
+  
+   Here the configuration is the number of CPU cores per executor can use and 
-1 means no limitation
+
+3. Submit application
+
+               :::bash
+               bin/gear app -jar 
examples/sol-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}-assembly.jar 
-streamProducer 10 -streamProcessor 10 
+   
+
+4. Then you can run command `top` to monitor the cpu usage.
+
+## Launch Application From Dashboard
+If you want to submit the application from dashboard, by default the 
`gearpump.cgroup.cpu-core-limit-per-executor` is inherited from Worker's 
configuration. You can provide your own conf file to override it.
+
+## Limitations
+Windows and Mac OS X don't support CGroup, so the resource isolation will not 
work even if you turn it on. There will not be any limitation for single 
executor's cpu usage.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-security.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-security.md 
b/docs/contents/deployment/deployment-security.md
new file mode 100644
index 0000000..e20fc67
--- /dev/null
+++ b/docs/contents/deployment/deployment-security.md
@@ -0,0 +1,80 @@
+Until now Gearpump supports deployment in a secured Yarn cluster and writing 
to secured HBase, where "secured" means Kerberos enabled. 
+Further security related feature is in progress.
+
+## How to launch Gearpump in a secured Yarn cluster
+Suppose user `gear` will launch gearpump on YARN, then the corresponding 
principal `gear` should be created in KDC server.
+
+1. Create Kerberos principal for user `gear`, on the KDC machine
+ 
+               :::bash 
+               sudo kadmin.local
+   
+       In the kadmin.local or kadmin shell, create the principal
+   
+               :::bash
+               kadmin:  addprinc 
gear/[email protected]
+   
+       Remember that user `gear` must exist on every node of Yarn. 
+
+2. Upload the gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip to 
remote HDFS Folder, suggest to put it under 
`/usr/lib/gearpump/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip`
+
+3. Create HDFS folder /user/gear/, make sure all read-write rights are granted 
for user `gear`
+
+               :::bash
+               drwxr-xr-x - gear gear 0 2015-11-27 14:03 /user/gear
+   
+   
+4. Put the YARN configurations under classpath.
+  Before calling `yarnclient launch`, make sure you have put all yarn 
configuration files under classpath. Typically, you can just copy all files 
under `$HADOOP_HOME/etc/hadoop` from one of the YARN cluster machine to 
`conf/yarnconf` of gearpump. `$HADOOP_HOME` points to the Hadoop installation 
directory. 
+  
+5. Get Kerberos credentials to submit the job:
+
+               :::bash
+               kinit gearpump/[email protected]
+   
+   
+       Here you can login with keytab or password. Please refer Kerberos's 
document for details.
+    
+               :::bash
+               yarnclient launch -package 
/usr/lib/gearpump/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip
+   
+  
+## How to write to secured HBase
+When the remote HBase is security enabled, a kerberos keytab and the 
corresponding principal name need to be
+provided for the gearpump-hbase connector. Specifically, the `UserConfig` 
object passed into the HBaseSink should contain
+`{("gearpump.keytab.file", "\\$keytab"), ("gearpump.kerberos.principal", 
"\\$principal")}`. example code of writing to secured HBase:
+
+       :::scala
+       val principal = "gearpump/[email protected]"
+       val keytabContent = Files.toByteArray(new File("path_to_keytab_file"))
+       val appConfig = UserConfig.empty
+             .withString("gearpump.kerberos.principal", principal)
+             .withBytes("gearpump.keytab.file", keytabContent)
+       val sink = new HBaseSink(appConfig, "$tableName")
+       val sinkProcessor = DataSinkProcessor(sink, "$sinkNum")
+       val split = Processor[Split]("$splitNum")
+       val computation = split ~> sinkProcessor
+       val application = StreamApplication("HBase", Graph(computation), 
UserConfig.empty)
+
+
+Note here the keytab file set into config should be a byte array.
+
+## Future Plan
+
+### More external components support
+1. HDFS
+2. Kafka
+
+### Authentication(Kerberos)
+Since Gearpumpâs Master-Worker structure is similar to HDFSâs 
NameNode-DataNode and Yarnâs ResourceManager-NodeManager, we may follow the 
way they use.
+
+1. User creates kerberos principal and keytab for Gearpump.
+2. Deploy the keytab files to all the cluster nodes.
+3. Configure Gearpumpâs conf file, specify kerberos principal and local 
keytab file location.
+4. Start Master and Worker.
+
+Every application has a submitter/user. We will separate the application from 
different users, like different log folders for different applications. 
+Only authenticated users can submit the application to Gearpump's Master.
+
+### Authorization
+Hopefully more on this soon

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-standalone.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-standalone.md 
b/docs/contents/deployment/deployment-standalone.md
new file mode 100644
index 0000000..c9d5549
--- /dev/null
+++ b/docs/contents/deployment/deployment-standalone.md
@@ -0,0 +1,59 @@
+Standalone mode is a distributed cluster mode. That is, Gearpump runs as 
service without the help from other services (e.g. YARN).
+
+To deploy Gearpump in cluster mode, please first check that the 
[Pre-requisites](hardware-requirement) are met.
+
+### How to Install
+You need to have Gearpump binary at hand. Please refer to [How to get gearpump 
distribution](get-gearpump-distribution) to get the Gearpump binary.
+
+You are suggested to unzip the package to same directory path on every machine 
you planned to install Gearpump.
+To install Gearpump, you at least need to change the configuration in 
`conf/gear.conf`.
+
+Config | Default value | Description
+------------ | ---------------|------------
+gearpump.hostname      | "127.0.0.1"    | Host or IP address of current 
machine. The ip/host need to be reachable from other machines in the cluster.
+gearpump.cluster.masters |     ["127.0.0.1:3000"] |    List of all master 
nodes, with each item represents host and port of one master.
+gearpump.worker.slots   | 1000 | how many slots this worker has
+
+Besides this, there are other optional configurations related with logs, 
metrics, transports, ui. You can refer to [Configuration 
Guide](deployment-configuration) for more details.
+
+### Start the Cluster Daemons in Standlone mode
+In Standalone mode, you can start master and worker in different JVMs.
+
+##### To start master:
+
+       :::bash
+       bin/master -ip xx -port xx
+
+The ip and port will be checked against settings under `conf/gear.conf`, so 
you need to make sure they are consistent.
+
+**NOTE:** You may need to execute `chmod +x bin/*` in shell to make the script 
file `master` executable.
+
+**NOTE**: for high availability, please check [Master HA Guide](deployment-ha)
+
+##### To start worker:
+
+       :::bash
+       bin/worker
+
+### Start UI
+
+       :::bash
+       bin/services
+       
+
+After UI is started, you can browse to `http://{web_ui_host}:8090` to view the 
cluster status.
+The default username and password is "admin:admin", you can check
+[UI Authentication](deployment-ui-authentication) to find how to manage users.
+
+![Dashboard](../img/dashboard.gif)
+
+**NOTE:** The UI port can be configured in `gear.conf`. Check [Configuration 
Guide](deployment-configuration) for information.
+
+### Bash tool to start cluster
+
+There is a bash tool `bin/start-cluster.sh` can launch the cluster 
conveniently. You need to change the file `conf/masters`, `conf/workers` and 
`conf/dashboard` to specify the corresponding machines.
+Before running the bash tool, please make sure the Gearpump package is already 
unzipped to the same directory path on every machine.
+`bin/stop-cluster.sh` is used to stop the whole cluster of course.
+
+The bash tool is able to launch the cluster without changing the 
`conf/gear.conf` on every machine. The bash sets the `gearpump.cluster.masters` 
and other configurations using JAVA_OPTS.
+However, please note when you log into any these unconfigured machine and try 
to launch the dashboard or submit the application, you still need to modify 
`conf/gear.conf` manually because the JAVA_OPTS is missing.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-ui-authentication.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-ui-authentication.md 
b/docs/contents/deployment/deployment-ui-authentication.md
new file mode 100644
index 0000000..0990192
--- /dev/null
+++ b/docs/contents/deployment/deployment-ui-authentication.md
@@ -0,0 +1,290 @@
+## What is this about?
+
+## How to enable UI authentication?
+
+1. Change config file gear.conf, find entry 
`gearpump-ui.gearpump.ui-security.authentication-enabled`, change the value to 
true
+
+       :::bash 
+       gearpump-ui.gearpump.ui-security.authentication-enabled = true
+       
+   
+    Restart the UI dashboard, and then the UI authentication is enabled. It 
will prompt for user name and password.
+
+## How many authentication methods Gearpump UI server support?
+
+Currently, It supports:
+
+1. Username-Password based authentication and
+2. OAuth2 based authentication.
+
+User-Password based authentication is enabled when 
`gearpump-ui.gearpump.ui-security.authentication-enabled`,
+ and **CANNOT** be disabled.
+
+UI server admin can also choose to enable **auxiliary** OAuth2 authentication 
channel.
+
+## User-Password based authentication
+
+   User-Password based authentication covers all authentication scenarios 
which requires
+   user to enter an explicit username and password.
+
+   Gearpump provides a built-in ConfigFileBasedAuthenticator which verify user 
name and password
+   against password hashcode stored in config files.
+
+   However, developer can choose to extends the 
`org.apache.gearpump.security.Authenticator` to provide a custom
+   User-Password based authenticator, to support LDAP, Kerberos, and 
Database-based authentication...
+
+### ConfigFileBasedAuthenticator: built-in User-Password Authenticator
+
+ConfigFileBasedAuthenticator store all user name and password hashcode in 
configuration file gear.conf. Here
+is the steps to configure ConfigFileBasedAuthenticator.
+
+#### How to add or remove user?
+
+For the default authentication plugin, it has three categories of users: 
admins, users, and guests.
+
+* admins: have unlimited permission, like shutdown a cluster, add/remove 
machines.
+* users: have limited permission to submit an application and etc..
+* guests: can not submit/kill applications, but can view the application 
status.
+
+System administrator can add or remove user by updating config file 
`conf/gear.conf`.  
+
+Suppose we want to add user jerry as an administrator, here are the steps:
+  
+1. Pick a password, and generate the digest for this password. Suppose we use 
password `ilovegearpump`, 
+   to generate the digest:
+   
+       :::bash
+       bin/gear org.apache.gearpump.security.PasswordUtil -password  
ilovegearpump
+    
+   
+    It will generate a digest value like this:
+   
+       :::bash
+       CgGxGOxlU8ggNdOXejCeLxy+isrCv0TrS37HwA==
+    
+
+2. Change config file conf/gear.conf at path 
`gearpump-ui.gearpump.ui-security.config-file-based-authenticator.admins`,
+   add user `jerry` in this list:
+   
+       :::bash
+           admins = {
+             ## Default Admin. Username: admin, password: admin
+             ## !!! Please replace this builtin account for production cluster 
for security reason. !!!
+             "admin" = "AeGxGOxlU8QENdOXejCeLxy+isrCv0TrS37HwA=="
+             "jerry" = "CgGxGOxlU8ggNdOXejCeLxy+isrCv0TrS37HwA=="
+           }
+    
+
+3. Restart the UI dashboard by `bin/services` to make the change effective.
+
+4. Group "admins" have very unlimited permission, you may want to restrict the 
permission. In that case 
+   you can modify 
`gearpump-ui.gearpump.ui-security.config-file-based-authenticator.users` or
+   `gearpump-ui.gearpump.ui-security.config-file-based-authenticator.guests`.
+
+5. See description at `conf/gear.conf` to find more information.   
+   
+#### What is the default user and password?
+
+For ConfigFileBasedAuthenticator, Gearpump distribution is shipped with two 
default users:
+
+1. username: admin, password: admin
+2. username: guest, password: guest
+
+User `admin` has unlimited permissions, while `guest` can only view the 
application status.
+
+For security reason, you need to remove the default users `admin` and `guest` 
for cluster in production.
+
+#### Is this secure?
+
+Firstly, we will NOT store any user password in any way so only the user 
himself knows the password. 
+We will use one-way hash digest to verify the user input password.
+
+### How to develop a custom User-Password Authenticator for LDAP, Database, 
and etc..
+
+If developer choose to define his/her own User-Password based authenticator, 
it is required that user
+    modify configuration option:
+
+       :::bash
+       ## Replace "org.apache.gearpump.security.CustomAuthenticator" with your 
real authenticator class.
+       gearpump.ui-security.authenticator = 
"org.apache.gearpump.security.CustomAuthenticator"
+               
+
+Make sure CustomAuthenticator extends interface:
+
+       :::scala
+       trait Authenticator {
+       
+         def authenticate(user: String, password: String, ec: 
ExecutionContext): Future[AuthenticationResult]
+       }
+
+
+## OAuth2 based authentication
+
+OAuth2 based authentication is commonly use to achieve social login with 
social network account.
+
+Gearpump provides generic OAuth2 Authentication support which allow user to 
extend to support new authentication sources.
+
+Basically, OAuth2 based Authentication contains these steps:
+ 1. User accesses Gearpump UI website, and choose to login with OAuth2 server.
+ 2. Gearpump UI website redirects user to OAuth2 server domain authorization 
endpoint.
+ 3. End user complete the authorization in the domain of OAuth2 server.
+ 4. OAuth2 server redirects user back to Gearpump UI server.
+ 5. Gearpump UI server verify the tokens and extract credentials from query
+ parameters and form fields.
+
+### Terminologies
+
+For terms like client Id, and client secret, please refers to guide [RFC 
6749](https://tools.ietf.org/html/rfc6749)
+
+### Enable web proxy for UI server
+
+To enable OAuth2 authentication, the Gearpump UI server should have network 
access to OAuth2 server, as
+ some requests are initiated directly inside Gearpump UI server. So, if you 
are behind a firewall, make
+ sure you have configured the proxy properly for UI server.
+
+#### If you are on Windows
+
+       :::bash
+       set JAVA_OPTS=-Dhttp.proxyHost=xx.com -Dhttp.proxyPort=8088 
-Dhttps.proxyHost=xx.com -Dhttps.proxyPort=8088
+       bin/services
+
+
+#### If you are on Linux
+
+       :::bash
+    export JAVA_OPTS="-Dhttp.proxyHost=xx.com -Dhttp.proxyPort=8088 
-Dhttps.proxyHost=xx.com -Dhttps.proxyPort=8088"
+    bin/services
+
+
+### Google Plus OAuth2 Authenticator
+
+Google Plus OAuth2 Authenticator does authentication with Google OAuth2 
service. It extracts the email address
+from Google user profile as credentials.
+
+To use Google OAuth2 Authenticator, there are several steps:
+
+1. Register your application (Gearpump UI server here) as an application to 
Google developer console.
+2. Configure the Google OAuth2 information in gear.conf
+3. Configure network proxy for Gearpump UI server if applies.
+
+#### Step1: Register your website as an OAuth2 Application on Google
+
+1. Create an application representing your website at 
[https://console.developers.google.com](https://console.developers.google.com)
+2. In "API Manager" of your created application, enable API "Google+ API"
+3. Create OAuth client ID for this application. In "Credentials" tab of "API 
Manager",
+choose "Create credentials", and then select OAuth client ID. Follow the wizard
+to set callback URL, and generate client ID, and client Secret.
+
+**NOTE:** Callback URL is NOT optional.
+
+#### Step2: Configure the OAuth2 information in gear.conf
+
+1. Enable OAuth2 authentication by setting 
`gearpump.ui-security.oauth2-authenticator-enabled`
+as true.
+2. Configure section `gearpump.ui-security.oauth2-authenticators.google` in 
gear.conf. Please make sure
+class name, client ID, client Secret, and callback URL are set properly.
+
+**NOTE:** Callback URL set here should match what is configured on Google in 
step1.
+
+#### Step3: Configure the network proxy if applies.
+
+To enable OAuth2 authentication, the Gearpump UI server should have network 
access to Google service, as
+ some requests are initiated directly inside Gearpump UI server. So, if you 
are behind a firewall, make
+ sure you have configured the proxy properly for UI server.
+
+For guide of how to configure web proxy for UI server, please refer to section 
"Enable web proxy for UI server" above.
+
+#### Step4: Restart the UI server and try to click the Google login icon on UI 
server.
+
+### CloudFoundry UAA server OAuth2 Authenticator
+
+CloudFoundryUaaAuthenticator does authentication by using CloudFoundry UAA 
OAuth2 service. It extracts the email address
+ from Google user profile as credentials.
+
+For what is UAA (User Account and Authentication Service), please see guide: 
[UAA](https://github.com/cloudfoundry/uaa)
+
+To use Google OAuth2 Authenticator, there are several steps:
+
+1. Register your application (Gearpump UI server here) as an application to 
UAA with helper tool `uaac`.
+2. Configure the Google OAuth2 information in gear.conf
+3. Configure network proxy for Gearpump UI server if applies.
+
+#### Step1: Register your application to UAA with `uaac`
+
+1. Check tutorial on uaac at 
[https://docs.cloudfoundry.org/adminguide/uaa-user-management.html](https://docs.cloudfoundry.org/adminguide/uaa-user-management.html)
+2. Open a bash shell, set the UAA server by command `uaac target`
+
+           :::bash
+           uaac target [your uaa server url]
+    
+    
+3. Login in as user admin by
+
+           :::bash
+           uaac token client get admin -s MyAdminPassword
+           
+    
+4. Create a new Application (Client) in UAA,
+    
+           :::bash
+           uaac client add [your_client_id]
+             --scope "openid cloud_controller.read"
+             --authorized_grant_types "authorization_code client_credentials 
refresh_token"
+             --authorities "openid cloud_controller.read"
+             --redirect_uri [your_redirect_url]
+             --autoapprove true
+             --secret [your_client_secret]
+           
+
+#### Step2: Configure the OAuth2 information in gear.conf
+
+1. Enable OAuth2 authentication by setting 
`gearpump.ui-security.oauth2-authenticator-enabled` as true.
+2. Navigate to section 
`gearpump.ui-security.oauth2-authenticators.cloudfoundryuaa`
+3. Config gear.conf 
`gearpump.ui-security.oauth2-authenticators.cloudfoundryuaa` section.
+Please make sure class name, client ID, client Secret, and callback URL are 
set properly.
+
+**NOTE:** The callback URL here should match what you set on CloudFoundry UAA 
in step1.
+
+#### Step3: Configure network proxy for Gearpump UI server if applies
+
+To enable OAuth2 authentication, the Gearpump UI server should have network 
access to Google service, as
+ some requests are initiated directly inside Gearpump UI server. So, if you 
are behind a firewall, make
+ sure you have configured the proxy properly for UI server.
+
+For guide of how to configure web proxy for UI server, please refer to please 
refer to section "Enable web proxy for UI server" above.
+
+#### Step4: Restart the UI server and try to click the CloudFoundry login icon 
on UI server.
+
+#### Step5: You can also enable additional authenticator for CloudFoundry UAA 
by setting config:
+
+       :::bash
+       additional-authenticator-enabled = true
+       
+
+Please see description in gear.conf for more information.
+
+#### Extends OAuth2Authenticator to support new Authorization service like 
Facebook, or Twitter.
+
+You can follow the Google OAuth2 example code to define a custom 
OAuth2Authenticator. Basically, the steps includes:
+
+1. Define an OAuth2Authenticator implementation.
+
+2. Add an configuration entry under 
`gearpump.ui-security.oauth2-authenticators`. For example:
+
+               ## name of this authenticator     
+       "socialnetworkx" {
+         "class" = 
"org.apache.gearpump.services.security.oauth2.impl.SocialNetworkXAuthenticator"
+
+         ## Please make sure this URL matches the name       
+         "callback" = 
"http://127.0.0.1:8090/login/oauth2/socialnetworkx/callback";
+
+         "clientId" = "gearpump_test2"
+         "clientSecret" = "gearpump_test2"
+         "defaultUserRole" = "guest"
+
+         ## Make sure socialnetworkx.png exists under dashboard/icons          
       
+         "icon" = "/icons/socialnetworkx.png"
+       }           
+       
+    
+   The configuration entry is supposed to be used by class 
`SocialNetworkXAuthenticator`.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/deployment-yarn.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/deployment-yarn.md 
b/docs/contents/deployment/deployment-yarn.md
new file mode 100644
index 0000000..401fa46
--- /dev/null
+++ b/docs/contents/deployment/deployment-yarn.md
@@ -0,0 +1,135 @@
+## How to launch a Gearpump cluster on YARN
+
+1. Upload the `gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip` to 
remote HDFS Folder, suggest to put it under 
`/usr/lib/gearpump/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip`
+
+2. Make sure the home directory on HDFS is already created and all read-write 
rights are granted for user. For example, user gear's home directory is 
`/user/gear`
+
+3. Put the YARN configurations under classpath.
+  Before calling `yarnclient launch`, make sure you have put all yarn 
configuration files under classpath. Typically, you can just copy all files 
under `$HADOOP_HOME/etc/hadoop` from one of the YARN Cluster machine to 
`conf/yarnconf` of gearpump. `$HADOOP_HOME` points to the Hadoop installation 
directory. 
+
+4. Launch the gearpump cluster on YARN
+
+               :::bash
+       yarnclient launch -package 
/usr/lib/gearpump/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip
+    
+
+    If you don't specify package path, it will read default package-path 
(`gearpump.yarn.client.package-path`) from `gear.conf`.
+
+    **NOTE:** You may need to execute `chmod +x bin/*` in shell to make the 
script file `yarnclient` executable.
+   
+5. After launching, you can browser the Gearpump UI via YARN resource manager 
dashboard.
+
+## How to configure the resource limitation of Gearpump cluster
+
+Before launching a Gearpump cluster, please change configuration section 
`gearpump.yarn` in `gear.conf` to configure the resource limitation, like:
+
+1. The number of worker containers. 
+2. The YARN container memory size for worker and master.
+
+## How to submit a application to Gearpump cluster.
+
+To submit the jar to the Gearpump cluster, we first need to know the Master 
address, so we need to get
+a active configuration file first.
+
+There are two ways to get an active configuration file:
+
+1. Option 1: specify "-output" option when you launch the cluster.
+
+       :::bash
+       yarnclient launch -package 
/usr/lib/gearpump/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip 
-output /tmp/mycluster.conf
+    
+
+    It will return in console like this:
+    
+       :::bash
+       ==Application Id: application_1449802454214_0034
+    
+    
+
+2. Option 2: Query the active configuration file
+
+       :::bash
+       yarnclient getconfig -appid <yarn application id> -output 
/tmp/mycluster.conf
+    
+
+    yarn application id can be found from the output of step1 or from YARN 
dashboard.
+
+3. After you downloaded the configuration file, you can launch application 
with that config file.
+
+       :::bash
+       gear app -jar 
examples/wordcount-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.jar -conf 
/tmp/mycluster.conf
+    
+  
+4. To run Storm application over Gearpump on YARN, please store the 
configuration file with `-output application.conf` 
+   and then launch Storm application with
+
+       :::bash
+       storm -jar 
examples/storm-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.jar 
storm.starter.ExclamationTopology exclamation
+    
+  
+5. Now the application is running. To check this:
+
+       :::bash
+       gear info -conf /tmp/mycluster.conf
+    
+
+6. To Start a UI server, please do:
+    
+       :::bash
+       services -conf /tmp/mycluster.conf
+    
+    
+    The default username and password is "admin:admin", you can check [UI 
Authentication](deployment-ui-authentication) to find how to manage users.
+
+   
+## How to add/remove machines dynamically.
+
+Gearpump yarn tool allows to dynamically add/remove machines. Here is the 
steps:
+
+1. First, query to get active resources.
+
+       :::bash
+       yarnclient query -appid <yarn application id>
+    
+
+    The console output will shows how many workers and masters there are. For 
example, I have output like this:
+
+       :::bash
+       masters:
+       container_1449802454214_0034_01_000002(IDHV22-01:35712)
+       workers:
+       container_1449802454214_0034_01_000003(IDHV22-01:35712)
+       container_1449802454214_0034_01_000006(IDHV22-01:35712)
+   
+
+2. To add a new worker machine, you can do:
+
+       :::bash
+       yarnclient addworker -appid <yarn application id> -count 2
+    
+
+    This will add two new workers machines. Run the command in first step to 
check whether the change is effective.
+
+3. To remove old machines, use:
+
+       :::bash
+       yarnclient removeworker -appid <yarn application id> -container <worker 
container id>
+    
+
+    The worker container id can be found from the output of step 1. For 
example "container_1449802454214_0034_01_000006" is a good container id.
+
+## Other usage:
+ 
+1. To kill a cluster,
+
+       :::bash
+       yarnclient kill -appid <yarn application id>
+    
+
+    **NOTE:** If the application is not launched successfully, then this 
command won't work. Please use "yarn application -kill <appId>" instead.
+
+2. To check the Gearpump version
+
+       :::bash
+       yarnclient version -appid <yarn application id>
+    

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/get-gearpump-distribution.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/get-gearpump-distribution.md 
b/docs/contents/deployment/get-gearpump-distribution.md
new file mode 100644
index 0000000..8a31113
--- /dev/null
+++ b/docs/contents/deployment/get-gearpump-distribution.md
@@ -0,0 +1,83 @@
+### Prepare the binary
+You can either download pre-build release package or choose to build from 
source code.
+
+#### Download Release Binary
+
+If you choose to use pre-build package, then you don't need to build from 
source code. The release package can be downloaded from:
+
+##### [Download page](http://gearpump.incubator.apache.org/downloads.html)
+
+#### Build from Source code
+
+If you choose to build the package from source code yourself, you can follow 
these steps:
+
+1). Clone the Gearpump repository
+
+       :::bash
+       git clone https://github.com/apache/incubator-gearpump.git
+  cd gearpump
+
+
+2). Build package
+
+       :::bash
+       ## Please use scala 2.11
+       ## The target package path: 
output/target/gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip
+       sbt clean assembly packArchiveZip
+
+
+  After the build, there will be a package file 
gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip generated under 
output/target/ folder.
+
+  **NOTE:**
+  Please set JAVA_HOME environment before the build.
+
+  On linux:
+
+       :::bash
+       export JAVA_HOME={path/to/jdk/root/path}
+       
+
+  On Windows:
+
+       :::bash
+       set JAVA_HOME={path/to/jdk/root/path}
+
+
+  **NOTE:**
+The build requires network connection. If you are behind an enterprise proxy, 
make sure you have set the proxy in your env before running the build commands.
+For windows:
+
+       :::bash
+       set HTTP_PROXY=http://host:port
+       set HTTPS_PROXY= http://host:port
+
+
+For Linux:
+
+       :::bash
+       export HTTP_PROXY=http://host:port
+       export HTTPS_PROXY= http://host:port
+
+
+### Gearpump package structure
+
+You need to flatten the `.zip` file to use it. On Linux, you can
+
+       :::bash
+       unzip gearpump-{{SCALA_BINARY_VERSION}}-{{GEARPUMP_VERSION}}.zip
+
+
+After decompression, the directory structure looks like picture 1.
+
+![Layout](../img/layout.png)
+
+Under bin/ folder, there are script files for Linux(bash script) and 
Windows(.bat script).
+
+script | function
+--------|------------
+local | You can start the Gearpump cluster in single JVM(local mode), or in a 
distributed cluster(cluster mode). To start the cluster in local mode, you can 
use the local /local.bat helper scripts, it is very useful for developing or 
troubleshooting.
+master | To start Gearpump in cluster mode, you need to start one or more 
master nodes, which represent the global resource management center. 
master/master.bat is launcher script to boot the master node.
+worker | To start Gearpump in cluster mode, you also need to start several 
workers, with each worker represent a set of local resources. worker/worker.bat 
is launcher script to start the worker node.
+services | This script is used to start backend REST service and other 
services for frontend UI dashboard (Default user "admin, admin").
+
+Please check [Command Line Syntax](../introduction/commandline) for more 
information for each script.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/deployment/hardware-requirement.md
----------------------------------------------------------------------
diff --git a/docs/contents/deployment/hardware-requirement.md 
b/docs/contents/deployment/hardware-requirement.md
new file mode 100644
index 0000000..dfe765b
--- /dev/null
+++ b/docs/contents/deployment/hardware-requirement.md
@@ -0,0 +1,30 @@
+### Pre-requisite
+
+Gearpump cluster can be installed on Windows OS and Linux.
+
+Before installation, you need to decide how many machines are used to run this 
cluster.
+
+For each machine, the requirements are listed in table below.
+
+**Table: Environment requirement on single machine**
+
+Resource | Requirements
+------------ | ---------------------------
+Memory       | 2GB free memory is required to run the cluster. For any 
production system, 32GB memory is recommended.
+Java          | JRE 6 or above
+User permission | Root permission is not required
+Network        Ethernet |(TCP/IP)
+CPU    | Nothing special
+HDFS installation      | Default is not required. You only need to install it 
when you want to store the application jars in HDFS.
+Kafka installation |   Default is not required. You need to install Kafka when 
you want the at-least once message delivery feature. Currently, the only 
supported data source for this feature is Kafka
+
+**Table: The default port used in Gearpump:**
+
+| usage        | Port |        Description |
+------------ | ---------------|------------
+  Dashboard UI | 8090  | Web UI.
+Dashboard web socket service | 8091 |  UI backend web socket service for long 
connection.
+Master port |  3000 |  Every other role like worker, appmaster, executor, user 
use this port to communicate with Master.
+
+You need to ensure that your firewall has not banned these ports to ensure 
Gearpump can work correctly.
+And you can modify the port configuration. Check 
[Configuration](deployment-configuration) section for details.  

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/dev/dev-connectors.md
----------------------------------------------------------------------
diff --git a/docs/contents/dev/dev-connectors.md 
b/docs/contents/dev/dev-connectors.md
new file mode 100644
index 0000000..01deb3e
--- /dev/null
+++ b/docs/contents/dev/dev-connectors.md
@@ -0,0 +1,237 @@
+## Basic Concepts
+`DataSource` and `DataSink` are the two main concepts Gearpump use to connect 
with the outside world.
+
+### DataSource
+`DataSource` is the start point of a streaming processing flow. 
+
+
+### DataSink
+`DataSink` is the end point of a streaming processing flow.
+
+## Implemented Connectors
+
+### `DataSource` implemented
+Currently, we have following `DataSource` supported.
+
+Name | Description
+-----| ----------
+`CollectionDataSource` | Convert a collection to a recursive data source. E.g. 
`seq(1, 2, 3)` will output `1,2,3,1,2,3...`.
+`KafkaSource` | Read from Kafka.
+
+### `DataSink` implemented
+Currently, we have following `DataSink` supported.
+
+Name | Description
+-----| ----------
+`HBaseSink` | Write the message to HBase. The message to write must be HBase 
`Put` or a tuple of `(rowKey, family, column, value)`.
+`KafkaSink` | Write to Kafka.
+
+## Use of Connectors
+
+### Use of Kafka connectors
+
+To use Kafka connectors in your application, you first need to add the 
`gearpump-external-kafka` library dependency in your application:
+
+#### SBT
+
+       :::sbt
+       "org.apache.gearpump" %% "gearpump-external-kafka" % 
{{GEARPUMP_VERSION}}
+
+#### XML
+
+       :::xml
+       <dependency>
+         <groupId>org.apache.gearpump</groupId>
+         <artifactId>gearpump-external-kafka</artifactId>
+         <version>{{GEARPUMP_VERSION}}</version>
+       </dependency>
+
+
+This is a simple example to read from Kafka and write it back using 
`KafkaSource` and `KafkaSink`. Users can optionally set a 
`CheckpointStoreFactory` such that Kafka offsets are checkpointed and 
at-least-once message delivery is guaranteed. 
+
+#### Low level API 
+
+       :::scala
+       val appConfig = UserConfig.empty
+       val props = new Properties
+       props.put(KafkaConfig.ZOOKEEPER_CONNECT_CONFIG, zookeeperConnect)
+       props.put(KafkaConfig.BOOTSTRAP_SERVERS_CONFIG, brokerList)
+       props.put(KafkaConfig.CHECKPOINT_STORE_NAME_PREFIX_CONFIG, appName)
+       val source = new KafkaSource(sourceTopic, props)
+       val checkpointStoreFactory = new KafkaStoreFactory(props)
+       source.setCheckpointStore(checkpointStoreFactory)
+       val sourceProcessor = DataSourceProcessor(source, sourceNum)
+       val sink = new KafkaSink(sinkTopic, props)
+       val sinkProcessor = DataSinkProcessor(sink, sinkNum)
+       val partitioner = new ShufflePartitioner
+       val computation = sourceProcessor ~ partitioner ~> sinkProcessor
+       val app = StreamApplication(appName, Graph(computation), appConfig)
+
+#### High level API
+
+       :::scala
+       val props = new Properties
+       val appName = "KafkaDSL"
+       props.put(KafkaConfig.ZOOKEEPER_CONNECT_CONFIG, zookeeperConnect)
+       props.put(KafkaConfig.BOOTSTRAP_SERVERS_CONFIG, brokerList)
+       props.put(KafkaConfig.CHECKPOINT_STORE_NAME_PREFIX_CONFIG, appName)
+               
+       val app = StreamApp(appName, context)
+               
+       if (atLeastOnce) {
+         val checkpointStoreFactory = new KafkaStoreFactory(props)
+         KafkaDSL.createAtLeastOnceStream(app, sourceTopic, 
checkpointStoreFactory, props, sourceNum)
+             .writeToKafka(sinkTopic, props, sinkNum)
+       } else {
+         KafkaDSL.createAtMostOnceStream(app, sourceTopic, props, sourceNum)
+             .writeToKafka(sinkTopic, props, sinkNum)
+       }
+               
+
+In the above example, configurations are set through Java properties and 
shared by `KafkaSource`, `KafkaSink` and `KafkaCheckpointStoreFactory`.
+Their configurations can be defined differently as below. 
+
+#### `KafkaSource` configurations
+
+Name | Descriptions | Type | Default 
+---- | ------------ | ---- | -------
+`KafkaConfig.ZOOKEEPER_CONNECT_CONFIG` | Zookeeper connect string for Kafka 
topics management | String 
+`KafkaConfig.CLIENT_ID_CONFIG` | An id string to pass to the server when 
making requests | String | ""  
+`KafkaConfig.GROUP_ID_CONFIG` | A string that uniquely identifies a set of 
consumers within the same consumer group | "" 
+`KafkaConfig.FETCH_SLEEP_MS_CONFIG` | The amount of time(ms) to sleep when 
hitting fetch.threshold | Int | 100 
+`KafkaConfig.FETCH_THRESHOLD_CONFIG` | Size of internal queue to keep Kafka 
messages. Stop fetching and go to sleep when hitting the threshold | Int | 
10000 
+`KafkaConfig.PARTITION_GROUPER_CLASS_CONFIG` | Partition grouper class to 
group partitions among source tasks |  Class | DefaultPartitionGrouper 
+`KafkaConfig.MESSAGE_DECODER_CLASS_CONFIG` | Message decoder class to decode 
raw bytes from Kafka | Class | DefaultMessageDecoder 
+`KafkaConfig.TIMESTAMP_FILTER_CLASS_CONFIG` | Timestamp filter class to filter 
out late messages | Class | DefaultTimeStampFilter 
+
+
+#### `KafkaSink` configurations
+
+Name | Descriptions | Type | Default 
+---- | ------------ | ---- | ------- 
+`KafkaConfig.BOOTSTRAP_SERVERS_CONFIG` | A list of host/port pairs to use for 
establishing the initial connection to the Kafka cluster | String |  
+`KafkaConfig.CLIENT_ID_CONFIG` | An id string to pass to the server when 
making requests | String | ""  
+
+#### `KafkaCheckpointStoreFactory` configurations
+
+Name | Descriptions | Type | Default 
+---- | ------------ | ---- | ------- 
+`KafkaConfig.ZOOKEEPER_CONNECT_CONFIG` | Zookeeper connect string for Kafka 
topics management | String | 
+`KafkaConfig.BOOTSTRAP_SERVERS_CONFIG` | A list of host/port pairs to use for 
establishing the initial connection to the Kafka cluster | String | 
+`KafkaConfig.CHECKPOINT_STORE_NAME_PREFIX` | Name prefix for checkpoint store 
| String | "" 
+`KafkaConfig.REPLICATION_FACTOR` | Replication factor for checkpoint store 
topic | Int | 1 
+
+### Use of `HBaseSink`
+
+To use `HBaseSink` in your application, you first need to add the 
`gearpump-external-hbase` library dependency in your application:
+
+#### SBT
+
+       :::sbt
+       "org.apache.gearpump" %% "gearpump-external-hbase" % 
{{GEARPUMP_VERSION}}
+
+#### XML
+       :::xml
+       <dependency>
+         <groupId>org.apache.gearpump</groupId>
+         <artifactId>gearpump-external-hbase</artifactId>
+         <version>{{GEARPUMP_VERSION}}</version>
+       </dependency>
+
+
+To connect to HBase, you need to provide following info:
+  
+  * the HBase configuration to tell which HBase service to connect
+  * the table name (you must create the table yourself, see the [HBase 
documentation](https://hbase.apache.org/book.html))
+
+Then, you can use `HBaseSink` in your application:
+
+       :::scala
+       //create the HBase data sink
+       val sink = HBaseSink(UserConfig.empty, tableName, 
HBaseConfiguration.create())
+       
+       //create Gearpump Processor
+       val sinkProcessor = DataSinkProcessor(sink, parallelism)
+
+
+       :::scala
+       //assume stream is a normal `Stream` in DSL
+       stream.writeToHbase(UserConfig.empty, tableName, parallelism, "write to 
HBase")
+
+
+You can tune the connection to HBase via the HBase configuration passed in. If 
not passed, Gearpump will try to check local classpath to find a valid HBase 
configuration (`hbase-site.xml`).
+
+Attention, due to the issue discussed 
[here](http://stackoverflow.com/questions/24456484/hbase-managed-zookeeper-suddenly-trying-to-connect-to-localhost-instead-of-zooke)
 you may need to create additional configuration for your HBase sink:
+
+       :::scala
+       def hadoopConfig = {
+        val conf = new Configuration()
+        conf.set("hbase.zookeeper.quorum", "zookeeperHost")
+        conf.set("hbase.zookeeper.property.clientPort", "2181")
+        conf
+       }
+       val sink = HBaseSink(UserConfig.empty, tableName, hadoopConfig)
+
+
+## How to implement your own `DataSource`
+
+To implement your own `DataSource`, you need to implement two things:
+
+1. The data source itself
+2. a helper class to easy the usage in a DSL
+
+### Implement your own `DataSource`
+You need to implement a class derived from 
`org.apache.gearpump.streaming.transaction.api.TimeReplayableSource`.
+
+### Implement DSL helper (Optional)
+If you would like to have a DSL at hand you may start with this customized 
stream; it is better if you can implement your own DSL helper.
+You can refer `KafkaDSLUtil` as an example in Gearpump source.
+
+Below is some code snippet from `KafkaDSLUtil`:
+
+       :::scala
+       object KafkaDSLUtil {
+       
+         def createStream[T](
+             app: StreamApp,
+             topics: String,
+             parallelism: Int,
+             description: String,
+             properties: Properties): dsl.Stream[T] = {
+           app.source[T](new KafkaSource(topics, properties), parallelism, 
description)
+         }
+       }
+
+
+## How to implement your own `DataSink`
+To implement your own `DataSink`, you need to implement two things:
+
+1. The data sink itself
+2. a helper class to make it easy use in DSL
+
+### Implement your own `DataSink`
+You need to implement a class derived from 
`org.apache.gearpump.streaming.sink.DataSink`.
+
+### Implement DSL helper (Optional)
+If you would like to have a DSL at hand you may start with this customized 
stream; it is better if you can implement your own DSL helper.
+You can refer `HBaseDSLSink` as an example in Gearpump source.
+
+Below is some code snippet from `HBaseDSLSink`:
+
+       :::scala
+       class HBaseDSLSink[T](stream: Stream[T]) {
+         def writeToHbase(userConfig: UserConfig, table: String, parallism: 
Int, description: String): Stream[T] = {
+           stream.sink(HBaseSink[T](userConfig, table), parallism, userConfig, 
description)
+         }
+         
+         def writeToHbase(userConfig: UserConfig, configuration: 
Configuration, table: String, parallism: Int, description: String): Stream[T] = 
{
+           stream.sink(HBaseSink[T](userConfig, table, configuration), 
parallism, userConfig, description)
+         }  
+       }
+       
+       object HBaseDSLSink {
+         implicit def streamToHBaseDSLSink[T](stream: Stream[T]): 
HBaseDSLSink[T] = {
+           new HBaseDSLSink[T](stream)
+         }
+       }
+

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/dev/dev-custom-serializer.md
----------------------------------------------------------------------
diff --git a/docs/contents/dev/dev-custom-serializer.md 
b/docs/contents/dev/dev-custom-serializer.md
new file mode 100644
index 0000000..b1abeda
--- /dev/null
+++ b/docs/contents/dev/dev-custom-serializer.md
@@ -0,0 +1,137 @@
+Gearpump has a built-in serialization framework with a shaded Kryo version, 
which allows you to customize how a specific message type can be serialized. 
+
+#### Register a class before serialization.
+
+Note, to use built-in kryo serialization framework, Gearpump requires all 
classes to be registered explicitly before using, no matter you want to use a 
custom serializer or not. If not using custom serializer, Gearpump will use 
default com.esotericsoftware.kryo.serializers.FieldSerializer to serialize the 
class. 
+
+To register a class, you need to change the configuration file gear.conf(or 
application.conf if you want it only take effect for single application).
+
+       :::json
+       gearpump {
+         serializers {
+           ## We will use default FieldSerializer to serialize this class type
+           "org.apache.gearpump.UserMessage" = ""
+           
+           ## we will use custom serializer to serialize this class type
+           "org.apache.gearpump.UserMessage2" = 
"org.apache.gearpump.UserMessageSerializer"
+         }
+       }
+       
+
+#### How to define a custom serializer for built-in kryo serialization 
framework
+
+When you decide that you want to define a custom serializer, you can do this 
in two ways.
+
+Please note that Gearpump shaded the original Kryo dependency. The package 
name ```com.esotericsoftware``` was relocated to 
```org.apache.gearpump.esotericsoftware```. So in the following customization, 
you should import corresponding shaded classes, the example code will show that 
part.
+
+In general you should use the shaded version of a library whenever possible in 
order to avoid binary incompatibilities, eg don't use:
+
+       :::scala
+       import com.google.common.io.Files
+
+
+but rather
+
+       :::scala
+       import org.apache.gearpump.google.common.io.Files
+
+
+##### System Level Serializer
+
+If the serializer is widely used, you can define a global serializer which is 
available to all applications(or worker or master) in the system.
+
+###### Step1: you first need to develop a java library which contains the 
custom serializer class. here is an example:
+
+       :::scala
+       package org.apache.gearpump
+       
+       import org.apache.gearpump.esotericsoftware.kryo.{Kryo, Serializer}
+       import org.apache.gearpump.esotericsoftware.kryo.io.{Input, Output}
+       
+       class UserMessage(longField: Long, intField: Int)
+       
+       class UserMessageSerializer extends Serializer[UserMessage] {
+         override def write(kryo: Kryo, output: Output, obj: UserMessage) = {
+           output.writeLong(obj.longField)
+           output.writeInt(obj.intField)
+         }
+       
+         override def read(kryo: Kryo, input: Input, typ: Class[UserMessage]): 
UserMessage = {
+           val longField = input.readLong()
+           val intField = input.readInt()
+           new UserMessage(longField, intField)
+         }
+       }
+
+
+###### Step2: Distribute the libraries
+
+Distribute the jar file to lib/ folder of every Gearpump installation in the 
cluster.
+
+###### Step3: change gear.conf on every machine of the cluster:
+
+       :::json
+       gearpump {
+         serializers {
+           "org.apache.gearpump.UserMessage" = 
"org.apache.gearpump.UserMessageSerializer"
+         }
+       }
+       
+
+###### All set!
+
+##### Define Application level custom serializer
+If all you want is to define an application level serializer, which is only 
visible to current application AppMaster and Executors(including tasks), you 
can follow a different approach.
+
+###### Step1: Define your custom Serializer class
+
+You should include the Serializer class in your application jar. Here is an 
example to define a custom serializer:
+
+       :::scala
+       package org.apache.gearpump
+       
+       import org.apache.gearpump.esotericsoftware.kryo.{Kryo, Serializer}
+       import org.apache.gearpump.esotericsoftware.kryo.io.{Input, Output}
+       
+       class UserMessage(longField: Long, intField: Int)
+       
+       class UserMessageSerializer extends Serializer[UserMessage] {
+         override def write(kryo: Kryo, output: Output, obj: UserMessage) = {
+           output.writeLong(obj.longField)
+           output.writeInt(obj.intField)
+         }
+       
+         override def read(kryo: Kryo, input: Input, typ: Class[UserMessage]): 
UserMessage = {
+           val longField = input.readLong()
+           val intField = input.readInt()
+           new UserMessage(longField, intField)
+         }
+       }
+
+
+###### Step2: Put a application.conf in your classpath on Client machine where 
you submit the application, 
+
+       :::json
+       ### content of application.conf
+       gearpump {
+         serializers {
+           "org.apache.gearpump.UserMessage" = 
"org.apache.gearpump.UserMessageSerializer"
+         }
+       }
+       
+
+###### Step3: All set!
+
+#### Advanced: Choose another serialization framework
+
+Note: This is only for advanced user which require deep customization of 
Gearpump platform.
+
+There are other serialization framework besides Kryo, like Protobuf. If user 
don't want to use the built-in kryo serialization framework, he can customize a 
new serialization framework. 
+
+basically, user need to define in gear.conf(or application.conf for single 
application's scope) file like this:
+
+       :::bash
+       gearpump.serialization-framework = 
"org.apache.gearpump.serializer.CustomSerializationFramework"
+       
+
+Please find an example in gearpump storm module, search 
"StormSerializationFramework" in source code.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/dev/dev-ide-setup.md
----------------------------------------------------------------------
diff --git a/docs/contents/dev/dev-ide-setup.md 
b/docs/contents/dev/dev-ide-setup.md
new file mode 100644
index 0000000..fa983ba
--- /dev/null
+++ b/docs/contents/dev/dev-ide-setup.md
@@ -0,0 +1,29 @@
+### Intellij IDE Setup
+
+1. In Intellij, download scala plugin.  We are using scala version 
{{SCALA_BINARY_VERSION}}
+2. Open menu "File->Open" to open Gearpump root project, then choose the 
Gearpump source folder.
+3. All set.
+
+**NOTE:** Intellij Scala plugin is already bundled with sbt. If you have Scala 
plugin installed, please don't install additional sbt plugin. Check your 
settings at "Settings -> Plugins"
+**NOTE:** If you are behind a proxy, to speed up the build, please set the 
proxy for sbt in "Settings -> Build Tools > SBT". in input field "VM 
parameters", add 
+
+       :::bash
+       -Dhttp.proxyHost=<proxy host>
+       -Dhttp.proxyPort=<port like 911>
+       -Dhttps.proxyHost=<proxy host>
+       -Dhttps.proxyPort=<port like 911>
+       
+
+### Eclipse IDE Setup
+
+I will show how to do this in eclipse LUNA.
+
+There is a sbt-eclipse plugin to generate eclipse project files, but seems 
there are some bugs, and some manual fix is still required. Here is the steps 
that works for me:
+
+1. Install latest version eclipse luna
+2. Install latest scala-IDE http://scala-ide.org/download/current.html   I use 
update site address: 
http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site
+3. Open a sbt shell under the root folder of Gearpump. enter "eclipse", then 
we get all eclipse project file generated.
+4. Use eclipse import wizard. File->Import->Existing projects into Workspace, 
make sure to tick the option "Search for nested projects"
+5. Then it may starts to complain about encoding error, like "IO error while 
decoding". You need to fix the eclipse default text encoding by changing 
configuration at "Window->Preference->General->Workspace->Text file encoding" 
to UTF-8.
+6. Then the project gearpump-external-kafka may still cannot compile. The 
reason is that there is some dependencies missing in generated .classpath file 
by sbt-eclipse. We need to do some manual fix. Right click on project icon of 
gearpump-external-kafka in eclipse, then choose menu "Build Path->Configure 
Build Path". A window will popup. Under the tab "projects", click add, choose 
"gearpump-streaming"
+7. All set. Now the project should compile OK in eclipse.

http://git-wip-us.apache.org/repos/asf/incubator-gearpump/blob/5f90b70f/docs/contents/dev/dev-non-streaming-example.md
----------------------------------------------------------------------
diff --git a/docs/contents/dev/dev-non-streaming-example.md 
b/docs/contents/dev/dev-non-streaming-example.md
new file mode 100644
index 0000000..3d1d5e0
--- /dev/null
+++ b/docs/contents/dev/dev-non-streaming-example.md
@@ -0,0 +1,133 @@
+We'll use [Distributed 
Shell](https://github.com/apache/incubator-gearpump/blob/master/examples/distributedshell)
 as an example to illustrate how to do that.
+
+What Distributed Shell do is that user send a shell command to the cluster and 
the command will the executed on each node, then the result will be return to 
user.
+
+### Maven/Sbt Settings
+
+Repository and library dependencies can be found at [Maven 
Setting](http://gearpump.incubator.apache.org/downloads.html#maven-dependencies)
+
+### Define Executor Class
+
+       :::scala
+       class ShellExecutor(executorContext: ExecutorContext, userConf : 
UserConfig) extends Actor{
+         import executorContext._
+       
+         override def receive: Receive = {
+           case ShellCommand(command, args) =>
+             val process = Try(s"$command $args" !!)
+             val result = process match {
+               case Success(msg) => msg
+               case Failure(ex) => ex.getMessage
+             }
+             sender ! ShellCommandResult(executorId, result)
+         }
+       }
+
+So ShellExecutor just receive the ShellCommand and try to execute it and 
return the result to the sender, which is quite simple.
+
+### Define AppMaster Class
+For a non-streaming application, you have to write your own AppMaster.
+
+Here is a typical user defined AppMaster, please note that some trivial codes 
are omitted.
+
+       :::scala
+       class DistShellAppMaster(appContext : AppMasterContext, app : 
Application) extends ApplicationMaster {
+         protected var currentExecutorId = 0
+       
+         override def preStart(): Unit = {
+           ActorUtil.launchExecutorOnEachWorker(masterProxy, 
getExecutorJvmConfig, self)
+         }
+       
+         override def receive: Receive = {
+           case ExecutorSystemStarted(executorSystem) =>
+             import executorSystem.{address, worker, resource => 
executorResource}
+             val executorContext = ExecutorContext(currentExecutorId, 
worker.workerId, appId, self, executorResource)
+             val executor = context.actorOf(Props(classOf[ShellExecutor], 
executorContext, app.userConfig)
+                 .withDeploy(Deploy(scope = RemoteScope(address))), 
currentExecutorId.toString)
+             executorSystem.bindLifeCycleWith(executor)
+             currentExecutorId += 1
+           case StartExecutorSystemTimeout =>
+             masterProxy ! ShutdownApplication(appId)
+             context.stop(self)
+           case msg: ShellCommand =>
+             Future.fold(context.children.map(_ ? msg))(new 
ShellCommandResultAggregator) { (aggregator, response) =>
+               aggregator.aggregate(response.asInstanceOf[ShellCommandResult])
+             }.map(_.toString()) pipeTo sender
+         }
+       
+         private def getExecutorJvmConfig: ExecutorSystemJvmConfig = {
+           val config: Config = 
Option(app.clusterConfig).map(_.getConfig).getOrElse(ConfigFactory.empty())
+           val jvmSetting = 
Util.resolveJvmSetting(config.withFallback(context.system.settings.config)).executor
+           ExecutorSystemJvmConfig(jvmSetting.classPath, jvmSetting.vmargs,
+             appJar, username, config)
+         }
+       }
+       
+
+So when this `DistShellAppMaster` started, first it will request resources to 
launch one executor on each node, which is done in method `preStart`
+
+Then the DistShellAppMaster's receive handler will handle the allocated 
resource to launch the `ShellExecutor` we want. If you want to write your 
application, you can just use this part of code. The only thing needed is 
replacing the Executor class.
+
+There may be a situation that the resource allocation failed which will bring 
the message `StartExecutorSystemTimeout`, the normal pattern to handle that is 
just what we do: shut down the application.
+
+The real application logic part is in `ShellCommand` message handler, which is 
specific to different applications. Here we distribute the shell command to 
each executor and aggregate the results to the client.
+
+For method `getExecutorJvmConfig`, you can just use this part of code in your 
own application.
+
+### Define Application
+Now its time to launch the application.
+
+       :::scala
+       object DistributedShell extends App with ArgumentsParser {
+         private val LOG: Logger = LogUtil.getLogger(getClass)
+       
+         override val options: Array[(String, CLIOption[Any])] = Array.empty
+       
+         LOG.info(s"Distributed shell submitting application...")
+         val context = ClientContext()
+         val appId = 
context.submit(Application[DistShellAppMaster]("DistributedShell", 
UserConfig.empty))
+         context.close()
+         LOG.info(s"Distributed Shell Application started with appId $appId !")
+       }
+
+The application class extends `App` and `ArgumentsParser which make it easier 
to parse arguments and run main functions. This part is similar to the 
streaming applications.
+
+The main class `DistributeShell` will submit an application to `Master`, whose 
`AppMaster` is `DistShellAppMaster`.
+
+### Define an optional Client class
+
+Now, we can define a `Client` class to talk with `AppMaster` to pass our 
commands to it.
+
+       :::scala
+       object DistributedShellClient extends App with ArgumentsParser  {
+         implicit val timeout = Constants.FUTURE_TIMEOUT
+         import scala.concurrent.ExecutionContext.Implicits.global
+         private val LOG: Logger = LoggerFactory.getLogger(getClass)
+       
+         override val options: Array[(String, CLIOption[Any])] = Array(
+           "master" -> 
CLIOption[String]("<host1:port1,host2:port2,host3:port3>", required = true),
+           "appid" -> CLIOption[Int]("<the distributed shell appid>", required 
= true),
+           "command" -> CLIOption[String]("<shell command>", required = true),
+           "args" -> CLIOption[String]("<shell arguments>", required = true)
+         )
+       
+         val config = parse(args)
+         val context = ClientContext(config.getString("master"))
+         val appid = config.getInt("appid")
+         val command = config.getString("command")
+         val arguments = config.getString("args")
+         val appMaster = context.resolveAppID(appid)
+         (appMaster ? ShellCommand(command, arguments)).map { reslut =>
+           LOG.info(s"Result: $reslut")
+           context.close()
+         }
+       }
+       
+
+In the `DistributedShellClient`, it will resolve the appid to the real 
appmaster(the application id will be printed when launching `DistributedShell`).
+
+Once we got the `AppMaster`, then we can send `ShellCommand` to it and wait 
for the result.
+
+### Submit application
+
+After all these, you need to package everything into a uber jar and submit the 
jar to Gearpump Cluster. Please check [Application submission 
tool](../introduction/commandline) to command line tool syntax.

[5/5] incubator-gearpump git commit: [GEARPUMP-213] Fix EditOnGitHub link and rename docs/docs to docs/con…

Reply via email to