Repository: samza
Updated Branches:
  refs/heads/master 88c1442c4 -> 00cbee8c0


SAMZA-927: added docs for split deployment


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/00cbee8c
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/00cbee8c
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/00cbee8c

Branch: refs/heads/master
Commit: 00cbee8c05ef54dc2983df2f1329a57fcd53d220
Parents: 88c1442
Author: Boris Shkolnik <[email protected]>
Authored: Mon Sep 26 18:07:10 2016 -0700
Committer: Xinyu Liu <[email protected]>
Committed: Mon Sep 26 18:07:10 2016 -0700

----------------------------------------------------------------------
 docs/learn/documentation/versioned/index.html   |  1 +
 .../versioned/jobs/split-deployment.md          | 59 ++++++++++++++++++++
 2 files changed, 60 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/00cbee8c/docs/learn/documentation/versioned/index.html
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/index.html 
b/docs/learn/documentation/versioned/index.html
index a997445..d0b14ec 100644
--- a/docs/learn/documentation/versioned/index.html
+++ b/docs/learn/documentation/versioned/index.html
@@ -72,6 +72,7 @@ title: Documentation
   <li><a href="jobs/logging.html">Logging</a></li>
   <li><a href="jobs/reprocessing.html">Reprocessing</a></li>
   <li><a href="jobs/web-ui-rest-api.html">Web UI and REST API</a></li>
+  <li><a href="jobs/split-deployment.html">Separating Samza Framework and Jobs 
Deployment</a></li>
 </ul>
 
 <h4>YARN</h4>

http://git-wip-us.apache.org/repos/asf/samza/blob/00cbee8c/docs/learn/documentation/versioned/jobs/split-deployment.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/jobs/split-deployment.md 
b/docs/learn/documentation/versioned/jobs/split-deployment.md
new file mode 100644
index 0000000..ebab670
--- /dev/null
+++ b/docs/learn/documentation/versioned/jobs/split-deployment.md
@@ -0,0 +1,59 @@
+---
+layout: page
+title: Separating Samza Framework and Jobs Deployment
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+
+### Motivation
+Currently all Samza jobs are deployed as a single unit/package which combines 
all the Samza libraries, user code and configs together. Typically in a large 
organization the team that manages the Samza cluster is not the same as the 
teams that are running applications on top of Samza. In this case, the current 
way of deployment presents two major problems:
+
+* **Samza software releases**:
+Every time Samza team releases a new version (for example a bug fix), the only 
way to deploy it is to rebuild all users packages and redeploy them. It would 
be much more efficient if the team could release the Samza framework 
separately, at its own cadence, and a simple job restart would pick up the new 
version.
+
+* **Packages incompatibilities**:
+If both Samza and a job depend on the same software, but on different 
(especially backward incompatible) versions, they cannot be released together, 
because it will most likely cause some runtime issue. Ideally, each one of them 
would load the packages it needs separately.
+<b>NOTE.</b>This problem is not addressed here.
+
+To address the first problem, we separate the deployment of the Samza 
framework from user jobs by defining two deployable units:
+
+* **Samza framework** - This contains Samza libraries only, and is deployed 
separately to all the machines in a cluster.
+* **User's job** - This contains user code only, and uses the pre-deployed 
Samza framework to run.
+
+Split deployment allows upgrading the Samza framework without forcing 
developers to explicitly upgrade their running applications. It also allows 
different versions of Samza framework with simple config changes. This means we 
can support canary, upgrade and rollback scenarios commonly
+required in organizations that run tens or hundreds of jobs.
+
+### Deployment sequence
+
+#### Pre-requisite for split deployment
+Each deployment will now consist of two separate packages:<p>
+
+1. **Samza framework** - This includes all Samza libraries and scripts, such 
as samza-api, samza-core, samza-log4j, samza-kafka, samza-yarn, samza-kv, 
samza-kv-inmemory, samza-kv-rocksdb, samza-shell, samza-hdfs and all their 
dependencies.
+2. **User's job** - This includes the job package: all user code for the 
StreamTask implementation, configs, and other libraries required by the job. 
The job's package should depend only samza-api and no other Samza libraries. 
The package won't be able to start by itself. In order to start, it will need 
to use the Samza framework.
+
+#### Deployment steps
+To run a job in split deployment mode:
+
+1. **Deploy the framework**:
+The Samza framework package should be deployed to ALL the machines of a 
cluster into a predefined, fixed location. This could be done by merely copying 
the jars, or creating a meta package that would deploy all of them. Let's 
assume that 'samza-framework' package is installed into the 
'/.../samza-fwk/0.11.0' directory.
+
+2. **Create symbolic link**:
+A symbolic link needs to be created for the **stable** version of the 
framework to point to the framework location, e.g.: {% highlight bash %} ln -s 
/.../samza-fwk/0.11.0 /.../samza-fwk/STABLE' {% endhighlight %}
+
+3. **Deploy user job**:
+In the job's config, the following property is required to enable split 
deployment, e.g. for Samza framework path at '/.../samza-fwk': {% highlight 
jproperties %} samza.fwk.path=/.../samza-fwk {% endhighlight %} By default 
Samza will look for the **stable** link inside the folder to find the 
framework. You can also override the version by configuring: {% highlight 
jproperties %} samza.fwk.version=0.11.1 {% endhighlight %} In this case Samza 
will pick '/.../samza-fwk/0.11.1' as the framework location. This way users can 
perform canary, upgrade and rollback their jobs easily by changing version in 
the config.

Reply via email to