Added: knox/site/books/knox-0-9-1/dev-guide.html
URL: 
http://svn.apache.org/viewvc/knox/site/books/knox-0-9-1/dev-guide.html?rev=1755109&view=auto
==============================================================================
--- knox/site/books/knox-0-9-1/dev-guide.html (added)
+++ knox/site/books/knox-0-9-1/dev-guide.html Wed Aug  3 19:44:13 2016
@@ -0,0 +1,1492 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+--><p><link href="book.css" rel="stylesheet"/></p><p><img src="knox-logo.gif" 
alt="Knox"/> <img src="apache-logo.gif" align="right" alt="Apache"/></p><h1><a 
id="Apache+Knox+Gateway+0.9.x+Developer's+Guide">Apache Knox Gateway 0.9.x 
Developer&rsquo;s Guide</a> <a 
href="#Apache+Knox+Gateway+0.9.x+Developer's+Guide"><img 
src="markbook-section-link.png"/></a></h1><h2><a id="Table+Of+Contents">Table 
Of Contents</a> <a href="#Table+Of+Contents"><img 
src="markbook-section-link.png"/></a></h2>
+<ul>
+  <li><a href="#Overview">Overview</a></li>
+  <li><a href="#Architecture+Overview">Architecture Overview</a></li>
+  <li><a href="#Project+Overview">Project Overview</a></li>
+  <li><a href="#Behavior">Behavior</a></li>
+  <li><a href="#Runtime+Behavior">Runtime Behavior</a></li>
+  <li><a href="#Deployment+Behavior">Deployment Behavior</a></li>
+  <li><a href="#Extension+Points">Extension Points</a></li>
+  <li><a href="#Providers">Providers</a></li>
+  <li><a href="#Services">Services</a></li>
+  <li><a href="#Standard+Providers">Standard Providers</a></li>
+  <li><a href="#Rewrite+Provider">Rewrite Provider</a></li>
+  <li><a href="#Gateway+Services">Gateway Services</a></li>
+  <li><a href="#KnoxSSO+Integration">KnoxSSO Integration</a></li>
+  <li><a href="#Auditing">Auditing</a></li>
+  <li><a href="#Logging">Logging</a></li>
+  <li><a href="#Internationalization">Internationalization</a></li>
+</ul><h2><a id="Overview">Overview</a> <a href="#Overview"><img 
src="markbook-section-link.png"/></a></h2><p>Apache Knox gateway is a 
specialized reverse proxy gateway for various Hadoop REST APIs. However, the 
gateway is built entirely upon a fairly generic framework. This framework is 
used to &ldquo;plug-in&rdquo; all of the behavior that makes it specific to 
Hadoop in general and any particular Hadoop REST API. It would be equally as 
possible to create a customized reverse proxy for other non-Hadoop HTTP 
endpoints. This approach is taken to ensure that the Apache Knox gateway can 
scale with the rapidly evolving Hadoop ecosystem.</p><p>Throughout this guide 
we will be using a publicly available REST API to demonstrate the development 
of various extension mechanisms. <a 
href="http://openweathermap.org/";>http://openweathermap.org/</a></p><h3><a 
id="Architecture+Overview">Architecture Overview</a> <a 
href="#Architecture+Overview"><img 
src="markbook-section-link.png"/></a></h3><p>The 
 gateway itself is a layer over an embedded Jetty JEE server. At the very 
highest level the gateway processes requests by using request URLs to lookup 
specific JEE Servlet Filter chain that is used to process the request. The 
gateway framework provides extensible mechanisms to assemble chains of custom 
filters that support secured access to services.</p><p>The gateway has two 
primary extensibility mechanisms: Service and Provider. The Service 
extensibility framework provides a way to add support for new HTTP/REST 
endpoints. For example, the support for WebHdfs is plugged into the Knox 
gateway as a Service. The Provider extensibility framework allows adding new 
features to the gateway that can be used across Services. An example of a 
Provider is an authentication provider. Providers can also expose APIs that 
other service and provider extensions can utilize.</p><p>Service and Provider 
integrations interact with the gateway framework in two distinct phases: 
Deployment and Runtime. The 
 gateway framework can be thought of as a layer over the JEE Servlet framework. 
Specifically all runtime processing within the gateway is performed by JEE 
Servlet Filters. The two phases interact with this JEE Servlet Filter based 
model in very different ways. The first phase, Deployment, is responsible for 
converting fairly simple to understand configuration called topology into JEE 
WebArchive (WAR) based implementation details. The second phase, Runtime, is 
the processing of requests via a set of Filters configured in the 
WAR.</p><p>From an &ldquo;ethos&rdquo; perspective, Service and Provider 
extensions should attempt to incur complexity associated with configuration in 
the deployment phase. This should allow for very streamlined request processing 
that is very high performance and easily testable. The preference at runtime, 
in OO style, is for small classes that perform a specific function. The ideal 
set of implementation classes are then assembled by the Service and Provider plu
 gins during deployment.</p><p>A second critical design consideration is 
streaming. The processing infrastructure is build around JEE Servlet Filters as 
they provide a natural streaming interception model. All Provider 
implementations should make every attempt to maintaining this streaming 
characteristic.</p><h3><a id="Project+Overview">Project Overview</a> <a 
href="#Project+Overview"><img src="markbook-section-link.png"/></a></h3><p>The 
table below describes the purpose of the current modules in the project. Of 
particular importance are the root pom.xml and the gateway-release module. The 
root pom.xml is critical because this is where all dependency version must be 
declared. There should be no dependency version information in module pom.xml 
files. The gateway-release module is critical because the dependencies declared 
there essentially define the classpath of the released gateway server. This is 
also true of the other -release modules in the project.</p>
+<table>
+  <thead>
+    <tr>
+      <th>File/Module </th>
+      <th>Description </th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>LICENSE </td>
+      <td>The license for all source files in the release. </td>
+    </tr>
+    <tr>
+      <td>NOTICE </td>
+      <td>Attributions required by dependencies. </td>
+    </tr>
+    <tr>
+      <td>README </td>
+      <td>A brief overview of the Knox project. </td>
+    </tr>
+    <tr>
+      <td>CHANGES </td>
+      <td>A description of the changes for each release. </td>
+    </tr>
+    <tr>
+      <td>ISSUES </td>
+      <td>The knox issues for the current release. </td>
+    </tr>
+    <tr>
+      <td>gateway-util-common </td>
+      <td>Common low level utilities used by many modules. </td>
+    </tr>
+    <tr>
+      <td>gateway-util-launcher </td>
+      <td>The launcher framework. </td>
+    </tr>
+    <tr>
+      <td>gateway-util-urltemplate </td>
+      <td>The i18n logging and resource framework. </td>
+    </tr>
+    <tr>
+      <td>gateway-i18n </td>
+      <td>The URL template and rewrite utilities </td>
+    </tr>
+    <tr>
+      <td>gateway-i18n-logging-log4j </td>
+      <td>The integration of i18n logging with log4j. </td>
+    </tr>
+    <tr>
+      <td>gateway-i18n-logging-sl4j </td>
+      <td>The integration of i18n logging with sl4j. </td>
+    </tr>
+    <tr>
+      <td>gateway-spi </td>
+      <td>The SPI for service and provider extensions. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-identity-assertion-common </td>
+      <td>The identity assertion provider base </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-identity-assertion-concat </td>
+      <td>An identity assertion provider that facilitates prefix and suffix 
concatenation.</td>
+    </tr>
+    <tr>
+      <td>gateway-provider-identity-assertion-pseudo </td>
+      <td>The default identity assertion provider. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-jersey </td>
+      <td>The jersey display provider. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-rewrite </td>
+      <td>The URL rewrite provider. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-rewrite-func-hostmap-static </td>
+      <td>Host mapping function extension to rewrite. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-rewrite-func-service-registry </td>
+      <td>Service registry function extension to rewrite. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-rewrite-step-secure-query </td>
+      <td>Crypto step extension to rewrite. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-security-authz-acls </td>
+      <td>Service level authorization. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-security-jwt </td>
+      <td>JSON Web Token utilities. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-security-preauth </td>
+      <td>Preauthenticated SSO header support. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-security-shiro </td>
+      <td>Shiro authentiation integration. </td>
+    </tr>
+    <tr>
+      <td>gateway-provider-security-webappsec </td>
+      <td>Filters to prevent common webapp security issues. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-as </td>
+      <td>The implementation of the Access service POC. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-definitions </td>
+      <td>The implementation of the Service definition and rewrite files. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-hbase </td>
+      <td>The implementation of the HBase service. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-hive </td>
+      <td>The implementation of the Hive service. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-oozie </td>
+      <td>The implementation of the Oozie service. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-tgs </td>
+      <td>The implementation of the Ticket Granting service POC. </td>
+    </tr>
+    <tr>
+      <td>gateway-service-webhdfs </td>
+      <td>The implementation of the WebHdfs service. </td>
+    </tr>
+    <tr>
+      <td>gateway-server </td>
+      <td>The implementation of the Knox gateway server. </td>
+    </tr>
+    <tr>
+      <td>gateway-shell </td>
+      <td>The implementation of the Knox Groovy shell. </td>
+    </tr>
+    <tr>
+      <td>gateway-test-ldap </td>
+      <td>Pulls in all of the dependencies of the test LDAP server. </td>
+    </tr>
+    <tr>
+      <td>gateway-server-launcher </td>
+      <td>The launcher definition for the gateway. </td>
+    </tr>
+    <tr>
+      <td>gateway-shell-launcher </td>
+      <td>The launcher definition for the shell. </td>
+    </tr>
+    <tr>
+      <td>knox-cli-launcher </td>
+      <td>A module to pull in all of the dependencies of the CLI. </td>
+    </tr>
+    <tr>
+      <td>gateway-test-ldap-launcher </td>
+      <td>The launcher definition for the test LDAP server. </td>
+    </tr>
+    <tr>
+      <td>gateway-release </td>
+      <td>The definition of the gateway binary release. Contains content and 
dependencies to be included in binary gateway package. </td>
+    </tr>
+    <tr>
+      <td>gateway-test-utils </td>
+      <td>Various utilities used in unit and system tests. </td>
+    </tr>
+    <tr>
+      <td>gateway-test </td>
+      <td>The functional tests. </td>
+    </tr>
+    <tr>
+      <td>pom.xml </td>
+      <td>The top level pom. </td>
+    </tr>
+    <tr>
+      <td>build.xml </td>
+      <td>A collection of utility for building and releasing. </td>
+    </tr>
+  </tbody>
+</table><h3><a id="Development+Processes">Development Processes</a> <a 
href="#Development+Processes"><img 
src="markbook-section-link.png"/></a></h3><p>The project uses Maven in general 
with a few convenience Ant targets.</p><p>Building the project can be built via 
Maven or Ant. The two commands below are equivalent.</p>
+<pre><code>mvn clean install
+ant
+</code></pre><p>A more complete build can be done that builds and generates 
the unsigned ZIP release artifacts. You will find these in the target/{version} 
directory (e.g. target/0.10.0-SNAPSHOT).</p>
+<pre><code>mvn -Prelease clean install
+ant release
+</code></pre><p>There are a few other Ant targets that are especially 
convenient for testing.</p><p>This command installs the gateway into the 
{{{install}}} directory of the project. Note that this command does not first 
build the project.</p>
+<pre><code>ant install-test-home
+</code></pre><p>This command starts the gateway and LDAP servers installed by 
the command above into a test GATEWAY_HOME (i.e. install). Note that this 
command does not first install the test home.</p>
+<pre><code>ant start-test-servers
+</code></pre><p>So putting things together the following Ant command will 
build a release, install it and start the servers ready for manual testing.</p>
+<pre><code>ant release install-test-home start-test-servers
+</code></pre><h2><a id="Behavior">Behavior</a> <a href="#Behavior"><img 
src="markbook-section-link.png"/></a></h2><p>There are two distinct phases in 
the behavior of the gateway. These are the deployment and runtime phases. The 
deployment phase is responsible for converting topology descriptors into an 
executable JEE style WAR. The runtime phase is the processing of requests via 
WAR created during the deployment phase.</p><p>The deployment phase is arguably 
the more complex of the two phases. This is because runtime relies on well 
known JEE constructs while deployment introduces new framework concepts. The 
base concept of the deployment framework is that of a 
&ldquo;contributor&rdquo;. In the framework, contributors are pluggable 
component responsible for generating JEE WAR artifacts from topology 
files.</p><h3><a id="Deployment+Behavior">Deployment Behavior</a> <a 
href="#Deployment+Behavior"><img 
src="markbook-section-link.png"/></a></h3><p>The goal of the deployment phase 
is to ta
 ke easy to understand topology descriptions and convert them into optimized 
runtime artifacts. Our goal is not only should the topology descriptors be easy 
to understand, but have them be easy for a management system (e.g. Ambari) to 
generate. Think of deployment as compiling an assembly descriptor into a JEE 
WAR. WARs are then deployed to an embedded JEE container (i.e. 
Jetty).</p><p>Consider the results of starting the gateway the first time. 
There are two sets of files that are relevant for deployment. The first is the 
topology file <code>&lt;GATEWAY_HOME&gt;/conf/topologies/sandbox.xml</code>. 
This second set is the WAR structure created during the deployment of the 
topology file.</p>
+<pre><code>data/deployments/sandbox.war.143bfef07f0/WEB-INF
+  web.xml
+  gateway.xml
+  shiro.ini
+  rewrite.xml
+  hostmap.txt
+</code></pre><p>Notice that the directory <code>sandbox.war.143bfef07f0</code> 
is an &ldquo;unzipped&rdquo; representation of a JEE WAR file. This 
specifically means that it contains a <code>WEB-INF</code> directory which 
contains a <code>web.xml</code> file. For the curious the strange number (i.e. 
143bfef07f0) in the name of the WAR directory is an encoded timestamp. This is 
the timestamp of the topology file (i.e. sandbox.xml) at the time the 
deployment occurred. This value is used to determine when topology files have 
changed and redeployment is required.</p><p>Here is a brief overview of the 
purpose of each file in the WAR structure.</p>
+<dl><dt>web.xml</dt><dd>A standard JEE WAR descriptor. In this case a build-in 
GatewayServlet is mapped to the url pattern /*.</dd><dt>gateway.xml</dt><dd>The 
configuration file for the GatewayServlet. Defines the filter chain that will 
be applied to each service&rsquo;s various URLs.</dd><dt>shiro.ini</dt><dd>The 
configuration file for the Shiro authentication provider&rsquo;s filters. This 
information is derived from the information in the provider section of the 
topology file.</dd><dt>rewrite.xml</dt><dd>The configuration file for the 
rewrite provider&rsquo;s filter. This captures all of the rewrite rules for the 
services. These rules are contributed by the contributors for each 
service.</dd><dt>hostmap.txt</dt><dd>The configuration file the hostmap 
provider&rsquo;s filter. This information is derived from the information in 
the provider section of the topology file.</dd>
+</dl><p>The deployment framework follows &ldquo;visitor&rdquo; style patterns. 
Each topology file is parsed and the various constructs within it are 
&ldquo;visited&rdquo;. The appropriate contributor for each visited construct 
is selected by the framework. The contributor is then passed the contrust from 
the topology file and asked to update the JEE WAR artifacts. Each contributor 
is free to inspect and modify any portion of the WAR artifacts.</p><p>The 
diagram below provides an overview of the deployment processing. Detailed 
descriptions of each step follow the diagram.</p><p><img 
src='deployment-overview.png'/></p>
+<ol>
+  <li><p>The gateway server loads a topology file from conf/topologies into an 
internal structure.</p></li>
+  <li><p>The gateway server delegates to a deployment factory to create the 
JEE WAR structure.</p></li>
+  <li><p>The deployment factory first creates a basic WAR structure with 
WEB-INF/web.xml.</p></li>
+  <li><p>Each provider and service in the topology is visited and the 
appropriate deployment contributor invoked. Each contributor is passed the 
appropriate information from the topology and modifies the WAR 
structure.</p></li>
+  <li><p>A complete WAR structure is returned to the gateway service.</p></li>
+  <li><p>The gateway server uses internal container APIs to dynamically deploy 
the WAR.</p></li>
+</ol><p>The Java method below is the actual code from the DeploymentFactory 
that implements this behavior. You will note the initialize, contribute, 
finalize sequence. Each contributor is given three opportunities to interact 
with the topology and archive. This allows the various contributors to interact 
if required. For example, the service contributors use the deployment 
descriptor added to the WAR by the rewrite provider.</p>
+<pre><code class="java">public static WebArchive createDeployment( 
GatewayConfig config, Topology topology ) {
+  Map&lt;String,List&lt;ProviderDeploymentContributor&gt;&gt; providers;
+  Map&lt;String,List&lt;ServiceDeploymentContributor&gt;&gt; services;
+  DeploymentContext context;
+
+  providers = selectContextProviders( topology );
+  services = selectContextServices( topology );
+  context = createDeploymentContext( config, topology.getName(), topology, 
providers, services );
+
+  initialize( context, providers, services );
+  contribute( context, providers, services );
+  finalize( context, providers, services );
+
+  return context.getWebArchive();
+}
+</code></pre><p>Below is a diagram that provides more detail. This diagram 
focuses on the interactions between the deployment factory and the service 
deployment contributors. Detailed description of each step follow the 
diagram.</p><p><img src='deployment-service.png'/></p>
+<ol>
+  <li><p>The gateway server loads global configuration (i.e. 
<GATEWAY_HOME>/conf/gateway-site.xml</p></li>
+  <li><p>The gateway server loads a topology descriptor file.</p></li>
+  <li><p>The gateway server delegates to the deployment factory to create a 
deployable WAR structure.</p></li>
+  <li><p>The deployment factory creates a runtime descriptor to configure that 
gateway servlet.</p></li>
+  <li><p>The deployment factory creates a basic WAR structure and adds the 
gateway servlet runtime descriptor to it.</p></li>
+  <li><p>The deployment factory creates a deployment context object and adds 
the WAR structure to it.</p></li>
+  <li><p>For each service defined in the topology descriptor file the 
appropriate service deployment contributor is selected and invoked. The correct 
service deployment contributor is determined by matching the role of a service 
in the topology descriptor to a value provided by the getRole() method of the 
ServiceDeploymentContributor interface. The initializeContribution method from 
<em>each</em> service identified in the topology is called. Each service 
deployment contributor is expected to setup any runtime artifacts in the WAR 
that other services or provides may need.</p></li>
+  <li><p>The contributeService method from <em>each</em> service identified in 
the topology is called. This is where the service deployment contributors will 
modify any runtime descriptors.</p></li>
+  <li><p>One of they ways that a service deployment contributor can modify the 
runtime descriptors is by asking the framework to contribute filters. This is 
how services are loosely coupled to the providers of features. For example a 
service deployment contributor might ask the framework to contribute the 
filters required for authorization. The deployment framework will then delegate 
to the correct provider deployment contributor to add filters for that 
feature.</p></li>
+  <li><p>Finally the finalizeContribution method for each service is invoked. 
This provides an opportunity to react to anything done via the 
contributeService invocations and tie up any loose ends.</p></li>
+  <li><p>The populated WAR is returned to the gateway server.</p></li>
+</ol><p>The following diagram will provided expanded detail on the behavior of 
provider deployment contributors. Much of the beginning and end of the sequence 
shown overlaps with the service deployment sequence above. Those steps (i.e. 
1-6, 17) will not be described below for brevity. The remaining steps have 
detailed descriptions following the diagram.</p><p><img 
src='deployment-provider.png'/></p>
+<ol>
+  <li><p>For each provider the appropriate provider deployment contributor is 
selected and invoked. The correct service deployment contributor is determined 
by first matching the role of a provider in the topology descriptor to a value 
provided by the getRole() method of the ProviderDeploymentContributor 
interface. If this is ambiguous, the name from the topology is used match the 
value provided by the getName() method of the ProviderDeploymentContributor 
interface. The initializeContribution method from <em>each</em> provider 
identified in the topology is called. Each provider deployment contributor is 
expected to setup any runtime artifacts in the WAR that other services or 
provides may need. Note: In addition, others provider not explicitly referenced 
in the topology may have their initializeContribution method called. If this is 
the case only one default instance for each role declared vis the getRole() 
method will be used. The method used to determine the default instance is no
 n-deterministic so it is best to select a particular named instance of a 
provider for each role.</p></li>
+  <li><p>Each provider deployment contributor will typically add any runtime 
deployment descriptors it requires for operation. These descriptors are added 
to the WAR structure within the deployment context.</p></li>
+  <li><p>The contributeProvider method of each configured or default provider 
deployment contributor is invoked.</p></li>
+  <li><p>Each provider deployment contributor populates any runtime deployment 
descriptors based on information in the topology.</p></li>
+  <li><p>Provider deployment contributors are never asked to contribute to the 
deployment directly. Instead a service deployment contributor will ask to have 
a particular provider role (e.g. authentication) contribute to the 
deployment.</p></li>
+  <li><p>A service deployment contributor asks the framework to contribute 
filters for a given provider role.</p></li>
+  <li><p>The framework selects the appropriate provider deployment contributor 
and invokes its contributeFilter method.</p></li>
+  <li><p>During this invocation the provider deployment contributor populate 
populate service specific information. In particular it will add filters to the 
gateway servlet&rsquo;s runtime descriptor by adding JEE Servlet Filters. These 
filters will be added to the resources (or URLs) identified by the service 
deployment contributor.</p></li>
+  <li><p>The finalizeContribute method of all referenced and default provider 
deployment contributors is invoked.</p></li>
+  <li><p>The provider deployment contributor is expected to perform any final 
modifications to the runtime descriptors in the WAR structure.</p></li>
+</ol><h3><a id="Runtime+Behavior">Runtime Behavior</a> <a 
href="#Runtime+Behavior"><img src="markbook-section-link.png"/></a></h3><p>The 
runtime behavior of the gateway is somewhat simpler as it more or less follows 
well known JEE models. There is one significant wrinkle. The filter chains are 
managed within the GatewayServlet as opposed to being managed by the JEE 
container. This is the result of an early decision made in the project. The 
intention is to allow more powerful URL matching than is provided by the JEE 
Servlet mapping mechanisms.</p><p>The diagram below provides a high level 
overview of the runtime processing. An explanation for each step is provided 
after the diagram.</p><p><img src='runtime-overview.png'/></p>
+<ol>
+  <li><p>A REST client makes a HTTP request that is received by the embedded 
JEE container.</p></li>
+  <li><p>A filter chain is looked up in a map of URLs to filter 
chains.</p></li>
+  <li><p>The filter chain, which is itself a filter, is invoked.</p></li>
+  <li><p>Each filter invokes the filters that follow it in the chain. The 
request and response objects can be wrapped in typically JEE Filter fashion. 
Filters may not continue chain processing and return if that is 
appropriate.</p></li>
+  <li><p>Eventually the end of the last filter in the chain is invoked. 
Typically this is a special &ldquo;dispatch&rdquo; filter that is responsible 
for dispatching the request to the ultimate endpoint. Dispatch filters are also 
responsible for reading the response.</p></li>
+  <li><p>The response may be in the form of a number of content types (e.g. 
application/json, text/xml).</p></li>
+  <li><p>The response entity is streamed through the various response wrappers 
added by the filters. These response wrappers may rewrite various portions of 
the headers and body as per their configuration.</p></li>
+  <li><p>The return of the response entity to the client is ultimately 
&ldquo;pulled through&rdquo; the filter response wrapper by the 
container.</p></li>
+  <li><p>The response entity is returned original client.</p></li>
+</ol><p>This diagram providers a more detailed breakdown of the request 
processing. Again descriptions of each step follow the diagram.</p><p><img 
src='runtime-request-processing.png'/></p>
+<ol>
+  <li><p>A REST client makes a HTTP request that is received by the embedded 
JEE container.</p></li>
+  <li><p>The embedded container looks up the servlet mapped to the URL and 
invokes the service method. This our case the GatewayServlet is mapped to /* 
and therefore receives all requests for a given topology. Keep in mind that the 
WAR itself is deployed on a root context path that typically contains a level 
for the gateway and the name of the topology. This means that there is a single 
GatewayServlet per topology and it is effectivly mapped to 
<gateway>/<topology>/*.</p></li>
+  <li><p>The GatewayServlet holds a single reference to a GatewayFilter which 
is a specialized JEE Servlet Filter. This choice was made to allow the 
GatewayServlet to dynamically deploy modified topologies. This is done by 
building a new GatewayFilter instance and replacing the old in an atomic 
fashion.</p></li>
+  <li><p>The GatewayFilter contains another layer of URL mapping as defined in 
the gateway.xml runtime descriptor. The various service deployment contributor 
added these mappings at deployment time. Each service may add a number of 
different sub-URLs depending in their requirements. These sub-URLs will all be 
mapped to independently configured filter chains.</p></li>
+  <li><p>The GatewayFilter invokes the doFilter method on the selected 
chain.</p></li>
+  <li><p>The chain invokes the doFilter method of the first filter in the 
chain.</p></li>
+  <li><p>Each filter in the chain continues processing by invoking the 
doFilter on the next filter in the chain. Ultimately a dispatch filter forward 
the request to the real service instead of invoking another filter. This is 
sometimes referred to as pivoting.</p></li>
+</ol><h2><a id="Gateway+Servlet+&+Gateway+Filter">Gateway Servlet &amp; 
Gateway Filter</a> <a href="#Gateway+Servlet+&+Gateway+Filter"><img 
src="markbook-section-link.png"/></a></h2><p>TODO</p>
+<pre><code class="xml">&lt;web-app&gt;
+
+  &lt;servlet&gt;
+    &lt;servlet-name&gt;sample&lt;/servlet-name&gt;
+    
&lt;servlet-class&gt;org.apache.hadoop.gateway.GatewayServlet&lt;/servlet-class&gt;
+    &lt;init-param&gt;
+      &lt;param-name&gt;gatewayDescriptorLocation&lt;/param-name&gt;
+      &lt;param-value&gt;gateway.xml&lt;/param-value&gt;
+    &lt;/init-param&gt;
+  &lt;/servlet&gt;
+
+  &lt;servlet-mapping&gt;
+    &lt;servlet-name&gt;sandbox&lt;/servlet-name&gt;
+    &lt;url-pattern&gt;/*&lt;/url-pattern&gt;
+  &lt;/servlet-mapping&gt;
+
+  &lt;listener&gt;
+    
&lt;listener-class&gt;org.apache.hadoop.gateway.services.GatewayServicesContextListener&lt;/listener-class&gt;
+  &lt;/listener&gt;
+
+  ...
+
+&lt;/web-app&gt;
+</code></pre>
+<pre><code class="xml">&lt;gateway&gt;
+
+  &lt;resource&gt;
+    &lt;role&gt;WEATHER&lt;/role&gt;
+    &lt;pattern&gt;/weather/**?**&lt;/pattern&gt;
+
+    &lt;filter&gt;
+      &lt;role&gt;authentication&lt;/role&gt;
+      &lt;name&gt;sample&lt;/name&gt;
+      &lt;class&gt;...&lt;/class&gt;
+    &lt;/filter&gt;
+
+    &lt;filter&gt;...&lt;/filter&gt;*
+
+  &lt;/resource&gt;
+
+&lt;/gateway&gt;
+</code></pre>
+<pre><code class="java">@Test
+public void testDevGuideSample() throws Exception {
+  Template pattern, input;
+  Matcher&lt;String&gt; matcher;
+  Matcher&lt;String&gt;.Match match;
+
+  // GET http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto
+  pattern = Parser.parse( &quot;/weather/**?**&quot; );
+  input = Parser.parse( &quot;/weather/2.5?q=Palo+Alto&quot; );
+
+  matcher = new Matcher&lt;String&gt;();
+  matcher.add( pattern, &quot;fake-chain&quot; );
+  match = matcher.match( input );
+
+  assertThat( match.getValue(), is( &quot;fake-chain&quot;) );
+}
+</code></pre><h2><a id="Extension+Logistics">Extension Logistics</a> <a 
href="#Extension+Logistics"><img 
src="markbook-section-link.png"/></a></h2><p>There are a number of extension 
points available in the gateway: services, providers, rewrite steps and 
functions, etc. All of these use the Java ServiceLoader mechanism for their 
discovery. There are two ways to make these extensions available on the class 
path at runtime. The first way to add a new module to the project and have the 
extension &ldquo;built-in&rdquo;. The second is to add the extension to the 
class path of the server after it is installed. Both mechanism are described in 
more detail below.</p><h3><a id="Service+Loaders">Service Loaders</a> <a 
href="#Service+Loaders"><img 
src="markbook-section-link.png"/></a></h3><p>Extensions are discovered via 
Java&rsquo;s [Service 
Loader|http://docs.oracle.com/javase/6/docs/api/java/util/ServiceLoader.html] 
mechanism. There are good [tutorials|http://docs.oracle.com/javase/tutorial/e
 xt/basics/spi.html] available for learning more about this. The basics come 
town to two things.</p>
+<ol>
+  <li><p>Implement the service contract interface (e.g. 
ServiceDeploymentContributor, ProviderDeploymentContributor)</p></li>
+  <li><p>Create a file in META-INF/services of the JAR that will contain the 
extension. This file will be named as the fully qualified name of the contract 
interface (e.g. 
org.apache.hadoop.gateway.deploy.ProviderDeploymentContributor). The contents 
of the file will be the fully qualified names of any implementation of that 
contract interface in that JAR.</p></li>
+</ol><p>One tip is to include a simple test with each of you extension to 
ensure that it will be properly discovered. This is very helpful in situations 
where a refactoring fails to change the a class in the META-INF/services files. 
An example of one such test from the project is shown below.</p>
+<pre><code class="java">  @Test
+  public void testServiceLoader() throws Exception {
+    ServiceLoader loader = ServiceLoader.load( 
ProviderDeploymentContributor.class );
+    Iterator iterator = loader.iterator();
+    assertThat( &quot;Service iterator empty.&quot;, iterator.hasNext() );
+    while( iterator.hasNext() ) {
+      Object object = iterator.next();
+      if( object instanceof ShiroDeploymentContributor ) {
+        return;
+      }
+    }
+    fail( &quot;Failed to find &quot; + 
ShiroDeploymentContributor.class.getName() + &quot; via service loader.&quot; );
+  }
+</code></pre><h3><a id="Class+Path">Class Path</a> <a href="#Class+Path"><img 
src="markbook-section-link.png"/></a></h3><p>One way to extend the 
functionality of the server without having to recompile is to add the extension 
JARs to the servers class path. As an extensible server this is made straight 
forward but it requires some understanding of how the server&rsquo;s classpath 
is setup. In the <GATEWAY_HOME> directory there are four class path related 
directories (i.e. bin, lib, dep, ext).</p><p>The bin directory contains very 
small &ldquo;launcher&rdquo; jars that contain only enough code to read 
configuration and setup a class path. By default the configuration of a 
launcher is embedded with the launcher JAR but it may also be extracted into a 
.cfg file. In that file you will see how the class path is defined.</p>
+<pre><code>class.path=../lib/*.jar,../dep/*.jar;../ext;../ext/*.jar
+</code></pre><p>The paths are all relative to the directory that contains the 
launcher JAR.</p>
+<dl><dt>../lib/*.jar</dt><dd>These are the &ldquo;built-in&rdquo; jars that 
are part of the project itself. Information is provided elsewhere in this 
document for how to integrate a built-in 
extension.</dd><dt>../dep/*.jar</dt><dd>These are the JARs for all of the 
external dependencies of the project. This separation between the generated 
JARs and dependencies help keep licensing issues 
straight.</dd><dt>../ext</dt><dd>This directory is for post-install extensions 
and is empty by default. Including the directory (vs *.jar) allows for 
individual classes to be placed in this directory.
+</dd><dt>../ext/*.jar</dt><dd>This would pick up all extension JARs placed in 
the ext directory.</dd>
+</dl><p>Note that order is significant. The lib JARs take precedence over dep 
JARs and they take precedence over ext classes and JARs.</p><h3><a 
id="Maven+Module">Maven Module</a> <a href="#Maven+Module"><img 
src="markbook-section-link.png"/></a></h3><p>Integrating an extension into the 
project follows well established Maven patterns for adding modules. Below are 
several points that are somewhat unique to the Knox project.</p>
+<ol>
+  <li><p>Add the module to the root pom.xml file&rsquo;s <modules> list. Take 
care to ensure that the module is in the correct place in the list based on its 
dependencies. Note: In general modules should not have non-test dependencies on 
gateway-server but rather gateway-spi</p></li>
+  <li><p>Any new dependencies must be represented in the root pom.xml 
file&rsquo;s <dependencyManagement> section. The required version of the 
dependencies will be declared there. The new sub-module&rsquo;s pom.xml file 
must not include dependency version information. This helps prevent dependency 
version conflict issues.</p></li>
+  <li><p>If the extension is to be &ldquo;built into&rdquo; the released 
gateway server it needs to be added as a dependency to the gateway-release 
module. This is done by adding to the <dependencies> section of the 
gateway-release&rsquo;s pom.xml file. If this isn&rsquo;t done the JARs for the 
module will not be automatically packaged into the release artifacts. This can 
be useful while an extension is under development but not yet ready for 
inclusion in the release.</p></li>
+</ol><p>More detailed examples of adding both a service and a provider 
extension are provided in subsequent sections.</p><h3><a 
id="Services">Services</a> <a href="#Services"><img 
src="markbook-section-link.png"/></a></h3><p>Services are extensions that are 
responsible for converting information in the topology file to runtime 
descriptors. Typically services do not require their own runtime descriptors. 
Rather, they modify either the gateway runtime descriptor (i.e. gateway.xml) or 
descriptors of other providers (e.g. rewrite.xml).</p><p>The service provider 
interface for a Service is ServiceDeploymentContributor and is shown below.</p>
+<pre><code class="java">package org.apache.hadoop.gateway.deploy;
+import org.apache.hadoop.gateway.topology.Service;
+public interface ServiceDeploymentContributor {
+  String getRole();
+  void initializeContribution( DeploymentContext context );
+  void contributeService( DeploymentContext context, Service service ) throws 
Exception;
+  void finalizeContribution( DeploymentContext context );
+}
+</code></pre><p>Each service provides an implementation of this interface that 
is discovered via the ServerLoader mechanism previously described. The meaning 
of this is best understood in the context of the structure of the topology 
file. A fragment of a topology file is shown below.</p>
+<pre><code class="xml">&lt;topology&gt;
+    &lt;gateway&gt;
+        ....
+    &lt;/gateway&gt;
+    &lt;service&gt;
+        &lt;role&gt;WEATHER&lt;/role&gt;
+        &lt;url&gt;http://api.openweathermap.org/data&lt;/url&gt;
+    &lt;/service&gt;
+    ....
+&lt;/topology&gt;
+</code></pre><p>With these two things a more detailed description of the 
purpose of each ServiceDeploymentContributor method should be helpful.</p>
+<dl><dt>String getRole();</dt><dd>This is the value the framework uses to 
associate a given <code>&lt;service&gt;&lt;role&gt;</code> with a particular 
ServiceDeploymentContributor implementation. See below how the example 
WeatherDeploymentContributor implementation returns the role WEATHER that 
matches the value in the topology file. This will result in the 
WeatherDeploymentContributor&rsquo;s methods being invoked when a WEATHER 
service is encountered in the topology file.</dd>
+</dl>
+<pre><code class="java">public class WeatherDeploymentContributor extends 
ServiceDeploymentContributorBase {
+  private static final String ROLE = &quot;WEATHER&quot;;
+  @Override
+  public String getRole() {
+    return ROLE;
+  }
+  ...
+}
+</code></pre>
+<dl><dt>void initializeContribution( DeploymentContext context );</dt><dd>In 
this method a contributor would create, initialize and add any descriptors it 
was responsible for to the deployment context. For the weather service example 
this isn&rsquo;t required so the empty method isn&rsquo;t shown 
here.</dd><dt>void contributeService( DeploymentContext context, Service 
service ) throws Exception;</dt><dd>In this method a service contributor 
typically add and configures any features it requires. This method will be 
dissected in more detail below.</dd><dt>void finalizeContribution( 
DeploymentContext context );</dt><dd>In this method a contributor would 
finalize any descriptors it was responsible for to the deployment context. For 
the weather service example this isn&rsquo;t required so the empty method 
isn&rsquo;t shown here.</dd>
+</dl><h4><a id="Service+Contribution+Behavior">Service Contribution 
Behavior</a> <a href="#Service+Contribution+Behavior"><img 
src="markbook-section-link.png"/></a></h4><p>In order to understand the job of 
the ServiceDeploymentContributor a few runtime descriptors need to be 
introduced.</p>
+<dl><dt>Gateway Runtime Descriptor: WEB-INF/gateway.xml</dt><dd>This runtime 
descriptor controls the behavior of the GatewayFilter. It defines a mapping 
between resources (i.e. URL patterns) and filter chains. The sample gateway 
runtime descriptor helps illustrate.</dd>
+</dl>
+<pre><code class="xml">&lt;gateway&gt;
+  &lt;resource&gt;
+    &lt;role&gt;WEATHER&lt;/role&gt;
+    &lt;pattern&gt;/weather/**?**&lt;/pattern&gt;
+    &lt;filter&gt;
+      &lt;role&gt;authentication&lt;/role&gt;
+      &lt;name&gt;sample&lt;/name&gt;
+      &lt;class&gt;...&lt;/class&gt;
+    &lt;/filter&gt;
+    &lt;filter&gt;...&lt;/filter&gt;*
+    ...
+  &lt;/resource&gt;
+&lt;/gateway&gt;
+</code></pre>
+<dl><dt>Rewrite Provider Runtime Descriptor: WEB-INF/rewrite.xml</dt><dd>The 
rewrite provider runtime descriptor controls the behavior of the rewrite 
filter. Each service contributor is responsible for adding the rules required 
to control the URL rewriting required by that service. Later sections will 
provide more detail about the capabilities of the rewrite provider.</dd>
+</dl>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;rule dir=&quot;IN&quot; 
name=&quot;WEATHER/openweathermap/inbound/versioned/file&quot;
+      pattern=&quot;*://*:*/**/weather/{version}?{**}&quot;&gt;
+    &lt;rewrite 
template=&quot;{$serviceUrl[WEATHER]}/{version}/weather?{**}&quot;/&gt;
+  &lt;/rule&gt;
+&lt;/rules&gt;
+</code></pre><p>With these two descriptors in mind a detailed breakdown of the 
WeatherDeploymentContributor&rsquo;s contributeService method will make more 
sense. At a high level the important concept is that contributeService is 
invoked by the framework for each <service> in the topology file.</p>
+<pre><code class="java">public class WeatherDeploymentContributor extends 
ServiceDeploymentContributorBase {
+  ...
+  @Override
+  public void contributeService( DeploymentContext context, Service service ) 
throws Exception {
+    contributeResources( context, service );
+    contributeRewriteRules( context );
+  }
+
+  private void contributeResources( DeploymentContext context, Service service 
) throws URISyntaxException {
+    ResourceDescriptor resource = context.getGatewayDescriptor().addResource();
+    resource.role( service.getRole() );
+    resource.pattern( &quot;/weather/**?**&quot; );
+    addAuthenticationFilter( context, service, resource );
+    addRewriteFilter( context, service, resource );
+    addDispatchFilter( context, service, resource );
+  }
+
+  private void contributeRewriteRules( DeploymentContext context ) throws 
IOException {
+    UrlRewriteRulesDescriptor allRules = context.getDescriptor( 
&quot;rewrite&quot; );
+    UrlRewriteRulesDescriptor newRules = loadRulesFromClassPath();
+    allRules.addRules( newRules );
+  }
+
+  ...
+}
+</code></pre><p>The DeploymentContext parameter contains information about the 
deployment as well as the WAR structure being created via deployment. The 
Service parameter is the object representation of the <service> element in the 
topology file. Details about particularly important lines follow the code 
block.</p>
+<dl><dt>ResourceDescriptor resource = 
context.getGatewayDescriptor().addResource();</dt><dd>Obtains a reference to 
the gateway runtime descriptor and adds a new resource element. Note that many 
of the APIs in the deployment framework follow a fluent vs bean 
style.</dd><dt>resource.role( service.getRole() );</dt><dd>Sets the role for a 
particular resource. Many of the filters may need access to this role 
information in order to make runtime decisions.</dd><dt>resource.pattern( 
&ldquo;/weather/**?**&rdquo; );</dt><dd>Sets the URL pattern to which the 
filter chain that will follow will be mapped within the 
GatewayFilter.</dd><dt>add*Filter( context, service, resource );</dt><dd>These 
are taken from a base class. A representation of the implementation of that 
method from the base class is shown below. Notice how this essentially 
delegates back to the framework to add the filters required by a particular 
provider role (e.g. &ldquo;rewrite&rdquo;).</dd>
+</dl>
+<pre><code class="java">  protected void addRewriteFilter( DeploymentContext 
context, Service service, ResourceDescriptor resource ) {
+    context.contributeFilter( service, resource, &quot;rewrite&quot;, null, 
null );
+  }
+</code></pre>
+<dl><dt>UrlRewriteRulesDescriptor allRules = context.getDescriptor( 
&ldquo;rewrite&rdquo; );</dt><dd>Here the rewrite provider runtime descriptor 
is obtained by name from the deployment context. This does represent a tight 
coupling in this case between this service and the default rewrite provider. 
The rewrite provider however is unlikely to be related with alternate 
implementations.</dd><dt>UrlRewriteRulesDescriptor newRules = 
loadRulesFromClassPath();</dt><dd>This is convenience method for loading 
partial rewrite descriptor information from the classpath. Developing and 
maintaining these rewrite rules is far easier as an external resource. The 
rewrite descriptor API could however have been used to achieve the same 
result.</dd><dt>allRules.addRules( newRules );</dt><dd>Here the rewrite rules 
for the weather service are merged into the larger set of rewrite rules.</dd>
+</dl>
+<pre><code class="xml">&lt;project&gt;
+    &lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;
+    &lt;parent&gt;
+        &lt;groupId&gt;org.apache.hadoop&lt;/groupId&gt;
+        &lt;artifactId&gt;gateway&lt;/artifactId&gt;
+        &lt;version&gt;0.10.0-SNAPSHOT&lt;/version&gt;
+    &lt;/parent&gt;
+
+    &lt;artifactId&gt;gateway-service-weather&lt;/artifactId&gt;
+    &lt;name&gt;gateway-service-weather&lt;/name&gt;
+    &lt;description&gt;A sample extension to the gateway for a weather REST 
API.&lt;/description&gt;
+
+    &lt;licenses&gt;
+        &lt;license&gt;
+            &lt;name&gt;The Apache Software License, Version 2.0&lt;/name&gt;
+            
&lt;url&gt;http://www.apache.org/licenses/LICENSE-2.0.txt&lt;/url&gt;
+            &lt;distribution&gt;repo&lt;/distribution&gt;
+        &lt;/license&gt;
+    &lt;/licenses&gt;
+
+    &lt;dependencies&gt;
+        &lt;dependency&gt;
+            &lt;groupId&gt;${gateway-group}&lt;/groupId&gt;
+            &lt;artifactId&gt;gateway-spi&lt;/artifactId&gt;
+        &lt;/dependency&gt;
+        &lt;dependency&gt;
+            &lt;groupId&gt;${gateway-group}&lt;/groupId&gt;
+            &lt;artifactId&gt;gateway-provider-rewrite&lt;/artifactId&gt;
+        &lt;/dependency&gt;
+
+        ... Test Dependencies ...
+
+    &lt;/dependencies&gt;
+
+&lt;/project&gt;
+</code></pre><h4><a id="Service+Definition+Files">Service Definition Files</a> 
<a href="#Service+Definition+Files"><img 
src="markbook-section-link.png"/></a></h4><p>As of release 0.6.0, the gateway 
now also supports a declarative way of plugging-in a new Service. A Service can 
be defined with a combination of two files, these are:</p>
+<pre><code>service.xml
+rewrite.xml
+</code></pre><p>The rewrite.xml file contains the rewrite rules as defined in 
other sections of this guide, and the service.xml file contains the various 
routes (paths) to be provided by the Service and the rewrite rule bindings to 
those paths. This will be described in further detail in this 
section.</p><p>While the service.xml file is absolutely required, the 
rewrite.xml file in theory is optional (though it is highly unlikely that no 
rewrite rules are needed).</p><p>To add a new service, simply add a service.xml 
and rewrite.xml file in an appropriate directory (see <a 
href="#Service+Definition+Directory+Structure">Service Definition Directory 
Structure</a>) in the module gateway-service-definitions to make the new 
service part of the Knox build.</p><h5><a id="service.xml">service.xml</a> <a 
href="#service.xml"><img src="markbook-section-link.png"/></a></h5><p>Below is 
a sample of a very simple service.xml file, taking the same weather api 
example.</p>
+<pre><code class="xml">&lt;service role=&quot;WEATHER&quot; 
name=&quot;weather&quot; version=&quot;0.1.0&quot;&gt;
+    &lt;routes&gt;
+        &lt;route path=&quot;/weather/**?**&quot;/&gt;
+    &lt;/routes&gt;
+&lt;/service&gt;
+
+</code></pre>
+<dl><dt><strong>service</strong></dt><dd>The root tag is &lsquo;service&rsquo; 
that has the three required attributes: <em>role</em>, <em>name</em> and 
<em>version</em>. These three values disambiguate this service definition from 
others. To ensure the exact same service definition is being used in a topology 
file, all values should be specified,</dd>
+</dl>
+<pre><code class="xml">&lt;topology&gt;
+    &lt;gateway&gt;
+        ....
+    &lt;/gateway&gt;
+    &lt;service&gt;
+        &lt;role&gt;WEATHER&lt;/role&gt;
+        &lt;name&gt;weather&lt;/name&gt;
+        &lt;version&gt;0.1.0&lt;/version&gt;
+        &lt;url&gt;http://api.openweathermap.org/data&lt;/url&gt;
+    &lt;/service&gt;
+    ....
+&lt;/topology&gt;
+</code></pre><p>If only <em>role</em> is specified in the topology file (the 
only required element other than <em>url</em>) then the first service 
definition of that role found will be used with the highest version of that 
role and name. Similarly if only the <em>version</em> is omitted from the 
topology specification of the service, the service definition of the highest 
version will be used. It is therefore important to specify a version for a 
service if it is desired that a topology be locked down to a specific version 
of a service.</p>
+<dl><dt><strong>routes</strong></dt><dd>Wrapper element for one or more 
routes.</dd><dt><strong>route</strong></dt><dd>A route specifies the 
<em>path</em> that the service is routing as well as any rewrite bindings or 
policy bindings. Another child element that may be used here is a 
<em>dispatch</em> element.</dd><dt><strong>rewrite</strong></dt><dd>A rewrite 
rule or function that is to be applied to the path. A rewrite element contains 
a <em>apply</em> attribute that references the rewrite function or rule by 
name. Along with the <em>apply</em> attribute, a <em>to</em> attribute must be 
used. The <em>to</em> specifies what part of the request or response to 
rewrite. The valid values for the <em>to</em> attribute are:</dd>
+</dl>
+<ul>
+  <li>request.url</li>
+  <li>request.headers</li>
+  <li>request.cookies</li>
+  <li>request.body</li>
+  <li>response.headers</li>
+  <li>response.cookies</li>
+  <li>response.body</li>
+</ul><p>Below is an example of a snippet from the WebHDFS service 
definition</p>
+<pre><code class="xml">    &lt;route path=&quot;/webhdfs/v1/**?**&quot;&gt;
+        &lt;rewrite apply=&quot;WEBHDFS/webhdfs/inbound/namenode/file&quot; 
to=&quot;request.url&quot;/&gt;
+        &lt;rewrite 
apply=&quot;WEBHDFS/webhdfs/outbound/namenode/headers&quot; 
to=&quot;response.headers&quot;/&gt;
+    &lt;/route&gt;
+</code></pre>
+<dl><dt><strong>dispatch</strong></dt><dd>The dispatch element can be used to 
plug-in a custom dispatch class. The interface for Dispatch can be found in the 
module gateway-spi, org.apache.hadoop.gateway.dispatch.Dispatch.</dd>
+</dl><p>This element can be used at the service level (i.e. as a child of the 
service tag) or at the route level. A dispatch specified at the route level 
takes precedence over a dispatch specified at the service level. By default the 
dispatch used is org.apache.hadoop.gateway.dispatch.DefaultDispatch.</p><p>The 
dispatch tag has four attributes that can be 
specified.</p><p><em>contributor-name</em> : This attribute can be used to 
specify a deployment contributor to be invoked for a custom 
dispatch.</p><p><em>classname</em> : This attribute can be used to specify a 
custom dispatch class.</p><p><em>ha-contributor-name</em> : This attribute can 
be used to specify a deployment contributor to be invoked for custom HA 
dispatch functionality.</p><p><em>ha-classname</em> : This attribute can be 
used to specify a custom dispatch class with HA functionality.</p><p>Only one 
of contributor-name or classname should be specified and one of 
ha-contributor-name or ha-classname should be specified.</
 p><p>If providing a custom dispatch, either a jar should be provided, see <a 
href="#Class+Path">Class Path</a> or a <a href="#Maven+Module">Maven Module</a> 
should be created.</p>
+<dl><dt><strong>policies</strong></dt><dd>This is a wrapper tag for 
<em>policy</em> elements and can be a child of the <em>service</em> tag or the 
<em>route</em> tag. Once again, just like with dispatch, the route level 
policies defined override the ones at the service level.</dd>
+</dl><p>This element can contain one or more <em>policy</em> elements. The 
order of the <em>policy</em> elements is important as that will be the order of 
execution.</p>
+<dl><dt><strong>policy</strong></dt><dd>At this time the policy element just 
has two attributes, <em>role</em> and <em>name</em>. These are used to execute 
a deployment contributor by that role and name. Therefore new policies must be 
added by using the deployment contributor mechanism.</dd>
+</dl><p>For example,</p>
+<pre><code class="xml">&lt;service role=&quot;FOO&quot; name=&quot;foo&quot; 
version=&quot;1.0.0&quot;&gt;
+    &lt;policies&gt;
+        &lt;policy role=&quot;webappsec&quot;/&gt;
+        &lt;policy role=&quot;authentication&quot;/&gt;
+        &lt;policy role=&quot;rewrite&quot;/&gt;
+        &lt;policy role=&quot;identity-assertion&quot;/&gt;
+        &lt;policy role=&quot;authorization&quot;/&gt;
+    &lt;/policies&gt;
+    &lt;routes&gt;
+        &lt;route path=&quot;/foo/?**&quot;&gt;
+            &lt;rewrite apply=&quot;FOO/foo/inbound&quot; 
to=&quot;request.url&quot;/&gt;
+            &lt;policies&gt;
+                &lt;policy role=&quot;webappsec&quot;/&gt;
+                &lt;policy role=&quot;federation&quot;/&gt;
+                &lt;policy role=&quot;identity-assertion&quot;/&gt;
+                &lt;policy role=&quot;authorization&quot;/&gt;
+                &lt;policy role=&quot;rewrite&quot;/&gt;
+            &lt;/policies&gt;
+            &lt;dispatch contributor-name=&quot;http-client&quot; /&gt;
+        &lt;/route&gt;
+    &lt;/routes&gt;
+    &lt;dispatch contributor-name=&quot;custom-client&quot; 
ha-contributor-name=&quot;ha-client&quot;/&gt;
+&lt;/service&gt;
+</code></pre><h5><a id="rewrite.xml">rewrite.xml</a> <a 
href="#rewrite.xml"><img src="markbook-section-link.png"/></a></h5><p>The 
rewrite.xml file that accompanies the service.xml file follows the same rules 
as described in the section <a href="#Rewrite+Provider">Rewrite 
Provider</a>.</p><h4><a id="Service+Definition+Directory+Structure">Service 
Definition Directory Structure</a> <a 
href="#Service+Definition+Directory+Structure"><img 
src="markbook-section-link.png"/></a></h4><p>On installation of the Knox 
gateway, the following directory structure can be found under 
${GATEWAY_HOME}/data. This is a mirror of the directories and files under the 
module gateway-service-definitions.</p>
+<pre><code>services
+    |______ service name
+                    |______ version
+                                |______service.xml
+                                |______rewrite.xml
+</code></pre><p>For example,</p>
+<pre><code>services
+    |______ webhdfs
+               |______ 2.4.0
+                         |______service.xml
+                         |______rewrite.xml
+</code></pre><p>To test out a new service, you can just add the appropriate 
files (service.xml and rewrite.xml) in a directory under 
${GATEWAY_HOME}/data/services. If you want to make the service contribution to 
the Knox build, they files need to go in the gateway-service-definitions 
module.</p><h4><a id="Service+Definition+Runtime+Behavior">Service Definition 
Runtime Behavior</a> <a href="#Service+Definition+Runtime+Behavior"><img 
src="markbook-section-link.png"/></a></h4><p>The runtime artifacts as well as 
the behavior does not change whether the service is plugged in via the 
deployment descriptors or through a service.xml file.</p><h4><a 
id="Custom+Dispatch+Dependency+Injection">Custom Dispatch Dependency 
Injection</a> <a href="#Custom+Dispatch+Dependency+Injection"><img 
src="markbook-section-link.png"/></a></h4><p>When writing a custom dispatch 
class, one often needs configuration or gateway services. A lightweight 
dependency injection system is used that can inject instances of
  classes or primitives available in the filter configuration&rsquo;s init 
params or as a servlet context attribute.</p><p>Details of this can be found in 
the module gateway-util-configinjector and also an example use of it is in the 
class org.apache.hadoop.gateway.dispatch.DefaultDispatch. Look at the following 
method for example:</p>
+<pre><code class="java"> @Configure
+   protected void setReplayBufferSize(@Default(&quot;8&quot;) int size) {
+      replayBufferSize = size;
+   }
+</code></pre><h3><a id="Providers">Providers</a> <a href="#Providers"><img 
src="markbook-section-link.png"/></a></h3>
+<pre><code class="java">public interface ProviderDeploymentContributor {
+  String getRole();
+  String getName();
+
+  void initializeContribution( DeploymentContext context );
+  void contributeProvider( DeploymentContext context, Provider provider );
+  void contributeFilter(
+      DeploymentContext context,
+      Provider provider,
+      Service service,
+      ResourceDescriptor resource,
+      List&lt;FilterParamDescriptor&gt; params );
+
+  void finalizeContribution( DeploymentContext context );
+}
+</code></pre>
+<pre><code class="xml">&lt;project&gt;
+    &lt;modelVersion&gt;4.0.0&lt;/modelVersion&gt;
+    &lt;parent&gt;
+        &lt;groupId&gt;org.apache.hadoop&lt;/groupId&gt;
+        &lt;artifactId&gt;gateway&lt;/artifactId&gt;
+        &lt;version&gt;0.10.0-SNAPSHOT&lt;/version&gt;
+    &lt;/parent&gt;
+
+    &lt;artifactId&gt;gateway-provider-security-authn-sample&lt;/artifactId&gt;
+    &lt;name&gt;gateway-provider-security-authn-sample&lt;/name&gt;
+    &lt;description&gt;A simple sample authorization 
provider.&lt;/description&gt;
+
+    &lt;licenses&gt;
+        &lt;license&gt;
+            &lt;name&gt;The Apache Software License, Version 2.0&lt;/name&gt;
+            
&lt;url&gt;http://www.apache.org/licenses/LICENSE-2.0.txt&lt;/url&gt;
+            &lt;distribution&gt;repo&lt;/distribution&gt;
+        &lt;/license&gt;
+    &lt;/licenses&gt;
+
+    &lt;dependencies&gt;
+        &lt;dependency&gt;
+            &lt;groupId&gt;${gateway-group}&lt;/groupId&gt;
+            &lt;artifactId&gt;gateway-spi&lt;/artifactId&gt;
+        &lt;/dependency&gt;
+    &lt;/dependencies&gt;
+
+&lt;/project&gt;
+</code></pre><h3><a id="Deployment+Context">Deployment Context</a> <a 
href="#Deployment+Context"><img src="markbook-section-link.png"/></a></h3>
+<pre><code class="java">package org.apache.hadoop.gateway.deploy;
+
+import ...
+
+public interface DeploymentContext {
+
+  GatewayConfig getGatewayConfig();
+
+  Topology getTopology();
+
+  WebArchive getWebArchive();
+
+  WebAppDescriptor getWebAppDescriptor();
+
+  GatewayDescriptor getGatewayDescriptor();
+
+  void contributeFilter(
+      Service service,
+      ResourceDescriptor resource,
+      String role,
+      String name,
+      List&lt;FilterParamDescriptor&gt; params );
+
+  void addDescriptor( String name, Object descriptor );
+
+  &lt;T&gt; T getDescriptor( String name );
+
+}
+</code></pre>
+<pre><code class="java">public class Topology {
+
+  public URI getUri() {...}
+  public void setUri( URI uri ) {...}
+
+  public String getName() {...}
+  public void setName( String name ) {...}
+
+  public long getTimestamp() {...}
+  public void setTimestamp( long timestamp ) {...}
+
+  public Collection&lt;Service&gt; getServices() {...}
+  public Service getService( String role, String name ) {...}
+  public void addService( Service service ) {...}
+
+  public Collection&lt;Provider&gt; getProviders() {...}
+  public Provider getProvider( String role, String name ) {...}
+  public void addProvider( Provider provider ) {...}
+}
+</code></pre>
+<pre><code class="java">public interface GatewayDescriptor {
+  List&lt;GatewayParamDescriptor&gt; params();
+  GatewayParamDescriptor addParam();
+  GatewayParamDescriptor createParam();
+  void addParam( GatewayParamDescriptor param );
+  void addParams( List&lt;GatewayParamDescriptor&gt; params );
+
+  List&lt;ResourceDescriptor&gt; resources();
+  ResourceDescriptor addResource();
+  ResourceDescriptor createResource();
+  void addResource( ResourceDescriptor resource );
+}
+</code></pre><h3><a id="Gateway+Services">Gateway Services</a> <a 
href="#Gateway+Services"><img src="markbook-section-link.png"/></a></h3><p>TODO 
- Describe the service registry and other global services.</p><h2><a 
id="Standard+Providers">Standard Providers</a> <a 
href="#Standard+Providers"><img 
src="markbook-section-link.png"/></a></h2><h3><a id="Rewrite+Provider">Rewrite 
Provider</a> <a href="#Rewrite+Provider"><img 
src="markbook-section-link.png"/></a></h3><p>gateway-provider-rewrite 
org.apache.hadoop.gateway.filter.rewrite.api.UrlRewriteRulesDescriptor</p>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;rule
+      dir=&quot;IN&quot;
+      name=&quot;WEATHER/openweathermap/inbound/versioned/file&quot;
+      pattern=&quot;*://*:*/**/weather/{version}?{**}&quot;&gt;
+    &lt;rewrite 
template=&quot;{$serviceUrl[WEATHER]}/{version}/weather?{**}&quot;/&gt;
+  &lt;/rule&gt;
+&lt;/rules&gt;
+</code></pre>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;filter name=&quot;WEBHBASE/webhbase/status/outbound&quot;&gt;
+    &lt;content type=&quot;*/json&quot;&gt;
+      &lt;apply path=&quot;$[LiveNodes][*][name]&quot; 
rule=&quot;WEBHBASE/webhbase/address/outbound&quot;/&gt;
+    &lt;/content&gt;
+    &lt;content type=&quot;*/xml&quot;&gt;
+      &lt;apply path=&quot;/ClusterStatus/LiveNodes/Node/@name&quot; 
rule=&quot;WEBHBASE/webhbase/address/outbound&quot;/&gt;
+    &lt;/content&gt;
+  &lt;/filter&gt;
+&lt;/rules&gt;
+</code></pre>
+<pre><code class="java">@Test
+public void testDevGuideSample() throws Exception {
+  URI inputUri, outputUri;
+  Matcher&lt;Void&gt; matcher;
+  Matcher&lt;Void&gt;.Match match;
+  Template input, pattern, template;
+
+  inputUri = new URI( 
&quot;http://sample-host:8443/gateway/topology/weather/2.5?q=Palo+Alto&quot; );
+
+  input = Parser.parse( inputUri.toString() );
+  pattern = Parser.parse( &quot;*://*:*/**/weather/{version}?{**}&quot; );
+  template = Parser.parse( 
&quot;http://api.openweathermap.org/data/{version}/weather?{**}&quot; );
+
+  matcher = new Matcher&lt;Void&gt;();
+  matcher.add( pattern, null );
+  match = matcher.match( input );
+
+  outputUri = Expander.expand( template, match.getParams(), null );
+
+  assertThat(
+      outputUri.toString(),
+      is( 
&quot;http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto&quot; ) );
+}
+</code></pre>
+<pre><code class="java">@Test
+public void testDevGuideSampleWithEvaluator() throws Exception {
+  URI inputUri, outputUri;
+  Matcher&lt;Void&gt; matcher;
+  Matcher&lt;Void&gt;.Match match;
+  Template input, pattern, template;
+  Evaluator evaluator;
+
+  inputUri = new URI( 
&quot;http://sample-host:8443/gateway/topology/weather/2.5?q=Palo+Alto&quot; );
+  input = Parser.parse( inputUri.toString() );
+
+  pattern = Parser.parse( &quot;*://*:*/**/weather/{version}?{**}&quot; );
+  template = Parser.parse( 
&quot;{$serviceUrl[WEATHER]}/{version}/weather?{**}&quot; );
+
+  matcher = new Matcher&lt;Void&gt;();
+  matcher.add( pattern, null );
+  match = matcher.match( input );
+
+  evaluator = new Evaluator() {
+    @Override
+    public List&lt;String&gt; evaluate( String function, List&lt;String&gt; 
parameters ) {
+      return Arrays.asList( &quot;http://api.openweathermap.org/data&quot; );
+    }
+  };
+
+  outputUri = Expander.expand( template, match.getParams(), evaluator );
+
+  assertThat(
+      outputUri.toString(),
+      is( 
&quot;http://api.openweathermap.org/data/2.5/weather?q=Palo+Alto&quot; ) );
+}
+</code></pre><h4><a id="Rewrite+Filters">Rewrite Filters</a> <a 
href="#Rewrite+Filters"><img src="markbook-section-link.png"/></a></h4><p>TODO 
- Cover the supported content types. TODO - Provide a XML and JSON 
&ldquo;properties&rdquo; example where one NVP is modified based on value of 
another name.</p>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;filter name=&quot;WEBHBASE/webhbase/regions/outbound&quot;&gt;
+    &lt;content type=&quot;*/json&quot;&gt;
+      &lt;apply path=&quot;$[Region][*][location]&quot; 
rule=&quot;WEBHBASE/webhbase/address/outbound&quot;/&gt;
+    &lt;/content&gt;
+    &lt;content type=&quot;*/xml&quot;&gt;
+      &lt;apply path=&quot;/TableInfo/Region/@location&quot; 
rule=&quot;WEBHBASE/webhbase/address/outbound&quot;/&gt;
+    &lt;/content&gt;
+  &lt;/filter&gt;
+&lt;/rules&gt;
+</code></pre>
+<pre><code class="xml">&lt;gateway&gt;
+  ...
+  &lt;resource&gt;
+    &lt;role&gt;WEBHBASE&lt;/role&gt;
+    &lt;pattern&gt;/hbase/*/regions?**&lt;/pattern&gt;
+    ...
+    &lt;filter&gt;
+      &lt;role&gt;rewrite&lt;/role&gt;
+      &lt;name&gt;url-rewrite&lt;/name&gt;
+      
&lt;class&gt;org.apache.hadoop.gateway.filter.rewrite.api.UrlRewriteServletFilter&lt;/class&gt;
+      &lt;param&gt;
+        &lt;name&gt;response.body&lt;/name&gt;
+        &lt;value&gt;WEBHBASE/webhbase/regions/outbound&lt;/value&gt;
+      &lt;/param&gt;
+    &lt;/filter&gt;
+    ...
+  &lt;/resource&gt;
+  ...
+&lt;/gateway&gt;
+</code></pre><p>HBaseDeploymentContributor <code>java
+    params = new ArrayList&lt;FilterParamDescriptor&gt;();
+    params.add( regionResource.createFilterParam().name( 
&quot;response.body&quot; ).value( 
&quot;WEBHBASE/webhbase/regions/outbound&quot; ) );
+    addRewriteFilter( context, service, regionResource, params );
+</code></p><h4><a id="Rewrite+Functions">Rewrite Functions</a> <a 
href="#Rewrite+Functions"><img 
src="markbook-section-link.png"/></a></h4><p>TODO - Provide an lowercase 
function as an example.</p>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;functions&gt;
+    &lt;hostmap config=&quot;/WEB-INF/hostmap.txt&quot;/&gt;
+  &lt;/functions&gt;
+  ...
+&lt;/rules&gt;
+</code></pre><h4><a id="Rewrite+Steps">Rewrite Steps</a> <a 
href="#Rewrite+Steps"><img src="markbook-section-link.png"/></a></h4><p>TODO - 
Provide an lowercase step as an example.</p>
+<pre><code class="xml">&lt;rules&gt;
+  &lt;rule dir=&quot;OUT&quot; 
name=&quot;WEBHDFS/webhdfs/outbound/namenode/headers/location&quot;&gt;
+    &lt;match pattern=&quot;{scheme}://{host}:{port}/{path=**}?{**}&quot;/&gt;
+    &lt;rewrite 
template=&quot;{gateway.url}/webhdfs/data/v1/{path=**}?{scheme}?host={$hostmap(host)}?{port}?{**}&quot;/&gt;
+    &lt;encrypt-query/&gt;
+  &lt;/rule&gt;
+&lt;/rules&gt;
+</code></pre><h3><a id="Identity+Assertion+Provider">Identity Assertion 
Provider</a> <a href="#Identity+Assertion+Provider"><img 
src="markbook-section-link.png"/></a></h3><p>Adding a new identity assertion 
provider is as simple as extending the 
AbstractIdentityAsserterDeploymentContributor and the 
CommonIdentityAssertionFilter from the 
gateway-provider-identity-assertion-common module to initialize any specific 
configuration from filter init params and implement two methods:</p>
+<ol>
+  <li>String mapUserPrincipal(String principalName);</li>
+  <li>String[] mapGroupPrincipals(String principalName, Subject subject);</li>
+</ol><p>To implement a simple toUpper or toLower identity assertion 
provider:</p>
+<pre><code class="java">package 
org.apache.hadoop.gateway.identityasserter.caseshifter.filter;
+
+import 
org.apache.hadoop.gateway.identityasserter.common.filter.AbstractIdentityAsserterDeploymentContributor;
+
+public class CaseShifterIdentityAsserterDeploymentContributor extends 
AbstractIdentityAsserterDeploymentContributor {
+
+  @Override
+  public String getName() {
+    return &quot;CaseShifter&quot;;
+  }
+
+  protected String getFilterClassname() {
+    return CaseShifterIdentityAssertionFilter.class.getName();
+  }
+}
+</code></pre><p>We merely need to provide the provider name for use in the 
topology and the filter classname for the contributor to add to the filter 
chain.</p><p>For the identity assertion filter itself it is just a matter of 
extension and the implementation of the two methods described earlier:</p>
+<pre><code class="java">package 
org.apache.hadoop.gateway.identityasserter.caseshifter.filter;
+
+import javax.security.auth.Subject;
+import javax.servlet.FilterConfig;
+import javax.servlet.ServletException;
+import 
org.apache.hadoop.gateway.identityasserter.common.filter.CommonIdentityAssertionFilter;
+
+public class CaseShifterIdentityAssertionFilter extends 
CommonIdentityAssertionFilter {
+  private boolean toUpper = false;
+  
+  @Override
+  public void init(FilterConfig filterConfig) throws ServletException {
+    String upper = filterConfig.getInitParameter(&quot;caseshift.upper&quot;);
+    if (&quot;true&quot;.equals(upper)) {
+      toUpper = true;
+    }
+  }
+
+  @Override
+  public String[] mapGroupPrincipals(String mappedPrincipalName, Subject 
subject) {
+    return null;
+  }
+
+  @Override
+  public String mapUserPrincipal(String principalName) {
+    if (toUpper) {
+      principalName = principalName.toUpperCase();
+    }
+    else {
+      principalName = principalName.toLowerCase();
+    }
+    return principalName;
+  }
+}
+</code></pre><p>Note that the above: </p>
+<ol>
+  <li>looks for specific filter init parameters for configuration of whether 
to convert to upper or to lower case</li>
+  <li>it no-ops the mapGroupPrincipals so that it returns null. This indicates 
that there are no changes needed to the groups contained within the Subject. If 
there are groups then they should be continued to flow through the system 
unchanged. This is actually the same implementation as the base class and is 
therefore not required to be overridden. We include it here for 
illustration.</li>
+  <li>based upon the configuration interrogated in the init method the 
principalName is convert to either upper or lower case.</li>
+</ol><p>That is the extent of what is needed to implement a new identity 
assertion provider module.</p><h3><a id="Jersey+Provider">Jersey Provider</a> 
<a href="#Jersey+Provider"><img 
src="markbook-section-link.png"/></a></h3><p>TODO</p><h3><a 
id="KnoxSSO+Integration">KnoxSSO Integration</a> <a 
href="#KnoxSSO+Integration"><img 
src="markbook-section-link.png"/></a></h3><h1>Knox SSO Integration for 
UIs</h1><h2>Introduction</h2><p>KnoxSSO provides an abstraction for integrating 
any number of authentication systems and SSO solutions and enables 
participating web applications to scale to those solutions more easily. Without 
the token exchange capabilities offered by KnoxSSO each component UI would need 
to integrate with each desired solution on its own. </p><p>This document 
examines the way to integrate with Knox SSO in the form of a Servlet Filter. 
This approach should be easily extrapolated into other frameworks - ie. Spring 
Security.</p><h3><a id="General+Flow">General Flow</a> <a href
 ="#General+Flow"><img src="markbook-section-link.png"/></a></h3><p>The 
following is a generic sequence diagram for SAML integration through 
KnoxSSO.</p><p><img src='general_saml_flow.png'/> </p><h4><a 
id="KnoxSSO+Setup">KnoxSSO Setup</a> <a href="#KnoxSSO+Setup"><img 
src="markbook-section-link.png"/></a></h4><h5><a 
id="knoxsso.xml+Topology">knoxsso.xml Topology</a> <a 
href="#knoxsso.xml+Topology"><img 
src="markbook-section-link.png"/></a></h5><p>In order to enable KnoxSSO, we 
need to configure the IdP topology. The following is an example of this 
topology that is configured to use HTTP Basic Auth against the Knox Demo LDAP 
server. This is the lowest barrier of entry for your development environment 
that actually authenticates against a real user store. What’s great is if you 
work against the IdP with Basic Auth then you will work with SAML or anything 
else as well.</p>
+<pre><code>            &lt;?xml version=&quot;1.0&quot; 
encoding=&quot;utf-8&quot;?&gt;
+               &lt;topology&gt;
+               &lt;gateway&gt;
+                       &lt;provider&gt;
+                       &lt;role&gt;authentication&lt;/role&gt;
+                       &lt;name&gt;ShiroProvider&lt;/name&gt;
+                       &lt;enabled&gt;true&lt;/enabled&gt;
+                       &lt;param&gt;
+                               &lt;name&gt;sessionTimeout&lt;/name&gt;
+                               &lt;value&gt;30&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               &lt;name&gt;main.ldapRealm&lt;/name&gt;
+                               
&lt;value&gt;org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               &lt;name&gt;main.ldapContextFactory&lt;/name&gt;
+                               
&lt;value&gt;org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               
&lt;name&gt;main.ldapRealm.contextFactory&lt;/name&gt;
+                               &lt;value&gt;$ldapContextFactory&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               
&lt;name&gt;main.ldapRealm.userDnTemplate&lt;/name&gt;
+                               
&lt;value&gt;uid={0},ou=people,dc=hadoop,dc=apache,dc=org&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               
&lt;name&gt;main.ldapRealm.contextFactory.url&lt;/name&gt;
+                               
&lt;value&gt;ldap://localhost:33389&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               
&lt;name&gt;main.ldapRealm.contextFactory.authenticationMechanism&lt;/name&gt;
+                               &lt;value&gt;simple&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               &lt;name&gt;urls./**&lt;/name&gt;
+                               &lt;value&gt;authcBasic&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;/provider&gt;
+        
+                       &lt;provider&gt;
+                           &lt;role&gt;identity-assertion&lt;/role&gt;
+                       &lt;name&gt;Default&lt;/name&gt;
+                       &lt;enabled&gt;true&lt;/enabled&gt;
+                       &lt;/provider&gt;
+               &lt;/gateway&gt;
+
+                   &lt;service&gt;
+                       &lt;role&gt;KNOXSSO&lt;/role&gt;
+                       &lt;param&gt;
+                               
&lt;name&gt;knoxsso.cookie.secure.only&lt;/name&gt;
+                               &lt;value&gt;true&lt;/value&gt;
+                       &lt;/param&gt;
+                       &lt;param&gt;
+                               &lt;name&gt;knoxsso.token.ttl&lt;/name&gt;
+                               &lt;value&gt;100000&lt;/value&gt;
+                       &lt;/param&gt;
+               &lt;/service&gt;
+               &lt;/topology&gt;
+</code></pre><p>Just as with any Knox service, the KNOXSSO service is 
protected by the gateway providers defined above it. In this case, the 
ShiroProvider is taking care of HTTP Basic Auth against LDAP for us. Once the 
user authenticates the request processing continues to the KNOXSSO service that 
will create the required cookie and do the necessary redirects.</p><p>The 
authenticate/federation provider can be swapped out to fit your deployment 
environment.</p><h5><a id="sandbox.xml+Topology">sandbox.xml Topology</a> <a 
href="#sandbox.xml+Topology"><img 
src="markbook-section-link.png"/></a></h5><p>In order to see the end to end 
story and use it as an example in your development, you can configure one of 
the cluster topologies to use the SSOCookieProvider instead of the out of the 
box ShiroProvider. The following is an example sandbox.xml topology that is 
configured for using KnoxSSO to protect access to the Hadoop REST APIs.</p>
+<pre><code>    &lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot;?&gt;
+       &lt;topology&gt;
+    &lt;gateway&gt;
+      &lt;provider&gt;
+          &lt;role&gt;federation&lt;/role&gt;
+          &lt;name&gt;SSOCookieProvider&lt;/name&gt;
+          &lt;enabled&gt;true&lt;/enabled&gt;
+          &lt;param&gt;
+              &lt;name&gt;sso.authentication.provider.url&lt;/name&gt;
+       
&lt;value&gt;https://localhost:9443/gateway/idp/api/v1/websso&lt;/value&gt;
+          &lt;/param&gt;
+      &lt;/provider&gt;
+        
+        &lt;provider&gt;
+            &lt;role&gt;identity-assertion&lt;/role&gt;
+            &lt;name&gt;Default&lt;/name&gt;
+            &lt;enabled&gt;true&lt;/enabled&gt;
+        &lt;/provider&gt;
+    &lt;/gateway&gt;
+    
+    &lt;service&gt;
+        &lt;role&gt;NAMENODE&lt;/role&gt;
+        &lt;url&gt;hdfs://localhost:8020&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;JOBTRACKER&lt;/role&gt;
+        &lt;url&gt;rpc://localhost:8050&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;WEBHDFS&lt;/role&gt;
+        &lt;url&gt;http://localhost:50070/webhdfs&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;WEBHCAT&lt;/role&gt;
+        &lt;url&gt;http://localhost:50111/templeton&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;OOZIE&lt;/role&gt;
+        &lt;url&gt;http://localhost:11000/oozie&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;WEBHBASE&lt;/role&gt;
+        &lt;url&gt;http://localhost:60080&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;HIVE&lt;/role&gt;
+        &lt;url&gt;http://localhost:10001/cliservice&lt;/url&gt;
+    &lt;/service&gt;
+
+    &lt;service&gt;
+        &lt;role&gt;RESOURCEMANAGER&lt;/role&gt;
+        &lt;url&gt;http://localhost:8088/ws&lt;/url&gt;
+    &lt;/service&gt;
+       &lt;/topology&gt;
+</code></pre>
+<ul>
+  <li>NOTE: Be aware that when using Chrome as your browser that cookies 
don’t seem to work for “localhost”. Either use a VM or like I did - use 
127.0.0.1. Safari works with localhost without problems.</li>
+</ul><p>As you can see above, the only thing being configured is the SSO 
provider URL. Since Knox is the issuer of the cookie and token, we don’t need 
to configure the public key since we have programmatic access to the actual 
keystore for use at verification time.</p><h4><a id="Curl+the+Flow">Curl the 
Flow</a> <a href="#Curl+the+Flow"><img 
src="markbook-section-link.png"/></a></h4><p>We should now be able to walk 
through the SSO Flow at the command line with curl to see everything that 
happens.</p><p>First, issue a request to WEBHDFS through knox.</p>
+<pre><code>    bash-3.2$ curl -iku guest:guest-password 
https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS
+       
+       HTTP/1.1 302 Found
+       Location: 
https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op+LISTSTATUS
+       Content-Length: 0
+       Server: Jetty(8.1.14.v20131031)
+</code></pre><p>Note the redirect to the knoxsso endpoint and the loginUrl 
with the originalUrl request parameter. We need to see that come from your 
integration as well.</p><p>Let’s manually follow that redirect with curl 
now:</p>
+<pre><code>    bash-3.2$ curl -iku guest:guest-password 
&quot;https://localhost:8443/gateway/idp/api/v1/websso?originalUrl=https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS&quot;
+
+       HTTP/1.1 307 Temporary Redirect
+       Set-Cookie: 
JSESSIONID=mlkda4crv7z01jd0q0668nsxp;Path=/gateway/idp;Secure;HttpOnly
+       Set-Cookie: 
hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODUzNzEsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.RpA84Qdr6RxEZjg21PyVCk0G1kogvkuJI2bo302bpwbvmc-i01gCwKNeoGYzUW27MBXf6a40vylHVR3aZuuBUxsJW3aa_ltrx0R5ztKKnTWeJedOqvFKSrVlBzJJ90PzmDKCqJxA7JUhyo800_lDHLTcDWOiY-ueWYV2RMlCO0w;Path=/;Domain=localhost;Secure;HttpOnly
+       Expires: Thu, 01 Jan 1970 00:00:00 GMT
+       Location: 
https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+       Content-Length: 0
+       Server: Jetty(8.1.14.v20131031)
+</code></pre><p>Note the redirect back to the original URL in the Location 
header and the Set-Cookie for the hadoop-jwt cookie. This is what the 
SSOCookieProvider in sandbox (and ultimately in your integration) will be 
looking for.</p><p>Finally, we should be able to take the above cookie and pass 
it to the original url as indicated in the Location header for our originally 
requested resource:</p>
+<pre><code>    bash-3.2$ curl -ikH &quot;Cookie: 
hadoop-jwt=eyJhbGciOiJSUzI1NiJ9.eyJleHAiOjE0NDM1ODY2OTIsInN1YiI6Imd1ZXN0IiwiYXVkIjoiSFNTTyIsImlzcyI6IkhTU08ifQ.Os5HEfVBYiOIVNLRIvpYyjeLgAIMbBGXHBWMVRAEdiYcNlJRcbJJ5aSUl1aciNs1zd_SHijfB9gOdwnlvQ_0BCeGHlJBzHGyxeypIoGj9aOwEf36h-HVgqzGlBLYUk40gWAQk3aRehpIrHZT2hHm8Pu8W-zJCAwUd8HR3y6LF3M;Path=/;Domain=localhost;Secure;HttpOnly&quot;
 https://localhost:9443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+
+       TODO: cluster was down and needs to be recreated :/
+</code></pre><h4><a id="Browse+the+Flow">Browse the Flow</a> <a 
href="#Browse+the+Flow"><img src="markbook-section-link.png"/></a></h4><p>At 
this point, we can use a web browser instead of the command line and see how 
the browser will challenge the user for Basic Auth Credentials and then manage 
the cookies such that the SSO and token exchange aspects of the flow are hidden 
from the user.</p><p>Simply, try to invoke the same webhdfs API from the 
browser URL bar.</p>
+<pre><code>            
https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS
+</code></pre><p>Based on our understanding of the flow it should behave 
like:</p>
+<ul>
+  <li>SSOCookieProvider checks for hadoop-jwt cookie and in its absence 
redirects to the configured SSO provider URL (knoxsso endpoint)</li>
+  <li>ShiroProvider on the KnoxSSO endpoint returns a 401 and the browser 
challenges the user for username/password</li>
+  <li>The ShiroProvider authenticates the user against the Demo LDAP Server 
using a simple LDAP bind and establishes the security context for the WebSSO 
request</li>
+  <li>The WebSSO service exchanges the normalized Java Subject into a JWT 
token and sets it on the response as a cookie named hadoop-jwt</li>
+  <li>The WebSSO service then redirects the user agent back to the originally 
requested URL - the webhdfs Knox service subsequent invocations will find the 
cookie in the incoming request and not need to engage the WebSSO service again 
until it expires.</li>
+</ul><h4><a id="Filter+by+Example">Filter by Example</a> <a 
href="#Filter+by+Example"><img src="markbook-section-link.png"/></a></h4><p>We 
have added a federation provider to Knox for accepting KnoxSSO cookies for REST 
APIs. This provides us with a couple benefits: KnoxSSO support for REST APIs 
for XmlHttpRequests from JavaScript (basic CORS functionality is also 
included). This is still rather basic and considered beta code. A model and 
real world usecase for others to base their integrations on</p><p>In addition, 
<a 
href="https://issues.apache.org/jira/browse/HADOOP-11717";>https://issues.apache.org/jira/browse/HADOOP-11717</a>
 added support for the Hadoop UIs to the hadoop-auth module and it can be used 
as another example.</p><p>We will examine the new SSOCookieFederationFilter in 
Knox here.</p>
+<pre><code>package org.apache.hadoop.gateway.provider.federation.jwt.filter;
+
+       import java.io.IOException;
+               import java.security.Principal;
+               import java.security.PrivilegedActionException;
+               import java.security.PrivilegedExceptionAction;
+               import java.util.ArrayList;
+               import java.util.Date;
+               import java.util.HashSet;
+               import java.util.List;
+               import java.util.Set;
+               
+               import javax.security.auth.Subject;
+               import javax.servlet.Filter;
+               import javax.servlet.FilterChain;
+               import javax.servlet.FilterConfig;
+               import javax.servlet.ServletException;
+               import javax.servlet.ServletRequest;
+               import javax.servlet.ServletResponse;
+               import javax.servlet.http.Cookie;
+               import javax.servlet.http.HttpServletRequest;
+               import javax.servlet.http.HttpServletResponse;
+               
+               import org.apache.hadoop.gateway.i18n.messages.MessagesFactory;
+               import 
org.apache.hadoop.gateway.provider.federation.jwt.JWTMessages;
+               import org.apache.hadoop.gateway.security.PrimaryPrincipal;
+               import org.apache.hadoop.gateway.services.GatewayServices;
+               import 
org.apache.hadoop.gateway.services.security.token.JWTokenAuthority;
+               import 
org.apache.hadoop.gateway.services.security.token.TokenServiceException;
+               import 
org.apache.hadoop.gateway.services.security.token.impl.JWTToken;
+               
+               public class SSOCookieFederationFilter implements Filter {
+                 private static JWTMessages log = MessagesFactory.get( 
JWTMessages.class );
+                 private static final String ORIGINAL_URL_QUERY_PARAM = 
&quot;originalUrl=&quot;;
+                 private static final String SSO_COOKIE_NAME = 
&quot;sso.cookie.name&quot;;
+                 private static final String SSO_EXPECTED_AUDIENCES = 
&quot;sso.expected.audiences&quot;;
+                 private static final String SSO_AUTHENTICATION_PROVIDER_URL = 
&quot;sso.authentication.provider.url&quot;;
+                 private static final String DEFAULT_SSO_COOKIE_NAME = 
&quot;hadoop-jwt&quot;;
+</code></pre><p>The above represent the configurable aspects of the 
integration</p>
+<pre><code>    private JWTokenAuthority authority = null;
+    private String cookieName = null;
+    private List&lt;String&gt; audiences = null;
+    private String authenticationProviderUrl = null;
+
+    @Override
+    public void init( FilterConfig filterConfig ) throws ServletException {
+      GatewayServices services = (GatewayServices) 
filterConfig.getServletContext().getAttribute(GatewayServices.GATEWAY_SERVICES_ATTRIBUTE);
+      authority = 
(JWTokenAuthority)services.getService(GatewayServices.TOKEN_SERVICE);
+</code></pre><p>The above is a Knox specific internal service that we use to 
issue and verify JWT tokens. This will be covered separately and you will need 
to be implement something similar in your filter implementation.</p>
+<pre><code>    // configured cookieName
+    cookieName = filterConfig.getInitParameter(SSO_COOKIE_NAME);
+    if (cookieName == null) {
+      cookieName = DEFAULT_SSO_COOKIE_NAME;
+    }
+</code></pre><p>The configurable cookie name is something that can be used to 
change a cookie name to fit your deployment environment. The default name is 
hadoop-jwt which is also the default in the Hadoop implementation. This name 
must match the name being used by the KnoxSSO endpoint when setting the 
cookie.</p>
+<pre><code>    // expected audiences or null
+    String expectedAudiences = 
filterConfig.getInitParameter(SSO_EXPECTED_AUDIENCES);
+    if (expectedAudiences != null) {
+      audiences = parseExpectedAudiences(expectedAudiences);
+    }
+</code></pre><p>Audiences are configured as a comma separated list of audience 
strings. Names of intended recipients or intents. The semantics that we are 
using for this processing is that - if not configured than any (or none) 
audience is accepted. If there are audiences configured then as long as one of 
the expected ones is found in the set of claims in the token it is accepted.</p>
+<pre><code>    // url to SSO authentication provider
+    authenticationProviderUrl = 
filterConfig.getInitParameter(SSO_AUTHENTICATION_PROVIDER_URL);
+    if (authenticationProviderUrl == null) {
+      log.missingAuthenticationProviderUrlConfiguration();
+    }
+  }
+</code></pre><p>This is the URL to the KnoxSSO endpoint. It is required and 
SSO/token exchange will not work without this set correctly.</p>
+<pre><code>    /**
+       * @param expectedAudiences
+       * @return
+       */
+       private List&lt;String&gt; parseExpectedAudiences(String 
expectedAudiences) {
+     ArrayList&lt;String&gt; audList = null;
+       // setup the list of valid audiences for token validation
+       if (expectedAudiences != null) {
+         // parse into the list
+         String[] audArray = expectedAudiences.split(&quot;,&quot;);
+         audList = new ArrayList&lt;String&gt;();
+         for (String a : audArray) {
+           audList.add(a);
+         }
+       }
+       return audList;
+     }
+</code></pre><p>The above method parses the comma separated list of expected 
audiences and makes it available for interrogation during token validation.</p>
+<pre><code>    public void destroy() {
+    }
+
+    public void doFilter(ServletRequest request, ServletResponse response, 
FilterChain chain) 
+        throws IOException, ServletException {
+      String wireToken = null;
+      HttpServletRequest req = (HttpServletRequest) request;
+  
+      String loginURL = constructLoginURL(req);
+      wireToken = getJWTFromCookie(req);
+      if (wireToken == null) {
+        if (req.getMethod().equals(&quot;OPTIONS&quot;)) {
+          // CORS preflight requests to determine allowed origins and related 
config
+          // must be able to continue without being redirected
+          Subject sub = new Subject();
+          sub.getPrincipals().add(new PrimaryPrincipal(&quot;anonymous&quot;));
+          continueWithEstablishedSecurityContext(sub, req, 
(HttpServletResponse) response, chain);
+        }
+        log.sendRedirectToLoginURL(loginURL);
+        ((HttpServletResponse) response).sendRedirect(loginURL);
+      }
+      else {
+        JWTToken token = new JWTToken(wireToken);
+        boolean verified = false;
+        try {
+          verified = authority.verifyToken(token);
+          if (verified) {
+            Date expires = token.getExpiresDate();
+            if (expires == null || new Date().before(expires)) {
+              boolean audValid = validateAudiences(token);
+              if (audValid) {
+                Subject subject = createSubjectFromToken(token);
+                continueWithEstablishedSecurityContext(subject, 
(HttpServletRequest)request, (HttpServletResponse)response, chain);
+              }
+              else {
+                log.failedToValidateAudience();
+                ((HttpServletResponse) response).sendRedirect(loginURL);
+              }
+            }
+            else {
+              log.tokenHasExpired();
+            ((HttpServletResponse) response).sendRedirect(loginURL);
+            }
+          }
+          else {
+            log.failedToVerifyTokenSignature();
+          ((HttpServletResponse) response).sendRedirect(loginURL);
+          }
+        } catch (TokenServiceException e) {

[... 298 lines stripped ...]


Reply via email to