This is an automated email from the ASF dual-hosted git repository.

olli pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/sling-site.git

commit 79dbf0d1f2553710aa4ba7ae78cd9f7d0f32b8a5
Author: Oliver Lietz <[email protected]>
AuthorDate: Mon Dec 18 11:25:59 2017 +0100

    SLING-7167 Adjust READMEs
---
 src/main/jbake/content/documentation/bundles.md    |   1 +
 .../content/documentation/bundles/distribution.md  | 252 +++++++++++++++++++++
 2 files changed, 253 insertions(+)

diff --git a/src/main/jbake/content/documentation/bundles.md 
b/src/main/jbake/content/documentation/bundles.md
index 374aeef..a0643bd 100644
--- a/src/main/jbake/content/documentation/bundles.md
+++ b/src/main/jbake/content/documentation/bundles.md
@@ -12,6 +12,7 @@ tags=bundles,modules
 * [Rendering Content - Default GET servlets 
(servlets.get)](/documentation/bundles/rendering-content-default-get-servlets.html)
 * [Validation](/documentation/bundles/validation.html)
 * [Repository 
Initialization](/documentation/bundles/repository-initialization.html)
+* [Distribution](/documentation/bundles/distribution.html)
 
 ## Resource Providers
 
diff --git a/src/main/jbake/content/documentation/bundles/distribution.md 
b/src/main/jbake/content/documentation/bundles/distribution.md
new file mode 100644
index 0000000..255e3bd
--- /dev/null
+++ b/src/main/jbake/content/documentation/bundles/distribution.md
@@ -0,0 +1,252 @@
+title=Content Distribution     
+type=page
+status=published
+tags=distribution
+~~~~~~
+
+## Overview
+
+The Sling Content Distribution module main goal is allowing distribution of 
content (Sling resources) among different Sling 
+instances. The term "distribution" here means the ability of picking one or 
more resources on a certain Sling instance in order 
+to copy and persist them onto another Sling instance. The Sling Content 
Distribution module is able to distribute content 
+by:
+
+ - "pushing" from Sling instance A to Sling instance B
+ - "pulling" from Sling instance B to Sling instance A
+ - "synchronizing" Sling instances A and B via a (third) coordinating instance 
C
+
+### Bundles
+
+The Sling Content Distribution module consists of the following bundles:
+
+ - `org.apache.sling.distribution.api`: this is where the APIs are defined
+ - `org.apache.sling.distribution.core`: this is where the basic 
infrastructure for distributing content is implemented
+ - `org.apache.sling.distribution.kryo-serializer`: Kryo based distribution 
package serializer
+ - `org.apache.sling.distribution.avro-serializer`: Apache Avro based 
distribution package serializer
+ - `org.apache.sling.distribution.sample`: this is a set of sample 
configurations and implementations for demo purpose 
+ - `org.apache.sling.distribution.it`: this is the integration testing suite
+ 
+## Design
+
+The Sling Content Distribution aims to be: _Reliable_, _simple_ and 
_extensible_.
+
+Reliability means that the system should be able to keep working also in 
presence of failures regarding I/O, network, etc.
+An example of such problems is when pushing content from instance A to 
instance B fails because B is unreachable: in such 
+ scenarios instance A should be able to keep pushing (pulling, etc.) content 
to other instances seamlessly. Another example
+ is when delivery of a certain content (package) fails too many times the 
distribution module should be able to either drop 
+ it or move it into a different "bucket" of failed items.
+Simplicity means that this module should be able to accomplish its tasks by 
providing clear, minimal and easy to use APIs together 
+with smart but not overly complicated or "hacky" implementations (see ["Simple 
software is 
hard"](http://events.linuxfoundation.org/events/apachecon-europe/program/schedule)).
+Extensibility means that the Sling Content Distribution module provides a set 
of APIs for distributing resources where each
+component coming into place during the distribution lifecycle can be extended 
or totally replaced.
+
+A distribution _request_ represents the need of aggregating some resources and 
to copy them from / to another Sling instance.
+Such requests are handled by _agents_ that are the main entry point for 
working with the distribution module.
+Each agent distributes content from one or more sources to one or more 
targets, such distribution can be triggered by:
+
+ - "pushing" the content to the (remote) target instances 
+ - "pulling" content from the (remote) source instances
+ - "coordinating" instances, that is they are used to synchronize multiple 
instances by having them as both sources and targets
+
+An _agent_ is capable of handling a certain distribution _request_ by creating 
one or more _packages_ of resources out of it 
+from the source(s), dispatching such _packages_ to one or more _queues_ and of 
processing such queued _packages_ by persisting 
+them into the target instance(s).
+
+The process of creating one or more packages is called _exporting_ as such 
operation may either happen locally to the agent 
+(the "push" scenario) or remotely (the "pull" scenario).
+
+The process of persisting one or more packages is called _importing_ as such 
operation may either happen locally (the "pull" 
+scenario) or remotely (the "push" scenario).
+
+In order to properly handle large number of _requests_ against the same 
_agent_ each of them is provided with _queues_ 
+where the exported _packages_ are sent, the _agent_ takes then care to process 
such a _queue_ in order to _import_ each 
+_package_. 
+ 
+
+### Distribution agents configuration
+
+Distribution agents configurations are proper OSGi configurations (backed by 
nodes of type `sling:OsgiConfig` in the repository).
+
+There are specialized factories for each supported scenario:
+
+- "forward" agents, see 
[ForwardDistributionAgentFactory-publish.json](https://gitbox.apache.org/repos/asf?p=sling-org-apache-sling-distribution-sample.git;a=blob_plain;f=src/main/resources/SLING-CONTENT/libs/sling/distribution/install.author/publish/org.apache.sling.distribution.agent.impl.ForwardDistributionAgentFactory-publish.json).
+- "reverse" agents, see 
[ReverseDistributionAgentFactory-publish-reverse.json](https://gitbox.apache.org/repos/asf?p=sling-org-apache-sling-distribution-sample.git;a=blob_plain;f=src/main/resources/SLING-CONTENT/libs/sling/distribution/install.author/publish-reverse/org.apache.sling.distribution.agent.impl.ReverseDistributionAgentFactory-publish-reverse.json).
+- "sync" agents, see 
[SyncDistributionAgentFactory-pubsync.json](https://gitbox.apache.org/repos/asf?p=sling-org-apache-sling-distribution-sample.git;a=blob_plain;f=src/main/resources/SLING-CONTENT/libs/sling/distribution/install.author/pubsync/org.apache.sling.distribution.agent.impl.SyncDistributionAgentFactory-pubsync.json).
+- "queue" agents, see 
[QueueDistributionAgentFactory-reverse.json](https://gitbox.apache.org/repos/asf?p=sling-org-apache-sling-distribution-sample.git;a=blob_plain;f=src/main/resources/SLING-CONTENT/libs/sling/distribution/install.publish/reverse/org.apache.sling.distribution.agent.impl.QueueDistributionAgentFactory-reverse.json).
+
+For example a "forward" agent can be defined specifying
+
+- The name of the agent (name property)
+- The sub service name used to access content and build packages (serviceName 
property)
+- The endpoints where the packages are to be imported 
(packageImporter.endpoints property)
+
+The sample package contains endpoints for exposing configuration for 
distribution agents.
+The _DistributionConfigurationResourceProviderFactory_ is used to expose agent 
configurations as resources.
+
+    {
+      "jcr:primaryType": "sling:OsgiConfig",
+      "provider.roots": [ "/libs/sling/distribution/settings/agents" ],
+      "kind" : "agent"
+    }
+
+Distribution agents' configurations can be retrieved via `HTTP GET`:
+
+    $ curl -u admin:admin 
http://localhost:8080/libs/sling/distribution/settings/agents/{agentName}.json
+
+### Distribution agents services
+
+Each distribution agent is an OSGi service and is resolved using a [Sling 
Resource Provider](#Resource_Providers) who locate it under 
`libs/sling/distribution/services/agents`.
+
+The _DistributionConfigurationResourceProviderFactory_ allows one to configure 
HTTP endpoints to access distribution OSGI configurations.
+The sample package contains endpoints for exposing distribution agents.
+The _DistributionServiceResourceProviderFactory_ is used to expose agent 
services as resources.
+
+    {
+      "jcr:primaryType": "sling:OsgiConfig",
+      "provider.roots": [ "/libs/sling/distribution/services/agents" ],
+      "kind" : "agent"
+    }
+
+Distribution agents can be triggered by sending `HTTP POST` requests to
+
+`http://$host:$port/libs/sling/distribution/services/agents/{agentName}`
+
+with HTTP parameters `action` and `path`.
+
+### Distribution queues
+
+#### In Memory queue
+
+That's a draft implementation using an in memory blocking queue together with 
a Sling scheduled processor which periodically fetches the first item of each 
queue and trigger a distribution of such an item.
+It's not suitable for production as it's currently not persisted and therefore 
restarting the bundle / platform would not keep the queue together with its 
items.
+
+#### Sling Job Handling based queue
+
+That's a queue implementation based on the queues and jobs provided by Sling 
Event bundle.
+Each item addition to a queue triggers the creation of a Sling job which will 
handle the processing of that item in the queue.
+By default Sling queues for distribution have the following options:
+
+- ordered
+- with max priority
+- with infinite retries
+- keeping job history
+
+### Distribution of packages among queues
+
+Each distribution agent uses a specific queue distribution mechanism, 
specified via a 'queue distribution strategy', which defines how packages are 
routed into agent queues.
+The currently available distribution strategies are
+
+- single: the agent has one only queue and all the items are routed there
+- priority path: the agent can route a configurable set of paths (note that 
this configuration is currently global for the system, not per agent) to a 
dedicated priority queue while all the others go to the default queue
+- error aware: the agent has one default queue for all the items, items 
failing for a configurable amount of times are either dropped or moved to an 
error queue (depending on configuration)
+
+ 
+## Usecases
+
+### Forward distribution
+
+In order to configure the "forward" distribution workflow, that transfers 
content from an author instance to a publish instance:
+
+- configure a remote importer on publish
+- configure a "forward" agent on author pointing to the url of the importer on 
publish
+
+Send `HTTP POST`request to 
`http://localhost:8080/libs/sling/distribution/services/agents/publish` with 
parameters `action=ADD` and `path=/content`
+
+#### Create/update content
+
+    $ curl -v -u admin:admin 
http://localhost:8080/libs/sling/distribution/services/agents/publish -d 
'action=ADD' -d 'path=/content/sample1'
+
+#### Delete content
+
+    $ curl -v -u admin:admin 
http://localhost:8080/libs/sling/distribution/services/agents/publish -d 
'action= DELETE' -d 'path=/content/sample1'
+
+### Reverse distribution
+
+In order to configure the "reverse" distribution workflow, that transfers 
content from a publish instance to an author instance:
+- configure a queue agent on publish to hold the packages that need to be 
distributed to author
+- configure a remote exporter on publish that exports package from the queue 
agent
+- configure a "reverse" agent on author pointing to the url of the exporter on 
publish
+
+Send `HTTP POST`request to 
`http://localhost:8080/libs/sling/distribution/services/agents/publish-reverse` 
with parameters `action=PULL`
+
+
+#### Create/update content
+
+    $ curl -u admin:admin 
http://localhost:8081/libs/sling/distribution/services/agents/reverse -d 
'action=ADD' -d 'path=/content/sample1'
+    $ curl -u admin:admin 
http://localhost:8080/libs/sling/distribution/services/agents/publish-reverse 
-d 'action=PULL'
+
+### Sync distribution
+
+
+In order to configure the "sync" distribution workflow, that transfers content 
from two publish instances via an author instance:
+- configure a remote exporter on each publish instance
+- configure a remote importer on each publish instance
+- configure a "sync" agent on author pointing to the urls of the exporter and 
importers on publish
+
+Send `HTTP POST`request to 
`http://localhost:8080/libs/sling/distribution/services/agents/pubsync` with 
parameters `action=PULL`
+
+
+#### Create/update content
+
+    $ curl -u admin:admin 
http://localhost:8081/libs/sling/distribution/services/agents/reverse-pubsync 
-d 'action=ADD' -d 'path=/content/sample1'
+    $ curl -u admin:admin 
http://localhost:8080/libs/sling/distribution/services/agents/pubsync -d 
'action=PULL'
+
+### Installation
+
+- install the dependency bundles on all Sling instances
+- install Sling Distribution api, core, samples on all Sling instances
+
+## HTTP API
+
+### API Requirements
+We need to expose APIs for configuring, commanding and monitoring distribution 
agents.
+
+- Configuration API should allow:
+ - CRUD operations for agent configs
+- Command API (eventually issued to multiple agents at once) should allow:
+ - to trigger a distribution request on a specific agent
+ - to explicitly create and export a package
+ - to explicitly import a formerly created package
+- Monitoring API should allow:
+ - inspection to internal queues of distribution agents
+ - inspection of commands history
+ 
+### API endpoints 
+
+#### Configuration API
+- Create config:  - POST _/libs/sling/distribution/settings/agents_
+- Read config - GET _/libs/sling/distribution/settings/agents/{agentName}_
+- Update config - PUT _/libs/sling/distribution/settings/agents/{agentName}_
+- Delete config - DELETE _/libs/sling/distribution/settings/agents/{agentName}_
+
+#### Command API
+- Distribute - POST _/libs/sling/distribution/services/agents/{agentName}_
+- Import package - POST 
_/libs/sling/distribution/services/importers/{importerName}_
+- Export package - POST 
_/libs/sling/distribution/services/exporters/{exporterName}_
+
+#### Monitoring API
+- Distribution history - GET 
_/libs/sling/distribution/services/agents/{agentName}/log_
+- Agent queue inspection  - GET 
_/libs/sling/distribution/services/agents/{agentName}/queues_
+
+## Java API
+
+There is a single entry point in triggering a distribution workflow, via 
[Distributor](https://gitbox.apache.org/repos/asf?p=sling-org-apache-sling-distribution-api.git;a=blob_plain;f=src/main/java/org/apache/sling/distribution/Distributor.java)
 API.
+
+    Distributor.distribute(agentName, resourceResolver, distributionRequest)
+
+## Extensions
+
+The following extensions for Apache Sling Content Distribution exist.
+
+### Apache Avro serializer
+The _org.apache.sling.distribution.avro-serializer_ contains a 
_DistributionContentSerializer_ based on [Apache Avro](http://avro.apache.org).
+
+### Kryo serializer
+The _org.apache.sling.distribution.kryo-serializer_ contains a 
_DistributionContentSerializer_ based on 
[Kryo](http://github.com/EsotericSoftware/kryo).
+
+## Ideas for future developments
+
+- distributed configuration
+- pushing to / pulling from JMS (pros: established pattern for 
producers/consumers problems, cons: other library / systems involved as a 
possible PoF)
+- WebSocket support (pros: once established it's bidirectional and therefore 
also publish can directly push stuff to author)
+- asynchronous import of packages (pros: parallel transport and import, cons: 
complex management of multiple queues on different publish instances)

-- 
To stop receiving notification emails like this one, please contact
"[email protected]" <[email protected]>.

Reply via email to