Author: baedke
Date: Fri Sep 19 10:10:08 2014
New Revision: 1626168

URL: http://svn.apache.org/r1626168
Log:
OAK-1915: TarMK Cold Standby

Improved MBeans, added documentation.

Added:
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/
    jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/coldstandby.md
    
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/jmx/ClientFailoverStatusMBean.java
Modified:
    jackrabbit/oak/trunk/oak-doc/src/site/site.xml
    
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/client/FailoverClient.java
    
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/BulkTest.java
    
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/MBeanTest.java

Added: jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/coldstandby.md
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/coldstandby.md?rev=1626168&view=auto
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/coldstandby.md 
(added)
+++ jackrabbit/oak/trunk/oak-doc/src/site/markdown/coldstandby/coldstandby.md 
Fri Sep 19 10:10:08 2014
@@ -0,0 +1,110 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+  -->
+
+#Cold Standby
+
+### What is it?
+
+The *Cold Standby* feature allows one or more clients to connect to a master 
instance and ensure automatic on-the-fly synchronization of the repository 
state from the master to the client(s). The sync process is one-way only. Data 
stored on the master is never changed. The only purpose of this client 
installation(s) is to garantuee a (almost live) data copy and enable a quick 
switch from the master to a client installation without data loss.
+
+### What is isn't
+
+The *Cold Standby* feature does not garantuee file, filesystem or even 
repository **integrity**! If the content of a tar file is corrupted, a file is 
missing or anything similar happens to the locally stored files the 
installation will break because these situation or not checked, detected or 
treated!
+
+### How it works
+
+On the master a TCP port is opened and listening to incoming messages. 
Currently there a two messages implemented:
+
+* give me the segment id of the current head
+* give me a segment data with a specified id
+
+The clients periodically request the segment id of the current head of the 
master. If the segment is locally unknown it will be retrieved. If it's already 
present the segments are compared and referenced segments (if necessary) will 
be requested, too.
+
+
+### Prerequisites
+
+An Oak installation using a SegmentStore using the TarMK.
+
+### Setup
+
+1. Perform a filesystem based copy of the master repository.
+2. on the master activate the feature by specifying the runmode <!-- TODO: 
this must be changed --> `syncmaster`. If the repository is running within a 
OSGI environment the feature will be activated by a corresponding 
configuration. <!-- TODO: add some OSGI specific info here -->
+3. on the client(s) activate the feature by specifying the runmode `syncslave` 
(add additional parameters if desired) and specify the path to the repository 
(-tar) files
+4. start the master and the client(s).
+
+You can add the additional argument `--secure true` if you like a SSL secured 
connection between the client and the master. It must be garantueed that 
**all** clients and the master either use secure or standard connections! A 
mixed configuration will definitely fail.
+
+The clients specify the master host using the `--host` (default is 
`localhost`) and `--port` (default is `8023`) arguments. For monitoring reasons 
(see below) the client(s) must be distinctable. Therefore a generic UUID is 
automatically created for each running client and this UUID is used to identify 
the client on the master. If you want to specify the name of the client you can 
set a system property `failOverID`.
+
+To sum it up a typical client command line could be:
+
+       java -DfailOverID="Client#1" -jar oak-run.jar syncslave --secure false 
--host 192.168.0.1 crx-quickstart/repository/segmentstore
+
+<!-- TODO: add the master specific arguments (like the accepted incoming IP 
ranges) -->
+The master can define the TCP port the feature is listening (default is 
`8023`) using the `--port` argument. If you want to restrict the communication 
you can specify a list of allowed IPs or IP ranges....
+
+### Robustness
+
+The data flow is designed to detect and handle connection and network related 
problems automatically. All packets are bundled with checksums and as soon as 
problems with the connection or damaged packets occur retry mechanisms are 
triggered.
+
+### Monitoring
+
+The *Cold Standby* feature exposes informations using JMX/MBeans. Doing so you 
can inspect the current state of the client(s) and the master using standard 
tools like `jconsole` or `jmc` (if running JDK 1.7 or higher). The information 
can be found if you look for a `org.apache.jackrabbit.oak:type="FailOver"` 
MBean named `Status`.
+
+#####Client
+Observing a client you will notice exactly one node (the id is either a 
generic UUID or the name specified by the `failOverID` system property). This 
node has three readonly attributes:
+
+* `Running`: boolean indicating whether the sync process is running
+* `Mode`: always `Client: ` followed by the ID described above
+* `Status`: a textual representation of the current state (like `running`, 
`stopped` and others)
+
+There are also two invokable methods:
+
+* `start()`: start the sync process
+* `stop()`: stop the sync process
+
+#####Master
+Observing the master exposes some general (non client-specific) informations 
via a MBean whose id value is the port number the `Cold Standby` service is 
using (usually `8023`). There are the same attributes and methods as described 
above but the values differ:
+
+* `Mode`: always the constant value `master`
+* `Status`: has more values like `got message`
+
+Furthermore informations for each (up to 10) clients can be retrieved. The 
MBean id is the name of the client (see above). There are no invokable methods 
for these MBeans but some very useful readonly attributes:
+
+* `Name`: the id of the client
+* `LastSeenTimestamp`: the timestamp of the last request in a textual 
representation
+* `LastRequest`: the last request of the client
+* `RemoteAddress`: the IP address of the client
+* `RemotePort`: the (generic) port the client used for the last request
+* `TransferredSegments`: the total number of segments transferred to this 
client
+* `TransferredSegmentBytes`: the total number of bytes transferred to this 
client
+
+A typical state might look like this:
+![Screenshot showing MBeans](mbeans.png)
+
+### Performance
+
+##### Master
+Running on the master enabling the *Cold Standby* feature has almost no 
measurable impact on the performance. The additional CPU consumption is very 
low and the extra harddisk and network IO shouldn't have any drawbacks.
+
+##### Client
+Things look differently on the client! During a sync process you can expect at 
least one CPU core running close to 100% for all the time. Due to the fact that 
the procedure is not multithreaded you can't speed up the process by using 
multiple cores. If no data is changed/transferred there will be no measurable 
activity. The expected throughput is about 700 KB / sec. Obviously this number 
will vary depending on the hardware and network environment but it does not 
depend on the size of the repository or whether you use SSL encryption or not. 
You should keep this in mind when estimating the time needed for an initial 
sync or when much data was changed in the meantime on the master node.
+
+### One word about security
+
+Assuming that the client(s) and the master run in the same intranet security 
zone there **should** be no security issue enabling the *Cold Standby* feature. 
Nevertheless you can add extra security by enabling SSL connections between the 
client(s) and the master (see above). Doing so reduces the possibility that the 
data is compromised by a man-in-the-middle. Furthermore you can specify the 
allowed client(s) by restricting the IP-address of incoming requests. This 
should help to garantuee that no one in the intranet can copy the repository 
(by accident).
+

Modified: jackrabbit/oak/trunk/oak-doc/src/site/site.xml
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-doc/src/site/site.xml?rev=1626168&r1=1626167&r2=1626168&view=diff
==============================================================================
--- jackrabbit/oak/trunk/oak-doc/src/site/site.xml (original)
+++ jackrabbit/oak/trunk/oak-doc/src/site/site.xml Fri Sep 19 10:10:08 2014
@@ -51,6 +51,7 @@ under the License.
       <item href="differences.html" name="Differences to Jackrabbit 2" />
       <item href="known_issues.html" name="Known Issues" />
       <item href="dos_and_donts.html" name="Dos and don'ts" />
+      <item href="coldstandby/coldstandby.html" name="Cold Standby" />
       <item href="FAQ.html" name="FAQ" />
     </menu>
     <menu name="Developing Oak">

Modified: 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/client/FailoverClient.java
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/client/FailoverClient.java?rev=1626168&r1=1626167&r2=1626168&view=diff
==============================================================================
--- 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/client/FailoverClient.java
 (original)
+++ 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/client/FailoverClient.java
 Fri Sep 19 10:10:08 2014
@@ -43,6 +43,7 @@ import java.util.concurrent.TimeUnit;
 
 import org.apache.jackrabbit.oak.plugins.segment.SegmentStore;
 import 
org.apache.jackrabbit.oak.plugins.segment.failover.CommunicationObserver;
+import 
org.apache.jackrabbit.oak.plugins.segment.failover.jmx.ClientFailoverStatusMBean;
 import 
org.apache.jackrabbit.oak.plugins.segment.failover.jmx.FailoverStatusMBean;
 import 
org.apache.jackrabbit.oak.plugins.segment.failover.codec.RecordIdDecoder;
 import org.apache.jackrabbit.oak.plugins.segment.failover.store.FailoverStore;
@@ -54,7 +55,7 @@ import javax.management.ObjectName;
 import javax.management.StandardMBean;
 import javax.net.ssl.SSLException;
 
-public final class FailoverClient implements FailoverStatusMBean, Runnable, 
Closeable {
+public final class FailoverClient implements ClientFailoverStatusMBean, 
Runnable, Closeable {
     public static final String CLIENT_ID_PROPERTY_NAME = "failOverID";
 
     private static final Logger log = LoggerFactory
@@ -72,6 +73,8 @@ public final class FailoverClient implem
     private SslContext sslContext;
     private boolean active = false;
     private boolean running;
+    private int failedRequests;
+    private long lastSuccessfulRequest;
     private volatile String state;
     private final Object sync = new Object();
 
@@ -81,6 +84,8 @@ public final class FailoverClient implem
 
     public FailoverClient(String host, int port, SegmentStore store, boolean 
secure) throws SSLException {
         this.state = STATUS_INITIALIZING;
+        this.lastSuccessfulRequest = -1;
+        this.failedRequests = 0;
         this.host = host;
         this.port = port;
         if (secure) {
@@ -92,7 +97,7 @@ public final class FailoverClient implem
 
         final MBeanServer jmxServer = 
ManagementFactory.getPlatformMBeanServer();
         try {
-            jmxServer.registerMBean(new StandardMBean(this, 
FailoverStatusMBean.class), new ObjectName(this.getMBeanName()));
+            jmxServer.registerMBean(new StandardMBean(this, 
ClientFailoverStatusMBean.class), new ObjectName(this.getMBeanName()));
         }
         catch (Exception e) {
             log.error("can register failover status mbean", e);
@@ -171,7 +176,10 @@ public final class FailoverClient implem
             ChannelFuture f = b.connect(host, port).sync();
             // Wait until the connection is closed.
             f.channel().closeFuture().sync();
+            this.failedRequests = 0;
+            this.lastSuccessfulRequest = System.currentTimeMillis() / 1000;
         } catch (Exception e) {
+            this.failedRequests++;
             log.error("Failed synchronizing state.", e);
             stop();
         } finally {
@@ -207,4 +215,25 @@ public final class FailoverClient implem
     public String getStatus() {
         return this.state;
     }
+
+    @Override
+    public int getFailedRequests() {
+        return this.failedRequests;
+    }
+
+    @Override
+    public int getSecondsSinceLastSuccess() {
+        if (this.lastSuccessfulRequest < 0) return -1;
+        return (int)(System.currentTimeMillis() / 1000 - 
this.lastSuccessfulRequest);
+    }
+
+    @Override
+    public int calcFailedRequests() {
+        return this.getFailedRequests();
+    }
+
+    @Override
+    public int calcSecondsSinceLastSuccess() {
+        return this.getSecondsSinceLastSuccess();
+    }
 }

Added: 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/jmx/ClientFailoverStatusMBean.java
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/jmx/ClientFailoverStatusMBean.java?rev=1626168&view=auto
==============================================================================
--- 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/jmx/ClientFailoverStatusMBean.java
 (added)
+++ 
jackrabbit/oak/trunk/oak-tarmk-failover/src/main/java/org/apache/jackrabbit/oak/plugins/segment/failover/jmx/ClientFailoverStatusMBean.java
 Fri Sep 19 10:10:08 2014
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *      http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.jackrabbit.oak.plugins.segment.failover.jmx;
+
+import org.apache.jackrabbit.oak.commons.jmx.Description;
+
+public interface ClientFailoverStatusMBean extends FailoverStatusMBean {
+
+    @Description("number of consecutive failed requests")
+    int getFailedRequests();
+
+    @Description("number of seconds since last successful request")
+    int getSecondsSinceLastSuccess();
+
+    // expose the informations as operations, too
+
+    @Description("number of consecutive failed requests")
+    int calcFailedRequests();
+
+    @Description("number of seconds since last successful request")
+    int calcSecondsSinceLastSuccess();
+
+}

Modified: 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/BulkTest.java
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/BulkTest.java?rev=1626168&r1=1626167&r2=1626168&view=diff
==============================================================================
--- 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/BulkTest.java
 (original)
+++ 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/BulkTest.java
 Fri Sep 19 10:10:08 2014
@@ -18,7 +18,6 @@
  */
 package org.apache.jackrabbit.oak.plugins.segment.failover;
 
-import junit.framework.Assert;
 import org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStore;
 import 
org.apache.jackrabbit.oak.plugins.segment.failover.client.FailoverClient;
 import 
org.apache.jackrabbit.oak.plugins.segment.failover.jmx.FailoverStatusMBean;

Modified: 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/MBeanTest.java
URL: 
http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/MBeanTest.java?rev=1626168&r1=1626167&r2=1626168&view=diff
==============================================================================
--- 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/MBeanTest.java
 (original)
+++ 
jackrabbit/oak/trunk/oak-tarmk-failover/src/test/java/org/apache/jackrabbit/oak/plugins/segment/failover/MBeanTest.java
 Fri Sep 19 10:10:08 2014
@@ -98,6 +98,9 @@ public class MBeanTest extends TestBase 
             String m = jmxServer.getAttribute(status, "Mode").toString();
             if (!m.startsWith("client: ")) fail("unexpected mode " + m);
 
+            assertEquals("1", jmxServer.getAttribute(status, 
"FailedRequests").toString());
+            assertEquals("-1", jmxServer.getAttribute(status, 
"SecondsSinceLastSuccess").toString());
+
             assertEquals(FailoverStatusMBean.STATUS_STOPPED, 
jmxServer.getAttribute(status, "Status"));
 
             assertEquals(false, jmxServer.getAttribute(status, "Running"));
@@ -125,6 +128,12 @@ public class MBeanTest extends TestBase 
         try {
             assertTrue(jmxServer.isRegistered(status));
             assertEquals("client: Foo", jmxServer.getAttribute(status, 
"Mode"));
+
+            assertEquals("1", jmxServer.getAttribute(status, 
"FailedRequests").toString());
+            assertEquals("-1", jmxServer.getAttribute(status, 
"SecondsSinceLastSuccess").toString());
+
+            assertEquals("1", jmxServer.invoke(status, "calcFailedRequests", 
null, null).toString());
+            assertEquals("-1", jmxServer.invoke(status, 
"calcSecondsSinceLastSuccess", null, null).toString());
         } finally {
             client.close();
         }
@@ -168,6 +177,18 @@ public class MBeanTest extends TestBase 
             assertEquals(true, jmxServer.getAttribute(serverStatus, 
"Running"));
             assertEquals(true, jmxServer.getAttribute(clientStatus, 
"Running"));
 
+            assertEquals("0", jmxServer.getAttribute(clientStatus, 
"FailedRequests").toString());
+            assertEquals("0", jmxServer.getAttribute(clientStatus, 
"SecondsSinceLastSuccess").toString());
+            assertEquals("0", jmxServer.invoke(clientStatus, 
"calcFailedRequests", null, null).toString());
+            assertEquals("0", jmxServer.invoke(clientStatus, 
"calcSecondsSinceLastSuccess", null, null).toString());
+
+            Thread.sleep(1000);
+
+            assertEquals("0", jmxServer.getAttribute(clientStatus, 
"FailedRequests").toString());
+            assertEquals("1", jmxServer.getAttribute(clientStatus, 
"SecondsSinceLastSuccess").toString());
+            assertEquals("0", jmxServer.invoke(clientStatus, 
"calcFailedRequests", null, null).toString());
+            assertEquals("1", jmxServer.invoke(clientStatus, 
"calcSecondsSinceLastSuccess", null, null).toString());
+
             assertEquals(new Long(2), jmxServer.getAttribute(connectionStatus, 
"TransferredSegments"));
             assertEquals(new Long(128), 
jmxServer.getAttribute(connectionStatus, "TransferredSegmentBytes"));
 


Reply via email to