[h2] Re: Data loss with H2 HA / Clustered Mode

dcheney Mon, 07 Oct 2013 13:00:42 -0700

Hi,

Noel invited me to join the discussion, and I hope nobody objects (or 
better yet, someone finds something useful) if I cross-post (updated with 
more current info) here re clustered H2.  We are also trying to use it and 
most of what I have seen looks pretty good.


Our requirement is for a relatively small DB with a low rate of 
transactions.  I'd like to be able to have a db clone on each of 3 nodes. 
 Clients would be directed to one node via a virtual IP; the cluster would 
apply all operations at that IP and onto two copies.  If either *copy* (not 
the 'master' at the virtual IP) fails the cluster continues to operate 
normally; after the node is recovered CreateCluster is re-run as usual, 
albeit with 3 nodes.  If the master (the node at the virtual IP) fails our 
code selects a new master from either of the 2 copies.  That node takes 
over the virtual IP and the cluster continues operating normally.  We 
detect when the failed node returns and issue a "CreateCluster" to 
reincorporate that node.

As far as I can see this imposes only a single requirement on H2 that is 
different than usual: support for 3 nodes in a CreateCluster -serverList.  
I've begun exploring this a bit - I have a working 3 node cluster (nodes 
.55, .53, and .51).  I run the server on .53 and .51, e.g.
java -cp /usr/lib/java/*:. org.h2.tools.Server -tcp -tcpPort 9101 
-tcpAllowOthers -baseDir <base> & 
Then on my 'first' server (.55) I successfully run CreateCluster (twice) to 
create db copies on node both .53 and .51:
java -cp /usr/lib/java/*:. org.h2.tools.CreateCluster \
    -urlSource jdbc:h2:tcp://172.16.35.55:9101/msp \
    -urlTarget jdbc:h2:tcp://172.16.35.51:9101/msp \
    -serverList 172.16.35.55:9101,172.16.35.51:9101,172.16.35.53:9101
The H2 logs show an update to the CLUSTER server list that includes all 
three servers.
/*SQL t:1*/SET CLUSTER '172.16.35.55:9101,172.16.35.51:9101,
172.16.35.53:9101';
Code in SessionRemote loops through the serverlist applying any SQL 
operation.  I also see this interesting comment at line 397: 
      // TODO cluster: support more than 2 connections
What would need to be done?

I've:
- run a number of SQL operations from multiple connections and all 3 server 
DB copies are updated as expected.
- failed an H2 server and/or node and successfully issued operations on the 
remaining cluster.
- reincorporated a failed node using "CreateCluster"
- successfully issued operations against the recovered cluster.
- obtained expected behaviors when attempting to insert a duplicate key and 
to update or delete a missing record - basic stuff.
- obtained quite unexpected behavior by issuing 'ipconfig eth0 off' on a 
node, which halts the entire database until the adapter returns.  Ouch.

I also attempted to create a SQL operation failure on a single node, hoping 
to receive an exception and see logs indicating a rollback.  I found that I 
was able to use a URL identifying a single clustered node (more below), 
removed a record on only that node, and then attempted to update that 
record at the cluster.  My takeaway so far is that I am able to break 
consistency and that the database does not reliably detect and rollback an 
operation upon a record that is missing on a node.   I've not explored this 
very thoroughly and I'm unclear on what behavior to expect with the current 
H2 - especially on an unsupported 3 node configuration.

While looking at logs I noted that under some circumstances - stop and 
restart one H2 server, then connect and disconnect using a normal multihost 
URL - the value of CLUSTER is set empty:    
10-02 16:15:02 jdbc[14]: /*SQL */SET AUTO_RECONNECT TRUE;
10-02 16:15:02 jdbc[14]: /*SQL */SET CLUSTER TRUE;
10-02 16:15:02 jdbc[14]: /*SQL */SET TRACE_LEVEL_SYSTEM_OUT 3;
10-02 16:15:02 jdbc[14]: /*SQL */SET DB_CLOSE_DELAY -1;
10-02 16:15:04 index: SYS_DATA remove ( /* key:119 */ 58, 0, 6, 'SET 
CLUSTER ''172.16.35.55:9101,172.16.35.53:9101,172.16.35.51:9101''')
10-02 16:15:04 pageStore: log - s: 1 table: 0 row: ( /* key:119 */ 58, 0, 
6, 'SET CLUSTER ''172.16.35.55:9101,172.16.35.53:9101,172.16.35.51:9101''')
10-02 16:15:04 index: SYS_DATA add ( /* key:120 */ 58, 0, 6, 'SET CLUSTER 
''''')
10-02 16:15:04 pageStore: log + s: 1 table: 0 row: ( /* key:120 */ 58, 0, 
6, 'SET CLUSTER ''''')
Though this is a cluster, at this point you can connect and issue 
operations against a JDBC URL identifying single node in the cluster.  The 
documented SELECT VALUE FROM INFORMATION_SCHEMA.SETTINGS WHERE 
NAME='CLUSTER'; returns an empty '' from successful connect til first 
write.  Not a surprise, but not the expected or documented behavior.  
Following any write operation the expected list of servers is restored.

If I understand the H2 clustering model, its approach is to detect a failed 
server(/node), explicitly remove it from the cluster, and continue 
operating as a "healthy" cluster of one (less) node.  Apart from this, 
operations are always expected to complete on all nodes.  That's pretty 
blunt, but if it is "good enough" for a particular application then it is.  
We really like H2 and I'm guessing that there is a pretty substantial 
market for a lightweight solution with good enough behavior.  If we can 
converge on what is good enough between a few of us, maybe we can make this 
work.  Thoughts? 

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to h2-database+unsubscr...@googlegroups.com.
To post to this group, send email to h2-database@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.

[h2] Re: Data loss with H2 HA / Clustered Mode

Reply via email to