Hi Alex,
We already have an issue on { hotAlignment:true } in distributed settings:https://github.com/orientechnologies/orientdb/issues/2270 It's in our queue. Lvc@ On 13 June 2014 18:18, alexpmorris <[email protected]> wrote: > Been testing with the 1.7.4 snapshot, and I still can't get an orientdb > cluster to properly align itself after a node is removed and added back. > I've tried on windows, on ubuntu, and ubuntu as root just to be sure. > I've tried adjust parameters in hazelcast.xml and > default-distributed-db-config.json, still nothing. If i completely erase > the db and let it recover, it generally will work. However, it will not > properly sync if a record has been altered while the other node was down. > > Here is the log file of what happens (this was 1.7.4 snapshot, on ubuntu > as root): > > 2014-06-13 12:02:07:505 INFO Loading configuration from: > /home/test/orientdb/orientdb2/config/orientdb-dserver-config.xml... > [OServerConfigurationLoaderXml] > 2014-06-13 12:02:07:913 INFO OrientDB Server v1.7-SNAPSHOT (build UNKNOWN@r; > 2014-06-12 18:25:56+0200) is starting up... [OServer] > 2014-06-13 12:02:07:926 INFO Databases directory: > /home/test/orientdb/orientdb2/databases [OServer] > 2014-06-13 12:02:08:001 INFO Port 0.0.0.0:2424 busy, trying the next > available... [OServerNetworkListener] > 2014-06-13 12:02:08:002 INFO Listening binary connections on 0.0.0.0:2425 > (protocol v.21, socket=default) [OServerNetworkListener] > 2014-06-13 12:02:08:002 INFO Port 0.0.0.0:2480 busy, trying the next > available... [OServerNetworkListener] > 2014-06-13 12:02:08:003 INFO Listening http connections on 0.0.0.0:2481 > (protocol v.10, socket=default) [OServerNetworkListener] > 2014-06-13 12:02:08:015 INFO Installing dynamic plugin 'studio-1.7.zip'... > [OServerPluginManager] > 2014-06-13 12:02:08:146 INFO Installing GREMLIN language v.2.5.0 - > graph.pool.max=50 [OGraphServerHandler] > 2014-06-13 12:02:08:195 INFO Starting distributed server > 'node1402673455127'... [OHazelcastPlugin] > 2014-06-13 12:02:08:245 INFO Configuring Hazelcast from > '/home/test/orientdb/orientdb2/config/hazelcast.xml'. [FileSystemXmlConfig] > 2014-06-13 12:02:08:591 INFO null [orientdb] [3.2.1] Prefer IPv4 stack is > true. [DefaultAddressPicker] > 2014-06-13 12:02:08:623 INFO null [orientdb] [3.2.1] Picked > Address[192.168.1.10]:2435, using socket > ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=2435], bind any local is true > [DefaultAddressPicker] > 2014-06-13 12:02:08:775 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Hazelcast Community Edition 3.2.1 (20140428) starting at > Address[192.168.1.10]:2435 [system] > 2014-06-13 12:02:08:775 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Copyright (C) 2008-2014 Hazelcast.com [system] > 2014-06-13 12:02:08:784 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Creating MulticastJoiner [Node] > 2014-06-13 12:02:08:810 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Address[192.168.1.10]:2435 is STARTING [LifecycleService] > 2014-06-13 12:02:09:010 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Connecting to /192.168.1.10:2434, timeout: 0, bind-any: true > [SocketConnector] > 2014-06-13 12:02:09:028 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] 49736 > accepted socket connection from /192.168.1.10:2434 > [TcpIpConnectionManager] > 2014-06-13 12:02:14:550 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > > Members [2] { > Member [192.168.1.10]:2434 > Member [192.168.1.10]:2435 this > } > [ClusterService] > 2014-06-13 12:02:16:100 INFO [192.168.1.10]:2435 [orientdb] [3.2.1] > Address[192.168.1.10]:2435 is STARTED [LifecycleService] > 2014-06-13 12:02:16:117 INFO [node1402673455127] found no previous > messages in queue orientdb.node.node1402673455127.response > [OHazelcastDistributedMessageService] > 2014-06-13 12:02:16:296 WARN [node1402673455127] opening database > 'testdb'... [OHazelcastPlugin] > 2014-06-13 12:02:16:302 INFO [node1402673455127] loaded database > configuration from active cluster [OHazelcastPlugin] > 2014-06-13 12:02:16:354 INFO updated distributed configuration for > database: testdb: > ---------- > { > "version":2, > "autoDeploy":true, > "hotAlignment":true, > "readQuorum":1, > "writeQuorum":2, > "failureAvailableNodesLessQuorum":false, > "readYourWrites":true,"clusters":{ > "internal":null, > "index":null, > "*":{ > "servers":["<NEW_NODE>","node1402673438702","node1402673455127"] > } > } > } > ---------- [OHazelcastPlugin] > 2014-06-13 12:02:16:375 WARN [node1402673455127] found 1 previous messages > in queue orientdb.node.node1402673455127.testdb.request, aligning the > database... [OHazelcastDistributedMessageService] > 2014-06-13 12:02:18:854 WARN Storage testdb was not closed properly. Will > try to restore from write ahead log. [OLocalPaginatedStorage] > 2014-06-13 12:02:18:854 SEVE Restore is not possible because write ahead > log is empty. [OLocalPaginatedStorage] > 2014-06-13 12:02:18:927 INFO Storage data restore was completed > [OLocalPaginatedStorage] > 2014-06-13 12:02:22:321 WARN segment file 'database.ocf' was not closed > correctly last time [OSingleFileSegment] > 2014-06-13 12:02:22:334 WARN Can not restore 1 WAL master record for > storage testdb [OWriteAheadLog][node1402673455127]<-[node1402673438702] > error on reading distributed request: record_update(#9:4 v.6) > Error on creation of shared resource > -> > com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55) > -> > com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68) > -> > com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291) > -> > com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49) > -> > com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243) > -> java.lang.Thread.run(Thread.java:745) > The record with id '#0:1' not found > -> > com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55) > -> > com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68) > -> > com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291) > -> > com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49) > -> > com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243) > -> java.lang.Thread.run(Thread.java:745) > Storage testdb is not opened. > -> > com.orientechnologies.common.concur.resource.OSharedContainerImpl.getResource(OSharedContainerImpl.java:55) > -> > com.orientechnologies.orient.server.distributed.ODistributedStorage.getResource(ODistributedStorage.java:516) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.init(OMetadataDefault.java:110) > -> > com.orientechnologies.orient.core.metadata.OMetadataDefault.load(OMetadataDefault.java:68) > -> > com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:291) > -> > com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49) > -> > com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:281) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.onMessage(OHazelcastDistributedDatabase.java:471) > -> > com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase$1.run(OHazelcastDistributedDatabase.java:243) > -> java.lang.Thread.run(Thread.java:745) > 2014-06-13 12:02:22:850 INFO [node1402673455127] executed all pending > tasks in queue, set restoringMessages=false and database 'testdb' as > online... [OHazelcastDistributedDatabase$1] > 2014-06-13 12:02:43:795 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 1/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:44:096 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 2/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:44:397 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 3/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:44:699 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 4/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:45:001 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 5/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:45:307 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 6/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:45:608 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 7/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:45:909 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 8/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:46:210 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 9/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:46:511 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 10/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:46:811 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 11/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:47:112 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 12/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:47:412 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 13/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:47:713 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 14/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:48:016 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 15/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:48:318 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 16/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:48:619 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 17/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:48:920 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 18/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:49:221 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 19/20 [ONetworkProtocolHttpDb] > 2014-06-13 12:02:49:522 INFO Node is not online yet (status=STARTING), > blocking the command until it's online 20/20 [ONetworkProtocolHttpDb] > > -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
