Could you please try something else for me. This is just that I know, if we are just messing up the startup. We’ll know more from there hopefully:
Go into the directory, where you cloned arangodb: cd arangodb ./scripts/startLocalCluster 3 2 1 The log and databases end up in cluster. Could you please share those logs with me? Kind regards, Kaveh > On 28 Jun 2016, at 14:08, Bart DS <[email protected]> wrote: > > Doh, I should have tried that before reporting back... > > Anyway, I removed the old databases and restarted everything (3 agents, 2 > DBServers and 1 coordinator) > Several minutes after starting the coordinator (a lot of messages appear > regarding synchronization of shards) everything seems to run and I can access > the web interface on http://localhost:8531 > > However, as soon as I started the coordinator, I started to see several > warnings and errors on the DBServers and Coordinators: > > DBServer1: > > 2016-06-28T10:03:54Z [4940] INFO ArangoDB (version 3.0.x-devel [linux]) is > ready for business. Have fun! > 2016-06-28T10:03:54Z [4940] INFO plan change handling successful > 2016-06-28T10:03:55Z [4940] ERROR {heartbeat} Could not read Current/Version > from agency. > 2016-06-28T10:04:01Z [4940] INFO plan change handling successful > 2016-06-28T10:04:06Z [4940] INFO plan change handling successful > 2016-06-28T10:04:12Z [4940] INFO creating local shard '_system/s2' for > central '_system/1' > 2016-06-28T10:04:12Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s2", "planId" : "1", > "leader" : "DBServer002" } > 2016-06-28T10:04:12Z [4940] INFO plan change handling successful > 2016-06-28T10:04:12Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:13Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:14Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:15Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:16Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:17Z [4940] INFO synchronizeOneShard: waiting for leader, > _system/s2, _system/1 > 2016-06-28T10:04:18Z [4940] INFO plan change handling successful > 2016-06-28T10:04:18Z [4940] INFO synchronizeOneShard: trying to synchronize > local shard '%s/%s' for central '%s/%s' _system s2 _system 1 > 2016-06-28T10:04:18Z [4940] INFO {replication} connected to master at > tcp://127.0.0.1:8530, id 257501827428435, version 3.0, last log tick 27 > 2016-06-28T10:04:19Z [4940] INFO synchronizeOneShard: synchronization worked > for shard s2 > 2016-06-28T10:04:19Z [4940] INFO synchronizeOneShard: done, _system/s2, > _system/1 > 2016-06-28T10:04:22Z [4940] INFO creating local shard '_system/s4' for > central '_system/3' > 2016-06-28T10:04:23Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s2", "planId" : "1", > "leader" : "DBServer002" } > 2016-06-28T10:04:23Z [4940] INFO plan change handling successful > > [...] > > 2016-06-28T10:05:57Z [4940] INFO plan change handling successful > 2016-06-28T10:05:57Z [4940] INFO Applying chunk: > 2016-06-28T10:05:57Z [4940] ERROR syncCollectionFinalize: ensureIndex { > "tick" : "437", "type" : 2100, "database" : "1", "cid" : "398", "cname" : > "s20", "data" : { "fields" : [ "time" ], "id" : "21", "sparse" : false, > "type" : "skiplist", "unique" : false } } > {"errorNum":10,"errorMessage":"usage: ensureIndex(<description>)"} > 2016-06-28T10:05:57Z [4940] ERROR synchronizeOneShard: Could not finalize > shard synchronization s20 { "error" : true, "errorMessage" : "could not > replicate body ops", "response" : { "statusCode" : 200, "status" : 200, > "message" : "OK", "headers" : { "content-type" : "application/x-arango-dump; > charset=utf-8", "x-arango-replication-lasttick" : "437", > "x-arango-replication-active" : "true", "server" : "ArangoDB", "http/1.1" : > "200 OK", "content-length" : "153", "connection" : "Keep-Alive", > "x-arango-replication-frompresent" : "true", "x-arango-replication-checkmore" > : "false", "x-arango-replication-lastincluded" : "437" }, "rawBody" : [Object > { "length" : 153 }], "body" : > "{\"tick\":\"437\",\"type\":2100,\"database\":\"1\",\"cid\":\"398\",\"cname\":\"s20\",\"data\":{\"fields\":[\"time\"],\"id\":\"21\",\"sparse\":false,\"type\":\"skiplist\",\"unique\":false}}\n" > }, "exception" : [Object { "errorNum" : 10, "errorMessage" : "usage: > ensureIndex(<description>)" }] } > 2016-06-28T10:05:57Z [4940] ERROR synchronization of local shard > '_system/s20' for central '_system/19' failed: "Did not work for shard s20." > 2016-06-28T10:05:57Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s10", "planId" : "9", > "leader" : "DBServer002" } > 2016-06-28T10:05:57Z [4940] INFO synchronizeOneShard: done, _system/s20, > _system/19 > 2016-06-28T10:05:57Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s15", "planId" : "14", > "leader" : "DBServer002" } > 2016-06-28T10:05:57Z [4940] INFO synchronizeOneShard: already done, > _system/s10, _system/9 > > [...] > > 2016-06-28T10:07:08Z [4940] INFO synchronizeOneShard: trying to synchronize > local shard '%s/%s' for central '%s/%s' _system s32 _system 31 > 2016-06-28T10:07:08Z [4940] INFO {replication} connected to master at > tcp://127.0.0.1:8530, id 257501827428435, version 3.0, last log tick 788 > 2016-06-28T10:07:09Z [4940] INFO Applying chunk: > 2016-06-28T10:07:09Z [4940] ERROR syncCollectionFinalize: ensureIndex { > "tick" : "823", "type" : 2100, "database" : "1", "cid" : "768", "cname" : > "s32", "data" : { "fields" : [ "mount" ], "id" : "33", "selectivityEstimate" > : 1, "sparse" : false, "type" : "hash", "unique" : true } } > {"errorNum":10,"errorMessage":"usage: ensureIndex(<description>)"} > 2016-06-28T10:07:09Z [4940] ERROR synchronizeOneShard: Could not finalize > shard synchronization s32 { "error" : true, "errorMessage" : "could not > replicate body ops", "response" : { "statusCode" : 200, "status" : 200, > "message" : "OK", "headers" : { "content-type" : "application/x-arango-dump; > charset=utf-8", "x-arango-replication-lasttick" : "823", > "x-arango-replication-active" : "true", "server" : "ArangoDB", "http/1.1" : > "200 OK", "content-length" : "173", "connection" : "Keep-Alive", > "x-arango-replication-frompresent" : "true", "x-arango-replication-checkmore" > : "false", "x-arango-replication-lastincluded" : "823" }, "rawBody" : [Object > { "length" : 173 }], "body" : > "{\"tick\":\"823\",\"type\":2100,\"database\":\"1\",\"cid\":\"768\",\"cname\":\"s32\",\"data\":{\"fields\":[\"mount\"],\"id\":\"33\",\"selectivityEstimate\":1,\"sparse\":false,\"type\":\"hash\",\"unique\":true}}\n" > }, "exception" : [Object { "errorNum" : 10, "errorMessage" : "usage: > ensureIndex(<description>)" }] } > 2016-06-28T10:07:09Z [4940] ERROR synchronization of local shard > '_system/s32' for central '_system/31' failed: "Did not work for shard s32." > 2016-06-28T10:07:09Z [4940] INFO synchronizeOneShard: done, _system/s32, > _system/31 > 2016-06-28T10:07:09Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s2", "planId" : "1", > "leader" : "DBServer002" } > 2016-06-28T10:07:09Z [4940] INFO plan change handling successful > 2016-06-28T10:07:09Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s10", "planId" : "9", > "leader" : "DBServer002" } > 2016-06-28T10:07:09Z [4940] INFO synchronizeOneShard: already done, > _system/s2, _system/1 > 2016-06-28T10:07:09Z [4940] INFO scheduleOneShardSynchronization: have > launched job { "database" : "_system", "shard" : "s15", "planId" : "14", > "leader" : "DBServer002" } > 2016-06-28T10:07:09Z [4940] INFO synchronizeOneShard: already done, > _system/s10, _system/9 > > [...] > > DBServer2: > > 2016-06-28T10:03:53Z [5029] INFO using SSL options: > SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG > 2016-06-28T10:03:53Z [5029] INFO file-descriptors (nofiles) hard limit is > 4096, soft limit is 1024 > 2016-06-28T10:03:53Z [5029] INFO created database directory 'primary2'. > 2016-06-28T10:03:53Z [5029] INFO WAL directory 'primary2/journals' does not > exist. creating it... > 2016-06-28T10:03:53Z [5029] INFO ArangoDB 3.0.x-devel [linux] 64bit, using > VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1f 6 Jan 2014 > 2016-06-28T10:03:53Z [5029] INFO loaded database '_system' from > 'primary2/databases/database-1' > 2016-06-28T10:03:53Z [5029] INFO the server has 4 (hyper) cores, using 1 > scheduler thread(s), 4 dispatcher thread(s) > 2016-06-28T10:03:56Z [5029] INFO JavaScript using startup './js', application > './js/apps' > 2016-06-28T10:03:56Z [5029] INFO changing state of PRIMARY server from > UNDEFINED to STARTUP > 2016-06-28T10:03:57Z [5029] INFO Cluster feature is turned on. Agency > version: , Agency endpoints: http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, > http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, > http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5002, server id: > 'DBServer002', internal address: tcp://127.0.0.1:8530, role: PRIMARY > 2016-06-28T10:03:57Z [5029] INFO using heartbeat interval value '1000 ms' > from agency > 2016-06-28T10:03:57Z [5029] INFO changing state of PRIMARY server from > STARTUP to SERVING > 2016-06-28T10:03:57Z [5029] INFO In database '_system': No version > information file found in database directory. > 2016-06-28T10:03:57Z [5029] INFO In database '_system': Database is > up-to-date (30000/cluster-local/init) > 2016-06-28T10:03:57Z [5029] INFO using endpoint 'http+tcp://0.0.0.0:8530' for > non-encrypted requests > 2016-06-28T10:03:57Z [5029] INFO Authentication is turned off > 2016-06-28T10:03:57Z [5029] INFO bootstraped DB server DBServer002 > 2016-06-28T10:03:57Z [5029] INFO bootstraped DB server DBServer002 > 2016-06-28T10:03:57Z [5029] INFO bootstraped DB server DBServer002 > 2016-06-28T10:03:57Z [5029] INFO In database '_system': Database is > up-to-date (-/db-server-local/init) > 2016-06-28T10:03:57Z [5029] INFO bootstraped DB server DBServer002 > 2016-06-28T10:03:57Z [5029] INFO ArangoDB (version 3.0.x-devel [linux]) is > ready for business. Have fun! > > [...] > > 2016-06-28T10:07:54Z [5029] INFO plan change handling successful > 2016-06-28T10:07:56Z [5029] INFO plan change handling successful > 2016-06-28T10:08:00Z [5029] INFO plan change handling successful > 2016-06-28T10:08:00Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/16/s17 > 2016-06-28T10:08:01Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/16/s17 > 2016-06-28T10:08:01Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/16/s17 > 2016-06-28T10:08:02Z [5029] ERROR {replication} insertLocal: dropping > follower DBServer001 for shard s17 > 2016-06-28T10:08:02Z [5029] ERROR {replication} insertLocal: dropping > follower DBServer001 for shard s20 > 2016-06-28T10:08:03Z [5029] INFO plan change handling successful > 2016-06-28T10:08:08Z [5029] INFO plan change handling successful > 2016-06-28T10:08:15Z [5029] INFO plan change handling successful > 2016-06-28T10:08:16Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:17Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:17Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:18Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:18Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:19Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:19Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:20Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > > [...] > > 2016-06-28T10:08:45Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:45Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:46Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:46Z [5029] ERROR FollowerInfo::add, timeout in agency > operation for key Current/Collections/_system/19/s20 > 2016-06-28T10:08:46Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:47Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:47Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:48Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:48Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:49Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:08:49Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > > [...] > > 2016-06-28T10:09:16Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:09:16Z [5029] ERROR FollowerInfo::remove, timeout in agency > operation for key Current/Collections/_system/19/s20 > 2016-06-28T10:09:16Z [5029] ERROR {replication} insertLocal: dropping > follower DBServer001 for shard s20 > 2016-06-28T10:15:20Z [5029] ERROR {replication} insertLocal: dropping > follower DBServer001 for shard s23 > 2016-06-28T10:15:20Z [5029] INFO plan change handling successful > 2016-06-28T10:15:22Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:15:22Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:15:23Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:15:23Z [5029] WARNING FollowerInfo::add, could not cas key > Current/Collections/_system/19/s20 > > [...] > > 2016-06-28T10:16:53Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:16:54Z [5029] WARNING FollowerInfo::remove, could not cas key > Current/Collections/_system/19/s20 > 2016-06-28T10:16:54Z [5029] ERROR FollowerInfo::remove, timeout in agency > operation for key Current/Collections/_system/19/s20 > 2016-06-28T10:16:54Z [5029] ERROR {replication} insertLocal: dropping > follower DBServer001 for shard s20 > > > Coordinator1: > > 2016-06-28T10:04:04Z [5109] INFO using SSL options: > SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG > 2016-06-28T10:04:04Z [5109] INFO Waiting for a DBserver to show up... > 2016-06-28T10:04:04Z [5109] INFO Found 2 DBservers. > 2016-06-28T10:04:04Z [5109] INFO file-descriptors (nofiles) hard limit is > 4096, soft limit is 1024 > 2016-06-28T10:04:04Z [5109] INFO created database directory 'coordinator'. > 2016-06-28T10:04:04Z [5109] INFO WAL directory 'coordinator/journals' does > not exist. creating it... > 2016-06-28T10:04:04Z [5109] INFO ArangoDB 3.0.x-devel [linux] 64bit, using > VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1f 6 Jan 2014 > 2016-06-28T10:04:04Z [5109] INFO loaded database '_system' from > 'coordinator/databases/database-1' > 2016-06-28T10:04:04Z [5109] INFO the server has 4 (hyper) cores, using 1 > scheduler thread(s), 4 dispatcher thread(s) > 2016-06-28T10:04:05Z [5109] INFO JavaScript using startup './js', application > './js/apps' > 2016-06-28T10:04:05Z [5109] INFO changing state of COORDINATOR server from > UNDEFINED to STARTUP > 2016-06-28T10:04:05Z [5109] INFO Cluster feature is turned on. Agency > version: , Agency endpoints: http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, > http+tcp://127.0.0.1:5001, server id: 'Coordinator001', internal address: > tcp://127.0.0.1:8531, role: COORDINATOR > 2016-06-28T10:04:05Z [5109] INFO using heartbeat interval value '1000 ms' > from agency > 2016-06-28T10:04:05Z [5109] INFO changing state of COORDINATOR server from > STARTUP to SERVING > 2016-06-28T10:04:05Z [5109] INFO In database '_system': No version > information file found in database directory. > 2016-06-28T10:04:05Z [5109] INFO In database '_system': Database is > up-to-date (30000/cluster-local/init) > 2016-06-28T10:04:06Z [5109] INFO using endpoint 'http+tcp://0.0.0.0:8531' for > non-encrypted requests > 2016-06-28T10:04:06Z [5109] INFO Authentication is turned off > 2016-06-28T10:04:06Z [5109] INFO In database '_system': Found 14 defined > task(s), 12 task(s) to run > 2016-06-28T10:04:06Z [5109] INFO In database '_system': state > coordinator-global/init, tasks setupGraphs, setupUsers, createUsersIndex, > addDefaultUserSystem, createModules, createRouting, insertRedirectionsAll, > setupAqlFunctions, createStatistics, createFrontend, setupQueues, setupJobs > 2016-06-28T10:06:52Z [5109] INFO In database '_system': init successfully > finished > 2016-06-28T10:07:11Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:490:8)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > 2016-06-28T10:07:11Z [5109] INFO Waiting for synchronous replication of > system collections... > 2016-06-28T10:07:22Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:501:12)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > 2016-06-28T10:07:47Z [5109] INFO bootstraped coordinator Coordinator001 > 2016-06-28T10:07:47Z [5109] INFO ArangoDB (version 3.0.x-devel [linux]) is > ready for business. Have fun! > 2016-06-28T10:09:16Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:501:12)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > 2016-06-28T10:09:16Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:501:12)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > 2016-06-28T10:16:54Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:501:12)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > 2016-06-28T10:16:54Z [5109] WARNING catch error in historian: ArangoError: > internal error\n at Error (native)\n at Object.exports.historian > (/home/bart/test2/arangodb/js/server/modules/@arangodb/statistics.js:501:12)\n > at eval (eval at undefined, <anonymous>:3:54)\n at eval (eval at > undefined, <anonymous>:3:70) > > > The web interface doesn't show any Coordinators or DBServers in the dashboard > (Coordinators: 0, DBServers: 0, RAM: 3.3%, Connections: 7) > and when I go to the "Nodes" section, the Overview tab is empty. > > > Bart > > > On Tuesday, June 28, 2016 at 11:54:29 AM UTC+2, Kaveh Vahedipour wrote: > ok. i should have mentioned to go back and remove the broken databases before > restarting your cluster. > i.e. rm -rf agency* primary* > > and then restart. > > > > -- > You received this message because you are subscribed to a topic in the Google > Groups "ArangoDB" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/arangodb/p30FN_TXp30/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
signature.asc
Description: Message signed with OpenPGP using GPGMail
