Hi Bart,

Thank for keeping the spirit up. An hour ago I spotted the same problem while 
testing. I should have a fix for that in a little bit. I’ll be in touch.

Kind regards,
Kaveh.

> On 28 Jun 2016, at 09:38, Bart DS <[email protected]> wrote:
> 
> Ok, I did some tests and the agents are running fine now.
> At least, initially...
> 
> When I start my first server via the following command:
> 
> ./build/bin/arangod --server.authentication=false --server.endpoint 
> tcp://0.0.0.0:8529 --cluster.my-address tcp://127.0.0.1:8529 
> --cluster.my-local-info db1 --cluster.my-role PRIMARY 
> --cluster.agency-endpoint tcp://127.0.0.1:5001 primary1
> 
> I get following output:
> 
> bart@laptop ~/test/arangodb $ ./build/bin/arangod 
> --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 
> --cluster.my-address tcp://127.0.0.1:8529 --cluster.my-local-info db1 
> --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 
> primary1
> 2016-06-28T07:06:38Z [24113] INFO using SSL options: 
> SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
> 2016-06-28T07:06:38Z [24113] INFO file-descriptors (nofiles) hard limit is 
> 4096, soft limit is 1024
> 2016-06-28T07:06:38Z [24113] INFO created database directory 'primary1'.
> 2016-06-28T07:06:38Z [24113] INFO WAL directory 'primary1/journals' does not 
> exist. creating it...
> 2016-06-28T07:06:38Z [24113] INFO ArangoDB 3.0.x-devel [linux] 64bit, using 
> VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1f 6 Jan 2014
> 2016-06-28T07:06:38Z [24113] INFO loaded database '_system' from 
> 'primary1/databases/database-1'
> 2016-06-28T07:06:38Z [24113] INFO the server has 4 (hyper) cores, using 1 
> scheduler thread(s), 4 dispatcher thread(s)
> 2016-06-28T07:06:39Z [24113] INFO JavaScript using startup './js', 
> application './js/apps'
> 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from 
> UNDEFINED to STARTUP
> 2016-06-28T07:06:39Z [24113] INFO Cluster feature is turned on. Agency 
> version: , Agency endpoints: http+tcp://127.0.0.1:5002, 
> http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, 
> http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, 
> http+tcp://127.0.0.1:5001, server id: 'DBServer001', internal address: 
> tcp://127.0.0.1:8529, role: PRIMARY
> 2016-06-28T07:06:39Z [24113] INFO using heartbeat interval value '1000 ms' 
> from agency
> 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from 
> STARTUP to SERVING
> 2016-06-28T07:06:39Z [24113] INFO In database '_system': No version 
> information file found in database directory.
> 2016-06-28T07:06:39Z [24113] INFO In database '_system': Database is 
> up-to-date (30000/cluster-local/init)
> 2016-06-28T07:06:40Z [24113] INFO using endpoint 'http+tcp://0.0.0.0:8529' 
> for non-encrypted requests
> 2016-06-28T07:06:40Z [24113] INFO Authentication is turned off
> 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001
> 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001
> 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001
> 2016-06-28T07:06:40Z [24113] INFO In database '_system': Database is 
> up-to-date (-/db-server-local/init)
> 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001
> 2016-06-28T07:06:40Z [24113] INFO ArangoDB (version 3.0.x-devel [linux]) is 
> ready for business. Have fun!
> 2016-06-28T07:06:41Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:51Z [24113] ERROR error details: 
> {"code":307,"errorNum":0,"errorMessage":"Temporary Redirect (Temporary 
> Redirect)","error":true}
> 2016-06-28T07:06:51Z [24113] ERROR error stack: ArangoError: Temporary 
> Redirect (Temporary Redirect)\n    at Error (native)\n    at writeLocked 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1663:41)\n   
>  at createLocalDatabases 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:348:9)\n    
> at handleDatabaseChanges 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:461:3)\n    
> at handleChanges 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1460:3)\n    
> at handlePlanChange 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1675:24)
> 2016-06-28T07:06:51Z [24113] ERROR plan change handling failed
> 2016-06-28T07:06:52Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:53Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:54Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:55Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:56Z [24113] WARNING {heartbeat} heartbeat could not be sent 
> to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, 
> http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, 
> http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5002, 
> http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, 
> http+tcp://127.0.0.1:5001): http code: 307, body:
> 2016-06-28T07:06:56Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:57Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:58Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:06:59Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:00Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:01Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:02Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:03Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:04Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:05Z [24113] WARNING {heartbeat} heartbeat could not be sent 
> to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5003, 
> http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, 
> http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, 
> http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, 
> http+tcp://127.0.0.1:5001): http code: 307, body:
> 2016-06-28T07:07:05Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:06Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:07Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:09Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:10Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read Current/Version 
> from agency.
> 2016-06-28T07:07:28Z [24113] INFO plan change handling successful
> 
> 
> And the agents start using all cpu resources until the system becomes almost 
> unresponsive.
> 
> At that time, the agents are logging following messages:
> 
> 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:06:54Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:06:57Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:10Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:13Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:25Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:33Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:07:43Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:08:24Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:08:34Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:08:54Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:08:55Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:09:14Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:09:25Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 
> 
> When I stop the primary server again via ctrl-c, after some time the cpu 
> usage starts to decrease but the memory is increasing rapidly, finally taking 
> up all system memory and swap, in the end freezing the whole system.
> At this point the agents start to log following messages:
> 
> 2016-06-28T07:12:34Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', took 
> 26.345204
> 2016-06-28T07:12:43Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:13:05Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:13:10Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:13:20Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:13:23Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:13:39Z [23997] WARNING {agency} I have a higher term than RPC 
> caller.
> 2016-06-28T07:15:23Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', took 
> 22.091859
> 2016-06-28T07:21:06Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', took 
> 17.283082
> 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', took 
> 19.252867
> 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', took 
> 40.744055
> [...]
> 
> 
> So the initial issue with the agents seems to be resolved, but it's 
> definitely still not working correctly
> 
> Bart
> 
> 
> 
> --
> You received this message because you are subscribed to a topic in the Google 
> Groups "ArangoDB" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/arangodb/p30FN_TXp30/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to