I also applied these configuration options earlier today to all the servers in 1 of my pools that was experiencing high IO load and repeated SigAlarms: command_timeout: 600 wserver_timeout: 30 max_recover: 150
And since then, everything has been quiet: IO on the main node that gossips externally: https://i.imgur.com/ERgz0Xo.jpg <https://i.imgur.com/ERgz0Xo.jpg> IO from another node in the same pool that gossips internally with the above node: https://i.imgur.com/wsaxrJ5.jpg <https://i.imgur.com/wsaxrJ5.jpg> Hopefully this can help other operators keep things in better shape for the time being. -T > On Feb 6, 2019, at 3:22 AM, Rolf Wuerdemann <ro...@digitalis.org> wrote: > > With your suggestions: > > load average below 1 > Traffic: ~150G/day > > Best, > > Rolf > > Am 2019-02-04 12:52, schrieb Martin Dobrev: >> Hi, >> I've spent last week trying to optimize configuration as much as >> possible. Following advise from a previous mail I've added: >>> command_timeout: 600 >>> wserver_timeout: 30 >>> max_recover: 150 >> to my sksconf and it seems this fixed majority of the EventLoop >> failures. I've added DB_CONFIG in KDB/PTree folders to get rid of DB >> archive logs that were causing plenty of IO load too. >> My clusters are now happily responding to queries and load-average is >> bellow one. Traffic wise things look better too, ~20GB/day. >> Kind regards, >> Martin Dobrev >> P.S. Adding/changing DB_CONFIG might cause an error in the databases >> that you can easily fix by running >> db_recover -e -v -h <path to SKS>/{KDB,PTree} >> On 04/02/2019 09:49, Rolf Wuerdemann wrote: >>> Hi, >>> Don't get me wrong, but within three days I've got 450G traffic >>> which can be assigned to sks by 99.9%. Estimated to 30 days this >>> means 4.5T (which is in good agreement of your 2+T/Key for these >>> two poison keys). >>> With this amount of traffic and the possibility to get >>> more of this keys (thus more traffic) every moment, I think it's >>> only a question of time until the network with the current >>> implementation will vanish. Traffic increased roughly a factor of >>> 300 (15G->4.5T) within twelve months, nodes within the network >>> decreased by a factor of two at least for the same time. >>> So: where to go and how? >>> Just my 2ct, >>> rowue >>> Am 2019-01-30 22:09, schrieb Martin Dobrev: >>> Hi, >>> My observations so far show that both keys generate 2+ TB/month >>> traffic on average for all my clustered nodes. I'm running nginx + >>> Varnish in-memory cache tuned at 5 minutes TTL which gives plenty of >>> CPU cycles for the never-ending EventLoop alarm loops. The latter >>> cause load-average spikes of up to 10 with just 4 Docker containers >>> running on a 12 core system. >>> Don't get me wrong. The throttling penalty is something I'd >>> swallow-up >>> as long as we keep the network running. >>> Regards, >>> Martin >>> keyserver.dobrev.eu | pgp.dobrev.it >>> -------- Original message -------- >>> From: Kristian Fiskerstrand >>> <kristian.fiskerstr...@sumptuouscapital.com> >>> Date: 30/01/2019 20:18 (GMT+00:00) >>> To: Shengjing Zhu <zsj950...@gmail.com>, sks-devel@nongnu.org >>> Subject: Re: [Sks-devel] Unusual traffic for key 0x69D2EAD9 and >>> 0xB33B4659 >>> On 1/12/19 8:15 PM, Shengjing Zhu wrote: >>> I think these requests are quite unusual. >>> Does anyone know what happens to these two keys? >>> Just to add a comment on this, adding a cache on the load-balancer >>> is >>> really a nice way to slow down hits on the underlying SKS nodes, I >>> keep >>> cache for 10 minutes in nginx, which really makes life more >>> pleasant. >>> -- >>> ---------------------------- >>> Kristian Fiskerstrand >>> Blog: https://blog.sumptuouscapital.com >>> Twitter: @krifisk >>> ---------------------------- >>> Public OpenPGP keyblock at hkp://pool.sks-keyservers.net >>> fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3 >>> ---------------------------- >>> "Action is the foundational key to all success" >>> (Pablo Picasso) >>> _______________________________________________ >>> Sks-devel mailing list >>> Sks-devel@nongnu.org >>> https://lists.nongnu.org/mailman/listinfo/sks-devel >> _______________________________________________ >> Sks-devel mailing list >> Sks-devel@nongnu.org >> https://lists.nongnu.org/mailman/listinfo/sks-devel > > -- > Security is an illusion - Datasecurity twice > Rolf Würdemann - ro...@digitalis.org - DL9ROW > GnuPG fingerprint: EEDC BEA9 EFEA 54A9 E1A9 2D54 69CC 9F31 6C64 206A > xmpp: ro...@digitalis.org E1189573 6B4A150C A0C2BF5A 5553F865 0B9CBF7A > ro...@jabber.ccc.de 64CBBB68 0A3514A4 026FC1E7 5328CE87 AEE2185F > > _______________________________________________ > Sks-devel mailing list > Sks-devel@nongnu.org > https://lists.nongnu.org/mailman/listinfo/sks-devel
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ Sks-devel mailing list Sks-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/sks-devel