To follow up on this, after making the below changes while my main disk IO went down, my load average went up, memory usage went through the roof & swapping ensued. I increased the amount of memory assigned to each of my main nodes (those that gossip with the outside world) and it seems to be holding steady so far: https://imgur.com/a/b0S4Ui2 <https://imgur.com/a/b0S4Ui2>
I believe the nodes may have also been having issues gossiping as I saw outbound network traffic flatline during the same time periods: https://imgur.com/a/IEoLboM <https://imgur.com/a/IEoLboM> -T > On Feb 6, 2019, at 4:21 PM, Todd Fleisher <t...@fleetstreetops.com> wrote: > > Signed PGP part > I also applied these configuration options earlier today to all the servers > in 1 of my pools that was experiencing high IO load and repeated SigAlarms: > command_timeout: 600 > wserver_timeout: 30 > max_recover: 150 > > And since then, everything has been quiet: > > IO on the main node that gossips externally: https://i.imgur.com/ERgz0Xo.jpg > <https://i.imgur.com/ERgz0Xo.jpg> > > IO from another node in the same pool that gossips internally with the above > node: https://i.imgur.com/wsaxrJ5.jpg <https://i.imgur.com/wsaxrJ5.jpg> > > Hopefully this can help other operators keep things in better shape for the > time being. > > -T > > >> On Feb 6, 2019, at 3:22 AM, Rolf Wuerdemann <ro...@digitalis.org >> <mailto:ro...@digitalis.org>> wrote: >> >> With your suggestions: >> >> load average below 1 >> Traffic: ~150G/day >> >> Best, >> >> Rolf >> >> Am 2019-02-04 12:52, schrieb Martin Dobrev: >>> Hi, >>> I've spent last week trying to optimize configuration as much as >>> possible. Following advise from a previous mail I've added: >>>> command_timeout: 600 >>>> wserver_timeout: 30 >>>> max_recover: 150 >>> to my sksconf and it seems this fixed majority of the EventLoop >>> failures. I've added DB_CONFIG in KDB/PTree folders to get rid of DB >>> archive logs that were causing plenty of IO load too. >>> My clusters are now happily responding to queries and load-average is >>> bellow one. Traffic wise things look better too, ~20GB/day. >>> Kind regards, >>> Martin Dobrev >>> P.S. Adding/changing DB_CONFIG might cause an error in the databases >>> that you can easily fix by running >>> db_recover -e -v -h <path to SKS>/{KDB,PTree} >>> On 04/02/2019 09:49, Rolf Wuerdemann wrote: >>>> Hi, >>>> Don't get me wrong, but within three days I've got 450G traffic >>>> which can be assigned to sks by 99.9%. Estimated to 30 days this >>>> means 4.5T (which is in good agreement of your 2+T/Key for these >>>> two poison keys). >>>> With this amount of traffic and the possibility to get >>>> more of this keys (thus more traffic) every moment, I think it's >>>> only a question of time until the network with the current >>>> implementation will vanish. Traffic increased roughly a factor of >>>> 300 (15G->4.5T) within twelve months, nodes within the network >>>> decreased by a factor of two at least for the same time. >>>> So: where to go and how? >>>> Just my 2ct, >>>> rowue >>>> Am 2019-01-30 22:09, schrieb Martin Dobrev: >>>> Hi, >>>> My observations so far show that both keys generate 2+ TB/month >>>> traffic on average for all my clustered nodes. I'm running nginx + >>>> Varnish in-memory cache tuned at 5 minutes TTL which gives plenty of >>>> CPU cycles for the never-ending EventLoop alarm loops. The latter >>>> cause load-average spikes of up to 10 with just 4 Docker containers >>>> running on a 12 core system. >>>> Don't get me wrong. The throttling penalty is something I'd >>>> swallow-up >>>> as long as we keep the network running. >>>> Regards, >>>> Martin >>>> keyserver.dobrev.eu <http://keyserver.dobrev.eu/> | pgp.dobrev.it >>>> <http://pgp.dobrev.it/> >>>> -------- Original message -------- >>>> From: Kristian Fiskerstrand >>>> <kristian.fiskerstr...@sumptuouscapital.com >>>> <mailto:kristian.fiskerstr...@sumptuouscapital.com>> >>>> Date: 30/01/2019 20:18 (GMT+00:00) >>>> To: Shengjing Zhu <zsj950...@gmail.com <mailto:zsj950...@gmail.com>>, >>>> sks-devel@nongnu.org <mailto:sks-devel@nongnu.org> >>>> Subject: Re: [Sks-devel] Unusual traffic for key 0x69D2EAD9 and >>>> 0xB33B4659 >>>> On 1/12/19 8:15 PM, Shengjing Zhu wrote: >>>> I think these requests are quite unusual. >>>> Does anyone know what happens to these two keys? >>>> Just to add a comment on this, adding a cache on the load-balancer >>>> is >>>> really a nice way to slow down hits on the underlying SKS nodes, I >>>> keep >>>> cache for 10 minutes in nginx, which really makes life more >>>> pleasant. >>>> -- >>>> ---------------------------- >>>> Kristian Fiskerstrand >>>> Blog: https://blog.sumptuouscapital.com >>>> <https://blog.sumptuouscapital.com/> >>>> Twitter: @krifisk >>>> ---------------------------- >>>> Public OpenPGP keyblock at hkp://pool.sks-keyservers.net >>>> <hkp://pool.sks-keyservers.net> >>>> fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3 >>>> ---------------------------- >>>> "Action is the foundational key to all success" >>>> (Pablo Picasso) >>>> _______________________________________________ >>>> Sks-devel mailing list >>>> Sks-devel@nongnu.org <mailto:Sks-devel@nongnu.org> >>>> https://lists.nongnu.org/mailman/listinfo/sks-devel >>> _______________________________________________ >>> Sks-devel mailing list >>> Sks-devel@nongnu.org <mailto:Sks-devel@nongnu.org> >>> https://lists.nongnu.org/mailman/listinfo/sks-devel >> >> -- >> Security is an illusion - Datasecurity twice >> Rolf Würdemann - ro...@digitalis.org <mailto:ro...@digitalis.org> - >> DL9ROW >> GnuPG fingerprint: EEDC BEA9 EFEA 54A9 E1A9 2D54 69CC 9F31 6C64 206A >> xmpp: ro...@digitalis.org <mailto:ro...@digitalis.org> E1189573 6B4A150C >> A0C2BF5A 5553F865 0B9CBF7A >> ro...@jabber.ccc.de <mailto:ro...@jabber.ccc.de> 64CBBB68 0A3514A4 >> 026FC1E7 5328CE87 AEE2185F >> >> _______________________________________________ >> Sks-devel mailing list >> Sks-devel@nongnu.org <mailto:Sks-devel@nongnu.org> >> https://lists.nongnu.org/mailman/listinfo/sks-devel > >
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ Sks-devel mailing list Sks-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/sks-devel