Yes. I think we are good now and no longer need to reference those backups. Do you want me to clean those up and remount that? I would like to do OS updates and reboot it this evening when there is a good window to do so.

Any ideas if ASF Infra is going to require anything for Spectre/Meltdown patching?

Dave

On 01/16/2018 08:24 AM, Kevin A. McGrail wrote:
Thanks Dave. https://issues.apache.org/jira/browse/INFRA-15846 is open

Also, I think I captured a bunch of backups and stuff.  Is it ok to delete the backups in /usr/local/spamassassin?  I think that's substantially ALL the space.  Could we then move that mount to another drive and ask Infra to decomm that extra space for other usage?

On 1/15/2018 11:22 AM, Dave Jones wrote:
On 01/09/2018 07:13 AM, Kevin A. McGrail wrote:
Offlist reply by accident.  Repeating...

On 1/8/2018 9:52 AM, Dave Jones wrote:
On 01/08/2018 08:06 AM, Kevin A. McGrail wrote:
On 1/8/2018 8:01 AM, Dave Jones wrote:
I know the nightly rules promotion script hits the ruleqa site to make sure there have been 3 days of successful masschecks so if we did add a captcha, that would have to be excluded.

I have the web logs from the ruleqa site going to their own files now with awstats setup to give a quick overview of what is going on.  I was planning on letting this ride for a bit and see what kind of activity the ruleqa site normally gets. I don't think it gets hit much now that the bots are taken care of.

Also, I installed NRPE on the box and am monitoring it more closely via Icinga.  I will have graphs of memory usage and get alerts when the memory is being exhausted again. Hopefully if this happens again, I can get into the box before OOM killer starts whacking processes.
Thanks.  Out of interest is the lack of swap what is completely killing the box?  I'm used to DDOSes but not why it's spiraling the whole box.

The lack of swap is not the direct problem but it certainly is making it hard to troubleshoot the actual problem.  I really think it's odd for infra to not setup swap space.  I know they said it was bad for their SAN and I suppose that it would be if it were on SSDs and VMs were constantly into swap.

We normally monitor our VMs so that we get alerts when memory is getting near exhaustion and swap is being used so it doesn't impact our SAN.  At least this gives us some time to get into the box, see what is going on, and restart processes before the whole system is unresponsive.

We build our VMs with swap space and it runs on our our Complellent SAN with SSDs without a problem.  Not sure why infra doesn't.  You just need to make sure to mount everything with the "discard" option to play nice with most SAN's virtual block allocation and freeing.

My Icinga memory graphs are showing the used RAM hovering so far around 6 GB.  It dropped to around 3.0 GB last night when the hourly ruleqa updates weren't happening in cron and the masscheck was running.  We seem to be wasting a lot of RAM in that VM right now after getting Apache HTTPD under control by blocking the bad bots. We will have more informative graphs after more time has passed.
Yeah, I was confused about the swap space as well but they give us more ram easily.

Let me know in a week or so and we can ask them to lower the ram if you want.

Regards,
KAM


See attached graph over the past week.  Now that the bots are blocked from the ruleqa.cgi, we could drop the RAM down to 8 GB and be pretty safe.  We are running around 6 GB max regularly.

This assumes we don't add anything new to the workload for sa-vm1.apache.org.  I personally don't think we should add anything new to this VM since it's pretty stable now.  If we need some new processing setup, then it should be on another VM.

Dave



Reply via email to