Bug#988191: ionos-1 proxy causes lots of temporary build failures by being slow to respond

2021-05-07 Thread Holger Levsen
On Fri, May 07, 2021 at 01:50:04PM +0200, Helmut Grohne wrote:
> I'm unsure what you mean here precisely. Do you mean adding a separate
> squid on ionos9 used by only ionos9 

yes, that.

> If the proxy is only used locally, adding another cpu is unnecessary as
> the times when the proxy is being used we are not cpu bound. Adding a
> bit of ram for that case sounds sensible though 2gb should already
> suffice in that case.
 
ok, cool.

> Possibly adding a bit of disk space for the squid spool (20G?) would
> be necessary to avoid excessive refetching.

ok.


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

In Europe there are people prosecuted by courts because they saved other people
from drowning in the  Mediterranean Sea.  That is almost as absurd  as if there
were people being prosecuted because they save humans from drowning in the sea.


signature.asc
Description: PGP signature


Bug#988191: ionos-1 proxy causes lots of temporary build failures by being slow to respond

2021-05-07 Thread Helmut Grohne
Hi Holger,

On Fri, May 07, 2021 at 11:32:21AM +, Holger Levsen wrote:
> as a simple fix: we could install squid on your node (ionos9?) and a another
> cpu for it? (and 4gb ram or such)

I'm unsure what you mean here precisely. Do you mean adding a separate
squid on ionos9 used by only ionos9 or do you mean moving the squid from
ionos1 to ionos9 and moving all of its users over to ionos9?

Both seem sensible to me, but the resource adjustments are slightly
different in each case.

If the proxy is only used locally, adding another cpu is unnecessary as
the times when the proxy is being used we are not cpu bound. Adding a
bit of ram for that case sounds sensible though 2gb should already
suffice in that case.

If other the other nodes from the same datacenter are supposed to use
the moved proxy, your adjustments seem sensible to me. Note that ionos9
intentionally swaps a lot. The swapping is a lot less bursty and a
magnitude lower than that observed on ionos1, but we may have to apply
resource limits to keep the proxy responsive.

Possibly adding a bit of disk space for the squid spool (20G?) would
be necessary to avoid excessive refetching.

Helmut



Bug#988191: ionos-1 proxy causes lots of temporary build failures by being slow to respond

2021-05-07 Thread Holger Levsen
On Fri, May 07, 2021 at 12:52:52PM +0200, Helmut Grohne wrote:
> The proxy running on ionos-1 is frequently overloaded and slow to
> respond. This frequently results in apt or mmdebstrap failing to
> download packages and makes a lot of jobs fail. For rebootstrap it seems
> to roughly kill 1/3 of all jobs. As such the situation is a significant
> waste of resources.

as a simple fix: we could install squid on your node (ionos9?) and a another
cpu for it? (and 4gb ram or such)


-- 
cheers,
Holger

 ⢀⣴⠾⠻⢶⣦⠀
 ⣾⠁⢠⠒⠀⣿⡁  holger@(debian|reproducible-builds|layer-acht).org
 ⢿⡄⠘⠷⠚⠋⠀  OpenPGP: B8BF54137B09D35CF026FE9D 091AB856069AAA1C
 ⠈⠳⣄

Stop saying that we are all in the same boat.
We’re all in the same storm.
But we’re not all in the same boat.


signature.asc
Description: PGP signature


Bug#988191: ionos-1 proxy causes lots of temporary build failures by being slow to respond

2021-05-07 Thread Helmut Grohne
Package: jenkins.debian.org
Severity: important

The proxy running on ionos-1 is frequently overloaded and slow to
respond. This frequently results in apt or mmdebstrap failing to
download packages and makes a lot of jobs fail. For rebootstrap it seems
to roughly kill 1/3 of all jobs. As such the situation is a significant
waste of resources.

Looking into munin (thanks for providing that!), there seems to be a
good correlation between significant iowait
https://jenkins.debian.net/munin/debian.net/ionos1-amd64.debian.net/cpu.html
and the temporary failures. The system has 40G swap in use at the moment
and significant swap traffic. It seems unsurprising that a squid running
on such a system would not perform well and time out requests
occasionally.

A long term solution would be separating the proxy component from the
component that causes heavy swapping (whatever that is). The simplest
solution would be dedicating a separate small vm at the same datacenter
to the proxying. I suppose that 1 vcpu, 1G ram and like 30G of disk
space would be the minimum to support a proxy-only vm.

Do you agree with the proposed solution?

Helmut