> 
> Dear Maintainer(s),
> 
> the handling of Debian packages could be improved with some configuration 
> options
> taken from the squid-deb-proxy package, see:
> https://sources.debian.org/src/squid-deb-proxy/0.8.14+nmu1/squid-deb-proxy.conf
> 
> This has been found out by Mike Gabriel; for details see
> the discussion here:
> https://bugs.debian.org/913886
> 
> If squid would improve the handling of Debian packages all users already using
> squid could avoid installing squid-deb-proxy in addition.
> 

Hi,

Thank you for your interest in improving Squid in Debian.

This is a topic I look into every so often to figure out how far Squid
alone can go, and to do so without breaking people using
squid-deb-proxy. The last time those checks were done was about July
2018 during the squid-4 packaging preparations.

A brief aside on the topic of removing/replacing squid-deb-proxy:

squid-deb-proxy provides relatively large amount of extra integration
setup to add Avahi daemon(s) reporting where to find the Squid proxy. As
well as apt configuration to enable auto-proxy detection and use the
Avahi service. There may be more subtle things as well. A key factor is
that much of this integration occurs on the client machines without any
proxy installed there.

Merging those integration actions would add a conflict between squid and
squid-deb-proxy. As well as require additional binary packages to be
produced by the pkg-squid team for the client machines integration
parts. That role IMHO is already played well by squid-deb-proxy despite
its maintainer not being in the pkg-squid team. So I see little gain and
much pain attempting to replace squid-deb-proxy.


Anyhow, back to the proposed config settings:

> So please consider to adjust d/debian.conf like proposed by Mike.
> 
> diff --git a/debian/debian.conf b/debian/debian.conf
> index 7ac16c97..7fa82301 100644
> --- a/debian/debian.conf
> +++ b/debian/debian.conf
> @@ -9,3 +9,29 @@ logfile_rotate 0
>  # localhost to use the proxy on new installs
>  #
>  #http_access allow localnet
> +
> +# Begin: Improve handling of Debian packages (taken from squid-deb-proxy)
> +# we need a big cache, some debs are huge
> +maximum_object_size 512 MB
> +
> +# tweaks to speed things up
> +cache_mem 200 MB

First surprise is this line above.

Squid packages as far back as Lenny have had a default cache_mem setting
of 256 MB. So this is actually a decrease in capacity of the fastest
accessible cache type (RAM).

> +maximum_object_size_in_memory 10240 KB
> +

Which is a 512Kb -> 10 MB conflicts a little with the earlier setting to
decrease the size of objects stored in there to 10MB.

Also, Squid understand units up to TB. So: s/10240 KB/10 MB/ would be
clearer and still work.


> +# refresh pattern for debs and udebs
> +refresh_pattern deb$   129600 100% 129600
> +refresh_pattern udeb$   129600 100% 129600
> +refresh_pattern tar.gz$  129600 100% 129600
> +refresh_pattern tar.xz$  129600 100% 129600
> +refresh_pattern tar.bz2$  129600 100% 129600
> +

Squid is much more often a gateway to the generic web or internal LAN
environments - or at least dual-purpose with the apt package caching.
These tarball settings specifically would affect content well beyond
Debian repositories.

Which is more a problem than benefit when one considers what (2) below
means about Debian repos not using these pattern settings.

So IMHO, the above are not appropriate for widespread default inclusion.
It is better to have admin explicitly opt-in to using such rules. for
example; by installing the squid-deb-proxy or similar package which
drops in a squid config snippet.


> +# always refresh Packages and Release files
> +refresh_pattern \/(Packages|Sources)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims
> +refresh_pattern \/Release(|\.gpg)$ 0 0% 0 refresh-ims
> +refresh_pattern \/InRelease$ 0 0% 0 refresh-ims
> +refresh_pattern \/(Translation-.*)(|\.bz2|\.gz|\.xz)$ 0 0% 0 refresh-ims
> +


The parameters refresh_pattern supplies are used as defaults for the
caching heuristic algorithm(s). The values on all these lines are *only*
obeyed when the origin server omits a parameter needed by the caching
freshness algorithm from its HTTP response message.

In particular that means that when evaluating the utility of these rules
it is necessary to first analyze the origin service responses for each
of the relevant APT request messages to determine what caching
parameters it is supplying and which it is omitting.

When I have done that analysis for Debian official repositories those
servers *did* consistently provide the necessary cache parameters to
Squid. So for at least those repo servers these refresh_pattern rules
would do nothing beyond that 'refresh-ims' change in behaviour.


On the other hand my local reprepro installation does not always provide
those details. Particularly for the Packages requests. But does so in a
way which these config lines are insufficient to prevent immediate
re-fetch of the full content.

Thus I am a bit surprised that Mike saw a useful improvement this
config. The expectation from me is that these rules make Squid spend
more bandwidth re-fetching objects it does not strictly have to. A net
negative on bandwidth gains over a default proxy config.



Full disclosure:
 I am the up-upstream Squid maintainer. Issues such as this which are
specific to Debian and derivatives I leave to Luigi our DM to make the
call on.

That said, my call would be to leave these settings or any updated
version of them under control of squid-deb-proxy so admin can opt-in by
using that package on their proxy machines.


Cheers
Amos

Reply via email to