Ingo Schwarze(schwa...@usta.de) on 2021.11.05 14:37:15 +0100:
> Hi Theo,
> 
> Theo de Raadt wrote on Thu, Nov 04, 2021 at 08:27:47AM -0600:
> > prx <p...@si3t.ch> wrote:
> >> On 2021/11/04 14:21, prx wrote:
> 
> >>> The attached patch add support for static gzip compression.
> >>> 
> >>> In other words, if a client support gzip compression, when "file" is
> >>> requested, httpd will check if "file.gz" is avaiable to serve.
> 
> >> This diff doesn't compress "on the fly".
> >> It's up to the webmaster to compress files **before** serving them.
> 
> > Does any other program work this way?
> 
> Yes.  The man(1) program does.  At least on the vast majority of
> Linux systems (at least those using the man-db implementation
> of man(1)), on FreeBSD, and on DragonFly BSD.
> 
> Those systems store most manual pages as gzipped man(7) and mdoc(7)
> files, and man(1) decompresses them every time a user wants to look
> at one of them.  You say "man ls", and what you get is actually
> /usr/share/man/man1/ls.1.gz or something like that.
> 
> For man(1), that is next to useless because du -sh /usr/share/man =
> 42.6M uncompressed.  But it has repeatedly caused bugs in the past.
> I would love to remove the feature from mandoc, but even though it is
> rarely used in OpenBSD (some ports installed gzipped manuals in the
> past, but i think the ports tree has been clean now for quite some
> time; you might still need the feature when installing software
> or unpacking foreign manual page packages without using ports)
> it would be a bad idea to remove it because it is too widely used
> elsewhere.  Note that even the old BSD man(1) supported it.
> 
> > Where you request one filename, and it gives you another?
> 
> You did not ask what web servers do, but we are discussing a patch to
> a webserver.  For this reason, let me note in passing that even some
> otherwise extremely useful sites get it very wrong the other way round:
> 
>  $ ftp https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> Trying 130.89.148.77...
> Requesting https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz
> 100% |**************************************************|  8050       00:00   
>  
> 8050 bytes received in 0.00 seconds (11.74 MB/s)
>  $ file ls.1.en.gz
> ls.1.en.gz: troff or preprocessor input text
>  $ grep -A 1 '^.SH NAME' ls.1.en.gz  
> .SH NAME
> ls \- list directory contents
>  $ gunzip ls.1.en.gz                                            
> gunzip: ls.1.en.gz: unrecognized file format

But with this patch, you are not asking the webserver for 
https://manpages.debian.org/bullseye/coreutils/ls.1.en.gz

You would be asking for

https://exmaple.com/whatever/ls.1

and with Accept-Encoding: gzip in the http header, and the
webserver would then look if it has a file

/whatever/ls.1.gz

(instead of without .gz) in its document tree and send you that, with 
"Content-Encoding: gzip" http header.
And because of that header, your client will know that the data is gzipped
and will unzip it before writing the file to the output.

I.e. there is no such problem (unless the patch has a bug).

/Benno

> 
> > I have a difficult time understanding why gzip has to sneak it's way
> > into everything.
> > 
> > I always prefer software that does precisely what I expect it to do.
> 
> Certainly.
> 
> I have no strong opinion whether a webserver qualifies as "everything",
> though, nor did i look at the diff.  While all manpages are small in the
> real world, some web servers may have to store huge amounts of data that
> clients might request, so disk space might occasionally be an issue on
> a web server even in 2021.  Also, some websites deliver huge amounts of
> data to the client even when the user merely asked for some text (not sure
> such sites would consider running OpenBSD httpd(8), but whatever :) - when
> browsing the web, bandwidth is still occasionally an issue even in 2021,
> even though that is a rather absurd fact.
> 
> Yours,
>   Ingo
> 

Reply via email to