[Bug-wget] Gzip Content-Encoding Patches

2017-07-31 Thread Tim Schlueter
Hi,

Please see the attached patches which add automatic gzip decompression
for HTTP files with the Content-Encoding response header set correctly.

It also adjusts a downloaded file's extension for br, compress, and
deflate Content-Encodings.

Since the first patch set:
* doc/wget.texi has been updated to reflect the changes in the patches.
* Commit messages have been changed to be in the GNU change log style.
* The patches are attached to this email instead of being in the body.

I have not yet had a chance to look at what would be involved to add
automated tests for this patch set.

Regards,
Tim
From cbdd976dea6289a1f167c2b50cc1d4b1ff878686 Mon Sep 17 00:00:00 2001
From: Tim Schlueter 
Date: Mon, 24 Jul 2017 20:39:24 -0700
Subject: [PATCH 1/3] Adjust Extension based on Content-Encoding

* doc/wget.texi (--adjust-extension, adjust_extension): Updated documentation.
* src/http.c (encoding_t): New enum.
(struct http_stat): Add local_encoding field.
(gethttp): --adjust-extension based on Content-Encoding.
---
 doc/wget.texi | 10 +--
 src/http.c| 90 +++
 2 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/doc/wget.texi b/doc/wget.texi
index 6453c35..e582d4f 100644
--- a/doc/wget.texi
+++ b/doc/wget.texi
@@ -1346,6 +1346,11 @@ renamed from @samp{--html-extension}, to better reflect its new
 behavior. The old option name is still acceptable, but should now be
 considered deprecated.
 
+As of version 1.20, Wget will also ensure that any downloaded files with
+a @code{Content-Encoding} of @samp{br}, @samp{compress}, @samp{deflate}
+or @samp{gzip} end in the suffix @samp{.br}, @samp{.Z}, @samp{.zlib}
+and @samp{.gz} respectively.
+
 At some point in the future, this option may well be expanded to
 include suffixes for other types of content, including content types
 that are not parsed by Wget.
@@ -3365,8 +3370,9 @@ Define a header for HTTP downloads, like using
 
 @item adjust_extension = on/off
 Add a @samp{.html} extension to @samp{text/html} or
-@samp{application/xhtml+xml} files that lack one, or a @samp{.css}
-extension to @samp{text/css} files that lack one, like
+@samp{application/xhtml+xml} files that lack one, a @samp{.css}
+extension to @samp{text/css} files that lack one, and a @samp{.br},
+@samp{.Z}, @samp{.zlib} or @samp{.gz} to compressed files like
 @samp{-E}. Previously named @samp{html_extension} (still acceptable,
 but deprecated).
 
diff --git a/src/http.c b/src/http.c
index f5d9caf..a8c6e18 100644
--- a/src/http.c
+++ b/src/http.c
@@ -1539,6 +1539,16 @@ persistent_available_p (const char *host, int port, bool ssl,
   fd = -1;  \
 } while (0)
 
+typedef enum
+{
+  ENC_INVALID = -1, /* invalid encoding */
+  ENC_NONE = 0, /* no special encoding */
+  ENC_GZIP, /* gzip compression */
+  ENC_DEFLATE,  /* deflate compression */
+  ENC_COMPRESS, /* compress compression */
+  ENC_BROTLI/* brotli compression */
+} encoding_t;
+
 struct http_stat
 {
   wgint len;/* received length */
@@ -1569,6 +1579,9 @@ struct http_stat
 #ifdef HAVE_METALINK
   metalink_t *metalink;
 #endif
+
+  encoding_t local_encoding;/* the encoding of the local file */
+
   bool temporary;   /* downloading a temporary file */
 };
 
@@ -3189,6 +3202,7 @@ gethttp (const struct url *u, struct url *original_url, struct http_stat *hs,
   xfree (hs->remote_time);
   hs->error = NULL;
   hs->message = NULL;
+  hs->local_encoding = ENC_NONE;
 
   conn = u;
 
@@ -3639,6 +3653,49 @@ gethttp (const struct url *u, struct url *original_url, struct http_stat *hs,
 }
 }
 
+  if (resp_header_copy (resp, "Content-Encoding", hdrval, sizeof (hdrval)))
+{
+  hs->local_encoding = ENC_INVALID;
+
+  switch (hdrval[0])
+{
+case 'b': case 'B':
+  if (0 == c_strcasecmp(hdrval, "br"))
+hs->local_encoding = ENC_BROTLI;
+  break;
+case 'c': case 'C':
+  if (0 == c_strcasecmp(hdrval, "compress"))
+hs->local_encoding = ENC_COMPRESS;
+  break;
+case 'd': case 'D':
+  if (0 == c_strcasecmp(hdrval, "deflate"))
+hs->local_encoding = ENC_DEFLATE;
+  break;
+case 'g': case 'G':
+  if (0 == c_strcasecmp(hdrval, "gzip"))
+hs->local_encoding = ENC_GZIP;
+  break;
+case 'i': case 'I':
+  if (0 == c_strcasecmp(hdrval, "identity"))
+hs->local_encoding = ENC_NONE;
+  break;
+case 'x': case 'X':
+  if (0 == c_strcasecmp(hdrval, "x-compress"))
+hs->local_encoding = ENC_COMPRESS;
+  else if (0 == c_strcasecmp(hdrval, "x-gzip"))
+hs->local_encoding = ENC_GZIP;
+  break;
+case '\0':
+  hs->local_encoding = ENC_NONE;
+}
+

[Bug-wget] [bug #51626] Configure scripts didn't check for availabilty of gperf, leading to make failure

2017-07-31 Thread Nutchanon Wetchasit
Follow-up Comment #2, bug #51626 (project wget):

Thanks for a quick response. Bootstrap script of current mainline version now
correctly aborts when `gperf` is not installed. (And still proceed normally
when `gperf` is installed)

Though, what caught my eye in this fix

is Gperf name being misspelled in the README.checkout file
,
and its description paragraph's indenting was kind of inconsistent

(it is supposed to be all-space, rather than tab); so it would be nice if
these get fixed as well.

Wget: v1.19.1-76-g951d3e4 (git)
Gnulib: v0.1-1154-gf497bc1 (git)
Autoconf: 2.69-1 (debian)
Gperf: 3.0.3-1+b1 (debian)
System: Debian GNU/Linux 7.0 Wheezy i386


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #51155] Page requisite requests should use GET method irrespective of original request method

2017-07-31 Thread Darshit Shah
Update of bug #51155 (project wget):

  Status:None => Wont Fix   
 Open/Closed:Open => Closed 


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/




[Bug-wget] [bug #51626] Configure scripts didn't check for availabilty of gperf, leading to make failure

2017-07-31 Thread Nutchanon Wetchasit
URL:
  

 Summary: Configure scripts didn't check for availabilty of
gperf, leading to make failure
 Project: GNU Wget
Submitted by: nachanon
Submitted on: Mon 31 Jul 2017 07:41:57 PM ICT
Category: Build/Install
Severity: 3 - Normal
Priority: 5 - Normal
  Status: None
 Privacy: Public
 Assigned to: None
 Originator Name: 
Originator Email: 
 Open/Closed: Open
 Discussion Lock: Any
 Release: 1.19.1
Operating System: GNU/Linux
 Reproducibility: Every Time
   Fixed Release: None
 Planned Release: None
  Regression: None
   Work Required: None
  Patch Included: None

___

Details:

While I was compiling lastest release version of Wget (1.19.1) via Git
checkout to verify a URL filtering issue, I ran into a build failure.

>From a look at output from Make, it turned out that Make somehow tried to
invoke `gperf` command, and failed as I didn't have gperf installed on my
system. Gperf  requirement is *not
mentioned anywhere in documentation*: README
,
README.checkout

and INSTALL

file. Also, *both `bootstrap` and `configure` didn't complain about it* being
missing either.

Exact steps to reproduce:

* Uninstall `gperf` (if installed)
* Run `git clone https://git.savannah.gnu.org/git/wget.git wget.git`
* Go into `wget.git` directory
* Run `git checkout v1.19.1`
* Run `./bootstrap`
* Run `./configure --prefix=/opt/wget-1.19.1 --enable-assert`
* Run `make V=1`

You would see that Make ends with an error like following:

gperf -m 10 ./unicase/special-casing-table.gperf >
./unicase/special-casing-table.h-t && \
mv ./unicase/special-casing-table.h-t ./unicase/special-casing-table.h
/bin/bash: gperf: command not found
make[2]: *** [unicase/special-casing-table.h] Error 127
make[2]: Leaving directory `/home/window/prog/wget.git/lib'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/home/window/prog/wget.git'
make: *** [all] Error 2


Bootstrap output, configure output, config.log, and full Make output are
attached as `wget1.19.1git-gperferror.zip` for a reference.

If `gperf` was installed, the build would succeed. This problem affects only
Git version of Wget; the tarball version builds fine.

Wget: 1.19.1 (git)
Gnulib: v0.1-1130-g916a632 (git)
Autoconf: 2.69-1 (debian)
Gperf: 3.0.3-1+b1 (debian)
System: Debian GNU/Linux 7.0 Wheezy i386



___

File Attachments:


---
Date: Mon 31 Jul 2017 07:41:57 PM ICT  Name: wget1.19.1git-gperferror.zip 
Size: 96KiB   By: nachanon
Bootstrap, configure, and build logs


___

Reply to this item at:

  

___
  Message sent via/by Savannah
  http://savannah.gnu.org/