Wget 1.8 is released. It should appear on ftp.gnu.org soon; until it
does, you can get it from:
ftp://ftp.gnjilux.hr/pub/unix/util/wget/wget-1.8.tar.gz
MD5 checksum of the archive is:
000caf43722b46df1f58b6fb2deb5b58
Please send bug reports to <[EMAIL PROTECTED]>.
I will announce the release on Freshmeat and gnu.announce when the
distribution shows up on ftp.gnu.org and its mirrors.
* Changes in Wget 1.8.
** A new progress indicator is now available and used by default.
You can choose the progress bar type with `--progress=TYPE'. Two
types are available, "bar" (the new default), and "dot" (the old
dotted indicator). You can permanently revert to the old progress
indicator by putting `progress = dot' in your `.wgetrc'.
** You can limit the download rate of the retrieval using the
`--limit-rate' option. For example, `wget --limit-rate=15k URL' will
tell Wget not to download the body of the URL faster than 15 kilobytes
per second.
** Recursive retrieval and link conversion have been revamped:
*** Wget now traverses links breadth-first. This makes the
calculation of depth much more reliable than before. Also, recursive
downloads are faster and consume *significantly* less memory than
before.
*** Links are converted only when the entire retrieval is complete.
This is the only safe thing to do, as only then is it known what URLs
have been downloaded.
*** BASE tags are handled correctly when converting links. Since Wget
already resolves <base href="..."> when resolving handling URLs, link
conversion now makes the BASE tags point to an empty string.
*** HTML anchors are now handled correctly. Links to an anchor in the
same document (<a href="#anchorname">), which used to confuse Wget,
are now converted correctly.
*** When in page-requisites (-p) mode, no-parent (-np) is ignored when
retrieving for inline images, stylesheets, and other documents needed
to display the page.
*** Page-requisites (-p) mode now works with frames. In other words,
`wget -p URL-THAT-USES-FRAMES' will now download the frame HTML files,
and all the files that they need to be displayed properly.
** `--base' now works conjunction with `--input-file', providing a
base for each URL and thereby allowing the URLs in the file to be
relative.
** If a host has more than one IP address, Wget uses the other
addresses when accessing the first one fails.
** Host directories now contain port information if the URL is at a
non-standard port.
** Wget now supports the robots.txt directives specified in
<http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html>.
** URL parser has been fixed, especially the infamous overzealous
quoting. Wget no longer dequotes reserved characters, e.g. `%3F' is
no longer translated to `?', nor `%2B' to `+'. Unsafe characters
which are not reserved are still escaped, of course.
** No more than 20 successive redirections are allowed.