Package: wget
Version: 1.10.1-1
Severity: wishlist

When mirroring a site (or -r in general), wget has no possibility
to define regex excludes for special subpages.

Example: if you want to mirror a site with dynamic content for
offline usage and this site contains a browsable calendar, wget
would suck all pages from -inf to +inf (wherever the serverscript
will stop/break).

The solution would be a regex pattern to exclude these pages.

(Note: this would a refinement of the -X option)

Current Workaround: install squid, specify an acl url_regex for
every unwanted content and http_access deny it ;)


-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (900, 'testing'), (50, 'unstable')
Architecture: i386 (i586)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.11
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)

Versions of packages wget depends on:
ii  libc6                       2.3.2.ds1-22 GNU C Library: Shared libraries an
ii  libssl0.9.7                 0.9.7e-3     SSL shared libraries

wget recommends no packages.

-- no debconf information


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to