above the code segment you submitted (line 765 of init.c) the
comment:

/* Strip the trailing slashes from directories.  */

here are the manual notes on this option:

(from "Recursive Accept/Reject Options")

`-I list'
`--include-directories=list'
    Specify a comma-separated list of directories you wish to follow when
downloading (See section Directory-Based Limits for more details.)
Elements of list may contain wildcards.

 --- and ---

(from "Directory-Based Limits")

`-I list'
`--include list'
`include_directories = list'
    `-I' option accepts a comma-separated list of directories included in
the retrieval. Any other directories will simply be ignored. The
directories are absolute paths. So, if you wish to download from
`http://host/people/bozo/' following only links to bozo's colleagues in
the `/people' directory and the bogus scripts in `/cgi-bin', you can
specify:

wget -I /people,/cgi-bin http://host/people/bozo/

---

On Wed, 11 Jun 2003, wei ye wrote:

> I'm trying to crawl url with  --include-directories='/r/'
> parameter.
>
> I expect to crawl '/r/*', but wget gives me '/r*'.
>
> By reading the code, it turns out that cmd_directory_vector()
> removed the trailing '/' of include-directories '/r/'.
>
> It's a minor bug, but I hope it could be fix in next version.
>
> Thanks!
>
> static int cmd_directory_vector(...) {
>  ...
>           if (len > 1)
>             {
>               if ((*t)[len - 1] == '/')
>                 (*t)[len - 1] = '\0';
>             }
>  ...
>
> }
>
> =====
> Wei Ye

-- 
"Yahweh commanded Abraham to sacrifice his only son Isaac on the top of a
mountain. When Abraham asked why, Yahweh replied because 'I am God.' When
I heard this story the first time, I promised myself to check out
atheism."  -- Louis Proyect www.marxmail.org/

Reply via email to