Please take a look this example:
$ \rm -rf biz.yahoo.com
$ ls biz.yahoo.com
$ wget -r  --domains=biz.yahoo.com -I /r/ 'http://biz.yahoo.com/r/'
$ ls biz.yahoo.com/
r/              reports/        research/
$

I want only '/r/', but it crawls /r*, which includes /reports/, /research/.

Is it an expected result or a bug?

Thanks alot!


--- "Aaron S. Hawley" <[EMAIL PROTECTED]> wrote:
> above the code segment you submitted (line 765 of init.c) the
> comment:
> 
> /* Strip the trailing slashes from directories.  */
> 
> here are the manual notes on this option:
> 
> (from "Recursive Accept/Reject Options")
> 
> `-I list'
> `--include-directories=list'
>     Specify a comma-separated list of directories you wish to follow when
> downloading (See section Directory-Based Limits for more details.)
> Elements of list may contain wildcards.
> 
>  --- and ---
> 
> (from "Directory-Based Limits")
> 
> `-I list'
> `--include list'
> `include_directories = list'
>     `-I' option accepts a comma-separated list of directories included in
> the retrieval. Any other directories will simply be ignored. The
> directories are absolute paths. So, if you wish to download from
> `http://host/people/bozo/' following only links to bozo's colleagues in
> the `/people' directory and the bogus scripts in `/cgi-bin', you can
> specify:
> 
> wget -I /people,/cgi-bin http://host/people/bozo/
> 
> ---
> 
> On Wed, 11 Jun 2003, wei ye wrote:
> 
> > I'm trying to crawl url with  --include-directories='/r/'
> > parameter.
> >
> > I expect to crawl '/r/*', but wget gives me '/r*'.
> >
> > By reading the code, it turns out that cmd_directory_vector()
> > removed the trailing '/' of include-directories '/r/'.
> >
> > It's a minor bug, but I hope it could be fix in next version.
> >
> > Thanks!
> >
> > static int cmd_directory_vector(...) {
> >  ...
> >           if (len > 1)
> >             {
> >               if ((*t)[len - 1] == '/')
> >                 (*t)[len - 1] = '\0';
> >             }
> >  ...
> >
> > }
> >
> > =====
> > Wei Ye
> 
> -- 
> "Yahweh commanded Abraham to sacrifice his only son Isaac on the top of a
> mountain. When Abraham asked why, Yahweh replied because 'I am God.' When
> I heard this story the first time, I promised myself to check out
> atheism."  -- Louis Proyect www.marxmail.org/


=====
Wei Ye

__________________________________
Do you Yahoo!?
Yahoo! Calendar - Free online calendar with sync to Outlook(TM).
http://calendar.yahoo.com

Reply via email to