On Fri, 15 Jun 2001, Augeri, Jim (NM75) wrote:
> Why not just use the "htdig-noindex" meta tag on the top
> page of any collections you don't want to index? Or am I
> missing something.
Oh, it's simple: I do not want to modify any pages. I also do not want
to forget to add htdig-noindex to a password-protected page. IMO, the
information that the page is password protected is already there. I
just need to teach htdig to use that info.
Yes, I know that my requirements are tough. I want all the hard work
to be done by [smart] programs. :)
Alex.
> -----Original Message-----
> From: Gilles Detillieux [mailto:[EMAIL PROTECTED]]
> Sent: Thursday, June 14, 2001 3:12 PM
> To: [EMAIL PROTECTED]
> Cc: [EMAIL PROTECTED]
> Subject: Re: [htdig] Excluding protected dirs and local_urls_only
>
>
> According to Alex Rousskov:
> > I need to index a site that has a few .htaccess protected
> > directories. I do not want protected directories to be indexed. I want
> > to index that site via local file system only. I do not want to have a
> > static list of directories that htdig must avoid. Instead, I simply
> > want htdig to exclude any directory that has .htaccess file in it.
> >
> > I tried to find a solution in the archives, but it seems like
> > my problem is unique. Would I have to code the solution?
>
> Yes, this does seem to be a rather unique problem, as far as I recall,
> but not that different from other problems. I'd recommend using a
> find command, similar to the one in http://www.htdig.org/FAQ.html#q5.25,
> to find all directories that have a .htaccess file and make URLs out of
> the directory names. Put these in a file, and use that file for your
> exclude_urls attribute (rather than start_url as in the example). The
> whole process can be automated with a script which runs this before
> running htdig and htmerge.
>
>
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html