On Thu, 14 Jun 2001, Gilles Detillieux wrote:

> Yes, this does seem to be a rather unique problem, as far as I
> recall, but not that different from other problems.  I'd recommend
> using a find command, similar to the one in
> http://www.htdig.org/FAQ.html#q5.25, to find all directories that
> have a .htaccess file and make URLs out of the directory names.  
> Put these in a file, and use that file for your exclude_urls
> attribute (rather than start_url as in the example).  The whole
> process can be automated with a script which runs this before
> running htdig and htmerge.

Yes, that would work OK, and I have considered it as a possible
workaround.

Ideally, since this is a "security" issue, I would prefer not to rely
on an external preparation step: I do not want to end up with
protected content displayed in search summaries when I (or somebody
else!) forget to run the script first. Renaming the script into htdig
would help, but this change might get lost when new htdig versions are
installed. Etc, etc.

I middle ground (and very general) solution would be to teach htdig to
import the result of a program execution into the configuration file
run-time. Something along these lines:

        exclude_urls: \
                [/mail-archive/] \
                `/usr/local/bin/myscript.sh` \
                ...

... where the command in the back-quoted string is executed by htdig
and then substituted with the output of the executed command. Htdig
should exit if the command fails (which is difficult to check for
reliably, unfortunately). Htdig should at leat warn if the command
produces no output.


Thanks a lot for your help,

Alex.



_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to