On  6 Sep, Jeff Breidenbach wrote:
> 
> 
> Hi htdig folks,
> 
> I'm having a bit of a problem getting what I want from the htdig
> configuration options. Lots of people, myself included, use htdig in
> conjunction with MHonArc. In the current release version of MHonArc
> (2.4.3, which I recently upgraded to) attachments may be stored in
> subdirectories as following:
> 
> The first URL is the message, while the second is the attachment.
> No need to follow the links, just look at their structure.
> 
> http://mail-archive.com/sinister%40majordomo.net/1997-month-08/msg00174.html

> 
>http://mail-archive.com/sinister%40majordomo.net/1997-month-08/msg00174/The_state_i_am_in.txt

> 
> My question is, using the current stable version of htdig, how
> can I configure it to ONLY index messages, and not index attachments?
> If I could say "Ignore everything that does not end in .html" or
> "only index URLs with a certain regexp" that would do the trick. 
> But with the current configuration options, I just don't see how to do
> this. 
> 
> Thanks in advance for enlightenment.
> 
> Jeff

Just at a quick glance, assuming you _only_ want to dig your Mhonarc
stuff, or are happy to dig it separately from the rest of your site, it
appears that you might be able to use a judicious mixture of start_url
and limit_urls_to (or perhaps exclude_urls if you know what the
extensions of all your attachment files are)

Set your start_url to the top level of the Mhonarc structure, and set

limit_urls_to: .html  or

exclude_urls: .txt

That should keep most of them out - of course, if you have an
attachment file whose name contains the string in limit_urls_to, it'll
get picked up.

And with luck Geoff or Gilles will know a better way :-)

-- 
David Robley

WEBMASTER                           | Phone +61 8 8374 0970
RESEARCH CENTRE FOR INJURY STUDIES  | http://www.nisu.flinders.edu.au/
AusEinet                            | http://auseinet.flinders.edu.au/
            Flinders University, ADELAIDE, SOUTH AUSTRALIA
            Visit the PHP mirror at http://au.php.net:81/

<<<<<<<<<<<<< WARNING * END OF TEXT * STOP READING HERE >>>>>>>>>>>>>>


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word unsubscribe in
the SUBJECT of the message.

Reply via email to