On Mon, 10 Jan 2005, Dan Langille wrote:
Each URL must contain one of the following (actually, there are more values in this list, but they have been eliminated to simply things):
DO_TOPIC DO_ROOT DO_COMMUNITY
How can I use that on limit_urls_to? I've been trying this:
limit_urls_to: ${start_url}*DO_TOPIC|DO_ROOT|DO_COMMUNITY*
There are addiitonal restrictions, but once I get a starting point, I think it'll all fall into place.
A few example of what we want to do:
http://example.org/index.html OK http://example.org/index.html?ID=4 BAD http://example.org/index.html?ID=4&DO_TOPIC OK
I don't think that you are going to be able to do what you want with limit_urls_to. The attribute contains a list of patterns, one of which must be matched. Once you add a pattern that satisfies the first URL above, the other two are also satisfied since they contain the first.
I am not sure how you would completely solve this type of problem short of somehow using the external parser/converter mechanism as a filter. Depending on specifics, you might be able to handle some restrictions through the bad_querystr attribute, but that would not be sufficient for the example above. There are also restrict and exclude attributes, but those are applied at search time. The only other thing I can think of is perhaps using url_rewrite_rules to rewrite URL's that you don't want to something that limit_normalized then then drops (never tried this and don't even know if it is actually feasible).
Jim
------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ ht://Dig general mailing list: <[email protected]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

