Problem with Vspider and excluding specific pages
Hi everyone, I've been working with Vspider for a while now (CFMX 6.1 and 7.0.1) on a search revamp project and ran into a problem. It seems that if you want the spider to follow a page but not index it, a link to that page must be explicit and not implicit with links like this: http://www.mysite.com/mydir/ and http://www.mysite.com/mydir/index.cfm If I want to exclude the index.cfm file from being indexed but still have vspider follow it, I'd do this in my config file: -indexclude */mydir/index.cfm That will work, but ONLY if the link vspider picks up to that page explicitly calls the index.cfm page. If vspider follows this link: http://www.mysite.com/mydir/ The index.cfm is indexed and followed, effectively rendering the -indexclude option useless. Does anyone have any ideas on this? It's possible I am missing something with one of the options but I've been through them quite a lot. I was thinking maybe the -regexp switch might help get around this problem. Thanks, Andy ~| Message: http://www.houseoffusion.com/lists.cfm/link=i:4:237595 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations & Support: http://www.houseoffusion.com/tiny.cfm/54
Problem with Vspider and excluding specific pages
Hi everyone, I've been working with Vspider for a while now (CFMX 6.1 and 7.0.1) on a search revamp project and ran into a problem. It seems that if you want the spider to follow a page but not index it, a link to that page must be explicit and not implicit with links like this: http://www.mysite.com/mydir/ and http://www.mysite.com/mydir/index.cfm If I want to exclude the index.cfm file from being indexed but still have vspider follow it, I'd do this in my config file: -indexclude */mydir/index.cfm That will work, but ONLY if the link vspider picks up to that page explicitly calls the index.cfm page. If vspider follows this link: http://www.mysite.com/mydir/ The index.cfm is indexed and followed, effectively rendering the -indexclude option useless. Does anyone have any ideas on this? It's possible I am missing something with one of the options but I've been through them quite a lot. I was thinking maybe the -regexp switch might help get around this problem. Thanks, Andy ~| Message: http://www.houseoffusion.com/lists.cfm/link=i:4:237582 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations & Support: http://www.houseoffusion.com/tiny.cfm/54