I just got the exact same situation working myself after some effort. There are two basic ways to do this: creating separate databases for each site, or creating one big database for all sites but then limiting the separate searches to just that site. I did it this latter way, which is easier, though has tradeoffs.
For this, you basically just need a different .conf file for each site you want to search, a different set of HTML template files for each site, and then a restriction in your search form to limit the search to just the one site when people search.
Let's say I have two sites, site1.org and site2.org. Besides my htdig.conf, I have site1.conf and site2.conf. You could actually call the files whatever you want -- 1.conf, 2.conf etc. In the htdig.conf I just put the options related to the search indexing, like the factor definitions:
backlink_factor: 0
description_factor: 200
keywords_factor: 300
meta_description_factor: 500
title_factor: 1000
and other options specific to the indexing, like the start URL (put all URLs here):
start_url: http://www.site1.org/ http://www.site2.org/
the limit:
limit_urls_to: ${start_url}
and the exclude URLs:
exclude_urls: /cgi-bin/ .cgi printerfriendly=1 /other/things/
Note that the exclude URLs will be applied to all sites indexed since we're using this method. So if there are different exclusion rules for each site, whichever rules you use will be applied to all the other sites as well. If you want /cgi-bin/ indexed on one site but not another, then as far as I can tell that's not possible with this method.
..And all the other settings related to indexing.
Ok, that's step one -- setting up the unique .conf files for each site.
Then, in each of the files, you can also insert lines like these:
search_results_header: /home/myaccount/html/www.site1.org/htdig/html_files/header.html
search_results_footer: /home/myaccount/html/www.site1.org/htdig/html_files/footer.html
#search_results_wrapper: /home/myaccount/html/www.site1.org/htdig/html_files/wrapper.html
nothing_found_file: /home/myaccount/html/www.site1.org/htdig/html_files/nomatch.html
syntax_error_file: /home/myaccount/html/www.site1.org/htdig/html_files/syntax.html
...Changing the locations to point to wherever you are keeping your HTML template files. I put the # before the wrapper because I don't use it.
Now you can set up all the HTML templates for each site, wherever you want really since they are accessed by the script and thus don't need to be in any web-accessible areas, though that's how I've done it in my sample code above.
Last step is to set up the search itself. Wherever you have a search form on all the sites, make sure you include the CONFIG and RESTRICT options as hidden variables like this:
<input type="hidden" name="config" value="site1">
<input type="hidden" name="restrict" value="site1.org">
That should do it. If you have any questions just ask.
-Mike
At 09:34 AM 2/4/2004 -0800, you wrote:
Hello;
Forgive me if this has been covered, but the Search "This mailing list" field doesn't give me any returns.
We have three intranet sites on one Linux/Apache server, all with their root folder under /var/www/. Each site has a unique look and I've got htdig working fine on one of them using it's site layout via the wrapper, long, short, etc. files in a folder within the site.
How can I use htdig to work on all three sites? Because of the different looks (and probable differences in htdig.conf tweaks), I'll need to use different template html pages and a different htdig.conf for each site.
Thanks! David Richards Intranet Development FedEx Freight System (408) 323-4036
********************************************************** This message contains information that is confidential and proprietary to FedEx Freight or its affiliates. It is intended only for the recipient named and for the express purpose(s) described therein. Any other use is prohibited. ****************************************************************
------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general
What is THE MEATRIX? www.themeatrix.com
Mike Flynn, Web Developer GRACE - www.gracelinks.org | [EMAIL PROTECTED]
------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

