I'm writing seeking input on resolving a situation in which I must make a choice between using htdig and microsoft index server.

Since this is the htdig list, I assume there will be strong support for using htdig (which I've been using since long ago, and have been very happy with it).

The site I am indexing and providing search capabilities for is served from a IIS with PHP. Htdig runs on a Linux machine elsewhere on the network. Search requests are passed to PHP which then queries the Linux box for the results. PHP then formats the results in the context of the page and returns the entire page back to the client.

There are a couple problems with this situation:

1) Htdig cannot get beyond the security of certain sections of the site even though some of the content needs to be indexed. Thus, that content simply cannot be indexed. I have complete control over both servers but am not interested in creating any "back doors". The security uses group-based permissions from a database and sessions to determine access. I may just end up adding the search engine "user" to all groups (kind of like having admin access).

2) There seems to be really high CPU usage every time a search is executed. I suspect this has something to do with how PHP opens an HTTP stream when passing the request to htdig running on the Linux box.

3) I find myself having to completely recreate the indexes every 3 or 4 months on a site consisting of about 4000 pages (including PDF and Office documents). About 1% of the site is updated daily. The index is updated nightly using rundig and db.words.db can grow to 20 MB.

My question is, would it make more sense to go with Microsoft's Index Server or to just keep trying to work with htdig. I'm fairly certain I will continue working with htdig but thought I'd put the question to the list to see what others might have to say about it.

Thanks and I look forward to your replies!

Ted Stresen-Reuter
http://www.tedmasterweb.com/



-------------------------------------------------------
This SF.net email is sponsored by Demarc:
A global provider of Threat Management Solutions.
Download our HomeAdmin security software for free today!
http://www.demarc.com/info/Sentarus/hamr30
_______________________________________________
ht://Dig general mailing list: <[email protected]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to