On Fri, 3 Apr 2004, Ninti Systems wrote: > 1. If I add '.php' to exclude_urls or bad_extensions, rundig doesn't > work (runs momentarily then stops a second later, no useful data in > database). Removing '.php' from the list solves problem. The same was > true for some directories I tried to add to these directives.
What does your start_url look like? Does it perhaps resolve or redirect to a PHP file? If so, then you will immediately exclude indexing of the first file and therefore never find any additional URLs to retrieve. Try executing rundig with two or three '-v' options. This should provide you with additional information that shows what htdig is seeing and how it is responding. > 2. I have added the external parsers required for .doc, .pdf, .rtf. I > can run these successfully on the command line directly and via > doc2html. HtDig (rundig) doesn't still doesn't process any of these file > types though. Again your best bet is probably to start by trying a run with some '-v's. This will allow you to determine whether htdig is even seeing the files that you want to index. There are a lot of reasons for htdig not seeing files that you might expect it to find. If this appears to be the case the following is a good place to start looking for answers. http://www.htdig.org/FAQ.html#q5.27 > 3. Editing common/long.html appears to have no effect whatsoever on > output, whereas common/header.html for example is readily editable. By default, htsearch uses templates that are compiled into the executable; this provides a slight performance advantage. In order to use the template files, you need to make some changes to your configuration file. Search htdig.conf (or whatever you named it) for template_map and template_name. Also see the following. http://www.htdig.org/attrs.html#template_map http://www.htdig.org/attrs.html#template_name Jim ------------------------------------------------------- This SF.Net email is sponsored by: IBM Linux Tutorials Free Linux tutorial presented by Daniel Robbins, President and CEO of GenToo technologies. Learn everything from fundamentals to system administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

