On Fri, 3 Apr 2004, Ninti Systems wrote:

> 1. If I add '.php' to exclude_urls or bad_extensions, rundig doesn't
> work (runs momentarily then stops a second later, no useful data in
> database). Removing '.php' from the list solves problem. The same was
> true for some directories I tried to add to these directives.

What does your start_url look like? Does it perhaps resolve or redirect to
a PHP file? If so, then you will immediately exclude indexing of the first
file and therefore never find any additional URLs to retrieve. Try
executing rundig with two or three '-v' options. This should provide you
with additional information that shows what htdig is seeing and how it is
responding.

> 2. I have added the external parsers required for .doc, .pdf, .rtf. I
> can run these successfully on the command line directly and via
> doc2html. HtDig (rundig) doesn't still doesn't process any of these file
> types though. 

Again your best bet is probably to start by trying a run with some '-v's.
This will allow you to determine whether htdig is even seeing the files
that you want to index. There are a lot of reasons for htdig not seeing
files that you might expect it to find. If this appears to be the case the
following is a good place to start looking for answers.

  http://www.htdig.org/FAQ.html#q5.27

> 3. Editing common/long.html appears to have no effect whatsoever on
> output, whereas common/header.html for example is readily editable.

By default, htsearch uses templates that are compiled into the executable;
this provides a slight performance advantage. In order to use the template
files, you need to make some changes to your configuration file. Search
htdig.conf (or whatever you named it) for template_map and template_name.
Also see the following.

http://www.htdig.org/attrs.html#template_map
http://www.htdig.org/attrs.html#template_name


Jim


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to