I just spent an embarrassingly long time figuring out that my index was skipping every document
 
my config was:
 
start_url:  http://www.fooco.com/content/archives/archivelist.php
Limit_urls_to: archive (and every other combination I could think of)
 
 
 
and yet htdig was behaving as if limit_urls was set to start_url.
 
After too many guesses, I changed Limit_urls to limit_normalized and everything works fine.
 
Only after that do I realize that my example config file had an upper case L on limit_urls_to:, removing the comment in front of it made me think I was enabling the config option but the case typo had another idea.
 
Deleting limit_normalized and lowercasing limit_urls proved the true source of the problem.
 
My question is, is there any verbose level or other diagnostic that would tell me that I had an unrecognized config file entry?
 
Or is there some way to display the settings the program is actually using to build an index?
 
Either would have been useful tonight.
 
Gordon

Reply via email to