Permit me to express how gratifying it is to have someone on your end who is humble, 
genuinely concerned and so receptive.  Most computer pros do not have such sensitivity 
to us "mere mortals" out here, so that is really aprpeciated.  

Now, you mention 5.31 on attributes.   And you ask if I have exhausted all 
possibilities there.  I am not excatly sure what is meant by that (yes, I have read 
and studied the FAQ, to the best of my ability).   So, I list, below, what I think you 
mean as "attributes" from the htdig.conf file:

exclude_urls:           /cgi-bin/ .cgi
                                /forum/
                                /applets/
                                /chat/
                                /dig/
                                /htdig/
                                /poll/
                                /random/

Would that indicate that I have "exhausted" the attributes or is there something else 
I need to do in that regard?

OK, next (assuming that "attributes" have been "exhausted") you say that "we need to 
get htdig to issue a warning."  I assume that this is something you would need to do 
or tell me to do?  A patch or string of some kind?  Whatever it takes, I am at your 
service and will be patient and cooperative in every way to help work through this. I 
love your program! I believe you correctly have identified the problem -- with my 
meager help, so far.  Just let me know what to do next.  I have a feeling that this 
can be solved.  If you need further info from my side, please tell me, too.  Thanks.
 

--

On Fri, 11 Oct 2002 12:55:12  
 Gilles Detillieux wrote:
>According to Pub Litics:
>> OK, been working on this for the past nine hours or so, but seem to
>> be stumped.
>> 
>> I have gotten into htdig.conf file, primarily (since that is the file
>> to which my attention was directed).  I have a main home directory
>> as the starting point.  Under that my directory structure consists of
>> text files, image files, htdig files, forum files [the forum files are
>> the ones I do not want indexed], applet files and cgi-bin.  Under each
>> sub-directory there are many, many sub-directories.  The site, itself,
>> is about 300 MG.
>> 
>> There is a mysql database for the forum, but it is tucked away in
>> the server and not referenced in the directory structure, except for
>> the file which calls it up. Under excluded URLs, I had listed /forum.
>> This listing did not stop htdig from searching and indexing thousands
>> of unwanted listings, however, from the forum.  A typical listing looks
>> like:  /forum/viewforum.php?f=3&sid=f4d181d874cbc2cc0f41f2927959f2c5
>> 
>> I tried /forum/ [with an added forward-slash], but that did not help.
>> Would it be possible to start at the sub-directory level, perhaps,
>> with multiple starting points?
>> 
>> At present, the search engine is totally useless because it searches
>> and indexes repeatedly.  The suggestion asks "where to prune?"  I would
>> reply "anywhere to exclude /forum and all under it."
>> 
>> I tried to understand what you mean by the bad query string process,
>> but I cannot figure out what you mean.  I have read all the material
>> and inspected htdig.conf copiously, but (I apologize) I do not know
>> what I am supposed to do.  Help!  Thanks.
>
>I must apologize for misleading you yesterday.  I was under the mistaken
>impression then that you wanted to index certain parts of the /forum
>subdirectory, but wanted to rein in htdig because it was getting lost in
>all the cross-links that the forums scripts generate.  In rereading your
>earlier messages, as well as this one, I see that you quite clearly stated
>you don't want any of /forum indexed.  So, you're not looking to prune
>this tree down to size at all, you're looking to lop it off at the root.
>
>The question, then, is not what the proper settings of exclude_urls and
>bad_querystr ought to be.  You're already trying the proper setting
>of exclude_urls, as stated above and previously, and bad_querystr
>doesn't apply if you don't want any of these scripts indexed at all.
>The question is why htdig isn't taking your exclude_urls setting.
>
>If you've exhausted all the possibilities of FAQ 5.31, then I'd suggest
>another possibility that's bitten other users lately.  When you Control-C
>out of htdig, it now by default creates a db.log file of all the URLs
>that have been "pushed" but not indexed so far.  This file tends to be
>persistent, because if it's there, htdig reads it and if you interrupt
>htdig again, it recreates it again.  So, if you interrupted htdig before
>changing exclude_urls, and restarted htdig afterward, the db.log file
>(in your database directory) may have had a number of /forum URLs that
>had already been pushed prior to the change to exclude_urls, that don't
>get rechecked afterward.  If this is the case, simply delete the file
>and restart htdig to truly restart from scratch.
>
>I think we need to get htdig to issue a warning if it loads a db.log
>file that's older than the config file it's using.  In fact, it would
>make sense for htdig to record the modtime and a checksum of the config
>file in db.log to make sure you're restarting with the same config.
>
>-- 
>Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
>Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
>Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)
>
>
>-------------------------------------------------------
>This sf.net email is sponsored by:ThinkGeek
>Welcome to geek heaven.
>http://thinkgeek.com/sf
>_______________________________________________
>htdig-general mailing list <[EMAIL PROTECTED]>
>To unsubscribe, send a message to <[EMAIL PROTECTED]> with 
>a subject of unsubscribe
>FAQ: http://htdig.sourceforge.net/FAQ.html
>


____________________________________________________________
Get 25MB of email storage with Lycos Mail Plus!
Sign up today -- http://www.mail.lycos.com/brandPage.shtml?pageId=plus 


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to