A <meta name="robots" content="none"> or any of the other variety of ways of
telling htdig not to follow links through a page has two small bugs.  Either
by it self would not manifest this problem I saw.  The following patch seems
to have fixed the problem. 

*** HTML.orig   Mon Nov  2 16:21:51 1998
--- HTML.cc     Sat Nov 14 00:40:55 1998
*************** HTML::parse(Retriever &retriever, URL &b
*** 256,262 ****
                if (description.length() > max_description_length)
                {
                    description << " ...";
!                   retriever.got_href(*href, description);
                    in_ref = 0;
                    description = 0;
                }
--- 256,263 ----
                if (description.length() > max_description_length)
                {
                    description << " ...";
!                       if (dofollow)
!                     retriever.got_href(*href, description);
                    in_ref = 0;
                    description = 0;
                }
*************** HTML::do_tag(Retriever &retriever, Strin
*** 512,520 ****
        }
  
        case 3:         // "/a"
!           if (dofollow && in_ref)
            {
!               retriever.got_href(*href, description);
                in_ref = 0;
            }
            break;
--- 513,522 ----
        }
  
        case 3:         // "/a"
!           if (in_ref)
            {
!               if (dofollow)
!                 retriever.got_href(*href, description);
                in_ref = 0;
            }
            break;


----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED] containing the single word "unsubscribe" in
the body of the message.

Reply via email to