----- Original Message -----
From: Randal L. Schwartz <[EMAIL PROTECTED]>
Sent: Thursday, October 28, 1999 9:50 AM
Subject: "DigExt" in user-agent hammering my site

> In the past week or so, I've been seeing many many portions of my site
> sucked down in rapid fire.  These sucks followed patterns of being a
> spider -- some autogenerated URLs from pod2html are definitely bad,
> and I saw these bad hits, indicating that someone is following every
> link in every one of my WT columns.  robots.txt is *occasionally*
> being fetched, but not always.
> The only thing in common with these rude intrusions is a windows-IE
> user agent, along with a new string "DigExt".
> Has anyone else seen this?  Can we trace this back to some lousy user
> interface somewhere?  I tried blocking them, and I got back some users
> that wondered why I was blocking them.
> Is it a new version of IE that permits heavy rapid download?
> --
> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777
> <[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
> See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl

Hey Randal,

I just tried it using IE5 for NT4 ..

What you're seeing is when someone has used "Make available offline"
followed by:

"If this favorite links to other pages, would you like to make those pages
available offline too? [y/n] ... Download pages [x] links deep from this

The useragent is this: Mozilla/4.0 (compatible; MSIE 5.0; Windows NT;

And proceeds to crawl the site with 0-wait time between requests....

I haven't inspected the client-header to see if there might be something to
indicate it's in "crawl" mode .. I think it's doubtful there is. So.....

-Jay J

Reply via email to