An intranet in Tennessee recently downloaded my friend's entire site, about 4 times in the same day -- they apparently got their settings screwed up, and it was cycling thru recursively.... Another thing the conscientious user of these programs should watch for, along with the depth of capture. Yet another thing is to make sure that you don't allow the program to follow links onto other sites endlessly, if for no other reason than that you don't want to clog up your own hard disk with what will be garbage for your original purposes.
And then often the programs themselves are at fault, for example by not respecting the robot settings of the webapges.
And not to worry Terry you never impugned my veracity -- yet another thing I hadn't thought of -- but I fear you had an agenda in mind that just didn't relate to my post, and that sort of got in the way! I may not be the clearest writer, but Dave and John both understood me quite clearly. My post had nothing whatsoever to do with intellectual property: so to dot the I's and cross the T's for you, I pointed out that in fact I had mentioned items that were in the public domain.
As you point out, it is both useful and legitimate to scoop up a website for personal use, or portions of a website. For an intranet, it's a little less clear-cut, but on balance, it surely must fall under what in US law is called "limited distribution": a teacher usually has the right to photocopy copyrighted work in limited quantities and use the photocopies as handouts. You therefore need not feel threatened as to the legality of what you're doing: it looks like someone once gave you a hard time; there are people who specialize in that kind of thing, not a pretty tactic. Anyhoo, relax and don't let that worry you.
On the other hand you are completely mistaken, both in the UK and in the US, about the nature of copyright. It is without reference to the subtrate: film, sculpture (as of a dial for example), webpages, music performed yet not printed as a score, all are every bit as protected as books; with regard to archive.org, the fact that the webpages are no longer out there doesn't change anything. The fact that exceptions are made for "limited distribution," "fair use", parody and several other specialized cases doesn't invalidate copyright: it confirms copyright, else these would not be exceptions.
-- Bill -