>>>>> "TZ" == Ted Zlatanov <[EMAIL PROTECTED]> writes:

  >>> You misunderstand.  If registration is required, a crawler will fail
  >>> anyway,
  >> 
  >> Unless the crawler is itself registered.  If I wrote a crawler, I'd
  >> keep a database of usernames and passwords for this purpose.

  TZ> That's not a typical web crawler, and obviously not what I meant.
  TZ> Such databases already exist (e.g. bugmenot) but using them to rip a
  TZ> page is definitely abusive.  Think Google, not rip-off.

i wrote a crawler for a client that did just that (even had paid
registration for the wall street journal). it was specifically crawling
newspapers and publications only so it had to register for some. it was
not meant for archiving or public (commercial only) use. hard to say
whether it violated any policies but that was their problem, not
mine. and i don't think they ever went live. 

uri

-- 
Uri Guttman  ------  [EMAIL PROTECTED]  -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs  ----------------------------  http://jobs.perl.org
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to