Maybe way ahead of me here, but it was just hitting me that it would be pretty cool to group urls to fetch my host and then perhaps use http 1.1 to reuse the connection and save initial handshaking overheard. Not a huge deal for a couple hits, but it I think it would make sense for large crawls.
Or maybe keep a pool of http connections to the last x sites open somewhere and check there first. Sound reasonable? Already doing it? I would be willing to help. Just a thought. Earl __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com