This is a multi-part message in MIME format.
--------------E261C94BAEFACBC455B9BFF2
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
haha.... now you're getting into some <cf_history>
this was one of the longest "Me too" threads I have ever seen in forums
about 3-4 years ago. When the first <cfx_http> (this is before <cfhttp>
was built into CF) was created, I decided I was going to build the first
web spider in CF.
so i posted to Allaire forums that I was going to build one and I'd give
out the code and asked if anyone wanted it.... I swear there must have
been 150 responses that just said: "I do" or "me too".
Anyway, building it was pretty easy, just had a table storing the URLs
that it had been too, and needed to go to. Then it would get the next
one in the list and go to that one, grab the content, extract the links
and whatever else you wanted. That was when I came up with the name
"secretagents.com", and only recently made use of the domain name.
I made a pretty wild tag for dealing with the text parsing, I don't even
remember if it works... check it out:
http://devex.allaire.com/developer/gallery/info.cfm?ID=CA34711B-2830-11D4-AA9700508B94F380&method=Full
with this tag you can basically grab anything out of a tag based
document.
The only downfall... is that spiders in CF are pretty slow, you might
want to look for a multi-threaded application. I think many of them are
free. They'll do 30-40 pages in the time CF will do 1.
Steve
Shane Witbeck wrote:
>
> Just curious if anyone has developed any CF-based spiders or bots?
>
> shane witbeck
> webmaster
> ------------------------------------------------------------------------------
> To Unsubscribe visit
>http://www.houseoffusion.com/index.cfm?sidebar=lists&body=lists/fusebox or send a
>message to [EMAIL PROTECTED] with 'unsubscribe' in the body.
--
Steve Nelson
http://www.SecretAgents.com
Tools for Fusebox Developers
--------------E261C94BAEFACBC455B9BFF2
Content-Type: text/x-vcard; charset=us-ascii;
name="m.vcf"
Content-Transfer-Encoding: 7bit
Content-Description: Card for Steve Nelson
Content-Disposition: attachment;
filename="m.vcf"
begin:vcard
n:Nelson;Steve
tel;work:804 242 1908
x-mozilla-html:FALSE
url:http://www.secretagents.com
org:SecretAgents.com
adr:;;109 e Jefferson St.;Charlottesville;VA;22902;USA
version:2.1
email;internet:[EMAIL PROTECTED]
fn:Steve Nelson
end:vcard
--------------E261C94BAEFACBC455B9BFF2--
------------------------------------------------------------------------------
To Unsubscribe visit
http://www.houseoffusion.com/index.cfm?sidebar=lists&body=lists/fusebox or send a
message to [EMAIL PROTECTED] with 'unsubscribe' in the body.