RE: Search engine spider
Well not exactly I just want to create the spider that is going to search the web for sites I need this to index the sites on the web etc So my search engine has some results to return to users... -Original Message- From: Dave Lyons [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 7:04 PM To: CF-Talk Subject: Re: Search engine spider I think they are talking about wanting to build a search engine, such as google or something, correct? Wanting to hit it big, get rich then hit us with pay-per-clicks, lol - Original Message - From: "Tilbrook, Peter" <[EMAIL PROTECTED]> To: "CF-Talk" <[EMAIL PROTECTED]> Sent: Monday, January 13, 2003 6:57 PM Subject: RE: Search engine spider > CFMX has one built-in (using Verity) if that suits your purposes. > > == > Peter Tilbrook > Internet Applications Developer > Australian Building Codes Board > GPO Box 9839 > CANBERRA ACT 2601 > AUSTRALIA > > WWW: http://www.abcb.gov.au/ >E-Mail: [EMAIL PROTECTED] > Telephone: (02) 6213 6731 >Mobile: 0439 401 823 > Facsimile: (02) 6213 7287 > > > -Original Message- > From: Kris Pilles [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, 14 January 2003 6:39 AM > To: CF-Talk > Subject: Search engine spider > > > Does anyone know where I can find some code for a search engine > spider??? Or > an existing one that I can check out??? > > Thanks > > Kris Pilles > Website Manager > Western Suffolk BOCES > 507 Deer Park Rd., Building C > Phone: 631-549-4900 x 267 > E-mail: [EMAIL PROTECTED] > > > > ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq This list and all House of Fusion resources hosted by CFHosting.com. The place for dependable ColdFusion Hosting. Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Search engine spider
I think they are talking about wanting to build a search engine, such as google or something, correct? Wanting to hit it big, get rich then hit us with pay-per-clicks, lol - Original Message - From: "Tilbrook, Peter" <[EMAIL PROTECTED]> To: "CF-Talk" <[EMAIL PROTECTED]> Sent: Monday, January 13, 2003 6:57 PM Subject: RE: Search engine spider > CFMX has one built-in (using Verity) if that suits your purposes. > > == > Peter Tilbrook > Internet Applications Developer > Australian Building Codes Board > GPO Box 9839 > CANBERRA ACT 2601 > AUSTRALIA > > WWW: http://www.abcb.gov.au/ >E-Mail: [EMAIL PROTECTED] > Telephone: (02) 6213 6731 >Mobile: 0439 401 823 > Facsimile: (02) 6213 7287 > > > -Original Message- > From: Kris Pilles [mailto:[EMAIL PROTECTED]] > Sent: Tuesday, 14 January 2003 6:39 AM > To: CF-Talk > Subject: Search engine spider > > > Does anyone know where I can find some code for a search engine spider??? Or > an existing one that I can check out??? > > Thanks > > Kris Pilles > Website Manager > Western Suffolk BOCES > 507 Deer Park Rd., Building C > Phone: 631-549-4900 x 267 > E-mail: [EMAIL PROTECTED] > > > > ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq This list and all House of Fusion resources hosted by CFHosting.com. The place for dependable ColdFusion Hosting. Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
CFMX has one built-in (using Verity) if that suits your purposes. == Peter Tilbrook Internet Applications Developer Australian Building Codes Board GPO Box 9839 CANBERRA ACT 2601 AUSTRALIA WWW: http://www.abcb.gov.au/ E-Mail: [EMAIL PROTECTED] Telephone: (02) 6213 6731 Mobile: 0439 401 823 Facsimile: (02) 6213 7287 -Original Message- From: Kris Pilles [mailto:[EMAIL PROTECTED]] Sent: Tuesday, 14 January 2003 6:39 AM To: CF-Talk Subject: Search engine spider Does anyone know where I can find some code for a search engine spider??? Or an existing one that I can check out??? Thanks Kris Pilles Website Manager Western Suffolk BOCES 507 Deer Park Rd., Building C Phone: 631-549-4900 x 267 E-mail: [EMAIL PROTECTED] ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Structure your ColdFusion code with Fusebox. Get the official book at http://www.fusionauthority.com/bkinfo.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
Thanks -Original Message- From: Ben Doom [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 3:49 PM To: CF-Talk Subject: RE: Search engine spider If by 'index the links' you mean generate a list of them, it seems to me the following would work pretty well. Create an empty array. Populate the first entry with your starting URL. Set a counter 'count' to 1. loop while count is less than or equal to the arraylen: cfhttp the url in the array at index count Grab all the links. loop over the links: if the link does not exist in the array, add it. /loop over links increment the counter /loop over array Output the links. If you write good modular code (especially where it grabs the links) you can edit it later to do whatever processing or searching you want. I know it's not code, but at least it should get you started should you not find what you want ready-made. If you want more help, feel free to contact me off-list (unless others are interested in this thing). -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 3:29 PM : To: CF-Talk : Subject: RE: Search engine spider : : : Bascially I need to create a search engine. And the the spider would be : used to index links for the search engine ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Structure your ColdFusion code with Fusebox. Get the official book at http://www.fusionauthority.com/bkinfo.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
If by 'index the links' you mean generate a list of them, it seems to me the following would work pretty well. Create an empty array. Populate the first entry with your starting URL. Set a counter 'count' to 1. loop while count is less than or equal to the arraylen: cfhttp the url in the array at index count Grab all the links. loop over the links: if the link does not exist in the array, add it. /loop over links increment the counter /loop over array Output the links. If you write good modular code (especially where it grabs the links) you can edit it later to do whatever processing or searching you want. I know it's not code, but at least it should get you started should you not find what you want ready-made. If you want more help, feel free to contact me off-list (unless others are interested in this thing). -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 3:29 PM : To: CF-Talk : Subject: RE: Search engine spider : : : Bascially I need to create a search engine. And the the spider would be : used to index links for the search engine ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Get the mailserver that powers this list at http://www.coolfusion.com Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
Bascially I need to create a search engine. And the the spider would be used to index links for the search engine -Original Message- From: Jerry Johnson [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 3:24 PM To: CF-Talk Subject: RE: Search engine spider What do you mean by search engine spider (just to make sure we are all on the same page)? I am guessing you want to point a "spider" at a particular website, and get back all the links from that page. Then point to each linked page and get back all links from it, continuing until you run out of links? Jerry Johnson >>> [EMAIL PROTECTED] 01/13/03 03:00PM >>> Well ASP, .NET of CF would work... I need to write a search engine spider... I am just looking for a basic one to work with but in the end I need to be bale to start it off somewhere and let it go Indexing pages along the way in our datasource Eventually work on refinement so that I can send it out to look for certain keywords etc -Original Message- From: Ben Doom [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:54 PM To: CF-Talk Subject: RE: Search engine spider I suppose that depends on what language you want the code in, what platform, what features, etc. I know I wrote a web link mapper (built a map of all the links in a site) in Perl in a couple of hours. I also wrote a doohicky that would scan a site for given text mining /n/ layers deep etc. and it didn't take too much longer, as I recall. Of course, I don't have the code for either anymore. :-) Can you be more specific about what you're trying to do? Maybe someone will be able to help out a bit more. -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 2:39 PM : To: CF-Talk : Subject: Search engine spider : : : Does anyone know where I can find some code for a search engine : spider??? Or an existing one that I can check out??? : : Thanks : : Kris Pilles : Website Manager : Western Suffolk BOCES : 507 Deer Park Rd., Building C : Phone: 631-549-4900 x 267 : E-mail: [EMAIL PROTECTED] : : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Structure your ColdFusion code with Fusebox. Get the official book at http://www.fusionauthority.com/bkinfo.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
What do you mean by search engine spider (just to make sure we are all on the same page)? I am guessing you want to point a "spider" at a particular website, and get back all the links from that page. Then point to each linked page and get back all links from it, continuing until you run out of links? Jerry Johnson >>> [EMAIL PROTECTED] 01/13/03 03:00PM >>> Well ASP, .NET of CF would work... I need to write a search engine spider... I am just looking for a basic one to work with but in the end I need to be bale to start it off somewhere and let it go Indexing pages along the way in our datasource Eventually work on refinement so that I can send it out to look for certain keywords etc -Original Message- From: Ben Doom [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:54 PM To: CF-Talk Subject: RE: Search engine spider I suppose that depends on what language you want the code in, what platform, what features, etc. I know I wrote a web link mapper (built a map of all the links in a site) in Perl in a couple of hours. I also wrote a doohicky that would scan a site for given text mining /n/ layers deep etc. and it didn't take too much longer, as I recall. Of course, I don't have the code for either anymore. :-) Can you be more specific about what you're trying to do? Maybe someone will be able to help out a bit more. -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 2:39 PM : To: CF-Talk : Subject: Search engine spider : : : Does anyone know where I can find some code for a search engine : spider??? Or an existing one that I can check out??? : : Thanks : : Kris Pilles : Website Manager : Western Suffolk BOCES : 507 Deer Park Rd., Building C : Phone: 631-549-4900 x 267 : E-mail: [EMAIL PROTECTED] : : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Your ad could be here. Monies from ads go to support these lists and provide more resources for the community. http://www.fusionauthority.com/ads.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
On another note, using the same techniques, I have a keyword suggestion tool that looks at an overture page and gets all of the keywords that are derivatives of the keyword you enter and returns a total for the given month. Just thought I'd throw that out there as well... I think it's the most useful for small web dev firms in analyzing what keywords they want to optimize their sites for. http://www.globalpromoter.com/keyword_suggestion_tool.cfm Happy cfftp-ing!!! ~Jason -Original Message- From: Dowdell, Jason G [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:49 PM To: CF-Talk Subject: RE: Search engine spider On one of my sites I have a script that cfhttp's a url and strips out all of the meta information, body and such and does calculations such as keyword density on them but it doesn't index an entire site. It just does one url at a time. I don't think it's what you're looking for but you can try it out and let me know if you need the code. http://www.GlobalPromoter.com//meta_spider.cfm If you want more detailed stats you can sign up for a free account and use the "site readiness tool". When you're finished using it I can delete your account so you're not in the database if you're worried about privacy. Plus I own the site so you have my word your info won't be shared. ~Jason -Original Message- From: Kris Pilles [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:39 PM To: CF-Talk Subject: Search engine spider Does anyone know where I can find some code for a search engine spider??? Or an existing one that I can check out??? Thanks Kris Pilles Website Manager Western Suffolk BOCES 507 Deer Park Rd., Building C Phone: 631-549-4900 x 267 E-mail: [EMAIL PROTECTED] ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Your ad could be here. Monies from ads go to support these lists and provide more resources for the community. http://www.fusionauthority.com/ads.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
Well ASP, .NET of CF would work... I need to write a search engine spider... I am just looking for a basic one to work with but in the end I need to be bale to start it off somewhere and let it go Indexing pages along the way in our datasource Eventually work on refinement so that I can send it out to look for certain keywords etc -Original Message- From: Ben Doom [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:54 PM To: CF-Talk Subject: RE: Search engine spider I suppose that depends on what language you want the code in, what platform, what features, etc. I know I wrote a web link mapper (built a map of all the links in a site) in Perl in a couple of hours. I also wrote a doohicky that would scan a site for given text mining /n/ layers deep etc. and it didn't take too much longer, as I recall. Of course, I don't have the code for either anymore. :-) Can you be more specific about what you're trying to do? Maybe someone will be able to help out a bit more. -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 2:39 PM : To: CF-Talk : Subject: Search engine spider : : : Does anyone know where I can find some code for a search engine : spider??? Or an existing one that I can check out??? : : Thanks : : Kris Pilles : Website Manager : Western Suffolk BOCES : 507 Deer Park Rd., Building C : Phone: 631-549-4900 x 267 : E-mail: [EMAIL PROTECTED] : : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
I suppose that depends on what language you want the code in, what platform, what features, etc. I know I wrote a web link mapper (built a map of all the links in a site) in Perl in a couple of hours. I also wrote a doohicky that would scan a site for given text mining /n/ layers deep etc. and it didn't take too much longer, as I recall. Of course, I don't have the code for either anymore. :-) Can you be more specific about what you're trying to do? Maybe someone will be able to help out a bit more. -- Ben Doom Programmer & General Lackey Moonbow Software, Inc : -Original Message- : From: Kris Pilles [mailto:[EMAIL PROTECTED]] : Sent: Monday, January 13, 2003 2:39 PM : To: CF-Talk : Subject: Search engine spider : : : Does anyone know where I can find some code for a search engine : spider??? Or an existing one that I can check out??? : : Thanks : : Kris Pilles : Website Manager : Western Suffolk BOCES : 507 Deer Park Rd., Building C : Phone: 631-549-4900 x 267 : E-mail: [EMAIL PROTECTED] : : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Get the mailserver that powers this list at http://www.coolfusion.com Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Search engine spider
On one of my sites I have a script that cfhttp's a url and strips out all of the meta information, body and such and does calculations such as keyword density on them but it doesn't index an entire site. It just does one url at a time. I don't think it's what you're looking for but you can try it out and let me know if you need the code. http://www.GlobalPromoter.com//meta_spider.cfm If you want more detailed stats you can sign up for a free account and use the "site readiness tool". When you're finished using it I can delete your account so you're not in the database if you're worried about privacy. Plus I own the site so you have my word your info won't be shared. ~Jason -Original Message- From: Kris Pilles [mailto:[EMAIL PROTECTED]] Sent: Monday, January 13, 2003 2:39 PM To: CF-Talk Subject: Search engine spider Does anyone know where I can find some code for a search engine spider??? Or an existing one that I can check out??? Thanks Kris Pilles Website Manager Western Suffolk BOCES 507 Deer Park Rd., Building C Phone: 631-549-4900 x 267 E-mail: [EMAIL PROTECTED] ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribe&forumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Get the mailserver that powers this list at http://www.coolfusion.com Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4