Re: nofollow regex
Wowsers.. Thanks Peter.. I looked at Adrian's code yesterday to try to see if I could modify to include all the complex examples. I'll test your code today and let you know how it goes. For as gracefully as Wordpress and some forum software handle this, it sure is complex to implement. ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317994 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
nofollow regex
Hey folks, Since I got no love in the RegEx forum, I'm hoping to post here to get a little more eyeballs on the question I'm struggling over. I'm looking for a working rel=nofollow regex to modify links. For example: Goto a href=http://google.com;Google/a now! and turning it into: Goto a href=http://google.com; rel=nofollowGoogle/a now! The best solution I've found so far is: http://www.sitecritic.net/articleDetail.php?id=242 , but this is a PHP solution. Any ideas on converting this to Coldfusion? The PHP solution covers a lot of scenarios (extra attributes, single quotes instead of double quotes) etc.. so that would be ideal. Thanks! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317920 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: nofollow regex
Does it have to be a server side solution? jQuery would make this a snap: $(document).ready(function(){ $('a[href^=http]').attr('rel','nofollow'); }); a href=/somepage.htmlThis is an internal link/a brbr a href=http://google.com;And this is an external link, with no follow/a -Original Message- From: Jeff Becker [mailto:jpbec...@yahoo.com] Sent: Wednesday, January 14, 2009 8:58 AM To: cf-talk Subject: nofollow regex Hey folks, Since I got no love in the RegEx forum, I'm hoping to post here to get a little more eyeballs on the question I'm struggling over. I'm looking for a working rel=nofollow regex to modify links. For example: Goto a href=http://google.com;Google/a now! and turning it into: Goto a href=http://google.com; rel=nofollowGoogle/a now! The best solution I've found so far is: http://www.sitecritic.net/articleDetail.php?id=242 , but this is a PHP solution. Any ideas on converting this to Coldfusion? The PHP solution covers a lot of scenarios (extra attributes, single quotes instead of double quotes) etc.. so that would be ideal. Thanks! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317923 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: nofollow regex
Ha! That would be great, but search engines won't see that which is the point really. Using that PHP regex you pointed to: cfsavecontent variable=html Goto a href=http://google.com;No/a now! and turning it into: Goto a href=http://google.com; rel=nofollow~Yes/a now! /cfsavecontent cfoutput pre#HTMLEditFormat(html)#/pre cfset re = [\s]*a[\s]*href=[\s]*[\\']?([\w.-]*)[\\']?[^]*(.*?)\/a cfset matches = REMatch(re, html) cfdump var=#matches# cfloop array=#matches# index=match p#HTMLEditFormat(match)#/p cfif NOT FindNoCase(nofollow, match) cfset newLink = Replace(match, , rel=nofollow, ONE) cfset html = ReplaceNoCase(html, match, newLink) /cfif /cfloop pre#HTMLEditFormat(html)#/pre Not as nice as a one hit RegEx but seems to get the job done :) Adrian -Original Message- From: Andy Matthews [mailto:li...@commadelimited.com] Sent: 14 January 2009 15:27 To: cf-talk Subject: RE: nofollow regex Does it have to be a server side solution? jQuery would make this a snap: $(document).ready(function(){ $('a[href^=http]').attr('rel','nofollow'); }); a href=/somepage.htmlThis is an internal link/a brbr a href=http://google.com;And this is an external link, with no follow/a -Original Message- From: Jeff Becker [mailto:jpbec...@yahoo.com] Sent: Wednesday, January 14, 2009 8:58 AM To: cf-talk Subject: nofollow regex Hey folks, Since I got no love in the RegEx forum, I'm hoping to post here to get a little more eyeballs on the question I'm struggling over. I'm looking for a working rel=nofollow regex to modify links. For example: Goto a href=http://google.com;Google/a now! and turning it into: Goto a href=http://google.com; rel=nofollowGoogle/a now! The best solution I've found so far is: http://www.sitecritic.net/articleDetail.php?id=242 , but this is a PHP solution. Any ideas on converting this to Coldfusion? The PHP solution covers a lot of scenarios (extra attributes, single quotes instead of double quotes) etc.. so that would be ideal. Thanks! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317925 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: nofollow regex
Yes it does. This is for validation on a blog or forum, etc... For the spam comments/posts that do sneak by, it would be nice to have the server side validation making sure to add in rel=nofollow. I thought about client side, but again, ideally, I'm after server-side validation and formatting. ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317926 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
RE: nofollow regex
Ahhh...I gotcha. That does sort of put a damper on it doesn't it? Oh well. I had fun whipping that out. -Original Message- From: Jeff Becker [mailto:jpbec...@yahoo.com] Sent: Wednesday, January 14, 2009 9:43 AM To: cf-talk Subject: Re: nofollow regex Yes it does. This is for validation on a blog or forum, etc... For the spam comments/posts that do sneak by, it would be nice to have the server side validation making sure to add in rel=nofollow. I thought about client side, but again, ideally, I'm after server-side validation and formatting. ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317928 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: nofollow regex
Adrian, Thats very nice.. Thanks for that. I made one minor correction. Note the added in cfset newLink = Replace(match, , rel=nofollow,ONE) I'm starting to run more complicated examples and have two issues. Running: cfsavecontent variable=html Goto a href=http://google.com;Google/a now!BR Also hit up a href='http://movies.com' rel='junkrel'movies/aBRBR Don't forget a href=http://coffee.com; title=Great Coffee GRRREAT COFFEE /a /cfsavecontent Two items: Anyway to remove that rel=junkrel. Again concern is for spammers. I think fakerel=nofollow might be bypassed as well. Other item. Any issues on the movies example switching from single quote to double quote ONLY ON the rel=nofollow. I'm thinking that might ok for search engine spiders Thanks again! ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317929 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: nofollow regex
It's been a long day, so I may well have missed something, but here we go anyway... One caveat - this assumes at least semi-valid markup - in so far as no inside of A opening tag/attributes. (Not impossible to fix, but since it's not valid I can't be bothered going that far right now.) cffunction name=setHyperlinkRel returntype=String output=false cfargument name=LinkCode type=String/ cfargument name=RelValue type=String/ cfargument name=Appendtype=Boolean default=true/ cfargument name=Delimiter type=String default=,/ cfset var Head = ListFirst(Arguments.LinkCode,'')/ cfset var Tail = ListRest(Arguments.LinkCode,'')/ cfset var RelAttr = rematch( '(?ims)\brel=(?:\S+|(['']).*?)(?=\1(?:\s|$))' , Head )/ cfif ArrayLen(RelAttr) cfif NOT find(RelValue,RelAttr[1]) cfset Head = replace ( Head , RelAttr[1] , ListAppend( RelAttr[1] , Arguments.RelValue , Arguments.Delimiter ) )/ /cfif cfelse cfset Head = Head ' rel=#Arguments.RelValue#' / /cfif cfreturn Head '' Tail / /cffunction cffunction name=addRelNofollow returntype=String output=false cfargument name=InputText type=String/ cfset var Result = Arguments.InputText/ cfset var NewHyperlink = 0/ cfset var i = 0/ cfset var Hyperlinks = rematch( '(?ims)a[^]+.*?/a' , Arguments.InputText )/ cfloop index=i from=1 to=#ArrayLen(Hyperlinks)# cfset NewHyperlink = setHyperlinkRel( Hyperlinks[i] , 'nofollow' ) / cfset Result = replace( Result , Hyperlinks[i] , NewHyperlink )/ /cfloop cfreturn Result / /cffunction cfset NewContent = addRelNofollow( OldContent ) / That works on all the various examples I've tried so far - let me know if there's anything missed and I'll update it. Needs ColdFusion 8 or Railo 3 for rematch calls (can do one with Java Regex if people need it working with other engines). ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317964 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Re: nofollow regex
I knew I wasn't fully awake - I forgot to implement the append/overwrite functionality. Below is an updated version of the first function that allows you to overwrite rel, rather than appending onto it. (You need to update function call in second function to actually turn this on.) Not entirely happy with how I've done the overwriting, but it works with the examples I tried. cffunction name=setHyperlinkRel returntype=String output=false cfargument name=LinkCode type=String/ cfargument name=RelValue type=String/ cfargument name=Appendtype=Boolean default=true/ cfargument name=Delimiter type=String default=,/ cfset var Head = ListFirst(Arguments.LinkCode,'')/ cfset var Tail = ListRest(Arguments.LinkCode,'')/ cfset var RelAttr = rematch( '(?ims)\brel=(?:\S+|(['']).*?)(?=\1(?:\s|$))' , Head )/ cfif ArrayLen(RelAttr) cfif Arguments.Append cfif NOT find(RelValue,RelAttr[1]) cfset Head = replace ( Head , RelAttr[1] , ListAppend( RelAttr[1] , Arguments.RelValue , Arguments.Delimiter ) )/ /cfif cfelse cfset Head = rereplace ( Head , RelAttr[1] '['']?' , 'rel=#Arguments.RelValue#' )/ /cfif cfelse cfset Head = Head ' rel=#Arguments.RelValue#' / /cfif cfreturn Head '' Tail / /cffunction ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317965 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: nofollow regex
Spotted another problem: Currently it avoids fakerel=nofollow but doesn't avoid fake-rel=nofollow Possibly changing \b to \s in first regex will work, but that needs testing. ~| Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to date Get the Free Trial http://ad.doubleclick.net/clk;207172674;29440083;f Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:317966 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4