Google doesn't put quotes around most attributes. The following works
(takes single or double quotes or even no quotes into consideration).
Watch out for wrapping in the regular expressions. It allows you to find
the value of 1 attribute in one or more tags (see examples).

<cfscript>
function GetAttributeValue(str,tag,attr){
        var regexp =
"<(#tag#)\s[^>]*#attr#=('.*?'|"".*?""|[^\s>]+)[^>]*>";
        var aReturn = ArrayNew(1);
        var start = 1;
        var stTmp = StructNew();
        
        while(true){
                stTmp = REFindNoCase(regexp,str,start,true);
                if(stTmp.pos[1] IS 0) break;
        
ArrayAppend(aReturn,REReplace(Mid(str,stTmp.pos[3],stTmp.len[3]),"^['""]
(.*)['""]$","\1"));
                start = stTmp.pos[1] + stTmp.len[1];
        }
        
        return aReturn;
}
</cfscript>
<cfhttp url="http://www.google.com/"; throwonerror="yes"></cfhttp>
<cfoutput>#HTMLCodeFormat(cfhttp.filecontent)#</cfoutput>
<cfdump var="#GetAttributeValue(cfhttp.filecontent,'a','href')#">
<cfdump var="#GetAttributeValue(cfhttp.filecontent,'img','src')#">
<cfdump var="#GetAttributeValue(cfhttp.filecontent,'a|td','class')#">

Pascal

> -----Original Message-----
> From: Burns, John D [mailto:[EMAIL PROTECTED]
> Sent: 22 March 2005 22:59
> To: CF-Talk
> Subject: RE: regex help for grabbing values of html tag attributes
> 
> Ben,
> 
> I can see what you've got (I think) and it makes sense, but for some
> reason, it's not working.  I'm grabbing the html from www.google.com
and
> running it on it and this is what I've got in my code:
> 
> #refindnocase('<img.*?src="(.*?)".*?>',cfhttp.fileContent,0,true)#
> 
> I'm using <cfdump to display that info and what I see are 2 arrays
(len
> and pos) and both have values of 1 and 0.  I thought that if the first
> value was 1, the second value would be the position of the occurrence
of
> the search string.  I know google has an image, and I'm displaying the
> cfhttp.filecontent in a textarea above so that I can ensure the
results
> are coming back as expected.  Any ideas?  Am I doing something wrong?
> 
> 
> John Burns
> Certified Advanced ColdFusion MX Developer
> Wyle Laboratories, Inc. | Web Developer
> 
> 
> -----Original Message-----
> From: Ben Doom [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, March 22, 2005 4:54 PM
> To: CF-Talk
> Subject: Re: regex help for grabbing values of html tag attributes
> 
> Well, I see a couple of problems with what you're using.  First,
you've
> not got a closing " on the attribute.  Second, you've wrapped a regex
> that contains a " in ""'s, which will error out if you don't escape
the
> inner "'s.  You can wrap it with single quotes to fix that.  Also, the
> last * boggles me.  I don't know why it's there.
> 
> Or, try this:
> 
> '<#tag#.*?#att#="(.*?)".*?>'
> 
> where (should be obvious) tag and att are defined as the tag and
> attribute you want.  Please note that if you define them as "span" and
> "class" and you have this:
> <span>stuff in between<span class="bob"> the "whole tag" match will
> return both span tags and the stuff in between.  The attribute match
> will return bob.  So, if this might be the case, lemme know and we'll
> tweak the regex.
> 
> Not tested, your miles may vary, trix are for kids, etc.
> 
> --Ben
> 
> Burns, John D wrote:
> > 6.1.  I was looking at the archives and have come up with this but
> > it's erroring....
> >
> > I'm using the img instance because it's easier to test on pages that
> > have multiple images...
> >
> > #refindnocase("<img[^>]*src="([^"]*)*>",cfhttp.fileContent,0,true)#
> 
> 
> 
> 
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Find out how CFTicket can increase your company's customer support 
efficiency by 100%
http://www.houseoffusion.com/banners/view.cfm?bannerid=49

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:199743
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to