..* is greedy -- it will match everything from the first table tag to the 
"code2" element.  .*? is non-greedy, assuming you are using CFMX or 
better.  It will match the table you want.

--Ben Doom

JJ Cool wrote:
> I have a newbie regex question. I can't seem to find a solution through 
> google, koders.com, krugle.com, (google.com/codesearch), table scrapping 
> regex searches. I'm having a little trouble coming up with a regex that will 
> allow me to scrape some data fr
> om a webpage retrieved with the <cfhttp> tag. For example, say that I want to 
> scrape the middle table with "Code 2" in it like below. Also, assume that the 
> webpage could change and the table will change positions and that random text 
> will appear between
>  the tables, and that I will always be able to find the start position of the 
> table by a unique column header.
> 
> <TABLE>
>       <TR>
>               <TD>Code 1</TD>
>               <TD>some stuff</TD>
>               <TD>some stuff</TD>
>       </TR>
> </TABLE>
> 
> <TABLE>
>       <TR>
>               <TD>Code 2</TD>
>               <TD>some stuff</TD>
>               <TD>some stuff</TD>
>       </TR>
> </TABLE>
> 
> <TABLE>
>       <TR>
>               <TD>Code 3</TD>
>               <TD>some stuff</TD>
>               <TD>some stuff</TD>
>       </TR>
> </TABLE>
> 
> Assume the above html is stored in a variable called tmpHTML.
> 
> What I'm trying to do is get the start position of the table by using a 
> REFindNoCase like this. 
> 
> <CFSET StartPosition=REFindNoCase("<table.*Code 2", tmpHTML)>
> 
> Problem is, it gets the start position of the first table. I can't seem to 
> come up with a regex that will find the <table> tag right before the "Code 2" 
> column header.
>  
> Any advice would be greatly appreciated, and thanks in advance!
> 
> CoolJJ
> 
> 
> 
> 
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Deploy Web Applications Quickly across the enterprise with ColdFusion MX7 & 
Flex 2
Free Trial 
http://www.adobe.com/products/coldfusion/flex2/?sdid=RVJU

Archive: http://www.houseoffusion.com/groups/RegEx/message.cfm/messageid:1023
Subscription: http://www.houseoffusion.com/groups/RegEx/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.21

Reply via email to