Will it always be a domain name you want to keep? And will the file size always be at the very end of the line?
-----Original Message----- From: Mark Henderson [mailto:m...@cwc.co.nz] Sent: Sunday, November 15, 2009 8:38 PM To: cf-talk Subject: Regex help with invalid HTML Calling all regex gurus. I've spent a little time on this so now it's time to seek advice from the professionals. Here is an example of the content I'm working with: <tr><td class="l"><a href="/">abc.co.nz</a><td>52 363<td>73 815<td>5 122 265<td>2 166 760<td>471.47 MB <tr><td class="l"><a href="/">xyz.co.nz</a><td>31 622<td>23 443<td>193 645<td>840 642<td>1.8 GB <tr><td class="l"><a href="/">blah.com</a><td>31 622<td>25 623<td>193 645<td>840 642<td>1.9 GB And what I want to do is remove everything between the first td (after the closing </a>) and the last td BEFORE the next tr. E.G. This <tr><td class="l"><a href="/">abc.co.nz</a><td>52 363<td>73 815<td>5 122 265<td>2 166 760<td>471.47 MB becomes <tr><td class="l"><a href="/">abc.co.nz</a> 471.47 MB At that point I will then strip all the remaining HTML tags (which I can do) and I should be good to go. Unfortunately I have no control over this code as it is generated by a stats program, and if indeed it used the correct closing tags and validated I could probably fumble around and eventually achieve what I want, as I've done in the past. And just in case anyone out there can do all this in one hit, ultimately I want the output from above to look like this: abc.co.nz 471.47 MB xyz.co.nz 1.8 GB blah.com 1.9 GB etc. I hope that makes sense. TIA Mark ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Want to reach the ColdFusion community with something they want? Let them know on the House of Fusion mailing lists Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:328403 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4