Will it always be a domain name you want to keep? And will the file size
always be at the very end of the line? 

-----Original Message-----
From: Mark Henderson [mailto:m...@cwc.co.nz] 
Sent: Sunday, November 15, 2009 8:38 PM
To: cf-talk
Subject: Regex help with invalid HTML


Calling all regex gurus. I've spent a little time on this so now it's time
to seek advice from the professionals. Here is an example of the content I'm
working with:

<tr><td class="l"><a href="/">abc.co.nz</a><td>52 363<td>73 815<td>5 122
265<td>2 166 760<td>471.47 MB
<tr><td class="l"><a href="/">xyz.co.nz</a><td>31 622<td>23 443<td>193
645<td>840 642<td>1.8 GB <tr><td class="l"><a href="/">blah.com</a><td>31
622<td>25 623<td>193 645<td>840 642<td>1.9 GB

And what I want to do is remove everything between the first td (after the
closing </a>) and the last td BEFORE the next tr.

E.G. This
<tr><td class="l"><a href="/">abc.co.nz</a><td>52 363<td>73 815<td>5 122
265<td>2 166 760<td>471.47 MB 

becomes

<tr><td class="l"><a href="/">abc.co.nz</a> 471.47 MB

At that point I will then strip all the remaining HTML tags (which I can
do) and I should be good to go. Unfortunately I have no control over this
code as it is generated by a stats program, and if indeed it used the
correct closing tags and validated I could probably fumble around and
eventually achieve what I want, as I've done in the past.  And just in case
anyone out there can do all this in one hit, ultimately I want the output
from above to look like this:

abc.co.nz 471.47 MB
xyz.co.nz 1.8 GB
blah.com 1.9 GB
etc.

I hope that makes sense.


TIA
Mark



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:328403
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to