Regex help needed
I am having an issue creating a regex to strip out the XML content that Word 2007 is adding our HTML editor. we are using TINYMEC and when one of our client upgraded recently it has created a large number of issues. what we need to do is to pull out the flowing content. it starts with !--[if gte mso 9]xmlbr/b w:WordDocumentbr/b Ends with ![endif]-- there is about 1000 chars between the nodes and sometimes there are muliple set of nodes with the same IF and endif I was trying to create a regex to strip out this content - everything from the begining to the end (I want NONE of it). if anyone has any other suggestion we are all ears here. Thanks - I am just not great at this regex stuff and can not get the correct statement. Matt ~| Order the Adobe Coldfusion Anthology now! http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:342201 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm
Re: Regex help needed
FYI I figure it out was simple once you looked at the content. since it is all in commented tags ReReplaceNoCase(str,!--(.*?)--, , ALL); Just incase anyone else has this issue. I am having an issue creating a regex to strip out the XML content that Word 2007 is adding our HTML editor. we are using TINYMEC and when one of our client upgraded recently it has created a large number of issues. what we need to do is to pull out the flowing content. it starts with !--[if gte mso 9]xmlbr/b w:WordDocumentbr/b Ends with ![endif]-- there is about 1000 chars between the nodes and sometimes there are muliple set of nodes with the same IF and endif I was trying to create a regex to strip out this content - everything from the begining to the end (I want NONE of it). if anyone has any other suggestion we are all ears here. Thanks - I am just not great at this regex stuff and can not get the correct statement. Matt ~| Order the Adobe Coldfusion Anthology now! http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:342207 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm
Regex Help Needed
CF5. In the following I keep getting a bad backreference error. I thought the nested expressions in parens should give me \1 = entire matching string, then \2 = first submatch, \3 = 2nd submatch, etc. I'm taking a string containing concatenated three-character weekday abbreviations and expanding them. Example: SatSun = Sat,Sun re = '((sun)(mon)|(mon)(tue)|(tue)(wed)|(wed)(thu)|(thu)(fri)|(fri)(sat)|(sat)(sun))'; s = REReplaceNoCase(s, re, '\2,\3', 'all');// adjacent weekdays ~| Discover CFTicket - The leading ColdFusion Help Desk and Trouble Ticket application http://www.houseoffusion.com/banners/view.cfm?bannerid=48 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226459 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
RE: Regex Help Needed
You could just do something a bit more simple: strTarget = SatSun; strResult = (Left(strTarget, 3) , Right(strTarget, 3)); ... Ben Nadel Web Developer Nylon Technology 6 West 14th Street New York, NY 10011 212.691.1134 212.691.3477 fax www.nylontechnology.com Vote for Pedro -Original Message- From: Jim McAtee [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 07, 2005 1:46 PM To: CF-Talk Subject: Regex Help Needed CF5. In the following I keep getting a bad backreference error. I thought the nested expressions in parens should give me \1 = entire matching string, then \2 = first submatch, \3 = 2nd submatch, etc. I'm taking a string containing concatenated three-character weekday abbreviations and expanding them. Example: SatSun = Sat,Sun re = '((sun)(mon)|(mon)(tue)|(tue)(wed)|(wed)(thu)|(thu)(fri)|(fri)(sat)|(sat)(su n))'; s = REReplaceNoCase(s, re, '\2,\3', 'all');// adjacent weekdays ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226461 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: Regex Help Needed
Not exactly. It has to be a replace opertion in mid-string. It's doable with ReplaceList(), except that ReplaceList() has no NoCase counterpart. s = 'Hours of operation: MonFri 11a8p SatSun 11a5p'; - Original Message - From: Ben Nadel [EMAIL PROTECTED] To: CF-Talk cf-talk@houseoffusion.com Sent: Wednesday, December 07, 2005 11:56 AM Subject: RE: Regex Help Needed You could just do something a bit more simple: strTarget = SatSun; strResult = (Left(strTarget, 3) , Right(strTarget, 3)); ... Ben Nadel Web Developer Nylon Technology 6 West 14th Street New York, NY 10011 212.691.1134 212.691.3477 fax www.nylontechnology.com Vote for Pedro -Original Message- From: Jim McAtee [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 07, 2005 1:46 PM To: CF-Talk Subject: Regex Help Needed CF5. In the following I keep getting a bad backreference error. I thought the nested expressions in parens should give me \1 = entire matching string, then \2 = first submatch, \3 = 2nd submatch, etc. I'm taking a string containing concatenated three-character weekday abbreviations and expanding them. Example: SatSun = Sat,Sun re = '((sun)(mon)|(mon)(tue)|(tue)(wed)|(wed)(thu)|(thu)(fri)|(fri)(sat)|(sat)(su n))'; s = REReplaceNoCase(s, re, '\2,\3', 'all');// adjacent weekdays ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226466 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: [QUARRANTINE] Regex Help Needed
Well, you don't really need the outside parens, so you could just ditch those. I could be remembering this wrong, but IIRC, CF5's engine remembered backreferences in the order they are completed, not started, but I could be wrong about that. It's been a little while since I had to play in that particular sandbox. :-) --Ben Jim McAtee wrote: CF5. In the following I keep getting a bad backreference error. I thought the nested expressions in parens should give me \1 = entire matching string, then \2 = first submatch, \3 = 2nd submatch, etc. I'm taking a string containing concatenated three-character weekday abbreviations and expanding them. Example: SatSun = Sat,Sun re = '((sun)(mon)|(mon)(tue)|(tue)(wed)|(wed)(thu)|(thu)(fri)|(fri)(sat)|(sat)(sun))'; s = REReplaceNoCase(s, re, '\2,\3', 'all');// adjacent weekdays ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226475 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: [QUARRANTINE] Regex Help Needed
- Original Message - From: Ben Doom [EMAIL PROTECTED] To: CF-Talk cf-talk@houseoffusion.com Sent: Wednesday, December 07, 2005 1:05 PM Subject: Re: [QUARRANTINE] Regex Help Needed Well, you don't really need the outside parens, so you could just ditch those. I thought that might be the case, but the following also throws the same backreference error: re = '(sun)(mon)|(mon)(tue)|(tue)(wed)|(wed)(thu)|(thu)(fri)|(fri)(sat)|(sat)(sun)'; s = REReplaceNoCase(s, re, '\1,\2', 'all');// adjacent weekdays Ah... I think I see what the problem is. If the match is SatSun then I'd need to use \13 \14. I'm not sure I know how to deal with that. ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226478 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: [QUARRANTINE] Regex Help Needed
Jim McAtee wrote: Ah... I think I see what the problem is. If the match is SatSun then I'd need to use \13 \14. I'm not sure I know how to deal with that. No, that shouldn't be it. It should only backreference what's actually matched. What happens if you just output \1? What's caught in it? --Ben ~| Logware (www.logware.us): a new and convenient web-based time tracking application. Start tracking and documenting hours spent on a project or with a client with Logware today. Try it for free with a 15 day trial account. http://www.houseoffusion.com/banners/view.cfm?bannerid=67 Message: http://www.houseoffusion.com/lists.cfm/link=i:4:226482 Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4 Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4 Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4 Donations Support: http://www.houseoffusion.com/tiny.cfm/54
Re: Cfhttp (now regex help needed...)
Ok, cool i've got this aspect working, thanks dave... Now... I am doing a cfhttp on a php page and need to return the php sid, so basically I need to search through the string and find sid=. where the Represent a 32 char alphanumeric string... I've tried ReFind('sid=([[:alnum:]])+',cfhttp.FileContent) But it doesn¹t seem to work... Any thoughts? Sorry for all this annoyance... This is so worth it if I get this working! Ryan ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Get the mailserver that powers this list at http://www.coolfusion.com Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Cfhttp (now regex help needed...)
Perhaps casing issues? Is sid always lower case? Also will the string always be 32 Characters, if its a url is it encoded? Does this help any? ReFindNoCase('sid=([[:alnum:]]){32}',cfhttp.FileContent) Kola -Original Message- From: Ryan Mitchell [mailto:[EMAIL PROTECTED] Sent: 14 July 2003 15:50 To: CF-Talk Subject: Re: Cfhttp (now regex help needed...) Ok, cool i've got this aspect working, thanks dave... Now... I am doing a cfhttp on a php page and need to return the php sid, so basically I need to search through the string and find sid=. where the Represent a 32 char alphanumeric string... I've tried ReFind('sid=([[:alnum:]])+',cfhttp.FileContent) But it doesn¹t seem to work... Any thoughts? Sorry for all this annoyance... This is so worth it if I get this working! Ryan ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: Cfhttp (now regex help needed...)
How big is the php page that's coming back? ReFind() has some character limitations (about 22k or so), so if the page is any bigger, it just fails. No error, no message, no grace -- it just doesn't find anything. Assuming the page is small enough, you should be looking for something like (|\?)sid=[[:alnum:]]{32} I changed just the to an or ? so it would find the string even if it's the first in the query string. The {32} means to look for 32 of the previous construct (in this case, an alnum). HTH. As always, if you need more assistance, Ninjas are standing by: http://www.houseoffusion.com/cf_lists/index.cfm?method=threadsforumid=21 HoF CF-RegEx -- for all your regex needs. -- Ben Doom Programmer General Lackey Moonbow Software, Inc : -Original Message- : From: Ryan Mitchell [mailto:[EMAIL PROTECTED] : Sent: Monday, July 14, 2003 10:50 AM : To: CF-Talk : Subject: Re: Cfhttp (now regex help needed...) : : : Ok, cool i've got this aspect working, thanks dave... : : Now... I am doing a cfhttp on a php page and need to return the : php sid, so : basically I need to search through the string and find : : sid=. where the Represent a 32 char alphanumeric string... : : I've tried : ReFind('sid=([[:alnum:]])+',cfhttp.FileContent) : But it doesn¹t seem to work... Any thoughts? : : Sorry for all this annoyance... : This is so worth it if I get this working! : : Ryan : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: Cfhttp (now regex help needed...)
Thanks, got it doing what I want now!!! On 14/7/03 16:08, Ben Doom [EMAIL PROTECTED] wrote: How big is the php page that's coming back? ReFind() has some character limitations (about 22k or so), so if the page is any bigger, it just fails. No error, no message, no grace -- it just doesn't find anything. Assuming the page is small enough, you should be looking for something like (|\?)sid=[[:alnum:]]{32} I changed just the to an or ? so it would find the string even if it's the first in the query string. The {32} means to look for 32 of the previous construct (in this case, an alnum). HTH. As always, if you need more assistance, Ninjas are standing by: http://www.houseoffusion.com/cf_lists/index.cfm?method=threadsforumid=21 HoF CF-RegEx -- for all your regex needs. -- Ben Doom Programmer General Lackey Moonbow Software, Inc : -Original Message- : From: Ryan Mitchell [mailto:[EMAIL PROTECTED] : Sent: Monday, July 14, 2003 10:50 AM : To: CF-Talk : Subject: Re: Cfhttp (now regex help needed...) : : : Ok, cool i've got this aspect working, thanks dave... : : Now... I am doing a cfhttp on a php page and need to return the : php sid, so : basically I need to search through the string and find : : sid=. where the Represent a 32 char alphanumeric string... : : I've tried : ReFind('sid=([[:alnum:]])+',cfhttp.FileContent) : But it doesn¹t seem to work... Any thoughts? : : Sorry for all this annoyance... : This is so worth it if I get this working! : : Ryan : : ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
regex help needed
Can someone assist me in writing the following regex... I have a string, varies in length and I need to pull out of the string all the characters between: %--- and ---% An example of what this string might look like is: dllCall1%---session|errorTrap|DetailNumeric---%EndDLLCall I need to pull: %---session|errorTrap|DetailNumeric---% The surroudning text will always vary in length depending upon results Thanks, Michael Tangorre [EMAIL PROTECTED] www.realmagnet.com work - 202-244-7845 fax - 202-244-7926 cell - 607-426-9277 ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
RE: regex help needed
result = REreplace(string, $.*(%---.*---%).*^, \1); That should do it, although it'll cause problems if there is more than one %--- ---% block in thr string. If the string you're looking for never contains a greater than sign, then you can use this RE instead, which will then work for strings that have multiple target blocks in there: $.*(%---[^]*---%).*^ It'll still only pull out the first block, so you'll need some kind of looping mechanism, but it'll return the first block correctly, rather than a combined result. --- Barney Boisvert, Senior Development Engineer AudienceCentral (formerly PIER System, Inc.) [EMAIL PROTECTED] voice : 360.756.8080 x12 fax : 360.647.5351 www.audiencecentral.com -Original Message- From: Michael Tangorre [mailto:[EMAIL PROTECTED] Sent: Monday, June 02, 2003 11:20 AM To: CF-Talk Subject: regex help needed Can someone assist me in writing the following regex... I have a string, varies in length and I need to pull out of the string all the characters between: %--- and ---% An example of what this string might look like is: dllCall1%---session|errorTrap|DetailNumeric---%EndDLLCall I need to pull: %---session|errorTrap|DetailNumeric---% The surroudning text will always vary in length depending upon results Thanks, Michael Tangorre [EMAIL PROTECTED] www.realmagnet.com work - 202-244-7845 fax - 202-244-7926 cell - 607-426-9277 ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Signup for the Fusion Authority news alert and keep up with the latest news in ColdFusion and related topics. http://www.fusionauthority.com/signup.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: regex help needed
Thanks! I wasn't even close. :-) You saved me a great deal of time, much appreciated. Mike - Original Message - From: Barney Boisvert [EMAIL PROTECTED] To: CF-Talk [EMAIL PROTECTED] Sent: Monday, June 02, 2003 2:29 PM Subject: RE: regex help needed result = REreplace(string, $.*(%---.*---%).*^, \1); That should do it, although it'll cause problems if there is more than one %--- ---% block in thr string. If the string you're looking for never contains a greater than sign, then you can use this RE instead, which will then work for strings that have multiple target blocks in there: $.*(%---[^]*---%).*^ It'll still only pull out the first block, so you'll need some kind of looping mechanism, but it'll return the first block correctly, rather than a combined result. --- Barney Boisvert, Senior Development Engineer AudienceCentral (formerly PIER System, Inc.) [EMAIL PROTECTED] voice : 360.756.8080 x12 fax : 360.647.5351 www.audiencecentral.com -Original Message- From: Michael Tangorre [mailto:[EMAIL PROTECTED] Sent: Monday, June 02, 2003 11:20 AM To: CF-Talk Subject: regex help needed Can someone assist me in writing the following regex... I have a string, varies in length and I need to pull out of the string all the characters between: %--- and ---% An example of what this string might look like is: dllCall1%---session|errorTrap|DetailNumeric---%EndDLLCall I need to pull: %---session|errorTrap|DetailNumeric---% The surroudning text will always vary in length depending upon results Thanks, Michael Tangorre [EMAIL PROTECTED] www.realmagnet.com work - 202-244-7845 fax - 202-244-7926 cell - 607-426-9277 ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq This list and all House of Fusion resources hosted by CFHosting.com. The place for dependable ColdFusion Hosting. http://www.cfhosting.com Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Re: regex help needed
Untested code but something like this should work: cfset str = dllCall1%---session|errorTrap|DetailNumeric---%EndDLLCall cfset extracted = #mid(str,findnocase(%---,str) + 4,findnocase(---%,str) - (findnocase(%---,str) + 4))# Basically, find the index of the first occurence of the open delimiter add 4 to it to get the index of the first character after that string, then find the index of the end delimiter, find the number of characters between the two values and then use MID to return the string. HTH Donnie Bachan Phone: (718) 217-2883 ICQ#: 28006783 Nitendo Vinces - By Striving You Shall Conquer == The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Original Message Follows From: Michael Tangorre [EMAIL PROTECTED] Reply-To: [EMAIL PROTECTED] To: CF-Talk [EMAIL PROTECTED] Subject: regex help needed Date: Mon, 2 Jun 2003 14:19:56 -0400 Can someone assist me in writing the following regex... I have a string, varies in length and I need to pull out of the string all the characters between: %--- and ---% An example of what this string might look like is: dllCall1%---session|errorTrap|DetailNumeric---%EndDLLCall I need to pull: %---session|errorTrap|DetailNumeric---% The surroudning text will always vary in length depending upon results Thanks, Michael Tangorre [EMAIL PROTECTED] www.realmagnet.com work - 202-244-7845 fax - 202-244-7926 cell - 607-426-9277 ~| Archives: http://www.houseoffusion.com/cf_lists/index.cfm?forumid=4 Subscription: http://www.houseoffusion.com/cf_lists/index.cfm?method=subscribeforumid=4 FAQ: http://www.thenetprofits.co.uk/coldfusion/faq Your ad could be here. Monies from ads go to support these lists and provide more resources for the community. http://www.fusionauthority.com/ads.cfm Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4