[PHP] Re: regexp novice
OOPS FORGOT to mention that I modify the string to add a colon if it is entered without one, so my regexp always expects a : to be in the middle. So in actuality - my regexp is 'passing' a value of 13:00 as legitimate, when it should not be. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp questions
Get a copy of http://www.weitz.de/regex-coach/ and contribute. Use the pattern on your string, one section at a time. On 5/10/2010 7:53 PM, Spud. Ivan. wrote: Hi, I've recently changed from php 5.1 to 5.3.2 and I'm havong problems with preg_match, because the same regular expressions used in php 5.1 are not matching anything in 5.3.2. There are any significant changes that I should know? I've been searching but I haven't found anything. Thanks. I.Lopez. _ Recibe en tu HOTMAIL los emails de TODAS tus CUENTAS. + info http://www.vivelive.com/hotmail-la-gente-de-hoy/index.html?multiaccount -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: regexp questions
I think the error is related to changed described here. http://www.pcre.org/changelog.txt Shiplu Mokadd.im My talks, http://talk.cmyweb.net Follow me, http://twitter.com/shiplu SUST Programmers, http://groups.google.com/group/p2psust Innovation distinguishes bet ... ... (ask Steve Jobs the rest) -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: regexp questions
On Tue, 2010-05-11 at 23:29 +0700, shiplu wrote: I think the error is related to changed described here. http://www.pcre.org/changelog.txt Shiplu Mokadd.im My talks, http://talk.cmyweb.net Follow me, http://twitter.com/shiplu SUST Programmers, http://groups.google.com/group/p2psust Innovation distinguishes bet ... ... (ask Steve Jobs the rest) That's quite a long document, care to narrow it down slightly? :p Thanks, Ash http://www.ashleysheridan.co.uk
Re: [PHP] Re: regexp questions
oops! Please see the change log for version 8.00 on http://www.pcre.org/changelog.txt Shiplu Mokadd.im My talks, http://talk.cmyweb.net Follow me, http://twitter.com/shiplu SUST Programmers, http://groups.google.com/group/p2psust Innovation distinguishes bet ... ... (ask Steve Jobs the rest)
[PHP] Re: RegExp for preg_split()
Try this (don't pay attention to the name): /** * @param string $text * @returnarray * @since Sat Apr 29 01:35:37 CDT 2006 * @authorrsalazar */ function parse_phrases( $text ) { $arr_pzas = array(); if ( preg_match_all('/(?(?=[\'])([\']).+?\\1|\w+)/X', $text, $arr_pzas) ) { $arr_pzas = $arr_pzas[0]; } return $arr_pzas; } // parse_phrases() Weber Sites LTD wrote: Hi I'm looking for the RegExp that will split a search string into search keywords. while taking into account. From what I managed to find I can get all of the words into an array but I would like all of the words inside to be in the same array cell. -- Atentamente, J. Rafael Salazar MagaƱa Innox - InnovaciĆ³n Inteligente Tel: +52 (33) 3615 5348 ext. 205 / 01 800 2-SOFTWARE http://www.innox.com.mx -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty (DONT actually think so)
Hi All, I don't actually think regexp is fault. But if anyone could explain this or give me some example code that will extract the attributes and data between a fieldset tag pair I would be appreciated. Henry Henry Grech-Cini [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hi All, function extractFieldsets($subject) { $regexp=/fieldset([^]*)[^(\/fieldset)]*/i; $replacement; $matches=array(); preg_match_all($regexp, $subject, $matches); return ($matches); } $result=extractFieldsets('testfieldset attribute=hellocontent of hello/fieldsetemblah/emfieldset attribute=goodbyegoodbye/fieldset'); echo br/; foreach($result as $key=$string) { echo (.$key.)=.br/; foreach($string as $subkey=$subres) echo(.$subkey.)=[.htmlspecialchars($subres).]br/; echo br/; } And it produced; (0)= (0)=[fieldset attribute=hellocon] (1)=[fieldset attribute=goodbyegoo] (1)= (0)=[ attribute=hello] (1)=[ attribute=goodbye] Why did it get three letters after the end of the fieldset tag con and goo. Any pointers? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty (DONT actually think so)
Henry Grech-Cini schrieb: ... $regexp=/fieldset([^]*)[^(\/fieldset)]*/i; ... $result=extractFieldsets('testfieldset attribute=hellocontent of hello/fieldsetemblah/emfieldset attribute=goodbyegoodbye/fieldset'); ... And it produced; (0)= (0)=[fieldset attribute=hellocon] (1)=[fieldset attribute=goodbyegoo] (1)= (0)=[ attribute=hello] (1)=[ attribute=goodbye] hi, as it is defined in regex-spec: a '^' inside a char-group '[...]' defines all chars, that aren't allowed, and not a string! so the first 't' of 'content' and the 'd' of 'goodbye' don't match your regex anymore. a start for a solution could be: ?php $rx = '/fieldset[^]*(.*)\/fieldset/i'; ? if you want to take care of your fieldset-attribs in your result, you can set your brackets again: ([^]*) some probs i can think of are nested fieldsets inside fieldsets (don't know by head, if this is allowed by w3). and another prob: is that you don't catch multiple fieldsets after another. i think there is a switch, that catches only the least result. hth SVEN -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty (DONT actually think so)
Thanks Sven, You are quite right with your some probs comment. Do you know what the the switch is that catches only the least result? I now get (0)= (0)=[fieldset attribute=hellolegendhello legend/legendcontent of hello/fieldsetemblah/emfieldset attribute=goodbyegoodbye/fieldset] (1)= (0)=[ attribute=hello] (2)= (0)=[legendhello legend/legendcontent of hello/fieldsetemblah/emfieldset attribute=goodbyegoodbye] as we can see the second fieldset is included in that which is between the fieldset tags! :-( Thanks everyone for you help including Mike (with the post out of chain). Henry Sven [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Henry Grech-Cini schrieb: ... $regexp=/fieldset([^]*)[^(\/fieldset)]*/i; ... $result=extractFieldsets('testfieldset attribute=hellocontent of hello/fieldsetemblah/emfieldset attribute=goodbyegoodbye/fieldset'); ... And it produced; (0)= (0)=[fieldset attribute=hellocon] (1)=[fieldset attribute=goodbyegoo] (1)= (0)=[ attribute=hello] (1)=[ attribute=goodbye] hi, as it is defined in regex-spec: a '^' inside a char-group '[...]' defines all chars, that aren't allowed, and not a string! so the first 't' of 'content' and the 'd' of 'goodbye' don't match your regex anymore. a start for a solution could be: ?php $rx = '/fieldset[^]*(.*)\/fieldset/i'; ? if you want to take care of your fieldset-attribs in your result, you can set your brackets again: ([^]*) some probs i can think of are nested fieldsets inside fieldsets (don't know by head, if this is allowed by w3). and another prob: is that you don't catch multiple fieldsets after another. i think there is a switch, that catches only the least result. hth SVEN -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty!?
I came accross this link a href= http://www.alpha-geek.com/2003/12/31/do_not_do_not_parse_html_with_regexs.html http://www.alpha-geek.com/2003/12/31/do_not_do_not_parse_html_with_regexs.html /a Do we all agree or should I keep trying? Henry -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty (DONT actually think so)
Henry Grech-Cini schrieb: Thanks Sven, You are quite right with your some probs comment. hi, think i found it. try this: ?php $rx = '/fieldset.*(.*)\/fieldset/iU'; ? the '/U' stands for 'ungreedy'. also note the change in the attribs-regex. hth SVEN -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp appears to be faulty!?
Henry Grech-Cini schrieb: I came accross this link a href= http://www.alpha-geek.com/2003/12/31/do_not_do_not_parse_html_with_regexs.html http://www.alpha-geek.com/2003/12/31/do_not_do_not_parse_html_with_regexs.html /a Do we all agree or should I keep trying? Henry hi henry, this could be an interesting discussion. i think there can be a solution for every problem. it's only a question of the logic. the main problem in this example are white spaces in every kind (space, tab, newline, carriage return, ...) and there are solutions in regex. a little example: '/( |\t|\n|\r)*/' checks optional white spaces you can also give '\s*' a try (any whitespace char). maybe it also works with '\r\n'? just some thoughts. hth SVEN ps: it surely is possible to ignore everything between 'script/script', isn't it? -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: regexp appears to be faulty!?
On Tue, 2004-02-24 at 10:49, Henry Grech-Cini wrote: http://www.alpha-geek.com/2003/12/31/do_not_do_not_parse_html_with_regexs.html Do we all agree or should I keep trying? The important thing to keep in mind here is to use the right tool for the job. If you are parsing an HTML document looking for tags, attributes, etc. I do recommend using domxml/simplexml/some XML parsing tool to get your job done quickly and cleanly. However, if you have a very specific need to extract some text from a string then you can probably get away with regular expressions. The big catch with regexp is that it has a very low reuse value. Generally regexps are difficult to read and rarely will you just copy and pate a regular expression from one piece of code to another. If your regexp is growing beyond one line and is taking a long time to process then it is time to move on. Additionally, regular expressions are not good at providing context. It just so happens that HTML documents are just text documents so if you can parse the text to get what you need great. However, if you want to move through the elements and attributes, you want something more powerful, like XPath or XQuery. (ie. you want to find the third fieldset child of the body element that has an attribute set to foo) As a side note, that article has a link to a similar one that lists a regexp based XML parser as the only PHP solution. :) -- Adam Bregenzer [EMAIL PROTECTED] http://adam.bregenzer.net/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp: 'a correctly parenthesized substring advanced'
I give an wrong example. Here is a better one ?php $txt = func1($par1, 100 (euro), func2($par2,(c) by nobody)); if (preg_match_all(' / ([a-zA-Z]\w*?) \s* ( \( ( (?.*?)| .*?(?R)* ) \) )+ /x ', $txt, $m)){ print_r($m); } else { echo no match; } echo \n; ? This must result in : First call: func1($par1, 100 (euro), func2($par2,(c) by nobody)) Second call with the inner part $par1, 100 (euro), func2($par2,(c) by nobody) must result in : func2($par2,(c) by nobody) Please visit www.regexp.org there is an open thread of this question. thanx -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: regexp: 'a correctly parenthesized substring advanced'
Forget regexp and try this function: http://sk.php.net/manual/en/function.token-get-all.php Jaaboo wrote: I give an wrong example. Here is a better one ?php $txt = func1($par1, 100 (euro), func2($par2,(c) by nobody)); if (preg_match_all(' / ([a-zA-Z]\w*?) \s* ( \( ( (?.*?)| .*?(?R)* ) \) )+ /x ', $txt, $m)){ print_r($m); } else { echo no match; } echo \n; ? This must result in : First call: func1($par1, 100 (euro), func2($par2,(c) by nobody)) Second call with the inner part $par1, 100 (euro), func2($par2,(c) by nobody) must result in : func2($par2,(c) by nobody) Please visit www.regexp.org there is an open thread of this question. thanx -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] Re: regexp: 'a correctly parenthesized substring advanced'
hi marek, thanx for your suggestion. i have tried out the tokenizer but 1. i think its not realy final and can change in the future 2. the tokenizer can only find real php code but i need to find my simplified php code without an ; on the end of an statement for example. so the tokenizer doesn't do my job. i'm not far from the right solution it's a litle step to bring the regex working with embeded parenthesis beetween i only had 2pregs in the complete source that can catch any type of registered function call with the simplified syntax ... this is fast enough for me ... Marek Kilimajer [EMAIL PROTECTED] schrieb im Newsbeitrag news:[EMAIL PROTECTED] Forget regexp and try this function: http://sk.php.net/manual/en/function.token-get-all.php Jaaboo wrote: I give an wrong example. Here is a better one ?php $txt = func1($par1, 100 (euro), func2($par2,(c) by nobody)); if (preg_match_all(' / ([a-zA-Z]\w*?) \s* ( \( ( (?.*?)| .*?(?R)* ) \) )+ /x ', $txt, $m)){ print_r($m); } else { echo no match; } echo \n; ? This must result in : First call: func1($par1, 100 (euro), func2($par2,(c) by nobody)) Second call with the inner part $par1, 100 (euro), func2($par2,(c) by nobody) must result in : func2($par2,(c) by nobody) Please visit www.regexp.org there is an open thread of this question. thanx -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: [RegExp] extracting anchors
Jome wrote: Jens Lehmann wrote: Hello, I want to extract the name-attribute of all anchors out of an HTML-source-code which don't have the href-attribute. I can use this code to get the name-attribute: preg_match_all('/a([^]*?)name=[ \'\](.*?)[ \'\](.*?)/is',$src,$ar); The name-attributes are now in $ar[2]. How can I exclude all links which have the href-attribute? I didn't find an easy way how to say that a string must _not_ be part of a pattern match. I hope you can give me some advise. This is one lousy solution but hey - it's a solution after all; you can simply remove all a tags containing href before doing the preg_match_all()-call, something like this below should do it I think. $src = preg_replace('/a[^]*?href=[^]*?/is', '', $src); This is of course a working solution, but I'm still interested if it can be done directly with just one regular expression. Thanks anyways. Jens -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: [RegExp] extracting anchors
Jens Lehmann wrote: Hello, I want to extract the name-attribute of all anchors out of an HTML-source-code which don't have the href-attribute. I can use this code to get the name-attribute: preg_match_all('/a([^]*?)name=[ \'\](.*?)[ \'\](.*?)/is',$src,$ar); The name-attributes are now in $ar[2]. How can I exclude all links which have the href-attribute? I didn't find an easy way how to say that a string must _not_ be part of a pattern match. I hope you can give me some advise. Jens This is one lousy solution but hey - it's a solution after all; you can simply remove all a tags containing href before doing the preg_match_all()-call, something like this below should do it I think. $src = preg_replace('/a[^]*?href=[^]*?/is', '', $src); Jome -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp for ' replacement
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Thalis A. Kalfigopoulos) wrote: If I have as part of a text: ...and then 'the quick brown fox jumped over the lazy dog's piano'... How can I substitute the single quote in dog's with say \' I want to aply a substitution for only the single quote that is between two single quotes and leave the rest of the text in between the same. Does this work for you? $str=...and then 'the quick brown fox jumped over the lazy dog's piano'...; echo $str=preg_replace(/'(.*)'(.*)'/U,'$1\'$2',$str); -- CC -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
[PHP] Re: regexp to substitute incremental strings
Hi mweb, try this: ? $string = IMG SRC=\C:\dir1\dir2\dir3\img1.gif\ blah blah blah some text, html markup... IMG SRC=\img2.jpg\ blah blah again; $string = preg_replace(/IMG SRC=\.*?([0-9])\.(gif|jpg)\/i, IMG SRC=\UNIQUE_CODE_0$1.$2\, $string); echo nl2br($string); ? ~James Mweb wrote in message ... Hello, What is the right regexp to handle this, either in a while loop (how?) or all by itself? The closer I've come to the solution is: -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP] Re: regexp (?:
Andrew Perevodchik [EMAIL PROTECTED] wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... Why doesn't this simple example work? ereg (aaa(?:bbb|ccc)aaa, $string); It causes an error. However ?: command is documented in a manual of my version of PHP4. It would probably be useful if you would also tell us WHAT you want to accomplish, so we won't have to guess about your intentions... / Franklin -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP] Re: regexp (?:
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Andrew Perevodchik) wrote: ereg (aaa(?:bbb|ccc)aaa, $string); It causes an error. However ?: command is documented in a manual of my version of PHP4. AFAIK, that is only documented in the preg_* chapter, and only applies to the preg_* functions. (The ereg_* functions use POSIX syntax, not PCRE). -- CC -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
RE: [PHP] Re: REGEXP
This is wrong. It should be Content-Type: multipart/mixed; boundary=B42DA66C4EC07C9B572A58FC I don't know why it is not reading the whole string. It seems to treat the *boundary* part as another line. It _is_ another line; it just happens to starts with whitespace. check RFC822. No doubt. If you have the whole message in a string (psuedo-code): Huh.. Then the below should work. But the stupid header breaks :) list ($h,$msg) =split(\n\n, $mailmsg); $h=str_replace(\t, , $h); // tab - space $h=str_replace(\n , , $h); // nlspace - space // i.e. continuation of prior line $hdrs=split(\n, $h); ... Regards, -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
RE: [PHP] Re: REGEXP
On 16-Jul-01 Adrian D'Costa wrote: Hi James, Thanks for your mail. But I think the problem like somewhere else. I have the following: echo $buffer; There result : Content-Type: multipart/mixed; This is wrong. It should be Content-Type: multipart/mixed; boundary=B42DA66C4EC07C9B572A58FC I don't know why it is not reading the whole string. It seems to treat the *boundary* part as another line. It _is_ another line; it just happens to starts with whitespace. check RFC822. If you have the whole message in a string (psuedo-code): list ($h,$msg) =split(\n\n, $mailmsg); $h=str_replace(\t, , $h); // tab - space $h=str_replace(\n , , $h); // nlspace - space // i.e. continuation of prior line $hdrs=split(\n, $h); ... Regards, -- Don Read [EMAIL PROTECTED] -- It's always darkest before the dawn. So if you are going to steal the neighbor's newspaper, that's the time to do it. -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP] Re: REGEXP
Hi James, Thanks for your mail. But I think the problem like somewhere else. I have the following: echo $buffer; There result : Content-Type: multipart/mixed; This is wrong. It should be Content-Type: multipart/mixed; boundary=B42DA66C4EC07C9B572A58FC I don't know why it is not reading the whole string. It seems to treat the *boundary* part as another line. I read somewhere the perl compatiable regexp can read a multiply line treating CRLF as part of the line. Any pointers Adrian On Sun, 15 Jul 2001, James Tan wrote: hi,, u could try using explode function eg: arrayresult = explode(string, separator); cont = explode(thestring, \); echo cont[0] ; // result as Content-Type: multipart/mixed; boundary= echo cont[1]; //result as B42DA66C4EC07C9B572A58FC echo cont[2]; // array index 2 will be nothing.. null or hope it works.. regards, James Adrian D'Costa wrote: Hi, I am trying to capture the Header from a mail for my webmail using php and pop3. The header is something like this: Content-Type: multipart/mixed; boundary=B42DA66C4EC07C9B572A58FC When I use preg_split(/[\d;]*/, $buffer), I get Content-Type: multipart/mixed; What I want is to return the whole line split by the ;. I usually try to avoid regexp (too lazy) but now I want to use it. The final result I would need is: B42DA66C4EC07C9B572A58FC so that I can search in the body of the message for the rest of the parts. Any pointers would be helpful. Adrian -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]
[PHP] Re: REGEXP
hi,, u could try using explode function eg: arrayresult = explode(string, separator); cont = explode(thestring, \); echo cont[0] ; // result as Content-Type: multipart/mixed; boundary= echo cont[1]; //result as B42DA66C4EC07C9B572A58FC echo cont[2]; // array index 2 will be nothing.. null or hope it works.. regards, James Adrian D'Costa wrote: Hi, I am trying to capture the Header from a mail for my webmail using php and pop3. The header is something like this: Content-Type: multipart/mixed; boundary=B42DA66C4EC07C9B572A58FC When I use preg_split(/[\d;]*/, $buffer), I get Content-Type: multipart/mixed; What I want is to return the whole line split by the ;. I usually try to avoid regexp (too lazy) but now I want to use it. The final result I would need is: B42DA66C4EC07C9B572A58FC so that I can search in the body of the message for the rest of the parts. Any pointers would be helpful. Adrian -- PHP General Mailing List (http://www.php.net/) To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] To contact the list administrators, e-mail: [EMAIL PROTECTED]