Re: [PHP] Re: regex help needed -- Solved! Thanks!

2004-08-02 Thread Fabrice Lezoray
Kathleen Ballard a écrit :
Thanks!  Works like a charm!
I am the very lowest of newbies when it comes to regex
and working through your solutions has been very
educational.  I have one question about something I
couldn't figure out: 

#h[1-9](.*)/h[1-9]#Uie
`h([1-6]).*?/h\1)`sie
What is the purpose of the back-ticks and the '#'? 
PCRE patterns has to be enclosed, you can use all the non alpha numerics 
characters to do that. Personnaly, I prefer back ticks because I don't 
have to escape it often inside my patterns.
For my example, you can also remove the ``s pattern modifier, It makes 
the dot ( . ) accept any New line characters, and I had not see  that 
you removed them before.

What are 'Uie' and  'sie'?
there are patterns modifiers, you can find a complete list and 
descriptions here :
http://www.php.net/manual/en/pcre.pattern.modifiers.php


Thanks again!
Kathleen
-Original Message-
From: Fabrice Lezoray [mailto:[EMAIL PROTECTED] 
Sent: Sunday, August 01, 2004 2:52 PM
To: [EMAIL PROTECTED]
Subject: [PHP] Re: regex help needed 

hi
M. Sokolewicz a écrit :
You could try something like:
$return = preg_replace('#h[1-9](.*)/h[1-9]#Uie',
'str_replace(br 

/, , $1)');
- Tul
Kathleen Ballard wrote:

Sorry,
Here is the code I am using to match the h* tags:
h([1-9]){1}.*/h([1-9]){1}
I think this mask is better :
`h([1-6]).*?/h\1)`sie

I have removed all the NL and CR chars from the
string
I am matching to make things easier.  Also, I have
run
tidy on the code so the tags are all uniform.
The above string seems to match the tag well now,
but
I still need to remove the br tags from the tag
contents (.*).
To remove the br / tags, you need to call
preg_replace_callback() :
?php
$str = 'h1hi br / ../h1 bla bla h5  br /
../h5 ...br /';
function cbk_br($match) {
	return 'h' . $match[1] . '' . str_replace('br /',
'', $match[2]) . 
'/h' . $match[1] . '';
}
$return =
preg_replace_callback('`h([1-6])(.*?)/h\1`si',
'cbk_br', 
$str);
echo $return;
?

The strings I will be matching are html formatted
text.  Sample h* tags with content are below:
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary/h4
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary br / Wins New Jersey/h4
h4Ex-Secretary Reich Loses Mass. Primary/h4
Again, any help is appreciated.
Kathleen


Sorry for my bad english ..
--
Fabrice Lezoray
http://classes.scriptsphp.fr
-
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP] Re: regex help needed

2004-08-01 Thread M. Sokolewicz
You could try something like:
$return = preg_replace('#h[1-9](.*)/h[1-9]#Uie', 'str_replace(br 
/, , $1)');

- Tul
Kathleen Ballard wrote:
Sorry,
Here is the code I am using to match the h* tags:
h([1-9]){1}.*/h([1-9]){1}
I have removed all the NL and CR chars from the string
I am matching to make things easier.  Also, I have run
tidy on the code so the tags are all uniform.
The above string seems to match the tag well now, but
I still need to remove the br tags from the tag
contents (.*).
The strings I will be matching are html formatted
text.  Sample h* tags with content are below:
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary/h4
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary br / Wins New Jersey/h4
h4Ex-Secretary Reich Loses Mass. Primary/h4
Again, any help is appreciated.
Kathleen
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[PHP] Re: regex help needed

2004-08-01 Thread Fabrice Lezoray
hi
M. Sokolewicz a écrit :
You could try something like:
$return = preg_replace('#h[1-9](.*)/h[1-9]#Uie', 'str_replace(br 
/, , $1)');

- Tul
Kathleen Ballard wrote:
Sorry,
Here is the code I am using to match the h* tags:
h([1-9]){1}.*/h([1-9]){1}
I think this mask is better :
`h([1-6]).*?/h\1)`sie

I have removed all the NL and CR chars from the string
I am matching to make things easier.  Also, I have run
tidy on the code so the tags are all uniform.
The above string seems to match the tag well now, but
I still need to remove the br tags from the tag
contents (.*).
To remove the br / tags, you need to call preg_replace_callback() :
?php
$str = 'h1hi br / ../h1 bla bla h5  br / ../h5 ...br /';
function cbk_br($match) {
	return 'h' . $match[1] . '' . str_replace('br /', '', $match[2]) . 
'/h' . $match[1] . '';
}
$return = preg_replace_callback('`h([1-6])(.*?)/h\1`si', 'cbk_br', 
$str);
echo $return;
?

The strings I will be matching are html formatted
text.  Sample h* tags with content are below:
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary/h4
h4Ex-Secretary Mickey Mouse br /Loses Mass.
Primary br / Wins New Jersey/h4
h4Ex-Secretary Reich Loses Mass. Primary/h4
Again, any help is appreciated.
Kathleen

--
Fabrice Lezoray
http://classes.scriptsphp.fr
-
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


RE: [PHP] Re: regex help needed -- Solved! Thanks!

2004-08-01 Thread Kathleen Ballard

Thanks!  Works like a charm!

I am the very lowest of newbies when it comes to regex
and working through your solutions has been very
educational.  I have one question about something I
couldn't figure out: 

#h[1-9](.*)/h[1-9]#Uie
`h([1-6]).*?/h\1)`sie
What is the purpose of the back-ticks and the '#'? 
What are 'Uie' and  'sie'?

Thanks again!
Kathleen

-Original Message-
From: Fabrice Lezoray [mailto:[EMAIL PROTECTED] 
Sent: Sunday, August 01, 2004 2:52 PM
To: [EMAIL PROTECTED]
Subject: [PHP] Re: regex help needed 

hi

M. Sokolewicz a écrit :
 You could try something like:
 $return = preg_replace('#h[1-9](.*)/h[1-9]#Uie',
'str_replace(br 
 /, , $1)');
 
 
 - Tul
 
 Kathleen Ballard wrote:
 
 Sorry,
 Here is the code I am using to match the h* tags:

 h([1-9]){1}.*/h([1-9]){1}
I think this mask is better :
`h([1-6]).*?/h\1)`sie



 I have removed all the NL and CR chars from the
string
 I am matching to make things easier.  Also, I have
run
 tidy on the code so the tags are all uniform.

 The above string seems to match the tag well now,
but
 I still need to remove the br tags from the tag
 contents (.*).
To remove the br / tags, you need to call
preg_replace_callback() :

?php
$str = 'h1hi br / ../h1 bla bla h5  br /
../h5 ...br /';
function cbk_br($match) {
return 'h' . $match[1] . '' . str_replace('br /',
'', $match[2]) . 
'/h' . $match[1] . '';
}
$return =
preg_replace_callback('`h([1-6])(.*?)/h\1`si',
'cbk_br', 
$str);
echo $return;
?


 The strings I will be matching are html formatted
 text.  Sample h* tags with content are below:

 h4Ex-Secretary Mickey Mouse br /Loses Mass.
 Primary/h4

 h4Ex-Secretary Mickey Mouse br /Loses Mass.
 Primary br / Wins New Jersey/h4

 h4Ex-Secretary Reich Loses Mass. Primary/h4

 Again, any help is appreciated.
 Kathleen


-- 
Fabrice Lezoray
http://classes.scriptsphp.fr

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[PHP] Re: Regex help needed

2003-07-16 Thread Nomadeous
First, the prob you got : WARNING 
comes from the following error:
(\s+face=\Verdana, Arial, Helvetica, sans-serif\|)
After the | (OR) sign, you must define another case, example:
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font 
size=\2\(\s+face=\Verdana, Arial, Helvetica, 
sans-serif\|\s)\s*purchasing power parity, '%POWER%', 
'tdtrsdsdsstr bgcolor=#f8f8f1 face=Verdana, Arial, Helvetica, 
sans-seriftdfont size=2Purchasing power parity');

Secondly, it's right that the \s expression is not recognised in
 purchasing\s+power\s+parity  , a little strange, but you can use two 
different ways instead of '\s':
 - [[:space:]]
 - [ ]
The brackets allows you to define a sequence of characters patterns (in 
the second case above, the space character).
It will give:
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font 
size=\2\\s*purchasing[[:space:]]+power[[:space:]]+parity, '%POWER%', 
'tdtrsdsdsstr bgcolor=#f8f8f1tdfont size=2Purchasing power 
parity');

Just a little help, you can find on the page 
http://www.php.net/manual/en/ref.regex.php that could be useful for you:

^ Start of line
$ End of line
n? Zero or only one single occurrence of character 'n'
n* Zero or more occurrences of character 'n'
n+ At least one or more occurrences of character 'n'
n{2} Exactly two occurrences of 'n'
n{2,} At least 2 or more occurrences of 'n'
n{2,4} From 2 to 4 occurrences of 'n'
. Any single character
() Parenthesis to group expressions
(.*) Zero or more occurrences of any single character, ie, anything!
(n|a) Either 'n' or 'a'
[1-6] Any single digit in the range between 1 and 6
[c-h] Any single lower case letter in the range between c and h
[D-M] Any single upper case letter in the range between D and M
[^a-z] Any single character EXCEPT any lower case letter between a and z.
Pitfall: the ^ symbol only acts as an EXCEPT rule if it is the
very first character inside a range, and it denies the
entire range including the ^ symbol itself if it appears again
later in the range. Also remember that if it is the first
character in the entire expression, it means start of line.
In any other place, it is always treated as a regular ^ symbol.
In other words, you cannot deny a word with ^undesired_word
or a group with ^(undesired_phrase).
Read more detailed regex documentation to find out what is
necessary to achieve this.
[_4^a-zA-Z] Any single character which can be the underscore or the
number 4 or the ^ symbol or any letter, lower or upper case
?, +, * and the {} count parameters can be appended not only to a single 
character, but also to a group() or a range[].

therefore,
^.{2}[a-z]{1,2}_?[0-9]*([1-6]|[a-f])[^1-9]{2}a+$
would mean:
^.{2} = A line beginning with any two characters,
[a-z]{1,2} = followed by either 1 or 2 lower case letters,
_? = followed by an optional underscore,
[0-9]* = followed by zero or more digits,
([1-6]|[a-f]) = followed by either a digit between 1 and 6 OR a
lower case letter between a and f,
[^1-9]{2} = followed by any two characters except digits
between 1 and 9 (0 is possible),
a+$ = followed by at least one or more
occurrences of 'a' at the end of a line.
Sid a écrit:
Hello,
Well I am doing by first reg ex operations and I am having problems 
which I just cannot figure out.

For example I tried
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font 
size=\2\\s*purchasing power parity, '%POWER%', 'tdtrsdsdsstr 
bgcolor=#f8f8f1tdfont size=2Purchasing power parity');
and this worked perfectly,

but when I chnaged that to
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font 
size=\2\\s*purchasing\s+power\s+parity, '%POWER%', 
'tdtrsdsdsstr bgcolor=#f8f8f1tdfont size=2Purchasing power 
parity');
It does not detect the string. Srange. According to what I know, \s+ 
will detect a single space also. I tried chnaging the last 2 \s+ to \s* 
but this did not work also.
Any ideas on this one?

As I proceed I would like the expression to detect the optional face 
attribute also, so I tried
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font 
size=\2\(\s+face=\Verdana, Arial, Helvetica, 
sans-serif\|)\s*purchasing power parity, '%POWER%', 
'tdtrsdsdsstr bgcolor=#f8f8f1 face=Verdana, Arial, Helvetica, 
sans-seriftdfont size=2Purchasing power parity');
... and this gave me an error like
Warning: eregi_replace(): REG_EMPTY:çempty (sub)expression in 
D:\sid\dg\test.php on line 2

Any ideas? BTW any place where I can get started on regex? I got a perl 
book that explains regex, but I have got to learn perl first (I dont 
know any perl)

Thanks in advance.

- Sid

Sid a écrit:
Hello,

Well I am doing by first reg ex operations and I am having problems which I just cannot figure out.

For example I tried
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font size=\2\\s*purchasing power parity, '%POWER%', 
'tdtrsdsdsstr bgcolor=#f8f8f1tdfont size=2Purchasing power parity');
and this worked perfectly,
but when I chnaged that to
echo eregi_replace (tr bgcolor=\#F8F8F1\(\s*)td\s*font