subject:"RE\: Regular Expression Problem..."

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert

On Mon, Oct 29, 2018 at 05:16:11PM +, MRAB wrote:

> > Logically it should not because
> > 
> > >s'::15>>$
> > 
> > does not match
> > 
> > ::\d*>>$
> > 
> > but I am not sure how to tell it that :-)
> > 
> For something like that, I'd use parsing by recursive descent.
> 
> It might be worth looking at pyparsing.

I feared as much. However, by slightly changing the boundary
conditions I was able to solve the problem :-)

Now I am only left with the task to search-replace a bunch of
LaTeX templates during the next database upgrade ...

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread MRAB

On 2018-10-29 08:02, Karsten Hilbert wrote:

On Sun, Oct 28, 2018 at 11:14:15PM +, MRAB wrote:

> - lines can contain several placeholders
> 
> - placeholders start and end with '$'
> 
> - placeholders are parsed in three passes
> 
> - the pass in which a placeholder is parsed is denoted by the number of '<' and '>' next to the '$':
> 
> 	$<...>$ / $<<...>>$ / $<<<...>>>$
> 
> - placeholders for different parsing passes must be nestable:
> 
> 	$<<<...$<...>$...>>>$

>
>(lower=earlier parsing passes will be inside)
> 
> - the internal structure is "name::options::range"
> 
> 	$$
> 
> - name will *not* contain '$' '<' '>' ':'
> 
> - range can be either a length or a "from-until"
> 
> - a length will be a positive integer (no bounds checking)
> 
> - "from-until" is: a positive integer, a '-', and a positive integer (no sanity checking)
> 
> - options needs to be able to contain nearly anything, except '::'
> 
> 
> Is that sufficiently defined and helpful to design the regular expression ?
> 
How can they be nested inside one another?

Is the string scanned, placeholders filled in for that level, and then the
string scanned again for the next level? (That would mean that the fill
value itself will be scanned in the next pass.)

Exactly. But *different* levels can be nested inside each other.

You could try matching the top level, for each match then match the next
level, and for each of those matches then match for the final level.

So I do.

Trying to do it all in one regex is usually a bad idea.

Right, I am not trying to do that. I was, however, worried
that I need to make the expression not "trip over" fragments
of what might seem to constitute part of another placeholder.

$<$::15>>$

Pass 1 might fill in to:

$>$

and I was worried to make sure the second pass does not stop here:

$>$
^

Logically it should not because

>s'::15>>$

does not match

::\d*>>$

but I am not sure how to tell it that :-)

For something like that, I'd use parsing by recursive descent.

It might be worth looking at pyparsing.
--
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert

On Sun, Oct 28, 2018 at 11:57:48PM +0100, Brian Oney wrote:

> On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote:
> > [^<:]
> 
> Would a simple regex work?

This brought about the solution.

However, not this way:

> >>> import re
> >>> t = '$$'
> >>> re.findall('[^<>:$]+', t)
> ['name', 'options', 'range']

because I am not trying to parcel out the placeholder *parts*
(but rather the placeholders from a given line).

I eventually figured that denoting the parsing stages
differently made for easier matching. Rather than

$<>$
$<<>>$
$<<<>>>$

do this

$1<>1$
$2<>2$
$3<>3$

which makes it way less ambiguous, and more matchable:

regexen = [
r'\$1{0,1}<[^<].*?>1{0,1}\$',
r'\$2<[^<].*?>2\$',
r'\$3<[^<].*?>3\$'
]

The [^<] part ("the single < is NOT to be followed directly
by another <") is actually superfluous but does protect
against legacy document templates still having
$<<(<)...(>)>>$ in them.

$<>$ is still retained as an alias for $1<>1$ because there is
A LOT of them in existing document templates. It is
normalized explicitely inside Python before fillin values are
generated.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert

> Right, I am not trying to do that. I was, however, worried
> that I need to make the expression not "trip over" fragments
> of what might seem to constitute part of another placeholder.
> 
>   $<$::15>>$
> 
> Pass 1 might fill in to:
> 
>   $>$
> 
> and I was worried to make sure the second pass does not stop here:
> 
>   $>$
   ^

Here, of course.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert

On Mon, Oct 29, 2018 at 12:10:04AM +0100, Thomas Jollans wrote:

> On 28/10/2018 22:04, Karsten Hilbert wrote:
> > - options needs to be able to contain nearly anything, except '::'
> 
> Including > and $ ?

Unfortunately, it might. Even if I assume that earlier passes
are "inside", and thusly "filled in" before outer=later
passes happen the fillin value might still contain some of :>$.

The fillin value will NOT contain a newly generated, matching
placeholder definition, though.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-29 Thread Karsten Hilbert

On Sun, Oct 28, 2018 at 11:14:15PM +, MRAB wrote:

> > - lines can contain several placeholders
> > 
> > - placeholders start and end with '$'
> > 
> > - placeholders are parsed in three passes
> > 
> > - the pass in which a placeholder is parsed is denoted by the number of '<' 
> > and '>' next to the '$':
> > 
> > $<...>$ / $<<...>>$ / $<<<...>>>$
> > 
> > - placeholders for different parsing passes must be nestable:
> > 
> > $<<<...$<...>$...>>>$
> > 
> > (lower=earlier parsing passes will be inside)
> > 
> > - the internal structure is "name::options::range"
> > 
> > $$
> > 
> > - name will *not* contain '$' '<' '>' ':'
> > 
> > - range can be either a length or a "from-until"
> > 
> > - a length will be a positive integer (no bounds checking)
> > 
> > - "from-until" is: a positive integer, a '-', and a positive integer (no 
> > sanity checking)
> > 
> > - options needs to be able to contain nearly anything, except '::'
> > 
> > 
> > Is that sufficiently defined and helpful to design the regular expression ?
> > 
> How can they be nested inside one another?
> Is the string scanned, placeholders filled in for that level, and then the
> string scanned again for the next level? (That would mean that the fill
> value itself will be scanned in the next pass.)

Exactly. But *different* levels can be nested inside each other.

> You could try matching the top level, for each match then match the next
> level, and for each of those matches then match for the final level.

So I do.

> Trying to do it all in one regex is usually a bad idea.

Right, I am not trying to do that. I was, however, worried
that I need to make the expression not "trip over" fragments
of what might seem to constitute part of another placeholder.

$<$::15>>$

Pass 1 might fill in to:

$>$

and I was worried to make sure the second pass does not stop here:

$>$
   ^

Logically it should not because

>s'::15>>$

does not match

::\d*>>$

but I am not sure how to tell it that :-)

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Thomas Jollans

On 28/10/2018 22:04, Karsten Hilbert wrote:
> - options needs to be able to contain nearly anything, except '::'

Including > and $ ?
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Thomas Jollans

On 28/10/2018 22:04, Karsten Hilbert wrote:
> - options needs to be able to contain nearly anything, except '::'
> 
> Is that sufficiently defined and helpful to design the regular expression ?

so options isn't '.*', but more like '(:?[^:]+)*' (Figuring out what
additional restriction this imposes is left an an exercise for the reader)
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread MRAB


On 2018-10-28 21:04, Karsten Hilbert wrote:

On Sun, Oct 28, 2018 at 09:43:27PM +0100, Karsten Hilbert wrote:


Let my try to explain the expression I am actually after
(assuming .compile with re.VERBOSE):

rx_works = '
\$<  # start of match is literal '$<' anywhere 
inside string
[^<:]+?::# followed by at least one "character", except '<' or ':', 
until the next '::' (this is the placeholder "name")
.*?::   # followed by any number of any "character", until the 
next '::'  (this is the placeholder "options")
\d*?# followed by any number of digits  

(the max length of placeholder output)
>\$  # followed by '>$'
|   # -- OR (in *either* order) --
\$<  # start of match is literal '$<' anywhere 
inside string
[^<:]+?::# followed by at least one "character", except '<' or ':', 
until the next '::' (this is the placeholder "name")
.*?::   # followed by any number of any "character", until the 
next '::'  (this is the placeholder "options")
# now the difference:
\d+-\d+ # followed by one-or-many digits, a '-', and 
one-or-many digits (this is the *range* 
from with placeholder output)
>\$' # followed by '>$'


Another try:

- lines can contain several placeholders

- placeholders start and end with '$'

- placeholders are parsed in three passes

- the pass in which a placeholder is parsed is denoted by the number of '<' and 
'>' next to the '$':

$<...>$ / $<<...>>$ / $<<<...>>>$

- placeholders for different parsing passes must be nestable:

$<<<...$<...>$...>>>$

(lower=earlier parsing passes will be inside)

- the internal structure is "name::options::range"

$$

- name will *not* contain '$' '<' '>' ':'

- range can be either a length or a "from-until"

- a length will be a positive integer (no bounds checking)

- "from-until" is: a positive integer, a '-', and a positive integer (no sanity 
checking)

- options needs to be able to contain nearly anything, except '::'


Is that sufficiently defined and helpful to design the regular expression ?


How can they be nested inside one another?
Is the string scanned, placeholders filled in for that level, and then 
the string scanned again for the next level? (That would mean that the 
fill value itself will be scanned in the next pass.)


You could try matching the top level, for each match then match the next 
level, and for each of those matches then match for the final level.


Trying to do it all in one regex is usually a bad idea. Keep it simple! 
(Do you even need to use a regex?)

--
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Brian Oney via Python-list

On Sun, 2018-10-28 at 22:04 +0100, Karsten Hilbert wrote:
> [^<:]

Would a simple regex work?

I mean:

~$ python
Python 2.7.13 (default, Sep 26 2018, 18:42:22) 
[GCC 6.3.0 20170516] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> t = '$$'
>>> re.findall('[^<>:$]+', t)
['name', 'options', 'range']

You can then interpret what you have extracted afterwards.
Maybe if you want to have the single ones grouped you could consider:

>>> t = t*2
>>> t
''
>>> re.findall('\$<+([^:]+)::([^:]+)::([^:]+)>+\$', t)
[('name', 'options', 'range'), ('name', 'options', 'range')]

HTH
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert

On Sun, Oct 28, 2018 at 10:04:39PM +0100, Karsten Hilbert wrote:

> - options needs to be able to contain nearly anything, except '::'

This seems to contradict the "nesting" requirement, but the
nesting restriction "earlier parsing passes go inside" makes
it possible.

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert

On Sun, Oct 28, 2018 at 09:43:27PM +0100, Karsten Hilbert wrote:

> Let my try to explain the expression I am actually after
> (assuming .compile with re.VERBOSE):
> 
> rx_works = '
>   \$< # start of match is literal '$<' 
> anywhere inside string
>   [^<:]+?::   # followed by at least one "character", except 
> '<' or ':', until the next '::'  (this is the placeholder "name")
>   .*?::   # followed by any number of any "character", 
> until the next '::'(this is the 
> placeholder "options")
>   \d*?# followed by any number of digits  
>   
>   (the max length of placeholder output)
>   >\$ # followed by '>$'
>   |   # -- OR (in *either* order) --
>   \$< # start of match is literal '$<' 
> anywhere inside string
>   [^<:]+?::   # followed by at least one "character", except 
> '<' or ':', until the next '::'  (this is the placeholder "name")
>   .*?::   # followed by any number of any "character", 
> until the next '::'(this is the 
> placeholder "options")
>   # now the difference:
>   \d+-\d+ # followed by one-or-many digits, a '-', and 
> one-or-many digits (this is the 
> *range* from with placeholder output)
>   >\$'# followed by '>$'

Another try:

- lines can contain several placeholders

- placeholders start and end with '$'

- placeholders are parsed in three passes

- the pass in which a placeholder is parsed is denoted by the number of '<' and 
'>' next to the '$':

$<...>$ / $<<...>>$ / $<<<...>>>$

- placeholders for different parsing passes must be nestable:

$<<<...$<...>$...>>>$

(lower=earlier parsing passes will be inside)

- the internal structure is "name::options::range"

$$

- name will *not* contain '$' '<' '>' ':'

- range can be either a length or a "from-until"

- a length will be a positive integer (no bounds checking)

- "from-until" is: a positive integer, a '-', and a positive integer (no sanity 
checking)

- options needs to be able to contain nearly anything, except '::'


Is that sufficiently defined and helpful to design the regular expression ?

Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread Karsten Hilbert

Now that MRAB has shown me the follies of my ways I would
like to learn how to properly write the regular expression I
need.

This part:

> rx_works = '\$<[^<:]+?::.*?::\d*?>\$|\$<[^<:]+?::.*?::\d+-\d+>\$'
> # it fails if switched around:
> rx_fails = '\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$'

suggests that I already have a solution. However, in reality this line:

> line = 'junk  $$  junk  $$  
> junk'

can be either way round (match_A, then match_B or the vice
versa) which, in turn, will switch the rx_works/rx_fails.

Let my try to explain the expression I am actually after
(assuming .compile with re.VERBOSE):

rx_works = '
\$< # start of match is literal '$<' 
anywhere inside string
[^<:]+?::   # followed by at least one "character", except 
'<' or ':', until the next '::'  (this is the placeholder "name")
.*?::   # followed by any number of any "character", 
until the next '::'(this is the 
placeholder "options")
\d*?# followed by any number of digits  

(the max length of placeholder output)
>\$ # followed by '>$'
|   # -- OR (in *either* order) --
\$< # start of match is literal '$<' 
anywhere inside string
[^<:]+?::   # followed by at least one "character", except 
'<' or ':', until the next '::'  (this is the placeholder "name")
.*?::   # followed by any number of any "character", 
until the next '::'(this is the 
placeholder "options")
# now the difference:
\d+-\d+ # followed by one-or-many digits, a '-', and 
one-or-many digits (this is the *range* 
from with placeholder output)
>\$'# followed by '>$'

I want this to work for

any number of matches

in any order of max-lenght or output-range

inside one string.

Now, why the [^<:]+? dance ?

Because three levels of placeholders

$<...::...::>$
$<<...::...::>>$
$<<<...::...::>>>$

need to be nestable inside each other ;-)

Anyone able to help ?

This seems beyond my current grasp of regular expressions.

Thanks,
Karsten
-- 
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2018-10-28 Thread MRAB


On 2018-10-28 18:51, Karsten Hilbert wrote:

Dear list members,

I cannot figure out why my regular expression does not work as I expect it to:

#---
#!/usr/bin/python

from __future__ import print_function
import re as regex

rx_works = '\$<[^<:]+?::.*?::\d*?>\$|\$<[^<:]+?::.*?::\d+-\d+>\$'
# it fails if switched around:
rx_fails = '\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$'
line = 'junk  $$  junk  $$  
junk'

print ('')
print ('line:', line)
print ('expected: $$')
print ('expected: $$')

print ('')
placeholders_in_line = regex.findall(rx_works, line, regex.IGNORECASE)
print('found (works):')
for ph in placeholders_in_line:
print (ph)

print ('')
placeholders_in_line = regex.findall(rx_fails, line, regex.IGNORECASE)
print('found (fails):')
for ph in placeholders_in_line:
print (ph)

#---

I am sure I simply don't see the problem ?

Here are some of the steps while matching the second regex. (View this 
in a monospaced font.)



1:
junk  $$  junk  $$  junk
  ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
^


2:
junk  $$  junk  $$  junk
 ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
^


3:
The .*? matches as few characters as possible, initially none.

junk  $$  junk  $$  junk
  ^
^
\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
   ^


4:
junk  $$  junk  $$  junk
 ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
^

At this point it can't match, so it backtracks.


5:
The .*? matches more characters, including the ":".

After more matching it's like the following.

junk  $$  junk  $$  junk
^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
   ^


6:
junk  $$  junk  $$  junk
  ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
 ^

Again it can't match, so it backtracks.


7:
The .*? matches more characters, including the ":".

After more matching it's like the following.

junk  $$  junk  $$  junk
   ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
   ^

8:
junk  $$  junk  $$  junk
  ^

\$<[^<:]+?::.*?::\d+-\d+>\$|\$<[^<:]+?::.*?::\d*?>\$
   ^

Success!

The first choice has matched this:

$$  junk  $$
--
https://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-11 Thread Terry Reedy


On 3/11/2013 2:30 PM, Serhiy Storchaka wrote:

On 11.03.13 04:06, Terry Reedy wrote:

On 3/10/2013 1:42 PM, mukesh tiwari wrote:

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/


As I remember, and as it still appears, this site severely penalizes
Python solvers by using the same time limit for all languages. Thus, a
'slow' python program may work correctly but the site will not let you
know.


I'm sure the time limits are enough to solve most (if not all) of
problems. Actually all submitted solutions on Python for this problem
run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/).


You do not see the solutions that timed out. I suppose you are pointing 
to the fact that for this problem there are solutions close to but under 
the time limit. However, algorithm running times are not evenly 
distributed. Suppose, for instance) there is a correct O(n**2) solution 
and a correct O(n) solution and that the ones listed are the O(n) 
solutions. Then the Python O(n**2) solutions could easily take 10x 
longer to run and time out, while equivalent C solutions do not.


Mukesh is not the first to post here a reasonable looking solution for 
that site that he could not judge because the test quite and refused to 
answer. I point out again that he was 'happy' to have a faster but 
incorrect program, even though it might have been a regression from his 
original.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-11 Thread Serhiy Storchaka


On 11.03.13 04:06, Terry Reedy wrote:

On 3/10/2013 1:42 PM, mukesh tiwari wrote:

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/


As I remember, and as it still appears, this site severely penalizes
Python solvers by using the same time limit for all languages. Thus, a
'slow' python program may work correctly but the site will not let you
know.


I'm sure the time limits are enough to solve most (if not all) of 
problems. Actually all submitted solutions on Python for this problem 
run from 0.47 to 0.61 seconds (http://www.spoj.com/ranks/MAIN12C/).


--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-11 Thread rusi

On Mar 11, 2:28 pm, jmfauth  wrote:
> On 11 mar, 03:06, Terry Reedy  wrote:
>
>
>
> > ...
> > By teaching 'speed before correctness", this site promotes bad
> > programming habits and thinking (and the use of low-level but faster
> > languages).
> > ...
>
> This is exactly what "your" flexible string representation
> does!

This is an old complaint of your with no new data for supporting it

>
> And away from technical aspects, you even succeeded to
> somehow lose unicode compliance.

This is a new complaint.
Just to make it clear:
1. All your recent complaints about unicode were in the realm of
performance
So your complaint that python has lost unicode compliance can mean one
of:
2a. The unicode standard mandates performance criteria
or
2b. There are problems with python's implementation (of strings?) that
have functional problems apart from your old performance complaints
or
2c. You need to look up what 'compliance' means

My own choice is to have a mid-point between
Very early binding: Narrow vs wide builds
Very late binding: String reps can change with the characters as they
are seen
This mid point would be perhaps a commandline switch to choose string-
engine attributes.

However to make this choice even worth a second look we need to have
hard data about performance that you have been unable to provide.

[See the recent thread of RoR vs Django to see the problems of
excessive spurious choice]
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-11 Thread Mark Lawrence


On 11/03/2013 09:28, jmfauth wrote:

On 11 mar, 03:06, Terry Reedy  wrote:



...
By teaching 'speed before correctness", this site promotes bad
programming habits and thinking (and the use of low-level but faster
languages).
...



This is exactly what "your" flexible string representation
does!

And away from technical aspects, you even succeeded to
somehow lose unicode compliance.

jmf



Please stick to something you know about such as sexual self abuse.

--
Cheers.

Mark Lawrence

--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-11 Thread jmfauth

On 11 mar, 03:06, Terry Reedy  wrote:

>
> ...
> By teaching 'speed before correctness", this site promotes bad
> programming habits and thinking (and the use of low-level but faster
> languages).
> ...

This is exactly what "your" flexible string representation
does!

And away from technical aspects, you even succeeded to
somehow lose unicode compliance.

jmf

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread Terry Reedy


On 3/10/2013 1:42 PM, mukesh tiwari wrote:

Hello all
I am trying to solve this problem[1]
[1] http://www.spoj.com/problems/MAIN12C/


As I remember, and as it still appears, this site severely penalizes 
Python solvers by using the same time limit for all languages. Thus, a 
'slow' python program may work correctly but the site will not let you 
know. A test that refuses to answer is no test at all. In the meanwhile, 
an algorithmically equivalent C program will be run and judged correct, 
so the programmer can try to speed up while not losing correctness.


By teaching 'speed before correctness", this site promotes bad 
programming habits and thinking (and the use of low-level but faster 
languages). I quote your later response: "Now I am getting wrong answer 
so at least program is faster then previous one". If the previous one 
was correct and the revision wrong, you should toss the revision and go 
back to the correct program.


I recommend that you work on problems where you have tests that you can 
actually run even before you code.


--
Terry Jan Reedy

--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread Chris Angelico

On Mon, Mar 11, 2013 at 5:48 AM, mukesh tiwari
 wrote:
> Hi Chris
> Thank you! Now I am getting wrong answer so at least program is faster then 
> previous one and I am looking for wrong answer reason. Thanks again!

Excellent! Have fun.

Incidentally, regular expressions aren't the only way to solve this
sort of problem. If you get stuck with one method, it may be worth
trying another one, to see if you can get around the issue.

As they say, now you have two problems...

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread mukesh tiwari

Hi Chris
Thank you! Now I am getting wrong answer so at least program is faster then 
previous one and I am looking for wrong answer reason. Thanks again!

import re

if __name__ == "__main__":
n = int ( raw_input() )
c = 1
while c <= n :
email =  filter ( lambda x : x != None , [ re.search ( 
'[a-zA-Z0-9][a-zA-Z0-9._]{4,}@[a-zA-Z0-9]+.(com|edu|org|co.in)' , x ) for x in 
raw_input().split(' ') ] )
t = len ( email )
print 'Case #' + str ( c ) + ': ' + str ( t )
for i in xrange ( t ) : 
print email[i].group()
c += 1

Regards 
Mukesh Tiwari


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread mukesh tiwari

Hi Chris
On the problem page, it is 3 second. 
 
> What is the time limit? I just tried it (Python 2.6 under Windows) and
> 
> it finished in a humanly-immeasurable amount of time. Are you sure
> 
> that STDIN (eg raw_input()) is where your test data is coming from?

Yes, on SPOJ we read data from STDIN. 

Regards 
Mukesh Tiwari

> 
> 
> 
> ChrisA

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread Chris Angelico

On Mon, Mar 11, 2013 at 4:59 AM, Chris Angelico  wrote:
> On Mon, Mar 11, 2013 at 4:42 AM, mukesh tiwari
>  wrote:
>> I am trying to solve this problem[1] using regular expression. I wrote this 
>> code but I am getting time limit exceed. Could some one please tell me how 
>> to make this code run faster.
>
> What is the time limit? I just tried it (Python 2.6 under Windows) and
> it finished in a humanly-immeasurable amount of time. Are you sure
> that STDIN (eg raw_input()) is where your test data is coming from?

Oops, reading comprehension fail. Time limit is 3s on a Pentium III.
I've no idea how long your code will take on that hardware, but I
doubt that it's taking three seconds. So my query regarding source of
test data still stands. Can you put together an uber-simple test
program that just echoes the lines of input, to make sure it really is
coming off stdin?

The problem description certainly does seem to imply stdin, but I
can't see why your code would take three seconds unless it's stalling
for some reason. Though perhaps on a P3 with the maximum 100 tests,
maybe that could take a while...

Something to try: Since you're using re.search(), see if you can drop
the complemented sets at the beginning [^~!@#$%^&*()<>?,.]* and end
[^~!@#$%^&*()<>?,.a-zA-Z0-9]* - they're going to be slow to process.
Also, you can simplify this:

[a-zA-Z0-9][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._][a-zA-Z0-9._]*

to this:

[a-zA-Z0-9][a-zA-Z0-9._]{4,}

The brace notation means "at least 4, at most infinity".

Try those out and see if you still get the results you want.

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2013-03-10 Thread Chris Angelico

On Mon, Mar 11, 2013 at 4:42 AM, mukesh tiwari
 wrote:
> I am trying to solve this problem[1] using regular expression. I wrote this 
> code but I am getting time limit exceed. Could some one please tell me how to 
> make this code run faster.

What is the time limit? I just tried it (Python 2.6 under Windows) and
it finished in a humanly-immeasurable amount of time. Are you sure
that STDIN (eg raw_input()) is where your test data is coming from?

ChrisA
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2009-09-08 Thread Steven D'Aprano

On Tue, 08 Sep 2009 09:21:35 +, Â§Ã¤Å½mÅ Ãâ¬vÂªÂº...@â¬Ã¹â¬Ã wrote:

> I have the following source code
> 
> 
> import re
> d = 'RTCB\r\nsignature:\xf1\x11
> \xde\x10\xfe\x0f\x9c\x10\xf6\xc9_\x10\xf3\xeb<\x10\xf2Zt\x10\xef\xd2\x91
\x10\xe6\xe7\xfb\x10\xe5p\x99\x10\xe2\x1e\xdf\x10\xdb\x0e\x9f\x10\xd8p\x06
\x10\xce\xb3_\x10\xcc\x8d\xe2\x10\xc8\x00\xa4\x10\xc5\x994\x10\xc2={\x10
\xc0\xdf\xda\x10\xbb\x03\xa3\x1
> 0\xb6E\n\x10\xacM\x12\x10\xa5`\xaa\x10\xa0\xaa\x1b\x10\x9bwy\x10\x9a
\xc4w\x10\x95\xb6\xde\x10\x93o
> \x10\x89N\xd3\x10\x86\xda=\x00\x00\x00\x00\x00\x00\x00\x00\r\ncef-
ip:127.0.0.1\r\nsender-ip:152.100.123.77\r\n\r\n'
> m = re.search('signature:(.*?)\r\n',d)
> 
> ---
> 
> as you can see, there is "signature:..." in front of d
> 
> but re.search can not find the match object, it return None object...


That's because you're trying to match over multiple lines. You need to 
specify the DOTALL flag.


I've re-formatted the string constant to take advantage of Python string 
concatenation, so it is easier to copy and paste into the interactive 
interpreter:


d =('RTCB\r\nsignature:'
'\xf1\x11\xde\x10\xfe\x0f\x9c\x10\xf6\xc9'
'_\x10\xf3\xeb'
'<\x10\xf2'
'Zt\x10\xef\xd2\x91\x10\xe6\xe7\xfb\x10\xe5'
'p\x99\x10\xe2\x1e\xdf\x10\xdb\x0e\x9f\x10\xd8'
'p\x06\x10\xce\xb3'
'_\x10\xcc\x8d\xe2\x10\xc8\x00\xa4\x10\xc5\x99'
'4\x10\xc2'
'={\x10\xc0\xdf\xda\x10\xbb\x03\xa3\x10\xb6'
'E\n\x10\xac'
'M\x12\x10\xa5'
'`\xaa\x10\xa0\xaa\x1b\x10\x9b'
'wy\x10\x9a\xc4'
'w\x10\x95\xb6\xde\x10\x93'
'o\x10\x89'
'N\xd3\x10\x86\xda'
'=\x00\x00\x00\x00\x00\x00\x00\x00'
'\r\ncef-ip:127.0.0.1\r\nsender-ip:152.100.123.77\r\n\r\n'
)
assert len(d) == 182


>>> re.search('signature:(.*?)\r\n', d)
>>> re.search('signature:.*?\r\n', d, re.DOTALL)
>>> m
<_sre.SRE_Match object at 0xb7e98138>



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2009-09-08 Thread Vlastimil Brom

2009/9/8 æ¾å°èªå·±çä¸çå¤© :
> I have the following source code
>
> 
> import re
> d = 'RTCB\r\nsignature:\xf1\x11
> \xde\x10\xfe\x0f\x9c\x10\xf6\xc9_\x10\xf3\xeb<\x10\xf2Zt\x10\xef\xd2\x91\x10\xe6\xe7\xfb\x10\xe5p\x99\x10\xe2\x1e\xdf\x10\xdb\x0e\x9f\x10\xd8p\x06\x10\xce\xb3_\x10\xcc\x8d\xe2\x10\xc8\x00\xa4\x10\xc5\x994\x10\xc2={\x10\xc0\xdf\xda\x10\xbb\x03\xa3\x1
> 0\xb6E\n\x10\xacM\x12\x10\xa5`\xaa\x10\xa0\xaa\x1b\x10\x9bwy\x10\x9a\xc4w\x10\x95\xb6\xde\x10\x93o
> \x10\x89N\xd3\x10\x86\xda=\x00\x00\x00\x00\x00\x00\x00\x00\r\ncef-ip:127.0.0.1\r\nsender-ip:152.100.123.77\r\n\r\n'
> m = re.search('signature:(.*?)\r\n',d)
>
> ---
>
> as you can see, there is "signature:..." in front of d
>
> but re.search can not find the match object, it return None object...
>
> i don't know why this happened??
>
> (i have test other cases, but i met this string which can't be search for)
>
> could anyone have any suggestions?
>
> --
>  [1;36mâ»Post by  [37mcommand[36mfrom  [33m59-124-255-226.HINET-IP. [m
>  [1;36mèé¼ çé¦é¦ä¹³éªæ´ [31mË [33mé»åä½åæ¬ç³»çµ± [31mË [32malexbbs.twbbs.org [31mË 
> [37m140.113.166.7 [m
> --
> http://mail.python.org/mailman/listinfo/python-list
>

I seems, that the problem is in the . [dot] not matching the newline
character by default; there is a "\n" before the first next "\r\n".

If this is intentional (i.e.the mix of line endings in one string) and
you want to make dot match any character, use e.g. the search pattern:
(?s)signature:(.*?)\r\n

hth,
  vbr
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular expression problem

2008-06-23 Thread MRAB

On Jun 22, 10:13 pm, abranches <[EMAIL PROTECTED]> wrote:
> Hello everyone.
>
> I'm having a problem when extracting data from HTML with regular
> expressions.
> This is the source code:
>
> You are ready in the next style="display: inline;">12 span>M 48S
>
> And I need to get the remaining time. Until here, isn't a problem
> getting it, but if the remaining time is less than 60 seconds then the
> source becomes something like this:
>
> You are ready in the next style="display: inline;">36 span>S
>
> I'm using this regular expression, but the minutes are always None...
> You are ready in the next.*?(?:>(\d+)M)?.*?(?:>(\d+) span>S)
>
> If I remove the ? from the first group, then it will work, but if
> there are only seconds it won't work.
> I could resolve this problem in a couple of python lines, but I really
> would like to solve it with regular expressions.
>
Your regex is working like this:

1. Match 'You are ready in the next'.
2. Match an increasing number of characters, starting with none
('.*?').
3. Try to match a pattern ('(?:>...)?') from where the previous step
left off. This doesn't match, but it's optional anyway, so continue to
the next step. (No characters consumed.)
4. Match an increasing number of characters, starting from none
('.*?'). It's this step that consumes the minutes.

It then goes on to match the seconds, and the minutes are always None
as you've found.

I've come up with this regex:

You are ready in the next(?:.*?>(\d+)M)?(?:.*?>(\d+)S)

Hope that helps.
--
http://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2007-04-30 Thread Gabriel Genellina

En Mon, 30 Apr 2007 19:16:58 -0300, John Davis <[EMAIL PROTECTED]>  
escribió:

> Hi all,
> I have a large logged string "str". I would like to strip down "str" so  
> that
> it contains only the lines that have "ERROR" in them. Could somebody  
> give me
> and indication of how to do this?

Forget about regular expressions! This famous quote [1] is entirely  
applicable here.
Also, str is not a good name, it hides the builtin type str. Using text as  
the variable name instead:

error_lines = [line for line in text.split("\n") if "ERROR" in line]

[1] http://regex.info/blog/2006-09-15/247

-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Regular Expression problem

2006-07-16 Thread Paul McGuire

> 
> Less is more:
> 
> pat = re.compile(r'href="([^"]+)')
> pat.search(your_link)
> 
> 

Be sure to also catch:

 
 
 

And it's not certain whether the OP is interested in tags like:



-- Paul

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-16 Thread Barry

On 13 Jul 2006 23:12:05 -0700, Paul McGuire <[EMAIL PROTECTED]> wrote:
Pyparsing is also good for recognizing basic HTML tags and theirattributes, regardless of the order of the attributes.-- PaultestText = """sldkjflsa;fajhere it would be 'mystylesheet.css'. I used the following regex to getthis value(I dont know if itI thought I was doing fine until I got stuck by this tag >>
  : sametag but with 'href=''href="">pat.search(your_link)
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-13 Thread Paul McGuire

Pyparsing is also good for recognizing basic HTML tags and their
attributes, regardless of the order of the attributes.

-- Paul

testText = """sldkjflsa;faj



here it would be 'mystylesheet.css'. I used the following regex to get
this value(I dont know if it

I thought I was doing fine until I got stuck by this tag >>

  : same

tag but with 'href=' part

tags are like these? >>


-OR-

-OR-


"""
from pyparsing import makeHTMLTags,line

linkTag = makeHTMLTags("link")[0]
for toks,s,e in linkTag.scanString(testText):
print toks.href
print line(s,testText)
print

Prints out:

mystylesheet.css


mystylesheet.css
  : same


mystylesheet.css


mystylesheet.css


mystylesheet.css


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-13 Thread Ant


> So What should I do to get the exact value(here the value after
> 'href=') in any case even if the
>
> tags are like these? >>
>
> 
> -OR-
> 
> -OR-
> 

The following should do it:

expr = r'http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-13 Thread Justin Azoff

Justin  Azoff wrote:
> >>> from BeautifulSoup import BeautifulSoup
> >>> html=''
> >>> page=BeautifulSoup(html)
> >>> page.link.get('href')
> 'mystylesheet.css'

On second thought, you will probably want something like
>>> [link.get('href') for link in page.fetch('link',{'type':'text/css'})]
['mystylesheet.css']

which will properly handle multiple link tags.

-- 
- Justin

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-13 Thread Justin Azoff

John Blogger wrote:
> That I want a particular tag value of one of my HTML files.
>
> ie: I want only the value after 'href=' in the tag >>
>
> ''
>
> here it would be 'mystylesheet.css'. I used the following regex to get
> this value(I dont know if it is good).

No matter how good it is you should still use something that
understands html:

>>> from BeautifulSoup import BeautifulSoup
>>> html=''
>>> page=BeautifulSoup(html)
>>> page.link.get('href')
'mystylesheet.css'

-- 
- Justin

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression problem

2006-07-13 Thread cdecarlo

Hey,

I'm new with regex's as well but here is my idea. Since you don't know
which attribute will come first why don't structure your regex like
this

(first off, I'll assume that \s == ' ', actually now that I think of
it, isn't \s any whitespace character? anyways \s == ' ' for now)

''

I think that should just about do it.

Hope this helped,

Colin

John Blogger wrote:
> (I don't know if it is the right place. So if I am wrong, please point
> me the right direction.
> If this post is read by you masters, I'm honoured. If I am getting a
> mere response, I'm blessed!)
>
> Hi,
>
> I'm a newbie regular expression user. I use regex in my Python
> programs.  I have a strange
>
> (sometimes not strange, but please bear in mind;  I'm a newbie  ;)
> problem using regex. That I want
>
> a particular tag value of one of my HTML files.
>
> ie: I want only the value after 'href=' in the tag >>
>
> ''
>
> here it would be 'mystylesheet.css'. I used the following regex to get
> this value(I dont know if it
>
> is good).
>
> _""_
> I thought I was doing fine until I got stuck by this tag >>
>
>   : same
> tag but with 'href=' part
>
> at a different place. I think you got the point!
>
> So What should I do to get the exact value(here the value after
> 'href=') in any case even if the
>
> tags are like these? >>
>
> 
> -OR-
> 
> -OR-
> 

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2005-05-31 Thread [EMAIL PROTECTED]

thank you again:
i used list and not  set because order in my list is important.
in fact i'd like to apply this function to strings (or ordered
sequences of data).
 For this reason proposed to use regular expression.
best regards.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2005-05-31 Thread Kent Johnson

[EMAIL PROTECTED] wrote:
> hi everyone
> there is a way, using re, to test (for es) in
> a=[a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14] if a list b is
> composed by three "sublists" of a separated or not by elements.
> 
> if b=[a2,a3,a4,a7,a8,a12,a13] gives true because in a
> we have [,a2,a3,a3,...,a7,a8,...,a12,a13,...]
> or b=[a1,a2,a5,a14] gives true because in a we have
> [a1,a2,,a5,...,a14] and so on...

difflib.SequenceMatcher can do this for you:

  >>> a = [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
  >>> b = [2,3,4,7,8,12,13]
  >>> import difflib
  >>> sm = difflib.SequenceMatcher(None, a, b)
  >>> sm.get_matching_blocks()
[(1, 0, 3), (6, 3, 2), (11, 5, 2), (14, 7, 0)]
  >>> b = [1, 2, 5, 14]
  >>> sm = difflib.SequenceMatcher(None, a, b)
  >>> sm.get_matching_blocks()
[(0, 0, 2), (4, 2, 1), (13, 3, 1), (14, 4, 0)]

You should test for len(sm.get_matching_blocks()) == 4 (the last element is a 
dummy)

Kent
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: regular expression problem

2005-05-31 Thread alex23

[EMAIL PROTECTED] wrote:
> hi everyone
> there is a way, using re, to test (for es) in
> a=[a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14] if a list b is
> composed by three "sublists" of a separated or not by elements.

Heya,

Is there any particular reason why you need to use re?

If you're using Python 2.3 or greater, the sets module might be easier
to deal with here:

>>> from sets import Set
>>> a = Set([1,2,3,4,5,6,7,8,9,10,11,12,13,14])
>>> b = Set([2,3,4,7,8,12,13])
>>> b.issubset(a)
True
>>> b = Set([1,2,5,14])
>>> b.issubset(a)
True
>>> b = Set([3,7,23,200])
>>> b.issubset(a)
False

Sets are unsorted, I'm uncertain if that's a requirement for you.

Hope this helps.
-alex23

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression Problem...

2004-12-01 Thread Caleb Hattingh

Obviously, Peter and Jorge are hardcore, but below is what a beginner like  
me hacked up:

My point, I guess, is that it is possible to quickly get a solution to a  
specific problem like this without being a total expert.  The code below  
was typed out once, and with only one minor correction before it produced  
the required behaviour.  Sure, it is not the general solution, but *this  
is WHY I use python* - You get a lot of mileage out of the basics.  Plus,  
I can still read and understand the sub-optimal code, and probably will be  
able to several years from now :)

Regular expressions are real powerful, and I am learning through using ViM  
- but it takes time and practice.

(err, I didn't include your example with quotes, but you should get the  
general idea)
***
'>>> stngs = [r'This Is An $EXAMPLE String', r'This Is An  
$EXAMPLE.String', r'This Is An $EXAMPLE',r'This Is An \$EXAMPLE\String']
'>>> stngs
['This Is An $EXAMPLE String', 'This Is An $EXAMPLE.String', 'This Is An  
$EXAMPLE', 'This Is An \\$EXAMPLE\\String']

'>>> for i in stngs:
wdlist = i.split(' ')
for j in wdlist:
if j.find(r'$') > -1:
dwd = j.split('.')[0]
if dwd.find('\\') > -1:
dwd = dwd.split('\\')[1]
print dwd

$EXAMPLE
$EXAMPLE
$EXAMPLE
$EXAMPLE
'>>>
***

On Wed, 01 Dec 2004 13:40:17 +0100, Peter Otten <[EMAIL PROTECTED]> wrote:
[EMAIL PROTECTED] wrote:
identifying/extracting a substring from another string. What I have to  
do
is to extract all the strings that begins with a "$" character, but
excluding characters like "." (point) and "'" (single quote) and "\" "/"
(slashes). For example I have:

1) This Is An $EXAMPLE String
2) This Is An $EXAMPLE.String
3) 'This Is An $EXAMPLE'
4) This Is An \$EXAMPLE\String;
I would like to extract only the "keyword" $EXAMPLE and what I'm using  
at
Is that what you want?
import re
r = re.compile("[$]\w+")
r.findall("""
... 1) This Is An $EXAMPLE String
... 2) This Is An $EXAMPLE.String
... 3) 'This Is An $EXAMPLE'
... 4) This Is An \$EXAMPLE\String;
... """)
['$EXAMPLE', '$EXAMPLE', '$EXAMPLE', '$EXAMPLE']
Peter
--
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression Problem...

2004-12-01 Thread Peter Otten

[EMAIL PROTECTED] wrote:

> identifying/extracting a substring from another string. What I have to do
> is to extract all the strings that begins with a "$" character, but
> excluding characters like "." (point) and "'" (single quote) and "\" "/"
> (slashes). For example I have:
> 
> 1) This Is An $EXAMPLE String
> 2) This Is An $EXAMPLE.String
> 3) 'This Is An $EXAMPLE'
> 4) This Is An \$EXAMPLE\String;
> 
> I would like to extract only the "keyword" $EXAMPLE and what I'm using at

Is that what you want?

>>> import re
>>> r = re.compile("[$]\w+")
>>> r.findall("""
... 1) This Is An $EXAMPLE String
... 2) This Is An $EXAMPLE.String
... 3) 'This Is An $EXAMPLE'
... 4) This Is An \$EXAMPLE\String;
... """)
['$EXAMPLE', '$EXAMPLE', '$EXAMPLE', '$EXAMPLE']

Peter

-- 
http://mail.python.org/mailman/listinfo/python-list

RE: Regular Expression Problem...

2004-12-01 Thread Doran_Dermot

You could try the following:
regex = re.compile("[\$]\w+", re.IGNORECASE)

I've only done a bit of testing.  Maybe somebody has a better solution.

Cheers!!

Dermot. 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
[EMAIL PROTECTED]
Sent: 01 December 2004 12:23
To: [EMAIL PROTECTED]
Subject: Regular Expression Problem...

Hello NG,

 I am quite new with Python... I'm writing an application that does
also some regexp things on strings, but I'm having problem about
identifying/extracting a substring from another string. What I have to do
is to extract all the strings that begins with a "$" character, but
excluding characters like "." (point) and "'" (single quote) and "\" "/"
(slashes). For example I have:

1) This Is An $EXAMPLE String
2) This Is An $EXAMPLE.String
3) 'This Is An $EXAMPLE'
4) This Is An \$EXAMPLE\String;

I would like to extract only the "keyword" $EXAMPLE and what I'm using at
the moment is:

#CODE BEGIN
import re

mystring = "This Is An \$EXAMPLE\String;"
regex = re.compile("[\$]+\S*",re.IGNORECASE)
keys = regex.findall(mystring)

#CODE END

Obviously this code returns things like $EXAMPLE', $EXAMPLE/, $EXAMPLE. and
so on...
Does anyone have a suggestion?

Thank you a lot.

Andrea.


--

Message for the recipient only, if received in error, please notify the
sender and read http://www.eni.it/disclaimer/

-- 
http://mail.python.org/mailman/listinfo/python-list
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Regular Expression Problem...

2004-12-01 Thread Jorge Godoy

[EMAIL PROTECTED] writes:

> #CODE BEGIN
> import re
>
> mystring = "This Is An \$EXAMPLE\String;"
> regex = re.compile("[\$]+\S*",re.IGNORECASE)
> keys = regex.findall(mystring)
>
> #CODE END

regex = re.compile("[\$]+\w*",re.IGNORECASE)

>>> import re
>>>
>>> mystring = "This Is An \$EXAMPLE\String;"
>>> regex = re.compile("[\$]+\w*",re.IGNORECASE)
>>> keys = regex.findall(mystring)
>>> keys
['$EXAMPLE']
>>> 


Be seeing you,
-- 
Godoy. <[EMAIL PROTECTED]>
-- 
http://mail.python.org/mailman/listinfo/python-list

43 matches

Mail list logo