Re: RegExp help

2016-02-10 Thread MRAB

On 2016-02-11 03:09, Larry Martell wrote:

On Wed, Feb 10, 2016 at 10:00 PM, MRAB  wrote:

On 2016-02-11 02:48, Larry Martell wrote:


Given this string:


s = """|Type=Foo


... |Side=Left"""


print s


|Type=Foo
|Side=Left

I can match with this:


m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE)
print m.group(0)


|Type=Foo
|Side=Left


print m.group(1)


Foo


print m.group(2)


Left

But when I try and sub it doesn't work:


rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2
\1',s,re.MULTILINE)
print rn


|Type=Foo
|Side=Left

What very stupid thing am I doing wrong?


The 4th argument of re.sub is the count.



Thanks. Turned out that this site is running 2.6 and that doesn't
support the flags arg to sub. So I had to change it to:

re.sub(r'\|Type=(.*)\n\|Side=(.*)', r'\|Side Type=\2 \1',s)


You could've used the inline flag "(?m)" in the pattern:

  rn = re.sub(r'(?m)^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 \1',s)

--
https://mail.python.org/mailman/listinfo/python-list


Re: RegExp help

2016-02-10 Thread Larry Martell
On Wed, Feb 10, 2016 at 10:00 PM, MRAB  wrote:
> On 2016-02-11 02:48, Larry Martell wrote:
>>
>> Given this string:
>>
> s = """|Type=Foo
>>
>> ... |Side=Left"""
>
> print s
>>
>> |Type=Foo
>> |Side=Left
>>
>> I can match with this:
>>
> m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE)
> print m.group(0)
>>
>> |Type=Foo
>> |Side=Left
>
> print m.group(1)
>>
>> Foo
>
> print m.group(2)
>>
>> Left
>>
>> But when I try and sub it doesn't work:
>>
> rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2
> \1',s,re.MULTILINE)
> print rn
>>
>> |Type=Foo
>> |Side=Left
>>
>> What very stupid thing am I doing wrong?
>>
> The 4th argument of re.sub is the count.


Thanks. Turned out that this site is running 2.6 and that doesn't
support the flags arg to sub. So I had to change it to:

re.sub(r'\|Type=(.*)\n\|Side=(.*)', r'\|Side Type=\2 \1',s)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: RegExp help

2016-02-10 Thread MRAB

On 2016-02-11 02:48, Larry Martell wrote:

Given this string:


s = """|Type=Foo

... |Side=Left"""

print s

|Type=Foo
|Side=Left

I can match with this:


m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE)
print m.group(0)

|Type=Foo
|Side=Left

print m.group(1)

Foo

print m.group(2)

Left

But when I try and sub it doesn't work:


rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 \1',s,re.MULTILINE)
print rn

|Type=Foo
|Side=Left

What very stupid thing am I doing wrong?


The 4th argument of re.sub is the count.

--
https://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-11-04 Thread Dave Angel



Simon Brunning wrote:

2009/11/4 Nadav Chernin :
  

Thanks, but my question is how to write the regex.



re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me.

  

How about:
os.path.splitext(x)[1] in  (".exe", ".dll", ".ocx", ".py"):

DaveA
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-11-04 Thread Simon Brunning
2009/11/4 Nadav Chernin :
> No, I need all files except exe|dll|ocx|py

not re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name)

Now that wasn't so hard, was it? ;-)

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: regexp help

2009-11-04 Thread Nadav Chernin
No, I need all files except exe|dll|ocx|py

-Original Message-
From: simon.brunn...@gmail.com [mailto:simon.brunn...@gmail.com] On Behalf Of 
Simon Brunning
Sent: ד 04 נובמבר 2009 19:13
To: Nadav Chernin
Cc: Python List
Subject: Re: regexp help

2009/11/4 Nadav Chernin :
> Thanks, but my question is how to write the regex.

re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me.

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-11-04 Thread Carsten Haese
Nadav Chernin wrote:
> Thanks, but my question is how to write the regex.

See http://www.amk.ca/python/howto/regex/ .

--
Carsten Haese
http://informixdb.sourceforge.net

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-11-04 Thread Simon Brunning
2009/11/4 Nadav Chernin :
> Thanks, but my question is how to write the regex.

re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me.

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


RE: regexp help

2009-11-04 Thread Nadav Chernin
Thanks, but my question is how to write the regex.

-Original Message-
From: simon.brunn...@gmail.com [mailto:simon.brunn...@gmail.com] On Behalf Of 
Simon Brunning
Sent: ד 04 נובמבר 2009 18:44
To: Nadav Chernin; Python List
Subject: Re: regexp help

2009/11/4 Nadav Chernin :
> I’m trying to write regexp that find all files that are not with next
> extensions:  exe|dll|ocx|py,  but can’t find any command that make it.

http://code.activestate.com/recipes/499305/ should be a good start.
Use the re module and your regex instead of fnmatch.filter(), and you
should be good to go.

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-11-04 Thread Simon Brunning
2009/11/4 Nadav Chernin :
> I’m trying to write regexp that find all files that are not with next
> extensions:  exe|dll|ocx|py,  but can’t find any command that make it.

http://code.activestate.com/recipes/499305/ should be a good start.
Use the re module and your regex instead of fnmatch.filter(), and you
should be good to go.

-- 
Cheers,
Simon B.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-08-27 Thread Paul McGuire
On Aug 27, 1:15 pm, Bakes  wrote:
> If I were using the code:
>
> (?P[0-9]+)
>
> to get an integer between 0 and 9, how would I allow it to register
> negative integers as well?

With that + sign in there, you will actually get an integer from 0 to
9...

-- Paul
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-08-27 Thread Mart.
On Aug 27, 7:15 pm, Bakes  wrote:
> If I were using the code:
>
> (?P[0-9]+)
>
> to get an integer between 0 and 9, how would I allow it to register
> negative integers as well?

-?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-08-27 Thread Peter Pearson
On Thu, 27 Aug 2009 11:15:59 -0700 (PDT), Bakes  wrote:
> If I were using the code:
>
> (?P[0-9]+)
>
> to get an integer between 0 and 9, how would I allow it to register
> negative integers as well?

(?P-?[0-9]+)

-- 
To email me, substitute nowhere->spamcop, invalid->net.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2009-08-27 Thread Iuri
You can use r"[+-]?\d+" to get positive and negative integers.

It returns true to these strings: "+123", "-123", "123"



On Thu, Aug 27, 2009 at 3:15 PM, Bakes  wrote:

> If I were using the code:
>
> (?P[0-9]+)
>
> to get an integer between 0 and 9, how would I allow it to register
> negative integers as well?
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread Paul McGuire
On May 9, 6:52 pm, John Machin <[EMAIL PROTECTED]> wrote:
> Paul McGuire wrote:
> > from re import *
>
> Perhaps you intended "import re".

Indeed I did.

> 
>
> > Both print "prince".
>
> No they don't. The result is "NameError: name 're' is not defined".

Dang, now how did that work in my script?  I assure you I did test it
before posting.

Ah!  My pyparsing prototype preceded the regex version in the same
script, and importing the pyparsing module imports re using "import
re".  That is why I didn't get NameError.  Sorry for sloppy posting...

Once you clean up the mistakes, you essentially get the same code as
earlier posted by Matimus.

-- Paul
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread John Machin

globalrev wrote:

The inner pair of () are not necessary.


yes they are?


You are correct. I was having a flashback to a dimly remembered previous 
incarnation during which I used regexp software in which something like 
& or \0 denoted the whole match (like MatchObject.group(0)) :-)

--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread globalrev
> The inner pair of () are not necessary.

yes they are?


ty anyway, got it now.
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread John Machin

globalrev wrote:

ty. that was the decrypt function. i am slo writing an encrypt
function.

def encrypt(phrase):
pattern =
re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])")


The inner pair of () are not necessary.


return pattern.sub(r"1\o\1", phrase)

doesnt work though, h becomes 1\\oh.


To be precise, "h" becomes "1\\oh", which is the same as r"1\oh". There 
is only one backslash in the result.


It's doing exactly what you told it to do: replace each consonant by
(1) the character '1'
(2) a backslash
(3) the character 'o'
(4) the consonant




def encrypt(phrase):
pattern =
re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])")
return pattern.sub(r"o\1", phrase)

returns oh.


It's doing exactly what you told it to do: replace each consonant by
(1) the character 'o'
(2) the consonant


i want hoh.


So tell it to do that:
return pattern.sub(r"\1o\1", phrase)


i dont quite get it.why cant i delimit pattern with \


Perhaps you could explain what you mean by "delimit pattern with \".

--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread globalrev
ty. that was the decrypt function. i am slo writing an encrypt
function.

def encrypt(phrase):
pattern =
re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])")
return pattern.sub(r"1\o\1", phrase)

doesnt work though, h becomes 1\\oh.


def encrypt(phrase):
pattern =
re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])")
return pattern.sub(r"o\1", phrase)

returns oh.

i want hoh.

i dont quite get it.why cant i delimit pattern with \
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread John Machin

Paul McGuire wrote:

from re import *


Perhaps you intended "import re".


vowels = "aAeEiIoOuU"
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r"\1",s)

This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant.  To meet your original request, change to:

from re import *


And again.


cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r"\1",s)

Both print "prince".



No they don't. The result is "NameError: name 're' is not defined".
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread Matimus
On May 9, 3:19 pm, globalrev <[EMAIL PROTECTED]> wrote:
> i want to a little stringmanipulationa nd im looking into regexps. i
> couldnt find out how to do:
> s = 'poprorinoncoce'
> re.sub('$o$', '$', s)
> should result in 'prince'
>
> $ is obv the wrng character to use bu what i mean the pattern is
> "consonant o consonant" and should be replace by just "consonant".
> both consonants should be the same too.
> so mole would be mole
> mom would be m etc

>>> import re
>>> s = s = 'poprorinoncoce'
>>> coc = re.compile(r"(.)o\1")
>>> coc.sub(r'\1', s)
'prince'

Matt
--
http://mail.python.org/mailman/listinfo/python-list


Re: regexp help

2008-05-09 Thread Paul McGuire
On May 9, 5:19 pm, globalrev <[EMAIL PROTECTED]> wrote:
> i want to a little stringmanipulationa nd im looking into regexps. i
> couldnt find out how to do:
> s = 'poprorinoncoce'
> re.sub('$o$', '$', s)
> should result in 'prince'
>
> $ is obv the wrng character to use bu what i mean the pattern is
> "consonant o consonant" and should be replace by just "consonant".
> both consonants should be the same too.
> so mole would be mole
> mom would be m etc

from re import *
vowels = "aAeEiIoOuU"
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels))
print encodeRe.sub(r"\1",s)

This is actually a little more complex than you asked - it will search
for any consonant-vowel-same_consonant triple, and replace it with the
leading consonant.  To meet your original request, change to:

from re import *
cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ"
encodeRe = re.compile(r"([%s])o\1" % cons)
print encodeRe.sub(r"\1",s)

Both print "prince".

-- Paul

(I have a pyparsing solution too, but I just used it to prototype up
the solution, then coverted it to regex.)
--
http://mail.python.org/mailman/listinfo/python-list


Re: RegExp Help

2007-12-14 Thread Sean DiZazzo
On Dec 14, 3:06 am, "Gabriel Genellina" <[EMAIL PROTECTED]>
wrote:
> En Fri, 14 Dec 2007 06:06:21 -0300, Sean DiZazzo <[EMAIL PROTECTED]>  
> escribió:
>
>
>
> > On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> >> On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote:
> >> > I'm wrapping up a command line util that returns xml in Python.  The
> >> > util is flaky, and gives me back poorly formed xml with different
> >> > problems in different cases.  Anyway I'm making progress.  I'm not
> >> > very good at regular expressions though and was wondering if someone
> >> > could help with initially splitting the tags from the stdout returned
> >> > from the util.
>
> >> Flaky XML is often produced by programs that treat XML as ordinary text
> >> files. If you are starting to parse XML with regular expressions you are
> >> making the very same mistake.  XML may look somewhat simple but
> >> producing correct XML and parsing it isn't.  Sooner or later you stumble
> >> across something that breaks producing or parsing the "naive" way.
>
> > It's not really complicated xml so far, just tags with attributes.
> > Still, using different queries against the program sometimes offers
> > differing results...a few examples:
>
> > 
> > 
> > 
> > 
>
> Ouch... only the second is valid xml. Most tools require at least a well  
> formed document. You may try using BeautifulStoneSoup, included with  
> BeautifulSouphttp://crummy.com/software/BeautifulSoup/
>
> > I found something that works, although I couldn't tell you why it
> > works.  :)
> >  retag = re.compile(r'<.+?>', re.DOTALL)
> > tags = retag.findall(retag)
> >  Why does that work?
>
> That means: "look for a less-than sign (<), followed by the shortest  
> sequence of (?) one or more (+) arbitrary characters (.), followed by a  
> greater-than sign (>)"
>
> If you never get nested tags, and never have a ">" inside an attribute,  
> that expression *might* work. But please try BeautifulStoneSoup, it uses a  
> lot of heuristics trying to guess the right structure. Doesn't work  
> always, but given your input, there isn't much one can do...
>
> --
> Gabriel Genellina

Thanks!  I'll take a look at BeautifulStoneSoup today and see what I
get.

~Sean
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RegExp Help

2007-12-14 Thread Gabriel Genellina
En Fri, 14 Dec 2007 06:06:21 -0300, Sean DiZazzo <[EMAIL PROTECTED]>  
escribió:

> On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
>> On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote:
>> > I'm wrapping up a command line util that returns xml in Python.  The
>> > util is flaky, and gives me back poorly formed xml with different
>> > problems in different cases.  Anyway I'm making progress.  I'm not
>> > very good at regular expressions though and was wondering if someone
>> > could help with initially splitting the tags from the stdout returned
>> > from the util.
>>
>> Flaky XML is often produced by programs that treat XML as ordinary text
>> files. If you are starting to parse XML with regular expressions you are
>> making the very same mistake.  XML may look somewhat simple but
>> producing correct XML and parsing it isn't.  Sooner or later you stumble
>> across something that breaks producing or parsing the "naive" way.
>>
> It's not really complicated xml so far, just tags with attributes.
> Still, using different queries against the program sometimes offers
> differing results...a few examples:
>
> 
> 
> 
> 

Ouch... only the second is valid xml. Most tools require at least a well  
formed document. You may try using BeautifulStoneSoup, included with  
BeautifulSoup http://crummy.com/software/BeautifulSoup/

> I found something that works, although I couldn't tell you why it
> works.  :)
>  retag = re.compile(r'<.+?>', re.DOTALL)
> tags = retag.findall(retag)
>  Why does that work?

That means: "look for a less-than sign (<), followed by the shortest  
sequence of (?) one or more (+) arbitrary characters (.), followed by a  
greater-than sign (>)"

If you never get nested tags, and never have a ">" inside an attribute,  
that expression *might* work. But please try BeautifulStoneSoup, it uses a  
lot of heuristics trying to guess the right structure. Doesn't work  
always, but given your input, there isn't much one can do...


-- 
Gabriel Genellina

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RegExp Help

2007-12-14 Thread Sean DiZazzo
On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote:
> On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote:
> > I'm wrapping up a command line util that returns xml in Python.  The
> > util is flaky, and gives me back poorly formed xml with different
> > problems in different cases.  Anyway I'm making progress.  I'm not
> > very good at regular expressions though and was wondering if someone
> > could help with initially splitting the tags from the stdout returned
> > from the util.
>
> > [...]
>
> > Can anyone help me?
>
> Flaky XML is often produced by programs that treat XML as ordinary text
> files. If you are starting to parse XML with regular expressions you are
> making the very same mistake.  XML may look somewhat simple but
> producing correct XML and parsing it isn't.  Sooner or later you stumble
> across something that breaks producing or parsing the "naive" way.
>
> Ciao,
> Marc 'BlackJack' Rintsch

It's not really complicated xml so far, just tags with attributes.
Still, using different queries against the program sometimes offers
differing results...a few examples:






It's consistent (at least) in that consistent queries always return
consistent tag styles.  It's returned to stdout with some extra
useless information, so the original question was to help get to just
the tags. After getting the tags, I'm running them through some
functions to fix them, and then using elementtree to parse them and
get all the rest of the info.

There is no api, so this is what I have to work with.  Is there a
better solution?

Thanks for your ideas.

~Sean
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: RegExp Help

2007-12-14 Thread Marc 'BlackJack' Rintsch
On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote:

> I'm wrapping up a command line util that returns xml in Python.  The
> util is flaky, and gives me back poorly formed xml with different
> problems in different cases.  Anyway I'm making progress.  I'm not
> very good at regular expressions though and was wondering if someone
> could help with initially splitting the tags from the stdout returned
> from the util.
> 
> […]
> 
> Can anyone help me?

Flaky XML is often produced by programs that treat XML as ordinary text
files. If you are starting to parse XML with regular expressions you are
making the very same mistake.  XML may look somewhat simple but
producing correct XML and parsing it isn't.  Sooner or later you stumble
across something that breaks producing or parsing the "naive" way.

Ciao,
Marc 'BlackJack' Rintsch
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: RegExp Help

2007-12-13 Thread Sean DiZazzo
On Dec 13, 5:49 pm, Sean DiZazzo <[EMAIL PROTECTED]> wrote:
> Hi group,
>
> I'm wrapping up a command line util that returns xml in Python.  The
> util is flaky, and gives me back poorly formed xml with different
> problems in different cases.  Anyway I'm making progress.  I'm not
> very good at regular expressions though and was wondering if someone
> could help with initially splitting the tags from the stdout returned
> from the util.
>
> I have the following example string, and am simply trying to split it
> into two xml tags...
>
> simplified = """2007-12-13 
> \n2007-12-13 
> \n"""
>
> Basically I want the two tags, and to discard anything in between
> using a reg exp.  Like this:
>
> tags = ["", " attr1="text1" attr2="text2" attr3="text3\n" /tag2>"]
>
> I've tried several approaches, some of which got close, but the
> newline in the middle of one of the tags screwed it up.  The closest
> I've been is something like this:
>
> retag = re.compile(r'<.+>*') # tried here with re.DOTALL as well
> tags = re.findall(retag)
>
> Can anyone help me?
>
> ~Sean

I found something that works, although I couldn't tell you why it
works.  :)

retag = re.compile(r'<.+?>', re.DOTALL)
tags = retag.findall(retag)

Why does that work?

~Sean
-- 
http://mail.python.org/mailman/listinfo/python-list