Re: RegExp help
On 2016-02-11 03:09, Larry Martell wrote: On Wed, Feb 10, 2016 at 10:00 PM, MRAB wrote: On 2016-02-11 02:48, Larry Martell wrote: Given this string: s = """|Type=Foo ... |Side=Left""" print s |Type=Foo |Side=Left I can match with this: m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE) print m.group(0) |Type=Foo |Side=Left print m.group(1) Foo print m.group(2) Left But when I try and sub it doesn't work: rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 \1',s,re.MULTILINE) print rn |Type=Foo |Side=Left What very stupid thing am I doing wrong? The 4th argument of re.sub is the count. Thanks. Turned out that this site is running 2.6 and that doesn't support the flags arg to sub. So I had to change it to: re.sub(r'\|Type=(.*)\n\|Side=(.*)', r'\|Side Type=\2 \1',s) You could've used the inline flag "(?m)" in the pattern: rn = re.sub(r'(?m)^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 \1',s) -- https://mail.python.org/mailman/listinfo/python-list
Re: RegExp help
On Wed, Feb 10, 2016 at 10:00 PM, MRAB wrote: > On 2016-02-11 02:48, Larry Martell wrote: >> >> Given this string: >> > s = """|Type=Foo >> >> ... |Side=Left""" > > print s >> >> |Type=Foo >> |Side=Left >> >> I can match with this: >> > m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE) > print m.group(0) >> >> |Type=Foo >> |Side=Left > > print m.group(1) >> >> Foo > > print m.group(2) >> >> Left >> >> But when I try and sub it doesn't work: >> > rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 > \1',s,re.MULTILINE) > print rn >> >> |Type=Foo >> |Side=Left >> >> What very stupid thing am I doing wrong? >> > The 4th argument of re.sub is the count. Thanks. Turned out that this site is running 2.6 and that doesn't support the flags arg to sub. So I had to change it to: re.sub(r'\|Type=(.*)\n\|Side=(.*)', r'\|Side Type=\2 \1',s) -- https://mail.python.org/mailman/listinfo/python-list
Re: RegExp help
On 2016-02-11 02:48, Larry Martell wrote: Given this string: s = """|Type=Foo ... |Side=Left""" print s |Type=Foo |Side=Left I can match with this: m = re.search(r'^\|Type=(.*)$\n^\|Side=(.*)$',s,re.MULTILINE) print m.group(0) |Type=Foo |Side=Left print m.group(1) Foo print m.group(2) Left But when I try and sub it doesn't work: rn = re.sub(r'^\|Type=(.*)$^\|Side=(.*)$', r'|Side Type=\2 \1',s,re.MULTILINE) print rn |Type=Foo |Side=Left What very stupid thing am I doing wrong? The 4th argument of re.sub is the count. -- https://mail.python.org/mailman/listinfo/python-list
Re: regexp help
Simon Brunning wrote: 2009/11/4 Nadav Chernin : Thanks, but my question is how to write the regex. re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me. How about: os.path.splitext(x)[1] in (".exe", ".dll", ".ocx", ".py"): DaveA -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
2009/11/4 Nadav Chernin : > No, I need all files except exe|dll|ocx|py not re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) Now that wasn't so hard, was it? ;-) -- Cheers, Simon B. -- http://mail.python.org/mailman/listinfo/python-list
RE: regexp help
No, I need all files except exe|dll|ocx|py -Original Message- From: simon.brunn...@gmail.com [mailto:simon.brunn...@gmail.com] On Behalf Of Simon Brunning Sent: ד 04 נובמבר 2009 19:13 To: Nadav Chernin Cc: Python List Subject: Re: regexp help 2009/11/4 Nadav Chernin : > Thanks, but my question is how to write the regex. re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me. -- Cheers, Simon B. -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
Nadav Chernin wrote: > Thanks, but my question is how to write the regex. See http://www.amk.ca/python/howto/regex/ . -- Carsten Haese http://informixdb.sourceforge.net -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
2009/11/4 Nadav Chernin : > Thanks, but my question is how to write the regex. re.match(r'.*\.(exe|dll|ocx|py)$', the_file_name) works for me. -- Cheers, Simon B. -- http://mail.python.org/mailman/listinfo/python-list
RE: regexp help
Thanks, but my question is how to write the regex. -Original Message- From: simon.brunn...@gmail.com [mailto:simon.brunn...@gmail.com] On Behalf Of Simon Brunning Sent: ד 04 נובמבר 2009 18:44 To: Nadav Chernin; Python List Subject: Re: regexp help 2009/11/4 Nadav Chernin : > I’m trying to write regexp that find all files that are not with next > extensions: exe|dll|ocx|py, but can’t find any command that make it. http://code.activestate.com/recipes/499305/ should be a good start. Use the re module and your regex instead of fnmatch.filter(), and you should be good to go. -- Cheers, Simon B. -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
2009/11/4 Nadav Chernin : > I’m trying to write regexp that find all files that are not with next > extensions: exe|dll|ocx|py, but can’t find any command that make it. http://code.activestate.com/recipes/499305/ should be a good start. Use the re module and your regex instead of fnmatch.filter(), and you should be good to go. -- Cheers, Simon B. -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On Aug 27, 1:15 pm, Bakes wrote: > If I were using the code: > > (?P[0-9]+) > > to get an integer between 0 and 9, how would I allow it to register > negative integers as well? With that + sign in there, you will actually get an integer from 0 to 9... -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On Aug 27, 7:15 pm, Bakes wrote: > If I were using the code: > > (?P[0-9]+) > > to get an integer between 0 and 9, how would I allow it to register > negative integers as well? -? -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On Thu, 27 Aug 2009 11:15:59 -0700 (PDT), Bakes wrote: > If I were using the code: > > (?P[0-9]+) > > to get an integer between 0 and 9, how would I allow it to register > negative integers as well? (?P-?[0-9]+) -- To email me, substitute nowhere->spamcop, invalid->net. -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
You can use r"[+-]?\d+" to get positive and negative integers. It returns true to these strings: "+123", "-123", "123" On Thu, Aug 27, 2009 at 3:15 PM, Bakes wrote: > If I were using the code: > > (?P[0-9]+) > > to get an integer between 0 and 9, how would I allow it to register > negative integers as well? > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On May 9, 6:52 pm, John Machin <[EMAIL PROTECTED]> wrote: > Paul McGuire wrote: > > from re import * > > Perhaps you intended "import re". Indeed I did. > > > > Both print "prince". > > No they don't. The result is "NameError: name 're' is not defined". Dang, now how did that work in my script? I assure you I did test it before posting. Ah! My pyparsing prototype preceded the regex version in the same script, and importing the pyparsing module imports re using "import re". That is why I didn't get NameError. Sorry for sloppy posting... Once you clean up the mistakes, you essentially get the same code as earlier posted by Matimus. -- Paul -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
globalrev wrote: The inner pair of () are not necessary. yes they are? You are correct. I was having a flashback to a dimly remembered previous incarnation during which I used regexp software in which something like & or \0 denoted the whole match (like MatchObject.group(0)) :-) -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
> The inner pair of () are not necessary. yes they are? ty anyway, got it now. -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
globalrev wrote: ty. that was the decrypt function. i am slo writing an encrypt function. def encrypt(phrase): pattern = re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])") The inner pair of () are not necessary. return pattern.sub(r"1\o\1", phrase) doesnt work though, h becomes 1\\oh. To be precise, "h" becomes "1\\oh", which is the same as r"1\oh". There is only one backslash in the result. It's doing exactly what you told it to do: replace each consonant by (1) the character '1' (2) a backslash (3) the character 'o' (4) the consonant def encrypt(phrase): pattern = re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])") return pattern.sub(r"o\1", phrase) returns oh. It's doing exactly what you told it to do: replace each consonant by (1) the character 'o' (2) the consonant i want hoh. So tell it to do that: return pattern.sub(r"\1o\1", phrase) i dont quite get it.why cant i delimit pattern with \ Perhaps you could explain what you mean by "delimit pattern with \". -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
ty. that was the decrypt function. i am slo writing an encrypt function. def encrypt(phrase): pattern = re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])") return pattern.sub(r"1\o\1", phrase) doesnt work though, h becomes 1\\oh. def encrypt(phrase): pattern = re.compile(r"([bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ])") return pattern.sub(r"o\1", phrase) returns oh. i want hoh. i dont quite get it.why cant i delimit pattern with \ -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
Paul McGuire wrote: from re import * Perhaps you intended "import re". vowels = "aAeEiIoOuU" cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ" encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels)) print encodeRe.sub(r"\1",s) This is actually a little more complex than you asked - it will search for any consonant-vowel-same_consonant triple, and replace it with the leading consonant. To meet your original request, change to: from re import * And again. cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ" encodeRe = re.compile(r"([%s])o\1" % cons) print encodeRe.sub(r"\1",s) Both print "prince". No they don't. The result is "NameError: name 're' is not defined". -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On May 9, 3:19 pm, globalrev <[EMAIL PROTECTED]> wrote: > i want to a little stringmanipulationa nd im looking into regexps. i > couldnt find out how to do: > s = 'poprorinoncoce' > re.sub('$o$', '$', s) > should result in 'prince' > > $ is obv the wrng character to use bu what i mean the pattern is > "consonant o consonant" and should be replace by just "consonant". > both consonants should be the same too. > so mole would be mole > mom would be m etc >>> import re >>> s = s = 'poprorinoncoce' >>> coc = re.compile(r"(.)o\1") >>> coc.sub(r'\1', s) 'prince' Matt -- http://mail.python.org/mailman/listinfo/python-list
Re: regexp help
On May 9, 5:19 pm, globalrev <[EMAIL PROTECTED]> wrote: > i want to a little stringmanipulationa nd im looking into regexps. i > couldnt find out how to do: > s = 'poprorinoncoce' > re.sub('$o$', '$', s) > should result in 'prince' > > $ is obv the wrng character to use bu what i mean the pattern is > "consonant o consonant" and should be replace by just "consonant". > both consonants should be the same too. > so mole would be mole > mom would be m etc from re import * vowels = "aAeEiIoOuU" cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ" encodeRe = re.compile(r"([%s])[%s]\1" % (cons,vowels)) print encodeRe.sub(r"\1",s) This is actually a little more complex than you asked - it will search for any consonant-vowel-same_consonant triple, and replace it with the leading consonant. To meet your original request, change to: from re import * cons = "bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ" encodeRe = re.compile(r"([%s])o\1" % cons) print encodeRe.sub(r"\1",s) Both print "prince". -- Paul (I have a pyparsing solution too, but I just used it to prototype up the solution, then coverted it to regex.) -- http://mail.python.org/mailman/listinfo/python-list
Re: RegExp Help
On Dec 14, 3:06 am, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Fri, 14 Dec 2007 06:06:21 -0300, Sean DiZazzo <[EMAIL PROTECTED]> > escribió: > > > > > On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > >> On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote: > >> > I'm wrapping up a command line util that returns xml in Python. The > >> > util is flaky, and gives me back poorly formed xml with different > >> > problems in different cases. Anyway I'm making progress. I'm not > >> > very good at regular expressions though and was wondering if someone > >> > could help with initially splitting the tags from the stdout returned > >> > from the util. > > >> Flaky XML is often produced by programs that treat XML as ordinary text > >> files. If you are starting to parse XML with regular expressions you are > >> making the very same mistake. XML may look somewhat simple but > >> producing correct XML and parsing it isn't. Sooner or later you stumble > >> across something that breaks producing or parsing the "naive" way. > > > It's not really complicated xml so far, just tags with attributes. > > Still, using different queries against the program sometimes offers > > differing results...a few examples: > > > > > > > > > > > Ouch... only the second is valid xml. Most tools require at least a well > formed document. You may try using BeautifulStoneSoup, included with > BeautifulSouphttp://crummy.com/software/BeautifulSoup/ > > > I found something that works, although I couldn't tell you why it > > works. :) > > retag = re.compile(r'<.+?>', re.DOTALL) > > tags = retag.findall(retag) > > Why does that work? > > That means: "look for a less-than sign (<), followed by the shortest > sequence of (?) one or more (+) arbitrary characters (.), followed by a > greater-than sign (>)" > > If you never get nested tags, and never have a ">" inside an attribute, > that expression *might* work. But please try BeautifulStoneSoup, it uses a > lot of heuristics trying to guess the right structure. Doesn't work > always, but given your input, there isn't much one can do... > > -- > Gabriel Genellina Thanks! I'll take a look at BeautifulStoneSoup today and see what I get. ~Sean -- http://mail.python.org/mailman/listinfo/python-list
Re: RegExp Help
En Fri, 14 Dec 2007 06:06:21 -0300, Sean DiZazzo <[EMAIL PROTECTED]> escribió: > On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: >> On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote: >> > I'm wrapping up a command line util that returns xml in Python. The >> > util is flaky, and gives me back poorly formed xml with different >> > problems in different cases. Anyway I'm making progress. I'm not >> > very good at regular expressions though and was wondering if someone >> > could help with initially splitting the tags from the stdout returned >> > from the util. >> >> Flaky XML is often produced by programs that treat XML as ordinary text >> files. If you are starting to parse XML with regular expressions you are >> making the very same mistake. XML may look somewhat simple but >> producing correct XML and parsing it isn't. Sooner or later you stumble >> across something that breaks producing or parsing the "naive" way. >> > It's not really complicated xml so far, just tags with attributes. > Still, using different queries against the program sometimes offers > differing results...a few examples: > > > > > Ouch... only the second is valid xml. Most tools require at least a well formed document. You may try using BeautifulStoneSoup, included with BeautifulSoup http://crummy.com/software/BeautifulSoup/ > I found something that works, although I couldn't tell you why it > works. :) > retag = re.compile(r'<.+?>', re.DOTALL) > tags = retag.findall(retag) > Why does that work? That means: "look for a less-than sign (<), followed by the shortest sequence of (?) one or more (+) arbitrary characters (.), followed by a greater-than sign (>)" If you never get nested tags, and never have a ">" inside an attribute, that expression *might* work. But please try BeautifulStoneSoup, it uses a lot of heuristics trying to guess the right structure. Doesn't work always, but given your input, there isn't much one can do... -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: RegExp Help
On Dec 14, 12:04 am, Marc 'BlackJack' Rintsch <[EMAIL PROTECTED]> wrote: > On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote: > > I'm wrapping up a command line util that returns xml in Python. The > > util is flaky, and gives me back poorly formed xml with different > > problems in different cases. Anyway I'm making progress. I'm not > > very good at regular expressions though and was wondering if someone > > could help with initially splitting the tags from the stdout returned > > from the util. > > > [...] > > > Can anyone help me? > > Flaky XML is often produced by programs that treat XML as ordinary text > files. If you are starting to parse XML with regular expressions you are > making the very same mistake. XML may look somewhat simple but > producing correct XML and parsing it isn't. Sooner or later you stumble > across something that breaks producing or parsing the "naive" way. > > Ciao, > Marc 'BlackJack' Rintsch It's not really complicated xml so far, just tags with attributes. Still, using different queries against the program sometimes offers differing results...a few examples: It's consistent (at least) in that consistent queries always return consistent tag styles. It's returned to stdout with some extra useless information, so the original question was to help get to just the tags. After getting the tags, I'm running them through some functions to fix them, and then using elementtree to parse them and get all the rest of the info. There is no api, so this is what I have to work with. Is there a better solution? Thanks for your ideas. ~Sean -- http://mail.python.org/mailman/listinfo/python-list
Re: RegExp Help
On Thu, 13 Dec 2007 17:49:20 -0800, Sean DiZazzo wrote: > I'm wrapping up a command line util that returns xml in Python. The > util is flaky, and gives me back poorly formed xml with different > problems in different cases. Anyway I'm making progress. I'm not > very good at regular expressions though and was wondering if someone > could help with initially splitting the tags from the stdout returned > from the util. > > […] > > Can anyone help me? Flaky XML is often produced by programs that treat XML as ordinary text files. If you are starting to parse XML with regular expressions you are making the very same mistake. XML may look somewhat simple but producing correct XML and parsing it isn't. Sooner or later you stumble across something that breaks producing or parsing the "naive" way. Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list
Re: RegExp Help
On Dec 13, 5:49 pm, Sean DiZazzo <[EMAIL PROTECTED]> wrote: > Hi group, > > I'm wrapping up a command line util that returns xml in Python. The > util is flaky, and gives me back poorly formed xml with different > problems in different cases. Anyway I'm making progress. I'm not > very good at regular expressions though and was wondering if someone > could help with initially splitting the tags from the stdout returned > from the util. > > I have the following example string, and am simply trying to split it > into two xml tags... > > simplified = """2007-12-13 > \n2007-12-13 > \n""" > > Basically I want the two tags, and to discard anything in between > using a reg exp. Like this: > > tags = ["", " attr1="text1" attr2="text2" attr3="text3\n" /tag2>"] > > I've tried several approaches, some of which got close, but the > newline in the middle of one of the tags screwed it up. The closest > I've been is something like this: > > retag = re.compile(r'<.+>*') # tried here with re.DOTALL as well > tags = re.findall(retag) > > Can anyone help me? > > ~Sean I found something that works, although I couldn't tell you why it works. :) retag = re.compile(r'<.+?>', re.DOTALL) tags = retag.findall(retag) Why does that work? ~Sean -- http://mail.python.org/mailman/listinfo/python-list