Re: [Tutor] Multiple regex replacements, lists and for.

Evert Rol Tue, 12 Oct 2010 00:05:59 -0700

> I'm new to python and inexperienced in programming but I'm trying hard.
> I have a shell script that I'm converting over to python.
> Part of the script replaces some lines of text.
> I can do this in python, and get the output I want, but so far only using sed.
> Here's an example script:
> 
> import subprocess, re
> list = ['Apples       the color red', 'Sky    i am the blue color', 'Grass    
> the
> colour green is here', 'Sky   i am the blue color']
> 
> def oldway():
>       sed_replacements = """
> s/\(^\w*\).*red/\\1:RED/
> s/\(^\w*\).*blue.*/\\1:BLUE/"""
>       sed = subprocess.Popen(['sed', sed_replacements],
> stdin=subprocess.PIPE, stdout=subprocess.PIPE)
>       data = sed.communicate("\n".join(list))[:-1]
>       for x in data:
>               print x
> oldway();
> 
> """ This produces:
> 
>>>> Apples:RED
>>>> Sky:BLUE
>>>> Grass      the colour green is here
>>>> Sky:BLUE
> 
> Which is what I want"""
> 
> print "---------------"
> 
> def withoutsed():
>       replacements = [
> (r'.*red', 'RED'),
> (r'.*blue.*', 'BLUE')]
>       for z in list:
>               for x,y in replacements:
>                       if re.match(x, z):
>                               print re.sub(x,y,z)
>                               break
>                       else:
>                               print z
> withoutsed();
> 
> """ Produces:
> 
>>>> RED
>>>> Sky        i am the blue color
>>>> BLUE
>>>> Grass      the colour green is here
>>>> Grass      the colour green is here
>>>> Sky        i am the blue color
>>>> BLUE
> 
> Duplicate printing + other mess = I went wrong"""
> 
> I understand that it's doing what I tell it to, and that my for and if
> statements are wrong.



You should make your Python regex more like sed. re.sub() always returns a 
string, either changed or unchanged. So you can "pipe" the two necessary 
re.sub() onto each other, like you do for sed: re.sub(replacement, replacement, 
re.sub(replacement, replacement, string). That removes the inner for loop, 
because you can do all the replacements in one go.
re.sub() will return the original string if there was no replacement (again 
like sed), so you can remove the if-statement with the re.match: re.sub() will 
leave the 'Grass' sentence untouched, but still print it.
Lastly, in your sed expression, you're catching the first non-whitespace 
characters and substitute them in the replacements, but you don't do in 
re.sub(). Again, this is practically the same format, the only difference being 
that in Python regexes, you don't need to escape the grouping parentheses.

I can give you the full solution, but I hope this is pointer in the right 
direction is good enough. All in all, your code can be as efficient in Python 
as in sed.

Cheers,

  Evert


Oh, btw: semi-colons at the end of a statement in Python are allowed, but 
redundant, and look kind of, well, wrong.


> What I want to do is replace matching lines and print them, and also
> print the non-matching lines.
> Can somebody please point me in the right direction?
> 
> Any other python pointers or help much appreciated,
> 
> Will.
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> http://mail.python.org/mailman/listinfo/tutor

_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Multiple regex replacements, lists and for.

Reply via email to