On Fri, 2011-01-28 at 20:54 -0500, Bill Sconce wrote:
> One of our membership is introducing himself to Python by discovering
> for himself how to use it to clean up a name-and-address database.
An all too common sort of problem. The Python re (regular expression)
module provides the sub method for making string substitutions.
http://docs.python.org/library/re.html#re.sub
The key thing to note is that the repl parameter does not have to be a
string. It can be a function!
I wrote a Converter class that takes in a dictionary of string
conversions. The "old" strings are assumed to be words. This allows
you to have a dictionary like dict(
i = 'I',
in = 'Out',
)
where some of the strings are contained in others. I needed that for
converting field names. The class combines all of the "old" strings
into a regular expression pattern and constructs the repl function to be
used by the pattern's sub method.
Then the call to pattern.sub is packaged using functools.partial and
named _cvt. A separate method, convert, documents the remaining free
parameters for the sub call. I think it makes things less mysterious.
Finally convert is assigned to __call__ so that Converter objects can be
called directly.
The last time I used this was to help my daughter with late changes for
the Ethics Bowl held last Fall at Dartmouth.
http://www.dartmouth.edu/~ethics/nereb/index.html
The scheduling program uses place holders for rooms and team names. She
needed to convert the generated schedule (ethics-bowl.txt) into
something meaningful for the participants.
<ethics-bowl.txt python fix-ethics-matches.py
did the trick.
As you can tell from the length of this email, I was pretty tickled when
I realized how nicely this kind of logic could be packaged.
--
Lloyd Kvam
Venix Corp
DLSLUG/GNHLUG library
http://dlslug.org/library.html
http://www.librarything.com/catalog/dlslug
http://www.librarything.com/catalog/dlslug&sort=stamp
http://www.librarything.com/rss/recent/dlslug
#!/usr/bin/env python
import sys
from converter import Converter
ethics_conversions = {
1: 'Simons Rock...........',
2: 'Buffalo State.........',
3: 'Colgate...............',
4: 'College of Nothre Dame',
5: 'Concordia.............',
6: 'Dartmouth.............',
7: 'Elmira................',
8: 'Franklin Pierce.......',
9: 'Manhattan I...........',
10: 'Manhattan II..........',
11: 'Marist I..............',
12: 'Marist II.............',
13: 'Merrimac..............',
14: 'Moravian..............',
15: 'Saint Joseph..........',
16: "St. John's............",
17: 'Stevens Institute.....',
18: 'SUNY Fredonia.........',
19: 'Union I...............',
20: 'Union II..............',
'G': 'H 028................',
'B': 'H 031................',
'C': 'H 046................',
'D': 'H 125................',
'F': 'H 124................',
'A': 'K 006................',
'E': 'K 007................',
'J': 'K 108................',
'I': 'K 004................',
'H': 'K 201................',
}
ethics_conversions = dict(
(str(k),v) for k,v in ethics_conversions.items()
)
converter = Converter(ethics_conversions)
for line in sys.stdin:
sys.stdout.write(converter(line))
__metaclass__ = type
import re
import functools
class Converter(object):
'''Take dictionary of word replacements.
Build object with convert function to apply the word replacements.
Compile regular expression pattern that matches all search strings in parallel.
Note that the replacement parameter in the sub method may be a function!
Use functools.partial to build the sub repl function based on the pattern
'''
def __init__(self, conversions):
self.pattern = re.compile(
'|'.join(r'\b%s\b' % k for k in conversions.keys()))
# provide function to lookup the replacement strings
# input is a match object - parameter m
# m.group() returns the string that triggered the match
# we use that to look up the replacement string from the
# conversions dictionary
def repl(m):
return conversions[m.group()]
self._cvt = functools.partial(
self.pattern.sub,
repl,
)
def convert(self, s, count=0):
'''Takes a string
Applies the pattern's sub method to make our replacements.
Returns the result.
(string is unchanged if there are no matches)
'''
return self._cvt(s, count)
__call__ = convert
Panel Round One Round Two Round Three
------- ------- ------- -------
A 2 vs 6 3 vs 16 9 vs 19
B 3 vs 17 8 vs 11 10 vs 20
C 4 vs 9 10 vs 18 11 vs 14
D 5 vs 10 9 vs 17 2 vs 12
E 7 vs 11 12 vs 13 1 vs 6
F 12 vs 15 2 vs 14 3 vs 13
G 14 vs 16 4 vs 19 5 vs 18
H 18 vs 19 5 vs 20 7 vs 17
I 8 vs 20 1 vs 7 15 vs 16
J 1 vs 13 6 vs 15 4 vs 8
Here are the new instructions for each team:
Team # Round One Round Two Round Three
Panel Opponent Panel Opponent Panel Opponent
------ ----- ------ ----- ------ ----- ------
1 J--13 I--7 E--6
2 A--6 F--14 D--12
3 B--17 A--16 F--13
4 C--9 G--19 J--8
5 D--10 H--20 G--18
6 A--2 J--15 E--1
7 E--11 I--1 H--17
8 I--20 B--11 J--4
9 C--4 D--17 A--19
10 D--5 C--18 B--20
11 E--7 B--8 C--14
12 F--15 E--13 D--2
13 J--1 E--12 F--3
14 G--16 F--2 C--11
15 F--12 J--6 I--16
16 G--14 A--3 I--15
17 B--3 D--9 H--7
18 H--19 C--10 G--5
19 H--18 G--4 A--9
20 I--8 H--5 B--10
_______________________________________________
gnhlug-discuss mailing list
gnhlug-discuss@mail.gnhlug.org
http://mail.gnhlug.org/mailman/listinfo/gnhlug-discuss/