On 28 May 2010, at 20:39, Tom Wilcox wrote:

>     out = ''
>     for tok in toks:
>         ## full word replace
>         if tok == 'house' : out += 'hse'+ADDR_FIELD_DELIM
>         elif tok == 'ground' : out += 'grd'+ADDR_FIELD_DELIM
>         elif tok == 'gnd' : out += 'grd'+ADDR_FIELD_DELIM
>         elif tok == 'front' : out += 'fnt'+ADDR_FIELD_DELIM
>         elif tok == 'floor' : out += 'flr'+ADDR_FIELD_DELIM
>         elif tok == 'floors' : out += 'flr'+ADDR_FIELD_DELIM

Not that it would solve your problems, but you can write the above much more 
elegantly using a dictionary:

# normalize the token
try:
        out += {
                'house'         : 'hse',
                'ground'        : 'grd',
                'gnd'           : 'grd',
                'front'         : 'fnt',
                'floor'         : 'flr',
                ...
        }[tok]
except KeyError:
        out += tok

# add a field delimiter if the token isn't among the exceptions for those
if tok not in ('borough', 'city', 'of', 'the', 'at', 'incl', 'inc'):
        out += ADDR_FIELD_DELIM

You should probably define those lists outside the for-loop though, I'm not 
sure the Python interpreter is smart enough to declare those lists only once 
otherwise. The concept remains though.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4c00531510211149731783!



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to