Brian D <brianden...@gmail.com> writes: > I've tackled this kind of problem before by looping through a patterns > dictionary, but there must be a smarter approach.> > Two addresses. Note that the first has incorrectly transposed the > direction and street name. ....
If you're really serious about it (e.g. you are the post office trying to program automatic mail sorting machines) there is no simple regex trick anything like what you want. A lot of addresses will be ambiguous. You have use all the info you have about your entire address corpus (e.g. you need a complete street directory of the whole US) and do a bunch of Bayesian inference. As a very simple example, for an address like "1000 RAMPART S ST" you'd use the zip code to identify the address's geographic neighborhood, and then use your street directory to find candidate correct addresses within that zip code. The USPS does an amazing job of delivering mail to completely mangled addresses based on methods like that. -- http://mail.python.org/mailman/listinfo/python-list