Greetings, List! Say I have an old-fashioned dbf style table, with a single name field of 50 characters:
names = dbf.Table(':memory:', 'name C(40)') Then I add a bunch of names from who-knows-where: for name in some_iterable(): names.append((name)) Now I want to know how many start with a 'J'... I have two options, 1) brute force: matches = [rec for rec in names if rec.name[0] == 'J'] or, 2) binary search after ordering: names.order('name[:1]') matches = names.search('J', startswith=True) So far so good. Now it gets a little more complicated. In my use case I am trying to match records from one database to records from a another database; therefore, I do _not_ know what text I will be searching for, only the fields I will be using. If I only had one criteria, I'd still be okay: different_table.order('zipcode[:5], last_name') for record in original_table: matches = different_table.search([record.zipcode[:5], last_name]) However, I have three different sets of matches: 'first_name[:1], last_name, city, dlvryaddrs[:4]' 'first_name[:1], last_name[:5], dlvryaddrs[:8]' 'first_name, last_name, city, state' This is not a problem for the ordering, as I can just do for criteria in (choices): different_table.order(criteria) The problem comes at the matching stage: the .search method is expecting a list of the pieces it is supposed to find, so what I need is a way to apply, for example, 'first_name[:1], last_name[:5], dlvryaddrs[:8]', to the current record to yield the text to search for. Current code follows, more comments towards the end. <code> import dbf import shutil from collections import defaultdict from cookbook.utils import index source_tables = [ '/temp/kaz15514', '/temp/kks15515', '/temp/kmn15585', '/temp/knv15516', '/temp/ktx15722', '/temp/kwa15584', '/temp/mco15902', '/temp/msq15994' ] counts = defaultdict(int) for i in index(source_tables): source_tables[i] = dbf.Table(source_tables[i]) shutil.copy('z:/orders/25105/mbk16508_02', '.') match_back = dbf.Table('mbk16508_02') match_back.add_fields('f1ltcta4 C(100), f1l5a8 C(100), ftltctst C(100)') for field, criteria in \ (('f1ltcta4', 'first_name[:1], last_name, city, dlvryaddrs[:4]'), ('f1l5a8', 'first_name[:1], last_name[:5], dlvryaddrs[:8]'), ('ftltctst', 'first_name, last_name, city, state')) match_back.order(criteria) for table in source_tables: counts = defaultdict(int) for record in match_back: matches = table.search(?????) . . . </code> The only idea I have at the moment is to parse the string (much like I do in the order method), and after the string is parsed pluck out the the needed pieces. If that is the best and/or most practical way to do it, I was thinking of adding __call__ to the record class. Then, besides being able to do: matches = table.search([record.zip4[:5], record.dlvryaddrs]) I could also do: matches = table.search(record('zip4[:5], dlvryaddrs')) or, equivalently, criteria = 'this, that[:7], the_other' matches = table.search(record(criteria)) Any better ideas? Am I missing anything already in the stdlib? Any and all tips appreciated! ~Ethan~ P.S. <shameless plug> Python d-Base currently lives at http://groups.google.com/group/python-dbase, and it's main purpose in life is to ease the transition between old dbf files and newer sql tables. It can, however, be used for read/write access to dBase III and VFP 6 tables, including memo fields. Success stories and bug reports both equally welcome! :D </shameless plug> -- http://mail.python.org/mailman/listinfo/python-list