Hello -- Is there a convention, library or Pythonic idiom for performing lightweight relational operations on flatfiles? I frequently find myself writing code to do simple SQL-like operations between flat files, such as appending columns from one file to another, linked through a common id. For example, take a list of addresses and append a 'district' field by looking up a congressional district from a second file that maps zip codes to districts.
Conceptually this is a simple database operation with a join on a common field (zip code in the above example). Other case use other relational operators (projection, cross-product, etc) so I'm really looking for something SQL-like in functionality. However, the data is in flat-files, the file structure changes frequently, the files are dynamically generated from a range of sources, are short-lived in nature, and otherwise not warrant the hassle of a database setup. So I've been looking around for a nice, Pythonic, zero-config (no parsers, no setup/teardown, etc) solution for simple queries that handles a database of csv-files-with-headers automatically. There are number of solutions that are close, but in the end come up short: - KirbyBase 1.9 (latest Python version) is the closest that I could find, as it lets you keep your data in flatfiles and perform operations using the field names from those text-based tables, but it doesn't support joins (the more recent Ruby version seems to). - Buzhug and Sqlite have their data structures w no automatic .tab or .csv parsing (unless sqlite includes a way to map flatfiles to sqlite virtual tables that I don't know about). - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/159974 is heading in the right direction, as it shows how to perform relational operations on lists and are index based rather than field-name based. - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/498130 and http://furius.ca/pubcode/pub/conf/common/bin/csv-db-import.html provide ways of automatically populating DBs but not the reverse (persist changes back out to the data files) The closest alternatives I've found are the GNU textutils that support join, cut, merge, etc but I need to add additional logic they don't support, nor do they allow field-level write operations from Python (UPDATE ... WHERE ...). Normally I'd jump right in and start coding but this seems like something so common that I would have expected someone else to have solved, so in the interest of not re-inventing the wheel I thought I'd see if anyone had any other suggestions. Any thoughts? Thanks! Ramon -- http://mail.python.org/mailman/listinfo/python-list