Mag, If your source data is clean, it may also be faster for you to parse your input files directly vs. use the CSV module which may(?) add some overhead.
Check out the struct module and/or use the split() method of strings. We do a lot of ETL processing with flat files and on a slow single core processing workstation, we can typically process 2 Gb of data in ~5 minutes. I would think a worst case processing time would be less than an hour for 14 Gb of data. Malcolm -- http://mail.python.org/mailman/listinfo/python-list