Hi, Given a large ascii file (delimited or fixed width) with one ID field and dimensions/measures fields, sorted by dimensions, I'd like to "flatten" or "rollup" the file by creating new columns: one for each combination of dimension level, and summing up measures over all records for a given ID.
If the wheel has already been invented, great, please point me in the right direction. If not, please share some pointers on how to think about this problem in order to write efficient code. Is a hash with dimension level combinations a good approach, with values reset at each new ID level? I know mysql, Oracle etc will do this , but they all have a cap on # of columns allowed. SAS will allow unlimited columns, but I don't own SAS. Thanks. ID,color,shape,msr1 ------------------------------ 001, blue, square, 4 001, red , circle, 5 001, red, circle, 6 ID, blue_circle, blue_square, red_circle, red_square -------------------------------------------------------------------------- 001,0,4,11,0 002 ... -- http://mail.python.org/mailman/listinfo/python-list