Hi, I have a case where I should aggregate data from the CSV file, which contains data in this way:
DATE TIME COUNTRY ZIP CITY VALUE1 VALUE2 VALUE3 21.2.2008 00:00 A 1000 CITY1 1 2 3 21.2.2008 00:00 A 1000 CITY2 4 5 6 21.2.2008 00:00 A 1000 CITY3 7 8 9 21.2.2008 00:00 A 1000 CITY4 1 2 3 21.2.2008 00:15 A 1000 CITY1 4 5 6 21.2.2008 00:15 A 1000 CITY2 7 8 9 21.2.2008 00:15 A 1000 CITY3 1 2 3 21.2.2008 00:15 A 1000 CITY4 4 5 6 21.2.2008 00:00 A 2000 CITY10 7 8 9 21.2.2008 00:00 A 2000 CITY20 1 2 3 21.2.2008 00:00 A 2000 CITY30 4 5 6 21.2.2008 00:00 A 2000 CITY40 1 2 3 21.2.2008 00:15 A 2000 CITY10 7 8 9 21.2.2008 00:15 A 2000 CITY20 1 2 3 21.2.2008 00:15 A 2000 CITY30 4 5 6 21.2.2008 00:15 A 2000 CITY40 1 2 3 I need to aggregate data from file1, so the result would be a CSV file (file2) in this format: DATE COUNTRY ZIP CITY SumOfVALUE1 SumOfVALUE2 SumOfVALUE3 formula1 21.2.2008 A 1000 CITY1 5 7 9 12 21.2.2008 A 1000 CITY2 11 13 15 24 21.2.2008 A 1000 CITY3 8 10 12 18 21.2.2008 A 1000 CITY4 5 7 9 12 21.2.2008 A 2000 CITY10 14 16 18 30 21.2.2008 A 2000 CITY20 2 4 6 6 21.2.2008 A 2000 CITY30 8 10 12 18 21.2.2008 A 2000 CITY40 2 4 6 6 So, group by DATE, COUNTRY, ZIP and CITY and sum (or do some calculation) the values and do some calculation from summed fields (e.g.: formula1 = SumOfVALUE1+SumOfVALUE2). I am able to do this by first loading file1 in SQL, perform a query there, which returns the file2 results and then load it back in the SQL in the different table. I would like to avoid the step of taking data out from database in order to process it. I would like to process the file1 in Python and load the result (file2) in SQL. >From some little experience with Perl, I think this is managable with double hash tables (1: basic hash with key/value = CITY/pointer-to- other-hash, 2: hash table with values for CITY1), so I assume that there would be also a way in Python, maybe with dictionaries? Any ideas? Regards, Vedran. -- http://mail.python.org/mailman/listinfo/python-list