On Jan 3, 3:41 pm, [EMAIL PROTECTED] wrote: > Yes in the sense that the top part will have merged cells so that > Horror and Classics don't need to be repeated every time, but the > headers aren't the important part. At this point I'm more interested > in organizing the data itself and i can worry about putting it into a > new excel file later.
What I am able to do is to use SQL, I think it is quite simple, about 50 rows of code including sample data and comments. It works for me and IMHO it is easy tu understand and I think you can use as it is. Otherwise you can control Excel from Python using win32api, win32com ( http://tinyurl.com/2m3x3v ) HTH Petr Jakes #!/usr/bin/env python # -*- coding: cp1250 -*- import sqlite3 con = sqlite3.connect(":memory:") cur = con.cursor() normalizedInputData=[] subCategories=[] rawData = [['Italy', 'Horror', '70s', 'Suspiria','Dario Argento', 4], ['Italy', 'Classics', 'Neo-Realist', 'Otto eMezzo', 'Fellini', 3], ['GB', 'Classics', 'Neo-Humoristic', 'Otto eMezzo', 'Fellini', 3], ['Fr', 'Horror', 'Neo-Realist', 'Otto eMezzo', 'Fellini', 8], ['Fr', 'Classics', 'Neo-Realist', 'Otto eMezzo', 'Fellini', 55], ['GB', 'Horror', 'Neo-Realist', 'Otto eMezzo', 'Fellini', 5], ['Italy', 'Horror', '70s', 'Profondo Rosso','Dario Argento', 4]] def alphanum(s): """only letters, numbers and '_' are acceptable for column names by SQL""" filtered='' for ch in s: if ch.isalnum() or ch in '_': filtered+=ch return filtered for myRow in rawData : cat_SubCat = alphanum("_".join(myRow[1:3])) if cat_SubCat not in subCategories: subCategories.append(cat_SubCat) myRow[1:3] = [cat_SubCat] normalizedInputData.append(myRow) def data_generator(dataSet): for dataSetRow in dataSet: yield dataSetRow subCategories=sorted(subCategories) # create SQL table named "MOVIES" with the apropriate fields (the tabe is store in the memory only) cur.execute("create table MOVIES(COUNTRY, CATEGORY, TITLE, DIRECTOR, QUANTITY)") # fill the table with data cur.executemany("""insert into MOVIES(COUNTRY, CATEGORY, TITLE, DIRECTOR, QUANTITY) values (?,?,?,?,?)""", data_generator(normalizedInputData)) # assemble dynamic SQL SELECT query, which returns PIVOT TABLE prologue = "select COUNTRY, SUM(QUANTITY) AS TOTAL, " template = "SUM (CASE CATEGORY WHEN '%s' THEN QUANTITY ELSE 0 END) %s" epilogue = " FROM MOVIES GROUP BY 1 ORDER BY 1" pivotSelect = prologue + ", ".join([template % (x, x) for x in subCategories]) + epilogue # execute SQL SELECT and return data row by row cur.execute(pivotSelect) for row in cur.fetchall(): print row -- http://mail.python.org/mailman/listinfo/python-list