christian wrote: > Hello, > > i'm not very experienced in python. Is there a way doing below more > memory efficient and maybe faster. > I import a 2-column file and then concat for every unique value in > the first column ( key) the value from the second > columns. > > So The ouptut is something like that. > A,1,2,3 > B,3,4 > C,9,10,11,12,90,34,322,21 > > > Thanks for advance & regards, > Christian > > > import csv > import random > import sys > from itertools import groupby > from operator import itemgetter > > f=csv.reader(open(sys.argv[1]),delimiter=';') > z=[[i[0],i[1]] for i in f] > z.sort(key=itemgetter(0)) > mydict = dict((k,','.join(map(itemgetter(1), it))) > for k, it in groupby(z, itemgetter(0))) > del(z) > > f = open(sys.argv[2], 'w') > for k,v in mydict.iteritems(): > f.write(v + "\n") > > f.close()
I don't expect that it matters much, but you don't need to sort your data if you use a dictionary anyway: import csv import sys infile, outfile = sys.argv[1:] d = {} with open(infile, "rb") as instream: for key, value in csv.reader(instream, delimiter=';'): d.setdefault(key, [key]).append(value) with open(outfile, "wb") as outstream: csv.writer(outstream).writerows(d.itervalues()) -- http://mail.python.org/mailman/listinfo/python-list