cnb <[EMAIL PROTECTED]> writes: > For each file I construct a list of reviews and then for each new file > I merge the reviews so that in the end have a list of reviewers and > for each reviewer all their reviews. > > What is the fastest way to do this?
Scan through all the files sequentially, emitting records like (movie, reviewer, review) Then use an external sort utility to sort/merge that output file on each of the 3 columns. Beats writing code. -- http://mail.python.org/mailman/listinfo/python-list