this code here:
def wordcount(lines): for i in range(len(lines)/8): words = lines[i].split(" ") if not locals().has_key("frequency"): frequency = {} for word in words: if frequency.has_key(word): frequency[word] += 1 else: frequency[word] = 1 return frequency wordcount(lines) is taking over six minutes to run on a two megabyte text file. i realize that's a big text file, a really big one (it's actually the full text of don quixote.). i'm trying to figure out how. is there a better way for me to do a frequency count of all the words in the text? it seems to me like this should scale linearly, but perhaps it isn't? i don't know much about algorithmic complexity. if someone could give a breakdown of this functions complexity as well i'd be much obliged. lines is expected to be a list of lines as provided by file.readline() -- http://mail.python.org/mailman/listinfo/python-list