i might add that with open( . . . instead of
foo = open( . . . also shows some maturity in py Abdur-Rahmaan Janhangeer, Mauritius abdurrahmaanjanhangeer.wordpress.com On 11 Jun 2017 12:33, "Peter Otten" <__pete...@web.de> wrote: > Japhy Bartlett wrote: > > > I'm not sure that they cared about how you used file.readlines(), I think > > the memory comment was a hint about instantiating Counter()s > > Then they would have been clueless ;) > > Both Schtvveer's original script and his subsequent "Verschlimmbesserung" > -- > beautiful german word for making things worse when trying to improve them > -- > use only two Counters at any given time. The second version is very > inefficient because it builds the same Counter over and over again -- but > this does not affect peak memory usage much. > > Here's the original version that triggered the comment: > > [Schtvveer Schvrveve] > > > import sys > > from collections import Counter > > > > def main(args): > > filename = args[1] > > word = args[2] > > print countAnagrams(word, filename) > > > > def countAnagrams(word, filename): > > > > fileContent = readFile(filename) > > > > counter = Counter(word) > > num_of_anagrams = 0 > > > > for i in range(0, len(fileContent)): > > if counter == Counter(fileContent[i]): > > num_of_anagrams += 1 > > > > return num_of_anagrams > > > > def readFile(filename): > > > > with open(filename) as f: > > content = f.readlines() > > > > content = [x.strip() for x in content] > > > > return content > > > > if __name__ == '__main__': > > main(sys.argv) > > > > referenced as before.py below, and here's a variant that removes > readlines(), range(), and the [x.strip() for x in content] list > comprehension, the goal being minimal changes, not code as I would write it > from scratch. > > # after.py > import sys > from collections import Counter > > def main(args): > filename = args[1] > word = args[2] > print countAnagrams(word, filename) > > def countAnagrams(word, filename): > > fileContent = readFile(filename) > counter = Counter(word) > num_of_anagrams = 0 > > for line in fileContent: > if counter == Counter(line): > num_of_anagrams += 1 > > return num_of_anagrams > > def readFile(filename): > # this relies on garbage collection to close the file > # which should normally be avoided > for line in open(filename): > yield line.strip() > > if __name__ == '__main__': > main(sys.argv) > > How to measure memoryview? I found > <https://stackoverflow.com/questions/774556/peak-memory- > usage-of-a-linux-unix-process> and as test data I use files containing > 10**5 and 10**6 > integers. With that setup (snipping everything but memory usage from the > time -v output): > > $ /usr/bin/time -v python before.py anagrams5.txt 123 > 6 > Maximum resident set size (kbytes): 17340 > $ /usr/bin/time -v python before.py anagrams6.txt 123 > 6 > Maximum resident set size (kbytes): 117328 > > > $ /usr/bin/time -v python after.py anagrams5.txt 123 > 6 > Maximum resident set size (kbytes): 6432 > $ /usr/bin/time -v python after.py anagrams6.txt 123 > 6 > Maximum resident set size (kbytes): 6432 > > See the pattern? before.py uses O(N) memory, after.py O(1). > > Run your own tests if you need more datapoints or prefer a different method > to measure memory consumption. > > _______________________________________________ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor