Re: Generator slower than iterator?

2008-12-20 Thread Federico Moreira
Wow, thanks again =) -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-19 Thread Raymond Hettinger
FedericoMoreirawrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]):     ip = line.split()[0]     if match_counter.has_key(ip):    

Re: Generator slower than iterator?

2008-12-19 Thread Federico Moreira
Great, 2min 34 secs with the open method =) but why? ip, sep, rest = line.partition(' ') match_counter[ip] += 1 instead of match_counter[line.strip()[0]] += 1 strip really takes more time than partition? I'm having the same results with both of them right now. --

Re: Generator slower than iterator?

2008-12-19 Thread MRAB
Federico Moreira wrote: Great, 2min 34 secs with the open method =) but why? ip, sep, rest = line.partition(' ') match_counter[ip] += 1 instead of match_counter[line.strip()[0]] += 1 strip really takes more time than partition? I'm having the same results with both of them right

Re: Generator slower than iterator?

2008-12-19 Thread Federico Moreira
Yep i meant split sorry. Thanks for the answer! -- http://mail.python.org/mailman/listinfo/python-list

Re: Generator slower than iterator?

2008-12-19 Thread Arnaud Delobelle
MRAB goo...@mrabarnett.plus.com writes: Federico Moreira wrote: Great, 2min 34 secs with the open method =) but why? ip, sep, rest = line.partition(' ') match_counter[ip] += 1 instead of match_counter[line.strip()[0]] += 1 strip really takes more time than partition? I'm

Re: Generator slower than iterator?

2008-12-16 Thread MRAB
Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] if match_counter.has_key(ip):

Re: Generator slower than iterator?

2008-12-16 Thread Lie Ryan
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0]

Re: Generator slower than iterator?

2008-12-16 Thread Lie Ryan
On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0]

Re: Generator slower than iterator?

2008-12-16 Thread Gary Herron
Lie Ryan wrote: On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip =

Re: Generator slower than iterator?

2008-12-16 Thread bearophileHUGS
MRAB: from collections import defaultdict match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] match_counter[ip] += 1 This can be a little faster still: match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]): ip =

Re: Generator slower than iterator?

2008-12-16 Thread Federico Moreira
The defaultdict option looks faster than the standard dict (20 secs aprox). Now i have: # import fileinput import sys from collections import defaultdict match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]): match_counter[line.split()[0]]

Re: Generator slower than iterator?

2008-12-16 Thread rdmurray
Quoth Lie Ryan lie.1...@gmail.com: On Tue, 16 Dec 2008 12:07:14 -0300, Federico Moreira wrote: Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in

Re: Generator slower than iterator?

2008-12-16 Thread Federico Moreira
2008/12/16 rdmur...@bitdance.com Python 3.0 does not support has_key, it's time to get used to not using it :) Good to know line.split(None, 1)[0] really speeds up the proccess Thanks again. -- http://mail.python.org/mailman/listinfo/python-list

Generator slower than iterator?

2008-12-16 Thread Federico Moreira
Hi all, Im parsing a 4.1GB apache log to have stats about how many times an ip request something from the server. The first design of the algorithm was for line in fileinput.input(sys.argv[1:]): ip = line.split()[0] if match_counter.has_key(ip): match_counter[ip] += 1 else:

Re: Generator slower than iterator?

2008-12-16 Thread Arnaud Delobelle
bearophileh...@lycos.com writes: This can be a little faster still: match_counter = defaultdict(int) for line in fileinput.input(sys.argv[1:]): ip = line.split(None, 1)[0] match_counter[ip] += 1 Bye, bearophile Or maybe (untested): match_counter = defaultdict(int) for line in

Re: Generator slower than iterator?

2008-12-16 Thread Arnaud Delobelle
Arnaud Delobelle arno...@googlemail.com writes: match_total = dict((key, val()) for key, val in match_counter.iteritems()) Sorry I meant match_total = dict((key, val.next()) for key, val in match_counter.iteritems()) -- Arnaud --