Re: [Tutor] Iterating Lines in File and Export Results
Alan, Peter, et al: Thank you all very much! Staring at this problem for hours was driving me crazy and I am very appreciative for your guys' time in looking into my silly error -- I have thoroughly reviewed both the responses and it makes perfect sense (*sigh of relief*). On Thu, Oct 2, 2014 at 6:08 PM, Peter Otten __pete...@web.de wrote: John Doe wrote: Hello List, I am in need of your assistance. I have a text file with random words in it. I want to write all the lines to a new file. Additionally, I am using Python 2.7 on Ubuntu 12.04: Here is my code: def loop_extract(): with open('words.txt', 'r') as f: for lines in f: The name `lines` is misleading, you are reading one line at a time. #print lines (I confirmed that each line is successfully #printed) with open('export.txt', 'w') as outf: outf.write(lines) #outf.write(lines) #outf.write('{}\n'.format(lines)) #outf.write('{}\n'.format(line for line in lines)) For some reason, the second file only contains the last line from the original file -- I have tried multiple variations (.read, .readlines, .writelines, other examples preceded by comment from above and many more) and tried to use the module, fileinput, but I still get the same results. Every time the line with open('export.txt', 'w') as outf: is executed the file export.txt is truncated: https://docs.python.org/dev/library/functions.html#open To avoid the loss of data open the file once, outside the loop: with open(words.txt) as infile, open(export.txt, w) as outfile: for line in infile: outfile.write(line) I do understand there is another way to copy the file over, but to provide additional background information on my purpose -- I want to read a file and save successful regex matches to a file; exporting specific data. There doesn't appear to be anything wrong with my expression as it prints the expected results without failure. I then decided to just write the export function by itself in its basic form, per the code above, which the same behavior occurred; That is a good approach! Reduce the code until only the source of the problem is left. only copying the last line. I've googled for hours and, unfortunately, at loss. I do that too, but not for hours ;) I want to read a file and save successful regex matches to a file; exporting specific data. An experienced user of Python might approach this scenario with a generator: def process_lines(infile): for line in infile: line = process(line) # your line processing if meets_condition(line): # your filter condition yield line with open(words.txt) as infile: with open(export.txt, w) as outfile: outfile.writelines( process_lines(infile)) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] could somebody please explain...
On Wed, Oct 01, 2014 at 09:43:29AM -0700, Clayton Kirkwood wrote: In an effort to learn and teach, I present a simple program which measures the time it takes to the various add functions with the appending results: Well done for making the effort! Now I'm going to tell you all the things you've done wrong! Sorry. But seriously, I am very pleased to see you making the effort to develop this on your own, but *accurately* timing fast-running code on modern computers is very tricky. The problem is, when you run some code, it isn't the only program running! The operating system is running, and these days all computers are multi-tasking, which means that anything up to hundreds of other programs could be running at the same time. At any one instant, most of them will be idle, doing nothing, but there's no way to be sure. Furthermore, there are now complexities with CPU caches. Running a bit of code will be much slower the first time, since it is not in the CPU cache. If the code it too big, it won't fit in the cache. The end result is that when you time how long a piece of code takes to run, there will always be two components: - the actually time taken for your code to run; - random noise caused by CPU cache effects, other processes running, the operating system, your anti-virus suddenly starting a scan in the middle of the run, etc. The noise can be quite considerable, possibly a few seconds. Now obviously if your code took ten minutes to run, then a few seconds either way is no big deal. But imagine that your timing test says that it took 2 seconds. That could mean: - 0.001 seconds for your code, and 1.999 seconds worth of noise; - 1.999 seconds for your code, and 0.001 seconds worth of noise; - or anything in between. That measurement is clearly quite useless. Does this mean that timing Python code is impossible? No, not really, but you have to do it carefully. The best way is to use Python's timeit module, which is carefully crafted to be as accurate as possible. First I'll show some results with timeit, then come back for a second post where I explain what you can do to be (nearly) as accurate. I'm going to compare four different ways of adding two numbers: (1) Using the + operator (2) Using operator.add (3) Using operator.__add__ (4) Using a hand-written function, made with lambda Here's the plus operator: from the command shell, I tell Python to use the timeit module to time some code. I give it some setup code to initialise two variables, then I time adding them together: [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 x + y 1000 loops, best of 3: 0.0971 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 x + y 1000 loops, best of 3: 0.0963 usec per loop So timeit measures how long it takes to run x + y ten million times. It does that three times, and picks the fastest of the three. The fastest will have the least amount of noise. I ran it twice, and the two results are fairly close: 0.0971 microseconds, and 0.0963 microseconds. [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s import operator operator.add(x, y) 100 loops, best of 3: 0.369 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s import operator operator.add(x, y) 100 loops, best of 3: 0.317 usec per loop This time I use operator.add, and get a speed of about 0.3 microseconds. So operator.add is about three times slower than the + operator. [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s import operator operator.__add__(x, y) 100 loops, best of 3: 0.296 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s import operator operator.__add__(x, y) 100 loops, best of 3: 0.383 usec per loop This time I use operator.__add__, and get about the same result as operator.add. You can see the variability in the results: 0.296 to 0.383 microsecond, that's a variation of about 30%. [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s add = lambda a,b: a+b add(x, y) 100 loops, best of 3: 0.296 usec per loop [steve@ando ~]$ python3.3 -m timeit -s x = 1; y = 2 -s add = lambda a,b: a+b add(x, y) 100 loops, best of 3: 0.325 usec per loop Finally, I try it with a hand-made function using lambda, and I get about the same 0.3 microseconds again, with considerable variability. Of course, the results you get on your computer may be completely different. More to follow... -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] could somebody please explain...
On Wed, Oct 01, 2014 at 09:43:29AM -0700, Clayton Kirkwood wrote: # program to test time and count options import datetime,operator, sys from datetime import time, date, datetime date = datetime.now() dayofweek = date.strftime(%a, %b) print(Today is, dayofweek, date.day, at , date.time()) start = 0 count_max=int(input(give me a number)) start_time = datetime.now() print( start_time ) while start count_max: start=start + 1 end_time = datetime.now() print( s=s+1 time difference is:, (end_time - start_time) ) The first problem you have here is that you are not actually timing how long it takes to add start + 1. You're actually timing eight things: - lookup the value of start; - lookup the value of count_max; - check whether the first is less than the second; - decide whether to loop, or exit the loop; - if we're still inside the loop, lookup start again; - add 1 to it; - store the result in start; - jump back to the top of the loop. So the results you get don't tell you much about the speed of start+1. Analogy: you want to know how long it takes you to drive to work in the morning. So you wake up, eat breakfast, brush your teeth, start the stopwatch, have a shower, get dressed, get in the car, drive to the gas station, fill up, buy a newspaper, and drive the rest of the way to work, and finally stop the stopwatch. The time you get is neither accurate as driving time, nor total time it takes to get to work time. Ideally, we want to do as little extra work as possible inside the timing loop, so we can get a figure as close as possible to the time actually taken by + as we can. The second problem is that you are using datetime.now() as your clock. That's not a high-precision clock. It might be only be accurate to a second, or a millisecond. It certainly isn't accurate enough to measure a single addition: py from datetime import datetime py x = 1 py t = datetime.now(); x + 1; datetime.now() - t 2 datetime.timedelta(0, 0, 85) This tells me that it supposedly took 85 microseconds to add two numbers, but as I showed before with timeit, the real figure is closer to 0.09 microseconds. That's a lot of noise! About 85000% noise! Unfortunately, it is tricky to know which clock to use. On Windows, time.clock() used to be the best one; on Linux, time.time() was the best. Starting in Python 3.3, there are a bunch more accurate clocks in the time module. But if you use the timeit module, it already picks the best clock for the job. But if in doubt, time.time() will normally be acceptable. https://docs.python.org/3/library/time.html https://docs.python.org/3/library/timeit.html -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] could somebody please explain...
Steven, I don't disagree with most of your analysis, I didn't know of other timing routines, and all of the superfluous stuff adds up. However, for a simple test, the route that I took was adequate I think. Yes I timed the whole wakeup to get to work, but the important element is that whatever I timed, was accurate between runs. And that is all that was import: to see the relative times.I also ran the complete program multiple times and found the test to be relatively consistent. I appreciate your notice of timeit(), I'll have to look into that, thanks. Thanks for taking the time to review and comment. Clayton !-Original Message- !From: Tutor [mailto:tutor-bounces+crk=godblessthe...@python.org] On !Behalf Of Steven D'Aprano !Sent: Friday, October 03, 2014 6:21 AM !To: tutor@python.org !Subject: Re: [Tutor] could somebody please explain... ! !On Wed, Oct 01, 2014 at 09:43:29AM -0700, Clayton Kirkwood wrote: ! ! # program to test time and count options ! ! import datetime,operator, sys ! from datetime import time, date, datetime date = datetime.now() ! dayofweek = date.strftime(%a, %b) print(Today is, dayofweek, ! date.day, at , date.time()) ! ! start = 0 ! count_max=int(input(give me a number)) start_time = datetime.now() ! ! print( start_time ) ! while start count_max: ! start=start + 1 ! ! end_time = datetime.now() ! print( s=s+1 time difference is:, (end_time - start_time) ) ! ! !The first problem you have here is that you are not actually timing how !long it takes to add start + 1. !You're actually timing eight things: ! !- lookup the value of start; !- lookup the value of count_max; !- check whether the first is less than the second; !- decide whether to loop, or exit the loop; !- if we're still inside the loop, lookup start again; !- add 1 to it; !- store the result in start; !- jump back to the top of the loop. ! ! !So the results you get don't tell you much about the speed of start+1. ! !Analogy: you want to know how long it takes you to drive to work in the !morning. So you wake up, eat breakfast, brush your teeth, start the !stopwatch, have a shower, get dressed, get in the car, drive to the gas !station, fill up, buy a newspaper, and drive the rest of the way to !work, and finally stop the stopwatch. The time you get is neither !accurate as driving time, nor total time it takes to get to work !time. ! !Ideally, we want to do as little extra work as possible inside the !timing loop, so we can get a figure as close as possible to the time !actually taken by + as we can. ! !The second problem is that you are using datetime.now() as your clock. !That's not a high-precision clock. It might be only be accurate to a !second, or a millisecond. It certainly isn't accurate enough to measure !a single addition: ! !py from datetime import datetime !py x = 1 !py t = datetime.now(); x + 1; datetime.now() - t !2 !datetime.timedelta(0, 0, 85) ! ! !This tells me that it supposedly took 85 microseconds to add two !numbers, but as I showed before with timeit, the real figure is closer !to 0.09 microseconds. That's a lot of noise! About 85000% noise! ! !Unfortunately, it is tricky to know which clock to use. On Windows, !time.clock() used to be the best one; on Linux, time.time() was the !best. Starting in Python 3.3, there are a bunch more accurate clocks in !the time module. But if you use the timeit module, it already picks the !best clock for the job. But if in doubt, time.time() will normally be !acceptable. ! !https://docs.python.org/3/library/time.html ! !https://docs.python.org/3/library/timeit.html ! ! ! !-- !Steven !___ !Tutor maillist - Tutor@python.org !To unsubscribe or change subscription options: !https://mail.python.org/mailman/listinfo/tutor ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] pygame module
i downloaded the 3.4 version of python but there is no matching binary file for pygame ive tried every 1.9.1 file and still cant import pygame would an older version of python work rob ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] pygame module
On Fri, Oct 3, 2014 at 2:27 PM, Rob Ward azzai...@gmail.com wrote: i downloaded the 3.4 version of python but there is no matching binary file for pygame ive tried every 1.9.1 file and still cant import pygame would an older version of python work You might have better results contacting the Pygame community for this question, as you're asking an installation question on a third-party library. http://pygame.org/wiki/info According to their FAQ, Pygame 1.9.2 should support Python 3: http://www.pygame.org/wiki/FrequentlyAskedQuestions#Does Pygame work with Python 3? ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor