Re: [Tutor] lists of lists: more Chutes & Ladders!
Never mind, I figured out that the slice assignment is emptying the previous lists, before the .reset() statements are creating new lists that I then populate and pass on. It makes sense. On Tue, Dec 31, 2013 at 12:59 AM, Keith Winston wrote: > I resolved a problem I was having with lists, but I don't understand how! > I caught my code inadvertently resetting/zeroing two lists TWICE at the > invocation of the game method, and it was leading to all the (gamechutes & > gameladders) lists returned by that method being zeroed out except the > final time the method is called. That is: the game method below is iterated > iter times (this happens outside the method), and every time gamechutes and > gameladders (which should be lists of all the chutes and ladders landed on > during the game) were returned empty, except for the last time, in which > case they were correct. I can see that doing the multiple zeroing is > pointless, but I can't understand why it would have any effect on the > returned values. Note that self.reset() is called in __init__, so the lists > exist before this method is ever called, if I understand properly. > > def game(self, iter): > """Single game""" > > self.gamechutes[:] = [] #when I take out these two slice > assignments, > self.gameladders[:] = [] # then gamechutes & gameladders work > properly > > self.gamechutes = [] # these were actually in a call to > self.reset() > self.gameladders = [] > > # other stuff in reset() > > while self.position < 100: > gamecandl = self.move() > if gamecandl[0] != 0: > self.gamechutes.append(gamecandl[0]) > if gamecandl[1] != 0: > self.gameladders.append(gamecandl[1]) > return [iter, self.movecount, self.numchutes, self.numladders, > self.gamechutes,self.gameladders] > > I'm happy to share the rest of the code if you want it, though I'm pretty > sure the problem lies here. If it's not obvious, I'm setting myself up to > analyse chute & ladder frequency: how often, in a sequence of games, one > hits specific chutes & ladders, and related stats. > > As always, any comments on style or substance are appreciated. > -- Keith ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] lists of lists: more Chutes & Ladders!
I resolved a problem I was having with lists, but I don't understand how! I caught my code inadvertently resetting/zeroing two lists TWICE at the invocation of the game method, and it was leading to all the (gamechutes & gameladders) lists returned by that method being zeroed out except the final time the method is called. That is: the game method below is iterated iter times (this happens outside the method), and every time gamechutes and gameladders (which should be lists of all the chutes and ladders landed on during the game) were returned empty, except for the last time, in which case they were correct. I can see that doing the multiple zeroing is pointless, but I can't understand why it would have any effect on the returned values. Note that self.reset() is called in __init__, so the lists exist before this method is ever called, if I understand properly. def game(self, iter): """Single game""" self.gamechutes[:] = [] #when I take out these two slice assignments, self.gameladders[:] = [] # then gamechutes & gameladders work properly self.gamechutes = [] # these were actually in a call to self.reset() self.gameladders = [] # other stuff in reset() while self.position < 100: gamecandl = self.move() if gamecandl[0] != 0: self.gamechutes.append(gamecandl[0]) if gamecandl[1] != 0: self.gameladders.append(gamecandl[1]) return [iter, self.movecount, self.numchutes, self.numladders, self.gamechutes,self.gameladders] I'm happy to share the rest of the code if you want it, though I'm pretty sure the problem lies here. If it's not obvious, I'm setting myself up to analyse chute & ladder frequency: how often, in a sequence of games, one hits specific chutes & ladders, and related stats. As always, any comments on style or substance are appreciated. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
On Mon, Dec 30, 2013 at 5:27 PM, William Ray Wing wrote: > On Dec 30, 2013, at 7:54 PM, "Protas, Meredith" > wrote: > >> Thanks for all of your comments! I am working with human genome information >> which is in the form of many very short DNA sequence reads. I am using a >> script that sorts through all of these sequences and picks out ones that >> contain a particular sequence I'm interested in. Because my data set is so >> big, I have the data on an external hard drive (but that's where I had it >> before when it was faster too). A strong suggestion: please show the content of the program to a professional programmer and get their informed analysis on the program. If it's possible, providing a clear description on what problem the program is trying to solve would be very helpful. It's very possible that the current program you're working with is not written with efficiency in mind. In many domains, efficiency isn't such a concern because the input is relatively small. But in bioinformatics, the inputs are huge (on the order of gigabytes or terabytes), and the proper use of memory and cpu matter a lot. In a previous thread on python-tutor, a bioinformatician was asking how to load their whole data set into memory. After a few questions, we realized their data set was about 100 gigabytes or so. Most of us here then tried to convince the original questioner to reconsider, that whatever performance gains they thought they were getting by read the whole file into memory were probably delusional dreams. I guess I'm trying to say: if you can, show us the source. Maybe there's something there that needs to be fixed. And maybe Python isn't even the right tool for the job. From the limited description you've provided of the problem---searching for a pattern among a database of short sequences---I'm wondering if you're using BLAST or not. (http://blast.ncbi.nlm.nih.gov/Blast.cgi) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
Hi Meridith, and welcome. On Mon, Dec 30, 2013 at 06:37:50PM +, Protas, Meredith wrote: > I am using a python script generated by another person. I have used > this script multiple times before and it takes around 24 hours to run. > Recently, I have tried to run the script again (the same exact command > lines) and it is much much slower. I have tried on two different > computers with the same result. Unfortunately, without knowing exactly what your script does, and the environment in which it is running, it's hard to give you any concrete advice or anything but questions. If the script is performing differently, something must have changed, it's just a matter of identifying what it is. - What version of Python were you running before, and after? - Does the script rely on any third-party libraries which have been upgraded? - What operating system are you using? Is it the same version as before? You mention top, which suggests Linux or Unix. - What does your script do? If it is downloading large amounts of data from the internet, has your internet connection changed? Perhaps something is throttling the rate at which data is downloaded. - Is the amount of data being processed any different? If you've doubled the amount of data, you should expect the time taken to double. - If your script is greatly dependent on I/O from disk, a much slower disk, or heavily-fragmented disk, may cause a significant slowdown. - Is there a difference in the "nice" level or priority at which the script is running? Hopefully one of these may point you in the right direction. Would you like us to have a look at the script and see if there is something obvious that stands out? -- Steven ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
On Dec 30, 2013, at 7:54 PM, "Protas, Meredith" wrote: > Thanks for all of your comments! I am working with human genome information > which is in the form of many very short DNA sequence reads. I am using a > script that sorts through all of these sequences and picks out ones that > contain a particular sequence I'm interested in. Because my data set is so > big, I have the data on an external hard drive (but that's where I had it > before when it was faster too). > > As for how much slower it is running, I don't know because I keep having to > move my computer before it is finished. The size of the data is the same, > the script has not been modified, and the data is still in the same place. > Essentially, I'm doing exactly what I did before (as a test) but it is now > slower. > > How would I test your suggestion, Bill, that the script is paging itself to > death? The data has not grown and I don't think the number of processes > occupying memory has changed. > > By the way, I am using a Mac and I've tried two different computers. > > Thanks so much for all of your help! > > Meredith > Meredith, look in your Utilities folder for an application called Activity Monitor. Launch it and then in the little tab bar close to the bottom of the window, select System Memory. This will get you several statistics, but the quick and dirty check is the pie chart to the right of the stats. If the green wedge is tiny or nonexistent, then you've essentially run out of "free" physical memory and the system is almost certainly paging. As a double check, reboot your Mac (which I assume is a laptop). Relaunch the Activity Monitor and again, select System Memory. Right now, immediately after a reboot, the green wedge should be quite large, possibly occupying 3/4 of the circle. Now launch your script again and watch what happens. If the wedge shrinks down to zero, you've found the problem and we need to figure out why and what has changed. - If memory use is NOT the problem, then we need to know more about the context. What version of Mac OS are you running? Are you running the system version of python or did you install your own? Did you recently upgrade to 10.9 (the version of python Apple ships with 10.9 is different from the one that came on 10.8)? - Finally, I assume the Mac has both USB and Firewire ports. Most Mac-compatible external drives also have both USB and Firewire. Is there an chance that you were using Firewire to hook up the drive previously and are using USB now? Hope this helps (or at least helps us get closer to an answer). -Bill ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] can't install
On Mon, Dec 30, 2013 at 7:42 PM, Tobias M. wrote: > Yes, '~' is your home directory. You can actually use this in your > shell instead of typing the whole path. The tilde shortcut is a C shell legacy inspired by the 1970s ADM-3A terminal, which had "Home" and tilde on the same key. Python has os.path.expanduser('~'), which also works on Windows even though using tilde like this isn't a Windows convention. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
It seems likely that mentioning what version of Python you're running it on might help in trouble-shooting... if you can run it on a subsection of your data, get it down to a workable amount of time (as in, minutes) and then put timers on the various sections to try to see what's taking so long. My suspicion, almost wholly free of distortion by facts, is that you are running it on a different version of Python, either an upgrade or downgrade was done and something about the script doesn't like that. Is it a terribly long script? On Mon, Dec 30, 2013 at 5:37 PM, William Ray Wing wrote: > On Dec 30, 2013, at 1:37 PM, "Protas, Meredith" > wrote: > > Hi, > > I'm very new to python so I'm sorry about such a basic question. > > I am using a python script generated by another person. I have used this > script multiple times before and it takes around 24 hours to run. > Recently, I have tried to run the script again (the same exact command > lines) and it is much much slower. I have tried on two different computers > with the same result. I used top to see if there were any suspicious > functions that were happening but there seems to not be. I also ran > another python script I used before and that went at the same speed as > before so the problem seems unique to the first python script. > > Does anyone have any idea why it is so much slower now than it used to be > (just around a month ago). > > Thanks for your help! > > Meredith > > > Meredith, This is just a slight expansion on the note you received from > Alan. Is there any chance that the script now is paging itself to death? > That is, if you are reading a huge amount of data into a structure in > memory, and if it no longer fits in available physical memory (either > because the amount of data to be read has grown or the number of other > processes that are occupying memory have grown), then that data structure > may have gone virtual and the OS may be swapping it out to disk. That > would dramatically increase the amount of elapsed wall time the program > takes to run. > > If you can tell us more about what the program actually is doing or > calculating, we might be able to offer more help. > > -Bill > > ___ > Tutor maillist - Tutor@python.org > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > > -- Keith ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] can't install
Quoting Lolo Lolo : Hi, can i ask why the name ~/my_venv/ .. is that just to indicate ~ as the home directory? The name was just an example. Yes, '~' is your home directory. You can actually use this in your shell instead of typing the whole path. You probably want to put virtual environments in your home directory. so pyvenv already has access to virtualenv? i thought i would have needed pypi https://pypi.python.org/pypi/virtualenv or easy_install before i got install it, but here you create the virtualenv before getting distribute_setup.py. Since Python 3.3 a 'venv' module is included in the standard library and shipped with the pyvenv tool. So you no longer need the 'virtualenv' 3rd party module. Everything seems fine now, at last! I really appreciate your help;) Great :) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
On Dec 30, 2013, at 1:37 PM, "Protas, Meredith" wrote: > Hi, > > I'm very new to python so I'm sorry about such a basic question. > > I am using a python script generated by another person. I have used this > script multiple times before and it takes around 24 hours to run. Recently, > I have tried to run the script again (the same exact command lines) and it is > much much slower. I have tried on two different computers with the same > result. I used top to see if there were any suspicious functions that were > happening but there seems to not be. I also ran another python script I used > before and that went at the same speed as before so the problem seems unique > to the first python script. > > Does anyone have any idea why it is so much slower now than it used to be > (just around a month ago). > > Thanks for your help! > > Meredith Meredith, This is just a slight expansion on the note you received from Alan. Is there any chance that the script now is paging itself to death? That is, if you are reading a huge amount of data into a structure in memory, and if it no longer fits in available physical memory (either because the amount of data to be read has grown or the number of other processes that are occupying memory have grown), then that data structure may have gone virtual and the OS may be swapping it out to disk. That would dramatically increase the amount of elapsed wall time the program takes to run. If you can tell us more about what the program actually is doing or calculating, we might be able to offer more help. -Bill___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] same python script now running much slower
On 30/12/13 18:37, Protas, Meredith wrote: I am using a python script generated by another person. I have used this script multiple times before and it takes around 24 hours to run. That's a long time by any standards. I assume its processing an enormous amount of data? Recently, I have tried to run the script again (the same exact command lines) and it is much much slower. How much slower? Precision is everything in programming... Does anyone have any idea why it is so much slower now than it used to be (just around a month ago). There are all sorts of possibilities but without any idea of what it does or how it does it we would only be making wild guesses. Here are some... Maybe you have increased your data size since last time? Maybe the computer configuration has changed? Maybe somebody modified the script? Maybe the data got moved to a network drive, or an optical disk? Who knows? Without more detail we really can't say. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] same python script now running much slower
Hi, I'm very new to python so I'm sorry about such a basic question. I am using a python script generated by another person. I have used this script multiple times before and it takes around 24 hours to run. Recently, I have tried to run the script again (the same exact command lines) and it is much much slower. I have tried on two different computers with the same result. I used top to see if there were any suspicious functions that were happening but there seems to not be. I also ran another python script I used before and that went at the same speed as before so the problem seems unique to the first python script. Does anyone have any idea why it is so much slower now than it used to be (just around a month ago). Thanks for your help! Meredith ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Generator next()
On 12/29/2013 12:33 PM, Steven D'Aprano wrote: def adder_factory(n): def plus(arg): return arg + n return plus # returns the function itself If you call adder_factory(), it returns a function: py> adder_factory(10) .plus at 0xb7af6f5c> What good is this? Watch carefully: py> add_two = adder_factory(2) py> add_three = adder_factory(3) py> add_two(100) 102 py> add_three(100) 103 The factory lets you programmatically create functions on the fly. add_two() is a function which adds two to whatever argument it gets. add_three() is a function which adds three to whatever argument it gets. We can create an "add_whatever" function without knowing in advance what "whatever" is going to be: py> from random import randint py> add_whatever = adder_factory(randint(1, 10)) py> add_whatever(200) 205 py> add_whatever(300) 305 So now you see the main reason for nested functions in Python: they let use create a function where you don't quite know exactly what it will do until runtime. You know the general form of what it will do, but the precise details aren't specified until runtime. little complement: Used that way, a function factory is the programming equivalent of parametric functions, or function families, in maths: add : v --> v + p Where does p come from? it's a parameter (in the sense of maths), or an "unbound varaible"; while 'v' is the actual (bound) variable of the function. Add actually defines a family of parametric functions, each with its own 'p' parameter, adding p to their input variable (they each have only one). The programming equivalent of this is what Steven demonstrated above. However, there is much confusion due to (funny) programming terminology: * a math variable is usually called parameter * a math parameter is usually called "upvalue" (not kidding!) * a parametric function, or function family (with undefinite param) is called function factory (this bit is actually semantically correct) * a member of function family (with definite param) is called (function) closure (every parametric expression is actually a closure, eg "index = index + offset", if offset is not locally defined) Denis ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Generator next()
On 12/29/2013 01:38 PM, Steven D'Aprano wrote: >In the previous timer function that I was using, it defined a timer class, >and then I had to instantiate it before I could use it, and then it saved a >list of timing results. I think in yours, it adds attributes to each >instance of a function/method, in which the relevant timings are stored. Correct. Rather than dump those attributes on the original function, it creates a new function and adds the attributes to that. The new function wraps the older one, that is, it calls the older one: def f(x): return g(x) We say that "f wraps g". Now, obviously wrapping a function just to return its result unchanged is pretty pointless, normally the wrapper function will do something to the arguments, or the result, or both. Or add metadata, as in your case the 'count'. This pattern is often used in parsing to add metadata (semantic, or stateful, usually) to bare match results. Denis ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor