Re: [Tutor] lists of lists: more Chutes & Ladders!

2013-12-30 Thread Keith Winston
Never mind, I figured out that the slice assignment is emptying the
previous lists, before the .reset() statements are creating new lists that
I then populate and pass on. It makes sense.



On Tue, Dec 31, 2013 at 12:59 AM, Keith Winston  wrote:

> I resolved a problem I was having with lists, but I don't understand how!
> I caught my code inadvertently resetting/zeroing two lists TWICE at the
> invocation of the game method, and it was leading to all the (gamechutes &
> gameladders) lists returned by that method being zeroed out except the
> final time the method is called. That is: the game method below is iterated
> iter times (this happens outside the method), and every time gamechutes and
> gameladders (which should be lists of all the chutes and ladders landed on
> during the game) were returned empty, except for the last time, in which
> case they were correct. I can see that doing the multiple zeroing is
> pointless, but I can't understand why it would have any effect on the
> returned values. Note that self.reset() is called in __init__, so the lists
> exist before this method is ever called, if I understand properly.
>
> def game(self, iter):
> """Single game"""
>
> self.gamechutes[:] = []   #when I take out these two slice
> assignments,
> self.gameladders[:] = []   # then gamechutes & gameladders work
> properly
>
> self.gamechutes = []  # these were actually in a call to
> self.reset()
> self.gameladders = []
>
> # other stuff in reset()
>
> while self.position < 100:
> gamecandl = self.move()
> if gamecandl[0] != 0:
> self.gamechutes.append(gamecandl[0])
> if gamecandl[1] != 0:
> self.gameladders.append(gamecandl[1])
> return [iter, self.movecount, self.numchutes, self.numladders,
> self.gamechutes,self.gameladders]
>
> I'm happy to share the rest of the code if you want it, though I'm pretty
> sure the problem lies here. If it's not obvious, I'm setting myself up to
> analyse chute & ladder frequency: how often, in a sequence of games, one
> hits specific chutes & ladders, and related stats.
>
> As always, any comments on style or substance are appreciated.
>



-- 
Keith
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] lists of lists: more Chutes & Ladders!

2013-12-30 Thread Keith Winston
I resolved a problem I was having with lists, but I don't understand how! I
caught my code inadvertently resetting/zeroing two lists TWICE at the
invocation of the game method, and it was leading to all the (gamechutes &
gameladders) lists returned by that method being zeroed out except the
final time the method is called. That is: the game method below is iterated
iter times (this happens outside the method), and every time gamechutes and
gameladders (which should be lists of all the chutes and ladders landed on
during the game) were returned empty, except for the last time, in which
case they were correct. I can see that doing the multiple zeroing is
pointless, but I can't understand why it would have any effect on the
returned values. Note that self.reset() is called in __init__, so the lists
exist before this method is ever called, if I understand properly.

def game(self, iter):
"""Single game"""

self.gamechutes[:] = []   #when I take out these two slice
assignments,
self.gameladders[:] = []   # then gamechutes & gameladders work
properly

self.gamechutes = []  # these were actually in a call to
self.reset()
self.gameladders = []

# other stuff in reset()

while self.position < 100:
gamecandl = self.move()
if gamecandl[0] != 0:
self.gamechutes.append(gamecandl[0])
if gamecandl[1] != 0:
self.gameladders.append(gamecandl[1])
return [iter, self.movecount, self.numchutes, self.numladders,
self.gamechutes,self.gameladders]

I'm happy to share the rest of the code if you want it, though I'm pretty
sure the problem lies here. If it's not obvious, I'm setting myself up to
analyse chute & ladder frequency: how often, in a sequence of games, one
hits specific chutes & ladders, and related stats.

As always, any comments on style or substance are appreciated.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread Danny Yoo
On Mon, Dec 30, 2013 at 5:27 PM, William Ray Wing  wrote:
> On Dec 30, 2013, at 7:54 PM, "Protas, Meredith"  
> wrote:
>
>> Thanks for all of your comments!  I am working with human genome information 
>> which is in the form of many very short DNA sequence reads.  I am using a 
>> script that sorts through all of these sequences and picks out ones that 
>> contain a particular sequence I'm interested in.  Because my data set is so 
>> big, I have the data on an external hard drive (but that's where I had it 
>> before when it was faster too).


A strong suggestion: please show the content of the program to a
professional programmer and get their informed analysis on the
program.  If it's possible, providing a clear description on what
problem the program is trying to solve would be very helpful.  It's
very possible that the current program you're working with is not
written with efficiency in mind.  In many domains, efficiency isn't
such a concern because the input is relatively small.  But in
bioinformatics, the inputs are huge (on the order of gigabytes or
terabytes), and the proper use of memory and cpu matter a lot.

In a previous thread on python-tutor, a bioinformatician was asking
how to load their whole data set into memory.  After a few questions,
we realized their data set was about 100 gigabytes or so.  Most of us
here then tried to convince the original questioner to reconsider,
that whatever performance gains they thought they were getting by read
the whole file into memory were probably delusional dreams.

I guess I'm trying to say: if you can, show us the source.  Maybe
there's something there that needs to be fixed.  And maybe Python
isn't even the right tool for the job. From the limited description
you've provided of the problem---searching for a pattern among a
database of short sequences---I'm wondering if you're using BLAST or
not.  (http://blast.ncbi.nlm.nih.gov/Blast.cgi)
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread Steven D'Aprano
Hi Meridith, and welcome.

On Mon, Dec 30, 2013 at 06:37:50PM +, Protas, Meredith wrote:

> I am using a python script generated by another person.  I have used 
> this script multiple times before and it takes around 24 hours to run.  
> Recently, I have tried to run the script again (the same exact command 
> lines) and it is much much slower.  I have tried on two different 
> computers with the same result.  

Unfortunately, without knowing exactly what your script does, and the 
environment in which it is running, it's hard to give you any concrete 
advice or anything but questions. If the script is performing 
differently, something must have changed, it's just a matter of 
identifying what it is.

- What version of Python were you running before, and after?

- Does the script rely on any third-party libraries which have been 
upgraded?

- What operating system are you using? Is it the same version as before? 
You mention top, which suggests Linux or Unix.

- What does your script do? If it is downloading large amounts of data 
from the internet, has your internet connection changed? Perhaps 
something is throttling the rate at which data is downloaded.

- Is the amount of data being processed any different? If you've doubled 
the amount of data, you should expect the time taken to double.

- If your script is greatly dependent on I/O from disk, a much slower 
disk, or heavily-fragmented disk, may cause a significant slowdown.

- Is there a difference in the "nice" level or priority at which the 
script is running?

Hopefully one of these may point you in the right direction.

Would you like us to have a look at the script and see if there is 
something obvious that stands out?



-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread William Ray Wing
On Dec 30, 2013, at 7:54 PM, "Protas, Meredith"  wrote:

> Thanks for all of your comments!  I am working with human genome information 
> which is in the form of many very short DNA sequence reads.  I am using a 
> script that sorts through all of these sequences and picks out ones that 
> contain a particular sequence I'm interested in.  Because my data set is so 
> big, I have the data on an external hard drive (but that's where I had it 
> before when it was faster too).
> 
> As for how much slower it is running, I don't know because I keep having to 
> move my computer before it is finished.  The size of the data is the same, 
> the script has not been modified, and the data is still in the same place.  
> Essentially, I'm doing exactly what I did before (as a test) but it is now 
> slower.
> 
> How would I test your suggestion, Bill, that the script is paging itself to 
> death?  The data has not grown and I don't think the number of processes 
> occupying memory has changed.
> 
> By the way, I am using a Mac and I've tried two different computers.  
> 
> Thanks so much for all of your help!
> 
> Meredith
> 

Meredith,  look in your Utilities folder for an application called Activity 
Monitor.  Launch it and then in the little tab bar close to the bottom of the 
window, select System Memory.  This will get you several statistics, but the 
quick and dirty check is the pie chart to the right of the stats.  If the green 
wedge is tiny or nonexistent, then you've essentially run out of "free" 
physical memory and the system is almost certainly paging.

As a double check, reboot your Mac (which I assume is a laptop).  Relaunch the 
Activity Monitor and again, select System Memory.  Right now, immediately after 
a reboot, the green wedge should be quite large, possibly occupying 3/4 of the 
circle.  Now launch your script again and watch what happens.  If the wedge 
shrinks down to zero, you've found the problem and we need to figure out why 
and what has changed.
-

If memory use is NOT the problem, then we need to know more about the context.  
What version of Mac OS are you running?  Are you running the system version of 
python or did you install your own?  Did you recently upgrade to 10.9 (the 
version of python Apple ships with 10.9 is different from the one that came on 
10.8)?

-

Finally, I assume the Mac has both USB and Firewire ports.  Most Mac-compatible 
external drives also have both USB and Firewire.  Is there an chance that you 
were using Firewire to hook up the drive previously and are using USB now?

Hope this helps (or at least helps us get closer to an answer).

-Bill

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] can't install

2013-12-30 Thread eryksun
On Mon, Dec 30, 2013 at 7:42 PM, Tobias M.  wrote:
> Yes, '~' is your home directory. You can actually use this in your
> shell instead of typing the whole path.

The tilde shortcut is a C shell legacy inspired by the 1970s ADM-3A
terminal, which had "Home" and tilde on the same key. Python has
os.path.expanduser('~'), which also works on Windows even though using
tilde like this isn't a Windows convention.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread Keith Winston
It seems likely that mentioning what version of Python you're running it on
might help in trouble-shooting... if you can run it on a subsection of your
data, get it down to a workable amount of time (as in, minutes) and then
put timers on the various sections to try to see what's taking so long. My
suspicion, almost wholly free of distortion by facts, is that you are
running it on a different version of Python, either an upgrade or downgrade
was done and something about the script doesn't like that. Is it a terribly
long script?




On Mon, Dec 30, 2013 at 5:37 PM, William Ray Wing  wrote:

> On Dec 30, 2013, at 1:37 PM, "Protas, Meredith" 
> wrote:
>
> Hi,
>
> I'm very new to python so I'm sorry about such a basic question.
>
> I am using a python script generated by another person.  I have used this
> script multiple times before and it takes around 24 hours to run.
>  Recently, I have tried to run the script again (the same exact command
> lines) and it is much much slower.  I have tried on two different computers
> with the same result.  I used top to see if there were any suspicious
> functions that were happening but there seems to not be.  I also ran
> another python script I used before and that went at the same speed as
> before so the problem seems unique to the first python script.
>
> Does anyone have any idea why it is so much slower now than it used to be
> (just around a month ago).
>
> Thanks for your help!
>
> Meredith
>
>
> Meredith,  This is just a slight expansion on the note you received from
> Alan.  Is there any chance that the script now is paging itself to death?
>  That is, if you are reading a huge amount of data into a structure in
> memory, and if it no longer fits in available physical memory (either
> because the amount of data to be read has grown or the number of other
> processes that are occupying memory have grown), then that data structure
> may have gone virtual and the OS may be swapping it out to disk.  That
> would dramatically increase the amount of elapsed wall time the program
> takes to run.
>
> If you can tell us more about what the program actually is doing or
> calculating, we might be able to offer more help.
>
> -Bill
>
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor
>
>


-- 
Keith
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] can't install

2013-12-30 Thread Tobias M.


Quoting Lolo Lolo :


Hi, can i ask why the name ~/my_venv/  .. is that just to indicate ~  
as the home directory?


The name was just an example. Yes, '~' is your home directory. You can  
actually use this in your shell

instead of typing the whole path.
You probably want to put virtual environments in your home directory.

so pyvenv already has access to virtualenv? i thought i would have  
needed pypi  https://pypi.python.org/pypi/virtualenv  or  
easy_install before i got install it, but here you create the  
virtualenv before getting distribute_setup.py.


Since Python 3.3 a 'venv' module is included in the standard library  
and shipped with the pyvenv tool. So you no longer need the  
'virtualenv' 3rd party module.



Everything seems fine now, at last! I really appreciate your help;)


Great :)




___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread William Ray Wing
On Dec 30, 2013, at 1:37 PM, "Protas, Meredith"  wrote:

> Hi,
> 
> I'm very new to python so I'm sorry about such a basic question.
> 
> I am using a python script generated by another person.  I have used this 
> script multiple times before and it takes around 24 hours to run.  Recently, 
> I have tried to run the script again (the same exact command lines) and it is 
> much much slower.  I have tried on two different computers with the same 
> result.  I used top to see if there were any suspicious functions that were 
> happening but there seems to not be.  I also ran another python script I used 
> before and that went at the same speed as before so the problem seems unique 
> to the first python script.
> 
> Does anyone have any idea why it is so much slower now than it used to be 
> (just around a month ago).
> 
> Thanks for your help!
> 
> Meredith

Meredith,  This is just a slight expansion on the note you received from Alan.  
Is there any chance that the script now is paging itself to death?  That is, if 
you are reading a huge amount of data into a structure in memory, and if it no 
longer fits in available physical memory (either because the amount of data to 
be read has grown or the number of other processes that are occupying memory 
have grown), then that data structure may have gone virtual and the OS may be 
swapping it out to disk.  That would dramatically increase the amount of 
elapsed wall time the program takes to run.

If you can tell us more about what the program actually is doing or 
calculating, we might be able to offer more help.

-Bill___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] same python script now running much slower

2013-12-30 Thread Alan Gauld

On 30/12/13 18:37, Protas, Meredith wrote:


I am using a python script generated by another person.  I have used
this script multiple times before and it takes around 24 hours to run.


That's a long time by any standards.
I assume its processing an enormous amount of data?


  Recently, I have tried to run the script again (the same exact command
lines) and it is much much slower.


How much slower?
Precision is everything in programming...


Does anyone have any idea why it is so much slower now than it used to
be (just around a month ago).


There are all sorts of possibilities but without any idea of what it 
does or how it does it we would only be making wild guesses.


Here are some...
Maybe you have increased your data size since last time? Maybe the 
computer configuration has changed? Maybe somebody modified the

script? Maybe the data got moved to a network drive, or an
optical disk? Who knows?

Without more detail we really can't say.


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


[Tutor] same python script now running much slower

2013-12-30 Thread Protas, Meredith
Hi,

I'm very new to python so I'm sorry about such a basic question.

I am using a python script generated by another person.  I have used this 
script multiple times before and it takes around 24 hours to run.  Recently, I 
have tried to run the script again (the same exact command lines) and it is 
much much slower.  I have tried on two different computers with the same 
result.  I used top to see if there were any suspicious functions that were 
happening but there seems to not be.  I also ran another python script I used 
before and that went at the same speed as before so the problem seems unique to 
the first python script.

Does anyone have any idea why it is so much slower now than it used to be (just 
around a month ago).

Thanks for your help!

Meredith
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Generator next()

2013-12-30 Thread spir

On 12/29/2013 12:33 PM, Steven D'Aprano wrote:

def adder_factory(n):
 def plus(arg):
 return arg + n
 return plus  # returns the function itself


If you call adder_factory(), it returns a function:

py> adder_factory(10)
.plus at 0xb7af6f5c>

What good is this? Watch carefully:


py> add_two = adder_factory(2)
py> add_three = adder_factory(3)
py> add_two(100)
102
py> add_three(100)
103


The factory lets you programmatically create functions on the fly.
add_two() is a function which adds two to whatever argument it gets.
add_three() is a function which adds three to whatever argument it gets.
We can create an "add_whatever" function without knowing in advance what
"whatever" is going to be:

py> from random import randint
py> add_whatever = adder_factory(randint(1, 10))
py> add_whatever(200)
205
py> add_whatever(300)
305


So now you see the main reason for nested functions in Python: they let
use create a function where you don't quite know exactly what it will do
until runtime. You know the general form of what it will do, but the
precise details aren't specified until runtime.


little complement:
Used that way, a function factory is the programming equivalent of parametric 
functions, or function families, in maths:


add : v --> v + p

Where does p come from? it's a parameter (in the sense of maths), or an "unbound 
varaible"; while 'v' is the actual (bound) variable of the function. Add 
actually defines a family of parametric functions, each with its own 'p' 
parameter, adding p to their input variable (they each have only one). The 
programming equivalent of this is what Steven demonstrated above. However, there 
is much confusion due to (funny) programming terminology:

* a math variable is usually called parameter
* a math parameter is usually called "upvalue" (not kidding!)
* a parametric function, or function family (with undefinite param) is called 
function factory (this bit is actually semantically correct)

* a member of function family (with definite param) is called (function) closure
(every parametric expression is actually a closure, eg "index = index + offset", 
if offset is not locally defined)


Denis
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Generator next()

2013-12-30 Thread spir

On 12/29/2013 01:38 PM, Steven D'Aprano wrote:

>In the previous timer function that I was using, it defined a timer class,
>and then I had to instantiate it before I could use it, and then it saved a
>list of timing results. I think in yours, it adds attributes to each
>instance of a function/method, in which the relevant timings are stored.

Correct. Rather than dump those attributes on the original function, it
creates a new function and adds the attributes to that. The new function
wraps the older one, that is, it calls the older one:

def f(x):
 return g(x)

We say that "f wraps g". Now, obviously wrapping a function just to
return its result unchanged is pretty pointless, normally the wrapper
function will do something to the arguments, or the result, or both.


Or add metadata, as in your case the 'count'.
This pattern is often used in parsing to add metadata (semantic, or stateful, 
usually) to bare match results.


Denis
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor