[Tutor] confusing installation

2013-02-28 Thread Lolo Lolo
Hi all. Im working through a database tutorial in a book called Core Python 
Applications. On page 290 it asks of me to install something though its not 
clear what it is. I think it has to do with SQLAlchemy. It says if you use 
Python 3 you'll need to get distribute first. You'll need a web browser (or the 
curl if you have it).
 
Mind you i havent a clue what it is on about.. then it goes on:
 
And to download the installation file (available at 
http://python-distribute.org/distribute_setup.py), and then get SQLAlchemy with 
easy install. Here is what this entire process might look like on a 
Windows-based PC... then it proceeds to  write a whole bunch of god know what. 
It looks similar to when you install something on linux.. Im a windows only 
user btw. 
 
That link i went there to download the file but if you look at that site i dont 
know what on earth im supposed to do to install/retrieve that file. I do not 
know what easy install is.What do they mean i need a web browser? doesnt every 
computer have a web-browser?? what is curl?? I can not proceed with this 
tutorial if i recieve no help... Please can anyone assist me on what i have to 
do to use the tools they are presenting. 
 
i use Python 3 and Windows___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] timeit: 10million x 1 Vs 1million x 10

2013-02-28 Thread Steven D'Aprano

On 28/02/13 13:27, DoanVietTrungAtGmail wrote:

Dear tutors

My function below simply populates a large dict. When measured by timeit
populating 10 million items once, versus populating 1 million items ten
times, the times are noticeably different:


I cannot replicate your results. When I try it, I get more or less the same
result each time:

py> for count, N in ((1, 1000), (10, 100)):
... t = timeit.Timer('f(N)', 'from __main__ import N, writeDict as f')
... print min(t.repeat(number=count))
...
7.2105910778
7.17914915085

The difference is insignificant.


However, I did notice that when I ran your code, memory consumption went to
80% on my computer, and the load average exceeded 4. I suggest that perhaps
the results you are seeing have something to do with your operating system's
response to memory usage, or some other external factor.


[...]

My guess is that this discrepancy is a result of either how some sort of
overhead in timeit, or of Python having to allocate memory space for a dict
10 times. What do you think, and how to find out for sure?


Whatever the answer is, it is neither of the above.

Firstly, while timeit does have some overhead, it is very small. After all,
timeit is designed for timing tiny sub-microsecond code snippets.

You can get an idea of timeit's overhead like this:

py> from timeit import Timer
py> t1 = Timer("x = 2")
py> t2 = Timer("x = 1;x = 2")
py> min(t1.repeat())
0.048729896545410156
py> min(t2.repeat())
0.06900882720947266


If timeit had no overhead at all, t2 should take twice as long as t1 since
it has two instructions rather than one. But it doesn't, so we can calculate
the (approximate) overhead with a bit of maths:

overhead + t  = 0.0486
overhead + 2t = 0.0690

Solving this gives me an overhead of 0.0282s, which is per the one million
loops that timeit does by default. So as you can see, it's quite small: about
30 nanoseconds on my computer per loop. Even if it was a million times
greater, it wouldn't be enough to explain the results you see.

As for your other suggestion, about the memory space allocation, it is also
unlikely to be correct. You are using the same dict on every test! On the
first run, Python has to reallocate memory to make the dict big enough for
10 million entries. After that, the dict is already resized and never gets
any bigger.




Second (for me, this question is more important), how to improve
performance? (I tried a tuple rather than a list for the dict values, it
was slightly faster, but I need dict items to be mutable)



Firstly, are you sure you need to improve performance?

Secondly, performance of what? Have you profiled your application to see
which parts are slow, or are you just guessing?

As it stands, I cannot advise you how to speed your application up, because
I don't know what it does or what bits are slow. But I doubt very much that
the slow part is storing a small list inside a dict.




--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] timeit: 10million x 1 Vs 1million x 10

2013-02-28 Thread Alan Gauld

On 28/02/13 02:27, DoanVietTrungAtGmail wrote:


---
import timeit

N = 1000 # This constant's value is either 10 million or 1 million
testDict = {}
def writeDict(N):
 for i in xrange(N):
 testDict[i] = [i, [i + 1, i + 2], i + 3]
print timeit.Timer('f(N)', 'from __main__ import N, writeDict as
f').timeit(1) # the 'number' parameter is either 1 or 10

---

My guess is that this discrepancy is a result of either how some sort of
overhead in timeit, or of Python having to allocate memory space for a
dict 10 times. What do you think, and how to find out for sure?


There are several extra overheads including calling the function 
multiple times and deleting the structures you created each time.



Second (for me, this question is more important), how to improve
performance?


In this specific case the best improvement is not to create the dict at 
all. Since the values are all derived from the key all you need is to 
store the keys and calculate the values when needed. But I suspect the 
real world use case is not that simple...


You could try moving N and the dict inside the function - local 
variables are usually slightly faster than globals.


You could also try using a generator for the dict.

I've no idea how much faster/slower that would be, with all things 
performance related testing is the only sure way.




--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor