Re: question on string object handling in Python 2.7.8

2014-12-25 Thread Denis McMahon
On Tue, 23 Dec 2014 20:28:30 -0500, Dave Tian wrote:

> Hi,
> 
> There are 2 statements:
> A: a = ‘h’
> B: b = ‘hh’
> 
> According to me understanding, A should be faster as characters would
> shortcut this 1-byte string ‘h’ without malloc; B should be slower than
> A as characters does not work for 2-byte string ‘hh’, which triggers the
> malloc. However, when I put A/B into a big loop and try to measure the
> performance using cProfile, B seems always faster than A.
> Testing code:
> for i in range(0, 1):
>   a = ‘h’ #or b = ‘hh’
> Testing cmd: python -m cProfile test.py
> 
> So what is wrong here? B has one more malloc than A but is faster than
> B?

Your understanding.

The first time through the loop, python creates a string object "h" or 
"hh" and creates a pointer (a or b) and assigns it to the string object.

The next range(1, 1) times through the loop, python re-assigns 
the existing pointer to the existing string object.

Maybe a 2 character string is faster to locate in the object table than a 
1 character string, so that in the 2 character case, the lookup is faster.

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: question on string object handling in Python 2.7.8

2014-12-24 Thread Gregory Ewing

Dave Tian wrote:

A: a = ‘h’ 

> B: b = ‘hh’


According to me understanding, A should be faster as characters would
shortcut this 1-byte string ‘h’ without malloc;


It sounds like you're expecting characters to be stored
"unboxed" like in Java.

That's not the way Python works. Objects are used for
everything, including numbers and characters (there is
no separate character type in Python, they're just
length-1 strings).

> for i in range(0, 1):
>a = ‘h’ #or b = ‘hh’
> Testing cmd: python -m cProfile test.py

Since you're assigning a string literal, there's just
one string object being allocated (at the time the code
is read in and compiled). All the loop is doing is
repeatedly assigning a reference to that object to a
or b, which doesn't require any further mallocs;
all it does is adjust reference counts. This will
be swamped by the overhead of the for-loop itself,
which is allocating and deallocating 100 million
integer objects.

I would expect both of these to be exactly the same
speed, within measurement error. Any difference you're
seeing is probably just noise, or the result of some
kind of warming-up effect.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: question on string object handling in Python 2.7.8

2014-12-24 Thread Ian Kelly
On Wed, Dec 24, 2014 at 4:22 AM, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:
> What happens here is that you time a piece of code to:
>
> - Build a large list containing 100 million individual int objects. Each
int
> object has to be allocated at run time, as does the list. Each int object
> is about 12 bytes in size.

Note to the OP: since you're using Python 2 you would do better to loop
over an xrange object instead of a range. xrange produces an iterator over
the desired range without needing to construct a single list containing all
of them. They would all still need to be allocated, but not all at once,
and memory could be reused.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: question on string object handling in Python 2.7.8

2014-12-24 Thread Ned Batchelder

On 12/23/14 8:28 PM, Dave Tian wrote:

Hi,

There are 2 statements:
A: a = ‘h’
B: b = ‘hh’

According to me understanding, A should be faster as characters would shortcut 
this 1-byte string ‘h’ without malloc; B should be slower than A as characters 
does not work for 2-byte string ‘hh’, which triggers the malloc.


I'm not sure why you thought a two-character string would require an 
extra malloc?  In Python 2.7, strings have a fixed-size head portion, 
then variable-length character storage.  These two parts are contiguous 
in one chunk of memory, for any length string.  Making a string requires 
one malloc, regardless of the size of the string.


What reference were you reading that implied otherwise?


However, when I put A/B into a big loop and try to measure the performance 
using cProfile, B seems always faster than A.
Testing code:
for i in range(0, 1):
a = ‘h’ #or b = ‘hh’
Testing cmd: python -m cProfile test.py

So what is wrong here? B has one more malloc than A but is faster than B?

Thanks,
Dave





--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: question on string object handling in Python 2.7.8

2014-12-24 Thread Dave Angel

On 12/23/2014 08:28 PM, Dave Tian wrote:

Hi,



Hi, please do some things when you post new questions:

1) identify your Python version.  In this case it makes a big 
difference, as in Python 2.x, the range function is the only thing that 
takes any noticeable time in this code.


2) when posting code, use cut 'n paste.  You retyped the code, which 
could have caused typos, and in fact did, since your email editor (or 
newsgroup editor, or whatever) decided to use 'smart quotes' instead of 
single quotes.  The Unicode characters shown in "Testing code" below 
include


   LEFT SINGLE QUOTATION MARK
and
   RIGHT SINGLE QUOTATION MARK

which are not valid Python syntax.


There are 2 statements:
A: a = ‘h’
B: b = ‘hh’

According to me understanding, A should be faster as characters would shortcut 
this 1-byte string ‘h’ without malloc;


Nope, there's no such promise in Python.  If there were such an 
optimization, it might vary between one implementation of Python and 
another, and between one version and the next.


But it'd be very hard to implement such an optimization, since the C 
interface would then see it, and third party native libraries would have 
to have special coding for this one kind of object.


You're probably thinking of Java and C#, which have native data and 
boxed data (I don't recall just what each one calls it).  Python, at 
least for the last 15 years or so, makes everything an object, which 
means there are no special cases for us to deal with.


B should be slower than A as characters does not work for 2-byte string 
‘hh’, which triggers the malloc. However, when I put A/B into a big loop 
and try to measure the performance using cProfile, B seems always faster 
than A.

Testing code:
for i in range(0, 1):
a = ‘h’ #or b = ‘hh’
Testing cmd: python -m cProfile test.py

So what is wrong here? B has one more malloc than A but is faster than B?



In my testing, sometimes A is quicker, and sometimes B is quicker.  But 
of course there are many ways of testing it, and many versions to test 
it on.  I put those statements (after fixing the quotes) into two 
functions, and called the two functions, letting profile tell me which 
was faster.


Incidentally, just putting them in functions cut the time by 
approximately 50%, probably because local variable lookup in a function 
in much faster in CPython than access to variables in globals().


There are other things going on, In any recent CPython implementation, 
certain strings will be interned, which can both save memory and avoid 
the constant thrashing of malloc and free.  So we might get different 
results by choosing a string which won't happen to get interned.


It's hard to get excited over any of these differences, but it is fun to 
think about it.


--
DaveA
--
https://mail.python.org/mailman/listinfo/python-list


Re: question on string object handling in Python 2.7.8

2014-12-24 Thread Steven D'Aprano
Dave Tian wrote:

> Hi,
> 
> There are 2 statements:
> A: a = ‘h’
> B: b = ‘hh’
> 
> According to me understanding, A should be faster as characters would
> shortcut this 1-byte string ‘h’ without malloc; B should be slower than A
> as characters does not work for 2-byte string ‘hh’, which triggers the
> malloc. However, when I put A/B into a big loop and try to measure the
> performance using cProfile, B seems always faster than A. 
>
> Testing code: 
> for i in range(0, 1): a = ‘h’ #or b = ‘hh’ 
> Testing cmd: python -m cProfile test.py

Any performance difference is entirely an artifact of your testing method.
You have completely misinterpreted what this piece of code will do.

What happens here is that you time a piece of code to:

- Build a large list containing 100 million individual int objects. Each int
object has to be allocated at run time, as does the list. Each int object
is about 12 bytes in size.

- Then, the name i is bound to one of those int objects. This is a fast
pointer assignment.

- Then, a string object containing either 'h' or 'hh' is allocated. In
either case, that requires 21 bytes, plus one byte per character. So either
22 or 23 bytes.

- The name a is bound to that string object. This is also a fast pointer
assignment.

- The loop returns to the top, and the name i is bound to the next int
object.

- Then, the name a is bound *to the same string object*, since it will have
been cached. No further malloc will be needed.


So as you can see, the time you measure is dominated by allocating a massive
list containing 100 million int objects. Only a single string object is
allocated, and the time difference between creating 'h' versus 'hh' is
insignificant.

The byte code can be inspected like this:

py> code = compile("for i in range(1): a = 'h'", '', 'exec')
py> from dis import dis
py> dis(code)
  1   0 SETUP_LOOP  26 (to 29)
  3 LOAD_NAME0 (range)
  6 LOAD_CONST   0 (1)
  9 CALL_FUNCTION1
 12 GET_ITER
>>   13 FOR_ITER12 (to 28)
 16 STORE_NAME   1 (i)
 19 LOAD_CONST   1 ('h')
 22 STORE_NAME   2 (a)
 25 JUMP_ABSOLUTE   13
>>   28 POP_BLOCK
>>   29 LOAD_CONST   2 (None)
 32 RETURN_VALUE


Notice instruction 19 and 22:

 19 LOAD_CONST   1 ('h')
 22 STORE_NAME   2 (a)

The string object is built at compile time, not run time, and Python simply
binds the name a to the pre-existing string object.

If you looked at the output of your timing code, you would see something
like this (only the times would be much larger, I cut the loop down from
100 million to only a few tens of thousands):


[steve@ando ~]$ python -m cProfile /tmp/x.py
 3 function calls in 0.062 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10.0510.0510.0620.062 x.py:1()
10.0000.0000.0000.000 {method 'disable'
of '_lsprof.Profiler' objects}
10.0120.0120.0120.012 {range}


The profiler doesn't even show the time required to bind the name to the
string object.

Here is a better way of demonstrating the same thing:


py> from timeit import Timer
py> t = Timer("a = 'h'")
py> min(t.repeat())
0.0508120059967041
py> t = Timer("a = ''")
py> min(t.repeat())
0.050585031509399414

No meaningful difference in time. What difference you do see is a fluke of
timing.




-- 
Steven

-- 
https://mail.python.org/mailman/listinfo/python-list


question on string object handling in Python 2.7.8

2014-12-24 Thread Dave Tian
Hi,

There are 2 statements:
A: a = ‘h’
B: b = ‘hh’

According to me understanding, A should be faster as characters would shortcut 
this 1-byte string ‘h’ without malloc; B should be slower than A as characters 
does not work for 2-byte string ‘hh’, which triggers the malloc. However, when 
I put A/B into a big loop and try to measure the performance using cProfile, B 
seems always faster than A.
Testing code:
for i in range(0, 1):
a = ‘h’ #or b = ‘hh’
Testing cmd: python -m cProfile test.py

So what is wrong here? B has one more malloc than A but is faster than B?

Thanks,
Dave


-- 
https://mail.python.org/mailman/listinfo/python-list