Re: [Tutor] a little loop

2013-05-30 Thread Oscar Benjamin
Sending again to the list (sorry boB)...

On 29 May 2013 17:51, boB Stepp robertvst...@gmail.com wrote:
 I don't know exactly how str.join is implemented but it does not use
 this quadratic algorithm. For example if str.join would first compute
 the length of the resulting string first then it can allocate memory
 for exactly one string of that length and copy each substring to the
 appropriate place (actually I imagine it uses an exponentially
 resizing buffer but this isn't important).


 ...str.join gets around these issues?

As I said I don't know how this is implemented in CPython (I hoped
Eryksun might chime in there :) ).

 In the linked article it was
 discussing increasing memory allocation by powers of two instead of
 trying to determine the exact length of the strings involved,
 mentioning that the maximum wasted memory would be 50% of what was
 actually needed. Is Python more clever in its implementation?

Actually the maximum memory wastage is 100% of what is needed or 50%
of what is actually used. This is if the amount needed is one greater
than a power of two and you end up doubling to the next power of two.
I don't see how CPython could be much cleverer in its implementation.
There aren't that many reasonable strategies here (when implementing
strings as linear arrays like CPython does).

 * CPython actually has an optimisation that can append strings in
 precisely this situation. However it is an implementation detail of
 CPython that may change and it does not work in other interpreters
 e.g. Jython. Using this kind of code can damage portability since your
 program may run fine in CPython but fail in other interpreters.

 You are speaking of appending and not concatenation here?

In this case I was just talking about single characters so you could
think of it as either. However, yes the optimisation is for
concatenation and in particular the '+' and '+=' operators.

 I had not even considered other Python interpreters than CPython. More
 complexity to consider for the future...

It's only a little bit of complexity. Just bear in mind the
distinction between a language feature that is true in any
conforming implementation and an implementation detail that happens
to be true in some or other interpreter but is not a specified part of
the language, In practise this means not really thinking too hard
about how CPython implements things and just using the recommended
idioms e.g. str.join.

I don't know if it is documented anywhere that str.join is linear
rather than quadratic but I consider that to be a language feature.
Exactly how it achieves linear behaviour (precomputing, resizing,
etc.) is an implementation detail. If your code relies only on
language features then it should not have problems when changing
interpreters.


Oscar
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-30 Thread Steven D'Aprano

On 30/05/13 02:51, boB Stepp wrote:

On Wed, May 29, 2013 at 11:07 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:



I don't know exactly how str.join is implemented but it does not use
this quadratic algorithm. For example if str.join would first compute
the length of the resulting string first then it can allocate memory
for exactly one string of that length and copy each substring to the
appropriate place (actually I imagine it uses an exponentially
resizing buffer but this isn't important).


I have not actually read the str.join source code, but what I understand is 
that it has two cases:

1) If you pass a sequence of sub-strings to join, say, a list, it can look at 
each sub-string, calculate the total space required, and allocate a string 
buffer of exactly that amount of memory, and only then copy the characters into 
the buffer.

2) If you pass an iterator, join cannot go over the sub-strings twice, it has 
to do so in one pass. It probably over-allocates the buffer, then when 
finished, shrinks it back down again.


Sure enough, ''.join(list-of-substrings) is measurably faster than 
''.join(iterator-of-substrings).





...str.join gets around these issues? In the linked article it was
discussing increasing memory allocation by powers of two instead of
trying to determine the exact length of the strings involved,
mentioning that the maximum wasted memory would be 50% of what was
actually needed. Is Python more clever in its implementation?



In the case of lists, CPython will over-allocate. I believe that up to some 
relatively small size, lists are initially quadrupled in size, after which time 
they are doubled. The exact cut-off size is subject to change, but as an 
illustration, we can pretend that it looks like this:


- An empty list is created with, say, 20 slots, all blank.

- When all 20 slots are filled, the next append or insert will increase the 
size of the list to 80 slots, 21 of which are used and 59 are blank.

- When those 80 slots are filled, the next append or insert will increase to 
320 slots.

- When those are filled, the number of slots is doubled to 640.

- Then 1280, and so forth.


So small lists waste more memory, up to 75% of the total size, but who cares, 
because they're small. Having more slots available, they require even fewer resizes, so 
they're fast.

However, I emphasis that the exact memory allocation scheme is not guaranteed, 
and is subject to change without notice. The only promise made, and this is 
*implicit* and not documented anywhere, is that appending to a list will be 
amortised to constant time, on average.

(Guido van Rossum, Python's creator, has said that he would not look kindly on 
anything that changed the basic performance characteristics of lists.)


When creating a string, Python may be able to determine the exact size 
required, in which case no over-allocation is needed. But when it can't, it may 
use a similar over-allocation strategy as for lists, except that the very last 
thing done before returning the string is to shrink it down so there's no 
wasted space.




* CPython actually has an optimisation that can append strings in
precisely this situation. However it is an implementation detail of
CPython that may change and it does not work in other interpreters
e.g. Jython. Using this kind of code can damage portability since your
program may run fine in CPython but fail in other interpreters.


You are speaking of appending and not concatenation here?



Yes. Because strings are immutable, under normal circumstances, concatenating 
two strings requires creating a third. Suppose you say:

A = Hello 
B = World!
C = A + B

So Python can see that string A has 6 characters, and B has 6 characters, so C 
requires space for 12 characters:

C = 

which can then be filled in:

C = Hello World!

and now string C is ready to be used.

But, suppose we have this instead:

A = A + B  # or A += B

The *old* A is used, then immediately discarded, and replaced with the new 
string. This leads to a potential optimization: instead of having to create a 
new string, Python can resize A in place:

A = Hello --
B = World!

then copy B into A:

A = Hello World!


But note that Python can only do this if A is the one and only reference to the 
string. If any other name, list, or other object is pointing to the string, 
this cannot be done. Also, you can't do it for the reverse:

B = A + B

since memory blocks can generally only grow from one side, not the other.

Finally, it also depends on whether the operating system allows you to grow 
memory blocks in the fashion. It may not. So the end result is that you cannot 
really rely on this optimization. It's nice when it is there, but it may not 
always be there.

And just a reminder, none of this is important for one or two string 
concatenations. It's only when you build up a string from repeated 
concatenations that this becomes an issue.



--
Steven

Re: [Tutor] a little loop

2013-05-30 Thread eryksun
On Wed, May 29, 2013 at 12:51 PM, boB Stepp robertvst...@gmail.com wrote:
 On Wed, May 29, 2013 at 11:07 AM, Oscar Benjamin
 oscar.j.benja...@gmail.com wrote:

 I don't know exactly how str.join is implemented but it does not use
 this quadratic algorithm. For example if str.join would first compute
 the length of the resulting string first then it can allocate memory
 for exactly one string of that length and copy each substring to the
 appropriate place (actually I imagine it uses an exponentially
 resizing buffer but this isn't important).

 ...str.join gets around these issues? In the linked article it was
 discussing increasing memory allocation by powers of two instead of
 trying to determine the exact length of the strings involved,
 mentioning that the maximum wasted memory would be 50% of what was
 actually needed. Is Python more clever in its implementation?

CPython computes the exact length required. Nothing clever. It first
expands the iterable into a list. It joins the strings in two passes.
In the first pass it computes the total size (3.3 also has to
determine the 'kind' of unicode string in this loop, i.e. ASCII,
2-byte, etc). Then it allocates a new string and copies in the data in
a second pass.


 * CPython actually has an optimisation that can append strings in
 precisely this situation. However it is an implementation detail of
 CPython that may change and it does not work in other interpreters
 e.g. Jython. Using this kind of code can damage portability since your
 program may run fine in CPython but fail in other interpreters.

 You are speaking of appending and not concatenation here?

In terms of sequence methods, it's inplace concatenation. On their
own, immutable string types only support regular concatenation, but
the interpreter can evaluate the concatenation inplace for special
cases. Specifically, it can resize the target string in an INPLACE_ADD
if it's not interned and has only *one* reference. Also, the reference
has to be a local variable; it can't be a global (unless at module
level), an attribute, or a subscript.

Here's an example (tested in 2.7 and 3.3).

Interned strings:

 s = 'abcdefgh' * 128

CPython code objects intern their string constants that are all name
characters (ASCII alphanumeric and underscore). But that's not an
issue if you multiply the string to make it longer than 20 characters.
A sequence length of 20 is the cutoff point for compile-time constant
folding. This keeps the code object size under wraps. The number 20
was apparently chosen for obvious reasons (at least to someone).
Anyway, if the string is determined at runtime, it won't be interned.

But really I'm concatenating the base string with itself so many times
to avoid using the Pymalloc object allocator (see the note below). Its
block sizes are fine grained at just 8 bytes apart. Depending on your
system I don't know if adding even one more byte will push you up to
the next block size, which would defeat an example based on object
id(). I'll take my chances that the stdlib realloc() will be able to
grow the block, but that's not guaranteed either. Strings should be
treated as immutable at all times. This is just a performance
optimization.

The reference count must be 1:

 sys.getrefcount(s)
2

Hmmm. The reference count of the string is incremented when it's
loaded on the stack, meaning it will always be at least 2. As such,
the original variable reference is deleted before in-place
concatenation. By that I mean that if you have s += 'spam', then mid
operation s is deleted from the current namespace. The next
instruction stores the result back.

VoilĂ :

 id_s = id(s)
 s += 'spam'
 id(s) == id_s
True


Note on object reallocation:
The following assumes CPython is built with the Pymalloc small-object
allocator, which is the default configuration in 2.3+. Pymalloc
requests memory from the system in 256 KiB chunks calls arenas. Each
arena is partitioned into 64 pools. Each pool has a fixed block size,
and block sizes increase in steps of 8, from 8 bytes up to 256 bytes
(up to 512 bytes in 3.3).

Resizing the string eventually calls PyObject_Realloc. If the object
isn't managed by Pymalloc, the call to PyObject_Realloc punts to the C
stdlib realloc. Otherwise if the new size maps to a larger block size,
or if it's shrinking by more than 25% to a smaller block size, the
allocation punts to PyObject_Malloc. This allocates a block from the
first available pool. If the requested size is larger than the maximum
block size, it punts to the C stdlib malloc.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-30 Thread Oscar Benjamin
On 30 May 2013 21:35, eryksun eryk...@gmail.com wrote:
 In terms of sequence methods, it's inplace concatenation. On their
 own, immutable string types only support regular concatenation, but
 the interpreter can evaluate the concatenation inplace for special
 cases. Specifically, it can resize the target string in an INPLACE_ADD
 if it's not interned and has only *one* reference.

It's also for BINARY_ADD in the form a = a + b:

$ python
Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type help, copyright, credits or license for more information.
 s = 'abcdefgh' * 128
 id_s = id(s)
 s = s + 'spam'
 print(id(s) == id_s)
True

A rare case of me actually using the dis module:

 def f():
...   s = s + 'spam'
...
 import dis
 dis.dis(f)
  2   0 LOAD_FAST0 (s)
  3 LOAD_CONST   1 ('spam')
  6 BINARY_ADD
  7 STORE_FAST   0 (s)
 10 LOAD_CONST   0 (None)
 13 RETURN_VALUE


Oscar
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-30 Thread eryksun
On Thu, May 30, 2013 at 6:35 PM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:

 It's also for BINARY_ADD in the form a = a + b:

Right you are. It sees that the next operation is a store back to a.
It wouldn't work the other way around, i.e. a = b + a.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-30 Thread eryksun
On Thu, May 30, 2013 at 3:16 PM, Steven D'Aprano st...@pearwood.info wrote:

 Sure enough, ''.join(list-of-substrings) is measurably faster than
 ''.join(iterator-of-substrings).

A tuple or list is used directly. Otherwise join() has to create an
iterator and build a new list.

This isn't directly related to the discussion on string concatenation.
But in relation to the offshoot discussion on lists, I've put together
some examples that demonstrate how different was of creating the same
list lead to different allocated sizes.

First, here's a function to get the allocated length of a list's item array:

from ctypes import sizeof, c_void_p, c_ssize_t

alloc_offset = sizeof(c_void_p * 2) + sizeof(c_ssize_t * 2)

def allocated(alist):
addr = id(alist)
alloc = c_ssize_t.from_address(addr + alloc_offset)
return alloc.value

It uses ctypes to peek into the object, but you can also use a list
object's __sizeof__() method to calculate the result. First get the
array's size in bytes by subtracting the size of an empty list from
the size of the list. Then divide by the number of bytes in a pointer:

import struct

pointer_size = struct.calcsize('P')
empty_size = [].__sizeof__()

def allocated(alist):
size_bytes = alist.__sizeof__() - empty_size
return size_bytes // pointer_size


Example 0:

 allocated([0,1,2,3,4,5,6,7,8,9,10,11])
12

In this case the constants are pushed on the stack, and the
interpreter evaluates BUILD_LIST(12), which in CPython calls
PyList_New(12). The same applies to using built-in range() in 2.x (not
3.x):

 allocated(range(12))
12

Example 1:

 allocated([i for i in xrange(12)])
16

This starts at 0 and grows as follows:

1 + 1//8 + 3 = 4
5 + 5//8 + 3 = 8
9 + 9//8 + 6 = 16


Example 2:

 allocated(list(xrange(12)))
19

This also applies to range() in 3.x. Some iterators have a
__length_hint__ method for guessing the initial size:

 iter(xrange(12)).__length_hint__()
12

The guess is immediately resized as follows:

12 + 12//8 + 6 = 19


Example 3:

 allocated(list(i for i in xrange(12)))
12

The initializer here is a generator, compiled from the generator
expression. A generator doesn't have a length hint. Instead, the list
uses a default guess of 8, which is over-allocated as follows:

8 + 8//8 + 3 = 12

If the generator continues to a 13th item, the list resizes to 13 +
13//8 + 6 = 20:

 allocated(list(i for i in xrange(13)))
20
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-29 Thread Jim Mooney
On 28 May 2013 22:33, Andreas Perstinger andiper...@gmail.com wrote:

 Wow, that means I can do this:   print  ''.join('But this parrot is dead!')


 But why do you want to do that?


Actually, I meant to do this:

print ''.join(' '.join('But this parrot is dead'.split()))

Which has the same effect.

There is an autodidactic purpose. I kept confusing the join statement by
putting the joined string first, them the joiner, bur fooling around like
this solidifies things in my mind ;')

Jim
Ornhgvshy vf orggre guna htyl
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-29 Thread boB Stepp
On Tue, May 28, 2013 at 9:34 PM, Steven D'Aprano st...@pearwood.info wrote:

 On 28/05/13 13:54, Tim Hanson wrote:



 However, a word of warning: although you *can* assemble a new string 
 character by character like that, you should not, because it risks being very 
 slow. *Painfully* slow. If you want to hear the details, please ask, but it 
 risks being slow for much the same reason as the infamous Shlemiel the 
 Painter Algorithm:

 http://www.joelonsoftware.com/articles/fog000319.html


Okay, I will come out of lurking and ask: What are the details?

BTW, interesting article and interesting site! I will have to bookmark
this one and come back for more exploration.

Thanks!
boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-29 Thread Oscar Benjamin
On 29 May 2013 16:38, boB Stepp robertvst...@gmail.com wrote:
 On Tue, May 28, 2013 at 9:34 PM, Steven D'Aprano st...@pearwood.info wrote:

 However, a word of warning: although you *can* assemble a new string 
 character by character like that, you should not, because it risks being 
 very slow. *Painfully* slow. If you want to hear the details, please ask, 
 but it risks being slow for much the same reason as the infamous Shlemiel 
 the Painter Algorithm:

 http://www.joelonsoftware.com/articles/fog000319.html

 Okay, I will come out of lurking and ask: What are the details?

Here is the code that Steven was referring to:

ham = ''
for char in 'spam':
ham = ham + char
print(ham)

What this code does is to create an empty string ham. Then for each
character in 'spam' is creates a new string by appending the character
to ham. However Python cannot append strings it can only create new
strings since strings are immutable*.

On the first iteration of the loop the zero length string is combined
with the character and a string of length 1 is created.
On the second a new string of length 2 is created.
On the third a new string of length 3 is created.
On the fourth a new string of length 4 is created.

So four strings are create with lengths 0, 1, 2, 3, 4.

If we did this with N characters then N strings would be created with
lengths 0, 1, 2, 3, ... , N. If cost of creating each string is
proportional to its length then the cost of the operation as a whole
is proportional to 0 + 1 + 2 + 3 + ... + N. The general formula to
compute this sum is (N**2 + N)/2. In other words the whole operation
is quadratic O(N**2) in the number of characters that are combined.

The Shlemiel the painter story describes a similar operation in which
the time taken to paint each section increases linearly as more
sections get painted. The result is that the time taken for Shlemiel
to paint a stretch of road is proportional to the square of its length
when it should just be proportional to its length.

I don't know exactly how str.join is implemented but it does not use
this quadratic algorithm. For example if str.join would first compute
the length of the resulting string first then it can allocate memory
for exactly one string of that length and copy each substring to the
appropriate place (actually I imagine it uses an exponentially
resizing buffer but this isn't important).

* CPython actually has an optimisation that can append strings in
precisely this situation. However it is an implementation detail of
CPython that may change and it does not work in other interpreters
e.g. Jython. Using this kind of code can damage portability since your
program may run fine in CPython but fail in other interpreters.


Oscar
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-29 Thread boB Stepp
On Wed, May 29, 2013 at 11:07 AM, Oscar Benjamin
oscar.j.benja...@gmail.com wrote:
 On 29 May 2013 16:38, boB Stepp robertvst...@gmail.com wrote:
 On Tue, May 28, 2013 at 9:34 PM, Steven D'Aprano st...@pearwood.info wrote:

 However, a word of warning: although you *can* assemble a new string 
 character by character like that, you should not, because it risks being 
 very slow. *Painfully* slow. If you want to hear the details, please ask, 
 but it risks being slow for much the same reason as the infamous Shlemiel 
 the Painter Algorithm:

 http://www.joelonsoftware.com/articles/fog000319.html

 Okay, I will come out of lurking and ask: What are the details?

 Here is the code that Steven was referring to:

 ham = ''
 for char in 'spam':
 ham = ham + char
 print(ham)

 What this code does is to create an empty string ham. Then for each
 character in 'spam' is creates a new string by appending the character
 to ham. However Python cannot append strings it can only create new
 strings since strings are immutable*.

 On the first iteration of the loop the zero length string is combined
 with the character and a string of length 1 is created.
 On the second a new string of length 2 is created.
 On the third a new string of length 3 is created.
 On the fourth a new string of length 4 is created.

 So four strings are create with lengths 0, 1, 2, 3, 4.

 If we did this with N characters then N strings would be created with
 lengths 0, 1, 2, 3, ... , N. If cost of creating each string is
 proportional to its length then the cost of the operation as a whole
 is proportional to 0 + 1 + 2 + 3 + ... + N. The general formula to
 compute this sum is (N**2 + N)/2. In other words the whole operation
 is quadratic O(N**2) in the number of characters that are combined.

 The Shlemiel the painter story describes a similar operation in which
 the time taken to paint each section increases linearly as more
 sections get painted. The result is that the time taken for Shlemiel
 to paint a stretch of road is proportional to the square of its length
 when it should just be proportional to its length.


After reading the link that Steven gave, I understood this, but you
have filled in some details. I was wondering how...

 I don't know exactly how str.join is implemented but it does not use
 this quadratic algorithm. For example if str.join would first compute
 the length of the resulting string first then it can allocate memory
 for exactly one string of that length and copy each substring to the
 appropriate place (actually I imagine it uses an exponentially
 resizing buffer but this isn't important).


...str.join gets around these issues? In the linked article it was
discussing increasing memory allocation by powers of two instead of
trying to determine the exact length of the strings involved,
mentioning that the maximum wasted memory would be 50% of what was
actually needed. Is Python more clever in its implementation?

 * CPython actually has an optimisation that can append strings in
 precisely this situation. However it is an implementation detail of
 CPython that may change and it does not work in other interpreters
 e.g. Jython. Using this kind of code can damage portability since your
 program may run fine in CPython but fail in other interpreters.

You are speaking of appending and not concatenation here?

I had not even considered other Python interpreters than CPython. More
complexity to consider for the future...

boB
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-28 Thread John Steedman
Some other tools, if you haven't come across them yet.

You already know about str.join ()

Slicing

b=['s','p','a','m']
b [ : 1 ]
['s']
b [ : 2 ]
['s', 'p']

Also, consider

len ( b)
4

range ( 4 )
[ 0, 1, 2, 3, 4]
# which I can iterate over.












On Tue, May 28, 2013 at 4:54 AM, Tim Hanson tjhan...@yahoo.com wrote:

 Okay, so I made it to FOR loops in the Lutz book.  A couple of days ago I
 was
 helped here with the .join method for creating strings from lists or
 tuples of
 strings.  I got to wondering if I could just, for the sake of learning, do
 the
 same thing in a FOR loop, since that's today's chapter:

 x=0; ham=''; b=['s','p','a','m'] #or, b=('s','p','a','m')
 for t in b:
 ham=ham+b[x]
 print(ham);x+=1


 s
 sp
 spa
 spam

 Alright, it works, eventually.  Can someone help me find a little more
 elegant
 way of doing this?  I'm sure there are several.

 Incidentally, I put the print statement within the FOR loop so I could
 watch
 progress.
 ___
 Tutor maillist  -  Tutor@python.org
 To unsubscribe or change subscription options:
 http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-28 Thread Alan Gauld

On 28/05/13 04:54, Tim Hanson wrote:


x=0; ham=''; b=['s','p','a','m'] #or, b=('s','p','a','m')
for t in b:
ham=ham+b[x]
print(ham);x+=1




Alright, it works, eventually.  Can someone help me find a little more elegant
way of doing this?  I'm sure there are several.


Python 'for' loops iterate over the target sequence so you usually don't 
need indexes. (And if you do you should use enumerate()). If it helps 
read the word 'for' as 'foreach'. So rewriting your for loop in

a Pythonic style yields:

for letter in b:
ham += letter
print(ham)

or if you must for some reason use indexing try

for index, letter in enumerate(b):
ham += b[index]
print(ham)

Note however that string addition is very inefficient compared to 
join(), so in the real world you are much better off using join.


There are other ways of doing this but join() is best so
there is little point in reciting them.

Notice also that I used 'letter' as the loop variable. Meaningful names 
make code much easier to understand. The extra effort of typing is 
easily offset by the time saved reading the code later.


--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-28 Thread Steven D'Aprano

On 28/05/13 13:54, Tim Hanson wrote:

Okay, so I made it to FOR loops in the Lutz book.  A couple of days ago I was
helped here with the .join method for creating strings from lists or tuples of
strings.  I got to wondering if I could just, for the sake of learning, do the
same thing in a FOR loop, since that's today's chapter:

x=0; ham=''; b=['s','p','a','m'] #or, b=('s','p','a','m')
for t in b:
ham=ham+b[x]
print(ham);x+=1


There's no need to manually count the index that you are looking at, and no 
need to manually look up the character using b[x]. That's what the for-loop is 
already doing. (Also, a good habit to get into is using *meaningful* variable 
names, not one-letter cryptic names.)


ham = ''
for char in ['s', 'p', 'a', 'm']:
ham = ham + char
print(ham)


But wait, there's more! There's no need to split the string spam up into 
characters yourself, since strings can be iterated over too:

ham = ''
for char in 'spam':
ham = ham + char
print(ham)


However, a word of warning: although you *can* assemble a new string character 
by character like that, you should not, because it risks being very slow. 
*Painfully* slow. If you want to hear the details, please ask, but it risks 
being slow for much the same reason as the infamous Shlemiel the Painter 
Algorithm:

http://www.joelonsoftware.com/articles/fog000319.html


So while it is okay to assemble strings using + if there are only a few pieces, 
you should avoid doing it whenever there is a chance of there being many 
pieces. The standard method for assembling a string from a collection of 
substrings is to do it in one go, using the join method, instead of piece by 
piece:

pieces = ['NOBODY', 'expects', 'the', 'Spanish', 'Inquisition!']
mystring = ' '.join(pieces)  # join with a single space between each piece
print(mystring)


The joiner can be any string you like, including the empty string '', it 
doesn't have to be a space.



--
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-28 Thread Jim Mooney
On 28 May 2013 19:34, Steven D'Aprano st...@pearwood.info wrote:

The standard method for assembling a string from a collection
 of substrings is to do it in one go, using the join method,

Wow, that means I can do this:   print  ''.join('But this parrot is dead!')

-- 
Jim
Ornhgvshy vf orggre guna htyl
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-28 Thread Andreas Perstinger

On 29.05.2013 05:20, Jim Mooney wrote:

On 28 May 2013 19:34, Steven D'Aprano st...@pearwood.info wrote:

The standard method for assembling a string from a collection

of substrings is to do it in one go, using the join method,


Wow, that means I can do this:   print  ''.join('But this parrot is dead!')


But why do you want to do that?

join iterates over the string you gave as an argument and puts the 
empty string in between each character:

'B' + '' + 'u' + '' + 't' + '' + ...

Thus you end up with the same string as you started.

Or did you mean something like:

 print ''.join('But this parrot is dead!')
But this parrot is dead!

Bye, Andreas
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] a little loop

2013-05-27 Thread Tim Hanson
Okay, so I made it to FOR loops in the Lutz book.  A couple of days ago I was 
helped here with the .join method for creating strings from lists or tuples of 
strings.  I got to wondering if I could just, for the sake of learning, do the 
same thing in a FOR loop, since that's today's chapter:

x=0; ham=''; b=['s','p','a','m'] #or, b=('s','p','a','m')
for t in b:
ham=ham+b[x]
print(ham);x+=1


s
sp
spa
spam

Alright, it works, eventually.  Can someone help me find a little more elegant 
way of doing this?  I'm sure there are several.

Incidentally, I put the print statement within the FOR loop so I could watch 
progress.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] a little loop

2013-05-27 Thread kartik sundarajan
One way I can suggest is

x=0; ham=''; b=['s','p','a','m'] #or, b=('s','p','a','m')
 for t in b:
 ham=ham+b[x]
 print(ham);x+=1


 't' is actually  equal to b[x] and its faster then indexed based look-up.
so you can rewrite

ham = ham + b[x]

as

ham += t

and remove the x increment.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor