Re: [Tutor] Question about O(N**2)

2014-05-04 Thread Devin Jeanpierre
On Sat, May 3, 2014 at 9:13 PM, Steven D'Aprano st...@pearwood.info wrote:

 On Sat, May 03, 2014 at 03:59:40PM -0700, Danny Yoo wrote:
  Following up on this.  Let's make sure that we're talking about the same 
  thing.
 
 
  The assertion is that the following:
 
  fullPath += [...]
 
  where fullPath is a list of strings, exhibits O(n^2) time.  I don't
  think this is true.  Semantically, the statement above should be
  equivalent to:
 
 fullPath.append(...)
 
  and so I agree with David: this is not O(n^2), but rather O(n).

 Danny is correct. For built-in lists, += augmented assignment is
 implemented using the extend method, not the + operator. Consequently,
 repeatedly calling += modifies the list in place, growing it as needed,
 which is amortized O(N) rather than O(N**2).

Nit: starting from an empty list, it's worst-case O(N), no amortized
qualifier necessary.

-- Devin
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Question about O(N**2)

2014-05-03 Thread Danny Yoo
Following up on this.  Let's make sure that we're talking about the same thing.


The assertion is that the following:

fullPath += [...]

where fullPath is a list of strings, exhibits O(n^2) time.  I don't
think this is true.  Semantically, the statement above should be
equivalent to:

   fullPath.append(...)

and so I agree with David: this is not O(n^2), but rather O(n).
Python's standard list implementation in C uses a strategy of
geometric array resizing whenever the capacity has been exceeded.  It
expands the capacity by a multiplicative factor of its current
capacity.  Amortized analysis, a subject in computer science, lets us
discover that the cost spread across many operations will be O(n).

References:

http://en.wikipedia.org/wiki/Dynamic_array


---

What can be O(n^2) is the related, but not identical scenario where:

   fullPath += ...

where fullPath is an immutable string.  But this scenario is different
than the original one.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Question about O(N**2)

2014-05-03 Thread Steven D'Aprano
On Sat, May 03, 2014 at 03:59:40PM -0700, Danny Yoo wrote:
 Following up on this.  Let's make sure that we're talking about the same 
 thing.
 
 
 The assertion is that the following:
 
 fullPath += [...]
 
 where fullPath is a list of strings, exhibits O(n^2) time.  I don't
 think this is true.  Semantically, the statement above should be
 equivalent to:
 
fullPath.append(...)
 
 and so I agree with David: this is not O(n^2), but rather O(n).

Danny is correct. For built-in lists, += augmented assignment is 
implemented using the extend method, not the + operator. Consequently, 
repeatedly calling += modifies the list in place, growing it as needed, 
which is amortized O(N) rather than O(N**2).

Had += been implemented as + instead, as I initially mis-remembered, the 
behaviour would have been O(N**2). The process would have gone:

fullPath = []
fullPath = [] + [a]  # add a new path component
fullPath = [a] + [b]
fullPath = [a, b] + [c]
fullPath = [a, b, c] + [d]
fullPath = [a, b, c, d] + [e]
...

Each individual + is an O(N+M) operation (if there are ten items in the 
first list, and five items in the second list, adding them needs to copy 
fifteen items). With M a constant 1, we can take each addition as O(N). 
And there are N additions, so we get O(N*N) or O(N**2).

We can test this by seeing how long it takes to perform a bunch of 
additions. For timing, I'm going to use the recipe shown here:

http://code.activestate.com/recipes/577896-benchmark-code-with-the-with-statement/

First, using augmented assignment for lists:

py size = 100
py with Timer():
... data = []
... for i in range(size):
... data += [i]
...
time taken: 0.548546 seconds
py with Timer():
... data = []
... for i in range(3*size):
... data += [i]
...
time taken: 1.672449 seconds


By increasing the amount of data by a factor of three, the time taken 
also increases by about a factor of three. That is O(N).

Now let's try list addition. It's so much slower than += that I had to 
reduce the size by a lot just to have it complete in any reasonable 
amount of time:

py size = 1
py with Timer():
... data = []
... for i in range(size):
... data = data + [i]
...
time taken: 0.331893 seconds
py with Timer():
... data = []
... for i in range(3*size):
... data = data + [i]
...
time taken: 3.203508 seconds


Here, increasing N by a factor of 3 takes about 10 times longer, which 
is consistent with O(N**2).



-- 
Steven
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor