Re: queue versus list
On 20/03/2020 21:57, Barry wrote: > > >> On 20 Mar 2020, at 00:42, duncan smith wrote: >> >> Bingo. Performance is indistinguishable from that of the list. Thread >> safety is unimportant (for my purposes), but the docs at >> https://docs.python.org/2/library/collections.html#collections.deque >> state "Deques support thread-safe, memory efficient appends and pops >> from either side of the deque with approximately the same O(1) > > I did a performance test of list vs sequel as a fifo that had 10k elements in > them. > Then I timed insert elements at the tail and popping from the head. > > Duque is 10x faster then list, rest run macOS python 3.8 > > Barry > > Yes, I avoided the cost of popping elements from the head by just iterating over the list. So the timings came out very similar. But the deque allows me to remove those elements efficiently. It's also neat because it lets me switch from what is basically a breadth first search of a graph to a depth first search by simply changing popleft() to popright(). Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On 20Mar2020 06:50, Dan Stromberg wrote: Actually, I think a deque is a doubly linked list of python lists. On 20Mar2020 09:45, Antoon Pardon wrote: This doesn't seem correct. A deque is used to simulate a stack or a queue. It doesn't use heappush or heappop. You are both correct; brain fade on my part. I do not know why I conflated a deque with a heap. Apologies for the confusion. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
> On 20 Mar 2020, at 00:42, duncan smith wrote: > > Bingo. Performance is indistinguishable from that of the list. Thread > safety is unimportant (for my purposes), but the docs at > https://docs.python.org/2/library/collections.html#collections.deque > state "Deques support thread-safe, memory efficient appends and pops > from either side of the deque with approximately the same O(1) I did a performance test of list vs sequel as a fifo that had 10k elements in them. Then I timed insert elements at the tail and popping from the head. Duque is 10x faster then list, rest run macOS python 3.8 Barry -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On Fri, 20 Mar 2020 06:50:29 -0700 Dan Stromberg wrote: > On Thu, Mar 19, 2020 at 6:11 PM Cameron Simpson wrote: > > > >> 4. If it doesn't need to be thread-safe, why not try deques instead? > > > > > >Bingo. Performance is indistinguishable from that of the list. > > > > A deque is implement using a list. > > > > Actually, I think a deque is a doubly linked list of python lists. According to the source¹: Data for deque objects is stored in a doubly-linked list of fixed length blocks. [...] Textbook implementations of doubly-linked lists store one datum per link, but that gives them a 200% memory overhead (a prev and next link for each datum) and it costs one malloc() call per data element. By using fixed-length blocks, the link to data ratio is significantly improved and there are proportionally fewer calls to malloc() and free(). The data blocks of consecutive pointers also improve cache locality. ¹ https://github.com/python/cpython/blob/3.8/Modules/_collectionsmodule.c#L19-L78 Dan -- “Atoms are not things.” – Werner Heisenberg Dan Sommers, http://www.tombstonezero.net/dan -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On Thu, Mar 19, 2020 at 6:11 PM Cameron Simpson wrote: > >> 4. If it doesn't need to be thread-safe, why not try deques instead? > > > >Bingo. Performance is indistinguishable from that of the list. > > A deque is implement using a list. > Actually, I think a deque is a doubly linked list of python lists. -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
Op 20/03/20 om 02:10 schreef Cameron Simpson: On 20Mar2020 00:37, duncan smith wrote: Thread safety is unimportant (for my purposes), but the docs at https://docs.python.org/2/library/collections.html#collections.deque state "Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction". Looking at the output from the profiler for my implementation with the queue it looks like a deque is being used but there's various other stuff happening relating to thread safety. Is the lesson from this that if all you're doing is appending and popping, then use a deque (rather than a queue) even if you want thread safety? (It makes quite a difference in performance for my use case.) Cheers. A deque will change the ordering of the elements you process. Your list loop behaves in FIFO order, as does the queue. A dequeue (you you're doing a heappush and a heappop) will issue you the lowest current element on each iteration. If you merely need to process all the elements you're good. If the order matters you need to consider that. A deque, being thread safe, will be using a lock. There will be a (quite small) overhead for that. This doesn't seem correct. A deque is used to simulate a stack or a queue. It doesn't use heappush or heappop. Antoon. -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On 20Mar2020 00:37, duncan smith wrote: On 19/03/2020 20:40, MRAB wrote: On 2020-03-19 20:08, duncan smith wrote: I have generator code along the following lines, Q = queue.Queue() Q.put(x) while not Q.empty(): x = Q.get() if : yield x else: Q.put() If I change it to, Q = [] Q.append(x) for x in Q: if : yield x else: Q.append() then it runs approximately 3 times more quickly. A list is a very simple data structure, and therefore fast (for things that are inherently fast, like append; insert is expensive). A queue takes a lock (cheap but an overhead) and might be expensive for get or put (this dependings on the data structure storing the queue items). A circular list (essentially a list using a base offset) can be fast until it needs resizing, a doubly linked list of nodes is always O(1) but is more expensive to add/remove (allocate node, adjust pointers); makye 3 times as slow :-) I haven't looked to see how Queue stores its items. I seem to remember (from somewhere) that the language specification doesn't guarantee that the latter will continue to work, but that it's unlikely that future implementations will break it. [...snip...] [MRAB] 1. I'm not sure it's a good idea to mutate a list while iterating over it. AFAICR that's the bit that depends on the implementation of lists. As long as iteration involves incrementing an index and indexing into the list it will work. If you're the only person using the list, as you are only _appending_ to it then the for loop is probably safe. If you were inserting elements into the list there would be trouble (an iterator basicly needs keep an index as you suggest below, and an insert at or before the current index invalidates that, causing the iterator to issue the same element again (for example). Apart from that, is there any good reason why I shouldn't use the list? Is there another approach that might avoid the overhead of using a queue? Results from profiling the two implementations below in case it makes anything clearer. TIA. Using a list seems fine for me if your code is as simple in reality as listed above. 2. The example with the list retains all of the results, whereas the one with the queue consumes them. 3. Queues are thread-safe because they're used for communication between threads, but lists aren't thread-safe; that might be something to do with the difference. 4. If it doesn't need to be thread-safe, why not try deques instead? Bingo. Performance is indistinguishable from that of the list. A deque is implement using a list. Thread safety is unimportant (for my purposes), but the docs at https://docs.python.org/2/library/collections.html#collections.deque state "Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction". Looking at the output from the profiler for my implementation with the queue it looks like a deque is being used but there's various other stuff happening relating to thread safety. Is the lesson from this that if all you're doing is appending and popping, then use a deque (rather than a queue) even if you want thread safety? (It makes quite a difference in performance for my use case.) Cheers. A deque will change the ordering of the elements you process. Your list loop behaves in FIFO order, as does the queue. A dequeue (you you're doing a heappush and a heappop) will issue you the lowest current element on each iteration. If you merely need to process all the elements you're good. If the order matters you need to consider that. A deque, being thread safe, will be using a lock. There will be a (quite small) overhead for that. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On 19/03/2020 20:40, MRAB wrote: > On 2020-03-19 20:08, duncan smith wrote: >> Hello, >> I have generator code along the following lines, >> >> >> Q = queue.Queue() >> Q.put(x) >> while not Q.empty(): >> x = Q.get() >> if : >> yield x >> else: >> Q.put() >> >> >> If I change it to, >> >> >> Q = [] >> Q.append(x) >> for x in Q: >> if : >> yield x >> else: >> Q.append() >> >> >> then it runs approximately 3 times more quickly. I seem to remember >> (from somewhere) that the language specification doesn't guarantee that >> the latter will continue to work, but that it's unlikely that future >> implementations will break it. Apart from that, is there any good reason >> why I shouldn't use the list? Is there another approach that might avoid >> the overhead of using a queue? Results from profiling the two >> implementations below in case it makes anything clearer. TIA. >> > [snip] Thanks MRAB and Paul, > A number of points: > > 1. I'm not sure it's a good idea to mutate a list while iterating over it. > AFAICR that's the bit that depends on the implementation of lists. As long as iteration involves incrementing an index and indexing into the list it will work. > 2. The example with the list retains all of the results, whereas the one > with the queue consumes them. > > 3. Queues are thread-safe because they're used for communication between > threads, but lists aren't thread-safe; that might be something to do > with the difference. > > 4. If it doesn't need to be thread-safe, why not try deques instead? Bingo. Performance is indistinguishable from that of the list. Thread safety is unimportant (for my purposes), but the docs at https://docs.python.org/2/library/collections.html#collections.deque state "Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction". Looking at the output from the profiler for my implementation with the queue it looks like a deque is being used but there's various other stuff happening relating to thread safety. Is the lesson from this that if all you're doing is appending and popping, then use a deque (rather than a queue) even if you want thread safety? (It makes quite a difference in performance for my use case.) Cheers. Duncan -- https://mail.python.org/mailman/listinfo/python-list
Re: queue versus list
On 2020-03-19 20:08, duncan smith wrote: Hello, I have generator code along the following lines, Q = queue.Queue() Q.put(x) while not Q.empty(): x = Q.get() if : yield x else: Q.put() If I change it to, Q = [] Q.append(x) for x in Q: if : yield x else: Q.append() then it runs approximately 3 times more quickly. I seem to remember (from somewhere) that the language specification doesn't guarantee that the latter will continue to work, but that it's unlikely that future implementations will break it. Apart from that, is there any good reason why I shouldn't use the list? Is there another approach that might avoid the overhead of using a queue? Results from profiling the two implementations below in case it makes anything clearer. TIA. [snip] A number of points: 1. I'm not sure it's a good idea to mutate a list while iterating over it. 2. The example with the list retains all of the results, whereas the one with the queue consumes them. 3. Queues are thread-safe because they're used for communication between threads, but lists aren't thread-safe; that might be something to do with the difference. 4. If it doesn't need to be thread-safe, why not try deques instead? -- https://mail.python.org/mailman/listinfo/python-list