Re: High-performance Priority Queue / Heap for glib
On Tue, 2009-03-10 at 20:53 +0100, Maik Zumstrull wrote: Maik Zumstrull wrote: Behdad Esfahbod wrote: I agree that having a priority queue in glib would be useful. However, the API should use an opaque structure and be agnostic of the actual implementation. I'm going to rethink the API and package the thing as a patch against 2.19.10 sources. Maybe the discussion will pick up a little when there's actual code on the table. Thanks for the feedback so far. Okay, here it is. I haven't extensively tested this for correctness or performance yet, but it basically works. The API is documented and should allow different implementations, I think. It's similar to what I suggested before, but I dropped merge(). I'm not sure every kind of underlying implementation could provide that efficiently. Also, priority queues and single entries of priority queues are syntactically different types now, but in reality the same thing with my implementation. Hi, Forgive my ignorance if this is a fundamentally stupid question.. but would it still be a priority queue if rather than assigning an integer priority when adding an element, the queue was set up with a callback function - such that the user's implementation can compare the priority of two items in the queue? I'm looking at implementing a computational geometry algorithm (based on Bentley-Ottmann) (and I've not got much experience in that field). The priority function required is the x-coordinate of some points, but with a tie-break on the y-coordinate in the case where the x-coordinates match. Without constraining the bounds of my coordinates, or performing some separate coordinate - priority mapping step (hard, since new coords are generated as the algorithm progresses), this doesn't seem to be possible with your implementation. Any thoughts? Best regards, Peter Clifton ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
On Thu, Mar 12, 2009 at 01:56:05PM +0100, Maik Zumstrull wrote: Worked on fifth-or-so attempt, must have been a temporary server glitch. You likely have a (transparent) proxy with multiple IP addresses. Don't limit the login to your IP address when logging in. -- Regards, Olav ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
Behdad Esfahbod wrote: I suggest you open a bug on http://bugzilla.gnome.org/ and attach your latest patch there. I created the bug (http://bugzilla.gnome.org/show_bug.cgi?id=575074), but I am unable to attach the patch: | Internal Error | | Bugzilla has suffered an internal error. Please save this page and | send it to bugmas...@gnome.org with details of what you were doing at | the time this message appeared. | | URL: http://bugzilla.gnome.org/attachment.cgi | | undef error - Undefined subroutine Fh::slice at | data/template/template/en/default/global/hidden-fields.html.tmpl | line 58 ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
Maik Zumstrull wrote: Behdad Esfahbod wrote: I suggest you open a bug on http://bugzilla.gnome.org/ and attach your latest patch there. I created the bug (http://bugzilla.gnome.org/show_bug.cgi?id=575074), but I am unable to attach the patch: Worked on fifth-or-so attempt, must have been a temporary server glitch. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
Hi Maik, I suggest you open a bug on http://bugzilla.gnome.org/ and attach your latest patch there. I like to comment on the patch, but don't want to take more of your time until the glib maintainers show interest in merging this. Cheers, behdad On 03/10/2009 03:53 PM, Maik Zumstrull wrote: Maik Zumstrull wrote: Behdad Esfahbod wrote: I agree that having a priority queue in glib would be useful. However, the API should use an opaque structure and be agnostic of the actual implementation. I'm going to rethink the API and package the thing as a patch against 2.19.10 sources. Maybe the discussion will pick up a little when there's actual code on the table. Thanks for the feedback so far. Okay, here it is. I haven't extensively tested this for correctness or performance yet, but it basically works. The API is documented and should allow different implementations, I think. It's similar to what I suggested before, but I dropped merge(). I'm not sure every kind of underlying implementation could provide that efficiently. Also, priority queues and single entries of priority queues are syntactically different types now, but in reality the same thing with my implementation. % diffstat glib-2.19.10-priorityqueue.diff docs/reference/glib/glib-docs.sgml|2 docs/reference/glib/glib-sections.txt | 17 docs/reference/glib/tmpl/priority_queues.sgml | 156 + glib/Makefile.am |1 glib/glib.h |1 glib/glib.symbols | 14 glib/gpqueue.c| 445 ++ glib/gpqueue.h| 63 +++ tests/Makefile.am |2 tests/priorityqueue-test.c| 170 + 10 files changed, 871 insertions(+) It applies to glib-2.19.10, and building with ./autogen.sh --enable-gtk-doc make make check seems to work fine. I hope I've integrated it into the build system correctly in all places. Please review. Thanks, Maik Zumstrull ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
Maik Zumstrull wrote: Behdad Esfahbod wrote: I agree that having a priority queue in glib would be useful. However, the API should use an opaque structure and be agnostic of the actual implementation. I'm going to rethink the API and package the thing as a patch against 2.19.10 sources. Maybe the discussion will pick up a little when there's actual code on the table. Thanks for the feedback so far. Okay, here it is. I haven't extensively tested this for correctness or performance yet, but it basically works. The API is documented and should allow different implementations, I think. It's similar to what I suggested before, but I dropped merge(). I'm not sure every kind of underlying implementation could provide that efficiently. Also, priority queues and single entries of priority queues are syntactically different types now, but in reality the same thing with my implementation. % diffstat glib-2.19.10-priorityqueue.diff docs/reference/glib/glib-docs.sgml|2 docs/reference/glib/glib-sections.txt | 17 docs/reference/glib/tmpl/priority_queues.sgml | 156 + glib/Makefile.am |1 glib/glib.h |1 glib/glib.symbols | 14 glib/gpqueue.c| 445 ++ glib/gpqueue.h| 63 +++ tests/Makefile.am |2 tests/priorityqueue-test.c| 170 + 10 files changed, 871 insertions(+) It applies to glib-2.19.10, and building with ./autogen.sh --enable-gtk-doc make make check seems to work fine. I hope I've integrated it into the build system correctly in all places. Please review. Thanks, Maik Zumstrulldiff -X diffignore -ruN glib-2.19.10/docs/reference/glib/glib-docs.sgml glib-2.19.10.local/docs/reference/glib/glib-docs.sgml --- glib-2.19.10/docs/reference/glib/glib-docs.sgml 2009-03-02 07:02:41.0 +0100 +++ glib-2.19.10.local/docs/reference/glib/glib-docs.sgml 2009-03-09 18:16:35.804036418 +0100 @@ -36,6 +36,7 @@ !ENTITY glib-Doubly-Linked-Lists SYSTEM xml/linked_lists_double.xml !ENTITY glib-Singly-Linked-Lists SYSTEM xml/linked_lists_single.xml !ENTITY glib-Double-ended-Queues SYSTEM xml/queue.xml +!ENTITY glib-Priority-Queues SYSTEM xml/priority_queues.xml !ENTITY glib-Sequences SYSTEM xml/sequence.xml !ENTITY glib-Trash-Stacks SYSTEM xml/trash_stack.xml !ENTITY glib-Hash-Tables SYSTEM xml/hash_tables.xml @@ -180,6 +181,7 @@ glib-Doubly-Linked-Lists; glib-Singly-Linked-Lists; glib-Double-ended-Queues; +glib-Priority-Queues; glib-Sequences; glib-Trash-Stacks; glib-Hash-Tables; diff -X diffignore -ruN glib-2.19.10/docs/reference/glib/glib-sections.txt glib-2.19.10.local/docs/reference/glib/glib-sections.txt --- glib-2.19.10/docs/reference/glib/glib-sections.txt 2009-03-02 07:19:08.0 +0100 +++ glib-2.19.10.local/docs/reference/glib/glib-sections.txt 2009-03-09 18:11:23.308036369 +0100 @@ -1939,6 +1939,23 @@ /SECTION SECTION +TITLEPriority Queues/TITLE +FILEpriority_queues/FILE + +GPQueue +GPQueueHandle +g_pqueue_insert +g_pqueue_top +g_pqueue_top_extended +g_pqueue_delete_top +g_pqueue_pop +g_pqueue_pop_extended +g_pqueue_delete +g_pqueue_change_priority +g_pqueue_destroy +/SECTION + +SECTION TITLESequences/TITLE FILEsequence/FILE diff -X diffignore -ruN glib-2.19.10/docs/reference/glib/tmpl/priority_queues.sgml glib-2.19.10.local/docs/reference/glib/tmpl/priority_queues.sgml --- glib-2.19.10/docs/reference/glib/tmpl/priority_queues.sgml 1970-01-01 01:00:00.0 +0100 +++ glib-2.19.10.local/docs/reference/glib/tmpl/priority_queues.sgml 2009-03-10 15:53:51.431915573 +0100 @@ -0,0 +1,156 @@ +!-- # SECTION Title # -- +Priority Queues + +!-- # SECTION Short_Description # -- +a collection of data entries with associated priority values that returns +entries one by one in order of priority + +!-- # SECTION Long_Description # -- +para +The #GPQueue structure and its associated functions provide a sorted collection +of data/priority pairs. Entries can be inserted in any order and at any time, +and an entry's priority can be changed after it has been inserted into the +queue. Any entry can be deleted at any time if you still have its +#GPQueueHandle, but entries are supposed to be removed one at a time in order +of priority with g_pqueue_pop(). +/para +para +The entries emphasiscannot/emphasis be iterated over in any way other than +removing them one by one in order of priority, but when doing that, this +structure is far more efficient than sorted lists or balanced trees, which +on the other hand do not suffer from this restriction. +/para +para +Most of the functions for #GPQueue work like #GList functions, that is, +you assign the result to your queue variable, and the +exception to the rule are g_pqueue_pop() and g_pqueue_pop_extended(),
Re: High-performance Priority Queue / Heap for glib
Maik Zumstrull wrote: Needing a fast priority queue for an internal project recently and noticing that glib doesn't have one, I wrote it myself. I'd be interested in submitting the code, and I'd be willing to do the necessary cleanup and documentation work on it. I agree that having a priority queue in glib would be useful. However, the API should use an opaque structure and be agnostic of the actual implementation. behdad It's a pretty straight-forward implementation of a Fibonacci Heap http://en.wikipedia.org/wiki/Fibonacci_heap, with a forest of trees of circular double-linked lists as the underlying data structure (which is not as nasty as that sounds). According to my quick mailing list search, a Heap-based priority queue was submitted in 2003, but not accepted. Let me address the comments in http://mail.gnome.org/archives/gtk-devel-list/2003-March/msg6.html. | - Does it offer significant (big-o) performance | benefits for a common operation, or allow some | operation that would be really hard to code without | the data structure? All operations can be done with existing glib data structures, but there is a significant performance benefit. Please refer to http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times. Glib includes linked lists and balanced trees, of course; a simple heap was submitted in 2003. As you can see, all three structures are easily beaten by this approach, dropping performance to O(1) for most operations and retaining O(log n) performance in the other cases. | - Would it be useful to a significant fraction of | applications? I don't think I can estimate the need. Personally, I was working with graph algorithms related to partitioning and path finding, where a good priority queue has a significant impact. In other areas where a priority queue is needed, presumably people are simply using glib's trees or lists now, accepting the performance penalty, which wouldn't matter in most applications. My implementation, ~300 lines of code, is fairly self-contained. It introduces one new data structure (to be renamed for inclusion): struct s_FQueue { gpointer data; gint priority; gint degree; gboolean marked; struct s_FQueue *parent; struct s_FQueue *prev; struct s_FQueue *next; struct s_FQueue *firstchild; }; typedef struct s_FQueue FQueue; The glib slice allocator is used to create and drop these elements. There are no significant dependencies on glib functionality otherwise. Memory management follows the semantics used by glib's linked lists: a simple NULL pointer represents an empty queue, and FQueue elements are created and deleted implicitly as items are added to and removed from the queue. The implementation doesn't at all care about the data pointer, so filling it with GINT_TO_POINTER wrapped gints instead of actual pointers is okay. The element's priorities have to be expressed as gints, the use of a comparison function is not supported as an alternative. Elements are returned by the queue in ascending order of the priority value. No guarantees are made regarding the iteration order for elements with the same priority. The priority of an entry can be changed after insertion by a function call, but the priority field must not be directly written. The user is expected to keep one FQueue *q variable for the queue, then pass its address to most of the API calls. The pointer is rewritten by each call as necessary to always point to the FQueue element currently at the front of the queue (or NULL for an empty queue). The GList style syntax q = some_api_call(q, ...); for rewriting the pointer is not used because of the special needs of the insert call (see below). This could be inverted to use GList-style syntax everywhere, and an FQueue** in the insert call only. This is the proposed / currently implemented API: #define fqueue_isempty(fq) (gboolean)(fq == NULL) Obvious. FQueue* fqueue_insert (FQueue **fq, gpointer data, gint priority); Inserts data into the list with a priority value of priority. An FQueue element is implicitly allocated. The fq pointer is rewritten to point to the front of the queue. The *returned* pointer points to the newly allocated element. Do NOT put that into your queue variable, but keep the pointer around somewhere if you intend to make change_priority or delete calls on this element. The returned pointer can be ignored if you don't plan to change any priorities and don't plan to remove elements other than by pop()-ing them. gpointer fqueue_peek (FQueue *fq); Returns the value of the data pointer of the element with the lowest priority value, without removing it from the queue. NULL is returned for an empty queue, so if you need to differentiate between an empty queue and an element with a NULL data pointer, check for that first. gpointer
Re: High-performance Priority Queue / Heap for glib
Behdad Esfahbod wrote: Maik Zumstrull wrote: Needing a fast priority queue for an internal project recently and noticing that glib doesn't have one, I wrote it myself. I'd be interested in submitting the code, and I'd be willing to do the necessary cleanup and documentation work on it. I agree that having a priority queue in glib would be useful. However, the API should use an opaque structure and be agnostic of the actual implementation. I'm going to rethink the API and package the thing as a patch against 2.19.10 sources. Maybe the discussion will pick up a little when there's actual code on the table. Thanks for the feedback so far. ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
Re: High-performance Priority Queue / Heap for glib
Maik Zumstrull maik.zumstr...@rz.uni-karlsruhe.de writes: [about a Fibonacci heap implementation] | - Does it offer significant (big-o) performance | benefits for a common operation, or allow some | operation that would be really hard to code without | the data structure? All operations can be done with existing glib data structures, but there is a significant performance benefit. Please refer to http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times. Glib includes linked lists and balanced trees, of course; a simple heap was submitted in 2003. As you can see, all three structures are easily beaten by this approach, dropping performance to O(1) for most operations and retaining O(log n) performance in the other cases. The big-O of a Fibonacci heap is better than that of a balanced tree or a binary heap, but how is the actual performance? When you benchmark your Fibonacci heap against balanced trees or binary heaps for realistic test cases, is there an actual improvement in runtime? The reason that I ask is this paragraph in the chapter on Fibonacci heaps in Cormen, Leiserson, and Rivest, _Introduction to Algorithms_: From a practical point of view, however, the constant factors and programming complexity of Fibonacci heaps make them less desirable than ordinary binary (or k-ary) heaps for most applications. Thus, Fibonacci heaps are predominantly of theoretical interest. I observed this effect in practice myself a few years ago, when I compared the performance of otherwise identical code using a binary heap and a Fibonacci heap (see page 51 of http://benpfaff.org/uniformity/uniformity-2001.04.27.pdf). But I didn't make much of an attempt to optimize my heaps, and maybe you did a better implementation. -- Ben Pfaff http://benpfaff.org ___ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list
High-performance Priority Queue / Heap for glib
Needing a fast priority queue for an internal project recently and noticing that glib doesn't have one, I wrote it myself. I'd be interested in submitting the code, and I'd be willing to do the necessary cleanup and documentation work on it. It's a pretty straight-forward implementation of a Fibonacci Heap http://en.wikipedia.org/wiki/Fibonacci_heap, with a forest of trees of circular double-linked lists as the underlying data structure (which is not as nasty as that sounds). According to my quick mailing list search, a Heap-based priority queue was submitted in 2003, but not accepted. Let me address the comments in http://mail.gnome.org/archives/gtk-devel-list/2003-March/msg6.html. | - Does it offer significant (big-o) performance | benefits for a common operation, or allow some | operation that would be really hard to code without | the data structure? All operations can be done with existing glib data structures, but there is a significant performance benefit. Please refer to http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times. Glib includes linked lists and balanced trees, of course; a simple heap was submitted in 2003. As you can see, all three structures are easily beaten by this approach, dropping performance to O(1) for most operations and retaining O(log n) performance in the other cases. | - Would it be useful to a significant fraction of | applications? I don't think I can estimate the need. Personally, I was working with graph algorithms related to partitioning and path finding, where a good priority queue has a significant impact. In other areas where a priority queue is needed, presumably people are simply using glib's trees or lists now, accepting the performance penalty, which wouldn't matter in most applications. My implementation, ~300 lines of code, is fairly self-contained. It introduces one new data structure (to be renamed for inclusion): struct s_FQueue { gpointer data; gint priority; gint degree; gboolean marked; struct s_FQueue *parent; struct s_FQueue *prev; struct s_FQueue *next; struct s_FQueue *firstchild; }; typedef struct s_FQueue FQueue; The glib slice allocator is used to create and drop these elements. There are no significant dependencies on glib functionality otherwise. Memory management follows the semantics used by glib's linked lists: a simple NULL pointer represents an empty queue, and FQueue elements are created and deleted implicitly as items are added to and removed from the queue. The implementation doesn't at all care about the data pointer, so filling it with GINT_TO_POINTER wrapped gints instead of actual pointers is okay. The element's priorities have to be expressed as gints, the use of a comparison function is not supported as an alternative. Elements are returned by the queue in ascending order of the priority value. No guarantees are made regarding the iteration order for elements with the same priority. The priority of an entry can be changed after insertion by a function call, but the priority field must not be directly written. The user is expected to keep one FQueue *q variable for the queue, then pass its address to most of the API calls. The pointer is rewritten by each call as necessary to always point to the FQueue element currently at the front of the queue (or NULL for an empty queue). The GList style syntax q = some_api_call(q, ...); for rewriting the pointer is not used because of the special needs of the insert call (see below). This could be inverted to use GList-style syntax everywhere, and an FQueue** in the insert call only. This is the proposed / currently implemented API: #define fqueue_isempty(fq) (gboolean)(fq == NULL) Obvious. FQueue* fqueue_insert (FQueue **fq, gpointer data, gint priority); Inserts data into the list with a priority value of priority. An FQueue element is implicitly allocated. The fq pointer is rewritten to point to the front of the queue. The *returned* pointer points to the newly allocated element. Do NOT put that into your queue variable, but keep the pointer around somewhere if you intend to make change_priority or delete calls on this element. The returned pointer can be ignored if you don't plan to change any priorities and don't plan to remove elements other than by pop()-ing them. gpointer fqueue_peek (FQueue *fq); Returns the value of the data pointer of the element with the lowest priority value, without removing it from the queue. NULL is returned for an empty queue, so if you need to differentiate between an empty queue and an element with a NULL data pointer, check for that first. gpointer fqueue_pop (FQueue **fq); Returns the value of the data pointer of the element with the lowest priority value, removing it from the queue. The FQueue element is implicitly deallocated. Do not attempt to make change_priority or delete calls on that element after this, it's gone! NULL is returned