Re: High-performance Priority Queue / Heap for glib

2009-10-23 Thread Peter Clifton
On Tue, 2009-03-10 at 20:53 +0100, Maik Zumstrull wrote:
 Maik Zumstrull wrote:
 
  Behdad Esfahbod wrote:
 
   I agree that having a priority queue in glib would be useful.
   However, the API should use an opaque structure and be agnostic of
   the actual implementation.
  
  I'm going to rethink the API and package the thing as a patch against
  2.19.10 sources. Maybe the discussion will pick up a little when
  there's actual code on the table. Thanks for the feedback so far.
 
 Okay, here it is. I haven't extensively tested this for correctness or
 performance yet, but it basically works. The API is documented and
 should allow different implementations, I think. It's similar to what I
 suggested before, but I dropped merge(). I'm not sure every kind of
 underlying implementation could provide that efficiently. Also,
 priority queues and single entries of priority queues are syntactically
 different types now, but in reality the same thing with my
 implementation.

Hi,

Forgive my ignorance if this is a fundamentally stupid question.. but
would it still be a priority queue if rather than assigning an integer
priority when adding an element, the queue was set up with a callback
function - such that the user's implementation can compare the priority
of two items in the queue?

I'm looking at implementing a computational geometry algorithm (based on
Bentley-Ottmann) (and I've not got much experience in that field). The
priority function required is the x-coordinate of some points, but with
a tie-break on the y-coordinate in the case where the x-coordinates
match.

Without constraining the bounds of my coordinates, or performing some
separate coordinate - priority mapping step (hard, since new coords are
generated as the algorithm progresses), this doesn't seem to be possible
with your implementation.

Any thoughts?

Best regards,

Peter Clifton


___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-16 Thread Olav Vitters
On Thu, Mar 12, 2009 at 01:56:05PM +0100, Maik Zumstrull wrote:
 Worked on fifth-or-so attempt, must have been a temporary server glitch.

You likely have a (transparent) proxy with multiple IP addresses. Don't
limit the login to your IP address when logging in.

-- 
Regards,
Olav
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-12 Thread Maik Zumstrull
Behdad Esfahbod wrote:

 I suggest you open a bug on http://bugzilla.gnome.org/ and attach
 your latest patch there.

I created the bug (http://bugzilla.gnome.org/show_bug.cgi?id=575074),
but I am unable to attach the patch:

| Internal Error
| 
| Bugzilla has suffered an internal error. Please save this page and
| send it to bugmas...@gnome.org with details of what you were doing at
| the time this message appeared.
| 
| URL: http://bugzilla.gnome.org/attachment.cgi
| 
| undef error - Undefined subroutine Fh::slice at
| data/template/template/en/default/global/hidden-fields.html.tmpl
| line 58
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-12 Thread Maik Zumstrull
Maik Zumstrull wrote:

 Behdad Esfahbod wrote:
 
  I suggest you open a bug on http://bugzilla.gnome.org/ and attach
  your latest patch there.
 
 I created the bug
 (http://bugzilla.gnome.org/show_bug.cgi?id=575074), but I am unable
 to attach the patch:

Worked on fifth-or-so attempt, must have been a temporary server glitch.
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-11 Thread Behdad Esfahbod

Hi Maik,

I suggest you open a bug on http://bugzilla.gnome.org/ and attach your latest 
patch there.  I like to comment on the patch, but don't want to take more of 
your time until the glib maintainers show interest in merging this.


Cheers,
behdad

On 03/10/2009 03:53 PM, Maik Zumstrull wrote:

Maik Zumstrull wrote:


Behdad Esfahbod wrote:



I agree that having a priority queue in glib would be useful.
However, the API should use an opaque structure and be agnostic of
the actual implementation.

I'm going to rethink the API and package the thing as a patch against
2.19.10 sources. Maybe the discussion will pick up a little when
there's actual code on the table. Thanks for the feedback so far.


Okay, here it is. I haven't extensively tested this for correctness or
performance yet, but it basically works. The API is documented and
should allow different implementations, I think. It's similar to what I
suggested before, but I dropped merge(). I'm not sure every kind of
underlying implementation could provide that efficiently. Also,
priority queues and single entries of priority queues are syntactically
different types now, but in reality the same thing with my
implementation.

% diffstat glib-2.19.10-priorityqueue.diff
  docs/reference/glib/glib-docs.sgml|2
  docs/reference/glib/glib-sections.txt |   17
  docs/reference/glib/tmpl/priority_queues.sgml |  156 +
  glib/Makefile.am  |1
  glib/glib.h   |1
  glib/glib.symbols |   14
  glib/gpqueue.c|  445 
++
  glib/gpqueue.h|   63 +++
  tests/Makefile.am |2
  tests/priorityqueue-test.c|  170 +
  10 files changed, 871 insertions(+)

It applies to glib-2.19.10, and building with
./autogen.sh --enable-gtk-doc  make  make check
seems to work fine. I hope I've integrated it into the build system
correctly in all places.

Please review.


Thanks,

Maik Zumstrull




___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-10 Thread Maik Zumstrull
Maik Zumstrull wrote:

 Behdad Esfahbod wrote:

  I agree that having a priority queue in glib would be useful.
  However, the API should use an opaque structure and be agnostic of
  the actual implementation.
 
 I'm going to rethink the API and package the thing as a patch against
 2.19.10 sources. Maybe the discussion will pick up a little when
 there's actual code on the table. Thanks for the feedback so far.

Okay, here it is. I haven't extensively tested this for correctness or
performance yet, but it basically works. The API is documented and
should allow different implementations, I think. It's similar to what I
suggested before, but I dropped merge(). I'm not sure every kind of
underlying implementation could provide that efficiently. Also,
priority queues and single entries of priority queues are syntactically
different types now, but in reality the same thing with my
implementation.

% diffstat glib-2.19.10-priorityqueue.diff 
 docs/reference/glib/glib-docs.sgml|2 
 docs/reference/glib/glib-sections.txt |   17 
 docs/reference/glib/tmpl/priority_queues.sgml |  156 +
 glib/Makefile.am  |1 
 glib/glib.h   |1 
 glib/glib.symbols |   14 
 glib/gpqueue.c|  445 ++
 glib/gpqueue.h|   63 +++
 tests/Makefile.am |2 
 tests/priorityqueue-test.c|  170 +
 10 files changed, 871 insertions(+)

It applies to glib-2.19.10, and building with
./autogen.sh --enable-gtk-doc  make  make check
seems to work fine. I hope I've integrated it into the build system
correctly in all places.

Please review.


Thanks,

Maik Zumstrulldiff -X diffignore -ruN glib-2.19.10/docs/reference/glib/glib-docs.sgml glib-2.19.10.local/docs/reference/glib/glib-docs.sgml
--- glib-2.19.10/docs/reference/glib/glib-docs.sgml	2009-03-02 07:02:41.0 +0100
+++ glib-2.19.10.local/docs/reference/glib/glib-docs.sgml	2009-03-09 18:16:35.804036418 +0100
@@ -36,6 +36,7 @@
 !ENTITY glib-Doubly-Linked-Lists SYSTEM xml/linked_lists_double.xml
 !ENTITY glib-Singly-Linked-Lists SYSTEM xml/linked_lists_single.xml
 !ENTITY glib-Double-ended-Queues SYSTEM xml/queue.xml
+!ENTITY glib-Priority-Queues SYSTEM xml/priority_queues.xml
 !ENTITY glib-Sequences SYSTEM xml/sequence.xml
 !ENTITY glib-Trash-Stacks SYSTEM xml/trash_stack.xml
 !ENTITY glib-Hash-Tables SYSTEM xml/hash_tables.xml
@@ -180,6 +181,7 @@
 glib-Doubly-Linked-Lists;
 glib-Singly-Linked-Lists;
 glib-Double-ended-Queues;
+glib-Priority-Queues;
 glib-Sequences;
 glib-Trash-Stacks;
 glib-Hash-Tables;
diff -X diffignore -ruN glib-2.19.10/docs/reference/glib/glib-sections.txt glib-2.19.10.local/docs/reference/glib/glib-sections.txt
--- glib-2.19.10/docs/reference/glib/glib-sections.txt	2009-03-02 07:19:08.0 +0100
+++ glib-2.19.10.local/docs/reference/glib/glib-sections.txt	2009-03-09 18:11:23.308036369 +0100
@@ -1939,6 +1939,23 @@
 /SECTION
 
 SECTION
+TITLEPriority Queues/TITLE
+FILEpriority_queues/FILE
+
+GPQueue
+GPQueueHandle
+g_pqueue_insert
+g_pqueue_top
+g_pqueue_top_extended
+g_pqueue_delete_top
+g_pqueue_pop
+g_pqueue_pop_extended
+g_pqueue_delete
+g_pqueue_change_priority
+g_pqueue_destroy
+/SECTION
+
+SECTION
 TITLESequences/TITLE
 FILEsequence/FILE
 
diff -X diffignore -ruN glib-2.19.10/docs/reference/glib/tmpl/priority_queues.sgml glib-2.19.10.local/docs/reference/glib/tmpl/priority_queues.sgml
--- glib-2.19.10/docs/reference/glib/tmpl/priority_queues.sgml	1970-01-01 01:00:00.0 +0100
+++ glib-2.19.10.local/docs/reference/glib/tmpl/priority_queues.sgml	2009-03-10 15:53:51.431915573 +0100
@@ -0,0 +1,156 @@
+!-- # SECTION Title # --
+Priority Queues
+
+!-- # SECTION Short_Description # --
+a collection of data entries with associated priority values that returns
+entries one by one in order of priority
+
+!-- # SECTION Long_Description # --
+para
+The #GPQueue structure and its associated functions provide a sorted collection
+of data/priority pairs. Entries can be inserted in any order and at any time,
+and an entry's priority can be changed after it has been inserted into the
+queue. Any entry can be deleted at any time if you still have its
+#GPQueueHandle, but entries are supposed to be removed one at a time in order
+of priority with g_pqueue_pop().
+/para
+para
+The entries emphasiscannot/emphasis be iterated over in any way other than
+removing them one by one in order of priority, but when doing that, this
+structure is far more efficient than sorted lists or balanced trees, which
+on the other hand do not suffer from this restriction.
+/para
+para
+Most of the functions for #GPQueue work like #GList functions, that is,
+you assign the result to your queue variable, and the
+exception to the rule are g_pqueue_pop() and g_pqueue_pop_extended(),

Re: High-performance Priority Queue / Heap for glib

2009-03-06 Thread Behdad Esfahbod
Maik Zumstrull wrote:
 Needing a fast priority queue for an internal project recently and
 noticing that glib doesn't have one, I wrote it myself. I'd be
 interested in submitting the code, and I'd be willing to do the
 necessary cleanup and documentation work on it.

I agree that having a priority queue in glib would be useful.  However, the
API should use an opaque structure and be agnostic of the actual implementation.

behdad

 It's a pretty straight-forward implementation of a Fibonacci Heap
 http://en.wikipedia.org/wiki/Fibonacci_heap, with a forest of trees
 of circular double-linked lists as the underlying data structure (which
 is not as nasty as that sounds).
 
 According to my quick mailing list search, a Heap-based priority queue
 was submitted in 2003, but not accepted. Let me address the comments in
 http://mail.gnome.org/archives/gtk-devel-list/2003-March/msg6.html.
 
 | - Does it offer significant (big-o) performance 
 |   benefits for a common operation, or allow some 
 |   operation that would be really hard to code without 
 |   the data structure?
 
 All operations can be done with existing glib data structures, but
 there is a significant performance benefit. Please refer to
 http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times.
 
 Glib includes linked lists and balanced trees, of course; a simple heap
 was submitted in 2003. As you can see, all three structures are easily
 beaten by this approach, dropping performance to O(1) for most
 operations and retaining O(log n) performance in the other cases.
 
 | - Would it be useful to a significant fraction of
 |   applications?
 
 I don't think I can estimate the need. Personally, I was working with
 graph algorithms related to partitioning and path finding, where a good
 priority queue has a significant impact. In other areas where a
 priority queue is needed, presumably people are simply using glib's
 trees or lists now, accepting the performance penalty, which wouldn't
 matter in most applications.
 
 
 My implementation, ~300 lines of code, is fairly self-contained. It
 introduces one new data structure (to be renamed for inclusion):
 
 struct s_FQueue {
 gpointer data;
 gint priority;
 
 gint degree;
 gboolean marked;
 
 struct s_FQueue *parent;
 struct s_FQueue *prev;
 struct s_FQueue *next;
 struct s_FQueue *firstchild;
 };
 
 typedef struct s_FQueue FQueue;
 
 The glib slice allocator is used to create and drop these elements.
 There are no significant dependencies on glib functionality otherwise.
 
 Memory management follows the semantics used by glib's linked lists: a
 simple NULL pointer represents an empty queue, and FQueue elements are
 created and deleted implicitly as items are added to and removed from
 the queue. The implementation doesn't at all care about the data
 pointer, so filling it with GINT_TO_POINTER wrapped gints instead of
 actual pointers is okay.
 
 The element's priorities have to be expressed as gints, the use of a
 comparison function is not supported as an alternative. Elements are
 returned by the queue in ascending order of the priority value. No
 guarantees are made regarding the iteration order for elements with the
 same priority. The priority of an entry can be changed after insertion
 by a function call, but the priority field must not be directly written.
 
 The user is expected to keep one FQueue *q variable for the queue, then
 pass its address to most of the API calls. The pointer is rewritten by
 each call as necessary to always point to the FQueue element currently
 at the front of the queue (or NULL for an empty queue). The GList style
 syntax q = some_api_call(q, ...); for rewriting the pointer is not used
 because of the special needs of the insert call (see below). This could
 be inverted to use GList-style syntax everywhere, and an FQueue** in
 the insert call only.
 
 This is the proposed / currently implemented API:
 
 #define fqueue_isempty(fq) (gboolean)(fq == NULL)
 
 Obvious.
 
 FQueue* fqueue_insert (FQueue **fq, gpointer data, gint priority);
 
 Inserts data into the list with a priority value of priority. An
 FQueue element is implicitly allocated. The fq pointer is rewritten to
 point to the front of the queue. The *returned* pointer points to the
 newly allocated element. Do NOT put that into your queue variable, but
 keep the pointer around somewhere if you intend to make
 change_priority or delete calls on this element. The returned pointer
 can be ignored if you don't plan to change any priorities and don't
 plan to remove elements other than by pop()-ing them.
 
 gpointer fqueue_peek (FQueue *fq);
 
 Returns the value of the data pointer of the element with the lowest
 priority value, without removing it from the queue. NULL is returned
 for an empty queue, so if you need to differentiate between an empty
 queue and an element with a NULL data pointer, check for that first.
 
 gpointer 

Re: High-performance Priority Queue / Heap for glib

2009-03-06 Thread Maik Zumstrull
Behdad Esfahbod wrote:

 Maik Zumstrull wrote:
  Needing a fast priority queue for an internal project recently and
  noticing that glib doesn't have one, I wrote it myself. I'd be
  interested in submitting the code, and I'd be willing to do the
  necessary cleanup and documentation work on it.
 
 I agree that having a priority queue in glib would be useful.
 However, the API should use an opaque structure and be agnostic of
 the actual implementation.

I'm going to rethink the API and package the thing as a patch against
2.19.10 sources. Maybe the discussion will pick up a little when
there's actual code on the table. Thanks for the feedback so far.
___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


Re: High-performance Priority Queue / Heap for glib

2009-03-04 Thread Ben Pfaff
Maik Zumstrull maik.zumstr...@rz.uni-karlsruhe.de writes:

[about a Fibonacci heap implementation]
 | - Does it offer significant (big-o) performance 
 |   benefits for a common operation, or allow some 
 |   operation that would be really hard to code without 
 |   the data structure?

 All operations can be done with existing glib data structures, but
 there is a significant performance benefit. Please refer to
 http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times.

 Glib includes linked lists and balanced trees, of course; a simple heap
 was submitted in 2003. As you can see, all three structures are easily
 beaten by this approach, dropping performance to O(1) for most
 operations and retaining O(log n) performance in the other cases.

The big-O of a Fibonacci heap is better than that of a balanced
tree or a binary heap, but how is the actual performance?  When
you benchmark your Fibonacci heap against balanced trees or
binary heaps for realistic test cases, is there an actual
improvement in runtime?

The reason that I ask is this paragraph in the chapter on
Fibonacci heaps in Cormen, Leiserson, and Rivest, _Introduction
to Algorithms_:

From a practical point of view, however, the constant
factors and programming complexity of Fibonacci heaps
make them less desirable than ordinary binary (or k-ary)
heaps for most applications.  Thus, Fibonacci heaps are
predominantly of theoretical interest.

I observed this effect in practice myself a few years ago, when I
compared the performance of otherwise identical code using a
binary heap and a Fibonacci heap (see page 51 of
http://benpfaff.org/uniformity/uniformity-2001.04.27.pdf).  But I
didn't make much of an attempt to optimize my heaps, and maybe
you did a better implementation.
-- 
Ben Pfaff 
http://benpfaff.org

___
gtk-devel-list mailing list
gtk-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/gtk-devel-list


High-performance Priority Queue / Heap for glib

2009-03-03 Thread Maik Zumstrull
Needing a fast priority queue for an internal project recently and
noticing that glib doesn't have one, I wrote it myself. I'd be
interested in submitting the code, and I'd be willing to do the
necessary cleanup and documentation work on it.

It's a pretty straight-forward implementation of a Fibonacci Heap
http://en.wikipedia.org/wiki/Fibonacci_heap, with a forest of trees
of circular double-linked lists as the underlying data structure (which
is not as nasty as that sounds).

According to my quick mailing list search, a Heap-based priority queue
was submitted in 2003, but not accepted. Let me address the comments in
http://mail.gnome.org/archives/gtk-devel-list/2003-March/msg6.html.

| - Does it offer significant (big-o) performance 
|   benefits for a common operation, or allow some 
|   operation that would be really hard to code without 
|   the data structure?

All operations can be done with existing glib data structures, but
there is a significant performance benefit. Please refer to
http://en.wikipedia.org/wiki/Fibonacci_heap#Summary_of_running_times.

Glib includes linked lists and balanced trees, of course; a simple heap
was submitted in 2003. As you can see, all three structures are easily
beaten by this approach, dropping performance to O(1) for most
operations and retaining O(log n) performance in the other cases.

| - Would it be useful to a significant fraction of
|   applications?

I don't think I can estimate the need. Personally, I was working with
graph algorithms related to partitioning and path finding, where a good
priority queue has a significant impact. In other areas where a
priority queue is needed, presumably people are simply using glib's
trees or lists now, accepting the performance penalty, which wouldn't
matter in most applications.


My implementation, ~300 lines of code, is fairly self-contained. It
introduces one new data structure (to be renamed for inclusion):

struct s_FQueue {
gpointer data;
gint priority;

gint degree;
gboolean marked;

struct s_FQueue *parent;
struct s_FQueue *prev;
struct s_FQueue *next;
struct s_FQueue *firstchild;
};

typedef struct s_FQueue FQueue;

The glib slice allocator is used to create and drop these elements.
There are no significant dependencies on glib functionality otherwise.

Memory management follows the semantics used by glib's linked lists: a
simple NULL pointer represents an empty queue, and FQueue elements are
created and deleted implicitly as items are added to and removed from
the queue. The implementation doesn't at all care about the data
pointer, so filling it with GINT_TO_POINTER wrapped gints instead of
actual pointers is okay.

The element's priorities have to be expressed as gints, the use of a
comparison function is not supported as an alternative. Elements are
returned by the queue in ascending order of the priority value. No
guarantees are made regarding the iteration order for elements with the
same priority. The priority of an entry can be changed after insertion
by a function call, but the priority field must not be directly written.

The user is expected to keep one FQueue *q variable for the queue, then
pass its address to most of the API calls. The pointer is rewritten by
each call as necessary to always point to the FQueue element currently
at the front of the queue (or NULL for an empty queue). The GList style
syntax q = some_api_call(q, ...); for rewriting the pointer is not used
because of the special needs of the insert call (see below). This could
be inverted to use GList-style syntax everywhere, and an FQueue** in
the insert call only.

This is the proposed / currently implemented API:

#define fqueue_isempty(fq) (gboolean)(fq == NULL)

Obvious.

FQueue* fqueue_insert (FQueue **fq, gpointer data, gint priority);

Inserts data into the list with a priority value of priority. An
FQueue element is implicitly allocated. The fq pointer is rewritten to
point to the front of the queue. The *returned* pointer points to the
newly allocated element. Do NOT put that into your queue variable, but
keep the pointer around somewhere if you intend to make
change_priority or delete calls on this element. The returned pointer
can be ignored if you don't plan to change any priorities and don't
plan to remove elements other than by pop()-ing them.

gpointer fqueue_peek (FQueue *fq);

Returns the value of the data pointer of the element with the lowest
priority value, without removing it from the queue. NULL is returned
for an empty queue, so if you need to differentiate between an empty
queue and an element with a NULL data pointer, check for that first.

gpointer fqueue_pop (FQueue **fq);

Returns the value of the data pointer of the element with the lowest
priority value, removing it from the queue. The FQueue element is
implicitly deallocated. Do not attempt to make change_priority or delete
calls on that element after this, it's gone! NULL is returned