Great analysis. GList is the most absolutely rudimentary list structure you could imagine. g_list_append walks the list to the end, then adds the new element. Totally O(n^2). A bit of googling turned up some very defensive arguments against changing GList to be a more opaque type with efficient head and tail pointers and length information.
We have several calls to g_list_length in the codebase, too. These walk all elements to find the length. Yuck. I think we should use g_list* as the internals to our own opaque list type. Given that most messages have only 15-20 headers, a hash table is probably not going to be more efficient due to its overhead -- it would have better worst-case performance, though. Aaron On Mon, 2006-03-06 at 22:00 -0600, Matthew Sayler wrote: > > I compiled dbmail and glib with profiling and gprof shows almost all > > time being spent in IA__g_list_last. (Note the CPU time reported by > > time was 277.17s -- the rest is probably in glibc and mysql libs..) > > > > time seconds seconds calls s/call s/call name > > 99.35 272.99 272.99 344704 0.00 0.00 IA__g_list_last > > 0.20 273.54 0.55 18702183 0.00 0.00 ucmp > > 0.10 273.82 0.28 344681 0.00 0.00 g_tree_node_insert > > 0.06 273.99 0.17 860545 0.00 0.00 g_tree_node_lookup > > 0.05 274.12 0.13 344771 0.00 0.00 _g_list_alloc > > 0.03 274.20 0.08 14 0.01 0.01 db_getmailbox > > (full profile log attached) > > OK -- O(n^2) behavior is bad. The traverse_tree_{keys, values, merge} > functions all ended up appending message pointers to the end of a linked > list. When you do this for 80,000 messages, you end up following about > 3.2 billion pointers.. Yuck. > > Attached patch created copies of these functions (traverse_tree_keys_rev > etc) which prepends list items and then uses a g_list_reverse to sort > out the mess. It seems to work for me but I've given it minimal > testing. > > There are better ways of doing this, but for me this reduces the above > mailbox query from 272cpusec to 5.1cpusec, for a tidy 50x speedup. > > Matt > > _______________________________________________ > Dbmail-dev mailing list > Dbmail-dev@dbmail.org > http://twister.fastxs.net/mailman/listinfo/dbmail-dev >