Hi Paul, * Paul Smith wrote on Sat, Jan 22, 2011 at 10:27:01PM CET: > Hi Ralf, sorry for the delay in reply.
No worries. > On Sun, 2011-01-09 at 08:50 +0100, Ralf Wildenhues wrote: > > Just breaking out of the for loop in the if-true case avoids the problem > > for me. (The bulk of the patch is cleanup to remove the unneeded 'best' > > variable.) Of course that only delays quadratic behavior as with filled > > caches, all of them are still walked. > > Thanks for your investigations here; that's very helpful. I had also > had some thoughts about improving the strcache in various ways. For > example, I was thinking maybe about splitting it into two: one for > filenames and one for variable names, since these strings very rarely > overlap. But a hash doesn't degrade in quality just because you put two different kinds of things in it. I wouldn't go for more complexity here. On the contrary, if you want an example where one big hash table is used for several different kinds of objects very successfully, look at git. > One other thing that would probably be useful is that someone posted > some patches that changed the hashing in GNU make to use a PATRICIA tree > instead of a standard hash (the hashing in GNU make is taken from > idutils); I think this could help a lot in the fairly common case where > there are lots of strings which share prefixes. I asked for copyright > assignment for that code but I'm not sure where it stands. Yes, this sounds like it might help. FWIW, I would only go that way if there are use cases where this demonstrably helps, and the log(N) behavior doesn't slow others down a lot. > But as always, it's better to test and fix real issues rather than guess > as to where your issues might be. I'll definitely look into making > changes along the lines you suggest. Thank you! > > strcache_iscached is more worrisome with 14-fold increase. Looking at > > its implementation, I wonder why it doesn't do a hash lookup rather > > than looping over all the cache entries. > > I was thinking that doing simple pointer compares across the buffers > would be not much slower, and even faster in simple cases, than the hash > lookup, but obviously when the # of buffers gets large that's not true. > > On the other hand, the strcache_iscached() function is really only > intended for debug/verification. It's called in an assert() (in file.c) > I added during development of the string cache to be sure that strings > expected to be in the cache, really were. For normal use that > could/should be disabled. Perhaps I need a special flag controlling > stuff like this. I could add a compiler flag to enable these checks in > maintenance mode, maybe. Sounds like a good idea. Cheers, Ralf _______________________________________________ Bug-make mailing list Bug-make@gnu.org http://lists.gnu.org/mailman/listinfo/bug-make