Hey, With the recent talk on GetHashCode(), I was just taking a look at the code. I decided to try my hand at seeing what would happen performancewise if i made the function work with an int* as opposed to char*. Turns out i ended up with something which was 50% slower, but also had 50% less collisions on a rather large english dictionary dataset (17mb of words, probably a few dupes).
Just in case it's of any use, i've attached it here, though i'd say there are much better ones out there. This was purely accidental ;) public unsafe static int GetHashCode(string s) { int h = 0; int rem = s.Length & 3; fixed (char* c = s) { int* start = (int*)c; int* end = start + s.Length - rem; for (int* i = end; i >= start; i--) h = (h << 5) - h + *i; char* cc = c + s.Length - rem; for (char* i = cc + rem; i >= cc; i--) h = (h << 5) - h + *i; return h; } } On Mon, Jul 21, 2008 at 12:33 PM, Marek Safar <[EMAIL PROTECTED]> wrote: > Hello Andreas, > > Imho we should revert the change, especially as MS seems to also have an > > implementation with low collisions (and in fact this will be what people > are > > expecting from a hashcode without any further explanation) we should do > the > > same. Otherwise this might drive some implementations into huge perf > > problems. > > > I reverted my changes for now to produce same bad/good results. However, > I tried to port another implementation which seems to produce much more > better results for both scenarios but it still needs more testing. I'll > send a patch for review when I am done. > > Marek > _______________________________________________ > Mono-devel-list mailing list > Mono-devel-list@lists.ximian.com > http://lists.ximian.com/mailman/listinfo/mono-devel-list >
_______________________________________________ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list