Hi, Where is this hashcode implementation taken from ? I don't think we should invent new ones.
Zoltan 2008/7/17 Marek Safar <[EMAIL PROTECTED]>: > Hello, > > Here are some data which hopefully bring some light to this topic. I didn't > measure uniqueness of hashcodes as I consider it less important as range > distribution. Enclosed you can find the result for you data with old and new > string.GetHashCode implementation. > > Some numbers to look at: > > ** Old implementation > > Highest contentions: 2985, 1155, 1137 > > ** Current implementation > > Highest contentions: 236, 217, 210 > > Marek > > >> I have been asked to move this discussion to the e-mail list from IRC. >> Basically we (my company and I) have new unit tests errors in 2.0 >> that did not occur at 1.9. The errors were traced to the >> String.GetHashCode change. I had one of our interns (Mike) research >> the change and I wanted to share his findings with you. His final >> document s attached. I can also send the test case Mike used to >> generate his data. I believe it is too large to send to all of your >> inboxes. >> >> http://docs.google.com/Doc?id=dhnnms3p_3fpmfq8dm >> >> (1:28:15 PM) holmes: Have there been any discussions about r106887 >> which was a change to String.GetHashCode? >> (1:31:54 PM) holmes: This change is causing some of our tests to fail. >> I will admit that our tests are relying on the sting hash code a bit >> too much. However after some investigation I am wondering if the new >> algorithm is generating a unique enough result. (I say that as >> respectfully as I can.) >> (1:34:33 PM) holmes: We conducted a short test where 58000 English >> string were tested and the new code only results in 60% unique results >> compared to the previous code's performance of 100% >> (1:38:36 PM) holmes: We have found the current algorithm only >> considers the last ~6 characters of a string, which may be the >> problem. That is the last thing I will say here I promise. >> (1:42:21 PM) kumpera: marek: how have you tested the distribution of >> your new string hashing algorithm? >> (1:42:54 PM) marek: kumpera: I used english dictionary >> (1:43:57 PM) marek: holmes: how big is your dictionary? >> (1:45:05 PM) holmes: marek: 58110 >> (1:46:12 PM) holmes: I think the real problem is not that a word >> hashes unique enough as a sentence producing a unique result. >> (1:46:56 PM) holmes: the last 6 characters define the whole algorithm. >> (or at least I think so) >> (1:56:50 PM) lupus: it's likely better to revert the change, string >> hashing is also supposed to be synced with the implementation in C in >> the runtime anyway >> >> -bill >> _______________________________________________ >> Mono-devel-list mailing list >> Mono-devel-list@lists.ximian.com >> http://lists.ximian.com/mailman/listinfo/mono-devel-list >> >> > > > _______________________________________________ > Mono-devel-list mailing list > Mono-devel-list@lists.ximian.com > http://lists.ximian.com/mailman/listinfo/mono-devel-list > > _______________________________________________ Mono-devel-list mailing list Mono-devel-list@lists.ximian.com http://lists.ximian.com/mailman/listinfo/mono-devel-list