Refactoring hardcoded SHA-1 constants
SHA-1 is clearly on its way out. I know that there has been discussion in the past about moving to different algorithms. I'm not interested in having that discussion now. I'd like to introduce a set of preprocessor constants that we'd use instead of hard-coded 20s and 40s everywhere. That way, if we decide to support a different algorithm, doing so becomes significantly easier. These would be used in new code. After that, I'd like to slowly start moving existing code to use these constants. I would also like to consider, as a third step, turning all of the unsigned char[20] uses into a struct containing unsigned char[20] as its only member, like libgit2 does. This disambiguates arbitrary byte data from byte sequences being used to represent an SHA-1 ID. I'm hoping the first suggestion will be mostly unobjectionable, as having hard-coded constants is widely considered bad programming practice. I suspect the latter two will be more controversial. These changes, if implemented, would be done by hand, not by machine, and they would be bisectable. Comments? -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: Refactoring hardcoded SHA-1 constants
Hi, brian m. carlson wrote: I'd like to introduce a set of preprocessor constants that we'd use instead of hard-coded 20s and 40s everywhere. Lukewarm on that. It's hard to do consistently and unless they're named well it can be harder to know what something like BINARY_OBJECT_NAME_LENGTH means than plain '20' when first reading. [...] I would also like to consider, as a third step, turning all of the unsigned char[20] uses into a struct containing unsigned char[20] as its only member, like libgit2 does. That would be very welcome! It's a nice way to steer people toward hashcmp using the type system, and it makes it possible to use a union to enforce alignment later if measurements show benefit. Thanks, Jonathan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Refactoring hardcoded SHA-1 constants
brian m. carlson wrote: I'd like to introduce a set of preprocessor constants that we'd use instead of hard-coded 20s and 40s everywhere. I agree that it would help code clarity to have symbolic constants for these numbers. On 04/19/2014 12:40 AM, Jonathan Nieder wrote: Lukewarm on that. It's hard to do consistently and unless they're named well it can be harder to know what something like BINARY_OBJECT_NAME_LENGTH means than plain '20' when first reading. OK, so let's see if we can name them well. (Though I think if the names come into wide use then we'll get familiar with them quickly.) libgit2 seems to use the name oid (for object ID) where we tend to use sha1 or name. That might be a good convention for us to move towards. Their constants are called GIT_OID_RAWSZ (== 20) and GIT_OID_HEXSZ (== 40). They don't exactly roll off the tongue, I'll admit. We wouldn't need a GIT_ prefix, I think, since our code is not meant to be used as a library. Let the brainstorming (and bikeshedding) begin! 1. GIT_OID_RAWSZ / GIT_OID_HEXSZ 2. OID_RAWSZ / OID_HEXSZ 3. OID_BINARY_LEN / OID_ASCII_LEN 4. BINARY_OID_LEN / ASCII_OID_LEN Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Refactoring hardcoded SHA-1 constants
On Sat, Apr 19, 2014 at 7:06 AM, Michael Haggerty mhag...@alum.mit.edu wrote: Let the brainstorming (and bikeshedding) begin! 1. GIT_OID_RAWSZ / GIT_OID_HEXSZ 2. OID_RAWSZ / OID_HEXSZ 3. OID_BINARY_LEN / OID_ASCII_LEN 4. BINARY_OID_LEN / ASCII_OID_LEN 5. sizeof(oid) / ASCII_OID_LEN -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Refactoring hardcoded SHA-1 constants
Duy Nguyen pclo...@gmail.com writes: On Sat, Apr 19, 2014 at 7:06 AM, Michael Haggerty mhag...@alum.mit.edu wrote: Let the brainstorming (and bikeshedding) begin! 1. GIT_OID_RAWSZ / GIT_OID_HEXSZ 2. OID_RAWSZ / OID_HEXSZ 3. OID_BINARY_LEN / OID_ASCII_LEN 4. BINARY_OID_LEN / ASCII_OID_LEN 5. sizeof(oid) / ASCII_OID_LEN Can we safely assume sizeof(struct { uchar oid[20]; }) is 20, or on some 64-bit platforms do we have to worry about 8-byte alignment? In any case, if the pair of names in 1. are what is already used, I do not see a need for us to waste time bikeshedding at all. In an ideal world, an implementation of cmd_foo() that defines a variable to hold an object name and then calls into libgit.a services may link in the future with a different implementation of the services that are currently offered by libgit.a but are reimplemented on top of libgit2, right? Having the same name would help us, not hurt us, with that direction in some future, no? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Refactoring hardcoded SHA-1 constants
On Sat, Apr 19, 2014 at 8:06 AM, Junio C Hamano gits...@pobox.com wrote: Duy Nguyen pclo...@gmail.com writes: On Sat, Apr 19, 2014 at 7:06 AM, Michael Haggerty mhag...@alum.mit.edu wrote: Let the brainstorming (and bikeshedding) begin! 1. GIT_OID_RAWSZ / GIT_OID_HEXSZ 2. OID_RAWSZ / OID_HEXSZ 3. OID_BINARY_LEN / OID_ASCII_LEN 4. BINARY_OID_LEN / ASCII_OID_LEN 5. sizeof(oid) / ASCII_OID_LEN Can we safely assume sizeof(struct { uchar oid[20]; }) is 20, or on some 64-bit platforms do we have to worry about 8-byte alignment? That's an interesting point. sizeof(that struct) on x86-64 returns 20. I haven't checked about other platforms. We have some ondisk structures around that contain unsigned char [20] if I remember correctly. Those would need to be handled with care during the conversion if this becomes a real issue. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html