On Tue, Aug 15, 2017 at 4:49 AM, Patryk Obara <patryk.ob...@gmail.com> wrote:
Welcome (back?) to the git mailing list! > I experimented with using a different hash algorithm (I am aware of > existing "Git hash function transition plan", I just want to push > things forward a bit) - and immediately hit a small issue - changing > the size of object_id hash buffer leads to compilation issues and > breaks graft-related tests. Thanks for advancing this frontier. :) > > I am sending patch 1 only to show a modification, that I did to > increase buffer size - it's not intended to be merged. > > Patch 2 fixes trivial compilation issue. > > Patches 3, 4, and 5 touch graft implementation to remove calculations > using GIT_SHA1_*, that lead to broken tests. I replaced FLEX_ARRAY of > object_id's representing parents with oid_array. New implementation > should be more future-proof, I think. I would think so, too. parse_oid_hex currently only reads sha1, but once it can read a new hash (or both old and new hash), it would solve the graft problems. > New implementation has tiny behaviour change: previously parents in > graft line needed to be separated with single space - now any number > of whitespace characters will do. Yeah that is because of parse_next_oid_hex in patch 5 is pretty smart (and if we'd want to preserve behavior we'd need to just skip one SP and in case of more SP "goto bad_graft_data" that is in the caller function. > > Alternative implementation approaches > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Strbuf could be replaced with string_list with > string_list_split_in_place instead of while loop in read_graft_line. > I didn't implement it this way because I learned > about string_list_split_in_place after finishing this implementation > draft. Right now I'm not sure which approach is better. > > Another possibility is dropping graft feature altogether - that would > mean removing code for parsing grafts and 'parent' field in the struct, > but preserving the struct itself as a shallow clone marker. Grafts are > a little-known feature with modern replacement, but this seems like > bigger task and rather out of the scope of transition to the new > hashing algorithm. > > I considered making function read_graft_line a static one and > read_graft_file non-static, but read_graft_line is used in > 'builtin/blame.c' in function read_ancestry, which is almost a copy of > read_graft_file (difference of single boolean flag passed to > register_commit_graft). Removal of this duplication may be worthwhile, > but I think it's out of scope. I think the grafts may be still in use in Linux, to fault in the history before git was used, which cannot be replaced by the shallow mechanism. Thanks for the patches 2-5! Stefan