After taking 1.5 years "vacation" from pack v4, I plan to do something
about it again. Will post more when I have some patches to discuss.
Only one question for now (forgive me if I asked already, it's been
quite some time)

I think pack v4 does not deliver its best promise that walking a tree
is simply following pointers and jumping from place to place. When we
want to copy from the middle of another tree, we need to scan from the
beginning of the tree. Tree offset cache helps, but the problem
remains. What do you think about an alternative format that each
"copy" instruction includes both index of the tree entry to copy from
(i.e. what we store now)  _and_ the byte offset from the beginning of
the tree? With this byte offset, we know exactly where to start
copying without scanning from the beginning. It will be a bit(?)
bigger, but it's also faster.

I imagine this is an optimization that can be done locally. The pack
transferred over network does not have these byte offsets. After the
pack is stored and verified by index-pack, we can rewrite it and add
this info. The simplest way is use a fixed size for this offset (e.g.
uint16_t or even uint8_t), add the place holder in copy instructions
of all v4 trees. After that object offsets will not change again and
we can start filling real offsets to placeholders.

PS. The rebased version on recent master is here if anyone is interested

https://github.com/pclouds/git/commits/pack-v4
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to