On Sun, Mar 17, 2013 at 08:21:13PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen wrote:
> > On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
> >> Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
> >> space/time trade-off.
> >
> > Follow
On Thu, Jan 31, 2013 at 6:06 PM, Duy Nguyen wrote:
> On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
>> Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
>> space/time trade-off.
>
> Following the on-disk format experiment yesterday, I changed the
> format to:
>
> - a li
Jeff King writes:
> On Thu, Jan 31, 2013 at 09:03:26AM -0800, Shawn O. Pearce wrote:
> ...
>> If we are going to change the index to support extension sections and
>> I have to modify JGit to grok this new format, it needs to be index v3
>> not index v2. If we are making index v3 we should just p
On Fri, Feb 1, 2013 at 5:15 PM, Jeff King wrote:
> The short-sha1 is a clever idea. Looks like it saves us on the order of
> 4MB for linux-2.6 (versus the full 20-byte sha1). Not as big as the
> savings we get from dropping the other 3 sha1's to uint32_t, but still
> not bad.
We could save anothe
On Thu, Jan 31, 2013 at 06:06:56PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
> > Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
> > space/time trade-off.
>
> Following the on-disk format experiment yesterday, I changed the
>
On Thu, Jan 31, 2013 at 06:06:56PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
> > Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
> > space/time trade-off.
>
> Following the on-disk format experiment yesterday, I changed the
>
On Wed, Jan 30, 2013 at 08:56:07PM +0700, Nguyen Thai Ngoc Duy wrote:
> Another point, but not really important at this stage, I think we have
> memory leak somewhere (lookup_commit??). It used up to 800 MB RES on
> linux-2.6.git while generating the cache.
We generate (and then leak!) the linked
On Thu, Jan 31, 2013 at 09:03:26AM -0800, Shawn O. Pearce wrote:
> > Of course, it is more convenient to store this kind of things in a
> > separate file while experimenting and improving the mechanism, but I
> > do not think we want to see each packfile in a repository comes with
> > 47 auxiliary
On Tue, Jan 29, 2013 at 11:17:41PM -0800, Junio C Hamano wrote:
> > True, but it is even less headache if the file is totally separate and
> > optional.
>
> Once you start thinking about using an offset to some list of SHA-1,
> perhaps? A section inside the same file can never go out of sync.
Y
On Wed, Jan 30, 2013 at 7:56 AM, Junio C Hamano wrote:
> Jeff King writes:
>
>>>From this:
>>
>>> Then it will be very natural for the extension data that store the
>>> commit metainfo to name objects in the pack the .idx file describes
>>> by the offset in the SHA-1 table.
>>
>> I guess your arg
On Wed, Jan 30, 2013 at 09:16:29PM +0700, Duy Nguyen wrote:
> Perhaps we could store abbrev sha-1 instead of full sha-1. Nice
> space/time trade-off.
Following the on-disk format experiment yesterday, I changed the
format to:
- a list a _short_ SHA-1 of cached commits
- a list of cache entries,
Jeff King writes:
>>From this:
>
>> Then it will be very natural for the extension data that store the
>> commit metainfo to name objects in the pack the .idx file describes
>> by the offset in the SHA-1 table.
>
> I guess your argument is that putting it all in the same file makes it
> more natu
On Wed, Jan 30, 2013 at 8:56 PM, Duy Nguyen wrote:
> However, performance seems to suffer too. Maybe I do more lookups than
> necessary, I don't know.
Yes, I should have stored the position in the sha-1 <-> offset map
instead of the position of the object in .pack file. Even so,
performance does
On Tue, Jan 29, 2013 at 04:16:11AM -0500, Jeff King wrote:
> When we are doing a commit traversal that does not need to
> look at the commit messages themselves (e.g., rev-list,
> merge-base, etc), we spend a lot of time accessing,
> decompressing, and parsing the commit objects just to find
> the
> True, but it is even less headache if the file is totally separate and
> optional.
Once you start thinking about using an offset to some list of SHA-1,
perhaps? A section inside the same file can never go out of sync.
Also a longer-term advantage is that you can teach index-pack to do
this.
--
On Wed, Jan 30, 2013 at 10:36:10AM +0700, Nguyen Thai Ngoc Duy wrote:
> On Tue, Jan 29, 2013 at 4:16 PM, Jeff King wrote:
> > +int commit_metapack(unsigned char *sha1,
> > + uint32_t *timestamp,
> > + unsigned char **tree,
> > + unsigned char
On Tue, Jan 29, 2013 at 10:08:08AM -0800, Junio C Hamano wrote:
> > In order to reduce the disk footprint and I/O cost, the future
> > direction for this mechanism may want to point into an existing
> > store of SHA-1 hashes with a shorter file offset, and the .idx file
> > could be such a store,
On Tue, Jan 29, 2013 at 09:38:10AM -0800, Junio C Hamano wrote:
> Jeff King writes:
>
> > +int commit_metapack(unsigned char *sha1,
> > + uint32_t *timestamp,
> > + unsigned char **tree,
> > + unsigned char **parent1,
> > + unsigned char **
On Tue, Jan 29, 2013 at 4:16 PM, Jeff King wrote:
> +int commit_metapack(unsigned char *sha1,
> + uint32_t *timestamp,
> + unsigned char **tree,
> + unsigned char **parent1,
> + unsigned char **parent2)
> +{
Nit picking. tree
Junio C Hamano writes:
> I am torn on this one.
>
> These cached properties of a single commit will not change no matter
> which pack it appears in, and it feels logically wrong, especially
> when you record these object names in the full SHA-1 form, to tie a
> "commit metapack" to a pack. Logic
Jeff King writes:
> +int commit_metapack(unsigned char *sha1,
> + uint32_t *timestamp,
> + unsigned char **tree,
> + unsigned char **parent1,
> + unsigned char **parent2)
> +{
> + struct commit_metapack *p;
> +
> + prepare_co
On Tue, Jan 29, 2013 at 11:24:45AM +0100, Michael Haggerty wrote:
> On 01/29/2013 10:16 AM, Jeff King wrote:
> > When we are doing a commit traversal that does not need to
> > look at the commit messages themselves (e.g., rev-list,
> > merge-base, etc), we spend a lot of time accessing,
> > decomp
On 01/29/2013 10:16 AM, Jeff King wrote:
> When we are doing a commit traversal that does not need to
> look at the commit messages themselves (e.g., rev-list,
> merge-base, etc), we spend a lot of time accessing,
> decompressing, and parsing the commit objects just to find
> the parent and timesta
When we are doing a commit traversal that does not need to
look at the commit messages themselves (e.g., rev-list,
merge-base, etc), we spend a lot of time accessing,
decompressing, and parsing the commit objects just to find
the parent and timestamp information. We can make a
space-time tradeoff b
24 matches
Mail list logo