On Wed, Oct 10 2018, SZEDER Gábor wrote:
> On Thu, Oct 04, 2018 at 11:09:58PM -0700, Junio C Hamano wrote:
>> SZEDER Gábor writes:
>>
>> >> git-gc - Cleanup unnecessary files and optimize the local repository
>> >>
>> >> Creating these indexes like the commit-graph falls under "optimize the
On Thu, Oct 04, 2018 at 11:09:58PM -0700, Junio C Hamano wrote:
> SZEDER Gábor writes:
>
> >> git-gc - Cleanup unnecessary files and optimize the local repository
> >>
> >> Creating these indexes like the commit-graph falls under "optimize the
> >> local repository",
> >
> > But it doesn't f
On Mon, Oct 08, 2018 at 11:08:03PM -0400, Jeff King wrote:
> I'd have done it as one fixed-size filter per commit. Then you should be
> able to hash the path keys once, and apply the result as a bitwise query
> to each individual commit (I'm assuming that it's constant-time to
> access the filter f
On Tue, Oct 09, 2018 at 03:03:08PM -0400, Derrick Stolee wrote:
> > I wonder if Roaring does better here.
>
> In these sparse cases, usually Roaring will organize the data as "array
> chunks" which are simply lists of the values. The thing that makes this
> still compressible is that we store two
On 10/9/2018 2:46 PM, Jeff King wrote:
On Tue, Oct 09, 2018 at 09:48:20AM -0400, Derrick Stolee wrote:
[I snipped all of the parts about bloom filters that seemed entirely
reasonable to me ;) ]
Imagine we have that list. Is a bloom filter still the best data
structure for each commit? At the
On Tue, Oct 09, 2018 at 09:48:20AM -0400, Derrick Stolee wrote:
> [I snipped all of the parts about bloom filters that seemed entirely
> reasonable to me ;) ]
> > Imagine we have that list. Is a bloom filter still the best data
> > structure for each commit? At the point that we have the complet
On Tue, Oct 09 2018, Derrick Stolee wrote:
> The filter needs to store every path that would be considered "not
> TREESAME". It can't store wildcards, so you would need to evaluate the
> wildcard and test all of those paths individually (not a good idea).
If full paths are stored, yes, But have
(Changing title to reflect the new topic.)
On 10/8/2018 11:08 PM, Jeff King wrote:
On Mon, Oct 08, 2018 at 02:29:47PM -0400, Derrick Stolee wrote:
There are two questions that I was hoping to answer by looking at
your code:
1. How do you store your Bloom filter? Is it connected to the commit-
On Mon, Oct 08, 2018 at 02:29:47PM -0400, Derrick Stolee wrote:
> > > > But I'm afraid it will take a while until I get around to turn it into
> > > > something presentable...
> > > Do you have the code pushed somewhere public where one could take a look?
> > > I
> > > Do you have the code pushed
SZEDER Gábor writes:
> There is certainly potential there. With a (very) rough PoC
> experiment, a 8MB bloom filter, and a carefully choosen path I can
> achieve a nice, almost 25x speedup:
>
> $ time git rev-list --count HEAD -- t/valgrind/valgrind.sh
> 6
>
> real0m1.563s
> user
On 10/8/2018 2:10 PM, SZEDER Gábor wrote:
On Mon, Oct 08, 2018 at 12:57:34PM -0400, Derrick Stolee wrote:
Nice! These numbers make sense to me, in terms of how many TREESAME queries
we actually need to perform for such a query.
Yeah... because you didn't notice that I deliberately cheated :)
On Mon, Oct 08, 2018 at 12:57:34PM -0400, Derrick Stolee wrote:
> On 10/8/2018 12:41 PM, SZEDER Gábor wrote:
> >On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote:
> >>I'm still excited about the prospect of a bloom filter for paths which
> >>each commit touches. I think that's the next big
On 10/8/2018 12:41 PM, SZEDER Gábor wrote:
On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote:
I'm still excited about the prospect of a bloom filter for paths which
each commit touches. I think that's the next big frontier in getting
things like "git log -- path" to a reasonable run-time
On Wed, Oct 03, 2018 at 03:18:05PM -0400, Jeff King wrote:
> I'm still excited about the prospect of a bloom filter for paths which
> each commit touches. I think that's the next big frontier in getting
> things like "git log -- path" to a reasonable run-time.
There is certainly potential there.
On Fri, Oct 05, 2018 at 10:01:31PM +0200, Ævar Arnfjörð Bjarmason wrote:
> > There's unfortunately not a fast way of doing that. One option would be
> > to keep a counter of "ungraphed commit objects", and have callers update
> > it. Anybody admitting a pack via index-pack or unpack-objects can ea
On Fri, Oct 05, 2018 at 04:00:12PM -0400, Derrick Stolee wrote:
> On 10/5/2018 3:47 PM, Jeff King wrote:
> > On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote:
> >
> > > > So can we really just take (total_objects - commit_graph_objects) and
> > > > compare it to some threshold?
> >
On Fri, Oct 05 2018, Jeff King wrote:
> On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote:
>
>> > So can we really just take (total_objects - commit_graph_objects) and
>> > compare it to some threshold?
>>
>> The commit-graph only stores the number of _commits_, not total objects.
>
On 10/5/2018 3:47 PM, Jeff King wrote:
On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote:
So can we really just take (total_objects - commit_graph_objects) and
compare it to some threshold?
The commit-graph only stores the number of _commits_, not total objects.
Oh, right, of cou
On Fri, Oct 05, 2018 at 03:41:40PM -0400, Derrick Stolee wrote:
> > So can we really just take (total_objects - commit_graph_objects) and
> > compare it to some threshold?
>
> The commit-graph only stores the number of _commits_, not total objects.
Oh, right, of course. That does throw a monkey
On 10/5/2018 3:21 PM, Jeff King wrote:
On Fri, Oct 05, 2018 at 09:45:47AM -0400, Derrick Stolee wrote:
My misunderstanding was that your proposed change to gc computes the
commit-graph in either of these two cases:
(1) The auto-GC threshold is met.
(2) There is no commit-graph file.
And what
On Fri, Oct 05, 2018 at 09:45:47AM -0400, Derrick Stolee wrote:
> My misunderstanding was that your proposed change to gc computes the
> commit-graph in either of these two cases:
>
> (1) The auto-GC threshold is met.
>
> (2) There is no commit-graph file.
>
> And what I hope to have instead of
On Fri, Oct 05 2018, Derrick Stolee wrote:
> On 10/5/2018 9:05 AM, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Oct 05 2018, Derrick Stolee wrote:
>>
>>> On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote:
I don't have time to polish this up for submission now, but here's a WIP
patch tha
On 10/5/2018 9:05 AM, Ævar Arnfjörð Bjarmason wrote:
On Fri, Oct 05 2018, Derrick Stolee wrote:
On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote:
I don't have time to polish this up for submission now, but here's a WIP
patch that implements this, highlights:
* There's a gc.clone.autoDet
On Fri, Oct 05 2018, Derrick Stolee wrote:
> On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote:
>> I don't have time to polish this up for submission now, but here's a WIP
>> patch that implements this, highlights:
>>
>> * There's a gc.clone.autoDetach=false default setting which overrides
>
On 10/4/2018 5:42 PM, Ævar Arnfjörð Bjarmason wrote:
I don't have time to polish this up for submission now, but here's a WIP
patch that implements this, highlights:
* There's a gc.clone.autoDetach=false default setting which overrides
gc.autoDetach if 'git gc --auto' is run via git-clone
SZEDER Gábor writes:
>> git-gc - Cleanup unnecessary files and optimize the local repository
>>
>> Creating these indexes like the commit-graph falls under "optimize the
>> local repository",
>
> But it doesn't fall under "cleanup unnecessary files", which the
> commit-graph file is, since,
On Wed, Oct 03 2018, Ævar Arnfjörð Bjarmason wrote:
> Don't have time to patch this now, but thought I'd send a note / RFC
> about this.
>
> Now that we have the commit graph it's nice to be able to set
> e.g. core.commitGraph=true & gc.writeCommitGraph=true in ~/.gitconfig or
> /etc/gitconfig t
On Wed, Oct 03 2018, Jeff King wrote:
> On Wed, Oct 03, 2018 at 12:08:15PM -0700, Stefan Beller wrote:
>
>> I share these concerns in a slightly more abstract way, as
>> I would bucket the actions into two separate bins:
>>
>> One bin that throws away information.
>> this would include removing
On Wed, Oct 03, 2018 at 12:08:15PM -0700, Stefan Beller wrote:
> I share these concerns in a slightly more abstract way, as
> I would bucket the actions into two separate bins:
>
> One bin that throws away information.
> this would include removing expired reflog entries (which
> I do not think a
On Wed, Oct 03, 2018 at 02:59:34PM -0400, Derrick Stolee wrote:
> > They don't help yet, and there's no good reason to enable bitmaps for
> > clients. I have a few patches that use bitmaps for things like
> > ahead/behind and --contains checks, but the utility of those may be
> > lessened quite a
>
> But you thought right, I do have an objection against that. 'git gc'
> should, well, collect garbage. Any non-gc stuff is already violating
> separation of concerns.
I share these concerns in a slightly more abstract way, as
I would bucket the actions into two separate bins:
One bin that th
On 10/3/2018 2:51 PM, Jeff King wrote:
On Wed, Oct 03, 2018 at 08:47:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
On Wed, Oct 03 2018, Stefan Beller wrote:
So we wouldn't be spending 5 minutes repacking linux.git right after
cloning it, just ~10s generating the commit graph, and the same would
On Wed, Oct 03, 2018 at 08:47:11PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, Oct 03 2018, Stefan Beller wrote:
>
> >> So we wouldn't be spending 5 minutes repacking linux.git right after
> >> cloning it, just ~10s generating the commit graph, and the same would
> >> happen if you rm'd .g
On Wed, Oct 03 2018, Stefan Beller wrote:
>> So we wouldn't be spending 5 minutes repacking linux.git right after
>> cloning it, just ~10s generating the commit graph, and the same would
>> happen if you rm'd .git/objects/info/commit-graph and ran "git commit",
>> which would kick of "gc --auto"
> So we wouldn't be spending 5 minutes repacking linux.git right after
> cloning it, just ~10s generating the commit graph, and the same would
> happen if you rm'd .git/objects/info/commit-graph and ran "git commit",
> which would kick of "gc --auto" in the background and do the same thing.
Or gen
On Wed, Oct 03, 2018 at 05:19:41PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> >> >> So we should make "git gc --auto" be run on clone,
> >> >> >
> >> >> > There is no garbage after 'git clone'...
> >> >>
> >> >> "git gc" is really "git gc-or-create-indexes" these days.
> >> >
> >> > Because it happ
On Wed, Oct 3, 2018 at 3:23 PM Ævar Arnfjörð Bjarmason wrote:
>
> Don't have time to patch this now, but thought I'd send a note / RFC
> about this.
>
> Now that we have the commit graph it's nice to be able to set
> e.g. core.commitGraph=true & gc.writeCommitGraph=true in ~/.gitconfig or
> /etc/g
On Wed, Oct 03 2018, SZEDER Gábor wrote:
> On Wed, Oct 03, 2018 at 04:22:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Wed, Oct 03 2018, SZEDER Gábor wrote:
>>
>> > On Wed, Oct 03, 2018 at 04:01:40PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> >>
>> >> On Wed, Oct 03 2018, SZEDER Gábor wrote:
On Wed, Oct 03, 2018 at 04:22:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, Oct 03 2018, SZEDER Gábor wrote:
>
> > On Wed, Oct 03, 2018 at 04:01:40PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >>
> >> On Wed, Oct 03 2018, SZEDER Gábor wrote:
> >>
> >> > On Wed, Oct 03, 2018 at 03:23:57PM +0
On Wed, Oct 3, 2018 at 4:01 PM Ævar Arnfjörð Bjarmason wrote:
> >> and change the
> >> need_to_gc() / cmd_gc() behavior so that we detect that the
> >> gc.writeCommitGraph=true setting is on, but we have no commit graph, and
> >> then just generate that without doing a full repack.
> >
> > Or just
On Wed, Oct 03 2018, SZEDER Gábor wrote:
> On Wed, Oct 03, 2018 at 04:01:40PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>
>> On Wed, Oct 03 2018, SZEDER Gábor wrote:
>>
>> > On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> >> Don't have time to patch this now, but thought
On Wed, Oct 03 2018, Derrick Stolee wrote:
> On 10/3/2018 9:36 AM, SZEDER Gábor wrote:
>> On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
>>> Don't have time to patch this now, but thought I'd send a note / RFC
>>> about this.
>>>
>>> Now that we have the commit graph it
On Wed, Oct 03, 2018 at 04:01:40PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, Oct 03 2018, SZEDER Gábor wrote:
>
> > On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> Don't have time to patch this now, but thought I'd send a note / RFC
> >> about this.
> >>
> >>
On Wed, Oct 03 2018, SZEDER Gábor wrote:
> On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> Don't have time to patch this now, but thought I'd send a note / RFC
>> about this.
>>
>> Now that we have the commit graph it's nice to be able to set
>> e.g. core.commitGraph=
On 10/3/2018 9:36 AM, SZEDER Gábor wrote:
On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
Don't have time to patch this now, but thought I'd send a note / RFC
about this.
Now that we have the commit graph it's nice to be able to set
e.g. core.commitGraph=true & gc.write
On Wed, Oct 03, 2018 at 03:23:57PM +0200, Ævar Arnfjörð Bjarmason wrote:
> Don't have time to patch this now, but thought I'd send a note / RFC
> about this.
>
> Now that we have the commit graph it's nice to be able to set
> e.g. core.commitGraph=true & gc.writeCommitGraph=true in ~/.gitconfig or
Don't have time to patch this now, but thought I'd send a note / RFC
about this.
Now that we have the commit graph it's nice to be able to set
e.g. core.commitGraph=true & gc.writeCommitGraph=true in ~/.gitconfig or
/etc/gitconfig to apply them to all repos.
But when I clone e.g. linux.git stuff
47 matches
Mail list logo