Re: Repacking a repository uses up all available disk space

2016-06-12 Thread Nasser Grainawi
On Jun 12, 2016, at 4:13 PM, Jeff King  wrote:
> 
>At GitHub we actually have a patch to `repack` that keeps all
>objects, reachable or not, in the pack, and use it for all of our
>automated maintenance. Since we don't drop objects at all, we can't
>ever have such a race. Aside from some pathological cases, it wastes
>much less space than you'd expect. We turn the flag off for special
>cases (e.g., somebody has rewound history and wants to expunge a
>sensitive object).
> 
>I'm happy to share the "keep everything" patch if you're interested.

We have the same kind of patch actually (for the same reason), but back on the 
shell implementation of repack. It'd be great if you could share your modern 
version.

Nasser

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Why does send-pack call pack-objects for all remote refs?

2015-12-11 Thread Nasser Grainawi

> On Dec 9, 2015, at 9:19 PM, Jeff King  wrote:
> 
> On Tue, Dec 08, 2015 at 05:34:43PM +, Daniel Koverman wrote:
> 
>> It is also good to know that 2000 remote refs is insane. The lower
>> hanging fruit here sounds like trimming that to a reasonable
>> number, so I'll try that approach first.
> 
> It's definitely a lot, but it's not unheard of. The git project has over
> 500 tags. That's not 2000, but you're within an order of magnitude.
> 
> I have seen repositories with 20,000+ tags. I consider that a bit more
> ridiculous, but it does work in practice.
> 

We have one at $DAY_JOB with 400,000+ refs. It presents some issues, but
Martin has raised those with the community and it works pretty well now.

Ref advertisement is still a pain...

> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, 
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git push race condition?

2014-03-24 Thread Nasser Grainawi
On Mar 24, 2014, at 4:54 PM, Jeff King p...@peff.net wrote:

 On Mon, Mar 24, 2014 at 03:18:14PM -0400, Scott Sandler wrote:
 
 I've noticed that a few times in the past several weeks, we've had
 events where pushes have been lost when two people pushed at just
 about the same time. The scenario is that two users both have commits
 based on commit A, call them B and B'. The user with commit B pushes
 at about the same time as the user who pushes B'. Both pushes are
 determined to be fast-forwards and both succeed, but B' overwrites B
 and B is no longer on origin/master. The server does have B in its
 .git directory but the commit isn't on any branch.
 
 What version of git are you running on the server? Is it possible that
 there is a simultaneous process running `git pack-refs` (e.g., a `git
 gc` run by a cron job or similar)?

`git gc --auto` could be getting triggered as well, so if you suspect
that you could set gc.auto=0 on the server side.

 
 There were some race conditions fixed last year wherein git could see
 stale values of refs, but I do not think they could impact writing to a
 ref like this.  When we take the lock on the ref, we always go straight
 to the filesystem, so the value we see is up-to-date.
 
 -Peff
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora 
Forum, hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] repack: add `repack.honorpackkeep` config var

2014-02-28 Thread Nasser Grainawi
On Feb 28, 2014, at 1:55 AM, Jeff King p...@peff.net wrote:

 On Thu, Feb 27, 2014 at 10:04:44AM -0800, Junio C Hamano wrote:
 
 I wonder if it makes sense to link it with pack.writebitmaps more
 tightly, without even exposing it as a seemingly orthogonal knob
 that can be tweaked, though.
 
 I think that is because I do not fully understand the , because ...
 part of the below:
 
 This patch introduces an option to disable the
 `--honor-pack-keep` option.  It is not triggered by default,
 even when pack.writeBitmaps is turned on, because its use
 depends on your overall packing strategy and use of .keep
 files.
 
 If you ask --write-bitmap-index (or have pack.writeBitmaps on), you
 do want the bitmap-index to be written, and unless you tell
 pack-objects to ignore the .keep marker, it cannot do so, no?
 
 Does the , because ... part above mean you may have an overall
 packing strategy to use .keep file to not ever repack some subset of
 the objects, so we will not silently explode the kept objects into a
 new pack?
 
 Exactly. The two features (bitmaps and .keep) are not compatible with
 each other, so you have to prioritize one. If you are using static .keep
 files, you might want them to continue being respected at the expense of
 using bitmaps for that repo. So I think you want a separate option from
 --write-bitmap-index to allow the appropriate flexibility.

Has anyone thought about how to make them compatible? We're using Martin Fick's 
git-exproll script which makes heavy use of keeps to reduce pack file churn. In 
addition to the on-disk benefits we get there, the driving factor behind 
creating exproll was to prevent Gerrit from having two large (30GB+) mostly 
duplicated pack files open in memory at the same time. Repacking in JGit would 
help in a single-master environment, but we'd be back to having this problem 
once we go to a multi-master setup.

Perhaps the solution here is actually something in JGit where it could 
aggressively try to close references to pack files, but that still doesn't help 
the disk churn problem. As Peff says below, we would want to repack often to 
get up-to-date bitmaps, but ideally we could do that without writing hundreds 
of GBs to disk (which is obviously worse when disk is a NFS mount).

 
 The default is another matter.  I think most people using .bitmaps on a
 server would probably want to set repack.packKeptObjects.  They would
 want to repack often to take advantage of the .bitmaps anyway, so they
 probably don't care about .keep files (any they see are due to races
 with incoming pushes).
 
 So we could do something like falling back to turning the option on if
 --write-bitmap-index is on _and_ the user didn't specify
 --pack-kept-objects. The existing default is mostly there because it is
 the conservative choice (.keep files continue to do their thing as
 normal unless you say otherwise). But the fallback thing would be one
 less knob that bitmap users would need to turn in the common case.
 
 Here's the interdiff for doing the fallback:
 
 ---
 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index 3a3d84f..a8ddc7f 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -2139,7 +2139,9 @@ repack.usedeltabaseoffset::
 repack.packKeptObjects::
   If set to true, makes `git repack` act as if
   `--pack-kept-objects` was passed. See linkgit:git-repack[1] for
 - details. Defaults to false.
 + details. Defaults to `false` normally, but `true` if a bitmap
 + index is being written (either via `--write-bitmap-index` or
 + `pack.writeBitmaps`).
 
 rerere.autoupdate::
   When set to true, `git-rerere` updates the index with the
 diff --git a/builtin/repack.c b/builtin/repack.c
 index 49947b2..6b0b62d 100644
 --- a/builtin/repack.c
 +++ b/builtin/repack.c
 @@ -9,7 +9,7 @@
 #include argv-array.h
 
 static int delta_base_offset = 1;
 -static int pack_kept_objects;
 +static int pack_kept_objects = -1;
 static char *packdir, *packtmp;
 
 static const char *const git_repack_usage[] = {
 @@ -190,6 +190,9 @@ int cmd_repack(int argc, const char **argv, const char 
 *prefix)
   argc = parse_options(argc, argv, prefix, builtin_repack_options,
   git_repack_usage, 0);
 
 + if (pack_kept_objects  0)
 + pack_kept_objects = write_bitmap;
 +
   packdir = mkpathdup(%s/pack, get_object_directory());
   packtmp = mkpathdup(%s/.tmp-%d-pack, packdir, (int)getpid());
 
 diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
 index f8431a8..b1eed5c 100755
 --- a/t/t7700-repack.sh
 +++ b/t/t7700-repack.sh
 @@ -21,7 +21,7 @@ test_expect_success 'objects in packs marked .keep are not 
 repacked' '
   objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 |
   sed -e s/^\([0-9a-f]\{40\}\).*/\1/) 
   mv pack-* .git/objects/pack/ 
 - git repack -A -d -l 
 + git repack --no-pack-kept-objects -A -d -l 
   git prune-packed 
   for p in 

Re: RFE: support change-id generation natively

2013-10-23 Thread Nasser Grainawi

On Oct 23, 2013, at 8:07 PM, Duy Nguyen wrote:

 On Wed, Oct 23, 2013 at 11:00 PM, Junio C Hamano gits...@pobox.com wrote:
 Duy Nguyen pclo...@gmail.com writes:
 
 On Wed, Oct 23, 2013 at 2:50 AM, Junio C Hamano gits...@pobox.com wrote:
 It would be just the matter of updating commit_tree_extended() in
 commit.c to:
 
 - detect the need to add a new Change-Id: trailer;
 
 - call hash_sha1_file() on the commit object buffer (assuming that
   a commit object that you can actually git cat-file commit using
   the change Id does not have to exist anywhere for Gerrit to
   work---otherwise you would need to call write_sha1_file()
   instead) before adding Change-Id: trailer;
 
 - add Change-Id: trailer to the buffer; and then finally
 
 - let the existing write_sha1_file() to write it out.
 
 I'm not objecting special support for Gerrit, but if the change is
 just commit_tree_extended() why don't we just ship the commit hook in
 a new Gerrit template?
 
 It is not clear to me how you envision to make it work.
 
 I don't have the source code.

Now you do: 
https://gerrit.googlesource.com/gerrit/+/master/gerrit-server/src/main/resources/com/google/gerrit/server/tools/root/hooks/commit-msg

 But the commit-msg hook document [1]
 describes roughly what you wrote below, except the tree part. And I
 suppose the hook has been working fine so far. Reading back the
 original post, James ruled out always-active hooks in general and
 wanted the control per command line. Perhaps we should add
 --no-hooks[=name,name] to git commit? Or maybe it's still
 inconvenient and --change-id is best.
 
 [1] 
 http://gerrit-documentation.googlecode.com/svn/Documentation/2.0/cmd-hook-commit-msg.html
 
 Naïvely thinking, an obvious place to do this kind of thing may be
 the commit-msg hook, where the hook reads what the user prepared,
 finds that there is no existing Change-Id: trailer, and decides to
 add one.
 
 But what value would it add on that line as the Id?
 
 It wants to use the name of the commit object that would result if
 it were to return without further editing the given message, but we
 do not give such a commit object name to the hook, so the hook needs
 to duplicate the logic to come up with one.  It may be doable (after
 all, builtin/commit.c is open source), but we do not give the hook
 the commit object header (i.e. it does not know what the tree,
 parent(s), author, committer lines would say, nor it does not know
 if we are going to add an encoding line), so the hook needs to guess
 what we will put there, too.
 -- 
 Duy
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html