Any puppet users? Drafting PPG
Puppet is often used with git as the mechanism to publish/distribute the configuration. This sidesteps the not-very-scalable central Puppet server. But the use of git isn't sophisticated in the least. Git can help in a few ways, IMO, and this is my initial approach at the topic: https://groups.google.com/forum/?fromgroups#!topic/puppet-users/OilxMytnD_k No fun in building this bike shed all alone :-) m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Pitfalls in auto-fast-forwarding heads that are not checked out?
I am building a small git wrapper around puppet, and one of the actions it performs is auto-fastforwarding of branches without checking them out. In simplified code... we ensure that we are on a head called master, and in some cases ppg commit, will commit to master and... ## early on # sanity-check we are on master headname=$(git rev-parse --symbolic-full-name --revs-only HEAD) if [ $headname -ne refs/heads/headname ]; then echo 2 ERROR: can only issue --immediate commit from the master branch! exit 1 fi ## then git commit -bla blarg baz ## and then... # ensure we can ff head_sha1=$(git rev-parse --revs-only master) mb=$(git merge-base $production_sha1 refs/heads/master) if [[ $mb -ne $production_sha1 ]]; then echo 2 ERROR: cannot fast-forward master to production exit 1 fi $GIT_EXEC_PATH/git-update-ref -m ppg immediate commit refs/heads/production $head_sha1 $production_sha1 || exit 1 Are there major pitfalls in this approach? I cannot think of any, but git has stayed away from updating my local tracking branches; so maybe there's a reason for that... cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pitfalls in auto-fast-forwarding heads that are not checked out?
On Sat, May 4, 2013 at 3:34 AM, Johannes Sixt j...@kdbg.org wrote: You mean refs/heads/master and != here because -ne is numeric comparison in a shell script. thanks! Yeah, I fixed those up late last night :-) Since git 1.8.0 you can express this check as if git merge-base --is-ancestor $production_sha1 refs/heads/master Ah, that's great! Unfortunate it's not there in earlier / more widely used releases of git. Are there major pitfalls in this approach? I don't think there are. Thanks... I cannot think of any, but git has stayed away from updating my local tracking branches; so maybe there's a reason for that... I don't understand what you are saying here. What is that? When I do git pull, git is careful to only update the branch I have checked out (if appropriate). It leaves any other branches that track branches on the remote that has just been fetched untouched. I always thought that at some point git pull would learn to evaluate those branches and auto-merge them if the merge is a ff. I would find that a natural bit of automation in git pull. Of course it would mean a change of semantics, existing scripts could be affected. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
offtopic: ppg design decisions - encapsulation
[ Unashamedly offtopic... asking here because I like git design and coding style, and ppg is drawing plenty of inspiration from the old git shell scripts. Please kindly flame me privately... ] ppg is a wrapper around git to maintain and distribute Puppet configs, adding a few niceties. Now, ppg will actuall manage two git repositories -- one for the puppet configs, the second one for ferrying back the puppet run results to the originating repo (were they get loaded in a puppet dashboard server for cute webbased reporting). The puppet config repo is a normally-behaved git repo. The reports repo is a bit of a hack -- never used directly by the user, it will follow a store-and-forward scheme, where I should trim old history or just use something other than git. So I see two possible repo layouts: - Transparent, nested .git/ # holding puppet configs, allows direct use of git commands .git/reports.git # nested inside puppet configs repo - Mediated, parallel .ppg/puppet.git # all git commands must be wrapped .ppg/reports.git My laziness and laisses-faire take on things drives to to use the transparentnested approach. Let the user do anything in there directly with git. OTOH, the mediated approach allows for more complete support, including sanity checks on commands that don't have hooks available. I already have a /usr/bin/ppg wrapper, which I could use to wrap all git commands, setting GIT_DIR=.ppg/puppet.git for all ops. It would force ops to be from the top of the tree (unless I write explicit support) and I would have to implement explicit. And it would break related tools that are not mediated via /usr/bin/git (gitk!). Written this way, it seems to be a minimal lazy approach vs DTRT. Am I missing any important aspect or option? Thoughts? cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Pitfalls in auto-fast-forwarding heads that are not checked out?
On Sat, May 4, 2013 at 2:51 PM, Jonathan Nieder jrnie...@gmail.com wrote: Another trick is to use git push: git push . $production_sha1:refs/heads/master Great trick -- thanks! In use in ppg now :-) m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: offtopic: ppg design decisions - encapsulation
On Mon, May 6, 2013 at 11:53 AM, John Keeping j...@keeping.me.uk wrote: I'm not sure I fully understand what the reports are, but it sounds like they are closely related to original configuration commits. If that is the case, have you considered using Git notes instead of a separate repository? Interesting suggestion! I read up on git-notes. Yes, reports are closely related to a commit -- it's a lot of the execution of puppet with that config on a client node. At the same time, we have one report per change deployment, per client -- with thousands of clients. So it will be a large dataset, and a transient one -- I intend to use git as a store-and-forward mechanism for the reports, and it is safesane to forget old reports. I don't see much ease-of-expiry in the notes, so I guess I would have to write that myself, which complicates things a bit :-) cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: add support for -G'regex' pickaxe variant
I just did git rebase origin/master for the umpteenth time, which reminded me this nice patch is still pending. ping? m On Thu, Jun 14, 2012 at 2:34 PM, Zbigniew Jędrzejewski-Szmek zbys...@in.waw.pl wrote: From: Martin Langhoff mar...@laptop.org git log -G'regex' is a very usable alternative to the classic pickaxe. Minimal patch to make it usable from gitk. [zj: reword message] Signed-off-by: Zbigniew Jędrzejewski-Szmek zbys...@in.waw.pl --- Martin's off on holidays, so I'm sending v2 after rewording. gitk-git/gitk | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gitk-git/gitk b/gitk-git/gitk index 22270ce..24eaead 100755 --- a/gitk-git/gitk +++ b/gitk-git/gitk @@ -2232,7 +2232,8 @@ proc makewindow {} { set gm [makedroplist .tf.lbar.gdttype gdttype \ [mc containing:] \ [mc touching paths:] \ - [mc adding/removing string:]] + [mc adding/removing string:] \ + [mc with changes matching regex:]] trace add variable gdttype write gdttype_change pack .tf.lbar.gdttype -side left -fill y @@ -4595,6 +4596,8 @@ proc do_file_hl {serial} { set gdtargs [concat -- $relative_paths] } elseif {$gdttype eq [mc adding/removing string:]} { set gdtargs [list -S$highlight_files] +} elseif {$gdttype eq [mc with changes matching regex:]} { + set gdtargs [list -G$highlight_files] } else { # must be containing:, i.e. we're searching commit info return -- 1.7.11.rc3.129.ga90bc7a.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Misusing git: trimming old commits
I am misusing git as a store-and-forward tool to transfer reports to a server in a resilient manner. The context is puppet (and ppg, I've spammed the list about it... ). The reports are small, with small deltas, but created frequently. With the exaction of the final destination, I want to expire reports that are old and successfully transferred. My current approach is (bash-y pseudocode): git push bla bla || exit $? prunehash=$(git log -n1 --until=one.month.ago --pretty=format:%H) test -z $prunehash exit 0 # prep empty commit # skip git read-tree --empty, we get the same with export GIT_INDEX_FILE=/does/not/exist empty_tree=$(git write-tree) unset GIT_INDEX_FILE empty_commit=$(git commit-tree -m empty $empty_tree) echo ${prunehash} ${empty_commit} .git/info/grafts git gc # TODO: cleanup stale grafts :-) is this a reasonable approach? m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Misusing git: trimming old commits
On Thu, May 9, 2013 at 11:40 AM, Martin Langhoff martin.langh...@gmail.com wrote: With the exaction of the final destination, I want to expire reports that are old and successfully transferred. OK, that took some effort to make work. Make sure you are not using reflogs (or that reflogs are promptly expired). # right after a successful push of all heads to the receiving server... for head in $(git branch|sed 's/^..//'); do # FIXME period graft_sha1=$(git log --until=one.month.ago -n1 --pretty=format:%H ${head}) if [[ -z $graft_sha1 ]]; then # nothing to prune continue fi # is it already grafted? if grep -q ^${graft_sha1} ${GIT_DIR}/info/grafts /dev/null ; then # don't duplicate the graft continue fi some_grafted='true' # prep empty commit # skip git read-tree --empty, we get the same with export GIT_INDEX_FILE=/tmp/ppg-emptytree.$$ empty_tree=$(git write-tree) rm ${GIT_INDEX_FILE} unset GIT_INDEX_FILE empty_commit=$(git commit-tree -m empty ${empty_tree}) echo ${graft_sha1} ${empty_commit} ${GIT_DIR}/info/grafts done if [[ -z $some_grafted ]]; then # nothing to do exit 0 fi # ppg-repack makes the unreachable objects loose # (it is git-repack hacked to remove --keep-true-parents), # git prune --expire actually deletes them $PPG_EXEC_PATH/ppg-repack -AFfd git prune --expire=now ### Cleanup stale grafts # current grafts points are reachable, # pruned graft points (made obsolete by a newer graft) # cannot be retrieved and git cat-file exit code is 128 touch ${GIT_DIR}/info/grafts.tmp.$$ while read line; do graftpoint=$(echo ${line} | cut -d' ' -f1) if git cat-file commit ${graftpoint} /dev/null ; then echo ${line} ${GIT_DIR}/info/grafts.tmp.$$ fi done ${GIT_DIR}/info/grafts if [ -s ${GIT_DIR}/info/grafts.tmp.$$ ]; then mv ${GIT_DIR}/info/grafts.tmp.$$ ${GIT_DIR}/info/grafts fi Perhaps it helps someone else trying to run git as a spooler :-) cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: add support for -G'regex' pickaxe variant
On Mon, May 13, 2013 at 2:55 PM, Jonathan Nieder jrnie...@gmail.com wrote: My experience is the opposite. I wonder What did the author of this nonsense comment mean? or What is the purpose of this strange condition in this if () statement?. Then git log -S finds the culprit Only if that if () statement looks that way from a single commit. That's my point. If the line code bit you are looking at is the result of several changes, your log -S will grind a while and find you nothing. cheers, m -- mar...@laptop.org - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] gitk: add support for -G'regex' pickaxe variant
On Mon, May 13, 2013 at 3:33 PM, Jonathan Nieder jrnie...@gmail.com wrote: Well, no, it should find the final change that brought it into the current form. Just like git blame. Has it been finding zero results in some cases where the current code matches the pattern? That sounds like a bug. Ummm, maybe. You are right, with current git it does work as I would expect (usefully ;-) ). I know I struggled quite a bit with log -S not finding stuff I thought it should and that log -G did find, back a year ago. Damn, I don't have a precise record of what git it was on, nor a good repro example. Too long ago, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: git cvsimport implications
On Fri, May 17, 2013 at 5:10 AM, Michael Haggerty mhag...@alum.mit.edu wrote: For one-time imports, the fix is to use a tool that is not broken, like cvs2git. As one of the earlier maintainers of cvsimport, I do believe that cvs2git is less broken, for one-shot imports, than cvsimport. Users looking for a one-shot import should not use cvsimport as there are better options there. Myself, I have used parsecvs (long ago, so perhaps it isn't the best of the crop nowadays). TBH, I am puzzled and amused at all the chest-thumping about cvs importers. Yeah, yours is a bit better or saner, but we all wade in the muddle of essentially broken data. So is not broken is rather misleading when talking to end users. It carries so many caveats about whether it'll work on the users' particular repo that it is not a generally truthful statement. I am very glad to hear it is better than cvsimport, and even more glad to hear its limitations are better understood and documented. It has had a testsuite for the longest of times! And very likely has the best chance of success across the available importers :-) Oh, and why is cvsimport so vague? Because it is just driven by cvsps. It creates a repo based on what cvsps understands from the CVS data. At the time, I looked into trying to use cvs2svn (precursor to cvs2git) as the CVS read side of cvsimport, but it did not support incremental imports at all, and it took forever to run. It was a time when git was new and people were dipping their toes in the pool, and some developers were pining to use git on projects that used CVS (like we use git-svn now). Incremental imports were a must. One of the nice features of cvsimport is that it can do incrementals on a repo imported with another tool. That earns it a place under the sun. If it didn't have that, I'd be voting for removal (after a review that the replacement *is* actually better ;-) across a number of test repos). cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Fwd: git cvsimport implications
On Fri, May 17, 2013 at 9:34 AM, Andreas Krey a.k...@gmx.de wrote: On Fri, 17 May 2013 15:14:58 +, Michael Haggerty wrote: ... We both know that the CVS history omits important data, and that the history is mutable, etc. So there are lots of hypothetical histories that do not contradict CVS. But some things are recorded unambiguously in the CVS history, like * The contents at any tag or the tip of any branch (i.e., what is in the working tree when you check it out). Except that the tags/branches may be made in a way that can't be mapped onto any commit/point of history otherwise exported, with branches that are done on parts of the trees first, or likewise tags. Yeah, that's what I remember too. It is perfectly fine in CVS to add a tag to a file at a much later date than the rest of the tree. And it happened too (oh, I didn't have directory support/some-os checked out when I tagged the release yesterday! let me check it out and add the tag, nevermind that the branch has moved forward in the interim...). I would add the long history of cvs repository manipulation. Bad, ugly stuff, but it happened in every major project I've seen. Mozilla, X.org, etc. TBH I am very glad that Michael cares deeply about the correctness, and it leads to a much better tool. No doubt. When discussing it with end users, I do think we have to be honest and say that there's a fair chance that the output will not be perfect... because what is in CVS is rather imperfect when you look at it closely (which users aren't usually doing). cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
Hi git list, I am trying to diagnose a strange problem in a VM running as a 'git over ssh server', with one repo which periodically grows very quickly. The complete dataset packs to a single pack+index of ~650MB. Growth is slow, these are ASCII text reports that use a template -- highly compressible. Reports come from a few dozen machines that log in every hour. However, something is happening that explodes the efficient pack into an ungodly mess. Do client pushes over git+ssh ever trigger a repack on the server? If so, these repacking processes are racing with each other and taking 650MB to 7GB at which point we hit ENOSPC, sometimes pom killer joins the party, etc. pack dir looks like this, ordered by timestamp: http://fpaste.org/55730/04636313/ cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
On Thu, Nov 21, 2013 at 2:52 PM, Junio C Hamano gits...@pobox.com wrote: - if it's receiving from many pushers, it races with itself; needs some lock or back-off mechanism Surely. I think these should help: 64a99eb4 (gc: reject if another gc is running, unless --force is given, 2013-08-08) 4c5baf02 (gc: remove gc.pid file at end of execution, 2013-10-16) They should be in the upcoming v1.8.5. Ah, great to hear. For the record, this hit me on git 1.7.1, current on RHEL6. thanks! m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Can a git push over ssh trigger a gc/repack? Diagnosing pack explosion
On Thu, Nov 21, 2013 at 10:21 AM, Martin Langhoff martin.langh...@gmail.com wrote: Do client pushes over git+ssh ever trigger a repack on the server? man git-config [snip] receive.autogc By default, git-receive-pack will run git-gc --auto after receiving data from git-push and updating refs. You can stop it by setting this variable to false. Oops! Ok, couple problems here: - if it's receiving from many pushers, it races with itself; needs some lock or back-off mechanism - alternatively, an splay mechanism. We have a hard threshold... given many pushers acting in parallel, they'll all hit the threshold at the same time. There is no need for this, we could randomize the threshold by 20%; that would radically reduce the racy-ness - auto repack in this scenario has a reasonable likelihood if being visited by the OOM killer -- therefore it needs to fail more gracefully, for example with tmpfile cleanup. Perhaps by having the tmpfiles places in a tmpdir named with the pid of the child would make this easier... Naturally, I'll move quickly to disable this evil-spawn-automagic setting and setup a cronjob. But I think it is possible to have defaults that work more reliably and with lower risk of explosion. thoughts? m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git filter-branch --directory-filter oddity
When using git filter-branch --prune-empty --directory-filter foo/bar to extract the history of the foo/bar directory, I am getting a very strange result. Directory foo/bar is slow moving. Say, 22 commits out of several thousand. I would like to extract just those 22 commits. Instead, I get ~1500 commits, which seem to have not been skipped because they are merge commits. Merges completely immaterial to this directory. As they have not been skipped, they are fully fleshed out. By this, I mean that we have the whole tree in place. So these 22 commits appear with foo/bar pulled out to the root of the project, in the midst of 1500 commits with a full tree. The commit diffs make no sense what-so-ever. Am I doing it wrong? m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git filter-branch --directory-filter oddity
On Tue, Dec 3, 2013 at 5:44 PM, Martin Langhoff martin.langh...@gmail.com wrote: As they have not been skipped, they are fully fleshed out. By this, I mean that we have the whole tree in place. So these 22 commits appear with foo/bar pulled out to the root of the project, in the midst of 1500 commits with a full tree. IOWs, I am experimenting with something like: git filter-branch -f -d /tmp/moodle-$RANDOM --prune-empty --index-filter git ls-files -z | grep -zZ -v '${dirpath}' | xargs -0 --no-run-if-empty -n100 git rm --quiet --cached --ignore-unmatch ^v2.1.0 $branches git filter-branch -f --prune-empty --subdirectory-filter ${dirpath} ^v2.1.0 $branches git filter-branch -f --commit-filter ~/src/git-filter-branch-tools/remove-pointless-commit.rb \\$@\ ^v2.1.0 $branches perhaps the docs for filter-branch imply, to me at least, that it's a DWIM tool. I am surprised at having to roll my own on something that is fairly popular. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git filter-branch --directory-filter oddity
On Tue, Dec 3, 2013 at 5:44 PM, Martin Langhoff martin.langh...@gmail.com wrote: Am I doing it wrong? Looks like I was doing something wrong. Apologies about the noise. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Publishing filtered branch repositories - workflow / recommendations?
Hi folks. currently working on a project based on Moodle (the LMS that got me into git in the first place). This is a highly modular software, and I would like to maintain a bunch of out of tree modules in a single repository, and be able to publish them in per-module repositories. So I would like to maintain a tree with looking like auth/foomatic/{code} mod/foomatic/{code} where I can develop, branch and tag all the foomatic code together. Yet, at release time I want to _also_ publish two repos auth-foomatic.git mod-foomatic.git each of them with matching tags and code at the root of the git tree, and ideally with a truthful history (i.e.: similar to having run git filter-branch --subdirectory-filter, but able to update that filtered history incrementally). Is there a reasonable approach to scripting this? Alternatively, has git submodule been improved so that it's usable by mere mortals (i.e.: my team), or are there strong alternatives to git submodule? cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Publishing filtered branch repositories - workflow / recommendations?
On Wed, Dec 4, 2013 at 6:01 PM, Martin Langhoff martin.langh...@gmail.com wrote: Is there a reasonable approach to scripting this? Found my answers. The 'subtree' merge strategy is smart enough to mostly help here. However, it does not handle new files created in the subdirectory. My workflow is this one. It is similar to the recipes for the subtree merge strategies, but invoking git mv to pull files out of the git merge -s ours --no-commit upstream/branch git read-tree --prefix= -u upstream/branch git mv mysubdir/* ./ ### read-tree can't do this git commit ... time passes git merge -s subtree --no-commit upstream/branch if [ -d mysubdir ]; then # handle new files git mv mysubdir/* ./ fi git commit glad that I ended up reading a lot about subtree. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Publishing filtered branch repositories - workflow / recommendations?
On Thu, Dec 5, 2013 at 2:18 PM, Jens Lehmann jens.lehm...@web.de wrote: Without knowing more I can't think of a reason why submodules should not suit your use case (but you'd have to script branching and tagging yourself until these commands learn to recurse into submodules too). The submodules feature is way too fiddly and has abundant gotchas. I am diving into subtrees, and finding it a lot more workable. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Publishing filtered branch repositories - workflow / recommendations?
On Thu, Dec 5, 2013 at 2:54 PM, Jens Lehmann jens.lehm...@web.de wrote: Am 05.12.2013 20:27, schrieb Martin Langhoff: On Thu, Dec 5, 2013 at 2:18 PM, Jens Lehmann jens.lehm...@web.de wrote: Without knowing more I can't think of a reason why submodules should not suit your use case (but you'd have to script branching and tagging yourself until these commands learn to recurse into submodules too). The submodules feature is way too fiddly and has abundant gotchas. Care to explain what bothers you the most? Being one of the people improving submodules I'm really interested in hearing more about that. Very glad to hear submodules is getting TLC! I have other scenarios at $dayjob where I may need submodules, so happy happy. I may be unaware of recent improvements, here's my (perhaps outdated) list - git clone does not clone existing submodules by default. An ideal workflow assumes that the user wants a fully usable checkout. - git pull does not fetchupdate all submodules (assuming a trivial tracking repos scenario) - git push does not warn if you forgot to push commits in the submodule there's possibly a few others that I've forgotten. The main issue is that things that are conceptually simple (clone, git pull with no local changes) are very fiddly. Our new developers, testers and support folks hurt themselves with it plenty. I don't mind complex scenarios being complex to handle. If you hit a nasty merge conflict in your submodule, and that's gnarly to resolve, that's not a showstopper. While writing this email, I reviewed Documentation/git-submodule.txt in git master, and it does seem to have grown some new options. I wonder if there is a tutorial with an example workflow anywhere showing the current level of usability. My hope is actually for some bits of automagic default behaviors to help things along (rather than new options)... Early git was very pedantic, and over time it learned some DWIMery. You're giving me hope that similar smarts might have grown in around submodule support ... cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
gitignore excludes not working?
Tested with git 1.7.12.4 (Apple Git-37) and git 1.8.3.1 on F20. $ mkdir foo $ cd foo $ git init Initialized empty Git repository in /tmp/foo/.git/ $ mkdir -p modules/boring $ mkdir -p modules/interesting $ touch modules/boring/lib.c $ touch modules/interesting/other.c $ touch modules/interesting/lib.c $ git add modules/interesting/lib.c $ git status # On branch master # # Initial commit # # Changes to be committed: # (use git rm --cached file... to unstage) # # new file: modules/interesting/lib.c # # Untracked files: # (use git add file... to include in what will be committed) # # modules/boring/ # modules/interesting/other.c $ echo '/modules/' .gitignore $ echo '!/modules/interesting/' .gitignore On this git status, I would expect to see modules/interesting/other.c listed as untracked, however: $ git status # On branch master # # Initial commit # # Changes to be committed: # (use git rm --cached file... to unstage) # # new file: modules/interesting/lib.c # # Untracked files: # (use git add file... to include in what will be committed) # # .gitignore thoughts? m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Publishing filtered branch repositories - workflow / recommendations?
On Fri, Dec 6, 2013 at 3:48 AM, Jens Lehmann jens.lehm...@web.de wrote: Right you are, we need tutorials for the most prominent use cases. In the meantime, are there any hints? Emails on this list showing a current smart workflow? Blog posts? Notes on a wiki? Early git was very pedantic, and over time it learned some DWIMery. You're giving me hope that similar smarts might have grown in around submodule support ... That's what we are aiming at :-) That is fantastic! Thank you. m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Wed, Dec 11, 2013 at 7:17 PM, Eric S. Raymond e...@thyrsus.com wrote: I tried very hard to salvage this program - the ability to remote-fetch CVS repos without rsync access was appealing Is that the only thing we lose, if we abandon cusps? More to the point, is there today an incremental import option, outside of git-cvsimport+cvsps? [ I am a bit out of touch with the current codebase but I coded and maintained a good part of it back in the day. However naive/limited the cvsps parser was, it did help a lot of projects make the leap to git... ] regards, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Wed, Dec 11, 2013 at 11:26 PM, Eric S. Raymond e...@thyrsus.com wrote: You'll have to remind me what you mean by incremental here. Possibly it's something cvs-fast-export could support. User can - run a cvs to git import at time T, resulting in repo G - make commits to cvs repo - run cvs to git import at time T1, pointed to G, and the import tool will only add the new commits found in cvs between T and T1. But what I'm trying to tell you is that, even after I've done a dozen releases and fixed the worst problems I could find, cvsps is far too likely to mangle anything that passes through it. The idea that you are preserving *anything* valuable by sticking with it is a mirage. The bugs that lead to a mangled history are real. I acknowledge and respect that. However, with those limitations, the incremental feature has value in many scenarios. The two main ones are as follows: - A developer is tracking his/her own patches on top of a CVS-based project with git. This is often done with git-svn for example. If old/convoluted branches in the far past are mangled, this user won't care; as long as HEAD-master and/or the current/recent branch are consistent with reality, the tool fits a need. - A project plans to transition to git gradually. Experienced developers who'd normally work on CVS HEAD start working on git (and landing their work on CVS afterwards). Old/mangled branches and tags are of little interest, the big value is CVS HEAD (which is linear) and possibly recent release/stable branches. The history captured is good enough for git blame/log/pickaxe along the master line. At transition time the original CVS repo can be kept around in readonly mode, so people can still checkout the exact contents of an old branch or tag for example (assuming no destructive surgery was done in the CVS repo). The above examples assume that the CVS repos have used flying fish approach in the interesting (i.e.: recent) parts of their history. [ Simplifying a bit for non-CVS-geeks -- flying fish is using CVS HEAD for your development, plus 'feature branches' that get landed, plus long-lived 'stable release' branches. Most CVS projects in modern times use flying fish, which is a lot like what the git project uses in its own repo, but tuned to CVS's strengths (interesting commits linearized in CVS HEAD). Other approaches ('dovetail') tend to end up with unworkable messes given CVS's weaknesses. ] The cvsimport+cvsps combo does a reasonable (though imperfect) job on 'flying fish' CVS histories _and that is what most projects evolved to use_. If other cvs import tools can handle crazy histories, hats off to them. But careful with knifing cvsps! cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Thu, Dec 12, 2013 at 12:17 PM, Andreas Krey a.k...@gmx.de wrote: But anyway, the replacement question is a) how fast the cvs-fast-export is and b) whether its output is stable In my prior work, the better CVS importers would not have stable output, so were not appropriate for incremental imports. And even the fastest ones were very slow on large repos. That is why I am asking the question. It won't magically disappear from your machine, and you have been warned. :-) However, esr is making the case that git-cvsimport should stop using cvsps. My questions are aimed at understanding whether this actually results in proposing that an important feature is dropped. Perhaps a better alternative is now available. m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Thu, Dec 12, 2013 at 1:15 PM, Eric S. Raymond e...@thyrsus.com wrote: That terminology -- flying fish and dovetail -- is interesting, and I have not heard it before. It might be woth putting in the Jargon File. Can you point me at examples of live usage? The canonical reference would be http://cvsbook.red-bean.com/cvsbook.html#Going%20Out%20On%20A%20Limb%20(How%20To%20Work%20With%20Branches%20And%20Survive) just by being on the internet and widely referenced it has probably eclipsed in google-juice examples of earlier usage. Karl Fogel may remember where he got the names from. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Thu, Dec 12, 2013 at 1:29 PM, Eric S. Raymond e...@thyrsus.com wrote: I am almost certain the output of cvs-fast-export is stable. I believe the output of cvsps-3.x was, too. Not sure about 2.x. IIRC, making the output stable is nontrivial, specially on branches. Two cases are still in my mind, from when I was wrestling with cvsps. 1 - For a history with CVS HEAD and a long-running stable release branch (STABLE), which branched at P1... a - adding a file only at the tip of STABLE retroactively changes history for P1 and perhaps CVS HEAD b - forgetting to properly tag a subset of files with the branch tag, and doing it later retroactively changes history 2 - you can create a new branch or tag with files that do not belong together in any commit. Doing so changes history retroactively ... when I say changes history, I mean that the importers I know revise their guesses of what files were seen together in a 'commit'. This is specially true for history recorded with early cvs versions that did not record a 'commit id'. cvsps has the strange feature that it will cache its assumptions/guesses, and continue incrementally from there. So if a change in the CVS repo means that the old guess is now invalidated, it continues the charade instead of forcing a complete rewrite of the git history. Maybe the current crop of tools have developed stronger magic than what was available a few years ago... the task did seem impossible to me. cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Thu, Dec 12, 2013 at 3:58 PM, Eric S. Raymond e...@thyrsus.com wrote: - regardless of commit ids, do you synthesize an artificial commit? How do you define parenthood for that artificial commit? Because tagging is never used to deduce changesets, the case does not arise. So if a branch has a nonsensical branching point, or a tag is nonsensical, is it ignored and not imported? curious, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: I have end-of-lifed cvsps
On Thu, Dec 12, 2013 at 6:04 PM, Eric S. Raymond e...@thyrsus.com wrote: I'm not sure what counts as a nonsensical branching point. I do know that Keith left this rather cryptic note in a REAME: Keith names exactly what we are talking about. At that time, Keith was struggling with the old xorg cvs repo which these and quite a few other nasties. I was also struggling with the mozilla cvs repo with its own gremlins. Between my earlier explanation and Keith's notes it should be clear to you. It is absolutely trivial in CVS to have an inconsistent checkout (for example, if you switch branch with the -l parameter disabling recursion, or if you accidentally switch branch in a subdirectory). On that inconsistent checkout, nothing prevents you from tagging it, nor from creating a new branch. An importer with a 'consistent tree mentality' will look at the files/revs involved in that tag (or branching point) and find no tree to match. CVS repos with that crap exist. x11/xorg did (Jim Gettys challenged me to try importing it at an LCA, after the Bazaar NG folks passed on it). Mozilla did as well. IMHO it is a valid path to skip importing the tag/branch. As long as main dev work was in HEAD, things end up ok (which goes back to my flying fish notes). cheers, m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git "thin" submodule clone to feed "describe"
On Tue, Feb 23, 2016 at 4:33 PM, Junio C Hamanowrote: > No, I do not think so. Thanks. I will probably setup a pre-commit hook at the top level project to update a submodule metadata file. Not the prettiest but... :-) m -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git "thin" submodule clone to feed "describe"
Hi git list! long time no see! :-) Been missing you lots. Do we currently have any means to clone _history_ but not _blobs_ of a repo, or some approximation thereof? With a bit more context: If I have a top-level project using a couple dozen submodules, where the submodules are huge, do I have a git-native means of running git-describe on each submodule without pulling the whole thing down? In this context, most developers want to get full checkout of some submodules, but not of all; and 'git describe' of the submodules is needed to 'shim' the missing submodules appropriately. If the answer is no, there's a bunch of ways I can carry that as extra data in the top level project. It's possible, yet inelegant & duplicative. thanks, martin -- martin.langh...@gmail.com - ask interesting questions - don't get distracted with shiny stuff - working code first ~ http://docs.moodle.org/en/User:Martin_Langhoff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Automagic `git checkout branchname` mysteriously fails
In a (private) repo project I have, I recently tried (and failed) to do: git checkout v4.1-support getting a "pathspec did not match any files known to git" error. There's an origin/v4.1-support, there is no v4.1-support "local" branch. Creating the tracking branch explicitly worked. Other similar branches in existence upstream did work. Autocomplete matched git's own behaviour for this; where git checkout foo woudn't work, autocomplete would not offer a completion. Why is this? One theory I have not explored is that I have other remotes, and some have a v4.1-support branch. If that's the case, the error message is not very helpful, and could be improved. git --version 2.7.4 DWIM in git is remarkably good, even addictive... when it works :-) cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: Automagic `git checkout branchname` mysteriously fails
On Fri, Oct 14, 2016 at 4:58 PM, Kevin Daudt <m...@ikke.info> wrote: > Correct, this only works when it's unambiguous what branch you actually > mean. That's not surprising, but there isn't a warning. IMHO, finding several branch matches is a strong indication that it'll be worth reporting to the user that the DWIM machinery got hits, but couldn't work it out. I get that process is not geared towards making a friendly msg easy, but we could print to stderr something like: "branch" matches more than one candidate ref, cannot choose automatically. If you mean to check out a branch, try git branch command. If you mean to check out a file, use -- before the pathname to disambiguate. and then continue with execution. With a bit of wordsmithing, the msg can be made to be helpful in the various failure modes. cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted ~ http://github.com/martin-langhoff by shiny stuff
Re: Delta compression not so effective
On Wed, Mar 1, 2017 at 1:30 PM, Linus Torvalds <torva...@linux-foundation.org> wrote: > For example, the sorting code thinks that objects with the same name > across the history are good sources of deltas. Marius has indicated he is working with jar files. IME jar and war files, which are zipfiles containing Java bytecode, range from not delta-ing in a useful fashion, to pretty good deltas. Depending on the build process (hi Maven!) there can be enough variance in the build metadata to throw all the compression machinery off. On a simple Maven-driven project I have at hand, two .war files compiled from the same codebase compressed really well in git. I've also seen projects where storage space is ~101% of the "uncompressed" size. my 2c, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted ~ http://github.com/martin-langhoff by shiny stuff
Re: Delta compression not so effective
On Wed, Mar 1, 2017 at 8:51 AM, Marius Storm-Olsen <msto...@gmail.com> wrote: > BUT, even still, I would expect Git's delta compression to be quite > effective, compared to the compression present in SVN. jar files are zipfiles. They don't delta in any useful form, and in fact they differ even if they contain identical binary files inside. > Commits: 32988 > DB (server) size: 139GB Are you certain of the on-disk storage at the SVN server? Ideally, you've taken the size with a low-level tool like `du -sh /path/to/SVNRoot`. Even with no delta compression (as per Junio and Linus' discussion), based on past experience importing jar/wars/binaries from SVN into git... I'd expect git's worst case to be on-par with SVN, perhaps ~5% larger due to compression headers on uncompressible data. cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Dropping a merge from history -- rebase or filter-branch or ...?
Hi git-folk! long time no see! I'm trying to do one of those "actually, please don't" things that turn out to be needed in the field. I need to open our next "for release" development branch from our master, but without a couple of disruptive feature branches, which have been merged into master already. We develop in github, so I'll call them Pull Requests (PRs) as gh does. So I'd like to run a filter-branch or git-rebase --interactive --preserve-merges that drops some PRs. Problem is, they don't work! filter-branch --commit-filter is fantastic, and gives me all the control I want... except that it will "skip the commit", but still use the trees in the later commits, so the code changes brought in by those commits I wanted to avoid will be there. I think the docs/help that discuss "skip commit" should have a big warning there! rebase --interactive --preserve-merges --keep-empty made a complete hash of things. Nonsense conflicts all over on the merge commits; I think it re-ran the merge without picking up the conflict resolutions we had applied. The changes we want to avoid are fairly localized -- a specific module got refactored in 3 stages. The rest of the history should replay cleanly. I don't want to delete the module. My fallback is a manually constructed revert. While still an option, I think it's better to have a clean stat without sizable feature-branch reverts. cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Should rerere auto-update a merge resolution?
Hi List! Let's say... - git v2.9.4 - rerere is enabled. - I merge maint into master, resolve erroneously, commit - I publish my merge in a temp branch, a reviewer points out my mistake - I reset hard, retry the merge, using --no-commit, rerere applies what it knows - I fix things up, then commit So far so good. Oops! One of the branches has moved forward in the meantime, so - git fetch - git reset --hard master - git merge maint ... rerere applies the first (incorrect) resolution... Am I doing it wrong? {C,Sh}ould rerere have done better? cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: Should rerere auto-update a merge resolution?
On Wed, Aug 23, 2017 at 4:34 PM, Junio C Hamano <gits...@pobox.com> wrote: > Between these two steps: > >> - I reset hard, retry the merge, using --no-commit, rerere applies what it >> knows >> - I fix things up, then commit > > You'd tell rerere to forget what it knows because it is wrong. Hi Junio! thanks for the quick response. Questions - when I tell it to forget, won't it forget the pre-resolution state? my read of the rerere docs imply that it gets called during the merge to record the conflicted state. - would it be a feature if it updated its resolution db automagically? rerere is plenty automagic already... cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: I'm trying to break "git pull --rebase"
On Tue, Feb 20, 2018 at 5:00 PM, Julius Musseauwrote: > I was hoping to concoct a situation where "git pull --rebase" makes a > mess of things. It breaks quite easily with some workflows. They are all in the "don't do that" territory. Open a long-lived feature-dev branch, work on it. Other folks are working on master. Merge master into feature-dev. Make sure some merges might need conflict resolution. Reorg some code on master, move files around. Code some more on feature-dev branch. Merge master into feature-dev; the merge machinery will probably cope with the code move, file renames. If it doesn't, resolve it by hand. Let all that simmer for a little bit. Then try to rebase. "Doctor, it hurts when I rebase after merging with conflict resolution... " cheers, m
git svn clone/fetch hits issues with gc --auto
Hi folks, Long time no see! Importing a 3GB (~25K revs, tons of files) SVN repo I hit the gc error: warning: There are too many unreachable loose objects; run 'git prune' to remove them. gc --auto: command returned error: 255 I don't seem to be the only one -- https://stackoverflow.com/questions/35738680/avoiding-warning-there-are-too-many-unreachable-loose-objects-during-git-svn Looking at code history, it dropped the ability to pass options to git repack when it was converted it to using git gc. Experimentally I find that tweaking it to run git gc --auto --prune=5.minutes.ago works well, while --prune=now breaks it. Attempts to commit fail with missing objects. - Why does --prune=now break it? Perhaps "gc" runs in the background, and races with the commit being prepared? - Would it be safe, sane to apply --prune=some.value on _clone_? - During _fetch_, --prune=some.value seems risky. In a checkout being actively used for development or merging it'd risk pruning objects users expect to be there for recovery. Would there be a safe, sane way? - Is there a safer, saner value than 5 minutes? cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: git svn clone/fetch hits issues with gc --auto
On Wed, Oct 10, 2018 at 8:21 AM Junio C Hamano wrote: > We probably can keep the "let's not run for a day" safety while > pretending that "git gc -auto" succeeded for callers like "git svn" > so that these callers do not hae to do "eval { ... }" to hide our > exit code, no? > > I think that is what Jonathan's patch (jn/gc-auto) does. +1 `--auto` means "DTRT, but remember you're running as part of a larger process; don't error out unless it's critical". cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: git svn clone/fetch hits issues with gc --auto
Looking around, Jonathan Tan's "[PATCH] gc: do not warn about too many loose objects" makes sense to me. - remove unactionable warning - as the warning is gone, no gc.log is produced - subsequent gc runs don't exit due to gc.log My very humble +1 on that. As for downsides... if we have truly tons of _recent_ loose objects, it'll ... take disk space? I'm fine with that. For more aggressive gc options, thoughts: - Do we always consider git gc --prune=now "safe" in a "won't delete stuff the user is likely to want" sense? For example -- are the references from reflogs enough safety? - Even if we don't, for some commands it should be safe to run git gc --prune=now at the end of the process, for example an import that generates a new git repo (git svn clone). cheers, m On Tue, Oct 9, 2018 at 10:49 PM Junio C Hamano wrote: > > Forwarding to Jonathan, as I think this is an interesting supporting > vote for the topic that we were stuck on. > > Eric Wong writes: > > > Martin Langhoff wrote: > >> Hi folks, > >> > >> Long time no see! Importing a 3GB (~25K revs, tons of files) SVN repo > >> I hit the gc error: > >> > >> warning: There are too many unreachable loose objects; run 'git prune' > >> to remove them. > >> gc --auto: command returned error: 255 > > > > GC can be annoying when that happens... For git-svn, perhaps > > this can be appropriate to at least allow the import to continue: > > > > diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm > > index 76b2965905..9b0caa3d47 100644 > > --- a/perl/Git/SVN.pm > > +++ b/perl/Git/SVN.pm > > @@ -999,7 +999,7 @@ sub restore_commit_header_env { > > } > > > > sub gc { > > - command_noisy('gc', '--auto'); > > + eval { command_noisy('gc', '--auto') }; > > }; > > > > sub do_git_commit { > > > > > > But yeah, somebody else who works on git regularly could > > probably stop repack from writing thousands of loose > > objects (and instead write a self-contained pack with > > those objects, instead). I haven't followed git closely > > lately, myself. -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff
Re: git svn clone/fetch hits issues with gc --auto
On Wed, Oct 10, 2018 at 7:27 AM Ævar Arnfjörð Bjarmason wrote: > As Jeff's > https://public-inbox.org/git/20180716175103.gb18...@sigill.intra.peff.net/ > and my https://public-inbox.org/git/878t69dgvx@evledraar.gmail.com/ > note it's a bit more complex than that. Ok, my bad for not reading the whole thread :-) thanks for the kind explanation. > - The warning is actionable, you can decide to up your expiration >policy. A newbie-ish user shouldn't need to know git's internal store model _and the nuances of its special cases_ got get through. > - We use this warning as a proxy for "let's not run for a day" Oh, so _that's_ the trick with creating gc.log? I then understand the idea of changing to exit 0. But it's far from clear, and a clear _flag_, and not spitting again the same warning, or differently-worded warning would be better. "We won't try running gc, a recent run was deemed pointless until some time passes. Nothing to worry about." > - This conflation of the user-visible warning and the policy is an >emergent effect of how the different gc pieces interact, which as I >note in the linked thread(s) sucks. It sure does, and that aspect should be easy to fix...(?) > So it's creating a lot of garbage during its cloning process that can > just be immediately thrown away? What is it doing? Using the object > store as a scratch pad for its own temporary state? Yeah, thats suspicious and I don't know why. I've worked on other importers and while those needed 'gc' to generate packs, they didn't generate garbage objects. After gc, the repo was "clean". cheers, m -- martin.langh...@gmail.com - ask interesting questions ~ http://linkedin.com/in/martinlanghoff - don't be distracted~ http://github.com/martin-langhoff by shiny stuff