Re: empty patch-2.6.13-git? patches on ftp.kernel.org
On Sun, 2005-09-04 at 17:31 +0200, Jan Dittmer wrote: > David Woodhouse wrote: > > On Fri, 2005-09-02 at 02:00 -0700, Linus Torvalds wrote: > > > >>Ahh. Please change that to > >> > >>rm -rf tmp-empty-tree > >>mkdir tmp-empty-tree > >>cd tmp-empty-tree > >>git-init-db > >> > >>because otherwise you'll almost certainly hit something else later > >>on.. > > > > > > OK, done. > > > > -git4 is again empty Hm, yes. + rm -rf tmp-empty-tree + mkdir tmp-empty-tree + cd tmp-empty-tree + git-init-db /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/branches/: Permission denied + unset GIT_DIR + git-read-tree f505380ba7b98ec97bf25300c2a58aeae903530b fatal: unable to create new cachefile Fixed now; thanks. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: empty patch-2.6.13-git? patches on ftp.kernel.org
On Wed, 2005-08-31 at 15:34 +0200, Tomasz K³oczko wrote: > Seems patches stored on ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots > are empty (only logs are correct): > -rw-r--r--1 536 53620 Aug 30 09:01 patch-2.6.13-git1.gz > -rw-r--r--1 536 53620 Aug 31 09:01 patch-2.6.13-git2.gz Hm. git-diff-cache now refuses to operate unless there's a local '.git/refs' directory, even when working with a separate object directory. So this doesn't work any more... rm -rf tmp-empty-tree mkdir -p tmp-empty-tree/.git cd tmp-empty-tree git-read-tree $CURCOMM git-checkout-cache Makefile perl -pi -e "s/EXTRAVERSION =.*/EXTRAVERSION = $EXTRAVERSION/" Makefile git-diff-cache -m -p $RELTREE | gzip -9 > $STAGE/patch-$CURNAME.gz I've changed the script to create 'tmp-empty-tree/.git/refs' and replaced 2.6.13-git[12] with real patches. > Also it will be good move all patch-2.6.12* and patch-2.6.13-rc* files > from this directory to old subdirectory. Done. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux BKCVS kernel history git import..
On Wed, 2005-07-27 at 08:29 -0700, Linus Torvalds wrote: > I used to think I wanted to, but these days I really don't. One of the > reasons is that I expect to try to pretty up the old bkcvs conversion some > time: use the name translation from the old "shortlog" scripts etc, and > see if I can do some other improvements on the conversion (I think I'll > remove the BK files - "ChangeSet" etc). Thomas has done all that; it's on kernel.org already. > And it's really much easier and more general to have a "graft" facility. > It's something that git can do trivially (literally a hook in > "parse_commit" to add a special parent), and it's actually a generic > mechanism exactly for issues like this ("project had old history in some > other format"). Hm, OK. That works and can also be used for the "fake _absence_ of parent" thing -- if I'm space-constrained and want only the history back to some relatively recent point like 2.6.0, I can do that by turning the 2.6.0 commit into an orphan instead of also using all the rest of the history back to 2.4.0. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Linux BKCVS kernel history git import..
On Tue, 2005-07-26 at 11:57 -0700, Linus Torvalds wrote: > If somebody adds some logic to "parse_commit()" to do the "fake parent" > thing, you can stitch the histories together and see the end result as one > big tree. Even without that, you can already do things like > > git diff v2.6.10..v2.6.12 That's a bit of a hack which really doesn't belong in the git tools. It's not particularly hard to reparent the tree for real -- I'd much rather see a tool added to git which can _actually_ change the 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 commit to have a parent of 0bcc493c633d78373d3fcf9efc29d6a710637519, and ripple the corresponding SHA1 changes up to the current HEAD. Note that the latter commit ID I gave there was actually the 2.6.12-rc2 commit in Thomas' history import, not your own. Thomas has done a lot of work on it, and it has the full names extracted from the shortlog script, full timestamps, branch/merge history and consistent character sets in the commit logs. I'd definitely suggest that you use that instead of the import from bkcvs. http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=summary -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What broke snapshots now?
On Sun, 2005-07-10 at 10:31 -0700, Linus Torvalds wrote: > No it's not, as far as I can tell: > > [EMAIL PROTECTED]:/home/dwmw2/git/mail-2.6(0)$ cat > .git/branches/origin > rsync://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git > > so your scripts will go out to rsync with www.kernel.org to get the data, > when you use "cg-update origin". Hm, OK. So I have absolutely no recollection of what my own scripts are actually doing. I could have sworn I made sure it was local. If it was using that URL for the master I might as well have run it elsewhere... It does seem to be working again now. I'll probably rewrite it next time it misbehaves. > -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What broke snapshots now?
On Sun, 2005-07-10 at 10:08 -0700, Linus Torvalds wrote: > Which script is this? I'm looking at your scripts, but > "cg-feedmaillist.sh" is unreadable for me, so I can't see all of it. Hm. Dunno why that happened -- it's readable now, and also at http://david.woodhou.se/cg-feedmaillist.sh > Anyway, it's possible that this is a temporary problem: one of the issues > is that since you seem to be using the "rsync:" protocol for updating > things, what happens is that if the mirroring is off a bit, you may have > gotten a new head, but not all the objects. Then you'd get exactly this. It's done locally on hera though -- so the mirroring shouldn't be a problem. IIRC the reason it uses rsync is because I wasn't getting tags when it was using whatever other method was the default for a local 'parent repository'. That was actually more relevant for the snapshots than the mailing list feed, though -- so even if it isn't fixed now, I could live without tags. More usefully though, if ordering really isn't a problem on your repository then I should probably rewrite the script to work directly from that instead of from a copy. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What broke snapshots now?
On Sun, 2005-07-10 at 00:38 +0100, David Woodhouse wrote: > Doh. I thought I'd already done that, but in fact that was for the > scripts which feed the mailing list, while the snapshot script kept > using my copy. Ok, the snapshot script starts working again if I change a few environment variables to match what the tools now expect. Now the mailing list feed isn't happy though -- it stopped being able to pull from your tree at around 0600 UTC (which I think is then the last DRM fix was added). I got this when trying to update... Tree change: 0109fd37046de64e8459f8c4f4706df9ac7cc82c:f179bc77d09b9087bfc559d0368bba350342ac76 error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99 fatal: unable to read destination tree (ce68a60e5c503aaef0a98f8d754effb6c7d9ee99) Applying changes... Fast-forwarding 0109fd37046de64e8459f8c4f4706df9ac7cc82c -> f179bc77d09b9087bfc559d0368bba350342ac76 on top of 0109fd37046de64e8459f8c4f4706df9ac7cc82c... error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99 fatal: failed to unpack tree object f179bc77d09b9087bfc559d0368bba350342ac76 Since it's just a fast-forward, I just copied the 'origin' tag into the 'master' to move it forward. But it's still not happy: hera /home/dwmw2/git/mail-2.6 $ cg-diff -r 0109fd37046de64e8459f8c4f4706df9ac7cc82c:f179bc77d09b9087bfc559d0368bba350342ac76 error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99 fatal: unable to read destination tree (ce68a60e5c503aaef0a98f8d754effb6c7d9ee99) -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What broke snapshots now?
On Sat, 2005-07-09 at 09:15 -0700, Linus Torvalds wrote: > Yes, looks that way. Except it's not "git on master.kernel.org", it's "git > in your home directory", I suspect. I expressly held off packing the > kernel repo until git had been updated on kernel.org. Doh. I thought I'd already done that, but in fact that was for the scripts which feed the mailing list, while the snapshot script kept using my copy. I've moved it out of the way now; sorry for the noise. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
What broke snapshots now?
Does git on master.kernel.org need to be updated to handle packed objects? See attached. Linus, please could you add the snapshot script to your regression testing? http://david.woodhou.se/git-snapshot.sh It'd be good to keep that working without too much manual intervention. -- dwmw2 --- Begin Message --- + case `hostname` in ++ hostname + export PATH=/home/dwmw2/cogito:/usr/bin:/bin + PATH=/home/dwmw2/cogito:/usr/bin:/bin + export BASE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + BASE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git + STAGINGLOCK=/staging/upload.lock + FINAL=/pub/linux/kernel/v2.6/snapshots + '[' '!' -d /pub/scm/linux/kernel/git/torvalds/linux-2.6.git ']' + export WORK_DIRECTORY=/home/dwmw2/snapshots/2.6 + WORK_DIRECTORY=/home/dwmw2/snapshots/2.6 + export SNAP_TAG_DIRECTORY=/home/dwmw2/snapshots/2.6/tags + SNAP_TAG_DIRECTORY=/home/dwmw2/snapshots/2.6/tags + export STAGE=/home/dwmw2/snapshots/2.6/stage + STAGE=/home/dwmw2/snapshots/2.6/stage + export SHA1_FILE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects + SHA1_FILE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects ++ ls -rt /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.11 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.11-tree /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc2 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc3 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc4 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc5 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc6 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc1 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc2 ++ tail -n1 ++ sed s:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v:: + RELNAME=2.6.13-rc2 ++ cat /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc2 + RELOBJ=c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19 ++ tail -n1 ++ sed s:/home/dwmw2/snapshots/2.6/tags/v:: ++ ls -rt /home/dwmw2/snapshots/2.6/tags/v2.6.13-rc2-git1 + SNAPNAME=2.6.13-rc2-git1 + '[' 2.6.13-rc2-git1 == '' ']' ++ cat /home/dwmw2/snapshots/2.6/tags/v2.6.13-rc2-git1 + LASTOBJ=c101f3136cc98a003d0d16be6fab7d0d950581a6 ++ echo 2.6.13-rc2-git1 ++ sed 's/^.*-git//' + OLDGITNUM=1 ++ expr 1 + 1 + NEWGITNUM=2 + CURNAME=2.6.13-rc2-git2 ++ tree-id c101f3136cc98a003d0d16be6fab7d0d950581a6 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c1/01f3136cc98a003d0d16be6fab7d0d950581a6: No such file or directory fatal: git-cat-file c101f3136cc98a003d0d16be6fab7d0d950581a6: bad file Invalid id: c101f3136cc98a003d0d16be6fab7d0d950581a6 usage: git-cat-file [-t | tagname] usage: git-cat-file [-t | tagname] Invalid id: + LASTTREE= ++ cat /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/HEAD + CURCOMM=a92b7b80579fe68fe229892815c750f6652eb6a9 ++ tree-id a92b7b80579fe68fe229892815c750f6652eb6a9 + CURTREE=7fd73e9f39bf6003cc3188a10426b62d8c47ab40 ++ tree-id c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c5/21cb0f10ef2bf28a18e1cc8adf378ccbbe5a19: No such file or directory fatal: git-cat-file c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19: bad file Invalid id: c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19 usage: git-cat-file [-t | tagname] usage: git-cat-file [-t | tagname] Invalid id: + RELTREE= + echo release 2.6.13-rc2 commit tree release 2.6.13-rc2 commit tree ++ git-cat-file -t c101f3136cc98a003d0d16be6fab7d0d950581a6 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c1/01f3136cc98a003d0d16be6fab7d0d950581a6: No such file or directory fatal: git-cat-file c101f3136cc98a003d0d16be6fab7d0d950581a6: bad file + echo last c101f3136cc98a003d0d16be6fab7d0d950581a6 tree last c101f3136cc98a003d0d16be6fab7d0d950581a6 tree + echo head a92b7b80579fe68fe229892815c750f6652eb6a9 tree 7fd73e9f39bf6003cc3188a10426b62d8c47ab40 head a92b7b80579fe68fe229892815c750f6652eb6a9 tree 7fd73e9f39bf6003cc3188a10426b62d8c47ab40 + '[' '' == 7fd73e9f39bf6003cc3188a10426b62d8c47ab40 ']' ++ echo 2.6.13-rc2-git2 ++ cut -f2- -d- + EXTRAVERSION=-rc2-git2 + cd /home/dwmw2/snapshots/2.6/stage + rm -rf tmp-empty-tree + mkdir -p tmp-empty-tree/.git + cd tmp-empty-tree + git-read-tree a92b7b80579fe68fe229892815c750f6652eb6a9 /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/f8/640c306db2d583b9a30f2e52f8fb0a4cf624e0: No such file or directory fatal: failed to unpack tree object a92b7b80579fe68fe229892815c750f6652eb6a9 + git-checkout-cache Makefile checkout-cache: Makefile is not in the cache. + perl -pi -e 's/EXTRAVERSION =.*/EXTRAVERSION = -rc2-git2/' Makefile Can't open Makefile: No such file or directory. + git-diff-cache -m -p + gzip -9 usage: diff-cache [-r] [-z] [-p] [-i] [--cached] + echo a92b7b80579fe68fe229892815c750f6652eb6a9 + cg-log c521cb
Re: Git-commits mailing list feed.
On Thu, 2005-04-21 at 12:29 +0200, Arjan van de Ven wrote: > with BK this was not possible, but could we please have -p added to the > diff parameters with git ? It makes diffs a LOT more reasable! With BK this was not possible, but could you please provide your criticism in 'diff -up' form? I've done 'perl -pi -e s/-u/-up/ gitdiff-do' as a quick hack to provide what you want, but a saner fix to make gitdiff-do obey the same GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as show-diff.c would be a more useful answer. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Set AUTHOR_DATE in git-tools
Entirely untested. Makefile: eca3a5d5256cca06d86ebb85ec9d3218752ffcd2 applypatch: 397e4a0e506f1c5765767057dfe506154b743b83 --- a/applypatch +++ b/applypatch @@ -26,6 +26,7 @@ EDIT=${EDIT:-vi} export AUTHOR_NAME="$(sed -n '/^Author/ s/Author: //p' .dotest/info)" export AUTHOR_EMAIL="$(sed -n '/^Email/ s/Email: //p' .dotest/info)" +export AUTHOR_DATE="$(sed -n '/^Date/ s/Date: //p' .dotest/info)" export SUBJECT="$(sed -n '/^Subject/ s/Subject: //p' .dotest/info)" if [ -n "$signoff" -a -f "$signoff" ]; then dotest: a3e3d35ae0afa358f01b49eecb358d64c616c3e4 mailinfo.c: c1dcac130530174ec5335d2c752d76403ad1d3ad --- a/mailinfo.c +++ b/mailinfo.c @@ -13,6 +13,7 @@ static char line[1000]; static char name[1000]; static char email[1000]; static char subject[1000]; +static char date[1000]; static char *sanity_check(char *name, char *email) { @@ -83,6 +84,11 @@ static void handle_subject(char *line) strcpy(subject, line); } +static void handle_date(char *line) +{ + strcpy(date, line); +} + static void add_subject_line(char *line) { while (isspace(*line)) @@ -99,6 +105,11 @@ static void check_line(char *line, int l cont = 0; return; } + if (!memcmp(line, "Date:", 5) && isspace(line[5])) { + handle_date(line+6); + cont = 0; + return; + } if (!memcmp(line, "Subject:", 8) && isspace(line[8])) { handle_subject(line+9); cont = 1; @@ -107,7 +118,7 @@ static void check_line(char *line, int l if (isspace(*line)) { switch (cont) { case 0: - fprintf(stderr, "I don't do 'From:' line continuations\n"); + fprintf(stderr, "I don't do 'From:' or 'Date:' header continuations\n"); break; case 1: add_subject_line(line); @@ -215,7 +226,8 @@ static void handle_rest(void) cleanup_space(name); cleanup_space(email); cleanup_space(sub); - printf("Author: %s\nEmail: %s\nSubject: %s\n\n", name, email, sub); + cleanup_space(date); + printf("Author: %s\nEmail: %s\nSubject: %s\nDate: %s\n", name, email, sub, date); FILE *out = cmitmsg; do { mailsplit.c: 9379fbc5e84983e5ea0754a6587cc3490c696c69 -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Mailing list feed.
If we just strip out the setting of $FROM and $MLIST, the script I use to feed bk-commits-head@vger.kernel.org is perfectly generic. Petr, can you include it in the tree so it gets updated as things change please? -- dwmw2 gitfeedmaillist.sh Description: application/shellscript
Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)
On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote: > external-parent > comment for this parent > > and the nice thing about that is that now that information allows you to > add external parents at any point. > > Why do it like this? First off, I think that the "initial import" ends up > being just one special case of the much more _generic_ issue of having > patches come in from other source control systems This isn't about patches coming in from other systems -- it's about _history_, and the fact that it's imported from another system is just an implementation detail. It's git history now, and what we have here is just a special case of wanting to prune ancient git history to keep the size of our working trees down. You refer to this yourself... > Secondly, we do need something like this for pruning off history anyway, > so that the tools have a better way of saying "history has been pruned > off" than just hitting a missing commit. Having a more explicit way of saying "history is pruned" than just a reference to a missing commit is a reasonable request -- but I really don't see how we can do that by changing the now-oldest commit object to contain an 'external-parent' field. Doing that would change the sha1 of the commit object in question, and then ripple through all the subsequent commits. Come this time next year, if I decide I want to prune anything older than 2.6.40 from all the trees on my laptop, it has to happen _without_ changing the commit objects which occur after my arbitrarily-chosen cutoff point. If we want to have an explicit record of pruning rather than just copying with a missing object, then I think we'd need to do it with an external note to say "It's OK that commit XXX is missing". > Thirdly, I don't actually want my new tree to depend on a conversion of > the old BK tree. > > Two reasons: if it's a really full conversion, there are definitely going > to be issues with BitMover. They do not want people to try to reverse > engineer how they do namespace merges Don't think of it as "a conversion of the old BK tree". It's just an import of Linux's development history. This isn't going to help reverse-engineer how BK does merges; it's just our own revision history. I'm not sure exactly how Thomas is extracting it, but AIUI it's all obtainable from the SCCS files anyway without actually resorting to using BK itself. There's nothing here for Larry to worry about. It's not as if we're actually using BK to develop git by observing BK's behaviour w.r.t merges and trying to emulate it. Besides -- if we wanted to do that, we'd need to use the _BK_ version of the tree; the git version wouldn't help us much anyway. And given that BK's merges are based on individual files and we're not going that route with git, it's not clear how much we could lift directly from BK even if we _were_ going to try that. > The other reason is just the really obvious one: in the last week, I've > already changed the format _twice_ in ways that change the hash. As long > as it's 119MB of data, it's not going to be too nasty to do again. That's fine. But by the time we settle on a format and actually start using it in anger, it'd be good to be sure that it _is_ possible to track development from current trees all the way back -- be that with explicit reference to pruned history as you suggest, or with absent parents as I still prefer. > it's not that it's necessarily the wrong thing to do, but I think it > is the wrogn thing to do _now_. OK, time for us to keep arguing over the implementation details of how we prune history then :) -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)
On Wed, 2005-04-20 at 02:08 -0700, Linus Torvalds wrote: > I converted my git archives (kernel and git itself) to do the SHA1 > hash _before_ the compression phase. I'm happy to see that -- because I'm going to be asking you to make another change which will also require a simple repository conversion. We are working on getting the complete history since 2.4.0 into git form. When it's done and checked (which should be RSN) I'd like you to edit the first commit object in your tree -- the import of 2.6.12-rc2, and give it a parent. That parent will be the sha1 hash of the 2.6.12-rc2 commit in the newly-provided history, and of course will change the sha1 hash of your first commit, and all subsequent commits. We'll provide a tool to do that, of course. The history itself will be absent from your tree. Obviously we'll need to make sure that the tools can cope with an absentee parent, probably by just treating that case as if no parent exists. That won't be hard, it'll be useful for people to prune their trees of unwanted older history in the general case too. That history won't be lost or undone -- it'll just be archived elsewhere. The reason for doing this is that without it, we can't ever have a full history actually connected to the current trees. There'd always be a break at 2.6.12-rc2, at which point you'd have to switch to an entirely different git repository. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: naive question
On Tue, 2005-04-19 at 23:00 +1000, Paul Mackerras wrote: > Is there a way to check out a tree without changing the mtime of any > files that you have already checked out and which are the same as the > version you are checking out? It seems that checkout-cache -a doesn't > overwrite any existing files, and checkout-cache -f -a overwrites all > files and gives them the current mtime. This is a pain if you are > using make and your tree is large (like, for instance, the linux > kernel :), because it means that after a checkout-cache -f -a you get > to recompile everything. Corollary: why aren't we storing mtime in the tree objects? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] provide better committer information to commit-tree.c
On Mon, 2005-04-18 at 18:12 -0700, Greg KH wrote: > Ok, then why display it as one? Nobody ever displays it as one as far as I'm aware. That would be something like "mailto:$COMMITTER"; > But I'll wait for Russell to wake up and start quoting the proper EU > privacy laws that he feels causes him to be forced to obfuscate his > email addresses in the changelog commits (as he did for the bk ones.) He's talking about his own interpretation of the UK's Data Protection Act, which requires you to be registered and fulfil certain other requirements if you keep personal information about people in a database. Email addresses have been ruled to be 'personal information' in this context, but this _isn't_ an email address -- and there are other get-out clauses for noncommercial situations such as this anyway, I believe. Besides, he can still obscure the author information as he unfortunately insists on doing; it's the _committer_ information which we're discussing here -- and that's always going to be himself in this case. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] provide better committer information to commit-tree.c
On Mon, 2005-04-18 at 17:45 -0700, Greg KH wrote: > Well Russell has stated that he has to for EU Privacy reasons. And I'd > like to do it as I don't have a local suse.de hostname for my laptop and > my employer probably doesn't really want my [EMAIL PROTECTED] address > showing up :) Why not? Do they complain that we see '[EMAIL PROTECTED]' when you connect to an IRC server? This _isn't_ an email address, and doesn't really need to be treated as such. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SCSI trees, merges and git status
On Mon, 2005-04-18 at 19:16 -0500, James Bottomley wrote: > Yes, that's what I did to get back to the commit just before the > merge: > > fsck-cache --unreachable 54ff646c589dcc35182d01c5b557806759301aa3|awk > '/^unreachable /{print $2}'|sed 's:^\(..\):.git/objects/\1/:'|xargs rm I was actually digressing and talking about pruning ancient history which _is_ theoretically reachable. It's not being 'undone'; it's just being omitted from the current _working_ tree. The whole point is that in a fully-populated tree the history _should_ be accessible all the way back. We're trying to get the older history available on kernel.org ASAP. The blobs are rsyncing to ~dwmw2/git/kernel-tglx1; the trees and commit objects will be coming soon. Theoretically all Linus actually needs in order to rebuild his current tree is the sha1 hash of the final commit in that historical tree, which corresponds to 2.6.12-rc2. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: SCSI trees, merges and git status
On Mon, 2005-04-18 at 17:03 -0700, Linus Torvalds wrote: > Git does work like BK in the way that you cannot remove history when you > have distributed it. Once it's there, it's there. But older history can be pruned, and there's really no reason why an http-based 'git pull' couldn't simply refrain from fetching commits older than a certain threshold. However, we can't _add_ the history if the current commits don't refer to it. I really think we should take the imported git history and make our 'current' tree refer to it -- even if just by having an appropriate 'parent' record in what is currently the oldest changeset in our tree; the 2.6.12-rc2 import. It doesn't matter that our oldest commit object refers to a nonexistent parent, but that does allow us to import historical data if we _want_ to, and have it all work properly. We should have the full historical git repo available within a day or so, I believe. It would be really useful if we could make the current trees refer back to that, instead of starting at 2.6.12-rc2. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] fixup GECOS handling
On Mon, 2005-04-18 at 12:36 +0200, Martin Schlemmer wrote: > realgecos[strchr(realgecos, ',') - realgecos] = '\0'; Er, *strchr(realgecos, ',') = 0; surely? Even if the compiler is clever enough to optimise out the gratuitous addition and subtraction, that's no real excuse for it. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Pretty-print date in 'git log'
On Mon, 2005-04-18 at 12:27 +0200, Petr Baudis wrote: > Yes. As far as I'm concerned, I'd put such stuff to git log, and extend > it usage so that it is possible to print individual log entries with it > - just make it accept a _range_ of commits, and then do > > git log $commit $commit That's fairly trivial. In the current (and misguided) version with chronological output, rev-tree will do it all for you, in fact: rev-tree $1 ^$2 In the older and more useful version, it was only slightly more complex: base=$(gitXnormid.sh -c $1) || exit 1 +if [ -n "$2" ]; then +endpoint=$(gitXnormid.sh -c $2) || exit 1 +if rev-tree $base $endpoint | grep -q $base:3; then +base= +else +rev-tree --edges $base $endpoint | sed 's/[a-z0-9]*:1//g' > $TMPCL +fi +fi changelog $base rm $TMPCL $TMPCM -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Pretty-print date in 'git log'
Add tool to render git's " " into an RFC2822-compliant string, because I don't think date(1) can do it. Use same for 'git log' output. Signed-off-by: David Woodhouse <[EMAIL PROTECTED]> --- Makefile +++ Makefile2005-04-18 15:40:43.0 +1000 @@ -14,7 +14,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree merge-base + check-files ls-tree merge-base show-date SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \ gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \ --- gitlog.sh +++ gitlog.sh 2005-04-18 15:39:38.0 +1000 @@ -13,6 +13,23 @@ rev-tree $base | sort -rn | while read time commit parents; do echo commit ${commit%:*}; - cat-file commit $commit + cat-file commit $commit | while read type rest ; do + case "$type" in + "author"|"committer") + DATESTAMP="`echo $rest | cut -f2 -d\>`" + RFC2822DATE="`show-date $DATESTAMP 2>/dev/null || echo $DATESTAMP`" + echo $type $rest | sed "s/$DATESTAMP\$/ $RFC2822DATE/" + ;; + + "") + echo "" + cat + ;; + *) + echo $type $rest + ;; + esac + done + echo -e "\n--" done --- show-date.c.orig2005-04-18 15:43:06.0 +1000 +++ show-date.c 2005-04-18 15:42:15.0 +1000 @@ -0,0 +1,48 @@ +#include +#include +#include +#include "cache.h" + +static const char *month_names[] = { +"Jan", "Feb", "Mar", "Apr", "May", "Jun", +"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" +}; + +static const char *weekday_names[] = { +"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" +}; + +int main(int argc, char **argv) +{ + time_t t; + int offset; + char *p; + struct tm tm; + + if (argc != 3) + usage("usage: show-date "); + + t = strtol(argv[1], &p, 0); + if (*p || !t) + usage("usage: show-date "); + + if (argv[2][0] != '-' && argv[2][0] != '+') + usage("usage: show-date "); + + offset = strtol(argv[2]+1, &p, 10); + if (*p || p!= argv[2]+5) + usage("usage: show-date "); + + if (argv[2][0] == '-') + offset = -offset; + + offset = 60 * (offset % 100 + (offset / 100 * 60)); + + t += offset; + gmtime_r(&t, &tm); + + printf("%s, %d %s %04d %02d:%02d:%02d %s\n", + weekday_names[tm.tm_wday], tm.tm_mday, month_names[tm.tm_mon], + tm.tm_year+1900, tm.tm_hour, tm.tm_min, tm.tm_sec, argv[2]); + return 0; +} -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] General object parsing
On Sun, 2005-04-17 at 18:15 -0700, Linus Torvalds wrote: > In particular, is there some easy way to walk backwards by time? "git log" > definitely needs that, and merge-base clearly wants something similar. Actually the ideal output of 'git log' isn't strictly chronological. IIRC my bkexport scripts used to make a chronologically sorted list, and I ended up changing it. Simple example: if there are changesets which have been lurking in some tree for months waiting for you to pull, and the only thing you did since I ran 'git log' on your tree yesterday is pull from that tree, then those changesets are what I want to see at the top of 'git log' output. In fact this probably means that the depth-first tree walking of the original gitlog.sh is probably the right thing to do, but when we hit a merge we want to try to make sure we process the _remote_ parent first. Are we sorting the 'parent' links in merges so that two merges of the same branches are guaranteed to be identical (assuming identical contents otherwise)? Or is it just that we didn't think about it, and so merges are putting the local and remote parents in the 'wrong' order by coincidence? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: full kernel history, in patchset format
On Sun, 2005-04-17 at 18:16 -0700, Linus Torvalds wrote: > Alternatively, you can have just the rev-tree cache of them. That's what > it was designed for (along with avoiding to have to read 60,000 commits). Purely from a conceptual POV I'd be a little happier with the history just ending with a parent pointer to a commit object which is absent, rather than having commit objects which point to _trees_ which are absent. But I suppose I can't really justify that, and I'm not overly bothered about it either. The important thing to get right at this point is that the tree we all work with should refer to the history, regardless of how we choose to prune it. The current linux-2.6.git tree has a parentless commit for the 2.6.12-rc2 import, which is bad. We should start with Thomas' git tree representing the real history, and work from that. You don't even need to see his tree; you only need the final sha1 hash of the commit in his tree which matches 2.6.12-rc2, so you can use that as the 'parent' of the first change you import yourself. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: full kernel history, in patchset format
On Mon, 2005-04-18 at 02:50 +0200, Petr Baudis wrote: > I think I will make git-pasky's default behaviour (when we get > http-pull, that is) to keep the complete commit history but only trees > you need/want; togglable to both sides. I think the default behaviour should probably be to fetch everything. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: full kernel history, in patchset format
On Mon, 2005-04-18 at 02:35 +0200, Petr Baudis wrote: > > For the special case of removing history before 2.6.12-rc2 from the > > trees, I certainly think we can do it by leaving out all the commits, > > not just the trees. We can do that easily, but there's no way we can > > _add_ that history retrospectively if we omit it in the first place. > > I'm confused by this paragraph, but that might be my English skills > failing somehow. "For the general case of people pruning their own trees, _maybe_ you're right that it would be good to keep the commits even if we delete the actual trees. But for history older than 2.6.12-rc2, that's a special case -- I think we can happily delete the commits too. "We can delete old trees/commits easily, but we can't _add_ them to the existing linux-2.6.git tree, because the oldest commit in that tree (b4ceb6e27e4cc3f37d26e04c4535c79b98a9f889) doesn't have a parent." -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: full kernel history, in patchset format
On Mon, 2005-04-18 at 01:39 +0200, Petr Baudis wrote: > I think this is bad, bad, bad. If you don't keep around all the > _commits_, you get into all sorts of troubles - when merging, when doing > git log, etc. And the commits themselves are probably actually pretty > small portion of the thing. I didn't do any actual measurement but I > would be pretty surprised if it would be much more than few megabytes of > data for the kernel history. I'm not sure it's that bad -- and everyone already seems perfectly happy not to have history going back before 2.6.12-rc2. We're not talking about doing this by _default_ -- we're talking about allowing people to keep trees pruned if they _want_ to. So I might want to drop history before 2.6.0 on my laptop, for example. > Of course an entirely different thing are _trees_ associated with those > commits. As long as you stay with a simple three-way merge, you > basically never want to look at trees which aren't heads and which you > don't specifically request to look at. And the trees and what they carry > inside is the main bulk of data. If the trees are absent and you're trying to merge, what do you gain from having the commit objects? And for the case of 'git log', I certainly think it's acceptable that you lose out on those parts of prehistory which you've explicitly removed from your local tree -- that's a feature, not a bug. For the special case of removing history before 2.6.12-rc2 from the trees, I certainly think we can do it by leaving out all the commits, not just the trees. We can do that easily, but there's no way we can _add_ that history retrospectively if we omit it in the first place. For history older than 2.6.12-rc2 I'd suggest that it would be available in a different place, and absent from the 'main' working tree that everyone uses by default. The only difference we'd see in the working tree is that the 2.6.12-rc2 commit -- the oldest commit in that tree -- would actually have an absentee parent instead of appearing to be an import. And all the sha1 hashes of all subsequent commits would be different, of course. To allow pruning of older objects in the general case would be a little bit harder than that, because as things stand you'd be re-fetching them every time you rsync from elsewhere -- but that wouldn't really be hard to fix if we care. Either way, I think it can probably be done by omitting the commit objects as well as the trees -- but the important point is that we _should_ include a 'parent' pointer in the oldest commit of the tree we're working with, pointing back to the imported history. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Building git on Fedora
On Sun, 2005-04-17 at 19:25 -0400, jeff millar wrote: > ln -sf /lib/modules/`uname -r`/build/include/linux > /usr/local/include/linux > > This fix creates a symlink, on each boot up, in the local include > directory that points to the kernel header files. If there's a better > way to do this, I'm all ears. What's wrong with the contents of the glibc-kernheaders package? Can you file specific bugs if you're having problems? In the long run, the answer is to convince Linus that we _really_ need the kernel to have a set of header files defining the ABI which are fit for public consumption, rather than having a horrid mix of private and exportable bits throughout the contents of the include/ directory. In the meantime, some poor mug has to clean the crap up and try to make something suitable to live in /usr/include/linux -- and unfortunately at the moment for Fedora that someone is me :) Unless git is doing something with kernel-private headers that it shouldn't, this probably wants to be discussed elsewhere -- most likely in Bugzilla. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: full kernel history, in patchset format
On Sat, 2005-04-16 at 10:04 -0700, Linus Torvalds wrote: > So I'd _almost_ suggest just starting from a clean slate after all. > Keeping the old history around, of course, but not necessarily putting it > into git now. It would just force everybody who is getting used to git in > the first place to work with a 3GB archive from day one, rather than > getting into it a bit more gradually. > > What do people think? I'm not so much worried about the data itself: the > git architecture is _so_ damn simple that now that the size estimate has > been confirmed, that I don't think it would be a problem per se to put > 3.2GB into the archive. But it will bog down "rsync" horribly, so it will > actually hurt synchronization untill somebody writes the rev-tree-like > stuff to communicate changes more efficiently.. Note that any given copy of a tree doesn't _need_ to keep all the history back the beginning of time. It's OK if the oldest commit object in your tree actually refers back to a parent which doesn't exist locally. I can well imagine that some people will want to keep their trees pruned to keep only a few weeks of history, while other copies of the tree will keep everything. However, if we _don't_ base our current work on an existing import of the kernel, then we don't retain that option. We can't just change the 'parent' field of your 2.6.12-rc2 import, without changing the sha1 hash of _everything_ that happens thereafter. So I'd say we should take Thomas' import, and base new work on that -- but then possibly leave out the older objects from the 'working' repository which everyone is rsyncing from; just make them available in a 'linux-history.git' object database elsewhere. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re-done kernel archive - real one?
On Sun, 2005-04-17 at 15:22 -0700, randy_dunlap wrote: > David did the commits-mailing-list script and I'm working on a > commits web-page like what was formerly seen at: > http://www.kernel.org/pub/linux/kernel/v2.6/testing/cset/ > (with daily tarball) > > based on some older scripts from David, however I'm wondering if > a variant of the gitlog.sh script wouldn't be a better starting > point for it. My commits-list script is in fact based on gitlog.sh. You'll probably find useful things to crib from in both that and the original bkexport.sh script. The commits script also wants updating to print the date properly now that we've changed how it's stored -- I'll try to find some time this week to update it and set it running on master.kernel.org again, but it may end up waiting till after LCA. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re-done kernel archive - real one?
On Sat, 2005-04-16 at 16:01 -0700, Linus Torvalds wrote: > So I re-created the dang thing (hey, it takes just a few minutes), and > pushed it out, and there's now an archive on kernel.org in my public > "personal" directory called "linux-2.6.git". I'll continue the tradition > of naming git-archive directories as "*.git", since that really ends up > being the ".git" directory for the checked-out thing. Do you want the commits list running for it yet? Do you want the changesets which are already in it re-mailed without a 'TESTING' tag? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Sat, 2005-04-16 at 17:33 +0200, Johannes Schindelin wrote: > > But if it can be done cheaply enough at a later date even though we end > > up repeating ourselves, and if it can be done _well_ enough that we > > shouldn't have just asked the user in the first place, then yes, OK I > > agree. > > The repetition could be helped by using a cache. Perhaps. Since neither such a cache nor even the commit comments are strictly part of the git data, they probably shouldn't be included in the sha1 hash of the commit object. However, I don't see a fundamental reason why we couldn't store them in the same file but omit them from the hash calculations. That also allows us to retrospectively edit commit comments without completely changing the entire subsequent history. Or is that a little too heretical a suggestion? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Fri, 2005-04-15 at 08:32 -0700, Linus Torvalds wrote: > - you're doing the work at the wrong point. Doing it _well_ is quite >expensive. So if you do it at commit time, you cannot _afford_ to do it >well, and you'll always fall back to doing an ass-backwards job that >doesn't really get you to the good state, and only gets you to a >not-very-interesting easy 1% of the solution (ie full file renames). > > - you're doing the work at the wrong point for _another_ reason. You're >freezing your (crappy) algorithm at tree creation time, and basically >making it pointless to ever create something better later, because even >if hardware and software improves, you've codified that "we have to >have crappy information". OK, I'm inclined to agree. The only thing that prevents me from capitulating entirely and resubscribing to the "Torvalds is always right" school is the concern that it _is_ expensive, and that's why I originally wanted to do it at commit time because then it's a one-off cost rather than recurring every time we want to track the history of a given piece of content. Also because we actually have the developer's attention at commit time, and we can get _real_ answers from the user about what she was doing, instead of having to guess. But if it can be done cheaply enough at a later date even though we end up repeating ourselves, and if it can be done _well_ enough that we shouldn't have just asked the user in the first place, then yes, OK I agree. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Fri, 2005-04-15 at 07:53 -0700, Linus Torvalds wrote: > Files DO NOT matter. Never have. It's an implementation limitation to > think they do. You'll screw yourself up, and when somebody comes up with a > half-way efficient way to generate inter-fiel diffs, your architecture is > totally and utterly unable to handle it. > > I don't care what you do at an SCM level, and if the crud you put on top > of git wants to perpetuate mistakes of yesteryear, that's _your_ issue. > But dammit, git is designed to do the right thing, and I will fight tooth > and nail against anybody who thinks individual files matter. No, really: individual files _DO_ matter. There's a reason we split stuff up into separate files, and if you look closely you'll find that we don't just randomly put different functions into different files with neither rhyme nor reason -- there's a pattern to it; usually some kind of functional grouping. And when I'm looking for the change that broke something, I can almost always tell which file it's in and go looking in _that_ file. It's a _whole_ lot easier to use the equivalent of 'bk revtool' than it is to sift through all the unrelated commits in the whole tree. If that's an implementation limitation, then it's an implementation limitation in my _brain_ not just in my tools. OK, in fact it shouldn't be 'show me the history of this file'; it's often really 'show me the history of this function' which I want. But that's fine. All I'm suggesting is that we should include the metadata which says "content moved from file XXX to file YYY" along with the commit objects. I'm certainly not suggesting that we should implement jejb's idea of explicit 'file revision history' objects -- the tree-based philosophy is perfectly sane and sufficient. But we do _also_ need a little information which allows us to track content as it moves around within the tree, and the SCM has to have a sane way to filter out the noise when we're looking for what broke. Yes, that's part of the SCM functionality, and can live in an xattr-type field in the commit object -- but it does need to be stored, and in practice I suspect it _will_ be useful for merging too. It's not about ditching the per-tree tracking and doing per-file tracking instead. I agree that would be wrong. It's about storing enough information to track what happened to given content as it moved around within the tree. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Fri, 2005-04-15 at 16:53 +0200, Ingo Molnar wrote: > but the specific scenario you described would require _Linus'_ tree to > be in limbo for a long time, and have uncommitted half-done edits. > I.e.: > >(A1B2)--(A2B2)--(A2'B3) > / \ /\ >/\ / \ > (A1B1) X (...) >\/ \ / > \ / \/ >(A2B1)--(A2B2)--(A3B2') > > in the above scenario Linus' tree needs to 'cross' with a maintainer's > tree. (maintainer's tree wont cross with another maintainer's tree, > as maintainer-to-maintainer merges rare.) Is that true? Consider (A2B1) to be a bugfixes-only tree which I make available for Linus to pull from. I keep doing more experimental stuff in my own private copy of the tree along the bottom branch, while Linus _eventually_ responds to my pull request and moves on, stopping only to add a 'static' to one of my new functions. I move on too but don't pull from Linus again for a little while; the final merge happens when I _do_ pull again. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling renames.
On Fri, 2005-04-15 at 13:37 +, [EMAIL PROTECTED] wrote: > > One option for optimising this, if we really need to, might be to track > > the file back to its _first_ ancestor and use that as an identification. > > The SCM could store that identifier in the blob itself, or we could > > consider it an 'inode number' and store it in git's tree objects. > > This suggestion (and this whole discussion about renames) has issues > with file copies, which form a branch in the revision history. If I > copy foo.c to foo2.c (or fs/ext2/ to fs/ext3/), then the oldest ancestor > isn't a "unique inode number". That's why I prefer the option of simply annotating the moves. They don't need to be just renames -- it can cover the cases where files are split up or merged into one, to indicate where the history of the given _data_ is coming from. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Fri, 2005-04-15 at 11:36 +0200, Ingo Molnar wrote: > do such cases occur frequently? In the kernel at least it's not too > typical. Isn't it? I thought it was a fairly accurate representation of the process "I make a whole bunch of changes to files I maintain, pulling from Linus while occasionally asking him to pull from my tree. Sometimes my files are changed by someone else in Linus' tree, and sometimes I change files that I don't actually own.". -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Thu, 2005-04-14 at 17:42 -0700, Linus Torvalds wrote: > I've not even been convinved that renames are worth it. Nobody has > really given a good reason why. > > There are two reasons for renames I can think of: > > - space efficiency in delta-based trees. > - "annotate". Neither of those were my motivation for looking at renames. The reasons I wanted to track renames were: - Per-file revision history which doesn't stop dead at a rename. - Merging where files have been renamed in one branch and modified in another. Which is basically a special case of the above; we need to see the per-file revision history. >So I'd seriously suggest that instead of worryign about renames, people >think about global diffs that aren't per-file. Git is good at limiting >the changes to a set of objects, and it should be entirely possible to >think of diffs as ways of moving lines _between_ objects and not just >within objects. It's quite common to move a function from one file to >another - certainly more so than renaming the whole file. > >In other words, I really believe renames are just a meaningless special >case of a much more interesting problem. Which is just one reason why >I'm not at all interested in bothering with them other than as a "data >moved" thing, which git already handles very well indeed. Git doesn't handle 'data moved' except at a whole-tree level. For each commit, it says "these are the old trees; this is the new tree". Git doesn't actually look hard into the contents of tree; certainly it has no business looking at the contents of individual files; that is something that the SCM or possibly only the user should do. The storage of 'rename' information in the commit object is another kind of 'xattr' storage which git would provides but not directly interpret. And you're right; it shouldn't have to be for renames only. There's no need for us to limit it to one "source" and one "destination"; the SCM can use it to track content as it sees fit. As I said, the main aim of this is to track revision history of given content, for displaying to the user and for performing merges. So when a file is split up, or a function is moved from it to another file, a 'rename' xattr can be included to mark that files 'foo' and 'bar' in the new tree are both associated with file 'wibble' in the parent. That's as much as we need to provide for content tracking, and it _does_ handle the general case as well as we should be attempting to. We don't want to get into dealing with file contents ourselves; we just want to store the hint for the SCM or the user that "your data went thataway". -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merge with git-pasky II.
On Thu, 2005-04-14 at 11:36 -0700, Linus Torvalds wrote: > And "merge these two trees" (which works on a _tree_ level) > or "find the common commit" (which works on a _commit_ level) I suspect that finding the common commit is actually a per-file thing; it's not just something you do for the _commit_ graph, then use for merging each file in the two branches you're trying to merge. Consider a simple repository which contains two files A and B. We start off with the first version of each ('A1B1'), and the owner of each file takes a branch and modifies their own file. There is cross-pulling between the two, and then each modifies the _other's_ file as well as their own... (A1B2)--(A2B2)--(A2'B3) / \ /\ /\ / \ (A1B1) X (...) \/ \ / \ / \/ (A2B1)--(A2B2)--(A3B2') Now, we're trying to merge the two branches. It appears that the most useful common ancestor to use for a three-way merge of file A is the version from tree 'A2B1', while the most useful common ancestor for merging file B is that in 'A1B2'. (I think it's a coincidence that in my example the useful files 'A2' and 'B2' actually do end up in a single tree together at some point.) -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling renames.
On Thu, 2005-04-14 at 18:23 -0400, Daniel Barkalow wrote: > I personally think renames are a minor thing that doesn't happen > much. What actually happens, in my opinion, is that some chunk of a > file is moved to a different, possibly new, file. If this is supported > (as something that the SCM notices), then a rename is just a special > case where the moved chunk is a whole file. Certainly we'd discussed the possibility that the 'rename' field may contain more than one destination, or more than one source filename. This could happen when a file is split into two, or when two files are merged into one, for example. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Date handling.
On Thu, 2005-04-14 at 14:01 -0700, H. Peter Anvin wrote: > Both of these are metadata; they may not be directly relevant to the > filesystem, but are attributes relevant to the client thereof; > effectively an xattr. Right. That's perfectly acceptable -- and that's the reason why I think it's also fine to keep the timezone and the rename information in there too. If we were being _really_ anal about auxiliary information being separate, we'd stick it in a separate blob object and merely refer to it from the commit object. I don't think there's really any call to take it that far, though. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Date handling.
On Thu, 2005-04-14 at 12:42 -0700, Luck, Tony wrote: > This is a very good point ... but this still has problems with the > "git is a filesystem, not a SCM" mantra. Timezone comments don't > belong in the git inode. Yeah, but really I'd want to see other serious users of it before I'd accept that the timezone information _really_ needs to be stored separately. After all, the committer and author information really wouldn't be considered part of the _filesystem_ either. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Date handling.
On Thu, 2005-04-14 at 12:19 -0700, [EMAIL PROTECTED] wrote: > With a UTC date, why would anyone care in which timezone the commit was > made? Any pretty printing would most likely be prettiest if it is done > relative to the timezone of the person looking at the commit record, not > the person who created the record. I'd prefer not to lose the information. If someone has committed a change at 2am, I like to know that it was 2am for _them_. It helps me decide where to look first for the cause of problems. :) It also helps disambiguate certain comments, especially those involving words or phrases such as "yesterday" or "this afternoon". -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling renames.
On Thu, 2005-04-14 at 20:58 +0200, Ingo Molnar wrote: > The thing i tried to avoid was to list long filenames in the commit > (because of the tree hierarchy we'd need to do tree-absolute pathnames > or something like that, and escape things, and do lookups - duplicating > a VFS which is quite bad) - it would be better to identify the rename > source and target via its tree object hash and its offset within that > tree. Such information could be embedded in the commit object just fine. > Something like: Actually I'm not sure that's true. Let's consider the two main users of this information. Firstly, because it's what I've been playing with: to list a given file's revision history, I currently work with its filename -- walk the commit objects, inspecting the tree and selecting those commits where the file has changed. If my filename is 'fs/jffs2/inode.c' then I can immediately skip over a commit where the 'fs' entry in the top-level tree is identical to that in the parent, or I can skip a commit where the 'jffs2' entry in the 'fs' subtree is identical to the parent... it's all done on filename, and the {parent, entry} tuple wouldn't help much here; I'd probably have to convert back to a filename anyway. Secondly, there's merges. I've paid less attention to these (see mail 5 minutes ago) but I think they'd end up operating on the rename information in a very similar way. To find a common ancestor for a given file,, we want to track its name as it changed during history; at that point it's all string compares. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Handling renames.
On Thu, 2005-04-14 at 11:11 -0700, Linus Torvalds wrote: > So, you really need to think of git as a filesystem. You can then > implement an SCM _on_top_of_it_, which means that your second suggestion > is not only acceptable, it really is the _only_ way to handle this in git: > > > So a commit involving a rename would look something like this... > > > > tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa > > parent bb95843a5a0f397270819462812735ee29796fb4 > > rename foo.c bar.c > > author David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100 > > committer David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100 > > Rename foo.c to bar.c and s/foo_/bar_/g > > Except I want that empty line in there, and I want it in the "free-form" > section. The "rename" part really isn't part of the git header. It's not > what git tracks, it was tracked by an SCM system on top of git. Note that not only may you have a _set_ of renames, but you'll also have a _different_ set of renames for each parent. Consider the representation of a merge where a file was called 'foo' in one parent, 'bar' in the other, and we called it 'foobar' in the resulting tree. That's the main reason I wanted the renames in with the parent information -- so it's <...>... I see your point though and I can't be bothered to argue for the sake of the slight efficiency benefit we might gain from doing it that way. The implementation details really aren't that interesting right now. Let us assume, however, that we have this information somehow stored in each commit object. It's perfectly sufficient from the POV of the 'git revtool' which I've been poking at; is it good enough for merges? Consider a simple case: A branch is taken, file foo.c is renamed to bar.c, and now we're trying to merge that branch back into the head, which has moved on. We can't just take 'bar.c' as a new file -- we have to track it all the way back to its inception, and notice that it actually shares a common ancestor with 'foo.c' in the other parent of the merge. How feasible, and how computationally expensive, is that task going to be? Especially given that there may be _many_ new files that we need to attempt to tie up with their partners, across many potential renames. One option for optimising this, if we really need to, might be to track the file back to its _first_ ancestor and use that as an identification. The SCM could store that identifier in the blob itself, or we could consider it an 'inode number' and store it in git's tree objects. If we can avoid that, however, it would be nice. How feasible is the merge going to be without it? -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Handling renames.
I've been looking at tracking file revisions. One proposed solution was to have a separate revision history for individual files, with a new kind of 'filecommit' object which parallels the existing 'commit', referencing a blob instead of a tree. Then trees would reference such objects instead of referencing blobs directly. I think that introduces a lot of redundancy though, because 99% of the time, the revision history of the individual file is entirely reproducible from the revision history of the tree. It's only when files are renamed that we fall over -- and I think we can handle renames fairly well if we just log them in the commit object. My 'gitfilelog.sh' script is already capable of tracking a given file back through multiple tree commits, listing those commits where the file in question was actually changed. It uses my patched version of diff- tree which supports 'diff-tree ' in order to do this. By storing rename information in the commit object, the script (or a reimplementation of a similar algorithm) could know when to change the filename it's looking for, as it goes back through the tree. That ought to be perfectly sufficient. So a commit involving a rename would look something like this... tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa parent bb95843a5a0f397270819462812735ee29796fb4 rename foo.c bar.c author David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100 committer David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100 Rename foo.c to bar.c and s/foo_/bar_/g Opinions? Dissent? We'd probably need to escape the filenames in some way -- handwave over that for now. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Date handling.
On Thu, 2005-04-14 at 02:12 -0700, Linus Torvalds wrote: > I take that back. I'd be much happier with you doing and testing it, > because now I'm crashing. OK. commit-tree now eats RFC2822 dates as AUTHOR_DATE because that's what you're going to want to feed it. We store seconds since UTC epoch, we add the author's or committer's timezone as auxiliary data so that dates can be pretty-printed in the original timezone later if anyone cares. I left the date parsing in rev-tree.c for backward compatibility but it can be dropped when we change to base64 :) Yes, glibc sucks and strptime is a pile of crap. We have to parse it ourselves. Index: commit-tree.c --- 1756b578489f93999ded68ae347bef7d6063101c/commit-tree.c (mode:100664 sha1:12196c79f31d004dff0df1f50dda67d8204f5568) +++ 82ba574c85e9a2e4652419c88244e9dd1bfa8baa/commit-tree.c (mode:100644 sha1:35cb09402c9868499bcaf6de42afbad9fdfebe05) @@ -7,6 +7,9 @@ #include #include +#include +#include +#include #define BLOCKING (1ul << 14) #define ORIG_OFFSET (40) @@ -95,6 +98,148 @@ } } +static const char *month_names[] = { +"Jan", "Feb", "Mar", "Apr", "May", "Jun", +"Jul", "Aug", "Sep", "Oct", "Nov", "Dec" +}; + +static const char *weekday_names[] = { +"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat" +}; + + +static char *skipfws(char *str) +{ + while (isspace(*str)) + str++; + return str; +} + + +/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822 + (i.e. English) day/month names, and it doesn't work correctly with %z. */ +static void parse_rfc2822_date(char *date, char *result, int maxlen) +{ + struct tm tm; + char *p; + int i, offset; + time_t then; + + memset(&tm, 0, sizeof(tm)); + + /* Skip day-name */ + p = skipfws(date); + if (!isdigit(*p)) { + for (i=0; i<7; i++) { + if (!strncmp(p,weekday_names[i],3) && p[3] == ',') { + p = skipfws(p+4); + goto day; + } + } + return; + } + + /* day */ + day: + tm.tm_mday = strtoul(p, &p, 10); + + if (tm.tm_mday < 1 || tm.tm_mday > 31) + return; + + if (!isspace(*p)) + return; + + p = skipfws(p); + + /* month */ + + for (i=0; i<12; i++) { + if (!strncmp(p, month_names[i], 3) && isspace(p[3])) { + tm.tm_mon = i; + p = skipfws(p+strlen(month_names[i])); + goto year; + } + } + return; /* Error -- bad month */ + + /* year */ + year: + tm.tm_year = strtoul(p, &p, 10); + + if (!tm.tm_year && !isspace(*p)) + return; + + if (tm.tm_year > 1900) + tm.tm_year -= 1900; + + p=skipfws(p); + + /* hour */ + if (!isdigit(*p)) + return; + tm.tm_hour = strtoul(p, &p, 10); + + if (!tm.tm_hour > 23) + return; + + if (*p != ':') + return; /* Error -- bad time */ + p++; + + /* minute */ + if (!isdigit(*p)) + return; + tm.tm_min = strtoul(p, &p, 10); + + if (!tm.tm_min > 59) + return; + + if (isspace(*p)) + goto zone; + + if (*p != ':') + return; /* Error -- bad time */ + p++; + + /* second */ + if (!isdigit(*p)) + return; + tm.tm_sec = strtoul(p, &p, 10); + + if (!tm.tm_sec > 59) + return; + + if (!isspace(*p)) + return; + + zone: + p = skipfws(p); + + if (*p == '-') + offset = -60; + else if (*p == '+') + offset = 60; + else + return; + + if (!isdigit(p[1]) || !isdigit(p[2]) || !isdigit(p[3]) || !isdigit(p[4])) + return; + + i = strtoul(p+1, NULL, 10); + offset *= ((i % 100) + ((i / 100) * 60)); + + if (*(skipfws(p + 5))) + return; + + then = mktime(&tm); /* mktime appears to ignore the GMT offset, stupidly */ + if (then == -1) + return; + + then -= offset; + + snprintf(result, maxlen, "%lu %5.5s", then, p); +} + /* * Having more than two parents may be strange, but hey, there's * no conceptual reason why the file format couldn't accept multi-way @@ -114,10 +259,12 @@ unsigned char commit_sha1[20]; char *gecos, *realgecos; char *email, realemail[1000]; - char *date, *realdate; + char date[20], realdate[20]; + char *audate; char comment[1000]; struct passwd *pw; time_t now; + struct tm *tm; char *buffer; unsigned int size; @@ -142,15 +289,19 @@
Re: Date handling.
On Thu, 2005-04-14 at 02:00 -0700, Linus Torvalds wrote: > I do like text output, but if it is painful, the "unix seconds" format is > certainly a hell of a lot simpler. And quite frankly, if we change it, we > might as well just change it all the way. So I'd almost prefer (1). Text _output_ is easy to generate; we don't need to store text in the database for that. So I've changed my mind -- I prefer (1) too. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Date handling.
The date handling is somewhat unreliable. We render dates into textual representation using the committer's locale (day names, etc), then later attempt to interpret that in some other locale. And we were just using localtime without even specifying the timezone so the timestamp was fairly randomised anyway. In fact, an $AUTHOR_DATE environment variable was making its way into the database entirely unchecked. I see two possible solutions: 1. Just store seconds-since-GMT-epoch and if we really want, the timezone as auxiliary information. 2. Store dates in RFC2822 form. Unless someone convincingly expresses a preference before I get to work and start playing with it, I'll implement the latter. -- dwmw2 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html