Re: empty patch-2.6.13-git? patches on ftp.kernel.org

2005-09-04 Thread David Woodhouse
On Sun, 2005-09-04 at 17:31 +0200, Jan Dittmer wrote:
> David Woodhouse wrote:
> > On Fri, 2005-09-02 at 02:00 -0700, Linus Torvalds wrote:
> > 
> >>Ahh. Please change that to
> >>
> >>rm -rf tmp-empty-tree
> >>mkdir tmp-empty-tree
> >>cd tmp-empty-tree
> >>git-init-db
> >>
> >>because otherwise you'll almost certainly hit something else later
> >>on..
> > 
> > 
> > OK, done. 
> > 
> 
> -git4 is again empty

Hm, yes.

+ rm -rf tmp-empty-tree
+ mkdir tmp-empty-tree
+ cd tmp-empty-tree
+ git-init-db
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/branches/: Permission denied
+ unset GIT_DIR
+ git-read-tree f505380ba7b98ec97bf25300c2a58aeae903530b
fatal: unable to create new cachefile

Fixed now; thanks.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: empty patch-2.6.13-git? patches on ftp.kernel.org

2005-09-02 Thread David Woodhouse
On Wed, 2005-08-31 at 15:34 +0200, Tomasz K³oczko wrote:
> Seems patches stored on ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots
> are empty (only logs are correct):

> -rw-r--r--1 536  53620 Aug 30 09:01 patch-2.6.13-git1.gz
> -rw-r--r--1 536  53620 Aug 31 09:01 patch-2.6.13-git2.gz

Hm. git-diff-cache now refuses to operate unless there's a local
'.git/refs' directory, even when working with a separate object
directory. So this doesn't work any more...

rm -rf tmp-empty-tree
mkdir -p tmp-empty-tree/.git
cd tmp-empty-tree

git-read-tree $CURCOMM
git-checkout-cache Makefile
perl -pi -e "s/EXTRAVERSION =.*/EXTRAVERSION = $EXTRAVERSION/" Makefile
git-diff-cache -m -p $RELTREE | gzip -9 > $STAGE/patch-$CURNAME.gz

I've changed the script to create 'tmp-empty-tree/.git/refs' and
replaced 2.6.13-git[12] with real patches.

> Also it will be good move all patch-2.6.12* and patch-2.6.13-rc* files 
> from this directory to old subdirectory.

Done.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux BKCVS kernel history git import..

2005-07-27 Thread David Woodhouse
On Wed, 2005-07-27 at 08:29 -0700, Linus Torvalds wrote:
> I used to think I wanted to, but these days I really don't. One of the
> reasons is that I expect to try to pretty up the old bkcvs conversion some
> time: use the name translation from the old "shortlog" scripts etc, and
> see if I can do some other improvements on the conversion (I think I'll
> remove the BK files - "ChangeSet" etc).

Thomas has done all that; it's on kernel.org already.

> And it's really much easier and more general to have a "graft" facility.  
> It's something that git can do trivially (literally a hook in
> "parse_commit" to add a special parent), and it's actually a generic
> mechanism exactly for issues like this ("project had old history in some
> other format").

Hm, OK. That works and can also be used for the "fake _absence_ of
parent" thing -- if I'm space-constrained and want only the history back
to some relatively recent point like 2.6.0, I can do that by turning the
2.6.0 commit into an orphan instead of also using all the rest of the
history back to 2.4.0. 

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux BKCVS kernel history git import..

2005-07-27 Thread David Woodhouse
On Tue, 2005-07-26 at 11:57 -0700, Linus Torvalds wrote:
> If somebody adds some logic to "parse_commit()" to do the "fake parent"
> thing, you can stitch the histories together and see the end result as one
> big tree. Even without that, you can already do things like
> 
> git diff v2.6.10..v2.6.12

That's a bit of a hack which really doesn't belong in the git tools.
It's not particularly hard to reparent the tree for real -- I'd much
rather see a tool added to git which can _actually_ change the
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 commit to have a parent of
0bcc493c633d78373d3fcf9efc29d6a710637519, and ripple the corresponding
SHA1 changes up to the current HEAD.

Note that the latter commit ID I gave there was actually the 2.6.12-rc2
commit in Thomas' history import, not your own. Thomas has done a lot of
work on it, and it has the full names extracted from the shortlog
script, full timestamps, branch/merge history and consistent character
sets in the commit logs. I'd definitely suggest that you use that
instead of the import from bkcvs.

http://www.kernel.org/git/?p=linux/kernel/git/tglx/history.git;a=summary

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What broke snapshots now?

2005-07-11 Thread David Woodhouse
On Sun, 2005-07-10 at 10:31 -0700, Linus Torvalds wrote:
> No it's not, as far as I can tell:
> 
> [EMAIL PROTECTED]:/home/dwmw2/git/mail-2.6(0)$ cat 
> .git/branches/origin
> rsync://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
> 
> so your scripts will go out to rsync with www.kernel.org to get the data, 
> when you use "cg-update origin".

Hm, OK. So I have absolutely no recollection of what my own scripts are
actually doing. I could have sworn I made sure it was local. If it was
using that URL for the master I might as well have run it elsewhere...

It does seem to be working again now. I'll probably rewrite it next time
it misbehaves.
> 
-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What broke snapshots now?

2005-07-10 Thread David Woodhouse
On Sun, 2005-07-10 at 10:08 -0700, Linus Torvalds wrote:
> Which script is this? I'm looking at your scripts, but
> "cg-feedmaillist.sh" is unreadable for me, so I can't see all of it.

Hm. Dunno why that happened -- it's readable now, and also at
http://david.woodhou.se/cg-feedmaillist.sh

> Anyway, it's possible that this is a temporary problem: one of the issues 
> is that since you seem to be using the "rsync:" protocol for updating 
> things, what happens is that if the mirroring is off a bit, you may have 
> gotten a new head, but not all the objects. Then you'd get exactly this.

It's done locally on hera though -- so the mirroring shouldn't be a
problem. IIRC the reason it uses rsync is because I wasn't getting tags
when it was using whatever other method was the default for a local
'parent repository'.

That was actually more relevant for the snapshots than the mailing list
feed, though -- so even if it isn't fixed now, I could live without
tags.

More usefully though, if ordering really isn't a problem on your
repository then I should probably rewrite the script to work directly
from that instead of from a copy.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What broke snapshots now?

2005-07-10 Thread David Woodhouse
On Sun, 2005-07-10 at 00:38 +0100, David Woodhouse wrote:
> Doh. I thought I'd already done that, but in fact that was for the
> scripts which feed the mailing list, while the snapshot script kept
> using my copy. 

Ok, the snapshot script starts working again if I change a few
environment variables to match what the tools now expect.

Now the mailing list feed isn't happy though -- it stopped being able to
pull from your tree at around 0600 UTC (which I think is then the last
DRM fix was added). I got this when trying to update...

Tree change: 
0109fd37046de64e8459f8c4f4706df9ac7cc82c:f179bc77d09b9087bfc559d0368bba350342ac76
error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99
fatal: unable to read destination tree 
(ce68a60e5c503aaef0a98f8d754effb6c7d9ee99)

Applying changes...
Fast-forwarding 0109fd37046de64e8459f8c4f4706df9ac7cc82c -> 
f179bc77d09b9087bfc559d0368bba350342ac76
on top of 0109fd37046de64e8459f8c4f4706df9ac7cc82c...
error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99
fatal: failed to unpack tree object f179bc77d09b9087bfc559d0368bba350342ac76

Since it's just a fast-forward, I just copied the 'origin' tag into the
'master' to move it forward. But it's still not happy:

hera /home/dwmw2/git/mail-2.6 $ cg-diff -r 
0109fd37046de64e8459f8c4f4706df9ac7cc82c:f179bc77d09b9087bfc559d0368bba350342ac76
error: cannot read sha1_file for ce68a60e5c503aaef0a98f8d754effb6c7d9ee99
fatal: unable to read destination tree 
(ce68a60e5c503aaef0a98f8d754effb6c7d9ee99)

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What broke snapshots now?

2005-07-09 Thread David Woodhouse
On Sat, 2005-07-09 at 09:15 -0700, Linus Torvalds wrote:
> Yes, looks that way. Except it's not "git on master.kernel.org", it's "git 
> in your home directory", I suspect. I expressly held off packing the 
> kernel repo until git had been updated on kernel.org.

Doh. I thought I'd already done that, but in fact that was for the
scripts which feed the mailing list, while the snapshot script kept
using my copy. I've moved it out of the way now; sorry for the noise.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What broke snapshots now?

2005-07-09 Thread David Woodhouse
Does git on master.kernel.org need to be updated to handle packed
objects? See attached. 

Linus, please could you add the snapshot script to your regression
testing? http://david.woodhou.se/git-snapshot.sh

It'd be good to keep that working without too much manual intervention. 

-- 
dwmw2

--- Begin Message ---
+ case `hostname` in
++ hostname
+ export PATH=/home/dwmw2/cogito:/usr/bin:/bin
+ PATH=/home/dwmw2/cogito:/usr/bin:/bin
+ export BASE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ BASE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
+ STAGINGLOCK=/staging/upload.lock
+ FINAL=/pub/linux/kernel/v2.6/snapshots
+ '[' '!' -d /pub/scm/linux/kernel/git/torvalds/linux-2.6.git ']'
+ export WORK_DIRECTORY=/home/dwmw2/snapshots/2.6
+ WORK_DIRECTORY=/home/dwmw2/snapshots/2.6
+ export SNAP_TAG_DIRECTORY=/home/dwmw2/snapshots/2.6/tags
+ SNAP_TAG_DIRECTORY=/home/dwmw2/snapshots/2.6/tags
+ export STAGE=/home/dwmw2/snapshots/2.6/stage
+ STAGE=/home/dwmw2/snapshots/2.6/stage
+ export 
SHA1_FILE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects
+ SHA1_FILE_DIRECTORY=/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects
++ ls -rt /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.11 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.11-tree 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc2 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc3 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc4 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc5 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.12-rc6 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc1 
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc2
++ tail -n1
++ sed s:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v::
+ RELNAME=2.6.13-rc2
++ cat /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/refs/tags/v2.6.13-rc2
+ RELOBJ=c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19
++ tail -n1
++ sed s:/home/dwmw2/snapshots/2.6/tags/v::
++ ls -rt /home/dwmw2/snapshots/2.6/tags/v2.6.13-rc2-git1
+ SNAPNAME=2.6.13-rc2-git1
+ '[' 2.6.13-rc2-git1 == '' ']'
++ cat /home/dwmw2/snapshots/2.6/tags/v2.6.13-rc2-git1
+ LASTOBJ=c101f3136cc98a003d0d16be6fab7d0d950581a6
++ echo 2.6.13-rc2-git1
++ sed 's/^.*-git//'
+ OLDGITNUM=1
++ expr 1 + 1
+ NEWGITNUM=2
+ CURNAME=2.6.13-rc2-git2
++ tree-id c101f3136cc98a003d0d16be6fab7d0d950581a6
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c1/01f3136cc98a003d0d16be6fab7d0d950581a6:
 No such file or directory
fatal: git-cat-file c101f3136cc98a003d0d16be6fab7d0d950581a6: bad file
Invalid id: c101f3136cc98a003d0d16be6fab7d0d950581a6
usage: git-cat-file [-t | tagname] 
usage: git-cat-file [-t | tagname] 
Invalid id: 
+ LASTTREE=
++ cat /pub/scm/linux/kernel/git/torvalds/linux-2.6.git/HEAD
+ CURCOMM=a92b7b80579fe68fe229892815c750f6652eb6a9
++ tree-id a92b7b80579fe68fe229892815c750f6652eb6a9
+ CURTREE=7fd73e9f39bf6003cc3188a10426b62d8c47ab40
++ tree-id c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c5/21cb0f10ef2bf28a18e1cc8adf378ccbbe5a19:
 No such file or directory
fatal: git-cat-file c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19: bad file
Invalid id: c521cb0f10ef2bf28a18e1cc8adf378ccbbe5a19
usage: git-cat-file [-t | tagname] 
usage: git-cat-file [-t | tagname] 
Invalid id: 
+ RELTREE=
+ echo release 2.6.13-rc2 commit tree
release 2.6.13-rc2 commit tree
++ git-cat-file -t c101f3136cc98a003d0d16be6fab7d0d950581a6
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/c1/01f3136cc98a003d0d16be6fab7d0d950581a6:
 No such file or directory
fatal: git-cat-file c101f3136cc98a003d0d16be6fab7d0d950581a6: bad file
+ echo last c101f3136cc98a003d0d16be6fab7d0d950581a6 tree
last c101f3136cc98a003d0d16be6fab7d0d950581a6 tree
+ echo head a92b7b80579fe68fe229892815c750f6652eb6a9 tree 
7fd73e9f39bf6003cc3188a10426b62d8c47ab40
head a92b7b80579fe68fe229892815c750f6652eb6a9 tree 
7fd73e9f39bf6003cc3188a10426b62d8c47ab40
+ '[' '' == 7fd73e9f39bf6003cc3188a10426b62d8c47ab40 ']'
++ echo 2.6.13-rc2-git2
++ cut -f2- -d-
+ EXTRAVERSION=-rc2-git2
+ cd /home/dwmw2/snapshots/2.6/stage
+ rm -rf tmp-empty-tree
+ mkdir -p tmp-empty-tree/.git
+ cd tmp-empty-tree
+ git-read-tree a92b7b80579fe68fe229892815c750f6652eb6a9
/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/f8/640c306db2d583b9a30f2e52f8fb0a4cf624e0:
 No such file or directory
fatal: failed to unpack tree object a92b7b80579fe68fe229892815c750f6652eb6a9
+ git-checkout-cache Makefile
checkout-cache: Makefile is not in the cache.
+ perl -pi -e 's/EXTRAVERSION =.*/EXTRAVERSION = -rc2-git2/' Makefile
Can't open Makefile: No such file or directory.
+ git-diff-cache -m -p
+ gzip -9
usage: diff-cache [-r] [-z] [-p] [-i] [--cached] 
+ echo a92b7b80579fe68fe229892815c750f6652eb6a9
+ cg-log 
c521cb

Re: Git-commits mailing list feed.

2005-04-21 Thread David Woodhouse
On Thu, 2005-04-21 at 12:29 +0200, Arjan van de Ven wrote:
> with BK this was not possible, but could we please have -p added to the
> diff parameters with git ? It makes diffs a LOT more reasable!

With BK this was not possible, but could you please provide your
criticism in 'diff -up' form?

I've done 'perl -pi -e s/-u/-up/ gitdiff-do' as a quick hack to provide
what you want, but a saner fix to make gitdiff-do obey the same
GIT_DIFF_CMD and GIT_DIFF_OPTS environment variables as show-diff.c
would be a more useful answer.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Set AUTHOR_DATE in git-tools

2005-04-21 Thread David Woodhouse
Entirely untested.

Makefile: eca3a5d5256cca06d86ebb85ec9d3218752ffcd2
applypatch: 397e4a0e506f1c5765767057dfe506154b743b83
--- a/applypatch
+++ b/applypatch
@@ -26,6 +26,7 @@ EDIT=${EDIT:-vi}
 
 export AUTHOR_NAME="$(sed -n '/^Author/ s/Author: //p' .dotest/info)"
 export AUTHOR_EMAIL="$(sed -n '/^Email/ s/Email: //p' .dotest/info)"
+export AUTHOR_DATE="$(sed -n '/^Date/ s/Date: //p' .dotest/info)"
 export SUBJECT="$(sed -n '/^Subject/ s/Subject: //p' .dotest/info)"
 
 if [ -n "$signoff" -a -f "$signoff" ]; then
dotest: a3e3d35ae0afa358f01b49eecb358d64c616c3e4
mailinfo.c: c1dcac130530174ec5335d2c752d76403ad1d3ad
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -13,6 +13,7 @@ static char line[1000];
 static char name[1000];
 static char email[1000];
 static char subject[1000];
+static char date[1000];
 
 static char *sanity_check(char *name, char *email)
 {
@@ -83,6 +84,11 @@ static void handle_subject(char *line)
strcpy(subject, line);
 }
 
+static void handle_date(char *line)
+{
+   strcpy(date, line);
+}
+
 static void add_subject_line(char *line)
 {
while (isspace(*line))
@@ -99,6 +105,11 @@ static void check_line(char *line, int l
cont = 0;
return;
}
+   if (!memcmp(line, "Date:", 5) && isspace(line[5])) {
+   handle_date(line+6);
+   cont = 0;
+   return;
+   }
if (!memcmp(line, "Subject:", 8) && isspace(line[8])) {
handle_subject(line+9);
cont = 1;
@@ -107,7 +118,7 @@ static void check_line(char *line, int l
if (isspace(*line)) {
switch (cont) {
case 0:
-   fprintf(stderr, "I don't do 'From:' line 
continuations\n");
+   fprintf(stderr, "I don't do 'From:' or 'Date:' header 
continuations\n");
break;
case 1:
add_subject_line(line);
@@ -215,7 +226,8 @@ static void handle_rest(void)
cleanup_space(name);
cleanup_space(email);
cleanup_space(sub);
-   printf("Author: %s\nEmail: %s\nSubject: %s\n\n", name, email, sub);
+   cleanup_space(date);
+   printf("Author: %s\nEmail: %s\nSubject: %s\nDate: %s\n", name, email, 
sub, date);
FILE *out = cmitmsg;
 
do {
mailsplit.c: 9379fbc5e84983e5ea0754a6587cc3490c696c69

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Mailing list feed.

2005-04-20 Thread David Woodhouse
If we just strip out the setting of $FROM and $MLIST, the script I use
to feed bk-commits-head@vger.kernel.org is perfectly generic. Petr, can
you include it in the tree so it gets updated as things change please?

-- 
dwmw2


gitfeedmaillist.sh
Description: application/shellscript


Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread David Woodhouse
On Wed, 2005-04-20 at 07:59 -0700, Linus Torvalds wrote:
> external-parent  
> comment for this parent
> 
> and the nice thing about that is that now that information allows you to 
> add external parents at any point. 
> 
> Why do it like this? First off, I think that the "initial import" ends up
> being just one special case of the much more _generic_ issue of having
> patches come in from other source control systems 

This isn't about patches coming in from other systems -- it's about
_history_, and the fact that it's imported from another system is just
an implementation detail. It's git history now, and what we have here is
just a special case of wanting to prune ancient git history to keep the
size of our working trees down. You refer to this yourself...

> Secondly, we do need something like this for pruning off history anyway, 
> so that the tools have a better way of saying "history has been pruned 
> off" than just hitting a missing commit. 

Having a more explicit way of saying "history is pruned" than just a
reference to a missing commit is a reasonable request -- but I really
don't see how we can do that by changing the now-oldest commit object to
contain an 'external-parent' field. Doing that would change the sha1 of
the commit object in question, and then ripple through all the
subsequent commits.

Come this time next year, if I decide I want to prune anything older
than 2.6.40 from all the trees on my laptop, it has to happen _without_
changing the commit objects which occur after my arbitrarily-chosen
cutoff point.

If we want to have an explicit record of pruning rather than just
copying with a missing object, then I think we'd need to do it with an
external note to say "It's OK that commit XXX is missing".

> Thirdly, I don't actually want my new tree to depend on a conversion of
> the old BK tree.
> 
> Two reasons: if it's a really full conversion, there are definitely going
> to be issues with BitMover. They do not want people to try to reverse
> engineer how they do namespace merges

Don't think of it as "a conversion of the old BK tree". It's just an
import of Linux's development history. This isn't going to help
reverse-engineer how BK does merges; it's just our own revision history.
I'm not sure exactly how Thomas is extracting it, but AIUI it's all
obtainable from the SCCS files anyway without actually resorting to
using BK itself. 

There's nothing here for Larry to worry about. It's not as if we're
actually using BK to develop git by observing BK's behaviour w.r.t
merges and trying to emulate it. Besides -- if we wanted to do that,
we'd need to use the _BK_ version of the tree; the git version wouldn't
help us much anyway.

And given that BK's merges are based on individual files and we're not
going that route with git, it's not clear how much we could lift
directly from BK even if we _were_ going to try that.

> The other reason is just the really obvious one: in the last week, I've
> already changed the format _twice_ in ways that change the hash. As long
> as it's 119MB of data, it's not going to be too nasty to do again.

That's fine. But by the time we settle on a format and actually start
using it in anger, it'd be good to be sure that it _is_ possible to
track development from current trees all the way back -- be that with
explicit reference to pruned history as you suggest, or with absent
parents as I still prefer.

> it's not that it's necessarily the wrong thing to do, but I think it
> is the wrogn thing to do _now_.

OK, time for us to keep arguing over the implementation details of how
we prune history then :)

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING! Object DB conversion (was Re: [PATCH] write-tree performance problems)

2005-04-20 Thread David Woodhouse
On Wed, 2005-04-20 at 02:08 -0700, Linus Torvalds wrote:
> I converted my git archives (kernel and git itself) to do the SHA1
> hash _before_ the compression phase.

I'm happy to see that -- because I'm going to be asking you to make
another change which will also require a simple repository conversion. 

We are working on getting the complete history since 2.4.0 into git
form. When it's done and checked (which should be RSN) I'd like you to
edit the first commit object in your tree -- the import of 2.6.12-rc2,
and give it a parent. That parent will be the sha1 hash of the
2.6.12-rc2 commit in the newly-provided history, and of course will
change the sha1 hash of your first commit, and all subsequent commits. 
We'll provide a tool to do that, of course.

The history itself will be absent from your tree. Obviously we'll need
to make sure that the tools can cope with an absentee parent, probably
by just treating that case as if no parent exists. That won't be hard,
it'll be useful for people to prune their trees of unwanted older
history in the general case too. That history won't be lost or undone --
it'll just be archived elsewhere.

The reason for doing this is that without it, we can't ever have a full
history actually connected to the current trees. There'd always be a
break at 2.6.12-rc2, at which point you'd have to switch to an entirely
different git repository.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: naive question

2005-04-19 Thread David Woodhouse
On Tue, 2005-04-19 at 23:00 +1000, Paul Mackerras wrote:
> Is there a way to check out a tree without changing the mtime of any
> files that you have already checked out and which are the same as the
> version you are checking out?  It seems that checkout-cache -a doesn't
> overwrite any existing files, and checkout-cache -f -a overwrites all
> files and gives them the current mtime.  This is a pain if you are
> using make and your tree is large (like, for instance, the linux
> kernel :), because it means that after a checkout-cache -f -a you get
> to recompile everything.

Corollary: why aren't we storing mtime in the tree objects?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] provide better committer information to commit-tree.c

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 18:12 -0700, Greg KH wrote:
> Ok, then why display it as one? 

Nobody ever displays it as one as far as I'm aware. That would be
something like "mailto:$COMMITTER";

> But I'll wait for Russell to wake up and start quoting the proper EU
> privacy laws that he feels causes him to be forced to obfuscate his
> email addresses in the changelog commits (as he did for the bk ones.)

He's talking about his own interpretation of the UK's Data Protection
Act, which requires you to be registered and fulfil certain other
requirements if you keep personal information about people in a
database. Email addresses have been ruled to be 'personal information'
in this context, but this _isn't_ an email address -- and there are
other get-out clauses for noncommercial situations such as this anyway,
I believe. 

Besides, he can still obscure the author information as he unfortunately
insists on doing; it's the _committer_ information which we're
discussing here -- and that's always going to be himself in this case.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] provide better committer information to commit-tree.c

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 17:45 -0700, Greg KH wrote:
> Well Russell has stated that he has to for EU Privacy reasons.  And I'd
> like to do it as I don't have a local suse.de hostname for my laptop and
> my employer probably doesn't really want my [EMAIL PROTECTED] address
> showing up :)

Why not? Do they complain that we see '[EMAIL PROTECTED]' when you
connect to an IRC server? This _isn't_ an email address, and doesn't
really need to be treated as such. 

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SCSI trees, merges and git status

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 19:16 -0500, James Bottomley wrote:
> Yes, that's what I did to get back to the commit just before the
> merge:
> 
> fsck-cache --unreachable 54ff646c589dcc35182d01c5b557806759301aa3|awk
> '/^unreachable /{print $2}'|sed 's:^\(..\):.git/objects/\1/:'|xargs rm

I was actually digressing and talking about pruning ancient history
which _is_ theoretically reachable. It's not being 'undone'; it's just
being omitted from the current _working_ tree. The whole point is that
in a fully-populated tree the history _should_ be accessible all the way
back.

We're trying to get the older history available on kernel.org ASAP. The
blobs are rsyncing to ~dwmw2/git/kernel-tglx1; the trees and commit
objects will be coming soon. 

Theoretically all Linus actually needs in order to rebuild his current
tree is the sha1 hash of the final commit in that historical tree, which
corresponds to 2.6.12-rc2.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SCSI trees, merges and git status

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 17:03 -0700, Linus Torvalds wrote:
> Git does work like BK in the way that you cannot remove history when you
> have distributed it. Once it's there, it's there.

But older history can be pruned, and there's really no reason why an
http-based 'git pull' couldn't simply refrain from fetching commits
older than a certain threshold.

However, we can't _add_ the history if the current commits don't refer
to it. I really think we should take the imported git history and make
our 'current' tree refer to it -- even if just by having an appropriate
'parent' record in what is currently the oldest changeset in our tree;
the 2.6.12-rc2 import.

It doesn't matter that our oldest commit object refers to a nonexistent
parent, but that does allow us to import historical data if we _want_
to, and have it all work properly.

We should have the full historical git repo available within a day or
so, I believe. It would be really useful if we could make the current
trees refer back to that, instead of starting at 2.6.12-rc2.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch] fixup GECOS handling

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 12:36 +0200, Martin Schlemmer wrote:
> realgecos[strchr(realgecos, ',') - realgecos] = '\0';

Er, *strchr(realgecos, ',') = 0; surely? Even if the compiler is clever
enough to optimise out the gratuitous addition and subtraction, that's
no real excuse for it.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Pretty-print date in 'git log'

2005-04-18 Thread David Woodhouse
On Mon, 2005-04-18 at 12:27 +0200, Petr Baudis wrote:
> Yes. As far as I'm concerned, I'd put such stuff to git log, and extend
> it usage so that it is possible to print individual log entries with it
> - just make it accept a _range_ of commits, and then do
> 
> git log $commit $commit

That's fairly trivial. In the current (and misguided) version with
chronological output, rev-tree will do it all for you, in fact:

rev-tree $1 ^$2

In the older and more useful version, it was only slightly more complex:

 base=$(gitXnormid.sh -c $1) || exit 1
 
+if [ -n "$2" ]; then
+endpoint=$(gitXnormid.sh -c $2) || exit 1
+if rev-tree $base $endpoint | grep -q $base:3; then
+base=
+else
+rev-tree --edges $base $endpoint | sed 's/[a-z0-9]*:1//g' > $TMPCL
+fi
+fi
 changelog $base
 rm $TMPCL $TMPCM


-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Pretty-print date in 'git log'

2005-04-17 Thread David Woodhouse
Add tool to render git's " " into an RFC2822-compliant
string, because I don't think date(1) can do it. Use same for 'git log'
output.

Signed-off-by: David Woodhouse <[EMAIL PROTECTED]>

--- Makefile
+++ Makefile2005-04-18 15:40:43.0 +1000
@@ -14,7 +14,7 @@
 
 PROG=   update-cache show-diff init-db write-tree read-tree commit-tree \
cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \
-   check-files ls-tree merge-base
+   check-files ls-tree merge-base show-date
 
 SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \
gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \
--- gitlog.sh
+++ gitlog.sh   2005-04-18 15:39:38.0 +1000
@@ -13,6 +13,23 @@
 
 rev-tree $base | sort -rn | while read time commit parents; do
echo commit ${commit%:*};
-   cat-file commit $commit
+   cat-file commit $commit | while read type rest ; do
+   case "$type" in
+   "author"|"committer")
+   DATESTAMP="`echo $rest | cut -f2 -d\>`"
+   RFC2822DATE="`show-date $DATESTAMP 2>/dev/null || echo 
$DATESTAMP`"
+   echo $type $rest | sed "s/$DATESTAMP\$/ $RFC2822DATE/"
+   ;;
+
+   "")
+   echo ""
+   cat
+   ;;
+   *)
+   echo $type $rest
+   ;;
+   esac
+   done
+   
echo -e "\n--"
 done
--- show-date.c.orig2005-04-18 15:43:06.0 +1000
+++ show-date.c 2005-04-18 15:42:15.0 +1000
@@ -0,0 +1,48 @@
+#include 
+#include 
+#include 
+#include "cache.h"
+
+static const char *month_names[] = {
+"Jan", "Feb", "Mar", "Apr", "May", "Jun",
+"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
+};
+
+static const char *weekday_names[] = {
+"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
+};
+
+int main(int argc, char **argv)
+{
+   time_t t;
+   int offset;
+   char *p;
+   struct tm tm;
+
+   if (argc != 3)
+   usage("usage: show-date  ");
+
+   t = strtol(argv[1], &p, 0);
+   if (*p || !t)
+   usage("usage: show-date  ");
+
+   if (argv[2][0] != '-' && argv[2][0] != '+')
+   usage("usage: show-date  ");
+
+   offset = strtol(argv[2]+1, &p, 10);
+   if (*p || p!= argv[2]+5)
+   usage("usage: show-date  ");
+
+   if (argv[2][0] == '-')
+   offset = -offset;
+
+   offset = 60 * (offset % 100 + (offset / 100 * 60));
+
+   t += offset;
+   gmtime_r(&t, &tm);
+
+   printf("%s, %d %s %04d %02d:%02d:%02d %s\n",
+  weekday_names[tm.tm_wday], tm.tm_mday, month_names[tm.tm_mon],
+  tm.tm_year+1900, tm.tm_hour, tm.tm_min, tm.tm_sec, argv[2]);
+   return 0;
+}

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] General object parsing

2005-04-17 Thread David Woodhouse
On Sun, 2005-04-17 at 18:15 -0700, Linus Torvalds wrote:
> In particular, is there some easy way to walk backwards by time? "git log"  
> definitely needs that, and merge-base clearly wants something similar. 

Actually the ideal output of 'git log' isn't strictly chronological.
IIRC my bkexport scripts used to make a chronologically sorted list, and
I ended up changing it.

Simple example: if there are changesets which have been lurking in some
tree for months waiting for you to pull, and the only thing you did
since I ran 'git log' on your tree yesterday is pull from that tree,
then those changesets are what I want to see at the top of 'git log'
output.

In fact this probably means that the depth-first tree walking of the
original gitlog.sh is probably the right thing to do, but when we hit a
merge we want to try to make sure we process the _remote_ parent first.

Are we sorting the 'parent' links in merges so that two merges of the
same branches are guaranteed to be identical (assuming identical
contents otherwise)? Or is it just that we didn't think about it, and so
merges are putting the local and remote parents in the 'wrong' order by
coincidence?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: full kernel history, in patchset format

2005-04-17 Thread David Woodhouse
On Sun, 2005-04-17 at 18:16 -0700, Linus Torvalds wrote:
> Alternatively, you can have just the rev-tree cache of them. That's what
> it was designed for (along with avoiding to have to read 60,000 commits).

Purely from a conceptual POV I'd be a little happier with the history
just ending with a parent pointer to a commit object which is absent,
rather than having commit objects which point to _trees_ which are
absent. But I suppose I can't really justify that, and I'm not overly
bothered about it either.

The important thing to get right at this point is that the tree we all
work with should refer to the history, regardless of how we choose to
prune it. The current linux-2.6.git tree has a parentless commit for the
2.6.12-rc2 import, which is bad. We should start with Thomas' git tree
representing the real history, and work from that. You don't even need
to see his tree; you only need the final sha1 hash of the commit in his
tree which matches 2.6.12-rc2, so you can use that as the 'parent' of
the first change you import yourself.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: full kernel history, in patchset format

2005-04-17 Thread David Woodhouse
On Mon, 2005-04-18 at 02:50 +0200, Petr Baudis wrote:
> I think I will make git-pasky's default behaviour (when we get
> http-pull, that is) to keep the complete commit history but only trees
> you need/want; togglable to both sides.

I think the default behaviour should probably be to fetch everything.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: full kernel history, in patchset format

2005-04-17 Thread David Woodhouse
On Mon, 2005-04-18 at 02:35 +0200, Petr Baudis wrote:
> > For the special case of removing history before 2.6.12-rc2 from the
> > trees, I certainly think we can do it by leaving out all the commits,
> > not just the trees. We can do that easily, but there's no way we can
> > _add_ that history retrospectively if we omit it in the first place.
> 
> I'm confused by this paragraph, but that might be my English skills
> failing somehow.

"For the general case of people pruning their own trees, _maybe_ you're
right that it would be good to keep the commits even if we delete the
actual trees. But for history older than 2.6.12-rc2, that's a special
case -- I think we can happily delete the commits too.

"We can delete old trees/commits easily, but we can't _add_ them to the
existing linux-2.6.git tree, because the oldest commit in that tree
(b4ceb6e27e4cc3f37d26e04c4535c79b98a9f889) doesn't have a parent."

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: full kernel history, in patchset format

2005-04-17 Thread David Woodhouse
On Mon, 2005-04-18 at 01:39 +0200, Petr Baudis wrote:
> I think this is bad, bad, bad. If you don't keep around all the
> _commits_, you get into all sorts of troubles - when merging, when doing
> git log, etc. And the commits themselves are probably actually pretty
> small portion of the thing. I didn't do any actual measurement but I
> would be pretty surprised if it would be much more than few megabytes of
> data for the kernel history.

I'm not sure it's that bad -- and everyone already seems perfectly happy
not to have history going back before 2.6.12-rc2. We're not talking
about doing this by _default_ -- we're talking about allowing people to
keep trees pruned if they _want_ to. So I might want to drop history
before 2.6.0 on my laptop, for example.

> Of course an entirely different thing are _trees_ associated with those
> commits. As long as you stay with a simple three-way merge, you
> basically never want to look at trees which aren't heads and which you
> don't specifically request to look at. And the trees and what they carry
> inside is the main bulk of data.

If the trees are absent and you're trying to merge, what do you gain
from having the commit objects? And for the case of 'git log', I
certainly think it's acceptable that you lose out on those parts of
prehistory which you've explicitly removed from your local tree --
that's a feature, not a bug. 

For the special case of removing history before 2.6.12-rc2 from the
trees, I certainly think we can do it by leaving out all the commits,
not just the trees. We can do that easily, but there's no way we can
_add_ that history retrospectively if we omit it in the first place.

For history older than 2.6.12-rc2 I'd suggest that it would be available
in a different place, and absent from the 'main' working tree that
everyone uses by default. The only difference we'd see in the working
tree is that the 2.6.12-rc2 commit -- the oldest commit in that tree --
would actually have an absentee parent instead of appearing to be an
import. And all the sha1 hashes of all subsequent commits would be
different, of course.

To allow pruning of older objects in the general case would be a little
bit harder than that, because as things stand you'd be re-fetching them
every time you rsync from elsewhere -- but that wouldn't really be hard
to fix if we care.

Either way, I think it can probably be done by omitting the commit
objects as well as the trees -- but the important point is that we
_should_ include a 'parent' pointer in the oldest commit of the tree
we're working with, pointing back to the imported history.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Building git on Fedora

2005-04-17 Thread David Woodhouse
On Sun, 2005-04-17 at 19:25 -0400, jeff millar wrote:
> ln -sf /lib/modules/`uname -r`/build/include/linux 
> /usr/local/include/linux
> 
> This fix creates a symlink, on each boot up, in the local include 
> directory that points to the kernel header files. If there's a better 
> way to do this, I'm all ears.

What's wrong with the contents of the glibc-kernheaders package? Can you
file specific bugs if you're having problems?

In the long run, the answer is to convince Linus that we _really_ need
the kernel to have a set of header files defining the ABI which are fit
for public consumption, rather than having a horrid mix of private and
exportable bits throughout the contents of the include/ directory. 

In the meantime, some poor mug has to clean the crap up and try to make
something suitable to live in /usr/include/linux -- and unfortunately at
the moment for Fedora that someone is me :)

Unless git is doing something with kernel-private headers that it
shouldn't, this probably wants to be discussed elsewhere -- most likely
in Bugzilla.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: full kernel history, in patchset format

2005-04-17 Thread David Woodhouse
On Sat, 2005-04-16 at 10:04 -0700, Linus Torvalds wrote:
> So I'd _almost_ suggest just starting from a clean slate after all.  
> Keeping the old history around, of course, but not necessarily putting it
> into git now. It would just force everybody who is getting used to git in 
> the first place to work with a 3GB archive from day one, rather than 
> getting into it a bit more gradually.
> 
> What do people think? I'm not so much worried about the data itself: the
> git architecture is _so_ damn simple that now that the size estimate has
> been confirmed, that I don't think it would be a problem per se to put
> 3.2GB into the archive. But it will bog down "rsync" horribly, so it will
> actually hurt synchronization untill somebody writes the rev-tree-like
> stuff to communicate changes more efficiently..

Note that any given copy of a tree doesn't _need_ to keep all the
history back the beginning of time. It's OK if the oldest commit object
in your tree actually refers back to a parent which doesn't exist
locally. I can well imagine that some people will want to keep their
trees pruned to keep only a few weeks of history, while other copies of
the tree will keep everything.

However, if we _don't_ base our current work on an existing import of
the kernel, then we don't retain that option. We can't just change the
'parent' field of your 2.6.12-rc2 import, without changing the sha1 hash
of _everything_ that happens thereafter. 

So I'd say we should take Thomas' import, and base new work on that --
but then possibly leave out the older objects from the 'working'
repository which everyone is rsyncing from; just make them available in
a 'linux-history.git' object database elsewhere.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-done kernel archive - real one?

2005-04-17 Thread David Woodhouse
On Sun, 2005-04-17 at 15:22 -0700, randy_dunlap wrote:
> David did the commits-mailing-list script and I'm working on a
> commits web-page like what was formerly seen at:
> http://www.kernel.org/pub/linux/kernel/v2.6/testing/cset/
> (with daily tarball)
> 
> based on some older scripts from David, however I'm wondering if
> a variant of the gitlog.sh script wouldn't be a better starting
> point for it.

My commits-list script is in fact based on gitlog.sh. You'll probably
find useful things to crib from in both that and the original
bkexport.sh script.

The commits script also wants updating to print the date properly now
that we've changed how it's stored -- I'll try to find some time this
week to update it and set it running on master.kernel.org again, but it
may end up waiting till after LCA.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Re-done kernel archive - real one?

2005-04-17 Thread David Woodhouse
On Sat, 2005-04-16 at 16:01 -0700, Linus Torvalds wrote:
> So I re-created the dang thing (hey, it takes just a few minutes), and
> pushed it out, and there's now an archive on kernel.org in my public
> "personal" directory called "linux-2.6.git". I'll continue the tradition
> of naming git-archive directories as "*.git", since that really ends up
> being the ".git" directory for the checked-out thing.

Do you want the commits list running for it yet? Do you want the
changesets which are already in it re-mailed without a 'TESTING' tag?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-17 Thread David Woodhouse
On Sat, 2005-04-16 at 17:33 +0200, Johannes Schindelin wrote:
> > But if it can be done cheaply enough at a later date even though we end
> > up repeating ourselves, and if it can be done _well_ enough that we
> > shouldn't have just asked the user in the first place, then yes, OK I
> > agree.
> 
> The repetition could be helped by using a cache.

Perhaps. Since neither such a cache nor even the commit comments are
strictly part of the git data, they probably shouldn't be included in
the sha1 hash of the commit object. However, I don't see a fundamental
reason why we couldn't store them in the same file but omit them from
the hash calculations. That also allows us to retrospectively edit
commit comments without completely changing the entire subsequent
history.

Or is that a little too heretical a suggestion?

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 08:32 -0700, Linus Torvalds wrote:
>  - you're doing the work at the wrong point. Doing it _well_ is quite 
>expensive. So if you do it at commit time, you cannot _afford_ to do it 
>well, and you'll always fall back to doing an ass-backwards job that 
>doesn't really get you to the good state, and only gets you to a 
>not-very-interesting easy 1% of the solution (ie full file renames).
> 
>  - you're doing the work at the wrong point for _another_ reason. You're 
>freezing your (crappy) algorithm at tree creation time, and basically 
>making it pointless to ever create something better later, because even 
>if hardware and software improves, you've codified that "we have to
>have crappy information".

OK, I'm inclined to agree. The only thing that prevents me from
capitulating entirely and resubscribing to the "Torvalds is always
right" school is the concern that it _is_ expensive, and that's why I
originally wanted to do it at commit time because then it's a one-off
cost rather than recurring every time we want to track the history of a
given piece of content. Also because we actually have the developer's
attention at commit time, and we can get _real_ answers from the user
about what she was doing, instead of having to guess.

But if it can be done cheaply enough at a later date even though we end
up repeating ourselves, and if it can be done _well_ enough that we
shouldn't have just asked the user in the first place, then yes, OK I
agree.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 07:53 -0700, Linus Torvalds wrote:
> Files DO NOT matter. Never have. It's an implementation limitation to 
> think they do. You'll screw yourself up, and when somebody comes up with a 
> half-way efficient way to generate inter-fiel diffs, your architecture is 
> totally and utterly unable to handle it.
> 
> I don't care what you do at an SCM level, and if the crud you put on top
> of git wants to perpetuate mistakes of yesteryear, that's _your_ issue.  
> But dammit, git is designed to do the right thing, and I will fight tooth
> and nail against anybody who thinks individual files matter.

No, really: individual files _DO_ matter. There's a reason we split
stuff up into separate files, and if you look closely you'll find that
we don't just randomly put different functions into different files with
neither rhyme nor reason -- there's a pattern to it; usually some kind
of functional grouping.

And when I'm looking for the change that broke something, I can almost
always tell which file it's in and go looking in _that_ file. It's a
_whole_ lot easier to use the equivalent of 'bk revtool' than it is to
sift through all the unrelated commits in the whole tree. If that's an
implementation limitation, then it's an implementation limitation in my
_brain_ not just in my tools.

OK, in fact it shouldn't be 'show me the history of this file'; it's
often really 'show me the history of this function' which I want. But
that's fine. All I'm suggesting is that we should include the metadata
which says "content moved from file XXX to file YYY" along with the
commit objects.

I'm certainly not suggesting that we should implement jejb's idea of
explicit 'file revision history' objects -- the tree-based philosophy is
perfectly sane and sufficient. But we do _also_ need a little
information which allows us to track content as it moves around within
the tree, and the SCM has to have a sane way to filter out the noise
when we're looking for what broke. Yes, that's part of the SCM
functionality, and can live in an xattr-type field in the commit object
-- but it does need to be stored, and in practice I suspect it _will_ be
useful for merging too.

It's not about ditching the per-tree tracking and doing per-file
tracking instead. I agree that would be wrong. It's about storing enough
information to track what happened to given content as it moved around
within the tree.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 16:53 +0200, Ingo Molnar wrote:
> but the specific scenario you described would require _Linus'_ tree to
> be in limbo for a long time, and have uncommitted half-done edits.
> I.e.:
> 
>(A1B2)--(A2B2)--(A2'B3)
> /  \   /\
>/\ /  \
>  (A1B1)  X   (...)
>\/ \  /
> \  /   \/
>(A2B1)--(A2B2)--(A3B2')
> 
> in the above scenario Linus' tree needs to 'cross' with a maintainer's
> tree.  (maintainer's tree wont cross with another maintainer's tree,
> as maintainer-to-maintainer merges rare.)

Is that true? Consider (A2B1) to be a bugfixes-only tree which I make
available for Linus to pull from. I keep doing more experimental stuff
in my own private copy of the tree along the bottom branch, while Linus
_eventually_ responds to my pull request and moves on, stopping only to
add a 'static' to one of my new functions. I move on too but don't pull
from Linus again for a little while; the final merge happens when I _do_
pull again.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Handling renames.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 13:37 +, [EMAIL PROTECTED] wrote:
> > One option for optimising this, if we really need to, might be to track
> > the file back to its _first_ ancestor and use that as an identification.
> > The SCM could store that identifier in the blob itself, or we could
> > consider it an 'inode number' and store it in git's tree objects.
> 
> This suggestion (and this whole discussion about renames) has issues
> with file copies, which form a branch in the revision history.  If I
> copy foo.c to foo2.c (or fs/ext2/ to fs/ext3/), then the oldest ancestor
> isn't a "unique inode number".

That's why I prefer the option of simply annotating the moves. They
don't need to be just renames -- it can cover the cases where files are
split up or merged into one, to indicate where the history of the given
_data_ is coming from.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Fri, 2005-04-15 at 11:36 +0200, Ingo Molnar wrote:
> do such cases occur frequently? In the kernel at least it's not too 
> typical. 

Isn't it? I thought it was a fairly accurate representation of the
process "I make a whole bunch of changes to files I maintain, pulling
from Linus while occasionally asking him to pull from my tree. Sometimes
my files are changed by someone else in Linus' tree, and sometimes I
change files that I don't actually own.".

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Thu, 2005-04-14 at 17:42 -0700, Linus Torvalds wrote:
> I've not even been convinved that renames are worth it. Nobody has
> really given a good reason why.
> 
> There are two reasons for renames I can think of:
> 
>  - space efficiency in delta-based trees.
>  - "annotate".

Neither of those were my motivation for looking at renames. The reasons
I wanted to track renames were:
   - Per-file revision history which doesn't stop dead at a rename.
   - Merging where files have been renamed in one branch and modified in
 another. Which is basically a special case of the above; we need to
 see the per-file revision history.

>So I'd seriously suggest that instead of worryign about renames, people 
>think about global diffs that aren't per-file. Git is good at limiting 
>the changes to a set of objects, and it should be entirely possible to 
>think of diffs as ways of moving lines _between_ objects and not just
>within objects. It's quite common to move a function from one file to 
>another - certainly more so than renaming the whole file.
>
>In other words, I really believe renames are just a meaningless special 
>case of a much more interesting problem. Which is just one reason why 
>I'm not at all interested in bothering with them other than as a "data 
>moved" thing, which git already handles very well indeed.

Git doesn't handle 'data moved' except at a whole-tree level. For each
commit, it says "these are the old trees; this is the new tree".

Git doesn't actually look hard into the contents of tree; certainly it
has no business looking at the contents of individual files; that is
something that the SCM or possibly only the user should do. The storage
of 'rename' information in the commit object is another kind of 'xattr'
storage which git would provides but not directly interpret.

And you're right; it shouldn't have to be for renames only. There's no
need for us to limit it to one "source" and one "destination"; the SCM
can use it to track content as it sees fit.

As I said, the main aim of this is to track revision history of given
content, for displaying to the user and for performing merges. So when a
file is split up, or a function is moved from it to another file, a
'rename' xattr can be included to mark that files 'foo' and 'bar' in the
new tree are both associated with file 'wibble' in the parent.

That's as much as we need to provide for content tracking, and it _does_
handle the general case as well as we should be attempting to. We don't
want to get into dealing with file contents ourselves; we just want to
store the hint for the SCM or the user that "your data went thataway".

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Merge with git-pasky II.

2005-04-15 Thread David Woodhouse
On Thu, 2005-04-14 at 11:36 -0700, Linus Torvalds wrote:
> And "merge these two trees" (which works on a _tree_ level)
> or "find the common commit" (which works on a _commit_ level)

I suspect that finding the common commit is actually a per-file thing;
it's not just something you do for the _commit_ graph, then use for
merging each file in the two branches you're trying to merge.

Consider a simple repository which contains two files A and B. We start
off with the first version of each ('A1B1'), and the owner of each file
takes a branch and modifies their own file. There is cross-pulling
between the two, and then each modifies the _other's_ file as well as
their own...

   (A1B2)--(A2B2)--(A2'B3)
/  \   /\
   /\ /  \
 (A1B1)  X   (...)
   \/ \  /
\  /   \/
   (A2B1)--(A2B2)--(A3B2')

Now, we're trying to merge the two branches. It appears that the most
useful common ancestor to use for a three-way merge of file A is the
version from tree 'A2B1', while the most useful common ancestor for
merging file B is that in 'A1B2'.

(I think it's a coincidence that in my example the useful files 'A2' and
'B2' actually do end up in a single tree together at some point.)

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Handling renames.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 18:23 -0400, Daniel Barkalow wrote:
> I personally think renames are a minor thing that doesn't happen
> much. What actually happens, in my opinion, is that some chunk of a
> file is moved to a different, possibly new, file. If this is supported
> (as something that the SCM notices), then a rename is just a special
> case where the moved chunk is a whole file.

Certainly we'd discussed the possibility that the 'rename' field may
contain more than one destination, or more than one source filename.
This could happen when a file is split into two, or when two files are
merged into one, for example.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Date handling.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 14:01 -0700, H. Peter Anvin wrote:
> Both of these are metadata; they may not be directly relevant to the 
> filesystem, but are attributes relevant to the client thereof; 
> effectively an xattr.

Right. That's perfectly acceptable -- and that's the reason why I think
it's also fine to keep the timezone and the rename information in there
too. If we were being _really_ anal about auxiliary information being
separate, we'd stick it in a separate blob object and merely refer to it
from the commit object. I don't think there's really any call to take it
that far, though.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Date handling.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 12:42 -0700, Luck, Tony wrote:
> This is a very good point ... but this still has problems with the
> "git is a filesystem, not a SCM" mantra.  Timezone comments don't
> belong in the git inode.

Yeah, but really I'd want to see other serious users of it before I'd
accept that the timezone information _really_ needs to be stored
separately. After all, the committer and author information really
wouldn't be considered part of the _filesystem_ either.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Date handling.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 12:19 -0700, [EMAIL PROTECTED] wrote:
> With a UTC date, why would anyone care in which timezone the commit was
> made?  Any pretty printing would most likely be prettiest if it is done
> relative to the timezone of the person looking at the commit record, not
> the person who created the record.

I'd prefer not to lose the information. If someone has committed a
change at 2am, I like to know that it was 2am for _them_. It helps me
decide where to look first for the cause of problems. :)

It also helps disambiguate certain comments, especially those involving
words or phrases such as "yesterday" or "this afternoon".

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Handling renames.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 20:58 +0200, Ingo Molnar wrote:
> The thing i tried to avoid was to list long filenames in the commit 
> (because of the tree hierarchy we'd need to do tree-absolute pathnames 
> or something like that, and escape things, and do lookups - duplicating 
> a VFS which is quite bad) - it would be better to identify the rename 
> source and target via its tree object hash and its offset within that 
> tree. Such information could be embedded in the commit object just fine.  
> Something like:

Actually I'm not sure that's true. Let's consider the two main users of
this information.

Firstly, because it's what I've been playing with: to list a given
file's revision history, I currently work with its filename -- walk the
commit objects, inspecting the tree and selecting those commits where
the file has changed. If my filename is 'fs/jffs2/inode.c' then I can
immediately skip over a commit where the 'fs' entry in the top-level
tree is identical to that in the parent, or I can skip a commit where
the 'jffs2' entry in the 'fs' subtree is identical to the parent... it's
all done on filename, and the {parent, entry} tuple wouldn't help much
here; I'd probably have to convert back to a filename anyway.

Secondly, there's merges. I've paid less attention to these (see mail 5
minutes ago) but I think they'd end up operating on the rename
information in a very similar way. To find a common ancestor for a given
file,, we want to track its name as it changed during history; at that
point it's all string compares.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Handling renames.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 11:11 -0700, Linus Torvalds wrote:
> So, you really need to think of git as a filesystem. You can then 
> implement an SCM _on_top_of_it_, which means that your second suggestion 
> is not only acceptable, it really is the _only_ way to handle this in git:
> 
> > So a commit involving a rename would look something like this...
> > 
> > tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
> > parent bb95843a5a0f397270819462812735ee29796fb4
> >     rename foo.c bar.c
> > author David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100
> > committer David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100
> > Rename foo.c to bar.c and s/foo_/bar_/g
> 
> Except I want that empty line in there, and I want it in the "free-form"  
> section. The "rename" part really isn't part of the git header. It's not 
> what git tracks, it was tracked by an SCM system on top of git.

Note that not only may you have a _set_ of renames, but you'll also have
a _different_ set of renames for each parent. Consider the
representation of a merge where a file was called 'foo' in one parent,
'bar' in the other, and we called it 'foobar' in the resulting tree.

That's the main reason I wanted the renames in with the parent
information -- so it's <...>...

I see your point though and I can't be bothered to argue for the sake of
the slight efficiency benefit we might gain from doing it that way. The
implementation details really aren't that interesting right now.

Let us assume, however, that we have this information somehow stored in
each commit object. It's perfectly sufficient from the POV of the 
'git revtool' which I've been poking at; is it good enough for merges?

Consider a simple case: A branch is taken, file foo.c is renamed to
bar.c, and now we're trying to merge that branch back into the head,
which has moved on. 

We can't just take 'bar.c' as a new file -- we have to track it all the
way back to its inception, and notice that it actually shares a common
ancestor with 'foo.c' in the other parent of the merge.

How feasible, and how computationally expensive, is that task going to
be? Especially given that there may be _many_ new files that we need to
attempt to tie up with their partners, across many potential renames. 

One option for optimising this, if we really need to, might be to track
the file back to its _first_ ancestor and use that as an identification.
The SCM could store that identifier in the blob itself, or we could
consider it an 'inode number' and store it in git's tree objects.

If we can avoid that, however, it would be nice. How feasible is the
merge going to be without it?

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Handling renames.

2005-04-14 Thread David Woodhouse
I've been looking at tracking file revisions. One proposed solution was
to have a separate revision history for individual files, with a new
kind of 'filecommit' object which parallels the existing 'commit',
referencing a blob instead of a tree. Then trees would reference such
objects instead of referencing blobs directly.

I think that introduces a lot of redundancy though, because 99% of the
time, the revision history of the individual file is entirely
reproducible from the revision history of the tree. It's only when files
are renamed that we fall over -- and I think we can handle renames
fairly well if we just log them in the commit object. 

My 'gitfilelog.sh' script is already capable of tracking a given file
back through multiple tree commits, listing those commits where the file
in question was actually changed. It uses my patched version of diff-
tree which supports 'diff-tree   ' in order to
do this.

By storing rename information in the commit object, the script (or a
reimplementation of a similar algorithm) could know when to change the
filename it's looking for, as it goes back through the tree. That ought
to be perfectly sufficient.

So a commit involving a rename would look something like this...

tree 82ba574c85e9a2e4652419c88244e9dd1bfa8baa
parent bb95843a5a0f397270819462812735ee29796fb4
    rename foo.c bar.c
author David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100
committer David Woodhouse <[EMAIL PROTECTED]> 1113499881 +0100
Rename foo.c to bar.c and s/foo_/bar_/g

Opinions? Dissent? We'd probably need to escape the filenames in some
way -- handwave over that for now.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Date handling.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 02:12 -0700, Linus Torvalds wrote:
> I take that back. I'd be much happier with you doing and testing it, 
> because now I'm crashing.

OK. commit-tree now eats RFC2822 dates as AUTHOR_DATE because that's
what you're going to want to feed it. We store seconds since UTC epoch,
we add the author's or committer's timezone as auxiliary data so that
dates can be pretty-printed in the original timezone later if anyone
cares. I left the date parsing in rev-tree.c for backward compatibility
but it can be dropped when we change to base64 :)

Yes, glibc sucks and strptime is a pile of crap. We have to parse it
ourselves.

Index: commit-tree.c
--- 1756b578489f93999ded68ae347bef7d6063101c/commit-tree.c  (mode:100664 
sha1:12196c79f31d004dff0df1f50dda67d8204f5568)
+++ 82ba574c85e9a2e4652419c88244e9dd1bfa8baa/commit-tree.c  (mode:100644 
sha1:35cb09402c9868499bcaf6de42afbad9fdfebe05)
@@ -7,6 +7,9 @@
 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #define BLOCKING (1ul << 14)
 #define ORIG_OFFSET (40)
@@ -95,6 +98,148 @@
}
 }
 
+static const char *month_names[] = {
+"Jan", "Feb", "Mar", "Apr", "May", "Jun",
+"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
+};
+
+static const char *weekday_names[] = {
+"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
+};
+
+
+static char *skipfws(char *str)
+{
+   while (isspace(*str))
+   str++;
+   return str;
+}
+
+   
+/* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
+   (i.e. English) day/month names, and it doesn't work correctly with %z. */
+static void parse_rfc2822_date(char *date, char *result, int maxlen)
+{
+   struct tm tm;
+   char *p;
+   int i, offset;
+   time_t then;
+
+   memset(&tm, 0, sizeof(tm));
+
+   /* Skip day-name */
+   p = skipfws(date);
+   if (!isdigit(*p)) {
+   for (i=0; i<7; i++) {
+   if (!strncmp(p,weekday_names[i],3) && p[3] == ',') {
+   p = skipfws(p+4);
+   goto day;
+   }
+   }
+   return;
+   }   
+
+   /* day */
+ day:
+   tm.tm_mday = strtoul(p, &p, 10);
+
+   if (tm.tm_mday < 1 || tm.tm_mday > 31)
+   return;
+
+   if (!isspace(*p))
+   return;
+
+   p = skipfws(p);
+
+   /* month */
+
+   for (i=0; i<12; i++) {
+   if (!strncmp(p, month_names[i], 3) && isspace(p[3])) {
+   tm.tm_mon = i;
+   p = skipfws(p+strlen(month_names[i]));
+   goto year;
+   }
+   }
+   return; /* Error -- bad month */
+
+   /* year */
+ year: 
+   tm.tm_year = strtoul(p, &p, 10);
+
+   if (!tm.tm_year && !isspace(*p))
+   return;
+
+   if (tm.tm_year > 1900)
+   tm.tm_year -= 1900;
+   
+   p=skipfws(p);
+
+   /* hour */
+   if (!isdigit(*p))
+   return;
+   tm.tm_hour = strtoul(p, &p, 10);
+   
+   if (!tm.tm_hour > 23)
+   return;
+
+   if (*p != ':')
+   return; /* Error -- bad time */
+   p++;
+
+   /* minute */
+   if (!isdigit(*p))
+   return;
+   tm.tm_min = strtoul(p, &p, 10);
+   
+   if (!tm.tm_min > 59)
+   return;
+
+   if (isspace(*p))
+   goto zone;
+
+   if (*p != ':')
+   return; /* Error -- bad time */
+   p++;
+
+   /* second */
+   if (!isdigit(*p))
+   return;
+   tm.tm_sec = strtoul(p, &p, 10);
+   
+   if (!tm.tm_sec > 59)
+   return;
+
+   if (!isspace(*p))
+   return;
+
+ zone:
+   p = skipfws(p);
+
+   if (*p == '-')
+   offset = -60;
+   else if (*p == '+')
+   offset = 60;
+   else
+  return;
+
+   if (!isdigit(p[1]) || !isdigit(p[2]) || !isdigit(p[3]) || 
!isdigit(p[4]))
+   return;
+
+   i = strtoul(p+1, NULL, 10);
+   offset *= ((i % 100) + ((i / 100) * 60));
+
+   if (*(skipfws(p + 5)))
+   return;
+
+   then = mktime(&tm); /* mktime appears to ignore the GMT offset, 
stupidly */
+   if (then == -1)
+   return;
+
+   then -= offset;
+
+   snprintf(result, maxlen, "%lu %5.5s", then, p);
+}
+
 /*
  * Having more than two parents may be strange, but hey, there's
  * no conceptual reason why the file format couldn't accept multi-way
@@ -114,10 +259,12 @@
unsigned char commit_sha1[20];
char *gecos, *realgecos;
char *email, realemail[1000];
-   char *date, *realdate;
+   char date[20], realdate[20];
+   char *audate;
char comment[1000];
struct passwd *pw;
time_t now;
+   struct tm *tm;
char *buffer;
unsigned int size;
 
@@ -142,15 +289,19 @@

Re: Date handling.

2005-04-14 Thread David Woodhouse
On Thu, 2005-04-14 at 02:00 -0700, Linus Torvalds wrote:
> I do like text output, but if it is painful, the "unix seconds" format is 
> certainly a hell of a lot simpler. And quite frankly, if we change it, we 
> might as well just change it all the way. So I'd almost prefer (1).

Text _output_ is easy to generate; we don't need to store text in the
database for that. So I've changed my mind -- I prefer (1) too.

-- 
dwmw2

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Date handling.

2005-04-14 Thread David Woodhouse
The date handling is somewhat unreliable. We render dates into textual
representation using the committer's locale (day names, etc), then later
attempt to interpret that in some other locale. And we were just using
localtime without even specifying the timezone so the timestamp was
fairly randomised anyway. In fact, an $AUTHOR_DATE environment variable
was making its way into the database entirely unchecked. 

I see two possible solutions:
1. Just store seconds-since-GMT-epoch and if we really want, the
   timezone as auxiliary information.
2. Store dates in RFC2822 form.

Unless someone convincingly expresses a preference before I get to work
and start playing with it, I'll implement the latter.

-- 
dwmw2


-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html