Re: refspecs with '*' as part of pattern
On Mon, 6 Jul 2015, Junio C Hamano wrote: Jacob Keller jacob.kel...@gmail.com writes: I've been looking at the refspecs for git fetch, and noticed that globs are partially supported. I wanted to use something like: refs/tags/some-prefix-*:refs/tags/some-prefix-* as a refspec, so that I can fetch only tags which have a specific prefix. I know that I could use namespaces to separate tags, but unfortunately, I am unable to fix the tag format. The specific repository in question is also generating several tags which are not relevant to me, in formats that are not really useful for human consumption. I am also not able to fix this less than useful practice. However, I noticed that refspecs only support * as a single component. The match algorithm works perfectly fine, as documented in abd2bde78bd9 (Support '*' in the middle of a refspec) What is the reason for not allowing slightly more arbitrary expressions? Obviously no more than one *... I cannot seem to be able to find related discussions around that patch, so this is only my guess, but I suspect that this is to discourage people from doing something like: refs/tags/*:refs/tags/foo-* which would open can of worms (e.g. imagine you fetch with that pathspec and then push with refs/tags/*:refs/tags/* back there; would you now get foo-v1.0.0 and foo-foo-v1.0.0 for their v1.0.0 tag?) we'd prefer not having to worry about. That wouldn't be it, since refs/tags/*:refs/tags/foo/* would have the same problem, assuming you didn't set up the push refspec carefully. I think it was mostly that it would be too easy to accidentally do something you don't want by having some other character instead of a slash, like refs/heads/*:refs/heads-*. Aside from the increased risk of hard-to-spot typos leading to very weird behavior, nothing actually goes wrong; in fact, I've been using git with that check removed for ages because I wanted a refspec like refs/heads/something-*:refs/heads/*. And it works fine as a local patch, since you don't need your refspec handling to interoperate with other repositories. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/13] New remote-hg helper
On Wed, 31 Oct 2012, Felipe Contreras wrote: Hi, On Wed, Oct 31, 2012 at 7:59 PM, Jonathan Nieder jrnie...@gmail.com wrote: Felipe Contreras wrote: On Wed, Oct 31, 2012 at 7:20 PM, Johannes Schindelin johannes.schinde...@gmx.de wrote: I just tested this with junio/next and it seems this issue is still unfixed: instead of reset refs/heads/blub from e7510461b7db54b181d07acced0ed3b1ada072c8 I get reset refs/heads/blub from :0 when running git fast-export ^master blub. That is not a problem. It has been discussed extensively, and the consensus seems to be that such command should throw nothing: http://article.gmane.org/gmane.comp.version-control.git/208729 Um. Are you claiming I have said that git fast-export ^master blub should silently emit nothing? Or has this been discussed extensively with someone else? Maybe I misunderstood when you said: A patch meeting the above description would make perfect sense to me. Anyway, when you have: % git fast-export ^next next^{commit} # nothing % git fast-export ^next next~0 # nothing % git fast-export ^next next~1 # nothing % git fast-export ^next next~2 # nothing It only makes sense that: % git fast-export ^next next # nothing It doesn't get any more obvious than that. But to each his own. I think that may be true where you have next in both places, but I think: $ git checkout -b new-branch master $ git fast-export ^master new-branch ought to emit no commit lines, but needs to emit a reset line. After all, you haven't told fast-export that the ref new-branch is up to date, and you have told it that you want it to be exported. If you create a new branch off of an existing commit, don't change it, and push it to hg, it shouldn't be up to remote-hg to figure out what should happen with no input; it should get a: reset refs/heads/new-branch from [something] I don't know why Johannes seems to want [something] not to be a mark reference (unless he's complaining about getting an invalid mark reference when there aren't any marks defined), but surely something of the above form is necessary to tell remote-hg to create the new branch. I think it would be worth testing that: $ git checkout -b new-branch master $ git push hg new-branch creates the new branch successfully (which I think it does, but wouldn't if git fast-export ^master new-branch actually returned nothing; parsed_refs gets it from the reset line). AFAICT, your code relies on getting the behavior that fast-export actually gives, not the behavior you seem to want or the behavior Johannes seems to want. And the reason that you don't need any changes to fast-export is that your process maps marks instead of sha1s. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git-clone ignores umask for working tree
On Fri, 6 Jul 2012, Alex Riesen wrote: Hi list, when git-clone was built in, its treatment of umask has changed: the shell version respected umask for newly created directories by using plain mkdir(1), and the builtin version just uses mkdir(work_tree, 0755). Is it intentional? I have the vague feeling that it was intentional, but it's entirely plausible that I just overlooked that mkdir(2) applies umask and went for the mode that you normally want. I don't think there's any particular need for this operation to be more restrictive than umask. -Daniel *This .sig left intentionally blank* -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] Merge driver
On Fri, 9 Sep 2005, Junio C Hamano wrote: I have several requests to people who are interested in merges and read-tree changes. I am pretty much set to use the recent read-tree updates Daniel has been working on. The only reason it has not hit the master branch yet, except that it still has known leaks that have not been plugged, is because read-tree is so fundamental to everything we do, and I am trying to be extremely conservative here. I've beaten it myself reasonably well and have not found any regressions (except removal of --emu23 which I believe nobody uses anyway), but I'd appreciate people to try it out and see if it performs well for your dataset. If you are planning further surgery on read-tree code, please base your changes on Daniel's rewrite to avoid your effort being wasted. This request goes both to Chuck (active_cache abstraction) and Fredrik (addition of 'ignore index and working tree matching rules' [*1*]). A proposed merge driver 'git-merge' is in the proposed updates branch. This is intended to be the top-level user interface to the merge machinery which drives multiple merge strategy scripts, and I am hoping that I can eventually (1) retire 'git-resolve' and 'git-octopus' (they simply become merge strategy scripts driven by 'git-merge') and (2) call 'git-merge' from 'git-pull'. What I have in the proposed updates branch has been fixed since my earlier message to the list and has a new merge strategy script, in addition to 'resolve' and 'octopus', called 'git-merge-multibase'. This uses Daniel's read-tree that can use more than one merge bases. I request Daniel to give OK to its name or suggest a better name for this script -- I would even accept 'git-merge-barkalow' if you want ;-). I'd actually been thinking it would just go into the the resolve driver, with that going back to before it chose among merge-base outputs and just sending the whole list to read-tree. If you are planning to implement a new merge strategy, please use the ones in the proposed updates branch as examples, and complain and suggest improvements if you find the interface between the strategy scripts and the driver lacking. This request goes primarily to Fredrik. I'm interested in doing the renaming merge that would have helped HPA's klibc-kbuild vs klibc case myself but if somebody else is so inclined please go wild. And finally, a request to everybody; please try out 'git-merge' and see how you like it. `git-merge` [-n] [-s strategy]... msg head remote remote... -n:: Do not show diffstat at the end of the merge. -s strategy:: use that merge strategy; can be given more than once to specify them in the order they should be tried. If there is no `-s` option, built-in list of strategies is used instead. head:: our branch head commit. remote:: other branch head merged into our branch. You need at least one remote. Specifying more than one remote obviously means you are trying an Octopus. Here is a sample transcript from a test resolving one of the 'more-than-one-merge-base' commits Fredrik found in the kernel repository (: siamese; is my $PS1;is my $PS2). : siamese; git reset --hard b8112df71cae7d6a86158caeb19d215f56c4f9ab : siamese; git merge -n \ 'reproduce 0e396ee43e445cb7c215a98da4e76d0ce354d9d7' \ HEAD 2089a0d38bc9c2cdd084207ebf7082b18cf4bf58 Trying merge strategy resolve... Trying to find the optimum merge base. Trying simple merge. Simple merge failed, trying Automatic merge. Removing drivers/net/fmv18x.c Auto-merging drivers/net/r8169.c. merge: warning: conflicts during merge ERROR: Merge conflict in drivers/net/r8169.c. Removing drivers/net/sk_g16.c Removing drivers/net/sk_g16.h fatal: merge program failed Rewinding the tree to pristine... Trying merge strategy multibase... Trying simple merge. Simple merge failed, trying Automatic merge. Removing drivers/net/fmv18x.c Auto-merging drivers/net/r8169.c. merge: warning: conflicts during merge ERROR: Merge conflict in drivers/net/r8169.c. Removing drivers/net/sk_g16.c Removing drivers/net/sk_g16.h fatal: merge program failed Rewinding the tree to pristine... Trying merge strategy octopus... Rewinding the tree to pristine... Using the multibase to prepare resolving by hand. Trying simple merge. Simple merge failed, trying Automatic merge. Removing drivers/net/fmv18x.c Auto-merging drivers/net/r8169.c. merge: warning: conflicts during merge ERROR: Merge conflict in drivers/net/r8169.c. Removing drivers/net/sk_g16.c Removing drivers/net/sk_g16.h fatal: merge program failed Automatic merge failed; fix up by hand : siamese; git-update-cache --refresh drivers/net/r8169.c: needs update : siamese; echo
Re: [RFH] Merge driver
On Fri, 9 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: It tries to make sure that there is room to put stuff for resolving a conflict without messing with modified files in the directory. I agree it can be used that way, but nobody seems to use it for that purpose as far as I can tell hence my earlier comment. But let's leave the door open by having them as independent options. Ah, okay. I hadn't realized that resolve used -u for that call to read-tree. You're entirely right. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Fri, 9 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: In case #16, I'm not sure what I should produce. I think the best thing might be to not leave anything in stage 1. The desired end effect is that the user is given a file with a section like: { *t = NULL; *m = 0; return Z_DATA_ERROR; return Z_OK; } I was thinking a bit more about this. Let's rephrase case #16. I'll call merge bases O1, O2,... and merge heads A and B, and we are interested in one path. If O1 and O2, the path has quite different contents. A has the same contents as O1 and B has the same contents as O2. There's a bit more subtlety here: since these are common ancestors, A must have somehow changed O2's version to O1's version, and B must have changed O1's version to O2's version. It's isn't just that each side left the file the same, but from different ancestral versions; both of the other versions must have gotten rejected somehow. I think the real key is to identify what was going on in between. We should not just pick one or the other and do two-file merge between the version in A and B (we could prototype by massaging 'diff A B' output to produce what is common between A and B and run (RCS) merge of A and B pretending that the common contents is the original to produce something like the above). If A has slight changes since O1 but B did not change since O2, ideally I think we would want the same thing to happen. Let's call it case #16+. What does the current implementation do? It is not case #16 because A and O1 does not exactly match. I suspect the result will be skewed because B has an exact match with O2. Yes, in this case we miss whatever caused A to reject O2, and we use the modified O2, because we don't realize that A's rejection of O2 should also apply to the version in B. Unfortunately, this looks just like the situation where both sides took O1, and B did a further modification to that. The situation becomes more interesting if both A and B has slight changes since O1 and O2 respectively. They do not exactly match with their bases, but I think ideally we would like something very similar to case #16 resolution to happen. I think the right thing, ideally, is to have the content merge also take multiple ancestors and have a #16 case itself when it's deciding which version of a block to use. The #16+ case is actually trickier, because we have fewer cues. One way to solve this would be to try doing things entirely in read-tree by doing not just exact matches but also checking the amount of changes -- if each heads has similar but different base call it case #16 and try two-file merge between the heads disregarding the bases. But I am a bit reluctant to suggest this. My gut feeling tells me that these 'interesting' cases are easier if scripted outside read-tree machinery to later enhance and improve the heuristics. Of course, the current case #16 detected by the exact match rule should be something we can automatically handle, but to make things safer to use I think we should have a way to detect case #16+ situlation and avoid mistakenly favoring A over B (or vice versa) only because one has slight modification while the other does not. I think #16+ is extra uncommon, because it involves someone making an irrelevant modification to a patched version of a file while someone else reverts the patch. I'm actually interested in doing a big spiffy program to do merges with information drawn as needed from the history, stuff happening on a per-hunk level, and support for block moves. It'll take a while before it gets anywhere, but I still think it's likely that people won't hit #16+ and get unexpected behavior before it's ready. The main thing I'm unsure of is whether Fredrick's algorithm is actually not a better solution: it is possible to understand what happened leading up to a merge either by looking at the time after the common ancestors or by looking at the time before them. I think that the more recent history is a better guide, but the older history is easier to use; the case his version isn't good for, I think, is when the common ancestors of the sides are even more complicated to merge. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] A new merge algorithm, take 3
On Thu, 8 Sep 2005, Fredrik Kuivinen wrote: On Wed, Sep 07, 2005 at 02:33:42PM -0400, Daniel Barkalow wrote: On Wed, 7 Sep 2005, Fredrik Kuivinen wrote: Of the 500 merge commits that currently exists in the kernel repository 19 produces non-clean merges with git-merge-script. The four merge cases listed in [EMAIL PROTECTED] are cleanly merged by git-merge-script. Every merge commit which is cleanly merged by git-resolve-script is also cleanly merged by git-merge-script, furthermore the results are identical. There are currently two merges in the kernel repository which are not cleanly merged by git-resolve-script but are cleanly merged by git-merge-script. If you use my read-tree and change git-resolve-script to pass all of the ancestors to it, how does it do? I expect you'll still be slightly ahead, because we don't (yet) have content merge with multiple ancestors. You should also check the merge that Tony Luck reported, which undid a revert, as well as the one that Len Brown reported around the same time which had similar problems. I think maintainer trees are a much better test for a merge algorithm, because the kernel repository is relatively linear, while maintainers tend more to merge things back and forth. Junio tested some of the multiple common ancestor cases with your version of read-tree and reported his results in [EMAIL PROTECTED]. Oh, right. I'm clearly not paying enough attention here. The two cases my algorithm merges cleanly and git-resolve-script do not merge cleanly are 0e396ee43e445cb7c215a98da4e76d0ce354d9d7 and 0c168775709faa74c1b87f1e61046e0c51ade7f3. Both of them have two common ancestors. The second one have, as far as I know, not been tested with your read-tree. Okay, I'll have to check whether the result I get seems right. I take it your result agrees with what the users actually produced by hand? The merge cases reported by Tony Luck and Len Brown are both cleanly merged by my code. Do they come out correctly? Both of those have cases which cannot be decided correctly with only the ancestor trees, due to one branch reverting a patch that was only in one ancestor. The correct result is to revert that patch, but figuring out that requires looking at more trees. I think your algorithm should work for this case, but it would be good to have verification. (IIRC, Len got the correct result while Tony got the wrong result and then corrected it later.) You are probably right about the maintainer trees. I should have a look at some of them. Do you know any specific repositories with interesting merge cases? Not especially, except that I would guess that people who have reported hitting bad cases would be more likely to have other interesting merges in their trees. You might also try merging maintainer trees with each other, since it's relatively likely that there would be complicating overlap that only doesn't cause confusion because things get rearranged in -mm. For that matter, I bet you'd get plenty of test cases out of trying to replicate -mm as a git tree. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] A new merge algorithm, take 3
On Thu, 8 Sep 2005, Fredrik Kuivinen wrote: The first one agrees with what was actually committed. For the second one the difference between the tree produced by the algorithm and what was committed is: diff --git a/include/net/ieee80211.h b/include/net/ieee80211.h --- a/include/net/ieee80211.h +++ b/include/net/ieee80211.h @@ -425,9 +425,7 @@ struct ieee80211_stats { struct ieee80211_device; -#if 0 /* for later */ #include ieee80211_crypt.h -#endif #define SEC_KEY_1 (10) #define SEC_KEY_2 (11) I have looked at the files and common ancestors involved and I think that this change have been introduced manually. I may have missed something when I analysed it though... Certainly possible that it was done manually. The merge cases reported by Tony Luck and Len Brown are both cleanly merged by my code. Do they come out correctly? Both of those have cases which cannot be decided correctly with only the ancestor trees, due to one branch reverting a patch that was only in one ancestor. The correct result is to revert that patch, but figuring out that requires looking at more trees. I think your algorithm should work for this case, but it would be good to have verification. (IIRC, Len got the correct result while Tony got the wrong result and then corrected it later.) Len's merge case come out identically to the tree he committed. I have described what I got for Tony's case in [EMAIL PROTECTED] (my merge algorithm produces the result Tony expected to get, but he didn't get that from git-resolve-script). Good. It looks to me like this is a good algorithm in practice, then. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Thu, 8 Sep 2005, Darrin Thompson wrote: On Mon, 2005-09-05 at 01:41 -0400, Daniel Barkalow wrote: I've got a version of read-tree which accepts multiple ancestors and does a merge using information from all of them. Do the multiple ancestors have to share a common parent? More to the point, is this read-tree any more friendly to baseless merges? read-tree doesn't care about the relationships between its inputs; it's only interested in the trees. But using ancestors which aren't common is unlikely to give you desired results. I think, if you do read-tree a^ b^ a b, you will get everything into the index, but it'll all going to be conflicts. I assume that what you want is something to include everything from two commits, which would give conflicts if a name is reused? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Thu, 8 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: I assume that what you want is something to include everything from two commits, which would give conflicts if a name is reused? My understanding is that Darrin wants to do what Linus did when he merged gitk into git.git. Personally I think that is a specialized application and something like the git-merge-projects script I posted as a follow-up would be more appropriate than adding it to the current merge discussion. Well, it's an easy addition to read-tree; just need a merge function which takes two entries and adds the non-NULL one in stage 0, or adds both if they both exist. git-merge-script probably shouldn't be the entry point to it, of course, but that part isn't my area anyway. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] A new merge algorithm, take 3
On Wed, 7 Sep 2005, Fredrik Kuivinen wrote: Of the 500 merge commits that currently exists in the kernel repository 19 produces non-clean merges with git-merge-script. The four merge cases listed in [EMAIL PROTECTED] are cleanly merged by git-merge-script. Every merge commit which is cleanly merged by git-resolve-script is also cleanly merged by git-merge-script, furthermore the results are identical. There are currently two merges in the kernel repository which are not cleanly merged by git-resolve-script but are cleanly merged by git-merge-script. If you use my read-tree and change git-resolve-script to pass all of the ancestors to it, how does it do? I expect you'll still be slightly ahead, because we don't (yet) have content merge with multiple ancestors. You should also check the merge that Tony Luck reported, which undid a revert, as well as the one that Len Brown reported around the same time which had similar problems. I think maintainer trees are a much better test for a merge algorithm, because the kernel repository is relatively linear, while maintainers tend more to merge things back and forth. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Mon, 5 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: I've got a version of read-tree which accepts multiple ancestors and does a merge using information from all of them. After disabling the debugging printf(), I used this read-tree to try resolving the parents of four commits Fredrik Kuivinen gave us in [EMAIL PROTECTED] using their two merge bases, and compared the resulting tree with the tree recorded in the commit. The results are really promising. For the following two commits, multi-base merge resolved their parents trivially and produced the same result as the tree in the commit. The current best-base merge in the master branch performed far worse and left many conflicts. - 467ca22d3371f132ee225a5591a1ed0cd518cb3d - da28c12089dfcfb8695b6b555cdb8e03dda2b690 Another one, 0e396ee43e445cb7c215a98da4e76d0ce354d9d7, multi-base merge left only one conflicting path to be hand resolved. The best-base merge again performed far worse. The other one, 3190186362466658f01b2e354e639378ce07e1a9, is resolved trivially with both algorithms. Do you know if there's anything like case #16 in there? I'd be interested to know if there's anything that gets handled automatically in different ways depending on which single base is used, and doesn't require manual intervention with multiple bases, because that's probably wrong. In case #16, I'm not sure what I should produce. I think the best thing might be to not leave anything in stage 1. Because? I know it would affect the readers of index files if you did so, but it would seem the most natural in git architecture to have merge-cache look at the resulting cache with such multiple stage 1 entries (and other stages) and let the script make a decision. I didn't want to break the assumption of only one entry per stage in the initial version. I'm also not sure that listing the ancestors is particularly useful in this case. They have to be exactly the contents of stages 2 and 3, plus possibly more stuff that's not been kept by either side. What you actually want is a two-way merge (i.e., a diff between the two sides, presented in merge format), so you don't really need any ancestors, unless it would fit some more general case that way. The desired end effect is that the user is given a file with a section like: { *t = NULL; *m = 0; return Z_DATA_ERROR; return Z_OK; } Sounds fine. Anyway, I really am happy to see this multi-base merge perform well on real-world data, and you are certainly the git hero of the week ;-). Great. Want me to send the patches with better organization, or are you set with what I've sent? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Make sure the diff machinery outputs \ No newline ... in english
On Mon, 5 Sep 2005, Linus Torvalds wrote: On Mon, 5 Sep 2005, Fredrik Kuivinen wrote: After a quick look through the diff source I didn't find anything else. It's quite possible that I haved missed something though. Most of the translated messages are related to error reporting, which I guess might be nice to have in the user specified language. Is it possible that we could integrate the diff algorithm into git, and get rid of the dependency on an external GNU diff? It would also make the portability problems go away (ie old diff's being broken). It would also potentially speed up the normal built-in diff a lot, since we wouldn't have to execute a whole other program to generate a diff, just call a helper function the way we do for xdiff.. Unreasonable? The algorithm actually used by GNU diff is pretty complicated, and I don't really understand the actual implementation, which evidentally has a few important refinements over the original paper. I've written my own diff, mainly to try a different algorithm, and it seems to work, but the code isn't yet appropriate to submit. This algorithm also has the advantage that it can identify moved sections and is less interested in interleaving a removed function with a new function to provide the shortest possible diff. I expect that I could get it to work if I put in a day on it; it's mostly writing a hashtable implementation for non-NULL-terminated string-keyed hash tables. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Tue, 6 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: Do you know if there's anything like case #16 in there? I'd be interested to know if there's anything that gets handled automatically in different ways depending on which single base is used, and doesn't require manual intervention with multiple bases, because that's probably wrong. Re-running the tests with the attached patch shows there weren't any. Good. (Although that patch doesn't seem to be directly on top of my version; I can tell what it's doing anyway) Great. Want me to send the patches with better organization, or are you set with what I've sent? That's up to you. If you are content with what I have in the pu branch, there is no need to bother resending. OTOH if you have further clean-ups in mind, i.e. better organization above, I do not mind dropping the current ones from pu and replace them with another set from you. I'm happy with the content in pu; the issue is just whether you want the history cleaned up more. In the series I sent, I kept forgetting parts that belonged in earlier patches. Could you look over the documentation in Documentation/technical/trivial-merge.txt, and see if it's a suitable replacement for the table in t1000-read-tree-m-3way.sh? It should be the same, except for ALT or non-ALT versions that we're not using, combining a few matching cases, describing the rules behind index requirements rather than listing outcomes, and the addition of info on how multiple ancestors are handled. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Multi-ancestor read-tree notes
On Tue, 6 Sep 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: Good. (Although that patch doesn't seem to be directly on top of my version; I can tell what it's doing anyway) That one was against the proposed updates head. I've updated it again to include the patch. I'm happy with the content in pu; the issue is just whether you want the history cleaned up more. In the series I sent, I kept forgetting parts that belonged in earlier patches. Again, that is up to you. I am not _that_ perfectionist but I do not mind reapplying updated ones if you are ;-). What's there is fine with me. (I'll work on improving the documentation as a further patch) Could you look over the documentation in Documentation/technical/trivial-merge.txt, and see if it's a suitable replacement for the table in t1000-read-tree-m-3way.sh? I do not understand what you meant by '*' and 'index+' in one-way merge table. I take the first row ('*') to mean If the tree is missing a path, that path is removed from the index. '*' means that that case applies regardless of what's there. 'index+' means that it's the index, with the stat information. I forgot to actually explain the table before going on to the interesting section. I like the second sentence in three-way merge description. That is a very easy-to-understand description of what the index requirements are. You have 2 2ALTs. Also 14 and 14ALT look like they are the same rule now. Ah, right. I had originally listed index in the table, with separate cases for having it match the head and having it match the result, but then ditched that when I figured out how that actually works. What's (empty)^ in ancest? All of them must be empty for this rule to apply? The '^' means that all must be like that. I have to check, but I think that 8ALT and 10ALT should be '+'. I am not quite sure it is 'a suitable replacement' yet; the existing table you can see it covers all the cases, but with things like 'ancestor+' means one of them matches, I cannot really tell the table covers all the cases or some cases fall of the end of the chain. All of the any ancestor spots are good for covering things. Case #11 (which actually needs to be at the bottom) is basically everything else. Also when we have more than one ancestors or one remotes and we say no merge, it is still unspecified (and I have to admit I cannot readily say what the result should be for all of them, except that I agree #16 will be fine with an empty stage1) what are left in which stages. Presently, except for case #16, only the first ancestor is used in no merge output. The right thing should be worked out and documented, of course. I'm not at all convinced at this point that we can do much with multiple remotes in a single application of the rules; you won't necessarily have the same merge base for all pairs, and all sorts of things go wrong if you start including ancestors that aren't related to something, or not including common ancestors of some pair. What might work is to have the error for an unmerged index only happen when you get to a no merge result, so that you can get as many conflicts as possible (in different files) resolved by the user at the same time. I personally think the exotic cases (i.e. no rule applies, or no merge result with more than one ancestors/remotes) needs to be handled outside read-tree anyway, by the script that drives read-tree to attempt trivial merges. I think case #16 would benefit from doing more stuff, but there aren't any holes in the rules, and I think that, for the multiple ancestors in no merge, we just want to use the one with the least conflict. (Or, if we write our own merge, do a #16/#13,#14/#11 decision per-hunk in our merge, which is the really right thing). I think the common case for multiple ancestors will really be that you've got a side branch that split before the split you're resolving, and was merged into both sides before now; in this case, there's no big problem, and it's not the exotic cross-merge case. Of course, we won't see this in projects like the kernel and git, which aren't that amorphous. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] Add a function for getting a struct tree for an ent.
Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- tree.c | 21 + tree.h |3 +++ 2 files changed, 24 insertions(+), 0 deletions(-) 3bfcc20b6aeff3e1fbcce97a426383c9770a2105 diff --git a/tree.c b/tree.c --- a/tree.c +++ b/tree.c @@ -1,5 +1,7 @@ #include tree.h #include blob.h +#include commit.h +#include tag.h #include cache.h #include stdlib.h @@ -212,3 +214,22 @@ int parse_tree(struct tree *item) free(buffer); return ret; } + +struct tree *parse_tree_indirect(const unsigned char *sha1) +{ + struct object *obj = parse_object(sha1); + do { + if (!obj) + return NULL; + if (obj-type == tree_type) + return (struct tree *) obj; + else if (obj-type == commit_type) + obj = (((struct commit *) obj)-tree-object); + else if (obj-type == tag_type) + obj = ((struct tag *) obj)-tagged; + else + return NULL; + if (!obj-parsed) + parse_object(obj-sha1); + } while (1); +} diff --git a/tree.h b/tree.h --- a/tree.h +++ b/tree.h @@ -32,4 +32,7 @@ int parse_tree_buffer(struct tree *item, int parse_tree(struct tree *tree); +/* Parses and returns the tree in the given ent, chasing tags and commits. */ +struct tree *parse_tree_indirect(const unsigned char *sha1); + #endif /* TREE_H */ - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] Add function to append to an object_list.
Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- object.c | 11 +++ object.h |3 +++ 2 files changed, 14 insertions(+), 0 deletions(-) 88cf2db55848e7a2cf655171c7e9fd74c70a0281 diff --git a/object.c b/object.c --- a/object.c +++ b/object.c @@ -184,6 +184,17 @@ struct object_list *object_list_insert(s return new_list; } +void object_list_append(struct object *item, + struct object_list **list_p) +{ + while (*list_p) { + list_p = ((*list_p)-next); + } + *list_p = xmalloc(sizeof(struct object_list)); + (*list_p)-next = NULL; + (*list_p)-item = item; +} + unsigned object_list_length(struct object_list *list) { unsigned ret = 0; diff --git a/object.h b/object.h --- a/object.h +++ b/object.h @@ -41,6 +41,9 @@ void mark_reachable(struct object *obj, struct object_list *object_list_insert(struct object *item, struct object_list **list_p); +void object_list_append(struct object *item, + struct object_list **list_p); + unsigned object_list_length(struct object_list *list); int object_list_contains(struct object_list *list, struct object *obj); - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] Rewrite read-tree
Adds support for multiple ancestors, removes --emu23, much simplification. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- read-tree.c | 811 +++-- t/t1005-read-tree-m-2way-emu23.sh | 422 --- 2 files changed, 425 insertions(+), 808 deletions(-) delete mode 100755 t/t1005-read-tree-m-2way-emu23.sh f196469bec156947038f1d3d00c899c9044334ca diff --git a/read-tree.c b/read-tree.c --- a/read-tree.c +++ b/read-tree.c @@ -5,73 +5,291 @@ */ #include cache.h -static int stage = 0; +#include object.h +#include tree.h + +static int merge = 0; static int update = 0; -static int unpack_tree(unsigned char *sha1) -{ - void *buffer; - unsigned long size; - int ret; +static int head_idx = -1; +static int merge_size = 0; - buffer = read_object_with_reference(sha1, tree, size, NULL); - if (!buffer) - return -1; - ret = read_tree(buffer, size, stage, NULL); - free(buffer); +static struct object_list *trees = NULL; + +static struct cache_entry df_conflict_entry = { +}; + +static struct tree_entry_list df_conflict_list = { + .name = NULL, + .next = df_conflict_list +}; + +typedef int (*merge_fn_t)(struct cache_entry **src); + +static int entcmp(char *name1, int dir1, char *name2, int dir2) +{ + int len1 = strlen(name1); + int len2 = strlen(name2); + int len = len1 len2 ? len1 : len2; + int ret = memcmp(name1, name2, len); + unsigned char c1, c2; + if (ret) + return ret; + c1 = name1[len]; + c2 = name2[len]; + if (!c1 dir1) + c1 = '/'; + if (!c2 dir2) + c2 = '/'; + ret = (c1 c2) ? -1 : (c1 c2) ? 1 : 0; + if (c1 c2 !ret) + ret = len1 - len2; return ret; } -static int path_matches(struct cache_entry *a, struct cache_entry *b) +static int unpack_trees_rec(struct tree_entry_list **posns, int len, + const char *base, merge_fn_t fn, int *indpos) { - int len = ce_namelen(a); - return ce_namelen(b) == len - !memcmp(a-name, b-name, len); + int baselen = strlen(base); + int src_size = len + 1; + do { + int i; + char *first; + int firstdir = 0; + int pathlen; + unsigned ce_size; + struct tree_entry_list **subposns; + struct cache_entry **src; + int any_files = 0; + int any_dirs = 0; + char *cache_name; + int ce_stage; + + /* Find the first name in the input. */ + + first = NULL; + cache_name = NULL; + + /* Check the cache */ + if (merge *indpos active_nr) { + /* This is a bit tricky: */ + /* If the index has a subdirectory (with +* contents) as the first name, it'll get a +* filename like foo/bar. But that's after +* foo, so the entry in trees will get +* handled first, at which point we'll go into +* foo, and deal with bar from the index, +* because the base will be foo/. The only +* way we can actually have foo/bar first of +* all the things is if the trees don't +* contain foo at all, in which case we'll +* handle foo/bar without going into the +* directory, but that's fine (and will return +* an error anyway, with the added unknown +* file case. +*/ + + cache_name = active_cache[*indpos]-name; + if (strlen(cache_name) baselen + !memcmp(cache_name, base, baselen)) { + cache_name += baselen; + first = cache_name; + } else { + cache_name = NULL; + } + } + + if (first) + printf(index %s\n, first); + + for (i = 0; i len; i++) { + if (!posns[i] || posns[i] == df_conflict_list) + continue; + printf(%d %s\n, i + 1, posns[i]-name); + if (!first || entcmp(first, firstdir, +posns[i]-name, +posns[i]-directory) 0) { + first = posns[i]-name; + firstdir = posns[i]-directory; + } + } + /* No name means we're done
[PATCH 4/4] Document the trivial merge rules for 3(+more ancestors)-way merges.
Signed-off-by: Daniel Barkalow --- Documentation/technical/trivial-merge.txt | 92 + 1 files changed, 92 insertions(+), 0 deletions(-) create mode 100644 Documentation/technical/trivial-merge.txt 7544be0a8eda7b796150729a7795c2639278da62 diff --git a/Documentation/technical/trivial-merge.txt b/Documentation/technical/trivial-merge.txt new file mode 100644 --- /dev/null +++ b/Documentation/technical/trivial-merge.txt @@ -0,0 +1,92 @@ +Trivial merge rules +=== + +This document describes the outcomes of the trivial merge logic in read-tree. + +One-way merge +- + +This replaces the index with a different tree, keeping the stat info +for entries that don't change, and allowing -u to make the minimum +required changes to the working tree to have it match. + + index treeresult + --- + * (empty) (empty) + (empty) treetree + index+ treetree + index+ index index+ + +Two-way merge +- + + + +Three-way merge +--- + +It is permitted for the index to lack an entry; this does not prevent +any case from applying. + +If the index exists, it is an error for it not to match either the +head or (if the merge is trivial) the result. + +If multiple cases apply, the one used is listed first. + +A result of no merge means that index is left in stage 0, ancest in +stage 1, head in stage 2, and remote in stage 3 (if any of these are +empty, no entry is left for that stage). Otherwise, the given entry is +left in stage 0, and there are no other entries. + +A result of no merge is an error if the index is not empty and not +up-to-date. + +*empty* means that the tree must not have a directory-file conflict + with the entry. + +For multiple ancestors or remotes, a '+' means that this case applies +even if only one ancestor or remote fits; normally, all of the +ancestors or remotes must be the same. + +case ancestheadremoteresult + +1 (empty)+ (empty) (empty) (empty) +2ALT (empty)+ *empty* remoteremote +2ALT (empty)+ *empty* remoteremote +2 (empty)^ (empty) remoteno merge +3ALT (empty)+ head*empty* head +3 (empty)^ head(empty) no merge +4 (empty)^ headremoteno merge +5ALT * headhead head +6 ancest^ (empty) (empty) no merge +8ALT ancest(empty) ancest(empty) +7 ancest+ (empty) remoteno merge +9 ancest+ head(empty) no merge +10ALT ancestancest (empty) (empty) +11ancest+ headremoteno merge +16anc1/anc2 anc1anc2 no merge +13ancest+ headancesthead +14ancest+ ancest remoteremote +14ALT ancest+ ancest remoteremote + +Only #2ALT and #3ALT use *empty*, because these are the only cases +where there can be conflicts that didn't exist before. Note that we +allow directory-file conflicts between things in different stages +after the trivial merge. + +A possible alternative for #6 is (empty), which would make it like +#1. This is not used, due to the likelihood that it arises due to +moving the file to multiple different locations or moving and deleting +it in different branches. + +Case #1 is included for completeness, and also in case we decide to +put on '+' markings; any path that is never mentioned at all isn't +handled. + +Note that #16 is when both #13 and #14 apply; in this case, we refuse +the trivial merge, because we can't tell from this data which is +right. This is a case of a reverted patch (in some direction, maybe +multiple times), and the right answer depends on looking at crossings +of history or common ancestors of the ancestors. + +The status as of Sep 5 is that multiple remotes are not supported \ No newline at end of file - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moved files and merges
On Sun, 4 Sep 2005, Junio C Hamano wrote: Sam Ravnborg [EMAIL PROTECTED] writes: If the problem is not fully understood it can be difficult to come up with the proper solution. And with the example above the problem should be really easy to understand. Then we have the tree as used by hpa with a few more mergers in it. But the above is what was initial tried to do with the added complexity of a few more renames etc. All true. Let's redraw that simplified scenario, and see if what I said still holds. It may be interesting to store my previous message and this one and run diff between them. I suspect that the main difference to come out would be the the problem description part and the merge machinery part would not be all that different. I'm not quite so convinced, because I think that the actual situation is a bit more natural, and therefore our expectations at the end should be closer to right with less attention to detail. But I think the actual situation is more interesting, anyway, because it's more likely to happen and we're more likely to be able to help. This is a simplified scenario of klibc vs klibc-kbuild HPA had trouble with, to help us think of a way to solve this interesting merge problem. #1 - #3 - #5 - #7 // / #0 - #2 - #4 - #6 There are two lines of developments. #0-#1 renames F to G and introduces K. #0-#2 keeps F as F and does not introduce K. At commit #3, #2 is merged into #1. The changes made to the file contents of F between #0 and #2 are appreciated, but we would also want to keep our decision to rename F to G and our new file K. So commit #3 has the resulting merge contents in G and has K, inherited from #1. This _might_ be different from what we traditionally consider a 'merge', but from the use case point of view it is a valid thing one would want to do. I think this is actually quite a regular merge, and I think we should be able to offer some assistance. The situation with K is normal: case #3ALT. If someone introduces a file and there's no file or directory with that name in other trees, we assume that the merge should include it. F/G is trickier, and I don't think we can actually do much about it with the current structure of read-tree/merge-cache/etc, but, theoretically, we should recognize that #0-#1 is a rename plus content changes, and #0-#2 is content changes, so the total should be the rename plus contents changes; I think we want to additionally signal a conflict, because there's a reasonable chance that the rename will interfere with the #0-#2 changes, and need intervention. Most likely, this just means that we should not commit automatically, but have the user test the result first. For now, of course, we don't get renames at any point in the merging procedure, so our code can't tell, and sees it as a big conflict that the user has to deal with. But we can agree on what the result is if the user includes all the changes from the other branch (and see the situation you reported first as cherry-picking the content and leaving the structural changes). Commit #4 is a continued development from #2; changes are made to F, and there is no K. Commit #5 similarly is a continued development from #3; its changes are made to G and K also has further changes. We are about to merge #6 into #5 to create #7. We should be able to take advantage of what the user did when the merge #3 was made; namely, we should be able to infer that the line of development that flows #0 .. #3 .. #7 prefers to rename F to G, and also wants the newly introduced K. We should be able to tell it by looking at what the merge #3 did. Again, K should be unexceptional, because we're keeping a file that was added to one side but not the other. (In the other situation, it still works; relative to the common ancestor, we're in #8ALT, since #5 doesn't have K, which was in #2 and #6; we see the rejection in a merge as a removal, which is effectively the same.) Now, how can we use git to figure that out? First off, it should handle K automatically, because we're still including a file added by one side without interference from the other side. First, given our current head (#5) and the other head we are about to merge (#6), we need a way to tell if we merged from them before (i.e. the existence of #3) and if so the latest of such merge (i.e. #3). The merge base between #5 and #6 is #2. We can look at commits between us (#5) and the merge base (#2), find a merge (#3), which has two parents. One of the parents is #2 which is reachable from #6, and the other is #1 which is not reachable from #6 but is reachable from #5. Can we say that this reliably tells us that #2 is on their side and #1 is on our side? Does the fact that #3 is the commit topologically closest to #5 tell us that #3 is the one we want to look deeper? This is still handwaving, but
[PATCH 0/4] Support multiple ancestors in read-tree
Various messages have already described this series. There's still a memory leak that should get resolved, but otherwise it should work. I'm not entirely sure that all directory-file conflict cases are handled properly, and some undefined cases behave differently. Also, I was a bit careless with preparing the patches. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/2] Remove emu23, fix entry order
A few things to improve testing. I'll clean up the series as a whole once it's tested. This removes the emu23 tests; I think that the only DF conflict tests were in that set, however, so these should be fished out and added to something else. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- read-tree.c | 89 +++- t/t1005-read-tree-m-2way-emu23.sh | 422 - 2 files changed, 37 insertions(+), 474 deletions(-) delete mode 100755 t/t1005-read-tree-m-2way-emu23.sh 63092a4dfb2042e8fc21260b2f315b01e9163940 diff --git a/read-tree.c b/read-tree.c --- a/read-tree.c +++ b/read-tree.c @@ -9,7 +9,6 @@ #include tree.h static int merge = 0; -static int emu23 = 0; static int update = 0; static struct object_list *trees = NULL; @@ -19,19 +18,39 @@ typedef int (*merge_fn_t)(struct cache_e int df_conflicts_2, int df_conflicts_3); +static int entcmp(char *name1, int dir1, char *name2, int dir2) +{ + int len1 = strlen(name1); + int len2 = strlen(name2); + int len = len1 len2 ? len1 : len2; + int ret = memcmp(name1, name2, len); + unsigned char c1, c2; + if (ret) + return ret; + c1 = name1[len]; + c2 = name2[len]; + if (!c1 dir1) + c1 = '/'; + if (!c2 dir2) + c2 = '/'; + ret = (c1 c2) ? -1 : (c1 c2) ? 1 : 0; + if (c1 c2 !ret) + ret = len1 - len2; + return ret; +} + static int unpack_trees_rec(struct tree_entry_list **posns, int len, const char *base, merge_fn_t fn, int file2, int file3, int *indpos) { int baselen = strlen(base); int src_size = len + 1; - if (emu23) - src_size++; if (src_size 4) src_size = 4; do { int i; char *first = NULL; + int firstdir = 0; int pathlen; unsigned ce_size; int dir2 = 0; @@ -73,11 +92,23 @@ static int unpack_trees_rec(struct tree_ } } + /* + if (first) + printf(%s\n, first); + */ + for (i = 0; i len; i++) { if (!posns[i]) continue; - if (!first || strcmp(first, posns[i]-name) 0) + /* + printf(%d %s\n, i + 1, posns[i]-name); + */ + if (!first || entcmp(first, firstdir, +posns[i]-name, +posns[i]-directory) 0) { first = posns[i]-name; + firstdir = posns[i]-directory; + } } /* No name means we're done */ if (!first) @@ -94,19 +125,6 @@ static int unpack_trees_rec(struct tree_ src_size); src[0] = active_cache[*indpos]; remove_cache_entry_at(*indpos); - if (emu23) { - // we need this in stage 2 as well as stage 0 - struct cache_entry *copy = - xmalloc(ce_size); - memcpy(copy, src[0], ce_size); - copy-ce_flags = - create_ce_flags(baselen + pathlen, 2); - if (dir2 || file2) { - die(cannot merge index and our head tree); - } - src[2] = copy; - subfile2 = 1; - } } for (i = 0; i len; i++) { @@ -125,8 +143,6 @@ static int unpack_trees_rec(struct tree_ } else { ce_stage = i + merge; } - if (emu23 ce_stage == 2) - ce_stage = 3; if (posns[i]-directory) { if (!subposns) { @@ -137,8 +153,6 @@ static int unpack_trees_rec(struct tree_ parse_tree(posns[i]-item.tree); subposns[i] = posns[i]-item.tree-entries; posns[i] = posns[i]-next; - if (emu23 ce_stage == 1) - dir2 = 1; if (ce_stage == 2) dir2 = 1; if (ce_stage == 3) @@ -168,19 +182,6 @@ static int
Re: Tool renames? was Re: First stab at glossary
On Thu, 1 Sep 2005, Junio C Hamano wrote: Tim Ottinger [EMAIL PROTECTED] writes: git-update-cache for instance? I am not sure which 'cache' commands need to be 'index' now. Logically you are right, but I suspect that may not fly well in practice. Too many of us have already got our fingers wired to type cache, and the glossary is there to describe both cache and index. My vote's for changing the official names, but keeping symlinks for the old names. As far as I know, there aren't any actual conflicts, and we might as well have new users pick up the logical names. I particularly think git merge would be really good to have. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Couple of read-tree questions
On Wed, 31 Aug 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: Is there any current use for read-tree with multiple trees without -m or equivalent? I did not know it even allowed multiple trees without -m, but you are right. It does not seem to complain. I have never thought about using multiple trees without -m, and I do not remember hearing any plan nor purpose of using it to do something interesting from Linus. I think its allowing multiple trees without -m is simply a bug. I guess it was probably that its behavior was obvious and didn't require any extra code. It still follows entirely from one tree without -m, but it might be worth prohibiting unless someone has a reason to do it intentionally. Why does --emu23 use I+H for stage 2, rather than just I? Wouldn't this just reintroduce removed files? They are not removed files, at least in the original context. The original intention was that git was supposed to work without having _any_ files in the working tree. The reason why multi-tree read-tree has so many special cases that says must match *if* work file exists, is that not having a corresponding working file was supposed to be equivalent to having the file checked out *and* unmodified. But they'd not only be missing from the working tree but also from the (pre-read-tree) index, which should only happen, assuming the index came from read-tree H, if they were subsequently removed from the index. I'd understand treating index entries for files missing from the working tree as up to date. (The thread you mention seems to say that we accept entries being missing from the index as if they were unchanged, but I don't see a good reason for this; you'd be dealing with the full set in the index for the merge, even if you don't have a populated working tree) I do not think anybody currently uses --emu23. I did it because it has a potential of making the two-tree fast forward (which is used in git checkout to switch between branches) easier to manage when the working tree is dirty than doing straight two-tree merge, but that is just a theoretical potential never tested in the field. Frankly, I do not mind, and I do not think anybody else minds, too much if you need to break or remove emu23 if that would make your code clean-up and redoing read-tree easier. I should have asked sooner, then. :) There's a bunch of clutter to get it to work that I can remove if it's not actually necessary. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reworked read-tree.
On Thu, 1 Sep 2005, Junio C Hamano wrote: Daniel, I do not know what your current status is, but I think you need something like this. Yup, I forgot to actually test that functionality. --- diff --git a/tree.c b/tree.c --- a/tree.c +++ b/tree.c @@ -224,10 +224,12 @@ struct tree *parse_tree_indirect(const u if (obj-type == tree_type) return (struct tree *) obj; else if (obj-type == commit_type) - return ((struct commit *) obj)-tree; + obj = (struct object *)(((struct commit *) obj)-tree); obj = ((struct commit *) obj)-tree-object; Multiple sequential casts always bother me, and we do actually have a field for this. else if (obj-type == tag_type) - obj = ((struct tag *) obj)-tagged; + obj = deref_tag(obj); Shouldn't be necessary (once you've got the parse_object below); we're already in a loop dereferencing things. else return NULL; + if (!obj-parsed) + parse_object(obj-sha1); } while (1); } - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] Reorganize read-tree
On Tue, 30 Aug 2005, Junio C Hamano wrote: Dan, I really really *REALLY* wanted to try this out in pu branch and even was about to rig some torture chamber for testing before applying the patch, but you got the shiny blue bat X-. I'll send a replacement with the settings correct. A patch to SubmittingPatches, MUA specific help section for users of Pine 4.63 would be very much appreciated. Ah, it looks like a recent version changed the default behavior to do the right thing, and inverted the sense of the configuration option. (Either that or Gentoo did it.) So you need to set the no-strip-whitespace-before-send option, unless the option you have is strip-whitespace-before-send, in which case you should avoid checking it. I don't actually have things set up for preparing patches from work, although I can resend the patches I prepared earlier. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2 (resend)] Object model additions for read-tree
Adds object_list_append() and a function to get the struct tree from an ent. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- object.c | 11 +++ object.h |3 +++ tree.c | 19 +++ tree.h |3 +++ 4 files changed, 36 insertions(+), 0 deletions(-) 49d33c385aa69d17c991300f73e77c6718a2b4a6 diff --git a/object.c b/object.c --- a/object.c +++ b/object.c @@ -184,6 +184,17 @@ struct object_list *object_list_insert(s return new_list; } +void object_list_append(struct object *item, + struct object_list **list_p) +{ + while (*list_p) { + list_p = ((*list_p)-next); + } + *list_p = xmalloc(sizeof(struct object_list)); + (*list_p)-next = NULL; + (*list_p)-item = item; +} + unsigned object_list_length(struct object_list *list) { unsigned ret = 0; diff --git a/object.h b/object.h --- a/object.h +++ b/object.h @@ -41,6 +41,9 @@ void mark_reachable(struct object *obj, struct object_list *object_list_insert(struct object *item, struct object_list **list_p); +void object_list_append(struct object *item, + struct object_list **list_p); + unsigned object_list_length(struct object_list *list); int object_list_contains(struct object_list *list, struct object *obj); diff --git a/tree.c b/tree.c --- a/tree.c +++ b/tree.c @@ -1,5 +1,7 @@ #include tree.h #include blob.h +#include commit.h +#include tag.h #include cache.h #include stdlib.h @@ -212,3 +214,20 @@ int parse_tree(struct tree *item) free(buffer); return ret; } + +struct tree *parse_tree_indirect(const unsigned char *sha1) +{ + struct object *obj = parse_object(sha1); + do { + if (!obj) + return NULL; + if (obj-type == tree_type) + return (struct tree *) obj; + else if (obj-type == commit_type) + return ((struct commit *) obj)-tree; + else if (obj-type == tag_type) + obj = ((struct tag *) obj)-tagged; + else + return NULL; + } while (1); +} diff --git a/tree.h b/tree.h --- a/tree.h +++ b/tree.h @@ -32,4 +32,7 @@ int parse_tree_buffer(struct tree *item, int parse_tree(struct tree *tree); +/* Parses and returns the tree in the given ent, chasing tags and commits. */ +struct tree *parse_tree_indirect(const unsigned char *sha1); + #endif /* TREE_H */ - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] Reorganize read-tree
On Wed, 31 Aug 2005, Catalin Marinas wrote: Daniel Barkalow [EMAIL PROTECTED] wrote: I got mostly done with this before Linus mentioned the possibility of having multiple index entries in the same stage for a single path. I finished it anyway, but I'm not sure that we won't want to know which of the common ancestors contributed which, and, if some of them don't have a path, we wouldn't be able to tell. I don't have time to look at the patch and I don't have a good knowledge of the GIT internals, so I will just ask. Does this patch changes the call convention for git-merge-one-file-script? I have my own script for StGIT and I would need to know whether it is affected or not. Nope, it only changes the trivial merge calling convention within read-tree.c; I think it's plausible that we might like to add information at some point, but the short-term goal is just to prevent a few bad cases in trivial merges. - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Stgit - patch history / add extra parents
On Tue, 30 Aug 2005, Catalin Marinas wrote: Back from holiday. Thanks to all who replied to this thread. On Tue, 2005-08-23 at 14:05 -0400, Daniel Barkalow wrote: Having a useful diff isn't really a requirement for a parent; the diff in the case of a merge is going to be the total of everything that happened elsewhere. The point is to be able to reach some commits between which there are interesting diffs. This also depends on how exactly freeze is used; if you use it before commiting a modification to the patch without rebasing, you get: old-top - new-top ^^ \ / bottom bottom to old-top is the old patch bottom to new-top is the new patch old-top to new-top is the change to the patch Then you want to keep new-top as a parent for rebasings until one of these is frozen. These links are not interesting to look at, but preserve the path to the old-top:new-top change, which is interesting. This was my initial StGIT implementation (up to version 0.3), only that there was no freeze command. Since I want an StGIT tree to be clean to the outside world, I wouldn't keep multiple parents for the visible top of a patch. As I understand from Junio's and Linus' e-mails (on the 23rd of August), there might be problems with merging the HEAD of an StGIT-managed tree if the above method is accessible via HEAD. Right, you'd want a separate head which is what you ask people to merge; the rest is only visible to people who are working on preparing the patch. But you could keep both sets of stuff (sharing tree objects but not commits). Ignoring the links to the corresponding bottoms, the development therefore looks like: local1 - local2 - merge - local3 - merge ^ ^ ^ mainline And this is how development is normally supposed to look. The trick is to only include a minimal number of merges. A merge occurs every time a patch is rebased. Anyway, having the bottoms in the graph (which is the main idea of StGIT) together with the old-top (or frozen state) parents make the graph pretty complicated. It should be possible to drop merges such that there's only one between any pair of local changes. That is, if you rebase at the end of the line above, it would get as parents local3 and the new bottom, not the last merge and the new bottom. The mainline changes only come in through the bottoms, so higher levels should look the same, but with the lower levels in the place of mainline. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Reorganize read-tree
I got mostly done with this before Linus mentioned the possibility of having multiple index entries in the same stage for a single path. I finished it anyway, but I'm not sure that we won't want to know which of the common ancestors contributed which, and, if some of them don't have a path, we wouldn't be able to tell. The other advantages I see to this approach are: - it uses the more common parser of tree objects, moving toward having only one (diff-cache still uses read_tree(), however). - it doesn't need to do very complicated things with the index; the original read-tree does a bunch of stuff with an index with a gap in the middle containing obsolete entries. - it uses a much simpler method of finding directory/file conflicts, which is possible because the struct trees represent directories as well as files. - it deals with each path completely before going on to the next one, instead of first dealing with each input tree and then dealing with each path. - it removes a lot of intimate knowledge of the index structure from the program. The general idea is that it figures out what trees you want, and then iterates through the entry lists together, recursing into directories, and calls the merge function with an array of the index entries (not yet added) for the path in each tree; the merge function adds the appropriate things to the index. Note that this set doesn't include calling merge functions with multiple ancestors or remotes; that can be done when we've decided on whether my version of read-tree is worth using. There are various potential refinements, plus removing a bunch of memory leaks, still to do, but I think this is sufficiently close to review. (Refinements: it ought to have two indices in memory, the old and the new, and never modify the old and only append to the new, to simplify things further; it ought to use a sentinal value for the index entry to indicate that there is something in the tree to conflict with there being a file at the given path; the --emu23 logic could be clearer) The first patch adds a few functions to the object library. The second patch changes read-tree around; It is essentially a rewrite, except for the merge functions and main(). -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Object model additions for read-tree
Adds object_list_append() and a function to get the struct tree from an ent. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- object.c | 11 +++ object.h |3 +++ tree.c | 19 +++ tree.h |3 +++ 4 files changed, 36 insertions(+), 0 deletions(-) 49d33c385aa69d17c991300f73e77c6718a2b4a6 diff --git a/object.c b/object.c --- a/object.c +++ b/object.c @@ -184,6 +184,17 @@ struct object_list *object_list_insert(s return new_list; } +void object_list_append(struct object *item, + struct object_list **list_p) +{ + while (*list_p) { + list_p = ((*list_p)-next); + } + *list_p = xmalloc(sizeof(struct object_list)); + (*list_p)-next = NULL; + (*list_p)-item = item; +} + unsigned object_list_length(struct object_list *list) { unsigned ret = 0; diff --git a/object.h b/object.h --- a/object.h +++ b/object.h @@ -41,6 +41,9 @@ void mark_reachable(struct object *obj, struct object_list *object_list_insert(struct object *item, struct object_list **list_p); +void object_list_append(struct object *item, + struct object_list **list_p); + unsigned object_list_length(struct object_list *list); int object_list_contains(struct object_list *list, struct object *obj); diff --git a/tree.c b/tree.c --- a/tree.c +++ b/tree.c @@ -1,5 +1,7 @@ #include tree.h #include blob.h +#include commit.h +#include tag.h #include cache.h #include stdlib.h @@ -212,3 +214,20 @@ int parse_tree(struct tree *item) free(buffer); return ret; } + +struct tree *parse_tree_indirect(const unsigned char *sha1) +{ + struct object *obj = parse_object(sha1); + do { + if (!obj) + return NULL; + if (obj-type == tree_type) + return (struct tree *) obj; + else if (obj-type == commit_type) + return ((struct commit *) obj)-tree; + else if (obj-type == tag_type) + obj = ((struct tag *) obj)-tagged; + else + return NULL; + } while (1); +} diff --git a/tree.h b/tree.h --- a/tree.h +++ b/tree.h @@ -32,4 +32,7 @@ int parse_tree_buffer(struct tree *item, int parse_tree(struct tree *tree); +/* Parses and returns the tree in the given ent, chasing tags and commits. */ +struct tree *parse_tree_indirect(const unsigned char *sha1); + #endif /* TREE_H */ - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Change read-tree to merge before using the index.
Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- read-tree.c | 522 ++- 1 files changed, 297 insertions(+), 225 deletions(-) d0f45ad81db2e133c49c23bd09c5615da344bb5c diff --git a/read-tree.c b/read-tree.c --- a/read-tree.c +++ b/read-tree.c @@ -5,28 +5,280 @@ */ #include cache.h -static int stage = 0; +#include object.h +#include tree.h + +static int merge = 0; +static int emu23 = 0; static int update = 0; -static int unpack_tree(unsigned char *sha1) +static struct object_list *trees = NULL; + +typedef int (*merge_fn_t)(struct cache_entry **src, + struct cache_entry **dest, + int df_conflicts_2, + int df_conflicts_3); + +static int unpack_trees_rec(struct tree_entry_list **posns, int len, + const char *base, merge_fn_t fn, + int file2, int file3, int *indpos) +{ + int baselen = strlen(base); + int src_size = len + 1; + if (emu23) + src_size++; + if (src_size 4) + src_size = 4; + do { + int i; + char *first = NULL; + int pathlen; + unsigned ce_size; + int dir2 = 0; + int dir3 = 0; + int subfile2 = file2; + int subfile3 = file3; + struct tree_entry_list **subposns = NULL; + struct cache_entry **src = NULL; + char *cache_name = NULL; + + /* Find the first name in the input. */ + + /* Check the cache */ + if (merge *indpos active_nr) { + /* This is a bit tricky: */ + /* If the index has a subdirectory (with +* contents) as the first name, it'll get a +* filename like foo/bar. But that's after +* foo, so the entry in trees will get +* handled first, at which point we'll go into +* foo, and deal with bar from the index, +* because the base will be foo/. The only +* way we can actually have foo/bar first of +* all the things is if the trees don't +* contain foo at all, in which case we'll +* handle foo/bar without going into the +* directory, but that's fine (and will return +* an error anyway, with the added unknown +* file case. +*/ + + cache_name = active_cache[*indpos]-name; + if (strlen(cache_name) baselen + !memcmp(cache_name, base, baselen)) { + cache_name += baselen; + first = cache_name; + } else { + cache_name = NULL; + } + } + + for (i = 0; i len; i++) { + if (!posns[i]) + continue; + if (!first || strcmp(first, posns[i]-name) 0) + first = posns[i]-name; + } + /* No name means we're done */ + if (!first) + return 0; + + pathlen = strlen(first); + ce_size = cache_entry_size(baselen + pathlen); + + if (cache_name !strcmp(cache_name, first)) { + src = xmalloc(sizeof(struct cache_entry *) * + src_size); + memset(src, 0, + sizeof(struct cache_entry *) * + src_size); + src[0] = active_cache[*indpos]; + remove_cache_entry_at(*indpos); + if (emu23) { + // we need this in stage 2 as well as stage 0 + struct cache_entry *copy = + xmalloc(ce_size); + memcpy(copy, src[0], ce_size); + copy-ce_flags = + create_ce_flags(baselen + pathlen, 2); + if (dir2 || file2) { + die(cannot merge index and our head tree); + } + src[2] = copy; + subfile2 = 1; + } + } + + for (i = 0; i len; i++) { + struct cache_entry *ce; + int ce_stage; + if (!posns[i] || + strcmp(first, posns[i]-name
Comments in read-tree about #nALT
I've gotten to the point of having all of the entries for a given path ready to put into the cache at the same, and now I want to convert the merge functions to take their data directly, rather than in the cache, so that they can take extra entries for extra ancestors. Part of threeway_merge, however, wants to search the rest of the cache for interfering entries in some cases, which would have to happen differently, because I won't have the cache completely filled out beforehand. I'm trying to figure out what the comments are talking about, and they seem to refer to a list of the possible cases. Is that list somewhere convenient? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merges without bases
On Sat, 27 Aug 2005, Martin Langhoff wrote: On 8/27/05, Daniel Barkalow [EMAIL PROTECTED] wrote: The problem with both of these (and doing it in the build system) is that, when a project includes another project, you generally don't want whatever revision of the included project happens to be the latest; you want the revision of the included project that the revision of the including project you're looking at matches. That is, if App includes Lib, and Exactly - so you do it on a tag, or a commit date with cvs. With Arch, GIT and others that have a stable id for each commit, you can use that or the more user-friendly tags. I'm thinking of cases like openssl, openssh, and libcrypto. Openssl and openssh both use libcrypto but not each other (looking at the ldd output, rather than packaging). However, it would be too much of a pain to work directly on libcrypto without working through some other package, because the library doesn't have its own applications. Furthermore, if you're doing much to libcrypto, you're likely doing it in the context of a particular application (say, for example, ssh needs a new cipher that isn't supported for SSL at the time). You'd want to make simultaneous changes to libcrypto to implement the new feature and to openssh to use it; neither can be validated until the other is written, which means that you'll have both projects checked out and dirty (in the cache sense) at the same time, and be building the using project. It would also be good to be able to check in this whole thing through the version control system, rather than partially through a change to the build system. That is, if I change the included libcrypto, commit it, and commit the including openssh, the system as a whole should understand that I want to change which commit of libcrypto gets used. Similarly, it would be good to merge changes into the libcrypto used by openssh with the same procedure used to merge changes to openssh itself, including supporting non-fast-forward when there's a local version in use. (Of course, currently, libcrypto is strictly part of openssl, because it would be too much of a pain with the present version control to make it independant, and openssh depends on openssl, despite not even linking against -lssl, because openssl got libcrypto first.) The good thing here is that a makefile will know how to handle the situation if the external lib is hosted in Arch, in SVN, or Visual SourceSafe. If your external lib is only available as a tarball in a url, you can fetch that and uncompress it too. Arch configurations are _cute_ but useless in any but the most narrow cases. Certainly, if it's sufficiently external to be in a different SCM it should be handled by the build system. Actually, if it's even nearly that external, it's probably going to be handled best by requiring people to go get it themselves. I find it odd that you say that the standard approach is to have the build system fetch a version of the included package; my experience is that projects either just report (or fail to report) a dependancy on having the other package or they copy the project into their project. The former means they can't change it (which is generally good, unless it becomes necessary), while the latter causes update problems (c.f. zlib). I think that Arch configurations and the CVS equivalent are, in fact, useless, but that this is only due to implementation being insufficiently clever, not due to the concept being inherently bad; I feel the same way about distributed development under Arch, which is really nice under git, so I have hope that something better could be done. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Comments in read-tree about #nALT
On Sat, 27 Aug 2005, Linus Torvalds wrote: On Sat, 27 Aug 2005, Daniel Barkalow wrote: What I missed was that the effect of causes_df_conflict is to give no merge for the entry, rather than giving an error overall. So I do need an equivalent. Daniel, I'm not 100% sure what you're trying to do, but one thing that might work out is to just having multiple stage 3 entries with the same pathname. We current use 4 stages: - stage 0 is resolved - stage 1 is original - stage 2 is one branch - stage 3 is another branch But if we allowed duplicate entries per stage, I think we could easily just fold stage 2/3 into one stage, and just have n entries in stage 2. That would immediately mean that a three-way merge could be n way. The only rule would be that when you add a entry to stage 2, you must always add it after any previous entry that is already in stage 2. That should be easy. It looks like stage 2 is currently special as the stage that's similar to the index/HEAD/working tree. However, I don't see any problem with n entries in stage 3, except that, if you have a non-maximal number of them for some reason, it'll be impossible to determine which came from which tree. In fact, this extension might even allow us to solve the multiple merge base problem: we could allow multiple entries in stage 1 too, ie one entry per merge base (and just collapse identical entries - there's no ordering involved in stage 1 entries). That's actually the problem I was working on. So you could merge n trees with m bases, and all without really changing the current logic much at all. Maybe I'm missing something (like what you're trying to do in the first place), but this _seems_ doable. I'd be afraid of confusing everything by removing the uniqueness invariant, although I guess not too much does anything with entries in stages other than 0. I probably just don't find the index as intuitive as you do and as the struct tree representation. I'm working on arranging the code to look at each path in sequence, with the input trees as the inner loop, rather than with the loops in the other order; using parse_tree to parse the objects instead of read_tree; and doing trivial merges before putting things in the cache, rather than after. I'd been thinking that this would avoid a limit on the number of stages, since I hadn't considered whether multiple entries for the same path and stage could be allowed. I still think that my order is likely to be easier to understand and involve read-tree relying less on tricky properties of the data structures, but I'll have to get it done before I can say that for sure. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Merges without bases
On Thu, 25 Aug 2005, Junio C Hamano wrote: One thing that makes me reluctant to recommend this merging unrelated projects business is that I suspect that it makes things _much_ harder for the upstream project that is being merged, and should not be done without prior arrangement; Linus merged gitk after talking with paulus, so that was OK. I'd still like to revive my idea of having projects overlaid on each other, where the commits in the project that absorbed the other project say, essentially, also include this other commit, but any changes to those files belong to that branch, not this one. That way, Linus could have included gitk in git, but changes to it, even when done in a git working tree, would show up in commits that only include gitk. (git actually can handle this with the alternative index file mechanism that Linus mentioned in a different thread.) Definitely post-1.0, of course. Suppose the above My Project is published, people send patches for core GIT part to it, and you as the maintainer of that My Project accept those patches. The users of My Project would be happy with the new features and wouldn't care less where their core GIT tools come from. But how would _I_ pull from that My Project, if I did not want to pull unrelated stuff in? With the right info, the tools could be made to automatically generate suitable commits, because those files would be tracked by a separate index file and committed into a separate branch, which would then be reincluded (by reference) in the containing project. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Looking at multiple ancestors in merge
On Wed, 24 Aug 2005, Daniel Barkalow wrote: Of course, this is going to take a bit of work, because read-tree currently puts all of its arguments into the cache and then works on merging, and taking multiple ancestors requires putting them somewhere else, because they won't fit in the cache. I've started this, and have gotten as far as having read-tree accept 3 trees and ignore everything but the last 3. Am I correct in assuming that if I break read-tree in any way, some test will fail? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: baffled again
On Wed, 24 Aug 2005, Junio C Hamano wrote: [EMAIL PROTECTED] writes: So I have another anomaly in my GIT tree. A patch to back out a bogus change to arch/ia64/hp/sim/boot/bootloader.c in my release branch at commit 62d75f3753647656323b0365faa43fc1a8f7be97 appears to have been lost when I merged the release branch to the test branch at commit 0c3e091838f02c537ccab3b6e8180091080f7df2 : siamese; git cat-file commit 0c3e091838f02c537ccab3b6e8180091080f7df2 tree 61a407356d1e897e0badea552ce69e657cab6108 parent 7ffacc1a2527c219b834fe226a7a55dc67ca3637 parent a4cce10492358b33d33bb43f98284c80482037e8 author Tony Luck [EMAIL PROTECTED] 1124808655 -0700 committer Tony Luck [EMAIL PROTECTED] 1124808655 -0700 Pull release into test branch So I pulled 7ffacc and a4cce1 from your repository and started digging from there. 7ffacc was the head of test branch back then, and a4cce1 was the head of release branch. I checked out 7ffacc in the repository and pulled a4cce1 into it, using the GIT with the optimum merge-base patch. : siamese; git pull . aegl-release Packing 0 objects Unpacking 0 objects * committish: a4cce10492358b33d33bb43f98284c80482037e8 refs/heads/aegl-release from . Trying to find the optimum merge base. Trying to merge a4cce10492358b33d33bb43f98284c80482037e8 into 7ffacc1a2527c219b834fe226a7a55dc67ca3637 using c1ffb910f7a4e1e79d462bb359067d97ad1a8a25. Simple merge failed, trying Automatic merge Auto-merging arch/ia64/sn/kernel/io_init.c. Committed merge db376974c0aebb9e99e5cd0bce21088c6a9d927c arch/ia64/hp/sim/boot/boot_head.S |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) It is using c1ffb9 as the merge base. The problematic path in the three trees involved are: : siamese; git ls-tree -r aegl-test-7ffacc1a | grep arch/ia64/hp/sim/boot/bootloader.c 100644 blob a7bed60b69f9e8de9a49944e22d03fb388ae93c7 arch/ia64/hp/sim/boot/bootloader.c : siamese; git ls-tree -r aegl-release-a4cce1 | grep arch/ia64/hp/sim/boot/bootloader.c 100644 blob 51a7b7b4dd0e7c5720683a40637cdb79a31ec4c4 arch/ia64/hp/sim/boot/bootloader.c : siamese; git ls-tree -r aegl-c1ffb9 | grep arch/ia64/hp/sim/boot/bootloader.c 100644 blob 51a7b7b4dd0e7c5720683a40637cdb79a31ec4c4 arch/ia64/hp/sim/boot/bootloader.c So the file did not change between the merge base and release, and test had the change. merge-cache picked the one in the test release. Your guess in the other message hits the mark. I wonder what _other_ candidates these two commits have in common and what would have happened if they were used as the base instead? : siamese; git merge-base -a aegl-test-7ffacc1a aegl-release-a4cce1 f6fdd7d9c273bb2a20ab467cb57067494f932fa3 3a931d4cca1b6dabe1085cc04e909575df9219ae c1ffb910f7a4e1e79d462bb359067d97ad1a8a25 You can check what variant of the file each of these commits contain. What is happening is: * the problematic patch 4aec0f is one before 3a931d. Among the three merge-base candidates, only 3a931d contains teh wrongly patched version. * the problematic change 4aec0f patch introduces is part of test branch, because it was pulled via release. * the tip of release being merged into test has this patch reverted, and the file is exactly the same as before 4aec0f patch. So three-way trivial merge algorithm says, hey, the file did not change between common ancestor and release but it is different in test, so the one in the test branch must be the merge result. This does not have much to do with which common ancestor merge-base chooses. Sorry, I am not sure what is the right way to resolve this offhand. If it picks 3a931d4cca1b6dabe1085cc04e909575df9219ae, it will determine that the file didn't change between that and test, and is different in release, so the one in release must be right. I believe that the hint that something is going on is that different common ancestors give different trivial merges (as opposed to some giving failure and some giving the same result), and resolving it probably involves identifying that that paths from f6f... and c1f... to release don't keep the same blob through the middle, despite having the same ends. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Query about status of http-pull
On Wed, 24 Aug 2005, Martin Schlemmer wrote: Hi, Recently cogito again say that the rsync method will be deprecated in future (due to http-pull now supporting pack objects I suppose), but it seems to me that it still have other issues: - lycan linux-2.6 # git pull origin Fetching HEAD using http Getting pack list error: Couldn't get 0572e3da3ff5c3744b2f606ecf296d5f89a4bbdf: not separate or in any pack error: Tried http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git/objects/05/72e3da3ff5c3744b2f606ecf296d5f89a4bbdf Cannot obtain needed object 0572e3da3ff5c3744b2f606ecf296d5f89a4bbdf while processing commit . It looks like pack-c24bb5025e835a3d8733931ce7cc440f7bfbaaed isn't in the pack list. I suspect that updating this file should really be done by anything that creates pack files, because people forget to run the program that does it otherwise and then http has problems. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: baffled again
On Wed, 24 Aug 2005, Linus Torvalds wrote: Now, if the shared patch hadn't been a patch, but a shared _commit_, then the thing would have been unambiguous - the shared commit would have been the merge point, and the revert would have clearly undone that shared commit. Actually, it was a shared commit (4aec0fb12267718c750475f3404337ad13caa8f5), which was (an ancestor of) a candidate merge point, but wasn't the one selected. Since a different one was chosen, it looked to the 3-way merge like a shared patch (since it ignores the untaken parent in the merges in the history). This should be fixable, but it'll require more cleverness in read-tree. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] undo and redo
On Wed, 24 Aug 2005, Carl Baldwin wrote: This brings up a good point (indirectly). git prune would destroy the undo objects. I had thought of this but decided to ignore it for the time being. If you made undo store the tree under refs somewhere, git prune would preserve it. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Looking at multiple ancestors in merge
On Wed, 24 Aug 2005, A Large Angry SCM wrote: Daniel Barkalow wrote: I'm starting to work on letting the merging process see multiple ancestors, and I think it's messy enough that I should actually discuss it. Review of the issue: It is possible to lost reverts in cases when merging two commits with multiple ancestors, in the following pattern: (letters representing blobs at some filename, children to the right) a-b-b-a-? \ X / a-b-b [Lots of stuff deleted] There seems to be a lot of effort being put into auto-magically choosing the right merge in the presence of multiple possible merge bases. Unfortunately, most (all?) of the proposals are attempting to divine intent, and so, are guaranteed to be 100% wrong at least some of the time. Wouldn't it be better, instead, to detect that current merge being attempted is ambiguous and require the user to specify the correct merge base? The alternative is a tool that appears to work all of the time but does the wrong thing some of the time. My proposal is actually to detect when a merge is ambiguous. In order to determine that, however, you have to evaluate multiple potential outcomes and see if they are actually different. I'm working on an efficient way to do that. Then further work could look into eliminating possibilities when information about the history excludes them. There were two issues in the case that Tony hit: it ignored a potential correct outcome for the merge, and it didn't ignore an outcome which could be demonstrated to be incorrect. The priority is to resolve the first, but things which improve the second or help with solutions to the second are worth understanding. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Stgit - patch history / add extra parents
On Tue, 23 Aug 2005, Catalin Marinas wrote: So the point is that there are things which are, in fact, parents, but we don't want to list them, because it's not desired information. What's the definition of a parent in GIT terms? What are the restriction for a commit object to be a parent? Can a parent be an arbitrarily chosen commit? Something is legitimate as a parent if someone took that commit and did something to it to get the new commit. The operation which caused the change is not specified. But you only want to include it if anyone cares about the parent. (For example, I often start with a chunk of work that does multiple things and is committed; I take mainline and generate a series of commits from there. It would be legitimate to list my development commit as a parent of each of these, since I did actually take it and strip out the unrelated changes. This would be a bit confusing in the log, but would make merges between something based on the messy version and something based on the refined version work well. On the other hand, I don't want to report the existance of the messy version, so I don't include it.) An StGIT patch is a represented by a top and bottom commit objects. The bottom one is the same as the parent of the top commit. The patch is the diff between the top's tree id and the bottom's tree id. Jan's proposal is to allow a freeze command to save the current top hash and later be used as a second parent for the newly generated top. The problem I see with this approach is that (even for the internal view you described) the newly generated top will have two parents, new-bottom and old-top, but only the diff between new-top and new-bottom is meaningful. The diff between new-top and old-top (as a parent-child relation) wouldn't contain anything relevant to the patch but all the new changes to the base of the stack. Having a useful diff isn't really a requirement for a parent; the diff in the case of a merge is going to be the total of everything that happened elsewhere. The point is to be able to reach some commits between which there are interesting diffs. This also depends on how exactly freeze is used; if you use it before commiting a modification to the patch without rebasing, you get: old-top - new-top ^^ \ / bottom bottom to old-top is the old patch bottom to new-top is the new patch old-top to new-top is the change to the patch Then you want to keep new-top as a parent for rebasings until one of these is frozen. These links are not interesting to look at, but preserve the path to the old-top:new-top change, which is interesting. Ignoring the links to the corresponding bottoms, the development therefore looks like: local1 - local2 - merge - local3 - merge ^ ^ ^ mainline And this is how development is normally supposed to look. The trick is to only include a minimal number of merges. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Removing deleted files after checkout
On Tue, 23 Aug 2005, Carl Baldwin wrote: Hello, I recently started using git to revision control the source for my web-page. I wrote a post-update hook to checkout the files when I push to the 'live' repository. In this particular context I decided that it was important to me to remove deleted files after checking out the new HEAD. I accomplished this by running git-ls-files before and after the checkout. Is there a better way? Could there be some way built into git to easily find out what files dissappear when replacing the current index with one from a new tree? Is there already? The behavior of git should NOT change to delete these files but I would argue that some way should exist to query what files disappeared if removing them is desired. If you don't use -f, git-checkout-script removes deleted files. Using -f tells it to ignore the old index, which means that it can't tell the difference between removed files and files that weren't tracked at all. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Removing deleted files after checkout
On Tue, 23 Aug 2005, Carl Baldwin wrote: On Tue, Aug 23, 2005 at 03:43:56PM -0400, Daniel Barkalow wrote: On Tue, 23 Aug 2005, Carl Baldwin wrote: Hello, I recently started using git to revision control the source for my web-page. I wrote a post-update hook to checkout the files when I push to the 'live' repository. In this particular context I decided that it was important to me to remove deleted files after checking out the new HEAD. I accomplished this by running git-ls-files before and after the checkout. Is there a better way? Could there be some way built into git to easily find out what files dissappear when replacing the current index with one from a new tree? Is there already? The behavior of git should NOT change to delete these files but I would argue that some way should exist to query what files disappeared if removing them is desired. If you don't use -f, git-checkout-script removes deleted files. Using -f tells it to ignore the old index, which means that it can't tell the difference between removed files and files that weren't tracked at all. Maybe I'm doing something wrong. This does not happen for me. I tried a simple test with git v0.99.4... cd mkdir test-git cd test-git/ echo testing | cg-init echo contents file git-add-script file git-commit-script -m 'testing' [point 1] cd .. cg-clone test-git/.git/ test-git2 cd test-git2 cg-rm file git-commit-script -m 'testing' ls cg-push cd ../test-git git-checkout-script Ah, okay. I think push and checkout don't play that well together; push changes the ref, which checkout uses to determine what it expects for the old contents, and then it's confused. What you probably actually want is: cd ../test-git git pull ../test-git2 which will correctly identify before and after, and remove any files that were removed. Alternatively, you could do, at point 1: cp .git/refs/master .git/refs/deployed git checkout deployed Then, after the push and cd: git checkout master cp .git/refs/master .git/refs/deployed git checkout deployed because checkout does remove files if you switch from a branch with them (e.g., deployed) to one without them (master, after the push). -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Removing deleted files after checkout
On Tue, 23 Aug 2005, Carl Baldwin wrote: The point is to push and use a post-update hook to do the checkout. So, this won't be possible. You could have the remote repository be something like ~/git/website.git, and have a hook which does: cd ~/www; git pull ~/git/website.git/. That is, have three things: the directory where you work on stuff, the central storage location, and the area that the web server serves, and have the storage location automatically update the web server area. That's what I do with my website section that's still in CVS, and the general concept is good (and means that the real repository isn't somewhere the web server is poking around). which will correctly identify before and after, and remove any files that were removed. Alternatively, you could do, at point 1: cp .git/refs/master .git/refs/deployed git checkout deployed How to get a post-update hook to do this? I suppose an update script could set this up for the post-update to later use. If you have deployed checked out, and you push to master in the same repository, having the hook do git resolve deployed master auto-update should work. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Stgit - patch history / add extra parents
On Tue, 23 Aug 2005, Jan Veldeman wrote: Daniel Barkalow wrote: On Tue, 23 Aug 2005, Catalin Marinas wrote: Something is legitimate as a parent if someone took that commit and did something to it to get the new commit. The operation which caused the change is not specified. But you only want to include it if anyone cares about the parent. This is indeed what I thought a parent should be used for. As an adition, I'll try to explain why I would sometimes want to care about some parents: I want to track a mailine tree, but have quite a few changes, which shoudn't be commited to the mainline immediately (let's call it my development tree). This is why I would use stgit. But I would also want to colaborate with other developers on this development tree, so I sometimes want to make updates available of this development tree to the others. This is where current stgit falls short. To easily share this development tree, I want some history (not all, only the ones I choose) of this development tree included, so that the other developers can easily follow my development. The parents which should be visible to the outside, will always be versions of my development tree, which I have previously pushed out. My way of working would become: * make changes, all over the place, using stgit * still make changes (none of these gets tracked, intermittent versions are lost) * having a good day: changes looks good, I want to push this out: * push my tree out * stgit-free (which makes the pushed out commits, the new parents of my stgit patches) * restart from top I'm not sure how applicable to this situation stgit really is; I see stgit as optimized for the case of a patch set which is basically done, where you want to keep it applicable to the mainline as the mainline advances. For your application, I'd just have a git branch full of various stuff, and then generate clean commits by branching mainline, diffing development against it, cutting the diff down to just what I want to push, and applying that. Then the clean patch goes into stgit. [...] This also depends on how exactly freeze is used; if you use it before commiting a modification to the patch without rebasing, you get: old-top - new-top ^^ \ / bottom bottom to old-top is the old patch bottom to new-top is the new patch old-top to new-top is the change to the patch Then you want to keep new-top as a parent for rebasings until one of these is frozen. These links are not interesting to look at, but preserve the path to the old-top:new-top change, which is interesting. my proposal does something like this, but a little more: not only does it keep track of the link between old-top and new-top, it also keeps track of the links between old-patch-in-between and new-patch-in-between. (This makes sense when the top is being removed or reordered) I was thinking of this as being the top and bottom commits for a single tracked patch, not as a whole series. I think patches lower wouldn't be affected, and patches higher would see this as a rebase. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Removing deleted files after checkout
On Tue, 23 Aug 2005, Carl Baldwin wrote: The thing that this doesn't do is remove empty directories when the last file is deleted. I once expressed the opinion in a previous thread that directories should be added and removed explicitly in git. (Thus allowing an empty directory to be added). If this were to happen then this case would get handled correctly. However, if git stays with the status quo then I think that git-read-tree -u should be changed to remove the empty directory. This would make it consistent. I think that git-read-tree -u ought to remove a directory if it removes the last file (or directory) in it. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Automatic merge failed, fix up by hand
On Tue, 23 Aug 2005, Junio C Hamano wrote: Only lightly tested, in the sense that I did only this one case and nothing else. For a large repository and with complex merges, merge-base -a _might_ end up reporting many candidates, in which case the pre-merge step to figure out the best merge base may turn out to be disastrously slow. I dunno. I think it's the right thing to do for now (and what I was going to suggest), and if people find it too slow, we can consider teaching read-tree to take multiple common ancestors and use any of them that gives clear result on a per-file basis. On the other hand, Tony might have hit a bad case with an ill-chosen common ancestor for a patch/revert sequence, and we probably want to look into that if we've got some history that demonstrates the problem. I think that, if there are two common ancestors, one of which has applied a patch and one of which hasn't, and on one side of the merge it gets reverted, we should get the revert, but we'll only get it if we choose the ancestor where it was applied. (Letters are versions of the file, which 'b' being the bad patch; the second column is the two choices for common ancestor) a-b-a-? / X / a-b-b-b Of course, you could have the two lines exactly flipped for a different file in the same commits, or for a different hunk in the same file, and there would be no single choice that doesn't lose the revert. The really right thing to do is identify that there is a b-a transition that is not a trivial merge and that is not beyond a common ancestor, but that's hard to determine easily and with sufficient granularity to catch everything. I still someday want to do a version of diff/merge for git that could select common ancestors on a per-hunk basis and identify block moves and avoid giving confusing (but marginally shorter) diffs, but that's a major undertaking that I don't have time for right now. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC] Stgit - patch history / add extra parents
On Sun, 21 Aug 2005, Jan Veldeman wrote: Catalin Marinas wrote: So for example, you only tag (freeze) the history when exporting the patches. When an error is being reported on that version, it's easy to view it and also view the progress that was already been made on those patches. I agree that it is a useful feature to be able to individually tag the patches. The problem is how to do this best. Your approach looks to me like it's not following the GIT DAG structure recommendation. Maybe the GIT designers could further comment on this but a commit object with multiple parents should be a result of a merge operation. A commit with a single parent should represent a transition of the tree from one state to another. With the freeze command you proposed, a commit with multiple parents is no longer a result of a merge operation, but just a convenience for tracking the patch history with gitk. My interpretation of parents is broader than only merges, and reading the README file, I believe it also the intension to do so (snippet from README file): A commit object ties such directory hierarchies together into a DAG of revisions - each commit is associated with exactly one tree (the directory hierarchy at the time of the commit). In addition, a commit refers to one or more parent commit objects that describe the history of how we arrived at that directory hierarchy. One factor not mentioned there is that, as things move upstream, we often want to discard a lot of history; if someone commits constantly to deal with editor malfunction or something, we don't really want to take all of this junk into the project history when it is cleaned up and accepted. So the point is that there are things which are, in fact, parents, but we don't want to list them, because it's not desired information. Probably the right thing is to have two views of the stack: the internal view, showing what actually happened, and the external view, showing what would have happened if the developers had done everything right the first time. When you make changes to the series, this adds to the internal view and entirely replaces the external view. I think that users will also want to discard the commits from the stack before rebasing in favor of the commits after, because (a) rebasing isn't all that interesting, especially if there's minimal merging, and (b) otherwise you'd get a ton of boring commits that obscure the interesting ones. I think that the best rule would be that, when you modify a patch, the previous version is the new version's parent, and when you rebase a series, you include as a parent any parent of the input that isn't also in the input (but never include the input itself as a parent of the output; the point of rebasing is to pretend that it was the newer mainline that you modified). This should mean that the internal history of a patch consists of the present version, based on each version that was replaced due to changing the patch rather than rebasing it. Of course, there's an interesting situation with the commits earlier in a series from a patch that was changed not being ancestors of the newer versions of those patches (because they weren't interesting in the development of those patches) but accessible as the commits that an interesting patch was based on. A possible solution is just to consider the revision of any patch a significant event in the history of the whole stack, causing all of the patches to get a new retained version. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Subject: [PATCH] Updates to glossary
On Thu, 18 Aug 2005, Johannes Schindelin wrote: tree object:: - An object containing a list of blob and/or tree objects. - (A tree usually corresponds to a directory without - subdirectories). + An object containing a list of file names and modes along with refs + to the associated blob and/or tree objects. A tree object is + equivalent to a directory. Actually, it contains object names, not refs, to be completely precise. (refs would imply an additional indirection.) -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: First stab at glossary
On Wed, 17 Aug 2005, Johannes Schindelin wrote: Hi, long, long time. Here?s my first stab at the glossary, attached the alphabetically sorted, asciidoc marked up txt file (Comments? Suggestions? Pizzas?): object:: The unit of storage in GIT. It is uniquely identified by the SHA1 of its contents. Consequently, an object can not be changed. SHA1:: A 20-byte sequence (or 41-byte file containing the hex representation and a newline). It is calculated from the contents of an object by the Secure Hash Algorithm 1. It's also often 40-character string (with whatever termination) in places like commit objects, tag objects, command-line arguments, listings, and so forth. object database:: Stores a set of objects, and an individial object is identified by its SHA1 (its ref). The objects are either stored as single files, or live inside of packs. object name:: Synonym for SHA1. Have we killed the use of the third term hash for this? I'd say that object name is the standard term, and SHA1 is a nickname, if only because object name is more descriptive of the particular use of the term. blob object:: Untyped object, i.e. the contents of a file. This i.e. should be e.g., since symlink targets are also stored as blobs, and any other bulk data stored by itself would be. (IIRC, Junio has a tagged blob to hold his public key, for example) tree object:: An object containing a list of blob and/or tree objects. (A tree usually corresponds to a directory without subdirectories). tree:: Either a working tree, or a tree object together with the dependent blob and tree objects (i.e. a stored representation of a working tree). cache:: A collection of files whose contents are stored as objects. The cache is a stored version of your working tree. Well, can also contain a second, and even a third version of a working tree, which are used when merging. cache entry:: The information regarding a particular file, stored in the index. A cache entry can be unmerged, if a merge was started, but not yet finished (i.e. if the cache contains multiple versions of that file). index:: Contains information about the cache contents, in particular timestamps and mode flags (stat information) for the files stored in the cache. An unmerged index is an index which contains unmerged cache entries. I think we might want to entirely kill the cache term, and talk only about the index and index entries. Of course, a bunch of the code will have to be renamed to make this completely successful, but we could change the glossary and documentation, and mention cache and cache entry as old names for index and index entry respectively. working tree:: The set of files and directories currently being worked on. Think ls -laR This is where the data is actually in the filesystem, and you can edit and compile it (as opposed to a tree object or the index, which semantically have the same contents, but aren't presented in the filesystem that way). directory:: The list you get with ls :-) checkout:: The action of updating the working tree to a revision which was stored in the object database. Move after revision? revision:: A particular state of files and directories which was stored in the object database. It is referenced by a commit object. commit:: The action of storing the current state of the cache in the object database. The result is a revision. commit object:: An object which contains the information about a particular revision, such as parents, committer, author, date and the tree object which corresponds to the top directory of the stored revision. Move parent around here. changeset:: BitKeeper/cvsps speak for commit. Since git does not store changes, but states, it really does not make sense to use the term changesets with git. ent:: Favorite synonym to tree-ish by some total geeks. Move after tree-ish. head:: The top of a branch. It contains a ref to the corresponding commit object. branch:: A non-cyclical graph of revisions, i.e. the complete history of a particular revision, which does not (yet) have children, which is called the branch head. The branch heads are stored in $GIT_DIR/refs/heads/. A branch head might have children, if they're in another branch. (E.g., I pull mainline, make a new branch based on it, and commit a change; the head of mainline is still a branch head, even though it's the parent of my new commit, because my new commit isn't in mainline.) ref:: A 40-byte hex representation of a SHA1 pointing to a particular object. These are stored in $GIT_DIR/refs/. head ref:: A ref pointing to a head. Often, this
Re: Git 1.0 Synopis (Draft v4)
On Tue, 16 Aug 2005, Johannes Schindelin wrote: Hi, On Tue, 16 Aug 2005, Junio C Hamano wrote: - Are all the files in Documentation/ reachable from git(7) or otherwise made into a standalone document using asciidoc by the Makefile? I haven't looked into documentation generation myself (I use only the text files as they are); help to update the Makefile by somebody handy with asciidoc suite is greatly appreciated here. Volunteers? The attached script reveals: git-unpack-objects.txt is not reachable from git.txt git-cvsimport-script.txt is not reachable from git.txt git-send-email-script.txt is not reachable from git.txt git-rename-script.txt is not reachable from git.txt tutorial.txt is not reachable from git.txt git-show-index.txt is not reachable from git.txt cvs-migration.txt is not reachable from git.txt diffcore.txt is not reachable from git.txt git-ls-remote-script.txt is not reachable from git.txt git-apply.txt is not reachable from git.txt git-diff-stages.txt is not reachable from git.txt pack-protocol.txt is not reachable from git.txt The ones that don't start with git probably don't belong in the same set; perhaps there should be a technical (or something similar but shorter) subdirectory for developer documentation instead of user documentation? (And tutorial and cvs-migration can move to howto) -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Alternate object pool mechanism updates.
On Tue, 16 Aug 2005, Linus Torvalds wrote: Finally, I have to say that that info directory is confusing. Namely, there's two of them - the git info and the object info directories are totally different directories - maybe logical, but to me it smells like info is here a code-name for misc files that don't make sense anywhere else. What this all is leading up to is that I think we'd be better off with a totally new git config file, in .git/config, and we'd have all the startup configuration there. Including things like alternate object directories, perhaps standard preferences for that particular repo, and things like the grafts thing. Wouldn't that be nice? I'd originally proposed the .git/info directory because I keep multiple working trees for the same repository, by having symlinks for .git/objects and .git/refs, and I could also get other per-repository things to be shared properly without knowing exactly what they are if they're in a subdirectory of .git that could be a symlink. This would mean that a .git/config would be per-working-tree, like .git/index or .git/HEAD, not pre-repository like .git/info/config. Of course, the core didn't have any thing to go in .git/info at the time, so it didn't really get tacked down. (I find it convenient to have mainline and my latest work both checked out for reference while I'm generating a series of commits for a patch set, and I don't want three different repositories which could be out of sync; this also keeps the repository safely out of pwd, since I have the actual repositories as ~/git/{project}.git/) -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Add support for figuring out where in the git archive we are
On Tue, 16 Aug 2005, Linus Torvalds wrote: If you use the GIT_DIR environment variable approach, it assumes that all filenames you give it are absolute and acts the way it always did before. Comments? Like? Dislike? I'm all in favor, at least in the general case. I suspect there'll be some things where we have to discuss the behavior, but we can argue that when it comes up. I think, slightly before 1.0, we should sort the library functions into a new set of object files with matching header files, because setup is not really distinctive, and there's at least one duplicate implementation (the ssh subprocess code in your connect.c is the same as my rsh.c in what it does, although yours uses two pipes and mine uses a socket). -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cloning speed comparison
On Sat, 13 Aug 2005, Petr Baudis wrote: Hello, I've wondered how slow the protocols other than rsync are, and the (well, a bit dubious; especially wrt. caching on the remote side) results are: git clone-pack:ssh 25s git rsync 27s git http-pull 47s git dumb-http 54s git ssh-pull660s cogito clone-pack:ssh 35s (!) cogito rsync 140s cogito ssh-pull480s cogito http-pull extrapolated to about an hour! I should be able to get http-pull down to the neighborhood of (current) ssh-pull; http-pull is that slow (when the source repository isn't packed) because it's entirely sequential, rather than overlapping requests like ssh-pull now does. I should also be able to get ssh-pull down to the area of clone-pack, but that's lower-priority, since there's clone-pack. (I've written an untested patch for local-pull, which I'll be testing, cleaning, and submitting tonight, assuming my newly-arrived monitor actually works) PS: With the latest git version as of time of writing this: $ time cg-clone git+ssh://[EMAIL PROTECTED]/home/pasky/WWW/dev/git/.g cogito ... progress: 5759 objects, 10292457 bytes $ time cg-clone http://localhost/~pasky/dev/git/.g cogito ... progress: 8681 objects, 14881571 bytes I've noticed that ssh connections don't actually disconnect at the end with recent versions of ssh sometimes. In my experience, this occasionally happens with git, but always happens with scp, suggesting that it's an ssh bug of some sort; I've also only noticed this with openssh 3.9_p1 with some of Gentoo's -r2 patches. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git 1.0 Synopis (Draft v4)
On Mon, 15 Aug 2005, Junio C Hamano wrote: Ryan Anderson [EMAIL PROTECTED] writes: I was waiting until you said, Ok, 1.00 tomorrow morning Makes sense. There would be some weeks until that happens I am afraid. It might be worth putting the list of things left to do before 1.0 in the tree (since they clearly covary), and it would be useful to know what you're thinking of as preventing the release at any particular stage. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Cloning speed comparison
On Mon, 15 Aug 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: I should be able to get http-pull down to the neighborhood of (current) ssh-pull; http-pull is that slow (when the source repository isn't packed) because it's entirely sequential, rather than overlapping requests like ssh-pull now does. I like those prefetch() and process() code in pull.c very much. I have been wondering if increasing parallelism more by prefetching beyond the immediate parents of the current commit, in if (get_history) part of process_commit(). Maybe it is not worth it because doing a commit, its associated tree(s) and its parents would already give us enough parallelism already. It is actually already maxing out the parallelism; it has a FIFO of objects which it needs, and calls prefetch() when it enqueues an object and fetch() when it dequeues it. It only cares about the dependancies for this purpose, not the types. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Add function to read an index file from an arbitrary filename.
Note that the pack file has to be in the usual location if it gets installed later. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- cache.h |2 ++ sha1_file.c | 10 -- 2 files changed, 10 insertions(+), 2 deletions(-) 59e5c6d163edae5da6136560d48a4750cceacdc6 diff --git a/cache.h b/cache.h --- a/cache.h +++ b/cache.h @@ -319,6 +319,8 @@ extern int get_ack(int fd, unsigned char extern struct ref **get_remote_heads(int in, struct ref **list, int nr_match, char **match); extern struct packed_git *parse_pack_index(unsigned char *sha1); +extern struct packed_git *parse_pack_index_file(unsigned char *sha1, + char *idx_path); extern void prepare_packed_git(void); extern void install_packed_git(struct packed_git *pack); diff --git a/sha1_file.c b/sha1_file.c --- a/sha1_file.c +++ b/sha1_file.c @@ -476,12 +476,18 @@ struct packed_git *add_packed_git(char * struct packed_git *parse_pack_index(unsigned char *sha1) { + char *path = sha1_pack_index_name(sha1); + return parse_pack_index_file(sha1, path); +} + +struct packed_git *parse_pack_index_file(unsigned char *sha1, char *idx_path) +{ struct packed_git *p; unsigned long idx_size; void *idx_map; - char *path = sha1_pack_index_name(sha1); + char *path; - if (check_packed_git_idx(path, idx_size, idx_map)) + if (check_packed_git_idx(idx_path, idx_size, idx_map)) return NULL; path = sha1_pack_name(sha1); - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Support packs in local-pull
If it doesn't find an object, it looks for an index that contains it and uses the same methods on that instead. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- local-pull.c | 112 +++--- 1 files changed, 91 insertions(+), 21 deletions(-) aafbc7fb9ae059b9c9afa42e8d2c0548ea960f9f diff --git a/local-pull.c b/local-pull.c --- a/local-pull.c +++ b/local-pull.c @@ -15,34 +15,54 @@ void prefetch(unsigned char *sha1) { } -int fetch(unsigned char *sha1) +static struct packed_git *packs = NULL; + +void setup_index(unsigned char *sha1) { - static int object_name_start = -1; - static char filename[PATH_MAX]; - char *hex = sha1_to_hex(sha1); - const char *dest_filename = sha1_file_name(sha1); + struct packed_git *new_pack; + char filename[PATH_MAX]; + strcpy(filename, path); + strcat(filename, /objects/pack/pack-); + strcat(filename, sha1_to_hex(sha1)); + strcat(filename, .idx); + new_pack = parse_pack_index_file(sha1, filename); + new_pack-next = packs; + packs = new_pack; +} - if (object_name_start 0) { - strcpy(filename, path); /* e.g. git.git */ - strcat(filename, /objects/); - object_name_start = strlen(filename); +int setup_indices() +{ + DIR *dir; + struct dirent *de; + char filename[PATH_MAX]; + unsigned char sha1[20]; + sprintf(filename, %s/objects/pack/, path); + dir = opendir(filename); + while ((de = readdir(dir)) != NULL) { + int namelen = strlen(de-d_name); + if (namelen != 50 || + strcmp(de-d_name + namelen - 5, .pack)) + continue; + get_sha1_hex(sha1, de-d_name + 5); + setup_index(sha1); } - filename[object_name_start+0] = hex[0]; - filename[object_name_start+1] = hex[1]; - filename[object_name_start+2] = '/'; - strcpy(filename + object_name_start + 3, hex + 2); + return 0; +} + +int copy_file(const char *source, const char *dest, const char *hex) +{ if (use_link) { - if (!link(filename, dest_filename)) { + if (!link(source, dest)) { pull_say(link %s\n, hex); return 0; } /* If we got ENOENT there is no point continuing. */ if (errno == ENOENT) { - fprintf(stderr, does not exist %s\n, filename); + fprintf(stderr, does not exist %s\n, source); return -1; } } - if (use_symlink !symlink(filename, dest_filename)) { + if (use_symlink !symlink(source, dest)) { pull_say(symlink %s\n, hex); return 0; } @@ -50,25 +70,25 @@ int fetch(unsigned char *sha1) int ifd, ofd, status; struct stat st; void *map; - ifd = open(filename, O_RDONLY); + ifd = open(source, O_RDONLY); if (ifd 0 || fstat(ifd, st) 0) { close(ifd); - fprintf(stderr, cannot open %s\n, filename); + fprintf(stderr, cannot open %s\n, source); return -1; } map = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, ifd, 0); close(ifd); if (map == MAP_FAILED) { - fprintf(stderr, cannot mmap %s\n, filename); + fprintf(stderr, cannot mmap %s\n, source); return -1; } - ofd = open(dest_filename, O_WRONLY | O_CREAT | O_EXCL, 0666); + ofd = open(dest, O_WRONLY | O_CREAT | O_EXCL, 0666); status = ((ofd 0) || (write(ofd, map, st.st_size) != st.st_size)); munmap(map, st.st_size); close(ofd); if (status) - fprintf(stderr, cannot write %s\n, dest_filename); + fprintf(stderr, cannot write %s\n, dest); else pull_say(copy %s\n, hex); return status; @@ -77,6 +97,56 @@ int fetch(unsigned char *sha1) return -1; } +int fetch_pack(unsigned char *sha1) +{ + struct packed_git *target; + char filename[PATH_MAX]; + if (setup_indices()) + return -1; + target = find_sha1_pack(sha1, packs); + if (!target) + return error(Couldn't find %s: not separate or in any pack, +sha1_to_hex(sha1)); + if (get_verbosely) { + fprintf(stderr, Getting pack %s\n, + sha1_to_hex(target-sha1)); + fprintf(stderr, which contains %s\n, + sha1_to_hex(sha1
Re: [OT?] git tools at SourceForge ?
On Fri, 12 Aug 2005, Wolfgang Denk wrote: This is somewhat off topic here, so I apologize, but I didn't know any better place to ask: Has anybody any information if SourceForge is going to provide git / cogito / ... for the projects they host? I asked SF, and they openend a new Feature Request (item #1252867); the message I received sounded as if I was the first person on the planet to ask... Am I really alone with this? The git architecture makes the central server less important, and it's easy to run your own. Also, kernel.org is providing space to a set of people with a large overlap with git users, since git hasn't been particularly publicized and kernel.org is hosting git. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [OT?] git tools at SourceForge ?
On Fri, 12 Aug 2005, Linus Torvalds wrote: And it's possible that git usage won't expand all that much either. But quite frankly, I think git is a lot better than CVS (or even SVN) by now, and I wouldn't be surprised if it started getting some use outside of the git-only and kernel projects once people start getting more used to it. And so I'd be thrilled to have some site like SF support it. I certainly think it's going to happen; it's just not surprising that it hasn't happened yet. Once there's a stable release and some publicity, I'd expect SF to see it as worthwhile. But a hosting site with git-only shell access needs to know what the necessary programs are going to be, which we haven't committed to quite yet. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Re: git-http-pull broken in latest git
On Thu, 11 Aug 2005, Junio C Hamano wrote: Petr Baudis [EMAIL PROTECTED] writes: $ git-cat-file commit bf570303153902ec3d85570ed24515bcf8948848 | grep tree tree 41f10531f1799bbb31a1e0f7652363154ce96f45 $ git-read-tree 41f10531f1799bbb31a1e0f7652363154ce96f45 fatal: failed to unpack tree object 41f10531f1799bbb31a1e0f7652363154ce96f45 Kaboom. I think the issue might be that the reference dependency tree building is broken and it should've pulled the other pack as well. Last time I checked, git-http-pull did not utilize the pack dependency information, which indeed is wrong. Is there documentation on the format? When it decides to fetch a pack instead of an asked-for object, it should check which commits the pack expects to have in your local repository and add them to its list of things to slurp. It should work anyway, except that I messed up some logic in the parallel pull stuff; when it finds it has something already, it ignores it entirely, rather than processing it. The following patch fixes this. --- [PATCH] Fix parallel pull dependancy tracking. It didn't refetch an object it already had (good), but didn't process it, either (bad). Synchronously process anything you already have. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- pull.c | 57 - 1 files changed, 32 insertions(+), 25 deletions(-) 9b6b4b259c6b00d5b2502c158bc800d7623352bc diff --git a/pull.c b/pull.c --- a/pull.c +++ b/pull.c @@ -98,12 +98,38 @@ static int process_tag(struct tag *tag) static struct object_list *process_queue = NULL; static struct object_list **process_queue_end = process_queue; -static int process(unsigned char *sha1, const char *type) +static int process_object(struct object *obj) { - struct object *obj; - if (has_sha1_file(sha1)) + if (obj-type == commit_type) { + if (process_commit((struct commit *)obj)) + return -1; + return 0; + } + if (obj-type == tree_type) { + if (process_tree((struct tree *)obj)) + return -1; return 0; - obj = lookup_object_type(sha1, type); + } + if (obj-type == blob_type) { + return 0; + } + if (obj-type == tag_type) { + if (process_tag((struct tag *)obj)) + return -1; + return 0; + } + return error(Unable to determine requirements +of type %s for %s, +obj-type, sha1_to_hex(obj-sha1)); +} + +static int process(unsigned char *sha1, const char *type) +{ + struct object *obj = lookup_object_type(sha1, type); + if (has_sha1_file(sha1)) { + /* We already have it, so we should scan it now. */ + return process_object(obj); + } if (object_list_contains(process_queue, obj)) return 0; object_list_insert(obj, process_queue_end); @@ -134,27 +160,8 @@ static int loop(void) return -1; if (!obj-type) parse_object(obj-sha1); - if (obj-type == commit_type) { - if (process_commit((struct commit *)obj)) - return -1; - continue; - } - if (obj-type == tree_type) { - if (process_tree((struct tree *)obj)) - return -1; - continue; - } - if (obj-type == blob_type) { - continue; - } - if (obj-type == tag_type) { - if (process_tag((struct tag *)obj)) - return -1; - continue; - } - return error(Unable to determine requirements -of type %s for %s, -obj-type, sha1_to_hex(obj-sha1)); + if (process_object(obj)) + return -1; } return 0; } - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Re: git-http-pull broken in latest git
On Thu, 11 Aug 2005, Junio C Hamano wrote: Daniel Barkalow [EMAIL PROTECTED] writes: It should work anyway,... That is true. Please forget about the recommendation to slurp packs and not falling back on commit walker. Thanks for the patch. No problem; I had been wondering what the rest of those lines were about anyway. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bootstrapping into git, commit gripes at me
On Mon, 11 Jul 2005, Junio C Hamano wrote: Linus Torvalds [EMAIL PROTECTED] writes: But what about the branch name? Should we just ask the user? Together with a flag, like git checkout -b new-branch v2.6.12 for somebody who wants to specify the branch name? Or should we pick a random name and add a helper function to rename a branch later? Opinions? How about treating master a temporary thing --- whatever I happen to be working on right now? That conflicts with my usage, where I have a single repository for all of my working directories, with .git/refs and .git/objects being symlinks to it, but .git/HEAD being different for each branch. The stuff in objects/ and refs/ really shouldn't depend on what you're currently doing for this reason. My way of thinking of master is that it's a real branch, which is for all of the situations where you aren't using a specially-designated branch. For many people, they only do stuff that's not designated specially; Jeff only does stuff that is designated specially. But if you do both, you'll want master to be left alone while you work on the side branch. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] Support for packs in HTTP
On Mon, 11 Jul 2005, Linus Torvalds wrote: On Mon, 11 Jul 2005, Daniel Barkalow wrote: On Sun, 10 Jul 2005, Linus Torvalds wrote: You really _mustn't_ try to create the pack directly to the $GIT_DIR/objects/pack subdirectory - that would make git itself start possibly using that pack before the index is all done, and that would be just wrong and nasty. So you really should _always_ generate the pack somewhere else, and then move it (pack file first, index file second). It's currently fine ignoring index files without corresponding pack files (sha1_file.c, line 470). That doesn't help. Well, it means that the order you move them doesn't matter, because it will ignore the pair if either hasn't been moved. Redgardless of which order you write them (and you _will_ write the pack-file first), you'll find that at some point you have both files, but one or the other isn't fully written, ie they are unusable. (Off topic: note that git-http-pull writes the _index_ first, because it fetches it to determine if it should fetch the pack) And yes, you can handle that by always checking the SHA1 of the files when you open them, but the fact is, you shouldn't need to, just to use it. Checking the SHA1 of the pack-file in particular is very expensive (since it's potentially a huge file, and you don't even want to read all of it). IIRC, we check the size of the pack file and there are hashes around the ends of the two files which have to match; but this is a die() check, not an ignore check, so we just crash with a clear error message rather than doing crazy stuff (like reading from beyond the end of the mmap). So that's what I decided the rule is: never ever have a partial file, and thus you can by definition use them immediately when you see both files. But that requires that you write them under another name than the final one. And since you want that _anyway_ for other uses, you don't hide that inside git-pack-objects, but you make it an exported interface. We should never write anything under the final name, anyway, for just this reason; we already use open/write/close/rename for objects, refs, and cache (maybe not working directory files, though). I think we're actually agreeing on this. My position is that the temporary location should be something like {final-name}.part, such that it doesn't match *.idx or *.pack beforehand (so it doesn't look like a complete file that you might want to send to someone) and it doesn't have to worry about EXDEV on the rename. Also, I would ideally like to be able to resume an interrupted download, which means that it would have to find the partial file in a predictable location, given what it's supposed to contain. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] Demo support for packs via HTTP
On Mon, 11 Jul 2005, Darrin Thompson wrote: On Sun, 2005-07-10 at 15:56 -0400, Daniel Barkalow wrote: + curl_easy_setopt(curl, CURLOPT_FILE, indexfile); + curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite); + curl_easy_setopt(curl, CURLOPT_URL, url); I was hoping to send in a patch which would turn on user auth and turn off ssl peer verification. Your (preliminary obviously) patch puts curl handling in two places. Is there a place were I can safely start working on adding the needed setopts? If I understand the curl documentation, you should be able to set options on the curl object when it has just been created, if those options aren't going to change between requests. Note that I make requests from multiple places, but I use the same curl object for all of them. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Management of packs not yet installed
Support for parsing index files without pack files, installing pack files while running, and checking what pack files are available. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit b686d7a0377c24e05dbed0dafe909dda6c3dfb48 tree ce285b1a0adb4f8d415f72668a77bc1f1f92e1e1 parent 167a4a3308f4a1606e268c2204c98d6999046ae0 author Daniel Barkalow [EMAIL PROTECTED] 1121024924 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1121024924 -0400 Index: cache.h === --- dbae854c7c91182c8a124d0b85d802945d1c6223/cache.h (mode:100644 sha1:84d43d366c6145a30865aa65d92ada88ab95bb9f) +++ ce285b1a0adb4f8d415f72668a77bc1f1f92e1e1/cache.h (mode:100644 sha1:719a77dfabb24e58abd21b7f3a4b846a114e000a) @@ -161,6 +161,8 @@ extern char *mkpath(const char *fmt, ...); extern char *git_path(const char *fmt, ...); extern char *sha1_file_name(const unsigned char *sha1); +extern char *sha1_pack_name(const unsigned char *sha1); +extern char *sha1_pack_index_name(const unsigned char *sha1); int safe_create_leading_directories(char *path); @@ -189,6 +191,9 @@ extern int has_sha1_pack(const unsigned char *sha1); extern int has_sha1_file(const unsigned char *sha1); +extern int has_pack_file(const unsigned char *sha1); +extern int has_pack_index(const unsigned char *sha1); + /* Convert to/from hex/sha1 representation */ extern int get_sha1(const char *str, unsigned char *sha1); extern int get_sha1_hex(const char *hex, unsigned char *sha1); @@ -260,6 +265,7 @@ void *pack_base; unsigned int pack_last_used; unsigned int pack_use_cnt; + unsigned char sha1[20]; char pack_name[0]; /* something like .git/objects/pack/x.pack */ } *packed_git; @@ -274,7 +280,14 @@ extern int path_match(const char *path, int nr, char **match); extern int get_ack(int fd, unsigned char *result_sha1); +extern struct packed_git *parse_pack_index(unsigned char *sha1); + extern void prepare_packed_git(void); +extern void install_packed_git(struct packed_git *pack); + +struct packed_git *find_sha1_pack(const unsigned char *sha1, + struct packed_git *packs); + extern int use_packed_git(struct packed_git *); extern void unuse_packed_git(struct packed_git *); extern struct packed_git *add_packed_git(char *, int); Index: sha1_file.c === --- dbae854c7c91182c8a124d0b85d802945d1c6223/sha1_file.c (mode:100644 sha1:b2914dd2ea629ae974fd4b4c272e77cb04e5c0e0) +++ ce285b1a0adb4f8d415f72668a77bc1f1f92e1e1/sha1_file.c (mode:100644 sha1:27136fdba0fbf2dd943f2634cb49660cdbf95ec4) @@ -200,6 +200,56 @@ return base; } +char *sha1_pack_name(const unsigned char *sha1) +{ + static const char hex[] = 0123456789abcdef; + static char *name, *base, *buf; + int i; + + if (!base) { + const char *sha1_file_directory = get_object_directory(); + int len = strlen(sha1_file_directory); + base = xmalloc(len + 60); + sprintf(base, %s/pack/pack-1234567890123456789012345678901234567890.pack, sha1_file_directory); + name = base + len + 11; + } + + buf = name; + + for (i = 0; i 20; i++) { + unsigned int val = *sha1++; + *buf++ = hex[val 4]; + *buf++ = hex[val 0xf]; + } + + return base; +} + +char *sha1_pack_index_name(const unsigned char *sha1) +{ + static const char hex[] = 0123456789abcdef; + static char *name, *base, *buf; + int i; + + if (!base) { + const char *sha1_file_directory = get_object_directory(); + int len = strlen(sha1_file_directory); + base = xmalloc(len + 60); + sprintf(base, %s/pack/pack-1234567890123456789012345678901234567890.idx, sha1_file_directory); + name = base + len + 11; + } + + buf = name; + + for (i = 0; i 20; i++) { + unsigned int val = *sha1++; + *buf++ = hex[val 4]; + *buf++ = hex[val 0xf]; + } + + return base; +} + struct alternate_object_database *alt_odb; /* @@ -360,6 +410,14 @@ int use_packed_git(struct packed_git *p) { + if (!p-pack_size) { + struct stat st; + // We created the struct before we had the pack + stat(p-pack_name, st); + if (!S_ISREG(st.st_mode)) + die(packfile %s not a regular file, p-pack_name); + p-pack_size = st.st_size; + } if (!p-pack_base) { int fd; struct stat st; @@ -387,8 +445,10 @@ * this is cheap. */ if (memcmp((char*)(p-index_base) + p-index_size - 40, - p-pack_base + p-pack_size - 20, 20)) + p
[PATCH 0/2] Support for packs in HTTP
This series has one patch which is ready to go in and one that's not (although it's a reasonable phony for the current state of the git world). 1: Several additional functions are needed in the library to support progressively getting pack data from some remote location and using it to determine what else to get. 2: git-http-pull can get packs as appropriate by getting all the index files first, and then using them to figure out whether the object it's looking for is in some pack it could get. Currently, there's no sane way to figure out what pack/index files are available from an HTTP server. But there only seems to be one pack file available on an HTTP server at the moment, so this tries to get that one. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Demo support for packs via HTTP
Support for downloading the pack file e3117bbaf6a59cb53c3f6f0d9b17b9433f0e4135 when appropriate. (Will support other pack files when the repository has a list of them.) Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit 74132562a2f6cfce9690a5091de7e85bd51d88af tree c0ae9cb936abac4412aa4a89928f4609d111fd2c parent b686d7a0377c24e05dbed0dafe909dda6c3dfb48 author Daniel Barkalow [EMAIL PROTECTED] 1121024943 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1121024943 -0400 Index: http-pull.c === --- ce285b1a0adb4f8d415f72668a77bc1f1f92e1e1/http-pull.c (mode:100644 sha1:1f9d60b9b1d5eed85b24d96c240666bbfc5a22ed) +++ c0ae9cb936abac4412aa4a89928f4609d111fd2c/http-pull.c (mode:100644 sha1:2a8d7e71d9447483668cb4a1eb01a096e736f8e3) @@ -56,6 +56,126 @@ return size; } +static int got_indices = 0; + +static struct packed_git *packs = NULL; + +static int fetch_index(unsigned char *sha1) +{ + char *filename; + char *url; + + FILE *indexfile; + + if (has_pack_index(sha1)) + return 0; + + if (get_verbosely) + fprintf(stderr, Getting index for pack %s\n, + sha1_to_hex(sha1)); + + url = xmalloc(strlen(base) + 64); + sprintf(url, %s/objects/pack/pack-%s.idx, + base, sha1_to_hex(sha1)); + + filename = sha1_pack_index_name(sha1); + indexfile = fopen(filename, w); + if (!indexfile) + return error(Unable to open local file %s for pack index, +filename); + + curl_easy_setopt(curl, CURLOPT_FILE, indexfile); + curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite); + curl_easy_setopt(curl, CURLOPT_URL, url); + + if (curl_easy_perform(curl)) { + fclose(indexfile); + return error(Unable to get pack index %s, url); + } + + fclose(indexfile); + return 0; +} + +static int setup_index(unsigned char *sha1) +{ + struct packed_git *new_pack; + if (has_pack_file(sha1)) + return 0; // don't list this as something we can get + + if (fetch_index(sha1)) + return -1; + + new_pack = parse_pack_index(sha1); + new_pack-next = packs; + packs = new_pack; + return 0; +} + +static int fetch_indices(void) +{ + unsigned char sha1[20]; + if (got_indices) + return 0; + get_sha1_hex(e3117bbaf6a59cb53c3f6f0d9b17b9433f0e4135, sha1); + setup_index(sha1); + got_indices = 1; + return 0; +} + +static int fetch_pack(unsigned char *sha1) +{ + char *url; + struct packed_git *target; + struct packed_git **lst; + FILE *packfile; + char *filename; + + if (fetch_indices()) + return -1; + target = find_sha1_pack(sha1, packs); + if (!target) + return error(Couldn't get %s: not separate or in any pack, +sha1_to_hex(sha1)); + + if (get_verbosely) { + fprintf(stderr, Getting pack %s\n, + sha1_to_hex(target-sha1)); + fprintf(stderr, which contains %s\n, + sha1_to_hex(sha1)); + } + + url = xmalloc(strlen(base) + 65); + sprintf(url, %s/objects/pack/pack-%s.pack, + base, sha1_to_hex(target-sha1)); + + filename = sha1_pack_name(target-sha1); + packfile = fopen(filename, w); + if (!packfile) + return error(Unable to open local file %s for pack, +filename); + + curl_easy_setopt(curl, CURLOPT_FILE, packfile); + curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite); + curl_easy_setopt(curl, CURLOPT_URL, url); + + if (curl_easy_perform(curl)) { + fclose(packfile); + return error(Unable to get pack file %s, url); + } + + fclose(packfile); + + install_packed_git(target); + + lst = packs; + while (*lst != target) + lst = ((*lst)-next); + *lst = (*lst)-next; + + return 0; +} + int fetch(unsigned char *sha1) { char *hex = sha1_to_hex(sha1); @@ -67,7 +187,7 @@ local = open(filename, O_WRONLY | O_CREAT | O_EXCL, 0666); if (local 0) - return error(Couldn't open %s\n, filename); + return error(Couldn't open local object %s\n, filename); memset(stream, 0, sizeof(stream)); @@ -75,6 +195,7 @@ SHA1_Init(c); + curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1); curl_easy_setopt(curl, CURLOPT_FILE, NULL); curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file); @@ -90,8 +211,12 @@ curl_easy_setopt(curl, CURLOPT_URL, url); - if (curl_easy_perform(curl)) - return error(Couldn't get %s for %s\n, url, hex
Re: [RFC] Design for http-pull on repo with packs
On Sun, 10 Jul 2005, Dan Holmsand wrote: Daniel Barkalow wrote: I have a design for using http-pull on a packed repository, and it only requires one extra file in the repository: an append-only list of the pack files (because getting the directory listing is very painful and failure-prone). A few comments (as I've been tinkering with a way to solve the problem myself). As long as the pack files are named sensibly (i.e. if they are created by git-repack-script), it's not very error-prone to just get the directory listing, and look for matches for pack-sha1.idx. It seems to work quite well (see below). It isn't beautiful in any way, but it works... I may grab your code for that; the version I just sent seems to be working except for that. If an individual file is not available, figure out what packs are available: Get the list of pack files the repository has (currently, I just use e3117bbaf6a59cb53c3f6f0d9b17b9433f0e4135) For any packs we don't have, get the index files. This part might be slightly expensive, for large repositories. If one assumes that packs are named as by git-repack-script, however, one might cache indexes we've already seen (again, see below). Or, if you go for the mandatory pack-index-file, require that it has a reliable order, so that you can get the last added index first. Nothing bad happens if you have index files for pack files you don't have, as it turns out; the library ignores them. So we can keep the index files around so we can quickly check if they have the objects we want. That way, we don't have to worry about skipping something now (because it's not needed) and then ignoring it when the branch gets merged in. So what I actually do is make a list of the pack files that aren't already downloaded that are available from the server, and download the index files for any where the index file isn't downloaded, either. Keep a list of the struct packed_gits for the packs the server has (these are not used as places to look for objects) Each time we need an object, check the list for it. If it is in there, download the corresponding pack and report success. Here you will need some strategy to deal with packs that overlap with what we've already got. Basically, small and overlapping packs should be unpacked, big and non-overlapping ones saved as is (since git-unpack-objects is painfully slow and memory-hungry...). I don't think there's an issue to having overlapping packs, either with each other or with separate objects. If the user wants, stuff can be repacked outside of the pull operation (note, though, that the index files should be truncated rather than removed, so that the program doesn't fetch them again next time some object can't be found easily). One could also optimize the pack-download bit, by figuring out the last object in the pack that we need (easy enough to do from the index file), and just get the part of the pack file leading up to that object. That could be a huge win for independently packed repositories (I don't do that in my code below, though). That's only possible if you can figure out what you want to have before you get it. My code is walking the reachability graph on the client; it can only figure out what other objects it needs after it's mapped the pack file. Anyway, here's my attempt at the same thing. It introduces git-dumb-fetch, with usage like git-fetch-pack (except that it works with http and rsync). And it adds some uglyness to git-cat-file, for figuring out which objects we already have. I might use that method for listing the available packs, although I'd sort of like to encourage a clean solution first. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Make --recover cause pull to trace everything
Make the --recover flag check the parents of commits which are already available. This is needed currently to deal with cases where a parent is pulled along with a commit (in a pack, e.g.) and references above that parent aren't also pulled together. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit 75e8c1be7a778e0a0fa119fe1bc408341932e7e5 tree ffbe708117543c356eb2981f1e0540b89b7a95e2 parent a7336ae514738f159dad314d6674961427f043a6 author Daniel Barkalow [EMAIL PROTECTED] 1121024019 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1121024019 -0400 Index: http-pull.c === --- 248f72f3e4dcb40693488b0c06f93d0b38122b8e/http-pull.c (mode:100644 sha1:1f9d60b9b1d5eed85b24d96c240666bbfc5a22ed) +++ ffbe708117543c356eb2981f1e0540b89b7a95e2/http-pull.c (mode:100644 sha1:3fa56f08b0b8e7316afcaab3a7bfa3f2d26b550f) @@ -146,7 +146,10 @@ int arg = 1; while (arg argc argv[arg][0] == '-') { - if (argv[arg][1] == 't') { + if (argv[arg][1] == '-') { + if (!strcmp(argv[arg] + 2, recover)) + careful = 1; + } else if (argv[arg][1] == 't') { get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; Index: local-pull.c === --- 248f72f3e4dcb40693488b0c06f93d0b38122b8e/local-pull.c (mode:100644 sha1:2f06fbee8b840a7ae642f5a22e2cb993687f3470) +++ ffbe708117543c356eb2981f1e0540b89b7a95e2/local-pull.c (mode:100644 sha1:0d10c07844030bc7cb615cf916dce89592151be7) @@ -116,7 +116,10 @@ int arg = 1; while (arg argc argv[arg][0] == '-') { - if (argv[arg][1] == 't') + if (argv[arg][1] == '-') { + if (!strcmp(argv[arg] + 2, recover)) + careful = 1; + } else if (argv[arg][1] == 't') get_tree = 1; else if (argv[arg][1] == 'c') get_history = 1; Index: pull.c === --- 248f72f3e4dcb40693488b0c06f93d0b38122b8e/pull.c (mode:100644 sha1:ed3078e3b27c62c07558fd94f339801cbd685593) +++ ffbe708117543c356eb2981f1e0540b89b7a95e2/pull.c (mode:100644 sha1:d9763840c7ebcb1e5838c3b960695cafcca3ac73) @@ -11,6 +11,7 @@ const unsigned char *current_ref = NULL; +int careful = 0; int get_tree = 0; int get_history = 0; int get_all = 0; @@ -91,7 +92,8 @@ if (get_history) { struct commit_list *parents = obj-parents; for (; parents; parents = parents-next) { - if (has_sha1_file(parents-item-object.sha1)) + if (!careful + has_sha1_file(parents-item-object.sha1)) continue; if (make_sure_we_have_it(NULL, parents-item-object.sha1)) { Index: pull.h === --- 248f72f3e4dcb40693488b0c06f93d0b38122b8e/pull.h (mode:100644 sha1:e173ae3337c4465da87d849f4e5c9da203fdf01d) +++ ffbe708117543c356eb2981f1e0540b89b7a95e2/pull.h (mode:100644 sha1:d1076468b71b31dd5e59ec55d98de830cf9df60e) @@ -21,6 +21,12 @@ /* If set, the hash that the current value of write_ref must be. */ extern const unsigned char *current_ref; +/* + * Set to check on everything, instead of stopping at points where we think + * we must have everything. + */ +extern int careful; + /* Set to fetch the target tree. */ extern int get_tree; Index: ssh-pull.c === --- 248f72f3e4dcb40693488b0c06f93d0b38122b8e/ssh-pull.c (mode:100644 sha1:26356dd7d84ea1bc9f7320b18562ed4117d4fac0) +++ ffbe708117543c356eb2981f1e0540b89b7a95e2/ssh-pull.c (mode:100644 sha1:7ca4243f3bd84590e7bb94467fd5acccd7d4d6f9) @@ -61,7 +61,10 @@ const char *prog = getenv(GIT_SSH_PUSH) ? : git-ssh-push; while (arg argc argv[arg][0] == '-') { - if (argv[arg][1] == 't') { + if (argv[arg][1] == '-') { + if (!strcmp(argv[arg] + 2, recover)) + careful = 1; + } else if (argv[arg][1] == 't') { get_tree = 1; } else if (argv[arg][1] == 'c') { get_history = 1; - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Remove map_sha1_file
Remove map_sha1_file(), now unused. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit c21a02262f770a25b005378e06354e582aa1bfd8 tree 7ac9fabe666f00f37572e7b349fdb859bf8a6491 parent 264ff9f3dcde5553728b34fa08e04643b2b55946 author Daniel Barkalow [EMAIL PROTECTED] 1121033599 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1121033599 -0400 Index: cache.h === --- 353fe33ae9c7265d7b685bca864d657e3efe2849/cache.h (mode:100644 sha1:38dac6d6a413f1c788e5331ef4741fc15d72d9bd) +++ 7ac9fabe666f00f37572e7b349fdb859bf8a6491/cache.h (mode:100644 sha1:11ba95c8aa9202fa3b1a3cbc07bc976641cd1908) @@ -167,7 +167,6 @@ int safe_create_leading_directories(char *path); /* Read and unpack a sha1 file into memory, write memory to a sha1 file */ -extern void * map_sha1_file(const unsigned char *sha1, unsigned long *size); extern int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size); extern int parse_sha1_header(char *hdr, char *type, unsigned long *sizep); extern int sha1_object_info(const unsigned char *, char *, unsigned long *); Index: sha1_file.c === --- 353fe33ae9c7265d7b685bca864d657e3efe2849/sha1_file.c (mode:100644 sha1:08560b2c7a6dff400a46160501c247081f9bb4c7) +++ 7ac9fabe666f00f37572e7b349fdb859bf8a6491/sha1_file.c (mode:100644 sha1:e082f2e6cb985caca11979311c291aa51d6c37fd) @@ -578,8 +578,7 @@ } static void *map_sha1_file_internal(const unsigned char *sha1, - unsigned long *size, - int say_error) + unsigned long *size) { struct stat st; void *map; @@ -587,8 +586,6 @@ char *filename = find_sha1_file(sha1, st); if (!filename) { - if (say_error) - error(cannot map sha1 file %s, sha1_to_hex(sha1)); return NULL; } @@ -602,8 +599,6 @@ break; /* Fallthrough */ case 0: - if (say_error) - perror(filename); return NULL; } @@ -620,11 +615,6 @@ return map; } -void *map_sha1_file(const unsigned char *sha1, unsigned long *size) -{ - return map_sha1_file_internal(sha1, size, 1); -} - int unpack_sha1_header(z_stream *stream, void *map, unsigned long mapsize, void *buffer, unsigned long size) { /* Get the data stream */ @@ -1112,7 +1102,7 @@ z_stream stream; char hdr[128]; - map = map_sha1_file_internal(sha1, mapsize, 0); + map = map_sha1_file_internal(sha1, mapsize); if (!map) { struct pack_entry e; @@ -1151,7 +1141,7 @@ unsigned long mapsize; void *map, *buf; - map = map_sha1_file_internal(sha1, mapsize, 0); + map = map_sha1_file_internal(sha1, mapsize); if (map) { buf = unpack_sha1_file(map, mapsize, type, size); munmap(map, mapsize); @@ -1331,7 +1321,7 @@ ssize_t size; unsigned long objsize; int posn = 0; - char *buf = map_sha1_file_internal(sha1, objsize, 0); + char *buf = map_sha1_file_internal(sha1, objsize); z_stream stream; if (!buf) { unsigned char *unpacked; - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] write_sha1_to_fd()
Add write_sha1_to_fd(), which writes an object to a file descriptor. This includes support for unpacking it and recompressing it. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit 264ff9f3dcde5553728b34fa08e04643b2b55946 tree 353fe33ae9c7265d7b685bca864d657e3efe2849 parent c3eb461762b1d65e424fc4ede6a1d4f3e0a679f7 author Daniel Barkalow [EMAIL PROTECTED] 1121033477 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1121033477 -0400 Index: cache.h === --- 545ef8191b517b7f9e4ea558edaf526038ed1895/cache.h (mode:100644 sha1:719a77dfabb24e58abd21b7f3a4b846a114e000a) +++ 353fe33ae9c7265d7b685bca864d657e3efe2849/cache.h (mode:100644 sha1:38dac6d6a413f1c788e5331ef4741fc15d72d9bd) @@ -187,6 +187,7 @@ extern int read_tree(void *buffer, unsigned long size, int stage); extern int write_sha1_from_fd(const unsigned char *sha1, int fd); +extern int write_sha1_to_fd(int fd, const unsigned char *sha1); extern int has_sha1_pack(const unsigned char *sha1); extern int has_sha1_file(const unsigned char *sha1); Index: sha1_file.c === --- 545ef8191b517b7f9e4ea558edaf526038ed1895/sha1_file.c (mode:100644 sha1:27136fdba0fbf2dd943f2634cb49660cdbf95ec4) +++ 353fe33ae9c7265d7b685bca864d657e3efe2849/sha1_file.c (mode:100644 sha1:08560b2c7a6dff400a46160501c247081f9bb4c7) @@ -1326,6 +1326,65 @@ return 0; } +int write_sha1_to_fd(int fd, const unsigned char *sha1) +{ + ssize_t size; + unsigned long objsize; + int posn = 0; + char *buf = map_sha1_file_internal(sha1, objsize, 0); + z_stream stream; + if (!buf) { + unsigned char *unpacked; + unsigned long len; + char type[20]; + char hdr[50]; + int hdrlen; + // need to unpack and recompress it by itself + unpacked = read_packed_sha1(sha1, type, len); + + hdrlen = sprintf(hdr, %s %lu, type, len) + 1; + + /* Set it up */ + memset(stream, 0, sizeof(stream)); + deflateInit(stream, Z_BEST_COMPRESSION); + size = deflateBound(stream, len + hdrlen); + buf = xmalloc(size); + + /* Compress it */ + stream.next_out = buf; + stream.avail_out = size; + + /* First header.. */ + stream.next_in = hdr; + stream.avail_in = hdrlen; + while (deflate(stream, 0) == Z_OK) + /* nothing */; + + /* Then the data itself.. */ + stream.next_in = unpacked; + stream.avail_in = len; + while (deflate(stream, Z_FINISH) == Z_OK) + /* nothing */; + deflateEnd(stream); + + objsize = stream.total_out; + } + + do { + size = write(fd, buf + posn, objsize - posn); + if (size = 0) { + if (!size) { + fprintf(stderr, write closed); + } else { + perror(write ); + } + return -1; + } + posn += size; + } while (posn objsize); + return 0; +} + int write_sha1_from_fd(const unsigned char *sha1, int fd) { char *filename = sha1_file_name(sha1); Index: ssh-push.c === --- 545ef8191b517b7f9e4ea558edaf526038ed1895/ssh-push.c (mode:100644 sha1:090d6f9f8fbde2d736ac5bf563415b0fa402b5aa) +++ 353fe33ae9c7265d7b685bca864d657e3efe2849/ssh-push.c (mode:100644 sha1:aac70af514e0dc5507fa4997ebad54352c973215) @@ -7,13 +7,13 @@ static unsigned char local_version = 1; static unsigned char remote_version = 0; +static int verbose = 0; + static int serve_object(int fd_in, int fd_out) { ssize_t size; - int posn = 0; unsigned char sha1[20]; - unsigned long objsize; - void *buf; signed char remote; + int posn = 0; do { size = read(fd_in, sha1 + posn, 20 - posn); if (size 0) { @@ -25,12 +25,12 @@ posn += size; } while (posn 20); - /* fprintf(stderr, Serving %s\n, sha1_to_hex(sha1)); */ + if (verbose) + fprintf(stderr, Serving %s\n, sha1_to_hex(sha1)); + remote = 0; - buf = map_sha1_file(sha1, objsize); - - if (!buf) { + if (!has_sha1_file(sha1)) { fprintf(stderr, git-ssh-push: could not find %s\n, sha1_to_hex(sha1)); remote = -1; @@ -41,20 +41,7 @@ if (remote 0) return 0; - posn = 0; - do { - size = write(fd_out, buf + posn
[PATCH] Better error message from git-ssh-push
If git-ssh-push can't interpret the commit-id, there are various possible issues. Just giving the usage message makes it hard to identify what could be wrong. Signed-off-by: Daniel Barkalow [EMAIL PROTECTED] --- commit 7a274ce1f93e6092dcf226d546a58d2d6df9d13c tree 1f045fa8aa017cabbac613cf8c1ea2bd63ccc46c parent 8934c88118c900fe38abbf60f893ee9ef4e83b3c author Daniel Barkalow [EMAIL PROTECTED] 1120507167 -0400 committer Daniel Barkalow [EMAIL PROTECTED](none) 1120507167 -0400 Index: ssh-push.c === --- 62a74516551505e5fd2b5c2fd14486f3ac8a400e/ssh-push.c (mode:100644 sha1:10390948efacfa06f4f6fc6b2f3631cec6fcb876) +++ 1f045fa8aa017cabbac613cf8c1ea2bd63ccc46c/ssh-push.c (mode:100644 sha1:6b1406b527ba6ede8602a04ab031003edb7da2b0) @@ -257,8 +257,12 @@ usage(ssh_push_usage); commit_id = argv[arg]; url = argv[arg + 1]; - if (get_sha1(commit_id, sha1)) - usage(ssh_push_usage); + if (get_sha1(commit_id, sha1)) { + fprintf(stderr, + Unable to interpret %s as something to push.\n, + commit_id); + return 1; + } memcpy(hex, sha1_to_hex(sha1), sizeof(hex)); argv[arg] = hex; - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Last mile for 1.0 again
On Mon, 4 Jul 2005, Linus Torvalds wrote: On Mon, 4 Jul 2005, Daniel Barkalow wrote: How about an option to git-rev-list to take a path, and (1) exclude any branch where the version at that path ends up ignored in a merge and (2) not list any revision where the version at that path is identical to a parent? Hmm. How is that different from git-whatchanged path, really? It would short-circuit going up areas of the history which don't contribute (i.e., lead up to a merge which took its version from a different parent). It could also stop when it ran out of branches that have the file at all. Neither of these is all that significant, I guess. Junio: what's missing from annotate/blame? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Last mile for 1.0 again
On Mon, 4 Jul 2005, Junio C Hamano wrote: DB == Daniel Barkalow [EMAIL PROTECTED] writes: DB Junio: what's missing from annotate/blame? Which one are you talking about? What I use to generate http://members.cox.net/junkio/Summary.txt is an implementation of an algorithm I consider complete in that it does rename/copy and complete rewrite correctly. What is missing from the implementation is efficiency. [perl script] How does this work, and what do we do about merges? I've got that part, but I'm not clear on how the rename/copy and complete rewrite stuff works. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/5] Add http-pull
On Sat, 23 Apr 2005, Petr Baudis wrote: Dear diary, on Sat, Apr 23, 2005 at 01:00:33AM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... On Sat, 23 Apr 2005, Petr Baudis wrote: Dear diary, on Fri, Apr 22, 2005 at 09:46:35PM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... Huh. Why? You just go back to history until you find a commit you already have. If you did it the way as Tony described, if you have that commit, you can be sure that you have everything it depends on too. But if you download 1000 files of the 1010 you need, and then your network goes down, you will need to download those 1000 again when it comes back, because you can't save them unless you have the full history. Why can't I? I think I can do that perfectly fine. The worst thing that can happen is that fsck-cache will complain a bit. Not if you're using the fact that you don't have them to tell you that you still need the other 10, which is what tony's scheme would do. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Change pull to _only_ download, and git update=pull+merge?
On Tue, 19 Apr 2005, Petr Baudis wrote: I disagree. This already forces you to have two branches (one to pull from to get the data, mirroring the remote branch, one for your real work) uselessly and needlessly. If you pull in a non-tracked tree, it certainly won't apply the changes, so you can just have your local tree and pull other people's trees as desired. I think there is just no good name for what pull is doing now, and update seems like a great name for what pull-and-merge really is. Pull really is pull - it _pulls_ the data, while update also updates the given tree. No surprises. I'm actually getting suspicious that the right thing is to hide pull in the id scheme. That is, instead of saying linus to refer to the linus head that you currently have, you say +linus to refer to the head Linus has on his server currently, and this will cause you to download anything necessary to perform the operation with the resulting value. See, I don't think you ever want to just pull. You want to pull-and-do-something, but the something could be any operation that uses a commit, not necessarily update. So you could do git diff -r +linus to compare your head against current linus. You'd want git update to take a working directory from linus to +linus (just because you know Linus's more recent head doesn't mean you're automatically using it). You could just git merge +linus in your working directory to sync with Linus. Even git log +linus to see his recent changes. I think the only reason not to just make any reference to a head pull it is performance on looking up the head; you don't really want to hammer the server getting these 40-byte files constantly or wait for a connection every time (not to mention the possibility of not being able to connect). But there's no reason to want to not have the latest data, since the older data doesn't go away. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [0/5] Parsers for git objects, porting some programs
On Mon, 18 Apr 2005, Junio C Hamano wrote: I was looking at the tree part and am thinking that it would make it much nicer if your tree object records path for each entry. You're entirely right, and I've actually now written the code that does it. I'm planning to send out a patch for that shortly. Currently it just borrows from object.refs to represent its children Note that object.refs needs to get filled out for those applications, even if the information is also included in the parse; object.refs is for finding what you can reach without worrying about how you do it. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
More patches
Here are the things I was saving for after the previous set: 1: Report the actual contents of trees 2: Add functions for scanning history by date 3: Add http-pull, a program to fetch the objects you need by HTTP 4: Change merge-base to find the most recent common ancestor 1 and 2 are core extensions. 3 might be best for the pasky tree. 4 is mostly a demo of 2 and because Linus thought it was a better algorithm. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[1/4] Report info from trees
This patch adds actual information to struct tree, making it possible to tell what sorts of things the referenced objects are. This is needed for http-pull, and Junio wanted something of the sort. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: tree.c === --- 1172a9b8f45b2fd640985595cc5258db3b027828/tree.c (mode:100644 sha1:7c5e5e46f4967b0812b06c0114946c3a6432c8d8) +++ 7e5a0d93117ecadfb15de3a6bebdb1aa94234fde/tree.c (mode:100644 sha1:39f9cbd1908e9046c148339f816025c9313ec142) @@ -27,6 +27,7 @@ char type[20]; void *buffer, *bufptr; unsigned long size; + struct tree_entry_list **list_p; if (item-object.parsed) return 0; item-object.parsed = 1; @@ -38,8 +39,10 @@ if (strcmp(type, tree_type)) return error(Object %s not a tree, sha1_to_hex(item-object.sha1)); + list_p = item-entries; while (size) { struct object *obj; + struct tree_entry_list *entry; int len = 1+strlen(bufptr); unsigned char *file_sha1 = bufptr + len; char *path = strchr(bufptr, ' '); @@ -48,6 +51,11 @@ sscanf(bufptr, %o, mode) != 1) return -1; + entry = malloc(sizeof(struct tree_entry_list)); + entry-directory = S_ISDIR(mode); + entry-executable = mode S_IXUSR; + entry-next = NULL; + /* Warn about trees that don't do the recursive thing.. */ if (strchr(path, '/')) { item-has_full_path = 1; @@ -56,12 +64,17 @@ bufptr += len + 20; size -= len + 20; - if (S_ISDIR(mode)) { - obj = lookup_tree(file_sha1)-object; + if (entry-directory) { + entry-item.tree = lookup_tree(file_sha1); + obj = entry-item.tree-object; } else { - obj = lookup_blob(file_sha1)-object; + entry-item.blob = lookup_blob(file_sha1); + obj = entry-item.blob-object; } add_ref(item-object, obj); + + *list_p = entry; + list_p = entry-next; } return 0; } Index: tree.h === --- 1172a9b8f45b2fd640985595cc5258db3b027828/tree.h (mode:100644 sha1:14ebbacded09d5e058c7f94652dcb9e12bc31cae) +++ 7e5a0d93117ecadfb15de3a6bebdb1aa94234fde/tree.h (mode:100644 sha1:985500e2a9130fe8c33134ca121838af9320c465) @@ -5,9 +5,20 @@ extern const char *tree_type; +struct tree_entry_list { + struct tree_entry_list *next; + unsigned directory : 1; + unsigned executable : 1; + union { + struct tree *tree; + struct blob *blob; + } item; +}; + struct tree { struct object object; unsigned has_full_path : 1; + struct tree_entry_list *entries; }; struct tree *lookup_tree(unsigned char *sha1); - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2/4] Sorting commits by date
Functions for a date-ordered queue of commits, progressively pulled out of the history incrementally. Linus wanted this for finding the most recent common ancestor, and it might be relevant to logging. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: commit.c === --- b3cf8daf9b619ae9f06a28f42a4ae01b69729206/commit.c (mode:100644 sha1:0099baa63971d86ee30ef2a7da25057f0f45a964) +++ 7e5a0d93117ecadfb15de3a6bebdb1aa94234fde/commit.c (mode:100644 sha1:ef9af397471817837e1799d72f6707e0ccc949b9) @@ -83,3 +83,47 @@ free(temp); } } + +static void insert_by_date(struct commit_list **list, struct commit *item) +{ + struct commit_list **pp = list; + struct commit_list *p; + while ((p = *pp) != NULL) { + if (p-item-date item-date) { + break; + } + pp = p-next; + } + struct commit_list *insert = malloc(sizeof(struct commit_list)); + insert-next = *pp; + *pp = insert; + insert-item = item; +} + + +void sort_by_date(struct commit_list **list) +{ + struct commit_list *ret = NULL; + while (*list) { + insert_by_date(ret, (*list)-item); + *list = (*list)-next; + } + *list = ret; +} + +struct commit *pop_most_recent_commit(struct commit_list **list) +{ + struct commit *ret = (*list)-item; + struct commit_list *parents = ret-parents; + struct commit_list *old = *list; + + *list = (*list)-next; + free(old); + + while (parents) { + parse_commit(parents-item); + insert_by_date(list, parents-item); + parents = parents-next; + } + return ret; +} Index: commit.h === --- b3cf8daf9b619ae9f06a28f42a4ae01b69729206/commit.h (mode:100644 sha1:8cd20b046875f5f7e534b0607fdd97f330f53272) +++ 7e5a0d93117ecadfb15de3a6bebdb1aa94234fde/commit.h (mode:100644 sha1:35679482132ae5a6b7d72bbb684f21472470717c) @@ -24,4 +24,8 @@ void free_commit_list(struct commit_list *list); +void sort_by_date(struct commit_list **list); + +struct commit *pop_most_recent_commit(struct commit_list **list); + #endif /* COMMIT_H */ - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[3/4] Add http-pull
This adds a command to pull a commit and dependant objects from an HTTP server. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- 50afb5dd4184842d8da1da8dcb9ca6a591dfc5b0/Makefile (mode:100644 sha1:803f1d49c436efa570d779db6d350efbceb29ddd) +++ f7f62e0d2a822ad0937fd98a826f65ac7f938217/Makefile (mode:100644 sha1:a3d26213c085e8b6bbc1ec352df0996e558e7c38) @@ -15,7 +15,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree merge-base merge-cache unpack-file + check-files ls-tree merge-base merge-cache unpack-file http-pull all: $(PROG) @@ -81,6 +81,11 @@ unpack-file: unpack-file.o $(LIB_FILE) $(CC) $(CFLAGS) -o unpack-file unpack-file.o $(LIBS) +http-pull: LIBS += -lcurl + +http-pull: http-pull.o $(LIB_FILE) + $(CC) $(CFLAGS) -o http-pull http-pull.o $(LIBS) + blob.o: $(LIB_H) cat-file.o: $(LIB_H) check-files.o: $(LIB_H) @@ -105,6 +110,7 @@ usage.o: $(LIB_H) unpack-file.o: $(LIB_H) write-tree.o: $(LIB_H) +http-pull.o: $(LIB_H) clean: rm -f *.o $(PROG) $(LIB_FILE) Index: http-pull.c === --- /dev/null (tree:50afb5dd4184842d8da1da8dcb9ca6a591dfc5b0) +++ f7f62e0d2a822ad0937fd98a826f65ac7f938217/http-pull.c (mode:100644 sha1:bd251f9e0748784bbd2cd5cf720f126d852fe888) @@ -0,0 +1,170 @@ +#include fcntl.h +#include unistd.h +#include string.h +#include stdlib.h +#include cache.h +#include commit.h +#include errno.h +#include stdio.h + +#include curl/curl.h +#include curl/easy.h + +static CURL *curl; + +static char *base; + +static int tree = 0; +static int commits = 0; +static int all = 0; + +static int has(unsigned char *sha1) +{ + char *filename = sha1_file_name(sha1); + struct stat st; + + if (!stat(filename, st)) + return 1; + return 0; +} + +static int fetch(unsigned char *sha1) +{ + char *hex = sha1_to_hex(sha1); + char *filename = sha1_file_name(sha1); + + char *url; + char *posn; + FILE *local; + struct stat st; + + if (!stat(filename, st)) { + return 0; + } + + local = fopen(filename, w); + + if (!local) + return error(Couldn't open %s\n, filename); + + curl_easy_setopt(curl, CURLOPT_FILE, local); + + url = malloc(strlen(base) + 50); + strcpy(url, base); + posn = url + strlen(base); + strcpy(posn, objects/); + posn += 8; + memcpy(posn, hex, 2); + posn += 2; + *(posn++) = '/'; + strcpy(posn, hex + 2); + + curl_easy_setopt(curl, CURLOPT_URL, url); + + printf(Getting %s\n, hex); + + if (curl_easy_perform(curl)) + return error(Couldn't get %s for %s\n, url, hex); + + fclose(local); + + return 0; +} + +static int process_tree(unsigned char *sha1) +{ + struct tree *tree = lookup_tree(sha1); + struct tree_entry_list *entries; + + if (parse_tree(tree)) + return -1; + + for (entries = tree-entries; entries; entries = entries-next) { + if (fetch(entries-item.tree-object.sha1)) + return -1; + if (entries-directory) { + if (process_tree(entries-item.tree-object.sha1)) + return -1; + } + } + return 0; +} + +static int process_commit(unsigned char *sha1) +{ + struct commit *obj = lookup_commit(sha1); + + if (fetch(sha1)) + return -1; + + if (parse_commit(obj)) + return -1; + + if (tree) { + if (fetch(obj-tree-object.sha1)) + return -1; + if (process_tree(obj-tree-object.sha1)) + return -1; + if (!all) + tree = 0; + } + if (commits) { + struct commit_list *parents = obj-parents; + for (; parents; parents = parents-next) { + if (has(parents-item-object.sha1)) + continue; + if (fetch(parents-item-object.sha1)) { + /* The server might not have it, and +* we don't mind. +*/ + continue; + } + if (process_commit(parents-item-object.sha1)) + return -1; + } + } + return 0; +} + +int main(int argc, char **argv) +{ + char *commit_id; + char *url; + int arg = 1; + unsigned char sha1[20]; + + while (arg argc argv[arg][0] == '-') { + if (argv[arg][1] == 't
[2/5] Add merge-base
merge-base finds one of the best common ancestors of a pair of commits. In particular, it finds one of the ones which is fewest commits away from the further of the heads. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- 37a0b01b85c2999243674d48bfc71cdba0e5518e/Makefile (mode:100644 sha1:346e3850de026485802e41e16a1180be2df85e4a) +++ d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44) @@ -14,7 +14,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree + check-files ls-tree merge-base SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \ gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \ Index: merge-base.c === --- /dev/null (tree:37a0b01b85c2999243674d48bfc71cdba0e5518e) +++ d662b707e11391f6cfe597fd4d0bf9c41d34d01a/merge-base.c (mode:100644 sha1:0f85e7d9e9a896d1142a54170ddf1159f11f9cdd) @@ -0,0 +1,108 @@ +#include stdlib.h +#include cache.h +#include revision.h + +struct revision *common_ancestor(struct revision *rev1, struct revision *rev2) +{ + struct parent *parent; + + struct parent *rev1list = malloc(sizeof(struct parent)); + struct parent *rev2list = malloc(sizeof(struct parent)); + + struct parent *posn, *temp; + + rev1list-parent = rev1; + rev1list-next = NULL; + + rev2list-parent = rev2; + rev2list-next = NULL; + + while (rev1list || rev2list) { + posn = rev1list; + rev1list = NULL; + while (posn) { + parse_commit_object(posn-parent); + if (posn-parent-flags 0x0001) { + /* + printf(1 already seen %s %x\n, + sha1_to_hex(posn-parent-sha1), + posn-parent-flags); + */ +// do nothing + } else if (posn-parent-flags 0x0002) { +// free lists + return posn-parent; + } else { + /* + printf(1 based on %s\n, + sha1_to_hex(posn-parent-sha1)); + */ + posn-parent-flags |= 0x0001; + + parent = posn-parent-parent; + while (parent) { + temp = malloc(sizeof(struct parent)); + temp-next = rev1list; + temp-parent = parent-parent; + rev1list = temp; + parent = parent-next; + } + } + posn = posn-next; + } + posn = rev2list; + rev2list = NULL; + while (posn) { + parse_commit_object(posn-parent); + if (posn-parent-flags 0x0002) { + /* + printf(2 already seen %s\n, + sha1_to_hex(posn-parent-sha1)); + */ +// do nothing + } else if (posn-parent-flags 0x0001) { +// free lists + return posn-parent; + } else { + /* + printf(2 based on %s\n, + sha1_to_hex(posn-parent-sha1)); + */ + posn-parent-flags |= 0x0002; + + parent = posn-parent-parent; + while (parent) { + temp = malloc(sizeof(struct parent)); + temp-next = rev2list; + temp-parent = parent-parent; + rev2list = temp; + parent = parent-next; + } + } + posn = posn-next; + } + } + return NULL; +} + +int main(int argc, char **argv) +{ + struct revision *rev1, *rev2, *ret; + unsigned char rev1key[20], rev2key[20]; + if (argc != 3 || + get_sha1_hex(argv[1], rev1key
[3/5] Add http-pull
http-pull is a program that downloads from a (normal) HTTP server a commit and all of the tree and blob objects it refers to (but not other commits, etc.). Options could be used to make it download a larger or different selection of objects. It depends on libcurl, which I forgot to mention in the README again. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44) +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/Makefile (mode:100644 sha1:940ef8578cf469354002cd8feaec25d907015267) @@ -14,7 +14,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree merge-base + check-files ls-tree http-pull merge-base SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \ gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \ @@ -35,6 +35,7 @@ LIBS= -lssl -lz +http-pull: LIBS += -lcurl $(PROG):%: %.o $(COMMON) $(CC) $(CFLAGS) -o $@ $^ $(LIBS) Index: http-pull.c === --- /dev/null (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a) +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44) @@ -0,0 +1,126 @@ +#include fcntl.h +#include unistd.h +#include string.h +#include stdlib.h +#include cache.h +#include revision.h +#include errno.h +#include stdio.h + +#include curl/curl.h +#include curl/easy.h + +static CURL *curl; + +static char *base; + +static int fetch(unsigned char *sha1) +{ + char *hex = sha1_to_hex(sha1); + char *filename = sha1_file_name(sha1); + + char *url; + char *posn; + FILE *local; + struct stat st; + + if (!stat(filename, st)) { + return 0; + } + + local = fopen(filename, w); + + if (!local) { + fprintf(stderr, Couldn't open %s\n, filename); + return -1; + } + + curl_easy_setopt(curl, CURLOPT_FILE, local); + + url = malloc(strlen(base) + 50); + strcpy(url, base); + posn = url + strlen(base); + strcpy(posn, objects/); + posn += 8; + memcpy(posn, hex, 2); + posn += 2; + *(posn++) = '/'; + strcpy(posn, hex + 2); + + curl_easy_setopt(curl, CURLOPT_URL, url); + + curl_easy_perform(curl); + + fclose(local); + + return 0; +} + +static int process_tree(unsigned char *sha1) +{ + void *buffer; +unsigned long size; +char type[20]; + +buffer = read_sha1_file(sha1, type, size); + if (!buffer) + return -1; + if (strcmp(type, tree)) + return -1; + while (size) { + int len = strlen(buffer) + 1; + unsigned char *sha1 = buffer + len; + unsigned int mode; + int retval; + + if (size len + 20 || sscanf(buffer, %o, mode) != 1) + return -1; + + buffer = sha1 + 20; + size -= len + 20; + + retval = fetch(sha1); + if (retval) + return -1; + + if (S_ISDIR(mode)) { + retval = process_tree(sha1); + if (retval) + return -1; + } + } + return 0; +} + +static int process_commit(unsigned char *sha1) +{ + struct revision *rev = lookup_rev(sha1); + if (parse_commit_object(rev)) + return -1; + + fetch(rev-tree); + process_tree(rev-tree); + return 0; +} + +int main(int argc, char **argv) +{ + char *commit_id = argv[1]; + char *url = argv[2]; + + unsigned char sha1[20]; + + get_sha1_hex(commit_id, sha1); + + curl_global_init(CURL_GLOBAL_ALL); + + curl = curl_easy_init(); + + base = url; + + fetch(sha1); + process_commit(sha1); + + curl_global_cleanup(); + return 0; +} - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2.1/5] Add merge-base
merge-base finds one of the best common ancestors of a pair of commits. In particular, it finds one of the ones which is fewest commits away from the further of the heads. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/Makefile (mode:100644 sha1:346e3850de026485802e41e16a1180be2df85e4a) +++ 7d806c2d3be8f87d3d4d87e5254500d7fc24476b/Makefile (mode:100644 sha1:0e84e3cd12f836602b420c197e08fabefe975493) @@ -14,7 +17,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree + check-files ls-tree merge-base SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \ gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \ Index: merge-base.c === --- /dev/null (tree:45f926575d2c44072bfcf2317dbf3f0fbb513a4e) +++ 7d806c2d3be8f87d3d4d87e5254500d7fc24476b/merge-base.c (mode:100644 sha1:ee979c7532cbdf823e9930993b0dd8f97aadb21f) @@ -0,0 +1,95 @@ +#include stdlib.h +#include cache.h +#include revision.h + +static struct revision *process_list(struct parent **list_p, int this_mark, +int other_mark) +{ + struct parent *parent, *temp; + struct parent *posn = *list_p; + *list_p = NULL; + while (posn) { + parse_commit_object(posn-parent); + if (posn-parent-flags this_mark) { + /* + printf(%d already seen %s %x\n, + this_mark + sha1_to_hex(posn-parent-sha1), + posn-parent-flags); + */ + /* do nothing; this indicates that this side +* split and reformed, and we only need to +* mark it once. +*/ + } else if (posn-parent-flags other_mark) { + return posn-parent; + } else { + /* + printf(%d based on %s\n, + this_mark, + sha1_to_hex(posn-parent-sha1)); + */ + posn-parent-flags |= this_mark; + + parent = posn-parent-parent; + while (parent) { + temp = malloc(sizeof(struct parent)); + temp-next = *list_p; + temp-parent = parent-parent; + *list_p = temp; + parent = parent-next; + } + } + posn = posn-next; + } + return NULL; +} + +struct revision *common_ancestor(struct revision *rev1, struct revision *rev2) +{ + struct parent *rev1list = malloc(sizeof(struct parent)); + struct parent *rev2list = malloc(sizeof(struct parent)); + + rev1list-parent = rev1; + rev1list-next = NULL; + + rev2list-parent = rev2; + rev2list-next = NULL; + + while (rev1list || rev2list) { + struct revision *ret; + ret = process_list(rev1list, 0x1, 0x2); + if (ret) { + /* free lists */ + return ret; + } + ret = process_list(rev2list, 0x2, 0x1); + if (ret) { + /* free lists */ + return ret; + } + } + return NULL; +} + +int main(int argc, char **argv) +{ + struct revision *rev1, *rev2, *ret; + unsigned char rev1key[20], rev2key[20]; + + if (argc != 3 || + get_sha1_hex(argv[1], rev1key) || + get_sha1_hex(argv[2], rev2key)) { + usage(merge-base commit-id commit-id); + } + rev1 = lookup_rev(rev1key); + rev2 = lookup_rev(rev2key); + ret = common_ancestor(rev1, rev2); + if (ret) { + printf(%s\n, sha1_to_hex(ret-sha1)); + return 0; + } else { + return 1; + } + +} - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch] fork optional branch point normazilation
On Sun, 17 Apr 2005, Linus Torvalds wrote: On Sun, 17 Apr 2005, Brad Roberts wrote: (ok, author looks better, but committer doesn't obey the AUTHOR_ vars yet) They should't, but maybe I should add COMMITTER_xxx overrides. I just do _not_ want people to think that they should claim to be somebody else: it's not a security issue (you could compile your own commit-tree.c after all), it's more of a social rule thing. I prefer seeing bad email addresses that at least match the system setup to seeing good email addresses that people made up just to make them look clean. It seems to me like there should be a set of variables for the user in general, and the various git scripts should arrange them appropriately (e.g., git apply could look for a first Signed-Off-By, and make the AUTHOR_ variables match that (for the next commit), while making the COMMITTER match the user, etc). It seems to me like the current situation is likely to lead to people claiming to be other people when applying their patches, just due to having set up their correct info for handling their own patches. Actually, if the scripts are reorganizing them, they might as well send them on the command line. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/5] Add http-pull
On Sun, 17 Apr 2005, Petr Baudis wrote: Index: Makefile === --- d662b707e11391f6cfe597fd4d0bf9c41d34d01a/Makefile (mode:100644 sha1:b2ce7c5b63fffca59653b980d98379909f893d44) +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/Makefile (mode:100644 sha1:940ef8578cf469354002cd8feaec25d907015267) @@ -35,6 +35,7 @@ LIBS= -lssl -lz +http-pull: LIBS += -lcurl $(PROG):%: %.o $(COMMON) $(CC) $(CFLAGS) -o $@ $^ $(LIBS) Whew. Looks like an awful trick, you say this works?! :-) At times, I wouldn't want to be a GNU make parser. Yup. GNU make is big on the features which do the obvious thing, even when you can't believe they work. This is probably why nobody's managed to replace it. Index: http-pull.c === --- /dev/null (tree:d662b707e11391f6cfe597fd4d0bf9c41d34d01a) +++ 157b46ce1d82b3579e2e1258927b0d9bdbc033ab/http-pull.c (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44) + url = malloc(strlen(base) + 50); Off-by-one. What about the trailing NUL? I get length(base) + object/=8 + 40 SHA1 + 1 for '/' and 1 for NUL = 50. I think you should have at least two disjunct modes - either you are downloading everything related to the given commit, or you are downloading all commit records for commit predecessors. Even if you might not want all the intermediate trees, you definitively want the intermediate commits, to keep the history graph contignuous. So in git pull, I'd imagine to do http-pull -c $new_head http-pull -t $(tree-id $new_head) So, -c would fetch a given commit and all its predecessors until it hits what you already have on your side. -t would fetch a given tree with all files and subtrees and everything. http-pull shouldn't default on either, since they are mutually exclusive. What do you think? I think I'd rather keep the current behavior and add a -c for getting the history of commits, and maybe a -a for getting the history of commits and their tress. There's some trickiness for the history of commits thing for stopping at the point where you have everything, but also behaving appropriately if you try once, fail partway through, and then try again. It's on my queue of things to think about. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[3.1/5] Add http-pull
http-pull is a program that downloads from a (normal) HTTP server a commit and all of the tree and blob objects it refers to (but not other commits, etc.). Options could be used to make it download a larger or different selection of objects. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/Makefile (mode:100644 sha1:346e3850de026485802e41e16a1180be2df85e4a) +++ 3eae85f66143160a26f5545d197862c89e2a8fb8/Makefile (mode:100644 sha1:0e84e3cd12f836602b420c197e08fabefe975493) @@ -14,7 +17,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree merge-base + check-files ls-tree http-pull merge-base SCRIPT=parent-id tree-id git gitXnormid.sh gitadd.sh gitaddremote.sh \ gitcommit.sh gitdiff-do gitdiff.sh gitlog.sh gitls.sh gitlsobj.sh \ @@ -35,6 +38,7 @@ LIBS= -lssl -lz +http-pull: LIBS += -lcurl $(PROG):%: %.o $(COMMON) $(CC) $(CFLAGS) -o $@ $^ $(LIBS) Index: README === --- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/README (mode:100664 sha1:0170eafb60ad9009ca41c6536cecd6d1fdee5b86) +++ 3eae85f66143160a26f5545d197862c89e2a8fb8/README (mode:100664 sha1:921d552d810394e665323ec82b4826914918689c) @@ -120,7 +120,7 @@ diff, patch libssl rsync - + curl (later than 7.7, according to the docs) The core GIT Index: http-pull.c === --- /dev/null (tree:45f926575d2c44072bfcf2317dbf3f0fbb513a4e) +++ 3eae85f66143160a26f5545d197862c89e2a8fb8/http-pull.c (mode:100644 sha1:7ba4ad67f6dac34addb537ee147ae3de0550a484) @@ -0,0 +1,139 @@ +#include fcntl.h +#include unistd.h +#include string.h +#include stdlib.h +#include cache.h +#include revision.h +#include errno.h +#include stdio.h + +#include curl/curl.h +#include curl/easy.h + +static CURL *curl; + +static char *base; + +static int fetch(unsigned char *sha1) +{ + char *hex = sha1_to_hex(sha1); + char *filename = sha1_file_name(sha1); + + char *url; + char *posn; + FILE *local; + + if (!access(filename, R_OK)) { + return 0; + } + + local = fopen(filename, w); + + if (!local) { + return error(Couldn't open %s, filename); + } + + curl_easy_setopt(curl, CURLOPT_FILE, local); + + url = malloc(strlen(base) + 50); + strcpy(url, base); + posn = url + strlen(base); + strcpy(posn, objects/); + posn += 8; + memcpy(posn, hex, 2); + posn += 2; + *(posn++) = '/'; + strcpy(posn, hex + 2); + + curl_easy_setopt(curl, CURLOPT_URL, url); + + if (curl_easy_perform(curl)) { + fclose(local); + unlink(filename); + return error(Error downloading %s from %s, +sha1_to_hex(sha1), url); + } + + fclose(local); + + return 0; +} + +static int process_tree(unsigned char *sha1) +{ + void *buffer; + unsigned long size; + char type[20]; + + buffer = read_sha1_file(sha1, type, size); + if (!buffer) + return error(Couldn't read %s., +sha1_to_hex(sha1)); + if (strcmp(type, tree)) + return error(Expected %s to be a tree, but was a %s., +sha1_to_hex(sha1), type); + while (size) { + int len = strlen(buffer) + 1; + unsigned char *sha1 = buffer + len; + unsigned int mode; + int retval; + + if (size len + 20 || sscanf(buffer, %o, mode) != 1) + return error(Invalid tree object); + + buffer = sha1 + 20; + size -= len + 20; + + retval = fetch(sha1); + if (retval) + return retval; + + if (S_ISDIR(mode)) { + retval = process_tree(sha1); + if (retval) + return retval; + } + } + return 0; +} + +static int process_commit(unsigned char *sha1) +{ + int retval; + struct revision *rev = lookup_rev(sha1); + if (parse_commit_object(rev)) + return error(Couldn't parse commit %s\n, sha1_to_hex(sha1)); + + retval = fetch(rev-tree); + if (retval) + return retval; + retval = process_tree(rev-tree); + return retval; +} + +int main(int argc, char **argv) +{ + char *commit_id = argv[1]; + char *url = argv[2]; + int retval; + + unsigned char sha1[20]; + + get_sha1_hex(commit_id, sha1); + + curl_global_init(CURL_GLOBAL_ALL
Re: [1/5] Parsing code in revision.h
On Sun, 17 Apr 2005, Linus Torvalds wrote: On Sun, 17 Apr 2005, Daniel Barkalow wrote: --- 45f926575d2c44072bfcf2317dbf3f0fbb513a4e/revision.h (mode:100644 sha1:28d0de3261a61f68e4e0948a25a416a515cd2e83) +++ 37a0b01b85c2999243674d48bfc71cdba0e5518e/revision.h (mode:100644 sha1:523bde6e14e18bb0ecbded8f83ad4df93fc467ab) @@ -24,6 +24,7 @@ unsigned int flags; unsigned char sha1[20]; unsigned long date; + unsigned char tree[20]; struct parent *parent; }; I think this is really wrong. The whole point of revision.h is that it's a generic framework for keeping track of relationships between different objects. And those objects are in no way just commit objects. For example, fsck uses this struct revision to create a full free of _all_ the object dependencies, which means that a struct revision can be any object at all - it's not in any way limited to commit objects, and there is no tree object that is associated with these things at all. I entirely missed this. No wonder my fsck-cache conversion wasn't going so well... Besides, why do you want the tree? There's really nothing you can do with the tree to a first approximation - you need to _first_ do the reachability analysis entirely on the commit dependencies, and then when you've selected a set of commits, you can just output those. I actually want the tree for http-pull, not merging stuff. I was trying to get a commit parser, not reachability at that point. I think the right thing is to make a separate struct commit that has the stuff I want in it, and probably do a struct tree at the same time. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [3/5] Add http-pull
On Sun, 17 Apr 2005, Petr Baudis wrote: Dear diary, on Sun, Apr 17, 2005 at 08:49:11PM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... I'm not too kind at this. Either make it totally separate commands, or make a required switch specifying what to do. Otherwise it implies the switches would just modify what it does, but they make it do something completely different. That's a good point. I'll require a -t for now, and add more later. -a would be fine too - basically a combination of -c and -t. I'd imagine that is what Linus would want to use, e.g. Well, -c -t would give you the current tree and the whole commit log, but not old trees. -a would additionally give you old trees. There's some trickiness for the history of commits thing for stopping at the point where you have everything, but also behaving appropriately if you try once, fail partway through, and then try again. It's on my queue of things to think about. Can't you just stop the recursion when you hit a commit you already have? The problem is that, if you've fetched the final commit already, and then the server dies, and you try again later, you already have the last one, and so you think you've got everything. At this point, I also want to put off doing much further with recursion and commits until revision.h and such are sorted out. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [2.1/5] Add merge-base
On Sun, 17 Apr 2005, Petr Baudis wrote: Dear diary, on Sun, Apr 17, 2005 at 06:51:59PM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... merge-base finds one of the best common ancestors of a pair of commits. In particular, it finds one of the ones which is fewest commits away from the further of the heads. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Note that during merge with Linus (probably the most complicated I've got so far, but still thankfully not too painful thanks to the rej tool) I've decided to revert your merge-base in favour of Linus' version. I did this mainly to make me merging Linus less awful; we should probably clean it up first and decide which solution to go for in the first place before possibly replacing it again, I think. Sure. I'm working on the rearrangement now. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[2/5] Implementations of parsing functions
This implements the parsing functions. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: blob.c === --- /dev/null (tree:5ca133e1b74aee39b2124c0ec9fd51539babb5e0) +++ 1172a9b8f45b2fd640985595cc5258db3b027828/blob.c (mode:100644 sha1:04e0c1da9b1f4cdb1d1c5881b785babd3b0ceb09) @@ -0,0 +1,24 @@ +#include blob.h +#include cache.h +#include stdlib.h + +const char *blob_type = blob; + +struct blob *lookup_blob(unsigned char *sha1) +{ + struct object *obj = lookup_object(sha1); + if (!obj) { + struct blob *ret = malloc(sizeof(struct blob)); + bzero(ret, sizeof(struct blob)); + created_object(sha1, ret-object); + ret-object.type = blob_type; + ret-object.parsed = 1; + return ret; + } + if (obj-parsed obj-type != blob_type) { + error(Object %s is a %s, not a blob, + sha1_to_hex(sha1), obj-type); + return NULL; + } + return (struct blob *) obj; +} Index: commit.c === --- /dev/null (tree:5ca133e1b74aee39b2124c0ec9fd51539babb5e0) +++ 1172a9b8f45b2fd640985595cc5258db3b027828/commit.c (mode:100644 sha1:0099baa63971d86ee30ef2a7da25057f0f45a964) @@ -0,0 +1,85 @@ +#include commit.h +#include cache.h +#include string.h + +const char *commit_type = commit; + +struct commit *lookup_commit(unsigned char *sha1) +{ + struct object *obj = lookup_object(sha1); + if (!obj) { + struct commit *ret = malloc(sizeof(struct commit)); + bzero(ret, sizeof(struct commit)); + created_object(sha1, ret-object); + return ret; + } + if (obj-parsed obj-type != commit_type) { + error(Object %s is a %s, not a commit, + sha1_to_hex(sha1), obj-type); + return NULL; + } + return (struct commit *) obj; +} + +static unsigned long parse_commit_date(const char *buf) +{ + unsigned long date; + + if (memcmp(buf, author, 6)) + return 0; + while (*buf++ != '\n') + /* nada */; + if (memcmp(buf, committer, 9)) + return 0; + while (*buf++ != '') + /* nada */; + date = strtoul(buf, NULL, 10); + if (date == ULONG_MAX) + date = 0; + return date; +} + +int parse_commit(struct commit *item) +{ + char type[20]; + void * buffer, *bufptr; + unsigned long size; + unsigned char parent[20]; + if (item-object.parsed) + return 0; + item-object.parsed = 1; + buffer = bufptr = read_sha1_file(item-object.sha1, type, size); + if (!buffer) + return error(Could not read %s, +sha1_to_hex(item-object.sha1)); + if (strcmp(type, commit_type)) + return error(Object %s not a commit, +sha1_to_hex(item-object.sha1)); + item-object.type = commit_type; + get_sha1_hex(bufptr + 5, parent); + item-tree = lookup_tree(parent); + add_ref(item-object, item-tree-object); + bufptr += 46; /* tree + hex sha1 + \n */ + while (!memcmp(bufptr, parent , 7) + !get_sha1_hex(bufptr + 7, parent)) { + struct commit_list *new_parent = + malloc(sizeof(struct commit_list)); + new_parent-next = item-parents; + new_parent-item = lookup_commit(parent); + add_ref(item-object, new_parent-item-object); + item-parents = new_parent; + bufptr += 48; + } + item-date = parse_commit_date(bufptr); + free(buffer); + return 0; +} + +void free_commit_list(struct commit_list *list) +{ + while (list) { + struct commit_list *temp = list; + list = temp-next; + free(temp); + } +} Index: object.c === --- /dev/null (tree:5ca133e1b74aee39b2124c0ec9fd51539babb5e0) +++ 1172a9b8f45b2fd640985595cc5258db3b027828/object.c (mode:100644 sha1:986624ac7a7fd9229e05e1f181fd500640298d9e) @@ -0,0 +1,96 @@ +#include object.h +#include cache.h +#include stdlib.h +#include string.h + +struct object **objs; +int nr_objs; +static int obj_allocs; + +static int find_object(unsigned char *sha1) +{ + int first = 0, last = nr_objs; + +while (first last) { +int next = (first + last) / 2; +struct object *obj = objs[next]; +int cmp; + +cmp = memcmp(sha1, obj-sha1, 20); +if (!cmp) +return next; +if (cmp 0) { +last = next; +continue; +} +first
Re: [PATCH] Get commits from remote repositories by HTTP
On Sat, 16 Apr 2005, Tony Luck wrote: On 4/16/05, Daniel Barkalow [EMAIL PROTECTED] wrote: +buffer = read_sha1_file(sha1, type, size); You never free this buffer. Ideally, this should all be rearranged to share the code with read-tree, and it should be fixed in common. It would also be nice if you saved tree objects in some temporary file and did not install them until after you had fetched all the blobs and trees that this tree references. Then if your connection is interrupted you can just restart it. It looks over everything relevant, even if it doesn't need to download anything, so it should work to continue if it stops in between. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
On Sun, 17 Apr 2005, Martin Mares wrote: Hello! This adds a program to download a commit, the trees, and the blobs in them from a remote repository using HTTP. It skips anything you already have. Is it really necessary to write your own HTTP downloader? If so, is it necessary to forget basic stuff like the Host: header? ;-) I wanted to get something hacked quickly; can you suggest a good one to use? If you feel that it should be optimized for speed, then at least use persistent connections. That's the next step. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Get commits from remote repositories by HTTP
On Sat, 16 Apr 2005, Adam Kropelin wrote: Tony Luck wrote: Otherwise this looks really nice. I was going to script something similar using wget ... but that would have made zillions of seperate connections. Not so kind to the server. How about building a file list and doing a batch download via 'wget -i /tmp/foo'? A quick test (on my ancient wget-1.7) indicates that it reuses connectionss when successive URLs point to the same server. You need to look at some of the files before you know what other files to get. You could do it in waves, but that would be excessively complicated to code and not the most efficient anyway. -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Add clone support to lntree
On Sun, 17 Apr 2005, Petr Baudis wrote: Dear diary, on Sat, Apr 16, 2005 at 05:06:54AM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... I think fork is as good as anything for describing the operation. I had thought about clone because it seemed to fill the role that bk clone had (although I never used BK, so I'm not sure). It doesn't seem useful to me to try cloning multiple remote repositories, since you'd get a copy of anything common from each; you just want to suck everything into the same .git/objects and split off working directories. Actually, what about if git pull outside of repository did what git clone does now? I'd kinda like clone instead of fork too. This seems like the best solution to me, too. Although that would make pull take a URL when making a new repository and not otherwise, which might be confusing. init-remote perhaps, or maybe just have init do it if given a URL? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line unsubscribe git in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Re: Re: Add clone support to lntree
On Sun, 17 Apr 2005, Petr Baudis wrote: Dear diary, on Sat, Apr 16, 2005 at 05:17:00AM CEST, I got a letter where Daniel Barkalow [EMAIL PROTECTED] told me that... On Sat, 16 Apr 2005, Petr Baudis wrote: Dear diary, on Sat, Apr 16, 2005 at 04:47:55AM CEST, I got a letter where Petr Baudis [EMAIL PROTECTED] told me that... git branch --- creates a branch from a given commit (when passed empty commit, creates a branch from the current commit and sets the working tree to that branch) Note that there is a bug in current git update - it will allow you to bring several of your trees to follow the same branch, or even a remote branch. This is not even supposed to work, and will be fixed when I get some sleep. You will be able to do git pull even on local branches, and the proper solution for this will be just tracking the branch you want to follow. I must admit that I'm not entirely decided yet, so I'd love to hear your opinion. I'm wondering, whether each tree should be fixed to a certain branch. That is, you decide a name when you do git fork, and then the tree always follows that branch. (It always has to follow [be bound to] *some* branch, and each branch can be followed by only a single tree at a time.) I don't think I'm following the use of branches. Currently, what I do is have a git-pasky and a git-linus, and fork off a working directory from one of these for each thing I want to work on. I do some work, commit as I make progress, and then do a diff against the remote head to get a patch to send off. If I want to do a series of patches which depend on each other, I fork my next directory off of my previous one rather than off of a remote base. I haven't done much rebasing, so I haven't worked out how I would do that most effectively. Yes. And that's exactly what the branches allow you to do. You just do git fork myhttpclient ~/myhttpclientdir then you do some hacking, and when you have something usable, you can go back to your main working directory and do git merge -b when_you_started myhttpclient Since you consider the code perfect, you can now just rm -rf ~/myhttpclient. Suddenly, you get a mail from mj pointing out some bugs, and it looks like there are more to come. What to do? git fork myhttpclient ~/myhttpclientdir (Ok, this does not work, but that's a bug, will fix tomorrow.) This will let you take off when you left in your work on the branch. Ah, I think that's what made me think I wasn't understanding branches; the first thing I tried hit this big. git update for seeking between commits is probably extremely important for any kind of binary search when you are wondering when did this bug appeared first, or when you are exploring how certain branch evolved over time. Doing git fork for each successive iteration sounds horrible. Even if there isn't a performance hit, it's semantically wrong, because you're looking at different versions that were in the same place at different times. Now, what about git branch and git update for switching between branches? I think this is the most controversial part; these are basically just shortcuts for not having to do git fork, and I wouldn't mind so much removing them, if you people really consider them too ugly a wart for the soft clean git skin. I admit that they both come from a hidden prejudice that git fork is going to be slow and eat a lot of disk. I think that this just confuses matters. The idea for git update for switching between branches is that especially when you have two rather similar branches and mostly do stuff on one of them, but sometimes you want to do something on the other one, you can do just quick git update, do stuff, and git update back, without any forking. I still think that fork should be quick enough, or you could leave the extra tree around. I'm not against having such a command, but I think it should be a separate command rather than a different use of update, since it would be used by poeople working in different ways. I think I can make this space efficient by hardlinking unmodified blobs to a directory of cached expanded blobs. I don't know but I really feel *very* unsafe when doing that. What if something screws up and corrupts my base... way too easy. And it gets pretty inconvenient and even more dangerous when you get the idea to do some modifications on your tree by something else than your favorite editor (which you've already checked does the right thing). It should only be an option, not required and maybe not even default. I think it should be possible to prevent stuff from screwing up, since we really don't want anything to ever modify those inodes (as opposed to some cases, where you want to modify inodes only in certain ways). For that matter, relatively
[PATCH] Use libcurl to use HTTP to get repositories
This enables the use of HTTP to download commits and associated objects from remote repositories. It now uses libcurl instead of local hack code. Still causes warnings for fsck-cache and rev-tree, due to unshared code. Still leaks a bit of memory due to bug copied from read-tree. Needs libcurl post 7.7 or so. Signed-Off-By: Daniel Barkalow [EMAIL PROTECTED] Index: Makefile === --- ed4f6e454b40650b904ab72048b2f93a068dccc3/Makefile (mode:100644 sha1:b39b4ea37586693dd707d1d0750a9b580350ec50) +++ d332a8ddffb50c1247491181af458970bf639942/Makefile (mode:100644 sha1:ca5dfd41b750cb1339128e4431afbbbc21bf57bb) @@ -14,7 +14,7 @@ PROG= update-cache show-diff init-db write-tree read-tree commit-tree \ cat-file fsck-cache checkout-cache diff-tree rev-tree show-files \ - check-files ls-tree merge-tree + check-files ls-tree merge-tree http-get all: $(PROG) @@ -23,6 +23,11 @@ LIBS= -lssl -lz +http-get: LIBS += -lcurl + +http-get:%:%.o read-cache.o + $(CC) $(CFLAGS) -o $@ $^ $(LIBS) + init-db: init-db.o update-cache: update-cache.o read-cache.o Index: http-get.c === --- /dev/null (tree:ed4f6e454b40650b904ab72048b2f93a068dccc3) +++ d332a8ddffb50c1247491181af458970bf639942/http-get.c (mode:100644 sha1:106ca31239e6afe6784e7c592234406f5c149e44) @@ -0,0 +1,126 @@ +#include fcntl.h +#include unistd.h +#include string.h +#include stdlib.h +#include cache.h +#include revision.h +#include errno.h +#include stdio.h + +#include curl/curl.h +#include curl/easy.h + +static CURL *curl; + +static char *base; + +static int fetch(unsigned char *sha1) +{ + char *hex = sha1_to_hex(sha1); + char *filename = sha1_file_name(sha1); + + char *url; + char *posn; + FILE *local; + struct stat st; + + if (!stat(filename, st)) { + return 0; + } + + local = fopen(filename, w); + + if (!local) { + fprintf(stderr, Couldn't open %s\n, filename); + return -1; + } + + curl_easy_setopt(curl, CURLOPT_FILE, local); + + url = malloc(strlen(base) + 50); + strcpy(url, base); + posn = url + strlen(base); + strcpy(posn, objects/); + posn += 8; + memcpy(posn, hex, 2); + posn += 2; + *(posn++) = '/'; + strcpy(posn, hex + 2); + + curl_easy_setopt(curl, CURLOPT_URL, url); + + curl_easy_perform(curl); + + fclose(local); + + return 0; +} + +static int process_tree(unsigned char *sha1) +{ + void *buffer; +unsigned long size; +char type[20]; + +buffer = read_sha1_file(sha1, type, size); + if (!buffer) + return -1; + if (strcmp(type, tree)) + return -1; + while (size) { + int len = strlen(buffer) + 1; + unsigned char *sha1 = buffer + len; + unsigned int mode; + int retval; + + if (size len + 20 || sscanf(buffer, %o, mode) != 1) + return -1; + + buffer = sha1 + 20; + size -= len + 20; + + retval = fetch(sha1); + if (retval) + return -1; + + if (S_ISDIR(mode)) { + retval = process_tree(sha1); + if (retval) + return -1; + } + } + return 0; +} + +static int process_commit(unsigned char *sha1) +{ + struct revision *rev = lookup_rev(sha1); + if (parse_commit_object(rev)) + return -1; + + fetch(rev-tree); + process_tree(rev-tree); + return 0; +} + +int main(int argc, char **argv) +{ + char *commit_id = argv[1]; + char *url = argv[2]; + + unsigned char sha1[20]; + + get_sha1_hex(commit_id, sha1); + + curl_global_init(CURL_GLOBAL_ALL); + + curl = curl_easy_init(); + + base = url; + + fetch(sha1); + process_commit(sha1); + + curl_global_cleanup(); + return 0; +} Index: revision.h === --- ed4f6e454b40650b904ab72048b2f93a068dccc3/revision.h (mode:100664 sha1:28d0de3261a61f68e4e0948a25a416a515cd2e83) +++ d332a8ddffb50c1247491181af458970bf639942/revision.h (mode:100664 sha1:523bde6e14e18bb0ecbded8f83ad4df93fc467ab) @@ -24,6 +24,7 @@ unsigned int flags; unsigned char sha1[20]; unsigned long date; + unsigned char tree[20]; struct parent *parent; }; @@ -111,4 +112,29 @@ } } +static int parse_commit_object(struct revision *rev) +{ + if (!(rev-flags SEEN)) { + void *buffer, *bufptr; + unsigned long size; + char type[20]; + unsigned char parent[20