Re: [PATCH v2 00/21] Support multiple worktrees
On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano gits...@pobox.com wrote: Duy Nguyen pclo...@gmail.com writes: I am not happy with the choice of main/HEAD that would squat on a good name for remote-tracking branch (i.e. s/origin/main/), though. $GIT_DIR/COMMON_HEAD perhaps? It's not just about HEAD. Anything worktree-specific of the main checkout can be accessed this way, e.g. main/index, main/FETCH_HEAD and it's not exactly common because it's worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...) ? Do we even need to expose them as ref-like things as a part of the external API/UI in the first place? For end-user scripts that want to operate in a real or borrowing worktree, there should be no reason to touch this other repository directly. Things like if one of the wortrees tries to check out a branch that is already checked out elsewhere, error out policy may need to consult the other worktrees via $GIT_COMMON_DIR mechanism but at that level we have all the control without contaminating end-user facing ref namespace in a way main/FETCH_HEAD... do. No, external API/UI is just extra bonus. We need to (or should) do so in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal refs. Given any ref, git_path(ref) gives the path to that ref, git_path(logs/%s, ref) gives the path of its reflog. By mapping special names to real paths behind git_path(), We can feed $GIT_COMMON_DIR/HEAD (under special names) to refs.c and it'll handle correctly without any changes for special cases. You said This makes it possible for a linked checkout to detach HEAD of the main one. but I think that is misguided---it just makes it easier to confuse users, if done automatically and without any policy knob. It instead should error out, while saying which worktree has the branch in question checked out. After all, checking out a branch that is checked out in another worktree (not the common one) needs the same caution, so main/HEAD is not the only special one. And if your updated git checkout 'frotz' gives a clear report of which worktree has the branch 'frotz' checked out, the user can go there to detach the HEAD in that worktree to detach with git -C $the_other_one checkout HEAD^0 if he chooses to. Jonathan mentions about the checkout in portable device case that would make the above a bit unnatural as you just can't cd there (git update-ref still works). I don't see any problems with checking out a branch multiple times. I may want to try modifying something in the branch that will be thrown away in the end. It's when the user updates a branch that we should either error+reject or detach other checkouts. I guess it's up to the user to decide which way they want. The error+reject way may make the user hunt through dead checkouts waiting to be pruned. But we can start with error+reject then add an option to auto-detach. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 01/23] sha1write: make buffer const-correct
On Sat, Dec 21, 2013 at 2:59 PM, Jeff King p...@peff.net wrote: We are passed a void * and write it out without ever s/are passed/pass/ Cheers, Christian. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] do not pretend sha1write returns errors
Jeff King p...@peff.net writes: The sha1write function returns an int, but it will always be 0. The failure-prone parts of the function happen in the flush callback, which cannot pass an error back to us. So we just end up calling die() during the flush. Let's just drop the return value altogether, as it only confuses callers into thinking that it might be useful. Only one call site actually checked the return value. We can drop that check, since it just led to a die() anyway. Signed-off-by: Jeff King p...@peff.net Thanks. This is kind of a step backwards if we ever wanted to actually make sha1write's return code mean anything. But I just don't foresee that happening. Meh. It hasn't returned a useful value since its introduction in 2005. -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: questions / suggestions about history simplification
On Sat, Dec 21, 2013 at 10:44:43PM -0800, Junio C Hamano wrote: Adam Spiers g...@adamspiers.org writes: I doubt it. 75% of the work for such a person to understand the behaviour from such an example is to understand what kind of history the example is building. Agreed. And that's precisely why I wanted a real repository manifesting the given example: being able to view it in gitk makes that a lot easier to understand, for obvious reasons. ... Well I didn't roll my own; I just copied the example from the man page. So it only tells you that I was spending a fair amount of effort trying to understand the man page ;-) Oh, that part I would agree, but then ... Not if the man page says if you wish to experiment with these options yourself, you can easily recreate the repository in this example by running the script contrib/foo bundled in the source distribution. ... The goal of sharing the series of commands is not to educate users through reading them, but simply to save them the time they would have to spend manually recreating the example scenario given in the man page. ... this part could be even easier if distro ships a sample repository, not a recipe to create such a sample repository, no? It could ship either or both, but shipping a repository only saves the user from having to run a single command to create the repository from the script, and even that advantage is negated if the user wishes the repository to be read/write (since then they would need cp -a ...). However, the question of how the distro should ship it is separate to the question of how the git source repository and release tarballs should provide it. Including a script (e.g. in contrib/) means that any changes to the man page can trivially be mirrored in that script within a single commit, giving all the normal advantages git offers for tracking source. And then the distro can easily create the sample repository from the script at package build-time, if they want to. OTOH, including a sample repository embedded within the git repository is either impossible or very ugly (e.g. having a non-default value of GIT_DIR for the embedded repository). But I doubt you were suggesting that ;-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Rationale behind 'extern' on protypes in .h files
Hi, I am a C Data Structure teacher at at an institute. Up until now I have only written classroom type C code. I always had interest in operating systems especially Linux Kernel but never got into anything just passively followed 'lwn.net', 'dr dobbs magazine' etc. But the series '30 Linux Kernel Developers in 30 Weeks' on 'linux.com' has really pushed me over the edge and I have this new found interest towards Linux and C. As these articles suggest that I should have a good experience in user space before I can even touch the kernel. So to get a feel of how actual real world C projects look like I have taken two excellent projects as my reference 'Git' and 'libvirt'. Also 'Linux Formrnal Scratch' to get started towards Linux internals. Although I have a good understanding of the basic/advanced concepts like pointers, memory management etc but many concepts like good error handling mechanism, unit testing, build systems etc. were always blurry but I know that this is just a start towards a long journey towards Linux Kernel. Now, my real question : 1) I cannot understand the reason behind making function prototypes as extern. What purpose does this serve? AFAIK we put definition in a .c file and the prototype in a .h thats it. 2) Why are some prototypes in some of the .h file are extern and others are not? Thank you guys for reading through. Any suggestions are humbly welcome. Ravi S. Jethani -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rationale behind 'extern' on protypes in .h files
On 22.12.2013 16:51, Ravi Shekhar Jethani wrote: Now, my real question : 1) I cannot understand the reason behind making function prototypes as extern. What purpose does this serve? AFAIK we put definition in a .c file and the prototype in a .h thats it. 2) Why are some prototypes in some of the .h file are extern and others are not? Thank you guys for reading through. Any suggestions are humbly welcome. Ravi S. Jethani -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html That's an interesting question. From my understanding there is no difference for functions declarations being set to extern or not, because extern is the default on functions. It is however important for variables to mark them extern if they're in a header file. After a quick research on the web, it may be there for historical reasons. Back then, when there were one pass compilers, the compiler needed to know where the function is to be found. (extern was not default at these times?) Another reason I could make up, would be to indicate, whether the function is in the c file having the same name as the header file. For example in builtin.h there are all the functions declared extern, as the implementation of the functions are found in builtin/*.c In another header cache-tree.h there are function declarations without explicit extern keyword, and the implementation is found in the cache-tree.c file (${SAME-NAME-AS-HEADER}.C) This is however not always the case as found in archive.{c,h} I'd be also interested in knowing the real reason for this rationale. Thanks, Stefan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Rationale behind 'extern' on protypes in .h files
Stefan Beller stefanbel...@googlemail.com writes: From my understanding there is no difference for functions declarations being set to extern or not, because extern is the default on functions. There is a difference for shared libraries if you would like to control which symbols are exported. With gcc, for example, you might compile using -fvisibility=hidden. Any functions explicitly declared with extern, bearing __attribute__((visibility(default)), or using visibility pragmas will be exported (similar to __declspec(dllexport) on Windows). Other functions will be internal to the shared library so you don't have to worry about callers depending on those symbols and performance can be a bit better by skipping the PLT and avoiding symbol relocations at load time. See Drepper's guide for more. http://www.akkadia.org/drepper/dsohowto.pdf pgpZa1gqS6XJz.pgp Description: PGP signature
[PATCH] [RFC] Making use of bitmaps for thin objects
I've been hacking with the performance of git on a large, quickly changing git repo used inside Facebook. Pulling a week of changes from this repo can be quite painful. $ { echo HEAD echo ^$have; } | time git pack-objects --no-use-bitmap-index --revs --stdout /dev/null Counting objects: 221082, done. Compressing objects: 100% (20174/20174), done. Total 221082 (delta 197889), reused 215144 (delta 194084) 86.53user 7.89system 1:12.76elapsed 129%CPU (0avgtext+0avgdata 5746608maxresident)k 0inputs+0outputs (0major+3177262minor)pagefaults 0swaps Jeff King's bitmap branch appears to give a very substantial speedup. After applying this branch, the counting objects phase is basically free. However, I found that the compression phase still takes a non-trivial amount of time. $ { echo HEAD echo ^$have; } | time ../git-bitmap/install/bin/git pack-objects --use-bitmap-index --revs --stdout /dev/null Counting objects: 220909, done. Compressing objects: 100% (20038/20038), done. Total 220909 (delta 197794), reused 215074 (delta 194050) 47.96user 2.45system 0:28.39elapsed 177%CPU (0avgtext+0avgdata 5890768maxresident)k 0inputs+0outputs (0major+984875minor)pagefaults 0swaps It looks like most of the time spent compressing objects was in cases where the object was already compressed in the packfile, but the delta was based on an object that the client already had. For some reason, --thin wasn't enabling reuse of these deltas. This is a hacky, poorly styled attempt to figure how how much better performance could be. With the bitmap branch, git should know what objects the client has already and can easily test if an existing delta can be reused. I don't know the branch well enough to code this, so as a hack, I just assumed the client has any delta base that is in the pack file (for our repo, this is always true, because we have a linear history) This greatly reduces the time: $ { echo HEAD echo ^$have; } | time ../git-bitmap/install/bin/git pack-objects --use-bitmap-index --revs --stdout --thin /dev/null Counting objects: 220909, done. Compressing objects: 100% (14203/14203), done. Total 220909 (delta 194050), reused 220909 (delta 199885) 3.57user 1.28system 0:04.59elapsed 105%CPU (0avgtext+0avgdata 2007296maxresident)k 0inputs+0outputs (0major+416243minor)pagefaults 0swaps I didn't do much testing here, I did run git-index-pack --fix-thin then git fsck, as a sanity check. I'd appreciate feedback here, if this is a valid approach, maybe somebody more involved in the bitmap branch would be interested in implementing the actual logic to figure out if the thin revision is valid * I was rather confused as to how --thin works today. I could not figure out how the choice was made to reuse a thin delta in the existing code base. * I had a very hacky way of communicating to the pack writing code that there was a delta, but that it need not take that into account for the pruposes of sorting the file. Does anybody have a better suggestion here? Signed-off-by: Ben Maurer bmau...@fb.com --- builtin/pack-objects.c | 26 +++--- pack-objects.h | 1 + 2 files changed, 20 insertions(+), 7 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index c2f2847..3dc4411 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -53,6 +53,7 @@ static unsigned long pack_size_limit; static int depth = 50; static int delta_search_threads; static int pack_to_stdout; +static int thin = 0; static int num_preferred_base; static struct progress *progress_state; static int pack_compression_level = Z_DEFAULT_COMPRESSION; @@ -347,7 +348,8 @@ static unsigned long write_no_reuse_object(struct sha1file *f, struct object_ent /* Return 0 if we will bust the pack-size limit */ static unsigned long write_reuse_object(struct sha1file *f, struct object_entry *entry, - unsigned long limit, int usable_delta) + unsigned long limit, int usable_delta, + const unsigned char* thin_delta_sha1) { struct packed_git *p = entry-in_pack; struct pack_window *w_curs = NULL; @@ -361,7 +363,11 @@ static unsigned long write_reuse_object(struct sha1file *f, struct object_entry if (entry-delta) type = (allow_ofs_delta entry-delta-idx.offset) ? OBJ_OFS_DELTA : OBJ_REF_DELTA; - hdrlen = encode_in_pack_object_header(type, entry-size, header); + + if (thin_delta_sha1) + type = OBJ_REF_DELTA; + + hdrlen = encode_in_pack_object_header(type, thin_delta_sha1 ? entry-delta_size : entry-size, header); offset = entry-in_pack_offset; revidx = find_pack_revindex(p, offset); @@ -403,7 +409,7 @@ static unsigned long write_reuse_object(struct sha1file *f, struct object_entry return 0; } sha1write(f, header,
RE: [PATCH] [RFC] Making use of bitmaps for thin objects
One issue with this approach is that it seems git-pack-index doesn't perform as well with thin packs. git-index-pack uses a multi-threaded approach to resolving the deltas. However, the multithreading only works on deltas that are exclusively in the pack. After the multi-threaded phase, it incrementally brings in objects from external packs, but in single threaded manner. Many objects in the pack have some dependency on an external object, therefore, defeating the multithreading. What's the use case for a pack file with a SHA1 reference living inside the pack file (why not just use an offset?) Would it make sense to assume that all such files are external and bring them in in the first phase.-- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/POC 3/7] setup.c: add split-repo support to .git files
On Sat, Dec 14, 2013 at 3:43 AM, Jonathan Nieder jrnie...@gmail.com wrote: Problems: * What if I move my worktree with mv? Then I still need the corresponding $GIT_SUPER_DIR/repos/id directory, and nobody told the GIT_SUPER_DIR about it. * What if my worktree is on removable media (think network filesystem) and has just been temporarily unmounted instead of deleted? So maybe it would be nicer to: i. When the worktree is on the same filesystem, keep a *hard link* to some file in the worktree (e.g., the .git file). If the link count goes down, it is safe to remove the $GIT_SUPER_DIR/repos/id directory. Link count goes down to 1 if I move the worktree to a different filesystem and it's not safe to remove $GIT_SUPER_DIR/repos/id in this case, I think. ii. When the worktree is on another filesystem, always keep $GIT_SUPER_DIR/repos/id unless the user decides to manually remove it. Provide documentation or a command to list basic information about $GIT_SUPER_DIR/repos directories (path to worktree, what branch they're on, etc). (i) doesn't require any futureproofing. As soon as someone wants it, they can implement the check and fall back to (ii) for worktrees without the magic hard link. (ii) would benefit from recording the working tree directory as a possibly unreliable, human-friendly reminder. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html