Re: RFC/Pull Request: Refs db backend
On 06/23/2015 02:50 AM, David Turner wrote: I've revived and modified Ronnie Sahlberg's work on the refs db backend. The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle. I recognize that there have been changes to the refs code since then, and that there are some further changes in-flight from e.g. Michael Haggerty. If there is interest in this, I can rebase once Michael's changes land. It's awesome that you are working on this! I'm reading through your commits and will add comments as they pop into my head... * I initially read refs-be-files to be a short version of references, they be files. I might never be able to get that pronunciation out of my head :-) * It would be more modest to call the files implementing the LMDB backend refs-be-lmdb.[c,h] rather than refs-be-db.[c,h]. * I wonder whether `refname_is_safe()` might eventually have to become backend-specific. For example, maybe one backend will have to impose a limit of 128 characters or something? No matter, though...it can be moved later. * You have put `format_reflog_msg()` in the public interface. It probably makes sense, because more than one backend might want to use it. But another backend might want to store (refname, old_sha1, new_sha1, ...) as separate columns in a database. As long as `format_reflog_msg()` is seen as a helper function and is not called by any of the shared code, it shouldn't be a problem. * I wonder whether `init_backend()` will be general enough. I'm thinking by analogy with object constructors, which usually need class-specific arguments during their initialization even though afterwards objects of different classes can be used interchangeably. So I guess the idea is that a typical `init_backend()` function will have to dig around itself to find whatever additional information that it needs (e.g., from the git configuration or the filesystem or whatever). So I think this is OK. * Your methods for bulk updates are I think analogous to the `initial_ref_transaction_commit()` function that I recently submitted [1]. Either way, the goal is to abstract away the fact that the file-based backend uses packed and loose references with tradeoffs that callers currently have to know about. * I don't like the fact that you have replaced `struct ref_transaction *` with `void *` in the public interface. On a practical level, I like the bit of type-safety that comes with the more specific declaration. But on a more abstract level, I think that the concept of a transaction could be useful across backends, for example in utility functions that verify that a proposed set of updates are internally consistent. I would rather see either * backends extend a basic `struct ref_transaction` to suit their needs, and upcast/downcast pointers at the module boundary, or * `struct ref_transaction` itself gets a `void *` member that backends can use for whatever purposes they want. * Regarding MERGE_HEAD: you take the point of view that it must continue to be stored as a file. And yet it must also behave somewhat like a reference; for example, `git rev-parse MERGE_HEAD` works today. MERGE_HEAD is also used for reachability, right? Another point of view is that MERGE_HEAD is a plain old boring reference, but there is some other metadata related to it that the refs backend has to store. The file-based backend would have special-case code to read the additional data from the tail of the loose refs file (and be sure to write the metadata when writing the reference), but other backends could store the reference with the rest but do their own thing with the metadata. So I guess I'm wondering whether the refs API needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with its metadata. * Don't the same considerations that apply to MERGE_HEAD also apply to FETCH_HEAD? * I'm showing my ignorance of LMDB, but I don't see where the LMDB backend initializes its database during `git init-db`. Is that what `init_env()` does? But it looks like `init_env()` is called on every git invocation (via `git_config_early()`). Puzzled. * Meanwhile, `create_default_files()` (in `builtin/init-db`) still creates directories `refs`, `refs/heads`, and `refs/tags`. * Rehash of the last two points: I expected one backend function that is used to initialize the refs backend when a new repository is created (e.g., in `git init`). The file-based backend would use this function to create the `refs`, `refs/heads`, and `refs/tags` directories. I expected a second function that is called once every time git runs in an existing repository (this one might, for example, open a database connection). And maybe even a third one that closes down the database connection before git exits. Would you please explain how this actually works? * `lmdb_init_backend()` leaks `path` if `env` is already set (in which case it needn't compute `path` in the first place). * You have the constraint that submodules need to use the same reference
Untracked files when git status executed on a new folder
Hi all. Today I've had an unexpected behaviour that I'm not sure if is a bug or I'm not doing git best practices... (surely the latest...) The sequence of actions is : 1. create a new subfolder of my local repository branch 2. cd to this new folder, and create a new file 3. execute git status from the new folder Doing that, the new folder doesn't appear as untracked. 4. cd .. 5. git status In this case, the new folder appears. If I create a new folder on the same level that the new one created in step 1, cd into it, and execute git status, the folder created in step 1 appears as untracked. After step 3 I have executed git commit and git push to the remote, and later another user has realized that the new file was not present. Please let me know if you understand my explanation, and if it's a bug or just that I'm not as experienced as I'd like with git ;) THANKS!! Víctor -- --- Víctor Martín Hernández RD Software Engineer Instituto de Ciencias del Espacio (ICE/CSIC), and Institut d'Estudis Espacials de Catalunya (IEEC) Campus UAB, Carrer de Can Magrans, s/n 08193 Bellaterra (Cerdanyola del Vallès) - Barcelona Tel. : +34 93 586 8782 Web: http://gwart.ice.cat/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] apply: fix adding new files on i-t-a entries
Since d95d728 (diff-lib.c: adjust position of i-t-a entries in diff - 2015-03-16), a normal git diff on i-t-a entries would produce a diff that _adds_ those files, not just adding content to existing and empty files like before. This is correct. Unfortunately, applying such a patch on the same repository that has the same i-t-a entries will fail with message already exists in index because git-apply checks, sees those i-t-a entries and aborts. git-apply does not realize those are for bookkeeping only, they do not really exist in the index. This patch tightens the exists in index check, ignoring i-t-a entries. For fixing the above problem, only the change in check_to_create() is needed. For other changes, - load_current(), reporting not exists in index is better than does not match index - check_preimage(), similar to load_current(), but it may also use ce_mode from i-t-a entry which is always zero - get_current_sha1(), or actually build_fake_ancestor(), we should not add i-t-a entries to the temporary index, at least not without also adding i-t-a flag back Since git add -p and git am use git apply underneath, they are broken too before this patch and fixed now. Reported-by: Patrick Higgins phigg...@google.com Reported-by: Bjørnar Snoksrud snoks...@gmail.com Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- I think I'm opening a can of worms with d95d728. There's nothing wrong with that patch per se, but with this issue popping up, I need to go over all {cache,index}_name_pos call sites and see what would be the sensible behavior when i-t-a entries are involved. So far blame, rm and checkout-entry and checkout paths are on my to-think-or-fix list. But this patch can get in early to fix a real regression instead of waiting for one big series. A lot more discussions will be had before that series gets in good shape. builtin/apply.c | 8 cache.h | 2 ++ read-cache.c | 12 t/t2203-add-intent.sh | 10 ++ 4 files changed, 28 insertions(+), 4 deletions(-) diff --git a/builtin/apply.c b/builtin/apply.c index 146be97..4f813ac 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -3344,7 +3344,7 @@ static int load_current(struct image *image, struct patch *patch) if (!patch-is_new) die(BUG: patch to %s is not a creation, patch-old_name); - pos = cache_name_pos(name, strlen(name)); + pos = exists_in_cache(name, strlen(name)); if (pos 0) return error(_(%s: does not exist in index), name); ce = active_cache[pos]; @@ -3497,7 +3497,7 @@ static int check_preimage(struct patch *patch, struct cache_entry **ce, struct s } if (check_index !previous) { - int pos = cache_name_pos(old_name, strlen(old_name)); + int pos = exists_in_cache(old_name, strlen(old_name)); if (pos 0) { if (patch-is_new 0) goto is_new; @@ -3551,7 +3551,7 @@ static int check_to_create(const char *new_name, int ok_if_exists) struct stat nst; if (check_index - cache_name_pos(new_name, strlen(new_name)) = 0 + exists_in_cache(new_name, strlen(new_name)) = 0 !ok_if_exists) return EXISTS_IN_INDEX; if (cached) @@ -3824,7 +3824,7 @@ static int get_current_sha1(const char *path, unsigned char *sha1) if (read_cache() 0) return -1; - pos = cache_name_pos(path, strlen(path)); + pos = exists_in_cache(path, strlen(path)); if (pos 0) return -1; hashcpy(sha1, active_cache[pos]-sha1); diff --git a/cache.h b/cache.h index 571c98f..6a44cb6 100644 --- a/cache.h +++ b/cache.h @@ -341,6 +341,7 @@ extern void free_name_hash(struct index_state *istate); #define discard_cache() discard_index(the_index) #define unmerged_cache() unmerged_index(the_index) #define cache_name_pos(name, namelen) index_name_pos(the_index,(name),(namelen)) +#define exists_in_cache(name, namelen) exists_in_index(the_index,(name),(namelen)) #define add_cache_entry(ce, option) add_index_entry(the_index, (ce), (option)) #define rename_cache_entry_at(pos, new_name) rename_index_entry_at(the_index, (pos), (new_name)) #define remove_cache_entry_at(pos) remove_index_entry_at(the_index, (pos)) @@ -512,6 +513,7 @@ extern int verify_path(const char *path); extern struct cache_entry *index_dir_exists(struct index_state *istate, const char *name, int namelen); extern struct cache_entry *index_file_exists(struct index_state *istate, const char *name, int namelen, int igncase); extern int index_name_pos(const struct index_state *, const char *name, int namelen); +extern int exists_in_index(const struct index_state *, const char *name, int namelen); #define ADD_CACHE_OK_TO_ADD 1 /* Ok to add */ #define ADD_CACHE_OK_TO_REPLACE 2 /* Ok to replace
Re: RFC/Pull Request: Refs db backend
On Tue, Jun 23, 2015 at 6:47 PM, Jeff King p...@peff.net wrote: I was thinking of actually moving to a log-structured ref storage. Something like: - any ref write puts a line at the end of a single logfile that contains the ref name, along with the normal reflog data - the logfile is the source of truth for the ref state. If you want to know the value of any ref, you can read it backwards to find the last entry for the ref. Everything else is an optimization. Let's call the number of refs N, and the number of ref updates in the log U. - we keep a key/value index mapping the name of any branch that exists to the byte offset of its entry in the logfile. This would probably One key/value mapping per branch, pointing to the latest reflog entry, or one key/valye mapping for each reflog entry? be in some binary key/value store (like LMDB). Without this, resolving a ref is O(U), which is horrible. With it, it should be O(1) or O(lg N), depending on the index data structure. I'm thinking of the user with small or medium repos, in terms of refs, who does not want an extra dependency. If we store one mapping per branch, then the size of this mapping is small enough that the index in a text file is ok. If we also store the offset to the previous reflog entry of the same branch in the current reflog entry, like a back pointer, then we could jump back faster. Or do you have something else in mind? Current reflog structure won't work because I think you bring back the reflog graveyard with this, and I don't want to lose that - the index can also contain other optimizations. E.g., rather than point to the entry in the logfile, it can include the sha1 directly (to avoid an extra level of indirection). It may want to include the peeled value, as the current packed-refs file does. - Reading all of the reflogs (e.g., for repacking) is O(U), just like it is today. Except the storage for the logfile is a lot more compact than what we store today, with one reflog per file. - Reading a single reflog is _also_ O(U), which is not as good as today. But if each log entry contains a byte offset of the previous entry, you can follow the chain (it is still slightly worse, because you are jumping all over the file, rather than reading a compact set of lines). - Pruning the reflog entries from the logfile requires rewriting the whole thing. That's similar to today, where we rewrite each of the reflog files. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Jun 2015, #05; Mon, 22)
Hi Junio, On 2015-06-23 00:49, Junio C Hamano wrote: * js/rebase-i-clean-up-upon-continue-to-skip (2015-06-18) 3 commits - rebase -i: do not leave a CHERRY_PICK_HEAD file behind - SQUASH: test_must_fail is a shell function - t3404: demonstrate CHERRY_PICK_HEAD bug Abandoning an already applied change in git rebase -i with --continue left CHERRY_PICK_HEAD and confused later steps. Expecting a reroll. ($gmane/271856) Actually, there had been two re-rolls, and v3 seemed to be okay: $gmane/272037. It also looks as if 726a35ebd^..726a35ebd^2 is identical with v3... Anything you want me to change in addition? Also: * js/fsck-opt (2015-06-22) 19 commits - fsck: support ignoring objects in `git fsck` via fsck.skiplist - fsck: git receive-pack: support excluding objects from fsck'ing - fsck: introduce `git fsck --connectivity-only` - fsck: support demoting errors to warnings - fsck: document the new receive.fsck.msg-id options - fsck: allow upgrading fsck warnings to errors - fsck: optionally ignore specific fsck issues completely - fsck: disallow demoting grave fsck errors to warnings - fsck: add a simple test for receive.fsck.msg-id - fsck: make fsck_tag() warn-friendly - fsck: handle multiple authors in commits specially - fsck: make fsck_commit() warn-friendly - fsck: make fsck_ident() warn-friendly - fsck: report the ID of the error/warning - fsck (receive-pack): allow demoting errors to warnings - fsck: offer a function to demote fsck errors to warnings - fsck: provide a function to parse fsck message IDs - fsck: introduce identifiers for fsck messages - fsck: introduce fsck options Rerolled (at v7) and seems more or less ready for 'next'. I see that you used `downcased` instead of my `lowercased`, which makes more sense, but the style of the multi-line `for` loop as per `pu` is still as *I* wrote it... I also saw that you downcased the first letter after `fsck:` in the commit messages, and touched up the message of the report the ID of the error/warning commit. Do you want to touch up the `for` loop style in offer a function to demote fsck errors to warnings or shall I send a v8 (it is ready to go: https://github.com/dscho/git/compare/next...fsck-api)? Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 12/19] initial_ref_transaction_commit(): check for duplicate refs
Michael Haggerty mhag...@alum.mit.edu writes: On 06/22/2015 11:06 PM, Junio C Hamano wrote: ... What I am wondering is if we could turn the safety logic that appear here (i.e. no existing refs must be assumed by the set of updates, etc.) into an optimization cue and implement this as a special case helper to ref_transaction_commit(), i.e. ref_transaction_commit(...) { if (updates are all initial creation no existing refs in repository) return initial_ref_transaction_commit(...); /* otherwise we do the usual thing */ ... } and have clone call ref_transaction_commit() as usual. The safety logic in this function is (approximately) necessary, but not sufficient, to guarantee safety. Oh, no question about it, and you do not even have to bring up an insane user runs random commands while Git is hard working on it non use-case ;-) One of the shortcuts that it takes is not locking the references while they are being created. Therefore, it would be unsafe for one process to call ref_transaction_commit() while another is calling initial_ref_transaction_commit(). So the caller has to know somehow that no other processes are working in the repository for this optimization to be safe. It conveys that knowledge by calling initial_ref_transaction_commit() rather than ref_transaction_commit(). OK. So the answer to my first question is the initial creation logic too fragile is a resounding yes; the caller should know that it is too crazy for the user to be competing with what it is doing before deciding to call initial_ref_transaction_commit(), hence we cannot automatically detect if it is safe from within ref_transaction_commit() to use this logic as an optimization. But I think if anything it would make more sense to go the other direction: * Teach ref_transaction_commit() an option that asks it to write references updates to packed-refs instead of loose refs (but locking the references as usual). * Change clone to use ref_transaction_commit() like everybody else, passing it the new REFS_WRITE_TO_PACKED_REFS option. Then clone would participate in the normal locking protocol, and it wouldn't *matter* if another process runs before the clone is finished. Yeah, I thought that was actually I was driving at, and doing so without that write-to-packed-refs option, which I'd prefer to leave it as an optimization inside ref_transaction_commit(). Except that I missed that the initial_* variant is even more aggressive (i.e. not locking), so no such optimization is safe. There would also be some consistency benefits. For example, if core.logallrefupdates is set globally or on the command line, the initial reference creations would be reflogged. And other operations that write references in bulk could use the new REFS_WRITE_TO_PACKED_REFS option to prevent loose reference proliferation. But I don't think any of this is a problem in practice, and I think we can live with using the optimized-but-not-100%-safe initial_ref_transaction_commit() for cloning. OK. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] apply: fix adding new files on i-t-a entries
Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: Since d95d728 (diff-lib.c: adjust position of i-t-a entries in diff - 2015-03-16), a normal git diff on i-t-a entries would produce a diff that _adds_ those files, not just adding content to existing and empty files like before. This is correct. Unfortunately, applying such a patch on the same repository that has the same i-t-a entries will fail with message already exists in index because git-apply checks, sees those i-t-a entries and aborts. git-apply does not realize those are for bookkeeping only, they do not really exist in the index. This patch tightens the exists in index check, ignoring i-t-a entries. For fixing the above problem, only the change in check_to_create() is needed. And the first thing I noticed and found a bit disturbing was that this change (which I think is correct, and happens to match what I sent out earlier) was the only thing necessary to make your new test pass. IOW, the other changes in this patch have no test coverage. For other changes, - load_current(), reporting not exists in index is better than does not match index Is that error reporting the only side effect from this change? This is only used when falling back to three-way merge while applying a creation patch. - check_preimage(), similar to load_current(), but it may also use ce_mode from i-t-a entry which is always zero This is for the normal (non three-way) application and the idea is the same as load_current() as you said above. - get_current_sha1(), or actually build_fake_ancestor(), we should not add i-t-a entries to the temporary index, at least not without also adding i-t-a flag back This is part of am three-way fallback codepath. I do not think the merge-recursive three-way merge code knows and cares about, is capable of handling, or would even want to deal with i-t-a entries in the first place, so adding an entry as i-t-a bit would not help. What the ultimate caller wants from us in this codepath is a tree object, and that is written out from the temporary index---and that codepath ignores i-t-a entries, so it is correct to omit them from the temporary index in the first place. Unlike the previous two changes, I think this change deserves a new test. I think I'm opening a can of worms with d95d728. There's nothing wrong with that patch per se, but with this issue popping up, I need to go over all {cache,index}_name_pos call sites and see what would be the sensible behavior when i-t-a entries are involved. Yeah, I agree that d95d728 should have been a part of a larger series that changes the world order, instead of a single change that brings inconsistency to the system. I cannot offhand convince myself that apply is the only casualty; assuming it is, I think a reasonable way forward is to keep d95d728 and adjust apply to the new world order. Otherwise, i.e. if there are wider fallouts from d95d728, we may instead want to temporarily revert it off from 'master', deal with fallouts to apply and other things, before resurrecting it. Anything that internally uses diff-index is suspect, I think. What do others think? You seem to ... So far blame, rm and checkout-entry and checkout paths are on my to-think-or-fix list. But this patch can get in early to fix a real regression instead of waiting for one big series. A lot more discussions will be had before that series gets in good shape. ... think that the damage could be quite extensive, so I am inclined to say that we first revert d95d728 before going forward. builtin/apply.c | 8 cache.h | 2 ++ read-cache.c | 12 t/t2203-add-intent.sh | 10 ++ 4 files changed, 28 insertions(+), 4 deletions(-) diff --git a/builtin/apply.c b/builtin/apply.c index 146be97..4f813ac 100644 --- a/builtin/apply.c +++ b/builtin/apply.c @@ -3344,7 +3344,7 @@ static int load_current(struct image *image, struct patch *patch) if (!patch-is_new) die(BUG: patch to %s is not a creation, patch-old_name); - pos = cache_name_pos(name, strlen(name)); + pos = exists_in_cache(name, strlen(name)); Something that is named as if it would return yes/no that returns a real value is not a very welcome idea. +/* This is the same as index_name_pos, except that i-t-a entries are invisible */ +int exists_in_index(const struct index_state *istate, const char *name, int namelen) +{ + int pos = index_name_stage_pos(istate, name, namelen, 0); + + if (pos 0) + return pos; + if (istate-cache[pos]-ce_flags CE_INTENT_TO_ADD) + return -pos-1; This is a useless complexity. Your callers cannot use the returned value like this: pos = exists_in_cache(...); if (pos 0) { if (active_cache[-pos-1]-ce_flags CE_INTENT_TO_ADD) ; /* ah it actually exists but it is i-t-a */
Re: Dependency Management
On Tue, Jun 23, 2015 at 1:52 AM, Jean Audibert jaudib...@euronext.com wrote: Hi, Sorry to bother you with this question but I can't find any official answer or strong opinion from Git community. In my company we recently started to use Git and we wonder how to share code and manage dependencies with Git? Use case: in project P we need to include lib-a and lib-b (libraries shared by several projects) In your opinion, what is the future proof solution? * Use submodule * Use subtree We know there is lot of PRO/CONS but I feel that subtree is behind in the race and the latest version of submodule work fine Use whatever works fine for your use case. My personal opinion/expectation is to see submodules improving/advancing more than subtrees advancing in the near future. Though this is neither the official nor a strong opinion. Stefan Suggestions are very welcome. Thanks in advance, Jean Audibert _ This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of Euronext N.V. or any of its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Mon, 2015-06-22 at 22:36 -0700, Junio C Hamano wrote: David Turner dtur...@twopensource.com writes: I've revived and modified Ronnie Sahlberg's work on the refs db backend. The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle. I recognize that there have been changes to the refs code since then, and that there are some further changes in-flight from e.g. Michael Haggerty. If there is interest in this, I can rebase once Michael's changes land. ... The db backend runs git for-each-ref about 30% faster than the files backend with fully-packed refs on a repo with ~120k refs. It's also about 4x faster than using fully-unpacked refs. In addition, and perhaps more importantly, it avoids case-conflict issues on OS X. I chose to use LMDB for the database... ... Ronnie Sahlberg's original version of this patchset used tdb. The advantage of tdb is that it's smaller (~125k). The disadvantages are that tdb is hard to build on OS X. It's also not in homebrew. So lmdb seemed simpler. If there is interest? Shut up and take my money ;-) More seriously, that's great that you stepped up to resurrect this topic. In a sense, the choice of sample database backend does not matter. I do not care if it is tdb, lmdb, or even Berkeley DB as long as it functions. ;-) As long as the interface between ref-transaction system on the Git side and the database backend is designed right, your lmdb thing can serve as a reference implementation for other people to plug other database backends to the same interface, right? Yes. As one step to validate the interface to the database backends, it would be nice to eventually have at least two backends that talk to meaningfully different systems, but we have to start somewhere, and for now we have lmdb is as good a place to start as any other db backend. I wonder if we can do a filesystem backend on top of the same backend interface---is that too much impedance mismatch to make it unpractical? The patch series does include a filesystem backend, which is simply the current ref infrastructure with extremely minor changes. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Untracked files when git status executed on a new folder
On 23/06, Víctor Martín Hernández wrote: Hi all. Today I've had an unexpected behaviour that I'm not sure if is a bug or I'm not doing git best practices... (surely the latest...) The sequence of actions is : 1. create a new subfolder of my local repository branch 2. cd to this new folder, and create a new file 3. execute git status from the new folder Doing that, the new folder doesn't appear as untracked. 4. cd .. 5. git status In this case, the new folder appears. If I create a new folder on the same level that the new one created in step 1, cd into it, and execute git status, the folder created in step 1 appears as untracked. Can't reproduce on Git 2.4.4/Linux, which Git version and platform are you using? -- Sincerely, Johannes Löthberg PGP Key ID: 0x50FB9B273A9D0BB5 https://theos.kyriasis.com/~kyrias/ signature.asc Description: PGP signature
[PATCH v7 2/5] bisect: replace hardcoded bad|good by variables
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr To add new tags like old/new and have keywords less confusing, the first step is to avoid hardcoding the keywords. The default mode is still bad/good. Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr Signed-off-by: Huynh Khoi Nguyen Nguyen huynh-khoi-nguyen.ngu...@ensimag.imag.fr Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr --- bisect.c | 51 ++- git-bisect.sh | 57 +++-- 2 files changed, 65 insertions(+), 43 deletions(-) mode change 100755 = 100644 git-bisect.sh diff --git a/bisect.c b/bisect.c index 5b8357d..2d3dbdc 100644 --- a/bisect.c +++ b/bisect.c @@ -21,6 +21,9 @@ static const char *argv_checkout[] = {checkout, -q, NULL, --, NULL}; static const char *argv_show_branch[] = {show-branch, NULL, NULL}; static const char *argv_update_ref[] = {update-ref, --no-deref, BISECT_HEAD, NULL, NULL}; +static const char *name_bad; +static const char *name_good; + /* Remember to update object flag allocation in object.h */ #define COUNTED(1u16) @@ -403,15 +406,21 @@ struct commit_list *find_bisection(struct commit_list *list, static int register_ref(const char *refname, const struct object_id *oid, int flags, void *cb_data) { - if (!strcmp(refname, bad)) { + struct strbuf good_prefix = STRBUF_INIT; + strbuf_addstr(good_prefix, name_good); + strbuf_addstr(good_prefix, -); + + if (!strcmp(refname, name_bad)) { current_bad_oid = xmalloc(sizeof(*current_bad_oid)); oidcpy(current_bad_oid, oid); - } else if (starts_with(refname, good-)) { + } else if (starts_with(refname, good_prefix.buf)) { sha1_array_append(good_revs, oid-hash); } else if (starts_with(refname, skip-)) { sha1_array_append(skipped_revs, oid-hash); } + strbuf_release(good_prefix); + return 0; } @@ -634,7 +643,7 @@ static void exit_if_skipped_commits(struct commit_list *tried, return; printf(There are only 'skip'ped commits left to test.\n - The first bad commit could be any of:\n); + The first %s commit could be any of:\n, name_bad); print_commit_list(tried, %s\n, %s\n); if (bad) printf(%s\n, oid_to_hex(bad)); @@ -732,18 +741,21 @@ static void handle_bad_merge_base(void) if (is_expected_rev(current_bad_oid)) { char *bad_hex = oid_to_hex(current_bad_oid); char *good_hex = join_sha1_array_hex(good_revs, ' '); - - fprintf(stderr, The merge base %s is bad.\n - This means the bug has been fixed - between %s and [%s].\n, - bad_hex, bad_hex, good_hex); - + if (!strcmp(name_bad, bad)) { + fprintf(stderr, The merge base %s is bad.\n + This means the bug has been fixed + between %s and [%s].\n, + bad_hex, bad_hex, good_hex); + } else { + die(BUG: terms %s/%s not managed, name_bad, name_good); + } exit(3); } - fprintf(stderr, Some good revs are not ancestor of the bad rev.\n + fprintf(stderr, Some %s revs are not ancestor of the %s rev.\n git bisect cannot work properly in this case.\n - Maybe you mistook good and bad revs?\n); + Maybe you mistook %s and %s revs?\n, + name_good, name_bad, name_good, name_bad); exit(1); } @@ -755,10 +767,10 @@ static void handle_skipped_merge_base(const unsigned char *mb) warning(the merge base between %s and [%s] must be skipped.\n - So we cannot be sure the first bad commit is + So we cannot be sure the first %s commit is between %s and %s.\n We continue anyway., - bad_hex, good_hex, mb_hex, bad_hex); + bad_hex, good_hex, name_bad, mb_hex, bad_hex); free(good_hex); } @@ -839,7 +851,7 @@ static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout) int fd; if (!current_bad_oid) - die(a bad revision is needed); + die(a %s revision is needed, name_bad); /* Check if file BISECT_ANCESTORS_OK exists. */ if
[PATCH v7 5/5] bisect: allows any terms set by user
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Introduction of the git bisect terms function. The user can set its own terms. It will work exactly like before. The terms must be set before the start. Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr --- Documentation/git-bisect.txt | 19 + git-bisect.sh| 68 t/t6030-bisect-porcelain.sh | 43 3 files changed, 125 insertions(+), 5 deletions(-) diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt index 3c3021a..ef0c03c 100644 --- a/Documentation/git-bisect.txt +++ b/Documentation/git-bisect.txt @@ -133,6 +133,25 @@ You must run `git bisect start` without commits as argument and run `git bisect new rev`/`git bisect old rev...` after to add the commits. +Alternative terms: use your own terms + + +If the builtins terms bad/good and new/old do not satisfy you, you can +set your own terms. + + +git bisect terms term1 term2 + + +This command has to be used before a bisection has started. +The term1 must be associated with the latest revisions and term2 with the +ancestors of term1. + +Only the first bisection following the 'git bisect terms' will use the terms. +If you mistyped one of the terms you can do again 'git bisect terms term1 +term2'. + + Bisect visualize diff --git a/git-bisect.sh b/git-bisect.sh index 73763a2..8ef2b94 100644 --- a/git-bisect.sh +++ b/git-bisect.sh @@ -1,6 +1,6 @@ #!/bin/sh -USAGE='[help|start|bad|good|new|old|skip|next|reset|visualize|replay|log|run]' +USAGE='[help|start|bad|good|new|old|terms|skip|next|reset|visualize|replay|log|run]' LONG_USAGE='git bisect help print this long help message. git bisect start [--no-checkout] [bad [good...]] [--] [pathspec...] @@ -11,6 +11,8 @@ git bisect (bad|new) [rev] git bisect (good|old) [rev...] mark rev... known-good revisions/ revisions before change in a given property. +git bisect terms term1 term2 + set up term1 and term2 as bisection terms. git bisect skip [(rev|range)...] mark rev... untestable revisions. git bisect next @@ -82,6 +84,14 @@ bisect_start() { # revision_seen is true if a git bisect start # has revision as arguments revision_seen=0 + # terms_defined is used to detect if the user + # defined his own terms with git bisect terms + terms_defined=0 + if test -s $GIT_DIR/TERMS_DEFINED + then + terms_defined=1 + get_terms + fi if test z$(git rev-parse --is-bare-repository) != zfalse then mode=--no-checkout @@ -180,10 +190,14 @@ bisect_start() { } git rev-parse --sq-quote $@ $GIT_DIR/BISECT_NAMES eval $eval true - if test $revision_seen -eq 1 test ! -s $GIT_DIR/BISECT_TERMS + if test $revision_seen -eq 1 test ! -s $GIT_DIR/BISECT_TERMS || test $terms_defined -eq 1 then echo $NAME_BAD $GIT_DIR/BISECT_TERMS - echo $NAME_GOOD $GIT_DIR/BISECT_TERMS + echo $NAME_GOOD $GIT_DIR/BISECT_TERMS + if test $terms_defined -eq 1 + then + echo git bisect terms $NAME_BAD $NAME_GOOD $GIT_DIR/BISECT_LOG || exit + fi fi echo git bisect start$orig_args $GIT_DIR/BISECT_LOG || exit # @@ -419,6 +433,7 @@ bisect_clean_state() { rm -f $GIT_DIR/BISECT_NAMES rm -f $GIT_DIR/BISECT_RUN rm -f $GIT_DIR/BISECT_TERMS + rm -f $GIT_DIR/TERMS_DEFINED # Cleanup head-name if it got left by an old version of git-bisect rm -f $GIT_DIR/head-name git update-ref -d --no-deref BISECT_HEAD @@ -447,6 +462,8 @@ bisect_replay () { eval $cmd ;; $NAME_GOOD|$NAME_BAD|skip) bisect_write $command $rev ;; + terms) + bisect_terms $rev ;; *) die $(gettext ?? what are you talking about?) ;; esac @@ -531,7 +548,8 @@ get_terms () { check_and_set_terms () { cmd=$1 case $cmd in - bad|good|new|old) + skip|start|terms) ;; + *) if test -s $GIT_DIR/BISECT_TERMS test $cmd != $NAME_BAD test $cmd != $NAME_GOOD then die $(eval_gettext Invalid command: you're currently in a \$NAME_BAD/\$NAME_GOOD bisect.) @@ -564,6 +582,44 @@ bisect_voc () { esac } +bisect_terms () { + case $# in + 0) + if test -s $GIT_DIR/BISECT_TERMS +
[PATCH v7 1/5] bisect: correction of typo
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr --- bisect.c| 2 +- t/t6030-bisect-porcelain.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/bisect.c b/bisect.c index 03d5cd9..5b8357d 100644 --- a/bisect.c +++ b/bisect.c @@ -743,7 +743,7 @@ static void handle_bad_merge_base(void) fprintf(stderr, Some good revs are not ancestor of the bad rev.\n git bisect cannot work properly in this case.\n - Maybe you mistake good and bad revs?\n); + Maybe you mistook good and bad revs?\n); exit(1); } diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh index 06b4868..9e2c203 100755 --- a/t/t6030-bisect-porcelain.sh +++ b/t/t6030-bisect-porcelain.sh @@ -362,7 +362,7 @@ test_expect_success 'bisect starting with a detached HEAD' ' test_expect_success 'bisect errors out if bad and good are mistaken' ' git bisect reset test_must_fail git bisect start $HASH2 $HASH4 2 rev_list_error - grep mistake good and bad rev_list_error + grep mistook good and bad rev_list_error git bisect reset ' -- 2.4.4.414.ge37915c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 4/5] bisect: add the terms old/new
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr When not looking for a regression during a bisect but for a fix or a change in another given property, it can be confusing to use 'good' and 'bad'. This patch introduce `git bisect new` and `git bisect old` as an alternative to 'bad' and good': the commits which have a certain property must be marked as `new` and the ones which do not as `old`. The output will be the first commit after the change in the property. During a new/old bisect session you cannot use bad/good commands and vice-versa. Some commands are still not available for old/new: * git rev-list --bisect does not treat the revs/bisect/new and revs/bisect/old-SHA1 files. Old discussions: - http://thread.gmane.org/gmane.comp.version-control.git/86063 introduced bisect fix unfixed to find fix. - http://thread.gmane.org/gmane.comp.version-control.git/182398 discussion around bisect yes/no or old/new. - http://thread.gmane.org/gmane.comp.version-control.git/199758 last discussion and reviews New discussions: - http://thread.gmane.org/gmane.comp.version-control.git/271320 ( v2 1/7-4/7 ) - http://comments.gmane.org/gmane.comp.version-control.git/271343 ( v2 5/7-7/7 ) Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr Signed-off-by: Huynh Khoi Nguyen Nguyen huynh-khoi-nguyen.ngu...@ensimag.imag.fr Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr --- Documentation/git-bisect.txt | 48 ++-- bisect.c | 11 +++--- git-bisect.sh| 30 ++- t/t6030-bisect-porcelain.sh | 38 +++ 4 files changed, 112 insertions(+), 15 deletions(-) diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt index 4cb52a7..3c3021a 100644 --- a/Documentation/git-bisect.txt +++ b/Documentation/git-bisect.txt @@ -18,8 +18,8 @@ on the subcommand: git bisect help git bisect start [--no-checkout] [bad [good...]] [--] [paths...] - git bisect bad [rev] - git bisect good [rev...] + git bisect (bad|new) [rev] + git bisect (good|old) [rev...] git bisect skip [(rev|range)...] git bisect reset [commit] git bisect visualize @@ -104,6 +104,35 @@ For example, `git bisect reset HEAD` will leave you on the current bisection commit and avoid switching commits at all, while `git bisect reset bisect/bad` will check out the first bad revision. + +Alternative terms: bisect new and bisect old + + +If you are not at ease with the terms bad and good, perhaps +because you are looking for the commit that introduced a fix, you can +alternatively use new and old instead. +But note that you cannot mix bad and good with new and old. + + +git bisect new [rev] + + +Marks the commit as new, e.g. the bug is no longer there, if you are looking +for a commit that fixed a bug, or the feature that used to work is now broken +at this point, if you are looking for a commit that introduced a bug. +It is the equivalent of git bisect bad [rev]. + + +git bisect old [rev...] + + +Marks the commit as old, as the opposite of 'git bisect new'. +It is the equivalent of git bisect good [rev...]. + +You must run `git bisect start` without commits as argument and run +`git bisect new rev`/`git bisect old rev...` after to add the +commits. + Bisect visualize @@ -379,6 +408,21 @@ In this case, when 'git bisect run' finishes, bisect/bad will refer to a commit has at least one parent whose reachable graph is fully traversable in the sense required by 'git pack objects'. +* Look for a fix instead of a regression in the code ++ + +$ git bisect start +$ git bisect new HEAD# current commit is marked as new +$ git bisect old HEAD~10 # the tenth commit from now is marked as old + ++ +Let's consider the last commit has a given property, and that we are looking +for the commit which introduced this property. For each commit the bisection +guide us to, we will test if the property is present. If it is we will mark +the commit as new with 'git bisect new', otherwise we will mark it as old. +At the end of the bisect session, the result will be the first new commit (e.g +the first one with the property). + SEE ALSO diff
[PATCH v7 3/5] bisect: simplify the addition of new bisect terms
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr We create a file BISECT_TERMS in the repository .git to be read during a bisection. The fonctions to be changed if we add new terms are quite few. In git-bisect.sh : check_and_set_terms bisect_voc Co-authored-by: Louis Stuber stub...@ensimag.grenoble-inp.fr Tweaked-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr Signed-off-by: Huynh Khoi Nguyen Nguyen huynh-khoi-nguyen.ngu...@ensimag.imag.fr Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr --- bisect.c | 38 +--- git-bisect.sh | 70 +-- revision.c| 26 -- 3 files changed, 122 insertions(+), 12 deletions(-) diff --git a/bisect.c b/bisect.c index 2d3dbdc..08be634 100644 --- a/bisect.c +++ b/bisect.c @@ -747,7 +747,10 @@ static void handle_bad_merge_base(void) between %s and [%s].\n, bad_hex, bad_hex, good_hex); } else { - die(BUG: terms %s/%s not managed, name_bad, name_good); + fprintf(stderr, The merge base %s is %s.\n + This means the first commit marked %s is + between %s and [%s].\n, + bad_hex, name_bad, name_bad, bad_hex, good_hex); } exit(3); } @@ -902,6 +905,36 @@ static void show_diff_tree(const char *prefix, struct commit *commit) } /* + * The terms used for this bisect session are stored in BISECT_TERMS. + * We read them and store them to adapt the messages accordingly. + * Default is bad/good. + */ +void read_bisect_terms(const char **read_bad, const char **read_good) +{ + struct strbuf str = STRBUF_INIT; + const char *filename = git_path(BISECT_TERMS); + FILE *fp = fopen(filename, r); + + if (!fp) { + if (errno == ENOENT) { + *read_bad = bad; + *read_good = good; + return; + } else { + die(could not read file '%s': %s, filename, + strerror(errno)); + } + } else { + strbuf_getline(str, fp, '\n'); + *read_bad = strbuf_detach(str, NULL); + strbuf_getline(str, fp, '\n'); + *read_good = strbuf_detach(str, NULL); + } + strbuf_release(str); + fclose(fp); +} + +/* * We use the convention that exiting with an exit code 10 means that * the bisection process finished successfully. * In this case the calling shell script should exit 0. @@ -917,8 +950,7 @@ int bisect_next_all(const char *prefix, int no_checkout) const unsigned char *bisect_rev; char bisect_rev_hex[GIT_SHA1_HEXSZ + 1]; - name_bad = bad; - name_good = good; + read_bisect_terms(name_bad, name_good); if (read_bisect_refs()) die(reading bisect refs failed); diff --git a/git-bisect.sh b/git-bisect.sh index ce6412f..7bb18db 100644 --- a/git-bisect.sh +++ b/git-bisect.sh @@ -77,6 +77,9 @@ bisect_start() { orig_args=$(git rev-parse --sq-quote $@) bad_seen=0 eval='' + # revision_seen is true if a git bisect start + # has revision as arguments + revision_seen=0 if test z$(git rev-parse --is-bare-repository) != zfalse then mode=--no-checkout @@ -101,6 +104,9 @@ bisect_start() { die $(eval_gettext '\$arg' does not appear to be a valid revision) break } + + revision_seen=1 + case $bad_seen in 0) state=$NAME_BAD ; bad_seen=1 ;; *) state=$NAME_GOOD ;; @@ -172,6 +178,11 @@ bisect_start() { } git rev-parse --sq-quote $@ $GIT_DIR/BISECT_NAMES eval $eval true + if test $revision_seen -eq 1 test ! -s $GIT_DIR/BISECT_TERMS + then + echo $NAME_BAD $GIT_DIR/BISECT_TERMS + echo $NAME_GOOD $GIT_DIR/BISECT_TERMS + fi echo git bisect start$orig_args $GIT_DIR/BISECT_LOG || exit # # Check if we can proceed to the next bisect state. @@ -232,6 +243,7 @@ bisect_skip() { bisect_state() { bisect_autostart state=$1 + check_and_set_terms $state case $#,$state
[PATCH v7 0/5] git bisect old/new
Hi, I fixed a few minor issues in v6. Patch between my version and v6 is below. I also pushed my branch here: https://github.com/moy/git/tree/bisect-terms Not visible in the patch below: I squashed PATCH 5 into PATCH 3 to avoid the pattern break stuff and then repair it. The first two patches seem ready. PATCH 4 (add old/new) is still buggy. When starting a bisection with git bisect start $old $new the command git bisect visualize does not work (it shows no commit). I consider PATCH 5 as WIP, I think it would need a lot of polishing and testing to be mergeable. I think a reasonable objective for now it to get old/new working in the user-interface, and drop PATCH 5 from the series when it gets merged. The existance of PATCH 5 is a very good thing even if it doesn't get merged: * The fact that it's possible to do it on top of the series shows that we make the code more generic. I think it's important that regardless of features, the code moves in the right direction. * The patch can be taken over later by someone else. diff --git a/bisect.c b/bisect.c index 7492fdc..ab09650 100644 --- a/bisect.c +++ b/bisect.c @@ -921,7 +921,7 @@ void read_bisect_terms(const char **read_bad, const char **read_good) FILE *fp = fopen(filename, r); if (!fp) { - if (errno==2) { + if (errno == ENOENT) { *read_bad = bad; *read_good = good; return; diff --git a/git-bisect.sh b/git-bisect.sh index 7da22b1..8ef2b94 100644 --- a/git-bisect.sh +++ b/git-bisect.sh @@ -541,7 +541,7 @@ get_terms () { { read NAME_BAD read NAME_GOOD - }$GIT_DIR/BISECT_TERMS + } $GIT_DIR/BISECT_TERMS fi } @@ -605,8 +605,8 @@ bisect_terms () { echo 1 $GIT_DIR/TERMS_DEFINED echo git bisect terms $NAME_BAD $NAME_GOOD $GIT_DIR/BISECT_LOG || exit else - die $(gettext A bisection has already started, and you can't change terms in the middle of it. -Use 'git bisect terms' to see the current terms. + die $(gettext A bisection has already started, and you can't change terms in the middle of it. +Use 'git bisect terms' to see the current terms. Otherwise, to start a new bisection with new terms, please use 'git bisect reset' and set the terms before the start) fi ;; *) diff --git a/revision.c b/revision.c index 27750ac..f22923f 100644 --- a/revision.c +++ b/revision.c @@ -2083,18 +2083,28 @@ extern void read_bisect_terms(const char **bad, const char **good); static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) { - char bisect_refs_path[256]; - strcpy(bisect_refs_path, refs/bisect/); - strcat(bisect_refs_path, name_bad); - return for_each_ref_in_submodule(submodule, bisect_refs_path, fn, cb_data); + struct strbuf bisect_refs_buf = STRBUF_INIT; + const char *bisect_refs_str; + int status; + strbuf_addstr(bisect_refs_buf, refs/bisect/); + strbuf_addstr(bisect_refs_buf, name_bad); + bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL); + status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, cb_data); + free((char *)bisect_refs_str); + return status; } static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) { - char bisect_refs_path[256]; - strcpy(bisect_refs_path, refs/bisect/); - strcat(bisect_refs_path, name_good); - return for_each_ref_in_submodule(submodule, bisect_refs_path, fn, cb_data); + struct strbuf bisect_refs_buf = STRBUF_INIT; + const char *bisect_refs_str; + int status; + strbuf_addstr(bisect_refs_buf, refs/bisect/); + strbuf_addstr(bisect_refs_buf, name_bad); + bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL); + status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, cb_data); + free((char *)bisect_refs_str); + return status; } static int handle_revision_pseudo_opt(const char *submodule, diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh index d91116e..289dbb0 100755 --- a/t/t6030-bisect-porcelain.sh +++ b/t/t6030-bisect-porcelain.sh @@ -781,12 +781,12 @@ test_expect_success 'bisect start with one new and old' ' git bisect new git bisect new bisect_result grep $HASH2 is the first new commit bisect_result - git bisect log log_to_replay.txt + git bisect log log_to_replay.txt git bisect reset ' test_expect_success 'bisect replay with old and new' ' - git bisect replay log_to_replay.txt bisect_result + git bisect replay log_to_replay.txt bisect_result grep $HASH2 is the first new commit bisect_result git bisect reset '
Re: [PATCH] apply: fix adding new files on i-t-a entries
Junio C Hamano gits...@pobox.com writes: I think I'm opening a can of worms with d95d728 I cannot offhand convince myself that apply is the only casualty; assuming it is, I think a reasonable way forward is to keep d95d728 and adjust apply to the new world order. Otherwise, i.e. if there are wider fallouts from d95d728, we may instead want to temporarily revert it off from 'master', deal with fallouts to apply and other things, before resurrecting it. Anything that internally uses diff-index is suspect, I think. What do others think? You seem to ... So far blame, rm and checkout-entry and checkout paths are on my to-think-or-fix list. But this patch can get in early to fix a real regression instead of waiting for one big series. A lot more discussions will be had before that series gets in good shape. ... think that the damage could be quite extensive, so I am inclined to say that we first revert d95d728 before going forward. Let's do this on 'nd/diff-i-t-a' topic and merge the result immediately to 'master' for now. -- 8 -- From: Junio C Hamano gits...@pobox.com Date: Tue, 23 Jun 2015 10:27:47 -0700 Subject: [PATCH] Revert diff-lib.c: adjust position of i-t-a entries in diff This reverts commit d95d728aba06a34394d15466045cbdabdada58a2. It turns out that many other commands that need to interact with the result of running diff-files and diff-index, e.g. git apply, git rm, etc., need to be adjusted to the new world order it brings in. For example, it would break this sequence to correct a whitespace breakage in the parts you changed: git add -N file git diff --cached file | git apply --cached --whitespace=fix git checkout file In the old world order, diff showed a patch to modify an existing empty file by adding its full contents, and apply updated the index by modifying the existing empty blob (which is what an Intent-to-Add entry records in the index) with that patch. In the new world order, diff shows a patch to create a new file with its full contents, but because apply thinks that the i-t-a entry already exists in the index, it refused to accept a creation. Adjusting apply to this new world order is easy, but we need to assess the extent of the damage to the rest of the system the new world order brought in before going forward and adjust them all, after which we can resurrect the commit being reverted here. --- builtin/add.c | 1 - diff-lib.c | 12 t/t2203-add-intent.sh | 23 --- t/t4011-diff-symlink.sh | 10 -- 4 files changed, 8 insertions(+), 38 deletions(-) diff --git a/builtin/add.c b/builtin/add.c index ee370b0..3390933 100644 --- a/builtin/add.c +++ b/builtin/add.c @@ -63,7 +63,6 @@ static void update_callback(struct diff_queue_struct *q, switch (fix_unmerged_status(p, data)) { default: die(_(unexpected diff status %c), p-status); - case DIFF_STATUS_ADDED: case DIFF_STATUS_MODIFIED: case DIFF_STATUS_TYPE_CHANGED: if (add_file_to_index(the_index, path, data-flags)) { diff --git a/diff-lib.c b/diff-lib.c index 714501a..a85c497 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -212,11 +212,6 @@ int run_diff_files(struct rev_info *revs, unsigned int option) ce-sha1, !is_null_sha1(ce-sha1), ce-name, 0); continue; - } else if (ce-ce_flags CE_INTENT_TO_ADD) { - diff_addremove(revs-diffopt, '+', ce-ce_mode, - EMPTY_BLOB_SHA1_BIN, 0, - ce-name, 0); - continue; } changed = match_stat_with_submodule(revs-diffopt, ce, st, @@ -381,13 +376,6 @@ static void do_oneway_diff(struct unpack_trees_options *o, struct rev_info *revs = o-unpack_data; int match_missing, cached; - /* i-t-a entries do not actually exist in the index */ - if (idx (idx-ce_flags CE_INTENT_TO_ADD)) { - idx = NULL; - if (!tree) - return; /* nothing to diff.. */ - } - /* if the entry is not checked out, don't examine work tree */ cached = o-index_only || (idx ((idx-ce_flags CE_VALID) || ce_skip_worktree(idx))); diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh index 7c641bf..2a4a749 100755 --- a/t/t2203-add-intent.sh +++ b/t/t2203-add-intent.sh @@ -5,24 +5,10 @@ test_description='Intent to add' . ./test-lib.sh test_expect_success 'intent to add' ' - test_commit 1 - git rm 1.t - echo hello 1.t echo hello file echo hello elif git add -N file - git add elif -
Re: [PATCH v7 3/5] bisect: simplify the addition of new bisect terms
On Tue, Jun 23, 2015 at 8:54 AM, Matthieu Moy matthieu@imag.fr wrote: diff --git a/revision.c b/revision.c index 3ff8723..f22923f 100644 --- a/revision.c +++ b/revision.c @@ -2076,14 +2079,32 @@ void parse_revision_opt(struct rev_info *revs, struct parse_opt_ctx_t *ctx, ctx-argc -= n; } +extern void read_bisect_terms(const char **bad, const char **good); + static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) { - return for_each_ref_in_submodule(submodule, refs/bisect/bad, fn, cb_data); + struct strbuf bisect_refs_buf = STRBUF_INIT; + const char *bisect_refs_str; + int status; + strbuf_addstr(bisect_refs_buf, refs/bisect/); + strbuf_addstr(bisect_refs_buf, name_bad); A single strbuf_addf() rather than two strbuf_addstr()s? + bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL); + status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, cb_data); + free((char *)bisect_refs_str); Why the above rather than the simpler? strbuf_addstr(bisect_refs, ...); status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, cb_data); strbuf_release(bisect_refs); What am I missing? + return status; } static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data) { - return for_each_ref_in_submodule(submodule, refs/bisect/good, fn, cb_data); + struct strbuf bisect_refs_buf = STRBUF_INIT; + const char *bisect_refs_str; + int status; + strbuf_addstr(bisect_refs_buf, refs/bisect/); + strbuf_addstr(bisect_refs_buf, name_bad); + bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL); + status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, cb_data); + free((char *)bisect_refs_str); Ditto. + return status; } -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
[+ronniesahlb...@gmail.com, FYI] On Mon, Jun 22, 2015 at 5:50 PM, David Turner dtur...@twopensource.com wrote: I've revived and modified Ronnie Sahlberg's work on the refs db backend. Awesome! The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle. I recognize that there have been changes to the refs code since then, and that there are some further changes in-flight from e.g. Michael Haggerty. If there is interest in this, I can rebase once Michael's changes land. Originally I wanted to continue on Ronnies work, but because of the churn in refs I stopped it for a while and took care of other projects (and wanted to come back eventually). Thanks for reviving this topic! The changes can be found here: https://github.com/dturner-tw/git.git on the dturner/pluggable-backends branch The db backend code was added in the penultimate commit; the rest is just code rearrangement and minor changes to make alternate backends possible. There ended up being a fair amount of this rearrangement, but the end result is that almost the entire git test suite runs under the db backend without error (see below for details). Looking at the end result in refs-be-db.c it feels like there are more functions in the refs_be_db struct, did this originate from other design choices? IIRC Ronnie wanted to have as least functions in there as possible, and share as much of the code between the databases, such that the glue between the db and the refs code is minimal. Some random comments from looking over the branch briefly: In the latest commit, (refs: tests for db backend), I am unsure about the copyright annotations. At least a sole Copyright (c) 2007 Junio C Hamano doesn't make sense to me. ;) Typo in commit message bisect: use refs insfrastructure for BISECT_START Some commits contain a ChangeId, which is a Gerrit leftover. :( Thanks, Stefan The db backend runs git for-each-ref about 30% faster than the files backend with fully-packed refs on a repo with ~120k refs. It's also about 4x faster than using fully-unpacked refs. In addition, and perhaps more importantly, it avoids case-conflict issues on OS X. I chose to use LMDB for the database. LMDB has a few features that make it suitable for usage in git: 1. It is relatively lightweight; it requires only one header file, and the library itself is under 300k (as opposed to 700k for e.g. sqlite). 2. It is well-tested: it's been used in OpenLDAP for years. 3. It's very fast. LMDB's benchmarks show that it is among the fastest key-value stores. 4. It has a relatively simple concurrency story; readers don't block writers and writers don't block readers. Ronnie Sahlberg's original version of this patchset used tdb. The advantage of tdb is that it's smaller (~125k). The disadvantages are that tdb is hard to build on OS X. It's also not in homebrew. So lmdb seemed simpler. To test this backend's correctness, I hacked test-lib.sh and test-lib-functions.sh to run all tests under the refs backend. Dozens of tests use manual ref/reflog reading/writing, or create submodules without passing --refs-backend-type to git init. If those tests are changed to use the update-ref machinery or test-refs-be-db (or, in the case of packed-refs, corrupt refs, and dumb fetch tests, are skipped), the only remaining failing tests are the git-new-workdir tests and the gitweb tests. Please let me know how it would be best to proceed. -- To unsubscribe from this list: send the line unsubscribe git in -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, 2015-06-23 at 07:47 -0400, Jeff King wrote: On Mon, Jun 22, 2015 at 08:50:56PM -0400, David Turner wrote: The db backend runs git for-each-ref about 30% faster than the files backend with fully-packed refs on a repo with ~120k refs. It's also about 4x faster than using fully-unpacked refs. In addition, and perhaps more importantly, it avoids case-conflict issues on OS X. Neat. Can you describe a bit more about the reflog handling? One of the problems we've had with large-ref repos is that the reflog storage is quite inefficient. You can pack all the refs, but you may still be stuck with a bunch of reflog files with one entry, wasting a whole inode. Doing a git repack when you have a million of those has horrible cold-cache performance. Basically anything that isn't one-file-per-reflog would be a welcome change. :) Reflogs are stored in the database as well. There is one header entry per ref to indicate that a reflog is present, and then one database entry per reflog entry; the entries are stored consecutively and immediately following the header so that it's fast to iterate over them. It has also been a dream of mine to stop tying the reflogs specifically to the refs. I.e., have a spot for reflogs of branches that no longer exist, which allows us to retain them for deleted branches. Then you can possibly recover from a branch deletion, whereas now you have to dig through git fsck's dangling output. And the reflog, if you don't expire it, becomes a suitable audit log to find out what happened to each branch when (whereas now it is full of holes when things get deleted). That would be cool, and I don't think it would be hard to add to my current code; we could simply replace the header with a tombstone. But I would prefer to wait until the series is merged; then we can build on top of it. I dunno. Maybe I am overthinking it. But it really feels like the _refs_ are a key/value thing, but the _reflogs_ are not. You can cram them into a key/value store, but you're probably operating on them as a big blob, then. Reflogs are, conceptually, queues. I agree that a raw key-value store is not a good way to store queues, but a B-Tree is not so terrible, since it offers relatively fast iteration (amortized constant time IIRC). I chose to use LMDB for the database. LMDB has a few features that make it suitable for usage in git: One of the complaints that Shawn had about sqlite is that there is no native Java implementation, which makes it hard for JGit to ship a compatible backend. I suspect the same is true for LMDB, but it is probably a lot simpler than sqlite (so reimplementation might be possible). But it may also be worth going with a slightly slower database if we can get wider compatibility for free. There's a JNI interface to LMDB, which is, of course, not native. I don't think it would be too hard to entirely rewrite LMDB in Java, but I'm not going to have time to do it for the forseeable future. I've asked Howard Chu if he knows of any efforts in progress. To test this backend's correctness, I hacked test-lib.sh and test-lib-functions.sh to run all tests under the refs backend. Dozens of tests use manual ref/reflog reading/writing, or create submodules without passing --refs-backend-type to git init. If those tests are changed to use the update-ref machinery or test-refs-be-db (or, in the case of packed-refs, corrupt refs, and dumb fetch tests, are skipped), the only remaining failing tests are the git-new-workdir tests and the gitweb tests. I think we'll need to bump core.repositoryformatversion, too. See the patches I just posted here: http://thread.gmane.org/gmane.comp.version-control.git/272447 Thanks, that's valuable. For the refs backend, opening the LMDB database for writing is sufficient to block other writers. Do you think it would be valuable to provide a git hold-ref-lock command that simply reads refs from stdin and keeps them locked until it reads EOF from stdin? That would allow cross-backend ref locking. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 3/5] bisect: simplify the addition of new bisect terms
Eric Sunshine sunsh...@sunshineco.com writes: On Tue, Jun 23, 2015 at 8:54 AM, Matthieu Moy matthieu@imag.fr wrote: + strbuf_addstr(bisect_refs_buf, refs/bisect/); + strbuf_addstr(bisect_refs_buf, name_bad); A single strbuf_addf() rather than two strbuf_addstr()s? + bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL); + status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, cb_data); + free((char *)bisect_refs_str); Why the above rather than the simpler? strbuf_addstr(bisect_refs, ...); status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, cb_data); strbuf_release(bisect_refs); What am I missing? Indeed, your version is much better. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
On Mon, Jun 22, 2015 at 5:42 PM, Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr wrote: Instead of removing a line to remove the commit, you can use the command drop (just like pick or edit). It has the same effect as deleting the line (removing the commit) except that you keep a visual trace of your actions, allowing a better control and reducing the possibility of removing a commit by mistake. Signed-off-by: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr --- diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index ac429a0..ecd277c 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -1102,4 +1102,20 @@ test_expect_success 'rebase -i commits that overwrite untracked files (no ff)' ' test $(git cat-file commit HEAD | sed -ne \$p) = I ' +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 + test_might_fail git rebase --abort + git checkout -b $1 master +} The way this is indented makes it difficult to see that lines 2 and 3 are continuations of 1. Perhaps format it like this instead? test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } + +test_expect_success 'drop' ' + test_rebase_end dropTest + set_fake_editor + FAKE_LINES=1 drop 2 3 drop 4 5 git rebase -i --root + test E = $(git cat-file commit HEAD | sed -ne \$p) + test C = $(git cat-file commit HEAD^ | sed -ne \$p) + test A = $(git cat-file commit HEAD^^ | sed -ne \$p) +' + test_done -- 2.4.3.371.g8992f2a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, 2015-06-23 at 17:23 +0700, Duy Nguyen wrote: On Tue, Jun 23, 2015 at 7:50 AM, David Turner dtur...@twopensource.com wrote: To test this backend's correctness, I hacked test-lib.sh and test-lib-functions.sh to run all tests under the refs backend. Now we have two. split-index also benefits from running through full test suite like this. I propose we make make test run the test suite twice. The first run is with default configuration, no split index, no fancy ref backend. The second run enables split-index and switches to new backend, running through all test cases. In future we can also enable packv4 in this second run. There won't be a third run. When the second ref backend comes, we can switch between the two backends using a random number generator where we control both algorithm and seed, so that when a test fails, the user can give us their seed and we can re-run with the same configuration. I'm not in love with this idea, because it makes it hard to do exhaustive testing efficiently. I would rather have make test run through all tests under all combinations -- or at least all relevant tests. We could perhaps mark tests with a list of features that they exercise, so that we don't have to run e.g. t8xxx with alternate refs backends. Dozens of tests use manual ref/reflog reading/writing, or create submodules without passing --refs-backend-type to git init. If those tests are changed to use the update-ref machinery or test-refs-be-db (or, in the case of packed-refs, corrupt refs, and dumb fetch tests, are skipped), the only remaining failing tests are the git-new-workdir tests and the gitweb tests. I haven't read the series, but I guess you should also add a few tests to run on the first run, so new code is exercised a bit even if people skip the second run. I did this already, yes. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Eric Sunshine sunsh...@sunshineco.com writes: +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 + test_might_fail git rebase --abort + git checkout -b $1 master +} The way this is indented makes it difficult to see that lines 2 and 3 are continuations of 1. Perhaps format it like this instead? test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } I completely agree with you, moreover it was indented like this before. I'll change it in my local version for now. Ironically, it was modified after the following: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: Eric Sunshine sunsh...@sunshineco.com writes: +test_expect_success 'rebase -i respects rebase.missingCommitsCheck=ignore' ' + test_config rebase.missingCommitsCheck ignore + test_when_finished git checkout master + git branch -D tmp2 Strange indentation. Considering that 'git branch -D tmp2' is a part of test_when_finished, I wasn't sure of how it was supposed to be indented, so I did it this way to show that it was still within test_when_finished and not a separate command. test_when_finished git checkout master git branch -D tmp2 Doesn't seem as clear, especially if you quickly read the lines. For now, I have removed the tab. :p Thanks, Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 Is this one guaranteed to succeed? Do we want to consider it a failure to remove $1 (e.g. dropTest)? $ git branch -D no-such-branch ; echo $? error: branch 'no-such-branch' not found. 1 If dropTest branch did not exist before the test that begins with a call to this function, what happens? Besides, a function that must be called at the beginning of a test piece has a name that ends with _end? That sounds funny, no? + test_might_fail git rebase --abort + git checkout -b $1 master +} I'm wondering if this is not sufficient. test_rebase_i_drop_prepare () { git reset --hard git checkout -B $1 master } I am guessing that you named _end because it has when_finished, but as far as I can tell, even after these three patches, the tests do not really rely on the fact that it is on 'master' branch. More importantly, just being on 'master' branch is not a sufficient cleanliness for the next test (and that is why you added these branch -D and might-fail rebase --abort to this function in the first place), it seems. So... +test_expect_success 'drop' ' + test_rebase_end dropTest + set_fake_editor + FAKE_LINES=1 drop 2 3 drop 4 5 git rebase -i --root + test E = $(git cat-file commit HEAD | sed -ne \$p) + test C = $(git cat-file commit HEAD^ | sed -ne \$p) + test A = $(git cat-file commit HEAD^^ | sed -ne \$p) +' + test_done -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Matthieu Moy matthieu@grenoble-inp.fr writes: Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: Eric Sunshine sunsh...@sunshineco.com writes: +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 + test_might_fail git rebase --abort + git checkout -b $1 master +} The way this is indented makes it difficult to see that lines 2 and 3 are continuations of 1. Perhaps format it like this instead? test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } I completely agree with you, moreover it was indented like this before. I'll change it in my local version for now. Perhaps to avoid confusion, stg like: test_when_finished ... ... git checkout (the closing alone on its line) I think that the indentation on its own is enough to avoid confusion test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } but your idea is fine as well, so I'm ok with either way. Thanks, Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 3/3] git rebase -i: add static check for commands and SHA-1
Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: I used: read -r command sha1 rest EOF $line EOF because printf '%s' $line | read -r command sha1 rest doesn't work (the 3 variables have no value as a result). There might be a better way to do this, but I don't have it right now. while read line do ( IFS=' ' set x $line shift # now $1 is your command, $2 is sha1, $3 is remainder ... ) done perhaps? But more importantly, why do you even need to keep the bad ones in a separate .badcmd and .badsha files? Isn't that bloating your changes unnecessarily, iow, if you issued your warning as you encounter them, wouldn't the change become cleaner and easier to understand (and as a side effect it may even become smaller)? The _only_ thing that you would get by keeping them in temporary files is that you can do one header and bunch of errors, but is it so common to make a bad edit to the insn sheet that a sequence of errors, one per line becomes more burdensome to the end user? I would think stripspace | while read -r command sha1 rest do ... and showing the warning as you detect inside that loop would be sufficient. Perhaps I am missing subtle details of what you are doing. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: Eric Sunshine sunsh...@sunshineco.com writes: +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 + test_might_fail git rebase --abort + git checkout -b $1 master +} The way this is indented makes it difficult to see that lines 2 and 3 are continuations of 1. Perhaps format it like this instead? test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } I completely agree with you, moreover it was indented like this before. I'll change it in my local version for now. Perhaps to avoid confusion, stg like: test_when_finished ... ... git checkout (the closing alone on its line) -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 4/5] bisect: add the terms old/new
Matthieu Moy matthieu@imag.fr writes: Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr Sounds like you went all out with this patch. Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Dependency Management
If neither git-submodule nor git-subtree is palatable to you, here are a couple of alternatives you might try: * https://github.com/ingydotnet/git-subrepo * https://github.com/tdd/git-stree On Tue, Jun 23, 2015 at 1:36 PM Stefan Beller sbel...@google.com wrote: On Tue, Jun 23, 2015 at 1:52 AM, Jean Audibert jaudib...@euronext.com wrote: Hi, Sorry to bother you with this question but I can't find any official answer or strong opinion from Git community. In my company we recently started to use Git and we wonder how to share code and manage dependencies with Git? Use case: in project P we need to include lib-a and lib-b (libraries shared by several projects) In your opinion, what is the future proof solution? * Use submodule * Use subtree We know there is lot of PRO/CONS but I feel that subtree is behind in the race and the latest version of submodule work fine Use whatever works fine for your use case. My personal opinion/expectation is to see submodules improving/advancing more than subtrees advancing in the near future. Though this is neither the official nor a strong opinion. Stefan Suggestions are very welcome. Thanks in advance, Jean Audibert _ This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of Euronext N.V. or any of its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
On Tue, Jun 23, 2015 at 3:01 PM, Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr wrote: Eric Sunshine sunsh...@sunshineco.com writes: +test_rebase_end () { + test_when_finished git checkout master + git branch -D $1 + test_might_fail git rebase --abort + git checkout -b $1 master +} The way this is indented makes it difficult to see that lines 2 and 3 are continuations of 1. Perhaps format it like this instead? test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } I completely agree with you, moreover it was indented like this before. I'll change it in my local version for now. Ironically, it was modified after the following: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: Eric Sunshine sunsh...@sunshineco.com writes: +test_expect_success 'rebase -i respects rebase.missingCommitsCheck=ignore' ' + test_config rebase.missingCommitsCheck ignore + test_when_finished git checkout master + git branch -D tmp2 Strange indentation. Clearly, that guy who made the Strange indentation review comment didn't know what he was talking about. ;-) In that earlier review, I must have missed the fact that the quoted string spanned multiple lines, and, unfortunately, got too busy with other things to say ah, the indentation makes perfect sense when your response pointed out its true nature. Matthieu's suggestion of formatting it like: test_when_finished ... ... should help to avoid such misconceptions. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, 2015-06-23 at 17:51 +0200, Michael Haggerty wrote: On 06/23/2015 02:50 AM, David Turner wrote: I've revived and modified Ronnie Sahlberg's work on the refs db backend. The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle. I recognize that there have been changes to the refs code since then, and that there are some further changes in-flight from e.g. Michael Haggerty. If there is interest in this, I can rebase once Michael's changes land. It's awesome that you are working on this! I'm reading through your commits and will add comments as they pop into my head... * I initially read refs-be-files to be a short version of references, they be files. I might never be able to get that pronunciation out of my head :-) That's OK so long as I can keep pronouncing reflog as re-flog. ;) * It would be more modest to call the files implementing the LMDB backend refs-be-lmdb.[c,h] rather than refs-be-db.[c,h]. Agreed. Will fix. * I wonder whether `refname_is_safe()` might eventually have to become backend-specific. For example, maybe one backend will have to impose a limit of 128 characters or something? No matter, though...it can be moved later. I think it would be an error to allow backends to impose additional limits on ref names. The limits imposed by the current backend have been the cause of much sadness here at Twitter (primarily, case-conflicts combined with d/f conflicts). * You have put `format_reflog_msg()` in the public interface. It probably makes sense, because more than one backend might want to use it. But another backend might want to store (refname, old_sha1, new_sha1, ...) as separate columns in a database. As long as `format_reflog_msg()` is seen as a helper function and is not called by any of the shared code, it shouldn't be a problem. Agreed. * I wonder whether `init_backend()` will be general enough. We can always upgrade it later. * Your methods for bulk updates are I think analogous to the `initial_ref_transaction_commit()` function that I recently submitted [1]. Either way, the goal is to abstract away the fact that the file-based backend uses packed and loose references with tradeoffs that callers currently have to know about. Yes, I saw your work after I had already started mine. * I don't like the fact that you have replaced `struct ref_transaction *` with `void *` in the public interface. On a practical level, I like the bit of type-safety that comes with the more specific declaration. But on a more abstract level, I think that the concept of a transaction could be useful across backends, for example in utility functions that verify that a proposed set of updates are internally consistent. I would rather see either * backends extend a basic `struct ref_transaction` to suit their needs, and upcast/downcast pointers at the module boundary, or * `struct ref_transaction` itself gets a `void *` member that backends can use for whatever purposes they want. There are no common fields between refs-be-file transactions and refs-be-lmdb transactions. I don't see much gain from adding an empty ref_transaction that backends could extend, since we would have to explicitly upcast/downcast all over the place. * Regarding MERGE_HEAD: you take the point of view that it must continue to be stored as a file. And yet it must also behave somewhat like a reference; for example, `git rev-parse MERGE_HEAD` works today. MERGE_HEAD is also used for reachability, right? Another point of view is that MERGE_HEAD is a plain old boring reference, but there is some other metadata related to it that the refs backend has to store. The file-based backend would have special-case code to read the additional data from the tail of the loose refs file (and be sure to write the metadata when writing the reference), but other backends could store the reference with the rest but do their own thing with the metadata. So I guess I'm wondering whether the refs API needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with its metadata. You are probably right that this is a good idea. * Don't the same considerations that apply to MERGE_HEAD also apply to FETCH_HEAD? All of the tests pass without any special handling of FETCH_HEAD. * I'm showing my ignorance of LMDB, but I don't see where the LMDB backend initializes its database during `git init-db`. Is that what `init_env()` does? But it looks like `init_env()` is called on every git invocation (via `git_config_early()`). Puzzled. There is no need to explicitly create the database (other than with mkdir); init_env does everything for you. * Meanwhile, `create_default_files()` (in `builtin/init-db`) still creates directories `refs`, `refs/heads`, and `refs/tags`. Yeah, that's legit. I'll create a backend initdb function, and rename init to setup. * Rehash of the last two points: I expected one backend function that is used to
Re: [PATCH v7 5/5] bisect: allows any terms set by user
Matthieu Moy matthieu@imag.fr writes: Subject: Re: [PATCH v7 5/5] bisect: allows any terms set by user s/allows/allow/; +Alternative terms: use your own terms + Lengths of underline and the text must match. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git status --diffstat?
Hi, I recently realized that I could use a git status syntax like this: On branch master Your branch is up-to-date with 'origin/master'. Changes not staged for commit: (use git add file... to update what will be committed) (use git checkout -- file... to discard changes in working directory) modified: macd.py 2 +- modified: macd.wsgi 2 +- modified: macd/admin.py 4 modified: macd/index.html 2 +- modified: macd/models.py 7 +-- modified: macd/settings.py27 +-- modified: macd/urls.py 6 ++ modified: macd/views.py27 +++ (...) The idea is to add diffstats to git-status. I cooked up a Python script to do that [1], but I'd like that to be default behavior on my box. Someone suggested me to just implement it in C, but I'm not familiar with the codebase, so it'd take me a while. What do you think about this? Would anybody else find it useful and perhaps consider implementing it? Cheers, d33tah signature.asc Description: OpenPGP digital signature
Re: [PATCH v7 0/5] git bisect old/new
Matthieu Moy matthieu@imag.fr writes: I fixed a few minor issues in v6. Patch between my version and v6 is below. I also pushed my branch here: https://github.com/moy/git/tree/bisect-terms It is somewhat confusing to see v3 yesterday and then this v7 next day. How did I miss v4 thru v6? Not visible in the patch below: I squashed PATCH 5 into PATCH 3 to avoid the pattern break stuff and then repair it. Good. The first two patches seem ready. Yeah, the first one is obviously fine ;-), and I agree the second one looks more or less OK. Regarding the second and third one, the messages they give when the user marked one tip of a side branch as old and the other new gave me a hiccup while reading them, though. if (!strcmp(name_bad, bad)) { fprintf(stderr, The merge base %s is bad.\n This means the bug has been fixed between %s and [%s].\n, bad_hex, bad_hex, good_hex); } else { fprintf(stderr, The merge base %s is %s.\n This means the first commit marked %s is between %s and [%s].\n, bad_hex, name_bad, name_bad, bad_hex, good_hex); The bad side is inherited from the original and not your fault, but it was already very hard to understand. The other side is not just unreadable, but I think is incorrect and confusing to say first commit marked %(name_bad)s; you know there are history segments whose oldest ends (i.e. merge base that is bad) are marked as 'bad', and the other ends are marked as 'good', and you haven't marked any of the commits in between yet. So there is no first commit marked either as bad or good there between these endpoints (yet). Also I was somewhat puzzled and disappointed to still see name_{bad,good} not name_{new,old} used as variable names even in the endgame patch, though. Is that intended? PATCH 4 (add old/new) is still buggy. When starting a bisection with git bisect start $old $new the command git bisect visualize does not work (it shows no commit). I consider PATCH 5 as WIP, I think it would need a lot of polishing and testing to be mergeable. I think a reasonable objective for now it to get old/new working in the user-interface, and drop PATCH 5 from the series when it gets merged. The existance of PATCH 5 is a very good thing even if it doesn't get merged: * The fact that it's possible to do it on top of the series shows that we make the code more generic. I think it's important that regardless of features, the code moves in the right direction. * The patch can be taken over later by someone else. Yeah, if I may rephrase to make sure we are on the same page, in order for 5/5 to be done sanely, 1-4/5 must be giving a good foundation to build on. I agree with that, I agree that including a polished 5/5 would be a good thing, and then I further agree that 1-4/5 could be delivered before 5/5 is ready. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: RFC/Pull Request: Refs db backend
-Original Message- From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On Behalf Of David Turner Sent: June 23, 2015 4:22 PM To: Randall S. Becker Cc: 'Stefan Beller'; 'git mailing list'; 'ronnie sahlberg' Subject: Re: RFC/Pull Request: Refs db backend Just to beg a request: LMDB is not available on some MPP architectures to which git has been ported. If it comes up, I beg you not to add this as a dependency to base git components. My changes make `configure` check for the presence of liblmdb. The LMDB code is only built if liblmdb is present. So, I think we're good. Thanks :) You have no idea how much, in a burnt by that in other projects POV. Cheers, Randall -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Matthieu Moy matthieu@grenoble-inp.fr writes: Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: I think that the indentation on its own is enough to avoid confusion test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } but your idea is fine as well, so I'm ok with either way. Read too quickly, it looks like a mis-indentation (I could laugh at Eric here, but I made the same confusion when reading the code at first). By avoid the confusion I mean make it clear it's not a mis-indentation. Yes, that stray fooled me as well. If it were following your suggestion in the earlier message on this thread, i.e. test_when_finished ... ... git checkout I wouldn't have to waste time commenting on it ;-) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Junio C Hamano gits...@pobox.com writes: Matthieu Moy matthieu@grenoble-inp.fr writes: Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: I think that the indentation on its own is enough to avoid confusion test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } but your idea is fine as well, so I'm ok with either way. Read too quickly, it looks like a mis-indentation (I could laugh at Eric here, but I made the same confusion when reading the code at first). By avoid the confusion I mean make it clear it's not a mis-indentation. Yes, that stray fooled me as well. If it were following your suggestion in the earlier message on this thread, i.e. test_when_finished ... ... git checkout I wouldn't have to waste time commenting on it ;-) I will do it this way then. ;) Thanks, Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
David Turner dtur...@twopensource.com writes: On Tue, 2015-06-23 at 15:53 -0400, David Turner wrote: * Regarding MERGE_HEAD: you take the point of view that it must continue to be stored as a file. And yet it must also behave somewhat like a reference; for example, `git rev-parse MERGE_HEAD` works today. MERGE_HEAD is also used for reachability, right? Another point of view is that MERGE_HEAD is a plain old boring reference, but there is some other metadata related to it that the refs backend has to store. The file-based backend would have special-case code to read the additional data from the tail of the loose refs file (and be sure to write the metadata when writing the reference), but other backends could store the reference with the rest but do their own thing with the metadata. So I guess I'm wondering whether the refs API needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with its metadata. You are probably right that this is a good idea. On reflection, I think it might make sense to keep MERGE_HEAD as a file. The problem is that not only would refs backends have to add new MERGE_HEAD-handling functions, but we would also need new plumbing commands to allow scripts to access the complete contents of MERGE_HEAD. That seems more complicated to me. I think you are talking about FETCH_HEAD, but I tend to agree. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, 2015-06-23 at 10:16 -0700, Stefan Beller wrote: The db backend code was added in the penultimate commit; the rest is just code rearrangement and minor changes to make alternate backends possible. There ended up being a fair amount of this rearrangement, but the end result is that almost the entire git test suite runs under the db backend without error (see below for details). Looking at the end result in refs-be-db.c it feels like there are more functions in the refs_be_db struct, did this originate from other design choices? IIRC Ronnie wanted to have as least functions in there as possible, and share as much of the code between the databases, such that the glue between the db and the refs code is minimal. I didn't actually spend that much time reading Ronnie's backend code. My code aims to be extremely thoroughly compatible. I spent a ton of time making sure that the git test suite passed. I don't know if an alternate approach would have been as compatible. The requirement for reflog storage did complicate things a bit. I also didn't see a strong need to abstract the database, since LMDB is common, widely compatible, and tiny. Some random comments from looking over the branch briefly: In the latest commit, (refs: tests for db backend), I am unsure about the copyright annotations. At least a sole Copyright (c) 2007 Junio C Hamano doesn't make sense to me. ;) Will fix, thanks. Typo in commit message bisect: use refs insfrastructure for BISECT_START Will fix, thanks. Some commits contain a ChangeId, which is a Gerrit leftover. :( Those were leftover from Ronnie's patches; since you are a Googler and you think we don't need them, I'll remove them. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: RFC/Pull Request: Refs db backend
-Original Message- From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On Behalf Of David Turner Sent: June 23, 2015 4:05 PM To: Stefan Beller Cc: git mailing list; ronnie sahlberg Subject: Re: RFC/Pull Request: Refs db backend On Tue, 2015-06-23 at 10:16 -0700, Stefan Beller wrote: The db backend code was added in the penultimate commit; the rest is just code rearrangement and minor changes to make alternate backends possible. There ended up being a fair amount of this rearrangement, but the end result is that almost the entire git test suite runs under the db backend without error (see below for details). Looking at the end result in refs-be-db.c it feels like there are more functions in the refs_be_db struct, did this originate from other design choices? IIRC Ronnie wanted to have as least functions in there as possible, and share as much of the code between the databases, such that the glue between the db and the refs code is minimal. I didn't actually spend that much time reading Ronnie's backend code. My code aims to be extremely thoroughly compatible. I spent a ton of time making sure that the git test suite passed. I don't know if an alternate approach would have been as compatible. The requirement for reflog storage did complicate things a bit. I also didn't see a strong need to abstract the database, since LMDB is common, widely compatible, and tiny. Just to beg a request: LMDB is not available on some MPP architectures to which git has been ported. If it comes up, I beg you not to add this as a dependency to base git components. Thanks, Randall -- Brief whoami: NonStopUNIX developer since approximately UNIX(421664400)/NonStop(2112884442) -- In my real life, I talk too much. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 01/10] t9001-send-email: move script creation in a setup test
Move the creation of the scripts used in to-cmd and cc-cmd tests in a setup test to make them available for later tests. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- t/t9001-send-email.sh | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index a3663da..eef12e6 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -312,13 +312,19 @@ test_expect_success $PREREQ,!AUTOIDENT 'broken implicit ident aborts send-email' ) ' +test_expect_success $PREREQ 'setup tocmd and cccmd scripts' ' + write_script tocmd-sed -\EOF + sed -n -e s/^tocmd--//p $1 + EOF + write_script cccmd-sed -\EOF + sed -n -e s/^cccmd--//p $1 + EOF +' + test_expect_success $PREREQ 'tocmd works' ' clean_fake_sendmail cp $patches tocmd.patch echo tocmd--to...@example.com tocmd.patch - write_script tocmd-sed -\EOF - sed -n -e s/^tocmd--//p $1 - EOF git send-email \ --from=Example nob...@example.com \ --to-cmd=./tocmd-sed \ @@ -332,9 +338,6 @@ test_expect_success $PREREQ 'cccmd works' ' clean_fake_sendmail cp $patches cccmd.patch echo cccmd-- cc...@example.com cccmd.patch - write_script cccmd-sed -\EOF - sed -n -e s/^cccmd--//p $1 - EOF git send-email \ --from=Example nob...@example.com \ --to=nob...@example.com \ -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 06/10] send-email: minor code refactoring
Group expressions in a single if statement. This avoid checking multiple time if the variable $sender is defined. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index f61449d..a0cd7ff 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -799,9 +799,9 @@ if (!$force) { } } -($sender) = expand_aliases($sender) if defined $sender; - -if (!defined $sender) { +if (defined $sender) { + ($sender) = expand_aliases($sender); +} else { $sender = $repoauthor || $repocommitter || ''; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 04/10] send-email: refactor address list process
Simplify code by creating a function which transform a list of strings containing email addresses (separated by commas, comporting aliases) into a clean list of valid email addresses. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 22 -- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 8bf38ee..2d5c530 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -833,12 +833,9 @@ sub expand_one_alias { return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias; } -@initial_to = expand_aliases(@initial_to); -@initial_to = validate_address_list(sanitize_address_list(@initial_to)); -@initial_cc = expand_aliases(@initial_cc); -@initial_cc = validate_address_list(sanitize_address_list(@initial_cc)); -@bcclist = expand_aliases(@bcclist); -@bcclist = validate_address_list(sanitize_address_list(@bcclist)); +@initial_to = process_address_list(@initial_to); +@initial_cc = process_address_list(@initial_cc); +@bcclist = process_address_list(@bcclist); if ($thread !defined $initial_reply_to $prompting) { $initial_reply_to = ask( @@ -1051,6 +1048,13 @@ sub sanitize_address_list { return (map { sanitize_address($_) } @_); } +sub process_address_list { + my @addr_list = expand_aliases(@_); + @addr_list = sanitize_address_list(@addr_list); + @addr_list = validate_address_list(@addr_list); + return @addr_list; +} + # Returns the local Fully Qualified Domain Name (FQDN) if available. # # Tightly configured MTAa require that a caller sends a real DNS @@ -1560,10 +1564,8 @@ foreach my $t (@files) { ($confirm =~ /^(?:auto|compose)$/ $compose $message_num == 1)); $needs_confirm = inform if ($needs_confirm $confirm_unconfigured @cc); - @to = expand_aliases(@to); - @to = validate_address_list(sanitize_address_list(@to)); - @cc = expand_aliases(@cc); - @cc = validate_address_list(sanitize_address_list(@cc)); + @to = process_address_list(@to); + @cc = process_address_list(@cc); @to = (@initial_to, @to); @cc = (@initial_cc, @cc); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line
parse_address_line had not the same behavior whether the user had Mail::Address or not. Teach parse_address_line to behave like Mail::Address. When the user input is correct, this implementation behaves exactly like Mail::Address except when there are quotes inside the name: Jane Doe j...@example.com In this case the result of parse_address_line is: With M::A : Jane Do e j...@example.com Without : Jane Do e j...@example.com When the user input is not correct, the behavior is also mostly the same. Unlike Mail::Address, this doesn't parse groups and recursive commentaries. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 2 +- perl/Git.pm | 67 t/t9000-addresses.sh | 30 +++ t/t9000/test.pl | 67 4 files changed, 165 insertions(+), 1 deletion(-) create mode 100755 t/t9000-addresses.sh create mode 100755 t/t9000/test.pl diff --git a/git-send-email.perl b/git-send-email.perl index a0cd7ff..bced78e 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -478,7 +478,7 @@ sub parse_address_line { if ($have_mail_address) { return map { $_-format } Mail::Address-parse($_[0]); } else { - return split_addrs($_[0]); + return Git::parse_mailboxes($_[0]); } } diff --git a/perl/Git.pm b/perl/Git.pm index 9026a7b..19ef081 100644 --- a/perl/Git.pm +++ b/perl/Git.pm @@ -864,6 +864,73 @@ sub ident_person { return $ident[0] $ident[1]; } +=item parse_mailboxes + +Return an array of mailboxes extracted from a string. + +=cut + +sub parse_mailboxes { + my $re_comment = qr/\((?:[^)]*)\)/; + my $re_quote = qr/(?:[^\\\]|\\.)*/; + my $re_word = qr/(?:[^][\s():;@\\,.]|\\.)+/; + + # divide the string in tokens of the above form + my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/; + my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_; + + # add a delimiter to simplify treatment for the last mailbox + push @tokens, ,; + + my (@addr_list, @phrase, @address, @comment, @buffer) = (); + foreach my $token (@tokens) { + if ($token =~ /^[,;]$/) { + # if buffer still contains undeterminated strings + # append it at the end of @address or @phrase + if (@address) { + push @address, @buffer; + } else { + push @phrase, @buffer; + } + + my $str_phrase = join ' ', @phrase; + my $str_address = join '', @address; + my $str_comment = join ' ', @comment; + + # quote are necessary if phrase contains + # special characters + if ($str_phrase =~ /[][():;@\\,.\000-\037\177]/) { + $str_phrase =~ s/(^|[^\\])/$1/g; + $str_phrase = qq[$str_phrase]; + } + + # add around the address if necessary + if ($str_address ne $str_phrase ne ) { + $str_address = qq[$str_address]; + } + + my $str_mailbox = $str_phrase $str_address $str_comment; + $str_mailbox =~ s/^\s*|\s*$//g; + push @addr_list, $str_mailbox if ($str_mailbox); + + @phrase = @address = @comment = @buffer = (); + } elsif ($token =~ /^\(/) { + push @comment, $token; + } elsif ($token eq ) { + push @phrase, (splice @address), (splice @buffer); + } elsif ($token eq ) { + push @address, (splice @buffer); + } elsif ($token eq @) { + push @address, (splice @buffer), @; + } elsif ($token eq .) { + push @address, (splice @buffer), .; + } else { + push @buffer, $token; + } + } + + return @addr_list; +} =item hash_object ( TYPE, FILENAME ) diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh new file mode 100755 index 000..7223d03 --- /dev/null +++ b/t/t9000-addresses.sh @@ -0,0 +1,30 @@ +#!/bin/sh +# +# Copyright (c) 2015 +# + +test_description='compare address parsing with and without Mail::Address' +. ./test-lib.sh + +if ! test_have_prereq PERL; then + skip_all='skipping perl interface tests, perl not available' + test_done +fi + +perl -MTest::More -e 0 2/dev/null || { + skip_all=Perl Test::More unavailable, skipping test + test_done +} + +perl -MMail::Address -e 0 2/dev/null || { + skip_all=Perl Mail::Address
[PATCH v6 02/10] send-email: allow aliases in patch header and command script outputs
Interpret aliases in: - Header fields of patches generated by git format-patch (using --to, --cc, --add-header for example) or manually modified. Example of fields in header: To: alias1 Cc: alias2 Cc: alias3 - Outputs of command scripts specified by --cc-cmd and --to-cmd. Example of script: #!/bin/sh echo alias1 echo alias2 Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 2 ++ t/t9001-send-email.sh | 60 +++ 2 files changed, 62 insertions(+) diff --git a/git-send-email.perl b/git-send-email.perl index 6bedf74..8bf38ee 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1560,7 +1560,9 @@ foreach my $t (@files) { ($confirm =~ /^(?:auto|compose)$/ $compose $message_num == 1)); $needs_confirm = inform if ($needs_confirm $confirm_unconfigured @cc); + @to = expand_aliases(@to); @to = validate_address_list(sanitize_address_list(@to)); + @cc = expand_aliases(@cc); @cc = validate_address_list(sanitize_address_list(@cc)); @to = (@initial_to, @to); diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index eef12e6..f7d4132 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1579,6 +1579,66 @@ test_expect_success $PREREQ 'sendemail.aliasfiletype=sendmail' ' grep ^!o@example\.com!$ commandline1 ' +test_expect_success $PREREQ 'alias support in To header' ' + clean_fake_sendmail + echo alias sbd some...@example.org .mailrc + test_config sendemail.aliasesfile .mailrc + test_config sendemail.aliasfiletype mailrc + git format-patch --stdout -1 --to=sbd aliased.patch + git send-email \ + --from=Example nob...@example.com \ + --smtp-server=$(pwd)/fake.sendmail \ + aliased.patch \ + 2errors out + grep ^!someone@example\.org!$ commandline1 +' + +test_expect_success $PREREQ 'alias support in Cc header' ' + clean_fake_sendmail + echo alias sbd some...@example.org .mailrc + test_config sendemail.aliasesfile .mailrc + test_config sendemail.aliasfiletype mailrc + git format-patch --stdout -1 --cc=sbd aliased.patch + git send-email \ + --from=Example nob...@example.com \ + --smtp-server=$(pwd)/fake.sendmail \ + aliased.patch \ + 2errors out + grep ^!someone@example\.org!$ commandline1 +' + +test_expect_success $PREREQ 'tocmd works with aliases' ' + clean_fake_sendmail + echo alias sbd some...@example.org .mailrc + test_config sendemail.aliasesfile .mailrc + test_config sendemail.aliasfiletype mailrc + git format-patch --stdout -1 tocmd.patch + echo tocmd--sbd tocmd.patch + git send-email \ + --from=Example nob...@example.com \ + --to-cmd=./tocmd-sed \ + --smtp-server=$(pwd)/fake.sendmail \ + tocmd.patch \ + 2errors out + grep ^!someone@example\.org!$ commandline1 +' + +test_expect_success $PREREQ 'cccmd works with aliases' ' + clean_fake_sendmail + echo alias sbd some...@example.org .mailrc + test_config sendemail.aliasesfile .mailrc + test_config sendemail.aliasfiletype mailrc + git format-patch --stdout -1 cccmd.patch + echo cccmd--sbd cccmd.patch + git send-email \ + --from=Example nob...@example.com \ + --cc-cmd=./cccmd-sed \ + --smtp-server=$(pwd)/fake.sendmail \ + cccmd.patch \ + 2errors out + grep ^!someone@example\.org!$ commandline1 +' + do_xmailer_test () { expected=$1 params=$2 git format-patch -1 -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 03/10] t9001-send-email: refactor header variable fields replacement
Create a function which replaces Date, Message-Id and X-Mailer lines generated by git-send-email by a specific string: Date:.*$ - Date: DATE-STRING Message-Id:.*$ - Message-Id: MESSAGE-ID-STRING X-Mailer:.*$ - X-Mailer: X-MAILER-STRING Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- t/t9001-send-email.sh | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index f7d4132..714fcae 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -522,6 +522,12 @@ Result: OK EOF +replace_variable_fields () { + sed -e s/^\(Date:\).*/\1 DATE-STRING/ \ + -e s/^\(Message-Id:\).*/\1 MESSAGE-ID-STRING/ \ + -e s/^\(X-Mailer:\).*/\1 X-MAILER-STRING/ +} + test_suppression () { git send-email \ --dry-run \ @@ -529,10 +535,7 @@ test_suppression () { --from=Example f...@example.com \ --to=t...@example.com \ --smtp-server relay.example.com \ - $patches | - sed -e s/^\(Date:\).*/\1 DATE-STRING/ \ - -e s/^\(Message-Id:\).*/\1 MESSAGE-ID-STRING/ \ - -e s/^\(X-Mailer:\).*/\1 X-MAILER-STRING/ \ + $patches | replace_variable_fields \ actual-suppress-$1${2+-$2} test_cmp expected-suppress-$1${2+-$2} actual-suppress-$1${2+-$2} } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 05/10] send-email: Allow use of aliases in the From field of --compose mode
Aliases were expanded before considering the From field of the --compose option. This is inconsistent with other fields (To, Cc, ...) which already support aliases. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index 2d5c530..f61449d 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -555,8 +555,6 @@ if (@alias_files and $aliasfiletype and defined $parse_alias{$aliasfiletype}) { } } -($sender) = expand_aliases($sender) if defined $sender; - # is_format_patch_arg($f) returns 0 if $f names a patch, or 1 if # $f is a revision list specification to be passed to format-patch. sub is_format_patch_arg { @@ -801,6 +799,8 @@ if (!$force) { } } +($sender) = expand_aliases($sender) if defined $sender; + if (!defined $sender) { $sender = $repoauthor || $repocommitter || ''; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line
Your git send-email does not seem to like PATCHes 08-10/10 ;-). Up to PATCH 07, the series looks good. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 08/10] send-email: consider quote as delimiter instead of character
Do not consider quote inside a recipient name as character when they are not escaped. This interprets: Jane Doe j...@example.com as: Jane Doe j...@example.com instead of: Jane\ \Doe j...@example.com Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/git-send-email.perl b/git-send-email.perl index bced78e..a03392c 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -1028,15 +1028,17 @@ sub sanitize_address { return $recipient; } + # remove non-escaped quotes + $recipient_name =~ s/(^|[^\\])/$1/g; + # rfc2047 is needed if a non-ascii char is included if ($recipient_name =~ /[^[:ascii:]]/) { - $recipient_name =~ s/^(.*)$/$1/; $recipient_name = quote_rfc2047($recipient_name); } # double quotes are needed if specials or CTLs are included elsif ($recipient_name =~ /[][()@,;:\\.\000-\037\177]/) { - $recipient_name =~ s/([\\\r])/\\$1/g; + $recipient_name =~ s/([\\\r])/\\$1/g; $recipient_name = qq[$recipient_name]; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: What's cooking in git.git (Jun 2015, #05; Mon, 22)
Johannes Schindelin johannes.schinde...@gmx.de writes: Hi Junio, On 2015-06-23 00:49, Junio C Hamano wrote: * js/rebase-i-clean-up-upon-continue-to-skip (2015-06-18) 3 commits - rebase -i: do not leave a CHERRY_PICK_HEAD file behind - SQUASH: test_must_fail is a shell function - t3404: demonstrate CHERRY_PICK_HEAD bug Abandoning an already applied change in git rebase -i with --continue left CHERRY_PICK_HEAD and confused later steps. Expecting a reroll. ($gmane/271856) Actually, there had been two re-rolls, and v3 seemed to be okay: $gmane/272037. It also looks as if 726a35ebd^..726a35ebd^2 is identical with v3... Anything you want me to change in addition? Thanks for a pointer; I think I updated the topic and then forgot to update the reference in whats cooking. I'll take a look at 272037 and if I have anything further will comment there. Also: * js/fsck-opt (2015-06-22) 19 commits - fsck: support ignoring objects in `git fsck` via fsck.skiplist - fsck: git receive-pack: support excluding objects from fsck'ing - fsck: introduce `git fsck --connectivity-only` - fsck: support demoting errors to warnings - fsck: document the new receive.fsck.msg-id options - fsck: allow upgrading fsck warnings to errors - fsck: optionally ignore specific fsck issues completely - fsck: disallow demoting grave fsck errors to warnings - fsck: add a simple test for receive.fsck.msg-id - fsck: make fsck_tag() warn-friendly - fsck: handle multiple authors in commits specially - fsck: make fsck_commit() warn-friendly - fsck: make fsck_ident() warn-friendly - fsck: report the ID of the error/warning - fsck (receive-pack): allow demoting errors to warnings - fsck: offer a function to demote fsck errors to warnings - fsck: provide a function to parse fsck message IDs - fsck: introduce identifiers for fsck messages - fsck: introduce fsck options Rerolled (at v7) and seems more or less ready for 'next'. I see that you used `downcased` instead of my `lowercased`, which makes more sense, but the style of the multi-line `for` loop as per pu` is still as *I* wrote it... I also saw that you downcased the first letter after `fsck:` in the commit messages, and touched up the message of the report the ID of the error/warning commit. Do you want to touch up the `for` loop style in offer a function to demote fsck errors to warnings or shall I send a v8 (it is ready to go: https://github.com/dscho/git/compare/next...fsck-api)? I often micro-tweak obvious things as I go over the series and applying them one by one, but the for-layout was a kind of change that I usually do not tweak during application (as there are larger chances of causing unneeded conflicts with later patches, and at the point of applying an earlier patch, I may not remember what I learned by skimming later patches in the series) and left there. If the for-layout is the only thing that is questionable thing to fix in what I queued, I think I can locally fix-up without an extra roundtrip. Thanks. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 0/5] git bisect old/new
Junio C Hamano gits...@pobox.com writes: Matthieu Moy matthieu@imag.fr writes: I fixed a few minor issues in v6. Patch between my version and v6 is below. I also pushed my branch here: https://github.com/moy/git/tree/bisect-terms It is somewhat confusing to see v3 yesterday and then this v7 next day. How did I miss v4 thru v6? Oops, I pattern-matched the wrong part when reading [PATCH v3 6/6]. Indeed, this should have been numberred v4, I should have s/v6/v3/ in my sentence above. Regarding the second and third one, the messages they give when the user marked one tip of a side branch as old and the other new gave me a hiccup while reading them, though. if (!strcmp(name_bad, bad)) { fprintf(stderr, The merge base %s is bad.\n This means the bug has been fixed between %s and [%s].\n, bad_hex, bad_hex, good_hex); } else { fprintf(stderr, The merge base %s is %s.\n This means the first commit marked %s is between %s and [%s].\n, bad_hex, name_bad, name_bad, bad_hex, good_hex); The bad side is inherited from the original and not your fault, but it was already very hard to understand. The other side is not just unreadable, but I think is incorrect and confusing to say first commit marked %(name_bad)s; you know there are history segments whose oldest ends (i.e. merge base that is bad) are marked as 'bad', and the other ends are marked as 'good', and you haven't marked any of the commits in between yet. So there is no first commit marked either as bad or good there between these endpoints (yet). The situation is (the bisection started with bad=branch1 and good=branch2): base (name_bad) -- branch1 (name_bad) \ `- branch2 (name_good) So, the first commit having property name_good is between base and branch2. So, the message seems wrong in two ways: * As you say, the wording marked as seem to imply that we did mark the commit, while we actually did not explore base..branch2 I'd write the first '%s' commit (the important is to remove marked). * The message uses 'name_bad' where it should use 'name_good'. Indeed, the original message talks about bug fixed, i.e. the first _good_ commit is in base..branch2. Is this what you meant? (Sorry, I need drawings and bullet lists to think properly ;-) ). Actually, I think it would make sense to include my drawing above in a comment in the code. Also I was somewhat puzzled and disappointed to still see name_{bad,good} not name_{new,old} used as variable names even in the endgame patch, though. Is that intended? I still think that name_old/name_new would have been much better names if we were to write bisect from scratch, but I got convinced by Christian that name_bad/name_good made sense too. Even if we adopted old/new in these variables, we would still have e.g. a variable bad_seen in the code. So, moving the codebase to adopt the old/new convention internally is more than just chosing the name of variables to avoid hardcoded good/bad string litterals. Moving the brain of developpers to adopt the old/new convention is even another debate. I don't know if this is the reason why Antoine did not change it, but that's why I did not insist further and did not do the change myself. Yeah, if I may rephrase to make sure we are on the same page, in order for 5/5 to be done sanely, 1-4/5 must be giving a good foundation to build on. I agree with that, I agree that including a polished 5/5 would be a good thing, and then I further agree that 1-4/5 could be delivered before 5/5 is ready. Yes. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 3/3] git rebase -i: add static check for commands and SHA-1
Junio C Hamano gits...@pobox.com writes: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: I used: read -r command sha1 rest EOF $line EOF because printf '%s' $line | read -r command sha1 rest doesn't work (the 3 variables have no value as a result). There might be a better way to do this, but I don't have it right now. while read line do ( IFS=' ' set x $line shift # now $1 is your command, $2 is sha1, $3 is remainder ... ) done perhaps? Will try, thanks! But more importantly, why do you even need to keep the bad ones in a separate .badcmd and .badsha files? Isn't that bloating your changes unnecessarily, iow, if you issued your warning as you encounter them, wouldn't the change become cleaner and easier to understand (and as a side effect it may even become smaller)? The _only_ thing that you would get by keeping them in temporary files is that you can do one header and bunch of errors, but is it so common to make a bad edit to the insn sheet that a sequence of errors, one per line becomes more burdensome to the end user? I would think stripspace | while read -r command sha1 rest do ... and showing the warning as you detect inside that loop would be sufficient. Perhaps I am missing subtle details of what you are doing. You're not missing subtle details, it is as you said, I tough it would be clearer for the user to have one header and a bunch of errors. Moreover while it would make the patch smaller and easier to understand, I am not sure about making it cleaner; I guess I will have to try and see how it ends up. What I'm not completely happy with your proposition is the fact that if there are multiple errors of the same kind, the output would look something like: Warning: the command isn't recognized in the following line: badcmd1 some_sha some_commit_message Warning: the command isn't recognized in the following line: badcmd2 some_sha some_commit_message (I don't think it would be good to squash some understandable warning message and the faulty line in one line, it would probably end up being too long) However as you say, such mistakes are uncommon so I guess it's fine. Thanks, Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Junio C Hamano gits...@pobox.com writes: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: +test_rebase_end () { +test_when_finished git checkout master +git branch -D $1 Is this one guaranteed to succeed? Do we want to consider it a failure to remove $1 (e.g. dropTest)? $ git branch -D no-such-branch ; echo $? error: branch 'no-such-branch' not found. 1 If dropTest branch did not exist before the test that begins with a call to this function, what happens? Besides, a function that must be called at the beginning of a test piece has a name that ends with _end? That sounds funny, no? Ah, scratch this last paragraph. I didn't see this is a multi-command when_finished. But other parts of what I said still stands. For example, even in a multi-command when_finished, git branch -D $1 if the main body of the test failed to create the branch $1, that command would fail and skip the remainder of the clean-up, so the first point above is still suspect. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 10/10] send-email: suppress meaningless whitespaces in from field
Remove leading and trailing whitespaces in from field before interepreting it to improve consistency with other options. The split_addrs function already take care of trailing and leading whitespaces for to, cc and bcc fields. The from option now: - has the same behavior when passing arguments like j...@example.com , \t j...@example.com or j...@example.com. - interprets aliases in string containing leading and trailing whitespaces such as alias or alias\t like other options. Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- git-send-email.perl | 1 + t/t9001-send-email.sh | 24 2 files changed, 25 insertions(+) diff --git a/git-send-email.perl b/git-send-email.perl index 8bf6656..749d809 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -786,6 +786,7 @@ if (!$force) { } if (defined $sender) { + $sender =~ s/^\s+|\s+$//g; ($sender) = expand_aliases($sender); } else { $sender = $repoauthor || $repocommitter || ''; diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 3c5b853..8e21fb0 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1719,4 +1719,28 @@ test_expect_success $PREREQ 'aliases work with email list' ' test_cmp expected-list actual-list ' +test_expect_success $PREREQ 'leading and trailing whitespaces are removed' ' + echo alias to2 t...@example.com .mutt + echo alias cc1 Cc 1 c...@example.com .mutt + test_config sendemail.aliasesfile .mutt + test_config sendemail.aliasfiletype mutt + TO1=$(echo QTo 1 t...@example.com | q_to_tab) + TO2=$(echo QZto2 | qz_to_tab_space) + CC1=$(echo cc1 | append_cr) + BCC1=$(echo Q b...@example.com Q | q_to_nul) + git send-email \ + --dry-run \ + --from=Example f...@example.com \ + --to=$TO1 \ + --to=$TO2 \ + --to= t...@example.com\ + --cc=$CC1 \ + --cc=Cc2 c...@example.com \ + --bcc=$BCC1 \ + --bcc=b...@example.com \ + 0001-add-master.patch | replace_variable_fields \ + actual-list + test_cmp expected-list actual-list +' + test_done -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 09/10] send-email: allow multiple emails using --cc, --to and --bcc
Accept a list of emails separated by commas in flags --cc, --to and --bcc. Multiple addresses can already be given by using these options multiple times, but it is more convenient to allow cutting-and-pasting a list of addresses from the header of an existing e-mail message, which already lists them as comma-separated list, as a value to a single parameter. The following format can now be used: $ git send-email --to='Jane j...@example.com, m...@example.com' Remove the limitation imposed by 79ee555b (Check and document the options to prevent mistakes, 2006-06-21) which rejected every argument with comma in --cc, --to and --bcc. Helped-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr Signed-off-by: Mathieu Lienard--Mayor mathieu.lienard--ma...@ensimag.imag.fr Signed-off-by: Jorge Juan Garcia Garcia jorge-juan.garcia-gar...@ensimag.imag.fr Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr --- Documentation/git-send-email.txt | 12 +-- git-send-email.perl | 17 ++-- t/t9001-send-email.sh| 44 3 files changed, 52 insertions(+), 21 deletions(-) diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt index b48a764..afd9569 100644 --- a/Documentation/git-send-email.txt +++ b/Documentation/git-send-email.txt @@ -49,17 +49,17 @@ Composing of 'sendemail.annotate'. See the CONFIGURATION section for 'sendemail.multiEdit'. ---bcc=address:: +--bcc=address,...:: Specify a Bcc: value for each email. Default is the value of 'sendemail.bcc'. + -The --bcc option must be repeated for each user you want on the bcc list. +This option may be specified multiple times. ---cc=address:: +--cc=address,...:: Specify a starting Cc: value for each email. Default is the value of 'sendemail.cc'. + -The --cc option must be repeated for each user you want on the cc list. +This option may be specified multiple times. --compose:: Invoke a text editor (see GIT_EDITOR in linkgit:git-var[1]) @@ -110,13 +110,13 @@ is not set, this will be prompted for. Only necessary if --compose is also set. If --compose is not set, this will be prompted for. ---to=address:: +--to=address,...:: Specify the primary recipient of the emails generated. Generally, this will be the upstream maintainer of the project involved. Default is the value of the 'sendemail.to' configuration value; if that is unspecified, and --to-cmd is not specified, this will be prompted for. + -The --to option must be repeated for each user you want on the to list. +This option may be specified multiple times. --8bit-encoding=encoding:: When encountering a non-ASCII message or subject that does not diff --git a/git-send-email.perl b/git-send-email.perl index a03392c..8bf6656 100755 --- a/git-send-email.perl +++ b/git-send-email.perl @@ -460,20 +460,6 @@ my ($repoauthor, $repocommitter); ($repoauthor) = Git::ident_person(@repo, 'author'); ($repocommitter) = Git::ident_person(@repo, 'committer'); -# Verify the user input - -foreach my $entry (@initial_to) { - die Comma in --to entry: $entry'\n unless $entry !~ m/,/; -} - -foreach my $entry (@initial_cc) { - die Comma in --cc entry: $entry'\n unless $entry !~ m/,/; -} - -foreach my $entry (@bcclist) { - die Comma in --bcclist entry: $entry'\n unless $entry !~ m/,/; -} - sub parse_address_line { if ($have_mail_address) { return map { $_-format } Mail::Address-parse($_[0]); @@ -1051,7 +1037,8 @@ sub sanitize_address_list { } sub process_address_list { - my @addr_list = expand_aliases(@_); + my @addr_list = map { parse_address_line($_) } @_; + @addr_list = expand_aliases(@addr_list); @addr_list = sanitize_address_list(@addr_list); @addr_list = validate_address_list(@addr_list); return @addr_list; diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh index 714fcae..3c5b853 100755 --- a/t/t9001-send-email.sh +++ b/t/t9001-send-email.sh @@ -1675,4 +1675,48 @@ test_expect_success $PREREQ '--[no-]xmailer with sendemail.xmailer=false' ' do_xmailer_test 1 --xmailer ' +test_expect_success $PREREQ 'setup expected-list' ' + git send-email \ + --dry-run \ + --from=Example f...@example.com \ + --to=To 1 t...@example.com \ + --to=t...@example.com \ + --to=t...@example.com \ + --cc=Cc 1 c...@example.com \ + --cc=Cc2 c...@example.com \ + --bcc=b...@example.com \ + --bcc=b...@example.com \ + 0001-add-master.patch | replace_variable_fields \ + expected-list +' + +test_expect_success $PREREQ 'use email list in --cc --to and --bcc' ' + git send-email \ + --dry-run \ + --from=Example f...@example.com \ + --to=To 1 t...@example.com,
[PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line
Matthieu Moy matthieu@grenoble-inp.fr writes: Your git send-email does not seem to like PATCHes 08-10/10 ;-). Up to PATCH 07, the series looks good. Yes, I get Too many recipients error... If I specify --no-signoff-by-cc then this is also aborted but I get no error (at least I've not seen it last time...). If I rerun git send-email with only one patch, it works (even if there is no difference with the number of recipient a priori). I'll investigate asap, not sure it's a bug, maybe It's just me ! -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On 06/23/2015 09:53 PM, David Turner wrote: On Tue, 2015-06-23 at 17:51 +0200, Michael Haggerty wrote: [...] * I don't like the fact that you have replaced `struct ref_transaction *` with `void *` in the public interface. On a practical level, I like the bit of type-safety that comes with the more specific declaration. But on a more abstract level, I think that the concept of a transaction could be useful across backends, for example in utility functions that verify that a proposed set of updates are internally consistent. I would rather see either * backends extend a basic `struct ref_transaction` to suit their needs, and upcast/downcast pointers at the module boundary, or * `struct ref_transaction` itself gets a `void *` member that backends can use for whatever purposes they want. There are no common fields between refs-be-file transactions and refs-be-lmdb transactions. I don't see much gain from adding an empty ref_transaction that backends could extend, since we would have to explicitly upcast/downcast all over the place. If you ask me, it would be better to do a bunch of up/downcasts within the single module (via two helper functions that could even do consistency checks) than have no help from the compiler in preventing people from passing unrelated pointer types into the `void *transaction` argument. Plus the `struct ref_transaction *` variables scattered throughout the code are a lot more self-explanatory than `void *`. * Regarding MERGE_HEAD: you take the point of view that it must continue to be stored as a file. And yet it must also behave somewhat like a reference; for example, `git rev-parse MERGE_HEAD` works today. MERGE_HEAD is also used for reachability, right? Another point of view is that MERGE_HEAD is a plain old boring reference, but there is some other metadata related to it that the refs backend has to store. The file-based backend would have special-case code to read the additional data from the tail of the loose refs file (and be sure to write the metadata when writing the reference), but other backends could store the reference with the rest but do their own thing with the metadata. So I guess I'm wondering whether the refs API needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with its metadata. You are probably right that this is a good idea. * Don't the same considerations that apply to MERGE_HEAD also apply to FETCH_HEAD? All of the tests pass without any special handling of FETCH_HEAD. That's odd. From git-fetch.txt: The names of refs that are fetched, together with the object names they point at, are written to `.git/FETCH_HEAD`. This information may be used by scripts or other git commands, such as linkgit:git-pull[1]. It seems like the test suite is reading FETCH_HEAD via the refs API in a couple of places. I don't understand why these don't fail when LMDB is being used... * I'm showing my ignorance of LMDB, but I don't see where the LMDB backend initializes its database during `git init-db`. Is that what `init_env()` does? But it looks like `init_env()` is called on every git invocation (via `git_config_early()`). Puzzled. There is no need to explicitly create the database (other than with mkdir); init_env does everything for you. OK. * Rehash of the last two points: I expected one backend function that is used to initialize the refs backend when a new repository is created (e.g., in `git init`). The file-based backend would use this function to create the `refs`, `refs/heads`, and `refs/tags` directories. I expected a second function that is called once every time git runs in an existing repository (this one might, for example, open a database connection). And maybe even a third one that closes down the database connection before git exits. Would you please explain how this actually works? LMDB doesn't really have the concept of a connection. It's basically just a couple of files that communicate using shared memory (and maybe some other locking that I haven't paid attention to). There is the concept of a transaction, which is the unit of concurrency (each thread may only have one open transaction). Transactions are either read-only or read-write, and there can only be one read-write transaction open at a time (across the entire system). Read-only transactions take a snapshot of the DB state at transaction start time. This combination of features means that we need to be a bit clever about read-only transactions; if a read-write transaction occurs in a separate process, we need to restart any read-only transactions to pick up its changes. If you are thinking about an *unrelated* separate process, then Git's philosophy is that if our process is reading *some* valid state of the references, it's all good even if that state is not quite the newest. After all, who's to say whether our process ran before or after the other process? As long as each
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote: + mkconfig 1 preciousObjects .git/config nit: I think it's better to use git config rather than manually writing config files. git config is more future-proof if we up switching config backends; it's also more composable with other configs (making this test easier to copy and manipulate), and more explicit. Other than that it looks good to me. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Enable core.fsyncObjectFiles by default
Linus Torvalds started a discussion[1] if we want to play rather safe than use defaults which make sense only for the most power users of Git: So git is safe in the sense that you won't really lose any data, but you may well be inconvenienced. The fsync each object config option is there in case you don't want that inconvenience, but it should be noted that it can make for a hell of a performance impact. Of course, it might well be the case that the actual default might be worth turning around. Most git users probably don't care about that kind of apply two hundred patches from Andrew Morton kind of workload, although rebase a big patch-series does end up doing basically the same thing, and might be more common. This patch enables fsync_object_files by default. [1] https://plus.google.com/u/1/+JonathanCorbet/posts/JBxiKPe3VXa Signed-off-by: Stefan Beller sbel...@google.com --- Documentation/config.txt | 8 environment.c| 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 43bb53c..dce2640 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -693,10 +693,10 @@ core.whitespace:: core.fsyncObjectFiles:: This boolean will enable 'fsync()' when writing object files. + -This is a total waste of time and effort on a filesystem that orders -data writes properly, but can be useful for filesystems that do not use -journalling (traditional UNIX filesystems) or that only journal metadata -and not file contents (OS X's HFS+, or Linux ext3 with data=writeback). +This ensures objects are written to disk instead of relying on the +operating systems cache and eventual write. Disabling this option will +yield performance with a trade off in safety for repository corruption +during power loss. core.preloadIndex:: Enable parallel index preload for operations like 'git diff' diff --git a/environment.c b/environment.c index 61c685b..b406f5e 100644 --- a/environment.c +++ b/environment.c @@ -35,7 +35,7 @@ const char *git_attributes_file; int zlib_compression_level = Z_BEST_SPEED; int core_compression_level; int core_compression_seen; -int fsync_object_files; +int fsync_object_files = 1; size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE; size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT; size_t delta_base_cache_limit = 96 * 1024 * 1024; -- 2.4.1.345.gab207b6.dirty -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Junio C Hamano gits...@pobox.com writes: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes: +test_rebase_end () { +test_when_finished git checkout master +git branch -D $1 Is this one guaranteed to succeed? Do we want to consider it a failure to remove $1 (e.g. dropTest)? $ git branch -D no-such-branch ; echo $? error: branch 'no-such-branch' not found. 1 If dropTest branch did not exist before the test that begins with a call to this function, what happens? Since the function is test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master If $1 doesn't exist, then it means that 'git checkout -b $1 master' failed, thus the test would fail before reaching the part in 'test_when_finished'. However I guess there are more reasons that could cause 'git branch -D $1' to fail so I'll add a 'test_might_fail' in front of it. Besides, a function that must be called at the beginning of a test piece has a name that ends with _end? That sounds funny, no? I see your point but I'm not really sure how to call it, considering that what it does is creating a branch to test on it, and taking care of the cleaning at the end of the test. Maybe something more generic like setup_and_clean ? +test_might_fail git rebase --abort +git checkout -b $1 master +} I'm wondering if this is not sufficient. test_rebase_i_drop_prepare () { git reset --hard git checkout -B $1 master } I am guessing that you named _end because it has when_finished, but as far as I can tell, even after these three patches, the tests do not really rely on the fact that it is on 'master' branch. More importantly, just being on 'master' branch is not a sufficient cleanliness for the next test (and that is why you added these branch -D and might-fail rebase --abort to this function in the first place), it seems. So... I removed the branch in case someone used the same name when creating a branch in a future test. However he would notice it when writing the test and it's not something that would suddenly break when modifying the code, so it might not be necessary. The might-fail rebase --abort is there in case this test fails in the middle (because of a future modification of rebase for example) to avoid failling all the following tests that use rebase. The name test_rebase_i_drop_prepare wouldn't be accurate since 2/3 and 3/3 uses the function as well and don't really have much to do with 'drop'. Thanks, Rémi -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 09/10] send-email: allow multiple emails using --cc, --to and --bcc
Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr writes: Helped-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr This is funny in a patch sent by the same Remi Lespinet ;-). Anyway, the whole series looks good to me now (I finally got all up to 10/10). Cheers, -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
Jeff King p...@peff.net writes: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). OK. In essense, the 'extension' on the disk is like 'capability' on the wire, in that you are not supposed to ask for capability they do not understand, and you are not supposed to touch a repository you do not understand. And the above looks like a reasonable sample feature to protect by the 'extension' system. + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); This message initially threw me off during my cursory reading, but the code tells me that this is only about repack -d. Unfortunately the users do not get the chance to read the code; perhaps s/cannot repack/ -d/; or something? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 07/10] send-email: reduce dependancies impact on parse_address_line
Matthieu Moy matthieu@grenoble-inp.fr writes: We can redirect todo_output to a variable but I've not found better... (Maybe someone has the solution here ?). Also there's no summary at the end of the test (as with other perl tests). You can get the 1..44 at the end with printf 1..%d\n, Test::More-builder-current_test; This is what t9700/test.pl does. I can also get it by removing the line Test::More-builder-no_ending(1); and replacing use Test::More; by use Test::More no_plan; I think I'm going to do that, because the no_ending thing makes the test suite success even if every test fails: at the end we have # test_external test Perl address parsing function was ok # test_external_without_stderr test no stderr: Perl address parsing function was ok in case everything is ok. With the no_ending line, only the second line reports failures, the first is always the same. I think both must be marked red. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: I think that the indentation on its own is enough to avoid confusion test_rebase_end () { test_when_finished git checkout master git branch -D $1 test_might_fail git rebase --abort git checkout -b $1 master } but your idea is fine as well, so I'm ok with either way. Read too quickly, it looks like a mis-indentation (I could laugh at Eric here, but I made the same confusion when reading the code at first). By avoid the confusion I mean make it clear it's not a mis-indentation. Anyway, we're just bikeshedding here. Any version is fine with me. -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
Just to beg a request: LMDB is not available on some MPP architectures to which git has been ported. If it comes up, I beg you not to add this as a dependency to base git components. My changes make `configure` check for the presence of liblmdb. The LMDB code is only built if liblmdb is present. So, I think we're good. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 4/5] bisect: add the terms old/new
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes: Matthieu Moy matthieu@imag.fr writes: Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Matthieu Moy matthieu@imag.fr Sounds like you went all out with this patch. Ah, the first line was in the original patch, and the second is added by send-email -s ... Both me and myself agree that one of them can be removed indeed ;-). -- Matthieu Moy http://www-verimag.imag.fr/~moy/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 0/5] git bisect old/new
Matthieu Moy matthieu@grenoble-inp.fr writes: It is somewhat confusing to see v3 yesterday and then this v7 next day. How did I miss v4 thru v6? Oops, I pattern-matched the wrong part when reading [PATCH v3 6/6]. Indeed, this should have been numberred v4, I should have s/v6/v3/ in my sentence above. OK. So, the first commit having property name_good is between base and branch2. So, the message seems wrong in two ways: Yeah, I agree with you on both points. Actually, I think it would make sense to include my drawing above in a comment in the code. Sounds like a good idea. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, 2015-06-23 at 15:53 -0400, David Turner wrote: * Regarding MERGE_HEAD: you take the point of view that it must continue to be stored as a file. And yet it must also behave somewhat like a reference; for example, `git rev-parse MERGE_HEAD` works today. MERGE_HEAD is also used for reachability, right? Another point of view is that MERGE_HEAD is a plain old boring reference, but there is some other metadata related to it that the refs backend has to store. The file-based backend would have special-case code to read the additional data from the tail of the loose refs file (and be sure to write the metadata when writing the reference), but other backends could store the reference with the rest but do their own thing with the metadata. So I guess I'm wondering whether the refs API needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with its metadata. You are probably right that this is a good idea. On reflection, I think it might make sense to keep MERGE_HEAD as a file. The problem is that not only would refs backends have to add new MERGE_HEAD-handling functions, but we would also need new plumbing commands to allow scripts to access the complete contents of MERGE_HEAD. That seems more complicated to me. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Mon, Jun 22, 2015 at 08:50:56PM -0400, David Turner wrote: The db backend runs git for-each-ref about 30% faster than the files backend with fully-packed refs on a repo with ~120k refs. It's also about 4x faster than using fully-unpacked refs. In addition, and perhaps more importantly, it avoids case-conflict issues on OS X. Neat. Can you describe a bit more about the reflog handling? One of the problems we've had with large-ref repos is that the reflog storage is quite inefficient. You can pack all the refs, but you may still be stuck with a bunch of reflog files with one entry, wasting a whole inode. Doing a git repack when you have a million of those has horrible cold-cache performance. Basically anything that isn't one-file-per-reflog would be a welcome change. :) It has also been a dream of mine to stop tying the reflogs specifically to the refs. I.e., have a spot for reflogs of branches that no longer exist, which allows us to retain them for deleted branches. Then you can possibly recover from a branch deletion, whereas now you have to dig through git fsck's dangling output. And the reflog, if you don't expire it, becomes a suitable audit log to find out what happened to each branch when (whereas now it is full of holes when things get deleted). Does your solution handle something like that? I was thinking of actually moving to a log-structured ref storage. Something like: - any ref write puts a line at the end of a single logfile that contains the ref name, along with the normal reflog data - the logfile is the source of truth for the ref state. If you want to know the value of any ref, you can read it backwards to find the last entry for the ref. Everything else is an optimization. Let's call the number of refs N, and the number of ref updates in the log U. - we keep a key/value index mapping the name of any branch that exists to the byte offset of its entry in the logfile. This would probably be in some binary key/value store (like LMDB). Without this, resolving a ref is O(U), which is horrible. With it, it should be O(1) or O(lg N), depending on the index data structure. - the index can also contain other optimizations. E.g., rather than point to the entry in the logfile, it can include the sha1 directly (to avoid an extra level of indirection). It may want to include the peeled value, as the current packed-refs file does. - Reading all of the reflogs (e.g., for repacking) is O(U), just like it is today. Except the storage for the logfile is a lot more compact than what we store today, with one reflog per file. - Reading a single reflog is _also_ O(U), which is not as good as today. But if each log entry contains a byte offset of the previous entry, you can follow the chain (it is still slightly worse, because you are jumping all over the file, rather than reading a compact set of lines). - Pruning the reflog entries from the logfile requires rewriting the whole thing. That's similar to today, where we rewrite each of the reflog files. One of the nice properties of this system is that it should be very resilient to corruption and races. Most of the operations are either appending to a file, or writing to a tempfile and renaming in place. The exception is the key/value index, but if we run into any problems there, it can be rebuilt by walking over the logfile (for a cost of O(U)). I dunno. Maybe I am overthinking it. But it really feels like the _refs_ are a key/value thing, but the _reflogs_ are not. You can cram them into a key/value store, but you're probably operating on them as a big blob, then. I chose to use LMDB for the database. LMDB has a few features that make it suitable for usage in git: One of the complaints that Shawn had about sqlite is that there is no native Java implementation, which makes it hard for JGit to ship a compatible backend. I suspect the same is true for LMDB, but it is probably a lot simpler than sqlite (so reimplementation might be possible). But it may also be worth going with a slightly slower database if we can get wider compatibility for free. To test this backend's correctness, I hacked test-lib.sh and test-lib-functions.sh to run all tests under the refs backend. Dozens of tests use manual ref/reflog reading/writing, or create submodules without passing --refs-backend-type to git init. If those tests are changed to use the update-ref machinery or test-refs-be-db (or, in the case of packed-refs, corrupt refs, and dumb fetch tests, are skipped), the only remaining failing tests are the git-new-workdir tests and the gitweb tests. I think we'll need to bump core.repositoryformatversion, too. See the patches I just posted here: http://thread.gmane.org/gmane.comp.version-control.git/272447 -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 06:14:22PM +0700, Duy Nguyen wrote: + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; I know both commands have translatable strings that are not marked with _(). But if you reroll, maybe you could add _() for these new strings. It's even better if you mark all others in these commands too, if you have time. Yeah, I agree these should be marked for translation. Thanks. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git name-rev not accepting abbreviated SHA with --stdin
Hi all, git name-rev does not accept abbreviated SHAs if --stdin is used, though it works when the SHA is given directly on the command line: $ git version git version 2.4.3 $ git name-rev --tags d73f544 d73f544 tags/v3.6.3~29 $ git name-rev --tags --stdin d73f544 d73f544 This *is* documented, but I'm curious why this distinction is made. Is it merely a matter of parsing or were there some other complications I am unaware of, which forced this distinction to be made? thanks sitaram -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
On Tue, Jun 23, 2015 at 02:57:23PM -0700, Stefan Beller wrote: Linus Torvalds started a discussion[1] if we want to play rather safe than use defaults which make sense only for the most power users of Git: So git is safe in the sense that you won't really lose any data, but you may well be inconvenienced. The fsync each object config option is there in case you don't want that inconvenience, but it should be noted that it can make for a hell of a performance impact. Of course, it might well be the case that the actual default might be worth turning around. Most git users probably don't care about that kind of apply two hundred patches from Andrew Morton kind of workload, although rebase a big patch-series does end up doing basically the same thing, and might be more common. This patch enables fsync_object_files by default. If you are looking for safety out of the box, I think this falls far short, as we do not fsync all of the other files. For instance, we do not fsync refs before they are written (nor anything else that uses the commit_lock_file() interface to rename, such as the index). We do always fsync packfiles and their indices. I had always assumed this was fine on ext4 with data=ordered (i.e., either the rename and its pointed-to content will go through, or not; so you either get your update or the old state, but not a garbage or empty file). But it sounds from what Ted wrote in: http://article.gmane.org/gmane.linux.file-systems/97255 that this may not be the case. If it's not, I think we should consider fsyncing ref writes. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
Jeff King p...@peff.net writes: I had always assumed this was fine on ext4 with data=ordered (i.e., either the rename and its pointed-to content will go through, or not; so you either get your update or the old state, but not a garbage or empty file). But it sounds from what Ted wrote in: http://article.gmane.org/gmane.linux.file-systems/97255 that this may not be the case. If it's not, I think we should consider fsyncing ref writes. That matches my understanding. IIRC, we do fsync (without a knob to turn it off) packs, we do not fsync refs (as you described above), the index, or working tree files (via write_entry()). Among these: * If we are so paranoid that loss of new loose objects matter (as opposed to Heh, it is just the matter of 'git add' again) to cause us to set core.fsyncObjectFiles, we should definitely fsync ref writes. They are even more precious. * The index is much less precious. In the worst case, you can 'git reset' (no flags) and re-add from the working tree and nothing unrecoverable is lost. I do not mind a knob to force us fsync, but we may want it to be separate from core.fsyncObjectFiles. * The working tree files? I am not sure. Again, recovery is just to run git diff to notice what was eaten by the power failure and then git checkout $path to restore from a known state, so... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
Theodore Ts'o ty...@mit.edu writes: The main issue is that non-expert users might not realize that they really need to run git fsck after a crash; otherwise, what might happen is that although git is only appending, that you might have some zero-length (or partially written) git object or pack files, and while you might not notice at that moment, it might come and bite you later. Regarding loose object files, given that we write to a temporary, optionally fsync, close and then move to the final name, would we still see partially written file if we omit the fsync, or would the corruption be limited to either empty or missing? The reason I am wondering is because the codepath to create an object (i.e. update-index --add, hash-object -w, or add) first checks if a packed or a loose object file _exists_ and if so bypasses writing the same thing anew, but the existence check for a loose object is to merely making sure that access(F_OK) (and optionally utime()) succeeds. If the potential breakage is limited to truncation to empty, then we could replace it with stat(2) and st.st_size check, as no loose object file can be empty. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
incomplete footers added by list server?
Since last friday between 10:39 and 10:50 UTC, mails to git@vger suddenly get an incomplete footer added. Instead of the normal footer: To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Only the first line is now added, actually making it fairly useless :) -- Dennis Kaarsemaker http://www.kaarsemaker.net -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: incomplete footers added by list server?
Hi Dennis, On 2015-06-23 11:43, Dennis Kaarsemaker wrote: Since last friday between 10:39 and 10:50 UTC, mails to git@vger suddenly get an incomplete footer added. Instead of the normal footer: To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Only the first line is now added, actually making it fairly useless :) Actually, I received this mail with the entire footer intact. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 12/19] initial_ref_transaction_commit(): check for duplicate refs
On 06/22/2015 11:06 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: Error out if the ref_transaction includes more than one update for any refname. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- refs.c | 11 +++ 1 file changed, 11 insertions(+) This somehow feels like ehh, I now know better and this function should have been like this from the beginning to me. But that is OK. Is the initial creation logic too fragile to deserve its own function to force callers to think about it, by the way? What I am wondering is if we could turn the safety logic that appear here (i.e. no existing refs must be assumed by the set of updates, etc.) into an optimization cue and implement this as a special case helper to ref_transaction_commit(), i.e. ref_transaction_commit(...) { if (updates are all initial creation no existing refs in repository) return initial_ref_transaction_commit(...); /* otherwise we do the usual thing */ ... } and have clone call ref_transaction_commit() as usual. The safety logic in this function is (approximately) necessary, but not sufficient, to guarantee safety. One of the shortcuts that it takes is not locking the references while they are being created. Therefore, it would be unsafe for one process to call ref_transaction_commit() while another is calling initial_ref_transaction_commit(). So the caller has to know somehow that no other processes are working in the repository for this optimization to be safe. It conveys that knowledge by calling initial_ref_transaction_commit() rather than ref_transaction_commit(). Of course the next question is, How does `git clone` know that no other process is working in the new repository? Actually, it doesn't. For example, I just verified that I can run git clone $URL mygit sleep 0.1 cd mygit git commit --allow-empty -m New root commit and thereby overwrite the upstream `master` without the usual non-fast-forward protection. I guess we are just relying on the user's common sense not to run Git commands in a new repository before its creation is complete. I suppose we *could* special-case `git clone` to not finish the initialization of the repository (for example, not write the `config` file) until *after* the packed-refs file is written. This would prevent other git processes from recognizing the directory as a Git repository and so prevent them from running before the clone is finished. But I think if anything it would make more sense to go the other direction: * Teach ref_transaction_commit() an option that asks it to write references updates to packed-refs instead of loose refs (but locking the references as usual). * Change clone to use ref_transaction_commit() like everybody else, passing it the new REFS_WRITE_TO_PACKED_REFS option. Then clone would participate in the normal locking protocol, and it wouldn't *matter* if another process runs before the clone is finished. There would also be some consistency benefits. For example, if core.logallrefupdates is set globally or on the command line, the initial reference creations would be reflogged. And other operations that write references in bulk could use the new REFS_WRITE_TO_PACKED_REFS option to prevent loose reference proliferation. But I don't think any of this is a problem in practice, and I think we can live with using the optimized-but-not-100%-safe initial_ref_transaction_commit() for cloning. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line unsubscribe git in
Dependency Management
Hi, Sorry to bother you with this question but I can't find any official answer or strong opinion from Git community. In my company we recently started to use Git and we wonder how to share code and manage dependencies with Git? Use case: in project P we need to include lib-a and lib-b (libraries shared by several projects) In your opinion, what is the future proof solution? * Use submodule * Use subtree We know there is lot of PRO/CONS but I feel that subtree is behind in the race and the latest version of submodule work fine Suggestions are very welcome. Thanks in advance, Jean Audibert _ This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of Euronext N.V. or any of its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] introduce preciousObjects repository extension
On Tue, Jun 23, 2015 at 5:54 PM, Jeff King p...@peff.net wrote: diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + while (argc--) { unsigned char sha1[20]; const char *name = *argv++; diff --git a/builtin/repack.c b/builtin/repack.c index af7340c..8ae7fe5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; I know both commands have translatable strings that are not marked with _(). But if you reroll, maybe you could add _() for these new strings. It's even better if you mark all others in these commands too, if you have time. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RFC/Pull Request: Refs db backend
On Tue, Jun 23, 2015 at 12:36 PM, Junio C Hamano gits...@pobox.com wrote: If there is interest? Shut up and take my money ;-) Yeah. This may be the next big thing since pack bitmap. It's even better if it enters 'master' hand in hand with pack protocol v2, but I think v2 needs more time. On Tue, Jun 23, 2015 at 7:50 AM, David Turner dtur...@twopensource.com wrote: To test this backend's correctness, I hacked test-lib.sh and test-lib-functions.sh to run all tests under the refs backend. Now we have two. split-index also benefits from running through full test suite like this. I propose we make make test run the test suite twice. The first run is with default configuration, no split index, no fancy ref backend. The second run enables split-index and switches to new backend, running through all test cases. In future we can also enable packv4 in this second run. There won't be a third run. When the second ref backend comes, we can switch between the two backends using a random number generator where we control both algorithm and seed, so that when a test fails, the user can give us their seed and we can re-run with the same configuration. Dozens of tests use manual ref/reflog reading/writing, or create submodules without passing --refs-backend-type to git init. If those tests are changed to use the update-ref machinery or test-refs-be-db (or, in the case of packed-refs, corrupt refs, and dumb fetch tests, are skipped), the only remaining failing tests are the git-new-workdir tests and the gitweb tests. I haven't read the series, but I guess you should also add a few tests to run on the first run, so new code is exercised a bit even if people skip the second run. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC/PATCH 0/2] bumping repository format version
We've managed to avoid bumping core.repositoryformatversion for the past 10 years, which is great. But I think there are some looming features that are going to need it. The most obvious one is changing the ref storage, where we need some way to tell older gits no, please don't think that taking 'refs/heads/foo.lock' is sufficient to actually lock. The first patch in this series is an attempt to pave the way for version bumps like this in as painless a way as possible, by letting us mark incompatible extensions by name. That way we can version things like how do you lock a ref independent of the main repositoryformatversion setting (just like we do for index version, for example). See the explanation in the first patch for more details. The second patch shows another use of the extension feature to provide safety in shared-object repos against older versions of git. [1/2]: introduce extensions form of core.repositoryformatversion [2/2]: introduce preciousObjects repository extension -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 04/19] for-each-ref: add '--points-at' option
On Tue, Jun 23, 2015 at 4:08 AM, Eric Sunshine sunsh...@sunshineco.com wrote: On Sun, Jun 21, 2015 at 4:48 PM, Karthik Nayak karthik@gmail.com wrote: Add the '--points-at' option provided by 'ref-filter'. The option lets the user to pick only refs which point to a particular commit. Add documentation and tests for the same. Based-on-patch-by: Jeff King p...@peff.net Mentored-by: Christian Couder christian.cou...@gmail.com Mentored-by: Matthieu Moy matthieu@grenoble-inp.fr Signed-off-by: Karthik Nayak karthik@gmail.com --- diff --git a/t/t6301-for-each-ref-filter.sh b/t/t6301-for-each-ref-filter.sh index b1fa8d4..67de3a7 100755 --- a/t/t6301-for-each-ref-filter.sh +++ b/t/t6301-for-each-ref-filter.sh @@ -16,4 +16,24 @@ test_expect_success 'setup some history and refs' ' git update-ref refs/odd/spot master ' +test_expect_success 'filtering with --points-at' ' + cat expect -\EOF + refs/heads/master + refs/odd/spot + refs/tags/three + EOF + git for-each-ref --format=%(refname) --points-at=master actual + test_cmp expect actual +' + +test_expect_success 'check signed tags with --points-at' ' + cat expect -\EOF + refs/heads/side + refs/tags/four + refs/tags/signed-tag four + EOF + git for-each-ref --format=%(refname) %(*subject) --points-at=side actual s/for-each-ref\s+/for-each-ref / Will change! thanks :) -- Regards, Karthik Nayak -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] introduce preciousObjects repository extension
If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see. For instance, if you do: $ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1 you now have additional safety when running git in the parent repository. Prunes and repacks will bail with an error, and `git gc` will skip those operations (it will continue to pack refs and do other non-object operations). Older versions of git, when run in the repository, will fail on every operation. Note that we do not set the preciousObjects extension by default when doing a clone -s, as doing so breaks backwards compatibility. It is a decision the user should make explicitly. Signed-off-by: Jeff King p...@peff.net --- Documentation/technical/repository-version.txt | 7 +++ builtin/gc.c | 20 +++- builtin/prune.c| 3 +++ builtin/repack.c | 3 +++ cache.h| 1 + environment.c | 1 + setup.c| 2 ++ t/t1302-repo-version.sh| 22 ++ 8 files changed, 50 insertions(+), 9 deletions(-) diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt index 3d7106d..00ad379 100644 --- a/Documentation/technical/repository-version.txt +++ b/Documentation/technical/repository-version.txt @@ -79,3 +79,10 @@ The defined extensions are: This extension does not change git's behavior at all. It is useful only for testing format-1 compatibility. + +`preciousObjects` +~ + +When the config key `extensions.preciousObjects` is set to `true`, +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or +`git repack -d`). diff --git a/builtin/gc.c b/builtin/gc.c index 36fe333..8b8dc6b 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char *prefix) if (gc_before_repack()) return -1; - if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, repack.argv[0]); - - if (prune_expire) { - argv_array_push(prune, prune_expire); - if (quiet) - argv_array_push(prune, --no-progress); - if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) - return error(FAILED_RUN, prune.argv[0]); + if (!repository_format_precious_objects) { + if (run_command_v_opt(repack.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, repack.argv[0]); + + if (prune_expire) { + argv_array_push(prune, prune_expire); + if (quiet) + argv_array_push(prune, --no-progress); + if (run_command_v_opt(prune.argv, RUN_GIT_CMD)) + return error(FAILED_RUN, prune.argv[0]); + } } if (prune_worktrees_expire) { diff --git a/builtin/prune.c b/builtin/prune.c index 0c73246..fc0c8e8 100644 --- a/builtin/prune.c +++ b/builtin/prune.c @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char *prefix) return 0; } + if (repository_format_precious_objects) + die(cannot prune in a precious-objects repo); + while (argc--) { unsigned char sha1[20]; const char *name = *argv++; diff --git a/builtin/repack.c b/builtin/repack.c index af7340c..8ae7fe5 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char *prefix) argc = parse_options(argc, argv, prefix, builtin_repack_options, git_repack_usage, 0); + if (delete_redundant repository_format_precious_objects) + die(cannot repack in a precious-objects repo); + if (pack_kept_objects 0) pack_kept_objects = write_bitmaps; diff --git a/cache.h b/cache.h index bee425b..9b59a63 100644 --- a/cache.h +++ b/cache.h @@ -694,6 +694,7 @@ extern int grafts_replace_parents; #define GIT_REPO_VERSION 0 #define GIT_REPO_VERSION_READ 1 extern int repository_format_version; +extern int repository_format_precious_objects; extern int check_repository_format(void); #define MTIME_CHANGED 0x0001 diff --git a/environment.c b/environment.c index 61c685b..da66e82 100644 --- a/environment.c +++ b/environment.c @@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1; int warn_on_object_refname_ambiguity = 1; int ref_paranoia = -1; int repository_format_version;
[PATCH 1/2] introduce extensions form of core.repositoryformatversion
Normally we try to avoid bumps of the whole-repository core.repositoryformatversion field. However, it is unavoidable if we want to safely change certain aspects of git in a backwards-incompatible way (e.g., modifying the set of ref tips that we must traverse to generate a list of unreachable, safe-to-prune objects). If we were to bump the repository version for every such change, then any implementation understanding version `X` would also have to understand `X-1`, `X-2`, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature). This patch documents the existing repositoryformatversion strategy and introduces a new format, 1, which lets a repository specify that it must run with an arbitrary set of extensions. This can be used, for example: - to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has clone --shared children) - that the refs are stored in a format besides the usual refs and packed-refs directories Because we bump to format 1, and because format 1 requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats. For example, if the user chooses to use database storage for refs, they may set the extensions.refbackend config to db. Older versions of git will not understand format 1 and bail. Versions of git which understand 1 but do not know about refbackend, or which know about refbackend but not about the db backend, will refuse to run. This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read. Note that we are only defining the rules for format 1 here. We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations. Signed-off-by: Jeff King p...@peff.net --- Documentation/technical/repository-version.txt | 81 ++ cache.h| 6 ++ setup.c| 37 +++- t/t1302-repo-version.sh| 38 4 files changed, 159 insertions(+), 3 deletions(-) create mode 100644 Documentation/technical/repository-version.txt diff --git a/Documentation/technical/repository-version.txt b/Documentation/technical/repository-version.txt new file mode 100644 index 000..3d7106d --- /dev/null +++ b/Documentation/technical/repository-version.txt @@ -0,0 +1,81 @@ +Git Repository Format Versions +== + +Every git repository is marked with a numeric version in the +`core.repositoryformatversion` key of its `config` file. This version +specifies the rules for operating on the on-disk repository data. An +implementation of git which does not understand a particular version +advertised by an on-disk repository MUST NOT operate on that repository; +doing so risks not only producing wrong results, but actually losing +data. + +Because of this rule, version bumps should be kept to an absolute +minimum. Instead, we generally prefer these strategies: + + - bumping format version numbers of individual data files (e.g., +index, packfiles, etc). This restricts the incompatibilities only to +those files. + + - introducing new data that gracefully degrades when used by older +clients (e.g., pack bitmap files are ignored by older clients, which +simply do not take advantage of the optimization they provide). + +A whole-repository format version bump should only be part of a change +that cannot be independently versioned. For instance, if one were to +change the reachability rules for objects, or the rules for locking +refs, that would require a bump of the repository format version. + +Note that this applies only to accessing the repository's disk contents +directly. An older client which understands only format `0` may still +connect via `git://` to a repository using format `1`, as long as the +server process understands format `1`. + +The preferred strategy for rolling out a version bump (whether whole +repository or for a single file) is to teach git to read the new format, +and allow writing the new format with a config switch or command line +option (for experimentation or for those who do not care about backwards +compatibility with older gits). Then after a long period to allow the +reading capability to become common, we may switch to writing the new +format by default. + +The currently defined format versions are: + +Version `0` +--- + +This is the format defined by
Re: [PATCH v4 03/19] ref-filter: implement '--points-at' option
On Tue, Jun 23, 2015 at 4:06 AM, Eric Sunshine sunsh...@sunshineco.com wrote: s/a one/one/ Noted! Thanks -- Regards, Karthik Nayak -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 07/19] for-each-ref: add '--merged' and '--no-merged' options
On Tue, Jun 23, 2015 at 4:11 AM, Eric Sunshine sunsh...@sunshineco.com wrote: According to the documentation you added to the OPTIONS section, the object following --merged and --no-merged is optional. Therefore, shouldn't this be s/object/[object]/ ? Also, in the OPTIONS section, you spelled it commit rather than object. Ditto: s/object/[object]/ Noted and changed, this would apply to --contains also :) -- Regards, Karthik Nayak -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] apply: fix adding new files on i-t-a entries
On Tue, Jun 23, 2015 at 9:50 AM, Junio C Hamano gits...@pobox.com wrote: Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: - pos = cache_name_pos(name, strlen(name)); + pos = exists_in_cache(name, strlen(name)); Something that is named as if it would return yes/no that returns a real value is not a very welcome idea. +/* This is the same as index_name_pos, except that i-t-a entries are invisible */ +int exists_in_index(const struct index_state *istate, const char *name, int namelen) +{ + int pos = index_name_stage_pos(istate, name, namelen, 0); + + if (pos 0) + return pos; + if (istate-cache[pos]-ce_flags CE_INTENT_TO_ADD) + return -pos-1; This is a useless complexity. Your callers cannot use the returned value like this: pos = exists_in_cache(...); if (pos 0) { if (active_cache[-pos-1]-ce_flags CE_INTENT_TO_ADD) ; /* ah it actually exists but it is i-t-a */ else ; /* no it does not really exist */ } else { ; /* yes it is really there at pos */ } because they cannot tell two cases apart: (1) you do have i-t-a with the given name, (2) you do not have the entry but the location you would insert an entry with such a name is occupied by an unrelated entry (i.e. with a name that sorts adjacent) that happens to be i-t-a. Also, the callers cannot even use that return value in the usual way they would use the return value from index_name_pos(), either. pos = exists_in_cache(...); if (pos 0) { /* ah, it does not exist, so... */ pos = -1 - pos; /* * ... it is OK to shift active_cache[pos..] by one and add our * entry at active_cache[pos] */ } else { /* it exists, so update in place */ ; } So, returning pos that smells like a return value from index_name_pos() only has an effect of confusing callers into buggy code, I am afraid. The callers that care need to be updated to check for ce_flags after finding the entry with index_name_pos() the usual way if you want to avoid search in the index_state-cache[] twice, and the callers that are only interested in knowing if an entry exists are better off with an exists_in_cache() that returns Yes/No and not a confusing and useless pos, I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
On Wed, Jun 24, 2015 at 4:57 AM, Stefan Beller sbel...@google.com wrote: Linus Torvalds started a discussion[1] if we want to play rather safe than use defaults which make sense only for the most power users of Git: So git is safe in the sense that you won't really lose any data, but you may well be inconvenienced. The fsync each object config option is there in case you don't want that inconvenience, but it should be noted that it can make for a hell of a performance impact. Of course, it might well be the case that the actual default might be worth turning around. Most git users probably don't care about that kind of apply two hundred patches from Andrew Morton kind of workload, although rebase a big patch-series does end up doing basically the same thing, and might be more common. This patch enables fsync_object_files by default. Will this make nfs performance a lot worse or still within acceptable range? -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
Stefan Beller sbel...@google.com writes: Linus Torvalds started a discussion[1] if we want to play rather safe than use defaults which make sense only for the most power users of Git: So git is safe in the sense that you won't really lose any data, but you may well be inconvenienced. The fsync each object config option is there in case you don't want that inconvenience, but it should be noted that it can make for a hell of a performance impact. Of course, it might well be the case that the actual default might be worth turning around. Most git users probably don't care about that kind of apply two hundred patches from Andrew Morton kind of workload, although rebase a big patch-series does end up doing basically the same thing, and might be more common. This patch enables fsync_object_files by default. Sorry, but I fail to see which part of what Linus said (which is the only thing you quoted from the discussion) argues for this patch. If anything, I read that as cautioning people against making a tradeoff based on an incorrect perception of risks and blindly flipping this bit ON (the original discussion a few days ago, where Ted says he has this bit ON while clarifying that he does so on SSD, is also a sensible description on how he made his trade-off). It's a different matter whom you would want to align with when assessing your own risk tolerance. If you quoted Corbet's original message, then that would have been more consistent. [1] https://plus.google.com/u/1/+JonathanCorbet/posts/JBxiKPe3VXa Signed-off-by: Stefan Beller sbel...@google.com --- Documentation/config.txt | 8 environment.c| 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index 43bb53c..dce2640 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -693,10 +693,10 @@ core.whitespace:: core.fsyncObjectFiles:: This boolean will enable 'fsync()' when writing object files. + -This is a total waste of time and effort on a filesystem that orders -data writes properly, but can be useful for filesystems that do not use -journalling (traditional UNIX filesystems) or that only journal metadata -and not file contents (OS X's HFS+, or Linux ext3 with data=writeback). +This ensures objects are written to disk instead of relying on the +operating systems cache and eventual write. Disabling this option will +yield performance with a trade off in safety for repository corruption +during power loss. core.preloadIndex:: Enable parallel index preload for operations like 'git diff' diff --git a/environment.c b/environment.c index 61c685b..b406f5e 100644 --- a/environment.c +++ b/environment.c @@ -35,7 +35,7 @@ const char *git_attributes_file; int zlib_compression_level = Z_BEST_SPEED; int core_compression_level; int core_compression_seen; -int fsync_object_files; +int fsync_object_files = 1; size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE; size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT; size_t delta_base_cache_limit = 96 * 1024 * 1024; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Enable core.fsyncObjectFiles by default
The main issue is that non-expert users might not realize that they really need to run git fsck after a crash; otherwise, what might happen is that although git is only appending, that you might have some zero-length (or partially written) git object or pack files, and while you might not notice at that moment, it might come and bite you later. If you do try to recover immediately after a crash, in the worst case you might have to do that git am -s /tmp/mbox-filled-with patches command, but otherwise you won't lose much data. So perhaps one alternative strategy to calling fsync(2) after every single git object file write might be to have git create a zero-length .git/in-progress-pid file, which gets fsync'ed, and then it can do the git am -s /tmp/akpm-big-bag-o-patches processing nice and fast, and once git is done, then we call call sync(2) and then delete the in-progress file. If there is an in-progress file in the .git directory, git would then automatically run git fsck to make sure the repository is consistent. For people who care, maybe that's a good compromise. (Me, the way things are right now is just fine since I have a nice fast SSD, and so setting fsyncObjectfiles is a perfectly fine thing as far as I am concerned. :-) - Ted -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html