Re: [PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
On 10/28/2013 08:10 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: True but when fetching other references, tags relevant to the history being fetched by default should automatically follow, so the above explains why fetch --tags is not a useful thing to do daily. Maybe not necessary in many scenarios, but is it harmful for the common case of cloning from and then periodically fetching from an upstream? There is no mention of harmful; I only said not useful. And it comes primarily from knowing why --tags was introduced in the first place; I should have said less useful than before, ever since we started to reliably auto-follow. The only harmful part is its interaction with --prune, which your series nicely addresses. OK, then we are in agreement. ISTM that the result of the declarative --tags option we have all upstream tags is easier to reason about than the history-dependent tag-following behavior I'd say we have all the relevant tags and we have all the tags the other side has are equally valid things to wish for, depending who you are and how your project is organized, and one is not necessarily easy/useful than the other. In case it was unclear, I was not seriously advocating to deprecate/remove --tags. Yes, I agree that both are valid things to wish for. I guess I was mostly thinking about which would be a better default. clone and remote add, by default, configure the repo to fetch all branches on the remote. For this default setup, what are the pros and cons of the two tag-fetching options (assuming that this patch series has fixed the problem with tag pruning)? Pros of auto-following: * Doesn't require a change to the status quo * The the user can change the refspec to be more restrictive without having to change the tag-following option and it continues to do the right thing. * If the project has branches outside of refs/heads (which would not, by default, be fetched--think continuous integration artifacts) and also has tags pointing at those branches, the user might get unwanted contents with --tags, but not with auto-following. Pros of --tags: * Easier to understand (?) because result is not history-dependent. * Avoids missing tags that are not directly on a branch: o---o---o---o- branch \ o - tag So it's not obvious that one is better than the other. I think if I were choosing the behavior for the first time, I would favor --tags. But I don't think the advantage is strong enough to make it worth changing the existing default. Regarding your first point: if it matters which of the duplicates is chosen, then it can only matter because they differ in some other way than their reference names, for example in their flags. So a robust way of de-duping them (if it is possible) would be to compare the two records and decide which one should take precedence based on this other information rather than based on which one happened to come earlier in the list. Then the list order would be immaterial and (for example) it wouldn't be a problem to reorder the items. But that changes the behaviour to those who has cared to rely on the ordering. With the change to first collect and then sort unstably before deduping, the series already tells them not to rely on the order, and two of us tentatively agreed in this discussion that it is not likely to matter. If later this agreement turns out to be a mistake and there are people who *do* rely on the ordering, the only acceptable fix for them is by making sure we restore the first one trumps semantics, not by defining an alternative, arguably better, behaviour---that is not a regression fix. Please note that the current patch series does *not* change the algorithm to use an unstable sort; that was only a suggestion for the future should somebody discover that the O(N^2) algorithm in this function is a performance bottleneck. What the old patch series *did* do was change the ordering that get_ref_map() adds references to the list in the case of (refspec_count tags == TAGS_SET) by moving the (tags == TAGS_SET) handling outside of the true branch of the (refspec_count) conditional. This had the result that the references added by get_fetch_map(remote_refs, tag_refspec, tail, 0); came after, rather than before, the references opportunistically being fetched with FETCH_HEAD_IGNORE status. But I dug even deeper and found that there was a (rather obscure) situation where the ordering change could lead to incorrect behavior, namely if all of the following are true: * there is a configured refspec for tags, like refs/tags/*:refs/tags/* * the user runs fetch with the --tags option and also with an explicit refspec on the command line to fetch a remote tag (e.g., refs/tags/foo:refs/heads/foo). In that case (after this version of this patch), an entry for refs/tags/foo:refs/heads/foo would be added to the list, then the
Re: [PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
Michael Haggerty mhag...@alum.mit.edu writes: True but when fetching other references, tags relevant to the history being fetched by default should automatically follow, so the above explains why fetch --tags is not a useful thing to do daily. Maybe not necessary in many scenarios, but is it harmful for the common case of cloning from and then periodically fetching from an upstream? There is no mention of harmful; I only said not useful. And it comes primarily from knowing why --tags was introduced in the first place; I should have said less useful than before, ever since we started to reliably auto-follow. The only harmful part is its interaction with --prune, which your series nicely addresses. ISTM that the result of the declarative --tags option we have all upstream tags is easier to reason about than the history-dependent tag-following behavior I'd say we have all the relevant tags and we have all the tags the other side has are equally valid things to wish for, depending who you are and how your project is organized, and one is not necessarily easy/useful than the other. In case it was unclear, I was not seriously advocating to deprecate/remove --tags. Regarding your first point: if it matters which of the duplicates is chosen, then it can only matter because they differ in some other way than their reference names, for example in their flags. So a robust way of de-duping them (if it is possible) would be to compare the two records and decide which one should take precedence based on this other information rather than based on which one happened to come earlier in the list. Then the list order would be immaterial and (for example) it wouldn't be a problem to reorder the items. But that changes the behaviour to those who has cared to rely on the ordering. With the change to first collect and then sort unstably before deduping, the series already tells them not to rely on the order, and two of us tentatively agreed in this discussion that it is not likely to matter. If later this agreement turns out to be a mistake and there are people who *do* rely on the ordering, the only acceptable fix for them is by making sure we restore the first one trumps semantics, not by defining an alternative, arguably better, behaviour---that is not a regression fix. In any case, thanks for working on this topic. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
On 10/24/2013 10:51 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: Previously, fetch's --tags option was considered equivalent to specifying the refspec refs/tags/*:refs/tags/* on the command line; in particular, it caused the remote.name.refspec configuration to be ignored. But it is not very useful to fetch tags without also fetching other references, whereas it *is* quite useful to be able to fetch tags *in addition to* other references. True but when fetching other references, tags relevant to the history being fetched by default should automatically follow, so the above explains why fetch --tags is not a useful thing to do daily. Maybe not necessary in many scenarios, but is it harmful for the common case of cloning from and then periodically fetching from an upstream? ISTM that the result of the declarative --tags option we have all upstream tags is easier to reason about than the history-dependent tag-following behavior we have all upstream tags that point to commits that were reachable from an upstream branch at the time of this or one of the previous fetches I just noticed that this is not explained very clearly in the fetch man page. I will try to improve it. If a user wants to fetch *only* tags, then it is still possible to specifying an explicit refspec: git fetch remote 'refs/tags/*:refs/tags/*' Please note that the documentation prior to 1.8.0.3 was ambiguous about this aspect of fetch --tags behavior. Commit f0cb2f137c 2012-12-14 fetch --tags: clarify documentation made the documentation match the old behavior. This commit changes the documentation to match the new behavior. The old behaviour as designed. The documentation change was not about refusing to fix a bug but the above makes it sound as if it were a such commit. I didn't mean to imply this. My point in mentioning the documentation change was that even though the old behavior has been in effect for a while, it was not clearly documented until quite recently. But I guess it is a moot point. The change to builtin/fetch.c:get_ref_map() has the side effect of changing the order that the (struct ref) objects are listed in ref_map. It seems to me that this could probably only matter in the case of duplicates. But later ref_remove_duplicates() is called, so duplicates are eliminated. However, ref_remove_duplicates() always retains the first instance of duplicates and discards the rest. It is conceivable that the boolean flags (which are not inspected by ref_remove_duplicates()) could differ among the duplicates, and that therefore changing the order of the items in the original list has the effect of changing the flags on the items that are retained. I don't know this code well enough to judge whether this might be a problem. If it is, then the correct approach is probably either to teach ref_remove_duplicates() to ensure that the flags are also consistent across duplicates, or to somehow combine the flags from all duplicates into the one that is retained. Please let me know if this is needed. I do not think either of these two is sufficient if you want to fix it; ref_remove_duplicates() needs to be taught to retain the first one it encounters among the duplicates, not ensure the flags are consistent by erroring out when they are not among duplicates, nor somehow combine flags from later one to the first one. But I suspect that, if this behaviour change were a problem, such a configuration was a problem before this change to most people; the order of duplicated [remote foo] section would not be under control of those who only use git remote without using an editor to tweak .git/config file anyway. So I do not think this regression is too big a deal; it is something you can fix later on top. I'll address your second point first: I agree that the order among explicit refspecs couldn't really be a problem because it is already arbitrary. However, get_ref_map() adds references from other sources to the result list. For example, if there are command-line arguments, it adds 1. command-line arguments (with fetch_head_status=FETCH_HEAD_MERGE) 2. configured refspecs (with fetch_head_status=FETCH_HEAD_IGNORE) 3. if (--tags) then tags_refspec (with fetch_head_status=FETCH_HEAD_NOT_FOR_MERGE) else the result of find_non_local_tags() (with fetch_head_status=FETCH_HEAD_NOT_FOR_MERGE) It could very well be that the order matters across different classes of reference. Regarding your first point: if it matters which of the duplicates is chosen, then it can only matter because they differ in some other way than their reference names, for example in their flags. So a robust way of de-duping them (if it is possible) would be to compare the two records and decide which one should take precedence based on this other information rather than based on which one happened to come earlier in the list. Then
Re: [PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
On 10/24/2013 10:51 PM, Junio C Hamano wrote: Michael Haggerty mhag...@alum.mit.edu writes: diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index ba1fe49..0e6d2ac 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -61,11 +61,9 @@ endif::git-pull[] ifndef::git-pull[] -t:: --tags:: -This is a short-hand for giving `refs/tags/*:refs/tags/*` -refspec from the command line, to ask all tags to be fetched -and stored locally. Because this acts as an explicit -refspec, the default refspecs (configured with the -remote.$name.fetch variable) are overridden and not used. +This is a short-hand requesting that all tags be fetched from +the remote in addition to whatever else is being fetched. It +is similar to using the refspec `refs/tags/*:refs/tags/*`. This is no longer a short-hand, is it? There is no other way to ask fetch the usual stuff, and then refs/tags/*:refs/tags/* as well. It should be sufficient for me to locally do: s/This is a short-hand requesting/Request/; I think. Yes, that's better. diff --git a/git-pull.sh b/git-pull.sh index b946fd9..dac7e1c 100755 --- a/git-pull.sh +++ b/git-pull.sh @@ -172,7 +172,7 @@ error_on_no_merge_candidates () { do case $opt in -t|--t|--ta|--tag|--tags) -echo Fetching tags only, you probably meant: +echo It doesn't make sense to pull tags; you probably meant: s/pull tags/pull all tags/; perhaps? Yes, that's also an improvement. Thanks, Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
Michael Haggerty mhag...@alum.mit.edu writes: Previously, fetch's --tags option was considered equivalent to specifying the refspec refs/tags/*:refs/tags/* on the command line; in particular, it caused the remote.name.refspec configuration to be ignored. But it is not very useful to fetch tags without also fetching other references, whereas it *is* quite useful to be able to fetch tags *in addition to* other references. True but when fetching other references, tags relevant to the history being fetched by default should automatically follow, so the above explains why fetch --tags is not a useful thing to do daily. So change the semantics of this option to do the latter. And I am not opposed to that change in the semantics; it makes an operation that is not so useful less annoying. If a user wants to fetch *only* tags, then it is still possible to specifying an explicit refspec: git fetch remote 'refs/tags/*:refs/tags/*' Please note that the documentation prior to 1.8.0.3 was ambiguous about this aspect of fetch --tags behavior. Commit f0cb2f137c 2012-12-14 fetch --tags: clarify documentation made the documentation match the old behavior. This commit changes the documentation to match the new behavior. The old behaviour as designed. The documentation change was not about refusing to fix a bug but the above makes it sound as if it were a such commit. The change to builtin/fetch.c:get_ref_map() has the side effect of changing the order that the (struct ref) objects are listed in ref_map. It seems to me that this could probably only matter in the case of duplicates. But later ref_remove_duplicates() is called, so duplicates are eliminated. However, ref_remove_duplicates() always retains the first instance of duplicates and discards the rest. It is conceivable that the boolean flags (which are not inspected by ref_remove_duplicates()) could differ among the duplicates, and that therefore changing the order of the items in the original list has the effect of changing the flags on the items that are retained. I don't know this code well enough to judge whether this might be a problem. If it is, then the correct approach is probably either to teach ref_remove_duplicates() to ensure that the flags are also consistent across duplicates, or to somehow combine the flags from all duplicates into the one that is retained. Please let me know if this is needed. I do not think either of these two is sufficient if you want to fix it; ref_remove_duplicates() needs to be taught to retain the first one it encounters among the duplicates, not ensure the flags are consistent by erroring out when they are not among duplicates, nor somehow combine flags from later one to the first one. But I suspect that, if this behaviour change were a problem, such a configuration was a problem before this change to most people; the order of duplicated [remote foo] section would not be under control of those who only use git remote without using an editor to tweak .git/config file anyway. So I do not think this regression is too big a deal; it is something you can fix later on top. diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index ba1fe49..0e6d2ac 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -61,11 +61,9 @@ endif::git-pull[] ifndef::git-pull[] -t:: --tags:: - This is a short-hand for giving `refs/tags/*:refs/tags/*` - refspec from the command line, to ask all tags to be fetched - and stored locally. Because this acts as an explicit - refspec, the default refspecs (configured with the - remote.$name.fetch variable) are overridden and not used. + This is a short-hand requesting that all tags be fetched from + the remote in addition to whatever else is being fetched. It + is similar to using the refspec `refs/tags/*:refs/tags/*`. This is no longer a short-hand, is it? There is no other way to ask fetch the usual stuff, and then refs/tags/*:refs/tags/* as well. It should be sufficient for me to locally do: s/This is a short-hand requesting/Request/; I think. diff --git a/git-pull.sh b/git-pull.sh index b946fd9..dac7e1c 100755 --- a/git-pull.sh +++ b/git-pull.sh @@ -172,7 +172,7 @@ error_on_no_merge_candidates () { do case $opt in -t|--t|--ta|--tag|--tags) - echo Fetching tags only, you probably meant: + echo It doesn't make sense to pull tags; you probably meant: s/pull tags/pull all tags/; perhaps? echo git fetch --tags exit 1 esac diff --git a/t/t5510-fetch.sh b/t/t5510-fetch.sh index 8328be1..02e5901 100755 --- a/t/t5510-fetch.sh +++ b/t/t5510-fetch.sh @@ -113,7 +113,7 @@ test_expect_success 'fetch --prune with a namespace keeps other namespaces' ' git rev-parse origin/master '
[PATCH 10/15] fetch --tags: fetch tags *in addition to* other stuff
Previously, fetch's --tags option was considered equivalent to specifying the refspec refs/tags/*:refs/tags/* on the command line; in particular, it caused the remote.name.refspec configuration to be ignored. But it is not very useful to fetch tags without also fetching other references, whereas it *is* quite useful to be able to fetch tags *in addition to* other references. So change the semantics of this option to do the latter. If a user wants to fetch *only* tags, then it is still possible to specifying an explicit refspec: git fetch remote 'refs/tags/*:refs/tags/*' Please note that the documentation prior to 1.8.0.3 was ambiguous about this aspect of fetch --tags behavior. Commit f0cb2f137c 2012-12-14 fetch --tags: clarify documentation made the documentation match the old behavior. This commit changes the documentation to match the new behavior. Signed-off-by: Michael Haggerty mhag...@alum.mit.edu --- The change to builtin/fetch.c:get_ref_map() has the side effect of changing the order that the (struct ref) objects are listed in ref_map. It seems to me that this could probably only matter in the case of duplicates. But later ref_remove_duplicates() is called, so duplicates are eliminated. However, ref_remove_duplicates() always retains the first instance of duplicates and discards the rest. It is conceivable that the boolean flags (which are not inspected by ref_remove_duplicates()) could differ among the duplicates, and that therefore changing the order of the items in the original list has the effect of changing the flags on the items that are retained. I don't know this code well enough to judge whether this might be a problem. If it is, then the correct approach is probably either to teach ref_remove_duplicates() to ensure that the flags are also consistent across duplicates, or to somehow combine the flags from all duplicates into the one that is retained. Please let me know if this is needed. Documentation/fetch-options.txt | 8 +++--- builtin/fetch.c | 46 +++- git-pull.sh | 2 +- t/t5510-fetch.sh | 22 --- t/t5515/fetch.br-unconfig_--tags_.._.git | 1 + t/t5515/fetch.master_--tags_.._.git | 1 + t/t5525-fetch-tagopt.sh | 23 7 files changed, 71 insertions(+), 32 deletions(-) diff --git a/Documentation/fetch-options.txt b/Documentation/fetch-options.txt index ba1fe49..0e6d2ac 100644 --- a/Documentation/fetch-options.txt +++ b/Documentation/fetch-options.txt @@ -61,11 +61,9 @@ endif::git-pull[] ifndef::git-pull[] -t:: --tags:: - This is a short-hand for giving `refs/tags/*:refs/tags/*` - refspec from the command line, to ask all tags to be fetched - and stored locally. Because this acts as an explicit - refspec, the default refspecs (configured with the - remote.$name.fetch variable) are overridden and not used. + This is a short-hand requesting that all tags be fetched from + the remote in addition to whatever else is being fetched. It + is similar to using the refspec `refs/tags/*:refs/tags/*`. --recurse-submodules[=yes|on-demand|no]:: This option controls if and under what conditions new commits of diff --git a/builtin/fetch.c b/builtin/fetch.c index 61e8117..7edb1ea 100644 --- a/builtin/fetch.c +++ b/builtin/fetch.c @@ -271,7 +271,7 @@ static struct ref *get_ref_map(struct transport *transport, const struct ref *remote_refs = transport_get_remote_refs(transport); - if (refspec_count || tags == TAGS_SET) { + if (refspec_count) { struct ref **old_tail; for (i = 0; i refspec_count; i++) { @@ -279,11 +279,9 @@ static struct ref *get_ref_map(struct transport *transport, if (refspecs[i].dst refspecs[i].dst[0]) *autotags = 1; } - /* Merge everything on the command line, but not --tags */ + /* Merge everything on the command line (but not --tags) */ for (rm = ref_map; rm; rm = rm-next) rm-fetch_head_status = FETCH_HEAD_MERGE; - if (tags == TAGS_SET) - get_fetch_map(remote_refs, tag_refspec, tail, 0); /* * For any refs that we happen to be fetching via command-line @@ -334,8 +332,13 @@ static struct ref *get_ref_map(struct transport *transport, tail = ref_map-next; } } - if (tags == TAGS_DEFAULT *autotags) + + if (tags == TAGS_SET) + /* also fetch all tags */ + get_fetch_map(remote_refs, tag_refspec, tail, 0); + else if (tags == TAGS_DEFAULT *autotags) find_non_local_tags(transport, ref_map, tail); + ref_remove_duplicates(ref_map); return ref_map;