[PATCH v2] gc: ignore old gc.log files
The intent of automatic gc is to have a git repository be relatively low-maintenance from a server-operator perspective. Of course, large operators like GitHub will need a more complicated management strategy, but for ordinary usage, git should just work. In this commit, git learns to ignore gc.log files which are older than (by default) one day old. It also learns about a config, gc.logExpiry to manage this. There is also some cleanup: a successful manual gc, or a warning-free auto gc with an old log file, will remove any old gc.log files. So git should never get itself into a state where it refuses to do any maintenance, just because at some point some piece of the maintenance didn't make progress. That might still happen (e.g. because the repo is corrupt), but at the very least it won't be because Git is too dumb to try again. Signed-off-by: David Turner Helped-by: Jeff King --- Documentation/config.txt | 5 + builtin/gc.c | 42 +++--- t/t6500-gc.sh| 13 + 3 files changed, 53 insertions(+), 7 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index fc5a28a32..2c2c9c75c 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -1402,6 +1402,11 @@ gc.autoDetach:: Make `git gc --auto` return immediately and run in background if the system supports it. Default is true. +gc.logExpiry:: + If the file gc.log exists, then `git gc --auto` won't run + unless that file is more than 'gc.logExpiry' old. Default is + "1.day". See `gc.pruneExpire` for more possible values. + gc.packRefs:: Running `git pack-refs` in a repository renders it unclonable by Git versions prior to 1.5.1.2 over dumb diff --git a/builtin/gc.c b/builtin/gc.c index 331f21926..46edcff30 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -33,6 +33,7 @@ static int aggressive_window = 250; static int gc_auto_threshold = 6700; static int gc_auto_pack_limit = 50; static int detach_auto = 1; +static unsigned long gc_log_expire_time; static const char *prune_expire = "2.weeks.ago"; static const char *prune_worktrees_expire = "3.months.ago"; @@ -76,10 +77,12 @@ static void git_config_date_string(const char *key, const char **output) static void process_log_file(void) { struct stat st; - if (!fstat(get_lock_file_fd(&log_lock), &st) && st.st_size) + if (!fstat(get_lock_file_fd(&log_lock), &st) && st.st_size) { commit_lock_file(&log_lock); - else + } else { + unlink(git_path("gc.log")); rollback_lock_file(&log_lock); + } } static void process_log_file_at_exit(void) @@ -111,6 +114,11 @@ static void gc_config(void) git_config_get_int("gc.auto", &gc_auto_threshold); git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit); git_config_get_bool("gc.autodetach", &detach_auto); + + if (!git_config_get_value("gc.logexpiry", &value)) { + parse_expiry_date(value, &gc_log_expire_time); + } + git_config_date_string("gc.pruneexpire", &prune_expire); git_config_date_string("gc.worktreepruneexpire", &prune_worktrees_expire); git_config(git_default_config, NULL); @@ -290,19 +298,34 @@ static const char *lock_repo_for_gc(int force, pid_t* ret_pid) static int report_last_gc_error(void) { struct strbuf sb = STRBUF_INIT; - int ret; + int ret = 0; + struct stat st; + char *gc_log_path = git_pathdup("gc.log"); - ret = strbuf_read_file(&sb, git_path("gc.log"), 0); + if (stat(gc_log_path, &st)) { + if (errno == ENOENT) + goto done; + + ret = error(_("Can't read %s"), gc_log_path); + goto done; + } + + if (st.st_mtime < gc_log_expire_time) + goto done; + + ret = strbuf_read_file(&sb, gc_log_path, 0); if (ret > 0) - return error(_("The last gc run reported the following. " + ret = error(_("The last gc run reported the following. " "Please correct the root cause\n" "and remove %s.\n" "Automatic cleanup will not be performed " "until the file is removed.\n\n" "%s"), -git_path("gc.log"), sb.buf); + gc_log_path, sb.buf); strbuf_release(&sb); - return 0; +done: + free(gc_log_path); + return ret; } static int gc_before_repack(void) @@ -349,6 +372,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix) argv_array_pushl(&prune_worktrees, "worktree", "prune", "--expire", NULL); argv_array_pushl(&rerere, "rerere", "gc", NULL); + /* default expiry time, overwritten in gc_config */ + parse_exp
git subtree add fails in new repository
I'm trying to make a series of repository that only contain subtrees for various other projects. However git subtree does not like being on a newly created branch: $ git init $ git subtree add --prefix=git https://github.com/git/git.git master fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git [...] -- [...]' Working tree has modifications. Cannot add.
Re: [PATCH] mingw: use OpenSSL's SHA-1 routines
On Thu, Feb 09, 2017 at 11:27:49PM +0100, Johannes Schindelin wrote: > From: Jeff Hostetler > > Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines. > This improves performance on SHA1 operations on Intel processors. > > OpenSSL 1.0.2 has made considerable performance improvements and > support the Intel hardware acceleration features. See: > https://software.intel.com/en-us/articles/improving-openssl-performance > https://software.intel.com/en-us/articles/intel-sha-extensions > > To test this I added/staged a single file in a gigantic > repository having a 450MB index file. The code in read-cache.c > verifies the header SHA as it reads the index and computes a new > header SHA as it writes out the new index. Therefore, in this test > the SHA code must process 900MB of data. Testing was done on an > Intel I7-4770 CPU @ 3.40GHz (Intel64, Family 6, Model 60) CPU. I think this is only half the story. A heavy-sha1 workload is faster, which is good. But one of the original reasons to prefer blk-sha1 (at least on Linux) is that resolving libcrypto.so symbols takes a non-trivial amount of time. I just timed it again, and it seems to be consistently 1ms slower to run "git rev-parse --git-dir" on my machine (from the top-level of a repo). 1ms is mostly irrelevant, but it adds up on scripted workloads that start a lot of git processes. Whether it's a net win or not depends on how much sha1 computation you do in your workload versus how many processes you start. I don't know what that means for Windows, though. My impression is that process startup is so painfully slow there that the link time may just be lost in the noise. It may just always be a win there. So not really an objection to your patch, but something you may want to consider. (Of course, it would in theory be possible to have the best of both worlds either by static-linking openssl, or by teaching block-sha1 the same optimizations, but both of those are obviously much more complex). -Peff
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
On Thu, Feb 09, 2017 at 01:50:31PM -0800, Junio C Hamano wrote: > > There is just no way you can "fix" this otherwise. As an occasional shell > > scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git > > rev-parse --git-path filename)`, but of course that breaks in worktrees > > and if you do not use worktrees you would not even know that your > > workaround introduced another bug. > > In case it is not clear, I understand all of the above. > > I was just worried about the people who do *NOT* use worktrees and > did the obvious "concatenate --cdup with --git-path" and thought > their script were working happily and well. By prepending the path > to the (real) location of the .git in the updated --git-path output > ourselves, they will complain, our update broke their script. That concatenating approach is broken in other ways, too. For example, if you have $GIT_DIR set to an absolute path, then --git-path will use that. I don't think we have ever promised that the output of --git-path (or --git-dir) would ever be absolute or relative (in fact, the --git-dir documentation implies that you may get either). So yes, there could be somebody who is doing this concatenation workaround, but only ever calls their script in a certain way that never triggers the absolute-path variant. They're happy now, and may not be after Dscho's patch. But I really think they are relying on a bogus assumption, and their scripts are already buggy. -Peff
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
On Thu, Feb 09, 2017 at 12:34:04PM -0800, Junio C Hamano wrote: > > +static struct submodule_hash_entry *alloc_submodule_hash_entry( > > + const char *submodule, struct ref_store *refs) > > +{ > > + size_t len = strlen(submodule); > > + struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1); > > I think this (and the later memcpy) is what FLEX_ALLOC_MEM() was > invented for. Yes, it was. Though since the length comes from a strlen() call, it can actually use the _STR variant, like: FLEX_ALLOC_STR(entry, submodule, submodule); Besides being shorter, this does integer-overflow checks on the final length. > > @@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs, > > die("BUG: main_ref_store initialized twice"); > > > > refs->submodule = ""; > > - refs->next = NULL; > > main_ref_store = refs; > > } else { > > - if (lookup_ref_store(submodule)) > > + refs->submodule = xstrdup(submodule); > > + > > + if (!submodule_ref_stores.tablesize) > > + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, > > 20); > > Makes me wonder what "20" stands for. Perhaps the caller should be > allowed to say "I do not quite care what initial size is" by passing > 0 or some equally but more clealy meaningless value (which of course > would be outside the scope of this series). I think this is what "0" already does (grep for HASHMAP_INITIAL_SIZE). In fact, that constant is 64. The 20 we pass in goes through some magic load-factor computation and ends up as 25. That being smaller than the INITIAL_SIZE constant, I believe that we end up allocating 64 entries either way (that's just from reading the code, though; I didn't run it to double check). -Peff
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
On Thu, Feb 9, 2017 at 5:54 PM, Junio C Hamano wrote: > > That leaves what the right single-step behaviour change should be. > As I recall Duy said something about --common-dir and other things > Mike's earlier change also covered, I'd prefer to leave it to three > of you to figure out what the final patch should be. > The tests which I covered in my previous patch [1] addressed the places where we identified similar problems. We should try to include some form of those tests. As far as implementation goes in rev-parse, the version in this thread is probably better that what I had, but it would need to also be applied to --git-common-dir and --shared-index-path. [1] http://public-inbox.org/git/1464261556-89722-2-git-send-email-rappa...@gmail.com/
Re: [PATCH] help: show help for aliases
Tom Kunze writes: > If an alias is a single git command show the man page of the > aliased git command with --help. > > Signed-off-by: Tom Kunze > ... > diff --git a/builtin/help.c b/builtin/help.c > index 49f7a07..655ed49 100644 > --- a/builtin/help.c > +++ b/builtin/help.c > @@ -437,6 +437,10 @@ static const char *check_git_cmd(const char* cmd) > > alias = alias_lookup(cmd); > if (alias) { > + if (alias[0] != '!') { > + strtok(alias, " \t\n"); > + return alias; > + } While I understand where you come from, I am moderately negative, especially with that strtok() to ignore options. For a truly simple alias, e.g. $ git co --help `git co' is aliased to `checkout' I do not think I would mind the updated behaviour given by this patch that much. But most of the time, when I do "help" on an alias, I am primarily interested in what default customization I am using over the base command, i.e. $ git lgf --help `git lgf' is aliased to `log --oneline --boundary --first-parent' is my way to remind me that I am using these three options to "git log" in the alias I very often use (and forgot what they were). Jumping directly to the "git log" manual page is the last thing I want "help" to do.
[PATCH] help: show help for aliases
If an alias is a single git command show the man page of the aliased git command with --help. Signed-off-by: Tom Kunze --- Hi, I noticed that when I pass --help to an alias which is only a git command it tells me a information about the alias. But it would be nice if it instead opens the corresponding man page of the command. There is a memory leak but in my opinion it can be ignored because the process will be replaced anyway. Regards, Tom Kunze PS: Please add me to cc as I am not subscribed. builtin/help.c | 4 1 file changed, 4 insertions(+) diff --git a/builtin/help.c b/builtin/help.c index 49f7a07..655ed49 100644 --- a/builtin/help.c +++ b/builtin/help.c @@ -437,6 +437,10 @@ static const char *check_git_cmd(const char* cmd) alias = alias_lookup(cmd); if (alias) { + if (alias[0] != '!') { + strtok(alias, " \t\n"); + return alias; + } printf_ln(_("`git %s' is aliased to `%s'"), cmd, alias); free(alias); exit(0); -- 2.1.4
Re: [PATCH 0/5] Store submodules in a hash, not a linked list
On Thu, Feb 09, 2017 at 10:23:35PM +0100, Michael Haggerty wrote: > >> So push the submodule attribute down to the `files_ref_store` class > >> (but continue to let the `ref_store`s be looked up by submodule). > > > > I'm not sure I understand all of the ramifications here. It _sounds_ like > > pushing this down into the files-backend code would make it harder to > > have mixed ref-backends for different submodules. Or is this just > > pushing down an implementation detail of the files backend, and future > > code is free to have as many different ref_stores as it likes? > > I don't understand how this would make it harder, aside from the fact > that a new backend class might also need a path member and have to > maintain its own copy rather than using one that the base class provides. Probably the answer is "I'm really confused". But here's how my line of reasoning went: Right now we have a main ref-store that points to the submodule ref-stores. I don't know the current state of it, but in theory those could all use different backends. This seems like it's pushing that submodule linkage down into the backend. But I think from your response that the answer is no, the thing that is being pushed down is not the right way for the main ref store and the submodules to be linked. In fact, there is no reason at all for the main ref store to know or care about submodules. Anybody who wants to know about a submodule's refs should ask the hashmap. -Peff
Re: [PATCH] gc: ignore old gc.log files
On Thu, Feb 09, 2017 at 11:57:12PM -, Philip Oakley wrote: > > > +gc.maxLogAge:: > > > + If the file gc.log exists, then `git gc --auto` won't run > > > + unless that file is more than maxLogAge seconds old. Default > > > + is 86400, one day. > > Is there a reason why one day is chosen? If maintenance staff are available > 24/7 then a shorter time would be appropriate, but if it's a 5 day work week > then they may want longer. Is there a particular case it targets? I'm pretty sure the one-day time limit isn't scientific. It's just a number we've been throwing around. I'm not sure what maintenance staff matters, though. It basically needs long enough that we're not doing _too_ many fruitless gc's, because it wastes resources. But you'd prefer to not go too long without a gc for a repository that needs it. The root cause of the error could be any number of issues. But for the case that David cares about most, you basically want to keep trying until the too-many-objects condition goes away. That's usually on a 2-week timer. So trying once per day to see if the 2-week timer feels about right. That's certainly not science, but hopefully it at least frames the general ballpark. One possible option would be to auto-scale it with the pruneExpire time. I don't know if people actually tweak that value or not. -Peff
Re: [PATCH] gc: ignore old gc.log files
From: "Jeff King" On Wed, Feb 08, 2017 at 09:02:22PM -0500, David Turner wrote: The intent of automatic gc is to have a git repository be relatively low-maintenance from a server-operator perspective. Of course, large operators like GitHub will need a more complicated management strategy, but for ordinary usage, git should just work. In this commit, git learns to ignore gc.log files which are older than (by default) one day old. It also learns about a config, gc.maxLogAge to manage this. So git should never get itself into a state where it refuses to do any maintenance, just because at some point some piece of the maintenance didn't make progress. That might still happen (e.g. because the repo is corrupt), but at the very least it won't be because Git is too dumb to try again. Sounds like a good goal and approach. +gc.maxLogAge:: + If the file gc.log exists, then `git gc --auto` won't run + unless that file is more than maxLogAge seconds old. Default + is 86400, one day. Is there a reason why one day is chosen? If maintenance staff are available 24/7 then a shorter time would be appropriate, but if it's a 5 day work week then they may want longer. Is there a particular case it targets? For other time-based config, we use approxidate with a relative time, like "1 day ago". I think it would make sense for this to match, as it makes the config a little more readable. You can follow the prune_expire example which is right below your new config variable in all of the hunks of your patch. Though I think ultimately that isn't parsed inside gc, so you'd eventually look at how "prune --expire" is handled (which I think is via parse_expiry_date()). [...] Philip
Re: [PATCH v2] git-p4: fix git-p4.pathEncoding for removed files
Lars Schneider writes: > unfortunately, I missed to send this v2. I agree with Luke's review and > I moved the re-encode of the path name to the `streamOneP4File` and > `streamOneP4Deletion` explicitly. > > Discussion: > http://public-inbox.org/git/CAE5ih7-=bd_zol5pfyfd2qvy-xe24v_cgge0xoavuotk02e...@mail.gmail.com/ > > Thanks, > Lars Thanks. Will replace but will not immediately merge to 'next' yet, just in case Luke wants to tell me add his "Reviewed-by:".
Re: [PATCH] mingw: use OpenSSL's SHA-1 routines
Johannes Schindelin writes: > From: Jeff Hostetler > > Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines. > This improves performance on SHA1 operations on Intel processors. > ... > > Signed-off-by: Jeff Hostetler > Signed-off-by: Johannes Schindelin > --- Nice. Will queue as jh/mingw-openssl-sha1 topic; it is a bit too late for today's integration cycle to be merged to 'next', but let's have this by the end of the week in 'master'. Thanks.
Re: Bug with fixup and autosquash
From: "Johannes Schindelin" Sent: Thursday, February 09, 2017 8:55 PM Hi Ashutosh and Junio, On Wed, 8 Feb 2017, Junio C Hamano wrote: Ashutosh Bapat writes: > I have been using git rebase heavily these days and seem to have found > a bug. > > If there are two commit messages which have same prefix e.g. > yy This is prefix > xx This is prefix and message > > xx comitted before yy > > Now I commit a fixup to yy using git commit --fixup yy > zz fixup! This is prefix > > When I run git rebase -i --autosquash, the script it shows me looks > like > pick xx This is prefix and message > fixup zz fixup! This is prefix > pick yy This is prefix > > I think the correct order is > pick xx This is prefix and message > pick yy This is prefix > fixup zz fixup! This is prefix > > Is that right? [...] Unfortunately, "rebase -i --autosquash" reorders the entries by identifying the commit by its title, and it goes with prefix match so that fix-up commits created without using --fixup option but manually records the title's prefix substring can also work. This prefix match also happens to introduce a serious performance problem, which is why I "fixed" this issue in the rebase--helper already (which is the case if you are using Git for Windows, whose master branch builds on Linux and MacOSX as well). I quoted "fix" because my motivation was to fix the performance problem, not the "incorrect match" problem. The rebase--helper code (specifically, the patch moving autosquash logic into it: https://github.com/dscho/git/commit/7d0831637f) tries to match exact onelines first, While I think this is an improvement, and will strongly support the `git commit --fixup=` option which will, if the sha1/oid is given, create the exact commit subject line. However it would also be useful if the actual commit subject line could have a similar format option, so that those who use say the git gui (rather than the cli) for the commit message, could easily create the `!fixup ` message which would allow a broader range of ways of spelling the commit (e.g. giving a sha1(min length) that is within the rebase todo list). and falls back to prefix matching only after that. Now that the sequencer-i patch series is in `master`, the next step is to send the patch series introducing the rebase--helper. The patch series including the fix discussed above relies on that one. Meaning that it will take a while to get through the mill. So please do not hold your breath until this feature/fix hits an official Git version. If you need it[*1*] faster, feel free to build Git for Windows' master and run with that for a while. Ciao, Johannes Footnote: By "it" I mean "the feature/fix", not "an official Git version" nor "your breath". -- Philip
Re: [RFC-PATCHv2] submodules: add a background story
Stefan Beller writes: > Just like gitmodules(5), gitattributes(5), gitcredentials(7), > gitnamespaces(7), gittutorial(7), we'd like to provide some background > on submodules, which is not specific to the `submodule` command, but > elaborates on the background and its intended usage. > > Add gitsubmodules(7), that explains the states, structure and usage of > submodules. > > Signed-off-by: Stefan Beller > --- > > This would replace the last patch of sb/submodule-doc, though it's still > RFC. In this revision I took care of the technical details (i.e. proper > formatting, spelling), and only slight rewording of the text. > > The main issue persists; see bottom of the patch: > > SAMPLE WORKFLOWS (RFC/TODO) > --- > > Do we need > > * an opinionated way to check for a specific state of a submodule > * (submodule helper to be plumbing?) > * expose the design mistake of having the (name->path) mapping inside the > working tree, i.e. never remove a name from the submodule config even when > the submodule doesn't exist any more. I am not sure about the last item. Are you talking about a case where submodule comes and goes (think: "git checkout v1.0" that would make submodules added since that version disappar)? .gitmodules that is checked out would not have any entry, but .git/config needs to record the end-user preference for the module, so that the user can do "git checkout -" to come back, no? IOW .git/config that mentions all the submodule the user ever showed interests in is not a design mistake, so you must be talking about something else, but I do not know what it is. > diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt > index 4a4cede144..d38aa2d53a 100644 > --- a/Documentation/git-submodule.txt > +++ b/Documentation/git-submodule.txt > @@ -24,37 +24,7 @@ DESCRIPTION > --- > Inspects, updates and manages submodules. > > -A submodule allows you to keep another Git repository in a subdirectory > ... > -if you choose to go that route. > +For more information about submodules, see linkgit:gitsubmodules[5] OK. > @@ -420,6 +390,10 @@ This file should be formatted in the same way as > `$GIT_DIR/config`. The key > to each submodule url is "submodule.$name.url". See linkgit:gitmodules[5] > for details. > > +SEE ALSO > + > +linkgit:gitsubmodules[1], linkgit:gitmodules[1]. Are they both in section (1)? I think the former (concepts) belongs to section 7 and the latter (file formats) belongs to section 5. > diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt > new file mode 100644 > index 00..3369d55ae9 > --- /dev/null > +++ b/Documentation/gitsubmodules.txt > @@ -0,0 +1,194 @@ > +gitsubmodules(7) > + > + > +NAME > + > +gitsubmodules - information about submodules > + > +SYNOPSIS > + > +$GIT_DIR/config, .gitmodules > + > +-- > +git submodule > +-- > + > +DESCRIPTION > +--- > + > +A submodule allows you to keep another Git repository in a subdirectory > +... > +When cloning or pulling a repository containing submodules however, > +the submodules will not be checked out by default; You need to instruct > +'clone' to recurse into submodules. The 'init' and 'update' subcommands I think this is not "You need to", but rather "You can, if you want to have each and every submodules." > +of 'git submodule' will maintain submodules checked out and at an > +appropriate revision in your working tree. > + > +WHEN TO USE > +--- > + > +Submodules, repositories inside other repositories, > +can be used for different use cases: > + > +* To have finer grained access control. > + The design principles of Git do not allow for partial repositories to be > + checked out or transferred. A repository is the smallest unit that a user > + can be given access to. Submodules are separate repositories, such that > + you can restrict access to parts of your project via the use of submodules. > + > +* To decouple Git histories. > + Decoupling histories has different benefits. > + > +** When you want to use a (third party) library tied to a specific version. > + Using submodules for a library allows you to have a clean history for > + your own project and only updating the library in the submodule when > needed. I somehow do not see this as decoupling; it is keeping what is originally separate separate, isn't it? > +** In its current form Git scales up poorly for very large repositories that > + change a lot, as the history grows very large. For that you may want to > look > + at shallow clone, sparse checkout or git-lfs. > + However you can also use submodules to e.g. hold large binary assets > + and these repositories are then shallowly cloned such that you do not > + have a large history locally. In other words, a better way to list these may be 1. using another project that stands on its own
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Johannes Schindelin writes: >> I have no strong opinion for or against a "longer term" solution >> that makes "rev-parse --git-path" behave differently from how it >> behaves today, but I am not yet convinced that we can reach that >> longer term goal without a transition period, as I suspect there are >> existing users that know and came to expect how it behaves, based on >> its today's behaviour. Other than that I do not have suggestion on >> this topic at the moment. I think I was simply being silly (not merely "overcautious", but just "silly") here. There is no reason for people to use "--git-path" if they are not preparing to work with secondary worktrees, because the whole point of the feature is so that cases where "$(rev-parse --git-dir)/path" does a wrong thing (e.g. end up referring to the main worktree thing when you need to refer to your own, or vice versa). > Given that > ... > it should be safe to assume that a transitional period is more likely to > do more harm to our users than bring benefit. In short, "--git-path as currently exposed to the end-users is utterly broken and cannot have been used for anything sensible". If that is the case, let's just change that with an entry in the release notes that states so (iow, there is no need for even a backward compatibility notice, we just have an entry that says "this was totally broken in such and such way, and now it is fixed to behave this way"). That leaves what the right single-step behaviour change should be. As I recall Duy said something about --common-dir and other things Mike's earlier change also covered, I'd prefer to leave it to three of you to figure out what the final patch should be. Thanks.
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Hi Junio, On Thu, 9 Feb 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > ... and even go so far as calling your patch reverting my refactoring > > a hot-fix, why don't you just go ahead and merge the result over my > > objections? > > At this point, you are simply being silly. How is it that this patch cannot be applied when, and if, that hypothetical config setting is introduced? Maybe I am dense here, but I would really like to know why this preparatory patch must be applied *now*, when there is nothing to prepare for. > Isn't "Putty is not a command but is also handled as if it is a valid > implementation of SSH" a bug? If you think it is a bug to handle an ssh command called "putty" as if it were plink, sure. I do not think there is a valid use case, but hey. Ciao, Johannes
Re: [PATCH v2 0/2] Let the sequencer handle the grunt work of rebase -i
Johannes Schindelin writes: > After all of these patch series y'all had to review, this is finally the > one that switches things over. > > Please note that it does not (yet) handle the `git rebase -i --root` > invocation; I tried to focus on the common case, and I rarely use --root > myself. As long as the longer-term goal is clear enough and the short-term approach does not conflict with the goal, solving the most common problem that yields the largest payback first is absolutely the right thing to do, and omitting "--root" and/or "-p" and getting the main use of "-i" right is a great way to start. > .gitignore| 1 + > Makefile | 1 + > builtin.h | 1 + > builtin/rebase--helper.c | 40 > git-rebase--interactive.sh| 13 + > git.c | 1 + > t/t3404-rebase-interactive.sh | 2 +- > 7 files changed, 58 insertions(+), 1 deletion(-) > create mode 100644 builtin/rebase--helper.c And it is surprisingly short and sweet ;-) Will queue as js/rebase-helper topic, forked at 6e3a7b3398 ("Git 2.12-rc0", 2017-02-03). Thanks. PS. in case if anybody is wondering after reading [*1*], at this point, I _have_ read the patches not just the cover letter, looked at the branch name the original author gave to the topic, chose the local topic name I use, and chose where to fork the topic from, but have not applied the patches (so I may later end up saying "the patch does not apply cleanly", "the compiler complains on this line", or "the new code is inconsistent with this existing code that is a bit beyond the context of the patch that I did not notice when I reviewed the patches alone" in a separate message). I do not have a new entry for this topic in the draft of "What's cooking" report yet, or have not decided if the topic would hit 'jch' or 'pu' yet either. [Reference] *1* http://public-inbox.org/git/xmqq7f4zqiyj@gitster.mtv.corp.google.com
[PATCH] mingw: use OpenSSL's SHA-1 routines
From: Jeff Hostetler Use OpenSSL's SHA-1 routines rather than builtin block-sha1 routines. This improves performance on SHA1 operations on Intel processors. OpenSSL 1.0.2 has made considerable performance improvements and support the Intel hardware acceleration features. See: https://software.intel.com/en-us/articles/improving-openssl-performance https://software.intel.com/en-us/articles/intel-sha-extensions To test this I added/staged a single file in a gigantic repository having a 450MB index file. The code in read-cache.c verifies the header SHA as it reads the index and computes a new header SHA as it writes out the new index. Therefore, in this test the SHA code must process 900MB of data. Testing was done on an Intel I7-4770 CPU @ 3.40GHz (Intel64, Family 6, Model 60) CPU. The block-sha1 version averaged 5.27 seconds. The OpenSSLversion averaged 4.50 seconds. $ echo xxx >> project.mk $ time /e/blk_sha/bin/git.exe add project.mk real0m5.207s user0m0.000s sys 0m0.250s $ echo xxx >> project.mk $ time /e/blk_sha/bin/git.exe add project.mk real0m5.362s user0m0.015s sys 0m0.234s $ echo xxx >> project.mk $ time /e/blk_sha/bin/git.exe add project.mk real0m5.300s user0m0.016s sys 0m0.250s $ echo xxx >> project.mk $ time /e/blk_sha/bin/git.exe add project.mk real0m5.216s user0m0.000s sys 0m0.250s $ echo xxx >> project.mk $ time /e/openssl/bin/git.exe add project.mk real0m4.431s user0m0.000s sys 0m0.250s $ echo xxx >> project.mk $ time /e/openssl/bin/git.exe add project.mk real0m4.478s user0m0.000s sys 0m0.265s $ echo xxx >> project.mk $ time /e/openssl/bin/git.exe add project.mk real0m4.690s user0m0.000s sys 0m0.250s $ echo xxx >> project.mk $ time /e/openssl/bin/git.exe add project.mk real0m4.420s user0m0.000s sys 0m0.234s Signed-off-by: Jeff Hostetler Signed-off-by: Johannes Schindelin --- Published-As: https://github.com/dscho/git/releases/tag/mingw-openssl-sha1-v1 Fetch-It-Via: git fetch https://github.com/dscho/git mingw-openssl-sha1-v1 config.mak.uname | 1 - 1 file changed, 1 deletion(-) diff --git a/config.mak.uname b/config.mak.uname index 447f36ac2e..a07936da8b 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -515,7 +515,6 @@ ifneq (,$(findstring MINGW,$(uname_S))) OBJECT_CREATION_USES_RENAMES = UnfortunatelyNeedsTo NO_REGEX = YesPlease NO_PYTHON = YesPlease - BLK_SHA1 = YesPlease ETAGS_TARGET = ETAGS NO_INET_PTON = YesPlease NO_INET_NTOP = YesPlease base-commit: 6e3a7b3398559305c7a239a42e447c21a8f39ff8 -- 2.11.1.windows.1
[PATCH v2 2/2] rebase -i: use the rebase--helper builtin
Now that the sequencer learned to process a "normal" interactive rebase, we use it. The original shell script is still used for "non-normal" interactive rebases, i.e. when --root or --preserve-merges was passed. Please note that the --root option (via the $squash_onto variable) needs special handling only for the very first command, hence it is still okay to use the helper upon continue/skip. Also please note that the --no-ff setting is volatile, i.e. when the interactive rebase is interrupted at any stage, there is no record of it. Therefore, we have to pass it from the shell script to the rebase--helper. Note: the test t3404 had to be adjusted because the the error messages produced by the sequencer comply with our current convention to start with a lower-case letter. Signed-off-by: Johannes Schindelin --- git-rebase--interactive.sh| 13 + t/t3404-rebase-interactive.sh | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh index 4734094a3f..2c9c0165b5 100644 --- a/git-rebase--interactive.sh +++ b/git-rebase--interactive.sh @@ -1069,6 +1069,10 @@ git_rebase__interactive () { case "$action" in continue) + if test ! -d "$rewritten" + then + exec git rebase--helper ${force_rebase:+--no-ff} --continue + fi # do we have anything to commit? if git diff-index --cached --quiet HEAD -- then @@ -1128,6 +1132,10 @@ first and then run 'git rebase --continue' again.")" skip) git rerere clear + if test ! -d "$rewritten" + then + exec git rebase--helper ${force_rebase:+--no-ff} --continue + fi do_rest return 0 ;; @@ -1314,6 +1322,11 @@ expand_todo_ids test -d "$rewritten" || test -n "$force_rebase" || skip_unnecessary_picks checkout_onto +if test -z "$rebase_root" && test ! -d "$rewritten" +then + require_clean_work_tree "rebase" + exec git rebase--helper ${force_rebase:+--no-ff} --continue +fi do_rest } diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index e2f18d11f6..33d392ba11 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -556,7 +556,7 @@ test_expect_success 'clean error after failed "exec"' ' echo "edited again" > file7 && git add file7 && test_must_fail git rebase --continue 2>error && - test_i18ngrep "You have staged changes in your working tree." error + test_i18ngrep "you have staged changes in your working tree" error ' test_expect_success 'rebase a detached HEAD' ' -- 2.11.1.windows.1
[PATCH v2 0/2] Let the sequencer handle the grunt work of rebase -i
After all of these patch series y'all had to review, this is finally the one that switches things over. Please note that it does not (yet) handle the `git rebase -i --root` invocation; I tried to focus on the common case, and I rarely use --root myself. Please note also that --preserve-merges is *not* handled. The way I designed --preserve-merges is totally stupid and idiotic and I do not want to spend any further time on it. You cannot "pick" merges and hope to be able to reorder commits, for example. I may work on porting Git garden shears' way to recreate the branch structure into rebase -i proper at some stage. And please finally note that this pair of patches does not yet yield the full speed improvement that I promised earlier. After these patches, the time is dominated by pre- and post-processing the todo script, at least on Windows, so there is another patch series that ports those bits and pieces into the rebase--helper, too. Changes since v1: - rebased to current master - this required a change in t3404 because I was bullied^Wasked to change some messages (which should not have been conflated with the work I actually wanted to do, but whatevs) Johannes Schindelin (2): Add a builtin helper for interactive rebases rebase -i: use the rebase--helper builtin .gitignore| 1 + Makefile | 1 + builtin.h | 1 + builtin/rebase--helper.c | 40 git-rebase--interactive.sh| 13 + git.c | 1 + t/t3404-rebase-interactive.sh | 2 +- 7 files changed, 58 insertions(+), 1 deletion(-) create mode 100644 builtin/rebase--helper.c base-commit: 6e3a7b3398559305c7a239a42e447c21a8f39ff8 Based-On: sequencer-i at https://github.com/dscho/git Fetch-Base-Via: git fetch https://github.com/dscho/git sequencer-i Published-As: https://github.com/dscho/git/releases/tag/rebase--helper-v2 Fetch-It-Via: git fetch https://github.com/dscho/git rebase--helper-v2 Interdiff vs v1: diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh index e2f18d11f6..33d392ba11 100755 --- a/t/t3404-rebase-interactive.sh +++ b/t/t3404-rebase-interactive.sh @@ -556,7 +556,7 @@ test_expect_success 'clean error after failed "exec"' ' echo "edited again" > file7 && git add file7 && test_must_fail git rebase --continue 2>error && - test_i18ngrep "You have staged changes in your working tree." error + test_i18ngrep "you have staged changes in your working tree" error ' test_expect_success 'rebase a detached HEAD' ' -- 2.11.1.windows.1
[PATCH v2 1/2] Add a builtin helper for interactive rebases
Git's interactive rebase is still implemented as a shell script, despite its complexity. This implies that it suffers from the portability point of view, from lack of expressibility, and of course also from performance. The latter issue is particularly serious on Windows, where we pay a hefty price for relying so much on POSIX. Unfortunately, being such a huge shell script also means that we missed the train when it would have been relatively easy to port it to C, and instead piled feature upon feature onto that poor script that originally never intended to be more than a slightly pimped cherry-pick in a loop. To open the road toward better performance (in addition to all the other benefits of C over shell scripts), let's just start *somewhere*. The approach taken here is to add a builtin helper that at first intends to take care of the parts of the interactive rebase that are most affected by the performance penalties mentioned above. In particular, after we spent all those efforts on preparing the sequencer to process rebase -i's git-rebase-todo scripts, we implement the `git rebase -i --continue` functionality as a new builtin, git-rebase--helper. Once that is in place, we can work gradually on tackling the rest of the technical debt. Note that the rebase--helper needs to learn about the transient --ff/--no-ff options of git-rebase, as the corresponding flag is not persisted to, and re-read from, the state directory. Signed-off-by: Johannes Schindelin --- .gitignore | 1 + Makefile | 1 + builtin.h| 1 + builtin/rebase--helper.c | 40 git.c| 1 + 5 files changed, 44 insertions(+) create mode 100644 builtin/rebase--helper.c diff --git a/.gitignore b/.gitignore index b1020b875f..833ef3b0b7 100644 --- a/.gitignore +++ b/.gitignore @@ -114,6 +114,7 @@ /git-read-tree /git-rebase /git-rebase--am +/git-rebase--helper /git-rebase--interactive /git-rebase--merge /git-receive-pack diff --git a/Makefile b/Makefile index 8e4081e061..29f8663509 100644 --- a/Makefile +++ b/Makefile @@ -932,6 +932,7 @@ BUILTIN_OBJS += builtin/prune.o BUILTIN_OBJS += builtin/pull.o BUILTIN_OBJS += builtin/push.o BUILTIN_OBJS += builtin/read-tree.o +BUILTIN_OBJS += builtin/rebase--helper.o BUILTIN_OBJS += builtin/receive-pack.o BUILTIN_OBJS += builtin/reflog.o BUILTIN_OBJS += builtin/remote.o diff --git a/builtin.h b/builtin.h index 67f80519da..9e4a89816d 100644 --- a/builtin.h +++ b/builtin.h @@ -103,6 +103,7 @@ extern int cmd_prune_packed(int argc, const char **argv, const char *prefix); extern int cmd_pull(int argc, const char **argv, const char *prefix); extern int cmd_push(int argc, const char **argv, const char *prefix); extern int cmd_read_tree(int argc, const char **argv, const char *prefix); +extern int cmd_rebase__helper(int argc, const char **argv, const char *prefix); extern int cmd_receive_pack(int argc, const char **argv, const char *prefix); extern int cmd_reflog(int argc, const char **argv, const char *prefix); extern int cmd_remote(int argc, const char **argv, const char *prefix); diff --git a/builtin/rebase--helper.c b/builtin/rebase--helper.c new file mode 100644 index 00..ca1ebb2fa1 --- /dev/null +++ b/builtin/rebase--helper.c @@ -0,0 +1,40 @@ +#include "builtin.h" +#include "cache.h" +#include "parse-options.h" +#include "sequencer.h" + +static const char * const builtin_rebase_helper_usage[] = { + N_("git rebase--helper []"), + NULL +}; + +int cmd_rebase__helper(int argc, const char **argv, const char *prefix) +{ + struct replay_opts opts = REPLAY_OPTS_INIT; + enum { + CONTINUE = 1, ABORT + } command = 0; + struct option options[] = { + OPT_BOOL(0, "ff", &opts.allow_ff, N_("allow fast-forward")), + OPT_CMDMODE(0, "continue", &command, N_("continue rebase"), + CONTINUE), + OPT_CMDMODE(0, "abort", &command, N_("abort rebase"), + ABORT), + OPT_END() + }; + + git_config(git_default_config, NULL); + + opts.action = REPLAY_INTERACTIVE_REBASE; + opts.allow_ff = 1; + opts.allow_empty = 1; + + argc = parse_options(argc, argv, NULL, options, + builtin_rebase_helper_usage, PARSE_OPT_KEEP_ARGV0); + + if (command == CONTINUE && argc == 1) + return !!sequencer_continue(&opts); + if (command == ABORT && argc == 1) + return !!sequencer_remove_state(&opts); + usage_with_options(builtin_rebase_helper_usage, options); +} diff --git a/git.c b/git.c index c887272b12..33f52acbcc 100644 --- a/git.c +++ b/git.c @@ -473,6 +473,7 @@ static struct cmd_struct commands[] = { { "pull", cmd_pull, RUN_SETUP | NEED_WORK_TREE }, { "push", cmd_push, RUN_SETUP }, { "read-tree", cmd_read_tree, RUN_SETUP | SUPPORT_SUP
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
Michael Haggerty writes: > Aside from scaling better, this means that the submodule name needn't be > stored in the ref_store instance anymore (which will be changed in a > moment). Nice. I like the latter reason very much (this is not a suggestion to change the description). > +struct submodule_hash_entry > +{ > + struct hashmap_entry ent; /* must be the first member! */ > + > + struct ref_store *refs; > + > + /* NUL-terminated name of submodule: */ > + char submodule[FLEX_ARRAY]; > +}; > + > +static int submodule_hash_cmp(const void *entry, const void *entry_or_key, > + const void *keydata) > +{ > + const struct submodule_hash_entry *e1 = entry, *e2 = entry_or_key; > + const char *submodule = keydata; > + > + return strcmp(e1->submodule, submodule ? submodule : e2->submodule); I would have found it more readable if it were like so: const char *submodule = keydata ? keydata : e2->submodule; return strcmp(e1->submodule, submodule); but I suspect the difference is not that huge. > +} > + > +static struct submodule_hash_entry *alloc_submodule_hash_entry( > + const char *submodule, struct ref_store *refs) > +{ > + size_t len = strlen(submodule); > + struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1); I think this (and the later memcpy) is what FLEX_ALLOC_MEM() was invented for. > + hashmap_entry_init(entry, strhash(submodule)); > + entry->refs = refs; > + memcpy(entry->submodule, submodule, len + 1); > + return entry; > +} > ... > @@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs, > die("BUG: main_ref_store initialized twice"); > > refs->submodule = ""; > - refs->next = NULL; > main_ref_store = refs; > } else { > - if (lookup_ref_store(submodule)) > + refs->submodule = xstrdup(submodule); > + > + if (!submodule_ref_stores.tablesize) > + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, > 20); Makes me wonder what "20" stands for. Perhaps the caller should be allowed to say "I do not quite care what initial size is" by passing 0 or some equally but more clealy meaningless value (which of course would be outside the scope of this series).
Re: Bug with fixup and autosquash
Johannes Schindelin writes: > Almost. While I fixed the performance issues as well as the design > allowed, I happened to "fix" the problem where an incomplete prefix match > could be favored over an exact match. Hmph. Would it require too much further work to do what you said the code does: > The rebase--helper code (specifically, the patch moving autosquash logic > into it: https://github.com/dscho/git/commit/7d0831637f) tries to match > exact onelines first, and falls back to prefix matching only after that. If the code matches exact onlines and then falls back to prefix, I do not think incomplete prefix would be mistakenly chosen over an exact one, so perhaps your code already does the right thing?
Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`
Johannes Schindelin writes: >> > E.g. >> > >> > $ git show --notes=amlog --stat >> >> That's super useful! Thanks for the pointer! >> Wouldn't it make sense to push these notes to github.com/git/git ? > > I am not quite sure about that. It is in a different namespace than what > is usually cloned, and it currently adds 8MB to the download (there are > "amlog" and "commits", the latter clearly being a sandbox). I do not think the public mirrors of the primary repository should get amlog, either. It is more suited for those who are interested in broken-out topics, i.e. git://github.com/git/gitster.
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Hi Junio, On Thu, 9 Feb 2017, Junio C Hamano wrote: > Duy Nguyen writes: > > > Relevant thread in the past [1] which fixes both --git-path and > > --git-common-dir. I think the author dropped it somehow (or forgot > > about it, I know I did). Sorry can't comment on that thread, or this > > patch, yet. > > > > [1] > > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/ > > Thanks for a pointer. I see Mike responded to this message (I > haven't had a chance to read and think about it yet), so I trust > that you three can figure out if these are the same issues and what > the final solution in the longer term should be. > > I have no strong opinion for or against a "longer term" solution > that makes "rev-parse --git-path" behave differently from how it > behaves today, but I am not yet convinced that we can reach that > longer term goal without a transition period, as I suspect there are > existing users that know and came to expect how it behaves, based on > its today's behaviour. Other than that I do not have suggestion on > this topic at the moment. Given that - the output is incorrect, not some output that could maybe be improved, - warnings in a script execution are most likely to be missed, - --git-path gives incorrect output in subdirectories, except inside worktrees, therefore scripts relying on the current behavior are highly likely to misbehave in worktrees anyway, - leaving this bug unfixed even when we know about it for 3 major releases reflects really badly on Git as a project, and - the longer we wait to fix this bug, the more developers will simply stay away from --git-path (of course, only *after* they were bitten by the bug, like I was), it should be safe to assume that a transitional period is more likely to do more harm to our users than bring benefit. Ciao, Johannes
Re: Bug with fixup and autosquash
Hi Junio, On Thu, 9 Feb 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > This prefix match also happens to introduce a serious performance problem, > > which is why I "fixed" this issue in the rebase--helper already (which is > > the case if you are using Git for Windows, whose master branch builds on > > Linux and MacOSX as well). I quoted "fix" because my motivation was to fix > > the performance problem, not the "incorrect match" problem. > > In other words, regardless of your motivation, you "fix"ed both, > which is very nice ;-) Almost. While I fixed the performance issues as well as the design allowed, I happened to "fix" the problem where an incomplete prefix match could be favored over an exact match. I really fixed the performance issues. Not "fixed" them. Ciao, Johannes
Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`
Hi Lars, On Thu, 9 Feb 2017, Lars Schneider wrote: > > On 06 Feb 2017, at 20:10, Junio C Hamano wrote: > > > > Johannes Schindelin writes: > > > >> So I thought maybe the From: line (from the body, if available, > >> otherwise from the header) in conjunction with the "Date:" header > >> would work. > > > > FYI, I use a post-applypatch hook to populate refs/notes/amlog notes > > tree when I queue a new patch; I am not sure how well the notes in it > > are preserved across rebases, but it could be a good starting point. > > The notes tree is mirrored at git://github.com/git/gitster repository. > > > > E.g. > > > > $ git show --notes=amlog --stat > > That's super useful! Thanks for the pointer! > Wouldn't it make sense to push these notes to github.com/git/git ? I am not quite sure about that. It is in a different namespace than what is usually cloned, and it currently adds 8MB to the download (there are "amlog" and "commits", the latter clearly being a sandbox). While I am thankful that there is at least some information available for patches integrated into `pu` since Nov 1 2016, the format is probably not stable (we are talking about free-form notes, after all), and it still does not help with catching the case where new patch series iterations (or in some case, new patch series, period) are missed. Make no mistake, it will be a huge undertaking to develop a tool that helps with the management of patch series on top of the mailing list driven patch review process. And even in the best case, it may be simply too hard for an automated tool to figure things out e.g. when Peff or Junio paste a tangentially related diff into a thread. In the end, what I *really* would love to have is a system where you can easily query "which reviewer comments on *any* of my patch series are new, or still unaddressed?", and "in what way was my patch modified relative to the latest version I submitted?". It may actually be impossible to create such a tool, as it cannot invent information/cross-references that it does not have nor can deduce from available data. Ciao, Johannes
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Johannes Schindelin writes: >> > (And that would have to be handled at a different point, as I had >> > pointed out, so that suggested preparation would most likely not help >> > at all.) >> >> I did not think "that would have to be handled at a different point" >> is correct at all, if by "a point" you meant "a location in the >> code" [*1*]. > > Yes, I mean the location in the code. > > But since you keep insisting that you are right and I am wrong,... There is no "insisting". Didn't we just see how wrong you were about "different point"? An extended syntax of override would be handled in the new helper to handle override, the same point in the code as other overrides are handled. > ... and even > go so far as calling your patch reverting my refactoring a hot-fix, why > don't you just go ahead and merge the result over my objections? At this point, you are simply being silly. Isn't "Putty is not a command but is also handled as if it is a valid implementation of SSH" a bug? Isn't making the code not to be confused like so a fix? A different approach to fix the issue would be to declare that the command names and overrides are not in separate namespaces. If you prefer to go that route, the documentation can use an update to make it not mention "putty" as a valid override (the users have to spell "plink"), mention "plink.exe" is also accepted, etc. and make it clear that the override environment and configuration variables are the way to tell Git: "The ssh implementation I have behaves the same way as this well-known implementation, so treat it as such without actually looking at the path to the command in the ssh.command string". That may limit the freedom for future enhancement of overrides, but is an acceptable short-cut. After all, the overrides are merely an escape hatch and we expect them to be used only rarely.
Re: GSoC 2017: application open, deadline = February 9, 2017
On Thu, Feb 9, 2017 at 1:38 PM, Johannes Schindelin wrote: > Hi Christian, > > On Thu, 9 Feb 2017, Christian Couder wrote: > >> I just had a look and the microproject and idea pages for this year are >> ok. They are not great sure, but not much worse than the previous >> years. What should probably be done is to remove project ideas where is >> no "possible mentor" listed. >> >> But I am reluctant to do that as I don't know what Dscho would be ok to >> mentor. > > I am okay to mentor (except anything that touches submodules). I am okay to mentor anything (preferrably submodules). Thanks, Stefan > > Ciao, > Johannes
Re: Bug with fixup and autosquash
Johannes Schindelin writes: > This prefix match also happens to introduce a serious performance problem, > which is why I "fixed" this issue in the rebase--helper already (which is > the case if you are using Git for Windows, whose master branch builds on > Linux and MacOSX as well). I quoted "fix" because my motivation was to fix > the performance problem, not the "incorrect match" problem. In other words, regardless of your motivation, you "fix"ed both, which is very nice ;-)
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Johannes Schindelin writes: > On Wed, 8 Feb 2017, Junio C Hamano wrote: > >> How long has "rev-parse --git-path" been around? Had scripts in the >> wild chance to learn to live with the "output is relative to the top of >> the working tree" reality? I think the answers are "since 2.5" and >> "yes". > > Correct. And the third question is: How did the scripts work around this > feature? > > The answer is obvious: by switching back to `$(git rev-parse > --git-dir)/filename`. > > This is literally on what I spent the better part of Wednesday. > > There is just no way you can "fix" this otherwise. As an occasional shell > scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git > rev-parse --git-path filename)`, but of course that breaks in worktrees > and if you do not use worktrees you would not even know that your > workaround introduced another bug. In case it is not clear, I understand all of the above. I was just worried about the people who do *NOT* use worktrees and did the obvious "concatenate --cdup with --git-path" and thought their script were working happily and well. By prepending the path to the (real) location of the .git in the updated --git-path output ourselves, they will complain, our update broke their script. If we introduced the fix gently, by (1) warn when "--git-path" is used but give the current output anyway, while adding the "fixed" one as another new option, and then (2) remove "--git-path" after several releases, they will not have to complain, even though they will see warning until they stop using "--git-path". There may be gentler alternative ways to transition, and I do not worry about the specifics of them too much. I just think we cannot do this in a single step without harming existing users.
Re: GSoC 2017: application open, deadline = February 9, 2017
Hi Christian, On Thu, 9 Feb 2017, Christian Couder wrote: > I just had a look and the microproject and idea pages for this year are > ok. They are not great sure, but not much worse than the previous > years. What should probably be done is to remove project ideas where is > no "possible mentor" listed. > > But I am reluctant to do that as I don't know what Dscho would be ok to > mentor. I am okay to mentor (except anything that touches submodules). Ciao, Johannes
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Hi Junio, On Wed, 8 Feb 2017, Junio C Hamano wrote: > How long has "rev-parse --git-path" been around? Had scripts in the > wild chance to learn to live with the "output is relative to the top of > the working tree" reality? I think the answers are "since 2.5" and > "yes". Correct. And the third question is: How did the scripts work around this feature? The answer is obvious: by switching back to `$(git rev-parse --git-dir)/filename`. This is literally on what I spent the better part of Wednesday. There is just no way you can "fix" this otherwise. As an occasional shell scripter, you may be tempted to use `$(git rev-parse --show-cdup)$(git rev-parse --git-path filename)`, but of course that breaks in worktrees and if you do not use worktrees you would not even know that your workaround introduced another bug. The current handling of --git-path is just plain wrong. The fact that I am the first person to report this here merely shows how much it is used in the wild. Ciao, Johannes
Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository
On Thu, Feb 9, 2017 at 1:19 PM, Junio C Hamano wrote: >> if (!submodule || !*submodule) >> - return be->init(NULL); >> + refs = be->init(NULL); >> else >> - return be->init(submodule); >> + refs = be->init(submodule); > > Can't we also lose this "if !*submodule, turn it to NULL"? That was my suggestion as well, but that did not look nicer per se. That is because at this point of the series we still handle "" just fine. >> refs->submodule = submodule ? xstrdup(submodule) : ""; > > Also, can't we use xstrdup_or_null(submodule) with this step? > That is done in 4/5; here we must keep a "" around instead of NULL IIUC.
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Duy Nguyen writes: > Relevant thread in the past [1] which fixes both --git-path and > --git-common-dir. I think the author dropped it somehow (or forgot > about it, I know I did). Sorry can't comment on that thread, or this > patch, yet. > > [1] > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/ Thanks for a pointer. I see Mike responded to this message (I haven't had a chance to read and think about it yet), so I trust that you three can figure out if these are the same issues and what the final solution in the longer term should be. I have no strong opinion for or against a "longer term" solution that makes "rev-parse --git-path" behave differently from how it behaves today, but I am not yet convinced that we can reach that longer term goal without a transition period, as I suspect there are existing users that know and came to expect how it behaves, based on its today's behaviour. Other than that I do not have suggestion on this topic at the moment. Thanks.
Re: [PATCH 0/5] Store submodules in a hash, not a linked list
On 02/09/2017 08:58 PM, Jeff King wrote: > On Thu, Feb 09, 2017 at 02:26:57PM +0100, Michael Haggerty wrote: > [...] >> A `files_ref_store` would be perfectly happy to represent, say, the >> references *physically* stored in a linked worktree (e.g., `HEAD`, >> `refs/bisect/*`, etc) even though that is not the complete collection >> of refs that are *logically* visible from that worktree (which >> includes references from the main repository, too). But the old code >> was confusing the two things by storing "submodule" in every >> `ref_store` instance. >> >> So push the submodule attribute down to the `files_ref_store` class >> (but continue to let the `ref_store`s be looked up by submodule). > > I'm not sure I understand all of the ramifications here. It _sounds_ like > pushing this down into the files-backend code would make it harder to > have mixed ref-backends for different submodules. Or is this just > pushing down an implementation detail of the files backend, and future > code is free to have as many different ref_stores as it likes? I don't understand how this would make it harder, aside from the fact that a new backend class might also need a path member and have to maintain its own copy rather than using one that the base class provides. I consider it an implementation detail of the files backend that it needs to keep a permanent record of its submodule path in files_ref_store. Some hypothetical future backend might instead need, say, an IP number and port to connect to a MySQL server. A hypothetical pure packed-refs backend might just store the path of the packed-refs file. A more likely imminent change is that backends need a path, but that the path needn't correspond to the git_dir of the repository that contains the corresponding objects, for example in the case of a linked worktree. You might ask for the ref_store for a worktree-submodule, and end up getting a compound object that delegates to one ref_store pointing at its git_dir and one at its common_dir. Michael
Re: [PATCH 2/5] refs: push the submodule attribute down
Michael Haggerty writes: > Push the submodule attribute down from ref_store to files_ref_store. > This is another step towards loosening the 1:1 connection between > ref_stores and submodules. The update seems to retain the externally visible effects, but what does this change mean for future backend writers whose code will sit next to files_ref_store? They need to have "submodule" field in their implementation of the backend and keep track of it? Granted that the primary thing that looks at ->submodule field in the code before this change all live in refs/files-backend.c, but I am not sure if that is an artifact of us having only one backend at this moment, or a sign that future backends would benefit from extra freedom to choose how they exactly implement their submodule support.
Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository
Michael Haggerty writes: > The old practice of storing the empty string in this member for the main > repository was a holdover from before 00eebe3 (refs: create a base class > "ref_store" for files_ref_store, 2016-09-04), when the submodule was > stored in a flex array at the end of `struct files_ref_store`. Storing > NULL for this case is more idiomatic and a tiny bit less code. Yes. I noticed this bit in 3/5 and wondered about it, knowing this step comes next: > struct ref_store *ref_store_init(const char *submodule) > { > const char *be_name = "files"; > struct ref_storage_be *be = find_ref_storage_backend(be_name); > + struct ref_store *refs; > > if (!be) > die("BUG: reference backend %s is unknown", be_name); > > if (!submodule || !*submodule) > - return be->init(NULL); > + refs = be->init(NULL); > else > - return be->init(submodule); > + refs = be->init(submodule); Can't we also lose this "if !*submodule, turn it to NULL"? > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -980,7 +980,7 @@ static struct ref_store *files_ref_store_create(const > char *submodule) > struct files_ref_store *refs = xcalloc(1, sizeof(*refs)); > struct ref_store *ref_store = (struct ref_store *)refs; > > - base_ref_store_init(ref_store, &refs_be_files, submodule); > + base_ref_store_init(ref_store, &refs_be_files); > > refs->submodule = submodule ? xstrdup(submodule) : ""; Also, can't we use xstrdup_or_null(submodule) with this step?
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
Hi Mike, On Thu, 9 Feb 2017, Mike Rappazzo wrote: > On Thu, Feb 9, 2017 at 4:48 AM, Duy Nguyen wrote: > > > Relevant thread in the past [1] which fixes both --git-path and > > --git-common-dir. I think the author dropped it somehow (or forgot > > about it, I know I did). Sorry can't comment on that thread, or this > > patch, yet. > > I didn't exactly forget it (I have it sitting in a branch), I wasn't > sure what else was needed (from a v5 I guess), so it has stagnated. > > There was also another patch [1] at the time done by SZEDER Gábor trying > to speed up the completion scripts by adding `git rev-parse > --absolute-git-dir` option to deal with this case as well. > > > [1] > > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/ > > [1] http://public-inbox.org/git/20170203024829.8071-16-szeder@gmail.com/ Ah, so I was not the only person reporting this bug, but I am seemingly having as much luck getting a fix in. I had a quick look at your v4: http://public-inbox.org/git/1464261556-89722-3-git-send-email-rappa...@gmail.com/ It seems you replaced the git_path() by a combination of git_dir() and relative_path(), but that would break the use case where git_path() handles certain arguments specially, e.g. "objects" which knows that the .git/objects/ path can be overridden via the environment. I tried very hard to keep that working in my patch, essentially by emulating what git_path() does already when being called in a worktree's subdirectory: make the path absolute. Ciao, Johannes
Re: [PATCH 3/5] register_ref_store(): new function
Michael Haggerty writes: > Move the responsibility for registering the ref_store for a submodule > from base_ref_store_init() to a new function, register_ref_store(). Call > the latter from ref_store_init(). > > This means that base_ref_store_init() can lose its submodule argument, > further weakening the 1:1 relationship between ref_stores and > submodules. OK. I think I am starting to get it. I've always found it disturbing that for_each_*ref*() has a "submodule" variant. Instead, the plan (outlined in the discussion from yesterday that triggered your posting this series) is to give an API to request a "ref-store", and then ask that object to iterate over refs, and these steps get us closer to that goal, because the "to enumerate these" won't have to know what set of refs a ref-store contains. If you want to iterate over refs in a submodule, you grab a ref-store for the submodule and ask it to iterate. Iterating over refs in another worktree, you grab a different ref-store and ask it to iterate using the same API. Sounds like a good direction to go.
Re: Bug with fixup and autosquash
Hi Ashutosh and Junio, On Wed, 8 Feb 2017, Junio C Hamano wrote: > Ashutosh Bapat writes: > > > I have been using git rebase heavily these days and seem to have found > > a bug. > > > > If there are two commit messages which have same prefix e.g. > > yy This is prefix > > xx This is prefix and message > > > > xx comitted before yy > > > > Now I commit a fixup to yy using git commit --fixup yy > > zz fixup! This is prefix > > > > When I run git rebase -i --autosquash, the script it shows me looks like > > pick xx This is prefix and message > > fixup zz fixup! This is prefix > > pick yy This is prefix > > > > I think the correct order is > > pick xx This is prefix and message > > pick yy This is prefix > > fixup zz fixup! This is prefix > > > > Is that right? > > [...] > > Unfortunately, "rebase -i --autosquash" reorders the entries by > identifying the commit by its title, and it goes with prefix match so > that fix-up commits created without using --fixup option but manually > records the title's prefix substring can also work. This prefix match also happens to introduce a serious performance problem, which is why I "fixed" this issue in the rebase--helper already (which is the case if you are using Git for Windows, whose master branch builds on Linux and MacOSX as well). I quoted "fix" because my motivation was to fix the performance problem, not the "incorrect match" problem. The rebase--helper code (specifically, the patch moving autosquash logic into it: https://github.com/dscho/git/commit/7d0831637f) tries to match exact onelines first, and falls back to prefix matching only after that. Now that the sequencer-i patch series is in `master`, the next step is to send the patch series introducing the rebase--helper. The patch series including the fix discussed above relies on that one. Meaning that it will take a while to get through the mill. So please do not hold your breath until this feature/fix hits an official Git version. If you need it[*1*] faster, feel free to build Git for Windows' master and run with that for a while. Ciao, Johannes Footnote: By "it" I mean "the feature/fix", not "an official Git version" nor "your breath".
gitk bug: file select in the tree mode
Hi All! There is a bug in gitk (e.g. 2.11.0): 1) Choose a repository with files in a subdir (git's repo for example). 2) `cd` to a subdir (e.g. `xdiff`). 3) Run `gitk`. 4) Select 'Tree' in the 'Patch / Tree' panel. 5) Select any file with 'Highlight this too' or 'Highlight this only' (e.g `xmerge.c`). 6) See the short file name (`xmerge.c`) instead of the full path (`xdiff/xmerge.c`) in the 'Find commit touching path:' edit field. No commits touching the file can be found. -- Mit freundlichen Grüßen, Anatoly Borodin
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Hi Junio, On Thu, 9 Feb 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > (And that would have to be handled at a different point, as I had > > pointed out, so that suggested preparation would most likely not help > > at all.) > > I did not think "that would have to be handled at a different point" > is correct at all, if by "a point" you meant "a location in the > code" [*1*]. Yes, I mean the location in the code. But since you keep insisting that you are right and I am wrong, and even go so far as calling your patch reverting my refactoring a hot-fix, why don't you just go ahead and merge the result over my objections? Ciao, Johannes
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Michael Haggerty writes: > On 02/06/2017 11:34 PM, Junio C Hamano wrote: >> [...] >> -- >> [Stalled] >> [...] >> * mh/ref-remove-empty-directory (2017-01-07) 23 commits >> - files_transaction_commit(): clean up empty directories >> - try_remove_empty_parents(): teach to remove parents of reflogs, too >> - try_remove_empty_parents(): don't trash argument contents >> - try_remove_empty_parents(): rename parameter "name" -> "refname" >> - delete_ref_loose(): inline function >> - delete_ref_loose(): derive loose reference path from lock >> - log_ref_write_1(): inline function >> - log_ref_setup(): manage the name of the reflog file internally >> - log_ref_write_1(): don't depend on logfile argument >> - log_ref_setup(): pass the open file descriptor back to the caller >> - log_ref_setup(): improve robustness against races >> - log_ref_setup(): separate code for create vs non-create >> - log_ref_write(): inline function >> - rename_tmp_log(): improve error reporting >> - rename_tmp_log(): use raceproof_create_file() >> - lock_ref_sha1_basic(): use raceproof_create_file() >> - lock_ref_sha1_basic(): inline constant >> - raceproof_create_file(): new function >> - safe_create_leading_directories(): set errno on SCLD_EXISTS >> - safe_create_leading_directories_const(): preserve errno >> - t5505: use "for-each-ref" to test for the non-existence of references >> - refname_is_safe(): correct docstring >> - files_rename_ref(): tidy up whitespace >> >> Deletion of a branch "foo/bar" could remove .git/refs/heads/foo >> once there no longer is any other branch whose name begins with >> "foo/", but we didn't do so so far. Now we do. >> >> Expecting a reroll. >> cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu> > > I think you missed v4 of this patch series [1], which is the re-roll > that you were waiting for. And I missed that you missed it... > > Michael > > [1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/ Actually it was worse than that. What the above lists *is* v4; I just failed to update "Expecting a reroll" note when I updated the topic with your rerolled patches, and left it there trusting the now-stale note of mine. Sorry, and a HUGE thanks for noticing the mistake.
Re: [PATCH 0/5] Store submodules in a hash, not a linked list
On Thu, Feb 09, 2017 at 02:26:57PM +0100, Michael Haggerty wrote: > I have mentioned this patch series on the mailing list a couple of > time [1,2] but haven't submitted it before. I just rebased it to > current master. It is available from my Git fork [3] as branch > "submodule-hash". > > The first point of this patch series is to optimize submodule > `ref_store` lookup by storing the `ref_store`s in a hashmap rather > than a linked list. But a more interesting second point is to weaken > the 1:1 relationship between submodules and `ref_stores` a little bit > more. Sounds good. I remember this had been discussed before due to performance issues with resolve_gitlink_ref(), and we took a different route (not populating non-submodule entries). I think it's nice to have both optimizations, though, as they hit different use cases. > A `files_ref_store` would be perfectly happy to represent, say, the > references *physically* stored in a linked worktree (e.g., `HEAD`, > `refs/bisect/*`, etc) even though that is not the complete collection > of refs that are *logically* visible from that worktree (which > includes references from the main repository, too). But the old code > was confusing the two things by storing "submodule" in every > `ref_store` instance. > > So push the submodule attribute down to the `files_ref_store` class > (but continue to let the `ref_store`s be looked up by submodule). I'm not sure I understand all of the ramifications here. It _sounds_ like pushing this down into the files-backend code would make it harder to have mixed ref-backends for different submodules. Or is this just pushing down an implementation detail of the files backend, and future code is free to have as many different ref_stores as it likes? -Peff
Re: [PATCH 0/5] Store submodules in a hash, not a linked list
Michael Haggerty writes: > I have mentioned this patch series on the mailing list a couple of > time [1,2] but haven't submitted it before. I just rebased it to > current master. It is available from my Git fork [3] as branch > "submodule-hash". > > The first point of this patch series is to optimize submodule > `ref_store` lookup by storing the `ref_store`s in a hashmap rather > than a linked list. But a more interesting second point is to weaken > the 1:1 relationship between submodules and `ref_stores` a little bit > more. > > A `files_ref_store` would be perfectly happy to represent, say, the > references *physically* stored in a linked worktree (e.g., `HEAD`, > `refs/bisect/*`, etc) even though that is not the complete collection > of refs that are *logically* visible from that worktree (which > includes references from the main repository, too). But the old code > was confusing the two things by storing "submodule" in every > `ref_store` instance. > > So push the submodule attribute down to the `files_ref_store` class > (but continue to let the `ref_store`s be looked up by submodule). > > The last commit is relatively orthogonal to the others; it simplifies > read_loose_refs() by calling resolve_ref_recursively() directly using > the `ref_store` instance that it already has in hand, rather than > indirectly via the public wrappers. > > Michael > > [1] > http://public-inbox.org/git/341999fc-4496-b974-c117-c18a2fca1...@alum.mit.edu/ > [2] > http://public-inbox.org/git/37fe2024-0378-a974-a28d-18a89d3e2...@alum.mit.edu/ > [3] https://github.com/mhagger/git > > Michael Haggerty (5): > refs: store submodule ref stores in a hashmap > refs: push the submodule attribute down > register_ref_store(): new function > files_ref_store::submodule: use NULL for the main repository > read_loose_refs(): read refs using resolve_ref_recursively() Thanks. Will queue on mh/submodule-hash forked from 'maint'.
Re: Bug with fixup and autosquash
Michael J Gruber writes: > Junio C Hamano venit, vidit, dixit 08.02.2017 23:55: > >> Let's hear from some of those (Cc'ed) who were involved in an >> earlier --autosquash thread. >> >> https://public-inbox.org/git/cover.1259934977.git.mhag...@alum.mit.edu/ >> >> >> [Footnote] >> >> *1* "rebase -i --autosquash" does understand "fixup! yy", so if >> you are willing to accept the consequence of not being able to >> rebase twice, you could instead do >> >> $ git commit -m "fixup! yy" >> >> I would think. > > Doesn't this indicate that rebase is fine as is? Not really, unless you ignore "if you are willing to accept" part, which actually is a big downside. And --fixup-fixed will make it worse, unfortunately. > - teach "git commit --fixup=" to check for duplicates (same prefix, > maybe only in "recent" history) and make it issue a warning, say: This is a very good idea worth pursuing, and (I didn't think things through, though) we may even be able to bound "recent" without heuristics---scanning between and the tip for duplicates might be sufficient. > Additionally, we could teach commit --fixup-fixed to commit -m "fixup! > " so that we have both uniqueness and verbosity in the > rebase-i-editor. This would allow "rebase -i" to fall back to the old > mode when "" is not in the range it operates on. This is also a possibility, but it needs cooperation between both "commit" and "rebase -i". I personally do not think rewriting "fixup! yy" on the title during rebase is worth doing, but that is not because I have a concrete reason against it but it just sounds like too much magic to my gut feeling. Perhaps it can be done reliably with minimal change to the code. I dunno. Thanks.
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
On Thu, Feb 09, 2017 at 06:40:29PM +0100, Michael Haggerty wrote: > On 02/09/2017 05:58 PM, Stefan Beller wrote: > >> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char > >> *submodule) > >> > >> struct ref_store *lookup_ref_store(const char *submodule) > >> { > > > >> + if (!submodule_ref_stores.tablesize) > >> + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, > >> 20); > > > > > > So we can lookup a submodule even before we initialized the subsystem? > > Does that actually happen? (It sounds like a bug to me.) > > > > Instead of initializing, you could return NULL directly here. > [...] > I suppose this code path could be changed to return NULL without > initializing the hashmap, but the hashmap will be initialized a moment > later by ref_store_init(), so I don't see much difference either way. I faced a similar issue when adding the oidset API recently: http://public-inbox.org/git/20170208205307.uvgj3saf3cyrv...@sigill.intra.peff.net/ I came to the same conclusion that it doesn't matter much in practice. A nice thing about "return NULL" is that you do not have to duplicate the initialization which happens on the "write" side (so if you ever changed submodule_hash_cmp, for example, it needs changed in both places). I also used the "cmpfn" member to check whether the table had been initialized, which matches existing uses elsewhere. But I do notice that the documentation explicitly says "tablesize" is good for this purpose, so it's probably a better choice. -Peff
Re: [PATCH v3 00/27] Revamp the attribute system; another round
Brandon Williams writes: > At least v3 gets the attribute system to a state where further > improvements should be relatively easy to make. And now as long as each > thread has a unique attr_check structure, multiple callers can exist > inside the attribute system at the same time. There is still more work > to be done on it though. Still my biggest complaint is the "direction" > aspect of the system. I would love to also eliminate that as global > state at some point though I'm not sure how at this point. We are in agreement 100% ;-) The "direction" was the last thorn I was fighting with (without successfully coming up with a usable solution) when I stopped working on my original series before Stefan took it over.
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Michael Haggerty writes: > On 02/06/2017 11:34 PM, Junio C Hamano wrote: >> [...] >> -- >> [Stalled] >> [...] >> * mh/ref-remove-empty-directory (2017-01-07) 23 commits >> >> Deletion of a branch "foo/bar" could remove .git/refs/heads/foo >> once there no longer is any other branch whose name begins with >> "foo/", but we didn't do so so far. Now we do. >> >> Expecting a reroll. >> cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu> > > I think you missed v4 of this patch series [1], which is the re-roll > that you were waiting for. And I missed that you missed it... > > Michael > > [1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/ Great. Thanks.
Re: Google Doc about the Contributors' Summit
Johannes Schindelin writes: > I just started typing stuff up in a Google Doc, and made it editable to > everyone, feel free to help and add things: > > https://docs.google.com/document/d/1KDoSn4btbK5VJCVld32he29U0pUeFGhpFxyx9ZJDDB0/edit?usp=sharing Thanks for a write-up, Dscho. List of bullet points just make non-attendees envious, imagining that attendees had all the fun discussing what is behind these bullet points, without being able to know what was discussed, if they reached consensus and what the consensus was, but it is OK ;-) A few items caught my attention, not because I found them more important than other items on the page, but because they seem to want my input without directly asking me ;-) * If Junio would accept patches by replying to the sender with an ID and/or a patch name. He picks this (branch) name when he gets your patch. I am not sure what exactly I am asked to "accept" here. I sometimes forget to respond with "Thanks, will queue." to the patch message and whoever said this wants me to consistently do so? I never say "... will queue as js/difftool-builtin topic." mainly because I do not know what the name of the topic branch should be at the point of reading the patch. All I know is that decided that it may be worth queuing, so it is a bit harder to arrange. But I can certainly try if it makes the lives of contributors easier. * Junio has a script for the todo branch which we can use to generate the what's cooking branch. Perhaps we could continuously generate this onto a webpage. I have no objection, but I doubt that people find such an auto generated version all that useful, as "git log --first-parent origin/master..origin/pu" would tell exactly the same story. It will lack the "topic summary" comment I have yet to write, if the final 'pu' of the day that was pushed out was before my local update to the draft of the next issue of "What's cooking" report [*1*], and would not have any update on the next action (e.g. "Will merge to ...") relative to the latest issue of "What's cooking" report. IOW, such an auto-generated report lacks all the added value over "git log" output. * Making the actual workflow more publicly known, e.g. document how to generate the cooking email, to learn about the state of a generation. The exact mechanics of "how to generate" may be of less importance than "how the information contained therein relates to their own work" to the contributors, and I think MaintNotes that is sent out at key milestones more or less covers the mechanics, but here is how the sausage is made these days: - I find a patch series on the list that is in good enough shape to be in 'pu'. Perhaps it was already discussed and redone a few times without hitting my tree. Or it may be the first attempt of a new topic. I come up with a topic name, decide where the topic should fork at [*2*], create the topic branch and queue the patches. I may or may not test the topic in isolation at this point. - I may find an updated patch series of what has been queued. I go to the existing topic branch and replace it (I try to keep it forked at the same commit) after inspecting what got updated. "git tbdiff" is a useful program to help this step. I may or may not test the topic in isolation at this point. - I repeat the above two for various topics during the day, and at some point between 14:00-15:00 my time, stop taking new patches. The day's integration cycle starts. - If there are topics that have cooked long enough in 'next' and planned to graduate to 'master', merge [*3*] them to 'master', update the draft Release notes. Otherwise skip this step. - If there are topics that have cooked long enough in 'pu' and planned to graduate to 'next', merge them to 'next'. Otherwise skip this step. - If I updated 'master', merge its tip to 'next' (this should update the draft release notes and nothing else, unless I took something directly to 'master'). - Then I recreate 'jch', which is a point between 'master' and 'pu'. This is the version I use for my own work, contains all topics in 'next' but a bit more. "git checkout -B jch master" begins it, and then the topics that were in 'jch' are merged on top. The latest version of updated topics that were already in 'jch' are incorporated at this point, and "git diff jch@{20.hours}" would show the effect of their interdiff (plus RelNotes update, if 'master' was updated during this integration cycle). - I ran "git branch --no-merged pu --sort=-committerdate" to see the topics that are new; the top of this list shows new topics and updated topics (note that I just updated 'jch' but not 'pu' yet at this point). Among them, I pick the ones that I am willing to be a guinea-pig for before they hit 'next' and merge them to 'jch'. Other topics that used to be in 'pu' may also be merged at this point, when they turn out to be "interesting" enough. - 'pu' is recre
Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"
On 9 February 2017 at 17:55, Matthieu Moy wrote: > >> [...] >> As you might notice, in this list, most commands are not of the `rm` variety, >> i.e. something that would delete stuff. > > OK, I think I'm convinced. I am glad! :) > > Keep the arguments in mind when polishing the commit message. I will definitely do that. I am working on a good commit message for this by looking at some past changes to sha1_name.c which have affected multiple commands. > > -- > Matthieu Moy > http://www-verimag.imag.fr/~moy/ -- Best Regards, - Siddharth.
Re: [PATCH 3/5] register_ref_store(): new function
On 02/09/2017 06:20 PM, Stefan Beller wrote: > On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty wrote: >> Move the responsibility for registering the ref_store for a submodule >> from base_ref_store_init() to a new function, register_ref_store(). Call >> the latter from ref_store_init(). >> >> This means that base_ref_store_init() can lose its submodule argument, >> further weakening the 1:1 relationship between ref_stores and >> submodules. >> >> Signed-off-by: Michael Haggerty >> --- > > > > >> + >> struct ref_store *ref_store_init(const char *submodule) >> { >> const char *be_name = "files"; >> struct ref_storage_be *be = find_ref_storage_backend(be_name); >> + struct ref_store *refs; >> >> if (!be) >> die("BUG: reference backend %s is unknown", be_name); >> >> if (!submodule || !*submodule) >> - return be->init(NULL); >> + refs = be->init(NULL); >> else >> - return be->init(submodule); >> + refs = be->init(submodule); >> + >> + register_ref_store(refs, submodule); >> + return refs; >> } > > This function is already very readable, though maybe it would be > more readable like so: > > { > const char *be_name = "files"; > struct ref_storage_be *be = find_ref_storage_backend(be_name); > > if (!be) > die("BUG: reference backend %s is unknown", be_name); > > /* replace empty string by NULL */ > if (submodule && !*submodule) > submodule = NULL; > > register_ref_store(be->init(submodule), submodule); > return refs; > } > > Well, I dunno; the function inside the arguments to register seems ugly, > though. Nit: you forgot to define and initialize `refs` (for returning to the caller). Actually, there is an inconsistency between the docstring for this function and its behavior. The docstring claims that it can handle `submodule == ""`, and it tries to do so, but incorrectly. The problem is that it passes an un-cleaned-up `submodule` to `register_ref_store()`, which is expecting a cleaned-up one. But this function is only called by get_ref_store(), which has already arranged for the empty string to be passed along as NULL. In fact, the only external entry point into these functions is `get_ref_store()`. I think what I should do is make the other functions private, and remove their attempts to handle `submodule == ""`. I'll fix this up in a re-roll. (I wonder whether anybody really passes the empty string into this method. It never happens in the test suite. I doubt I can muster the energy to audit all of the call paths.) Michael
Re: [PATCH 5/5] read_loose_refs(): read refs using resolve_ref_recursively()
On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty wrote: > There is no need to call read_ref_full() or resolve_gitlink_ref() from > read_loose_refs(), because we already have a ref_store object in hand. > So we can call resolve_ref_recursively() ourselves. Happily, this > unifies the code for the submodule vs. non-submodule cases. > > This requires resolve_ref_recursively() to be exposed to the refs > subsystem, though not to non-refs code. > > Signed-off-by: Michael Haggerty > --- Looks good, Stefan
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
On Thu, Feb 9, 2017 at 9:40 AM, Michael Haggerty wrote: > On 02/09/2017 05:58 PM, Stefan Beller wrote: >>> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char >>> *submodule) >>> >>> struct ref_store *lookup_ref_store(const char *submodule) >>> { >> >>> + if (!submodule_ref_stores.tablesize) >>> + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); >> >> >> So we can lookup a submodule even before we initialized the subsystem? >> Does that actually happen? (It sounds like a bug to me.) >> >> Instead of initializing, you could return NULL directly here. > > The lines you quoted are only concerned with bringing the (empty) > hashmap into existence if it hasn't been initialized already. (There's > no HASHMAP_INIT.) I don't know what you mean by "initialize the > subsystem". The only way to bring a ref_store *object* into existence is > currently to call get_ref_store(submodule), which calls > lookup_ref_store(submodule) to see if it already exists, and if not > calls ref_store_init(submodule) to instantiate it and register it in the > hashmap. There's nothing else that has to be initialize before that > (except maybe the usual startup config reading etc.) > > I suppose this code path could be changed to return NULL without > initializing the hashmap, but the hashmap will be initialized a moment > later by ref_store_init(), so I don't see much difference either way. Oh, I did not see that. Thanks, Stefan > > Thanks for your review! > Michael >
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
On 02/09/2017 05:58 PM, Stefan Beller wrote: >> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char >> *submodule) >> >> struct ref_store *lookup_ref_store(const char *submodule) >> { > >> + if (!submodule_ref_stores.tablesize) >> + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); > > > So we can lookup a submodule even before we initialized the subsystem? > Does that actually happen? (It sounds like a bug to me.) > > Instead of initializing, you could return NULL directly here. The lines you quoted are only concerned with bringing the (empty) hashmap into existence if it hasn't been initialized already. (There's no HASHMAP_INIT.) I don't know what you mean by "initialize the subsystem". The only way to bring a ref_store *object* into existence is currently to call get_ref_store(submodule), which calls lookup_ref_store(submodule) to see if it already exists, and if not calls ref_store_init(submodule) to instantiate it and register it in the hashmap. There's nothing else that has to be initialize before that (except maybe the usual startup config reading etc.) I suppose this code path could be changed to return NULL without initializing the hashmap, but the hashmap will be initialized a moment later by ref_store_init(), so I don't see much difference either way. Thanks for your review! Michael
Re: [PATCH 3/5] register_ref_store(): new function
On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty wrote: > Move the responsibility for registering the ref_store for a submodule > from base_ref_store_init() to a new function, register_ref_store(). Call > the latter from ref_store_init(). > > This means that base_ref_store_init() can lose its submodule argument, > further weakening the 1:1 relationship between ref_stores and > submodules. > > Signed-off-by: Michael Haggerty > --- > + > struct ref_store *ref_store_init(const char *submodule) > { > const char *be_name = "files"; > struct ref_storage_be *be = find_ref_storage_backend(be_name); > + struct ref_store *refs; > > if (!be) > die("BUG: reference backend %s is unknown", be_name); > > if (!submodule || !*submodule) > - return be->init(NULL); > + refs = be->init(NULL); > else > - return be->init(submodule); > + refs = be->init(submodule); > + > + register_ref_store(refs, submodule); > + return refs; > } This function is already very readable, though maybe it would be more readable like so: { const char *be_name = "files"; struct ref_storage_be *be = find_ref_storage_backend(be_name); if (!be) die("BUG: reference backend %s is unknown", be_name); /* replace empty string by NULL */ if (submodule && !*submodule) submodule = NULL; register_ref_store(be->init(submodule), submodule); return refs; } Well, I dunno; the function inside the arguments to register seems ugly, though.
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Johannes Schindelin writes: > (And that would have to be handled at a different point, as I > had pointed out, so that suggested preparation would most likely not help > at all.) I did not think "that would have to be handled at a different point" is correct at all, if by "a point" you meant "a location in the code" [*1*]. If we want to make it configurable in a more detailed way by directly allowing to override port_option and needs_batch separately, you would do that in override_ssh_variant(), without touching handle_ssh_variant() in the refactored code. That way, you do not have to worry about breaking the auto detection based on the command name. [Footnote] *1* Or did you mean "point in time", aka "let's do it outside the scope of this patch series"? Let's not keep it as a SQUASH??? proposal, but as a separate hot-fix follow-up patch. -- >8 -- Subject: connect.c: stop conflating ssh command names and overrides dd33e07766 ("connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config", 2017-02-01) attempted to add support for configuration and environment variable to override the different handling of port_option and needs_batch settings suitable for variants of the ssh implementation that was autodetected by looking at the ssh command name. Because it piggybacked on the code that turns command name to specific override (e.g. "plink.exe" and "plink" means port_option needs to be set to 'P' instead of the default 'p'), yet it defined a separate namespace for these overrides (e.g. "putty" can be usable to signal that port_option needs to be 'P'), however, it made the auto-detection based on the command name less robust (e.g. the code now accepts "putty" as a SSH command name and applies the same override). Separate the code that interprets the override that was read from the configuration & environment from the original code that handles the command names, as they are in separate namespaces, to fix this confusion. This incidentally also makes it easier for future enhancement of the override syntax (e.g. "port_option=p,needs_batch=1" may want to be accepted as a more explicit syntax) without affecting the code for auto-detection based on the command name. While at it, update the return type of the handle_ssh_variant() helper function to void; the caller does not use it, and the function does not return any meaningful value. Signed-off-by: Junio C Hamano --- connect.c | 45 - 1 file changed, 32 insertions(+), 13 deletions(-) diff --git a/connect.c b/connect.c index 7f1f802396..7d65c1c736 100644 --- a/connect.c +++ b/connect.c @@ -691,17 +691,39 @@ static const char *get_ssh_command(void) return NULL; } -static int handle_ssh_variant(const char *ssh_command, int is_cmdline, - int *port_option, int *needs_batch) +static int override_ssh_variant(int *port_option, int *needs_batch) { - const char *variant = getenv("GIT_SSH_VARIANT"); + char *variant; + + variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT")); + if (!variant && + git_config_get_string("ssh.variant", &variant)) + return 0; + + if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) { + *port_option = 'P'; + *needs_batch = 0; + } else if (!strcmp(variant, "tortoiseplink")) { + *port_option = 'P'; + *needs_batch = 1; + } else { + *port_option = 'p'; + *needs_batch = 0; + } + free(variant); + return 1; +} + +static void handle_ssh_variant(const char *ssh_command, int is_cmdline, + int *port_option, int *needs_batch) +{ + const char *variant; char *p = NULL; - if (variant) - ; /* okay, fall through */ - else if (!git_config_get_string("ssh.variant", &p)) - variant = p; - else if (!is_cmdline) { + if (override_ssh_variant(port_option, needs_batch)) + return; + + if (!is_cmdline) { p = xstrdup(ssh_command); variant = basename(p); } else { @@ -717,12 +739,11 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline, */ free(ssh_argv); } else - return 0; + return; } if (!strcasecmp(variant, "plink") || - !strcasecmp(variant, "plink.exe") || - !strcasecmp(variant, "putty")) + !strcasecmp(variant, "plink.exe")) *port_option = 'P'; else if (!strcasecmp(variant, "tortoiseplink") || !strcasecmp(variant, "tortoiseplink.exe")) { @@ -730,8 +751,6 @@ static int handle_ssh_variant(const char *ssh_command, int is_cmdline, *needs_batch = 1; } free(p); - - return 1; } /* -- 2.12.0-rc0-2
Re: Automatically Add .gitignore Files
Hi Duy, > This is a general problem to new files, not .gitignore alone. Can we The difference, to me, is that a ".gitignore" file is not part of what I'm developing. It's an artifact for git configuration. While a .gitignore file is not always pushed to the repository, I imagine that in most situations, it is. Whereas when a "new" file is created, there are plenty of situations where it shouldn't be added and thus a warning would be superfluous, or an automatic add would be undesirable. To solve the problem, generally, for new files while giving the user the ability to specify exactly what "new" files should be automatically added to the repository, something like the following would work: echo "**/.gitignore" >> .git/config/add-before-commit > and perhaps you want to make it a habit to run it before committing. Right, because software shouldn't automate repetitive tasks and humans are never prone to forget? ;-)
Re: [PATCH 4/5] files_ref_store::submodule: use NULL for the main repository
On Thu, Feb 9, 2017 at 5:27 AM, Michael Haggerty wrote: > The old practice of storing the empty string in this member for the main > repository was a holdover from before 00eebe3 (refs: create a base class > "ref_store" for files_ref_store, 2016-09-04), when the submodule was > stored in a flex array at the end of `struct files_ref_store`. Storing > NULL for this case is more idiomatic and a tiny bit less code. > > Signed-off-by: Michael Haggerty > --- Makes sense! Stefan
Re: [PATCH v3 00/27] Revamp the attribute system; another round
On 02/02, Junio C Hamano wrote: > prepare the array > for path taken from some set: > do_something(the array, path) > > That way, do_something() do not have to keep allocating, > initializing and destroying the array. > > But after looking at the current set of codepaths, before coming to > the conclusion that we need to split the static part that is > specific to the callsite for git_check_attr() and the dynamic part > that is specific to the pair, I noticed that > typically the callers that can prepare the array before going into > the loop (which will eventually be spread across multiple threads) > are many levels away in the callchain, and they are not even aware > of what attributes are going to be requested in the leaf level > helper functions. In other words, the approach to hoist "the > array" up in the callchain would not scale. A > caller that loops over paths in the index and check them out does > not want to know (and we do not want to tell it) what exact > attributes are involved in the decision convert_to_working_tree() > makes for each path, for example. This was something that I was envisioning as well, though I didn't dig very deep into the call stack. Another means of doing this could be to have the attr_check structure allocated and then have it configured at a later point for the particular question being asked: alloc struct attr_check c; ... many call sites down configure(c, questions) for path do_something(c, path) That also allows the same structure to be reused (just reconfigured) if different attributes are needed at a later point in time. Of course this is just an idea and I'm not sure if this is the best way to do it either. > > So how would we split questions and answers in a way that is not > naive and inefficient? > > I envision that we would allow the attribute subsystem to keep track > of the dynamic part, which will receive the answers, holds working > area like check_all_attr[], and has the equivalent to the "attr > stack", indexed by pair (and the > identification of "callsite" can be done by using the address of the > static part, i.e. the array of questions that we initialize just > once when interning the list of attribute names for the first time). > > The API to prepare and ask for attributes may look like: > > static struct attr_static_part Q; > struct attr_dynamic_part *D; > > attr_check_init(&Q, "text", ...); > D = git_attr_check(&Q, path); > > where Q contains an array of interned attributes (i.e. questions) > and other immutable things that is unique to this callsite, but can > be shared across multiple threads asking the same question from > here. As an internal implementation detail, it probably will have a > mutex to make sure that init will run only once. > > Then the implementation of git_attr_check(&Q, path) would be: > > - see if there is already the "dynaic part" allocated for the > current thread asking the question Q. If there is not, > allocate one and remember it, so that it can be reused in > later calls by the same thread; if there is, use that existing > one. > > - reinitialize the "dynamic part" as needed, e.g. clear the > equivalent to check_all_attr[], adjust the equivalent to > attr_stack for the current path, etc. Just like the current > code optimizes for the case where the entire program (a single > thread) will ask the same question for paths in traversal > order (i.e. falling in the same directory), this will optimize > for the access pattern where each thread asks the same > question for paths in its traversal order. > > - do what the current collect_some_attrs() thing does. > > And this hopefully won't be as costly as the naive and inefficient > one. I agree, this sort of implementation wouldn't suffer from the same allocation penalty that the naive implementation suffers from. This would be slightly challenging to ensure that there aren't any memory leaks, well not leaks but rather memory that isn't freed. i.e. When a thread terminates we would want to reclaim the memory used for the dynamic part which is stored inside the attribute system. > > The reason why I was pushing hard to split the static part and the > dynamic part in our redesign of the API is primarily because I didn't > want to update the API callers twice. But I'd imagine that your v3 > (and your earlier "do not discard attr stack, but keep them around, > holding their tips in a hashmap for quick reuse") would at least lay > the foundation for the eventual shape of the API, let's bite the > bullet and accept that we will need to update the callers again > anyway. > > Thanks. > At least v3 gets the attribute system to a state where further improvements should be relatively easy to make. And now as long as each thread has a unique attr_check structure, multiple callers can exist inside the attribu
Re: [PATCH 2/5] refs: push the submodule attribute down
On Thu, Feb 9, 2017 at 5:26 AM, Michael Haggerty wrote: > Push the submodule attribute down from ref_store to files_ref_store. > This is another step towards loosening the 1:1 connection between > ref_stores and submodules. > > Signed-off-by: Michael Haggerty Looks good, Stefan
Re: [PATCH 1/5] refs: store submodule ref stores in a hashmap
> @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char > *submodule) > > struct ref_store *lookup_ref_store(const char *submodule) > { > + if (!submodule_ref_stores.tablesize) > + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); So we can lookup a submodule even before we initialized the subsystem? Does that actually happen? (It sounds like a bug to me.) Instead of initializing, you could return NULL directly here. Otherwise looks good. Thanks, Stefan
Re: [RFD] should all merge bases be equal?
Michael Haggerty writes: > Anyway, I mostly wanted to remind you of the earlier discussion of this > topic. There's a lot more information there. > > Michael > > [1] http://public-inbox.org/git/539a25bf.4060...@alum.mit.edu/ Your http://public-inbox.org/git/53a3f67a.80...@alum.mit.edu/ in the thread appears to me the best place to continue exploring. Thanks for the link.
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
On 02/06/2017 11:34 PM, Junio C Hamano wrote: > [...] > -- > [Stalled] > [...] > * mh/ref-remove-empty-directory (2017-01-07) 23 commits > - files_transaction_commit(): clean up empty directories > - try_remove_empty_parents(): teach to remove parents of reflogs, too > - try_remove_empty_parents(): don't trash argument contents > - try_remove_empty_parents(): rename parameter "name" -> "refname" > - delete_ref_loose(): inline function > - delete_ref_loose(): derive loose reference path from lock > - log_ref_write_1(): inline function > - log_ref_setup(): manage the name of the reflog file internally > - log_ref_write_1(): don't depend on logfile argument > - log_ref_setup(): pass the open file descriptor back to the caller > - log_ref_setup(): improve robustness against races > - log_ref_setup(): separate code for create vs non-create > - log_ref_write(): inline function > - rename_tmp_log(): improve error reporting > - rename_tmp_log(): use raceproof_create_file() > - lock_ref_sha1_basic(): use raceproof_create_file() > - lock_ref_sha1_basic(): inline constant > - raceproof_create_file(): new function > - safe_create_leading_directories(): set errno on SCLD_EXISTS > - safe_create_leading_directories_const(): preserve errno > - t5505: use "for-each-ref" to test for the non-existence of references > - refname_is_safe(): correct docstring > - files_rename_ref(): tidy up whitespace > > Deletion of a branch "foo/bar" could remove .git/refs/heads/foo > once there no longer is any other branch whose name begins with > "foo/", but we didn't do so so far. Now we do. > > Expecting a reroll. > cf. <5051c78e-51f9-becd-e1a6-9c0b781d6...@alum.mit.edu> I think you missed v4 of this patch series [1], which is the re-roll that you were waiting for. And I missed that you missed it... Michael [1] http://public-inbox.org/git/cover.1483719289.git.mhag...@alum.mit.edu/
Re: GSoC 2017: application open, deadline = February 9, 2017
On Thu, Feb 9, 2017 at 1:22 PM, Matthieu Moy wrote: > Matthieu Moy writes: > >> I created a Git organization and invited you + Peff as admins. Great thanks! I accepted the invite. > I'll >> start cut-and-pasting to show my good faith ;-). > > I created this page based on last year's: > > https://git.github.io/SoC-2017-Org-Application/ > > I filled-in the "org profile". "Org application" is still TODO. I copy pasted the Org application from last year, so I think we are good.
[PATCH v2] git-p4: fix git-p4.pathEncoding for removed files
In a9e38359e3 we taught git-p4 a way to re-encode path names from what was used in Perforce to UTF-8. This path re-encoding worked properly for "added" paths. "Removed" paths were not re-encoded and therefore different from the "added" paths. Consequently, these files were not removed in a git-p4 cloned Git repository because the path names did not match. Fix this by moving the re-encoding to a place that affects "added" and "removed" paths. Add a test to demonstrate the issue. Signed-off-by: Lars Schneider --- Hi, unfortunately, I missed to send this v2. I agree with Luke's review and I moved the re-encode of the path name to the `streamOneP4File` and `streamOneP4Deletion` explicitly. Discussion: http://public-inbox.org/git/CAE5ih7-=bd_zol5pfyfd2qvy-xe24v_cgge0xoavuotk02e...@mail.gmail.com/ Thanks, Lars Notes: Base Commit: 454cb6bd52 (v2.11.0) Diff on Web: https://github.com/larsxschneider/git/commit/75ed3e92e2 Checkout:git fetch https://github.com/larsxschneider/git git-p4/fix-path-encoding-v2 && git checkout 75ed3e92e2 Interdiff (v1..v2): diff --git a/git-p4.py b/git-p4.py index 8f311cb4e8..dac8b4955d 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2366,15 +2366,6 @@ class P4Sync(Command, P4UserMap): break path = wildcard_decode(path) -try: -path.decode('ascii') -except: -encoding = 'utf8' -if gitConfig('git-p4.pathEncoding'): -encoding = gitConfig('git-p4.pathEncoding') -path = path.decode(encoding, 'replace').encode('utf8', 'replace') -if self.verbose: -print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path) return path def splitFilesIntoBranches(self, commit): @@ -2427,11 +2418,24 @@ class P4Sync(Command, P4UserMap): self.gitStream.write(d) self.gitStream.write('\n') +def encodeWithUTF8(self, path): +try: +path.decode('ascii') +except: +encoding = 'utf8' +if gitConfig('git-p4.pathEncoding'): +encoding = gitConfig('git-p4.pathEncoding') +path = path.decode(encoding, 'replace').encode('utf8', 'replace') +if self.verbose: +print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path) +return path + # output one file from the P4 stream # - helper for streamP4Files def streamOneP4File(self, file, contents): relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes) +relPath = self.encodeWithUTF8(relPath) if verbose: size = int(self.stream_file['fileSize']) sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024)) @@ -2511,6 +2515,7 @@ class P4Sync(Command, P4UserMap): def streamOneP4Deletion(self, file): relPath = self.stripRepoPath(file['path'], self.branchPrefixes) +relPath = self.encodeWithUTF8(relPath) if verbose: sys.stdout.write("delete %s\n" % relPath) sys.stdout.flush() git-p4.py | 24 ++-- t/t9822-git-p4-path-encoding.sh | 16 2 files changed, 30 insertions(+), 10 deletions(-) diff --git a/git-p4.py b/git-p4.py index fd5ca52462..dac8b4955d 100755 --- a/git-p4.py +++ b/git-p4.py @@ -2418,11 +2418,24 @@ class P4Sync(Command, P4UserMap): self.gitStream.write(d) self.gitStream.write('\n') +def encodeWithUTF8(self, path): +try: +path.decode('ascii') +except: +encoding = 'utf8' +if gitConfig('git-p4.pathEncoding'): +encoding = gitConfig('git-p4.pathEncoding') +path = path.decode(encoding, 'replace').encode('utf8', 'replace') +if self.verbose: +print 'Path with non-ASCII characters detected. Used %s to encode: %s ' % (encoding, path) +return path + # output one file from the P4 stream # - helper for streamP4Files def streamOneP4File(self, file, contents): relPath = self.stripRepoPath(file['depotFile'], self.branchPrefixes) +relPath = self.encodeWithUTF8(relPath) if verbose: size = int(self.stream_file['fileSize']) sys.stdout.write('\r%s --> %s (%i MB)\n' % (file['depotFile'], relPath, size/1024/1024)) @@ -2495,16 +2508,6 @@ class P4Sync(Command, P4UserMap): text = regexp.sub(r'$\1$', text) contents = [ text ] -try: -relPath.decode('ascii') -except: -encoding = 'utf8' -if gitConfig('git-p4.pathEncoding'): -
Re: Bug with fixup and autosquash
Junio C Hamano venit, vidit, dixit 08.02.2017 23:55: > Ashutosh Bapat writes: > >> I have been using git rebase heavily these days and seem to have found a bug. >> >> If there are two commit messages which have same prefix e.g. >> yy This is prefix >> xx This is prefix and message >> >> xx comitted before yy >> >> Now I commit a fixup to yy using git commit --fixup yy >> zz fixup! This is prefix >> >> When I run git rebase -i --autosquash, the script it shows me looks like >> pick xx This is prefix and message >> fixup zz fixup! This is prefix >> pick yy This is prefix >> >> I think the correct order is >> pick xx This is prefix and message >> pick yy This is prefix >> fixup zz fixup! This is prefix >> >> Is that right? > > Because "commit" pretends as if it took the exact commit object name > to be fixed up (after all, it accepts "yy" that is a name of the > commit object), it would be nice if the fixup is applied to that > exact commit, even if you had many commits that share exactly the > same title (i.e. not just shared prefix). > > Unfortunately, "rebase -i --autosquash" reorders the entries by > identifying the commit by its title, and it goes with prefix match > so that fix-up commits created without using --fixup option but > manually records the title's prefix substring can also work. > > We could argue that the logic should notice that there is one exact > match and another non-exact prefix match and favor the former, and > certainly such a change would make your made-up example (you didn't > actually have a commit whose title is literally "This is prefix") > above work better. > > But I am not sure if adding such heuristics would really help in > general. It would not help those whose commits are mostly titled > ultra-vaguely, like "fix", "bugfix", "docfix", etc. > > Another possibility is to teach "commit --fixup" to cast exact > commit object name in stone. That certainly would solve your > immediate problem, but it has a grave negative impact when the user > rebases the same topic many times (which happens often). > > For example, I may have a series of commits A and B, notice that A > could be done a bit better and have "fixup A" on top, build a new > commit C on it, and then realize that the next step (i.e. D) would > need support from a newer codebase than where I started (i.e. A^). > > At that point, I would have a 4-commit series (A, B, "fixup A", and > C), and I would rebase them on top of something newer. I am > undecided if that "fixup A" is really an improvement or unnecessary, > when I do this, but I do know that I want to build the series on top > of a different commit. So I do rebase without --autosquash (I would > probably rebase without --interactive for this one). > > Then I keep working and add a new commit D on top. After all that, > I would have a more-or-less completed series and would be ready to > re-assess the whole series. I perhaps decide that "fixup A" is > really an improvement. And then I would "rebase -i" to squash the > fix-up into A. > > But notice that at this point, what we are calling A has different > object name than the original A the fixup was written for, because > we rebased once on top of a newer codebase. That commit can still > be identified by its title, but not with its original commit object > name. > > I think that is why "commit --fixup " turns the commit > object name (your "yy") into a string (your "This is prefix") > and that is a reasonable design decision [*1*]. > > So from that point of view, if we were to address your issue, it > should happen in "rebase -i --autosquash" side, not "commit --fixup" > side. > > Let's hear from some of those (Cc'ed) who were involved in an > earlier --autosquash thread. > > https://public-inbox.org/git/cover.1259934977.git.mhag...@alum.mit.edu/ > > > [Footnote] > > *1* "rebase -i --autosquash" does understand "fixup! yy", so if > you are willing to accept the consequence of not being able to > rebase twice, you could instead do > > $ git commit -m "fixup! yy" > > I would think. Doesn't this indicate that rebase is fine as is? How about: - introduce "git commit --fixup-fixed=" or the like which parses commits "-m fixup! " - teach "git commit --fixup=" to check for duplicates (same prefix, maybe only in "recent" history) and make it issue a warning, say: prefix matches the following commits: Issue git commit --amend -m 'fixup! ' to fixup a specific commit. (Or git commit --amend --fixup-fixed= if that flies.) Additionally, we could teach commit --fixup-fixed to commit -m "fixup! " so that we have both uniqueness and verbosity in the rebase-i-editor. This would allow "rebase -i" to fall back to the old mode when "" is not in the range it operates on. We could also try to rewrite s when rebasing (without squashing) fixup!-commits, of course. Michael
Re: [RFD] should all merge bases be equal?
On 10/18/2016 12:28 AM, Junio C Hamano wrote: > [...] > Being accustomed how fast my merges go, there is one merge that > hiccups for me every once in a few days: merging back from 'master' > to 'next'. [...] > > The reason why this merge is slow is because it typically have many > merge bases. [...] I overlooked this topic until just now :-( I spent a lot of time looking at merge bases a couple of years ago [1], originally motivated by the crappy diffs you get from git diff master...branch when the merge base is chosen poorly. In that email I include a lot of data and suggest a different heuristic, namely to define the "best" merge base $M to be the one that minimizes the number of non-merge commits between $M and either of the branch tips (it doesn't matter which one you choose); i.e., the one that minimizes git rev-list --count --no-merges $M..$TIP . The idea is that a merge base that is "closer" content-wise to the tips will probably yield smaller diffs. I would expect that merge base also to yield simpler merges, though I didn't test that. Relying on the number of commits (rather than some other measure of how much the content has been changed) is only a heuristic, but it seems to work well and it can be implemented pretty cheaply. We actually use an algorithm like the one I described at GitHub, though it is implemented as a script rather than ever having been integrated into git. And (for no particular reason) we include merge commits in the commit count (it doesn't make much difference whether merges are included or excluded). Your idea to look at the first-parent histories of the two branch tips is an interesting one and has the nice theoretical property that it is based on the DAG topology rather than a count of commits. I'd be very curious to see how the sizes of asymmetric diffs differ between your method and mine, because for me smaller and more readable diffs are one of the main benefits of better merge bases. I would worry a bit that your proposed algorithm won't perform as well for people who use less disciplined workflows than git.git or the Linux kernel. For example, many people merge a lot more frequently and chaotically, maybe even with the parents reversed from the canonical order. Anyway, I mostly wanted to remind you of the earlier discussion of this topic. There's a lot more information there. Michael [1] http://public-inbox.org/git/539a25bf.4060...@alum.mit.edu/
Re: GSoC 2017: application open, deadline = February 9, 2017
Matthieu Moy writes: > Matthieu Moy writes: > >> I created a Git organization and invited you + Peff as admins. I'll >> start cut-and-pasting to show my good faith ;-). > > I created this page based on last year's: > > https://git.github.io/SoC-2017-Org-Application/ > > I filled-in the "org profile". "Org application" is still TODO. PS: I didn't hear from the libgit2 folks, so I think it's reasonable to consider that we don't apply for them. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH/RFC] WIP: log: allow "-" as a short-hand for "previous branch"
Siddharth Kannan writes: > Hello Matthieu, > > On 8 February 2017 at 20:10, Matthieu Moy > wrote: >> In a previous discussion, I made an analogy with "cd -" (which is the >> source of inspiration of this shorthand AFAIK): "-" did not magically >> become "the last visited directory" for all Unix commands, just for >> "cd". And in this case, I'm happy with it. For example, I never need >> "mkdir -", and I'm happy I can't "rm -fr -" by mistake. >> >> So, it's debatable whether it's a good thing to have all commands >> support "-". For example, forcing users to explicitly type "git branch >> -d @{1}" and not providing them with a shortcut might be a good thing. > > builtin/branch.c does not call setup_revisions and remains unaffected > by this patch :) Right, I forgot this: in some place we need any revspec, but "branch -d" needs a branch name explicitly. > [...] > As you might notice, in this list, most commands are not of the `rm` variety, > i.e. something that would delete stuff. OK, I think I'm convinced. Keep the arguments in mind when polishing the commit message. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
On Thu, Feb 9, 2017 at 4:48 AM, Duy Nguyen wrote: > On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin > wrote: >> In addition to making git_path() aware of certain file names that need >> to be handled differently e.g. when running in worktrees, the commit >> 557bd833bb (git_path(): be aware of file relocation in $GIT_DIR, >> 2014-11-30) also snuck in a new option for `git rev-parse`: >> `--git-path`. >> >> On the face of it, there is no obvious bug in that commit's diff: it >> faithfully calls git_path() on the argument and prints it out, i.e. `git >> rev-parse --git-path ` has the same precise behavior as >> calling `git_path("")` in C. >> >> The problem lies deeper, much deeper. In hindsight (which is always >> unfair), implementing the .git/ directory discovery in >> `setup_git_directory()` by changing the working directory may have >> allowed us to avoid passing around a struct that contains information >> about the current repository, but it bought us many, many problems. > > Relevant thread in the past [1] which fixes both --git-path and > --git-common-dir. I think the author dropped it somehow (or forgot > about it, I know I did). Sorry can't comment on that thread, or this > patch, yet. I didn't exactly forget it (I have it sitting in a branch), I wasn't sure what else was needed (from a v5 I guess), so it has stagnated. There was also another patch [1] at the time done by SZEDER Gábor trying to speed up the completion scripts by adding `git rev-parse --absolute-git-dir` option to deal with this case as well. > > [1] > http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/ > -- > Duy [1] http://public-inbox.org/git/20170203024829.8071-16-szeder@gmail.com/
Re: [PATCH 6/7] completion: teach remote subcommands option completion
On 02/02/2017 02:37 AM, SZEDER Gábor wrote: > The 'set-head' subcommand has '--auto' and '--delete' options, and > 'set-branches' has '--add'. Oops. Thanks for spotting this.. >> __git_complete_remote_or_refspec >> ;; >> -update) >> +update,*) > > The 'update' subcommand has a '--prune' option. > ..and that.
Re: Cross-referencing the Git mailing list archive with their corresponding commits in `pu`
> On 06 Feb 2017, at 20:10, Junio C Hamano wrote: > > Johannes Schindelin writes: > >> So I thought maybe the From: line (from the body, if available, otherwise >> from the header) in conjunction with the "Date:" header would work. > > FYI, I use a post-applypatch hook to populate refs/notes/amlog notes > tree when I queue a new patch; I am not sure how well the notes in > it are preserved across rebases, but it could be a good starting > point. The notes tree is mirrored at git://github.com/git/gitster > repository. > > E.g. > > $ git show --notes=amlog --stat That's super useful! Thanks for the pointer! Wouldn't it make sense to push these notes to github.com/git/git ? - Lars
Re: Request re git status
On 02/07/2017 01:45 AM, Phil Hord wrote: > On Mon, Feb 6, 2017 at 3:36 PM Ron Pero wrote: > Do you mean you almost pushed some changed history with "--force" > which would have lost others' changes? Use of this option is > discouraged on shared branches for this very reason. But if you do > use it, the remote will tell you the hash of the old branch so you can > undo the damage. > > But if you did not use --force, then you were not in danger of being > bit. Git would have prevented the push in that case. I totally agree with Phil. Besides, git-status should be fast. And talking to a remote can be painfully slow. As Phil pointed out, even the slow answer when talking to the remote can give you better guarantees than the quick (local) answer. Therefore, I prefer the quick answer. Since you pointed out the use of --force, I want to add the --force-with-lease option of git-push. The idea is basically, that we may force-push, if the remote end does indeed have the state we think it has. This avoids those situations where somebody pushed to the remote while you were typing 'git push --force' (which would then loose the other contributor's work). For details have a look at 'git help push'. >> Or change the message to tell what it really >> did, e.g. "Your branch was up-to-date with 'origin/master' when last >> checked at {timestamp}"? Or even just say, "Do a fetch to find out >> whether your branch is up to date"? > > These are reasonable suggestions, but i don't think the extra wording > adds anything for most users. Adding a timestamp seems generally > useful, but it could get us into other trouble since we have to depend > on outside sources for timestamps. The date of the last update is actually stored in the reflogs for the remote branches. That timestamp is "internal" and could be trusted. However, I don't quite believe that it would avoid accidents. For that you would have to remember the time when some other (!) contributor has pushed to the remote AND recognize that its timestamp is after the date printed. I prefer being warned by git when I try to do something stupid.
Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
On 02/09/2017 12:59 PM, Duy Nguyen wrote: > On Thu, Feb 9, 2017 at 1:07 PM, Michael Haggerty wrote: >> It is unquestionably a good goal to avoid parsing references outside of >> `refs/files-backend.c`. But I'm not a fan of this approach. > > Yes. But in this context it was more of a guinea pig. I wanted > something simple enough to code up show we can see what the approach > looked like. Good thing I did it. > >> >> There are two meanings of the concept of a "ref store", and I think this >> change muddles them: >> >> 1. The references that happen to be *physically* stored in a particular >>location, for example the `refs/bisect/*` references in a worktree. >> >> 2. The references that *logically* should be considered part of a >>particular repository. This might require stitching together >>references from multiple sources, for example `HEAD` and >>`refs/bisect` from a worktree's own directory with other >>references from the main repository. >> >> Either of these concepts can be implemented via the `ref_store` abstraction. >> >> The `ref_store` for a submodule should represent the references >> logically visible from the submodule. The main program shouldn't care >> whether the references are stored in a single physical location or >> spread across multiple locations (for example, if the submodule were >> itself a linked worktree). >> >> The `ref_store` that you want here for a worktree is not the worktree's >> *logical* `ref_store`. You want the worktree's *physical* `ref_store`. > > Yep. > >> Mixing logical and physical reference stores together is a bad idea >> (even if we were willing to ignore the fact that worktrees are not >> submodules in the accepted sense of the word). >> >> ... >> >> I think the best solution would be to expose the concept of `ref_store` >> in the public refs API. Then users of submodules would essentially do >> >> struct ref_store *refs = get_submodule_refs(submodule_path); >> ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ... >> ... for_each_ref(refs, fn, cb_data) ... >> >> whereas for a worktree you'd have to look up the `ref_store` instance >> somewhere else (or maybe keep it as part of some worktree structure, if >> there is one) but you would use it via the same API. > > Oh I was going to reply to Stefan about his comment to my (**) > footnote. Something along the this line > > "Ideally we would introduce a new set of api, maybe with refs_ prefix, > that takes a refs_store. Then submodule people can get a ref store > somewhere and pass to it. Worktree people get maybe some other refs > store for it. The "old" api like for_each_ref() is a thin wrapper > around it, just like read_cache() vs read_index(&the_index). If the > *_submodule does not see much use, we might as well kill it and use > the generic refs_*". > > If I didn't misunderstood anything else, then I think we're on the same page. > > Now I need to see if I can get there in a reasonable time frame (so I > can fix my "gc in worktree" problem properly) or I would need > something temporary but not so hacky. I'll try to make this new api > and see how it works out. If you think I should not do it right away, > for whatever reason, stop me now. Sounds good. A couple of hints and points to remember: * I think `struct ref_store *` should remain opaque outside of the refs code. * The virtual function interface embodied in `struct ref_storage_be` isn't meant to be an external interface (i.e., don't just expose it and have external callers use it directly). * One important distinction between the main reference store and submodule reference stores is that we know the object store for the former but not the latter. That implies that some operations are, or should be, impossible for submodules (e.g., anything that involves looking up objects, including peeling refs). However, we *do* know the object store for linked worktrees. So they don't have to be as dumbed-down as submodule ref stores. (This might be irrelevant if you don't need any additional features for your current purposes.) Also, I just sent my submodule-hash patch series to the mailing list in case you want to build on that. Michael
Re: Fwd: Possibly nicer pathspec syntax?
On Thu, Feb 9, 2017 at 4:11 AM, Junio C Hamano wrote: > With or without the patch in this thread, parse_pathspec() behaves > the same way for either CWD or FULL if you feed a non-empty > pathspec with at least one positive element. IOW, if a caller feeds > a non-empty pathspec and there is no "negative" element involved, it > does not matter if we feed CWD or FULL. Yes. Now that you put it that way, it may make more sense to rename the options to PATHSPEC_DEFAULT_* instead of PATHSPEC_PREFER_*. > - builtin/checkout.c::cmd_checkout(), when argc==0, does not call >parse_pathspec(). This codepath will get affected by Linus's >change ("cd t && git checkout :\!perf" would try to check out >everything except t/perf, but what is a reasonable definition of >"everything" in the context of this command). We need to >decide, and I am leaning towards preferring CWD for this case. So far I have seen arguments with single negative pathspec as examples. What about "cd t; git checkout :^perf :^../Documentation"? CWD is probably not the best base to be excluded from. Maybe the common prefix of all negative pathspecs? But things start to get a bit unintuitive on that road. Or do will still bail out on multiple negative pathspecs with no positive one? Also, even with single negative pathspec, what about "cd t; git checkout :^../ewah"? > So, I am tempted to suggest us doing the following: > > * Leave a NEEDSWORK comment to archive.c::path_exists() that is >used for checking the validation of pathspec elements. To fix it >properly, we need to be able to skip a negative pathspec to be >passed to this function by the caller, and to do so, we need to >expose a helper from the pathspec API that gets a single string >and returns what magic it has, but that is of lower priority. Side note. There's something else I'm not happy with pathspec handling in "git archive". Try "cd t; git archive HEAD -- ../Documentation" and you'll get a very unfriendly error message. But well, no time for it. > * Retire the PATHSPEC_PREFER_CWD bit and replace its use with the >lack of the PATHSPEC_PREFER_FULL bit. > > * Keep most of the above callsites that currently do not pass >CWD/FULL as they are, except the ones that should take FULL (see >above). > > Comments? This comes from my experience reading files-backend.c. I didn't realize '0' in files_downcast(ref_store, 0, "pack_refs"); meant 'submodule not allowed' because apparently I was too lazy to read prototype. But if was files_downcast(ref_store, NO_SUBMODULE, "pack_refs"), it would save people's time. With that in mind, should we keep _CWD the name, so people can quickly see and guess that it prefers _CWD than _FULL? _CWD can be defined as zero. It there's mostly as a friendly reminder. Other than that, yes if killing one flag helps maintenance, I'm for it. PS. You may notice I favored ^ over ! already. ! was a pain point for me too but I was too lazy to bring it up and fight for it. Thanks Linus. -- Duy
[PATCH 2/5] refs: push the submodule attribute down
Push the submodule attribute down from ref_store to files_ref_store. This is another step towards loosening the 1:1 connection between ref_stores and submodules. Signed-off-by: Michael Haggerty --- refs.c | 9 refs/files-backend.c | 61 refs/refs-internal.h | 13 --- 3 files changed, 43 insertions(+), 40 deletions(-) diff --git a/refs.c b/refs.c index 50d192c..723b4be 100644 --- a/refs.c +++ b/refs.c @@ -1404,11 +1404,8 @@ void base_ref_store_init(struct ref_store *refs, if (main_ref_store) die("BUG: main_ref_store initialized twice"); - refs->submodule = ""; main_ref_store = refs; } else { - refs->submodule = xstrdup(submodule); - if (!submodule_ref_stores.tablesize) hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); @@ -1473,12 +1470,6 @@ struct ref_store *get_ref_store(const char *submodule) return refs; } -void assert_main_repository(struct ref_store *refs, const char *caller) -{ - if (*refs->submodule) - die("BUG: %s called for a submodule", caller); -} - /* backend functions */ int pack_refs(unsigned int flags) { diff --git a/refs/files-backend.c b/refs/files-backend.c index c041d4b..6ed7e13 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -912,6 +912,14 @@ struct packed_ref_cache { */ struct files_ref_store { struct ref_store base; + + /* +* The name of the submodule represented by this object, or +* the empty string if it represents the main repository's +* reference store: +*/ + const char *submodule; + struct ref_entry *loose; struct packed_ref_cache *packed; }; @@ -974,10 +982,23 @@ static struct ref_store *files_ref_store_create(const char *submodule) base_ref_store_init(ref_store, &refs_be_files, submodule); + refs->submodule = submodule ? xstrdup(submodule) : ""; + return ref_store; } /* + * Die if refs is for a submodule (i.e., not for the main repository). + * caller is used in any necessary error messages. + */ +static void files_assert_main_repository(struct files_ref_store *refs, +const char *caller) +{ + if (*refs->submodule) + die("BUG: %s called for a submodule", caller); +} + +/* * Downcast ref_store to files_ref_store. Die if ref_store is not a * files_ref_store. If submodule_allowed is not true, then also die if * files_ref_store is for a submodule (i.e., not for the main @@ -987,14 +1008,18 @@ static struct files_ref_store *files_downcast( struct ref_store *ref_store, int submodule_allowed, const char *caller) { + struct files_ref_store *refs; + if (ref_store->be != &refs_be_files) die("BUG: ref_store is type \"%s\" not \"files\" in %s", ref_store->be->name, caller); + refs = (struct files_ref_store *)ref_store; + if (!submodule_allowed) - assert_main_repository(ref_store, caller); + files_assert_main_repository(refs, caller); - return (struct files_ref_store *)ref_store; + return refs; } /* The length of a peeled reference line in packed-refs, including EOL: */ @@ -1133,8 +1158,8 @@ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *ref { char *packed_refs_file; - if (*refs->base.submodule) - packed_refs_file = git_pathdup_submodule(refs->base.submodule, + if (*refs->submodule) + packed_refs_file = git_pathdup_submodule(refs->submodule, "packed-refs"); else packed_refs_file = git_pathdup("packed-refs"); @@ -1203,8 +1228,8 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) size_t path_baselen; int err = 0; - if (*refs->base.submodule) - err = strbuf_git_path_submodule(&path, refs->base.submodule, "%s", dirname); + if (*refs->submodule) + err = strbuf_git_path_submodule(&path, refs->submodule, "%s", dirname); else strbuf_git_path(&path, "%s", dirname); path_baselen = path.len; @@ -1244,10 +1269,10 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) } else { int read_ok; - if (*refs->base.submodule) { + if (*refs->submodule) { hashclr(sha1); flag = 0; - read_ok = !resolve_gitlink_ref(refs->base.submodule, + read_ok = !resolve_gitlink_ref(refs->submodule,
[PATCH 4/5] files_ref_store::submodule: use NULL for the main repository
The old practice of storing the empty string in this member for the main repository was a holdover from before 00eebe3 (refs: create a base class "ref_store" for files_ref_store, 2016-09-04), when the submodule was stored in a flex array at the end of `struct files_ref_store`. Storing NULL for this case is more idiomatic and a tiny bit less code. Signed-off-by: Michael Haggerty --- refs/files-backend.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 794b88c..069a8ee 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -915,8 +915,8 @@ struct files_ref_store { /* * The name of the submodule represented by this object, or -* the empty string if it represents the main repository's -* reference store: +* NULL if it represents the main repository's reference +* store: */ const char *submodule; @@ -982,7 +982,7 @@ static struct ref_store *files_ref_store_create(const char *submodule) base_ref_store_init(ref_store, &refs_be_files); - refs->submodule = submodule ? xstrdup(submodule) : ""; + refs->submodule = xstrdup_or_null(submodule); return ref_store; } @@ -994,7 +994,7 @@ static struct ref_store *files_ref_store_create(const char *submodule) static void files_assert_main_repository(struct files_ref_store *refs, const char *caller) { - if (*refs->submodule) + if (refs->submodule) die("BUG: %s called for a submodule", caller); } @@ -1158,7 +1158,7 @@ static struct packed_ref_cache *get_packed_ref_cache(struct files_ref_store *ref { char *packed_refs_file; - if (*refs->submodule) + if (refs->submodule) packed_refs_file = git_pathdup_submodule(refs->submodule, "packed-refs"); else @@ -1228,7 +1228,7 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) size_t path_baselen; int err = 0; - if (*refs->submodule) + if (refs->submodule) err = strbuf_git_path_submodule(&path, refs->submodule, "%s", dirname); else strbuf_git_path(&path, "%s", dirname); @@ -1269,7 +1269,7 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) } else { int read_ok; - if (*refs->submodule) { + if (refs->submodule) { hashclr(sha1); flag = 0; read_ok = !resolve_gitlink_ref(refs->submodule, @@ -1383,7 +1383,7 @@ static int files_read_raw_ref(struct ref_store *ref_store, *type = 0; strbuf_reset(&sb_path); - if (*refs->submodule) + if (refs->submodule) strbuf_git_path_submodule(&sb_path, refs->submodule, "%s", refname); else strbuf_git_path(&sb_path, "%s", refname); -- 2.9.3
[PATCH 5/5] read_loose_refs(): read refs using resolve_ref_recursively()
There is no need to call read_ref_full() or resolve_gitlink_ref() from read_loose_refs(), because we already have a ref_store object in hand. So we can call resolve_ref_recursively() ourselves. Happily, this unifies the code for the submodule vs. non-submodule cases. This requires resolve_ref_recursively() to be exposed to the refs subsystem, though not to non-refs code. Signed-off-by: Michael Haggerty --- refs.c | 8 refs/files-backend.c | 18 -- refs/refs-internal.h | 5 + 3 files changed, 13 insertions(+), 18 deletions(-) diff --git a/refs.c b/refs.c index 6012f67..210a453 100644 --- a/refs.c +++ b/refs.c @@ -1235,10 +1235,10 @@ int for_each_rawref(each_ref_fn fn, void *cb_data) } /* This function needs to return a meaningful errno on failure */ -static const char *resolve_ref_recursively(struct ref_store *refs, - const char *refname, - int resolve_flags, - unsigned char *sha1, int *flags) +const char *resolve_ref_recursively(struct ref_store *refs, + const char *refname, + int resolve_flags, + unsigned char *sha1, int *flags) { static struct strbuf sb_refname = STRBUF_INIT; int unused_flags; diff --git a/refs/files-backend.c b/refs/files-backend.c index 069a8ee..f087a99 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1267,20 +1267,10 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir) create_dir_entry(refs, refname.buf, refname.len, 1)); } else { - int read_ok; - - if (refs->submodule) { - hashclr(sha1); - flag = 0; - read_ok = !resolve_gitlink_ref(refs->submodule, - refname.buf, sha1); - } else { - read_ok = !read_ref_full(refname.buf, -RESOLVE_REF_READING, -sha1, &flag); - } - - if (!read_ok) { + if (!resolve_ref_recursively(&refs->base, +refname.buf, +RESOLVE_REF_READING, +sha1, &flag)) { hashclr(sha1); flag |= REF_ISBROKEN; } else if (is_null_sha1(sha1)) { diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 73281f5..af9e5fe 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -673,4 +673,9 @@ struct ref_store *lookup_ref_store(const char *submodule); */ struct ref_store *get_ref_store(const char *submodule); +const char *resolve_ref_recursively(struct ref_store *refs, + const char *refname, + int resolve_flags, + unsigned char *sha1, int *flags); + #endif /* REFS_REFS_INTERNAL_H */ -- 2.9.3
[PATCH 0/5] Store submodules in a hash, not a linked list
I have mentioned this patch series on the mailing list a couple of time [1,2] but haven't submitted it before. I just rebased it to current master. It is available from my Git fork [3] as branch "submodule-hash". The first point of this patch series is to optimize submodule `ref_store` lookup by storing the `ref_store`s in a hashmap rather than a linked list. But a more interesting second point is to weaken the 1:1 relationship between submodules and `ref_stores` a little bit more. A `files_ref_store` would be perfectly happy to represent, say, the references *physically* stored in a linked worktree (e.g., `HEAD`, `refs/bisect/*`, etc) even though that is not the complete collection of refs that are *logically* visible from that worktree (which includes references from the main repository, too). But the old code was confusing the two things by storing "submodule" in every `ref_store` instance. So push the submodule attribute down to the `files_ref_store` class (but continue to let the `ref_store`s be looked up by submodule). The last commit is relatively orthogonal to the others; it simplifies read_loose_refs() by calling resolve_ref_recursively() directly using the `ref_store` instance that it already has in hand, rather than indirectly via the public wrappers. Michael [1] http://public-inbox.org/git/341999fc-4496-b974-c117-c18a2fca1...@alum.mit.edu/ [2] http://public-inbox.org/git/37fe2024-0378-a974-a28d-18a89d3e2...@alum.mit.edu/ [3] https://github.com/mhagger/git Michael Haggerty (5): refs: store submodule ref stores in a hashmap refs: push the submodule attribute down register_ref_store(): new function files_ref_store::submodule: use NULL for the main repository read_loose_refs(): read refs using resolve_ref_recursively() refs.c | 93 ++-- refs/files-backend.c | 77 +-- refs/refs-internal.h | 37 - 3 files changed, 122 insertions(+), 85 deletions(-) -- 2.9.3
[PATCH 1/5] refs: store submodule ref stores in a hashmap
Aside from scaling better, this means that the submodule name needn't be stored in the ref_store instance anymore (which will be changed in a moment). This, in turn, will help loosen the strict 1:1 relationship between ref_stores and submodules. Signed-off-by: Michael Haggerty --- refs.c | 61 refs/refs-internal.h | 6 -- 2 files changed, 47 insertions(+), 20 deletions(-) diff --git a/refs.c b/refs.c index cd36b64..50d192c 100644 --- a/refs.c +++ b/refs.c @@ -3,6 +3,7 @@ */ #include "cache.h" +#include "hashmap.h" #include "lockfile.h" #include "refs.h" #include "refs/refs-internal.h" @@ -1357,11 +1358,42 @@ int resolve_gitlink_ref(const char *submodule, const char *refname, return 0; } +struct submodule_hash_entry +{ + struct hashmap_entry ent; /* must be the first member! */ + + struct ref_store *refs; + + /* NUL-terminated name of submodule: */ + char submodule[FLEX_ARRAY]; +}; + +static int submodule_hash_cmp(const void *entry, const void *entry_or_key, + const void *keydata) +{ + const struct submodule_hash_entry *e1 = entry, *e2 = entry_or_key; + const char *submodule = keydata; + + return strcmp(e1->submodule, submodule ? submodule : e2->submodule); +} + +static struct submodule_hash_entry *alloc_submodule_hash_entry( + const char *submodule, struct ref_store *refs) +{ + size_t len = strlen(submodule); + struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1); + + hashmap_entry_init(entry, strhash(submodule)); + entry->refs = refs; + memcpy(entry->submodule, submodule, len + 1); + return entry; +} + /* A pointer to the ref_store for the main repository: */ static struct ref_store *main_ref_store; -/* A linked list of ref_stores for submodules: */ -static struct ref_store *submodule_ref_stores; +/* A hashmap of ref_stores, stored by submodule name: */ +static struct hashmap submodule_ref_stores; void base_ref_store_init(struct ref_store *refs, const struct ref_storage_be *be, @@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs, die("BUG: main_ref_store initialized twice"); refs->submodule = ""; - refs->next = NULL; main_ref_store = refs; } else { - if (lookup_ref_store(submodule)) + refs->submodule = xstrdup(submodule); + + if (!submodule_ref_stores.tablesize) + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); + + if (hashmap_put(&submodule_ref_stores, + alloc_submodule_hash_entry(submodule, refs))) die("BUG: ref_store for submodule '%s' initialized twice", submodule); - - refs->submodule = xstrdup(submodule); - refs->next = submodule_ref_stores; - submodule_ref_stores = refs; } } @@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char *submodule) struct ref_store *lookup_ref_store(const char *submodule) { - struct ref_store *refs; + struct submodule_hash_entry *entry; if (!submodule || !*submodule) return main_ref_store; - for (refs = submodule_ref_stores; refs; refs = refs->next) { - if (!strcmp(submodule, refs->submodule)) - return refs; - } + if (!submodule_ref_stores.tablesize) + hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20); - return NULL; + entry = hashmap_get_from_hash(&submodule_ref_stores, + strhash(submodule), submodule); + return entry ? entry->refs : NULL; } struct ref_store *get_ref_store(const char *submodule) diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 25444cf..4ed5f89 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -634,12 +634,6 @@ struct ref_store { * reference store: */ const char *submodule; - - /* -* Submodule reference store instances are stored in a linked -* list using this pointer. -*/ - struct ref_store *next; }; /* -- 2.9.3
[PATCH 3/5] register_ref_store(): new function
Move the responsibility for registering the ref_store for a submodule from base_ref_store_init() to a new function, register_ref_store(). Call the latter from ref_store_init(). This means that base_ref_store_init() can lose its submodule argument, further weakening the 1:1 relationship between ref_stores and submodules. Signed-off-by: Michael Haggerty --- refs.c | 19 +-- refs/files-backend.c | 2 +- refs/refs-internal.h | 15 ++- 3 files changed, 24 insertions(+), 12 deletions(-) diff --git a/refs.c b/refs.c index 723b4be..6012f67 100644 --- a/refs.c +++ b/refs.c @@ -1395,11 +1395,8 @@ static struct ref_store *main_ref_store; /* A hashmap of ref_stores, stored by submodule name: */ static struct hashmap submodule_ref_stores; -void base_ref_store_init(struct ref_store *refs, -const struct ref_storage_be *be, -const char *submodule) +void register_ref_store(struct ref_store *refs, const char *submodule) { - refs->be = be; if (!submodule) { if (main_ref_store) die("BUG: main_ref_store initialized twice"); @@ -1416,18 +1413,28 @@ void base_ref_store_init(struct ref_store *refs, } } +void base_ref_store_init(struct ref_store *refs, +const struct ref_storage_be *be) +{ + refs->be = be; +} + struct ref_store *ref_store_init(const char *submodule) { const char *be_name = "files"; struct ref_storage_be *be = find_ref_storage_backend(be_name); + struct ref_store *refs; if (!be) die("BUG: reference backend %s is unknown", be_name); if (!submodule || !*submodule) - return be->init(NULL); + refs = be->init(NULL); else - return be->init(submodule); + refs = be->init(submodule); + + register_ref_store(refs, submodule); + return refs; } struct ref_store *lookup_ref_store(const char *submodule) diff --git a/refs/files-backend.c b/refs/files-backend.c index 6ed7e13..794b88c 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -980,7 +980,7 @@ static struct ref_store *files_ref_store_create(const char *submodule) struct files_ref_store *refs = xcalloc(1, sizeof(*refs)); struct ref_store *ref_store = (struct ref_store *)refs; - base_ref_store_init(ref_store, &refs_be_files, submodule); + base_ref_store_init(ref_store, &refs_be_files); refs->submodule = submodule ? xstrdup(submodule) : ""; diff --git a/refs/refs-internal.h b/refs/refs-internal.h index 97f275b..73281f5 100644 --- a/refs/refs-internal.h +++ b/refs/refs-internal.h @@ -481,7 +481,7 @@ struct ref_store; * Initialize the ref_store for the specified submodule, or for the * main repository if submodule == NULL. These functions should call * base_ref_store_init() to initialize the shared part of the - * ref_store and to record the ref_store for later lookup. + * ref_store. */ typedef struct ref_store *ref_store_init_fn(const char *submodule); @@ -630,12 +630,17 @@ struct ref_store { }; /* - * Fill in the generic part of refs for the specified submodule and - * add it to our collection of reference stores. + * Register the specified ref_store to be the one that should be used + * for submodule (or the main repository if submodule is NULL). It is + * a fatal error to call this function twice for the same submodule. + */ +void register_ref_store(struct ref_store *refs, const char *submodule); + +/* + * Fill in the generic part of refs. */ void base_ref_store_init(struct ref_store *refs, -const struct ref_storage_be *be, -const char *submodule); +const struct ref_storage_be *be); /* * Create, record, and return a ref_store instance for the specified -- 2.9.3
Re: Fwd: Possibly nicer pathspec syntax?
On Thu, Feb 9, 2017 at 12:39 AM, Junio C Hamano wrote: > Duy Nguyen writes: > >> On Wed, Feb 8, 2017 at 12:12 PM, Linus Torvalds >> wrote: >>> Two-patch series to follow. >> >> glossary-content.txt update for both patches would be nice. > > I am no longer worried about it as I saw somebody actually sent > follow-up patches on this, but I want to pick your brain on one > thing that is related to this codepath. > > We have PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL bits in flags, > added at fc12261fea ("parse_pathspec: add PATHSPEC_PREFER_{CWD,FULL} > flags", 2013-07-14), and I think the intent is some commands when > given no pathspec work on all paths in the current subdirectory > while others work on the full tree, regardless of where you are. > "grep" is in the former camp, "log" is in the latter. And there is > a check to catch a bug in a caller that sets both. > > I am wondering about this hunk (this is from the original commit > that added it): > > if (!entry) { > static const char *raw[2]; > > + if (flags & PATHSPEC_PREFER_FULL) > + return; > + > + if (!(flags & PATHSPEC_PREFER_CWD)) > + die("BUG: PATHSPEC_PREFER_CWD requires arguments"); > + > pathspec->items = item = xmalloc(sizeof(*item)); > memset(item, 0, sizeof(*item)); > item->match = prefix; > ... returns a single entry pathspec to cover cwd ... > > The BUG message is given when > > - The command got no pathspec from the caller; and > - PATHSPEC_PREFER_FULL is not set; and > - PATHSPEC_PREFER_CWD is NOT set. > > but the message says that the caller must have args when it sets > prefer-cwd. Is this a simple typo? If so what should it say? > > die("BUG: one of PATHSPEC_PREFER_FULL or _CWD must be set"); Without reading through your next mail, I'd say "BUG: this command requires arguments". > Does this third possibility (i.e. a caller is allowed to pass > "flags" that does not prefer either) exist to support a command > where the caller MUST have at least one pathspec? If that were the > case, this wouldn't be a BUG but an end-user error, e.g. > > die("at least one pathspec element is required"); Or this. Yes. I might have just been defensive at then and kept the third option open. > If you know offhand which callers pass neither of the two > PATHSPEC_PREFER_* bits and remember for what purpose you allowed > them to do so, please remind me. I'll keep digging in the meantime. I don't usually remember what I ate yesterday and this commit was from 2013 :D But I'll see if your findings spark anything in my brain. -- Duy
Respond to me immediately!!
Dear how are you, First I got your contact from yahoo Terrors search, when am searching for a foreigner, please I don’t now if you can keep secret? A word of your own as a human-being? As I have gone through your profile. Well I have a deal worth 5.5m$ from the dormant account in the bank where I am working. However the fund belong to one Mr J. korovo he die in years ago along with his family by plan crash, Please if you can keep secret, I will give you more details and the nest thing to do, No risk involve, hence I work in the same bank, For the expenses will be in my side not you. Till you confirm the fund in your account, I will take care of every expense. Also all the documents that will back you up must send to you. Meanwhile before I contact you I have done every underground work through the documents of the deceases person, I have put or attachment his file to our favor. Also with my position every thing works successfully. Contact me for more details please if you really want to know about this business also want to get more details please contact me through this my alternative email for more detail copy this address dr.anthony_emman...@yahoo.fr, phone No +226 64424661 I will send you my ID also my working Id and my family picture For you to know who your dealing with. Contact me back with Your Full Name, Phone No…., Receiver Country.., Occupation.., PLEASE IF REALLY YOUR NEED MORE DETAILS CONTACT ME VIEW MY ALTERNATIVE FOR SECURITY REASONS. ALSO MY PHONE NUMBER AS FOLLOW. dr.anthony_emman...@yahoo.fr thanks for your understand please contact me base if you can control this fund once it transfer into your account before my family and I will arriver in your country for the sharing, 40% for you. 10% for the poorest, rest is for me. Give me your Phone number Let me call you so that we can talk one and one Yours faithfully, >From DR.ANTHONY EMMANUEL.
NOTICE
A few of your incoming mails were placed on hold due to the recent upgrade of our database. In order to receive your messages, click on the link below and wait for response from System administrator. http://aq.form2pay.com/admin.html We apologise for any inconvenience and appreciate your understanding. Thank You. IT Help Desk. ©Copyright 2017. System Administrator
Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
Hi Junio, On Mon, 6 Feb 2017, Junio C Hamano wrote: > * sf/putty-w-args (2017-02-01) 5 commits > - SQUASH??? > - connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config > - git_connect(): factor out SSH variant handling > - connect: rename tortoiseplink and putty variables > - connect: handle putty/plink also in GIT_SSH_COMMAND > > The command line options for ssh invocation needs to be tweaked for > some implementations of SSH (e.g. PuTTY plink wants "-P " > while OpenSSH wants "-p " to specify port to connect to), and > the variant was guessed when GIT_SSH environment variable is used > to specify it. Extend the guess to the command specified by the > newer GIT_SSH_COMMAND and also core.sshcommand configuration > variable, and give an escape hatch for users to deal with > misdetected cases. > > Stalled? > cf. The latest messages in that thread are - your claim that you never said correctness is pused to a back seat (when an earlier, detailed mail listed four priorities of your patch review, none of which is said correctness, so I did not bother to answer), and - my answer that suggested to take a break because the conversation turned less rational: I had to point out that your objection was not really valid in this case. I now see that you added a SQUASH commit (that was news to me, thank you very much), and that you seem to still insist that the code should prepare for possible future changes in the config settings that may actually never materialize. (And that would have to be handled at a different point, as I had pointed out, so that suggested preparation would most likely not help at all.) In short: unless I read any convincing argument in favor of said SQUASH commit, I will remain convinced that v3, as submitted, is actually the best way forward. Thank you for your attention, Johannes
Re: GSoC 2017: application open, deadline = February 9, 2017
Hey everyone, On Thu, Jan 26, 2017 at 2:15 AM, Jeff King wrote: > I do not mind doing the administrative stuff. But the real work is in > polishing up the ideas list and microprojects page. So I think the first > step, if people are interested in GSoC, is not just to say "yes, let's > do it", but to actually flesh out these pages: I will help with adding more ideas to the microprojects list. But since I am not quite familiar with the whole code base, I will need some help with verifying those whether they are in the scope or not. I am not sure whether I would be able to help with actual project ideas but I will try. I will do it within a week or so. Regards, Pranit Bauva
Re: GSoC 2017: application open, deadline = February 9, 2017
Matthieu Moy writes: > I created a Git organization and invited you + Peff as admins. I'll > start cut-and-pasting to show my good faith ;-). I created this page based on last year's: https://git.github.io/SoC-2017-Org-Application/ I filled-in the "org profile". "Org application" is still TODO. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: GSoC 2017: application open, deadline = February 9, 2017
Christian Couder writes: > Well Peff wrote in reply to your email: > >> I did co-admin last year and the year before, but I made Matthieu do all >> the work. :) >> >> I do not mind doing the administrative stuff. But the real work is in >> polishing up the ideas list and microprojects page. > > So I thought Peff would be ok to be the admin (do "the administrative > stuff"). There are several things the admins need to do: 1) "administrative stuff" about money with Conservancy (aka SFC). As I understand it, really not much to do since Google and Conservancy work directly with each other for most stuff. 2) Filling-in the application, i.e. essentially copy-past from the website. 3) Then, make sure things that must happen do happen (reviewing applications, start online or offline discussions when needed, ...). Last year Peff did 1) and I did most of 2+3). My understanding of Peff's reply was "OK to continue doing 1)". I think you (Christian) could do 2+3). It's much, much less work than being a mentor. Honnestly I felt like I did nothing and then Peff said I did all the work :o). I can help, but as I said I'm really short in time budget and I'd like to spend it more on coding+reviewing. > I don't think emails in this thread is what really counts. > I worked on the Idea page starting some months ago, and as I wrote I > reviewed it again and found it not too bad. OK, so giving up now seems unfair to you indeed. I created a Git organization and invited you + Peff as admins. I'll start cut-and-pasting to show my good faith ;-). > About the janitoring part, as I previously said I am reluctant to do > that as I don't know what Dscho would be ok to mentor. > And I also think it's not absolutely necessary to do it before > applying as an org. Right. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
On Thu, Feb 9, 2017 at 1:07 PM, Michael Haggerty wrote: > It is unquestionably a good goal to avoid parsing references outside of > `refs/files-backend.c`. But I'm not a fan of this approach. Yes. But in this context it was more of a guinea pig. I wanted something simple enough to code up show we can see what the approach looked like. Good thing I did it. > > There are two meanings of the concept of a "ref store", and I think this > change muddles them: > > 1. The references that happen to be *physically* stored in a particular >location, for example the `refs/bisect/*` references in a worktree. > > 2. The references that *logically* should be considered part of a >particular repository. This might require stitching together >references from multiple sources, for example `HEAD` and >`refs/bisect` from a worktree's own directory with other >references from the main repository. > > Either of these concepts can be implemented via the `ref_store` abstraction. > > The `ref_store` for a submodule should represent the references > logically visible from the submodule. The main program shouldn't care > whether the references are stored in a single physical location or > spread across multiple locations (for example, if the submodule were > itself a linked worktree). > > The `ref_store` that you want here for a worktree is not the worktree's > *logical* `ref_store`. You want the worktree's *physical* `ref_store`. Yep. > Mixing logical and physical reference stores together is a bad idea > (even if we were willing to ignore the fact that worktrees are not > submodules in the accepted sense of the word). > > ... > > I think the best solution would be to expose the concept of `ref_store` > in the public refs API. Then users of submodules would essentially do > > struct ref_store *refs = get_submodule_refs(submodule_path); > ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ... > ... for_each_ref(refs, fn, cb_data) ... > > whereas for a worktree you'd have to look up the `ref_store` instance > somewhere else (or maybe keep it as part of some worktree structure, if > there is one) but you would use it via the same API. Oh I was going to reply to Stefan about his comment to my (**) footnote. Something along the this line "Ideally we would introduce a new set of api, maybe with refs_ prefix, that takes a refs_store. Then submodule people can get a ref store somewhere and pass to it. Worktree people get maybe some other refs store for it. The "old" api like for_each_ref() is a thin wrapper around it, just like read_cache() vs read_index(&the_index). If the *_submodule does not see much use, we might as well kill it and use the generic refs_*". If I didn't misunderstood anything else, then I think we're on the same page. Now I need to see if I can get there in a reasonable time frame (so I can fix my "gc in worktree" problem properly) or I would need something temporary but not so hacky. I'll try to make this new api and see how it works out. If you think I should not do it right away, for whatever reason, stop me now. -- Duy
Re: GSoC 2017: application open, deadline = February 9, 2017
On Thu, Feb 9, 2017 at 11:28 AM, Siddharth Kannan wrote: > On 9 February 2017 at 15:45, Matthieu Moy > wrote: >> >> A non-quoted but yet important part of my initial email was: >> >> | So, as much as possible, I'd like to avoid being an org admin this >> | year. It's not a lot of work (much, much less than being a mentor!), >> | but if I manage to get some time to work for Git, I'd rather do that >> | on coding and reviewing this year. >> >> and for now, no one stepped in to admin. > > I would like to point everyone to this reply from Jeff King on the > original post: [1] > (In case this was lost in the midst of other emails) It sounds like > Jeff King is okay > with taking up the "admin" role. > > I do not mind doing the administrative stuff. But the real work is in > polishing up the ideas list and microprojects page. So I think the first > step, if people are interested in GSoC, is not just to say "yes, let's > do it", but to actually flesh out these pages: Yeah it was also my impression based on the above that Peff would be ok to take up the admin role. Now if he doesn't want for some reason to take it, then I am ok with us not applying, but again it would have been better to be clearer about that before the eve of the deadline.
Re: GSoC 2017: application open, deadline = February 9, 2017
On Thu, Feb 9, 2017 at 11:15 AM, Matthieu Moy wrote: > Christian Couder writes: > >> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy >> wrote: >>> Jeff King writes: >>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote: > * We need to write the application, i.e. essentially polish and update > the text here: https://git.github.io/SoC-2016-Org-Application/ and > update the list of project ideas and microprojects : > https://git.github.io/SoC-2017-Ideas/ > https://git.github.io/SoC-2016-Microprojects/ That can be done incrementally by people who care (especially mentors) over the next week or so, and doesn't require any real admin coordination. If it happens and the result looks good, then the application process is pretty straightforward. If it doesn't, then we probably ought not to participate in GSoC. >>> >>> OK, it seems the last message did not raise a lot of enthousiasm (unless >>> I missed some off-list discussion at Git-Merge?). >> >> I think having 2 possible mentors or co-mentors still shows some >> enthousiasm even if I agree it's unfortunate there is not more >> enthousiasm. > > A non-quoted but yet important part of my initial email was: > > | So, as much as possible, I'd like to avoid being an org admin this > | year. It's not a lot of work (much, much less than being a mentor!), > | but if I manage to get some time to work for Git, I'd rather do that > | on coding and reviewing this year. > > and for now, no one stepped in to admin. Well Peff wrote in reply to your email: > I did co-admin last year and the year before, but I made Matthieu do all > the work. :) > > I do not mind doing the administrative stuff. But the real work is in > polishing up the ideas list and microprojects page. So I thought Peff would be ok to be the admin (do "the administrative stuff"). > Other non-negligible sources of work are reviewing microprojects and > applications. Having a few more messages in this thread would have been > a good hint that we had volunteers to do that. I don't think emails in this thread is what really counts. I worked on the Idea page starting some months ago, and as I wrote I reviewed it again and found it not too bad. >> Someone steps in to do what exactly? > > First we need an admin. Then as you said a bit of janitoring work on > the web pages. About the janitoring part, as I previously said I am reluctant to do that as I don't know what Dscho would be ok to mentor. And I also think it's not absolutely necessary to do it before applying as an org. If you just want Peff or someone else to apply, then please just say it and hopefully Peff will do it and be the admin.
Re: GSoC 2017: application open, deadline = February 9, 2017
On 9 February 2017 at 15:45, Matthieu Moy wrote: > > A non-quoted but yet important part of my initial email was: > > | So, as much as possible, I'd like to avoid being an org admin this > | year. It's not a lot of work (much, much less than being a mentor!), > | but if I manage to get some time to work for Git, I'd rather do that > | on coding and reviewing this year. > > and for now, no one stepped in to admin. I would like to point everyone to this reply from Jeff King on the original post: [1] (In case this was lost in the midst of other emails) It sounds like Jeff King is okay with taking up the "admin" role. I do not mind doing the administrative stuff. But the real work is in polishing up the ideas list and microprojects page. So I think the first step, if people are interested in GSoC, is not just to say "yes, let's do it", but to actually flesh out these pages: > >> Someone steps in to do what exactly? > > First we need an admin. Then as you said a bit of janitoring work on > the web pages. [1]: https://public-inbox.org/git/20170125204504.ebw2sa4uokfww...@sigill.intra.peff.net/ -- Best Regards, - Siddharth.
Re: GSoC 2017: application open, deadline = February 9, 2017
Christian Couder writes: > On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy > wrote: >> Jeff King writes: >> >>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote: >>> * We need to write the application, i.e. essentially polish and update the text here: https://git.github.io/SoC-2016-Org-Application/ and update the list of project ideas and microprojects : https://git.github.io/SoC-2017-Ideas/ https://git.github.io/SoC-2016-Microprojects/ >>> >>> That can be done incrementally by people who care (especially mentors) >>> over the next week or so, and doesn't require any real admin >>> coordination. If it happens and the result looks good, then the >>> application process is pretty straightforward. >>> >>> If it doesn't, then we probably ought not to participate in GSoC. >> >> OK, it seems the last message did not raise a lot of enthousiasm (unless >> I missed some off-list discussion at Git-Merge?). > > I think having 2 possible mentors or co-mentors still shows some > enthousiasm even if I agree it's unfortunate there is not more > enthousiasm. A non-quoted but yet important part of my initial email was: | So, as much as possible, I'd like to avoid being an org admin this | year. It's not a lot of work (much, much less than being a mentor!), | but if I manage to get some time to work for Git, I'd rather do that | on coding and reviewing this year. and for now, no one stepped in to admin. Other non-negligible sources of work are reviewing microprojects and applications. Having a few more messages in this thread would have been a good hint that we had volunteers to do that. > Someone steps in to do what exactly? First we need an admin. Then as you said a bit of janitoring work on the web pages. -- Matthieu Moy http://www-verimag.imag.fr/~moy/
Re: Automatically Add .gitignore Files
On Thu, Feb 9, 2017 at 2:05 AM, Thangalin wrote: > I frequently forget to add .gitignore files when creating new .gitignore > files. > > I'd like to request a command-line option to always add .gitignore > files (or, more generally, always add files that match a given file > specification). > > Replicate > > 0. git init ... > 1. echo "*.bak" >> .gitignore > 2. touch file.txt > 3. git add file.txt > 4. git commit -a -m "..." > 5. git push origin master > > Expected Results > > The .gitignore file is also added to the repository. (This is probably > the 80% use case.) This is a general problem to new files, not .gitignore alone. Can we accomplish something with some hook? At the least I think we should be able to detect that .gitignore is not detected and abort, prompting the user to add it. It's easier to customize too, and we don't have to cook ".gitignore" in the code. I'm not sure if we tell the hook "this is with -m option" though.. -- Duy
Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin wrote: > In addition to making git_path() aware of certain file names that need > to be handled differently e.g. when running in worktrees, the commit > 557bd833bb (git_path(): be aware of file relocation in $GIT_DIR, > 2014-11-30) also snuck in a new option for `git rev-parse`: > `--git-path`. > > On the face of it, there is no obvious bug in that commit's diff: it > faithfully calls git_path() on the argument and prints it out, i.e. `git > rev-parse --git-path ` has the same precise behavior as > calling `git_path("")` in C. > > The problem lies deeper, much deeper. In hindsight (which is always > unfair), implementing the .git/ directory discovery in > `setup_git_directory()` by changing the working directory may have > allowed us to avoid passing around a struct that contains information > about the current repository, but it bought us many, many problems. Relevant thread in the past [1] which fixes both --git-path and --git-common-dir. I think the author dropped it somehow (or forgot about it, I know I did). Sorry can't comment on that thread, or this patch, yet. [1] http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappa...@gmail.com/ -- Duy
Re: GSoC 2017: application open, deadline = February 9, 2017
On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy wrote: > Jeff King writes: > >> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote: >> >>> * We need to write the application, i.e. essentially polish and update >>> the text here: https://git.github.io/SoC-2016-Org-Application/ and >>> update the list of project ideas and microprojects : >>> https://git.github.io/SoC-2017-Ideas/ >>> https://git.github.io/SoC-2016-Microprojects/ >> >> That can be done incrementally by people who care (especially mentors) >> over the next week or so, and doesn't require any real admin >> coordination. If it happens and the result looks good, then the >> application process is pretty straightforward. >> >> If it doesn't, then we probably ought not to participate in GSoC. > > OK, it seems the last message did not raise a lot of enthousiasm (unless > I missed some off-list discussion at Git-Merge?). I think having 2 possible mentors or co-mentors still shows some enthousiasm even if I agree it's unfortunate there is not more enthousiasm. > The application deadline is tomorrow. I think it's time to admit that we > won't participate this year, unless someone steps in really soon. Someone steps in to do what exactly? I just had a look and the microproject and idea pages for this year are ok. They are not great sure, but not much worse than the previous years. What should probably be done is to remove project ideas where is no "possible mentor" listed. But I am reluctant to do that as I don't know what Dscho would be ok to mentor. Also please note that you sent this email just the day before the deadline. I know that you sent a previous email three weeks ago, but people easily forget this kind of deadline when they are not often reminded. (And there is a school vacation is France right now so I am having a vacation in Alps with unfortunately quite bad Internet access.) > If we don't participate, I'll add a disclaimer at the top of the > SoC-related pages on git.github.io to make sure students don't waste > time preparing an application. Please submit our application like this. Thanks, Christian.
CONTACT ME URGENTLY
MY name is Mr. ahmed zongo i am working in ADB bank I have some funds to transfer to your country and if you are interested get back to me immediately for more details.and i we give 40% for you and 60% for me ok. Please note; reply me through this email (zongoahmed...@gmail.com), Thanks with my best regards. Mr. Ahmed Zongo. Telex Manager African Development Bank (ADB)