Re: [RFC PATCHv2 09/17] update submodules: add scheduling to update submodules
On Mon, Dec 5, 2016 at 3:37 PM, David Turner <david.tur...@twosigma.com> wrote: > This patch confuses me -- see below. > >> -Original Message- >> From: Stefan Beller [mailto:sbel...@google.com] >> Sent: Friday, December 02, 2016 7:30 PM >> To: bmw...@google.com; David Turner >> Cc: git@vger.kernel.org; sand...@crustytoothpaste.net; hvo...@hvoigt.net; >> gits...@pobox.com; Stefan Beller >> Subject: [RFC PATCHv2 09/17] update submodules: add scheduling to update >> submodules > [snip] >> +static int update_submodule(const char *path, const struct object_id >> *oid, >> + int force, int is_new) >> +{ >> + const char *git_dir; >> + struct child_process cp = CHILD_PROCESS_INIT; >> + const struct submodule *sub = submodule_from_path(null_sha1, path); >> + >> + if (!sub || !sub->name) >> + return -1; >> + >> + git_dir = resolve_gitdir(git_common_path("modules/%s", sub->name)); >> + >> + if (!git_dir) >> + return -1; >> + >> + if (is_new) >> + connect_work_tree_and_git_dir(path, git_dir); >> + >> + /* update index via `read-tree --reset sha1` */ >> + argv_array_pushl(, "read-tree", >> +force ? "--reset" : "-m", >> +"-u", sha1_to_hex(oid->hash), NULL); >> + prepare_submodule_repo_env(_array); >> + cp.git_cmd = 1; >> + cp.no_stdin = 1; >> + cp.dir = path; >> + if (run_command()) { >> + warning(_("reading the index in submodule '%s' failed"), >> path); > > The error is not (usually) in "reading the index" -- it's "updating the > index" (or the working tree) > >> + child_process_clear(); >> + return -1; >> + } >> + >> + /* write index to working dir */ >> + child_process_clear(); >> + child_process_init(); >> + argv_array_pushl(, "checkout-index", "-a", NULL); > > I'm confused -- doesn't read-tree -u already do this? And if not, shouldn't > we back out the result of the read-tree, to leave the submodule as it was? > >> + cp.git_cmd = 1; >> + cp.no_stdin = 1; >> + cp.dir = path; >> + if (force) >> + argv_array_push(, "-f"); >> + >> + if (run_command()) { >> + warning(_("populating the working directory in submodule '%s' >> failed"), path); >> + child_process_clear(); >> + return -1; >> + } >> + >> + /* get the HEAD right */ >> + child_process_clear(); >> + child_process_init(); >> + argv_array_pushl(, "checkout", "--recurse-submodules", >> NULL); > > > Why are we running checkout on the submodule when we've already done most of > the checkout? The only thing left is to set HEAD and recurse, right? I must > be missing something. > Yes this is only used to set the HEAD correctly and then recurse down. I tried to remove the first 2 calls to ch8ild processes at one point in time, which did not work out. I should have written in the commit message why that was a problem. So I'll redo that just to see the problem and improve the commit message.
RE: [RFC PATCHv2 09/17] update submodules: add scheduling to update submodules
This patch confuses me -- see below. > -Original Message- > From: Stefan Beller [mailto:sbel...@google.com] > Sent: Friday, December 02, 2016 7:30 PM > To: bmw...@google.com; David Turner > Cc: git@vger.kernel.org; sand...@crustytoothpaste.net; hvo...@hvoigt.net; > gits...@pobox.com; Stefan Beller > Subject: [RFC PATCHv2 09/17] update submodules: add scheduling to update > submodules [snip] > +static int update_submodule(const char *path, const struct object_id > *oid, > + int force, int is_new) > +{ > + const char *git_dir; > + struct child_process cp = CHILD_PROCESS_INIT; > + const struct submodule *sub = submodule_from_path(null_sha1, path); > + > + if (!sub || !sub->name) > + return -1; > + > + git_dir = resolve_gitdir(git_common_path("modules/%s", sub->name)); > + > + if (!git_dir) > + return -1; > + > + if (is_new) > + connect_work_tree_and_git_dir(path, git_dir); > + > + /* update index via `read-tree --reset sha1` */ > + argv_array_pushl(, "read-tree", > +force ? "--reset" : "-m", > +"-u", sha1_to_hex(oid->hash), NULL); > + prepare_submodule_repo_env(_array); > + cp.git_cmd = 1; > + cp.no_stdin = 1; > + cp.dir = path; > + if (run_command()) { > + warning(_("reading the index in submodule '%s' failed"), > path); The error is not (usually) in "reading the index" -- it's "updating the index" (or the working tree) > + child_process_clear(); > + return -1; > + } > + > + /* write index to working dir */ > + child_process_clear(); > + child_process_init(); > + argv_array_pushl(, "checkout-index", "-a", NULL); I'm confused -- doesn't read-tree -u already do this? And if not, shouldn't we back out the result of the read-tree, to leave the submodule as it was? > + cp.git_cmd = 1; > + cp.no_stdin = 1; > + cp.dir = path; > + if (force) > + argv_array_push(, "-f"); > + > + if (run_command()) { > + warning(_("populating the working directory in submodule '%s' > failed"), path); > + child_process_clear(); > + return -1; > + } > + > + /* get the HEAD right */ > + child_process_clear(); > + child_process_init(); > + argv_array_pushl(, "checkout", "--recurse-submodules", > NULL); Why are we running checkout on the submodule when we've already done most of the checkout? The only thing left is to set HEAD and recurse, right? I must be missing something.
[RFC PATCHv2 09/17] update submodules: add scheduling to update submodules
The walker of a tree is only expected to call `schedule_submodule_for_update` and once done, to run `update_submodules`. This avoids directory/file conflicts and later we can parallelize all submodule actions if needed. Signed-off-by: Stefan Beller--- submodule.c | 132 submodule.h | 5 +++ 2 files changed, 137 insertions(+) diff --git a/submodule.c b/submodule.c index 7bb64d6c69..02c28ef56b 100644 --- a/submodule.c +++ b/submodule.c @@ -1348,3 +1348,135 @@ int parallel_submodules(void) { return parallel_jobs; } + +static struct scheduled_submodules_update_type { + const char *path; + const struct object_id *oid; + /* +* Do we need to perform a complete checkout or just incremental +* update? +*/ + unsigned is_new:1; +} *scheduled_submodules; +#define SCHEDULED_SUBMODULES_INIT {NULL, NULL, 0} + +static int scheduled_submodules_nr, scheduled_submodules_alloc; + +void schedule_submodule_for_update(const struct cache_entry *ce, int is_new) +{ + struct scheduled_submodules_update_type *ssu; + ALLOC_GROW(scheduled_submodules, + scheduled_submodules_nr + 1, + scheduled_submodules_alloc); + ssu = _submodules[scheduled_submodules_nr++]; + ssu->path = ce->name; + ssu->oid = >oid; + ssu->is_new = !!is_new; +} + +static int update_submodule(const char *path, const struct object_id *oid, + int force, int is_new) +{ + const char *git_dir; + struct child_process cp = CHILD_PROCESS_INIT; + const struct submodule *sub = submodule_from_path(null_sha1, path); + + if (!sub || !sub->name) + return -1; + + git_dir = resolve_gitdir(git_common_path("modules/%s", sub->name)); + + if (!git_dir) + return -1; + + if (is_new) + connect_work_tree_and_git_dir(path, git_dir); + + /* update index via `read-tree --reset sha1` */ + argv_array_pushl(, "read-tree", + force ? "--reset" : "-m", + "-u", sha1_to_hex(oid->hash), NULL); + prepare_submodule_repo_env(_array); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.dir = path; + if (run_command()) { + warning(_("reading the index in submodule '%s' failed"), path); + child_process_clear(); + return -1; + } + + /* write index to working dir */ + child_process_clear(); + child_process_init(); + argv_array_pushl(, "checkout-index", "-a", NULL); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.dir = path; + if (force) + argv_array_push(, "-f"); + + if (run_command()) { + warning(_("populating the working directory in submodule '%s' failed"), path); + child_process_clear(); + return -1; + } + + /* get the HEAD right */ + child_process_clear(); + child_process_init(); + argv_array_pushl(, "checkout", "--recurse-submodules", NULL); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.dir = path; + if (force) + argv_array_push(, "-f"); + argv_array_push(, sha1_to_hex(oid->hash)); + + if (run_command()) { + warning(_("setting the HEAD in submodule '%s' failed"), path); + child_process_clear(); + return -1; + } + + child_process_clear(); + return 0; +} + +int update_submodules(int force) +{ + int i; + gitmodules_config(); + + /* +* NEEDSWORK: As submodule updates can potentially take some +* time each and they do not overlap (i.e. no d/f conflicts; +* this can be parallelized using the run_commands.h API. +*/ + for (i = 0; i < scheduled_submodules_nr; i++) { + struct submodule *sub; + struct scheduled_submodules_update_type *ssu = + _submodules[i]; + + if (!submodule_is_interesting(ssu->path)) + continue; + + sub = submodule_from_path(null_sha1, ssu->path); + + switch (sub->update_strategy) { + case SM_UPDATE_UNSPECIFIED: /* fall thru */ + case SM_UPDATE_CHECKOUT: + update_submodule(ssu->path, ssu->oid, +force, ssu->is_new); + break; + case SM_UPDATE_REBASE: + case SM_UPDATE_MERGE: + case SM_UPDATE_NONE: + case SM_UPDATE_COMMAND: + warning("update strategy for submodule '%s' not supported", ssu->path); + } + } + + scheduled_submodules_nr = 0; + return 0; +} diff --git a/submodule.h b/submodule.h index