Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Johannes Schindelin writes: > On Mon, 13 Mar 2017, Junio C Hamano wrote: > >> When a patch series is refactoring an existing function to be used >> in different codepaths, an existing issue inherited from the >> original code (e.g. perhaps an existing error checking that is >> looser than ideal) may have been OK in the original context (e.g. >> perhaps it died and refused to run until the user corrected the >> repository), and it still is OK in the codepath that uses the >> refactored building blocks to replace the original code, but it may >> not be acceptable to leave the issue in the original code when a new >> caller calls the refactored building block (e.g. perhaps the >> refactoring made it possible for a caller to ask the function not to >> die and instead act on different kinds of errors in different ways). > > In this case, ... It becomes somewhat irritating when you get combative and defensive unnecessarily. We already established this case is OK two messages ago, I think. The above is only to make sure that people cannot take the "issues in the original is OK to leave outside the scope of a new series" in my message out of context and treat it as a general rule to justify their sloppy patches in the future. We need to see if issues inherited from the original is necessary to fix before refactoring even begins on case-by-case basis, seeing what the requirement of the new code that uses the refactored code. And the case-by-case thing we already did for _this_ case.
Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Hi Junio, On Mon, 13 Mar 2017, Junio C Hamano wrote: > Junio C Hamano writes: > > > The former case may split into two. "Yes I agree that is bad and it > > is the same badness as the version without this change", in which case > > we may want to leave a "NEEDSWORK" comment in-code so that people can > > notice when browsing the code (but fixing the badness would be outside > > the scope of the series), and "yes that is bad and we shouldn't > > introduce that badness", in which case we do want to fix it in the > > patch. > > We however need to be a bit careful here, though. > > When a patch series is refactoring an existing function to be used > in different codepaths, an existing issue inherited from the > original code (e.g. perhaps an existing error checking that is > looser than ideal) may have been OK in the original context (e.g. > perhaps it died and refused to run until the user corrected the > repository), and it still is OK in the codepath that uses the > refactored building blocks to replace the original code, but it may > not be acceptable to leave the issue in the original code when a new > caller calls the refactored building block (e.g. perhaps the > refactoring made it possible for a caller to ask the function not to > die and instead act on different kinds of errors in different ways). In this case, discover_git_directory() is intended to use the exact same logic as setup_git_directory() (including bugs and all) so that they do discover the same directory. It would not do for discover_git_directory() to detect a *different* directory, no matter how much "more correct" one would deem it. Ciao, Johannes
Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Junio C Hamano writes: > The former case may split into two. "Yes I agree that is bad and it > is the same badness as the version without this change", in which > case we may want to leave a "NEEDSWORK" comment in-code so that > people can notice when browsing the code (but fixing the badness > would be outside the scope of the series), and "yes that is bad and > we shouldn't introduce that badness", in which case we do want to > fix it in the patch. We however need to be a bit careful here, though. When a patch series is refactoring an existing function to be used in different codepaths, an existing issue inherited from the original code (e.g. perhaps an existing error checking that is looser than ideal) may have been OK in the original context (e.g. perhaps it died and refused to run until the user corrected the repository), and it still is OK in the codepath that uses the refactored building blocks to replace the original code, but it may not be acceptable to leave the issue in the original code when a new caller calls the refactored building block (e.g. perhaps the refactoring made it possible for a caller to ask the function not to die and instead act on different kinds of errors in different ways). So "being bug-to-bug compatible with existing code" is not always a good thing---we need to see case by case if a problem identified in the review as inherited from the original needs to be addressed in the series. It would be better to address issues we identify even if it is an old one. After all, any change has potential to introduce new bugs, and striving to leave known bug inherited from the old code around would guarantee that the resulting code will be buggier than the original ;-) But in order to keep bisectability, such "further fixups" should be done as a follow-up to the series, not as part of it.
Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Johannes Schindelin writes: > PLEASE NOTE: the purpose of this patch series is to allow the same > function (setup_git_directory_gently_1()) to be the work horse for > setup_git_directory() as before, but also for the new > discover_git_directory() function that is introduced to fix > read_early_config() to avoid hardconfig .git/config. > > The purpose is *not* to change any current behavior. Please do not ask me > to do that in this patch series. Please note that pointing out potential problems is only asking for confirmation ("yes that is bad") or for enlightenment ("no, you are reading the code wrong and that bad thing does not happen because..."). Even in the former case, unless the badness is introduced by the change, pointing out potential problems is *NOT* asking to change the patch to fix existing issues. The former case may split into two. "Yes I agree that is bad and it is the same badness as the version without this change", in which case we may want to leave a "NEEDSWORK" comment in-code so that people can notice when browsing the code (but fixing the badness would be outside the scope of the series), and "yes that is bad and we shouldn't introduce that badness", in which case we do want to fix it in the patch. So do not read my (or anybody else's) reviews as _asking_ changes and making further fixes, and you'll do better.
Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Hi Junio, PLEASE NOTE: the purpose of this patch series is to allow the same function (setup_git_directory_gently_1()) to be the work horse for setup_git_directory() as before, but also for the new discover_git_directory() function that is introduced to fix read_early_config() to avoid hardconfig .git/config. The purpose is *not* to change any current behavior. Please do not ask me to do that in this patch series. Now to your comments: On Fri, 10 Mar 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > @@ -890,14 +892,22 @@ static enum discovery_result > > setup_git_directory_gently_1(struct strbuf *dir, > > if (one_filesystem) > > current_device = get_device_or_die(dir->buf, NULL, 0); > > for (;;) { > > - int offset = dir->len; > > + int offset = dir->len, error_code = 0; > > > > if (offset > min_offset) > > strbuf_addch(dir, '/'); > > strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); > > - gitdirenv = read_gitfile(dir->buf); > > - if (!gitdirenv && is_git_directory(dir->buf)) > > - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > > + gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? > > + NULL : &error_code); > > + if (!gitdirenv) { > > We tried to read ".git" as a regular file, but couldn't. There are > some cases but I am having trouble to convince myself all cases are > covered correctly here, so let me follow the code aloud. It is subtle and finicky, I agree. > > + if (die_on_error || > > + error_code == READ_GITFILE_ERR_NOT_A_FILE) { > > + if (is_git_directory(dir->buf)) > > + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > > If we are told to die on error, but if it is a Git directory, then > we do not have to (and want to) die and instead report that > directory as discovered. > > If we are to report the failure status instead of dying, we also do > want to recover when the read_gitfile_gentrly() failed only because > it was a real Git directory. READ_GITFILE_ERR_NOT_A_FILE could be a > signal for that, and we recover after making sure it is a Git > directory. > > Among the READ_GITFILE_ERR_* codes, ERR_NOT_A_FILE is the only one > we attempt this recovery. All others just take the "else if" thing > below. > > What happens when is_git_directory() test here does not pass, > though? Let's say stat() in read_gitfile_gently() found a FIFO > there, it gives us ERR_NOT_A_FILE, is_git_directory() would say > "Nah", and then ...? Don't we want the other side for this if() > statement that returns failure? The current code as per `master` detects the NOT_A_FILE condition, and as the parameter return_error_code was passed as NULL (because read_gitfile is actually #define'd in cache.h to call read_gitfile_gently with NULL as second parameter), it calls read_gitfile_error_die(). That function *ignores* the NOT_A_FILE error condition, and as a consequence returns to read_gitfile_gently() which then returns NULL because there *was* an error_code. That means that setup_git_directory_gently_1() will retrieve a NULL from the read_gitfile() call, which means it then goes on to test whether it is looking at a .git/ directory via is_git_directory() and if that returns false, the loop will continue to look at the *parent* directory. That, in turn, means that what you are asking for is a change in behavior, and as I am unwilling to introduce a change in behavior with this patch series, my patch does the exact right thing. > > + } else if (error_code && > > + error_code != READ_GITFILE_ERR_STAT_FAILED) > > + return GIT_DIR_INVALID_GITFILE; > > + } > > On the other hand, if read_gitfile_gently() didn't say "we found > something that is not a regular file" we come here. If the error > came because there wasn't ".git" in the directory we are looking at, > i.e. stat(2) in read_gitfile_gently() gave ENOENT, it is perfectly > normal and we just want to go one level up. > > But shouldn't read_gitfile_gently() be inspecting errno when it sees > a stat() failure? If there _is_ ".git" but we failed to stat it for > whatever reason (EACCES, ELOOP,...), we do not want to go up but > give the INVALID_GITFILE from here, just like other cases, no? > For that I imagine that ERR_STAT_FAILED needs to be split into two, > one for true ERR_STAT_FAILED (from which we cannot recover) and the > other for ERR_ENOENT to signal us that there is nothing there (which > allows us to go up). True. But again, you are asking for a *change in behavior*, which is not the purpose of this patch series. > By the way, is the "error_code &&" needed? It is not! I had the order of the if/else blocks completely differently originally [*1*] and just overlooked tha
Re: [PATCH v5 10/11] setup_git_directory_gently_1(): avoid die()ing
Johannes Schindelin writes: > @@ -890,14 +892,22 @@ static enum discovery_result > setup_git_directory_gently_1(struct strbuf *dir, > if (one_filesystem) > current_device = get_device_or_die(dir->buf, NULL, 0); > for (;;) { > - int offset = dir->len; > + int offset = dir->len, error_code = 0; > > if (offset > min_offset) > strbuf_addch(dir, '/'); > strbuf_addstr(dir, DEFAULT_GIT_DIR_ENVIRONMENT); > - gitdirenv = read_gitfile(dir->buf); > - if (!gitdirenv && is_git_directory(dir->buf)) > - gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; > + gitdirenv = read_gitfile_gently(dir->buf, die_on_error ? > + NULL : &error_code); > + if (!gitdirenv) { We tried to read ".git" as a regular file, but couldn't. There are some cases but I am having trouble to convince myself all cases are covered correctly here, so let me follow the code aloud. > + if (die_on_error || > + error_code == READ_GITFILE_ERR_NOT_A_FILE) { > + if (is_git_directory(dir->buf)) > + gitdirenv = DEFAULT_GIT_DIR_ENVIRONMENT; If we are told to die on error, but if it is a Git directory, then we do not have to (and want to) die and instead report that directory as discovered. If we are to report the failure status instead of dying, we also do want to recover when the read_gitfile_gentrly() failed only because it was a real Git directory. READ_GITFILE_ERR_NOT_A_FILE could be a signal for that, and we recover after making sure it is a Git directory. Among the READ_GITFILE_ERR_* codes, ERR_NOT_A_FILE is the only one we attempt this recovery. All others just take the "else if" thing below. What happens when is_git_directory() test here does not pass, though? Let's say stat() in read_gitfile_gently() found a FIFO there, it gives us ERR_NOT_A_FILE, is_git_directory() would say "Nah", and then ...? Don't we want the other side for this if() statement that returns failure? > + } else if (error_code && > +error_code != READ_GITFILE_ERR_STAT_FAILED) > + return GIT_DIR_INVALID_GITFILE; > + } On the other hand, if read_gitfile_gently() didn't say "we found something that is not a regular file" we come here. If the error came because there wasn't ".git" in the directory we are looking at, i.e. stat(2) in read_gitfile_gently() gave ENOENT, it is perfectly normal and we just want to go one level up. But shouldn't read_gitfile_gently() be inspecting errno when it sees a stat() failure? If there _is_ ".git" but we failed to stat it for whatever reason (EACCES, ELOOP,...), we do not want to go up but give the INVALID_GITFILE from here, just like other cases, no? For that I imagine that ERR_STAT_FAILED needs to be split into two, one for true ERR_STAT_FAILED (from which we cannot recover) and the other for ERR_ENOENT to signal us that there is nothing there (which allows us to go up). By the way, is the "error_code &&" needed? Unless the original path given to read_gitfile_gently() is a NULL pointer, the only time its return value is NULL is when it has non-zero error_code. > strbuf_setlen(dir, offset); > if (gitdirenv) { > strbuf_addstr(gitdir, gitdirenv); > @@ -934,7 +944,7 @@ const char *discover_git_directory(struct strbuf *gitdir) > return NULL; > > cwd_len = dir.len; > - if (setup_git_directory_gently_1(&dir, gitdir) <= 0) { > + if (setup_git_directory_gently_1(&dir, gitdir, 0) <= 0) { > strbuf_release(&dir); > return NULL; > } > @@ -994,7 +1004,7 @@ const char *setup_git_directory_gently(int *nongit_ok) > die_errno(_("Unable to read current working directory")); > strbuf_addbuf(&dir, &cwd); > > - switch (setup_git_directory_gently_1(&dir, &gitdir)) { > + switch (setup_git_directory_gently_1(&dir, &gitdir, 1)) { > case GIT_DIR_NONE: > prefix = NULL; > break;