Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
On 09/16/2015 06:00 PM, Junio C Hamano wrote: > *1* But one "gc" could be foreground gc that does not write to a log > file and the other "gc" could be background that does write to a > log file, so this cannot be the primary way to avoid double > execution. ^^^ This was the point that was confusing me. If this is not one of the roles of the log file, then things are easier. If you decide to go the route of tempfile/lockfile modifications, feel free to CC me for feedback. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
Duy Nguyenwrites: > We do keep another lock before coming to opening this log file. So > once we get here we already know nobody else will be opening the log > file. We can simply open it the normal way, then make sure we clean it > up at atexit(). > >>> This doesn't seem like a common thing to want (as in, this might be the >>> only caller), but it probably makes sense to build it into the >>> tempfile/lockfile API nevertheless, because implementing it externally >>> would require a lot of other code to be duplicated. >>> >>> Another possibility that might work (maybe without requiring changes to >>> tempfile/lockfile): don't worry about deleting the log file if it is >>> empty, but make observers treat an empty log file the same as an absent one. >> >> Probably your "don't remove and check for emptiness" approach would >> be the simpler of the two, but I think we can go either way. > > People have complained to me about stray files in $GIT_DIR, so it's > probably best that we delete empty/useless files. Although we could > delete empty files at the beginning of the next gc instead of at > atexit(). Let me try it out and see which is simplest. It would be nicer if we did not have to do an extra and custom atexit() handler. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
On Wed, Sep 16, 2015 at 11:00 PM, Junio C Hamanowrote: > Michael Haggerty writes: > >> I'm not sure what behavior you want. At one point you say "puts the file >> to a final place if it is non-empty" but later you say "leave it if >> non-empty". Should the file be written directly, or should it be written >> to a lockfile and renamed into place only when complete? > > I do not think we care that deeply either way, as we do not want to > run multiple auto-gc's at the same time in the first place. So either > one of the following is perfectly fine: > > * We open the final destination directly, but with O_CREAT|O_EXCL, >which as a side effect also detects the simultanous execution >[*1*]. We do not do any "finalizing rename" if we go this route. >When we are done, close it and remove it if we did not write >anything, or leave it there if we did write something. > > * We open a lockfile the usual way. When we are done, close it and >and remove it if we did not write anything, or finalize it by >renaming it if we did write something. > > I think Duy's code wants to do the latter. We do keep another lock before coming to opening this log file. So once we get here we already know nobody else will be opening the log file. We can simply open it the normal way, then make sure we clean it up at atexit(). >> This doesn't seem like a common thing to want (as in, this might be the >> only caller), but it probably makes sense to build it into the >> tempfile/lockfile API nevertheless, because implementing it externally >> would require a lot of other code to be duplicated. >> >> Another possibility that might work (maybe without requiring changes to >> tempfile/lockfile): don't worry about deleting the log file if it is >> empty, but make observers treat an empty log file the same as an absent one. > > Probably your "don't remove and check for emptiness" approach would > be the simpler of the two, but I think we can go either way. People have complained to me about stray files in $GIT_DIR, so it's probably best that we delete empty/useless files. Although we could delete empty files at the beginning of the next gc instead of at atexit(). Let me try it out and see which is simplest. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
On 09/14/2015 07:37 PM, Junio C Hamano wrote: > Junio C Hamanowrites: > >> Thanks, will queue. > > Ehh, I spoke a bit too early. > >>> diff --git a/builtin/gc.c b/builtin/gc.c >>> index bcc75d9..2c3aaeb 100644 >>> --- a/builtin/gc.c >>> +++ b/builtin/gc.c >>> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = >>> ARGV_ARRAY_INIT; >>> static struct argv_array rerere = ARGV_ARRAY_INIT; >>> >>> static char *pidfile; >>> +static struct strbuf log_filename = STRBUF_INIT; >>> +static int daemonized; >>> >>> static void remove_pidfile(void) >>> { >>> + if (daemonized && log_filename.len) { >>> + struct stat st; >>> + >>> + close(2); >>> + if (stat(log_filename.buf, ) || >>> + !st.st_size || >>> + rename(log_filename.buf, git_path("gc.log"))) >>> + unlink(log_filename.buf); >>> + } > > Unfortuantely we cannot queue this as-is, as we let the tempfile API > to automatically manage the pidfile since ebebeaea (gc: use tempfile > module to handle gc.pid file, 2015-08-10), and you cannot piggy-back > the log file finalization to this function that no longer exists. > > Besides, it is obviously wrong to remove this log file in a function > whose name is remove_pidfile() ;-) > > Adding a new function to tempfile API that puts the file to a final > place if it is non-empty and otherwise remove it, and using that to > create this "gc.log" file, would be the cleanest from the point of > view of _this_ codepath. I however do not know if that is too > specific for the need of this codepath or "leave it if non-empty, > but otherwise remove as it is uninteresting" is fairly common thing > we would want and it is a good addition to the API set. > > Michael, what do you think? I'm not sure what behavior you want. At one point you say "puts the file to a final place if it is non-empty" but later you say "leave it if non-empty". Should the file be written directly, or should it be written to a lockfile and renamed into place only when complete? Technically I don't see a problem implementing either behavior. POSIX allows [1] calls to stat() and rename() from a signal handler. There is a minor technical difficulty that commit_lock_file() allocates memory via get_locked_file_path(), but this would be surmountable by pre-allocating the memory for the locked file path and storing it in the lock_file object. This doesn't seem like a common thing to want (as in, this might be the only caller), but it probably makes sense to build it into the tempfile/lockfile API nevertheless, because implementing it externally would require a lot of other code to be duplicated. Another possibility that might work (maybe without requiring changes to tempfile/lockfile): don't worry about deleting the log file if it is empty, but make observers treat an empty log file the same as an absent one. Michael [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04_03 -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
Michael Haggertywrites: > I'm not sure what behavior you want. At one point you say "puts the file > to a final place if it is non-empty" but later you say "leave it if > non-empty". Should the file be written directly, or should it be written > to a lockfile and renamed into place only when complete? I do not think we care that deeply either way, as we do not want to run multiple auto-gc's at the same time in the first place. So either one of the following is perfectly fine: * We open the final destination directly, but with O_CREAT|O_EXCL, which as a side effect also detects the simultanous execution [*1*]. We do not do any "finalizing rename" if we go this route. When we are done, close it and remove it if we did not write anything, or leave it there if we did write something. * We open a lockfile the usual way. When we are done, close it and and remove it if we did not write anything, or finalize it by renaming it if we did write something. I think Duy's code wants to do the latter. > This doesn't seem like a common thing to want (as in, this might be the > only caller), but it probably makes sense to build it into the > tempfile/lockfile API nevertheless, because implementing it externally > would require a lot of other code to be duplicated. > > Another possibility that might work (maybe without requiring changes to > tempfile/lockfile): don't worry about deleting the log file if it is > empty, but make observers treat an empty log file the same as an absent one. Probably your "don't remove and check for emptiness" approach would be the simpler of the two, but I think we can go either way. Thanks. [Footnote] *1* But one "gc" could be foreground gc that does not write to a log file and the other "gc" could be background that does write to a log file, so this cannot be the primary way to avoid double execution. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
Nguyễn Thái Ngọc Duywrites: > While commit 9f673f9 (gc: config option for running --auto in > background - 2014-02-08) helps reduce some complaints about 'gc > --auto' hogging the terminal, it creates another set of problems. > > The latest in this set is, as the result of daemonizing, stderr is > closed and all warnings are lost. This warning at the end of cmd_gc() > is particularly important because it tells the user how to avoid "gc > --auto" running repeatedly. Because stderr is closed, the user does > not know, naturally they complain about 'gc --auto' wasting CPU. > > Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto > will not run and gc.log printed out until the user removes gc.log. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- > When gc.log exists, gc --auto now simply exits > > builtin/gc.c | 33 - > 1 file changed, 32 insertions(+), 1 deletion(-) Thanks, will queue. This is sufficient for non-novice users, but I wonder if the error message is strong enough. A knee-jerk reaction to "Last run reported an error so we won't run until the log is removed" may be "Then why don't you remove the log for me automatically?", which is a total brain-less raction [*1*], and "then I'll remove the log" is not all that better. The message we want to convey is "correct the root cause so that gc can run without having to give an error message." I guess it all depends on what is in gc.log; after all, this new codepath is not in a good position to read the diagnosis left in there and offer good pieces of advice to resolve them. [Footnote] *1* ... which is what a knee-jerk reaction is by definition. > > diff --git a/builtin/gc.c b/builtin/gc.c > index bcc75d9..2c3aaeb 100644 > --- a/builtin/gc.c > +++ b/builtin/gc.c > @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT; > static struct argv_array rerere = ARGV_ARRAY_INIT; > > static char *pidfile; > +static struct strbuf log_filename = STRBUF_INIT; > +static int daemonized; > > static void remove_pidfile(void) > { > + if (daemonized && log_filename.len) { > + struct stat st; > + > + close(2); > + if (stat(log_filename.buf, ) || > + !st.st_size || > + rename(log_filename.buf, git_path("gc.log"))) > + unlink(log_filename.buf); > + } > if (pidfile) > unlink(pidfile); > } > @@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > fprintf(stderr, _("See \"git help gc\" for manual > housekeeping.\n")); > } > if (detach_auto) { > + struct strbuf sb = STRBUF_INIT; > + if (strbuf_read_file(, git_path("gc.log"), 0) > 0) > + return error(_("last gc run reported:\n" > +"%s\n" > +"not running until %s is > removed"), > + sb.buf, git_path("gc.log")); > + strbuf_release(); > + > if (gc_before_repack()) > return -1; > /* >* failure to daemonize is ok, we'll continue >* in foreground >*/ > - daemonize(); > + daemonized = !daemonize(); > } > } else > add_repack_all_option(); > @@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char > *prefix) > name, (uintmax_t)pid); > } > > + if (daemonized) { > + int fd; > + > + strbuf_addstr(_filename, git_path("gc.log_XX")); > + fd = xmkstemp(log_filename.buf); > + if (fd >= 0) { > + dup2(fd, 2); > + close(fd); > + } else > + strbuf_release(_filename); > + } > + > if (gc_before_repack()) > return -1; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4] gc: save log from daemonized gc --auto and print it next time
Junio C Hamanowrites: > Thanks, will queue. Ehh, I spoke a bit too early. >> diff --git a/builtin/gc.c b/builtin/gc.c >> index bcc75d9..2c3aaeb 100644 >> --- a/builtin/gc.c >> +++ b/builtin/gc.c >> @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = >> ARGV_ARRAY_INIT; >> static struct argv_array rerere = ARGV_ARRAY_INIT; >> >> static char *pidfile; >> +static struct strbuf log_filename = STRBUF_INIT; >> +static int daemonized; >> >> static void remove_pidfile(void) >> { >> +if (daemonized && log_filename.len) { >> +struct stat st; >> + >> +close(2); >> +if (stat(log_filename.buf, ) || >> +!st.st_size || >> +rename(log_filename.buf, git_path("gc.log"))) >> +unlink(log_filename.buf); >> +} Unfortuantely we cannot queue this as-is, as we let the tempfile API to automatically manage the pidfile since ebebeaea (gc: use tempfile module to handle gc.pid file, 2015-08-10), and you cannot piggy-back the log file finalization to this function that no longer exists. Besides, it is obviously wrong to remove this log file in a function whose name is remove_pidfile() ;-) Adding a new function to tempfile API that puts the file to a final place if it is non-empty and otherwise remove it, and using that to create this "gc.log" file, would be the cleanest from the point of view of _this_ codepath. I however do not know if that is too specific for the need of this codepath or "leave it if non-empty, but otherwise remove as it is uninteresting" is fairly common thing we would want and it is a good addition to the API set. Michael, what do you think? >> @@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char >> *prefix) >> fprintf(stderr, _("See \"git help gc\" for manual >> housekeeping.\n")); >> } >> if (detach_auto) { >> +struct strbuf sb = STRBUF_INIT; >> +if (strbuf_read_file(, git_path("gc.log"), 0) > 0) >> +return error(_("last gc run reported:\n" >> + "%s\n" >> + "not running until %s is >> removed"), >> + sb.buf, git_path("gc.log")); >> +strbuf_release(); >> + >> if (gc_before_repack()) >> return -1; >> /* >> * failure to daemonize is ok, we'll continue >> * in foreground >> */ >> -daemonize(); >> +daemonized = !daemonize(); >> } >> } else >> add_repack_all_option(); >> @@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char >> *prefix) >> name, (uintmax_t)pid); >> } >> >> +if (daemonized) { >> +int fd; >> + >> +strbuf_addstr(_filename, git_path("gc.log_XX")); >> +fd = xmkstemp(log_filename.buf); >> +if (fd >= 0) { >> +dup2(fd, 2); >> +close(fd); >> +} else >> +strbuf_release(_filename); >> +} >> + >> if (gc_before_repack()) >> return -1; -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4] gc: save log from daemonized gc --auto and print it next time
While commit 9f673f9 (gc: config option for running --auto in background - 2014-02-08) helps reduce some complaints about 'gc --auto' hogging the terminal, it creates another set of problems. The latest in this set is, as the result of daemonizing, stderr is closed and all warnings are lost. This warning at the end of cmd_gc() is particularly important because it tells the user how to avoid "gc --auto" running repeatedly. Because stderr is closed, the user does not know, naturally they complain about 'gc --auto' wasting CPU. Daemonized gc now saves stderr to $GIT_DIR/gc.log. Following gc --auto will not run and gc.log printed out until the user removes gc.log. Signed-off-by: Nguyễn Thái Ngọc Duy--- When gc.log exists, gc --auto now simply exits builtin/gc.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/builtin/gc.c b/builtin/gc.c index bcc75d9..2c3aaeb 100644 --- a/builtin/gc.c +++ b/builtin/gc.c @@ -43,9 +43,20 @@ static struct argv_array prune_worktrees = ARGV_ARRAY_INIT; static struct argv_array rerere = ARGV_ARRAY_INIT; static char *pidfile; +static struct strbuf log_filename = STRBUF_INIT; +static int daemonized; static void remove_pidfile(void) { + if (daemonized && log_filename.len) { + struct stat st; + + close(2); + if (stat(log_filename.buf, ) || + !st.st_size || + rename(log_filename.buf, git_path("gc.log"))) + unlink(log_filename.buf); + } if (pidfile) unlink(pidfile); } @@ -330,13 +341,21 @@ int cmd_gc(int argc, const char **argv, const char *prefix) fprintf(stderr, _("See \"git help gc\" for manual housekeeping.\n")); } if (detach_auto) { + struct strbuf sb = STRBUF_INIT; + if (strbuf_read_file(, git_path("gc.log"), 0) > 0) + return error(_("last gc run reported:\n" + "%s\n" + "not running until %s is removed"), +sb.buf, git_path("gc.log")); + strbuf_release(); + if (gc_before_repack()) return -1; /* * failure to daemonize is ok, we'll continue * in foreground */ - daemonize(); + daemonized = !daemonize(); } } else add_repack_all_option(); @@ -349,6 +368,18 @@ int cmd_gc(int argc, const char **argv, const char *prefix) name, (uintmax_t)pid); } + if (daemonized) { + int fd; + + strbuf_addstr(_filename, git_path("gc.log_XX")); + fd = xmkstemp(log_filename.buf); + if (fd >= 0) { + dup2(fd, 2); + close(fd); + } else + strbuf_release(_filename); + } + if (gc_before_repack()) return -1; -- 2.3.0.rc1.137.g477eb31 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html