Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
On Fri, May 8, 2015 at 9:09 PM, Jeff King p...@peff.net wrote: On Fri, May 08, 2015 at 07:56:28PM -0400, Eric Sunshine wrote: I spent some time downloading old Xcode releases and poking through the packages. Xcode 3.2.x seems to be the last in the Xcode 3 series, and none of the Xcode 3.2.x versions I examined carried getdelim(). The first package in which I found getdelim() was Xcode 4.1. (Unfortunately, Apple doesn't seem to make Xcode 4.0 available for download anymore or it's only available to paying developers, so I couldn't check it.) According to Wikipedia[1], Xcode 4.1 was released the same day as Lion (OS X 10.7 [2]), but was also available to paying developers for Snow Leopard (OS X 10.6). Consequently, I think it's safe to say that getdelim() is available for Lion (10.7) and later. If we don't mind being a bit less conservative, then we might assume that it also is available for Snow Leopard (10.6), which it definitely supported, but perhaps that's too risky, since not everyone would have been a paid subscriber. Thanks for digging. I'd argue for the conservative choice, simply because this is a pure optimization. The old code should work just fine, and people have been living with it for years. I doubt it will affect many people either way, though. Lion is 4 years old, and most OS X people seem to upgrade fairly regularly. It is not like long-term server systems where we are supporting Solaris 7. :) Want to roll a patch? After a long, long delay, here it is...[1] Alternately, we could make the test more dynamic and accurate by grepping stdio.h for 'getdelim' or just by trying a test compile, though that's probably too expensive. The natural place would be in configure.ac, and that is orthogonal to the default Darwin setting, I think. I added that too[1]. [1]: http://thread.gmane.org/gmane.comp.version-control.git/270576 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
Hi, On 2015-04-17 12:16, Eric Sunshine wrote: On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote: We spend a lot of time in strbuf_getwholeline in a tight loop reading characters from a stdio handle into a buffer. The libc getdelim() function can do this for us with less overhead. Just for the record: Git for Windows cannot lean on `getdelim()`, as it is not available on Windows. Do not let that stop you; if it turns out to impact performance, we will just have to come up with our own implementation of that function. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
On Wed, Apr 22, 2015 at 08:00:55PM +0200, Johannes Schindelin wrote: On 2015-04-17 12:16, Eric Sunshine wrote: On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote: We spend a lot of time in strbuf_getwholeline in a tight loop reading characters from a stdio handle into a buffer. The libc getdelim() function can do this for us with less overhead. Just for the record: Git for Windows cannot lean on `getdelim()`, as it is not available on Windows. Do not let that stop you; if it turns out to impact performance, we will just have to come up with our own implementation of that function. Hopefully the earlier patch in the series to avoid locking will help on Windows. After the end of the series, it isn't used anymore on Linux, but I kept it in exactly for those less-fortunate systems. If you can find a Windows equivalent that does the same thing as getdelim, it should be pretty easy to drop it into an alternate strbuf_getwholeline implementation (or just provide a compat getdelim if it is close enough to have the same interface). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
On Fri, Apr 17, 2015 at 06:16:48AM -0400, Eric Sunshine wrote: If somebody has a FreeBSD or OS X system to test on, I'd love to see what is needed to compile with HAVE_GETDELIM there. Modern Mac OS X, 10.10.x Yosemite, has getdelim() and git builds fine with HAVE_GETDELIM. I also tested on old Snow Leopard 10.5.8 from 2009. It does not have getdelim(). Unfortunately, I haven't been able to determine when getdelim() was introduced on the Mac OS X, thus have been unable to craft a simple rule for config.mak.uname. Thanks for looking into it. Since there haven't been any other takers in the meantime, do you want to prepare a patch that checks $(uname_R) for 10.10.x? That's likely more conservative than is necessary, but we can loosen it later if somebody on 10.9.x shows up with test results. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote: We spend a lot of time in strbuf_getwholeline in a tight loop reading characters from a stdio handle into a buffer. The libc getdelim() function can do this for us with less overhead. It's in POSIX.1-2008, and was a GNU extension before that. Therefore we can't rely on it, but can fall back to the existing getc loop when it is not available. The HAVE_GETDELIM knob is turned on automatically for Linux, where we have glibc. We don't need to set any new feature-test macros, because we already define _GNU_SOURCE. Other systems that implement getdelim may need to other macros (probably _POSIX_C_SOURCE = 200809L), but we can address that along with setting the Makefile knob after testing the feature on those systems. [...] Based on a patch from Rasmus Villemoes r...@rasmusvillemoes.dk. Signed-off-by: Jeff King p...@peff.net --- If somebody has a FreeBSD or OS X system to test on, I'd love to see what is needed to compile with HAVE_GETDELIM there. Modern Mac OS X, 10.10.x Yosemite, has getdelim() and git builds fine with HAVE_GETDELIM. I also tested on old Snow Leopard 10.5.8 from 2009. It does not have getdelim(). Unfortunately, I haven't been able to determine when getdelim() was introduced on the Mac OS X, thus have been unable to craft a simple rule for config.mak.uname. And to confirm that the performance is much better. Sharing my 1.6GB packed-refs file would be hard, but you should be able to generate something large and ridiculous. I'll leave that as an exercise to the reader. diff --git a/Makefile b/Makefile index 5f3987f..36655d5 100644 --- a/Makefile +++ b/Makefile @@ -359,6 +359,8 @@ all:: # compiler is detected to support it. # # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function. +# +# Define HAVE_GETDELIM if your system has the getdelim() function. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL BASIC_CFLAGS += -DHAVE_BSD_SYSCTL endif +ifdef HAVE_GETDELIM + BASIC_CFLAGS += -DHAVE_GETDELIM +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/config.mak.uname b/config.mak.uname index f4e77cb..d26665f 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux) HAVE_DEV_TTY = YesPlease HAVE_CLOCK_GETTIME = YesPlease HAVE_CLOCK_MONOTONIC = YesPlease + HAVE_GETDELIM = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 7/9] strbuf_getwholeline: use getdelim if it is available
We spend a lot of time in strbuf_getwholeline in a tight loop reading characters from a stdio handle into a buffer. The libc getdelim() function can do this for us with less overhead. It's in POSIX.1-2008, and was a GNU extension before that. Therefore we can't rely on it, but can fall back to the existing getc loop when it is not available. The HAVE_GETDELIM knob is turned on automatically for Linux, where we have glibc. We don't need to set any new feature-test macros, because we already define _GNU_SOURCE. Other systems that implement getdelim may need to other macros (probably _POSIX_C_SOURCE = 200809L), but we can address that along with setting the Makefile knob after testing the feature on those systems. Running git rev-parse refs/heads/does-not-exist on a repo with an extremely large (1.6GB) packed-refs file went from (best-of-5): real0m8.601s user0m8.084s sys 0m0.524s to: real0m6.768s user0m6.340s sys 0m0.432s for a wall-clock speedup of 21%. Based on a patch from Rasmus Villemoes r...@rasmusvillemoes.dk. Signed-off-by: Jeff King p...@peff.net --- If somebody has a FreeBSD or OS X system to test on, I'd love to see what is needed to compile with HAVE_GETDELIM there. And to confirm that the performance is much better. Sharing my 1.6GB packed-refs file would be hard, but you should be able to generate something large and ridiculous. I'll leave that as an exercise to the reader. Makefile | 6 ++ config.mak.uname | 1 + strbuf.c | 42 ++ 3 files changed, 49 insertions(+) diff --git a/Makefile b/Makefile index 5f3987f..36655d5 100644 --- a/Makefile +++ b/Makefile @@ -359,6 +359,8 @@ all:: # compiler is detected to support it. # # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function. +# +# Define HAVE_GETDELIM if your system has the getdelim() function. GIT-VERSION-FILE: FORCE @$(SHELL_PATH) ./GIT-VERSION-GEN @@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL BASIC_CFLAGS += -DHAVE_BSD_SYSCTL endif +ifdef HAVE_GETDELIM + BASIC_CFLAGS += -DHAVE_GETDELIM +endif + ifeq ($(TCLTK_PATH),) NO_TCLTK = NoThanks endif diff --git a/config.mak.uname b/config.mak.uname index f4e77cb..d26665f 100644 --- a/config.mak.uname +++ b/config.mak.uname @@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux) HAVE_DEV_TTY = YesPlease HAVE_CLOCK_GETTIME = YesPlease HAVE_CLOCK_MONOTONIC = YesPlease + HAVE_GETDELIM = YesPlease endif ifeq ($(uname_S),GNU/kFreeBSD) HAVE_ALLOCA_H = YesPlease diff --git a/strbuf.c b/strbuf.c index 921619e..0d4f4e5 100644 --- a/strbuf.c +++ b/strbuf.c @@ -435,6 +435,47 @@ int strbuf_getcwd(struct strbuf *sb) return -1; } +#ifdef HAVE_GETDELIM +int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) +{ + ssize_t r; + + if (feof(fp)) + return EOF; + + strbuf_reset(sb); + + /* Translate slopbuf to NULL, as we cannot call realloc on it */ + if (!sb-alloc) + sb-buf = NULL; + r = getdelim(sb-buf, sb-alloc, term, fp); + + if (r 0) { + sb-len = r; + return 0; + } + assert(r == -1); + + /* +* Normally we would have called xrealloc, which will try to free +* memory and recover. But we have no way to tell getdelim() to do so. +* Worse, we cannot try to recover ENOMEM ourselves, because we have +* no idea how many bytes were read by getdelim. +* +* Dying here is reasonable. It mirrors what xrealloc would do on +* catastrophic memory failure. We skip the opportunity to free pack +* memory and retry, but that's unlikely to help for a malloc small +* enough to hold a single line of input, anyway. +*/ + if (errno == ENOMEM) + die(Out of memory, getdelim failed); + + /* Restore slopbuf that we moved out of the way before */ + if (!sb-buf) + strbuf_init(sb, 0); + return EOF; +} +#else int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) { int ch; @@ -458,6 +499,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term) sb-buf[sb-len] = '\0'; return 0; } +#endif int strbuf_getline(struct strbuf *sb, FILE *fp, int term) { -- 2.4.0.rc2.384.g7297a4a -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html