Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-06-02 Thread Eric Sunshine
On Fri, May 8, 2015 at 9:09 PM, Jeff King p...@peff.net wrote:
 On Fri, May 08, 2015 at 07:56:28PM -0400, Eric Sunshine wrote:
 I spent some time downloading old Xcode releases and poking through
 the packages. Xcode 3.2.x seems to be the last in the Xcode 3 series,
 and none of the Xcode 3.2.x versions I examined carried getdelim().
 The first package in which I found getdelim() was Xcode 4.1.
 (Unfortunately, Apple doesn't seem to make Xcode 4.0 available for
 download anymore or it's only available to paying developers, so I
 couldn't check it.) According to Wikipedia[1], Xcode 4.1 was released
 the same day as Lion (OS X 10.7 [2]), but was also available to paying
 developers for Snow Leopard (OS X 10.6).

 Consequently, I think it's safe to say that getdelim() is available
 for Lion (10.7) and later. If we don't mind being a bit less
 conservative, then we might assume that it also is available for Snow
 Leopard (10.6), which it definitely supported, but perhaps that's too
 risky, since not everyone would have been a paid subscriber.

 Thanks for digging. I'd argue for the conservative choice, simply
 because this is a pure optimization. The old code should work just fine,
 and people have been living with it for years.

 I doubt it will affect many people either way, though. Lion is 4 years
 old, and most OS X people seem to upgrade fairly regularly. It is not
 like long-term server systems where we are supporting Solaris 7. :)

 Want to roll a patch?

After a long, long delay, here it is...[1]

 Alternately, we could make the test more dynamic and accurate by
 grepping stdio.h for 'getdelim' or just by trying a test compile,
 though that's probably too expensive.

 The natural place would be in configure.ac, and that is orthogonal to
 the default Darwin setting, I think.

I added that too[1].

[1]: http://thread.gmane.org/gmane.comp.version-control.git/270576
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-04-22 Thread Johannes Schindelin
Hi,

On 2015-04-17 12:16, Eric Sunshine wrote:
 On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote:
 We spend a lot of time in strbuf_getwholeline in a tight
 loop reading characters from a stdio handle into a buffer.
 The libc getdelim() function can do this for us with less
 overhead.

Just for the record: Git for Windows cannot lean on `getdelim()`, as it is not 
available on Windows. Do not let that stop you; if it turns out to impact 
performance, we will just have to come up with our own implementation of that 
function.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-04-22 Thread Jeff King
On Wed, Apr 22, 2015 at 08:00:55PM +0200, Johannes Schindelin wrote:

 On 2015-04-17 12:16, Eric Sunshine wrote:
  On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote:
  We spend a lot of time in strbuf_getwholeline in a tight
  loop reading characters from a stdio handle into a buffer.
  The libc getdelim() function can do this for us with less
  overhead.
 
 Just for the record: Git for Windows cannot lean on `getdelim()`, as
 it is not available on Windows. Do not let that stop you; if it turns
 out to impact performance, we will just have to come up with our own
 implementation of that function.

Hopefully the earlier patch in the series to avoid locking will help
on Windows. After the end of the series, it isn't used anymore on Linux,
but I kept it in exactly for those less-fortunate systems.

If you can find a Windows equivalent that does the same thing as
getdelim, it should be pretty easy to drop it into an alternate
strbuf_getwholeline implementation (or just provide a compat getdelim
if it is close enough to have the same interface).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-04-21 Thread Jeff King
On Fri, Apr 17, 2015 at 06:16:48AM -0400, Eric Sunshine wrote:

  If somebody has a FreeBSD or OS X system to test on, I'd
  love to see what is needed to compile with HAVE_GETDELIM
  there.
 
 Modern Mac OS X, 10.10.x Yosemite, has getdelim() and git builds fine
 with HAVE_GETDELIM. I also tested on old Snow Leopard 10.5.8 from
 2009. It does not have getdelim(). Unfortunately, I haven't been able
 to determine when getdelim() was introduced on the Mac OS X, thus have
 been unable to craft a simple rule for config.mak.uname.

Thanks for looking into it. Since there haven't been any other takers in
the meantime, do you want to prepare a patch that checks $(uname_R) for
10.10.x?  That's likely more conservative than is necessary, but we can
loosen it later if somebody on 10.9.x shows up with test results.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-04-17 Thread Eric Sunshine
On Thu, Apr 16, 2015 at 5:01 AM, Jeff King p...@peff.net wrote:
 We spend a lot of time in strbuf_getwholeline in a tight
 loop reading characters from a stdio handle into a buffer.
 The libc getdelim() function can do this for us with less
 overhead. It's in POSIX.1-2008, and was a GNU extension
 before that. Therefore we can't rely on it, but can fall
 back to the existing getc loop when it is not available.

 The HAVE_GETDELIM knob is turned on automatically for Linux,
 where we have glibc. We don't need to set any new
 feature-test macros, because we already define _GNU_SOURCE.
 Other systems that implement getdelim may need to other
 macros (probably _POSIX_C_SOURCE = 200809L), but we can
 address that along with setting the Makefile knob after
 testing the feature on those systems.
 [...]

 Based on a patch from Rasmus Villemoes r...@rasmusvillemoes.dk.

 Signed-off-by: Jeff King p...@peff.net
 ---
 If somebody has a FreeBSD or OS X system to test on, I'd
 love to see what is needed to compile with HAVE_GETDELIM
 there.

Modern Mac OS X, 10.10.x Yosemite, has getdelim() and git builds fine
with HAVE_GETDELIM. I also tested on old Snow Leopard 10.5.8 from
2009. It does not have getdelim(). Unfortunately, I haven't been able
to determine when getdelim() was introduced on the Mac OS X, thus have
been unable to craft a simple rule for config.mak.uname.

 And to confirm that the performance is much better.
 Sharing my 1.6GB packed-refs file would be hard, but you
 should be able to generate something large and ridiculous.
 I'll leave that as an exercise to the reader.

 diff --git a/Makefile b/Makefile
 index 5f3987f..36655d5 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -359,6 +359,8 @@ all::
  # compiler is detected to support it.
  #
  # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl 
 function.
 +#
 +# Define HAVE_GETDELIM if your system has the getdelim() function.

  GIT-VERSION-FILE: FORCE
 @$(SHELL_PATH) ./GIT-VERSION-GEN
 @@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL
 BASIC_CFLAGS += -DHAVE_BSD_SYSCTL
  endif

 +ifdef HAVE_GETDELIM
 +   BASIC_CFLAGS += -DHAVE_GETDELIM
 +endif
 +
  ifeq ($(TCLTK_PATH),)
  NO_TCLTK = NoThanks
  endif
 diff --git a/config.mak.uname b/config.mak.uname
 index f4e77cb..d26665f 100644
 --- a/config.mak.uname
 +++ b/config.mak.uname
 @@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux)
 HAVE_DEV_TTY = YesPlease
 HAVE_CLOCK_GETTIME = YesPlease
 HAVE_CLOCK_MONOTONIC = YesPlease
 +   HAVE_GETDELIM = YesPlease
  endif
  ifeq ($(uname_S),GNU/kFreeBSD)
 HAVE_ALLOCA_H = YesPlease
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/9] strbuf_getwholeline: use getdelim if it is available

2015-04-16 Thread Jeff King
We spend a lot of time in strbuf_getwholeline in a tight
loop reading characters from a stdio handle into a buffer.
The libc getdelim() function can do this for us with less
overhead. It's in POSIX.1-2008, and was a GNU extension
before that. Therefore we can't rely on it, but can fall
back to the existing getc loop when it is not available.

The HAVE_GETDELIM knob is turned on automatically for Linux,
where we have glibc. We don't need to set any new
feature-test macros, because we already define _GNU_SOURCE.
Other systems that implement getdelim may need to other
macros (probably _POSIX_C_SOURCE = 200809L), but we can
address that along with setting the Makefile knob after
testing the feature on those systems.

Running git rev-parse refs/heads/does-not-exist on a repo
with an extremely large (1.6GB) packed-refs file went from
(best-of-5):

  real0m8.601s
  user0m8.084s
  sys 0m0.524s

to:

  real0m6.768s
  user0m6.340s
  sys 0m0.432s

for a wall-clock speedup of 21%.

Based on a patch from Rasmus Villemoes r...@rasmusvillemoes.dk.

Signed-off-by: Jeff King p...@peff.net
---
If somebody has a FreeBSD or OS X system to test on, I'd
love to see what is needed to compile with HAVE_GETDELIM
there. And to confirm that the performance is much better.
Sharing my 1.6GB packed-refs file would be hard, but you
should be able to generate something large and ridiculous.
I'll leave that as an exercise to the reader.

 Makefile |  6 ++
 config.mak.uname |  1 +
 strbuf.c | 42 ++
 3 files changed, 49 insertions(+)

diff --git a/Makefile b/Makefile
index 5f3987f..36655d5 100644
--- a/Makefile
+++ b/Makefile
@@ -359,6 +359,8 @@ all::
 # compiler is detected to support it.
 #
 # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function.
+#
+# Define HAVE_GETDELIM if your system has the getdelim() function.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -1437,6 +1439,10 @@ ifdef HAVE_BSD_SYSCTL
BASIC_CFLAGS += -DHAVE_BSD_SYSCTL
 endif
 
+ifdef HAVE_GETDELIM
+   BASIC_CFLAGS += -DHAVE_GETDELIM
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
diff --git a/config.mak.uname b/config.mak.uname
index f4e77cb..d26665f 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -36,6 +36,7 @@ ifeq ($(uname_S),Linux)
HAVE_DEV_TTY = YesPlease
HAVE_CLOCK_GETTIME = YesPlease
HAVE_CLOCK_MONOTONIC = YesPlease
+   HAVE_GETDELIM = YesPlease
 endif
 ifeq ($(uname_S),GNU/kFreeBSD)
HAVE_ALLOCA_H = YesPlease
diff --git a/strbuf.c b/strbuf.c
index 921619e..0d4f4e5 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -435,6 +435,47 @@ int strbuf_getcwd(struct strbuf *sb)
return -1;
 }
 
+#ifdef HAVE_GETDELIM
+int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
+{
+   ssize_t r;
+
+   if (feof(fp))
+   return EOF;
+
+   strbuf_reset(sb);
+
+   /* Translate slopbuf to NULL, as we cannot call realloc on it */
+   if (!sb-alloc)
+   sb-buf = NULL;
+   r = getdelim(sb-buf, sb-alloc, term, fp);
+
+   if (r  0) {
+   sb-len = r;
+   return 0;
+   }
+   assert(r == -1);
+
+   /*
+* Normally we would have called xrealloc, which will try to free
+* memory and recover. But we have no way to tell getdelim() to do so.
+* Worse, we cannot try to recover ENOMEM ourselves, because we have
+* no idea how many bytes were read by getdelim.
+*
+* Dying here is reasonable. It mirrors what xrealloc would do on
+* catastrophic memory failure. We skip the opportunity to free pack
+* memory and retry, but that's unlikely to help for a malloc small
+* enough to hold a single line of input, anyway.
+*/
+   if (errno == ENOMEM)
+   die(Out of memory, getdelim failed);
+
+   /* Restore slopbuf that we moved out of the way before */
+   if (!sb-buf)
+   strbuf_init(sb, 0);
+   return EOF;
+}
+#else
 int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 {
int ch;
@@ -458,6 +499,7 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
term)
sb-buf[sb-len] = '\0';
return 0;
 }
+#endif
 
 int strbuf_getline(struct strbuf *sb, FILE *fp, int term)
 {
-- 
2.4.0.rc2.384.g7297a4a

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html