Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-07 Thread Jeff King
On Wed, Apr 08, 2015 at 12:43:09AM +0200, Rasmus Villemoes wrote: > Hm, I'm afraid it's not that simple. It seems that data may be lost from > the stream if getdelim encounters ENOMEM: Looking at the glibc > implementation (libio/iogetdelim.c), if reallocating the user buffer > fails, -1 is return

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-07 Thread Rasmus Villemoes
On Tue, Apr 07 2015, Jeff King wrote: > On Tue, Apr 07, 2015 at 03:48:33PM +0200, Rasmus Villemoes wrote: > >> Implementation-wise, I think strbuf_getwholeline could be implemented >> mostly as a simple wrapper for getdelim. If I'm reading the current code >> and the posix spec for getdelim corre

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-07 Thread Jeff King
On Tue, Apr 07, 2015 at 03:48:33PM +0200, Rasmus Villemoes wrote: > > 3. Find some alternative that is more robust than fgets, and faster > > than getc. I don't think there is anything in stdio, but I am not > > above dropping in a faster non-portable call if it is available, > >

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-07 Thread Rasmus Villemoes
On Sun, Apr 05 2015, Jeff King wrote: > which is back to the v2.0.0 number. Even with the extra strlen, it seems > that what fgets does internally beats repeated getc calls. Which I guess > is not too surprising, as each getc() will have to check for underflow > in the buffer. Perhaps there is mo

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-05 Thread Junio C Hamano
Jeff King writes: > On Sun, Apr 05, 2015 at 01:27:32AM -0400, Jeff King wrote: > >> On Sun, Apr 05, 2015 at 12:56:14AM -0400, Jeff King wrote: >> >> > The big downside is that our input strings are no longer NUL-clean >> > (reading "foo\0bar\n" would yield just "foo". I doubt that matters in >>

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-05 Thread Junio C Hamano
Jeff King writes: > So we'd have to either: > > 1. Decide that doesn't matter. > > 2. Have callers specify a "damn the NULs, I want it fast" flag. The callers that used to call fgets and then later rewritten to strbuf_getwholeline(), either of the above obviously should be OK, and because th

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-05 Thread Jeff King
On Sun, Apr 05, 2015 at 09:36:04PM +0700, Duy Nguyen wrote: > On Sun, Apr 5, 2015 at 11:56 AM, Jeff King wrote: > > So we'd have to either: > > > > 1. Decide that doesn't matter. > > > > 2. Have callers specify a "damn the NULs, I want it fast" flag. > > 2+. Avoid FILE* interface and go with

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-05 Thread Duy Nguyen
On Sun, Apr 5, 2015 at 11:56 AM, Jeff King wrote: > So we'd have to either: > > 1. Decide that doesn't matter. > > 2. Have callers specify a "damn the NULs, I want it fast" flag. 2+. Avoid FILE* interface and go with syscalls for reading packed-refs? If mmaping the entire file could be a prob

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-04 Thread Jeff King
On Sun, Apr 05, 2015 at 01:27:32AM -0400, Jeff King wrote: > On Sun, Apr 05, 2015 at 12:56:14AM -0400, Jeff King wrote: > > > The big downside is that our input strings are no longer NUL-clean > > (reading "foo\0bar\n" would yield just "foo". I doubt that matters in > > the real world, but it doe

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-04 Thread Jeff King
On Sun, Apr 05, 2015 at 12:56:14AM -0400, Jeff King wrote: > The big downside is that our input strings are no longer NUL-clean > (reading "foo\0bar\n" would yield just "foo". I doubt that matters in > the real world, but it does fail a few of the tests (e.g., t7008 tries > to read a list of patte

Re: [PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-04 Thread Jeff King
On Sat, Apr 04, 2015 at 09:11:10PM -0400, Jeff King wrote: > I also considered optimizing the "term == '\n'" case by using fgets, but > it gets rather complex (you have to pick a size, fgets into it, and then > keep going if you didn't get a newline). Also, fgets sucks, because you > have to call

[PATCH 3/6] strbuf_getwholeline: use getc_unlocked

2015-04-04 Thread Jeff King
strbuf_getwholeline calls getc in a tight loop. On modern libc implementations, the stdio code locks the handle for every operation, which means we are paying a significant overhead. We can get around this by locking the handle for the whole loop and using the unlocked variant. Running "git rev-p