Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:54:39AM -0700, Junio C Hamano wrote: > OK. Trying to repurpose strbuf_read() for non-blocking FD was a > silly idea to begin with, as it wants to read thru to the EOF, and > setting FD explicitly to O_NONBLOCK is a sign that the caller wants > to grab as much as possible and does not want to wait for the EOF. > > So assuming we want strbuf_read_nonblock(), what interface do we > want from it? We could: > > * Have it grab as much as possible into sb as long as it does not >block? > > * Have it grab reasonably large amount into sb, and not blocking is >an absolute requirement, but the function is not required to read >everything that is available on the FD (i.e. the caller is >expected to loop)? I think we are crossing emails, but I would definitely argue for the latter. > If we choose the latter, then your "EAGAIN? EOF?" becomes easier, > no? We only have to do a single call to xread(), and then we: > > - get EAGAIN or EWOULDBLOCK; leave sb as-is, set errno==EAGAIN and >return -1. > > - get something (in which case that is not an EOF yet); append to >sb, return the number of bytes. > > - get EOF; leave sb as-is, return 0. Yes, exactly. > ssize_t strbuf_read_nonblock(struct strbuf *sb, int fd, size_t hint) > { > strbuf_grow(sb, hint ? hint : 8192); > ssize_t want = sb->alloc - sb->len - 1; > ssize_t got = xread_nonblock(fd, sb->buf + sb->len, want); > if (got < 0) { > if (errno == EWOULDBLOCK) > errno = EAGAIN; /* make life easier for the caller */ > return got; > } I like the normalizing of errno, but that should probably be part of xread_nonblock(), I would think. > sb->len += got; > sb->buf[sb->len] = '\0'; Use strbuf_setlen() here? > return got; If "got == 0", we naturally do not change sb->len at all, and the strbuf is left unchanged. But do we want to de-allocate what we allocated in strbuf_grow() above? That is what strbuf_read() does, but I think it is even less likely to help anybody here. With the original strbuf_read(), you might do: if (strbuf_read(&buf, fd, hint) <= 0) return; /* got nothing */ but because the nature of strbuf_read_nonblock() is to call it from a loop, you'd want to strbuf_release() when you leave the loop anyway. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
Jeff King writes: > I actually wonder if callers who are _expecting_ non-blocking want to > loop in strbuf_read() at all. > > strbuf_read() is really about reading to EOF, and growing the buffer to > fit all of the input. But that's not at all what you want to do with > non-blocking. There you believe for some reason that data may be > available (probably due to poll), and you want to read one chunk of it, > maybe act, and then go back to polling. I think I could have gone to grab a cup of coffee instead of typing another message that essentially said the same thing ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
Jeff King writes: > I think they have to loop for correctness, but they may do this: > > if (xread(fd, buf, len) < 0) > die_errno("OMG, an error!"); > > which is not correct if "fd" is unknowingly non-blocking. > ... > The spinning behavior is not great, but does mean that we spin and > continue rather than bailing with an error. OK, that is a problem (I just read through "git grep xread -- \*.c". There aren't that many codepaths that read from a fd that we didn't open ourselves, but there indeed are some. So it seems that we would want a xread() that spins to help such codepaths, and xread_nonblock() that gives a short-read upon EWOULDBLOCK. They can share a helper function, of course. > If we reset errno to "0" at the top of the function, we could get around > one problem, but it still makes an annoying interface: the caller has to > check errno for any 0-return to figure out if it was really EOF, or just > EAGAIN. > > If we return -1, though, we have a similar annoyance. If the caller > notices a -1 return value and finds EAGAIN, they still may need to check > sb->len to see if they made forward progress and have data they should > be dealing with. OK. Trying to repurpose strbuf_read() for non-blocking FD was a silly idea to begin with, as it wants to read thru to the EOF, and setting FD explicitly to O_NONBLOCK is a sign that the caller wants to grab as much as possible and does not want to wait for the EOF. So assuming we want strbuf_read_nonblock(), what interface do we want from it? We could: * Have it grab as much as possible into sb as long as it does not block? * Have it grab reasonably large amount into sb, and not blocking is an absolute requirement, but the function is not required to read everything that is available on the FD (i.e. the caller is expected to loop)? If we choose the latter, then your "EAGAIN? EOF?" becomes easier, no? We only have to do a single call to xread(), and then we: - get EAGAIN or EWOULDBLOCK; leave sb as-is, set errno==EAGAIN and return -1. - get something (in which case that is not an EOF yet); append to sb, return the number of bytes. - get EOF; leave sb as-is, return 0. ssize_t strbuf_read_nonblock(struct strbuf *sb, int fd, size_t hint) { strbuf_grow(sb, hint ? hint : 8192); ssize_t want = sb->alloc - sb->len - 1; ssize_t got = xread_nonblock(fd, sb->buf + sb->len, want); if (got < 0) { if (errno == EWOULDBLOCK) errno = EAGAIN; /* make life easier for the caller */ return got; } sb->len += got; sb->buf[sb->len] = '\0'; return got; } -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:50 AM, Jeff King wrote: > On Thu, Sep 17, 2015 at 10:45:40AM -0700, Stefan Beller wrote: > >> > You _can_ loop on read until you hit EAGAIN, but I think in general you >> > shouldn't; if you get a lot of input on this fd, you'll starve all of >> > the other descriptors you're polling. You're better off to read a >> > finite amount from each descriptor, and then check again who is ready to >> > read. >> >> That's what I do with the current implementation. Except it's not as clear >> and >> concise as I patched it into the strbuf_read. > > Is it? I thought the implementation you posted bumped the existing > strbuf_read to strbuf_buf_internal, including the loop. So as long as we > are not getting EAGAIN, it will keep reading forever. You'll get EAGAIN pretty fast though, as all you do is reading as fast as you can. > Actually not quite > true, as any read shorter than 8192 bytes will cause us to jump out of > the loop, too, but if we assume that the caller is feeding us data > faster than we can read it, we'll never exit strbuf_read_nonblock() and > serve any of the other descriptors. I see the difference now. That makes sense. > > -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:45:40AM -0700, Stefan Beller wrote: > > You _can_ loop on read until you hit EAGAIN, but I think in general you > > shouldn't; if you get a lot of input on this fd, you'll starve all of > > the other descriptors you're polling. You're better off to read a > > finite amount from each descriptor, and then check again who is ready to > > read. > > That's what I do with the current implementation. Except it's not as clear and > concise as I patched it into the strbuf_read. Is it? I thought the implementation you posted bumped the existing strbuf_read to strbuf_buf_internal, including the loop. So as long as we are not getting EAGAIN, it will keep reading forever. Actually not quite true, as any read shorter than 8192 bytes will cause us to jump out of the loop, too, but if we assume that the caller is feeding us data faster than we can read it, we'll never exit strbuf_read_nonblock() and serve any of the other descriptors. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:35 AM, Jeff King wrote: > > > You _can_ loop on read until you hit EAGAIN, but I think in general you > shouldn't; if you get a lot of input on this fd, you'll starve all of > the other descriptors you're polling. You're better off to read a > finite amount from each descriptor, and then check again who is ready to > read. That's what I do with the current implementation. Except it's not as clear and concise as I patched it into the strbuf_read. > > And then the return value becomes a no-brainer, because it's just the > return value of read(). Either you got some data, you got EOF, or you > get an error, which might be EAGAIN. You never have the case where you > got some data, but then you also got EOF and EAGAIN, and the caller has > to figure out which. > > So I think you really want something like: > > ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint) > { > ssize_t cnt; > > strbuf_grow(hint ? hint : 8192); > cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); > if (cnt > 0) > strbuf_setlen(sb->len + cnt); > return cnt; > } > > (where I'm assuming xread passes us back EAGAIN; we could also replace > it with read and loop on EINTR ourselves). Yeah that's exactly what I am looking for (the hint may even be over engineered here, as I have no clue how much data comes back). So I guess I could just use that new method now. > I actually wonder if callers who are _expecting_ non-blocking want to > loop in strbuf_read() at all. > > strbuf_read() is really about reading to EOF, and growing the buffer to > fit all of the input. But that's not at all what you want to do with > non-blocking. There you believe for some reason that data may be > available (probably due to poll), and you want to read one chunk of it, > maybe act, and then go back to polling. As a micro project (leftover bit maybe?): When strbuf_read is reading data out from a non blocking pipe, we're currently spinning (with the EAGAIN/EWOULDBLOCK). Introduce a call to poll to reduce the spinning. > > -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:26:19AM -0700, Stefan Beller wrote: > > If we return -1, though, we have a similar annoyance. If the caller > > notices a -1 return value and finds EAGAIN, they still may need to check > > sb->len to see if they made forward progress and have data they should > > be dealing with. > > If errno == EAGAIN, we know it is a non blocking fd, so we could encode > the length read as (- 2 - length), such that > > ...-2 the length read if EAGAIN was ignored > -1 for error, check errno! > 0 for EOF > +1... length if we just read it or restarted it due to EINTR. > > The call site should know if it is non blocking (i.e. if the <-2 case can > happen) and handle it appropriately. I actually wonder if callers who are _expecting_ non-blocking want to loop in strbuf_read() at all. strbuf_read() is really about reading to EOF, and growing the buffer to fit all of the input. But that's not at all what you want to do with non-blocking. There you believe for some reason that data may be available (probably due to poll), and you want to read one chunk of it, maybe act, and then go back to polling. You _can_ loop on read until you hit EAGAIN, but I think in general you shouldn't; if you get a lot of input on this fd, you'll starve all of the other descriptors you're polling. You're better off to read a finite amount from each descriptor, and then check again who is ready to read. And then the return value becomes a no-brainer, because it's just the return value of read(). Either you got some data, you got EOF, or you get an error, which might be EAGAIN. You never have the case where you got some data, but then you also got EOF and EAGAIN, and the caller has to figure out which. So I think you really want something like: ssize_t strbuf_read_once(struct strbuf *sb, int fd, size_t hint) { ssize_t cnt; strbuf_grow(hint ? hint : 8192); cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); if (cnt > 0) strbuf_setlen(sb->len + cnt); return cnt; } (where I'm assuming xread passes us back EAGAIN; we could also replace it with read and loop on EINTR ourselves). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 10:13 AM, Jeff King wrote: > On Thu, Sep 17, 2015 at 09:58:00AM -0700, Junio C Hamano wrote: > >> > Certainly anybody who does not realize their descriptor is O_NONBLOCK >> > and is using the spinning for correctness. I tend to think that such >> > sites are wrong, though, and would benefit from us realizing they are >> > spinning. >> >> With or without O_NONBLOCK, not looping around xread() _and_ relying >> on the spinning for "correctness" means the caller is not getting >> correctness anyway, I think, because xread() does return a short >> read, and we deliberately and explicitly do so since 0b6806b9 >> (xread, xwrite: limit size of IO to 8MB, 2013-08-20). > > I think they have to loop for correctness, but they may do this: > > if (xread(fd, buf, len) < 0) > die_errno("OMG, an error!"); > > which is not correct if "fd" is unknowingly non-blocking. As Stefan > mentioned, we do not set O_NONBLOCK ourselves very much, but I wonder if > we could inherit it from the environment in some cases. > > The spinning behavior is not great, but does mean that we spin and > continue rather than bailing with an error. > >> > But I think you can't quite get away with leaving strbuf_read untouched >> > in this case. On error, it wants to restore the original value of the >> > strbuf before the strbuf_read call. Which means that we throw away >> > anything read into the strbuf before we get EAGAIN, and the caller never >> > gets to see it. >> >> I agree we need to teach strbuf_read() that xread() is now nicer on >> O_NONBLOCK; perhaps like this? >> >> strbuf.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/strbuf.c b/strbuf.c >> index cce5eed..49104d7 100644 >> --- a/strbuf.c >> +++ b/strbuf.c >> @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t >> hint) >> >> cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); >> if (cnt < 0) { >> + if (errno == EAGAIN || errno == EWOULDBLOCK) >> + break; >> if (oldalloc == 0) >> strbuf_release(sb); >> else > > If we get EAGAIN on the first read, this will return "0", and I think we > end up in the "was it EOF, or EAGAIN?" situation I mentioned earlier. > If we reset errno to "0" at the top of the function, we could get around > one problem, but it still makes an annoying interface: the caller has to > check errno for any 0-return to figure out if it was really EOF, or just > EAGAIN. > > If we return -1, though, we have a similar annoyance. If the caller > notices a -1 return value and finds EAGAIN, they still may need to check > sb->len to see if they made forward progress and have data they should > be dealing with. If errno == EAGAIN, we know it is a non blocking fd, so we could encode the length read as (- 2 - length), such that ...-2 the length read if EAGAIN was ignored -1 for error, check errno! 0 for EOF +1... length if we just read it or restarted it due to EINTR. The call site should know if it is non blocking (i.e. if the <-2 case can happen) and handle it appropriately. Any callsite of today which is unaware of the fd being non blocking would break by this (as we want to fix the spinning anyway eventually) > > -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 9:58 AM, Junio C Hamano wrote: > Jeff King writes: > >> On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote: >> >>> And your new caller that does O_NONBLOCK wants to do more than >>> looping upon EWOULDBLOCK. It certainly would not want us to loop >>> here. >>> >>> So I wonder if you can just O_NONBLOCK the fd and use the usual >>> strbuf_read(), i.e. without any change in this patch, and update >>> xread() to _unconditionally_ return when read(2) says EAGAIN or >>> EWOULDBLOCK. >>> >>> What would that break? >> >> Certainly anybody who does not realize their descriptor is O_NONBLOCK >> and is using the spinning for correctness. I tend to think that such >> sites are wrong, though, and would benefit from us realizing they are >> spinning. > > With or without O_NONBLOCK, not looping around xread() _and_ relying > on the spinning for "correctness" means the caller is not getting > correctness anyway, I think, because xread() does return a short > read, and we deliberately and explicitly do so since 0b6806b9 > (xread, xwrite: limit size of IO to 8MB, 2013-08-20). > >> But I think you can't quite get away with leaving strbuf_read untouched >> in this case. On error, it wants to restore the original value of the >> strbuf before the strbuf_read call. Which means that we throw away >> anything read into the strbuf before we get EAGAIN, and the caller never >> gets to see it. > > I agree we need to teach strbuf_read() that xread() is now nicer on > O_NONBLOCK; perhaps like this? > > strbuf.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/strbuf.c b/strbuf.c > index cce5eed..49104d7 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t > hint) > > cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); > if (cnt < 0) { > + if (errno == EAGAIN || errno == EWOULDBLOCK) > + break; This would need xread to behave differently too. (if EAGAIN, return) Looking at xread to answer: "Why did we have a spinning loop in case of EAGAIN in the first place?" I ended up with 1c15afb9343b (2005-12-19, xread/xwrite: do not worry about EINTR at calling sites.) There we had lots of EAGAIN checks sprinkled through the code base, so we had non blocking IO back then? I think I chose to not use xread, as it contradicts everything we want in the non blocking case. We want to ignore any operations with a recoverable error (EAGAIN and EINTR) and keep going with the rest of the program. In the blocking case (as it is used currently) we can just have the checks from xread moved to strbuf_read. > if (oldalloc == 0) > strbuf_release(sb); > else -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 09:58:00AM -0700, Junio C Hamano wrote: > > Certainly anybody who does not realize their descriptor is O_NONBLOCK > > and is using the spinning for correctness. I tend to think that such > > sites are wrong, though, and would benefit from us realizing they are > > spinning. > > With or without O_NONBLOCK, not looping around xread() _and_ relying > on the spinning for "correctness" means the caller is not getting > correctness anyway, I think, because xread() does return a short > read, and we deliberately and explicitly do so since 0b6806b9 > (xread, xwrite: limit size of IO to 8MB, 2013-08-20). I think they have to loop for correctness, but they may do this: if (xread(fd, buf, len) < 0) die_errno("OMG, an error!"); which is not correct if "fd" is unknowingly non-blocking. As Stefan mentioned, we do not set O_NONBLOCK ourselves very much, but I wonder if we could inherit it from the environment in some cases. The spinning behavior is not great, but does mean that we spin and continue rather than bailing with an error. > > But I think you can't quite get away with leaving strbuf_read untouched > > in this case. On error, it wants to restore the original value of the > > strbuf before the strbuf_read call. Which means that we throw away > > anything read into the strbuf before we get EAGAIN, and the caller never > > gets to see it. > > I agree we need to teach strbuf_read() that xread() is now nicer on > O_NONBLOCK; perhaps like this? > > strbuf.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/strbuf.c b/strbuf.c > index cce5eed..49104d7 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t > hint) > > cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); > if (cnt < 0) { > + if (errno == EAGAIN || errno == EWOULDBLOCK) > + break; > if (oldalloc == 0) > strbuf_release(sb); > else If we get EAGAIN on the first read, this will return "0", and I think we end up in the "was it EOF, or EAGAIN?" situation I mentioned earlier. If we reset errno to "0" at the top of the function, we could get around one problem, but it still makes an annoying interface: the caller has to check errno for any 0-return to figure out if it was really EOF, or just EAGAIN. If we return -1, though, we have a similar annoyance. If the caller notices a -1 return value and finds EAGAIN, they still may need to check sb->len to see if they made forward progress and have data they should be dealing with. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 09:44:19AM -0700, Junio C Hamano wrote: > > Arguably we should actually return the number of bytes we _did_ read, > > but then caller cannot easily tell the difference between EOF and > > EAGAIN. > > Why can't it check errno==EAGAIN/EWOULDBLOCK? Is it trustworthy to check errno if read() has not actually ever returned -1? E.g., consider this sequence: 1. somebody (maybe even us calling strbuf_read) calls read() and it returns -1, setting errno to EAGAIN 2. we call strbuf_read() 2a. it calls read(), which returns N bytes 2b. it calls read() again, which returns 0 for EOF 2c. it returns N, because that's how many bytes it read 3. we wonder if we hit EOF, or if we simply need to read again. We check errno == EAGAIN I don't think the read calls in step 2 are guaranteed to clear errno, and we might read the cruft from step 1. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
Jeff King writes: > On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote: > >> And your new caller that does O_NONBLOCK wants to do more than >> looping upon EWOULDBLOCK. It certainly would not want us to loop >> here. >> >> So I wonder if you can just O_NONBLOCK the fd and use the usual >> strbuf_read(), i.e. without any change in this patch, and update >> xread() to _unconditionally_ return when read(2) says EAGAIN or >> EWOULDBLOCK. >> >> What would that break? > > Certainly anybody who does not realize their descriptor is O_NONBLOCK > and is using the spinning for correctness. I tend to think that such > sites are wrong, though, and would benefit from us realizing they are > spinning. With or without O_NONBLOCK, not looping around xread() _and_ relying on the spinning for "correctness" means the caller is not getting correctness anyway, I think, because xread() does return a short read, and we deliberately and explicitly do so since 0b6806b9 (xread, xwrite: limit size of IO to 8MB, 2013-08-20). > But I think you can't quite get away with leaving strbuf_read untouched > in this case. On error, it wants to restore the original value of the > strbuf before the strbuf_read call. Which means that we throw away > anything read into the strbuf before we get EAGAIN, and the caller never > gets to see it. I agree we need to teach strbuf_read() that xread() is now nicer on O_NONBLOCK; perhaps like this? strbuf.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/strbuf.c b/strbuf.c index cce5eed..49104d7 100644 --- a/strbuf.c +++ b/strbuf.c @@ -368,6 +368,8 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); if (cnt < 0) { + if (errno == EAGAIN || errno == EWOULDBLOCK) + break; if (oldalloc == 0) strbuf_release(sb); else -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 9:44 AM, Junio C Hamano wrote: > On Thu, Sep 17, 2015 at 9:30 AM, Jeff King wrote: >> >> So I think we would probably want to treat EAGAIN specially: return -1 >> to signal to the caller but _don't_ truncate the strbuf. > > Yeah, "don't truncate" is needed. > >> Arguably we should actually return the number of bytes we _did_ read, >> but then caller cannot easily tell the difference between EOF and >> EAGAIN. > > Why can't it check errno==EAGAIN/EWOULDBLOCK? Grepping through Gits sources, there are no occurrences of explicitly setting O_NONBLOCK except for the newly introduced spot in the followup patch in run-command (yes we do poll there). So how would I find out if a fd is blocking or not in the 82 cases of strbuf_read? (Now I naively assume they would all block.) We could also expose the flags in the API. I think IGNORE_EAGAIN might not be the best now, but rather a NO_TRUNCATE flag would do. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 9:30 AM, Jeff King wrote: > > So I think we would probably want to treat EAGAIN specially: return -1 > to signal to the caller but _don't_ truncate the strbuf. Yeah, "don't truncate" is needed. > Arguably we should actually return the number of bytes we _did_ read, > but then caller cannot easily tell the difference between EOF and > EAGAIN. Why can't it check errno==EAGAIN/EWOULDBLOCK? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
On Thu, Sep 17, 2015 at 09:13:40AM -0700, Junio C Hamano wrote: > And your new caller that does O_NONBLOCK wants to do more than > looping upon EWOULDBLOCK. It certainly would not want us to loop > here. > > So I wonder if you can just O_NONBLOCK the fd and use the usual > strbuf_read(), i.e. without any change in this patch, and update > xread() to _unconditionally_ return when read(2) says EAGAIN or > EWOULDBLOCK. > > What would that break? Certainly anybody who does not realize their descriptor is O_NONBLOCK and is using the spinning for correctness. I tend to think that such sites are wrong, though, and would benefit from us realizing they are spinning. But I think you can't quite get away with leaving strbuf_read untouched in this case. On error, it wants to restore the original value of the strbuf before the strbuf_read call. Which means that we throw away anything read into the strbuf before we get EAGAIN, and the caller never gets to see it. So I think we would probably want to treat EAGAIN specially: return -1 to signal to the caller but _don't_ truncate the strbuf. Arguably we should actually return the number of bytes we _did_ read, but then caller cannot easily tell the difference between EOF and EAGAIN. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock
Stefan Beller writes: > Subject: Re: [PATCH 01/10] strbuf: Add strbuf_read_noblock s/Add/add/; > We need to read from pipes without blocking in a later patch. I am hoping that you are at least not spinning---i.e. do a poll first to make sure there is at least some progress to be made before calling this. > Signed-off-by: Stefan Beller > --- > strbuf.c | 25 +++-- > strbuf.h | 6 ++ > 2 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/strbuf.c b/strbuf.c > index cce5eed..4130ee2 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE > *f) > return res; > } > > -ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) > +#define IGNORE_EAGAIN (1) > + > +static ssize_t strbuf_read_internal(struct strbuf *sb, int fd, > + size_t hint, int flags) > { > size_t oldlen = sb->len; > size_t oldalloc = sb->alloc; > @@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t > hint) > for (;;) { > ssize_t cnt; > > - cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); > + cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); > if (cnt < 0) { > + if (errno == EINTR) > + continue; > + if (errno == EAGAIN) { > + if (flags & IGNORE_EAGAIN) > + break; > + else > + continue; > + } In order to ensure that this is not breaking the normal case, I had to look at the implementation of xread() to see it behaves identically when the flags is not passed. That one-time review burden implies that this is adding a maintenance burden to keep this copied function in sync. We should also handle EWOULDBLOCK not just EAGAIN. More importantly, I am not sure if this helper is even necessary. Looking at xread (reproduced in its full glory): /* * xread() is the same a read(), but it automatically restarts read() * operations with a recoverable error (EAGAIN and EINTR). xread() * DOES NOT GUARANTEE that "len" bytes is read even if the data is available. */ ssize_t xread(int fd, void *buf, size_t len) { ssize_t nr; if (len > MAX_IO_SIZE) len = MAX_IO_SIZE; while (1) { nr = read(fd, buf, len); if ((nr < 0) && (errno == EAGAIN || errno == EINTR)) continue; return nr; } } Are we doing the right thing for EAGAIN here? Now, man read(2) says this: EAGAIN The file descriptor fd refers to a file other than a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. EAGAIN or EWOULDBLOCK The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the read would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these con‐ stants to have the same value, so a portable application should check for both possibilities. If the fd is not marked nonblocking, then we will not get EAGAIN (or EWOULDBLOCK). When fd is set to nonblocking, the current xread() spins if read() says that the operation would block. What would we achieve by spinning ourselves here, though? The caller must have set the fd to be nonblocking for a reason, and that reason cannot be "if the data is not ready, we do not have anything better than just spinning for new data to arrive"---if so, the caller wouldn't have set the fd to be nonblocking in the first place. Even if there is such a stupid caller that only wants us to loop, because we explicitly allow xread() to return a short read, all of our callers must call it in a loop if they know how much they want to read. We can just return from here and let them loop around us. And your new caller that does O_NONBLOCK wants to do more than looping upon EWOULDBLOCK. It certainly would not want us to loop here. So I wonder if you can just O_NONBLOCK the fd and use the usual strbuf_read(), i.e. without any change in this patch, and update xread() to _unconditionally_ return when read(2) says EAGAIN or EWOULDBLOCK. What would that break? > if (oldalloc == 0) > strbuf_release(sb); > else > @@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t > hint) > return sb->len
[PATCH 01/10] strbuf: Add strbuf_read_noblock
We need to read from pipes without blocking in a later patch. Signed-off-by: Stefan Beller --- strbuf.c | 25 +++-- strbuf.h | 6 ++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/strbuf.c b/strbuf.c index cce5eed..4130ee2 100644 --- a/strbuf.c +++ b/strbuf.c @@ -357,7 +357,10 @@ size_t strbuf_fread(struct strbuf *sb, size_t size, FILE *f) return res; } -ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) +#define IGNORE_EAGAIN (1) + +static ssize_t strbuf_read_internal(struct strbuf *sb, int fd, + size_t hint, int flags) { size_t oldlen = sb->len; size_t oldalloc = sb->alloc; @@ -366,8 +369,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) for (;;) { ssize_t cnt; - cnt = xread(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); + cnt = read(fd, sb->buf + sb->len, sb->alloc - sb->len - 1); if (cnt < 0) { + if (errno == EINTR) + continue; + if (errno == EAGAIN) { + if (flags & IGNORE_EAGAIN) + break; + else + continue; + } if (oldalloc == 0) strbuf_release(sb); else @@ -384,6 +395,16 @@ ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) return sb->len - oldlen; } +ssize_t strbuf_read(struct strbuf *sb, int fd, size_t hint) +{ + return strbuf_read_internal(sb, fd, hint, 0); +} + +ssize_t strbuf_read_noblock(struct strbuf *sb, int fd, size_t hint) +{ + return strbuf_read_internal(sb, fd, hint, IGNORE_EAGAIN); +} + #define STRBUF_MAXLINK (2*PATH_MAX) int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint) diff --git a/strbuf.h b/strbuf.h index aef2794..23ca7aa 100644 --- a/strbuf.h +++ b/strbuf.h @@ -367,6 +367,12 @@ extern size_t strbuf_fread(struct strbuf *, size_t, FILE *); extern ssize_t strbuf_read(struct strbuf *, int fd, size_t hint); /** + * Same as strbuf_read, just returns non-blockingly by ignoring EAGAIN. + * The fd must have set O_NONBLOCK. + */ +extern ssize_t strbuf_read_noblock(struct strbuf *, int fd, size_t hint); + +/** * Read the contents of a file, specified by its path. The third argument * can be used to give a hint about the file size, to avoid reallocs. */ -- 2.6.0.rc0.131.gf624c3d -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html