Enabling flowctl debug tracing might show some useful log when, say, client is not at all consuming while server keeps generating. https://github.com/grpc/grpc/blob/master/doc/environment_variables.md
On Fri, Jul 19, 2019 at 1:03 PM Yonatan Zunger <zun...@humu.com> wrote: > I have no idea what would be involved in attaching ASAN to Python, and > suspect it may be "exciting," so I'm trying to see first if gRPC has any > monitoring capability around its buffers. > > One thing I did notice while reading through the codebase was unittests > like this one > <https://github.com/grpc/grpc/blob/master/test/core/end2end/tests/retry_exceeds_buffer_size_in_subsequent_batch.cc> > about > exceeding buffer sizes -- that does seem to trigger an ABORTED response, > but the test was fairly hard to understand (not much commenting there...). > Am I right in thinking that if this 4MB buffer is overflowed, that's > somehow going to happen? > > On Fri, Jul 19, 2019 at 12:59 PM Lidi Zheng <li...@google.com> wrote: > >> Hi Yonatan, >> >> In gRPC Python side, the consumption of message is sequential, and won't >> be kept in memory. >> If you recall the batch operations, only if a message is sent to >> application, will gRPC Python start another RECV_MESSAGE operation. >> It's unlikely that the problem resided in Python space. >> >> In C-Core space, AFAIK for each TCP read, the size is 4MiB >> <https://github.com/grpc/grpc/blob/master/src/core/lib/iomgr/tcp_posix.cc#L1177> >> per >> channel. >> I think we have flow control both in TCP level and HTTP2 level. >> >> For debugging, did you try to use ASAN? For channel arg, I can only find >> "GRPC_ARG_TCP_READ_CHUNK_SIZE" and "GRPC_ARG_MAX_RECEIVE_MESSAGE_LENGTH" >> that might be related to your case. >> >> Lidi Zheng >> >> On Fri, Jul 19, 2019 at 12:48 PM Yonatan Zunger <zun...@humu.com> wrote: >> >>> Maybe a more concrete way of asking this question: Let's say we have a >>> Python gRPC client making a response-streaming request to some gRPC server. >>> The server starts to stream back responses. If the client fails to consume >>> data as fast as the server generates it, I'm trying to figure out where the >>> data would accumulate, and which memory allocator it would be using. >>> (Because Python heap profiling won't see calls to malloc()) >>> >>> If I'm understanding correctly: >>> >>> * The responses are written by the server to the network socket at the >>> server's own speed (no pushback controlling it); >>> * These get picked up by the kernel network device on the client, and >>> get pulled into userspace ASAP by the event loop, which is in the C layer >>> of the gRPC client. This is stored in a grpc_byte_buffer and builds up >>> there. >>> * The Python client library exposes a response iterator, which is >>> ultimately a _Rendezvous object; its iteration is implemented in >>> _Rendezvous._next(), which calls cygrpc.ReceiveMessageOperation, which is >>> what drains data from the grpc_byte_buffer and passes it to the protobuf >>> parser, which creates objects in the Python memory address space and >>> returns them to the caller. >>> >>> This means that if the client were to drain the iterator more slowly, >>> data would accumulate in the grpc_byte_buffer, which is in the C layer and >>> not visible to (e.g.) Python heap profiling using the PEP445 malloc hooks. >>> >>> If I am understanding this correctly, is there any way (without doing a >>> massive amount of plumbing) to monitor the state of the byte buffer, e.g. >>> with some gRPC debug parameter? And is there any mechanism in the C layer >>> which limits the size of this buffer, doing something like failing the RPC >>> if the buffer size exceeds some threshold? >>> >>> Yonatan >>> >>> On Thu, Jul 18, 2019 at 5:27 PM Yonatan Zunger <zun...@humu.com> wrote: >>> >>>> Hi everyone, >>>> >>>> I'm trying to debug a mysterious memory blowout in a Python batch job, >>>> and one of the angles I'm exploring is that this may have to do with the >>>> way it's reading data. This job is reading from bigtable, which is >>>> ultimately fetching the actual data with a unidirectional streaming "read >>>> rows" RPC. This takes a single request and returns a sequence of data >>>> chunks, the higher-level client reshapes this into an iterator over the >>>> individual data cells, and those are consumed by the higher-level program, >>>> so that the next response proto is consumed once the program is ready to >>>> parse it. >>>> >>>> Something I can't remember about gRPC internals: What, if anything, is >>>> the pushback mechanism in unidirectional streaming? In the zero-pushback >>>> case, it would seem that a server could yield results at any speed, which >>>> would be accepted by the client and stored in gRPC's internal buffers until >>>> it got read by the client code, which could potentially cause a large >>>> memory blowout if the server wrote faster than the client read. Is this in >>>> fact the case? If so, is there any good way to instrument and detect if >>>> it's happening? (Some combination of gRPC debug flags, perhaps) If not, is >>>> there some pushback mechanism I'm not thinking of? >>>> >>>> (Alas, I can't change the protocol in this situation; the server is run >>>> by someone else) >>>> >>>> Yonatan >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "grpc.io" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to grpc-io+unsubscr...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/grpc-io/CAFk%3DnbT16yfxQ_%2BUkudCAkaADECw-XRbqvtC4u%3DbaEQ_Rv9VAA%40mail.gmail.com >>> <https://groups.google.com/d/msgid/grpc-io/CAFk%3DnbT16yfxQ_%2BUkudCAkaADECw-XRbqvtC4u%3DbaEQ_Rv9VAA%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups " > grpc.io" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to grpc-io+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/grpc-io/CAFk%3DnbSgze_g9n0CJCD%3D0_%3DBXMmkgaja%2BVwb4c-BxPBppERZ9g%40mail.gmail.com > <https://groups.google.com/d/msgid/grpc-io/CAFk%3DnbSgze_g9n0CJCD%3D0_%3DBXMmkgaja%2BVwb4c-BxPBppERZ9g%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/CAP2DVw0B%2Bfx7He%2BMmuLU%2B6rJVDc17v6FJUGeuRsDZL9e%2BhVm4w%40mail.gmail.com.