Maybe a more concrete way of asking this question: Let's say we have a
Python gRPC client making a response-streaming request to some gRPC server.
The server starts to stream back responses. If the client fails to consume
data as fast as the server generates it, I'm trying to figure out where the
data would accumulate, and which memory allocator it would be using.
(Because Python heap profiling won't see calls to malloc())

If I'm understanding correctly:

* The responses are written by the server to the network socket at the
server's own speed (no pushback controlling it);
* These get picked up by the kernel network device on the client, and get
pulled into userspace ASAP by the event loop, which is in the C layer of
the gRPC client. This is stored in a grpc_byte_buffer and builds up there.
* The Python client library exposes a response iterator, which is
ultimately a _Rendezvous object; its iteration is implemented in
_Rendezvous._next(), which calls cygrpc.ReceiveMessageOperation, which is
what drains data from the grpc_byte_buffer and passes it to the protobuf
parser, which creates objects in the Python memory address space and
returns them to the caller.

This means that if the client were to drain the iterator more slowly, data
would accumulate in the grpc_byte_buffer, which is in the C layer and not
visible to (e.g.) Python heap profiling using the PEP445 malloc hooks.

If I am understanding this correctly, is there any way (without doing a
massive amount of plumbing) to monitor the state of the byte buffer, e.g.
with some gRPC debug parameter? And is there any mechanism in the C layer
which limits the size of this buffer, doing something like failing the RPC
if the buffer size exceeds some threshold?

Yonatan

On Thu, Jul 18, 2019 at 5:27 PM Yonatan Zunger <zun...@humu.com> wrote:

> Hi everyone,
>
> I'm trying to debug a mysterious memory blowout in a Python batch job, and
> one of the angles I'm exploring is that this may have to do with the way
> it's reading data. This job is reading from bigtable, which is ultimately
> fetching the actual data with a unidirectional streaming "read rows" RPC.
> This takes a single request and returns a sequence of data chunks, the
> higher-level client reshapes this into an iterator over the individual data
> cells, and those are consumed by the higher-level program, so that the next
> response proto is consumed once the program is ready to parse it.
>
> Something I can't remember about gRPC internals: What, if anything, is the
> pushback mechanism in unidirectional streaming? In the zero-pushback case,
> it would seem that a server could yield results at any speed, which would
> be accepted by the client and stored in gRPC's internal buffers until it
> got read by the client code, which could potentially cause a large memory
> blowout if the server wrote faster than the client read. Is this in fact
> the case? If so, is there any good way to instrument and detect if it's
> happening? (Some combination of gRPC debug flags, perhaps) If not, is there
> some pushback mechanism I'm not thinking of?
>
> (Alas, I can't change the protocol in this situation; the server is run by
> someone else)
>
> Yonatan
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAFk%3DnbT16yfxQ_%2BUkudCAkaADECw-XRbqvtC4u%3DbaEQ_Rv9VAA%40mail.gmail.com.

Reply via email to