The deserialization happens at the surface layer instead of the transport 
layer, unless we suspect that HTTP/2 frames themselves were malformed. If 
we suspect the serialization/deserialization code, we can check if simply 
serializing the proto to bytes and back is causing issues. Protobuf has 
utility functions to do this. Alternatively, gRPC has utility functions 
here 
https://github.com/grpc/grpc/blob/master/include/grpcpp/impl/codegen/proto_utils.h

I am worried for memory corruption though so that is certainly something to 
check.


On Wednesday, March 24, 2021 at 11:02:30 AM UTC-7 Bryan Schwerer wrote:

> Thanks for replying.
>
> I was able to get a tcpdump capture and run it through the wireshark 
> disector.  It indicated that there were malformed protobuf fields in the 
> message.  I'm guessing the client threw the messages away.   I didn't see a 
> trace message indicating that.  Is there some sort of stat I can check?  
> Would it be possible that older versions didn't discard malformed message?  
> I haven't loaded up an old version of our code, but I suspect it has always 
> been there.  The end of the message has counters and such that if they were 
> a bit off, no one would notice.
>
> I think we are corrupting the messages on the server side,  I turned on 
> -fstack-protector-all and the problem went away.  If there's a possible way 
> to check the message before sending to Writer, that may give us more 
> information.  We don't use arenas.  The message itself is uint32's, bool's 
> and one string.  I assume protobufs makes a copy of the string and not the 
> pointer to the buffer.
>
> On Wednesday, March 24, 2021 at 1:35:29 PM UTC-4 yas...@google.com wrote:
>
>> This is pretty strange. It is possible that we are being blocked on flow 
>> control. I would check that we are making sure that the application layer 
>> is reading. If I am not mistaken, `perform_stream_op[s=0x7f0e16937290]:  
>> RECV_MESSAGE` is a log that is seen at the start of an operation meaning 
>> that the HTTP/2 layer hasn't yet been instructed to read a message, (or 
>> there is a previous read on the stream already that hasn't finished). Given 
>> that you are just updating the gRPC version from 1.20 to 1.36.1, I do not 
>> have an answer as to why you would see this without any application 
>> changes. 
>>
>> A few questions - 
>> Do the two streams use the same underlying channel/transport?
>> Are the clients and the server in the same process?
>> Is there anything special about the environment this is being run in?
>>
>> (One way to make sure that the read op is being propagated to the 
>> transport layer, is to check the logs with the "channel" tracer.)
>> On Friday, March 19, 2021 at 12:59:30 PM UTC-7 Bryan Schwerer wrote:
>>
>>> Hello,
>>>
>>> I'm in the long overdo process of updating gRPC from 1.20 to 1.36.1.  I 
>>> am running into an issue where the streaming replies from the server are 
>>> not reaching the client in about 50% of the instances.  This is binary, 
>>> either the streaming call works perfectly or it doesn't work at all.  After 
>>> debugging a bit, I turned on the http tracing and from what I can tell, the 
>>> http messages are received in the client thread, but where in the correct 
>>> case, perform_stream_op[s=0x7f0e16937290]:  RECV_MESSAGE is logged, but in 
>>> the broken case it isn't.  No error messages occur.
>>>
>>> I've tried various tracers, but haven't hit anything.  The code is 
>>> pretty much the same pattern as the example and there's no indication any 
>>> disconnect has occurred which would cause the call to terminate.  Using gdb 
>>> to look at the thread, it is still in epoll_wait.
>>>
>>> The process in which this runs calls 2 different synchronous server 
>>> streaming calls to the same server in separate threads.  It also is a gRPC 
>>> server.  Everything is run over the internal 'lo' interface.  Any ideas on 
>>> where to look to debug this?
>>>
>>> Thanks,
>>>
>>> Bryan
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/41e29b56-535e-47f4-a529-a23fface1b40n%40googlegroups.com.

Reply via email to