Could one of the developers please elaborate on the case where:
* The sender has been blocked (shortly) on a closed TCP Window leaving a 
large set of gRPC chunks waiting at the HTTP2 stream sender side
* The receiver has not send a OUTBOUND WINDOW_UPDATE to signal its OK to 
send more data.

My current impression is that the python (sender) will not continue sending 
although the TCP Congestion window is open again.


Op woensdag 5 januari 2022 om 14:52:09 UTC+1 schreef John Forensics:

> So the stream is stalled as the congestion window is closed. Just verified 
> with tcpdump and tcptrace, and can confirm thats the problem.
>
>
> https://github.com/grpc/grpc/blob/2e3e3ccc3d5c40010a6a5555d6f824a5024176a7/src/core/ext/transport/chttp2/transport/writing.cc#L449
>
> This explains why excessive logging and making my program inefficient 
> helps (e.g. adding debug logging of gRPC itself).
>
> Which changes the question: why isn't the stream unstalled when the 
> congestion-window opens again..
>
> This also seems something that would be different in terms of low level 
> API  in Linux, Windows and WINE.
>
> Op woensdag 5 januari 2022 om 14:32:20 UTC+1 schreef John Forensics:
>
>> After sending some data (about 4 chunks of 16k of a single 1Mb gRPC 
>> message)  the HTTP2-stream seems to be put on the "stalled" list. (Whatever 
>> that may be). 
>>
>>
>>  World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello W
>> orld!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello Worl
>> d!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.Hello 
>> World!!!!.Hello World!!!!.Hello World!!!!.Hello World!!!!.'
>> I0105 12:30:25.218000000   216 src/core/lib/iomgr/timer_generic.cc:558]   
>> .. shard[13] popped 0
>> I0105 12:30:25.258000000   208 
>> src/core/lib/surface/completion_queue.cc:1074] RETURN_EVENT[00484AA0]: 
>> QUEUE_TIMEOUT
>> I0105 12:30:25.267000000   212 src/core/lib/iomgr/executor.cc:294] 
>> EXECUTOR (default-executor) try to schedule 00499258 (short) to thread 0
>> I0105 12:30:25.267000000   216 src/core/lib/iomgr/timer_generic.cc:613]   
>> .. result --> 1, shard[13]->min_deadline 10496 --> 12695, now=11765
>> I0105 12:30:25.267000000   208 
>> src/core/lib/surface/completion_queue.cc:970] 
>> grpc_completion_queue_next(cq=00484AA0, deadline=gpr_timespec { tv_sec: 
>> 1641385825, tv_nsec: 341971712, clock_type: 1 }, reserved=00000000)
>> I0105 12:30:25.268000000   196 src/core/lib/iomgr/executor.cc:242] 
>> EXECUTOR (default-executor) [0]: execute
>> I0105 12:30:25.268000000   216 src/core/lib/iomgr/timer_generic.cc:517]   
>> .. shard[14]: heap_empty=true
>> I0105 12:30:25.268000000   196 src/core/lib/iomgr/executor.cc:123] 
>> EXECUTOR (default-executor) run 00499258
>> I0105 12:30:25.268000000   216 src/core/lib/iomgr/timer_generic.cc:492]   
>> .. shard[14]->queue_deadline_cap --> 12694
>> I0105 12:30:25.268000000   196 
>> src/core/ext/transport/chttp2/transport/chttp2_transport.cc:800] W:00497270 
>> SERVER [ipv4:10.1.0.1:41200] state WRITING+MORE -> WRITING [continue 
>> writing]
>> I0105 12:30:25.269000000   216 src/core/lib/iomgr/timer_generic.cc:558]   
>> .. shard[14] popped 0
>> I0105 12:30:25.269000000   196 
>> src/core/ext/transport/chttp2/transport/stream_lists.cc:71] 
>> 00497270[5][svr]: pop from writing
>> I0105 12:30:25.269000000   216 src/core/lib/iomgr/timer_generic.cc:613]   
>> .. result --> 1, shard[14]->min_deadline 10496 --> 12695, now=11765
>> I0105 12:30:25.269000000   196 
>> src/core/ext/transport/chttp2/transport/stream_lists.cc:71] 
>> 00497270[5][svr]: pop from writable
>> I0105 12:30:25.269000000   216 src/core/lib/iomgr/timer_generic.cc:517]   
>> .. shard[7]: heap_empty=true
>> I0105 12:30:25.270000000   196 
>> src/core/ext/transport/chttp2/transport/writing.cc:443] W:00497270 
>> SERVER[5] im-(sent,send)=(1,0) announce=0
>> I0105 12:30:25.270000000   216 src/core/lib/iomgr/timer_generic.cc:492]   
>> .. shard[7]->queue_deadline_cap --> 12764
>> D0105 12:30:25.270000000   196 
>> src/core/ext/transport/chttp2/transport/flow_control.cc:117] 
>> 00498C78[5][svr] | s updt sent | trw:                             0, tlw:   
>>                     4183485, taw:                       4187469, srw:       
>>                     288, slw:                       4187521, saw:           
>>             4187521
>> I0105 12:30:25.270000000   216 src/core/lib/iomgr/timer_generic.cc:558]   
>> .. shard[7] popped 0
>> D0105 12:30:25.270000000   196 
>> src/core/ext/transport/chttp2/transport/writing.cc:163] 
>> ipv4:10.1.0.1:41200:00497270 stream 5 moved to stalled list by transport. 
>> This is FULLY expected to happen in a healthy program that is not seeing 
>> flow control stalls. However, if you know that there are unwanted stalls, 
>> here is some helpful data: 
>> [fc:pending=656:pending-compressed=0:flowed=1048288:peer_initwin=4187516:t_win=0:s_win=288:s_delta=-1048288]
>> I0105 12:30:25.271000000   216 src/core/lib/iomgr/timer_generic.cc:613]   
>> .. result --> 1, shard[7]->min_deadline 10626 --> 12765, now=11765
>> I0105 12:30:25.271000000   196 
>> src/core/ext/transport/chttp2/transport/stream_lists.cc:125] 
>> 00497270[5][svr]: add to stalled_by_transport
>> I0105 12:30:25.271000000   216 src/core/lib/iomgr/timer_generic.cc:517]   
>> .. shard[9]: heap_empty=true
>> D0105 12:30:25.271000000   196 
>> src/core/ext/transport/chttp2/transport/flow_control.cc:117] 
>> 00498C78[0][svr] | t updt sent | trw:                             0, tlw:   
>>                     4183485, taw:                       4187469, srw:       
>>                        , slw:                              , saw:           
>>                    
>> I0105 12:30:25.271000000   216 src/core/lib/iomgr/timer_generic.cc:492]   
>> .. shard[9]->queue_deadline_cap --> 12764
>> I0105 12:30:25.272000000   196 
>> src/core/ext/transport/chttp2/transport/chttp2_transport.cc:800] W:00497270 
>> SERVER [ipv4:10.1.0.1:41200] state WRITING -> IDLE [begin writing 
>> nothing]
>> I0105 12:30:25.272000000   216 src/core/lib/iomgr/timer_generic.cc:558]   
>> .. shard[9] popped 0
>>
>>
>> Op woensdag 5 januari 2022 om 11:54:13 UTC+1 schreef John Forensics:
>>
>>> I managed to extract some logging, the python side seems to send a too 
>>> large frame.
>>>
>>> [id: 0x0e0792ca, L:/127.0.0.1:43718 - R:localhost/127.0.0.1:60630] Sent 
>>> GOAWAY: lastStreamId '2147483647 <(214)%20748-3647>', errorCode '6', 
>>> debugData 'Frame length: 2165320 exceeds maximum: 16384'. Forcing shutdown 
>>> of the connection.
>>>
>>>
>>> Op dinsdag 4 januari 2022 om 16:42:12 UTC+1 schreef John Forensics:
>>>
>>>> I am running identical Python code from either 
>>>>
>>>> WINE  python.exe   versus Linux native python 
>>>>
>>>> The behaviour is slightly different, leading eventually to* losing the 
>>>> communicatie with a WINE python variant (messages/calls not received at 
>>>> the 
>>>> java client side)*
>>>>
>>>> The most visible difference I see is the number of HTTP2 Streams that 
>>>> are created. The linux native runs stay at a low number, and most 
>>>> communication is done within one stream, while the windows variant seems 
>>>> to 
>>>> distribute it evenly over more streams. 
>>>>
>>>> I do not understand why the gRPC layer would behave differently. I 
>>>> currently try to understand the problem using wireshark, but thats not the 
>>>> right way to do it.
>>>>
>>>> Any tips on how to debug this would be appreciated? I do not see any 
>>>> exceptions explaining why a message isn't received at the java side.
>>>>
>>>> If people have some hypotheses for me to test, that would be nice as 
>>>> well.
>>>>
>>>> Best regards,
>>>>
>>>> Ruud
>>>>
>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/cb04e424-ae24-4171-ab7b-f5f962330bdan%40googlegroups.com.

Reply via email to