[ https://issues.apache.org/jira/browse/THRIFT-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17647623#comment-17647623 ]
Benjamin Mahler commented on THRIFT-5673: ----------------------------------------- It gets called because the upper layer asks to read some data (in this case it wants to read the start of the json encoded thrift message which is a '[' character). In the case that the zlib layer finds EOF from the underlying transport, it should not be looping infinitely like this burning a cpu core. Let's compare with the C++ and Java implementations: C++: the same layer appears to surface up the EOF (0 bytes read) instead of looping: https://github.com/apache/thrift/blob/v0.17.0/lib/cpp/src/thrift/transport/TZlibTransport.cpp#L178-L182 Java: it appears to throw an IOException that wraps a TTransportException that I think will have an END_OF_FILE type: https://github.com/apache/thrift/blob/v0.17.0/lib/java/src/main/java/org/apache/thrift/transport/TZlibTransport.java#L99-L118 https://github.com/apache/thrift/blob/v0.17.0/lib/java/src/main/java/org/apache/thrift/transport/TTransportException.java#L33 So it appears that the C++/Java equivalents here surface the unexpected EOF up to the caller rather than loop infinitely, at least based on a read through the code. > TZlibTransport.py gets into infinite loop if underlying transport ends > unexpectedly. > ------------------------------------------------------------------------------------ > > Key: THRIFT-5673 > URL: https://issues.apache.org/jira/browse/THRIFT-5673 > Project: Thrift > Issue Type: Bug > Components: Python - Library > Reporter: Benjamin Mahler > Priority: Major > > In python, when using a thrift client composed as follows, the > TZlibTransport.py logic can enter an infinite loop when the response closes > prematurely (e.g. disconnection): > {code:java} > class GzipTransport(TZlibTransport.TZlibTransport): > def _init_zlib(self): > """Override of internal method for setting up the zlib compression and > decompression objects, to support 'gzip' content encoding. > """ > self._zcomp_read = zlib.decompressobj(wbits=zlib.MAX_WBITS + 16) > self._zcomp_write = zlib.compressobj(self.compresslevel, > wbits=zlib.MAX_WBITS + 16) > http_client.setCustomHeaders({"Content-Encoding": "gzip", "Accept-Encoding": > "gzip"}) > transport = GzipTransport(http_client) > transport = TTransport.TBufferedTransport(transport, rbuf_size=1024 * 1024) > thriftClient = ReadOnlyScheduler.Client(protocol) > httpClient.open() > try: > transport.open() > return thriftClient.getFoo(FooQuery(...)) > finally: > transport.close() > {code} > We observed that when the server shuts down at the same time as the response > is being read, the program spins on the following stack trace: > {code:java} > File: "client.py", line 47, in query_scheduler_api > return thrift_client_method(thrift_client, *args) > File: "gen/apache/aurora/api/ReadOnlyScheduler.py", line 198, in > getJobSummary > return self.recv_getJobSummary() > File: "gen/apache/aurora/api/ReadOnlyScheduler.py", line 210, in > recv_getJobSummary > (fname, mtype, rseqid) = iprot.readMessageBegin() > File: "thrift/protocol/TJSONProtocol.py", line 417, in readMessageBegin > self.readJSONArrayStart() > File: "thrift/protocol/TJSONProtocol.py", line 405, in readJSONArrayStart > self.readJSONSyntaxChar(LBRACKET) > File: "thrift/protocol/TJSONProtocol.py", line 253, in readJSONSyntaxChar > current = self.reader.read() > File: > "thrift-0.13.0-cp37-cp37m-linux_x86_64.whl/thrift/protocol/TJSONProtocol.py", > line 164, in read > self.data = self.protocol.trans.read(1) > File: "thrift/transport/TTransport.py", line 164, in read > self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size))) > File: "thrift/transport/TZlibTransport.py", line 191, in read > if self.readComp(sz): > File: "thrift/transport/TZlibTransport.py", line 208, in readComp > return False > {code} > Essentially, it appears that when the json protocol asks to read 1 byte, and > there's always 0 bytes returned from the http response (EOF), then the zlib > transport will spin in a loop here: > [https://github.com/apache/thrift/blob/v0.17.0/lib/py/src/transport/TZlibTransport.py#L190-L192] > It seems like this is a bug, and the zlib transport should treat 0 bytes > coming back from the underlying transport as EOF, at which point it closes > its compression streams and tells the caller about the remaining data and EOF > (possibly throwing an exception if the EOF does not cleanly finish the > decompression stream). > In our case, the underlying transport is ultimately the HttpClient's > response, and that doesn't appear to throw an exception when the response is > cut short, it just appears to return 0 bytes repeatedly. > We use thrift 0.13.0 but I see the same logic in the latest release and the > bug appears unfixed there as well. -- This message was sent by Atlassian Jira (v8.20.10#820010)