[
https://issues.apache.org/jira/browse/THRIFT-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17647623#comment-17647623
]
Benjamin Mahler commented on THRIFT-5673:
-----------------------------------------
It gets called because the upper layer asks to read some data (in this case it
wants to read the start of the json encoded thrift message which is a '['
character). In the case that the zlib layer finds EOF from the underlying
transport, it should not be looping infinitely like this burning a cpu core.
Let's compare with the C++ and Java implementations:
C++: the same layer appears to surface up the EOF (0 bytes read) instead of
looping:
https://github.com/apache/thrift/blob/v0.17.0/lib/cpp/src/thrift/transport/TZlibTransport.cpp#L178-L182
Java: it appears to throw an IOException that wraps a TTransportException that
I think will have an END_OF_FILE type:
https://github.com/apache/thrift/blob/v0.17.0/lib/java/src/main/java/org/apache/thrift/transport/TZlibTransport.java#L99-L118
https://github.com/apache/thrift/blob/v0.17.0/lib/java/src/main/java/org/apache/thrift/transport/TTransportException.java#L33
So it appears that the C++/Java equivalents here surface the unexpected EOF up
to the caller rather than loop infinitely, at least based on a read through the
code.
> TZlibTransport.py gets into infinite loop if underlying transport ends
> unexpectedly.
> ------------------------------------------------------------------------------------
>
> Key: THRIFT-5673
> URL: https://issues.apache.org/jira/browse/THRIFT-5673
> Project: Thrift
> Issue Type: Bug
> Components: Python - Library
> Reporter: Benjamin Mahler
> Priority: Major
>
> In python, when using a thrift client composed as follows, the
> TZlibTransport.py logic can enter an infinite loop when the response closes
> prematurely (e.g. disconnection):
> {code:java}
> class GzipTransport(TZlibTransport.TZlibTransport):
> def _init_zlib(self):
> """Override of internal method for setting up the zlib compression and
> decompression objects, to support 'gzip' content encoding.
> """
> self._zcomp_read = zlib.decompressobj(wbits=zlib.MAX_WBITS + 16)
> self._zcomp_write = zlib.compressobj(self.compresslevel,
> wbits=zlib.MAX_WBITS + 16)
> http_client.setCustomHeaders({"Content-Encoding": "gzip", "Accept-Encoding":
> "gzip"})
> transport = GzipTransport(http_client)
> transport = TTransport.TBufferedTransport(transport, rbuf_size=1024 * 1024)
> thriftClient = ReadOnlyScheduler.Client(protocol)
> httpClient.open()
> try:
> transport.open()
> return thriftClient.getFoo(FooQuery(...))
> finally:
> transport.close()
> {code}
> We observed that when the server shuts down at the same time as the response
> is being read, the program spins on the following stack trace:
> {code:java}
> File: "client.py", line 47, in query_scheduler_api
> return thrift_client_method(thrift_client, *args)
> File: "gen/apache/aurora/api/ReadOnlyScheduler.py", line 198, in
> getJobSummary
> return self.recv_getJobSummary()
> File: "gen/apache/aurora/api/ReadOnlyScheduler.py", line 210, in
> recv_getJobSummary
> (fname, mtype, rseqid) = iprot.readMessageBegin()
> File: "thrift/protocol/TJSONProtocol.py", line 417, in readMessageBegin
> self.readJSONArrayStart()
> File: "thrift/protocol/TJSONProtocol.py", line 405, in readJSONArrayStart
> self.readJSONSyntaxChar(LBRACKET)
> File: "thrift/protocol/TJSONProtocol.py", line 253, in readJSONSyntaxChar
> current = self.reader.read()
> File:
> "thrift-0.13.0-cp37-cp37m-linux_x86_64.whl/thrift/protocol/TJSONProtocol.py",
> line 164, in read
> self.data = self.protocol.trans.read(1)
> File: "thrift/transport/TTransport.py", line 164, in read
> self.__rbuf = BufferIO(self.__trans.read(max(sz, self.__rbuf_size)))
> File: "thrift/transport/TZlibTransport.py", line 191, in read
> if self.readComp(sz):
> File: "thrift/transport/TZlibTransport.py", line 208, in readComp
> return False
> {code}
> Essentially, it appears that when the json protocol asks to read 1 byte, and
> there's always 0 bytes returned from the http response (EOF), then the zlib
> transport will spin in a loop here:
> [https://github.com/apache/thrift/blob/v0.17.0/lib/py/src/transport/TZlibTransport.py#L190-L192]
> It seems like this is a bug, and the zlib transport should treat 0 bytes
> coming back from the underlying transport as EOF, at which point it closes
> its compression streams and tells the caller about the remaining data and EOF
> (possibly throwing an exception if the EOF does not cleanly finish the
> decompression stream).
> In our case, the underlying transport is ultimately the HttpClient's
> response, and that doesn't appear to throw an exception when the response is
> cut short, it just appears to return 0 bytes repeatedly.
> We use thrift 0.13.0 but I see the same logic in the latest release and the
> bug appears unfixed there as well.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)