[ https://issues.apache.org/jira/browse/KNOX-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Larry McCay updated KNOX-755: ----------------------------- Fix Version/s: (was: 0.14.0) 0.15.0 > retry logic for replayBuffer limit errors is incorrect. > ------------------------------------------------------- > > Key: KNOX-755 > URL: https://issues.apache.org/jira/browse/KNOX-755 > Project: Apache Knox > Issue Type: Bug > Reporter: Sergey Shelukhin > Fix For: 0.15.0 > > > Hive receives corrupted thrift requests when using Knox with Hive with a > large query and insufficient replayBuffer: > {noformat} > org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) > at > org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:354) > at > org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:347) > at > org.apache.hive.service.cli.thrift.TExecuteStatementReq$TExecuteStatementReqStandardScheme.read(TExecuteStatementReq.java:618) > ... > {noformat} > It seems that the retry logic for this error is incorrect, as follows (names > changed to generic): > {noformat} > 2016-10-05 15:25:51,104 DEBUG http.wire (Wire.java:wire(63)) - >> > "[0x80][0x1][0x0][0x1][0x0][0x0][0x0][0x10]ExecuteStatement[0x0][0x0][0x0]...![0x88]SELECT > 1 AS `number_of_records`,[\n]" > ... > 2016-10-05 15:25:51,117 DEBUG http.wire (Wire.java:wire(77)) - >> " > `tablename`.`columnn" > 2016-10-05 15:25:51,118 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]" > ... > 2016-10-05 15:25:51,119 INFO client.DefaultHttpClient > (DefaultRequestDirector.java:tryExecute(726)) - I/O exception > (java.io.IOException) caught when processing request: Hit replay buffer max > limit > 2016-10-05 15:25:51,120 DEBUG client.DefaultHttpClient > (DefaultRequestDirector.java:tryExecute(731)) - Hit replay buffer max limit > java.io.IOException: Hit replay buffer max limit > at > org.apache.hadoop.gateway.dispatch.CappedBufferHttpEntity$ReplayStream.read(CappedBufferHttpEntity.java:143) > at java.io.InputStream.read(InputStream.java:101) > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792) > at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769) > at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744) > at > org.apache.hadoop.gateway.dispatch.CappedBufferHttpEntity.writeTo(CappedBufferHttpEntity.java:93) > at > org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98) > {noformat} > However, then it retries: > {noformat} > 2016-10-05 15:25:51,121 INFO client.DefaultHttpClient > (DefaultRequestDirector.java:tryExecute(733)) - Retrying request > 2016-10-05 15:25:51,121 DEBUG client.DefaultHttpClient > (DefaultRequestDirector.java:tryExecute(703)) - Reopening the direct > connection. > {noformat} > After auth (for which the same incorrect request as below is sent, but not > parsed due to 401), it sends the thing again with correct auth header, as > follows: > {noformat} > 2016-10-05 15:25:51,166 DEBUG client.DefaultHttpClient > (DefaultRequestDirector.java:tryExecute(713)) - Attempt 3 to execute request > 2016-10-05 15:25:51,166 DEBUG conn.DefaultClientConnection > (DefaultClientConnection.java:sendRequestHeader(269)) - Sending request: POST > /cliservice?doAs=... HTTP/1.1 > 2016-10-05 15:25:51,167 DEBUG http.wire (Wire.java:wire(63)) - >> "POST > /cliservice?doAs=... HTTP/1.1[\r][\n]" > ... > 2016-10-05 15:25:51,169 DEBUG http.wire (Wire.java:wire(63)) - >> > "Authorization: Negotiate ... > 2016-10-05 15:25:51,170 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]" > ... > 2016-10-05 15:25:51,172 DEBUG http.wire (Wire.java:wire(63)) - >> > "1000[\r][\n]" > 2016-10-05 15:25:51,173 DEBUG http.wire (Wire.java:wire(63)) - >> > "[0x80][0x1][0x0][0x1][0x0][0x0][0x0][0x10]ExecuteStatement[0x0] ... > ![0x88]SELECT 1 AS `number_of_records`,[\n]" > ... > 2016-10-05 15:25:51,186 DEBUG http.wire (Wire.java:wire(77)) - >> " > `tablename`.`columnn" > 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> "[\r][\n]" > 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> > "1f3[\r][\n]" > 2016-10-05 15:25:51,187 DEBUG http.wire (Wire.java:wire(63)) - >> "ther` AS > `anothercolumnnameother`,[\n]" > ... rest of the query > {noformat} > Note that there's a gap at "columnn", where "columnname" should be. > This results in the above error when reading the request, and error 500 on > gateway side. > I think the retry logic should be fixed to send the correct buffer, or > removed for this type of error. -- This message was sent by Atlassian JIRA (v6.4.14#64029)