Github user sijie commented on the issue: https://github.com/apache/bookkeeper/issues/198 @eolivelli @kishorekasi @merlimat and me were looking together at this issue. @merlimat found the problem on how do we handling the invalid op code in v2 protocol. in TestBackwardCompat#testCompact400, the test is to verify the current client can't talk to a 4.0.0 server. The 4.5.0 client is sending a protobuf encoded request to 4.0.0 server. The 4.0.0 server will interpret the 4.5.0 protobuf encoded request, but it will realize this is bad request and sending v2 protocol encoded response. because the request is a bad request, 4.0.0 server sent a response back with unknown op code. In current v2 ResponseEnDecoder (listed as below), when it doesn't know the op code, it will return the buffer to the channel. this might cause the misbehavior in the channel pipeline to decrement reference count. ` @Override public Object decode(ByteBuf buffer) throws Exception { int rc; long ledgerId, entryId; int packetHeader = buffer.readInt(); byte version = PacketHeader.getVersion(packetHeader); byte opCode = PacketHeader.getOpCode(packetHeader); switch (opCode) { case BookieProtocol.ADDENTRY: rc = buffer.readInt(); ledgerId = buffer.readLong(); entryId = buffer.readLong(); return new BookieProtocol.AddResponse(version, rc, ledgerId, entryId); case BookieProtocol.READENTRY: rc = buffer.readInt(); ledgerId = buffer.readLong(); entryId = buffer.readLong(); if (rc == BookieProtocol.EOK) { ByteBuf content = buffer.slice(); return new BookieProtocol.ReadResponse(version, rc, ledgerId, entryId, content.retain()); } else { return new BookieProtocol.ReadResponse(version, rc, ledgerId, entryId); } case BookieProtocol.AUTH: ByteBufInputStream bufStream = new ByteBufInputStream(buffer); BookkeeperProtocol.AuthMessage.Builder builder = BookkeeperProtocol.AuthMessage.newBuilder(); builder.mergeFrom(bufStream, extensionRegistry); BookkeeperProtocol.AuthMessage am = builder.build(); return new BookieProtocol.AuthResponse(version, am); default: return buffer; } } ` One suggested fix from @merlimat is to throw exception in the EnDeCoder when receiving unknown op code. so the netty can close the connection, error out the pending requests and cleaning up the resources.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---