bschofield opened a new issue #6806:
URL: https://github.com/apache/pulsar/issues/6806


   I am using the go client, which is a wrapper around the C++ client. I 
encountered a bug where all my consumers on a namespace suddenly died and could 
not be restarted. It seems that they are segfaulting whilst trying to 
deserialize a batch of messages from the broker.
   
   The _gdb_ backtrace looks like this:-
   
   ```
   Thread 13 "redacted" received signal SIGSEGV, Segmentation fault.
   [Switching to Thread 0x7fffc2f64700 (LWP 6053)]
   pulsar::SharedBuffer::readUnsignedInt (this=0x7fff8c006c08) at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/lib/SharedBuffer.h:95
   95           uint32_t value = ntohl(*(uint32_t*)data());
   (gdb) bt
   #0  pulsar::SharedBuffer::readUnsignedInt (this=0x7fff8c006c08) at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/lib/SharedBuffer.h:95
   #1  pulsar::Commands::deSerializeSingleMessageInBatch (batchedMessage=..., 
batchIndex=batchIndex@entry=1)
       at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/lib/Commands.cc:651
   #2  0x00007ffff7e3b704 in 
pulsar::ConsumerImpl::receiveIndividualMessagesFromBatch (this=0x7fff8c033bc0, 
       cnx=std::shared_ptr<pulsar::ClientConnection> (use count 4, weak count 
5) = {...}, batchedMessage=..., redeliveryCount=0)
       at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/lib/ConsumerImpl.cc:372
   #3  0x00007ffff7e3bde1 in pulsar::ConsumerImpl::messageReceived 
(this=this@entry=0x7fff8c033bc0, 
       cnx=std::shared_ptr<pulsar::ClientConnection> (use count 4, weak count 
5) = {...}, msg=..., isChecksumValid=@0x7fffc2f62cac: true, metadata=..., 
payload=...)
       at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/generated/lib/PulsarApi.pb.h:17227
   #4  0x00007ffff7dcf936 in pulsar::ClientConnection::handleIncomingMessage 
(this=0x7fff8c0c11d0, msg=..., isChecksumValid=<optimised out>, 
msgMetadata=..., payload=...)
       at /usr/include/c++/9/bits/shared_ptr_base.h:1192
   #5  0x00007ffff7ddd258 in pulsar::ClientConnection::processIncomingBuffer 
(this=0x7fff8c0c11d0)
       at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/generated/lib/PulsarApi.pb.h:22533
   #6  0x00007ffff7dddccc in pulsar::ClientConnection::handleRead 
(this=0x7fff8c0c11d0, err=..., bytesTransferred=<optimised out>, minReadSize=4)
       at 
/home/ben/src/github.com/bschofield/pulsar/pulsar-client-cpp/lib/ClientConnection.cc:489
   
   [...other stack frames deleted...]
   ```
   
   The messages are produced by v2.5.0 of the go client (i.e. wrapped C++), on 
Alpine Linux (musl). The messsages are batched into groups of max 1000, with 
LZ4 compression.
   
   I see this bug in the consumer using both v2.5.0 and the latest master. The 
backtrace is taken from a machine running Ubuntu.
   
   Any ideas what might be causing this?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to