Re: Cassandra Inbound Error Message

Bowen Song via user Thu, 29 Aug 2024 03:58:31 -0700

Hi Edi,

The description "most errors" sounds suspicious. If there's someoutliers, it may indicate that you are looking at the wrong thing orthere's multiple contributing factors to the result.

In Cassandra, small message is an inter-node commutations class formessages smaller than a configurable threshold (default to 64 KiB) andis neither the highest priority nor streaming. Many (most?) inter-nodecommutations fall into this class, this includes reads and writes,authentications, repairs and many more.

If the network for the inter-node connection is congested and unstable,there isn't much you can do about it other than improving the qualityand bandwidth of the network and/or reducing congestion. This may meanvertical scaling (e.g. more bandwidth per node) or horizontal scaling(more nodes). But before you do any of that, you should first confirmthat the cause of the issue is indeed network congestion, not somethingelse.


Cheers,
Bowen


On 29/08/2024 10:47, edi mari wrote:


Thank you for your insights, Bowen.

The occurrence is inconsistent—some nodes report seven errors in tenminutes, while others show just one error every 24 hours at random times.We noticed that most errors tend to occur during periods of heavynetwork load.

I agree that addressing the root cause is essential, and we areactively working to reduce the network's pressure.Is there any tuning or configuration in Cassandra that could helpprevent these errors?Where can I find more information about these errors, and under whatcircumstances do these messages appear?Additionally, what does the term "SMALL_MESSAGES" mean in the errormessage?

Edi

On Tue, Aug 27, 2024 at 8:04 PM Bowen Song via user<user@cassandra.apache.org> wrote:


    Hello Edi,

    Before attempt to prematurely optimise, let's try to understand the
    situation a bit better.

    * What's the bandwidth available? (think: total bandwidth and the
    typical usage)
    * What's causing the heavy network load?
    * How much bandwidth is consumed by the heavy network load?
    * How long do they typically last?
    * How frequent does that happen?
    * Is the thing causing the load flexible to run at a slower rate
    or at a
    different time of the day/week?

    It's usually better to address the problem at source, instead of
    tweaking the victims and hoping that they will better survive it.

    Cheers,
    Bowen


    On 27/08/2024 12:57, edi mari wrote:
    > Hello ,
    > Recently, we've noticed errors appearing in the Cassandra logs,
    which
    > coincide with periods of heavy network load. We investigated and
    > confirmed that the network was under significant stress during
    these
    > times.
    > Is there any configuration or tuning in Cassandra that could help
    > eliminate these errors?
    > Perhaps increasing the inbound connection timeout might help?
    >
    > Cassandra V4.0.4
    >
    > ERROR [Messaging-EventLoop-3-6] 2024-08-27 01:31:33,741
    > InboundMessageHandler.java:300 -
    > /xx.xx.xx.xx:7000->/xx.xx.xx.xx:7000-SMALL_MESSAGES-fbe3b1a9
    > unexpected exception caught while processing inbound messages;
    > terminating connection
    > ERROR [Messaging-EventLoop-3-4] 2024-08-27 11:10:16,390
    > InboundMessageHandler.java:300 -
    > /xx.xx.xx.xx:7000->/xx.xx.xx.xx:7000-SMALL_MESSAGES-5d216061
    > unexpected exception caught while processing inbound messages;
    > terminating connection
    >

Re: Cassandra Inbound Error Message

Reply via email to