sounds like you have a hot-spot on those two datanode hosts. Either because
the blocks that it's writing to are all (or a majority) located there, or
there is some type of issue with the host. Stopping the DN processes on
those two hosts should confirm this, unless the hot spot moves. Do you have
the HDFS rack script set up appropriately to distribute the blocks for
files across the hosts?

On Wed, Mar 15, 2023 at 5:52 PM Vincent Russell <vincent.russ...@gmail.com>
wrote:

> Hello,
>
> I am using accumulo 2.0.1 with hadoop 3.3.1.
>
> I have two identical clusters with 28 tservers.
>
> I have writers on both clusters which are set with 10 batch writers with a
> max memory of 50m.
>
> However, one server is ingesting 10x faster than the other.
>
> Is there anything I should check for?
>
> I don't see any errors, but one thing that I noticed is that the slow site
> has a lot of "Slow sync cost" info log messages from the tservers.
>
> I see these messages on the fast cluster as well, but they are far less.
> It also appears that on the slow cluster these messages are occurring on
> only two of the nodes in the cluster, where these messages appear to be
> more spread out on the fast cluster.
>
> Thank you in advance for your help,
> Vincent
>

Reply via email to