Yes. We have the hdfs rack-aware set up to divide the blocks equally. And according to the name node http page it doesn't look like those nodes have a much higher number of blocks that nother nodes.
I can try temporarily shutting down one of the data nodes to see what that does. We did already lose a node on the cluster a few days ago. I'm currently waiting for the system administrators to replace a disk. Thanks, On Wed, Mar 15, 2023 at 5:59 PM Dave Marion <dmario...@gmail.com> wrote: > sounds like you have a hot-spot on those two datanode hosts. Either because > the blocks that it's writing to are all (or a majority) located there, or > there is some type of issue with the host. Stopping the DN processes on > those two hosts should confirm this, unless the hot spot moves. Do you have > the HDFS rack script set up appropriately to distribute the blocks for > files across the hosts? > > On Wed, Mar 15, 2023 at 5:52 PM Vincent Russell <vincent.russ...@gmail.com > > > wrote: > > > Hello, > > > > I am using accumulo 2.0.1 with hadoop 3.3.1. > > > > I have two identical clusters with 28 tservers. > > > > I have writers on both clusters which are set with 10 batch writers with > a > > max memory of 50m. > > > > However, one server is ingesting 10x faster than the other. > > > > Is there anything I should check for? > > > > I don't see any errors, but one thing that I noticed is that the slow > site > > has a lot of "Slow sync cost" info log messages from the tservers. > > > > I see these messages on the fast cluster as well, but they are far less. > > It also appears that on the slow cluster these messages are occurring on > > only two of the nodes in the cluster, where these messages appear to be > > more spread out on the fast cluster. > > > > Thank you in advance for your help, > > Vincent > > >