I stopped the two data nodes and it had no effect. Thanks,
On Wed, Mar 15, 2023 at 6:53 PM Vincent Russell <vincent.russ...@gmail.com> wrote: > Yes. We have the hdfs rack-aware set up to divide the blocks equally. > And according to the name node http page it doesn't look like those nodes > have a much higher number of blocks that nother nodes. > > I can try temporarily shutting down one of the data nodes to see what that > does. > > We did already lose a node on the cluster a few days ago. I'm currently > waiting for the system administrators to replace a disk. > > Thanks, > > On Wed, Mar 15, 2023 at 5:59 PM Dave Marion <dmario...@gmail.com> wrote: > >> sounds like you have a hot-spot on those two datanode hosts. Either >> because >> the blocks that it's writing to are all (or a majority) located there, or >> there is some type of issue with the host. Stopping the DN processes on >> those two hosts should confirm this, unless the hot spot moves. Do you >> have >> the HDFS rack script set up appropriately to distribute the blocks for >> files across the hosts? >> >> On Wed, Mar 15, 2023 at 5:52 PM Vincent Russell < >> vincent.russ...@gmail.com> >> wrote: >> >> > Hello, >> > >> > I am using accumulo 2.0.1 with hadoop 3.3.1. >> > >> > I have two identical clusters with 28 tservers. >> > >> > I have writers on both clusters which are set with 10 batch writers >> with a >> > max memory of 50m. >> > >> > However, one server is ingesting 10x faster than the other. >> > >> > Is there anything I should check for? >> > >> > I don't see any errors, but one thing that I noticed is that the slow >> site >> > has a lot of "Slow sync cost" info log messages from the tservers. >> > >> > I see these messages on the fast cluster as well, but they are far less. >> > It also appears that on the slow cluster these messages are occurring on >> > only two of the nodes in the cluster, where these messages appear to be >> > more spread out on the fast cluster. >> > >> > Thank you in advance for your help, >> > Vincent >> > >> >