Re: Regionserver reports RegionTooBusyException on import

Josh Elser Thu, 06 May 2021 16:39:18 -0700

You were able to work around the durability concerns by skipping the WAL (never 
forget that this means your data in HBase is *not* guaranteed to be there).

We’re already doing this. This is actually not a problem for us, because we 
verify the data after the import (using our own restore-test mapreduce report).

Yes, I was summarizing what you had said to then make sure youunderstood the implications of what you had done. Good to hear you areverifying this.

Of course, you can also change your application (the Import m/r job) such that 
you can inject sleeps, but I assume you don't want to do that. We don't expose 
an option in that job (to my knowledge) that would inject slowdowns.


That’s funny - I was just talking about this with my colleague more in jest. 
But would it be possible that the MemStore realizes that the incoming write 
rate is higher than the flushing rate and slow down the write requests a little 
bit?
That means putting the „sleep“ into MemStore as a kind of an adaptive 
congestion control: MemStore could measure the incoming rate and the flushing 
rate and add some sleeps on demand...

HBase is essentially do what you're asking. By throwing theRegionTooBusyException, the client is pushed into a retry loop. Theclient will pause before it retries, increase the amount of time itwaits the next time (by some function, I forget exactly what), and thenretry the same operation.

The problem you're facing is that the default configuration isinsufficient for the load and/or hardware that you're throwing at HBase.

The other thing you should be asking yourself is if you have a hotspotin your table design which is causing the load to not be evenly spreadacross all RegionServers.

Re: Regionserver reports RegionTooBusyException on import

Reply via email to