> HBase is essentially do what you're asking. By throwing the
> RegionTooBusyException, the client is pushed into a retry loop. The client
> will pause before it retries, increase the amount of time it waits the next
> time (by some function, I forget exactly what), and then retry the same
> operation.
Just for your information - I think I found it:
org.apache.hadoop.hbase.client.AsyncRequestFutureImpl:
/**
* Log as much info as possible, and, if there is something to replay,
* submit it again after a back off sleep.
*/
private void resubmit(ServerName oldServer, List<Action> toReplay,
int numAttempt, int failureCount, Throwable throwable) {
…
Udo Offermann
ZFabrik Software GmbH & Co. KG
Lammstrasse 2, 69190 Walldorf
tel: +49 6227 3984 255
fax: +49 6227 3984 254
email: [email protected] <mailto:[email protected]>
www: z2-environment.net <http://z2-environment.net/>
> Am 07.05.2021 um 01:39 schrieb Josh Elser <[email protected]>:
>
>>> You were able to work around the durability concerns by skipping the WAL
>>> (never forget that this means your data in HBase is *not* guaranteed to be
>>> there).
>> We’re already doing this. This is actually not a problem for us, because we
>> verify the data after the import (using our own restore-test mapreduce
>> report).
>
> Yes, I was summarizing what you had said to then make sure you understood the
> implications of what you had done. Good to hear you are verifying this.
>
>>> Of course, you can also change your application (the Import m/r job) such
>>> that you can inject sleeps, but I assume you don't want to do that. We
>>> don't expose an option in that job (to my knowledge) that would inject
>>> slowdowns.
>> That’s funny - I was just talking about this with my colleague more in jest.
>> But would it be possible that the MemStore realizes that the incoming write
>> rate is higher than the flushing rate and slow down the write requests a
>> little bit?
>> That means putting the „sleep“ into MemStore as a kind of an adaptive
>> congestion control: MemStore could measure the incoming rate and the
>> flushing rate and add some sleeps on demand...
>
> HBase is essentially do what you're asking. By throwing the
> RegionTooBusyException, the client is pushed into a retry loop. The client
> will pause before it retries, increase the amount of time it waits the next
> time (by some function, I forget exactly what), and then retry the same
> operation.
>
> The problem you're facing is that the default configuration is insufficient
> for the load and/or hardware that you're throwing at HBase.
>
> The other thing you should be asking yourself is if you have a hotspot in
> your table design which is causing the load to not be evenly spread across
> all RegionServers.