Hi, We are running a medium sized HBase cluster (12 data nodes) with around 200 TB of data (w/o replication). When a node fails, the time to (fully) recover is in the order of 30 minutes. We’re looking for ways to reduce this. Almost two years ago, we already ‘discovered’ the hbase.wal.split.to.hfile setting, but didn’t dare turn it on because of data-loss concerns based of some JIRA tickets in this area. Can anyone comment on its current status? Is it safe to use?
Best regards,
Frens Jan
Award-winning OSINT partner for Law Enforcement and Defence.
Frens Jan Rumph
Data platform engineering lead
phone:
site:
pgp: +31 50 21 11 622
web-iq.com <https://web-iq.com/>
CEE2 A4F1 972E 78C0 F816
86BB D096 18E2 3AC0 16E0
The content of this email is confidential and intended for the recipient(s)
specified in this message only. It is strictly forbidden to share any part of
this message with any third party, without a written consent of the sender. If
you received this message by mistake, please reply to this message and follow
with its deletion, so that we can ensure such a mistake does not occur in the
future.
signature.asc
Description: Message signed with OpenPGP
