[DISCUSS] Disable HDFS readahead for PREAD reads
Hello team, I recently discovered "hbase.store.reader.no-readahead", which defaults to false (so readahead is enabled). This only applies to PREAD reads, not STREAM reads which always use readahead. When readahead is enabled, the default readahead amount in the DFSClient is 4mb. In my opinion this is extremely huge for HBase's use-case. Further, reads in HBase are always for a block at a time and blocks typically have more than one row in them. So we are already reading ahead a bit via block reads. And lastly, readahead is typically useful for sequential read scenarios. It's unlikely for someone to do sequential IO via PREAD, instead they would use Scans (thus STREAM). In the case where someone is doing sequential IO via PREAD, they'd get some natural readahead due to our reading of blocks at a time. I disabled readahead on about 50 servers across various clusters in our production environment, and saw a massive (10x or more) drop in disk IO for random read and mixed read cases. Scan workloads were mostly unaffected due to not using this setting. I also did a targeted load test of a cluster, with and without readahead, and was able to get double the random read throughput with it disabled. I'd like to update the default for this config to "true", thus disabling readahead for PREAD by default. I also think it's worth investigating making readahead configurable for STREAM reads, perhaps based on the scan's max result size or blockBytesScanned of the last next() call. Any objections to changing the default? https://issues.apache.org/jira/browse/HBASE-27896
[jira] [Created] (HBASE-27898) org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 300000 ms for txid=895, WAL system stuck?
Jepson created HBASE-27898: -- Summary: org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=895, WAL system stuck? Key: HBASE-27898 URL: https://issues.apache.org/jira/browse/HBASE-27898 Project: HBase Issue Type: Bug Components: wal Affects Versions: 2.2.2 Reporter: Jepson 2023-04-15 06:41:23,278 ERROR [regionserver/bdpprd03:16020-longCompactions-0] regionserver.CompactSplit: Compaction failed region=OLAP:XS_USER_BEHAVIOR_RISK,325001600010,1681250348046.3823c66e8cb8bca3bd9eb1757d3c2a1b., storeName=info, priority=9, startTime=1681511184093 *{color:#de350b}org.apache.hadoop.hbase.exceptions.TimeoutIOException: Failed to get sync result after 30 ms for txid=895, WAL system stuck?{color}* at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:145) at org.apache.hadoop.hbase.regionserver.wal.AbstractFSWAL.blockOnSync(AbstractFSWAL.java:731) at org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.sync(AsyncFSWAL.java:611) at org.apache.hadoop.hbase.regionserver.wal.WALUtil.doFullAppendTransaction(WALUtil.java:158) at org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeMarker(WALUtil.java:136) at org.apache.hadoop.hbase.regionserver.wal.WALUtil.writeCompactionMarker(WALUtil.java:70) at org.apache.hadoop.hbase.regionserver.HStore.writeCompactionWalRecord(HStore.java:1512) at org.apache.hadoop.hbase.regionserver.HStore.doCompaction(HStore.java:1441) at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1428) at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:2235) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.doCompaction(CompactSplit.java:616) at org.apache.hadoop.hbase.regionserver.CompactSplit$CompactionRunner.run(CompactSplit.java:658) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] First release candidate for hbase 3.0.0-alpha-4 is available for download
I am sorry, touched the send buttion accidentially before finishing the email. +1 | | 何良均 | | 邮箱:2005hit...@163.com | Replied Message | From | Duo Zhang | | Date | 05/28/2023 23:13 | | To | HBase Dev List | | Cc | | | Subject | [VOTE] First release candidate for hbase 3.0.0-alpha-4 is available for download | Please vote on this Apache hbase release candidate, hbase-3.0.0-alpha-4RC0 The VOTE will remain open for at least 72 hours. [ ] +1 Release this package as Apache hbase 3.0.0-alpha-4 [ ] -1 Do not release this package because ... The tag to be voted on is 3.0.0-alpha-4RC0: https://github.com/apache/hbase/tree/3.0.0-alpha-4RC0 This tag currently points to git reference e44cc02c75ecae7ece845f04722eb16b7528393f The release files, including signatures, digests, as well as CHANGES.md and RELEASENOTES.md included in this RC can be found at: https://dist.apache.org/repos/dist/dev/hbase/3.0.0-alpha-4RC0/ Maven artifacts are available in a staging repository at: https://repository.apache.org/content/repositories/orgapachehbase-1520/ Maven artifacts for hadoop3 are available in a staging repository at: https://repository.apache.org/content/repositories/not-applicable/ Artifacts were signed with the 0x9AD2AE49 key which can be found in: https://downloads.apache.org/hbase/KEYS 3.0.0-alpha-4 is the fourth alpha release for our 3.0.0 major release line. HBase 3.0.0 includes the following big feature/changes: Synchronous Replication OpenTelemetry Tracing Distributed MOB Compaction Backup and Restore Move RSGroup balancer to core Reimplement sync client on async client CPEPs on shaded protobuf Move the logging framework from log4j to log4j2 Decouple region replication and general replication framework, and also make region replication can work when SKIP_WAL is used A new file system based replication peer storage Used hbase table instead of zookeeper for tracking hbase replication queue Notice that this is not a production ready release. It is used to let our users try and test the new major release, to get feedback before the final GA release is out. So please do NOT use it in production. Just try it and report back everything you find unusual. And this time we will not include CHANGES.md and RELEASENOTE.md in our source code, you can find it on the download site. For getting these two files for old releases, please go to https://archive.apache.org/dist/hbase/ To learn more about Apache hbase, please see http://hbase.apache.org/ Thanks, Your HBase Release Manager
Re: [VOTE] First release candidate for hbase 3.0.0-alpha-4 is available for download
+1 | | 何良均 | | 邮箱:2005hit...@163.com | Replied Message | From | Duo Zhang | | Date | 05/28/2023 23:13 | | To | HBase Dev List | | Cc | | | Subject | [VOTE] First release candidate for hbase 3.0.0-alpha-4 is available for download | Please vote on this Apache hbase release candidate, hbase-3.0.0-alpha-4RC0 The VOTE will remain open for at least 72 hours. [ ] +1 Release this package as Apache hbase 3.0.0-alpha-4 [ ] -1 Do not release this package because ... The tag to be voted on is 3.0.0-alpha-4RC0: https://github.com/apache/hbase/tree/3.0.0-alpha-4RC0 This tag currently points to git reference e44cc02c75ecae7ece845f04722eb16b7528393f The release files, including signatures, digests, as well as CHANGES.md and RELEASENOTES.md included in this RC can be found at: https://dist.apache.org/repos/dist/dev/hbase/3.0.0-alpha-4RC0/ Maven artifacts are available in a staging repository at: https://repository.apache.org/content/repositories/orgapachehbase-1520/ Maven artifacts for hadoop3 are available in a staging repository at: https://repository.apache.org/content/repositories/not-applicable/ Artifacts were signed with the 0x9AD2AE49 key which can be found in: https://downloads.apache.org/hbase/KEYS 3.0.0-alpha-4 is the fourth alpha release for our 3.0.0 major release line. HBase 3.0.0 includes the following big feature/changes: Synchronous Replication OpenTelemetry Tracing Distributed MOB Compaction Backup and Restore Move RSGroup balancer to core Reimplement sync client on async client CPEPs on shaded protobuf Move the logging framework from log4j to log4j2 Decouple region replication and general replication framework, and also make region replication can work when SKIP_WAL is used A new file system based replication peer storage Used hbase table instead of zookeeper for tracking hbase replication queue Notice that this is not a production ready release. It is used to let our users try and test the new major release, to get feedback before the final GA release is out. So please do NOT use it in production. Just try it and report back everything you find unusual. And this time we will not include CHANGES.md and RELEASENOTE.md in our source code, you can find it on the download site. For getting these two files for old releases, please go to https://archive.apache.org/dist/hbase/ To learn more about Apache hbase, please see http://hbase.apache.org/ Thanks, Your HBase Release Manager