Holy shit, the two of you were fast. Thank you! On Tue, Dec 5, 2023 at 10:15 PM Bryan Beaudreault <bbeaudrea...@apache.org> wrote: > > Where exactly in StoreScanner are you able to pinpoint the slowness to?
I'm still at the very beginning. I started at RpcServer and have made my way to RegionScannerImpl. It is always about a day between me adding more debug code and getting results back so it's slow going. That's why I reached out now to see if others have good ideas on speeding this up a little. Turns out that was a good idea. Thank you! > You > might try going deeper into StoreFileScanner -> HFileReaderImpl -> > HFileBlock. In my experience, the code path follows there as you get closer > to the FS, but the complexity of the code increases as well. The ultimate > method just before HDFS is typically around FSReaderImpl.readAtOffset, in > HFileBlock.java. > > I'm unsure if it's available in your version, but histograms exist for fs > access. Search JMX for FsReadTime. Will do. Sorry, should have added. This is on 2.4 (I know, we had issues upgrading to 2.5 and haven't had a chance to investigate). > > You could try seeing how many blocks are required to read the given rows. > For large rows or rows that span multiple storefiles, this can affect > performance a lot. An example of how to do that is here: > https://github.com/apache/hbase/pull/4967 Thanks! Will check but I'm pretty sure all rows are tiny. I'll double check though. > Also, are you sure you couldn't use async-profiler? We use this all the > time in our very latency-sensitive production. It has no noticeable > overhead in my experience and doesn't need any special dependencies. I have to admit, I have never used async-profiler. Shame on me. That is a fabulous hint and I'll read up on it immediately. Thank you again! > > On Tue, Dec 5, 2023 at 3:46 PM Lars Francke <lars.fran...@gmail.com> wrote: > > > Hi, > > > > I am debugging an issue where we see some Get requests taking 2-5s. > > We do see "responseTooSlow" etc. and this is in an environment where I > > cannot run a Profiler but I _can_ run modified code. > > > > So what I did was I added a stupid "MethodTimer"[1] which records how > > long certain operations take at various points in the code (e.g. [2]). > > I've been doing this a few rounds and have now arrived at the StoreScanner. > > > > I'm wondering if anyone has better ideas on how to diagnose this? > > I am a HBase committer but I haven't been able to keep up with the > > changes in the last 5-6 years so I'm not too familiar with the inner > > workings anymore and would appreciate a hint. > > > > I suspect it is slowness related to storage access. > > I was not able to find any logs or tweaks to log "slow storage" > > access, does such a thing exist? > > And something else that'd help me: Can anyone point me (if it exists) > > at the (vicinity of the) code that actually reads from HDFS at the > > end? There are so many layers. > > > > Thank you! > > > > Cheers, > > Lars > > > > > > [1] < > > https://github.com/stackabletech/docker-images/blob/8349f29f8aded8a01a8d1dbf7a90776ede1764ca/hbase/stackable/patches/2.4.12/005-STACKABLE-profiling-2.4.12.patch#L150C5-L150C5 > > > > > [2] < > > https://github.com/stackabletech/docker-images/blob/8349f29f8aded8a01a8d1dbf7a90776ede1764ca/hbase/stackable/patches/2.4.12/005-STACKABLE-profiling-2.4.12.patch#L289-L297 > > > > >