[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369296#comment-17369296 ] Viraj Jasani commented on HBASE-23349: -- {quote} the issue already fixed in 1.7.0 released on 2021/06/12? It is still marked as UNRESOLVED. {quote} This Jira is not yet resolved. Any Jira that is resolved is always marked "Resolved" with fix versions that indicate which HBase releases the Jira fix/improvement has landed on. [~larry1285] how long have you been facing this issue? Are you getting this log for same HFile for long time or for different HFiles? Are you using any custom coprocs that might be leaking refCounts? Which HBase version are you using? > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Priority: Major > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369205#comment-17369205 ] Chengliang commented on HBASE-23349: Is the issue already fixed in 1.7.0 released on 2021/06/12? It is still marked as UNRESOLVED. I faced exactly the same error. {code:java} regionserver.HStore - Can't archive compacted file hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 because of either isCompactedAway=true or file has reference, isReferencedInReads=true, refCount=1, skipping for now.{code} Thanks you so much for the clarification. Best & Regards, Larry > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0-alpha-1, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036467#comment-17036467 ] Andrew Kyle Purtell commented on HBASE-23349: - We are releasing 1.6.0 now because of HBASE-23825, moving to 1.7.0. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.7.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17026433#comment-17026433 ] ramkrishna.s.vasudevan commented on HBASE-23349: I will list out my points here -> Generally the scanners created at the Region level due to incoming user requests have a lease mechanism. It is basically to release the resources. So any long running scan there will be a point of end for the scan or may be even if the user is not responsive or not closing the scan we have the lease mechanism. In any such case the scanners gets closed and the resources get released. -> The case where it cannot happen is when CPs create their own scanners. Then there are chances that if the CP scanner fails or does not release the resources we may hold up the underlying resources and even recover lease will not work -> One way to solve this is to have a lease mechanism for CP scans also so that we don end up in scans being alive for a longer time. -> Coming to the benefit of the ref count based mechanism, it solves the sync block issues which was happening for every next call. But not only that we have two other benefits For the first benefit, pls refer to this user mailing list http://apache-hbase.679495.n3.nabble.com/Extremely-long-flush-times-td4104190.html http://mail-archives.apache.org/mod_mbox/hbase-dev/201208.mbox/%3c6548f17059905b48b2a6f28ce3692baa0ce29...@oaexch4server.oa.oclc.org%3E (see the later part of it). We are helping the readers to carry on without the impact of flushes and the reverse way where flushes are not blocked due to readers. Here the scans are heavier where either you have filters applied, more deletes/versions to skip through. In such cases having a non sync way of readers always helps. The other benefit is that, the current readers need not reset itself, load the new files(after compaction which may not be in cache), reseek to the last fetched row and then again proceed with the scan. Obviously it means that the next scan that comes will have to anyway read from the filesystem and then load to the cache but atleast the ongoing scans are not impacted. -> Finally I would like to mention that Phoenix like cases where there could be a query that reads large amount of data and has filtering applied along with heavy writes, it may be obvious that we may face the issues as the users have faced in the mailing list. I am fine if every body agrees to revert the patch and put back the sync way of readers (or any other better soln). Just saying because it should be giving a view that am in favour of the exisiting behaviour and against any changes to it. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024863#comment-17024863 ] Andrew Kyle Purtell commented on HBASE-23349: - Some regions are hot, with readers mostly always active, and also taking writes, enough to generate flush and compaction activity. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024855#comment-17024855 ] ramkrishna.s.vasudevan commented on HBASE-23349: bq.Now the issue is that if there are readers active on a region always it will never be allowed to discharge compacted files Ya if readers are active discharger will never come into play. Is there any specific reason why readers are always active? Are you suggesting we go back to the locking way only? If not then atleast i prefer the approach here where CPs can make the store scanner reset itself on compaction. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024520#comment-17024520 ] Andrew Kyle Purtell commented on HBASE-23349: - And just to be clear we are seeing this issue in production so it is not only a theoretical concern. There really are regions in production where refcount is nonzero for so long that failure to discharge compacted files is a performance and operational issue, leading to incidents. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024517#comment-17024517 ] Andrew Kyle Purtell commented on HBASE-23349: - Phoenix is a red herring now. There was some past issue with leaks which is why we added the metric to make ref counts visible. Now the issue is that if there are readers active on a region always it will never be allowed to discharge compacted files. That is an HBase level problem for certain. We looked at solutions and without bringing some locking back none of the solutions are safe. So here we are. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17024085#comment-17024085 ] ramkrishna.s.vasudevan commented on HBASE-23349: [~stack], [~larsh], [~vjasani] Just my thoughts here. Seems currently this ref counting issue happens due to Phoenix not able to close the scanners properly in some cases. It is not coming out of hbase. Correct me if am wrong here [~vjasani]. Next thing is that even recently a 1.3.0 user had faced issue with sync blocks exactly the issue HBASE-13082 was trying to solve because the user had rows to read with lots of deletes and frequent flushes/compactions were going on. For all such cases this non sync way will help them. Also I believe some of the features like external compaction may depend or make use of this async way of removing the compacted files rather than as part of the scan flow. The code fix may become more uglier if we keep adding another boolean and have two code paths either to do scanner reset as part of scans or just allow the scanner to continue as is and do the async way of removals. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023559#comment-17023559 ] Anoop Sam John commented on HBASE-23349: The ref counting way of lazy deletion of compacted files was done primarily to avoid the syncronized blocks in the read path. On that jira, the perf test results were with removal of all such i believe. Later we had to fix one issue wrt memstore flush during the read and as part of that a volatile bool lookup came in the read path (during seek, next etc). Am not sure whether some perf reports been taken after that. and now we were trying to add another volatile boolean. Now it might be good to do a perf test comparing current way (no sync blocks but with a volatile bool read) VS change to old way with sync blocks - ref counting. Not sure how easy/difficult it will be. [~vjasani] You have some bandwidth for this? Lets have an offline detailed discussion if so. Can explain the old way before the ref counting came in (If u wish to do so) > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022411#comment-17022411 ] Michael Stack commented on HBASE-23349: --- [~anoop.hbase] / [~ram_krish] Any comments lads? What we going to do w/ this one? Seems nasty. The refcounting is nice. Would be pity to undo it. Wonder if other repercussions than this issues's when refcount doesn't go to zero. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015365#comment-17015365 ] Lars Hofhansl commented on HBASE-23349: --- Minor nit: This is not lock coarsening. That was the failed I attempt I had to reduce the frequency of taking memory barriers (the locks were almost never contended), by pushing the locking up the stack into the region scanner. [~ram_krish] and [~anoop.hbase] then came up with an actual solution :), but that then required the reference counting. Note that the numbers on HBASE-13082, where with the lock coarsening, not with reference counting. At this point my concern is just about correctness and the issues we have seen with reference counting. It is generally very hard to retrofit reference counting into a large, complex system. Ram and Anoop did an awesome job! Perhaps HBase is just too complex to add this reliably. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014683#comment-17014683 ] Viraj Jasani commented on HBASE-23349: -- Sure [~apurtell] I will get back on this in some days and yes agree that "the scope of change will be less and reviews will be smoother and risk will be lower". > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014488#comment-17014488 ] Andrew Kyle Purtell commented on HBASE-23349: - If you can find an acceptable solution short of reintroducing locking it will go better for everyone because the scope of change will be less and reviews will be smoother and risk will be lower. So please have at it. That said, if we have reached the end of the road with the "storefile lock coarsening" then we need to recognize this and avoid the sunk cost fallacy. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013344#comment-17013344 ] Viraj Jasani commented on HBASE-23349: -- I agree on considering removal of refCount system based on your suggestions [~apurtell] [~larsh] However, I am just trying to give one chance to consider both points: # Perf improvement as part of HBASE-13082 # Scanner reset during compaction if required(config based) Tried to use volatile enum(NONE, FLUSH, COMPACTION) instead of 2 volatile booleans for Scanner.next(), seek() calls to not let perf degrade for normal scans. Hence, if archival is not happening correctly, we can notify open scanners and reset KV Heap in the next(), seek() runs. However, whether next() has to reset KVHeap is something that can be determined based on volatile enum value which would be set while notifying scanners. [https://github.com/apache/hbase/pull/939] with some tests for Scanner reset during compaction and successful archival thereafter. Considering refCount presence in HBase for some time, someone might have started building some system(alert, recovery etc) based on refCount usecase. In fact, we have also done auto region reopen etc but other users might have built some other usecases too. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17013255#comment-17013255 ] Andrew Kyle Purtell commented on HBASE-23349: - {quote} I think we should step back and remember why we have the ref counting in the first place. This came from a discussion started in HBASE-13082 and HBASE-10060, namely too much synchronization. If any changes we make now needs new synchronization in the scanner.next(...) path we're back to where we started and in that case we should remove the ref counting and bring back the old notification and scanner switching we had before. {quote} I made a similar comment on an internal discussion yesterday. If we have to walk back the StoreScanner "lock coarsening" work, then let's not be afraid to do it. There is a nuanced decision we would have to make, but let's not be concerned about sunk costs. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008612#comment-17008612 ] Viraj Jasani commented on HBASE-23349: -- I have incorporated some reviews in the linked PR(#939). Please review > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006485#comment-17006485 ] Lars Hofhansl commented on HBASE-23349: --- Sure. [~ram_krish], [~anoop.hbase], FYI. I know you guys invested a lot of time in this. In light of the issues I'm in favor removing the refcounting code and restoring the old behavior. Let's have a discussion. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17004841#comment-17004841 ] Viraj Jasani commented on HBASE-23349: -- Thanks [~larsh] Will try to see where else refCounts are being used. Is it better to take this in 2 phase? For now, we can bring scanner notification for compaction, and then if refCount usage is not that widespread, we can remove it as part of different Jira? > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HBASE-23349) Low refCount preventing archival of compacted away files
[ https://issues.apache.org/jira/browse/HBASE-23349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001958#comment-17001958 ] Lars Hofhansl commented on HBASE-23349: --- I think we should step back and remember why we have the ref counting in the first place. This came from a discussion started in HBASE-13082 and HBASE-10060, namely too much synchronization. If any changes we make now needs new synchronization in the scanner.next(...) path we're back to where we started and in that case we should remove the ref counting and bring back the old notification and scanner switching we had before. My apologies that I had triggered the original discussion, and then completely dropped off (worked on other stuff) when we attempted to fix it. Reference counting is bad (I've never seen this successful implemented), if we can avoid it we should a bit of performance drop is acceptable. Long story for: If we bring back scanner notification then let's get rid of ref counting completely. > Low refCount preventing archival of compacted away files > > > Key: HBASE-23349 > URL: https://issues.apache.org/jira/browse/HBASE-23349 > Project: HBase > Issue Type: Improvement >Affects Versions: 3.0.0, 2.3.0, 1.6.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Fix For: 3.0.0, 2.3.0, 1.6.0 > > > We have observed that refCount on compacted away store files as low as 1 is > prevent archival. > {code:java} > regionserver.HStore - Can't archive compacted file > hdfs://{{root-dir}}/hbase/data/default/t1/12a9e1112e0371955b3db8d3ebb2d298/cf1/73b72f5ddfce4a34a9e01afe7b83c1f9 > because of either isCompactedAway=true or file has reference, > isReferencedInReads=true, refCount=1, skipping for now. > {code} > We should come up with core code (run as part of discharger thread) > gracefully resolve reader lock issue by resetting ongoing scanners to start > pointing to new store files instead of compacted away store files. -- This message was sent by Atlassian Jira (v8.3.4#803005)