[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021943#comment-15021943 ]
ramkrishna.s.vasudevan commented on HBASE-13082: ------------------------------------------------ bq. Do it up here where compactedfiles is introduced. Ya okie. I will add it in StorefileManager where we introduce this compactedfiles. Also removed getCompactedfiles from the Store.java. bq.ImmutableList is a guava-ism? Removed the usage of guava API. bq.Yeah, below has a side-effect. Instead return what was sorted and have the caller assign: Ya done. bq.Why every '2' minutes? Any reason? 5 minutes? The TTLCleaner works every 5 mins for the secondary region replica. So before that we need to move the compacted files to the archive. Hence it is 2 mins. bq.You are looking at an enum? Why not just look at the refcount? Have a method isReferenced? Or isAliveStill? Or isInScan? I have changed these things now.. Also the logic is pretty much simple. Every time do a refCount increment while getting the files for scans. (We are going to do that only on the active Store files). Scan completiong will decrement the count. Once compacted done mark the boolean as compactedAway. So for compaction cleaner chore we only check if the count and the boolean is true. If so move it to archive. bq.So, we need Store Interface to have public Collection<StoreFile> getCompactedfiles() { Not needed any more. Will use the StorefileManager API. I think configuring the Cleaner per region makes sense because the region can collectively issue the cleaner across all its stores. bq.Is this added in this patch? If so, closeAfterCompaction? or isCloseAfterCompaction? closeAfterCompaction() will be the name. bq.We make filesToRemove even if we may not use it it? i.e. compacted is false. We create this array to hold one file only? Then clear it? Changed this. Will directly collect the compactedfiles and just remove them. bq.Who calls this closeAndArchiveCompactedFiles ? The chore thread? Yes. bq.We do a copy here and operate on the copy? The original can change if just after we getThecompactedFiles another set of compaction gets completed. Like we do in other areas by obtaininig a read lock before getting the files for scanners similarly I have added the read lock here just for copying the compacted files. bq..down in archiveAndRemoveCompactedFiles? There are no references to the file, right? Currently the storefiles are updated under the write lock. Similarly the compacted files are also updated with the same write lock. So the removal of the compacted files are also under the write lock. This is basically to make the update to the compactedfiles list atomic. bq.{ 156 ACTIVE, DISCARDED; 157 } Now no more enums. bq.doing it in StoreFile is not right.. .this is meta info on the StoreFile instances. StoreFileManager? Agree that Storefile is not right and StorefileInfo is better. I think we can do this in a seperate JIRA. But if we do in StorefileManager currently it is an interface so every impl of the SFManager should take care of this state and ref counting. (like default and StripeSFM). bq.Shoud return Collection<StoreFile> and internally you do the Immutable stuff (good practice) Okie. I just followed what clearStoreFiles() does. bq.In StoreFileScanner, when would scanner be null? Changed this now. Considering the fact that we have two seperated lists for the current storefiles and compacted files this may no longer be needed. bq.Maybe a timer on scans? If goes on longer than a minute have it return and then clean up compacted files.... New issue. Ya true. New Issue. bq.Is checkFlushed the right name for the method? Changed to checkResetHeap. bq.Why is it a CompactedHFilesCleaner cleaner and not a HFileCleaner? Changed to CompactedHFilesDischarger. Does that sound good? bq.Yeah, why call checkFlushed in shipped? I rechecked this flow fully. Calling this in shipped or close() may not be needed because after the shipped() call any way we wil be calling one of the scan API like next(), reseek(), peek() etc. But regarding not calling checkResetHeap in reseek(), peek(), seek() etc. I think its okie. The reason is because every next() may call reseek or seek() internally and that time if we can ensure that if we can reset the heap on flush it will ensure that we don't hold up the snapshot for a longer time. But one downside could be if there is no flush during a scan and there are lot of reseeks() we end up checking the volatile every time. But I think it is okie? > Coarsen StoreScanner locks to RegionScanner > ------------------------------------------- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug > Reporter: Lars Hofhansl > Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, HBASE-13082.pdf, HBASE-13082_1.pdf, > HBASE-13082_12.patch, HBASE-13082_13.patch, HBASE-13082_14.patch, > HBASE-13082_15.patch, HBASE-13082_1_WIP.patch, HBASE-13082_2.pdf, > HBASE-13082_2_WIP.patch, HBASE-13082_3.patch, HBASE-13082_4.patch, > HBASE-13082_9.patch, HBASE-13082_9.patch, HBASE-13082_withoutpatch.jpg, > HBASE-13082_withpatch.jpg, LockVsSynchronized.java, gc.png, gc.png, gc.png, > hits.png, next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)