[ 
https://issues.apache.org/jira/browse/HBASE-20628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eshcar Hillel reassigned HBASE-20628:
-------------------------------------

    Assignee: Eshcar Hillel  (was: stack)

> SegmentScanner does over-comparing when one flushing
> ----------------------------------------------------
>
>                 Key: HBASE-20628
>                 URL: https://issues.apache.org/jira/browse/HBASE-20628
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: stack
>            Assignee: Eshcar Hillel
>            Priority: Critical
>             Fix For: 2.0.1
>
>         Attachments: HBASE-20628.branch-2.0.001.patch, 
> HBASE-20628.branch-2.0.001.patch, hits-20628.png
>
>
> Flushing memstore is taking too long. It looks like we are doing a bunch of 
> comparing out of a new facility in hbase2, the Segment scanner at flush time.
> Below is a patch from [~anoop.hbase]. I had a similar more hacky version. 
> Both undo the extra comparing we were seeing in perf tests.
> [~anastas] and [~eshcar]. Need your help please.
> As I read it, we are trying to flush the memstore snapshot (default, no IMC 
> case). There is only ever going to be one Segment involved (even if IMC is 
> enabled); the snapshot Segment. But the getScanners is returning a list (of 
> one)  Scanners and the scan is via the generic SegmentScanner which is all 
> about a bunch of stuff we don't need when doing a flush so it seems to do 
> more work than is necessary. It also supports scanning backwards which is not 
> needed when trying to flush memstore.
> Do you see a problem doing a version of Anoops patch (whether IMC or not)? It 
> makes a big difference in general throughput when the below patch is in 
> place. Thanks.
> {code}
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> index cbd60e5da3..c3dd972254 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreSnapshot.java
> @@ -40,7 +40,8 @@ public class MemStoreSnapshot implements Closeable {
>      this.cellsCount = snapshot.getCellsCount();
>      this.memStoreSize = snapshot.getMemStoreSize();
>      this.timeRangeTracker = snapshot.getTimeRangeTracker();
> -    this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +    //this.scanners = snapshot.getScanners(Long.MAX_VALUE, Long.MAX_VALUE);
> +    this.scanners = snapshot.getScannersForSnapshot();
>      this.tagsPresent = snapshot.isTagsPresent();
>    }
> diff --git 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
>  
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> index 70074bf3b4..279c4e50c8 100644
> --- 
> a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> +++ 
> b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/Segment.java
> @@ -33,6 +33,7 @@ import org.apache.hadoop.hbase.KeyValueUtil;
>  import org.apache.hadoop.hbase.io.TimeRange;
>  import org.apache.hadoop.hbase.util.Bytes;
>  import org.apache.hadoop.hbase.util.ClassSize;
> +import org.apache.hadoop.hbase.util.CollectionBackedScanner;
>  import org.apache.yetus.audience.InterfaceAudience;
>  import org.slf4j.Logger;
>  import 
> org.apache.hbase.thirdparty.com.google.common.annotations.VisibleForTesting;
> @@ -130,6 +131,10 @@ public abstract class Segment {
>      return Collections.singletonList(new SegmentScanner(this, readPoint, 
> order));
>    }
> +  public List<KeyValueScanner> getScannersForSnapshot() {
> +    return Collections.singletonList(new 
> CollectionBackedScanner(this.cellSet.get(), comparator));
> +  }
> +
>    /**
>     * @return whether the segment has any cells
>     */
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to