[jira] [Commented] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395713#comment-15395713 ] Eshcar Hillel commented on HBASE-14921: --- We are trying to reproduce the PE results you reported [~anoop.hbase] and [~ram_krish], and we need your help with the following questions: 1. What is the exact cluster configuration you are running on? Is it a single machine hosting both the master and 1 RS? What is the setting of the underlying HDFS? Again NN and 1 DN on the same machine? 2. --inmemoryCompaction flag you are using in PE is not in master? Can you open a Jira to share the patch with this supplement so we don't need to re-implement this or even better commit this code to master. 3. How much time the experiment is expected to run (approximately)? One major drawback of the PE tool is that it does not allow to benchmark concurrent reads and writes. Once we are able to test 100% writes and have results for all settings we should also compare the 50%-50% workload. But we can discuss this when we get there. > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Anastasia Braginsky > Attachments: CellBlocksSegmentInMemStore.pdf, > CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, > HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, > HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, > HBASE-14921-V06-CAO.patch, InitialCellArrayMapEvaluation.pdf, > IntroductiontoNewFlatandCompactMemStore.pdf, NewCompactingMemStoreFlow.pptx > > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401076#comment-15401076 ] Eshcar Hillel commented on HBASE-14921: --- Questions on PE tests: 1. you set hbase.regionserver.global.memstore.size to 0.42. Did you also set hfile.block.cache.size to 0.38? or even lower? 2. which setting did you use for WAL writeToWAL=true/false? > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Anastasia Braginsky > Attachments: CellBlocksSegmentInMemStore.pdf, > CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, > HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, > HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, > HBASE-14921-V06-CAO.patch, InitialCellArrayMapEvaluation.pdf, > IntroductiontoNewFlatandCompactMemStore.pdf, NewCompactingMemStoreFlow.pptx > > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402087#comment-15402087 ] Eshcar Hillel commented on HBASE-14921: --- Hi [~anoop.hbase], [~ram_krish], Did you use ssd machines for the PE? So far when running the suggested setting on an hdd machine I get an insufficient memory error even when running the default memstore, so I suspect this is due to IO bottleneck (flushes are simply too slow). Can you share the configuration of the machine you run on? > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Anastasia Braginsky > Attachments: CellBlocksSegmentInMemStore.pdf, > CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, > HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, > HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, > HBASE-14921-V06-CAO.patch, InitialCellArrayMapEvaluation.pdf, > IntroductiontoNewFlatandCompactMemStore.pdf, NewCompactingMemStoreFlow.pptx > > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14921: -- Status: Open (was: Patch Available) > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Anastasia Braginsky > Attachments: CellBlocksSegmentInMemStore.pdf, > CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, > HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, > HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, > HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, > HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, > HBASE-14921-V11-CAO.patch, InitialCellArrayMapEvaluation.pdf, > IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, > MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx > > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14921: -- Status: Patch Available (was: Open) > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Anastasia Braginsky > Attachments: CellBlocksSegmentInMemStore.pdf, > CellBlocksSegmentinthecontextofMemStore(1).pdf, HBASE-14921-V01.patch, > HBASE-14921-V02.patch, HBASE-14921-V03.patch, HBASE-14921-V04-CA-V02.patch, > HBASE-14921-V04-CA.patch, HBASE-14921-V05-CAO.patch, > HBASE-14921-V06-CAO.patch, HBASE-14921-V08-CAO.patch, > HBASE-14921-V09-CAO.patch, HBASE-14921-V10-CAO.patch, > HBASE-14921-V11-CAO.patch, InitialCellArrayMapEvaluation.pdf, > IntroductiontoNewFlatandCompactMemStore.pdf, MemStoreSizes.pdf, > MemstoreItrCountissue.patch, NewCompactingMemStoreFlow.pptx > > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V05.patch > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, HBASE-15016-V04.patch, HBASE-15016-V05.patch, > Regioncounters.pdf, suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V06.patch fixed style issues+ anoop's comment > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03 (1).patch, HBASE-15016-V03.patch, HBASE-15016-V04.patch, > HBASE-15016-V05.patch, HBASE-15016-V05.patch, HBASE-15016-V06.patch, > Regioncounters.pdf, suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166831#comment-15166831 ] Eshcar Hillel commented on HBASE-15016: --- Excellent!!! Moving on to work on Task #3 > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03 (1).patch, HBASE-15016-V03.patch, HBASE-15016-V04.patch, > HBASE-15016-V05.patch, HBASE-15016-V05.patch, HBASE-15016-V06.patch, > Regioncounters.pdf, suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-15359) Simplifying Segment hierarchy
Eshcar Hillel created HBASE-15359: - Summary: Simplifying Segment hierarchy Key: HBASE-15359 URL: https://issues.apache.org/jira/browse/HBASE-15359 Project: HBase Issue Type: Sub-task Reporter: Eshcar Hillel Assignee: Eshcar Hillel Now that it is clear that no memstore segment will be implemented as an HFIle, and that all segments store their data in some representation of CellSet (skip-list or flat), the segment hierarchy can be much simplified. The attached patch includes only 3 classes in the hierarchy: Segment - comprises most of the state and implementation MutableSegment - extends API with add and rollback functionality ImmutableSegment - extends API with key-value scanner for snapshot SegmentScanner is the scanner for all types of segments. In addition, the option to rollback immutable segment in the memstore is disabled. This code would allow us to make progress independently in the compaction subtask (HBASE-14920) and the flat index representation subtask (HBASE-14921). It also means that the new immutable segment can reuse the existing SegmentScanner, instead of implementing a new scanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15359) Simplifying Segment hierarchy
[ https://issues.apache.org/jira/browse/HBASE-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15359: -- Attachment: HBASE-14918-FIX-SEGMENT.patch > Simplifying Segment hierarchy > - > > Key: HBASE-15359 > URL: https://issues.apache.org/jira/browse/HBASE-15359 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: HBASE-14918-FIX-SEGMENT.patch > > > Now that it is clear that no memstore segment will be implemented as an > HFIle, and that all segments store their data in some representation of > CellSet (skip-list or flat), the segment hierarchy can be much simplified. > The attached patch includes only 3 classes in the hierarchy: > Segment - comprises most of the state and implementation > MutableSegment - extends API with add and rollback functionality > ImmutableSegment - extends API with key-value scanner for snapshot > SegmentScanner is the scanner for all types of segments. > In addition, the option to rollback immutable segment in the memstore is > disabled. > This code would allow us to make progress independently in the compaction > subtask (HBASE-14920) and the flat index representation subtask > (HBASE-14921). It also means that the new immutable segment can reuse the > existing SegmentScanner, instead of implementing a new scanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15359) Simplifying Segment hierarchy
[ https://issues.apache.org/jira/browse/HBASE-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15359: -- Status: Patch Available (was: Open) > Simplifying Segment hierarchy > - > > Key: HBASE-15359 > URL: https://issues.apache.org/jira/browse/HBASE-15359 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: HBASE-14918-FIX-SEGMENT.patch > > > Now that it is clear that no memstore segment will be implemented as an > HFIle, and that all segments store their data in some representation of > CellSet (skip-list or flat), the segment hierarchy can be much simplified. > The attached patch includes only 3 classes in the hierarchy: > Segment - comprises most of the state and implementation > MutableSegment - extends API with add and rollback functionality > ImmutableSegment - extends API with key-value scanner for snapshot > SegmentScanner is the scanner for all types of segments. > In addition, the option to rollback immutable segment in the memstore is > disabled. > This code would allow us to make progress independently in the compaction > subtask (HBASE-14920) and the flat index representation subtask > (HBASE-14921). It also means that the new immutable segment can reuse the > existing SegmentScanner, instead of implementing a new scanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15359) Simplifying Segment hierarchy
[ https://issues.apache.org/jira/browse/HBASE-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173428#comment-15173428 ] Eshcar Hillel commented on HBASE-15359: --- In the umbrella jira [~stack] says bq. We want CellBlocks right? Not HFile blocks? and bq. The HFile instance (with all its great possibilities) seams like something too complex for this simple task of storing the data flat. Agree. >From this I conjecture segments will not be implemented as an HFile instance. rollback is declared in Store.java and MemStore.java and implemented in HStore ans AbstractMemStore. In the meanwhile it is also supported by MutableSegment, but can be removed once it is cleaned everywhere else. > Simplifying Segment hierarchy > - > > Key: HBASE-15359 > URL: https://issues.apache.org/jira/browse/HBASE-15359 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: HBASE-14918-FIX-SEGMENT.patch > > > Now that it is clear that no memstore segment will be implemented as an > HFIle, and that all segments store their data in some representation of > CellSet (skip-list or flat), the segment hierarchy can be much simplified. > The attached patch includes only 3 classes in the hierarchy: > Segment - comprises most of the state and implementation > MutableSegment - extends API with add and rollback functionality > ImmutableSegment - extends API with key-value scanner for snapshot > SegmentScanner is the scanner for all types of segments. > In addition, the option to rollback immutable segment in the memstore is > disabled. > This code would allow us to make progress independently in the compaction > subtask (HBASE-14920) and the flat index representation subtask > (HBASE-14921). It also means that the new immutable segment can reuse the > existing SegmentScanner, instead of implementing a new scanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15359) Simplifying Segment hierarchy
[ https://issues.apache.org/jira/browse/HBASE-15359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15359: -- Attachment: HBASE-15359-V01.patch > Simplifying Segment hierarchy > - > > Key: HBASE-15359 > URL: https://issues.apache.org/jira/browse/HBASE-15359 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: HBASE-14918-FIX-SEGMENT.patch, HBASE-15359-V01.patch > > > Now that it is clear that no memstore segment will be implemented as an > HFIle, and that all segments store their data in some representation of > CellSet (skip-list or flat), the segment hierarchy can be much simplified. > The attached patch includes only 3 classes in the hierarchy: > Segment - comprises most of the state and implementation > MutableSegment - extends API with add and rollback functionality > ImmutableSegment - extends API with key-value scanner for snapshot > SegmentScanner is the scanner for all types of segments. > In addition, the option to rollback immutable segment in the memstore is > disabled. > This code would allow us to make progress independently in the compaction > subtask (HBASE-14920) and the flat index representation subtask > (HBASE-14921). It also means that the new immutable segment can reuse the > existing SegmentScanner, instead of implementing a new scanner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14920: -- Attachment: HBASE-14920-V01.patch > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14920: -- Status: Patch Available (was: Open) > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203266#comment-15203266 ] Eshcar Hillel commented on HBASE-14918: --- New patch is attached to task HBASE-14920 - new compacting memstore implementation. The patch is not small ;) please review. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: CellBlocksSegmentDesign.pdf, MSLABMove.patch > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14920: -- Attachment: HBASE-14920-V02.patch > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch, HBASE-14920-V02.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217973#comment-15217973 ] Eshcar Hillel commented on HBASE-14920: --- Hi, [~stack] I will get back with answers to your questions/comments. However, first I encountered an issue which I would like to consult about. During in-memory compaction we write aside the minimum sequence id in the segment, which is the result of the compaction. We then use this sequence id to update the WAL sequence Id accounting so that old WAL files can be discarded. What I discovered is that there are cells the are not regular key-value cells, which reside in a segment and have very small sequence id (e.g., 8). I guess these cells are being added automatically by the store, and not as a result of the client operation. These cells fall through the compaction filter, so they always land on the resulting segment which means the minimum sequence id of the segment is always small and as a consequence WAL files are not discarded (and at some point may exceed the limit on the number of WAL files - currently 33). A possible solution would be to ignore these cells while doing the bookkeeping of the minimal sequence id. Namely, only consider cells of type Put or Delete when setting the minimum seq id of the segment. Does this seem a reasonable solution? Here is a full list of possible types: Minimum, Put, Delete, DeleteFamilyVersion, DeleteColumn, DeleteFamily Maximum. > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch, HBASE-14920-V02.patch, > move.to.junit4.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219382#comment-15219382 ] Eshcar Hillel commented on HBASE-14920: --- I recently updated YCSB to support also delete operations. The delete operations in the benchmark I ran ended up as cells of type *DeleteFamily*. I would expect tombstones to be of type Delete. This could be an issue with my YCSB client, so we can ignore this for the moment. Anyone else had the same problem before? What exactly is the affect of a cell of type DeleteFamily in a normal disk compaction? does it remove all the entries in the column family? [~anoop.hbase] The patch is in RB (there's a link to it in the Jira). An in-memory compaction removes entries from the memory, much like a flush to disk would do. The only reason to keep records in WAL is when data is *not yet* persistent on disk. If we remove data from memory (during in-memory compaction) so it will never arrive to disk (since a more recent version already exists), no point in keeping the records in WAL, and it can be removed from it. To summarize this point, in the case of a compacting memstore tombstones are not removed during in-memory compaction (this is the equivalent of minor compaction) and need to wait till they hit the disk to be removed in a major compaction. > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch, HBASE-14920-V02.patch, > move.to.junit4.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14920) Compacting Memstore
[ https://issues.apache.org/jira/browse/HBASE-14920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15219648#comment-15219648 ] Eshcar Hillel commented on HBASE-14920: --- OK, thanks [~anoop.hbase] for clarifying this. I will work on a new patch to incorporate the suggestion above. I will apply what is possible and explain the changes that are not applicable. [~stack] just to be clear, you suggested to change the class name (and some methods) to CompactionMemStore (instead of CompatingMemStore)? > Compacting Memstore > --- > > Key: HBASE-14920 > URL: https://issues.apache.org/jira/browse/HBASE-14920 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14920-V01.patch, HBASE-14920-V02.patch, > move.to.junit4.patch > > > Implementation of a new compacting memstore with non-optimized immutable > segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: Regioncounters.pdf The attached file (Region counters) show a view of the region counters before and after the patch, and can help understand how they affect the flush decisions. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, Regioncounters.pdf > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085376#comment-15085376 ] Eshcar Hillel commented on HBASE-15016: --- There are 4 decisions to make: 1) when to do in-memory flush 2) when to do in-memory compaction 3) when to flush to disk 4) which stores to flush to disk. One feedback we got when working on HBASE-13408 was that decisions 1 and 2 should be encapsulated and managed within the memstore. This is reasonable since the memstore holds all the information about the sizes and duplications etc. What you are suggesting now is to add a ‘warning’ message sent by region to stores that would trigger an in-memory flush and/or a compaction. Here is a scenario we need to avoid: having a compaction pipeline of size 80MB, and then whenever the active segment only reaches a few MBs - a warning message is sent, triggers in-memory flush and compaction. Then a big segment (80MB) is merged with a small segment (3MB) creating a big segment again (say, around 80MB when removing duplication). If this happens over and over again, it’s a waste of cpu time and also generates a lot of work for the GC. [somewhat similar to the small files problem FlushLargeStoresPolicy tries to resolve] Another issue, say you have several stores in a region, at least one default memstore (A) and one compacted memstore (B). Assume they both exceed 16MB, and other memstores are less than 16MB. When the region triggers a flush to disk, the current policy chooses to flush A and B. It is reasonable to flush A since there is no other way to reduce its size, however, is it reasonable to flush B? If it stays in memory longer it has a chance to reduce its size without flushing to disk. Just mentioned these issues so you can consider them when preparing your patch. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, Regioncounters.pdf > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15089127#comment-15089127 ] Eshcar Hillel commented on HBASE-15016: --- Ok so I will work on a patch that includes a new flush policy class which also considers the data in the table/column descriptors when making decisions, and the flush size will be determined also based on this data. Nice :) > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, Regioncounters.pdf, suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090776#comment-15090776 ] Eshcar Hillel commented on HBASE-15016: --- Sure it would help if you have a clean version. As a matter of fact perhaps a clean version of your patch is all that is needed for this sub-task, while the flush policy which is tightly coupled with compacting memstores and additional changes to column descriptor like adding isCompacting() method should be part of sub-task #3 where we actually declare the new class. What do you say? > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, Regioncounters.pdf, suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090994#comment-15090994 ] Eshcar Hillel commented on HBASE-14919: --- It seems that this patch is ready to be committed. [~stack] do you prefer to wait with it till after patch #2 is committed? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092051#comment-15092051 ] Eshcar Hillel commented on HBASE-14919: --- I am not sure how updateLowestUnflushedSequenceIdInWal is related to HBASE-15082, I might be missing some context. Can you elaborate [~anoop.hbase]? The motivation behind updateLowestUnflushedSequenceIdInWal is to have the WAL cover all entries still in memory (this is required for correctness) while truncating the WAL so it does not get too long (to bound MTTR). It is required for compacting memstore which cherry picks entries to be removed from memory without flushing them to disk. Since this is done in the background and not in a batched mode, the new method is invoked whenever needs to update the wal. > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V05.patch New patch after rebase and changes following Anoop's comments. > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15104887#comment-15104887 ] Eshcar Hillel commented on HBASE-14919: --- Please advise: I think I know how to handle the check-style, white-space and find-bugs issues in hbase-server. But I'm not sure how to handle the errors with Hadoop v2.x.x and the 82 find-bugs warning. Can someone help here? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V06.patch > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110274#comment-15110274 ] Eshcar Hillel commented on HBASE-14918: --- I submitted a patch in task 1 two days ago but didn't receive any QA report since. Any problems with the QA system? > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110790#comment-15110790 ] Eshcar Hillel commented on HBASE-14918: --- I went through the code in HBASE-10713, and it seems we can come up with a design for task #4 of a compacted memstore which stores the data in a flat format (in the issue they are called CellBlocks) instead of in java skip-list. [~anoop.hbase] would you be interested to collaborate on this? If you are, we can schedule an off-list chat to discuss the details of the design. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V07.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15114306#comment-15114306 ] Eshcar Hillel commented on HBASE-14919: --- The find bug warning is as follows {code} org.apache.hadoop.hbase.ipc.RpcServer$Connection.close() might ignore java.lang.Exception {code} but this patch did not change the RpcServer file, so I guess this one is also unrelated to the patch. > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14918: -- Attachment: CellBlocksSegmentDesign.pdf Attached an initial suggestion for the design of a CellBlocks segment, where Cells are stored in a flat format. Any comments are welcomed. Note that at this point the segment does not support compression, but I assume the format of cell blocks is ``compression-friendly''. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: CellBlocksSegmentDesign.pdf > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121094#comment-15121094 ] Eshcar Hillel commented on HBASE-14918: --- Thanks [~anoop.hbase]. I don't see how you can move MSLAB to the HStore level. In the first patch MSLAB is used in the segment to allocate the byte range (in maybeCloneWithAllocator()), and it also does bookkeeping of scanners which access the MSLAB (with inc/decScannersCount()) so it can manage the deallocation of buffers when no scanners can access them. This is also the case in master but there the methods are in the scope of DefaultMemStore and the MemStoreScanner. How would you suggest to move it to HStore? Why do you think it is better there and not inside the segment? > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: CellBlocksSegmentDesign.pdf > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121203#comment-15121203 ] Eshcar Hillel commented on HBASE-14918: --- ok. But even in the HBASE-15180 patch DefaultMemStore still have the attributes {code} volatile MemStoreLAB allocator; volatile MemStoreLAB snapshotAllocator; {code} and MemStoreScanner still have the attributes {code} volatile MemStoreLAB allocatorAtCreation; volatile MemStoreLAB snapshotAllocatorAtCreation; {code} So either I'm missing something or we talk on two different things. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: CellBlocksSegmentDesign.pdf > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121385#comment-15121385 ] Eshcar Hillel commented on HBASE-14918: --- I reviewed the mslab-move patch. Software-engineering-wise I am not at all convinced that the right place for mslab is in HStore level. The compacting memstore is an example in which cells are allocated at the memstore level and not the store level. But more important is what you say about off-heap memory. I have no experience with off-heaping. Can you please elaborate why the suggested design cannot be off-heap, and what is needed to allow it be off-heap? In addition, you refer to the write-path, but actually the write-path goes through mutable-segment that stores the data in a CSLM format. Only reads and scans access the cell block. It is good we have this discussion at this point since it relates to the design of task #4, and can also affect task #3. However, [~stack], is there anything that prevents committing the patch of task #1. Is it not committed due to the MSLAB issue? IMO, the mslab is orthogonal to task #1. If it is decided that it needs to move, then it is possible to do so even after the patch. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > Attachments: CellBlocksSegmentDesign.pdf, MSLABMove.patch > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15121515#comment-15121515 ] Eshcar Hillel commented on HBASE-14919: --- No problem - I can explain again: When an external flush is invoked (call to snapshot()), the data is flushed to disk, but some data may still be left in memory. The sequenceId is passed to the memstore so it can do a bookkeeping of sequence ids and compute an approximation of the lowest sequence id (oldest entry) still in memory. updateLowestUnflushedSequenceIdInWal() is called when completing a flush to disk and when completing an in-memory compaction. These are the two events when the content of the memory changes. This method updates the wal with the lowest sequence id that is still in memory, as explained above. > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V08.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V09.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V10.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch, HBASE-14919-V10.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132267#comment-15132267 ] Eshcar Hillel commented on HBASE-14919: --- bq. Should we set -1 to when create instance of this class? Define -1 as NO_SNAPSHOT_ID? Definitely, yes. bq. This javadoc seems wrong. I will fix this. bq. So, an ImmutableSegment takes a MutableSegment on construction? You can skip the pupa ImmutableSegmentAdapter? (It is the only implementation of ImmutableSegment). Task #4 in the umbrella issue implements another ImmutableSegment (CellBlocksSrgment), which does not take MutableSegment on construction. Would you like better the name ImmutableSegmentWrapper? bq. Reading the MutableSegment looking at the methods it has, why can't they all be in ImmutableSegment? We could simplify the design by removing the distinction between mutable and immutable segments but this would be at the cost of implementing unnecessary API, possibly not efficiently. Consider tailSet(firstCell), it is similar for first() and getComparator(). tailSet is used in the methods AbstractMemStore::upsert() and AbstractMemStore::updateColumnValue() where it is applied only to the active (mutable) segment, and in methods of MutableCellSetSegment and MutableCellSetSegmentScanner. It is not needed in any use case for an immutable segment. However, its implementation in CellBlocksSegment would be inefficient since it will incur costly traversal over all cell blocks plus allocating many new objects to be stored in the result sorted set. It is possible to do it this way. We believe exposing an unnecessary API which incurs costly implementation is not advisable. bq. And why this subset of Set methods in MutableSegment? We believe it is good practice to only expose the minimal necessary functionality. bq. In ImmutableSegment, there is this method getScannerForMemStoreSnapshot(). As I read it, I am getting a Scanner on a MemStoreSnapshot... No, this is incorrect. This method returns a KeyValueScanner that is used *by* MemStoreSnapshot, but _it scans the cells in the segment who generated it_. Would you like better the name getKeyValueScanner()? And the other method would be getSegmentScanner(). bq. What is special about MutableCellSetSegmentScanner? When would the Scanner on a MutableSegment differ from Scanner on an ImmutableSegment? In other words, could we just have a SegmentScanner implementation and it work for both Segment types? Each scanner depends on the segment it is scanning. For example, consider the method MutableCellSetSegmentScanner::backwardSeek() - it seeks forward then go backward {code} public boolean backwardSeek(Cell key) throws IOException { seek(key); if (peek() == null || segment.compareRows(peek(), key) > 0) { return seekToPreviousRow(key); } return true; } {code} There is no reason for CellBlocksSegment to implement it this way since the cells are contiguous and therefore allow for much simpler traversal. > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch, HBASE-14919-V10.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V11.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch, HBASE-14919-V10.patch, HBASE-14919-V11.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V12.patch > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch, HBASE-14919-V10.patch, HBASE-14919-V11.patch, > HBASE-14919-V12.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring for In-Memory Flush
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15146472#comment-15146472 ] Eshcar Hillel commented on HBASE-14919: --- [~stack] I see we have some problems with the commit. Anything I can do to help? > Infrastructure refactoring for In-Memory Flush > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch, HBASE-14919-V05.patch, HBASE-14919-V06.patch, > HBASE-14919-V06.patch, HBASE-14919-V07.patch, HBASE-14919-V08.patch, > HBASE-14919-V09.patch, HBASE-14919-V10.patch, HBASE-14919-V11.patch, > HBASE-14919-V12.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V04.patch Attaching a new patch. [~stack], sorry if I took over your task, but this is a blocker for task #3. The patch is very thin. Following what we discussed and agreed upon. All issues related to compaction including a new flush policy is deferred to task #3. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, HBASE-15016-V04.patch, Regioncounters.pdf, > suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15156994#comment-15156994 ] Eshcar Hillel commented on HBASE-15016: --- Looking at the code I'm trying to understand if using the version numbers in the cells is good enough for updating the wal. >From what I see now it seems that it is; the solution we suggested before was >complicated and required getting the wal sequence number from the region, but >actually the wal sequence id are stamped in the cell in the HRegion method >processRowsWithLocks(), when performing atomic mutation in the region. I hope I am not missing anything. Is this the right way to describe it? So, for your question [~stack], I will remove the getWalSequenceId() method from RegionServicesForStores. HRegion is already implementing 3(!) interfaces. I prefer not to add the 4th interface and instead use composition for RegionServicesForStores. But if you insist I can make it an interface. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch, HBASE-15016-V04.patch, Regioncounters.pdf, > suggestion.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache
[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15500618#comment-15500618 ] Eshcar Hillel commented on HBASE-15560: --- Hi [~benmanns] Left some comments in RB. It seems TinyLFU will ignore the inMemory flag. Is this correct? It might be ok given that the caching policy is more sophisticated than LRU, but can you be explicit about this issue? Also, given that in the patch LRU is the default policy seems no new tests were added in to test TinyLFU. > TinyLFU-based BlockCache > > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Affects Versions: 2.0.0 >Reporter: Ben Manes >Assignee: Ben Manes > Attachments: HBASE-15560.patch, tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a striped ring buffer and writes > to a queue. The operations are applied in batches under a try-lock by an > asynchronous thread, thereby track the usage pattern without incurring high > latencies > ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]). > In YCSB benchmarks the results were inconclusive. For a large cache (99% hit > rates) the two caches have near identical throughput and latencies with > LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a > 1-4% hit rate improvement and therefore lower latencies. The lack luster > result is because a synthetic Zipfian distribution is used, which SLRU > performs optimally. In a more varied, real-world workload we'd expect to see > improvements by being able to make smarter predictions. > The provided patch implements BlockCache using the > [Caffeine|https://github.com/ben-manes/caffeine] caching library (see > HighScalability > [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]). > Edward Bortnikov and Eshcar Hillel have graciously provided guidance for > evaluating this patch ([github > branch|https://github.com/ben-manes/hbase/tree/tinylfu]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16643) Reverse scanner heap creation may not allow MSLAB closure due to inproper ref counting of segments
[ https://issues.apache.org/jira/browse/HBASE-16643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509182#comment-15509182 ] Eshcar Hillel commented on HBASE-16643: --- Hi [~ram_krish] Could you please add a link to review board so it is easier to review the patch. Thanks. > Reverse scanner heap creation may not allow MSLAB closure due to inproper ref > counting of segments > -- > > Key: HBASE-16643 > URL: https://issues.apache.org/jira/browse/HBASE-16643 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-16643.patch > > > In the reverse scanner case, > While doing 'initBackwardHeapIfNeeded' in MemstoreScanner for setting the > backward heap, we do a > {code} > if ((backwardHeap == null) && (forwardHeap != null)) { > forwardHeap.close(); > forwardHeap = null; > // before building the heap seek for the relevant key on the scanners, > // for the heap to be built from the scanners correctly > for (KeyValueScanner scan : scanners) { > if (toLast) { > res |= scan.seekToLastRow(); > } else { > res |= scan.backwardSeek(cell); > } > } > {code} > forwardHeap.close(). This would internally decrement the MSLAB ref counter > for the current active segment and snapshot segment. > When the scan is actually closed again we do close() and that will again > decrement the count. Here chances are there that the count would go negative > and hence the actual MSLAB closure that checks for refCount==0 will fail. > Apart from this, when the refCount becomes 0 after the firstClose if any > other thread requests to close the segment, then we will end up in corrupted > segment because the segment could be put back to the MSLAB pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions
[ https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509291#comment-15509291 ] Eshcar Hillel commented on HBASE-16417: --- We've started working on the policy patch. We try to set a baseline for performance by running PE over the default memstore with the configuration settings [~ram_krish] suggested in HBASE-14921. *However* we are only able to complete the run with 2-4 threads, 10 regions (pre split) writing 10GB. Any attempt to either increase the number of threads or the amount of data results in an Out of Memory Error caused by the *GC*. (I'm running with the same settings for PE, hbase and GC as you posted.) We are running on a SSD 2.9PB disk (HDD wasn't even able to complete a run even with a single thread), with 48GB RAM. Some questions [~ram_krish]: 1. Did you run PE on an SSD or HDD? 2. How much memory do you have in your machine? 3. Can you think of any parameter that you forgot to mention that may have caused the difference in executions? I'll also try running PE with more space for the GC and will update on the result. > In-Memory MemStore Policy for Flattening and Compactions > > > Key: HBASE-16417 > URL: https://issues.apache.org/jira/browse/HBASE-16417 > Project: HBase > Issue Type: Sub-task >Reporter: Anastasia Braginsky >Assignee: Anastasia Braginsky > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions
[ https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509464#comment-15509464 ] Eshcar Hillel commented on HBASE-16417: --- I used the same GC settings as you mentioned {\code} export HBASE_OPTS="-XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=60 -XX:G1HeapWastePercent=20 -XX:G1MixedGCCountTarget=8" {\code} The last 2 parameters you mentioned seems critical. Did you mean these two where changed? {\code} hbase.hstore.flusher.count 2 The number of flush threads. With fewer threads, the MemStore flushes will be queued. With more threads, the flushes will be executed in parallel, increasing the load on HDFS, and potentially causing more compactions. hbase.hstore.blockingStoreFiles 10 If more than this number of StoreFiles exist in any one Store (one StoreFile is written per flush of MemStore), updates are blocked for this region until a compaction is completed, or until hbase.hstore.blockingWaitTime has been exceeded. {\code} > In-Memory MemStore Policy for Flattening and Compactions > > > Key: HBASE-16417 > URL: https://issues.apache.org/jira/browse/HBASE-16417 > Project: HBase > Issue Type: Sub-task >Reporter: Anastasia Braginsky >Assignee: Anastasia Braginsky > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16417) In-Memory MemStore Policy for Flattening and Compactions
[ https://issues.apache.org/jira/browse/HBASE-16417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15509576#comment-15509576 ] Eshcar Hillel commented on HBASE-16417: --- The error message is by the GC {\code} Stack: [0x7f56a85d1000,0x7f56a86d2000], sp=0x7f56a86cfe10, free space=1019k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0xab97ea] VMError::report_and_die()+0x2ba V [libjvm.so+0x4f9dab] report_vm_out_of_memory(char const*, int, unsigned long, VMErrorType, char const*)+0x8b V [libjvm.so+0x91a7c3] os::Linux::commit_memory_impl(char*, unsigned long, bool)+0x103 V [libjvm.so+0x91ac65] os::pd_commit_memory_or_exit(char*, unsigned long, unsigned long, bool, char const*)+0x35 V [libjvm.so+0x914d46] os::commit_memory_or_exit(char*, unsigned long, unsigned long, bool, char const*)+0x26 V [libjvm.so+0x5c073f] G1PageBasedVirtualSpace::commit_internal(unsigned long, unsigned long)+0xbf V [libjvm.so+0x5c09cc] G1PageBasedVirtualSpace::commit(unsigned long, unsigned long)+0x11c V [libjvm.so+0x5c3610] G1RegionsLargerThanCommitSizeMapper::commit_regions(unsigned int, unsigned long)+0x40 V [libjvm.so+0x625157] HeapRegionManager::commit_regions(unsigned int, unsigned long)+0x77 V [libjvm.so+0x6263f1] HeapRegionManager::make_regions_available(unsigned int, unsigned int)+0x31 V [libjvm.so+0x626970] HeapRegionManager::expand_by(unsigned int)+0xb0 V [libjvm.so+0x597c29] G1CollectedHeap::expand(unsigned long)+0x199 V [libjvm.so+0x5a3b0d] G1CollectedHeap::do_collection_pause_at_safepoint(double)+0xc6d V [libjvm.so+0xac3c2a] VM_G1IncCollectionPause::doit()+0x9a V [libjvm.so+0xac2c35] VM_Operation::evaluate()+0x55 V [libjvm.so+0xac100a] VMThread::evaluate_operation(VM_Operation*)+0xba V [libjvm.so+0xac138e] VMThread::loop()+0x1ce V [libjvm.so+0xac1800] VMThread::run()+0x70 V [libjvm.so+0x91cb88] java_start(Thread*)+0x108 {\code} so the RS crashes, don't think it is killed by master. Not writing to WAL. > In-Memory MemStore Policy for Flattening and Compactions > > > Key: HBASE-16417 > URL: https://issues.apache.org/jira/browse/HBASE-16417 > Project: HBase > Issue Type: Sub-task >Reporter: Anastasia Braginsky >Assignee: Anastasia Braginsky > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache
[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15522580#comment-15522580 ] Eshcar Hillel commented on HBASE-15560: --- [~ben.manes] seems you addressed all comments in RB. Can you upload new patch which passes QA? Currently you have some compilation errors. You can take a look at the Report table under the unit lines. > TinyLFU-based BlockCache > > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Affects Versions: 2.0.0 >Reporter: Ben Manes >Assignee: Ben Manes > Attachments: HBASE-15560.patch, tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a striped ring buffer and writes > to a queue. The operations are applied in batches under a try-lock by an > asynchronous thread, thereby track the usage pattern without incurring high > latencies > ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]). > In YCSB benchmarks the results were inconclusive. For a large cache (99% hit > rates) the two caches have near identical throughput and latencies with > LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a > 1-4% hit rate improvement and therefore lower latencies. The lack luster > result is because a synthetic Zipfian distribution is used, which SLRU > performs optimally. In a more varied, real-world workload we'd expect to see > improvements by being able to make smarter predictions. > The provided patch implements BlockCache using the > [Caffeine|https://github.com/ben-manes/caffeine] caching library (see > HighScalability > [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]). > Edward Bortnikov and Eshcar Hillel have graciously provided guidance for > evaluating this patch ([github > branch|https://github.com/ben-manes/hbase/tree/tinylfu]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache
[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15532200#comment-15532200 ] Eshcar Hillel commented on HBASE-15560: --- Hi This often happened to me when tests failed in QA and passed locally. Usually after some rebase this is resolved. I am not a committer, and I'm not sure what's the policy for committing patches which didn't pass QA, however I can give my +1 [~saint@gmail.com], [~busbey], any input here? > TinyLFU-based BlockCache > > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Affects Versions: 2.0.0 >Reporter: Ben Manes >Assignee: Ben Manes > Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, HBASE-15560.patch, tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a striped ring buffer and writes > to a queue. The operations are applied in batches under a try-lock by an > asynchronous thread, thereby track the usage pattern without incurring high > latencies > ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]). > In YCSB benchmarks the results were inconclusive. For a large cache (99% hit > rates) the two caches have near identical throughput and latencies with > LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a > 1-4% hit rate improvement and therefore lower latencies. The lack luster > result is because a synthetic Zipfian distribution is used, which SLRU > performs optimally. In a more varied, real-world workload we'd expect to see > improvements by being able to make smarter predictions. > The provided patch implements BlockCache using the > [Caffeine|https://github.com/ben-manes/caffeine] caching library (see > HighScalability > [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]). > Edward Bortnikov and Eshcar Hillel have graciously provided guidance for > evaluating this patch ([github > branch|https://github.com/ben-manes/hbase/tree/tinylfu]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache
[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15549086#comment-15549086 ] Eshcar Hillel commented on HBASE-15560: --- Hi, can I add my view of this issue ? I think the gap between what is required by the community and what can be provided is not that big. 1) [~ben.manes] you already have the results of the YCSB benchmark you ran with the initial patch. Can you rerun these tests with the latest patch and publish the results in some form. I suggest you publish the exact settings you used plus raw results (rather than lift). You can either present a comparison table of the mean latency + high (95th/99th) percentiles, over different cache sizes, or depict the dynamic of the latency throughout the run in a graph (by using the '-s' flag -- I can explain offline), or best do both. If you dig in the region server log you can find records of the hit ratio, which you can also depict alongside the latency; could be nice to see. This results would show that when combining HBase and Caffeine there is no overhead and in some cases a measurable benefit, even in synthetic workloads. 2) [~stack] if the results of these experiments would satisfy the community then the default can be switched to TinyLFU, with LRU being optional and pushed to master. This would allow the community to further experiment with this feature more easily, and to modify it if necessary. 3) Ben briefly described the results of the benchmarks when using a static distribution. Here is my explanation of the results (Ben feel free to correct me if I'm wrong): The distribution of the items is skewed but *static* with a small (high frequency) head and a long (low frequency) tail. With a given cache size -- after the cache is warm -- the items at the head feel the second segment (which is 80% of the cache in TinyLFU) and the following items feel the first segment. With LRU from time to time items from the tail of the distribution cause eviction from the first segment which is later translated to cache misses and increased latency, while TinyLFU tends to keep items with higher frequency in the cache, which results in less misses. As the size of the cache grows less and less items are evicted from the cache and the difference diminishes. With *dynamic* distribution items are continuously evicted from the cache and here the benefit of TinyLFU should be clearly pronounced. We have traces of production workloads that would potentially have skewed dynamic probability. However, we can neither share them and currently don't have the resources to turn them into a running benchmark. We could try to make an effort at this direction if this becomes a make-it-or-break-it point. Would this be acceptable: 1) [~ben.manes] running static YCSB benchmark; 2) [~stack] commit TinyLFU as a default; 3) benchmark with dynamic workloads, either by us or others in the community. > TinyLFU-based BlockCache > > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Affects Versions: 2.0.0 >Reporter: Ben Manes >Assignee: Ben Manes > Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a s
[jira] [Commented] (HBASE-15560) TinyLFU-based BlockCache
[ https://issues.apache.org/jira/browse/HBASE-15560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553051#comment-15553051 ] Eshcar Hillel commented on HBASE-15560: --- No unfortunately we currently don't have anonymized traces to share. But let's start with step (1) and continue from there. I think when the cache is small/medium size we can get interesting results even with YCSB synthetic workloads. > TinyLFU-based BlockCache > > > Key: HBASE-15560 > URL: https://issues.apache.org/jira/browse/HBASE-15560 > Project: HBase > Issue Type: Improvement > Components: BlockCache >Affects Versions: 2.0.0 >Reporter: Ben Manes >Assignee: Ben Manes > Attachments: HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, HBASE-15560.patch, > tinylfu.patch > > > LruBlockCache uses the Segmented LRU (SLRU) policy to capture frequency and > recency of the working set. It achieves concurrency by using an O( n ) > background thread to prioritize the entries and evict. Accessing an entry is > O(1) by a hash table lookup, recording its logical access time, and setting a > frequency flag. A write is performed in O(1) time by updating the hash table > and triggering an async eviction thread. This provides ideal concurrency and > minimizes the latencies by penalizing the thread instead of the caller. > However the policy does not age the frequencies and may not be resilient to > various workload patterns. > W-TinyLFU ([research paper|http://arxiv.org/pdf/1512.00727.pdf]) records the > frequency in a counting sketch, ages periodically by halving the counters, > and orders entries by SLRU. An entry is discarded by comparing the frequency > of the new arrival (candidate) to the SLRU's victim, and keeping the one with > the highest frequency. This allows the operations to be performed in O(1) > time and, though the use of a compact sketch, a much larger history is > retained beyond the current working set. In a variety of real world traces > the policy had [near optimal hit > rates|https://github.com/ben-manes/caffeine/wiki/Efficiency]. > Concurrency is achieved by buffering and replaying the operations, similar to > a write-ahead log. A read is recorded into a striped ring buffer and writes > to a queue. The operations are applied in batches under a try-lock by an > asynchronous thread, thereby track the usage pattern without incurring high > latencies > ([benchmarks|https://github.com/ben-manes/caffeine/wiki/Benchmarks#server-class]). > In YCSB benchmarks the results were inconclusive. For a large cache (99% hit > rates) the two caches have near identical throughput and latencies with > LruBlockCache narrowly winning. At medium and small caches, TinyLFU had a > 1-4% hit rate improvement and therefore lower latencies. The lack luster > result is because a synthetic Zipfian distribution is used, which SLRU > performs optimally. In a more varied, real-world workload we'd expect to see > improvements by being able to make smarter predictions. > The provided patch implements BlockCache using the > [Caffeine|https://github.com/ben-manes/caffeine] caching library (see > HighScalability > [article|http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html]). > Edward Bortnikov and Eshcar Hillel have graciously provided guidance for > evaluating this patch ([github > branch|https://github.com/ben-manes/hbase/tree/tinylfu]). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-13408: -- Attachment: HBASE-13408-trunk-v09.patch We attach a new patch which includes the changes required by the recent discussion. Specifically, we removed (undo) some of the changes to the HRegion and FlushPolicy classes. We moved the code for triggering in memory flush into the compacting memstore implementation. We excluded two changes: (1) we did not remove the StoreSegmentScanner tier from the KeyValueScanner hierarchy as this would result in empty implementation (of the two methods we define here) in the other 5 concrete classes implementing the KeyValueScanner interface, which seems unnecessary. (2) we did not remove the snapshot - this needs to be discussed in a different Jira; there are pros and cons, and it shouldn’t be decided without thorough discussion. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-13408: -- Attachment: HBASE-13408-trunk-v10.patch > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1464#comment-1464 ] Eshcar Hillel commented on HBASE-13408: --- The patch is now available also on review board. I wasn't able to extract the audit warning. Can someone point out what the problem is? thanks > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15006791#comment-15006791 ] Eshcar Hillel commented on HBASE-13408: --- Hi all :) :) :) Nothing else on our table. Any feedback? Patch is available on RB. Considering release audit is unrelated to our patch last QA is {code}+1 overall{code} Thanks :) > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025324#comment-15025324 ] Eshcar Hillel commented on HBASE-13408: --- [~stack] we'll get back with answers to your comments/suggestions/review later on, just wanted to say that the evaluation results for the patch is attached (posted October 26), please take a look. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025386#comment-15025386 ] Eshcar Hillel commented on HBASE-13408: --- Latency spikes are from the evaluation we did with the 0.98 branch (posted July 14). In the more recent evaluation we did we were able to identify the cause of the spikes and avoid it (anyway thanks for suggesting to help :) ). Without the spikes it is easier to see the benefit of the new feature. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031696#comment-15031696 ] Eshcar Hillel commented on HBASE-13408: --- OK then we'll go for several steps. I also assume this is an ok to open a new umbrella Jira for this purpose with a link back to this jira. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14918) In-Memory MemStore Flush and Compaction
Eshcar Hillel created HBASE-14918: - Summary: In-Memory MemStore Flush and Compaction Key: HBASE-14918 URL: https://issues.apache.org/jira/browse/HBASE-14918 Project: HBase Issue Type: Umbrella Affects Versions: 2.0.0 Reporter: Eshcar Hillel A memstore serves as the in-memory component of a store unit, absorbing all updates to the store. From time to time these updates are flushed to a file on disk, where they are compacted (by eliminating redundancies) and compressed (i.e., written in a compressed format to reduce their storage size). We aim to speed up data access, and therefore suggest to apply in-memory memstore flush. That is to flush the active in-memory segment into an intermediate buffer where it can be accessed by the application. Data in the buffer is subject to compaction and can be stored in any format that allows it to take up smaller space in RAM. The less space the buffer consumes the longer it can reside in memory before data is flushed to disk, resulting in better performance. Specifically, the optimization is beneficial for workloads with medium-to-high key churn which incur many redundant cells, like persistent messaging. We suggest to structure the solution as 3 subtasks (respectively, patches). (1) Infrastructure - refactoring of the MemStore hierarchy, introducing segment (StoreSegment) as first-class citizen, and decoupling memstore scanner from the memstore implementation; (2) Implementation of a new memstore (CompactingMemstore) with non-optimized immutable segment representation, and (3) Memory optimization including compressed format representation and offheap allocations. This Jira continues the discussion in HBASE-13408. Design documents, evaluation results and previous patches can be found in HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14919) Infrastructure refactoring
Eshcar Hillel created HBASE-14919: - Summary: Infrastructure refactoring Key: HBASE-14919 URL: https://issues.apache.org/jira/browse/HBASE-14919 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.0 Reporter: Eshcar Hillel Assignee: Eshcar Hillel Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as first-class citizen and decoupling memstore scanner from the memstore implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14920) Compacting Memstore
Eshcar Hillel created HBASE-14920: - Summary: Compacting Memstore Key: HBASE-14920 URL: https://issues.apache.org/jira/browse/HBASE-14920 Project: HBase Issue Type: Sub-task Reporter: Eshcar Hillel Assignee: Eshcar Hillel Implementation of a new compacting memstore with non-optimized immutable segment representation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14921) Memory optimizations
Eshcar Hillel created HBASE-14921: - Summary: Memory optimizations Key: HBASE-14921 URL: https://issues.apache.org/jira/browse/HBASE-14921 Project: HBase Issue Type: Sub-task Affects Versions: 2.0.0 Reporter: Eshcar Hillel Memory optimizations including compressed format representation and offheap allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V01.patch patch attached > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Status: Patch Available (was: Open) > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037817#comment-15037817 ] Eshcar Hillel commented on HBASE-14918: --- Submitted patch for first sub-task > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 3 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (3) Memory optimization including compressed format representation and > offheap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037825#comment-15037825 ] Eshcar Hillel commented on HBASE-13408: --- Created new Jira HBASE-14918 with three sub-tasks. Submitted patch for first refactoring task. This Jira is EOL; if you wish to continue followING this issue please start watching HBASE-14918 (and/or HBASE-14919/HBASE-14920/HBASE-14921). > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13408) HBase In-Memory Memstore Compaction
[ https://issues.apache.org/jira/browse/HBASE-13408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037831#comment-15037831 ] Eshcar Hillel commented on HBASE-13408: --- This is an umbrella Jira continuing the current issue. > HBase In-Memory Memstore Compaction > --- > > Key: HBASE-13408 > URL: https://issues.apache.org/jira/browse/HBASE-13408 > Project: HBase > Issue Type: New Feature >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 2.0.0 > > Attachments: HBASE-13408-trunk-v01.patch, > HBASE-13408-trunk-v02.patch, HBASE-13408-trunk-v03.patch, > HBASE-13408-trunk-v04.patch, HBASE-13408-trunk-v05.patch, > HBASE-13408-trunk-v06.patch, HBASE-13408-trunk-v07.patch, > HBASE-13408-trunk-v08.patch, HBASE-13408-trunk-v09.patch, > HBASE-13408-trunk-v10.patch, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver02.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver03.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument-ver04.pdf, > HBaseIn-MemoryMemstoreCompactionDesignDocument.pdf, > InMemoryMemstoreCompactionEvaluationResults.pdf, > InMemoryMemstoreCompactionMasterEvaluationResults.pdf, > InMemoryMemstoreCompactionScansEvaluationResults.pdf, > StoreSegmentandStoreSegmentScannerClassHierarchies.pdf > > > A store unit holds a column family in a region, where the memstore is its > in-memory component. The memstore absorbs all updates to the store; from time > to time these updates are flushed to a file on disk, where they are > compacted. Unlike disk components, the memstore is not compacted until it is > written to the filesystem and optionally to block-cache. This may result in > underutilization of the memory due to duplicate entries per row, for example, > when hot data is continuously updated. > Generally, the faster the data is accumulated in memory, more flushes are > triggered, the data sinks to disk more frequently, slowing down retrieval of > data, even if very recent. > In high-churn workloads, compacting the memstore can help maintain the data > in memory, and thereby speed up data retrieval. > We suggest a new compacted memstore with the following principles: > 1.The data is kept in memory for as long as possible > 2.Memstore data is either compacted or in process of being compacted > 3.Allow a panic mode, which may interrupt an in-progress compaction and > force a flush of part of the memstore. > We suggest applying this optimization only to in-memory column families. > A design document is attached. > This feature was previously discussed in HBASE-5311. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15043498#comment-15043498 ] Eshcar Hillel commented on HBASE-14919: --- Added link to review board. [~yuzhihong] can you tell what's the base for the -1 overall for the patch? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V02.patch > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048897#comment-15048897 ] Eshcar Hillel commented on HBASE-14919: --- The patch is now available in rb. I will work to fix the style warnings. This would be a good time to give feedback :). > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V03.patch I suspect master is not stable (see HBASE-14947), still posting new patch with code review modifications. Also posting in rb. > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15061674#comment-15061674 ] Eshcar Hillel commented on HBASE-14919: --- Thanks [~te...@apache.org] for providing the list of failed tests. Is it normal for the QA to report success while some tests have failed? Can you verify the master is stable and no tests fail there? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V04.patch > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15062582#comment-15062582 ] Eshcar Hillel commented on HBASE-14919: --- ok. Just uploaded a new patch - hope to have removed all style errors. Any chance of committing a patch while master is not stable? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14918: -- Description: A memstore serves as the in-memory component of a store unit, absorbing all updates to the store. From time to time these updates are flushed to a file on disk, where they are compacted (by eliminating redundancies) and compressed (i.e., written in a compressed format to reduce their storage size). We aim to speed up data access, and therefore suggest to apply in-memory memstore flush. That is to flush the active in-memory segment into an intermediate buffer where it can be accessed by the application. Data in the buffer is subject to compaction and can be stored in any format that allows it to take up smaller space in RAM. The less space the buffer consumes the longer it can reside in memory before data is flushed to disk, resulting in better performance. Specifically, the optimization is beneficial for workloads with medium-to-high key churn which incur many redundant cells, like persistent messaging. We suggest to structure the solution as 4 subtasks (respectively, patches). (1) Infrastructure - refactoring of the MemStore hierarchy, introducing segment (StoreSegment) as first-class citizen, and decoupling memstore scanner from the memstore implementation; (2) Adding StoreServices facility at the region level to allow memstores update region counters and access region level synchronization mechanism; (3) Implementation of a new memstore (CompactingMemstore) with non-optimized immutable segment representation, and (4) Memory optimization including compressed format representation and off heap allocations. This Jira continues the discussion in HBASE-13408. Design documents, evaluation results and previous patches can be found in HBASE-13408. was: A memstore serves as the in-memory component of a store unit, absorbing all updates to the store. From time to time these updates are flushed to a file on disk, where they are compacted (by eliminating redundancies) and compressed (i.e., written in a compressed format to reduce their storage size). We aim to speed up data access, and therefore suggest to apply in-memory memstore flush. That is to flush the active in-memory segment into an intermediate buffer where it can be accessed by the application. Data in the buffer is subject to compaction and can be stored in any format that allows it to take up smaller space in RAM. The less space the buffer consumes the longer it can reside in memory before data is flushed to disk, resulting in better performance. Specifically, the optimization is beneficial for workloads with medium-to-high key churn which incur many redundant cells, like persistent messaging. We suggest to structure the solution as 3 subtasks (respectively, patches). (1) Infrastructure - refactoring of the MemStore hierarchy, introducing segment (StoreSegment) as first-class citizen, and decoupling memstore scanner from the memstore implementation; (2) Implementation of a new memstore (CompactingMemstore) with non-optimized immutable segment representation, and (3) Memory optimization including compressed format representation and offheap allocations. This Jira continues the discussion in HBASE-13408. Design documents, evaluation results and previous patches can be found in HBASE-13408. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoup
[jira] [Created] (HBASE-15016) StoreServices facility in Region
Eshcar Hillel created HBASE-15016: - Summary: StoreServices facility in Region Key: HBASE-15016 URL: https://issues.apache.org/jira/browse/HBASE-15016 Project: HBase Issue Type: Sub-task Reporter: Eshcar Hillel The default implementation of a memstore ensures that between two flushes the memstore size increases monotonically. Supporting new memstores that store data in different formats (specifically, compressed), or that allows to eliminate data redundancies in memory (e.g., via compaction), means that the size of the data stored in memory can decrease even between two flushes. This requires memstores to have access to facilities that manipulate region counters and synchronization. This subtasks introduces a new region interface -- StoreServices, through which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel reassigned HBASE-14921: - Assignee: Eshcar Hillel > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-14921) Memory optimizations
[ https://issues.apache.org/jira/browse/HBASE-14921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel reassigned HBASE-14921: - Assignee: (was: Eshcar Hillel) > Memory optimizations > > > Key: HBASE-14921 > URL: https://issues.apache.org/jira/browse/HBASE-14921 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel > > Memory optimizations including compressed format representation and offheap > allocations -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel reassigned HBASE-15016: - Assignee: Eshcar Hillel > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065538#comment-15065538 ] Eshcar Hillel commented on HBASE-14919: --- Sure [~stack] I understand. Meanwhile I will start working on the second sub-task we defined (already updated HBASE-14918). > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067050#comment-15067050 ] Eshcar Hillel commented on HBASE-15016: --- As stated in the description, in future memstores size can fluctuate; when using alternative memory formats or due to compaction we need to take occasional lock and update size counters at the region level. The goal of this issue is to come up with a StoreService entity that would be the bridge between any future low-level memstore needs and the services at the region level. We aim to make the minimal possible changes to HRegion itself. A memstore can get a pointer to the store services object of its region at construction time. The store services encapsulates a (backward) pointer to the region. The store services object manages any additional size counters beyond the ones existing in the region, and allows to acquire and release the region lock in exclusive mode. The region itself can also query the store services object, e.g., to base its flush decision also on the data this object is managing. In addition, we extend HStore with two methods (1) finalizeFlush() which is invoked after the flush is completed (vs running prepare beforehand), each memstore type can choose to do what it sees right at this phase. (2) getMemorySizeForFlushPolicy() - based on this size the memstore is selected (or not) to be flushed to disk, each memstore type could potentially compute differently the size to report. We are working on a patch, but if you have any comments early is better than later :) > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V01.patch > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070733#comment-15070733 ] Eshcar Hillel commented on HBASE-15016: --- The patch is attached. Recap: Future memstore optimizations such as memstore compaction, compression, and off-heaping, require some interface with services at the region level. For this purpose we introduce the StoreServices class. In addition to being the interface through which memstore access services, it also maintains additional data that is updated by memstores and can be queried by the region. This patch also refines and extends region-to-store communication. Since this is the normal flow of data, there is no need to create a new interface and the API of Store is extended (with the 2 methods described in the previous comment). Finally, this patch refines the region method which decides to invoke a flush. The decision is captured in the new invokeFlushIfNeeded() method. The decision in based also on data stored in the StoreServices objects. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Status: Patch Available (was: Open) > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-14919: -- Attachment: HBASE-14919-V04.patch re-submitting patch; master seems to be stable again. > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070913#comment-15070913 ] Eshcar Hillel commented on HBASE-14918: --- Patches are available for task 1 and task 2. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V02.patch new patch > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15072127#comment-15072127 ] Eshcar Hillel commented on HBASE-14918: --- Both patches got +1 overall in QA. Happy Holidays to those who celebrate - waiting to make progress when you return. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Fix For: 0.98.18 > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14919) Infrastructure refactoring
[ https://issues.apache.org/jira/browse/HBASE-14919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074978#comment-15074978 ] Eshcar Hillel commented on HBASE-14919: --- Went over your comments: 1. volatile - I see what you mean, but since not all the effects of volatile are clear (e.g., it is not cached) I prefered to keep the same behaviour as in master where cellset and snapshot are volatile in DefaultMemStore. 2. snapshotId - this is how snapshotId is used, the pattern is the same before and after the patch: {code} public void clearSnapshot(long id) throws UnexpectedStateException { ... if (this.snapshotId != id) { throw new UnexpectedStateException("Current snapshot id is " + this.snapshotId + ",passed " + id); } // OK. Passed in snapshot is same as current snapshot. If not-empty, // create a new snapshot and let the old one go. ... if (!this.snapshot.isEmpty()) { ... } ... this.snapshotId = -1; ... } {code} seems nothing else to be done. Are we good to commit? > Infrastructure refactoring > -- > > Key: HBASE-14919 > URL: https://issues.apache.org/jira/browse/HBASE-14919 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-14919-V01.patch, HBASE-14919-V01.patch, > HBASE-14919-V02.patch, HBASE-14919-V03.patch, HBASE-14919-V04.patch, > HBASE-14919-V04.patch > > > Refactoring the MemStore hierarchy, introducing segment (StoreSegment) as > first-class citizen and decoupling memstore scanner from the memstore > implementation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075008#comment-15075008 ] Eshcar Hillel commented on HBASE-15016: --- bq. Who is providing the services? The Region or the Store? When I read the StoreServices Interface, it looks like Services we want of the Region? If so, should it be RegionServices (for use by the Store) rather than StoreServices? changed the name to RegionServerProxy bq. Why is it a fluctuating memstore size rather than just memstore size? This method in StoreServices, addAndGetFluctuatingMemstoreSize, is it for all Stores in the region or for the current Store only? Is getMemstoreActiveSize for the current Store or all Stores under the Region? There are two counters: memstoreSize and memstoreFluctuatingSize, and the active size is their difference. Changed the names to {code} long addAndGetGlobalMemstoreSize(long size) long addAndGetGlobalMemstoreFluctuatingSize(long size) long getGlobalMemstoreActiveSize() {code} bq. This call is actually complicated in implementation... getWalSequenceId... and not used elsewhere by this patch (maybe it is used over in the pipeline patch). Does it have to be exposed? One of the services that the new memstore needs from region is getting the wal sequence id, as they are associated with the segments in the memstore, so these are exposed here. bq. Drop 'ForFlushPolicy' from name of this method... getMemStoreSizeForFlushPolicy ... it is redundant (yeah, size is determined by a flush policy but don't have to say so in method name). This is to be use by the region figuring when to flush? changed to getMemStoreActiveSize() bq. Does the soft limit in the HRegion belong in this patch? Yes, because it is used in the implementation of the requestFlushIfNeeded(). Changed names to memstoreFlushSizeLB memstoreFlushSizeUB bq. In the hbase-common, you refer to a class type that is downstream in hbase-server. We are trying to have modules not have inter-dependencies if possible... at least not ones that go in this direction... from upstream to downstream dependency (downstream to upstream is ok): I'm not sure where exactly you mean, can you point this in RB? Would be happy to learn what is the right way to update the method to compute the region class size. > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eshcar Hillel updated HBASE-15016: -- Attachment: HBASE-15016-V03.patch > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15016) StoreServices facility in Region
[ https://issues.apache.org/jira/browse/HBASE-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075414#comment-15075414 ] Eshcar Hillel commented on HBASE-15016: --- I see where you're getting at, by all means [~stack], be my guest, go ahead and make a version :) > StoreServices facility in Region > > > Key: HBASE-15016 > URL: https://issues.apache.org/jira/browse/HBASE-15016 > Project: HBase > Issue Type: Sub-task >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: HBASE-15016-V01.patch, HBASE-15016-V02.patch, > HBASE-15016-V03.patch > > > The default implementation of a memstore ensures that between two flushes the > memstore size increases monotonically. Supporting new memstores that store > data in different formats (specifically, compressed), or that allows to > eliminate data redundancies in memory (e.g., via compaction), means that the > size of the data stored in memory can decrease even between two flushes. This > requires memstores to have access to facilities that manipulate region > counters and synchronization. > This subtasks introduces a new region interface -- StoreServices, through > which store components can access these facilities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15991) CompactingMemstore#InMemoryFlushRunnable should implement Comparable/Comparator
[ https://issues.apache.org/jira/browse/HBASE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15323275#comment-15323275 ] Eshcar Hillel commented on HBASE-15991: --- [~anoop.hbase], [~ram_krish], can you give some background on this issue: Does changing the queue solves the comparable problem? Do you see this issue also when running other operations or just batch mutate? We ran quit a lot of experiments with the code in HBASE-14920 (but not batch operations) and never had this issue > CompactingMemstore#InMemoryFlushRunnable should implement > Comparable/Comparator > --- > > Key: HBASE-15991 > URL: https://issues.apache.org/jira/browse/HBASE-15991 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15991.patch > > > Configuring CompactingMemstore for a table fails due to the following error > {code} > 2016-06-08 23:27:03,761 ERROR [B.defaultRpcServer.handler... > 2016-06-08 23:27:03,761 ERROR > [B.defaultRpcServer.handler=38,queue=8,port=16041] ipc.RpcServer: Unexpected > throwable object > java.lang.ClassCastException: > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable > cannot be cast to java.lang.Comparable > at > java.util.concurrent.PriorityBlockingQueue.siftUpComparable(PriorityBlockingQueue.java:357) > at > java.util.concurrent.PriorityBlockingQueue.offer(PriorityBlockingQueue.java:489) > at > org.apache.hadoop.hbase.util.StealJobQueue$1.offer(StealJobQueue.java:56) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1361) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.checkActiveSize(CompactingMemStore.java:258) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd(AbstractMemStore.java:403) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:113) > at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:630) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyToMemstore(HRegion.java:3769) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3740) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3222) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2954) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2896) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:868) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:830) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2307) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34826) > > {code} > It is a straight forward fix. But If we implement the Comparable the > compareTo() should be based on what attribute? Should be based on the time? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15991) CompactingMemstore#InMemoryFlushRunnable should implement Comparable/Comparator
[ https://issues.apache.org/jira/browse/HBASE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15326918#comment-15326918 ] Eshcar Hillel commented on HBASE-15991: --- Some comments: 1. The exception is due to StealJobQueue extending PriorityBlockingQueue which throws the exception when it cannot determine the position of the item in the queue. I agree there is no need in a priority queue for the in-memory flush, at least for now. If it is required in the future the attribute for comparison will become obvious as well. 2. We should consider the pros/cons of LinkedBlockingQueue vs other queue such as ConcurrentLinkedQueue. see in http://stackoverflow.com/questions/1301691/java-queue-implementations-which-one "The most important difference between LinkedBlockingQueue and ConcurrentLinkedQueue is that if you request an element from a LinkedBlockingQueue and the queue is empty, your thread will wait until there is something there. A ConcurrentLinkedQueue will return right away with the behavior of an empty queue." on the other hand in ConcurrentLinkedQueue "Beware that, unlike in most collections, the size method is NOT a constant-time operation. Because of the asynchronous nature of these queues, determining the current number of elements requires a traversal of the elements." So we need to consider what is more important having a size method with constant time, or making sure retrieving an element never blocks. > CompactingMemstore#InMemoryFlushRunnable should implement > Comparable/Comparator > --- > > Key: HBASE-15991 > URL: https://issues.apache.org/jira/browse/HBASE-15991 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15991.patch, HBASE-15991_test.patch > > > Configuring CompactingMemstore for a table fails due to the following error > {code} > 2016-06-08 23:27:03,761 ERROR [B.defaultRpcServer.handler... > 2016-06-08 23:27:03,761 ERROR > [B.defaultRpcServer.handler=38,queue=8,port=16041] ipc.RpcServer: Unexpected > throwable object > java.lang.ClassCastException: > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable > cannot be cast to java.lang.Comparable > at > java.util.concurrent.PriorityBlockingQueue.siftUpComparable(PriorityBlockingQueue.java:357) > at > java.util.concurrent.PriorityBlockingQueue.offer(PriorityBlockingQueue.java:489) > at > org.apache.hadoop.hbase.util.StealJobQueue$1.offer(StealJobQueue.java:56) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1361) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.checkActiveSize(CompactingMemStore.java:258) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd(AbstractMemStore.java:403) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:113) > at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:630) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyToMemstore(HRegion.java:3769) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3740) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3222) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2954) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2896) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:868) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:830) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2307) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34826) > > {code} > It is a straight forward fix. But If we implement the Comparable the > compareTo() should be based on what attribute? Should be based on the time? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15991) CompactingMemstore#InMemoryFlushRunnable should implement Comparable/Comparator
[ https://issues.apache.org/jira/browse/HBASE-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15327184#comment-15327184 ] Eshcar Hillel commented on HBASE-15991: --- In that case, among all blocking options ArrayBlockingQueue (bounded size), SynchronousQueue (no capacity - each insert operation must wait for a corresponding remove operation), PriorityBlockingQueue (requires comparator), and LinkedTransferQueue (supports synchronous transfer in addition to put), LinkedBlockingQueue seems to be the best choice. It is claimed to have higher throughput than array-based queues but less predictable performance. In the context of a thread pool using a blocking pool is probably ok. I guess then +1 > CompactingMemstore#InMemoryFlushRunnable should implement > Comparable/Comparator > --- > > Key: HBASE-15991 > URL: https://issues.apache.org/jira/browse/HBASE-15991 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-15991.patch, HBASE-15991_test.patch > > > Configuring CompactingMemstore for a table fails due to the following error > {code} > 2016-06-08 23:27:03,761 ERROR [B.defaultRpcServer.handler... > 2016-06-08 23:27:03,761 ERROR > [B.defaultRpcServer.handler=38,queue=8,port=16041] ipc.RpcServer: Unexpected > throwable object > java.lang.ClassCastException: > org.apache.hadoop.hbase.regionserver.CompactingMemStore$InMemoryFlushRunnable > cannot be cast to java.lang.Comparable > at > java.util.concurrent.PriorityBlockingQueue.siftUpComparable(PriorityBlockingQueue.java:357) > at > java.util.concurrent.PriorityBlockingQueue.offer(PriorityBlockingQueue.java:489) > at > org.apache.hadoop.hbase.util.StealJobQueue$1.offer(StealJobQueue.java:56) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1361) > at > org.apache.hadoop.hbase.regionserver.CompactingMemStore.checkActiveSize(CompactingMemStore.java:258) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.internalAdd(AbstractMemStore.java:403) > at > org.apache.hadoop.hbase.regionserver.AbstractMemStore.add(AbstractMemStore.java:113) > at org.apache.hadoop.hbase.regionserver.HStore.add(HStore.java:630) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyToMemstore(HRegion.java:3769) > at > org.apache.hadoop.hbase.regionserver.HRegion.applyFamilyMapToMemstore(HRegion.java:3740) > at > org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:3222) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2954) > at > org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2896) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doBatchOp(RSRpcServices.java:868) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.doNonAtomicRegionMutation(RSRpcServices.java:830) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.multi(RSRpcServices.java:2307) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:34826) > > {code} > It is a straight forward fix. But If we implement the Comparable the > compareTo() should be based on what attribute? Should be based on the time? -- This message was sent by Atlassian JIRA (v6.3.4#6332)