[ https://issues.apache.org/jira/browse/HBASE-26567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
chenglei updated HBASE-26567: ----------------------------- Description: As HBASE-18375 said, data chunks in {{CellChunkMap}} is indexed by index chunks by ChunkID ,not by strong reference, so in order to prevent the data chunks in {{CellChunkMap}} from gc by JVM we put these data chunks in {{ChunkCreator.chunkIdMap}} which is a the strong map. Whether or not the {{ChunkCreator}} put the data chunk in {{ChunkCreator.chunkIdMap}} depends on the data chunk is pooled or the index type of {{CompactingMemStore}} is {{IndexType.CHUNK_MAP}}. So when we call {{ChunkCreator.getChunk}}, it is very important to specify the IndexType. {{ChunkCreator}} has following package-visible method, which does not require the the {{IndexType}} and use {{IndexType.ARRAY_MAP}} {code:java} Chunk getChunk(ChunkType chunkType) { return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType); } {code} This method is used by following line 363 in {{MemStoreLABImpl.getNewExternalChunk}} now, which is used by {{CellChunkImmutableSegment.copyCellIntoMSLAB}} to force copy a big cell into {{MemStoreLAB}},so the index type obviously should be {{IndexType.CHUNK_MAP}}. {code:java} 359 public Chunk getNewExternalChunk(ChunkCreator.ChunkType chunkType) { 360 switch (chunkType) { 361 case INDEX_CHUNK: 362 case DATA_CHUNK: 363 Chunk c = this.chunkCreator.getChunk(chunkType); 364 chunks.add(c.getId()); 365 return c; 366 case JUMBO_CHUNK: // a jumbo chunk doesn't have a fixed size 367 default: 368 return null; 369 } 370 } {code} Even though fortunately this IndexType mismatch does not cause bug now because {{MemStoreLABImpl.getNewExternalChunk}} in fact is not called by {{CellChunkImmutableSegment.copyCellIntoMSLAB}} when the cell is bigger than chunkSize,but I think it is dangerous to use this was: As HBASE-18375 said, data chunks in {{CellChunkMap}} is indexed by index chunks by ChunkID ,not by strong reference, so in order to prevent the data chunks in {{CellChunkMap}} from gc by JVM we put these data chunks in {{ChunkCreator.chunkIdMap}} which is a the strong map. Whether or not the {{ChunkCreator}} put the data chunk in {{ChunkCreator.chunkIdMap}} depends on the data chunk is pooled or the index type of {{CompactingMemStore}} is {{IndexType.CHUNK_MAP}}. So when we call {{ChunkCreator.getChunk}}, it is very important to specify the IndexType. {{ChunkCreator}} has following package-visible method, which does not require the the {{IndexType}} and use {{IndexType.ARRAY_MAP}} {code:java} Chunk getChunk(ChunkType chunkType) { return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType); } {code} This method is used by following line 363 in {{MemStoreLABImpl.getNewExternalChunk}} now, which is used by {{CellChunkImmutableSegment.copyCellIntoMSLAB}} to force copy a big cell into {{MemStoreLAB}},so the index type obviously should be {{IndexType.CHUNK_MAP}}. {code:java} 359 public Chunk getNewExternalChunk(ChunkCreator.ChunkType chunkType) { 360 switch (chunkType) { 361 case INDEX_CHUNK: 362 case DATA_CHUNK: 363 Chunk c = this.chunkCreator.getChunk(chunkType); 364 chunks.add(c.getId()); 365 return c; 366 case JUMBO_CHUNK: // a jumbo chunk doesn't have a fixed size 367 default: 368 return null; 369 } 370 } {code} Even though fortunately this IndexType mismatch does not cause bug now because {{MemStoreLABImpl.getNewExternalChunk}} is in fact not called by {{CellChunkImmutableSegment.copyCellIntoMSLAB}} when the cell is bigger than chunkSize. > Remove high-risk ChunkCreator.getChunk(ChunkType chunkType) method > ------------------------------------------------------------------ > > Key: HBASE-26567 > URL: https://issues.apache.org/jira/browse/HBASE-26567 > Project: HBase > Issue Type: Improvement > Components: in-memory-compaction > Affects Versions: 3.0.0-alpha-1, 2.4.8 > Reporter: chenglei > Priority: Major > > As HBASE-18375 said, data chunks in {{CellChunkMap}} is indexed by index > chunks by ChunkID ,not by strong reference, so in order to prevent the data > chunks in {{CellChunkMap}} from gc by JVM we put these data chunks in > {{ChunkCreator.chunkIdMap}} which is a the strong map. Whether or not the > {{ChunkCreator}} put the data chunk in {{ChunkCreator.chunkIdMap}} depends on > the data chunk is pooled or the index type of {{CompactingMemStore}} is > {{IndexType.CHUNK_MAP}}. So when we > call {{ChunkCreator.getChunk}}, it is very important to specify the > IndexType. {{ChunkCreator}} has following package-visible method, which does > not require the the {{IndexType}} and use {{IndexType.ARRAY_MAP}} > {code:java} > Chunk getChunk(ChunkType chunkType) { > return getChunk(CompactingMemStore.IndexType.ARRAY_MAP, chunkType); > } > {code} > This method is used by following line 363 in > {{MemStoreLABImpl.getNewExternalChunk}} now, which is used by > {{CellChunkImmutableSegment.copyCellIntoMSLAB}} to force copy a big cell into > {{MemStoreLAB}},so the index type obviously should be {{IndexType.CHUNK_MAP}}. > {code:java} > 359 public Chunk getNewExternalChunk(ChunkCreator.ChunkType chunkType) { > 360 switch (chunkType) { > 361 case INDEX_CHUNK: > 362 case DATA_CHUNK: > 363 Chunk c = this.chunkCreator.getChunk(chunkType); > 364 chunks.add(c.getId()); > 365 return c; > 366 case JUMBO_CHUNK: // a jumbo chunk doesn't have a fixed size > 367 default: > 368 return null; > 369 } > 370 } > {code} > Even though fortunately this IndexType mismatch does not cause bug now > because {{MemStoreLABImpl.getNewExternalChunk}} in fact is not called by > {{CellChunkImmutableSegment.copyCellIntoMSLAB}} when the cell is bigger than > chunkSize,but I think it is dangerous to use this -- This message was sent by Atlassian Jira (v8.20.1#820001)