[ 
https://issues.apache.org/jira/browse/HBASE-18201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527252#comment-16527252
 ] 

Kuan-Po Tseng commented on HBASE-18201:
---------------------------------------

Thanks for reviewing, Reid Chan.
{quote}I think adjust the call order like following should works. No need to 
add another if branch, kind of confusing.
{code:java}
this.dataBlockEncoder.endBlockEncoding(encodingCtx, out, baosBytes);
baos.flush();
baosBytes = baos.toByteArray();
{code}
{quote}
The problem is ROW_INDEX_V1 extends different class, its #endBlockEncoding 
write (int) onDiskDataSize in OutputStream while other encoders write in byte 
array which is under 
 OutputStream. If we do #endBlockEncoding first and #flush, Encoder 
ROW_INDEX_V1 runs well, while byte array with other encoders is 
{(int)onDiskDataSize, byte, byte....,byte} since they write (int)onDiskDataSize 
in byte array first and flush all data, but the right order is {byte, byte, 
...., (int) onDiskDataSize}. (int)onDiskDataSize should be the last.

Could we add useTag = currentKV.getTagsLength() > 0 in while loop above? Once 
it is set true, the rest no needs to check.
{quote}
{code:java}
HStoreFile hsf = new HStoreFile(fs, path, conf, cacheConf, BloomType.NONE, 
true);
StoreFileReader reader = hsf.getReader();
boolean useTag = reader.getHFileReader().getFileContext().isIncludesTags();
{code}
Kinds of heavy to create a HStoreFile instance just to use its isIncludesTags 
method.
{quote}
Sorry, I didn't explain carefully. HStoreFile instance is already created in 
#testCodecs which happened before #checkStatistics , we could check if useTag 
is true in #testCodecs instead of creating a new one.
{quote}
{code:java}
DataBlockEncodingTool#checkStatistics
rawKVs = uncompressedOutputStream.toByteArray();
{code}
I doubt it a real rawKVs, since i see no about writing tags (if a kv has).
{quote}
Pardon me. Do you mean rawKVs may not be a real rawKVs because #checkStatistics 
doesn't write tags to rawKVs ?

> add UT and docs for DataBlockEncodingTool
> -----------------------------------------
>
>                 Key: HBASE-18201
>                 URL: https://issues.apache.org/jira/browse/HBASE-18201
>             Project: HBase
>          Issue Type: Sub-task
>          Components: tooling
>            Reporter: Chia-Ping Tsai
>            Assignee: Kuan-Po Tseng
>            Priority: Minor
>              Labels: beginner
>         Attachments: HBASE-18201.master.001.patch, 
> HBASE-18201.master.002.patch, HBASE-18201.master.002.patch, 
> HBASE-18201.master.003.patch
>
>
> There is no example, documents, or tests for DataBlockEncodingTool. We should 
> have it friendly if any use case exists. Otherwise, we should just get rid of 
> it because DataBlockEncodingTool presumes that the implementation of cell 
> returned from DataBlockEncoder is KeyValue. The presume may obstruct the 
> cleanup of KeyValue references in the code base of read/write path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to