[ 
https://issues.apache.org/jira/browse/HBASE-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034172#comment-14034172
 ] 

Jonathan Hsieh commented on HBASE-11339:
----------------------------------------

>>We could avoid extra writes by just writing to a separate LOB log/file. Was 
>>this considered?
>It was considered. But we didn't find a good solution for this.
...
>  You mean to write the WAL by stores? If we use the HLog as the Lob files 
> directly, is it efficient to seek a KV in it? I don't think so.

I'm not convinced.  The idea I'm suggesting is having a special lob log file 
that is written once at write time that is essentially the lob store files in 
the doc, and put a reference to it (file name, and offset) in the normal wal.  
This allows the lob to only be written once.  I don't see how this would be 
less efficient than an approach that must write the values out at least twice.

>>5MB cells are large but aren't really that big. Maybe this should just be 
>>"blobs" (binary large objects) or "mobs" (medium objects)? the objects being 
>>immutable is important too
>Actually the Lobs could be mutable. The Lobs that are not used anymore will be 
>handled by the Sweep Tool.

When I say mutable I mean that I can modify a particular byte in the lob 
without having to "overwrite" the previous lob with a whole new lob.  I don't 
think the proposed design handles this modify a few bytes in a large blob 
without doing the rewrite of the entire lob.

> Good idea. But In this way, all the actions occurs in the client, each client 
> writes a new file in HDFS. It's hard to control the file size which 
> consequently leads to too many small files in HDFS probably.

I agree about the hdfs small files problem but I think we need to properly 
define what a LOB is and the scope of this effort.  (hence my suggestion of 
Medium Objects -- MOBS).  

Consider storing and shipping real large objects (say 100MB's or GB's).  Here 
hbase's api is insufficient.  We'd want a streaming api for that or allow the 
client to go to the file system directly (which may be a security concern for 
some users).  

Consider storing and shipping moderately sized objects (say 100k's to 10MB's).  
HBase's API is still sufficient, but we'd want to avoid the write amplification 
problem.  The proposed design does this, but I think it could go further to 
avoid a 2x write amplification if we handled it at the logging portion of the 
write path as opposed to the flushing part of the of the write path.

I'm under the impression we are solving the latter case here.  Is that correct?

> HBase LOB
> ---------
>
>                 Key: HBASE-11339
>                 URL: https://issues.apache.org/jira/browse/HBASE-11339
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver, Scanners
>            Reporter: Jingcheng Du
>            Assignee: Jingcheng Du
>         Attachments: HBase LOB Design.pdf
>
>
>   It's quite useful to save the massive binary data like images, documents 
> into Apache HBase. Unfortunately directly saving the binary LOB(large object) 
> to HBase leads to a worse performance since the frequent split and compaction.
>   In this design, the LOB data are stored in an more efficient way, which 
> keeps a high write/read performance and guarantees the data consistency in 
> Apache HBase.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to