[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

Karthik Ranganathan (JIRA) Tue, 16 Oct 2012 10:35:06 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13477194#comment-13477194
 ]


Karthik Ranganathan commented on HBASE-6980:
--------------------------------------------

@ramakrishna - this should not be necessary for ensuring no data loss right? 
Once we have a snapshot memstore, we automatically should know the max seq id 
to which it has data - that would never change.

1. From what I remember of the code (when I was looking into something 
unrelated), we track the *min* seq id from the current memstore instead of the 
max seq id from the snapshot memstore to put into the HLog when its rolled 
after a flush. So this synchronization becomes necessary - if we store the max 
seq id along with the memstore that is flushed, we should be able to eliminate 
the locks.

2. Also, its arguable if we need the absolute correct max-seq-id flushed. In a 
very small % of cases, we would end up rolling logs a bit slower. As long as we 
are conservative with updating the max seq id in the HLog we should be good, 
right?
                
> Parallel Flushing Of Memstores
> ------------------------------
>
>                 Key: HBASE-6980
>                 URL: https://issues.apache.org/jira/browse/HBASE-6980
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Kannan Muthukkaruppan
>            Assignee: Kannan Muthukkaruppan
>
> For write dominated workloads, single threaded memstore flushing is an 
> unnecessary bottleneck. With a single flusher thread, we are basically not 
> setup to take advantage of the aggregate throughput that multi-disk nodes 
> provide.
> * For puts with WAL enabled, the bottleneck is more likely the "single" WAL 
> per region server. So this particular fix may not buy as much unless we 
> unlock that bottleneck with multiple commit logs per region server. (Topic 
> for a separate JIRA-- HBASE-6981).
> * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk 
> imports), we should be able to support much better ingest rates with parallel 
> flushing of memstores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

Reply via email to