[ 
https://issues.apache.org/jira/browse/HBASE-6752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13540076#comment-13540076
 ] 

Gregory Chanan commented on HBASE-6752:
---------------------------------------

@nkeywal: didn't study anything in too much depth.

For the read part, my thought was to implement a config (in HTableDescriptor?) 
that would reject user-set timestamps on writes, so we know for sure there 
can't be any writes in the timestamp range that need to be replayed from the 
WAL.  I suspect there are other optimizations we could do with that 
information, but haven't thought it through.

For writes, do you create a new WAL for the new writes that are happening while 
the log is still replaying?  If so, management could be complicated and it 
might make sense to have support for multiple WALs already before tackling 
that.  If not (you write to the same WAL), would that even work?  I guess you 
would want to avoid replaying the new writes (might be okay if all WAL updates 
are idempotent, but could be an issue if a lot of writes go in during the 
replay time).
                
> On region server failure, serve writes and timeranged reads during the log 
> split
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-6752
>                 URL: https://issues.apache.org/jira/browse/HBASE-6752
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Priority: Minor
>
> Opening for write on failure would mean:
> - Assign the region to a new regionserver. It marks the region as recovering
>   -- specific exception returned to the client when we cannot server.
>   -- allow them to know where they stand. The exception can include some time 
> information (failure stated on: ...)
>   -- allow them to go immediately on the right regionserver, instead of 
> retrying or calling the region holding meta to get the new address
>      => save network calls, lower the load on meta.
> - Do the split as today. Priority is given to region server holding the new 
> regions
>   -- help to share the load balancing code: the split is done by region 
> server considered as available for new regions
>   -- help locality (the recovered edits are available on the region server) 
> => lower the network usage
> - When the split is finished, we're done as of today
> - while the split is progressing, the region server can
>  -- serve writes
>    --- that's useful for all application that need to write but not read 
> immediately:
>    --- whatever logs events to analyze them later
>    --- opentsdb is a perfect example.   
>  -- serve reads if they have a compatible time range. For heavily used 
> tables, it could be an help, because:
>    --- we can expect to have a few minutes of data only (as it's loaded)
>    --- the heaviest queries, often accepts a few -or more- minutes delay. 
> Some "What if":
> 1) the split fails
> => Retry until it works. As today. Just that we serves writes. We need to 
> know (as today) that the region has not recovered if we fail again.
> 2) the regionserver fails during the split
> => As 1 and as of today/
> 3) the regionserver fails after the split but before the state change to 
> fully available.
> => New assign. More logs to split (the ones already dones and the new ones).
> 4) the assignment fails
> => Retry until it works. As today.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to