[jira] [Commented] (HBASE-20952) Re-visit the WAL API

Duo Zhang (JIRA) Sat, 15 Sep 2018 02:08:42 -0700


    [ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16616239#comment-16616239
 ]


Duo Zhang commented on HBASE-20952:
-----------------------------------

First ,WALSplitter is not a separated topic, it is the core of HBase. You can 
disable replication but you can not disable wal splitting...

And I think your approach sound good, there is a register method(or initialize? 
Or just do it in the constructor, not critical), when RS starts we will call it 
to get the permit to write to the log system. In the FileSystem based log 
system, it is just a creating of a directory, and for other log systems it is  
And when master think the RS is die, then we call a disable method, which 
prevent further appending. For FileSystem this is done by renaming and 
recoverLease, and for other log systems I think there are ways to do this.

And I agree that, we should have different wal splitter for different wal 
systems. For FileSystem, this maybe done by splitting wal files into several 
recovered edits into the region directory, and for other log systems, we could 
use different ways. But the key point here is that, we need to know there are 
recovered edits when opening a region and scanning it to reconstruct the 
memstore. So I think we should add another method to the WAL system, which is 
used to get the recovered edits for a region when opening a region. IIRC 
[~zyork] is working on deploy HBase on S3 and was fighting with the recovered 
edits directory should be on S3 or HDFS, do not know what's the final solution 
but after the discussion here, I think it should be on HDFS, not S3?

So I think here we will add two methods to the wal system. One is for splitting 
wal for a region server, and the other is for getting recovered edits for a 
region. If the implementation is wal per region, then the split method is just 
a dummy one that does nothing, otherwise you still need to do something to make 
separated wals for different regions. And if split is too heavy, you can do 
filtering when getting recovered edits? Not sure, maybe.

And for replication, the above word 'subscribe/replay' inspires me. The 
replication is just another subscriber of the wals, right? It receives the wals 
for specific tables(regions), and then sends it to the remote clusters. So I 
think we could introduce the subscribe/consume style APIs for the wal system, 
then the implementation of replication will be straight-forward. I do not care 
whether they are wal files or some topics on Kafka, just give me the stream to 
read! And the FileSystem related code in the replication framework will also be 
moved into the wal system. You can see the code, we just use zookeeper to 
record the unconsumed wal files, and try to locate it on the FileSystem as it 
may be moved to oldWALs. It is just a basic subscribe/consume framework I think.

And for sync replication, I think we should make it work with different wal 
implementations. This is another story and I will keep tracking it. To be 
honest I do not know the solution yet, but I'm optimistic.

So in general, I think the problem for the current wal abstraction is that, the 
line is too low, we should cut it at a higher place, where fencing, log 
splitting, and reading recovered edits should all be included in it, but now 
lots of the code are outside the wal system. Thanks [~sergey.soldatov], your 
post really helps.

> Re-visit the WAL API
> --------------------
>
>                 Key: HBASE-20952
>                 URL: https://issues.apache.org/jira/browse/HBASE-20952
>             Project: HBase
>          Issue Type: Sub-task
>          Components: wal
>            Reporter: Josh Elser
>            Priority: Major
>         Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20952) Re-visit the WAL API

Reply via email to