[ 
https://issues.apache.org/jira/browse/HBASE-23326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997737#comment-16997737
 ] 

Duo Zhang commented on HBASE-23326:
-----------------------------------

{quote}
Thanks for putting up the doc. Can we have 'comment' access. Here are a few 
notes in meantime (some carried over from github comments):
{quote}

Done.

{quote}
On layout, could have a 'master' namespace so master Region is in same place in 
filesystem (Might be too awkward excluding this namespace from consideration in 
general processing).

Or, here you make a MasterProcs directory. Old system had a MasterProcWALs dir. 
Make instead a generic 'master' dir at top-level into which we put all stuff 
master wants to persist to filesystem of which these new procedures WALs would 
be first.
{quote}

The design here is to make the procedure store be self-managed. IMO, most data 
should be stored in hbase:meta, or other system tables. And why we need this 
special store is that, inializing and assigning meta depend on the procedure 
framework so we can not use hbase:meta store these things. This should not be a 
common case, so let's isolate it from the normal region store.

{quote}
You cannot pass a RegionServerServices that has the special implementations of 
flush/compaction/rolling? Just to minimize how this Region implementation 
deviates from the norm.
{quote}

It is a 'local' HRegion, so in general, it should not have a 
RegionServerServices along with it. And in fact, if we pass a 
RegionServerServices in, lot's of other features will be activated, such as 
quota, metrics, etc. This will cause problem as if we do not enable table on 
master, some of the components are not initialized. Of course, metrics is 
useful here, but can be a follow on.
And in general, I think we can do some refactoring on HRegion, to make it 
decouple with RegionServerServices, and for the optional features, we can add 
some interface to make them pluggable, then the code here will be more clean.
But anyway, as said above, this should not be a common case, so do not need to 
be hurry on the refactoring.

{quote}
WAL dirs will be deleted/cleanedup after WALs are moved to recovered.edits? 
There'll be no accumulation of WALs? What about archiving? Peter figured how to 
get the MasterProcWALs into the general WAL archive. Maybe no archiving of 
these WALs?
{quote}

I do not think we can archive the WALs to the general place as if we enable 
region on master, it will mess things up. Now I haven't take care of this part 
yet, the intention is to just delete the WALs in the first version. And later, 
we could implement our own archiving logic, it is easy I think. Anyway, the 
design here is to be self-managed. And for tracing the problem, if we assume 
that the HRegion and WAL framework are fine(If it is not fine, then we should 
find out on the normal read/write path), then the problem should be in our code 
which read/write to the HRegion. So maybe we could enable multi version and 
keep deleted cells on this region, to make it more debug friendly.

{quote}
For recovered.edits, they content is supposed to be 'sorted'. When we move WALs 
to recovered.edits, they will be 'sorted' because we write in procedure order? 
Is there anything we need to do to ensure edits go into the WAL 'ordered'?
{quote}

Technically, they do not need to be 'sorted'. As there are sequece ids in the 
WALEntry and we do not do compaction then replaying, order is not important. 
And why we make them sorted is because performance. As when splitting we can 
know all the sequence ids of the WALEntries contained in a splitted WAL file, 
so we can just name it with the sequence ids. Then when replaying, we can 
quickly filter out the unnecessary WAL files. But here, since we do not need to 
split, it is not necessary to read the files again and rename them...

And that's why I use a different name of the directory to put these WAL files. 
You can see the modification in HRegion, I added a special config to specify 
the special directory to place recovered'edits. If this option is set, then the 
logic is a bit different, where we will not filter out any WAL files, and do 
not check its name parttern.

Thanks.

> Implement a ProcedureStore which stores procedures in a HRegion
> ---------------------------------------------------------------
>
>                 Key: HBASE-23326
>                 URL: https://issues.apache.org/jira/browse/HBASE-23326
>             Project: HBase
>          Issue Type: Improvement
>          Components: proc-v2
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>
> So we can resue the code in HRegion for persisting the procedures, and also 
> the optimized WAL implementation for better performance.
> This requires we merge the hbase-procedure module to hbase-server, which is 
> an anti-pattern as we make the hbase-server module more overloaded. But I 
> think later we can first try to move the WAL stuff out.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to