So I think the first step is to introduce different implementations for
different scenarios.

And for replication, we could use your above approach to solve the problem
first, and then we could try to use Andrew's approach to optimize.

WDYT?

唐天航 <tangtianhang...@gmail.com> 于2022年3月18日周五 11:50写道:

> If we can accept the performance degradation in the Replication process,
> then I have a solution that can completely solve this problem based on the
> existing implementation.
> See ReaderBase#seek:
>
> @Override
> public void seek(long pos) throws IOException {
>   if (compressionContext != null && emptyCompressionContext) {
>     while (next() != null) {
>       if (getPosition() == pos) {
>         emptyCompressionContext = false;
>         break;
>       }
>     }
>   }
>   seekOnFs(pos);
> }
>
> The logic of this code is actually to build the LRUCache, but
> emptyCompressionContext is only true at initialization.
> We can modify it to ensure that the hfile is re-read and the LRUCache is
> rebuilded every time when we seek.
> Then replace all external calls to seekOnFs with this method.
>
> The consequence of doing this is a drop in Replication performance, but I
> think at least we can guarantee the correctness of LRUCache.
>
> Do you think this plan makes sense?
>
> (The code above is based on branch-1, but the logic on master is similar)
>
> 张铎(Duo Zhang) <palomino...@gmail.com> 于2022年3月17日周四 23:07写道:
>
> > I agree with Andrew that without a 'versioned' LRUCache, it is not easy
> to
> > implement things correctly. And yes it will impact performance if we
> > implement the 'versioned' logic, for example, using a buffer.
> >
> > But considering the real scenario, we do not always need to support
> > rollback LRUCache.
> > When writing WAL, which is on the critical write path, we do not need to
> > rollback the LRUCache.
> > And on reading WAL, basically we have two scenarios.
> > The first is splitting WAL, which will impact MTTR. But we can make sure
> > that we will only read closed WAL files when recovery, and even if we
> fail
> > in the middle, we can just fail the task, the master will schedule a new
> > splitting task and try again. So we do not need to implement rollback for
> > this scenario too.
> > The second is replication. I think this is the only place where we need
> to
> > implement rollback, as we will keep tailing the WAL file which is being
> > written currently. And for replication, the performance for reading WAL
> > is not very critical then, so I think it is OK to implement rollback for
> > this scenario.
> >
> > So basically, I think we could have different LRUCache implementations,
> and
> > also different reader implementations, to suit different scenarios, then
> we
> > could gain both correctness and performance.
> >
> > Thanks.
> >
> > 唐天航 <tangtianhang...@gmail.com> 于2022年3月17日周四 17:35写道:
> >
> > > Hi duo,
> > >
> > > I have submit a PR <https://github.com/apache/hbase/pull/4237> for the
> > > doc.
> > > Please kindly help me review it if it is convenient for you. Maybe need
> > > some polish.
> > >
> > > Thank you, Regards
> > >
> > >
> > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2022年3月16日周三 18:36写道:
> > >
> > > > +1 on updating doc first. You can file an issue for the documentation
> > > > change, and let's also send an NOTICE email to both dev and user list
> > to
> > > > warn our users about this.
> > > >
> > > > 唐天航 <tangtianhang...@gmail.com> 于2022年3月16日周三 18:08写道:
> > > >
> > > > > If we only reset the position to the head, yes we can fix it.
> > > > > In fact, 26849 is to fix the problem in this scenario.
> > > > > But unfortunately, we have some other scenarios where we roll back
> > the
> > > > > position to some intermediate position, such as
> > > > ProtobufLogReader.java#L381
> > > > > <
> > > >
> > https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java
> > > > >
> /org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java#L381
> > > > > <
> > > >
> > >
> >
> https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java#L381
> > > > >
> > > > > >
> > > > > I think we cannot rollback the LRUCache too...
> > > > > While my cluster works fine after 26849, the fix is still
> > theoretically
> > > > > incomplete.
> > > > >
> > > > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2022年3月16日周三 17:59写道:
> > > > >
> > > > > > The old WAL compression implementation is buggy when used
> together
> > > with
> > > > > > replication, that's true...
> > > > > >
> > > > > > But in general I think it is fixable, the dict is per file IIRC,
> > so I
> > > > > think
> > > > > > clearing the LRUCache when resetting to the head of the file can
> > fix
> > > > the
> > > > > > problem?
> > > > > >
> > > > > > Maybe we need to do some refactoring...
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > 唐天航 <tangtianhang...@gmail.com> 于2022年3月16日周三 16:20写道:
> > > > > >
> > > > > > > Hi masters,
> > > > > > >
> > > > > > > I have created an issue HBASE-26849
> > > > > > > <https://issues.apache.org/jira/browse/HBASE-26849> about NPE
> > > caused
> > > > > by
> > > > > > > WAL
> > > > > > > Compression and Replication.
> > > > > > >
> > > > > > > For this problem, I try to reopen a WAL reader when we reset
> the
> > > > > position
> > > > > > > to 0 and it looks like it's working well. But it didn't
> > > fundamentally
> > > > > > solve
> > > > > > > the problem.
> > > > > > >
> > > > > > > Since we have the WAL Compression feature, Replication has
> > > > introduced a
> > > > > > lot
> > > > > > > of new code, and there are many places that reset the HLog
> > > position,
> > > > > such
> > > > > > > as seekOnFs to originalPosition. I guess none of these codes
> > > consider
> > > > > > > compatibility with WAL Compression. Because theoretically we
> can
> > > roll
> > > > > > back
> > > > > > > the position to any position at any time, but the LRUCache in
> the
> > > > > > > corresponding LRUDictionary should also be rolled back,
> otherwise
> > > the
> > > > > > read
> > > > > > > and write link behavior may be inconsistent. But LRUCache can't
> > > roll
> > > > > back
> > > > > > > at all...
> > > > > > >
> > > > > > > So my thought is, open another issue and add some description
> in
> > > the
> > > > > doc,
> > > > > > > WAL Compression and Replication are not compatible.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > Thank you. Regards
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to