>
> Maybe we need to do some refactoring...
>
I can not agree more...
But before that, I think we'd better point out this compatibility issue
explicitly in our doc.

唐天航 <tangtianhang...@gmail.com> 于2022年3月16日周三 18:08写道:

> If we only reset the position to the head, yes we can fix it.
> In fact, 26849 is to fix the problem in this scenario.
> But unfortunately, we have some other scenarios where we roll back the
> position to some intermediate position, such as
> ProtobufLogReader.java#L381
> <https://github.com/apache/hbase/blob/branch-1/hbase-server/src/main/java%20/org/apache/hadoop/hbase/regionserver/wal/ProtobufLogReader.java#L381>
> I think we cannot rollback the LRUCache too...
> While my cluster works fine after 26849, the fix is still theoretically
> incomplete.
>
> 张铎(Duo Zhang) <palomino...@gmail.com> 于2022年3月16日周三 17:59写道:
>
>> The old WAL compression implementation is buggy when used together with
>> replication, that's true...
>>
>> But in general I think it is fixable, the dict is per file IIRC, so I
>> think
>> clearing the LRUCache when resetting to the head of the file can fix the
>> problem?
>>
>> Maybe we need to do some refactoring...
>>
>> Thanks.
>>
>> 唐天航 <tangtianhang...@gmail.com> 于2022年3月16日周三 16:20写道:
>>
>> > Hi masters,
>> >
>> > I have created an issue HBASE-26849
>> > <https://issues.apache.org/jira/browse/HBASE-26849> about NPE caused by
>> > WAL
>> > Compression and Replication.
>> >
>> > For this problem, I try to reopen a WAL reader when we reset the
>> position
>> > to 0 and it looks like it's working well. But it didn't fundamentally
>> solve
>> > the problem.
>> >
>> > Since we have the WAL Compression feature, Replication has introduced a
>> lot
>> > of new code, and there are many places that reset the HLog position,
>> such
>> > as seekOnFs to originalPosition. I guess none of these codes consider
>> > compatibility with WAL Compression. Because theoretically we can roll
>> back
>> > the position to any position at any time, but the LRUCache in the
>> > corresponding LRUDictionary should also be rolled back, otherwise the
>> read
>> > and write link behavior may be inconsistent. But LRUCache can't roll
>> back
>> > at all...
>> >
>> > So my thought is, open another issue and add some description in the
>> doc,
>> > WAL Compression and Replication are not compatible.
>> >
>> > What do you think?
>> >
>> > Thank you. Regards
>> >
>>
>

Reply via email to