On Tue, Apr 14, 2020 at 1:16 PM Andrey Elenskiy
<[email protected]> wrote:

> Hello,
>
> I'm trying to understand the extent of the following issue mentioned in
> "WAL Compression" doc: https://hbase.apache.org/book.html#wal.compression
>
> A possible downside to WAL compression is that we lose more data from the
> > last block in the WAL if it ill-terminated mid-write. If entries in this
> > last block were added with new dictionary entries but we failed persist
> the
> > amended dictionary because of an abrupt termination, a read of this last
> > block may not be able to resolve last-written entries.
>
>
> Does it mean there's a potential data loss even if the clients of
> regionserver received an ack?



Yes.

The ack to client says the data made it out to the WAL successfully, not if
the last compressed block is properly terminated.



> First mention of this issue I noticed here:
>
> https://issues.apache.org/jira/browse/HBASE-18504?focusedCommentId=16127767&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16127767
> However, I couldn't find anything like that mentioned in the issue that
> introduced the WAL compression (
> https://issues.apache.org/jira/browse/HBASE-4608).
>
> I've also poked around the code of how compression is done (
>
> https://github.com/apache/hbase/blob/7877e09b6023c80e8bacd25fb8e0b9273ed7d258/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java#L171
> )
> and not able to see how "failed to persist the amended dictionary" case can
> happen. It seems like there's no explicit dictionary stored at all and
> instead writing the data entries continuously records the dictionary on the
> fly. If data is not in a dictionary it's written out explicitly so it
> shouldn't be lost.
>
>
I think the doc probably detached from implementation describing general
block-based compression where decompress w/o dictionary is non-starter.

Did you look read-side? It doesn't give up on a block that is without
dictionary or a dictionary that is incomplete. It reads the compressed
block w/o aid of dictionary anyways?



> Could you please clarify the situation where data loss after receiving an
> ack can happen when using wal compression?
>
>
Hopefully above gives sense of how the deviance you are wallowing in came
about.

Good on you Andrey,

S




> Thanks,
> Andrey
>

Reply via email to