On Tue, Apr 14, 2020 at 1:16 PM Andrey Elenskiy <[email protected]> wrote:
> Hello, > > I'm trying to understand the extent of the following issue mentioned in > "WAL Compression" doc: https://hbase.apache.org/book.html#wal.compression > > A possible downside to WAL compression is that we lose more data from the > > last block in the WAL if it ill-terminated mid-write. If entries in this > > last block were added with new dictionary entries but we failed persist > the > > amended dictionary because of an abrupt termination, a read of this last > > block may not be able to resolve last-written entries. > > > Does it mean there's a potential data loss even if the clients of > regionserver received an ack? Yes. The ack to client says the data made it out to the WAL successfully, not if the last compressed block is properly terminated. > First mention of this issue I noticed here: > > https://issues.apache.org/jira/browse/HBASE-18504?focusedCommentId=16127767&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16127767 > However, I couldn't find anything like that mentioned in the issue that > introduced the WAL compression ( > https://issues.apache.org/jira/browse/HBASE-4608). > > I've also poked around the code of how compression is done ( > > https://github.com/apache/hbase/blob/7877e09b6023c80e8bacd25fb8e0b9273ed7d258/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCellCodec.java#L171 > ) > and not able to see how "failed to persist the amended dictionary" case can > happen. It seems like there's no explicit dictionary stored at all and > instead writing the data entries continuously records the dictionary on the > fly. If data is not in a dictionary it's written out explicitly so it > shouldn't be lost. > > I think the doc probably detached from implementation describing general block-based compression where decompress w/o dictionary is non-starter. Did you look read-side? It doesn't give up on a block that is without dictionary or a dictionary that is incomplete. It reads the compressed block w/o aid of dictionary anyways? > Could you please clarify the situation where data loss after receiving an > ack can happen when using wal compression? > > Hopefully above gives sense of how the deviance you are wallowing in came about. Good on you Andrey, S > Thanks, > Andrey >
