The PR for HBASE-27632 is ready for review now.

In this PR we just removed 'hbase.regionserver.hlog.reader.impl'.

Will file other issues to remove 'hbase.regionserver.hlog.writer.impl' and
also update the ref guide.

The most important changes for the PR are:

   1. Introduced two types of WAL readers, WALStreamReader and
   WALTailingReader. The implementation classes are ProtobufWALStreamReader
   and ProtobufWALTailingReader, and they share the same base class
   AbstractProtobufWALReader.
   2. Now, you do not need to specify SecureProtobufLogReader for reading
   an encrypted WAL file, the ProtobufWALStreamReader and
   ProtobufWALTailingReader will automatically choose to use
   SecureWALCellCodec if the file is written by SecureProtobufLogWriter.
   3. For WALStreamReader, we still need to the ability to read from a
   specific position(in WALInputFormat, for recovering from an error), instead
   of introducing a seek method, we add a startPosition parameter when
   creating a WALStreamReader.
   4. The logging splitting related logics are almost the same, as the API
   does not change too much.
   5. For WALTailingReader, it does not throw any exceptions now, we
   introduce a State to indicate the reader's state and what you need to do
   after issuing a next call. So the WALEntryStream is also changed a lot, as
   well as ReplicationSourceWALReader.
   6. For verifying that we could correctly deal with partial WAL file, we
   introduced a TestParsePartialWALFile, please see the comments of this file
   for more details.
   7. The hbase.regionserver.hlog.reader.impl is removed, as now we have
   two types of reader implementations. And in general, after removing the
   necessity of specify SecureProtobufLogReader when reading encrypted WAL
   file, users do not need to specify WAL reader any more. For writing UTs, we
   introduced a hbase.regionserver.wal.stream.reader.impl config, to specify
   the implementation class for WALStreamReader in tests.

PTAL if you have interest.

Thanks.

唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 18:53写道:

> Make sense. Thank you.
>
> 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月15日周三 18:24写道:
>
> > We could discuss how to implement BookKeeper based WAL in the future.
> >
> > For me, I would like to abstract at a higher level, as we do not need to
> > have a 'file' when using BookKeeper. But this requires changing the
> > replication abstraction too.
> >
> > Thanks.
> >
> > 唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 11:42写道:
> >
> > > Can we keep these configs and add a new one for the replication reader?
> > >
> > > As @Andrew said, One of the things we are doing is using BookKeeper for
> > WAL
> > > storage. This depends on the configuration above. Although we are
> > > developing based on branch-1, and it is too early to talk about joining
> > the
> > > community, I'm not sure what the attitude of the community is, whether
> it
> > > is willing to accept implementations based on other storage in the
> > future.
> > >
> > > Thanks.
> > >
> > >
> > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月14日周二 21:20写道:
> > >
> > > > Thanks Andrew for the feedback.
> > > >
> > > > If no other concerns, I will start the refactoring work and
> deprecated
> > > > these two configs.
> > > >
> > > > Thanks.
> > > >
> > > > Andrew Purtell <andrew.purt...@gmail.com> 于2023年2月10日周五 23:01写道:
> > > >
> > > > > As you point out these configuration settings were introduced when
> we
> > > > > migrated from SequenceFile based WALs to the protobuf format. We
> > needed
> > > > to
> > > > > give users a way to manually migrate, although, arguably, an auto
> > > > migration
> > > > > would have been better.
> > > > >
> > > > > In theory these settings allow users to implement their own WAL
> > readers
> > > > > and writers. However I do not believe users will do this. The WAL
> is
> > > > > critical for performance and correctness. If anyone is
> contemplating
> > > such
> > > > > wizard level changes they can patch the code themselves. It’s fine
> to
> > > > > document these settings as deprecated for sure, and I think ok also
> > to
> > > > > claim them unsupported and ignored.
> > > > >
> > > > > >
> > > > > > On Feb 10, 2023, at 3:41 AM, 张铎 <palomino...@gmail.com> wrote:
> > > > > >
> > > > > > While discussing how to deal with the problem in HBASE-27621, we
> > > > > proposed
> > > > > > to introduce two types of WAL readers, one for WAL splitting, and
> > the
> > > > > other
> > > > > > for WAL replication, as replication needs to tail the WAL file
> > which
> > > is
> > > > > > currently being written, so the logic is much more complicated.
> We
> > do
> > > > not
> > > > > > want to affect WAL splitting logic and performance while tweaking
> > the
> > > > > > replication related things, as all HBase users need WAL splitting
> > but
> > > > not
> > > > > > everyone needs replication.
> > > > > >
> > > > > > But when reviewing the related code, I found that we have two
> > > > > > configurations for specifying the WAL reader class and WAL write
> > > class,
> > > > > > which indicates that we could only have one implementation for
> the
> > > WAL
> > > > > > reader. They are 'hbase.regionserver.hlog.reader.impl' and
> > > > > > 'hbase.regionserver.hlog.writer.impl'.
> > > > > >
> > > > > > We mentioned these two configurations several times in our ref
> > guide.
> > > > > >
> > > > > > HBase 2.0+ can no longer read Sequence File based WAL file.
> > > > > >
> > > > > > HBase can no longer read the deprecated WAL files written in the
> > > Apache
> > > > > >> Hadoop Sequence File format. The
> > hbase.regionserver.hlog.reader.impl
> > > > and
> > > > > >> hbase.regionserver.hlog.writer.impl configuration entries should
> > be
> > > > set
> > > > > to
> > > > > >> use the Protobuf based WAL reader / writer classes. This
> > > > implementation
> > > > > has
> > > > > >> been the default since HBase 0.96, so legacy WAL files should
> not
> > > be a
> > > > > >> concern for most downstream users.
> > > > > >
> > > > > >
> > > > > > Configure WAL encryption.
> > > > > >
> > > > > > Configure WAL encryption in every RegionServer’s hbase-site.xml,
> by
> > > > > setting
> > > > > >> the following properties. You can include these in the HMaster’s
> > > > > >> hbase-site.xml as well, but the HMaster does not have a WAL and
> > will
> > > > not
> > > > > >> use them.
> > > > > >> <property>
> > > > > >>  <name>hbase.regionserver.hlog.reader.impl</name>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
> > > > > >> </property>
> > > > > >> <property>
> > > > > >>  <name>hbase.regionserver.hlog.writer.impl</name>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
> > > > > >> </property>
> > > > > >> <property>
> > > > > >>  <name>hbase.regionserver.wal.encryption</name>
> > > > > >>  <value>true</value>
> > > > > >> </property>
> > > > > >
> > > > > >
> > > > > > So in fact, do not consider encryption, the configurations are
> > > useless
> > > > as
> > > > > > we do not support reading sequence file format WAL any more, the
> > only
> > > > > valid
> > > > > > options are protobuf based reader and write. And for security, I
> > > think
> > > > > the
> > > > > > configuration is redundant as if encryption is enabled, we should
> > use
> > > > > > SecureProtobufLogWriter for writing, no matter what the
> > configuration
> > > > > value
> > > > > > is. And for readers, I do not think we should use a configuration
> > to
> > > > > > specify the implementation, we should detect whether the file is
> > > > > encrypted
> > > > > > and choose a secure or normal reader to read the file.
> > > > > >
> > > > > > So here, I propose we just deprecated these two configurations
> > > because
> > > > > they
> > > > > > are useless now.
> > > > > >
> > > > > > Thanks.
> > > > >
> > > >
> > >
> >
>

Reply via email to