Optimizing Sentry to HDFS protocol

Alexander Kolbasov Tue, 27 Jun 2017 15:46:33 -0700

Some food for thought.

Currently Sentry uses serialized Thrift structures to send a lot of
information from the Sentry Server to the HDFS namenode plugin for the HDFS
sync.


We should think of ways to optimize this protocol in several ways:


   - Rather then streaming huge snapshots in a single message we should
   provide streaming protocol with smaller messages and later reassembly on
   the HDFS side.
   - Most of the information passed are long strings with common prefixes.
   We should be able to apply simple compression techniques (e.g. prefix
   compression) or even run a full compression on the data before sending.
   - We should consider using non-thrift data structures for passing the
   info and just use Thrift as a transport mechanism.

- Sasha

Optimizing Sentry to HDFS protocol

Reply via email to