Re: Optimizing Sentry to HDFS protocol

2017-06-28 Thread Brian Towles
On Tue, Jun 27, 2017 at 5:45 PM Alexander Kolbasov wrote: > >- Rather then streaming huge snapshots in a single message we should >provide streaming protocol with smaller messages and later reassembly on >the HDFS side. > > [bt] If we are going to keep with the

Re: Optimizing Sentry to HDFS protocol

2017-06-28 Thread Alexander Kolbasov
Clarification - I am talking about cases where messages can be several Gbs in size. On Jun 27, 2017, at 9:33 PM, Na Li wrote: Sasha, 1) "- Rather then streaming huge snapshots in a single message we should provide streaming protocol with smaller messages and later

Re: Optimizing Sentry to HDFS protocol

2017-06-27 Thread Alexander Kolbasov
Lina, thanks for your comments! > On Jun 27, 2017, at 9:33 PM, Na Li wrote: > > Sasha, > > 1) "- Rather then streaming huge snapshots in a single message we should >> provide streaming protocol with smaller messages and later reassembly on >> the HDFS side." > Based on

Re: Optimizing Sentry to HDFS protocol

2017-06-27 Thread Na Li
Sasha, 1) "- Rather then streaming huge snapshots in a single message we should > provide streaming protocol with smaller messages and later reassembly on > the HDFS side." Based on https://thrift.apache.org/docs/concepts, Thrift transport can be raw TCP or HTTP. HTTP is above TCP. TCP will

Optimizing Sentry to HDFS protocol

2017-06-27 Thread Alexander Kolbasov
Some food for thought. Currently Sentry uses serialized Thrift structures to send a lot of information from the Sentry Server to the HDFS namenode plugin for the HDFS sync. We should think of ways to optimize this protocol in several ways: - Rather then streaming huge snapshots in a single