+1

It's good approach but I have a question/concern.
Is the proposal to shade is for some specific jars OR to shade all the
third party jars?

If proposal is shade all the third-party jars then if would impact the run
time memory usage as all the classes from the third-party jars would be
loaded regardless..

-Kalyan

On Wed, Nov 15, 2017 at 12:04 PM, Sergio Pena <[email protected]>
wrote:

> +1
>
> I like the idea. It's hard to upgrade our libraries to newer ones when
> other components break due to Sentry being a plugin. I was once thought of
> using different jars versions per plugin (i.e. guava11 on hdfs, guava14 on
> hive and the server, etc), but that is too much to do and not good. I like
> the hbase idea better so we can use the latest libraries that we want
> instead of depending on other components.
>
> - Sergio
>
>
>
> On Wed, Nov 15, 2017 at 11:53 AM, Alexander Kolbasov <[email protected]>
> wrote:
>
> > Agreed, this would be a very useful thing to do. I remember spending a
> lot
> > of time trying to make Sentry work with DataNucleus 4 - the problem was
> > that e2e tests combine Sentry with Hive in the same JVM and this created
> a
> > conflict on the DataNucleus libraries and test failures.
> >
> > Looking at the HBase proposal it seems to require some (small) code
> changes
> > for package names - or I am misreading it? Do you plan to shade all 3-rd
> > party packages or just some? I would consider at least
> > Guava/ZooKeepeer/DataNucleus/Jetty.
> >
> > - Alex
> >
> > On Wed, Nov 15, 2017 at 8:24 AM, Brian Towles <[email protected]>
> > wrote:
> >
> > > Howdy all,
> > >
> > > An issue that keeps coming up seems to be the conflict of dependency
> > > versions between Sentry and the components it is plugging into. A
> current
> > > example of this impact is Google Guava with hive2 using v14 and Impala
> > > using v11 while Sentry needs to have at least v14 in order to fix some
> > bugs
> > > in the BoneCP library.  These sort of conflicts come in whenever we are
> > > embedding a Sentry plugin or using a Sentry library into another
> project.
> > >
> > > I would like to propose a mechanism for offsetting some of these issues
> > > using something similar to the third party shading that HBase uses for
> > > common problems (such as Guava)
> > > (https://github.com/apache/hbase-thirdparty with
> > > documentation about it http://hbase.apache.org/book.html#thirdparty)
> and
> > > is
> > > used by HBase and Hadoop for packaging of their downstream artifacts (
> > > https://github.com/apache/hbase/blob/master/hbase-shaded/ and
> > > https://github.com/apache/hadoop/tree/trunk/hadoop-
> > > client-modules/hadoop-client-api/
> > > )
> > >
> > > The main benefit of this is that it would allow Sentry to be used as
> > > libraries and plugins with all of the dependencies needed for Sentry to
> > be
> > > abstracted away from components implementing it. Sentry could rev
> > versions
> > > of libraries easier and not have collisions of library versions needed
> by
> > > the implementing component.  As well this would potentially make Sentry
> > > downstream usage more stable since it would be used and tested against
> a
> > > static set of dependencies and not using libraries based on what the
> > > implementing component has.
> > >
> > > On the downside, it would make the on disk size of the Sentry plugins
> and
> > > libraries for downstream larger. As well, the number of classes loaded
> > into
> > > memory would be larger since there would be potential duplication of
> > actual
> > > class implementations or multiple versions of a class with different
> > > package names in memory.  But this seems to be a common practice and
> the
> > > lack of library version collisions and stability make up for these
> > > downsides.
> > >
> > > The third party shading works by using the Maven shade plugin to do
> > package
> > > name shifting of the third party library (Guava, BoneCP) to a sentry
> > > specific package using the version on the third party library needed
> for
> > > Sentry.
> > >
> > > E.G. *com.google.common* packages could be shifted to
> > > *org.apache.sentry.shaded.com.google.common*.
> > >
> > > Since the Maven Shade plugin can actually change the byte-codes of the
> > > libraries being shaded, we can do this even for dependencies that have
> > > shared sub-dependencies .  BoneCP being the main example since it uses
> > > Guava internal to itself.  We could shade the BoneCP into the same
> shared
> > > sentry third party dependency jar and since the bytecode level
> > > manipulation.
> > >
> > > What this means on the development side is that we would need to
> > reference
> > > imports from the shaded third party
> > >
> > > E.G. *import com.google.common.collect.Maps* becomes *import
> > > org.apache.sentry.shaded.com.google.common.collect.Maps*.
> > >
> > > Ive been looking at this in the context of
> > > https://issues.apache.org/jira/browse/SENTRY-2044, but I feel this
> > should
> > > potentially be something that is more of an overall Sentry standard
> > > practice and larger scale implementation.
> > >
> > > -=Brian
> > >
> > >
> > > --
> > > *Brian Towles* | Software Engineer
> > > t. (512) 415- <0000000000>8105 e. [email protected] <
> > [email protected]>
> > > cloudera.com <http://www.cloudera.com/>
> > >
> > > [image: Cloudera] <http://www.cloudera.com/>
> > >
> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> > Cloudera
> > > on LinkedIn] <https://www.linkedin.com/company/cloudera>
> > > ------------------------------
> > >
> >
>

Reply via email to