Agreed, this would be a very useful thing to do. I remember spending a lot of time trying to make Sentry work with DataNucleus 4 - the problem was that e2e tests combine Sentry with Hive in the same JVM and this created a conflict on the DataNucleus libraries and test failures.
Looking at the HBase proposal it seems to require some (small) code changes for package names - or I am misreading it? Do you plan to shade all 3-rd party packages or just some? I would consider at least Guava/ZooKeepeer/DataNucleus/Jetty. - Alex On Wed, Nov 15, 2017 at 8:24 AM, Brian Towles <btow...@cloudera.com> wrote: > Howdy all, > > An issue that keeps coming up seems to be the conflict of dependency > versions between Sentry and the components it is plugging into. A current > example of this impact is Google Guava with hive2 using v14 and Impala > using v11 while Sentry needs to have at least v14 in order to fix some bugs > in the BoneCP library. These sort of conflicts come in whenever we are > embedding a Sentry plugin or using a Sentry library into another project. > > I would like to propose a mechanism for offsetting some of these issues > using something similar to the third party shading that HBase uses for > common problems (such as Guava) > (https://github.com/apache/hbase-thirdparty with > documentation about it http://hbase.apache.org/book.html#thirdparty) and > is > used by HBase and Hadoop for packaging of their downstream artifacts ( > https://github.com/apache/hbase/blob/master/hbase-shaded/ and > https://github.com/apache/hadoop/tree/trunk/hadoop- > client-modules/hadoop-client-api/ > ) > > The main benefit of this is that it would allow Sentry to be used as > libraries and plugins with all of the dependencies needed for Sentry to be > abstracted away from components implementing it. Sentry could rev versions > of libraries easier and not have collisions of library versions needed by > the implementing component. As well this would potentially make Sentry > downstream usage more stable since it would be used and tested against a > static set of dependencies and not using libraries based on what the > implementing component has. > > On the downside, it would make the on disk size of the Sentry plugins and > libraries for downstream larger. As well, the number of classes loaded into > memory would be larger since there would be potential duplication of actual > class implementations or multiple versions of a class with different > package names in memory. But this seems to be a common practice and the > lack of library version collisions and stability make up for these > downsides. > > The third party shading works by using the Maven shade plugin to do package > name shifting of the third party library (Guava, BoneCP) to a sentry > specific package using the version on the third party library needed for > Sentry. > > E.G. *com.google.common* packages could be shifted to > *org.apache.sentry.shaded.com.google.common*. > > Since the Maven Shade plugin can actually change the byte-codes of the > libraries being shaded, we can do this even for dependencies that have > shared sub-dependencies . BoneCP being the main example since it uses > Guava internal to itself. We could shade the BoneCP into the same shared > sentry third party dependency jar and since the bytecode level > manipulation. > > What this means on the development side is that we would need to reference > imports from the shaded third party > > E.G. *import com.google.common.collect.Maps* becomes *import > org.apache.sentry.shaded.com.google.common.collect.Maps*. > > Ive been looking at this in the context of > https://issues.apache.org/jira/browse/SENTRY-2044, but I feel this should > potentially be something that is more of an overall Sentry standard > practice and larger scale implementation. > > -=Brian > > > -- > *Brian Towles* | Software Engineer > t. (512) 415- <0000000000>8105 e. btow...@cloudera.com <j...@cloudera.com> > cloudera.com <http://www.cloudera.com/> > > [image: Cloudera] <http://www.cloudera.com/> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera > on LinkedIn] <https://www.linkedin.com/company/cloudera> > ------------------------------ >