There are some tools in the tests jar, such as PerformanceEvaluation. But anyway, maybe they should be moved to main...
Istvan Toth <st...@apache.org> 于2024年3月5日周二 16:14写道: > > DISCLAIMER: I don't have a patch ready, or even an elegant way mapped out > to achieve this, this is about discussing whether we even want to make > these changes. > These are also substantial changes, but they could be targeted for HBase > 3.0. > > One issue I have noticed is that we ship test jars and test dependencies in > the assembly. > I can't see anyone using those, but it bloats the assembly and classpath, > and adds unnecessary JARs with possible CVE issues. (for example Kerby > which is a Hadoop minicluster dependency) > > My proposal is to exclude the test jars and the test scope dependencies > from the assembly. > > The advantages would be: > * Smaller distro size > * Faster startup (this is marginal) > * Less CVE-prone JARs in the binary assemblies > > The other issue is that the assembly includes much of the Hadoop > distribution. > The basic assumption in all scripts and instructions is that the node has a > fully configured Hadoop installation, and we include it in the classpath of > HBase. > > If that is true, then there is no reason to include Hadoop in the assembly, > HBase and its direct dependencies should be enough. > > One could argue that it would simplify the client side, which is true to > some extent (though 95% of the client distro use cases are served better by > simply using hbase-shaded-client). > > We could either remove the Hadoop libraries from either or both of the > assemblies unconditionally, or provide two variants for either or both > assemblies, one with Hadoop included, and one without it. > Spark already does this, it has binary distributions both with and without > Hadoop. > > The advantages would be: > * Smaller distro size > * Faster startup (this is marginal) > * Less chance of conflicts with the Hadoop jars > * Less CVE-prone JARs in the binary assemblies > > > Thirdly, we could consider excluding the > full-fat org.apache.hbase:hbase-shaded-client JAR from the Hadoop-less > binary assemblies. It is not used by the assembly, and AFAIK it is not > included in any of the 'hbase classpath' command variants. > > This would make sure that no Hadoop libraries are included (even in shaded > form) and would make the HBase distribution fully insulated from Hadoop's > CVE issues. > > (The full-fat hbase-shaded-client works best as direct build-time > dependency anyway) > > best regards > Istvan