I will work on it next week. On Sat, Mar 15, 2025 at 3:01 PM 张铎(Duo Zhang) <palomino...@gmail.com> wrote:
> Do we have an issue for it? > > At least for releasing 3.0.0, we need to run ITBLL... > > Thanks. > > Istvan Toth <st...@apache.org> 于2025年3月15日周六 12:32写道: > > > > Yes, we've discussed this issue, but deferred solving it. > > > > IMO the easiest way is to add a third assembly which is functionally > > equivalent to the 2.x assembly. > > That could work for running hbase-it, ITBLL, chaos monkey, etc. > > We'd also have to decide whether to build it by default, and whether we > > want to publish it as part of official releases. > > > > In theory could also make a delta assembly that only includes the > > additional test related stuff, but I'm afraid that that would require a > lot > > of maintenance. > > We could also add a script/maven target that downloads test-related JARs > > from maven, but keeping that one up to date would also be problematic. > > > > Istvan > > > > On Sat, Mar 15, 2025 at 4:51 AM 张铎(Duo Zhang) <palomino...@gmail.com> > wrote: > > > > > After this change, we can not run ITBLL on 3.0.0 because hbase-it is > > > also excluded... > > > > > > I tried manually copying all the tests jar and hbase-it jar to the lib > > > directory but it did not work, I guess we still missed several hadoop > > > jars... > > > > > > So what is the suggested way to run ITBLL after this change? > > > > > > Thanks. > > > > > > Istvan Toth <st...@cloudera.com.invalid> 于2025年1月20日周一 14:20写道: > > > > > > > > This is almost done. > > > > > > > > The final outstanding patch is > https://github.com/apache/hbase/pull/5766 > > > > for the new Hadoop-less assembly. > > > > > > > > Could you please review it ? > > > > > > > > > > > > > > > > On Sat, Mar 9, 2024 at 8:48 AM Nihal Jain <nihaljain...@gmail.com> > > > wrote: > > > > > > > > > I have created sub tasks with necessary details in the umbrella > jira. > > > Will > > > > > take them up in coming days. Also will add more sub tasks later if > > > needed. > > > > > > > > > > Regards > > > > > Nihal > > > > > > > > > > On Sat, 9 Mar 2024, 11:53 Istvan Toth, <st...@cloudera.com.invalid > > > > > wrote: > > > > > > > > > > > Thank you Nihal. > > > > > > I'm not very familiar with the tools in the test code, so you can > > > > > probably > > > > > > plan that work better. > > > > > > I just have some generic steps in mind: > > > > > > * Identify all the tools / scripts in the test jars > > > > > > * Identify and analyze their dependencies (compared to the > current > > > > > runtime > > > > > > deps) > > > > > > * Decide which ones to move to the runtime JARs. > > > > > > * Move them to the runtime code (or perhaps a separate module) > > > > > > > > > > > > I have created https://issues.apache.org/jira/browse/HBASE-28431 > as > > > an > > > > > > umbrella ticket to organize the sub-tasks. > > > > > > > > > > > > Istvan > > > > > > > > > > > > On Fri, Mar 8, 2024 at 7:06 PM Nihal Jain < > nihaljain...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Sure I will be able to take up. Please create tasks with > necessary > > > > > > details > > > > > > > or let me know if you want me to create. > > > > > > > > > > > > > > On Fri, 8 Mar 2024, 12:45 Istvan Toth, > <st...@cloudera.com.invalid > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Thanks for volunteering, Nihal. > > > > > > > > > > > > > > > > I could work on the Hadoop-less, and assemblies, and you > could > > > work > > > > > on > > > > > > > > cleaning up the test jars. > > > > > > > > Would that work for you ? > > > > > > > > I know that I'm picking the smaller part, but it turns out > that I > > > > > won't > > > > > > > > have as much time to work on this as I hoped. > > > > > > > > > > > > > > > > (Unless there are other volunteers, of course) > > > > > > > > > > > > > > > > Istvan > > > > > > > > > > > > > > > > On Wed, Mar 6, 2024 at 7:03 PM Istvan Toth < > st...@cloudera.com> > > > > > wrote: > > > > > > > > > > > > > > > > > We seem to be in agreement in principle, however the devil > is > > > in > > > > > the > > > > > > > > > details. > > > > > > > > > > > > > > > > > > The first step should be moving the diagnostic tools out > of the > > > > > test > > > > > > > > jars. > > > > > > > > > Are there any tools we don't want to move out ? > > > > > > > > > Do the diagnostic tools pull in extra dependencies > compared to > > > the > > > > > > > > current > > > > > > > > > runtime JARs, and if they do, what are those ? > > > > > > > > > I haven't thought of the chaosmonkey tests yet, do those > have > > > > > > specific > > > > > > > > > additional dependencies / scripts ? > > > > > > > > > > > > > > > > > > Should we move the tools simply to the normal jars, or > should > > > we > > > > > move > > > > > > > > them > > > > > > > > > to a new module (could be called hbase-diagnostics) ? > > > > > > > > > > > > > > > > > > Istvan > > > > > > > > > > > > > > > > > > On Tue, Mar 5, 2024 at 7:10 PM Bryan Beaudreault < > > > > > > > > bbeaudrea...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > >> I'm +0 on hbase-examples, but +1000000 on any > improvements we > > > can > > > > > > make > > > > > > > > to > > > > > > > > >> ltt/pe/chaos/minicluster/etc. It's extremely frustrating > how > > > much > > > > > > > > reliance > > > > > > > > >> we have on test jars both generally but also specifically > > > around > > > > > > these > > > > > > > > >> core > > > > > > > > >> test executables. Unfortunately I haven't had time to > > > dedicate to > > > > > > > these > > > > > > > > >> frustrations myself, but happy to help with review, etc. > > > > > > > > >> > > > > > > > > >> On Tue, Mar 5, 2024 at 1:03 PM Nihal Jain < > > > nihaljain...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >> > Thank you for bringing this up. > > > > > > > > >> > > > > > > > > > >> > +1 for this change. > > > > > > > > >> > > > > > > > > > >> > In fact, some time back, we had faced similar problem. > > > Security > > > > > > > scans > > > > > > > > >> found > > > > > > > > >> > that we were bundling some vulnerable hadoop test jar. > To > > > deal > > > > > > with > > > > > > > > >> that we > > > > > > > > >> > had to make a change in our internal HBase fork to > exclude > > > all > > > > > > HBase > > > > > > > > and > > > > > > > > >> > Hadoop test jars from assembly. This helped us get rid > of > > > > > > vulnerable > > > > > > > > >> jar. > > > > > > > > >> > (Although I hadn't dealt with test scope dependencies > > > there.) > > > > > > > > >> > > > > > > > > > >> > But, I have been thinking of pushing this change in > Apache > > > > > HBase, > > > > > > > just > > > > > > > > >> > wasn't sure if this was even acceptable. It's great to > see > > > same > > > > > > has > > > > > > > > been > > > > > > > > >> > brought up here today. > > > > > > > > >> > > > > > > > > > >> > We hadn't dealt with the ltt, pe etc. tools and wrote a > > > script > > > > > to > > > > > > > > >> download > > > > > > > > >> > them on demand to avoid massive code change in internal > > > fork. > > > > > But > > > > > > I > > > > > > > > >> have a > > > > > > > > >> > +1 on the idea of identifying and moving all such tools > to > > > a new > > > > > > > > module. > > > > > > > > >> > This would be great and make things easier for us as > well. > > > > > > > > >> > > > > > > > > > >> > Also, a way we could help new users easily get started, > in > > > case > > > > > we > > > > > > > > >> > completely stop bundling hadoop jars, is by providing a > > > script > > > > > > which > > > > > > > > >> starts > > > > > > > > >> > a hbase cluster in a single node setup. In fact I had > > > written a > > > > > > > simple > > > > > > > > >> > script sometime back that automates this process given a > > > release > > > > > > > link > > > > > > > > >> for > > > > > > > > >> > both. It first downloads Hadoop and HBase binaries and > then > > > > > starts > > > > > > > > both > > > > > > > > >> > with the hbase root directory set to be on hdfs. We > could > > > > > provide > > > > > > > > >> something > > > > > > > > >> > similar to help new users to get started easily. > > > > > > > > >> > > > > > > > > > >> > Although I am also +1 on the idea to provide both > variants > > > as > > > > > > > > mentioned > > > > > > > > >> by > > > > > > > > >> > Nick, which might not even need any such script. > > > > > > > > >> > > > > > > > > > >> > Also, I am willing to volunteer for help towards this > > > effort. > > > > > > Please > > > > > > > > >> let me > > > > > > > > >> > know if anything is needed. > > > > > > > > >> > > > > > > > > > >> > Thanks, > > > > > > > > >> > Nihal > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > On Tue, 5 Mar 2024, 15:35 Nick Dimiduk, < > > > ndimi...@apache.org> > > > > > > > wrote: > > > > > > > > >> > > > > > > > > > >> > > This would be great cleanup, big +1 from me for all > three > > > of > > > > > > these > > > > > > > > >> > > adjustments, including the promotion of pe, ltt, and > > > friends > > > > > out > > > > > > > of > > > > > > > > >> the > > > > > > > > >> > > test scope. > > > > > > > > >> > > > > > > > > > > >> > > I believe that we included hbase test jars because we > > > used to > > > > > > > freely > > > > > > > > >> mix > > > > > > > > >> > > classes needed for minicluster between runtime and > test > > > jars, > > > > > > > which > > > > > > > > in > > > > > > > > >> > turn > > > > > > > > >> > > relied on Hadoop minicluster capabilities. The big > cleanup > > > > > > around > > > > > > > > >> > > HBaseTestingUtil/it addressed much (or all) of these > > > issues on > > > > > > > > >> branch-3. > > > > > > > > >> > > > > > > > > > > >> > > I believe that we include a Hadoop distribution in our > > > > > assembly > > > > > > > > >> because > > > > > > > > >> > > that makes it easy for a new user to download our > release > > > > > > bin.tgz > > > > > > > > and > > > > > > > > >> get > > > > > > > > >> > > started immediately with learning. I guess it’s high > time > > > that > > > > > > we > > > > > > > > work > > > > > > > > >> > out > > > > > > > > >> > > the with- and without-Hadoop variants. > > > > > > > > >> > > > > > > > > > > >> > > Thanks, > > > > > > > > >> > > Nick > > > > > > > > >> > > > > > > > > > > >> > > On Tue, 5 Mar 2024 at 09:14, Istvan Toth < > > > st...@apache.org> > > > > > > > wrote: > > > > > > > > >> > > > > > > > > > > >> > > > DISCLAIMER: I don't have a patch ready, or even an > > > elegant > > > > > way > > > > > > > > >> mapped > > > > > > > > >> > out > > > > > > > > >> > > > to achieve this, this is about discussing whether we > > > even > > > > > want > > > > > > > to > > > > > > > > >> make > > > > > > > > >> > > > these changes. > > > > > > > > >> > > > These are also substantial changes, but they could > be > > > > > targeted > > > > > > > for > > > > > > > > >> > HBase > > > > > > > > >> > > > 3.0. > > > > > > > > >> > > > > > > > > > > > >> > > > One issue I have noticed is that we ship test jars > and > > > test > > > > > > > > >> > dependencies > > > > > > > > >> > > in > > > > > > > > >> > > > the assembly. > > > > > > > > >> > > > I can't see anyone using those, but it bloats the > > > assembly > > > > > and > > > > > > > > >> > classpath, > > > > > > > > >> > > > and adds unnecessary JARs with possible CVE issues. > (for > > > > > > example > > > > > > > > >> Kerby > > > > > > > > >> > > > which is a Hadoop minicluster dependency) > > > > > > > > >> > > > > > > > > > > > >> > > > My proposal is to exclude the test jars and the test > > > scope > > > > > > > > >> dependencies > > > > > > > > >> > > > from the assembly. > > > > > > > > >> > > > > > > > > > > > >> > > > The advantages would be: > > > > > > > > >> > > > * Smaller distro size > > > > > > > > >> > > > * Faster startup (this is marginal) > > > > > > > > >> > > > * Less CVE-prone JARs in the binary assemblies > > > > > > > > >> > > > > > > > > > > > >> > > > The other issue is that the assembly includes much > of > > > the > > > > > > Hadoop > > > > > > > > >> > > > distribution. > > > > > > > > >> > > > The basic assumption in all scripts and > instructions is > > > that > > > > > > the > > > > > > > > >> node > > > > > > > > >> > > has a > > > > > > > > >> > > > fully configured Hadoop installation, and we > include it > > > in > > > > > the > > > > > > > > >> > classpath > > > > > > > > >> > > of > > > > > > > > >> > > > HBase. > > > > > > > > >> > > > > > > > > > > > >> > > > If that is true, then there is no reason to include > > > Hadoop > > > > > in > > > > > > > the > > > > > > > > >> > > assembly, > > > > > > > > >> > > > HBase and its direct dependencies should be enough. > > > > > > > > >> > > > > > > > > > > > >> > > > One could argue that it would simplify the client > side, > > > > > which > > > > > > is > > > > > > > > >> true > > > > > > > > >> > to > > > > > > > > >> > > > some extent (though 95% of the client distro use > cases > > > are > > > > > > > served > > > > > > > > >> > better > > > > > > > > >> > > by > > > > > > > > >> > > > simply using hbase-shaded-client). > > > > > > > > >> > > > > > > > > > > > >> > > > We could either remove the Hadoop libraries from > either > > > or > > > > > > both > > > > > > > of > > > > > > > > >> the > > > > > > > > >> > > > assemblies unconditionally, or provide two variants > for > > > > > either > > > > > > > or > > > > > > > > >> both > > > > > > > > >> > > > assemblies, one with Hadoop included, and one > without > > > it. > > > > > > > > >> > > > Spark already does this, it has binary distributions > > > both > > > > > with > > > > > > > and > > > > > > > > >> > > without > > > > > > > > >> > > > Hadoop. > > > > > > > > >> > > > > > > > > > > > >> > > > The advantages would be: > > > > > > > > >> > > > * Smaller distro size > > > > > > > > >> > > > * Faster startup (this is marginal) > > > > > > > > >> > > > * Less chance of conflicts with the Hadoop jars > > > > > > > > >> > > > * Less CVE-prone JARs in the binary assemblies > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > Thirdly, we could consider excluding the > > > > > > > > >> > > > full-fat org.apache.hbase:hbase-shaded-client JAR > from > > > the > > > > > > > > >> Hadoop-less > > > > > > > > >> > > > binary assemblies. It is not used by the assembly, > and > > > AFAIK > > > > > > it > > > > > > > is > > > > > > > > >> not > > > > > > > > >> > > > included in any of the 'hbase classpath' command > > > variants. > > > > > > > > >> > > > > > > > > > > > >> > > > This would make sure that no Hadoop libraries are > > > included > > > > > > (even > > > > > > > > in > > > > > > > > >> > > shaded > > > > > > > > >> > > > form) and would make the HBase distribution fully > > > insulated > > > > > > from > > > > > > > > >> > Hadoop's > > > > > > > > >> > > > CVE issues. > > > > > > > > >> > > > > > > > > > > > >> > > > (The full-fat hbase-shaded-client works best as > direct > > > > > > > build-time > > > > > > > > >> > > > dependency anyway) > > > > > > > > >> > > > > > > > > > > > >> > > > best regards > > > > > > > > >> > > > Istvan > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > *István Tóth* | Sr. Staff Software Engineer > > > > > > > > > *Email*: st...@cloudera.com > > > > > > > > > cloudera.com <https://www.cloudera.com> > > > > > > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera > > > > > > > [image: > > > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > > > [image: > > > > > > > > > Cloudera on LinkedIn] < > > > https://www.linkedin.com/company/cloudera> > > > > > > > > > ------------------------------ > > > > > > > > > ------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > *István Tóth* | Sr. Staff Software Engineer > > > > > > > > *Email*: st...@cloudera.com > > > > > > > > cloudera.com <https://www.cloudera.com> > > > > > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> > > > [image: > > > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > > > [image: > > > > > > > Cloudera > > > > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > > > > > > ------------------------------ > > > > > > > > ------------------------------ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > *István Tóth* | Sr. Staff Software Engineer > > > > > > *Email*: st...@cloudera.com > > > > > > cloudera.com <https://www.cloudera.com> > > > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> > [image: > > > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> > [image: > > > > > Cloudera > > > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > > > > ------------------------------ > > > > > > ------------------------------ > > > > > > > > > > > > > > > > > > > > > > > -- > > > > *István Tóth* | Sr. Staff Software Engineer > > > > *Email*: st...@cloudera.com > > > > cloudera.com <https://www.cloudera.com> > > > > [image: Cloudera] <https://www.cloudera.com/> > > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > > > Cloudera > > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > > ------------------------------ > > > > ------------------------------ > > > > -- *István Tóth* | Sr. Staff Software Engineer *Email*: st...@cloudera.com cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> ------------------------------ ------------------------------