The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can we have this for branch-3 as well since we have been backporting a lot of PRs to branch-3 lately.
Thanks, Vihang On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich <k...@rxd.hu> wrote: > Hey, > > > We already have nightly builds for Hive [1]. > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > ...and hive-dev-box can launch such archives; either by using it like this: > https://www.mail-archive.com/dev@hive.apache.org/msg142420.html > > or with a somewhat longer command you could launch hdb in bazaar mode; and > have an HS2 running with a nightly version: > > docker run --rm -d -p 10000:10000 -v hive-dev-box_work:/work -e > HIVE_VERSION= > http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz > --name hive > kgyrtkirk/hive-dev-box:bazaar > > cheers, > Zoltan > > On 5/24/23 09:15, Stamatis Zampetakis wrote: > > Hey all, > > > > We already have nightly builds for Hive [1]. > > > > Do we need something more than that? > > > > Best, > > Stamatis > > > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > > > > > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar <vihan...@apache.org> > wrote: > >> > >> I think there are many benefits like others in this thread suggested > which > >> can be built on top of nightly builds. Having docker images is great but > >> for now I think we can start simple and publish the jars. Many users > still > >> just deploy using jars and it would be useful to them. Once we have a > >> docker environment we can add a docker image too to the nightly builds > so > >> that users can choose their preferred way. > >> > >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park <glap...@gmail.com> > wrote: > >> > >>> I think such nightly builds will be useful for testing and debugging > in the > >>> future. > >>> > >>> I also wonder if we can somehow create builds even from previous > commits > >>> (e.g., for the past few years). Such builds from previous commits don't > >>> have to be daily builds, and I think weekly builds (or even monthly > builds) > >>> would also be very useful. > >>> > >>> The reason I wish such builds were available is to facilitate > debugging and > >>> testing. When tested against the TPC-DS benchmark, the current master > >>> branch has several correctness problems that were introduced after the > >>> release of Hive 3.1.2. We have reported all problems known to us in > [1] and > >>> also submitted several patches. If such nightly builds had been > available, > >>> we would have saved quite a bit of time for implementing the patches by > >>> quickly finding offending commits that introduced new correctness bugs. > >>> > >>> In addition, you can find quite a few commits in the master branch that > >>> report bugs which are not reproduced in Hive 3.1.2. Examples: > HIVE-19990, > >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > >>> HIVE-22227, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > >>> HIVE-25170, HIVE-25864, HIVE-26671. > >>> (There may be some errors in this list because we compared against Hive > >>> 3.1.2 with many patches backported.) Such nightly builds can be useful > for > >>> finding root causes of such bugs. > >>> > >>> Ideally I wish there was an automated procedure to create nightly > builds, > >>> run TPC-DS benchmark, and report correctness/performance results, > although > >>> this would be quite hard to implement. (I remember Spark implemented > this > >>> procedure in the era of Spark 2, but my memory could be wrong.) > >>> > >>> [1] https://issues.apache.org/jira/browse/HIVE-26654 > >>> > >>> > >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena <ayush...@gmail.com> > wrote: > >>> > >>>> Hi Vihang, > >>>> +1, We were even exploring publishing the docker images of the > snapshot > >>>> version as well per commit or maybe weekly, so just shoot 2 docker > >>> commands > >>>> and you get a Hive cluster running with master code. > >>>> > >>>> Sai, I think to spin up an env via Docker with all these things > should be > >>>> doable for sure, but would require someone with real good expertise > with > >>>> docker as well as setting up these services with Hive. Obviously, I am > >>> not > >>>> that guy :-) > >>>> > >>>> @Simhadri has a PR which publishes docker images once a release tag is > >>>> pushed, you can explore to have similar stuff for the Snapshot > version, > >>>> maybe if that sounds cool > >>>> > >>>> -Ayush > >>>> > >>>> On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > >>>> <saihema...@cloudera.com.invalid> wrote: > >>>> > >>>>> Hi Vihang, > >>>>> > >>>>> +1 on the idea. > >>>>> > >>>>> This is a great idea to quickly test if a certain feature is working > as > >>>>> expected on a certain branch. > >>>>> This way we test data loss, correctness, or any other unexpected > >>>> scenarios > >>>>> that are Hive specific only. However, I'm wondering if it is possible > >>> to > >>>>> deploy/test in a kerberized environment or issues involving > >>> authorization > >>>>> services like sentry/ranger. > >>>>> > >>>>> Thanks, > >>>>> Sai. > >>>>> > >>>>> On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > >>>> vihan...@apache.org> > >>>>> wrote: > >>>>> > >>>>>> Hello Team, > >>>>>> > >>>>>> I have observed that it is a common use-case where users would like > >>> to > >>>>> test > >>>>>> out unreleased features/bug fixes either to unblock them or test out > >>> if > >>>>> the > >>>>>> bug fixes really work as intended in their environments. Today in > the > >>>>> case > >>>>>> of Apache Hive, this is not very user friendly because it requires > >>> the > >>>>> end > >>>>>> user to build the binaries directly from the hive source code. > >>>>>> > >>>>>> I found that Apache Spark has a very useful infrastructure [1] which > >>>>>> deploys nightly snapshots [2] [3] from the branch using github > >>> actions. > >>>>>> This is super useful for any user who wants to try out the latest > and > >>>>>> greatest using the nightly builds. > >>>>>> > >>>>>> I was wondering if we should also adopt this. We can use github > >>> actions > >>>>> to > >>>>>> upload the snapshot jars to the public repository (e.g github > >>> packages) > >>>>> and > >>>>>> schedule it as a nightly job. > >>>>>> > >>>>>> [1] https://issues.apache.org/jira/browse/INFRA-21167 > >>>>>> [2] > >>>> https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > >>>>>> [3] https://github.com/apache/spark/pull/30623 > >>>>>> > >>>>>> I can take a stab at this if the community thinks that this is a > nice > >>>>> thing > >>>>>> to have. > >>>>>> > >>>>>> Thanks, > >>>>>> Vihang > >>>>>> > >>>>> > >>>> > >>> >