Re: [DISCUSS] Nightly snaphot builds
Thanks Zoltan. Makes sense. Also, we should definitely strive to release within 180 days especially when there are lots of commits to a branch. -Vihang On Fri, May 26, 2023 at 12:04 AM Zoltan Haindrich wrote: > On 5/25/23 19:58, vihang karajgaonkar wrote: > > I just tried the job and it worked as expected. Thanks! If I understand > > correctly, the job retains builds for 180 days. Does it mean if there > were > > no commits to a branch for more than 180 days, we will lose the build > > artifacts eventually? > > not entirely - the removal of old builds is a post-build action; which > means - if there are no builds; the removal logic will never run > https://plugins.jenkins.io/discard-old-build/ > > on the other hand I wonder how much value a nightly build can still > provide after 180 days :) > preferably - a real release should be done after some time :) > > cheers, > Zoltan > > > > > On Thu, May 25, 2023 at 1:50 AM Zoltan Haindrich wrote: > > > >> Hey Vihang, > >> > >> I've added you as an admin; and I've copied the job as > >> http://ci.hive.apache.org/job/hive-nightly-branch-3/ > >> other option could be to trigger the original job or use > >> parameterized-scheduler but that would configure a real unconditional > >> nightly build - which will just build the > >> same version over-and-over again if there are no changes... > >> ...the current nighly is SCM triggered ; but only once-a-day it makes a > >> check which creates the desired results. > >> > >> the least painfull was to copy the job; I guess no-one touched the > >> pipeline script ever since it was introduced :D > >> > >> cheers, > >> Zoltan > >> > >> On 5/25/23 01:26, vihang karajgaonkar wrote: > >>> I created https://issues.apache.org/jira/browse/HIVE-27371 to have > >> nightly > >>> builds for branch-3. Once that is merged, I think we can have scheduled > >>> builds for branch-3 as well. Although, I don't have permissions to > >> create a > >>> new job for branch-3. Does anyone know how to do it? > >>> > >>> Thanks, > >>> Vihang > >>> > >>> On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar < > >> vihan...@apache.org> > >>> wrote: > >>> > The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. > >> Can > we have this for branch-3 as well since we have been backporting a lot > >> of > PRs to branch-3 lately. > > Thanks, > Vihang > > > > > > On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > > > Hey, > > > >> We already have nightly builds for Hive [1]. > >> [1] http://ci.hive.apache.org/job/hive-nightly/ > > > > ...and hive-dev-box can launch such archives; either by using it like > > this: > > https://www.mail-archive.com/dev@hive.apache.org/msg142420.html > > > > or with a somewhat longer command you could launch hdb in bazaar > mode; > > and have an HS2 running with a nightly version: > > > > docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e > > HIVE_VERSION= > > > >> > http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz > > --name hive > > kgyrtkirk/hive-dev-box:bazaar > > > > cheers, > > Zoltan > > > > On 5/24/23 09:15, Stamatis Zampetakis wrote: > >> Hey all, > >> > >> We already have nightly builds for Hive [1]. > >> > >> Do we need something more than that? > >> > >> Best, > >> Stamatis > >> > >> [1] http://ci.hive.apache.org/job/hive-nightly/ > >> > >> > >> On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < > > vihan...@apache.org> wrote: > >>> > >>> I think there are many benefits like others in this thread > suggested > > which > >>> can be built on top of nightly builds. Having docker images is > great > > but > >>> for now I think we can start simple and publish the jars. Many > users > > still > >>> just deploy using jars and it would be useful to them. Once we > have a > >>> docker environment we can add a docker image too to the nightly > >> builds > > so > >>> that users can choose their preferred way. > >>> > >>> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park > > wrote: > >>> > I think such nightly builds will be useful for testing and > debugging > > in the > future. > > I also wonder if we can somehow create builds even from previous > > commits > (e.g., for the past few years). Such builds from previous commits > > don't > have to be daily builds, and I think weekly builds (or even > monthly > > builds) > would also be very useful. > > The reason I wish such builds were available is to facilitate > > debugging and > testing. When tested against the TPC-DS benchmark, the current > >> master > branch
Re: [DISCUSS] Nightly snaphot builds
On 5/25/23 19:58, vihang karajgaonkar wrote: I just tried the job and it worked as expected. Thanks! If I understand correctly, the job retains builds for 180 days. Does it mean if there were no commits to a branch for more than 180 days, we will lose the build artifacts eventually? not entirely - the removal of old builds is a post-build action; which means - if there are no builds; the removal logic will never run https://plugins.jenkins.io/discard-old-build/ on the other hand I wonder how much value a nightly build can still provide after 180 days :) preferably - a real release should be done after some time :) cheers, Zoltan On Thu, May 25, 2023 at 1:50 AM Zoltan Haindrich wrote: Hey Vihang, I've added you as an admin; and I've copied the job as http://ci.hive.apache.org/job/hive-nightly-branch-3/ other option could be to trigger the original job or use parameterized-scheduler but that would configure a real unconditional nightly build - which will just build the same version over-and-over again if there are no changes... ...the current nighly is SCM triggered ; but only once-a-day it makes a check which creates the desired results. the least painfull was to copy the job; I guess no-one touched the pipeline script ever since it was introduced :D cheers, Zoltan On 5/25/23 01:26, vihang karajgaonkar wrote: I created https://issues.apache.org/jira/browse/HIVE-27371 to have nightly builds for branch-3. Once that is merged, I think we can have scheduled builds for branch-3 as well. Although, I don't have permissions to create a new job for branch-3. Does anyone know how to do it? Thanks, Vihang On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar < vihan...@apache.org> wrote: The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can we have this for branch-3 as well since we have been backporting a lot of PRs to branch-3 lately. Thanks, Vihang On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: Hey, > We already have nightly builds for Hive [1]. > [1] http://ci.hive.apache.org/job/hive-nightly/ ...and hive-dev-box can launch such archives; either by using it like this: https://www.mail-archive.com/dev@hive.apache.org/msg142420.html or with a somewhat longer command you could launch hdb in bazaar mode; and have an HS2 running with a nightly version: docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e HIVE_VERSION= http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz --name hive kgyrtkirk/hive-dev-box:bazaar cheers, Zoltan On 5/24/23 09:15, Stamatis Zampetakis wrote: Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < vihan...@apache.org> wrote: I think there are many benefits like others in this thread suggested which can be built on top of nightly builds. Having docker images is great but for now I think we can start simple and publish the jars. Many users still just deploy using jars and it would be useful to them. Once we have a docker environment we can add a docker image too to the nightly builds so that users can choose their preferred way. On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: I think such nightly builds will be useful for testing and debugging in the future. I also wonder if we can somehow create builds even from previous commits (e.g., for the past few years). Such builds from previous commits don't have to be daily builds, and I think weekly builds (or even monthly builds) would also be very useful. The reason I wish such builds were available is to facilitate debugging and testing. When tested against the TPC-DS benchmark, the current master branch has several correctness problems that were introduced after the release of Hive 3.1.2. We have reported all problems known to us in [1] and also submitted several patches. If such nightly builds had been available, we would have saved quite a bit of time for implementing the patches by quickly finding offending commits that introduced new correctness bugs. In addition, you can find quite a few commits in the master branch that report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, HIVE-25170, HIVE-25864, HIVE-26671. (There may be some errors in this list because we compared against Hive 3.1.2 with many patches backported.) Such nightly builds can be useful for finding root causes of such bugs. Ideally I wish there was an automated procedure to create nightly builds, run TPC-DS benchmark, and report correctness/performance results, although this would be quite hard to implement. (I remember Spark implemented t
Re: [DISCUSS] Nightly snaphot builds
I just tried the job and it worked as expected. Thanks! If I understand correctly, the job retains builds for 180 days. Does it mean if there were no commits to a branch for more than 180 days, we will lose the build artifacts eventually? On Thu, May 25, 2023 at 1:50 AM Zoltan Haindrich wrote: > Hey Vihang, > > I've added you as an admin; and I've copied the job as > http://ci.hive.apache.org/job/hive-nightly-branch-3/ > other option could be to trigger the original job or use > parameterized-scheduler but that would configure a real unconditional > nightly build - which will just build the > same version over-and-over again if there are no changes... > ...the current nighly is SCM triggered ; but only once-a-day it makes a > check which creates the desired results. > > the least painfull was to copy the job; I guess no-one touched the > pipeline script ever since it was introduced :D > > cheers, > Zoltan > > On 5/25/23 01:26, vihang karajgaonkar wrote: > > I created https://issues.apache.org/jira/browse/HIVE-27371 to have > nightly > > builds for branch-3. Once that is merged, I think we can have scheduled > > builds for branch-3 as well. Although, I don't have permissions to > create a > > new job for branch-3. Does anyone know how to do it? > > > > Thanks, > > Vihang > > > > On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar < > vihan...@apache.org> > > wrote: > > > >> The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. > Can > >> we have this for branch-3 as well since we have been backporting a lot > of > >> PRs to branch-3 lately. > >> > >> Thanks, > >> Vihang > >> > >> > >> > >> > >> > >> On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > >> > >>> Hey, > >>> > >>> > We already have nightly builds for Hive [1]. > >>> > [1] http://ci.hive.apache.org/job/hive-nightly/ > >>> > >>> ...and hive-dev-box can launch such archives; either by using it like > >>> this: > >>> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html > >>> > >>> or with a somewhat longer command you could launch hdb in bazaar mode; > >>> and have an HS2 running with a nightly version: > >>> > >>> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e > >>> HIVE_VERSION= > >>> > http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz > >>> --name hive > >>> kgyrtkirk/hive-dev-box:bazaar > >>> > >>> cheers, > >>> Zoltan > >>> > >>> On 5/24/23 09:15, Stamatis Zampetakis wrote: > Hey all, > > We already have nightly builds for Hive [1]. > > Do we need something more than that? > > Best, > Stamatis > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < > >>> vihan...@apache.org> wrote: > > > > I think there are many benefits like others in this thread suggested > >>> which > > can be built on top of nightly builds. Having docker images is great > >>> but > > for now I think we can start simple and publish the jars. Many users > >>> still > > just deploy using jars and it would be useful to them. Once we have a > > docker environment we can add a docker image too to the nightly > builds > >>> so > > that users can choose their preferred way. > > > > On Mon, May 22, 2023 at 11:07 PM Sungwoo Park > >>> wrote: > > > >> I think such nightly builds will be useful for testing and debugging > >>> in the > >> future. > >> > >> I also wonder if we can somehow create builds even from previous > >>> commits > >> (e.g., for the past few years). Such builds from previous commits > >>> don't > >> have to be daily builds, and I think weekly builds (or even monthly > >>> builds) > >> would also be very useful. > >> > >> The reason I wish such builds were available is to facilitate > >>> debugging and > >> testing. When tested against the TPC-DS benchmark, the current > master > >> branch has several correctness problems that were introduced after > the > >> release of Hive 3.1.2. We have reported all problems known to us in > >>> [1] and > >> also submitted several patches. If such nightly builds had been > >>> available, > >> we would have saved quite a bit of time for implementing the patches > >>> by > >> quickly finding offending commits that introduced new correctness > >>> bugs. > >> > >> In addition, you can find quite a few commits in the master branch > >>> that > >> report bugs which are not reproduced in Hive 3.1.2. Examples: > >>> HIVE-19990, > >> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > >> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > >> HIVE-25170, HIVE-25864, HIVE-26671. > >> (There may be some errors in this list because we compared against > >>> Hive > >> 3.1.2 with many patches backported.) Such nightly builds
Re: [DISCUSS] Nightly snaphot builds
Hey Vihang, I've added you as an admin; and I've copied the job as http://ci.hive.apache.org/job/hive-nightly-branch-3/ other option could be to trigger the original job or use parameterized-scheduler but that would configure a real unconditional nightly build - which will just build the same version over-and-over again if there are no changes... ...the current nighly is SCM triggered ; but only once-a-day it makes a check which creates the desired results. the least painfull was to copy the job; I guess no-one touched the pipeline script ever since it was introduced :D cheers, Zoltan On 5/25/23 01:26, vihang karajgaonkar wrote: I created https://issues.apache.org/jira/browse/HIVE-27371 to have nightly builds for branch-3. Once that is merged, I think we can have scheduled builds for branch-3 as well. Although, I don't have permissions to create a new job for branch-3. Does anyone know how to do it? Thanks, Vihang On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar wrote: The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can we have this for branch-3 as well since we have been backporting a lot of PRs to branch-3 lately. Thanks, Vihang On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: Hey, > We already have nightly builds for Hive [1]. > [1] http://ci.hive.apache.org/job/hive-nightly/ ...and hive-dev-box can launch such archives; either by using it like this: https://www.mail-archive.com/dev@hive.apache.org/msg142420.html or with a somewhat longer command you could launch hdb in bazaar mode; and have an HS2 running with a nightly version: docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e HIVE_VERSION= http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz --name hive kgyrtkirk/hive-dev-box:bazaar cheers, Zoltan On 5/24/23 09:15, Stamatis Zampetakis wrote: Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < vihan...@apache.org> wrote: I think there are many benefits like others in this thread suggested which can be built on top of nightly builds. Having docker images is great but for now I think we can start simple and publish the jars. Many users still just deploy using jars and it would be useful to them. Once we have a docker environment we can add a docker image too to the nightly builds so that users can choose their preferred way. On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: I think such nightly builds will be useful for testing and debugging in the future. I also wonder if we can somehow create builds even from previous commits (e.g., for the past few years). Such builds from previous commits don't have to be daily builds, and I think weekly builds (or even monthly builds) would also be very useful. The reason I wish such builds were available is to facilitate debugging and testing. When tested against the TPC-DS benchmark, the current master branch has several correctness problems that were introduced after the release of Hive 3.1.2. We have reported all problems known to us in [1] and also submitted several patches. If such nightly builds had been available, we would have saved quite a bit of time for implementing the patches by quickly finding offending commits that introduced new correctness bugs. In addition, you can find quite a few commits in the master branch that report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, HIVE-25170, HIVE-25864, HIVE-26671. (There may be some errors in this list because we compared against Hive 3.1.2 with many patches backported.) Such nightly builds can be useful for finding root causes of such bugs. Ideally I wish there was an automated procedure to create nightly builds, run TPC-DS benchmark, and report correctness/performance results, although this would be quite hard to implement. (I remember Spark implemented this procedure in the era of Spark 2, but my memory could be wrong.) [1] https://issues.apache.org/jira/browse/HIVE-26654 On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: Hi Vihang, +1, We were even exploring publishing the docker images of the snapshot version as well per commit or maybe weekly, so just shoot 2 docker commands and you get a Hive cluster running with master code. Sai, I think to spin up an env via Docker with all these things should be doable for sure, but would require someone with real good expertise with docker as well as setting up these services with Hive. Obviously, I am not that guy :-) @Simhadri has a PR which publishes docker images once a release tag is pushed, you can explore to
Re: [DISCUSS] Nightly snaphot builds
I created https://issues.apache.org/jira/browse/HIVE-27371 to have nightly builds for branch-3. Once that is merged, I think we can have scheduled builds for branch-3 as well. Although, I don't have permissions to create a new job for branch-3. Does anyone know how to do it? Thanks, Vihang On Wed, May 24, 2023 at 10:07 AM vihang karajgaonkar wrote: > The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can > we have this for branch-3 as well since we have been backporting a lot of > PRs to branch-3 lately. > > Thanks, > Vihang > > > > > > On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > >> Hey, >> >> > We already have nightly builds for Hive [1]. >> > [1] http://ci.hive.apache.org/job/hive-nightly/ >> >> ...and hive-dev-box can launch such archives; either by using it like >> this: >> https://www.mail-archive.com/dev@hive.apache.org/msg142420.html >> >> or with a somewhat longer command you could launch hdb in bazaar mode; >> and have an HS2 running with a nightly version: >> >> docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e >> HIVE_VERSION= >> http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz >> --name hive >> kgyrtkirk/hive-dev-box:bazaar >> >> cheers, >> Zoltan >> >> On 5/24/23 09:15, Stamatis Zampetakis wrote: >> > Hey all, >> > >> > We already have nightly builds for Hive [1]. >> > >> > Do we need something more than that? >> > >> > Best, >> > Stamatis >> > >> > [1] http://ci.hive.apache.org/job/hive-nightly/ >> > >> > >> > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar < >> vihan...@apache.org> wrote: >> >> >> >> I think there are many benefits like others in this thread suggested >> which >> >> can be built on top of nightly builds. Having docker images is great >> but >> >> for now I think we can start simple and publish the jars. Many users >> still >> >> just deploy using jars and it would be useful to them. Once we have a >> >> docker environment we can add a docker image too to the nightly builds >> so >> >> that users can choose their preferred way. >> >> >> >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park >> wrote: >> >> >> >>> I think such nightly builds will be useful for testing and debugging >> in the >> >>> future. >> >>> >> >>> I also wonder if we can somehow create builds even from previous >> commits >> >>> (e.g., for the past few years). Such builds from previous commits >> don't >> >>> have to be daily builds, and I think weekly builds (or even monthly >> builds) >> >>> would also be very useful. >> >>> >> >>> The reason I wish such builds were available is to facilitate >> debugging and >> >>> testing. When tested against the TPC-DS benchmark, the current master >> >>> branch has several correctness problems that were introduced after the >> >>> release of Hive 3.1.2. We have reported all problems known to us in >> [1] and >> >>> also submitted several patches. If such nightly builds had been >> available, >> >>> we would have saved quite a bit of time for implementing the patches >> by >> >>> quickly finding offending commits that introduced new correctness >> bugs. >> >>> >> >>> In addition, you can find quite a few commits in the master branch >> that >> >>> report bugs which are not reproduced in Hive 3.1.2. Examples: >> HIVE-19990, >> >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, >> >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, >> >>> HIVE-25170, HIVE-25864, HIVE-26671. >> >>> (There may be some errors in this list because we compared against >> Hive >> >>> 3.1.2 with many patches backported.) Such nightly builds can be >> useful for >> >>> finding root causes of such bugs. >> >>> >> >>> Ideally I wish there was an automated procedure to create nightly >> builds, >> >>> run TPC-DS benchmark, and report correctness/performance results, >> although >> >>> this would be quite hard to implement. (I remember Spark implemented >> this >> >>> procedure in the era of Spark 2, but my memory could be wrong.) >> >>> >> >>> [1] https://issues.apache.org/jira/browse/HIVE-26654 >> >>> >> >>> >> >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena >> wrote: >> >>> >> Hi Vihang, >> +1, We were even exploring publishing the docker images of the >> snapshot >> version as well per commit or maybe weekly, so just shoot 2 docker >> >>> commands >> and you get a Hive cluster running with master code. >> >> Sai, I think to spin up an env via Docker with all these things >> should be >> doable for sure, but would require someone with real good expertise >> with >> docker as well as setting up these services with Hive. Obviously, I >> am >> >>> not >> that guy :-) >> >> @Simhadri has a PR which publishes docker images once a release tag >> is >> pushed, you can explore to have similar stuff for the Snapshot >> version, >> maybe if that sounds cool >> >
Re: [DISCUSS] Nightly snaphot builds
The nightly job http://ci.hive.apache.org/job/hive-nightly/ is great. Can we have this for branch-3 as well since we have been backporting a lot of PRs to branch-3 lately. Thanks, Vihang On Wed, May 24, 2023 at 6:56 AM Zoltan Haindrich wrote: > Hey, > > > We already have nightly builds for Hive [1]. > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > ...and hive-dev-box can launch such archives; either by using it like this: > https://www.mail-archive.com/dev@hive.apache.org/msg142420.html > > or with a somewhat longer command you could launch hdb in bazaar mode; and > have an HS2 running with a nightly version: > > docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e > HIVE_VERSION= > http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz > --name hive > kgyrtkirk/hive-dev-box:bazaar > > cheers, > Zoltan > > On 5/24/23 09:15, Stamatis Zampetakis wrote: > > Hey all, > > > > We already have nightly builds for Hive [1]. > > > > Do we need something more than that? > > > > Best, > > Stamatis > > > > [1] http://ci.hive.apache.org/job/hive-nightly/ > > > > > > On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar > wrote: > >> > >> I think there are many benefits like others in this thread suggested > which > >> can be built on top of nightly builds. Having docker images is great but > >> for now I think we can start simple and publish the jars. Many users > still > >> just deploy using jars and it would be useful to them. Once we have a > >> docker environment we can add a docker image too to the nightly builds > so > >> that users can choose their preferred way. > >> > >> On Mon, May 22, 2023 at 11:07 PM Sungwoo Park > wrote: > >> > >>> I think such nightly builds will be useful for testing and debugging > in the > >>> future. > >>> > >>> I also wonder if we can somehow create builds even from previous > commits > >>> (e.g., for the past few years). Such builds from previous commits don't > >>> have to be daily builds, and I think weekly builds (or even monthly > builds) > >>> would also be very useful. > >>> > >>> The reason I wish such builds were available is to facilitate > debugging and > >>> testing. When tested against the TPC-DS benchmark, the current master > >>> branch has several correctness problems that were introduced after the > >>> release of Hive 3.1.2. We have reported all problems known to us in > [1] and > >>> also submitted several patches. If such nightly builds had been > available, > >>> we would have saved quite a bit of time for implementing the patches by > >>> quickly finding offending commits that introduced new correctness bugs. > >>> > >>> In addition, you can find quite a few commits in the master branch that > >>> report bugs which are not reproduced in Hive 3.1.2. Examples: > HIVE-19990, > >>> HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > >>> HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > >>> HIVE-25170, HIVE-25864, HIVE-26671. > >>> (There may be some errors in this list because we compared against Hive > >>> 3.1.2 with many patches backported.) Such nightly builds can be useful > for > >>> finding root causes of such bugs. > >>> > >>> Ideally I wish there was an automated procedure to create nightly > builds, > >>> run TPC-DS benchmark, and report correctness/performance results, > although > >>> this would be quite hard to implement. (I remember Spark implemented > this > >>> procedure in the era of Spark 2, but my memory could be wrong.) > >>> > >>> [1] https://issues.apache.org/jira/browse/HIVE-26654 > >>> > >>> > >>> On Tue, May 23, 2023 at 10:44 AM Ayush Saxena > wrote: > >>> > Hi Vihang, > +1, We were even exploring publishing the docker images of the > snapshot > version as well per commit or maybe weekly, so just shoot 2 docker > >>> commands > and you get a Hive cluster running with master code. > > Sai, I think to spin up an env via Docker with all these things > should be > doable for sure, but would require someone with real good expertise > with > docker as well as setting up these services with Hive. Obviously, I am > >>> not > that guy :-) > > @Simhadri has a PR which publishes docker images once a release tag is > pushed, you can explore to have similar stuff for the Snapshot > version, > maybe if that sounds cool > > -Ayush > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > wrote: > > > Hi Vihang, > > > > +1 on the idea. > > > > This is a great idea to quickly test if a certain feature is working > as > > expected on a certain branch. > > This way we test data loss, correctness, or any other unexpected > scenarios > > that are Hive specific only. However, I'm wondering if it is possible > >>> to > > deploy/test in a kerberized environment or issues involving > >>> authoriz
Re: [DISCUSS] Nightly snaphot builds
Hey, > We already have nightly builds for Hive [1]. > [1] http://ci.hive.apache.org/job/hive-nightly/ ...and hive-dev-box can launch such archives; either by using it like this: https://www.mail-archive.com/dev@hive.apache.org/msg142420.html or with a somewhat longer command you could launch hdb in bazaar mode; and have an HS2 running with a nightly version: docker run --rm -d -p 1:1 -v hive-dev-box_work:/work -e HIVE_VERSION=http://ci.hive.apache.org/job/hive-nightly/lastSuccessfulBuild/artifact/archive/apache-hive-4.0.0-nightly-b0b3fde70c-20230524_014711-bin.tar.gz --name hive kgyrtkirk/hive-dev-box:bazaar cheers, Zoltan On 5/24/23 09:15, Stamatis Zampetakis wrote: Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar wrote: I think there are many benefits like others in this thread suggested which can be built on top of nightly builds. Having docker images is great but for now I think we can start simple and publish the jars. Many users still just deploy using jars and it would be useful to them. Once we have a docker environment we can add a docker image too to the nightly builds so that users can choose their preferred way. On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: I think such nightly builds will be useful for testing and debugging in the future. I also wonder if we can somehow create builds even from previous commits (e.g., for the past few years). Such builds from previous commits don't have to be daily builds, and I think weekly builds (or even monthly builds) would also be very useful. The reason I wish such builds were available is to facilitate debugging and testing. When tested against the TPC-DS benchmark, the current master branch has several correctness problems that were introduced after the release of Hive 3.1.2. We have reported all problems known to us in [1] and also submitted several patches. If such nightly builds had been available, we would have saved quite a bit of time for implementing the patches by quickly finding offending commits that introduced new correctness bugs. In addition, you can find quite a few commits in the master branch that report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, HIVE-25170, HIVE-25864, HIVE-26671. (There may be some errors in this list because we compared against Hive 3.1.2 with many patches backported.) Such nightly builds can be useful for finding root causes of such bugs. Ideally I wish there was an automated procedure to create nightly builds, run TPC-DS benchmark, and report correctness/performance results, although this would be quite hard to implement. (I remember Spark implemented this procedure in the era of Spark 2, but my memory could be wrong.) [1] https://issues.apache.org/jira/browse/HIVE-26654 On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: Hi Vihang, +1, We were even exploring publishing the docker images of the snapshot version as well per commit or maybe weekly, so just shoot 2 docker commands and you get a Hive cluster running with master code. Sai, I think to spin up an env via Docker with all these things should be doable for sure, but would require someone with real good expertise with docker as well as setting up these services with Hive. Obviously, I am not that guy :-) @Simhadri has a PR which publishes docker images once a release tag is pushed, you can explore to have similar stuff for the Snapshot version, maybe if that sounds cool -Ayush On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala wrote: Hi Vihang, +1 on the idea. This is a great idea to quickly test if a certain feature is working as expected on a certain branch. This way we test data loss, correctness, or any other unexpected scenarios that are Hive specific only. However, I'm wondering if it is possible to deploy/test in a kerberized environment or issues involving authorization services like sentry/ranger. Thanks, Sai. On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < vihan...@apache.org> wrote: Hello Team, I have observed that it is a common use-case where users would like to test out unreleased features/bug fixes either to unblock them or test out if the bug fixes really work as intended in their environments. Today in the case of Apache Hive, this is not very user friendly because it requires the end user to build the binaries directly from the hive source code. I found that Apache Spark has a very useful infrastructure [1] which deploys nightly snapshots [2] [3] from the branch using github actions. This is super useful for any user who wants to try out the latest and greatest using the nightly builds. I was wondering if we should also adopt this. We can use githu
Re: [DISCUSS] Nightly snaphot builds
Hey all, We already have nightly builds for Hive [1]. Do we need something more than that? Best, Stamatis [1] http://ci.hive.apache.org/job/hive-nightly/ On Tue, May 23, 2023 at 9:03 AM vihang karajgaonkar wrote: > > I think there are many benefits like others in this thread suggested which > can be built on top of nightly builds. Having docker images is great but > for now I think we can start simple and publish the jars. Many users still > just deploy using jars and it would be useful to them. Once we have a > docker environment we can add a docker image too to the nightly builds so > that users can choose their preferred way. > > On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: > > > I think such nightly builds will be useful for testing and debugging in the > > future. > > > > I also wonder if we can somehow create builds even from previous commits > > (e.g., for the past few years). Such builds from previous commits don't > > have to be daily builds, and I think weekly builds (or even monthly builds) > > would also be very useful. > > > > The reason I wish such builds were available is to facilitate debugging and > > testing. When tested against the TPC-DS benchmark, the current master > > branch has several correctness problems that were introduced after the > > release of Hive 3.1.2. We have reported all problems known to us in [1] and > > also submitted several patches. If such nightly builds had been available, > > we would have saved quite a bit of time for implementing the patches by > > quickly finding offending commits that introduced new correctness bugs. > > > > In addition, you can find quite a few commits in the master branch that > > report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, > > HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > > HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > > HIVE-25170, HIVE-25864, HIVE-26671. > > (There may be some errors in this list because we compared against Hive > > 3.1.2 with many patches backported.) Such nightly builds can be useful for > > finding root causes of such bugs. > > > > Ideally I wish there was an automated procedure to create nightly builds, > > run TPC-DS benchmark, and report correctness/performance results, although > > this would be quite hard to implement. (I remember Spark implemented this > > procedure in the era of Spark 2, but my memory could be wrong.) > > > > [1] https://issues.apache.org/jira/browse/HIVE-26654 > > > > > > On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: > > > > > Hi Vihang, > > > +1, We were even exploring publishing the docker images of the snapshot > > > version as well per commit or maybe weekly, so just shoot 2 docker > > commands > > > and you get a Hive cluster running with master code. > > > > > > Sai, I think to spin up an env via Docker with all these things should be > > > doable for sure, but would require someone with real good expertise with > > > docker as well as setting up these services with Hive. Obviously, I am > > not > > > that guy :-) > > > > > > @Simhadri has a PR which publishes docker images once a release tag is > > > pushed, you can explore to have similar stuff for the Snapshot version, > > > maybe if that sounds cool > > > > > > -Ayush > > > > > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > > > wrote: > > > > > > > Hi Vihang, > > > > > > > > +1 on the idea. > > > > > > > > This is a great idea to quickly test if a certain feature is working as > > > > expected on a certain branch. > > > > This way we test data loss, correctness, or any other unexpected > > > scenarios > > > > that are Hive specific only. However, I'm wondering if it is possible > > to > > > > deploy/test in a kerberized environment or issues involving > > authorization > > > > services like sentry/ranger. > > > > > > > > Thanks, > > > > Sai. > > > > > > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > > > vihan...@apache.org> > > > > wrote: > > > > > > > > > Hello Team, > > > > > > > > > > I have observed that it is a common use-case where users would like > > to > > > > test > > > > > out unreleased features/bug fixes either to unblock them or test out > > if > > > > the > > > > > bug fixes really work as intended in their environments. Today in the > > > > case > > > > > of Apache Hive, this is not very user friendly because it requires > > the > > > > end > > > > > user to build the binaries directly from the hive source code. > > > > > > > > > > I found that Apache Spark has a very useful infrastructure [1] which > > > > > deploys nightly snapshots [2] [3] from the branch using github > > actions. > > > > > This is super useful for any user who wants to try out the latest and > > > > > greatest using the nightly builds. > > > > > > > > > > I was wondering if we should also adopt this. We can use github > > actions > > > > to > > > > > upload the snapshot jars to the public repository (e.g github > > packages) > > > > and >
Re: [DISCUSS] Nightly snaphot builds
I think there are many benefits like others in this thread suggested which can be built on top of nightly builds. Having docker images is great but for now I think we can start simple and publish the jars. Many users still just deploy using jars and it would be useful to them. Once we have a docker environment we can add a docker image too to the nightly builds so that users can choose their preferred way. On Mon, May 22, 2023 at 11:07 PM Sungwoo Park wrote: > I think such nightly builds will be useful for testing and debugging in the > future. > > I also wonder if we can somehow create builds even from previous commits > (e.g., for the past few years). Such builds from previous commits don't > have to be daily builds, and I think weekly builds (or even monthly builds) > would also be very useful. > > The reason I wish such builds were available is to facilitate debugging and > testing. When tested against the TPC-DS benchmark, the current master > branch has several correctness problems that were introduced after the > release of Hive 3.1.2. We have reported all problems known to us in [1] and > also submitted several patches. If such nightly builds had been available, > we would have saved quite a bit of time for implementing the patches by > quickly finding offending commits that introduced new correctness bugs. > > In addition, you can find quite a few commits in the master branch that > report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, > HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, > HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, > HIVE-25170, HIVE-25864, HIVE-26671. > (There may be some errors in this list because we compared against Hive > 3.1.2 with many patches backported.) Such nightly builds can be useful for > finding root causes of such bugs. > > Ideally I wish there was an automated procedure to create nightly builds, > run TPC-DS benchmark, and report correctness/performance results, although > this would be quite hard to implement. (I remember Spark implemented this > procedure in the era of Spark 2, but my memory could be wrong.) > > [1] https://issues.apache.org/jira/browse/HIVE-26654 > > > On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: > > > Hi Vihang, > > +1, We were even exploring publishing the docker images of the snapshot > > version as well per commit or maybe weekly, so just shoot 2 docker > commands > > and you get a Hive cluster running with master code. > > > > Sai, I think to spin up an env via Docker with all these things should be > > doable for sure, but would require someone with real good expertise with > > docker as well as setting up these services with Hive. Obviously, I am > not > > that guy :-) > > > > @Simhadri has a PR which publishes docker images once a release tag is > > pushed, you can explore to have similar stuff for the Snapshot version, > > maybe if that sounds cool > > > > -Ayush > > > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > > wrote: > > > > > Hi Vihang, > > > > > > +1 on the idea. > > > > > > This is a great idea to quickly test if a certain feature is working as > > > expected on a certain branch. > > > This way we test data loss, correctness, or any other unexpected > > scenarios > > > that are Hive specific only. However, I'm wondering if it is possible > to > > > deploy/test in a kerberized environment or issues involving > authorization > > > services like sentry/ranger. > > > > > > Thanks, > > > Sai. > > > > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > > vihan...@apache.org> > > > wrote: > > > > > > > Hello Team, > > > > > > > > I have observed that it is a common use-case where users would like > to > > > test > > > > out unreleased features/bug fixes either to unblock them or test out > if > > > the > > > > bug fixes really work as intended in their environments. Today in the > > > case > > > > of Apache Hive, this is not very user friendly because it requires > the > > > end > > > > user to build the binaries directly from the hive source code. > > > > > > > > I found that Apache Spark has a very useful infrastructure [1] which > > > > deploys nightly snapshots [2] [3] from the branch using github > actions. > > > > This is super useful for any user who wants to try out the latest and > > > > greatest using the nightly builds. > > > > > > > > I was wondering if we should also adopt this. We can use github > actions > > > to > > > > upload the snapshot jars to the public repository (e.g github > packages) > > > and > > > > schedule it as a nightly job. > > > > > > > > [1] https://issues.apache.org/jira/browse/INFRA-21167 > > > > [2] > > https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > > > > [3] https://github.com/apache/spark/pull/30623 > > > > > > > > I can take a stab at this if the community thinks that this is a nice > > > thing > > > > to have. > > > > > > > > Thanks, > > > > Vihang > > > > > > > > > >
Re: [DISCUSS] Nightly snaphot builds
I think such nightly builds will be useful for testing and debugging in the future. I also wonder if we can somehow create builds even from previous commits (e.g., for the past few years). Such builds from previous commits don't have to be daily builds, and I think weekly builds (or even monthly builds) would also be very useful. The reason I wish such builds were available is to facilitate debugging and testing. When tested against the TPC-DS benchmark, the current master branch has several correctness problems that were introduced after the release of Hive 3.1.2. We have reported all problems known to us in [1] and also submitted several patches. If such nightly builds had been available, we would have saved quite a bit of time for implementing the patches by quickly finding offending commits that introduced new correctness bugs. In addition, you can find quite a few commits in the master branch that report bugs which are not reproduced in Hive 3.1.2. Examples: HIVE-19990, HIVE-14557, HIVE-21132, HIVE-21188, HIVE-21544, HIVE-22114, HIVE-7, HIVE-22236, HIVE-23911, HIVE-24198, HIVE-22777, HIVE-25170, HIVE-25864, HIVE-26671. (There may be some errors in this list because we compared against Hive 3.1.2 with many patches backported.) Such nightly builds can be useful for finding root causes of such bugs. Ideally I wish there was an automated procedure to create nightly builds, run TPC-DS benchmark, and report correctness/performance results, although this would be quite hard to implement. (I remember Spark implemented this procedure in the era of Spark 2, but my memory could be wrong.) [1] https://issues.apache.org/jira/browse/HIVE-26654 On Tue, May 23, 2023 at 10:44 AM Ayush Saxena wrote: > Hi Vihang, > +1, We were even exploring publishing the docker images of the snapshot > version as well per commit or maybe weekly, so just shoot 2 docker commands > and you get a Hive cluster running with master code. > > Sai, I think to spin up an env via Docker with all these things should be > doable for sure, but would require someone with real good expertise with > docker as well as setting up these services with Hive. Obviously, I am not > that guy :-) > > @Simhadri has a PR which publishes docker images once a release tag is > pushed, you can explore to have similar stuff for the Snapshot version, > maybe if that sounds cool > > -Ayush > > On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala > wrote: > > > Hi Vihang, > > > > +1 on the idea. > > > > This is a great idea to quickly test if a certain feature is working as > > expected on a certain branch. > > This way we test data loss, correctness, or any other unexpected > scenarios > > that are Hive specific only. However, I'm wondering if it is possible to > > deploy/test in a kerberized environment or issues involving authorization > > services like sentry/ranger. > > > > Thanks, > > Sai. > > > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar < > vihan...@apache.org> > > wrote: > > > > > Hello Team, > > > > > > I have observed that it is a common use-case where users would like to > > test > > > out unreleased features/bug fixes either to unblock them or test out if > > the > > > bug fixes really work as intended in their environments. Today in the > > case > > > of Apache Hive, this is not very user friendly because it requires the > > end > > > user to build the binaries directly from the hive source code. > > > > > > I found that Apache Spark has a very useful infrastructure [1] which > > > deploys nightly snapshots [2] [3] from the branch using github actions. > > > This is super useful for any user who wants to try out the latest and > > > greatest using the nightly builds. > > > > > > I was wondering if we should also adopt this. We can use github actions > > to > > > upload the snapshot jars to the public repository (e.g github packages) > > and > > > schedule it as a nightly job. > > > > > > [1] https://issues.apache.org/jira/browse/INFRA-21167 > > > [2] > https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > > > [3] https://github.com/apache/spark/pull/30623 > > > > > > I can take a stab at this if the community thinks that this is a nice > > thing > > > to have. > > > > > > Thanks, > > > Vihang > > > > > >
Re: [DISCUSS] Nightly snaphot builds
Hi Vihang, +1, We were even exploring publishing the docker images of the snapshot version as well per commit or maybe weekly, so just shoot 2 docker commands and you get a Hive cluster running with master code. Sai, I think to spin up an env via Docker with all these things should be doable for sure, but would require someone with real good expertise with docker as well as setting up these services with Hive. Obviously, I am not that guy :-) @Simhadri has a PR which publishes docker images once a release tag is pushed, you can explore to have similar stuff for the Snapshot version, maybe if that sounds cool -Ayush On Tue, 23 May 2023 at 04:26, Sai Hemanth Gantasala wrote: > Hi Vihang, > > +1 on the idea. > > This is a great idea to quickly test if a certain feature is working as > expected on a certain branch. > This way we test data loss, correctness, or any other unexpected scenarios > that are Hive specific only. However, I'm wondering if it is possible to > deploy/test in a kerberized environment or issues involving authorization > services like sentry/ranger. > > Thanks, > Sai. > > On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar > wrote: > > > Hello Team, > > > > I have observed that it is a common use-case where users would like to > test > > out unreleased features/bug fixes either to unblock them or test out if > the > > bug fixes really work as intended in their environments. Today in the > case > > of Apache Hive, this is not very user friendly because it requires the > end > > user to build the binaries directly from the hive source code. > > > > I found that Apache Spark has a very useful infrastructure [1] which > > deploys nightly snapshots [2] [3] from the branch using github actions. > > This is super useful for any user who wants to try out the latest and > > greatest using the nightly builds. > > > > I was wondering if we should also adopt this. We can use github actions > to > > upload the snapshot jars to the public repository (e.g github packages) > and > > schedule it as a nightly job. > > > > [1] https://issues.apache.org/jira/browse/INFRA-21167 > > [2] https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > > [3] https://github.com/apache/spark/pull/30623 > > > > I can take a stab at this if the community thinks that this is a nice > thing > > to have. > > > > Thanks, > > Vihang > > >
Re: [DISCUSS] Nightly snaphot builds
Hi Vihang, +1 on the idea. This is a great idea to quickly test if a certain feature is working as expected on a certain branch. This way we test data loss, correctness, or any other unexpected scenarios that are Hive specific only. However, I'm wondering if it is possible to deploy/test in a kerberized environment or issues involving authorization services like sentry/ranger. Thanks, Sai. On Mon, May 22, 2023 at 11:15 AM vihang karajgaonkar wrote: > Hello Team, > > I have observed that it is a common use-case where users would like to test > out unreleased features/bug fixes either to unblock them or test out if the > bug fixes really work as intended in their environments. Today in the case > of Apache Hive, this is not very user friendly because it requires the end > user to build the binaries directly from the hive source code. > > I found that Apache Spark has a very useful infrastructure [1] which > deploys nightly snapshots [2] [3] from the branch using github actions. > This is super useful for any user who wants to try out the latest and > greatest using the nightly builds. > > I was wondering if we should also adopt this. We can use github actions to > upload the snapshot jars to the public repository (e.g github packages) and > schedule it as a nightly job. > > [1] https://issues.apache.org/jira/browse/INFRA-21167 > [2] https://github.com/apache/spark/pkgs/container/apache-spark-ci-image > [3] https://github.com/apache/spark/pull/30623 > > I can take a stab at this if the community thinks that this is a nice thing > to have. > > Thanks, > Vihang >
[DISCUSS] Nightly snaphot builds
Hello Team, I have observed that it is a common use-case where users would like to test out unreleased features/bug fixes either to unblock them or test out if the bug fixes really work as intended in their environments. Today in the case of Apache Hive, this is not very user friendly because it requires the end user to build the binaries directly from the hive source code. I found that Apache Spark has a very useful infrastructure [1] which deploys nightly snapshots [2] [3] from the branch using github actions. This is super useful for any user who wants to try out the latest and greatest using the nightly builds. I was wondering if we should also adopt this. We can use github actions to upload the snapshot jars to the public repository (e.g github packages) and schedule it as a nightly job. [1] https://issues.apache.org/jira/browse/INFRA-21167 [2] https://github.com/apache/spark/pkgs/container/apache-spark-ci-image [3] https://github.com/apache/spark/pull/30623 I can take a stab at this if the community thinks that this is a nice thing to have. Thanks, Vihang