Hi all, Appreciate the context Uwe! In poking around a bit I see there's a cleanup option in the job config called: "Clean up this jobs workspaces from other slave nodes". Seems like that might help a bit, though I'd have to do a little more digging to be sure. Doesn't appear to be enabled on either the Lucene or Solr jobs, fwiw.
Does anyone have a pointer to the discussion around getting these project-specific build nodes? I searched around in JIRA and on lists.apache.org, but couldn't find a request for the nodes or a discussion about creating them. Would love to understand the rationale a bit better: in talking to INFRA folks last week, they suggested the main (only?) reason folks use project-specific VMs is to avoid waiting in the general pool...but our average wait time these days at least looks much much longer than anything in the general pool. >From the "outside" looking in, it feels like we're taking on more maintenance/infra burden for worse results (at least as defined by 'uptime' and build-wait times). Best, Jason On Tue, Oct 15, 2024 at 6:12 AM Uwe Schindler <u...@thetaphi.de> wrote: > > Hi, > > I have root access on both machines. I was not aware of problems. The > workspace name problem is a known one: If a node is down while the job > is renamed or there are multiple nodes having the workspace it can't > delete. In the case of multiple nodes, it only deletes the workspace on > the node which had the job running recently. > > As general rule that I always use: Before renaming a job, go to the Job > and prune the workspace from the web interface. But this has the same > problem like described before: It only shows the workspace of the last > node which executed the job recently. > > Uwe > > Am 14.10.2024 um 22:01 schrieb Jason Gerlowski: > > Of course, happy to help - glad you got some 'green' builds. > > > > Both agents should be back online now. > > > > The root of the problem appears to be that Jenkins jobs use a static > > workspace whose path is based on the name of the job. This would work > > great if job names never changed I guess. But our job names *do* > > drift - both Lucene and Solr tend to include version strings (e.g. > > Solr-check-9.6, Lucene-check-9.12), which introduces some "drift" and > > orphans a few workspaces a year. That doesn't sound like much, but > > each workspace contains a full Solr or Lucene checkout+build, so they > > add up pretty quickly. Anyway, that root problem remains and will > > need to be addressed if our projects want to continue the specially > > tagged agents. But things are healthy for now! > > > > Best, > > > > Jason > > > > On Tue, Oct 8, 2024 at 3:10 AM Luca Cavanna <java...@apache.org> wrote: > >> Thanks a lot Jason, > >> this helps a lot. I see that the newly added jobs for 10x and 10.0 have > >> been built and it all looks pretty green now. > >> > >> Thanks > >> Luca > >> > >> On Mon, Oct 7, 2024 at 11:27 PM Jason Gerlowski <gerlowsk...@gmail.com> > >> wrote: > >>> Hi Luca, > >>> > >>> I suspect I'm chiming in here a little late to help with your > >>> release-related question, but... > >>> > >>> I stopped into the "#askinfra Office Hours" this afternoon at > >>> ApacheCon, and asked for some help on this. Both workers seemed to > >>> have disk-space issues, seemingly due to orphaned workspaces. I've > >>> gotten one agent/worker back online (lucene-solr-2 I believe). The > >>> other one I'm hoping to get back online shortly, after a bit more > >>> cleanup. > >>> > >>> (Getting the right permissions to clean things up was a bit of a > >>> process; I'm hoping to document this and will share here when that's > >>> ready.) > >>> > >>> There are still nightly jobs that run on the ASF Jenkins (for both > >>> Lucene and Solr); on the Solr side at least these are quite useful. > >>> > >>> Best, > >>> > >>> Jason > >>> > >>> > >>> On Wed, Oct 2, 2024 at 2:40 PM Luca Cavanna <java...@apache.org> wrote: > >>>> Hi all, > >>>> I created new CI jobs at https://ci-builds.apache.org/job/Lucene/ > >>>> yesterday to cover branch_10x and branch_10_0 . Not a single build for > >>>> them started so far. > >>>> > >>>> Poking around I noticed on the build history a message "Pending - all > >>>> nodes of label Lucene are offline", which looked suspicious. Are we > >>>> still using this jenkins? I have successfully used it for the release I > >>>> have done in the past, but it was already some months ago. The step of > >>>> creating jobs is still part of the release wizard process anyways, so it > >>>> felt right to do this step. I am not sure how to proceed from here, does > >>>> anyone know? I also noticed a too low disk space warning on one of the > >>>> two agents. > >>>> > >>>> Thanks > >>>> Luca > >>> --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > >>> For additional commands, e-mail: dev-h...@lucene.apache.org > >>> > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > > For additional commands, e-mail: dev-h...@lucene.apache.org > > > -- > Uwe Schindler > Achterdiek 19, D-28357 Bremen > https://www.thetaphi.de > eMail: u...@thetaphi.de > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org