Thanks Houston!
~ David

On Fri, Oct 27, 2023 at 2:03 PM Houston Putman <hous...@apache.org> wrote:

> After fixing the docker tests, I believe all of the other Solr-Check and
> Solr-Smoketest errors, that were the result of running Solr processes, have
> gone away.
> The MTLs issue still exists, and there are other issues with the
> smoketest but at least there is progress.
>
> We should definitely move the docker tests to use BATS so that we can have
> better control over test cleanup. But that's not going to be a very easy
> migration.
>
> - Houston
>
> On Thu, Oct 26, 2023 at 1:00 PM Houston Putman <hous...@apache.org> wrote:
>
> > Ok, I think I fixed the docker tests. The other issues all still apply
> > though.
> >
> > - Houston
> >
> > On Thu, Oct 26, 2023 at 12:16 PM Houston Putman <hous...@apache.org>
> > wrote:
> >
> >> The Jenkins builds aren't in a great state right now.
> >>
> >> Currently the Solr-Check-main
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-main> build is
> >> failing consistently because of random Solr processes being found on the
> >> box (when the integration tests expect nothing else to be running). Now
> >> that we have port randomization for the integration tests, its a very
> good
> >> sign that the found Solr processes all use port 8983, meaning that we
> >> aren't leaking Solrs in the integration tests.
> >>
> >> Because of this, the culprit seems to be that the smoke tests (which
> >> still start a Solr on port 8983) are leaking processes, and looking at
> the
> >> logs, that seems to be the case (Solr-Smoketest-9.4
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.4>,
> >> Solr-Smoketest-9.x
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Smoketest-9.x>). So
> >> fixing the Smoketests leaking Solr processes will in turn fix both the
> >> smoke test builds and the main check.
> >>
> >> As for the Solr-Check-9.x
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Check-9.x> build, it is
> >> running on Crave, so it doesn't have the same issue with leaked Solr
> >> processes. However on crave, there seems to be an issue with the mTLS
> >> tests. (Solr-Check-main also has this issue, but only on the
> lucene-solr-1
> >> machine, not lucene-solr-2 strangely). We need to investigate why the
> TLS
> >> tests pass locally for everyone (and on 1/2 of the Jenkins boxes), but
> not
> >> on crave.
> >>
> >> Lastly, the Docker tests are broken in a very strange way. A while ago,
> I
> >> added tests to make sure that the prometheus exporter can communicate
> >> correctly in docker. This test seems to fail on both
> >> Solr-Docker-Nightly-main
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-main>
> and
> >> Solr-Docker-Nightly-9.x
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.x>. At
> >> first I thought the issue was that the Jenkins servers had different
> Docker
> >> networking that didn't support these tests, and I let it be for a bit.
> Now
> >> we are running Solr-Docker-Nightly-9.4
> >> <https://ci-builds.apache.org/job/Solr/job/Solr-Docker-Nightly-9.4>,
> >> which has the same tests included and it passes. So it does seem like
> the
> >> Jenkins servers allow us to use Docker networking in the ways we want,
> but
> >> for some reason 9.x and 9.4 (which should be relatively identical) don't
> >> behave the same way. Looking at the err logs, the problem is
> >>
> >>> /opt/solr/docker/scripts/docker-entrypoint.sh: line 48: exec:
> >>> solr-exporter: not found
> >>>
> >> On the top of my head I think this might be using the slim docker image?
> >> Because otherwise there's no reason why the solr exporter wouldn't be
> >> there... (Also no idea why it wouldn't work the same on the 9.4
> build...)
> >>
> >> Anyways, this is just a list of what's going on. I'll try to fix the
> >> docker stuff, but would love help with the other builds!
> >>
> >> - Houston
> >>
> >
>

Reply via email to