aduchate opened a new issue, #5342: URL: https://github.com/apache/couchdb/issues/5342
## Description On a cluster (3.4.2) of 4 nodes that has a fairly large amount of databases (~1500 of size between 1GB and 150GB each), we have recently added and modified about 20 design docs per database (total 30000 dds). We have setup Ken with a concurrency of 5 to let the indexation happen. About every 10 minutes, we see one indexer process not being updated anymore. It basically stays stuck forever (we let a few linger for 24 hours). Killing all couchjs_mainjs has no impact on the stuck indexer. The only way to get rid of those stuck indexers is to issue, in remsh, an exit(<pid>, kill). . Pid here is the pid field of /_active_task, not indexer_pid. ## Steps to Reproduce Create a lot of databases with a lot of data, create a few design documents per database, start ken. ## Expected Behaviour The indexers shouldn't get stuck ## Your Environment * CouchDB version used: 3.4.2 * Browser name and version: irrelevant * Operating system and version: ubuntu 22.04 (couchdb compiled with dockerfile below) ``` FROM ubuntu:22.04 # Create app directory WORKDIR /root # Install dependencies RUN apt-get update RUN ln -fs /usr/share/zoneinfo/Europe/Brussels /etc/localtime RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends tzdata RUN dpkg-reconfigure --frontend noninteractive tzdata RUN apt-get install -y gnupg2 wget curl vim git build-essential pkg-config libicu-dev libmozjs-78-dev libcurl4-openssl-dev libncurses-dev node-gyp npm libssl-dev help2man openjdk-21-jdk-headless RUN git clone https://github.com/erlang/otp otp_src_27.1.2 WORKDIR /root/otp_src_27.1.2 RUN git checkout -b 27.1.2 44ffe8811dfcf3d2fe04d530c6e8fac5ca384e02 RUN bash -c 'export ERL_TOP=`pwd`; export LANG=C; ./configure; make; make release_tests; cd release/tests/test_server; /root/otp_src_27.1.2/bin/erl -s ts install -s ts smoke_test batch -s init stop; cd ..; tar zcvf otp-tests.tgz test_server; cd /root/otp_src_27.1.2; make install' WORKDIR /root RUN git clone https://github.com/apache/couchdb.git #9 WORKDIR /root/couchdb RUN git checkout -b 3.4.2 6e5ad2a5c5479cb09722b4a7d13b3d59b7bb2a23 RUN bash -c 'curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash ; . /root/.nvm/nvm.sh; nvm install 18' RUN ./configure --disable-docs --spidermonkey-version 78 chdir=/media/data/src/couchdb RUN bash -c '. /root/.nvm/nvm.sh; nvm use 18; make release' WORKDIR /root/couchdb/rel RUN tar zcvf couchdb.jammy-jellyfish.3.4.2.tgz couchdb ``` ## Additional Context We can give you access to the infrastructure that causes the problem to happen if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
