Performance problems

Artem Abeleshev Sun, 05 Feb 2023 21:05:33 -0800

Hi everyone!

I've struggling with the performance problem already for the couple of
weeks. We have two environments:


- `dev` with 2 nodes of ManifoldCF agent + 1 node of Zookeeper
- `prod` with 4 nodes of ManifoldCF agent + 3 node of Zookeeper

ManifoldCF agent settings are identical at the moment, we have expicitly
indicated following settings:

- 200 db handles (`org.apache.manifoldcf.database.maxhandles`)
- 100 worker threads (`org.apache.manifoldcf.crawler.threads`)
- 10 expire threads (`org.apache.manifoldcf.crawler.expirethreads`)
- 10 cleanup threads (`org.apache.manifoldcf.crawler.cleanupthreads`)
- 10 document delete threads (`org.apache.manifoldcf.crawler.deletethreads`)

(but I have tried prod with various configs, the result is the same)

In the Postgres config, at the moment we have `max_connections` of `840`
and `shared_buffers` of `244559`.

I have a job that is runninhg really slowly on production evironment in
comparing to the development environment. I have monitored the JVM using
VisualVM and noticed that all worker threads almost all the time spending
in `WAITING` or `TIMED_WAITING` statuses. I grabbed a lot of threadudmps
and almost every time I found worker threads are waiting on `LockGate`.
What can be the possible cause of it and what I can do with that?

Another thing that makes threads sleep for a long time is a concurrent
modification failures caused by PostgreSQL due to the usage of
`SYNCHRONIZED` isolation level. After failure thread is send to sleep for a
random time up to `60000` millis. It is made by design, but is there a way
to reduce amount of these failures?

I will be grateful for any hints or ideas.

Thank you!

With respect,
Abeleshev Artem

Performance problems

Reply via email to