Hi Daniel,

Yep, it's clearly a memory size issue. We have much enough memory available so you can allocate, say, 3GB to Docker to be safe (or more for the whole, like Postgres or what not).

Sorry, I have no idea why we get a proxy_ajp:error.

I noticed something. The OFBiz demos are supposed to run under the ofbizDemo 
user.
This was a constraint specified long ago by Infra:

jleroux@ofbiz-vm1:/home/ofbizDemo$ ls
3669011.txt       RUN-DEMOS-AS-ofbizDemo-USER-ONLY-NOT-ROOT all-manual.sh  branch22.01          cronlog-svn-update.log nohup.out  stable-manual.sh  trunk.sh
3669011.txt.save  all-manual-nicely.sh branch18.12    check-svn-update.sh  
next-manual.sh port.txt   trunk

It's security related. Can't you do the same with Docker or is that constraint unnecessary for Docker with an Infra agreement for the ofbizdocker user (same security profile), or?

About your question: <<Perhaps other parts of the Java VM are using over 600MB to 
manage themselves?>>
Are you sure it's only Java related, eg. is Postgres using another process or 
is it included in the Docker OFBiz component, etc?

HTH

Jacques

Le 29/03/2023 à 21:43, Daniel Watford a écrit :
Hi Jacques,

The TL;DR is that the ofbiz process ran out of memory and was killed by the
kernel.

The original limit of 1600MB was being hit, causing an immediate rather
than graceful kill. This limit has been lifted to 2400MB.

The following shows the steps taken to look into why OFBiz had stopped once
I had ssh'd to ofbiz-vm1, became root and changed to the
/home/ofbizdocker/demo-trunk directory

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# date
Wed Mar 29 18:44:16 UTC 2023
root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# docker compose ps -a
NAME                 IMAGE                                         COMMAND
                 SERVICE             CREATED             STATUS
        PORTS
demo-trunk-db-1      postgres:13
  "docker-entrypoint.s…"   db                  16 hours ago        Up 16
hours                5432/tcp
demo-trunk-ofbiz-1   ghcr.io/apache/ofbiz:trunk-plugins-snapshot
  "/ofbiz/docker-entry…"   ofbiz               16 hours ago        Exited
(137) 7 hours ago

The demo-trunk-ofbiz-1 container stopped 7 hours ago, but not much
information to go on here.

Reviewing the logs with `docker compose logs` does not reveal anything
about why the demo-trunk-ofbiz-1 container shut down.


Next we'll try inspecting the stopped container:

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# docker inspect
demo-trunk-ofbiz-1
[{
     "Id":
"8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092",
     "Created": "2023-03-29T02:35:03.522931049Z",
     "Path": "/ofbiz/docker-entrypoint.sh",
     "Args": [
       "bin/ofbiz"
     ],
     "State": {
       "Status": "exited",
       "Running": false,
       "Paused": false,
       "Restarting": false,
       "OOMKilled": true,
       "Dead": false,
       "Pid": 0,
       "ExitCode": 137,
       "Error": "",
       "StartedAt": "2023-03-29T02:35:04.063103982Z",
   "FinishedAt": "2023-03-29T11:18:12.827856157Z"

   [... snip ...]

 From the above we see the container was OOMKilled (out of memory)

It is likely that the OFBiz deployment attempted to exceed its 1600MB limit
defined in the docker-compose.yml file.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# cat docker-compose.yml
version: "2.4"

services:
   db:
     image: postgres:13
     mem_limit: 300M
     memswap_limit: 300M
     cpu_shares: 200
     restart: "no"
     volumes:
       - ./postgres-initdb.d:/docker-entrypoint-initdb.d
     env_file:
       - postgres.env
       - ofbiz-postgres.env

   ofbiz:
     image: ghcr.io/apache/ofbiz:trunk-plugins-snapshot
     mem_limit: 1600M
     memswap_limit: 1600M
     cpu_shares: 200
     ports:
       - 127.0.0.1:8009:8009
     volumes:
       -
./after-config-applied.d:/docker-entrypoint-hooks/after-config-applied.d
     env_file:
       - ofbiz-postgres.env
     environment:
       OFBIZ_HOST: demo-trunk.ofbiz.apache.org
       OFBIZ_ENABLE_AJP_PORT: 1
       OFBIZ_DATA_LOAD: demo
       OFBIZ_CONTENT_URL_PREFIX: https://demo-trunk.ofbiz.apache.org

Now that we know ofbiz will killed due to running out of memory at 11:18:12
UTC, we can check the logs again to see what requests were being made
around this time.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk#
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,237 |-0.0.0.0-8009-exec-1
|ControlServlet                |T| [[[ordermgr::ajaxSetUserPreference
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Begun,
encoding=[UTF-8]- total:0.0,since last(Begin):0.0]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,241 |-0.0.0.0-8009-exec-1
|ServiceDispatcher             |T| Sync service
[order/preferenceGetSetPermission] finished in [1] milliseconds
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|ServiceDispatcher             |T| Sync service [order/setUserPreference]
finished in [3] milliseconds
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|RequestHandler                |I| Ran Event [service:#setUserPreference]
from [request], result is [success]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|ServerHitBin                  |I| Visit delegatorName=default,
ServerHitBin delegatorName=default
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,246 |-0.0.0.0-8009-exec-1
|ControlServlet                |T| [[[ordermgr::ajaxSetUserPreference
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Done-
total:0.009,since last([ordermgr::ajaxSe...):0.009]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,294 |-0.0.0.0-8009-exec-8
|ControlServlet                |T| [[[ordermgr::createOrderConversation
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Begun,
encoding=[UTF-8]- total:0.0,since last(Begin):0.0]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,314 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.0s, 0 requests, 0
views in
file:/ofbiz/framework/common/webcommon/WEB-INF/handlers-controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,315 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.006s, 52 requests,
21 views in
file:/ofbiz/framework/common/webcommon/WEB-INF/common-controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,320 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.0s, 4 requests, 0
views in file:/ofbiz/applications/commonext/webapp/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,376 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.004s, 347 requests,
134 views in
file:/ofbiz/applications/content/webapp/content/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,380 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.077s, 380 requests,
153 views in
file:/ofbiz/applications/order/webapp/ordermgr/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,384 |-0.0.0.0-8009-exec-8
|ServiceDispatcher             |E| Incoming context (in runSync :
createOrderConversation) does not match expected requirements
demo-trunk-ofbiz-1  | org.apache.ofbiz.service.ServiceValidationException:
The following required parameter is missing: communicationEventPrpTypId
demo-trunk-ofbiz-1  |   at
org.apache.ofbiz.service.ModelService.resolveRequiredValues(ModelService.java:1270)
~[ofbiz.jar:?]
demo-trunk-ofbiz-1  |   at
org.apache.ofbiz.service.ModelService.validate(ModelService.java:1150)
~[ofbiz.jar:?]
.
.
.
demo-trunk-ofbiz-1  |   at
org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:433)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:926)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1791)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at java.lang.Thread.run(Thread.java:833) ~[?:?]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|ServiceDispatcher             |T| [[Sync service failed...-
total:0.0,since last(Begin):0.0]] - 'order / createOrderConversation'
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|TransactionUtil               |I| Transaction rolled back
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|TransactionUtil               |W| Not committing transaction, status is No
Transaction (6)
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|RequestHandler                |I| Ran Event
[service:#createOrderConversation] from [request], result is [error]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|RequestHandler                |E| Request createOrderConversation caused
an error with the following message: [The following required parameter is
missing: communicationEventPrpTypId]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,386 |-0.0.0.0-8009-exec-8
|RequestHandler                |I| Sending redirect to: [
https://demo-trunk.ofbiz.apache.org:443/ordermgr/control/orderview?orderId=WSCO10012].
Hidden sessionId by default.
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,395 |-0.0.0.0-8009-exec-8
|ServerHitBin                  |I| Visit delegatorName=default,
ServerHitBin delegatorName=default
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,397 |-0.0.0.0-8009-exec-8
|ControlServlet                |T| [[[ordermgr::createOrderConversation
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Done-
total:0.103,since last([ordermgr::create...):0.103]]

So it looks like a request was made to createOrderConversation which failed
and issued a redirect to
https://demo-trunk.ofbiz.apache.org:443/ordermgr/control/orderview?orderId=WSCO10012.
And then the logs show nothing else.



Next we can take a look at /var/log/kern.log around 11:18:12 UTC to see if
we can find out why the ofbiz service was killed.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# cd /var/log
root@ofbiz-vm1:/var/log# less kern.log
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039472] C2 CompilerThre invoked
oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039481] CPU: 0 PID: 3444767
Comm: C2 CompilerThre Not tainted 5.15.0-1028-aws #32~20.04.1-Ubuntu
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039484] Hardware name: Xen HVM
domU, BIOS 4.11.amazon 08/24/2006
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039486] Call Trace:
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039489]  <TASK>
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039492]  dump_stack_lvl+0x4a/0x63
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039498]  dump_stack+0x10/0x16
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039501]  dump_header+0x53/0x225
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039505]
oom_kill_process.cold+0xb/0x10
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039509]
out_of_memory+0x1dc/0x530
.
.
.
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039576] memory: usage 1638400kB,
limit 1638400kB, failcnt 0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039578] memory+swap: usage
1638400kB, limit 1638400kB, failcnt 1441
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039579] kmem: usage 8120kB,
limit 9007199254740988kB, failcnt 0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039581] Memory cgroup stats for
/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092:
.
.
.
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039598] Tasks state (memory
values in pages):
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039599] [  pid  ]   uid  tgid
total_vm      rss pgtables_bytes swapents oom_score_adj name
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039604] [3442255]  1000 3442255
1744968   417268  4272128        0             0 java
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039607]
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,mems_allowed=0,oom_memcg=/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,task_memcg=/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,task=java,pid=3442255,uid=1000
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039805] Memory cgroup out of
memory: Killed process 3442255 (java) total-vm:6979872kB,
anon-rss:1624552kB, file-rss:44520kB, shmem-rss:0kB, UID:1000
pgtables:4172kB oom_score_adj:0


 From the above logs we have line,
"Memory cgroup out of memory: Killed process 3442255 (java)
total-vm:6979872kB, anon-rss:1624552kB, file-rss:44520kB, shmem-rss:0kB,
UID:1000 pgtables:4172kB oom_score_adj:0"

This tells us that process 3442255, a java process was killed.

Further up in the log we have line:
"memory: usage 1638400kB, limit 1638400kB, failcnt 0"
Basically our ofbiz java process was using 1600MB, hitting its memory
limit. This is surprising though as our ofbiz processes launch with
parameters: -Xms128M -Xmx1024M
These parameters should limit heap usage to 1024MB. Perhaps other parts of
the Java VM are using over 600MB to manage themselves?

As a work around I will increase the limit assigned to the docker ofbiz
container, but it would be good to be able to predict expected memory usage
by OFBiz.



Once OFBiz was up and running again, there were still problems connecting
to https://demo-trunk.ofbiz.apache.org/partymgr.

[Wed Mar 29 19:27:01.264143 2023] [proxy_ajp:error] [pid 3849696:tid
139767516272384] (120006)APR does not understand this error code: [client
185.191.171.12:2680] AH00878: read response failed from 127.0.0.1:8009
(localhost)

It seems that Apache HTTPd did not like the responses it was getting from
OFBiz Tomcat on port 8009.

I didn't know how to investigate this issue any further, so ended up
removing and recreating both the database and ofbiz containers. Once this
was done demo-trunk appears to be working normally.


Any suggestions on how to investigate the issue with the AJP port if
similar problems occur will be gratefully received.



Thanks,

Dan.

Reply via email to