Hi Jacques,

The TL;DR is that the ofbiz process ran out of memory and was killed by the
kernel.

The original limit of 1600MB was being hit, causing an immediate rather
than graceful kill. This limit has been lifted to 2400MB.

The following shows the steps taken to look into why OFBiz had stopped once
I had ssh'd to ofbiz-vm1, became root and changed to the
/home/ofbizdocker/demo-trunk directory

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# date
Wed Mar 29 18:44:16 UTC 2023
root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# docker compose ps -a
NAME                 IMAGE                                         COMMAND
                SERVICE             CREATED             STATUS
       PORTS
demo-trunk-db-1      postgres:13
 "docker-entrypoint.s…"   db                  16 hours ago        Up 16
hours                5432/tcp
demo-trunk-ofbiz-1   ghcr.io/apache/ofbiz:trunk-plugins-snapshot
 "/ofbiz/docker-entry…"   ofbiz               16 hours ago        Exited
(137) 7 hours ago

The demo-trunk-ofbiz-1 container stopped 7 hours ago, but not much
information to go on here.

Reviewing the logs with `docker compose logs` does not reveal anything
about why the demo-trunk-ofbiz-1 container shut down.


Next we'll try inspecting the stopped container:

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# docker inspect
demo-trunk-ofbiz-1
[{
    "Id":
"8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092",
    "Created": "2023-03-29T02:35:03.522931049Z",
    "Path": "/ofbiz/docker-entrypoint.sh",
    "Args": [
      "bin/ofbiz"
    ],
    "State": {
      "Status": "exited",
      "Running": false,
      "Paused": false,
      "Restarting": false,
      "OOMKilled": true,
      "Dead": false,
      "Pid": 0,
      "ExitCode": 137,
      "Error": "",
      "StartedAt": "2023-03-29T02:35:04.063103982Z",
  "FinishedAt": "2023-03-29T11:18:12.827856157Z"

  [... snip ...]

>From the above we see the container was OOMKilled (out of memory)

It is likely that the OFBiz deployment attempted to exceed its 1600MB limit
defined in the docker-compose.yml file.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# cat docker-compose.yml
version: "2.4"

services:
  db:
    image: postgres:13
    mem_limit: 300M
    memswap_limit: 300M
    cpu_shares: 200
    restart: "no"
    volumes:
      - ./postgres-initdb.d:/docker-entrypoint-initdb.d
    env_file:
      - postgres.env
      - ofbiz-postgres.env

  ofbiz:
    image: ghcr.io/apache/ofbiz:trunk-plugins-snapshot
    mem_limit: 1600M
    memswap_limit: 1600M
    cpu_shares: 200
    ports:
      - 127.0.0.1:8009:8009
    volumes:
      -
./after-config-applied.d:/docker-entrypoint-hooks/after-config-applied.d
    env_file:
      - ofbiz-postgres.env
    environment:
      OFBIZ_HOST: demo-trunk.ofbiz.apache.org
      OFBIZ_ENABLE_AJP_PORT: 1
      OFBIZ_DATA_LOAD: demo
      OFBIZ_CONTENT_URL_PREFIX: https://demo-trunk.ofbiz.apache.org

Now that we know ofbiz will killed due to running out of memory at 11:18:12
UTC, we can check the logs again to see what requests were being made
around this time.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# docker compose logs | tail -70
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,237 |-0.0.0.0-8009-exec-1
|ControlServlet                |T| [[[ordermgr::ajaxSetUserPreference
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Begun,
encoding=[UTF-8]- total:0.0,since last(Begin):0.0]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,241 |-0.0.0.0-8009-exec-1
|ServiceDispatcher             |T| Sync service
[order/preferenceGetSetPermission] finished in [1] milliseconds
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|ServiceDispatcher             |T| Sync service [order/setUserPreference]
finished in [3] milliseconds
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|RequestHandler                |I| Ran Event [service:#setUserPreference]
from [request], result is [success]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,244 |-0.0.0.0-8009-exec-1
|ServerHitBin                  |I| Visit delegatorName=default,
ServerHitBin delegatorName=default
demo-trunk-ofbiz-1  | 2023-03-29 11:18:05,246 |-0.0.0.0-8009-exec-1
|ControlServlet                |T| [[[ordermgr::ajaxSetUserPreference
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Done-
total:0.009,since last([ordermgr::ajaxSe...):0.009]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,294 |-0.0.0.0-8009-exec-8
|ControlServlet                |T| [[[ordermgr::createOrderConversation
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Begun,
encoding=[UTF-8]- total:0.0,since last(Begin):0.0]]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,314 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.0s, 0 requests, 0
views in
file:/ofbiz/framework/common/webcommon/WEB-INF/handlers-controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,315 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.006s, 52 requests,
21 views in
file:/ofbiz/framework/common/webcommon/WEB-INF/common-controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,320 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.0s, 4 requests, 0
views in file:/ofbiz/applications/commonext/webapp/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,376 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.004s, 347 requests,
134 views in
file:/ofbiz/applications/content/webapp/content/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,380 |-0.0.0.0-8009-exec-8
|ConfigXMLReader               |I| controller loaded: 0.077s, 380 requests,
153 views in
file:/ofbiz/applications/order/webapp/ordermgr/WEB-INF/controller.xml
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,384 |-0.0.0.0-8009-exec-8
|ServiceDispatcher             |E| Incoming context (in runSync :
createOrderConversation) does not match expected requirements
demo-trunk-ofbiz-1  | org.apache.ofbiz.service.ServiceValidationException:
The following required parameter is missing: communicationEventPrpTypId
demo-trunk-ofbiz-1  |   at
org.apache.ofbiz.service.ModelService.resolveRequiredValues(ModelService.java:1270)
~[ofbiz.jar:?]
demo-trunk-ofbiz-1  |   at
org.apache.ofbiz.service.ModelService.validate(ModelService.java:1150)
~[ofbiz.jar:?]
.
.
.
demo-trunk-ofbiz-1  |   at
org.apache.coyote.ajp.AjpProcessor.service(AjpProcessor.java:433)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:63)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:926)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1791)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
~[tomcat-coyote-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1191)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
~[tomcat-util-9.0.72.jar:9.0.72]
demo-trunk-ofbiz-1  |   at java.lang.Thread.run(Thread.java:833) ~[?:?]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|ServiceDispatcher             |T| [[Sync service failed...-
total:0.0,since last(Begin):0.0]] - 'order / createOrderConversation'
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|TransactionUtil               |I| Transaction rolled back
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|TransactionUtil               |W| Not committing transaction, status is No
Transaction (6)
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|RequestHandler                |I| Ran Event
[service:#createOrderConversation] from [request], result is [error]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,385 |-0.0.0.0-8009-exec-8
|RequestHandler                |E| Request createOrderConversation caused
an error with the following message: [The following required parameter is
missing: communicationEventPrpTypId]
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,386 |-0.0.0.0-8009-exec-8
|RequestHandler                |I| Sending redirect to: [
https://demo-trunk.ofbiz.apache.org:443/ordermgr/control/orderview?orderId=WSCO10012].
Hidden sessionId by default.
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,395 |-0.0.0.0-8009-exec-8
|ServerHitBin                  |I| Visit delegatorName=default,
ServerHitBin delegatorName=default
demo-trunk-ofbiz-1  | 2023-03-29 11:18:12,397 |-0.0.0.0-8009-exec-8
|ControlServlet                |T| [[[ordermgr::createOrderConversation
(Domain:https://demo-trunk.ofbiz.apache.org)] Request Done-
total:0.103,since last([ordermgr::create...):0.103]]

So it looks like a request was made to createOrderConversation which failed
and issued a redirect to
https://demo-trunk.ofbiz.apache.org:443/ordermgr/control/orderview?orderId=WSCO10012.
And then the logs show nothing else.



Next we can take a look at /var/log/kern.log around 11:18:12 UTC to see if
we can find out why the ofbiz service was killed.

root@ofbiz-vm1:/home/ofbizdocker/demo-trunk# cd /var/log
root@ofbiz-vm1:/var/log# less kern.log
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039472] C2 CompilerThre invoked
oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039481] CPU: 0 PID: 3444767
Comm: C2 CompilerThre Not tainted 5.15.0-1028-aws #32~20.04.1-Ubuntu
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039484] Hardware name: Xen HVM
domU, BIOS 4.11.amazon 08/24/2006
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039486] Call Trace:
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039489]  <TASK>
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039492]  dump_stack_lvl+0x4a/0x63
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039498]  dump_stack+0x10/0x16
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039501]  dump_header+0x53/0x225
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039505]
oom_kill_process.cold+0xb/0x10
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039509]
out_of_memory+0x1dc/0x530
.
.
.
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039576] memory: usage 1638400kB,
limit 1638400kB, failcnt 0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039578] memory+swap: usage
1638400kB, limit 1638400kB, failcnt 1441
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039579] kmem: usage 8120kB,
limit 9007199254740988kB, failcnt 0
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039581] Memory cgroup stats for
/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092:
.
.
.
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039598] Tasks state (memory
values in pages):
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039599] [  pid  ]   uid  tgid
total_vm      rss pgtables_bytes swapents oom_score_adj name
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039604] [3442255]  1000 3442255
1744968   417268  4272128        0             0 java
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039607]
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,mems_allowed=0,oom_memcg=/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,task_memcg=/docker/8afc7a9b0a1f34c7a72f4b24f4bb9f2679233821f1a9697302719ac0a144b092,task=java,pid=3442255,uid=1000
Mar 29 11:18:12 ofbiz-vm1 kernel: [3983121.039805] Memory cgroup out of
memory: Killed process 3442255 (java) total-vm:6979872kB,
anon-rss:1624552kB, file-rss:44520kB, shmem-rss:0kB, UID:1000
pgtables:4172kB oom_score_adj:0


>From the above logs we have line,
"Memory cgroup out of memory: Killed process 3442255 (java)
total-vm:6979872kB, anon-rss:1624552kB, file-rss:44520kB, shmem-rss:0kB,
UID:1000 pgtables:4172kB oom_score_adj:0"

This tells us that process 3442255, a java process was killed.

Further up in the log we have line:
"memory: usage 1638400kB, limit 1638400kB, failcnt 0"
Basically our ofbiz java process was using 1600MB, hitting its memory
limit. This is surprising though as our ofbiz processes launch with
parameters: -Xms128M -Xmx1024M
These parameters should limit heap usage to 1024MB. Perhaps other parts of
the Java VM are using over 600MB to manage themselves?

As a work around I will increase the limit assigned to the docker ofbiz
container, but it would be good to be able to predict expected memory usage
by OFBiz.



Once OFBiz was up and running again, there were still problems connecting
to https://demo-trunk.ofbiz.apache.org/partymgr.

[Wed Mar 29 19:27:01.264143 2023] [proxy_ajp:error] [pid 3849696:tid
139767516272384] (120006)APR does not understand this error code: [client
185.191.171.12:2680] AH00878: read response failed from 127.0.0.1:8009
(localhost)

It seems that Apache HTTPd did not like the responses it was getting from
OFBiz Tomcat on port 8009.

I didn't know how to investigate this issue any further, so ended up
removing and recreating both the database and ofbiz containers. Once this
was done demo-trunk appears to be working normally.


Any suggestions on how to investigate the issue with the AJP port if
similar problems occur will be gratefully received.



Thanks,

Dan.







On Wed, 29 Mar 2023 at 19:21, Jacques Le Roux <jacques.le.r...@les7arts.com>
wrote:

> Hi Daniel,
>
> The trunk demo is down since early afternoon, normal ?
>
> Received connection from - /127.0.0.1 : 43768
> 2023-03-28 15:41:39,866 |OFBiz-AdminServer |ContainerLoader
> |I| Shutting down containers
> 2023-03-28 15:41:39,867 |OFBiz-AdminServer |ContainerLoader
> |I| Stopping container catalina-container
> [...]
> Mar 28, 2023 3:41:40 PM org.apache.coyote.AbstractProtocol stop
> INFO: Stopping ProtocolHandler ["ajp-nio-127.0.0.1-8009"]
> Mar 28, 2023 3:41:40 PM org.apache.coyote.AbstractProtocol stop
> INFO: Stopping ProtocolHandler ["http-nio-8080"]
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ContainerLoader
> |I| Stopped container catalina-container
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ContainerLoader
> |I| Stopping container service-container
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |JobPoller
> |I| Shutting down JobPoller.
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |JobPoller
> |I| JobPoller shutdown completed.
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceContainer
> |I| Removing from cache dispatcher: default
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceDispatcher
> |I| De-Registering dispatcher: default
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceContainer
> |I| Removing from cache dispatcher: rest-api
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceDispatcher
> |I| De-Registering dispatcher: rest-api
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceContainer
> |I| Removing from cache dispatcher: entity-default
> 2023-03-28 15:41:40,138 |OFBiz-AdminServer |ServiceDispatcher
> |I| De-Registering dispatcher: entity-default
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ServiceDispatcher
> |I| Shutting down the service engine...
> 2023-03-28 15:41:40,139 |OFBiz-JobPoller |JobPoller
> |I| JobPoller thread stopped.
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopped container service-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopping container delegator-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopped container delegator-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopping container admin-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopped container admin-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopping container component-container
> 2023-03-28 15:41:40,139 |OFBiz-AdminServer |ContainerLoader
> |I| Stopped container component-container
>
> Sorry, I have no time yet to check further.
>
> Jacques
>
> Le 28/03/2023 à 18:09, Daniel Watford a écrit :
> > Hello,
> >
> > Per https://issues.apache.org/jira/browse/OFBIZ-12786, the demo-trunk
> site
> > is now hosted by a docker container running on VM ofbiz-vm1.apache.org.
> >
> > The changes to the host are captured in PR,
> > https://github.com/apache/ofbiz-tools/pull/5.
> >
> > The previous approach to demo-trunk deployment patched the solr plugin to
> > set the solr URL to https://demo-trunk.ofbiz.apache.org/solr. You can
> view
> > the patch here -
> >
> https://github.com/apache/ofbiz-tools/blob/master/demo-backup/patch/trunk/solr.config.patch
> > .
> >
> > I believe I have configured the new demo-trunk container to produce the
> > same solr url, but by using configuration rather than source code
> changes.
> >
> > I don't know much about the use of Solr in OFBiz so don't know how to
> check
> > that the original demo-trunk behaviour has been maintained.
> >
> > Please could someone familiar with Solr in OFBiz take a look
> > https://demo-trunk.ofbiz.apache.org/partymgr/control/main and see if
> Solr
> > is operating as expected.
> >
> > Thanks,
> >
> > Dan.
> >
>


-- 
Daniel Watford

Reply via email to