Hello, I noticed today that there have been no writes to log files on the
demo-stable instance since the 11th August.

The following is some notes on what went wrong and how log file permissions
are implemented in our docker containers. Hopefully these will be useful if
someone is investigating log issues with the demo sites in future.

Notes:

Looking at the git log of the release18.12 branch, I couldn't see any
commits related to logging on or prior to the 11th August.

I opened a terminal to the ofbiz VM, changed to the
/home/ofbizdocker/demo-stable directory and then ran the following command
to output the console log, piping to the less command:
```
docker compose logs | less
```

Logging is configured in the framework/base/config/log4j.xml file. In this
file I can see that all the log appenders refer to files with the path
runtime/logs. These paths are relative to the current working directory,
which is /ofbiz inside the demo containers.

I searched for 'runtime/logs' in the console log and found errors similar
to the following:

demo-stable-ofbiz-1  | 2024-10-04 02:35:08,031 main ERROR
RollingFileManager (runtime/logs/ofbiz.log) java.io.FileNotFoundException:
runtime/logs/ofbiz.log (Permission denied) java.io.FileNotFoundException:
runtime/logs/ofbiz.log (Permission denied)

I knew that the /ofbiz/runtime/logs/ofbiz.log file existed. From the host
system I could check the permissions of the file with

```
ls -l /home/ofbizdocker/demo-stable/logs/ofbiz.log
```

-rw-r--r-- 1 166536 166536 8970237 Aug 11 02:35
/home/ofbizdocker/demo-stable/logs/ofbiz.log

The owning user is uid 166536 and the owning group is gid 166536. These
values might seem a bit strange, but they relate to a feature we use in
docker called User Namespaces[1] where a uid or gid of a process running in
a container is mapped to a different and non-privileged uid or gid on the
host. That way, if a process somehow 'breaks out' of its container, it will
not have any useful privileges on the host.

We configure user namespaces such that uids and gids within running
containers are mapped to values 165536 higher on the host system.

Therefore if the ofbiz.log file is owned by uid 166536, this maps to user
1000 inside the demo-stable docker container.

To quickly see which uid the various java processes on the host are running
as, we can use:

```
root@ofbiz-vm1:/home/ofbizdocker/demo-stable# ps auxf | grep java | cut
-c1-80
166537   1829545  3.3  3.5 6349340 1176440 ?     Ssl  02:35  25:59  \_
/opt/java
166536   1830554  4.2  5.9 7056352 1964308 ?     Ssl  02:35  33:09  \_
/opt/java
166536   1831670  1.9  6.0 7052264 1974948 ?     Ssl  02:36  15:02  \_
/opt/java
166536   1832608  0.4  1.9 6868992 654896 ?      Ssl  02:36   3:37  \_
/opt/java
166536   1833105  0.4  1.5 6878208 524976 ?      Ssl  02:36   3:15  \_
/opt/java
166536   1833609  0.4  2.2 6872852 739020 ?      Ssl  02:36   3:33  \_
/opt/java
```
The first column is the uid for each process. Notice that one of the
processes has a different uid 166537 to the rest.

Next, we run a bash process inside the demo-stable ofbiz container and
check the uid of the current user:

```
root@ofbiz-vm1:/home/ofbizdocker/demo-stable# docker compose exec ofbiz
/bin/bash
ofbiz@47c06d728af8:/ofbiz$ id
uid=1001(ofbiz) gid=1001(ofbiz) groups=1001(ofbiz)
```

Notice that the uid and gid is 1001, rather than the expected 1000. It
seems that the uid of the ofbiz user changed from 1000 to 1001 in our
demo-stable container images around August 11th.

To fix the permissions issue, change the owning uid and gid of the
demo-stable's log directory on the host to match the user running in the
container, and then restart the container:

```
root@ofbiz-vm1:/home/ofbizdocker/demo-stable# chown -R 166537:166537 logs/
root@ofbiz-vm1:/home/ofbizdocker/demo-stable# docker compose restart
```

The demo-trunk and demo-next containers do not appear to have the uid/gid
mismatch issue since they are built from a different base container image
that the demo-stable container.

Although the above changes to permissions in the demo-stable log directory
solved our issue, it perhaps shows a weakness in how the ofbiz user is
added to our container images. Perhaps the container image build process
needed to hard-code a uid/gid for the ofbiz user during container build to
try and ensure consistency.


[1] - https://docs.docker.com/engine/security/userns-remap


-- 
Daniel Watford

Reply via email to