Hi,

I am running the Storm Supervisor in an image that I've created in Kubernetes 
using a securityContext that has the following:

      securityContext:
        runAsUser: 1000620005
        fsGroup: 1000620005
        supplementalGroups: [ 64000 ]

The UID 1000620005 is not related to a user specified in the /etc/passwd file 
in the Docker image.

When I kill a topology, this generates the following exception:

2024-05-10 06:26:10.661 [SLOT_6700] itemId= { jobName="" ,jobTemplateId="" 
,userOrAppId="" ,tenantId="",  jobStep="", scaleCopyJobId=""} ERROR 
apache.storm.daemon.supervisor.Slot - Error when processing event 
java.lang.NullPointerException: null
      at org.apache.storm.utils.ServerUtils.getUserId(ServerUtils.java:1095) 
~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.utils.ServerUtils.isAnyPosixProcessPidDirAlive(ServerUtils.java:1284)
 ~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.utils.ServerUtils.isAnyPosixProcessPidDirAlive(ServerUtils.java:1216)
 ~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.utils.ServerUtils.areAllProcessesDead(ServerUtils.java:1178) 
~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.container.DefaultResourceIsolationManager.areAllProcessesDead(DefaultResourceIsolationManager.java:146)
 ~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.daemon.supervisor.Container.areAllProcessesDead(Container.java:248)
 ~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.daemon.supervisor.Slot.killContainerFor(Slot.java:237) 
~[storm-server-2.6.1.jar:2.6.1]
      at org.apache.storm.daemon.supervisor.Slot.handleRunning(Slot.java:792) 
~[storm-server-2.6.1.jar:2.6.1]
      at 
org.apache.storm.daemon.supervisor.Slot.stateMachineStep(Slot.java:184) 
~[storm-server-2.6.1.jar:2.6.1]
      at org.apache.storm.daemon.supervisor.Slot.run(Slot.java:1051) 
[storm-server-2.6.1.jar:2.6.1]

which in turn means that the supervisor process dies, and the pod is restarted.

In looking at the Storm source code I think that the issue is in 
storm-server/src/main/java/org/apache/storm/utils/ServerUtils.java where it has 
the following code:

       if (user != null && !user.isEmpty()) {
            cmdArgs.add(user);
        }

which results in the following command being executed:

id -u ?

since with the securityContext specified above there is not a named user 
associated with the UID of 1000620005 and a username is not available.

I can see the following in worker.yaml for the topology:

bash-4.2$ cat worker.yaml
worker-id: 145eac49-838f-4796-bd77-c3c99e202e32
logs.users: []
logs.groups: []
topology.submitter.user: '?'

The id -u ? command outputs:

bash-4.2$ id -u ?
id: ?: no such user

this then causes the Null Pointer Exception since it can't parse the output.

I am running with a patch locally that detects whether the username is '?' and 
doesn't add the user to the command line.  This appears to work:

        if (user != null && !user.isEmpty() && !user.equals("?")) {
            cmdArgs.add(user);
        }

Is there a different technique that would work in this scenario, or does it 
require a code change in the storm-server to resolve the issue?

Thanks,

Steve

Reply via email to