Hi,

I was browsing the code of HAProxy today (I don't use it regularly, but I still like to read the code since it might be useful for my own applications), and I saw three things that concerned me:


1. The network namespace support seems to be a bit broken. In the function "my_socketat" (lines 114-129 of src/namespace.c in the latest dev branch), you attempt to first change network namespace to the desired namespace, and then change back to the default namespace. While this is correct, this does not work in two cases, both involving user namespaces:

* First, HAProxy could be running in a non-initial user namespace with a full set of capabilities (including CAP_SYS_ADMIN), but the network namespace is still associated with the initial user namespace. Such an environment could be simulated with "unshare -r" (omitting -n), or by using a container runtime that supported user namespaces but the network namespace is still associated with the host (e.g. docker run --net=host, if it were supported in userns-remap mode). In this case, the first setns() would succeed, but the setns() back to the original namespace would fail because HAProxy would not have the CAP_SYS_ADMIN capability in the original network namespace (it is owned by the initial user namespace). To mitigate this, HAProxy would need to fork a new process, call setns() and create socket in the new process, and then transfer the socket back to the original process using SCM_RIGHTS (you can probably reuse the code in proto_sockpair.c or some other file mentioning SCM_RIGHTS to do that).

Example pseudo code (error checking omitted):

int sockpair[2];

socketpair(AF_UNIX, SOCK_STREAM, 0, sockpair);

pid_t pid = fork();

if (pid == 0) {

    close(sockpair[0]);

    setns(ns_fd, CLONE_NEWNET);

    int sock_fd = socket(domain, type, protocol); /* original values */

    send_fd_uxst(sock_fd, sockpair[1]); /* as in src/proto_sockpair.c */

    _exit(0);

} else {

    close(sockpair[1]);

    int recv_fd = recv_fd_uxst(sockpair[0]);

    close(sockpair[0]);

    return recv_fd;

}

* Second, HAProxy could be running as a non-root user, and at least one "rootless" container with a separated network namespace exists for that user. It would be nice if HAProxy could create a socket in such a network namespace without root privileges. Judging by what I already see in the code, that does not seem to be possible as it currently stands. The solution to solve this case is identical to the first case; the only difference is that you also have to enter the associated user namespace first (hint: you can use the NS_GET_USERNS ioctl on the target network namespace to obtain a file descriptor to that user namespace, which you can pass to setns()) and set PR_SET_DUMPABLE to 0 before entering the user namespace for security.

These techniques have already been employed in software like "slirp4netns", which creates a TUN/TAP device in a given network namespace, and handles both of the above cases correctly. The only difference is that for HAProxy, we should be creating a socket instead, but the overall technique is still the same.

Another complaint about the network namespace support is that it only supports namespaces in /var/run/netns. My own tool (search for "ctrtool ns_open_file" on google), on the other hand, support network namespaces created in arbitrary locations (and even allows creating sockets in arbitrary namespaces that also account for the above two user namespace scenarios). It would be nice if HAProxy supported arbitrary network namespace locations too, to support the rootless container use case.

2. There is a stack buffer overflow found in one of the files. Not disclosing it here because this email will end up on the public mailing list. If there is a "security" email address I could disclose it to, what is it?

3. There was another feature that I felt was really broken, but since it is related to #2 (it's associated with the same file that the stack buffer overflow exists on), I'm not disclosing it here publicly either. (The issue itself has nothing to do with security, but I will only disclose this after #2 has been resolved.)

Peter Jin



Reply via email to