David Schwartz a écrit :
6) Epoll removes the file from the set, when the *kernel* object gets
   closed (internal use-count goes to zero)

With that in mind, how can the code snippet above trigger a removal from
the epoll set?

        I don't see how that can be. Suppose I add fd 8 to an epoll set. 
Suppose fd
5 is a dup of fd 8. Now, I close fd 8. How can fd 8 remain in my epoll set,
since there no longer is an fd 8? Events on files registered for epoll
notification are reported by descriptor, so the set membership has to be
associated (as reflected into userspace) with the descriptor, not the file.

Events are not necessarly reported "by descriptors". epoll uses an opaque field provided by the user.

It's up to the user to properly chose a tag that will makes sense if the user app is playing dup()/close() games for example.

typedef union epoll_data
{
  void *ptr;
  int fd;
  uint32_t u32;
  uint64_t u64;
} epoll_data_t;


It's true some applications are using 'fd' field from epoll_data_t, but in this case they should not play dup()/close() games that could change the meaning of their 'epoll tags'. They would better use 'ptr/u64' for example to map the event to an application object. In this object they might find the correct handle (fd) to communicate with the kernel for a given 'file'. This handle could then be remapped to another handle using dup()/fcntl()/close()...



        For example, consider:

        1) Process creates an epoll set, the set gets fd 4.

        2) Process creates a socket, it gets fd 5.

        3) The process adds fd 5 to set 4.

        4) The process forks.

        5) The child inherits the epoll set but not the socket.

        Here the kernel cannot quite do the right thing. Ideally, the parent 
would
still have fd 5 in its version of the epoll set. After all, it has not
closed fd 5. However, the child *cannot* see fd 5 in its version of the
epoll set since it has no fd 5. An event reported for fd 5 would be
nonsense.

Yes, it would be nonsense that the child still tries to get events from the epoll set while he cannot possibly use the socket. If you use 'ptr' field to retrieve an object, this object probably would have no meaning in the child anyway, especially after an exec() syscall.

That kind of user error can also happens with select()/poll(), if you do for example :

FD_ZERO(&fdset);
FD_SET(fd, &fdset);
select(fd+1,&fdset, NULL, NULL, NULL);
newfd = dup(fd);
close(fd);
for (i = 0 ; i < maxfd ; i++)
if (FD_ISSET(i, &fdset))
    read(i, ...)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to