[take23 1/5] kevent: Description.

Evgeniy Polyakov Tue, 07 Nov 2006 08:54:45 -0800

Description.

int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent *arg);


fd - is the file descriptor referring to the kevent queue to manipulate. 
It is created by opening "/dev/kevent" char device, which is created with 
dynamic 
minor number and major number assigned for misc devices. 

cmd - is the requested operation. It can be one of the following:
    KEVENT_CTL_ADD - add event notification 
    KEVENT_CTL_REMOVE - remove event notification 
    KEVENT_CTL_MODIFY - modify existing notification 

num - number of struct ukevent in the array pointed to by arg 
arg - array of struct ukevent

When called, kevent_ctl will carry out the operation specified in the cmd 
parameter.
-------------------------------------------------------------------------------------

 int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr, 
__u64 timeout, struct ukevent *buf, unsigned flags)

ctl_fd - file descriptor referring to the kevent queue 
min_nr - minimum number of completed events that kevent_get_events will block 
waiting for 
max_nr - number of struct ukevent in buf 
timeout - number of nanoseconds to wait before returning less than min_nr 
events. 
        If this is -1, then wait forever. 
buf - pointer to an array of struct ukevent. 
flags - unused 

kevent_get_events will wait timeout milliseconds for at least min_nr completed 
events, 
copying completed struct ukevents to buf and deleting any KEVENT_REQ_ONESHOT 
event requests. 
In nonblocking mode it returns as many events as possible, but not more than 
max_nr. 
In blocking mode it waits until timeout or if at least min_nr events are ready.
-------------------------------------------------------------------------------------

 int kevent_wait(int ctl_fd, unsigned int num, __u64 timeout)

ctl_fd - file descriptor referring to the kevent queue 
num - number of processed kevents 
timeout - this timeout specifies number of nanoseconds to wait until there is 
free space in kevent queue 

This syscall waits until either timeout expires or at least one event becomes 
ready. 
It also copies that num events into special ring buffer and requeues them (or 
removes depending on flags). 
-------------------------------------------------------------------------------------

 int kevent_ring_init(int ctl_fd, struct kevent_ring *ring, unsigned int num)

ctl_fd - file descriptor referring to the kevent queue 
num - size of the ring buffer in events 

 struct kevent_ring
 {
   unsigned int ring_kidx;
   struct ukevent event[0];
 }

ring_kidx - is an index in the ring buffer where kernel will put new events 
when 
  kevent_wait() or kevent_get_events() is called 

Example userspace code (ring_buffer.c) can be found on project's homepage.

Each kevent syscall can be so called cancellation point in glibc, i.e. when 
thread has 
been cancelled in kevent syscall, thread can be safely removed and no events 
will be lost, 
since each syscall (kevent_wait() or kevent_get_events()) will copy event into 
special ring buffer, 
accessible from other threads or even processes (if shared memory is used).

When kevent is removed (not dequeued when it is ready, but just removed), even 
if it was ready, 
it is not copied into ring buffer, since if it is removed, no one cares about 
it (otherwise user 
would wait until it becomes ready and got it through usual way using 
kevent_get_events() or kevent_wait()) 
and thus no need to copy it to the ring buffer.

It is possible with userspace ring buffer, that events in the ring buffer can 
be replaced without knowledge 
for the thread currently reading them (when other thread calls 
kevent_get_events() or kevent_wait()), 
so appropriate locking between threads or processes, which can simultaneously 
access the same ring buffer, 
is required.
-------------------------------------------------------------------------------------

The bulk of the interface is entirely done through the ukevent struct. 
It is used to add event requests, modify existing event requests, 
specify which event requests to remove, and return completed events.

struct ukevent contains the following members:

struct kevent_id id
    Id of this request, e.g. socket number, file descriptor and so on 
__u32 type
    Event type, e.g. KEVENT_SOCK, KEVENT_INODE, KEVENT_TIMER and so on 
__u32 event
    Event itself, e.g. SOCK_ACCEPT, INODE_CREATED, TIMER_FIRED 
__u32 req_flags
    Per-event request flags,

    KEVENT_REQ_ONESHOT
        event will be removed when it is ready 

    KEVENT_REQ_WAKEUP_ONE
        When several threads wait on the same kevent queue and requested the 
same event, 
        for example 'wake me up when new client has connected, so I could call 
accept()', 
        then all threads will be awakened when new client has connected, but 
only one of 
        them can process the data. This problem is known as thundering nerd 
problem. 
        Events which have this flag set will not be marked as ready (and 
appropriate threads 
        will not be awakened) if at least one event has been already marked. 

    KEVENT_REQ_ET
        Edge Triggered behaviour. It is an optimisation which allows to move 
ready and dequeued 
        (i.e. copied to userspace) event to move into set of interest for given 
storage (socket, 
        inode and so on) again. It is very usefull for cases when the same 
event should be used 
        many times (like reading from pipe). It is similar to epoll()'s EPOLLET 
flag. 

__u32 ret_flags
    Per-event return flags

    KEVENT_RET_BROKEN
        Kevent is broken 

    KEVENT_RET_DONE
        Kevent processing was finished successfully 

    KEVENT_RET_COPY_FAILED
        Kevent was not copied into ring buffer due to some error conditions. 

__u32 ret_data
    Event return data. Event originator fills it with anything it likes (for 
example 
    timer notifications put number of milliseconds when timer has fired 
union { __u32 user[2]; void *ptr; }
    User's data. It is not used, just copied to/from user. The whole structure 
is aligned 
    to 8 bytes already, so the last union is aligned properly. 

---------------------------------------------------------------------------------

Usage

For KEVENT_CTL_ADD, all fields relevant to the event type must be filled 
(id, type, possibly event, req_flags). After kevent_ctl(..., KEVENT_CTL_ADD, 
...) 
returns each struct's ret_flags should be checked to see if the event is 
already broken or done.

For KEVENT_CTL_MODIFY, the id, req_flags, and user and event fields must be set 
and an 
existing kevent request must have matching id and user fields. If a match is 
found, 
req_flags and event are replaced with the newly supplied values and requeueing 
is started, 
so modified kevent can be checked and probably marked as ready immediately. If 
a match can't
be found, the passed in ukevent's ret_flags has KEVENT_RET_BROKEN set. 
KEVENT_RET_DONE is always set.

For KEVENT_CTL_REMOVE, the id and user fields must be set and an existing 
kevent request must 
have matching id and user fields. If a match is found, the kevent request is 
removed. 
If a match can't be found, the passed in ukevent's ret_flags has 
KEVENT_RET_BROKEN set. 
KEVENT_RET_DONE is always set.

For kevent_get_events, the entire structure is returned.

---------------------------------------------------------------------------------

Usage cases

kevent_timer
struct ukevent should contain following fields:
    type - KEVENT_TIMER 
    event - KEVENT_TIMER_FIRED 
    req_flags - KEVENT_REQ_ONESHOT if you want to fire that timer only once 
    id.raw[0] - number of seconds after commit when this timer shout expire 
    id.raw[0] - additional to number of seconds number of nanoseconds 


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[take23 1/5] kevent: Description.

Reply via email to