Re: [Dovecot] Patch: event port-based ioloop and notify

2009-12-09 Thread Timo Sirainen
On Thu, 2009-12-10 at 00:28 +0100, Emanuele Pucciarelli wrote:
> > Why do you silently handle EBADFs and ENOENTs? Are those actually
> > happening in your system? Dovecot should never close fds before removing
> > them from ioloop, other ioloops also log an error if that happens.
> 
> I'll check that again and report back. I'm not sure if I saw an
> ENOENT. Does this apply to ioloop-notify as well?

Yes.

> > For port_getn() error handling I'd probably do the same as all other
> > ioloops: Ignore EINTR/ETIME and treat everything else as fatal. What's
> > the idea behind BAD_PORTEV_USER?
> 
> Unfortunately, it's possible (at least in some versions, see [2]) that
> port_getn() returns EINTR before updating nget to anything useful. In
> this case, the code would see it set to 1 and try to process the
> event: it would probably crash immediately, as soon as it tries to
> dereference events[0].portev_user to update the refcount. It seems to
> be a rare race condition and I haven't witnessed it personally, but
> (claimed to be) entirely possible.

Oh, so kind of an API design bug. But could BAD_PORTEV_USER be simply
NULL? Seems safer than some random memory address that might even be
valid.

Oh and in notify code your loops use ret, not read_events.


signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] Patch: event port-based ioloop and notify

2009-12-09 Thread Emanuele Pucciarelli
Timo,

thanks for the detailed review! I'm going to work on the patch and
make it follow your comments.

> Does the configure check actually have to test that all the port_*
> functions work? If they exist in libc, aren't they guaranteed to work?
> If just their existence is enough, I'd just change the check to test to
> AC_CHECK_FUNC(port_associate, ..).

I don't know of any configurations where they exist and don't work.
I'll cut the tests, then.

> Why do you call it fen / file event notification? It seems to be called
> just "port" in the man pages and everywhere else, so I'd call it that.

Yes, I mixed that up. The general port_* framework is the "event ports
framework", while the part that specifically watches for changes in
files and directories is the newer "file event notification API" [1].

The presence of port_* is enough for the ioloop to work, but the FEN
API, which introduces PORT_SOURCE_FILE in the headers, is needed for
the ioloop-notify. The file event notification is only available since
version 72 of Nevada, so all recent versions (since before 2008) of
OpenSolaris have it, but I don't know which, if any, versions of
Solaris 10 have it.

I would be going to separate the two checks and take a less tortuous
path to check for PORT_SOURCE_FILE (AC_CHECK_DECL, I'd say).

> Why do you silently handle EBADFs and ENOENTs? Are those actually
> happening in your system? Dovecot should never close fds before removing
> them from ioloop, other ioloops also log an error if that happens.

I'll check that again and report back. I'm not sure if I saw an
ENOENT. Does this apply to ioloop-notify as well?

> For port_getn() error handling I'd probably do the same as all other
> ioloops: Ignore EINTR/ETIME and treat everything else as fatal. What's
> the idea behind BAD_PORTEV_USER?

Unfortunately, it's possible (at least in some versions, see [2]) that
port_getn() returns EINTR before updating nget to anything useful. In
this case, the code would see it set to 1 and try to process the
event: it would probably crash immediately, as soon as it tries to
dereference events[0].portev_user to update the refcount. It seems to
be a rare race condition and I haven't witnessed it personally, but
(claimed to be) entirely possible.

> In general when an error that really shouldn't happen does happen I
> prefer to fatal/panic than trying to limp along.

Then I'll let EBADFD be fatal too, instead of trying to recreate the
port and proceed, and I'll clean up the two switch statements.

>> + io->path = i_new(char, strlen(path)+1);
>> + strcpy(io->path, path);
>
> io->path = i_strdup(path);

Thanks. It kind of shows that I haven't taken the time to read the
library – sorry about that.

I'll re-post as soon as I manage to do the clean-up.

[1] http://arc.opensolaris.org/caselog/PSARC/2007/027/mail
[2] http://www.mail-archive.com/networking-disc...@opensolaris.org/msg12330.html

-- 
Emanuele


Re: [Dovecot] Patch: event port-based ioloop and notify

2009-12-09 Thread Timo Sirainen
On Sat, 2009-11-14 at 21:09 +0100, Emanuele Pucciarelli wrote:
> I have prepared a small patch to support Solaris 10 and Opensolaris'
> event port mechanism for both the ioloop and the notify subsystems. It
> seems to work fine for me, but I haven't conducted any extensive
> testing.

Does the configure check actually have to test that all the port_*
functions work? If they exist in libc, aren't they guaranteed to work?
If just their existence is enough, I'd just change the check to test to
AC_CHECK_FUNC(port_associate, ..).

Why do you call it fen / file event notification? It seems to be called
just "port" in the man pages and everywhere else, so I'd call it that.

Why do you silently handle EBADFs and ENOENTs? Are those actually
happening in your system? Dovecot should never close fds before removing
them from ioloop, other ioloops also log an error if that happens.

For port_getn() error handling I'd probably do the same as all other
ioloops: Ignore EINTR/ETIME and treat everything else as fatal. What's
the idea behind BAD_PORTEV_USER?

In general when an error that really shouldn't happen does happen I
prefer to fatal/panic than trying to limp along.

> + io->path = i_new(char, strlen(path)+1);
> + strcpy(io->path, path);

io->path = i_strdup(path);



signature.asc
Description: This is a digitally signed message part


[Dovecot] Patch: event port-based ioloop and notify

2009-11-14 Thread Emanuele Pucciarelli
Greetings,

thanks to all of you who work on Dovecot!

I have prepared a small patch to support Solaris 10 and Opensolaris'
event port mechanism for both the ioloop and the notify subsystems. It
seems to work fine for me, but I haven't conducted any extensive
testing.

It would be great if someone could review and/or test it (and if it
could eventually enter the code base).

I have uploaded a copy at
http://cr.opensolaris.org/~e.p/dovecot/dovecot-01-fen.diff .

Thanks!

--
Emanuele