Re: select() efficiency / epoll

2005-08-24 Thread Davy Durham

Davide Libenzi wrote:



There is no known problem in using epoll_ctl() in one thread while 
another does epoll_wait().
I suggest you to ask Valgrind to take a look at you binary. Since I 
have no clue of what your software does, please create the *minimal* 
code snippet that exploit the eventual problem, and post it.


Yes, I have pretty much confirmed this. And unfortunately I tried to 
make a minimal code snippet which demonstrates the problem, but wasn't 
able to do that before I figured out a work-around.  I may still try to 
create something for you to test against so you can fix it.  But I'm 
going to have to continue to work with the existing implementation since 
I'm going to be running this code on some production servers where 
updating the kernel might not be an option.


The work-around is as follows:

1) I create a queue that can hold operations to perform on the epoll 
structure and I protect it with a mutex.


2) Other threads (when needing to modify the epoll) lock the mutex and 
enque the operation into the operation queue instead of calling 
epoll_ctl itself (i.e. add this socket for reading.. add this socket for 
writing, remove this socket.. etc) *and* then cancel the epoll_wait() 
  I implemented the cancel by having a pipe() always being watched for 
read, and write a byte to it when I want to cancel (is there a better way?)
  There are several operations that could be supported 
(add/remove/modify/change userdata/etc), but I only need two myself.


3) There's only one thread that actually does the epoll_wait().  When 
epoll_wait() returns, (I first drain the cancel pipe so it never fills 
up) I handle what events need handling, and then lock the operations 
queue mutex, perform all the operations in the queue then clear the queue




So, this works for me now.

Thanks for all your guys' info.

-- Davy

P.S.   Davide, I still might get you that snipped, but it's not a 
trivial snippet as you can imagine... and timing is everything to the 
problem :( .. and also the question of WHERE it corrupts memory.. it 
seemed to be unpredictable so far.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-24 Thread Davy Durham

Davide Libenzi wrote:



There is no known problem in using epoll_ctl() in one thread while 
another does epoll_wait().
I suggest you to ask Valgrind to take a look at you binary. Since I 
have no clue of what your software does, please create the *minimal* 
code snippet that exploit the eventual problem, and post it.


Yes, I have pretty much confirmed this. And unfortunately I tried to 
make a minimal code snippet which demonstrates the problem, but wasn't 
able to do that before I figured out a work-around.  I may still try to 
create something for you to test against so you can fix it.  But I'm 
going to have to continue to work with the existing implementation since 
I'm going to be running this code on some production servers where 
updating the kernel might not be an option.


The work-around is as follows:

1) I create a queue that can hold operations to perform on the epoll 
structure and I protect it with a mutex.


2) Other threads (when needing to modify the epoll) lock the mutex and 
enque the operation into the operation queue instead of calling 
epoll_ctl itself (i.e. add this socket for reading.. add this socket for 
writing, remove this socket.. etc) *and* then cancel the epoll_wait() 
  I implemented the cancel by having a pipe() always being watched for 
read, and write a byte to it when I want to cancel (is there a better way?)
  There are several operations that could be supported 
(add/remove/modify/change userdata/etc), but I only need two myself.


3) There's only one thread that actually does the epoll_wait().  When 
epoll_wait() returns, (I first drain the cancel pipe so it never fills 
up) I handle what events need handling, and then lock the operations 
queue mutex, perform all the operations in the queue then clear the queue




So, this works for me now.

Thanks for all your guys' info.

-- Davy

P.S.   Davide, I still might get you that snipped, but it's not a 
trivial snippet as you can imagine... and timing is everything to the 
problem :( .. and also the question of WHERE it corrupts memory.. it 
seemed to be unpredictable so far.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davide Libenzi

On Tue, 23 Aug 2005, Davy Durham wrote:


Davide Libenzi wrote:



I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 (I 
stopped maintaining it when 2.6 went "stable"). I'd definitely suggest to 
use 2.6 if you are looking at epoll.


I am using linux-2.6.11 and glibc-2.3.4  .. and using select() in it's place 
seems to work fine.  Are there any known issues with say, one thread does 
epoll_wait()s while other threads may be doing epoll_ctl()s?


There is no known problem in using epoll_ctl() in one thread while another 
does epoll_wait().
I suggest you to ask Valgrind to take a look at you binary. Since I have 
no clue of what your software does, please create the *minimal* code 
snippet that exploit the eventual problem, and post it.



- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham <[EMAIL PROTECTED]> wrote:

 


I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

 

Oh!.. unless the epoll_data_t is a union just for convenience in that it 
already has an 'int fd' if you want to use that, but don't have to.. 
that at least makes the void *ptr, useful..  The example in 'man epoll' 
sorta made it look necessary to set the 'fd' of the union.


But that still doesn't fix the issue of course.. but good to know.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham <[EMAIL PROTECTED]> wrote:
 


I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

 

No, I saw that epoll_data_t was a union (although, it kind of makes the 
ptr useless as a user data pointer.. but I'm not using it for that)


When I mean that pointers are getting corrupted, I just mean in other 
parts of the code (actually it's some C++ STL container's data and is 
completely unrelated to the epoll specific code)  Something, somewhere 
seems to be writing to memory that it's not supposed to be writing to.  
And as far as I can tell, it happens when I use epoll and doesn't when I 
use select  :-/


-- Davy




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Davide Libenzi wrote:



I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 
(I stopped maintaining it when 2.6 went "stable"). I'd definitely 
suggest to use 2.6 if you are looking at epoll.


I am using linux-2.6.11 and glibc-2.3.4  .. and using select() in it's 
place seems to work fine.  Are there any known issues with say, one 
thread does epoll_wait()s while other threads may be doing epoll_ctl()s?


Is there someone else I should be asking this question?

Thanks,
 Davy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davide Libenzi

On Tue, 23 Aug 2005, Willy Tarreau wrote:


On Tue, Aug 23, 2005 at 06:55:26AM -0500, Davy Durham wrote:

Thanks for the info.. I did find this thread and was wondering if this
patch ever got put in

http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html



Interesting ! At least it does not seem to be present in the
epoll-2.4.24-0.20 I have right here, and although the code changed
significantly in 2.6, it does not seem to contain it either. But I
don't even see how to merge this into 2.6. You should ask Davide,
he knows this code better than anyone else, and could tell us if
this patch was simply lost or is unneeded.


I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 (I 
stopped maintaining it when 2.6 went "stable"). I'd definitely suggest to 
use 2.6 if you are looking at epoll.




- Davide

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Jari Sundell
On 8/23/05, Davy Durham <[EMAIL PROTECTED]> wrote:

> Yes, that is what I was thinking and is why I mentioned that.  But I'm
> apparently not overwriting the pointers with FDs.. it seems that epoll
> is the cause at this point (unless I'm misusing the epoll API).  I've
> made some changes to now use select() instead of epoll and things work
> flawlessly (although it obviously won't work as efficiently when I
> really connect a lot of clients to this server)

I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

-- 
Rakshasa

Nyaa?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
On Tue, Aug 23, 2005 at 06:55:26AM -0500, Davy Durham wrote:
> Thanks for the info.. I did find this thread and was wondering if this 
> patch ever got put in
> 
> http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html
> 

Interesting ! At least it does not seem to be present in the
epoll-2.4.24-0.20 I have right here, and although the code changed
significantly in 2.6, it does not seem to contain it either. But I
don't even see how to merge this into 2.6. You should ask Davide,
he knows this code better than anyone else, and could tell us if
this patch was simply lost or is unneeded.

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham
Thanks for the info.. I did find this thread and was wondering if this 
patch ever got put in


http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html



Willy Tarreau wrote:


On Tue, Aug 23, 2005 at 06:24:42AM -0500, Davy Durham wrote:
 

That's probably a good idea.  Where would I find out what other projects 
use it?
   



I use it in my load-balancer (haproxy), and it could somewhat match your
needs, because I ported the select()-based earlier version to epoll() with
the smallest possible changes. Indeed, the new epoll() loop still uses the
FD_ISSET() to determine what to do with epoll_ctl(). If you have changed
your code to use select(), you may find similarities. But I want to tell
you from now that my code is NOT multi-threaded. It could be a bug in the
epoll implementation, because I don't think that there are so many
applications using epoll on MT models. Bert says that the epoll implementation
is heavily benchmarked, which is true, but which does not guarantee that it
is tested under every condition.

You can download it from there :

 http://w.ods.org/tools/haproxy/src/devel/

Use version 1.2.6. I added epoll in 1.2.5, so the diff between 1.2.4 and
1.2.5 could help you too.

Good luck !
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham <[EMAIL PROTECTED]> wrote:
 


However, I'm getting segfaults because some pointers in places are
getting set to low integer values (which didn't used to have those values).
   



Is it possible that you are overwritting the pointers with file
descriptors, as those would have low integer values?

 

Yes, that is what I was thinking and is why I mentioned that.  But I'm 
apparently not overwriting the pointers with FDs.. it seems that epoll 
is the cause at this point (unless I'm misusing the epoll API).  I've 
made some changes to now use select() instead of epoll and things work 
flawlessly (although it obviously won't work as efficiently when I 
really connect a lot of clients to this server)




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
On Tue, Aug 23, 2005 at 06:24:42AM -0500, Davy Durham wrote:
> That's probably a good idea.  Where would I find out what other projects 
> use it?

I use it in my load-balancer (haproxy), and it could somewhat match your
needs, because I ported the select()-based earlier version to epoll() with
the smallest possible changes. Indeed, the new epoll() loop still uses the
FD_ISSET() to determine what to do with epoll_ctl(). If you have changed
your code to use select(), you may find similarities. But I want to tell
you from now that my code is NOT multi-threaded. It could be a bug in the
epoll implementation, because I don't think that there are so many
applications using epoll on MT models. Bert says that the epoll implementation
is heavily benchmarked, which is true, but which does not guarantee that it
is tested under every condition.

You can download it from there :

  http://w.ods.org/tools/haproxy/src/devel/

Use version 1.2.6. I added epoll in 1.2.5, so the diff between 1.2.4 and
1.2.5 could help you too.

Good luck !
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Jari Sundell
On 8/23/05, Davy Durham <[EMAIL PROTECTED]> wrote:
> 
> However, I'm getting segfaults because some pointers in places are
> getting set to low integer values (which didn't used to have those values).

Is it possible that you are overwritting the pointers with file
descriptors, as those would have low integer values?

-- 
Rakshasa

Nyaa?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham
That's probably a good idea.  Where would I find out what other projects 
use it?


Willy Tarreau wrote:


Hi,

On Tue, Aug 23, 2005 at 06:01:15AM -0500, Davy Durham wrote:
 

I just mean that when  I debug and catch the segv, it's dies because 
some pointers now have corrupted values.  (usually because something is 
overwriting some memory some where)


I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.
   



Just out of curiosity, have you had the opportunity to read some other
code which uses epoll ? Maybe reading others code could enlighten you
on potential bugs in your code, potential races, etc...

Regards,
Willy
 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Davy Durham wrote:



I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.



Well, the select() replacement works fine... so hrmm..


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
Hi,

On Tue, Aug 23, 2005 at 06:01:15AM -0500, Davy Durham wrote:
> I just mean that when  I debug and catch the segv, it's dies because 
> some pointers now have corrupted values.  (usually because something is 
> overwriting some memory some where)
> 
> I'm currently re-writing some code to make it use select() instead of 
> epoll_wait() and see if everything is suddently fixed.  If so, then I 
> will suspect that epoll has a problem.  But it's still not ruled out 
> being my fault since it could be a timing issue that makes the crash 
> show up.

Just out of curiosity, have you had the opportunity to read some other
code which uses epoll ? Maybe reading others code could enlighten you
on potential bugs in your code, potential races, etc...

Regards,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

bert hubert wrote:


On Tue, Aug 23, 2005 at 04:49:14AM -0500, Davy Durham wrote:

 

However, I'm getting segfaults because some pointers in places are 
getting set to low integer values (which didn't used to have those values).
   



epoll is pretty heavily benchmarked and hence tested. I don't entirely
understand the remark above and suggest looking at the generated core dumps.

 

I just mean that when  I debug and catch the segv, it's dies because 
some pointers now have corrupted values.  (usually because something is 
overwriting some memory some where)


I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread bert hubert
On Tue, Aug 23, 2005 at 04:49:14AM -0500, Davy Durham wrote:

> However, I'm getting segfaults because some pointers in places are 
> getting set to low integer values (which didn't used to have those values).

epoll is pretty heavily benchmarked and hence tested. I don't entirely
understand the remark above and suggest looking at the generated core dumps.

-- 
http://www.PowerDNS.com  Open source, database driven DNS Software 
http://netherlabs.nl  Open and Closed source services
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

So, I've been trying to use epoll.. on linux-2.6.11-6mdk


However, I'm getting segfaults because some pointers in places are 
getting set to low integer values (which didn't used to have those values).


The deal is that my application is multi-threaded, and I was wondering 
if epoll had issues if you use epoll_ctl while an epoll_wait is waiting 
or something like that.  I'm also compiling with -D_MULTI_THREADED.  I'm 
not new to threading, but am stumped at this point.


I'm not ruling out it being my code, but wanted to ask about epoll since 
it's so new.


Any ideas?

Thanks,
 Davy


bert hubert wrote:


On Fri, Jul 22, 2005 at 04:18:46PM -0500, Davy Durham wrote:
 

Please forgive and redirect me if this is not the right place to ask 
this question:


I'm looking to write a sort of messaging system that would take input 
from any number of entities that "register" with it.. it would then 
route the messages to outputs and so forth..
   



Look at epoll, or libevent, which uses epoll to be quick in this scenario.


 



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

So, I've been trying to use epoll.. on linux-2.6.11-6mdk


However, I'm getting segfaults because some pointers in places are 
getting set to low integer values (which didn't used to have those values).


The deal is that my application is multi-threaded, and I was wondering 
if epoll had issues if you use epoll_ctl while an epoll_wait is waiting 
or something like that.  I'm also compiling with -D_MULTI_THREADED.  I'm 
not new to threading, but am stumped at this point.


I'm not ruling out it being my code, but wanted to ask about epoll since 
it's so new.


Any ideas?

Thanks,
 Davy


bert hubert wrote:


On Fri, Jul 22, 2005 at 04:18:46PM -0500, Davy Durham wrote:
 

Please forgive and redirect me if this is not the right place to ask 
this question:


I'm looking to write a sort of messaging system that would take input 
from any number of entities that register with it.. it would then 
route the messages to outputs and so forth..
   



Look at epoll, or libevent, which uses epoll to be quick in this scenario.


 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread bert hubert
On Tue, Aug 23, 2005 at 04:49:14AM -0500, Davy Durham wrote:

 However, I'm getting segfaults because some pointers in places are 
 getting set to low integer values (which didn't used to have those values).

epoll is pretty heavily benchmarked and hence tested. I don't entirely
understand the remark above and suggest looking at the generated core dumps.

-- 
http://www.PowerDNS.com  Open source, database driven DNS Software 
http://netherlabs.nl  Open and Closed source services
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

bert hubert wrote:


On Tue, Aug 23, 2005 at 04:49:14AM -0500, Davy Durham wrote:

 

However, I'm getting segfaults because some pointers in places are 
getting set to low integer values (which didn't used to have those values).
   



epoll is pretty heavily benchmarked and hence tested. I don't entirely
understand the remark above and suggest looking at the generated core dumps.

 

I just mean that when  I debug and catch the segv, it's dies because 
some pointers now have corrupted values.  (usually because something is 
overwriting some memory some where)


I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
Hi,

On Tue, Aug 23, 2005 at 06:01:15AM -0500, Davy Durham wrote:
 I just mean that when  I debug and catch the segv, it's dies because 
 some pointers now have corrupted values.  (usually because something is 
 overwriting some memory some where)
 
 I'm currently re-writing some code to make it use select() instead of 
 epoll_wait() and see if everything is suddently fixed.  If so, then I 
 will suspect that epoll has a problem.  But it's still not ruled out 
 being my fault since it could be a timing issue that makes the crash 
 show up.

Just out of curiosity, have you had the opportunity to read some other
code which uses epoll ? Maybe reading others code could enlighten you
on potential bugs in your code, potential races, etc...

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Davy Durham wrote:



I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.



Well, the select() replacement works fine... so hrmm..


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham
That's probably a good idea.  Where would I find out what other projects 
use it?


Willy Tarreau wrote:


Hi,

On Tue, Aug 23, 2005 at 06:01:15AM -0500, Davy Durham wrote:
 

I just mean that when  I debug and catch the segv, it's dies because 
some pointers now have corrupted values.  (usually because something is 
overwriting some memory some where)


I'm currently re-writing some code to make it use select() instead of 
epoll_wait() and see if everything is suddently fixed.  If so, then I 
will suspect that epoll has a problem.  But it's still not ruled out 
being my fault since it could be a timing issue that makes the crash 
show up.
   



Just out of curiosity, have you had the opportunity to read some other
code which uses epoll ? Maybe reading others code could enlighten you
on potential bugs in your code, potential races, etc...

Regards,
Willy
 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Jari Sundell
On 8/23/05, Davy Durham [EMAIL PROTECTED] wrote:
 
 However, I'm getting segfaults because some pointers in places are
 getting set to low integer values (which didn't used to have those values).

Is it possible that you are overwritting the pointers with file
descriptors, as those would have low integer values?

-- 
Rakshasa

Nyaa?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
On Tue, Aug 23, 2005 at 06:24:42AM -0500, Davy Durham wrote:
 That's probably a good idea.  Where would I find out what other projects 
 use it?

I use it in my load-balancer (haproxy), and it could somewhat match your
needs, because I ported the select()-based earlier version to epoll() with
the smallest possible changes. Indeed, the new epoll() loop still uses the
FD_ISSET() to determine what to do with epoll_ctl(). If you have changed
your code to use select(), you may find similarities. But I want to tell
you from now that my code is NOT multi-threaded. It could be a bug in the
epoll implementation, because I don't think that there are so many
applications using epoll on MT models. Bert says that the epoll implementation
is heavily benchmarked, which is true, but which does not guarantee that it
is tested under every condition.

You can download it from there :

  http://w.ods.org/tools/haproxy/src/devel/

Use version 1.2.6. I added epoll in 1.2.5, so the diff between 1.2.4 and
1.2.5 could help you too.

Good luck !
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham [EMAIL PROTECTED] wrote:
 


However, I'm getting segfaults because some pointers in places are
getting set to low integer values (which didn't used to have those values).
   



Is it possible that you are overwritting the pointers with file
descriptors, as those would have low integer values?

 

Yes, that is what I was thinking and is why I mentioned that.  But I'm 
apparently not overwriting the pointers with FDs.. it seems that epoll 
is the cause at this point (unless I'm misusing the epoll API).  I've 
made some changes to now use select() instead of epoll and things work 
flawlessly (although it obviously won't work as efficiently when I 
really connect a lot of clients to this server)




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham
Thanks for the info.. I did find this thread and was wondering if this 
patch ever got put in


http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html



Willy Tarreau wrote:


On Tue, Aug 23, 2005 at 06:24:42AM -0500, Davy Durham wrote:
 

That's probably a good idea.  Where would I find out what other projects 
use it?
   



I use it in my load-balancer (haproxy), and it could somewhat match your
needs, because I ported the select()-based earlier version to epoll() with
the smallest possible changes. Indeed, the new epoll() loop still uses the
FD_ISSET() to determine what to do with epoll_ctl(). If you have changed
your code to use select(), you may find similarities. But I want to tell
you from now that my code is NOT multi-threaded. It could be a bug in the
epoll implementation, because I don't think that there are so many
applications using epoll on MT models. Bert says that the epoll implementation
is heavily benchmarked, which is true, but which does not guarantee that it
is tested under every condition.

You can download it from there :

 http://w.ods.org/tools/haproxy/src/devel/

Use version 1.2.6. I added epoll in 1.2.5, so the diff between 1.2.4 and
1.2.5 could help you too.

Good luck !
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Willy Tarreau
On Tue, Aug 23, 2005 at 06:55:26AM -0500, Davy Durham wrote:
 Thanks for the info.. I did find this thread and was wondering if this 
 patch ever got put in
 
 http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html
 

Interesting ! At least it does not seem to be present in the
epoll-2.4.24-0.20 I have right here, and although the code changed
significantly in 2.6, it does not seem to contain it either. But I
don't even see how to merge this into 2.6. You should ask Davide,
he knows this code better than anyone else, and could tell us if
this patch was simply lost or is unneeded.

Regards,
Willy

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Jari Sundell
On 8/23/05, Davy Durham [EMAIL PROTECTED] wrote:

 Yes, that is what I was thinking and is why I mentioned that.  But I'm
 apparently not overwriting the pointers with FDs.. it seems that epoll
 is the cause at this point (unless I'm misusing the epoll API).  I've
 made some changes to now use select() instead of epoll and things work
 flawlessly (although it obviously won't work as efficiently when I
 really connect a lot of clients to this server)

I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

-- 
Rakshasa

Nyaa?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham [EMAIL PROTECTED] wrote:
 


I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

 

No, I saw that epoll_data_t was a union (although, it kind of makes the 
ptr useless as a user data pointer.. but I'm not using it for that)


When I mean that pointers are getting corrupted, I just mean in other 
parts of the code (actually it's some C++ STL container's data and is 
completely unrelated to the epoll specific code)  Something, somewhere 
seems to be writing to memory that it's not supposed to be writing to.  
And as far as I can tell, it happens when I use epoll and doesn't when I 
use select  :-/


-- Davy




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Jari Sundell wrote:


On 8/23/05, Davy Durham [EMAIL PROTECTED] wrote:

 


I was hoping you would mention in your reply that you knew
epoll_data_t was an union and you didn't touch epoll_data::fd, so i
wouldn't have to say it explicitly. ;)

 

Oh!.. unless the epoll_data_t is a union just for convenience in that it 
already has an 'int fd' if you want to use that, but don't have to.. 
that at least makes the void *ptr, useful..  The example in 'man epoll' 
sorta made it look necessary to set the 'fd' of the union.


But that still doesn't fix the issue of course.. but good to know.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davide Libenzi

On Tue, 23 Aug 2005, Willy Tarreau wrote:


On Tue, Aug 23, 2005 at 06:55:26AM -0500, Davy Durham wrote:

Thanks for the info.. I did find this thread and was wondering if this
patch ever got put in

http://www.ussg.iu.edu/hypermail/linux/kernel/0303.3/1139.html



Interesting ! At least it does not seem to be present in the
epoll-2.4.24-0.20 I have right here, and although the code changed
significantly in 2.6, it does not seem to contain it either. But I
don't even see how to merge this into 2.6. You should ask Davide,
he knows this code better than anyone else, and could tell us if
this patch was simply lost or is unneeded.


I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 (I 
stopped maintaining it when 2.6 went stable). I'd definitely suggest to 
use 2.6 if you are looking at epoll.




- Davide

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davy Durham

Davide Libenzi wrote:



I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 
(I stopped maintaining it when 2.6 went stable). I'd definitely 
suggest to use 2.6 if you are looking at epoll.


I am using linux-2.6.11 and glibc-2.3.4  .. and using select() in it's 
place seems to work fine.  Are there any known issues with say, one 
thread does epoll_wait()s while other threads may be doing epoll_ctl()s?


Is there someone else I should be asking this question?

Thanks,
 Davy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: select() efficiency / epoll

2005-08-23 Thread Davide Libenzi

On Tue, 23 Aug 2005, Davy Durham wrote:


Davide Libenzi wrote:



I should mention that the 2.4 patch is old WRT mainline epoll in 2.6 (I 
stopped maintaining it when 2.6 went stable). I'd definitely suggest to 
use 2.6 if you are looking at epoll.


I am using linux-2.6.11 and glibc-2.3.4  .. and using select() in it's place 
seems to work fine.  Are there any known issues with say, one thread does 
epoll_wait()s while other threads may be doing epoll_ctl()s?


There is no known problem in using epoll_ctl() in one thread while another 
does epoll_wait().
I suggest you to ask Valgrind to take a look at you binary. Since I have 
no clue of what your software does, please create the *minimal* code 
snippet that exploit the eventual problem, and post it.



- Davide

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/