Re: Piggy-backing lock-free queue synchronization on ev_async membar
Also, the stuff about event compression aside, you'll need some kind of barrier operations on the queue itself at the very least if you wanted this to work reliably on all SMP machines under all circumstances (e.g. Marc's libecb stuff). On Tue, Mar 18, 2014 at 7:01 AM, Evgeny Zajcev wrote: > 2014-03-18 13:48 GMT+03:00 Konstantin Osipov : > >> Hi, >> >> In our program we have 2 threads, both running ev_loops, and would >> like to organize a simple producer-consumer message queue from one >> thread to the other. >> For the purpose of this discussion let's assume a message is a >> simple integer. >> >> It seems that we could implement such a data structure >> in a completely lock-less manner. >> >> Please consider the following implementation: >> >> enum { QUEUE_SIZE = 100 }; >> >> struct queue { >> int q[QUEUE_SIZE]; >> int wpos; /* todo: cacheline aligned. */ >> int rpos; /* todo: cacheline aligned */ >> ev_async async; >> }; >> >> /* Done in the producer */ >> void >> queue_init(struct queue *q, ev_loop *consumer) >> { >> q->rpos = q->wpos = 0; >> ev_async_init(&async, consumer); >> } >> >> /* For use in the producer thread only. */ >> void >> queue_put(struct queueu *q, int i) >> { >> if (q->wpos == QUEUE_SIZE) >> q->wpos = 0; >> q->q[q->wpos++] = i; >> ev_async_send(&q->async); >> } >> >> /* >> * For use in the consumer thread only, in the event handler >> * for q->async. >> */ >> int >> queue_get(struct queue *q) >> { >> if (q->rpos == QUEUE_SIZE) >> q->rpos = 0; >> return q->q[q->rpos++]; >> } >> >> Let's put aside the problem of rpos and wpos running over each other, >> for simplicity. >> The question is only, provided that QUEUE_SIZE is sufficient for >> our production loads, would memory barrier built into >> ev_async_send be sufficient to ensure the correct read ordering >> of this queue? >> >> > > The reading order would be ok of course. However you should take into > account that ``multiple events might get compressed into a single > callback invocation``, so consumer thread may have to consume multiple > > items from the queue upon callback. You might need to create some > logic to prevent under/over consuming of items > > -- > lg > > ___ > libev mailing list > libev@lists.schmorp.de > http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev > ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Mixing async and sync after a fork
On Thu, Jan 10, 2013 at 10:05 AM, Bo Lorentsen wrote: > The thing is ... this works amazingly, until I try to startup this app as > a daemon (fork). And yes, when i start it as i daemon i make sure to call > ev::post_fork(), and my default loop starts up as it is supposed to (but > later or lazy really), without any problems (in its own libev thread), but > it seems like any socket connections made after the fork between > ev::post_fork() and ev::default_loop().run() are invalid after calling run. > You shouldn't even need post_fork() just to do daemonization. Just daemonize first and then initialize libev stuff afterwards. My first random guess at your problem would be that you're using some daemonization library/code written by someone else, and that it closes all open file descriptors during the process (a common programming error in daemonization code). I've use libev in a ton of daemonized C code without issue, as I'm sure many others have, so it's unlikely to be a bug in libev. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: memory leak or something wrong
On Mon, Jun 18, 2012 at 4:50 PM, Brandon Black wrote: > However, whether it's a > simple bug or somehow intentional behavior is unclear to me given all > that arguing about malloc(p,0) between glibc/C/POSIX people. Shortly after the last email, I noticed the realloc() docs in Fedora 16 contain this sentence about realloc(): "If size was equal to 0, either NULL or a pointer suitable to be passed to free() is returned." Sure enough, realloc(p, 0) returns a non-NULL pointer, and free(p) of that return value also stops the leak. Another summary of some of the differences of interpretation going on here: http://lists.gnu.org/archive/html/bug-gnulib/2011-03/msg00243.html . IMHO, in a world with that kind of mess going on, the new "safe" behavior for portable code may just be to never use realloc(p,0) because you can't trust what it means :( -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: memory leak or something wrong
On Mon, Jun 18, 2012 at 3:06 PM, Marc Lehmann wrote: > It could also be some esoteric form of internal memory fragmentation, > which I would also consider a bug in the allocator. This is what I suspect as well, as the "leak" is a very slow growth in the Data size, but valgrind won't report it. However, whether it's a simple bug or somehow intentional behavior is unclear to me given all that arguing about malloc(p,0) between glibc/C/POSIX people. Modifying the original example loop to compile on my box with the distro's libev-4.04, as: #include #include "libev/ev.h" int main(int argc, char* argv[]) { while (1) { struct timespec slp; slp.tv_sec = 0; slp.tv_nsec = 10*1e6; struct ev_loop *evp = ev_loop_new( 0 ); ev_loop_destroy( evp ); nanosleep( &slp, NULL ); } } -- Then running it and monitoring the VmData size via procfs, I see results like: -- [blblack@mysteron ~]$ while [ 1 ]; do grep VmData /proc/30936/status; sleep 5; done VmData: 2352 kB VmData: 2736 kB VmData: 2992 kB VmData: 3376 kB VmData: 3632 kB VmData: 4016 kB --- Using ev_set_allocator() to transform realloc(p,0) into free(p) calls stops the growth. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: memory leak or something wrong
On Mon, Jun 18, 2012 at 12:43 AM, yao liufeng wrote: > can someone help to clarify whether it's a > memory leak or something wrong with the code? Thanks! Googling around, it seems like in the past year or two there's been some flareup between C99/C1X, POSIX, and glibc about interpretations of what the right/acceptable/conforming behavior of realloc(p,0) is: http://austingroupbugs.net/view.php?id=400 http://sourceware.org/bugzilla/show_bug.cgi?id=12547 In practice, on Fedora 16 (glibc 2.14), I've observed a very small leak on code like yours (you can even leave libev out of this, it's all about alloc/dealloc cycles using only the realloc() interface), which is fixable by replacing realloc(p,0) calls with free(p). libev already works around realloc(p,0) on non-glibc targets, it may just have to not use realloc(p,0) anywhere now as the simplest solution. You can test for yourself whether that's the issue by doing something like: void* my_realloc(void* ptr, long size) { if(size) return realloc(ptr, size); free(ptr); return 0; } ev_set_allocator(my_realloc); -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: ev_io_set
On Mon, Jun 11, 2012 at 4:39 AM, 钱晓明 wrote: > If I have started a io_watcher with EV_READ event, but later I want to set > EV_WRITE event in callback function of EV_READ, should I have to call > ev_io_stop first, then ev_io_set and ev_io_start again? > If all messages have been send, should I call ev_io_stop again, clear > EV_WRITE event and ev_io_start again? It would probably be more efficient to simply register separate callbacks for EV_READ and EV_WRITE so you can start/stop them independently. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: multi-ev_loop on UDP socket
On Mon, Jun 4, 2012 at 12:36 PM, Brandon Black wrote: > But you can scale up to more packets/sec handled on a given machine by > doing one thread-per-socket (-per-core) and distributing the packets I forgot to mention in the previous email: another area to look at for high performance UDP on Linux is the relatively new syscalls sendmmsg() and recvmmsg(). These can send or receive multiple UDP packets in a single syscall. This reduces the syscall overhead (you're bouncing out to kernel space less often and picking/dropping packets in batches), which further improves the performance of looping on a single socket. In theory, it could reduce internal kernel overhead as well (e.g. lock the socket once for all packets), but last I checked a lot of those kernel-internal optimizations haven't yet been implemented; it more less just loops over sendmsg() internally in the kernel. By using this interface you'll get them if they ever are, though. Just killing some syscall overhead for now is still a nice help. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: multi-ev_loop on UDP socket
On Mon, Jun 4, 2012 at 8:40 AM, Marc Lehmann wrote: > On Mon, Jun 04, 2012 at 02:44:19PM +0200, Joachim Nilsson > wrote: >> On Mon, Jun 4, 2012 at 2:29 PM, Marc Lehmann wrote: >> > On Mon, Jun 04, 2012 at 08:41:38AM +0800, 钱晓明 >> > wrote: >> >> If there are more than one event loop on same udp socket, and each of >> >> them in different thread, datagrams will be processed by all threads >> >> concurrently? For example, one thread is processing udp message while > If the processing of the udp packet takes comparatively long, then you will > have few threads selecting on the fd (in the best case only one on average), > because the others are busy processing something else. In the specific case of connectionless UDP packets, if the processing overhead to handle each incoming packet is very optimized and fast, on Linux hosts, I've found that the Linux socket code itself becomes the scaling limit. One of my projects is an open source authoritative DNS server where I did a lot of perf testing on this scenario. Spawning multiple threads looping on a single socket doesn't help, as they're all waiting on some lower-level socket serialization in the kernel, so you get basically the same throughput with 8 threads as you do with 1, even if you have a bunch of CPU cores to use. But you can scale up to more packets/sec handled on a given machine by doing one thread-per-socket (-per-core) and distributing the packets to these multiple sockets by some other mechanism. e.g. in the DNS case with an 8-core server, you might use some front-end hardware loadbalancer (or some linux iptables/ipvsadm type hacks) to re-distribute the incoming UDP packets on port 53 to local ports 1060-1067, run a thread+loop per socket on each of your 8 cores listening on those ports, and of course have the LB stuff remap the responses to source port 53 afterwards as well. Then you can see real scaling over anything you might try to do against a single UDP socket. This all depends, as Marc was saying, on how much per-packet processing overhead you have in your daemon. If the overhead was higher then yes you might be better off with several threads/loops on a single socket. I'd say as a very very rough rule of thumb: if your processing code is light and optimal enough that it could handle somewhere in the ballpark of say 20-50K (or higher) pps on a single CPU core on the target machine, you're probably in the territory where the socket is the limit and it's time to do thread-per-socket as described above. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev SIGCHLD handler
On Thu, May 10, 2012 at 6:28 AM, Vetoshkin Nikita wrote: > So correct SIGCHLD handler should loop calling waitpid() until no zombies > exist. > Please, correct me, if I'm wrong. The code does appear to "loop" on waitpid as you describe, it just does so by feeding itself new events until waitpid() indicates there are no more children, if you look at childcb() in ev.c ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev release candidate
On Tue, May 8, 2012 at 10:46 AM, Marc Lehmann wrote: > On Tue, May 08, 2012 at 10:33:43AM -0500, Brandon Black > wrote: >> Then it works. If I change j++ to ++j in the above, it still works. > > another bug... cna you try with this loop: That one works (or at least, doesn't throw an assert when running my app testsuite w/ EV_VERIFY=3) -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev release candidate
On Tue, May 8, 2012 at 9:47 AM, Marc Lehmann wrote: > if (j++ & 1) > { > assert (("libev: io watcher list contains a loop", w != w2)); > w2 = w2->next; > } > > And see if that makes it go away? (I never implemented this particular > algorithm before :) Still the same assert failure with that chunk of code. I blindly tried a few other variants on the theme as well. If I move the assert after the assignment, e.g.: if (j++ & 1) { w2 = w2->next; assert (("libev: io watcher list contains a loop", w != w2)); } Then it works. If I change j++ to ++j in the above, it still works. Other obvious variations all produce assert failures (moving assert back above the assignment, or outside of the if-branch completely). -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev release candidate
On Sun, May 6, 2012 at 2:52 PM, Marc Lehmann wrote: > It would be nice to get some compile feedback on this one. If all goes wlel, > the real release will follow shortly. I've tried the update in an app that embeds libev. Before, using the previous two releases of libev, the app passed its own testsuite before both in debug mode (which also enables all EV_VERIFY stuff) and "normal" mode (no EV_VERIFY). If I swap in the code from this tarball, it still passes its own testsuite in "normal" mode, but w/ EV_VERIFY on I get an assert failure: Assertion '("libev: io watcher list contains a loop", w != w2)' failed in ev_verify() at ./libev/ev.c:2585 I haven't yet had time to look into this in detail. It could well be that this is just newly-exposing an existing API usage bug in my application code, but I thought I'd report early rather than wait till I sort it out, in case it makes more sense to someone else. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: empty loop, adding watchers, one minute idle
2012/3/25 Zoltán Lajos Kis : > Even if you lock the mutex before the call to ev_ref and ev_run, the outcome > is still the same: the one minute delay remains. From the docs for ev_set_loop_release_cb: While event loop modifications are allowed between invocations of release and acquire (that's their only purpose after all), no modifications done will affect the event loop, i.e. adding watchers will have no effect on the set of file descriptors being watched, or the time waited. Use an ev_async watcher to wake up ev_run when you want it to take note of any changes you made. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Re: Timeout event resolution
On Thu, Jan 26, 2012 at 2:33 PM, wrote: > > [some code] The code initially sets "after" to 0.1 and repeat to 0.0, which will initially fire the callback after 0.1 seconds. Inside the callback, you're setting a repeat for 1.0 seconds later via "w->repeat = 1." and ev_timer_again(), which causes it to continue firing every 1.0 seconds afterwards. Check the docs for ev_timer_init(), etc, if that doesn't make sense... -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Timeout event resolution
On Thu, Jan 26, 2012 at 12:17 PM, wrote: > I develop using Linux 2.6.9 libev 4.04. I can't upgrade the kernel > version, as it is part of a larger system. > My problem is that I could not set libev to generate 100ms timeouts. > It generates consistent timeouts around 1s, nothing less than that. > I tried on another machine with Ubuntu and a 2.6.3x kernel... but I got > the same results. > I'm passing the interval as 0.1 > The fact that your code isn't getting subsecond timeouts on the Ubuntu kernel either should rule out a 2.6.9-specific problem. Perhaps paste your code? More likely than not, there's some basic bug in it. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev thread unsafe api
On Sat, Jan 14, 2012 at 1:23 AM, Zaheer Ahmad wrote: > "Whenever you want to start/stop a watcher or do other modifications to > an event loop," so its unclear to me which other api are not thread safe > apart from "ev_<>_start/stop". Are ev_<>_init, ev_feed_event, > ev_clear_pending, ev_async_send safe? > ev_async_send() is explicitly documented as thread-safe. For other functions, as you've quoted above: if they modify the event loop, they're not thread-safe. So from your list, ev_feed_event() and ev_clear_pending() are not thread-safe because they modify an event loop. ev_X_init doesn't touch the loop, but it's as safe as any other variable initialization. It doesn't do any internal locking, so two concurrent ev_X_init on the same memory from different threads of execution would be unsafe. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Tue, Jan 3, 2012 at 5:02 AM, Yaroslav wrote: > > Interesting observation: removing __thread storage class makes thread data > shared by all threads. Even without any locks concurrent modifications of > the same memory area result in 5-10 fold test time increase. I.e., shared > variable write is about 5-10 times slower than non-shared even without any > locks. > I assume you mean concurrent writes of course. Implicitly shared memory between threads should have regular access time when there's no false sharing with other threads writing to nearby memory. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Mon, Jan 2, 2012 at 4:49 PM, Colin McCabe wrote: > > The problem is that there's no way for the programmer to distinguish > "the data that really needs to be shared" from the data that shouldn't > be shared between threads. Even in C/C++, all you can do is insert > padding and hope for the best. And you'll never know how much to > insert, because it's architecture specific. > > For large chunks of data anyways, you can just go directly to mmap() for memory and know it's on a completely different page than other allocations. Doesn't solve all cases of course. > malloc doesn't allow you to specify which threads will be accessing > the data. It's quite possible that the memory you get back from > malloc will be on the same cache line as another allocation that a > different thread got back. malloc implementations that use per-thread > storage, like tcmalloc, can help a little bit here. But even then, > some allocations will be accessed by multiple threads, and they may > experience false sharing with allocations that are intended to be used > by only one thread. This one is a really sore point for sure. I really wish there were a standard malloc interface where threads could allocate from pools that nominally belong to other threads. Sometimes you find yourself in a situation where one thread has to do the allocating, but you know in advance the memory will mostly "belong" to a different specific thread for most of its life. Being able to hint these (and other related) conditions to something like tcmalloc() would be nice. > Threads do have an advantage in that you won't be loading the ELF > binary twice, which will save some memory. However, even when using > multiple processes, shared libraries will only be loaded once. > Usually even with procs, the readonly text segments of the ELF binaries should be shared as well, right? -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Sat, Dec 31, 2011 at 2:36 PM, Jorge wrote: > > tThreads(seconds): 0.535, tProcs(seconds): 0.573, ratio:0.933 > > Perhaps I'm doing it wrong ? > Could you run it on other unixes and post the results ? > I used "-O3 -pthread" for CFLAGS and got the following results on two vastly different Linux Xen vhosts, both of which reported 2 CPU cores: tThreads(seconds): 0.271, tProcs(seconds): 0.294, ratio:0.920 tThreads(seconds): 0.700, tProcs(seconds): 0.716, ratio:0.978 And this on an older local MacOS laptop w/ 4 cores: tThreads(seconds): 0.939, tProcs(seconds): 0.961, ratio:0.977 So yes, I think it's fair to say that for the workload measured by the benchmark, threads is probably slightly faster on the tested hosts. But I think given the facts that (a) the timings include pthread_create()/pthread_join() in the threads and fork()/waitpid() in the procs case, and (b) the calculation in each thread completes very very quickly, most likely it's the thread/proc spawn/cleanup stuff that's making a different, and I imagine thread do create/join faster than fork/waitpid in the general case. On the other hand, if I modified kPI to be 5e7 to give them more CPU work to do per thread/proc invocation, I get the following on my MacOS laptop: tThreads(seconds): 9.482, tProcs(seconds): 9.341, ratio:1.015 On the two linux xen hosts even at 5e8 the threads still came out barely ahead, but then again they're xen hosts, so I'm not sure how relevant they are given the extra layer there... -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Sat, Dec 31, 2011 at 6:34 AM, Hongli Lai wrote: > The benchmark tool in the Samba presentation also surprised me. It > showed that processes are indeed a bit faster at various system calls, > but only slightly so. Still doesn't make much sense to me though > because the kernel already has to protect its data structures against > concurrent accesses by multiple processes, no matter whether those > processes are multithreaded as well. > > Well there's the added complication that under threads, the kernel now has to be careful about user-space memory access as well in some cases. Related, I've observed some similar automatic slowdowns because of pthreads cancellation checks. As soon as you're running pthreads, all sorts of syscalls become cancellation points, and every time they're invoked a quick check has to be performed to see whether another thread has sent you pthread_cancel(). ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Thu, Dec 22, 2011 at 7:53 AM, Hongli Lai wrote: > I know that, but as you can read from my very first email I was planning > on running I threads, with I=number of cores, where each thread has 1 event > loop. My question now has got nothing to do with the threads vs events > debate. Marc is claiming that running I *processes* instead of I threads is > faster thanks to MMU stuff and I'm asking for clarification. > Right, so either way an argument based on 2 threads per core is irrelevant, which is the argument you made in point (2) earlier. It doesn't make sense to argue about the benefits of threads under a layout that's know to be suboptimal in larger ways. The threads-vs-procs debate is about 1 thread-per-core vs 1-proc-per-core, not N-threads-per-core vs N-procs-per-core. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Feature request: ability to use libeio with multiple event loops
On Thu, Dec 22, 2011 at 1:05 AM, Hongli Lai wrote: > 2. Suppose the system has two cores and N = 4, so two processes or two > threads will be scheduled on a single core. A context switch to > another thread on the same core should be cheaper because 1) the MMU > register does not have to swapped and 2) no existing TLB entries have > to be invalidated. > > In the general case for real-world software, running N heavily loaded threads on one CPU core is going to be less efficient than using a single thread/process with a non-blocking event loop on that one CPU core. That's the point of the (fairly well settled, IMHO) debate between many-blocking-threads-per-core and one-eventloop-thread-per-core. So your point (2) doesn't really fall in favor of threads because you're talking about a scenario they're not optimal for to begin with. The debate here is really about one process per core versus one thread per core (and the fine details of the differences between processes and threads, and how scheduling them is more or less efficient on a given CPU and OS given roughly nprocs == nthreads == ncores). -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [timers] - triggering at the same time
On Sat, Dec 17, 2011 at 8:25 PM, Neeraj Rai wrote: > Let me refine the questions from the last email. > This may be a bit orthogonal to your primary questions or possibly irrelevant, but if all of these timers have the same absolute duration, it would be simplest to adopt strategy #4 from the ev_timer documentation and manage them yourself in a doubly-linked list and a single ev_timer, and maintain the firing order via the list order. Or if you have a small set of fixed absolute durations, a doubly-linked list + ev_timer for each duration. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: EV and the order of callback invocation
On Wed, Dec 14, 2011 at 2:10 AM, Marios Titas wrote: > > I am trying to use priorities in order to control the order of > invocation of the callbacks but it doesn't seem to work. For example > consider the following perl program: > >use EV; >my $pid = fork or exec qw(sleep 1); >my $w1; $w1 = EV::child $pid, 0, sub{undef $w1; print "1\n"}; >my $w2; $w2 = EV::child $pid, 0, sub{undef $w2; print "1\n"}; >$w2->priority(1); > The ev.pod documentation notes that: "Due to some design glitches inside libev, child watchers will always be handled at maximum priority (their priority is set to EV_MAXPRI by libev)" -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libecb patch: fix compile warnings on gcc-llvm 4.2.1 (OS X 10.6 with Xcode 4)
On Thu, Dec 8, 2011 at 4:18 PM, Hongli Lai wrote: > > I don't know whether it's fully implemented, but everything that ecb.h > uses *is* fully implemented in llvm-gcc 4.2.1. This is pretty huge > because Xcode 4 now uses llvm-gcc as the default gcc so all OS X users > will get these warnings. > > I doubt they're actually fully implemented in OSX's llvm-gcc 4.2.1. Do you have any supporting documentation or research on that? Also the llvm-gcc 4.2.1 that OSX shipped with 4.2.1 was a pretty tragic mistake on their part, IMHO. It's a branch of code that upstream considers deprecated and unsupported, and has known bugs. See this bug link upstream: http://llvm.org/bugs/show_bug.cgi?id=9891 . ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Should I use EV_WRITE event?
2011/11/6 钱晓明 > Hi, I am working on libev few days, and there is a question about EV_WRITE: > When processed request from client, the server has some data to write > back. At this time, it can: > *A.* write back directly in a loop, until all data has been written > *B.* install a EV_WRITE event when client connected, check buffer in the > callback function, and write back if there is data. The EV_WRITE event only > install once. > *C. *install/start a EV_WRITE event when adding reply data to buffer, and > in callback function write all data to client, stop EV_WRITE event at last > before returning from this function. > Which one is best? How frequent the callback function will be called in > situation B? > > C is the best general purpose solution. Create can create the EV_WRITE callback when you create the socket, but simply not start it immediately. When you have reply data that needs to be sent, you buffer it in application code and start your EV_WRITE watcher to drain that buffer. When the reply buffer becomes empty, you stop the EV_WRITE watcher. This gets you correct, nonblocking behavior. The only problem with it is that it's slightly sub-optimal in the common case where the entire reply easily fits in the socket's send buffer, so sometimes you adapt other strategies to handle the common case efficiently (e.g. attempt immediate send first, then fall back to C on EAGAIN). But you almost always need a backup plan that does the full work of (C) in the case that the socket cannot accept all of your data immediately. If you don't have code that does scenario C in your application, your application likely breaks down (fails to function correctly, or blocks) in corner cases that weren't obvious to you initially. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH] ev: fix epoll_init fd leak
On Wed, Nov 2, 2011 at 6:58 AM, Christian Parpart wrote: > > I do think a little bit different here. Think of a server app. The app: > 1.) first reads config file, > 2.) populates internal data structures out of it, > 3.) cleans up its environment (including closing inherited fds, including > 0,1,2) > 4.) initializes its own subsystem, including epoll_create(libev) and its > own log targets and listener sockets > 5.) runs the loop until exit. > > This is not a bad behaviour. > > It is arguably bad behavior. You shouldn't be just closing 0/1/2 as part of your daemon init, you should be re-opening them on /dev/null if you have nothing more interesting to do with them (arguably, connecting fd#2 to some mechanism for capturing error output from other libraries might be more robust, but if you want to ignore any stderr generated by other code that's fine too). Here's the first random hit I got on google showing an example of using freopen() to do this during daemon init. Countless other examples exist, and most real-world daemons do something similar with regard to 0/1/2: http://www.itp.uzh.ch/~dpotter/howto/daemonize p.s.: you wanted a software-example, well, I once worked at a company where > resources (like fds) > and high performance where treated very precise, keeping fds around for > nothing would have been > a joke, though, sure, you might work around libev's behaviour. > > Keeping 3 whole fds around pointed at /dev/null is pretty common practice, and isn't going to affect your performance, that's just silly premature micro-optimization. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: alternate approach to timer inaccuracy due to cached times
On Sat, Oct 15, 2011 at 1:40 AM, Denis Bilenko wrote: > Giving the IO watcher higher priority than ev_timer's would not work > here, would it? > IIRC, priorities only matter within the same loop iteration, here the > issue occurs when > the timer has got an event but the IO hasn't yet. Priorities should still fix things, within the limitation that they can't make the timer fire "on time" if the loop iteration is taking longer than the timeout value. The loop runs in two distinct phases: event gathering followed by callback execution. Even if your callback execution time is 1000ms, and at the end of the last callback you set an i/o timeout for 500ms, if another i/o arrived during the 1000ms you were blocking the loop, and the i/o watcher has higher priority than the timeout watcher, the i/o callback will be invoked before the timeout callback on the next iteration (at which point it can reset it for another 500ms, rather than letting it invoke immediately). If you set the timeout priority lower than the i/o priority and it still invokes immediately on the next loop iteration, in practice that means you really did have no incoming i/o for a period of longer than 500ms. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: alternate approach to timer inaccuracy due to cached times
On Fri, Oct 14, 2011 at 6:34 AM, Denis Bilenko wrote: > Shaun, your patch does not seem to work for me with the above program > - the callback is not called at all if the patch is applied. The primary issue with the idea behind his patch (I don't know about the implementation) is that it guarantees some timeouts will not fire until much later than expected. By basing all timer starts on the end stamp of loop event processing rather than the beginning, the new minimum delay for a timeout is now the event processing delay + the specified timeout. So for example, in his original scenario (500ms timeouts, and 500ms to process a batch of events), with his patched approach some of his timeouts would effectively become 1000ms timeouts. It's basically automatically applying the workaround of "expand the timeout to be large enough that the event processing delay doesn't matter so much", at the expense of making timeouts much more inexact and unpredictable in the senses that they're exact and predictable today. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: alternate approach to timer inaccuracy due to cached times
On Fri, Oct 14, 2011 at 12:48 AM, Shaun Lindsay wrote: > The problem isn't the volume of timer events. Condensing all the timeouts > to one actual event won't help since, as shown in the example, a single > timeout is vulnerable to early expiration. Basing all your timeouts off of > that would just cause everything to expire early (Although using one event > and a backing list seems like a really cool way of squeezing a huge number > of events in to the system). Setting the timeout lower priority than the i/o (I believe Marc already mentioned this) would help too. However, having an error timeout window that's even roughly in the ballpark of the time it takes to process one set of pending i/o callbacks, I'm not sure you can avoid taking at least a single ev_time() hit when setting each timeout watcher (or in the Strategy 4 optimization, an ev_time() hit as you add/update each linked list entry). Without it there's no way to account for the lost time, and with the loop iteration duration and the timeout being of the same rough magnitude, it matters. There's no magic bullet for that problem, something has to give, and ev_time() hits are cheaper than ev_update_now() hits. "Normal" libev software doesn't see these problems in practice because a typical event loop iteration completes in a very small fraction of the timeout window (e.g. 50ms to process pending events vs a 60 second inactivity timeout). -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: alternate approach to timer inaccuracy due to cached times
On Thu, Oct 13, 2011 at 9:08 PM, Shaun Lindsay wrote: > I think my description of the issue was lacking. This is not specific to > gevent. To illustrate the problem, I've included a C test program that > reproduces the timeout problems [...] libev's way of doing things (using loop start time) is just more efficient for most normal cases, because normally you don't have large delays/blocks in your callbacks in an event-driven program. In the big picture, it sounds like the primary issue in your original code is that you're doing blocking database calls in the midst of an event callback. That's going to screw up a lot of assumptions right there. Usually the way to handle this (assuming the database driver can't be hooked into the loop the way things should be) is to spawn a thread/process to handle SQL stuff asynchronously and talk to it over a local socketpair. If you're really stuck with this though, you could also switch to ev.pod's 4th strategy for timers, where all of your socket error timeouts are collapsed into one timeout watcher and go from there on the loop time issues. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Trouble installing EV-4.03 on Mac OSX 10.7 (Lion)
On Thu, Jul 21, 2011 at 9:35 PM, Simon Cocking wrote: > Hi all, > > I was hoping someone could point me in the right direction with this one. > I've had EV-4.03 working fine with OSX 10.6, but following the upgrade to > 10.7 I can't get it to install either using the default configuration or any > variation. I haven't installed Lion yet to check how things played out at release time, but I know some of the final release candidates were shipping a broken version of llvm-gcc that had a known bug affecting EV.xs compilation. The bug is documented by the LLVM guys at: http://llvm.org/bugs/show_bug.cgi?id=9891 , and it was an optimizer bug (goes away with -O0). The final bug comment shows an EV.xs patch I made that serves as a crude workaround. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Best libev way to write in a socket
On Tue, Jun 21, 2011 at 1:31 PM, Zabrane wrote: > Hi Brandon, > Could your proxy code be shared for new comers (as me)? > It'll be very good if build up a repository of libev examples. Unfortunately it's not very good example code in its current state. It uses unportable things (e.g. Linux's splice(2)) that have a significant impact on how everything works, it implements a quirky proprietary protocol, and it was written for raw speed and getting the code out the door quickly, at the expense of the logic often being very hard to follow. I think in the time it took any sane person to comprehend my proxy code, they could simply learn to write something cleaner on their own the hard way :). Someday I hope to refactor it to be releasable as a cleaner, generic proxy that can support multiple protocols, but that could be a few months away yet, and even then it would probably still be highly Linux-specific. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Best libev way to write in a socket
2011/6/20 Jaime Fernández : > b) What's the most convenient way to write data in a TCP socket: > 1.- "send" data directly in the read callback > 2.- Create a write watcher > 3.- Create a watcher of both types: READ&WRITE. It really depends on the nature of the protocol and the data traffic's patterns what will be the most efficient. The most generic answer is this: create both reader and writer watchers for the socket separately, but only activate the read watcher when you need to receive traffic, and only active the write watcher when you have buffered data ready to write to the socket. You can enable and disable them independently with ev_io_start() and ev_io_stop(). e.g. for a simple HTTP server, you might start with only the read watcher enabled. Once you've processed a request and have a response ready to send, you turn on the write watcher to drain the buffered data, then turn it off again when the write buffer is empty. You can also write response data directly from the read watcher that processed the request, and it will probably work 99% of the time for most typical protocols with a serial conversation of reads and writes. However, the write could block because local buffers and the tcp in-flight window are filled, at which point you'll need to buffer the data somewhere per-connection and start up a write watcher to drain it anyways. One way or another, the implementation won't work 100% unless you implement write watcher and a buffer somehow. For an example of trying to be efficient and quick: I recently wrote a proxy server that mostly passes clear data between two connections (other than some initial protocol-specific setup traffic). For that I implemented both read and write watchers for both sides, so 4 watchers per proxied connection. You can think of them as read+write for each socket, but it's more natural pair them the opposite way and think of them as e.g. read_socket1+write_socket2 and read_socket2+write_socket1, for each direction of traffic flow. By default only the read watchers are turned on, and they attempt a non-blocking optimistic write() to the other socket immediately at the end of the read() callback. If the write() fails due to blockage, the data is buffered and the watchers are swapped: the read watcher is disabled, the corresponding write watcher is enabled. The write-watcher, after each invocation where it drains part of the buffer, attempts a non-blocking optimistic read() in order to try to keep the buffer filled. If the buffer drains completely, the traffic flow switches back to watching for read() availability (stop the write watcher, start the read watcher). -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: timer behaviour
On Mon, Jun 13, 2011 at 4:23 PM, Juan Pablo L wrote: > attached is source for database pool module, The locking is insufficient. See http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#THREAD_LOCKING_EXAMPLE . At the very least, you need to add hooks via ev_set_loop_release_cb() so that ev_run() uses your loop lock and releases it when not waiting, and an empty ev_async watcher to notify the loop of changes (add/remove watchers). There could be other problems lurking as well, I didn't look too deep. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: timer behaviour
On Mon, Jun 13, 2011 at 9:38 AM, Juan Pablo L wrote: > yes i m calling ev_timer_again but i m calling from outside the loop > (not from any callback), i m calling from another thread. > i m calling ev_break from another thread. If you're making calls on a single eventloop from multiple threads, are you locking for it? This is mentioned in the libev docs. libev doesn't do locking for you, you have to serialize the calls on a single loop yourself, e.g. with a pthread mutex. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: updating timers safely
On Thu, Jun 2, 2011 at 10:25 PM, Brandon Black wrote: > On Thu, Jun 2, 2011 at 10:15 PM, Juan Pablo L > wrote: >> i m trying to avoid using start/stop or even ev_timer_again because the >> socket is considered busy when taken by a thread and thread will hold the >> socket for as long as clients have something to send which could be a lot of >> info >> or just a few bytes, i m trying to say that a thread may hold the socket >> longer than the >> idle time out so how can i manage >> to prevent the event loop from disconnecting a socket that is being used >> by another thread ? > > Well, not knowing the rest of this design, my guess would be that when > a thread takes control of a socket, it should remove it from the idle > timer linked list, and when it's done it should re-add it at the end. > I should add: you might try a single-thread design first. Seems like every time I design for multi-threaded workers to take advantage of CPU cores, I end up hitting some other limit before I need them and going back to just running one thread. My last project, I maxed out the packets/sec rate on the network interface before I had to spin up a second worker thread. But then again, most of these projects I'm talking about had very light real processing requirements, they were mostly just shuffling data around (e.g. proxy servers for custom protocols). ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: updating timers safely
On Thu, Jun 2, 2011 at 10:15 PM, Juan Pablo L wrote: > i m trying to avoid using start/stop or even ev_timer_again because the > socket is considered busy when taken by a thread and thread will hold the > socket for as long as clients have something to send which could be a lot of > info > or just a few bytes, i m trying to say that a thread may hold the socket > longer than the > idle time out so how can i manage > to prevent the event loop from disconnecting a socket that is being used > by another thread ? Well, not knowing the rest of this design, my guess would be that when a thread takes control of a socket, it should remove it from the idle timer linked list, and when it's done it should re-add it at the end. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: updating timers safely
On Thu, Jun 2, 2011 at 9:08 PM, Juan Pablo L wrote: > thanks for your reply, actually yes i have read the part about the Be smart > about timer, > but this is the first time i have to manage so many timers, and something is > not clear to me. > > #4 refers to making a linked list and putting all timers in there but how > can i control individually > each timer, [] It works, as long as you maintain the list in timer order, which is O(1) if all the timeout intervals are the same. When a connection enters its "connected" state where you want to start the idle timer, you simply add it to the tail of the list. When a connection ends, you remove it from the list. When a connection has an I/O event to reset the idle timer, you move it to the bottom of the list. The actual ev_timer is set for the time the head of the list needs to fire. When it fires, you handle the connection at the head (and any more in list order within some small epsilon past the current ev_now()). If "handling" the timeout at the top of the list means killing the connection, then it simply gets removed from the list. If not, it goes to the bottom (e.g. a recurring maintenance timer per-connection, as opposed to an idle-kill timer). At the end of the callback, you set the next timeout for the timer to the time of the new list head. The details can be tricky to implement and debug. IMHO as the docs state, it's only worth it if you have a *ton* of identical timeouts. It would be neat if someone wrote a generic implementation in C that worked with arbitrary user-defined connection datatypes while keeping list management pointers inside them. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: c++ compilation problem
On Mon, Apr 18, 2011 at 3:16 PM, Richard Kojedzinszky wrote: > > So for now, do you say that the whole c++ support is undocumented, and > unsupported? I think he means only the interfaces documented in the pod are documented and supported at this time: http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#C_SUPPORT -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Stopping a watcher in one callback where another may be called
On Tue, Mar 29, 2011 at 11:30 PM, Arlen Cuss wrote: > My question now: say both the ev_timer and ev_io get triggered at the > same time. If the ev_timer callback gets called first, then the ev_io is > stopped and deallocated (as is all the auxiliary data that the ev_io > callback will be using, as it happens). If the timer callback happens to run first, and does ev_io_stop on the related io watcher, libev won't try to run the io callback (even if it had pending events for the same eventloop iteration). You can set the order of execution for related events in a single eventloop iteration by setting watcher priorities (e.g. for standard network stuff, you might set the io watcher as higher priority than the idle timeout watcher, so that if they trigger on the same loop iteration the connection isn't unnecessarily killed. Or if proxying buffered network traffic, you might set write callbacks higher than read to minimize peak buffer allocation). ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: active index mismatch in heap?
On Wed, Mar 9, 2011 at 3:23 AM, Marc Lehmann wrote: > I wonder how much the slowdown of very frequent ev_verify's is compared > to running under valgrind (which presumably would have pinpointed this > problem nicely). I tried both with this application. Valgrind is considerably slower, but works well enough for testing with just a handful of connections. Running the daemon under gdb for a long time with lots of test clients was what really nailed down the problem for me in the end, because it was still fast enough to be usable, and when it aborted from the assert I could look at a stack trace. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: active index mismatch in heap?
On Sun, Feb 27, 2011 at 1:38 AM, Marc Lehmann wrote: > On Sat, Feb 26, 2011 at 11:27:10PM -0600, Brandon Black > wrote: >> Assertion '("libev: active index mismatch in heap", ((W)((heap >> [i]).w))->active == i)' failed in verify_heap() at ./libev/ev.c:1978 > > [...] > > Yes, but there are lots of possibilities. If you use threads, then a > possibility would be that you stop the watcher in another thread. It could > also be that you free an active watcher. Just for the record, I was eventually able to track down my error, and it was a corner case where I was failing to stop a timer watcher before free()-ing it. Thanks for the help -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
active index mismatch in heap?
Hi, I'm getting the following assert() failure rarely when running with libev 4.04 w/ EV_VERIFY=3 (this on Linux w/ epoll, a multi-threaded app that uses one loop per thread). I haven't yet managed to catch the assert while running under a debugger, it only seems to happen when I'm running a "live" test daemon (I made it so that assert() output would go to syslog), and even then it's only happened twice in the course of a week or so: Assertion '("libev: active index mismatch in heap", ((W)((heap [i]).w))->active == i)' failed in verify_heap() at ./libev/ev.c:1978 At this point I'm just looking for some pointers to help me track down what's going on: is this generally likely to be an application bug on my part (I am using the public API only), or should I be looking more at libev itself? I'd post code but the application is rather complex and I haven't been able to reliably reproduce this even with the full app, much less pare it down to a minimal case yet. In general I'm only using normal ev_timer, not ev_periodic. Some of my timers are short one-shot timers, and then there's one repeating timer which is reset to a new interval with w->repeat = X; ev_timer_again() within its callback (as part of a double-linked list setup like strategy #4 in ev.pod). Is there some likely stupid error I could be making in my use of the ev_timer API that would lead to this assert, that I could double-check the code for specifically? -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: are new benchmarks needed?
On Tue, Dec 28, 2010 at 10:40 PM, Charles Kerr wrote: > (2) If there are no practical performance difference between libev and > libevent, as libev's author says, then when libev compares itself to > libevent in the README and in other places, it should *say that* > instead of saying "faster" or "much faster" as it does now. It's > simply self-evident that "faster" is not equivalent to "no practical > difference." Perhaps this is just a matter of the statements coming from different perspectives. In an artificial benchmark of eventloops, under a microscrope, perhaps one is significantly faster than the other, but in any real-world practical application, your application code should dominate performance issues and a choice between two reasonable efficient eventloop implementations just doesn't have much practical impact. At that point your choice is really more about API preference, maintenance issues, portability issues, etc. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Example of custom backend with thread?
On Tue, Dec 21, 2010 at 2:17 PM, AJ ONeal wrote: > I have a function to process data on a DSP which blocks > and a network socket which gives data to process. Not knowing the rest of the details, my first stab would probably be to continue with your model of having a separate thread for handling the blocking DSP interactions (1 thread per DSP if there's more than one DSP). You could implement a work queue that's locked with a pthread mutex and has a pthread condition variable for signaling readiness. Might be interesting to track average workqueue length to know if the DSP stuff is bottlenecking too. Very rough pseudocode for the lock/cond interaction (ignoring many complexities of any real implementation, threads are tricky, you have to be careful about who owns what data at any given time): pthread_mutex_t workqueue_mutex; pthread_cond_t workqueue_cond; // main thread, libev callback for client socket data client_data_recv() { // ... put received network data in some buffer ... pthread_lock(&workqueue_mutex); add_buffer_ptr_to_workqueue(mybufptr, workqueue); pthread_cond_signal(&workqueue_cond); pthread_unlock(&workqueue_mutex); } // DSP thread mainloop() { while(1) { pthread_mutex_lock(&workqueue_mutex); if(queue_is_empty(workqueue)) pthread_cond_wait(&workqueue_cond); remove_buffer_ptr_from_workqueue(mybufptr, workqueue); pthread_mutex_unlock(&workqueue_mutex); // ... process work item from mybufptr ... } } ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: write()s and pwrite()s from multiple threads in OSX
On Tue, Nov 23, 2010 at 4:55 AM, Jorge Chamorro wrote: > We've found, in nodejs, that on Mac OSX, when write()ing to a disk file in a > tight loop from >= 2 threads, sometimes some write()s seem to fail to write > the data but update the fd's file pointer properly, resulting in strings of > zeroes ('holes') of data.length length in the file. Another way to fix this without locking (at least on my mac running snow leopard) is to set O_APPEND when you open the fd. I'd agree that concurrent writes from pthreads in the same process should be atomic according to POSIX, but in practice I think this is exactly the sort of area where you're going to find some platforms with bugs. Most issues come down to file position update issues. O_APPEND fixes this in some cases because it uses different logic for updating the file position (I think concurrency w/ O_APPEND even works inter-process for many systems? not sure). The best way to do concurrent writes is of course using pwrite() with specific offsets if you're dealing with fixed-size records, but O_APPEND is a reasonable solution for logfile-like semantics. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: libev TCP echo server example?
On Sat, Mar 27, 2010 at 3:38 PM, Marc Lehmann wrote: > On Sat, Mar 27, 2010 at 07:21:52PM +, Chris Dew > wrote: >> // Lots of globals, what's the best way to get rid of these? > > put them into a struct, maybe together with the watcher(s), and put it's > address intot eh data pointer (or or some othe rtechnique described in the > libev manual). > If you want a more complex/complete example, you can look at some of my code over here: http://code.google.com/p/gdnsd/source/browse/trunk/gdnsd/dnsio_tcp.c There be many dragons in this code, but you can see the basic per-thread/per-conn data structures and how they're passed around and how the callbacks are set up with timeouts and so-on. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Optimal multithread model
On Tue, Mar 16, 2010 at 9:51 AM, Christophe Meessen wrote: > Brandon Black a écrit : >> >> The thing we all seem to agree on is that eventloops >> beat threads within one process, but the thing we disagree on is >> whether it's better to have the top-level processes be processes or >> threads. >> > > I'm really not so sure about the former. The ICE system developers > [http://www.zeroc.com] which is a CORBA like application made some > benchmarks and concluded that the one thread per socket is the most > efficient. It is also used in one the most efficient CORBA implementation > omniORB [http://omniorb.sourceforge.net/]. > > This is probably also because the application can't be easily turned into an > event loop program because the "callbacks" may have a long execution time. > These would have to be turned into state machines using the timer to go from > one state to the other. This is weird and doesn't seem at all more efficient > than a plain basic thread. Users would dislike it. > My impression is that the discussion is biased by a particular use case > pattern in mind and a focus nearly exclusive on performance. Well yes if their code isn't well-structured for event loops, then threads will work better :) The "thread per socket" thing is something I've run into as well though, at least on Linux, regardless of whether the threads are threads or processes. What that boils down to is that Linux has a serializing lock on each socket. Normally most people are dealing with TCP sessions, and one "socket" is a serial TCP session anyways, and so this isn't a practical concern. However, with small UDP transactions (think designs like DNS servers) involving a single request packet and a single reply packet, this lock on the server's listening socket becomes a limiting factor. You'd like to scale up by putting several threads of execution behind a single socket, but it just doesn't work because of the socket locking. So what you end up doing is spawning a thread/process per UDP socket, and having something else loadbalance all your requests from the official public port number to the several ports your application listens on (Linux's ipvsadm can do this in software right on the same host). There are NUMA considerations in this problem too, concerning where you place the processes and where you place your NICs and how the IRQs get routed, etc. To some degree the kernel autobalances this stuff, but libnuma and/or numactl (or other similar stuff) are handy too. I ran into this writing an actual DNS server, but while researching the socket scaling thing I came across a reference from the facebook guys facing the same problem with UDP-based memcached not scaling up, detailed here: http://www.facebook.com/note.php?note_id=39391378919 . Sounds like they found the kernel-level issue and patched around it locally, but never merged upstream? ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Optimal multithread model
On Tue, Mar 16, 2010 at 8:40 AM, Christophe Meessen wrote: > Regarding the threads vs. processes discussion I see a use case which hasn't > been discussed yet. > > Consider the C10K application with many cold links and a context (i.e. > authentication, data structures, ...) associated to each connection. > > With threads we can easily set up a pool of worker threads that easily and > efficiently pick up the context associated to the connection becoming > active. I don't see how an equivalent model can be efficiently implemented > with processes. > > I would prefer it was possible to do it with processes because they have the > benefit of a separate memory space which is much better for security and > robustness. But I couldn't find a way to do it as easily and efficiently as > with threads. > You're again mixing up "how do I scale on many cores" with "how do I scale for many slow network events on one core", which can be combined into the "how do I scale for many slow network events on many cores". You would never solve the combined problem with blocking processes alone. The process model for this would be one process per core, and either many threads per process (one for each connection that process is handling), or an event loop per process handling the many connections. The thing we all seem to agree on is that eventloops beat threads within one process, but the thing we disagree on is whether it's better to have the top-level processes be processes or threads. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Optimal multithread model
On Tue, Mar 16, 2010 at 12:15 AM, James Mansion wrote: > Brandon Black wrote: >> >> However, the thread model as typically used scales poorly across >> multiple CPUs as compared to distinct processes, especially as one >> scales up from simple SMP to the ccNUMA style we're seeing with >> > > This is just not true. I think it is, "as typically used". Again, you can threaded write software that doesn't use mutexes and doesn't write to the same memory from two threads, but then you're not really using much of threads. One of my major projects right now is structured that way actually, so I certainly see your argument and use it :). However, using threads this way isn't much different from using processes. The advantage to using threads over processes in that case is that you don't have to go around making SysV shm or mmap(MAP_SHARED) areas for the data you intend to share, and certain other aspects of the OS interface (like how you cleanly start and stop the groups of processes, how you handle signals, etc) are simplified. With either model you can still take advantage of pthread mutexes and condition variables and so-on (see pthread_mutexattr_setpshared(), etc) once you've established your explicitly-shared memory regions, if those mechanisms make the most sense for you. That's really the crux: explicit sharing where necessary, versus implicitly sharing everything and relying on programmer discipline to avoid Bad Things. >> >> large-core-count Opteron and Xeon -based machines these days. This is >> mostly because of memory access and data caching issues, not because >> of context switching. The threads thrash on caching memory that >> they're both writing to (and/or content on locks, it's related), and >> some of the threads are running on a different NUMA node than where >> the data is (in some cases this is very pathological, especially if >> you haven't had each thread allocate its own memory with a smart >> malloc). >> > > This affects processes too. The key aspect is how much data is actively > shared between > the different threads *and is changing*. The heap is a key point of > contention but thread > caching heap managers are a big help - and the process model only works when > you don't > need to share the data. We agree on this point up until your last statement. The process model shares data just as well as the thread model, you just have to explicitly state what is shared. > Large core count Opteron and Xeon chips are *reducing* the NUMA affects with > modest numbers of threads because the number of physical sockets is going > down > and the integration on a socket is higher, and when you communicate between > your > coprocesses you'll have the same data copying - its not free. > > Sure - scaling with threads on a 48 core system is a challenge. Scaling on a > glueless > 8 core system (or on a share of a host with more cores) is more relevant to > what most > of us do though. > > Clock speed isn't going anywhere at the moment, but core count is - and so > is the > available LAN performance. I think I see this very differently. We agree that clock speeds are going nowhere, leading to higher core counts per die. Where we're at now is that a cheap-ish 1U server may well have 2 NUMA nodes with 2-4 cores each, and both the core count per NUMA node and the total number of NUMA nodes is likely to increase going forward. NUMA is going to become a single-server scaling problem, even for small servers. It's the only way a small server that costs a fixed $X is going to continue to scale up in performance over the years now. Either they're going to both increase core count slightly and increase the count of (smaller) dies in a small system, having each die be a NUMA node, or stuff 16+ cores in a die and try to stick with just 2 dies in the system, in which case they put some NUMA hierarchy inside each die. But either way, there's going to be more NUMA involved as we add more cores to a single system. We've been down the 16 (or more) -way uniform-memory-access SMP road years ago, it simply doesn't scale. [... skipped some of the rest, relatively minor points and this debate is getting long ...] > are static. That's a long way different from sharing only a small flat > memory resource and > some pipe IPC. Its more convenient and can be a lot faster. Processes *can* do everything threads do. Even the exact same forms of memory sharing and IPC, with the same efficiencies. The pthread API itself is usable across process boundaries, or you can do other equivalent things. [ ...] > Let me ask you - how do you think memcached should have scaled past their > original single-thread performance? li
Re: Optimal multithread model
On Mon, Mar 15, 2010 at 5:07 PM, James Mansion wrote: > Marc Lehmann wrote: >> >> Keep in mind thatthe primary use for threads is improve context switching >> times in single processor situations - event loops are usually far faster >> at >> context switches. >> >> > > No, I don't think so. (re: improve context switching times in single > processor > situations). Yes, context switch in a state machine is fastest, followed by > coroutines, > and then threads. But you're limited to one core. >> >> Now that multi cores become increasingly more common and scalability to >> multiple cpus and even hosts is becoming more important, threads should be >> avoided as they are not efficiently using those configs (again, they are a >> single-cpu thing). >> > > You keep saying this, but that doesn't make it true. His basic point here, which I agree with, is sound. If you're trying to scale up a meta-task (network server) that does many interleaved tasks (talking to many clients) on one processor, event loops are going to beat threads, assuming you can make everything the event loop does nonblocking (or fast enough for blocking to not matter much). That's all talking about a single CPU core though. However, the thread model as typically used scales poorly across multiple CPUs as compared to distinct processes, especially as one scales up from simple SMP to the ccNUMA style we're seeing with large-core-count Opteron and Xeon -based machines these days. This is mostly because of memory access and data caching issues, not because of context switching. The threads thrash on caching memory that they're both writing to (and/or content on locks, it's related), and some of the threads are running on a different NUMA node than where the data is (in some cases this is very pathological, especially if you haven't had each thread allocate its own memory with a smart malloc). >> >> While there are exceptions (as always), in the majority of cases you will >> not be able to beat event loops, especially whne using multiple processes, >> as they use the given resources most efficiently. >> > > Only if you don't block, which is frequently hard to ensure if you are > using third-party libraries for database access (or heavy crypto, or > calc-intensive code that is painful to step explicitly). In fact, all > those nasty business-related functions that cause us to build systems > in the first place. In the case that you can either (a) just use threads instead of event loops, but you still want one process per core and several threads within, or (b) use an event loop, but also spawn separate threads for slow-running tasks (crypto , database, whatever) and queue your I/O operations to those threads into the event loop for non-blocking access to them. I think some of the issue in this argument is a matter of semantics. You can make threads scale up well anyways by simply designing your multi-threaded software to not contend on pthread mutexes and not having multiple threads writing to the same shared blocks of memory, but then you're effectively describing the behavior of processes, and you've implemented a multi-process model by using threads but not using most of the defining features of threads. You may as well save yourself some sanity and use processes at that point, and have any shared read-only data either in memory pre-fork (copy-on-write, and write never happens to these blocks), or via mmap(MAP_SHARED), or some other data-sharing mechanism. So If you've got software that's scaling well by adding threads as you add CPU cores, you've probably got software that could have just as efficiently been written as processes instead of threads, and been less error-prone to boot. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Wed, Feb 24, 2010 at 8:15 AM, Marc Lehmann wrote: > On Tue, Feb 23, 2010 at 12:03:52PM -0600, Brandon Black > wrote: >> something I wanted to verify. I'm still not entirely clear, in the >> general case, when aliasing through pointers to technically-unrelated >> types (two different structs which just happen to have a common set of >> initial members, in this case) is allowed by the C standard, and when > > While that is nice of you, it has zero relevance to libev, as libev > doesn't alias those members at all (that's the whole point of the casts, > obviously...). > > It would be far more interesting to check whether any casts are missing, > causing gcc to not warn, but to actually alias, resulting in code > misoptimisations... > >> it does or doesn't cause gcc optimization bugs, which may or may not > > assuming aliasing where gcc doesn't does cause code not to behave as > intended. these are not optimisation bugs, but bugs in the code. > > afaics, if you remove the casts, gcc will actually "misoptimise" ev.c, > as this results in aliasing and gcc will happily ignore writes when > inlining... What I am reading in your two responses above (feel free to berate me if I'm wrong) is that the way the ev.c code is now (the specific use of types and typecasts) both causes a gcc warning related to aliasing, and prevents an aliasing-related bug in the code (which would only manifest itself under optimization). Whereas if the casts were changed, the opposite would be true: no warning, but yes bug. You don't see why this is a confusing situation for users of gcc? My choices in this situation are seriously either get a warning or get a bug? >> be related to gcc's authors either not adhering to the standard, or >> being "more strict" about aliasing than the standard strictly >> requires. > > gcc is documented to be less strict than C, btw. Ok, so you're saying anything the C standard accepts in terms of legal aliasing, gcc will not mis-optimize? >> I'm taking you at your word that libev's pointer-aliasing doesn't > > Yeah, and you beating your wife is probably not a problem for the economy > either. > > As the very first I would like to see evidence for any aliasing in > libev. I have no clue what you are talking about, and so far, your statement > above is just more fud spreading... > > Really, if you don't know, why can't you for god's sake ASK instead of > making up bullshit claims such as libev doing any pointer aliasing (or > relying on any (obviously legal) aliasing in fact, e.g. by using memcpy). > > This continued assaults, intended or not, are really annoying. > > You are making up bullshit faster than I can set it right! > > I seriosuly suggets changing your strategy - don't make stupid statements you > have no clue about, ASK and I will answer. > > So far, all my time in this discussion is sucked up by just clarifying > all the weird statements people do, without having the slightets clue > apparently. > > If sth. is unclear, I can reply both by wearing my libev maintainer hat as > well as wearing my gcc maintainer hat. > > It is hard for me to help you if you don't have any questions and just > make fullmouthed but wrong claims. This is the kind of assault that makes our conversations almost unbearable. Unless I misunderstand you, the problem here is that I'm mis-using the word aliasing. You could have simply pointed out that I'm mis-using this technical term in this context, instead of ranting about false accusations. If I change the phrase "libev's pointer-aliasing" to "libev's specific method of using pointers to typeA to point to objects of typeB", then the second statement is true, and is what I meant. You do use ev_watcher pointers to refer to ev_io objects. If that isn't an instance of aliasing, then I'm mis-using the term. The mistake is that simple. I'm done with the rest of this. I think you know the answers to the questions I'm asking, but it's just not worth the abuse to figure out how to correctly ask you. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Tue, Feb 23, 2010 at 8:25 AM, Marc Lehmann wrote: > On Mon, Feb 22, 2010 at 02:05:58PM -0600, Brandon Black > wrote: >> > Can we settle this please, or _finally_ bring some evidence? >> >> You're arguing with the wrong person. I didn't submit the patch, and >> I have never once asserted that there was a problem with the libev >> code with regards to aliasing. > > Except that you explicitly wrote that and reinforced it later. You have > weird says of writing what you apperently didn't assert. Maybe use quote > characters around assertions that you didn't mean to come from you? > > I know you didn't submit the patch, but I still know what you wrote. Maybe > it was a simple mistake you did, but fact is that you did assert that > there is an issue with libev and strict aliasing, and so far, you haven't > taken back that statement. > > So please be bold enough to stand by your words or clarify them. Saying A > and then claiming you didn't assert A does not do it in my eyes. Fine. This is the statement I think we are at odds about, in my first email in this thread: "I've seen this strict-aliasing issues with libev + gcc 4.4 as well. I haven't bothered to report it yet simply because I haven't had the time to sort out exactly what's going on, and whether it's a real issue that needs to be fixed, or as you've said, just an annoying bogus warning." I did not say, "I have seen aliasing errors in the libev code". I said I had seen strict-aliasing issues with libev + gcc 4.4 ("issues" here meaning I'm seeing warnings with this code compiled with this compiler, although perhaps that was not explicitly clear), and then noted that I have not yet determined for myself whether the warnings are bogus. >> I just find these warnings confusing, > > Then don't enable them - libev doesn't enable them, and with every > released version of gcc you explicitly have to ask for them. Because of all of the FUD and misinformation surrounding gcc aliasing warnings in general (not just here on this mailing list), they are something I wanted to verify. I'm still not entirely clear, in the general case, when aliasing through pointers to technically-unrelated types (two different structs which just happen to have a common set of initial members, in this case) is allowed by the C standard, and when it does or doesn't cause gcc optimization bugs, which may or may not be related to gcc's authors either not adhering to the standard, or being "more strict" about aliasing than the standard strictly requires. I'm taking you at your word that libev's pointer-aliasing doesn't cause bugs when compiled with gcc (probably the most common compiler for libev), partly because you're a smart guy, and partly because the technique in question is in wide use elsewhere. I even use similar techniques in my own code, although subtle differences make mine not emit the warnings. From what I gather around the internet, whether or not any given version gcc issues aliasing warnings for any given chunk of code may be completely orthogonal to whether there is a gcc aliasing-related optimization bug anyways, though (much less orthogonal to complying with the C standards). But I still would like to know, for my own sake, what the rules are on this issue. My own google searching on this topic has turned up a lot of confusing and contradictory information. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Mon, Feb 22, 2010 at 1:45 PM, Marc Lehmann wrote: > On Mon, Feb 22, 2010 at 12:59:08PM -0600, Brandon Black > wrote: >> On Sun, Feb 21, 2010 at 3:39 AM, Marc Lehmann wrote: >> > On Sat, Feb 20, 2010 at 09:41:35AM -0600, Brandon Black >> > wrote: >> >> I've seen this strict-aliasing issues with libev + gcc 4.4 as well. I >> > Facts please... >> >> I wasn't asserting that there was an aliasing issue in libev. I was > > Well, you asserted a "strict-aliasing issue", which to me sounds a lot > like an aliasing issue... > >> asserting that there is a strict aliasing warning with gcc + libev, > > That's not what you asserted, but maybe what you meant. You even quoted > it... [...] > Can we settle this please, or _finally_ bring some evidence? You're arguing with the wrong person. I didn't submit the patch, and I have never once asserted that there was a problem with the libev code with regards to aliasing. I just find these warnings confusing, as do many people, and I'm trying to find clarification on them. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Sun, Feb 21, 2010 at 3:39 AM, Marc Lehmann wrote: > On Sat, Feb 20, 2010 at 09:41:35AM -0600, Brandon Black > wrote: >> I've seen this strict-aliasing issues with libev + gcc 4.4 as well. I > Facts please... I wasn't asserting that there was an aliasing issue in libev. I was asserting that there is a strict aliasing warning with gcc + libev, and I said I haven't fully investigated it to figure out what the cause is. So I agree, I still don't have facts. The real issue here is that I apparently don't understand, precisely and in-depth, exactly what the C standard says about aliasing, and exactly what gcc says about the same, and whether they agree or not. Apparently I'm not the only one, as evidenced not only by this thread, but by Google searches turning up many similar threads on many other mailing lists, as well as web pages that attempt to explain the issue, and then other pages/threads calling those pages horse-shit. I'd like to take a simple case here and break it down, if you don't mind: == blbl...@xpc:~$ cat test.c struct foo { int a; }; struct bar { int a; }; struct bar sbar; int get_a(void) { return ((struct foo*)(&sbar))->a; } blbl...@xpc:~$ gcc -std=c99 -O3 -Wall -c test.c test.c: In function ‘get_a’: test.c:13: warning: dereferencing type-punned pointer will break strict-aliasing rules blbl...@xpc:~$ gcc -v Using built-in specs. Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.4.1-4ubuntu9' --with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --enable-multiarch --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.4 --program-suffix=-4.4 --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --disable-werror --with-arch-32=i486 --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 4.4.1 (Ubuntu 4.4.1-4ubuntu9) === The code above is a simplified version of a line at the beginning of evpipe_init() which generates the same warning. The original line is this: if (!ev_is_active (&pipe_w)) which after pre-processing is this: if (!(0 + ((ev_watcher *)(void *)(&((loop)->pipe_w)))->active)) In the definition of "struct ev_loop", pipe_w is a member of type "struct ev_io" (not a pointer, a direct sub-struct). ev_io and ev_watcher have a common first member, "int active". Can someone who understands the relevant standards and whatever modern gcc is doing with aliasing explain what the standards say about the test case snippet above, and why gcc is (possibly erroneously?) warning on it? -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Sat, Feb 20, 2010 at 9:41 AM, Brandon Black wrote: > > I've seen this strict-aliasing issues with libev + gcc 4.4 as well. I > haven't bothered to report it yet simply because I haven't had the > time to sort out exactly what's going on, and whether it's a real > issue that needs to be fixed, or as you've said, just an annoying > bogus warning. The newer versions of gcc are very aggressive about > aliasing assumptions for optimization, and my hunch is that this is a > real issue with newer gcc's that really care about strict aliasing, > but it's just a hunch at this point. I plan at some point in the next > few days to dig into this in detail and sort it out for sure one way > or the other. I meant to add (but forgot): if anyone is really worried this could cause a bug in your code in the short term, you can simply use -fno-strict-aliasing when compiling libev, which will prevent the compiler from making any bad aliasing assumptions about libev code. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [PATCH 2] fix broken strict aliasing
On Sat, Feb 20, 2010 at 9:06 AM, Marc Lehmann wrote: > On Fri, Feb 19, 2010 at 07:28:03PM +0100, Luca Barbato > wrote: >> > so you need to identify where the warnings originate (e..g in your >> > compiler) and then ask this question to those people who actually control >> > the code that generates such bogus warnings. >> >> Are you _sure_ those are bogus? If you are using x86 with an ancient gcc > > Is there any evidence to the contrary? If yes, I would be happy to hear of > it. > I've seen this strict-aliasing issues with libev + gcc 4.4 as well. I haven't bothered to report it yet simply because I haven't had the time to sort out exactly what's going on, and whether it's a real issue that needs to be fixed, or as you've said, just an annoying bogus warning. The newer versions of gcc are very aggressive about aliasing assumptions for optimization, and my hunch is that this is a real issue with newer gcc's that really care about strict aliasing, but it's just a hunch at this point. I plan at some point in the next few days to dig into this in detail and sort it out for sure one way or the other. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: ev_async_send() not trigerring corresponding async handler in target loop ?
On Wed, Dec 30, 2009 at 11:25 AM, Pierre-Yves Kerembellec wrote: >> Now, as a word of advice: multithreading is (imho) extremely complicated, >> expect that you will have to learn a lot. [...] >> Keep also in mind that threads are not very scalable (they are meant to >> improve performance on a single cpu only and decrease performance on >> multiple cores in general), and since the number of cores will increase >> more and more in the near future, they might not be such a good choie. > > "Decrease performance on mutiple core in general" ? But what about a > single-threaded > single process program ? It wouldn't benefit from multiple cores (since the > kernel > wouldn't schedule this program on mote than one core at a time anyway), right > ? A well-designed multi-threaded program can scale better over multiple cores than a single-process, single-threaded program, yes :) What he's referring to is that in the general case, a multi-process program will scale better over increasingly large core counts than a multi-threaded program, because the shared address space of the threads tends to lead to memory/cache performance issues even on non-NUMA machines, and obviously on NUMA machines the problems get even worse when the threads that share active chunks of memory might want to be scheduled on several different NUMA nodes. However, a *carefully designed* multi-threaded program can avoid these sorts of memory issues in many cases, but then that gets back into another component of why "multithreading is (imho) extremely complicated". -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: [patch] event_loopbreak()
On Mon, Jul 6, 2009 at 3:22 PM, Antony Dovgal wrote: > Hello all. > > I'd like to propose a patch adding support for event_loopbreak(), > event_base_loopbreak() and ev_break(): > http://dev.daylessday.org/diff/libev_event_loopbreak.diff > > I didn't actually use or see libev until today, so bear with me if the patch > is wrong/incomplete. Does this accomplish something that ev_unloop() does not? ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: ev signal watchers
Todd Fisher wrote: The troble I'm having is the child processes stop receiving signals, from the parent process - when there is a lot of activity. I am able to resume signal processing by putting the whole process to sleep and waking it back up (cntrl+z fg)... This behavior seems odd and I'm wondering if by off chance anyone has 1. any suggestions about this queuing system design, e.g. the use of signals to communicate to a child process vs using a pipe? It's a little early in the morning for me to comment on the rest yet, but I will say that POSIX signals are only guaranteed to be delivered at all, there is no guarantee on *when* they will be delivered. They're not a reliably means of realtime communication. ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Couple of bugs
On Wed, Feb 25, 2009 at 3:19 PM, Marc Lehmann wrote: > On Wed, Feb 25, 2009 at 10:37:36AM -0500, Steve Grubb > wrote: >> My bad...these do get executed. But I am tempted to define NDEBUG, though, to >> remove the extraneous text. So, I guess the comment about function calls in >> the assert expressions would be an issue. > > You are free to define NDEBUG if you want (this is documented, btw.). > > There are no known issues with function calls in asserts so far... > FWIW, I've been using libev embedded in my production application, compiled with NDEBUG, for months now without running into any issues. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Couple of bugs
On Sun, Feb 22, 2009 at 12:21 PM, Steve Grubb wrote: > Hi, > > I am using libev as part of the Linux audit daemon. I have done some testing > with 3.53 recently and run across a couple of "issues". > I'm not sure about your __STDC_VERSION__ issue, but as for the other two being pointed out by valgrind: The invalid free is because you're calling ev_loop_destroy on the default loop. Check the docs for ev_default_loop(), ev_default_destroy(), ev_new_loop(), and ev_destroy_loop(). The "leak" isn't really a leak. It's a data structure global to libev that's allocated at startup and never freed, but it doesn't leak. This is also covered in the docs near the end. The docs are linked from the main libev site. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: How do I cancel a timer?
James Mansion wrote: ryah dahl wrote: The mistake checking you want could be added with very much overhead That's not necessarily true. Just have a signal byte at a known offset and require it to be 0 to do the initialisation, and the initialisationn will change the signal. This is low overhead on users and runtime - but it IS an API change. To say its not possible in general is wrong, though. And, to be honest, having an *optional*check that looks in memory and says 'if it looks initialised, then it IS in itialised' and accepts a probabilistic false positive, is also cheap and unlikely to be a problem in reality while catching a class of bug. (Though, you might want to increase the size of the structure slightly to have a magic flag - 64 bits or so perhaps). The problem with an initialization flag byte is this then requires that the memory being passed to the init function be zero-d (or at least, the flag byte within it be zero-d), whereas with the current simpler scheme, an application may be managing its own local pools of memory and reusing old chunks of memory without zeroing them before calling the initialization function. So again, there's a performance loss there of the application needing to zero the memory before each re-use for that scheme to work (not to mention failing to zero the memory before reuse, or zeroing the wrong thing before reusing the right one, is basically going to fall into the same class of application programming bug/mistake as the problem we're trying to avoid in the first place). ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: multiple signals per watcher
Alejandro Mery wrote: Hello, I have an ev_signal watcher upon SIGTERM|SIGINT, but I just noticed ^C is not been catched by this watcher, only SIGTERM... are "multiple" signals on the same watcher supported by libev? You need to add one watcher per signal, but they can have the same callback function. If you look at "kill -l", you will see that signal numbers are not a bitmask anyways. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: Libev leaks memory
hsanson wrote: I am using libev to implement a media server and so far everything is smooth and easy. The only problem that I found is that libev seems to leak memory. [...] ==30474== 256 bytes in 1 blocks are still reachable in loss record 1 of 1 ==30474==at 0x4022AB8: malloc (vg_replace_malloc.c:207) ==30474==by 0x4022BFC: realloc (vg_replace_malloc.c:429) ==30474==by 0x4041357: ev_realloc_emul (ev.c:377) ==30474==by 0x40429B1: array_realloc (ev.c:394) ==30474==by 0x404408B: ev_signal_start (ev.c:2118) ==30474==by 0x4044170: ev_default_loop_init (ev.c:1485) ==30474==by 0x8049660: ev_default_loop (ev.h:433) ==30474==by 0x8049630: main (video_server.c:9) ==30474== [...] I wouldn't worry about it too much. Those 256 bytes are "leaked" just once, so it's not a real leak, it's more like "necessary memory that can't be easily free()'d up at exit". -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: new 4-heap code - please give it a try
Brandon Black wrote: Brandon Black wrote: Marc Lehmann wrote: I just committed code to use a 4-heap instead of a 2-heap to libev. [...] I've just tried it against my code, looks pretty good from here. My test suite passes with the new code, so no breakage. I spoke too soon (I guess my test suite still needs some work), I'm having a timer-related issue with cvs that didn't happen with 3.31. Specifically, if (in a single event loop) I start a repeating timer of 20 seconds, and then a one-shot timer of 3 seconds, they both fire at the 20-second mark. I'm still in the process of making sure I didn't do something dumb to cause this, and making a simpler piece of test code for it, but just a heads up in case you have an obvious answer. Here's some simple demo code that seems to exhibit the issue I'm seeing with cvs's timers, using two single-shot timers of 3 and 10 seconds. Switching the order of the ev_timer_start()s makes it behave better. -- #include #include #include "ev.h" static void ten_cb(struct ev_loop* loop, struct ev_timer* t, int revents) { fprintf(stderr, "The 10-sec timer fired @ %li\n", time(NULL)); ev_unloop(loop, EVUNLOOP_ALL); } static void three_cb(struct ev_loop* loop, struct ev_timer* t, int revents) { fprintf(stderr, "The 3-sec timer fired @ %li\n", time(NULL)); ev_unloop(loop, EVUNLOOP_ALL); } int main(int argc, char* argv[]) { struct ev_timer ten; struct ev_timer three; struct ev_loop* def_loop = ev_default_loop(EVFLAG_AUTO); ev_timer_init(&ten, &ten_cb, 10., 0.); ev_timer_init(&three, &three_cb, 3., 0.); ev_timer_start(def_loop, &ten); ev_timer_start(def_loop, &three); fprintf(stderr, "Starting loop @ %li\n", time(NULL)); ev_loop(def_loop, 0); return 0; } ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: new 4-heap code - please give it a try
Brandon Black wrote: Marc Lehmann wrote: I just committed code to use a 4-heap instead of a 2-heap to libev. [...] I've just tried it against my code, looks pretty good from here. My test suite passes with the new code, so no breakage. I spoke too soon (I guess my test suite still needs some work), I'm having a timer-related issue with cvs that didn't happen with 3.31. Specifically, if (in a single event loop) I start a repeating timer of 20 seconds, and then a one-shot timer of 3 seconds, they both fire at the 20-second mark. I'm still in the process of making sure I didn't do something dumb to cause this, and making a simpler piece of test code for it, but just a heads up in case you have an obvious answer. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: new 4-heap code - please give it a try
Marc Lehmann wrote: I just committed code to use a 4-heap instead of a 2-heap to libev. [...] I would be mainly interested in feedback of the type "it doesn't crash" or "it seems to work correctly", but wouldn't mind performance numbers, either :) I'll do some further benchmarks mainly with lower number of watchers, but so far, it seems it has no detrimental effect on the performance in that case. I've just tried it against my code, looks pretty good from here. My test suite passes with the new code, so no breakage. The only performance-sensitive parts of my code that really exercise libev only use it for a small handful of watchers for UDP sockets (in the benchmark case, 3 UDP sockets). Even then, the runtime is dominated by things other than libev (mostly memory compare/copy operations). On my dev box (Linux, 1.6Ghz Sempron), I was able to discern a small performance difference, but it's near the limit of what I can reliably see given the normal random variances in the benchmark runs, so don't put too much faith in them. It definitely didn't seem to hurt though, even in my "low number of watchers" case. These are average seconds for the benchmark run (which will generate 1,000,000 socket events on one of the three sockets) for the two different versions, using EV_MINIMAL or not, sorted by runtime: 3.33 NORMAL: 17.54 3.31 EV_MINIMAL: 17.88 3.33 EV_MINIMAL: 18.49 3.31 NORMAL: 18.49 -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Trivial warnings nit
On targets without EV_USE_INOTIFY, lack of this #if wrapper results in: libev/ev.c:1288: warning: ‘infy_fork’ declared ‘static’ but never defined At least for me using it embedded with my compiler/cflags/etc (which include -Werror, which is what makes it a pain why I compile on targets with pre-inotify glibc). For whatever reasons the gcc pragma "system_header" I'm using to suppress the rest of the libev warnings doesn't suppress this one. === libev/ev.c == --- libev/ev.c (revision 595) +++ libev/ev.c (local) @@ -1285,7 +1285,9 @@ backend = 0; } +#if EV_USE_INOTIFY void inline_size infy_fork (EV_P); +#endif void inline_size loop_fork (EV_P) ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Marc Lehmann wrote: On Wed, Apr 09, 2008 at 06:15:27PM -0500, Brandon Black <[EMAIL PROTECTED]> wrote: Regardless, if poll() returns zero there's no point in examining revents anyways right? The extra test might slow you down when you need it most, when you are in a tight loop handling I/O events, especially when it is hard to predict. That makes the whole issue moot. Well, I'd rather have bugs fixed, especially in tools like valgrind that I use myself. You might not care about the quality of the software you use, but I do. And when I use valgrind, I expect useful and good results, not false positives. Think about how much time you and I already have potentially wasted on this valgrund bug. This wouldn't have been neccessary if valgrind were fixed. And yes, you might not care, but I am happy when _other_ people who find out about these issues get the real bug fixed, so _I_ do not have to waste time investigating them. With your attitude, everybody would just slap workarounds around kernel bugs and everybody would have to reinvent those workarounds again and again. I have different standards. No need to get personal with attacks on my "standards". If it's a valgrind bug then fine, it's a valgrind bug. If anything you gain some performance by not scanning the poll_fd's in the case of a timeout with no events occuring. Micro-optimisations like these often don't have the intended effect, though, you have to look at the larger picture. Besides, the point is not avoiding code changes to libev - if you had looked, you had seen that I already added a workaround, despite having no indicationg of this beign a real problem. The point is a) fixing a real bug in a tool people rely on (valgrind) and b) helping others by epxloring and fixing the issue, instead of blindly applying some kludgy that might not even be necessary and letting others stumble upon the same problems. Look, either the change you applied to skip scanning revents on a retval of zero is a "blind kludge" or an optimization that happens to workaround a valgrind bug. If you think it's a blind kludge and its optimization value is dubious (or even negative), then by all means don't apply the workaround, nobody's forcing you. And yes, you might not care, but that doesn't mean it's good if other people care for the quality of their and other's software. Free software lives from people who *do* care about bugs, otherwise free software wouldn't have the high quality that is (usually) has. All I've done here is seen a valgrind output flagging a potential problem and tried my best to follow up on what it means. I don't see how that is any indication that I lack standards. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Marc Lehmann wrote: http://lists.freedesktop.org/archives/dbus/2006-September/005724.html In case anyone wonders, quoting from gnu/linux manpages is a very bad idea, they are often very wrong and in contradiciton to both reality and the unix specification. The unix specification says: In each pollfd structure, poll() clears the revents member except that where the application requested a report on a condition by setting one of the bits of events listed above, poll() sets the corresponding bit in revents if the requested condition is true. If none of the defined events have occurred on any selected file descriptor, poll() waits at least timeout milliseconds for an event to occur on any of the selected file descriptors. Without contradiction, revents *is* cleared by any successful call to poll. Upon successful completion, poll() returns a non-negative value. A return value of 0 is non-negative, and therefore the call was successful. Again, one should report this as a bug in valgrind, flagging correct programs is a bug in valgrind, and nothing else. And so far, I have seen zero evidence for this beign a problem in practise, although I admit I haven't tested this very widely. However, even if real systems do not implement poll like above, this is likely a bug in those systems as well (as usually they follow sus), and should be reported. I am strictly against implementing kludges on top of bugs instead of fixing them, whenever possible. Regardless, if poll() returns zero there's no point in examining revents anyways right? That makes the whole issue moot. If anything you gain some performance by not scanning the poll_fd's in the case of a timeout with no events occuring. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Marc Lehmann wrote: On Wed, Apr 09, 2008 at 04:40:23PM -0500, Brandon Black <[EMAIL PROTECTED]> wrote: Here's an interesting thread from another list regarding this exact same problem: http://lists.freedesktop.org/archives/dbus/2006-September/005724.html I'm still a little hazy on what the "right" answer is here, as I'm not a poll() expert, but something like this may be the answer: poll() is required to clear the revents field. however, if any real-world systems don't do that on timeout, I'd be happy to include a workaround, the cost would be negligible. (in fact, I might add it in any case, but please, could somebody find out a bit more evidence? thanks :) My reading of that other thread I quoted is basically (a) the standard is self-contradictory, (b) some systems definitely don't clear revents when they poll() returns zero (what I'm seeing), and (c) there's really no point looking at revents when poll returns zero, as this means there are zero events to look at anyways. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Brandon Black wrote: Sorry maybe I wasn't clear. Those error messages I pasted are a completely separate issue from the realloc thing. I tracked it down now to the point that I can interpret the messages (and exactly when they occur) as meaning: when poll_poll() is called, the 0th member of the poll set, which is the read side of the pipe that's set up for signals and such, has an "revents" member which is uninitialized memory, and poll_poll() is making decisions based on reading that ununitialized memory. The reason I don't see any worse behavior is that even though it's "uninitialized", being an early allocation it tends to be zero'd out anyways by chance. Here's an interesting thread from another list regarding this exact same problem: http://lists.freedesktop.org/archives/dbus/2006-September/005724.html I'm still a little hazy on what the "right" answer is here, as I'm not a poll() expert, but something like this may be the answer: diff -u -r1.21 ev_poll.c --- ev_poll.c 25 Dec 2007 07:05:45 - 1.21 +++ ev_poll.c 9 Apr 2008 21:38:51 - @@ -101,6 +101,8 @@ return; } + if(res == 0) return; + for (i = 0; i < pollcnt; ++i) if (expect_false (polls [i].revents & POLLNVAL)) fd_kill (EV_A_ polls [i].fd); ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Brandon Black wrote: Sorry maybe I wasn't clear. Those error messages I pasted are a completely separate issue from the realloc thing. I tracked it down now to the point that I can interpret the messages (and exactly when they occur) as meaning: when poll_poll() is called, the 0th member of the poll set, which is the read side of the pipe that's set up for signals and such, has an "revents" member which is uninitialized memory, and poll_poll() is making decisions based on reading that ununitialized memory. The reason I don't see any worse behavior is that even though it's "uninitialized", being an early allocation it tends to be zero'd out anyways by chance. Well now that I re-read what I just wrote, it finally sunk in that this is "revents" not "events", and it's probably(?) up to the kernel to initialize that. Valgrind must be at fault here somehow, I suspect. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: valgrind stuff
Marc Lehmann wrote: On Wed, Apr 09, 2008 at 10:49:51AM -0500, Brandon Black <[EMAIL PROTECTED]> wrote: meaningful runtime leakage in practice). The first is that the signals array isn't ever deallocated. This fixed it for me locally: thats not a problem, it does not leak (the signals array cannot be deallocated for various reasons, but its not per-loop, so no leak). It doesn't leak in the long term, because it's a one shot thing, but it is an annoyance when you're trying to get clean valgrind output. I'll add an exception for it I guess. Also, the default ev_realloc() does a realloc to zero for allocs of size zero, instead of an actual free(), which results in valgrind reporting a bunch of leaks of size zero at exit. Glibc's docs say realloc() to zero actually, realloc is defined to be free in that case in the C programming language. this is a known bug in valgrind. you might want to look if somebody has reported it yet and if not, do so. Fair enough. It happens just once, when I start up my default loop. Usually this indicates a real code bug that should be fixed, although given the deep magic of libev, it wouldn't surprise me terribly if the code was ok and valgrind just can't understand what's going on :) Hmm, realloc (*, 0) to free memory is usually a code bug? That would be surprising, realloc was designed to be a full-blwon memory manager function, and lots of programs use it. If memory is resized and later the pointer is accessed beyond the reallocated size, that would be a bug, of course, but that is a bug regardless of how large the size parameter was. realloc (*, 0) is a bug in the same cases where free(x) would be a bug, and free does not usually indicate a real code bug. I haven't further tracked this down yet, but I will get around to it, just a heads up in case something obvious occurs to someone else. It is never wrong to report what you see (well, if the reports have a certain minimum quality, like yours :), even if its not a real issue, sometimes it should be documented to spare people some work. Sorry maybe I wasn't clear. Those error messages I pasted are a completely separate issue from the realloc thing. I tracked it down now to the point that I can interpret the messages (and exactly when they occur) as meaning: when poll_poll() is called, the 0th member of the poll set, which is the read side of the pipe that's set up for signals and such, has an "revents" member which is uninitialized memory, and poll_poll() is making decisions based on reading that ununitialized memory. The reason I don't see any worse behavior is that even though it's "uninitialized", being an early allocation it tends to be zero'd out anyways by chance. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
valgrind stuff
I've been running valgrind on some code with an embedded copy of libev. There are two things in libev that cause pointless valgrind memory leak warnings (pointless as in they don't constitute any kind of meaningful runtime leakage in practice). The first is that the signals array isn't ever deallocated. This fixed it for me locally: Index: ev.c === RCS file: /schmorpforge/libev/ev.c,v retrieving revision 1.223 diff -u -r1.223 ev.c --- ev.c6 Apr 2008 14:34:50 - 1.223 +++ ev.c9 Apr 2008 15:27:23 - @@ -1266,6 +1266,7 @@ #if EV_ASYNC_ENABLE array_free (async, EMPTY); #endif + ev_free (signals); backend = 0; } But I suspect that only works because I'm not using multiplicity, probably needs some check for whether it's the default loop being destroyed, or something like that. Also, the default ev_realloc() does a realloc to zero for allocs of size zero, instead of an actual free(), which results in valgrind reporting a bunch of leaks of size zero at exit. Glibc's docs say realloc() to zero is equivalent to free(), so I don't know why valgrind is reporting that to begin with. It's not too hard to silence these in various ways (like a custom reallocator that does an explicit free() and returns NULL on realloc to zero). Just FYI. Memory "leak" stuff aside, valgrind is also giving me this: ==8341== Conditional jump or move depends on uninitialised value(s) ==8341==at 0x804E52B: poll_poll (ev_poll.c:105) ==8341==by 0x804F285: ev_loop (ev.c:1632) ==8341==by 0x8049E17: main (main.c:135) ==8341== ==8341== Conditional jump or move depends on uninitialised value(s) ==8341==at 0x804E563: poll_poll (ev_poll.c:108) ==8341==by 0x804F285: ev_loop (ev.c:1632) ==8341==by 0x8049E17: main (main.c:135) ==8341== ==8341== Conditional jump or move depends on uninitialised value(s) ==8341==at 0x804E58F: poll_poll (ev_poll.c:108) ==8341==by 0x804F285: ev_loop (ev.c:1632) ==8341==by 0x8049E17: main (main.c:135) [Those line numbers are in the 3.2 sources] It happens just once, when I start up my default loop. Usually this indicates a real code bug that should be fixed, although given the deep magic of libev, it wouldn't surprise me terribly if the code was ok and valgrind just can't understand what's going on :) I haven't further tracked this down yet, but I will get around to it, just a heads up in case something obvious occurs to someone else. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
little typo
On line 369 of ev.h in the latest CVS, EV_ASYNC_ENABLE is typo'd as EV_ASYND_ENABLE. -- Brandon ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev