I haven't used gdb in a bunch of years and looking at the manual, I don't see a way to continue a single thread. That makes my second suggestion silly unless there is something I didn't see (which is decidedly possible).
On Mon, Jul 18, 2011 at 9:05 AM, Ted Dunning <[email protected]> wrote: > I have two suggestions that might or might not work. > > First, you can increase the timeouts to high values and also write a bit of > code that can expire the session instantly. The ZK unit tests have examples > of how to do this by opening a second connection with the same session id > and then closing it. This has the effect of instantly expiring the original > connection. You still have a bit of an education process here. This is > high risk since the configuration file with the long timeouts will probably > get checked in by mistake at some point. There might be a way to avoid this > with a special startup option that over-rides the session length for just > the one invocation. > > A second idea is that you might be able to define a gdb macro that is > invoked when you hit a breakpoint and another that is invoked at continue > time (or manually). The first macro would invoke a function to start or > continue a background thread that can keep the heartbeats going. The second > macro would kill that thread and restore normal operation. The ideal case > would be to continue just the normal ZK heartbeat thread except that might > cause notifications to be called in the background which could confuse the > person doing the debugging. > > If you can make it work, the second approach would give you something > approaching a normal debugging experience. > > > On Mon, Jul 18, 2011 at 7:41 AM, Fournier, Camille F. < > [email protected]> wrote: > >> ZooKeeper can't possibly know that you are in GDB unless you have a >> special message that you send to the server that says "I'm in a debugger >> now, please don't expire me". You might be able to hack something in to do >> this, but do you really want to? I think the second idea is best. If you are >> a developer working in any kind of multi-threaded distributed system, you >> need to be aware that suspending all threads can lead to the remote parts of >> your process failing. That's just professional distributed systems >> development 101. This isn't unique to C, Java developers also have to choose >> between suspending all threads during debugging and suspending only the >> thread affected by the breakpoint. >> >> You can also split the difference between points one and two, namely, get >> the message out to the developers that if they're working against ZK and >> suspend all threads, they might end up losing their session, but when >> working in an env that you expect to do a lot of debugging in (development, >> QA), jack up the timeout so it happens less frequently. >> >> If you truly want to separate the process from its zookeeper heartbeating, >> you could take a tip from the HBASE devs in >> https://issues.apache.org/jira/browse/HBASE-1316. Because dealing with >> timeouts is much more of an issue in large Java processes due to full GC, >> they have experimented with various solutions that you might be able to >> apply here in C. >> >> C >> >> >> -----Original Message----- >> From: Stephen Tyree [mailto:[email protected]] >> Sent: Monday, July 18, 2011 10:07 AM >> To: [email protected] >> Subject: libzookeeper_mt and GDB >> >> Hello All, >> >> I've been using Zookeeper at my place of work for a few months now >> successfully, but there has been a lingering issue I haven't been able >> to solve without issue. Namely, when using GDB with libzookeeper_mt, >> once you hit a breakpoint, the program you're running essentially has >> until the session timeout to continue onward or its session will be >> expired. This is a pain in the butt when using ephemeral znodes, but in >> my case those ephemeral znodes are tied to locks which means losing them >> is bad news. I've tried a number of different ideas to solve this issue, >> and all of them have varying degrees of success. >> >> The first idea I had was jacking up the session timeouts, which >> obviously works. This extends the time you have at any given breakpoint >> to figure out the issue and move onward, but comes at the expense of >> ephemeral znodes living for much longer than they reasonably should when >> the program crashes (something that is likely to be an issue if you're >> using GDB). In the case of locking, those znodes which hang around for a >> while have negative consequences on the performance of the system. This >> is how we currently deal with the issue. >> >> The second idea was to instruct all developers at my job to use GDB >> non-stop mode for debugging. This works, since GDB would only stop the >> thread which hit a breakpoint in this mode, but runs into the issue that >> I need to change the development habits of hundreds of engineers just to >> save myself the trouble. Ideally Zookeeper would function with GDB in >> whatever mode you felt like using. >> >> The third idea was decidedly more intricate. Essentially I spawn a >> subprocess which uses the exact same session I do, but only holds onto >> that session while the parent process is unresponsive (at a breakpoint >> probably). This essentially locks your session while at breakpoints, but >> has no impact while not at breakpoints. The only caveat to this approach >> is the transition between breakpoints and non-breakpoints. Since the >> server last saw the session in the subprocess, it doesn't send heartbeat >> messages to the parent process. This means it's up to the parent process >> to send PING messages to the server in order to reestablish the session, >> but this only happens at 1/3 of the session timeout (which is too long). >> >> Whatever the case, a simple, generic solution would be ideal for this >> situation. It might be as simple as allowing configurable PING messages >> (for the third solution) or it might be as frustrating as creating a >> Zookeeper service which runs outside of the process (thus bypassing >> GDB's breakpoints). Any ideas? >> >> Thanks, >> Stephen Tyree >> > >
