My best guess is that you are not using a consistent version of libevent when compiling, linking, and running.
--David (mobile) On Jan 20, 2010, at 5:28 PM, "Nathan Marz" <[email protected]> wrote: > OK... jumped into gdb and here's what I found: > > (gdb) s > 483 event_set(&event_, socket_, eventFlags_, > TConnection::eventHandler, > this); > (gdb) p appState_ > $8 = apache::thrift::server::APP_INIT > (gdb) s > 484 event_base_set(server_->getEventBase(), &event_); > (gdb) p appState_ > $9 = 128 > (gdb) s > 487 if (event_add(&event_, 0) == -1) { > (gdb) p appState_ > $10 = 128 > (gdb) s > 490 } > (gdb) p appState_ > $11 = 130 > > It appears to be getting corrupted twice, once during > "event_base_set" and > once during "event_add". Any ideas? > > > > On Wed, Jan 20, 2010 at 4:03 PM, David Reiss <[email protected]> > wrote: > >> So you're saying that this happens on the first received message? >> Should be relatively easy to debug. >> >> 1/ Make a debug build of Thrift and Scribe. >> 2/ Put a breakpoint in the constructor of of TConnection. >> 3/ When the breakpoint hits, get the address of the appState_. >> 4/ Put a watchpoint on the contents of that address. If possible, >> make it conditional on the new value not being one of the valid >> enum values. >> 5/ Continue. >> 6/ When the watchpoint triggers (and is not a valid enum), do a >> backtrace >> to find out how it was corrupted. Usually it is a memory error. >> >> If it is a memory error, it might be more efficient to just run it >> under >> valgrind. >> >> --David >> >> Nathan Marz wrote: >>> Could use some help on this one. I'm running into this error when >>> using >>> scribe, and I traced back the error to TNonBlocking Server. Here's >>> the >> tail >>> of the log: >>> >>> Thrift: Wed Jan 20 23:11:06 2010 libevent 1.3e method epoll >>> Thrift: Wed Jan 20 23:14:08 2010 Totally Fucked. Application State >>> 130 >>> scribed: src/server/TNonblockingServer.cpp:430: void >>> apache::thrift::server::TConnection::transition(): Assertion `0' >>> failed. >>> >>> In the code, this message is printed whenever a switch statement >>> doesn't >>> match any of the cases. >>> >>> I have scribe set up to have a "master" log server which >>> aggregates all >>> logs, and the "client" servers simply forward messages to the >>> master. >>> The clients work fine, it's the master that is crashing whenever it >> receives >>> a message. In case it's helpful, here's my scribe confs for >> master/client: >>> >>> master: >>> >>> port=1464 >>> >>> >>> <store> >>> category=default >>> type=file >>> rotate_period=hourly >>> add_newlines=1 >>> create_symlink=yes >>> file_path=/vol/scribe >>> base_filename=thisisoverwritten >>> fs_type=std >>> </store> >>> >>> client: >>> >>> port=1464 >>> >>> >>> <store> >>> category=default >>> type=buffer >>> >>> target_write_size=20480 >>> max_write_interval=1 >>> buffer_send_rate=1 >>> retry_interval=120 >>> retry_interval_range=60 >>> >>> <primary> >>> type=network >>> remote_host=XXX >>> remote_port=1464 >>> </primary> >>> >>> <secondary> >>> type=file >>> fs_type=std >>> file_path=/mnt/scribe >>> base_filename=thisisoverwritten >>> max_size=300000000 >>> </secondary> >>> </store> >>> >>> >>> >>> >> > > > > -- > Nathan Marz > Twitter: @nathanmarz > http://nathanmarz.com
