Re: more amd hangs: problem really in syslog?
Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. OK, sent under different cover. Specifically, it'd be handy to know why amd felt it was necessary to open the console. Yeah, I'm kind of curious myself. BTW, I was going to work on this some more today, but the boss thought that putting the box into production was more important. The good news is that under real world load my freebsd box had 20-40% free cpu and a load average of 1.5. With load as equal as the switch could make it, the linux box had no free cpu and a load average of 8. :) I also (finally) got the approval to install freebsd on the fourth box (there are already two linux machines up) so A) I'm making progress in the office, and B) I should have a chance to pound on the syslog stuff tomorrow. Happy, Doug To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. OK, sent under different cover. Specifically, it'd be handy to know why amd felt it was necessary to open the console. Yeah, I'm kind of curious myself. BTW, I was going to work on this some more today, but the boss thought that putting the box into production was more important. The good news is that under real world load my freebsd box had 20-40% free cpu and a load average of 1.5. With load as equal as the switch could make it, the linux box had no free cpu and a load average of 8. :) I also (finally) got the approval to install freebsd on the fourth box (there are already two linux machines up) so A) I'm making progress in the office, and B) I should have a chance to pound on the syslog stuff tomorrow. Happy, Doug To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Wed, Jul 14, 1999 at 10:56:05PM -0700, Mike Smith wrote: On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. http://www.egroups.com/group/freebsd-hackers/40590.html Greg -- Gregory S. Sutter My reality check just bounced. mailto:gsut...@pobox.com http://www.pobox.com/~gsutter/ PGP DSS public key 0x40AE3052 To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Dang. Now I need that stack dump from amd that you posted and I deleted. Specifically, it'd be handy to know why amd felt it was necessary to open the console. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ [EMAIL PROTECTED] \\-- Joseph Merrick \\ [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
: : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. -Matt Matthew Dillon [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in "siobi" state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ [EMAIL PROTECTED] \\-- Joseph Merrick \\ [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: : : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Here is syslog.conf: *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages mail.info /var/log/maillog lpr.info/var/log/lpd-errs cron.* /var/cron/log *.err root *.notice;news.err root *.alert root *.emerg * local7.*/var/log/amd.log Basically, it's what comes with the system plus that line for local7. I am using a serial console setup for this box, but as far as I could see from the logs amd did generate there were no events at *.err priority, or to the kern facility, so nothing should have been printed to the serial console. Also, just in case it matters I start syslogd with -svv flags in rc.conf. Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. I have to admit that you lost me here. Normal syslog stuff is working just fine (where "normal" is freebsd system stuff), it's amd that locks up. It's been kind of a hectic day here, in addition to this problem so I might just be a little dense. Can you explain in more detail what you'd like me to try? Thanks, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in "siobi" state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: (gdb) file /usr/sbin/amd Reading symbols from /usr/sbin/amd...done. (gdb) attach 155 Attaching to program: /usr/sbin/amd, process 155 0x8063dc4 in open () (gdb) where #0 0x8063dc4 in open () #1 0x806b5c3 in vsyslog (pri=6, fmt=0x809279a %s, ap=0xbfbfb240 X) at /usr/src/lib/libc/../libc/gen/syslog.c:262 #2 0x806b2c2 in syslog (pri=6, fmt=0x809279a %s) at /usr/src/lib/libc/../libc/gen/syslog.c:130 #3 0x805a3d8 in real_plog (lvl=6, fmt=0x8091ea0 recompute_portmap: NFS version %d, vargs=0xbfbfba7c \002) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:443 #4 0x805a2be in plog (lvl=16, fmt=0x8091ea0 recompute_portmap: NFS version %d) at /usr/src/usr.sbin/amd/libamu/../../../contrib/amd/libamu/xutil.c:383 So I started thinking that maybe the problem was actually in syslog (or amd's interface to it). So I disabled the following two options in my amd.conf file: log_file = syslog:local7 log_options =all And lo and behold, it worked like a charm. I was able to run my conf-building script for my web server 20 times in a row with no ill effects. Previously the best I could do was 3 times before it hung. After confirming that it worked with no logging, I tried enabling logging to a regular file, and that also worked like a charm. After turning syslog style logging back on, it locked up cold, with a very similar traceback. If anyone wants to work on this, let me know. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999 13:20:55 MST, Doug wrote: After confirming that it worked with no logging, I tried enabling logging to a regular file, and that also worked like a charm. After turning syslog style logging back on, it locked up cold, with a very similar traceback. Sheesh, Mark Murray wasn't kidding when he told me that AMD tickles bugs. Of course, I thought he meant that it tickles bugs in our NFS code. :-) This discovery sounds like a real step forward. Ciao, Sheldon. To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
: : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: : : So I started thinking that maybe the problem was actually in :syslog (or amd's interface to it). So I disabled the following two options :in my amd.conf file: : :log_file = syslog:local7 :log_options =all : : And lo and behold, it worked like a charm. I was able to run my :conf-building script for my web server 20 times in a row with no ill :effects. Previously the best I could do was 3 times before it hung. : : After confirming that it worked with no logging, I tried enabling :logging to a regular file, and that also worked like a charm. After :turning syslog style logging back on, it locked up cold, with a very :similar traceback. : : If anyone wants to work on this, let me know. : :Doug Are you syslogging to the console by any chance? Here is syslog.conf: *.err;kern.debug;auth.notice;mail.crit /dev/console *.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages mail.info /var/log/maillog lpr.info/var/log/lpd-errs cron.* /var/cron/log *.err root *.notice;news.err root *.alert root *.emerg * local7.*/var/log/amd.log Basically, it's what comes with the system plus that line for local7. I am using a serial console setup for this box, but as far as I could see from the logs amd did generate there were no events at *.err priority, or to the kern facility, so nothing should have been printed to the serial console. Also, just in case it matters I start syslogd with -svv flags in rc.conf. Try messing around with /etc/syslog.conf and see if just plain file logging prevents the lockup -- you could even try turning off all logging (but leaving syslog running, i.e. turning it into a sink-null) to see if that has an effect. I have to admit that you lost me here. Normal syslog stuff is working just fine (where normal is freebsd system stuff), it's amd that locks up. It's been kind of a hectic day here, in addition to this problem so I might just be a little dense. Can you explain in more detail what you'd like me to try? Thanks, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: After pounding on this some more with today's -current (prior to the MNT_ASYNC flag change) I got a lot more lockups that looked like this: On Mon, 12 Jul 1999, Doug wrote: Ok, got another hang in siobi state (this time after it successfully completed the script). Here is the trace: 'siobi' is in sioopen() in the sio driver. The callout device is already open, but the caller is trying to open it in blocking mode. It'd be useful to know what is hanging in 'siobi' here, since trying to re-open the console is a bit of a suspicious action. I'm using a serial console, but I directed local7 to a file in syslog.conf. But from what you're saying it sounds like the serial console is a suspect? 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. -- \\ The mind's the standard \\ Mike Smith \\ of the man. \\ msm...@freebsd.org \\-- Joseph Merrick \\ msm...@cdrom.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
:*.err;kern.debug;auth.notice;mail.crit /dev/console :*.notice;kern.debug;lpr.info;mail.crit;news.err /var/log/messages :mail.info /var/log/maillog :lpr.info/var/log/lpd-errs :cron.* /var/cron/log :*.err root :*.notice;news.err root :*.alert root :*.emerg * : :local7.*/var/log/amd.log : : Basically, it's what comes with the system plus that line for :local7. I am using a serial console setup for this box, but as far as I :could see from the logs amd did generate there were no events at *.err :priority, or to the kern facility, so nothing should have been printed to :the serial console. Also, just in case it matters I start syslogd with :-svv flags in rc.conf. : : : Try messing around : with /etc/syslog.conf and see if just plain file logging prevents the : lockup -- you could even try turning off all logging (but leaving syslog : running, i.e. turning it into a sink-null) to see if that has an effect. : : I have to admit that you lost me here. Normal syslog stuff is :working just fine (where normal is freebsd system stuff), it's amd that :locks up. It's been kind of a hectic day here, in addition to this problem :so I might just be a little dense. Can you explain in more detail what :you'd like me to try? : :Thanks, : :Doug Comment the whole thing out, kill -HUP the syslogd (or kill and restart it), and see if amd still locks up. If it does not, add the file lines (/var/*) back in, restart, and see if amd locks up. If it does not, add the /dev/console line back in, restart, and see if amd locks up. If it does not, add the root and * entries back in and see if amd locks up. And so forth. We may find that there is something inherently broken with syslogd that is causing the lockup even with all entries commented out, or we may well find that it is a certain line, such as the /dev/console, root, or * line for emergency messages that is causing amd to lockup. -Matt Matthew Dillon dil...@backplane.com To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Mike Smith wrote: 'siobi' is someone trying to open the serial console, for whatever reason. Without knowing who it was that was stuck there, it's hard to guess what is going on. D'uh, sorry. Long day. It was amd that was hung in the siobi state. No way to clear it without rebooting the box. Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message
Re: more amd hangs: problem really in syslog?
On Tue, 13 Jul 1999, Matthew Dillon wrote: Comment the whole thing out, kill -HUP the syslogd (or kill and restart it), and see if amd still locks up. Ok, now I think I get it. You want me to enable syslog'ing in amd, then do what you're talking about here. I will try this first thing tomorrow morning. We're about to put the box into production to make sure that the bug is really licked, then I'm about to go home. :) We have multiple machines in this configuration, so taking this one down tomorrow should be no problem. If it does not, add the file lines (/var/*) back in, restart, and see if amd locks up. If it does not, add the /dev/console line back in, restart, and see if amd locks up. If it does not, add the root and * entries back in and see if amd locks up. And so forth. Gotcha. We may find that there is something inherently broken with syslogd that is causing the lockup even with all entries commented out, or we may well find that it is a certain line, such as the /dev/console, root, or * line for emergency messages that is causing amd to lockup. I think that Mike Smith is on the right track in suspecting that the serial console is involved (due to the siobi state that amd was hanging in). However, which line of the syslog.conf that was causing that is a darn good question, given that none of them *should* have been involved. Thanks for all the great suggestions, Doug -- On account of being a democracy and run by the people, we are the only nation in the world that has to keep a government four years, no matter what it does. -- Will Rogers To Unsubscribe: send mail to majord...@freebsd.org with unsubscribe freebsd-hackers in the body of the message