Re: Post-KSE disaster with libc_r

2002-06-30 Thread Julian Elischer

Can someone please check out a libc_r tree as of 3 days ago
and try that...

There was a commit in libc_r/uthreads 2 days ago that might be relevant.
failing that, can someone try newly compiled utilities on an older pre-KSE
kernel?

We need to eliminate one of these two changes...

I think it's likely that it's breakage in signals from KSE
but I'd like to know that before I tear even more hair out chasing this..

SO, I'm suffering from brain fade now..
but please, signals is known to be in dire need of cleanup
after the KSE edit, (signals are delivered to processes but can effect
individual threads.  yuck)  

Anyone who can help identify the problem please do.. I'm off to bed before
my head explodes..
I'll be back tomorrow AM.
I'm going to spend as much of msuspension sleeping as possible :-)

On Mon, 1 Jul 2002, Wesley Morgan wrote:

> I see this problem too. Luckily I have my entire KDE and QT system build
> with debugging symbols... However, the problem is definitely in the
> libc_r... I get virtually the same dump as Michael.
> 
> #0  0x28e8d280 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> #1  0x28e8c9a7 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> #2  0xd0d0d0d0 in ?? ()
> #3  0x0001 in ?? ()
> #4  0x5f28 in ?? ()
> 
> 
> 
> On Sun, 30 Jun 2002, Bill Huey wrote:
> 
> > On Mon, Jul 01, 2002 at 07:11:31AM +0200, Michael Nottebrock wrote:
> > > Program received signal SIGSEGV, Segmentation fault.
> > > 0x281cc918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > (gdb) bt
> > > #0  0x281cc918 in _thread_kern_sched_state_unlock () from
> > > /usr/lib/libc_r.so.5
> > > #1  0x281cc2e2 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> > > #2  0xd0d0d0d0 in ?? ()
> > > #3  0x080570b0 in ?? ()
> >
> > This is unlikely to be a KSE problem.
> >
> > What do the rest of the threads look like ?
> >
> > Try "info threads" in gdb and then progressively walking through the thread
> > list with "thread N", N being the thread number. I ran into a funny
> > create at thread start up time crash and I'm wondering if it could
> > be the same thing.
> >
> > bill
> >
> >
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> >
> 
> -- 
>_ __ ___   ___ ___ ___
>   Wesley N Morgan   _ __ ___ | _ ) __|   \
>   [EMAIL PROTECTED] _ __ | _ \._ \ |) |
>   FreeBSD: The Power To Serve  _ |___/___/___/
> Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-06-30 Thread Wesley Morgan

Reverting to:
uthread_sigpending.c 1.8
uthread_sigsuspend.c 1.11
Makefile.inc 1.32

Has no effect. As far as I can tell theres no more changes...

Looking at some ktrace / gdb output shows the funny business starting
right after kdeinit tries to fork into something else:


  2723 kdeinit  CALL  gettimeofday(0x28e94ab8,0)
  2723 kdeinit  RET   gettimeofday 0
  2723 kdeinit  CALL  wait4(0x,0,0x1,0)
  2723 kdeinit  RET   wait4 2724/0xaa4
  2723 kdeinit  CALL  poll(0x8059000,0x1,0)
  2723 kdeinit  RET   poll 1
  2723 kdeinit  PSIG  SIGSEGV SIG_DFL
  2723 kdeinit  NAMI  "kdeinit.core"

Both the kdeinit and a child it forks are dying... Setting a breakpoint of
fork() in the binary shows:


Breakpoint 1, 0x28eda7d4 in fork () from /usr/lib/libc.so.5
(gdb) bt
#0  0x28eda7d4 in fork () from /usr/lib/libc.so.5
#1  0x28e83a5c in fork () from /usr/lib/libc_r.so.5
#2  0x0804e8d5 in QGListIterator::~QGListIterator() ()
#3  0x0804add1 in QGListIterator::~QGListIterator() ()
(gdb) s
Single stepping until exit from function fork,
which has no line number information.
0x28e83a5c in fork () from /usr/lib/libc_r.so.5
(gdb)
Single stepping until exit from function fork,
which has no line number information.
warning: Cannot insert breakpoint 0:
Error accessing memory address 0xd0d0d0d0: Bad address.
(gdb)

the 0xd0d0d0d0 is the same as in the coredump earlier.

Rebuilt libc_r with debugging symbols and...


(gdb) bt
#0  thread_kern_poll (wait_reqd=0)
at /usr/src/lib/libc_r/uthread/uthread_kern.c:862
#1  0x28e8c8d7 in _thread_kern_scheduler ()
at /usr/src/lib/libc_r/uthread/uthread_kern.c:372
#2  0xd0d0d0d0 in ?? ()
#3  0x0001 in ?? ()
#4  0x5f28 in ?? ()
Error accessing memory address 0xbecf2000: Bad address.

Hope some of this is useful to anyone out there!

On Sun, 30 Jun 2002, Julian Elischer wrote:

> Can someone please check out a libc_r tree as of 3 days ago
> and try that...
>
> There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> failing that, can someone try newly compiled utilities on an older pre-KSE
> kernel?
>
> We need to eliminate one of these two changes...
>
> I think it's likely that it's breakage in signals from KSE
> but I'd like to know that before I tear even more hair out chasing this..
>
> SO, I'm suffering from brain fade now..
> but please, signals is known to be in dire need of cleanup
> after the KSE edit, (signals are delivered to processes but can effect
> individual threads.  yuck)
>
> Anyone who can help identify the problem please do.. I'm off to bed before
> my head explodes..
> I'll be back tomorrow AM.
> I'm going to spend as much of msuspension sleeping as possible :-)
>
> On Mon, 1 Jul 2002, Wesley Morgan wrote:
>
> > I see this problem too. Luckily I have my entire KDE and QT system build
> > with debugging symbols... However, the problem is definitely in the
> > libc_r... I get virtually the same dump as Michael.
> >
> > #0  0x28e8d280 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > #1  0x28e8c9a7 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> > #2  0xd0d0d0d0 in ?? ()
> > #3  0x0001 in ?? ()
> > #4  0x5f28 in ?? ()
> >
> >
> >
> > On Sun, 30 Jun 2002, Bill Huey wrote:
> >
> > > On Mon, Jul 01, 2002 at 07:11:31AM +0200, Michael Nottebrock wrote:
> > > > Program received signal SIGSEGV, Segmentation fault.
> > > > 0x281cc918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > > (gdb) bt
> > > > #0  0x281cc918 in _thread_kern_sched_state_unlock () from
> > > > /usr/lib/libc_r.so.5
> > > > #1  0x281cc2e2 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> > > > #2  0xd0d0d0d0 in ?? ()
> > > > #3  0x080570b0 in ?? ()
> > >
> > > This is unlikely to be a KSE problem.
> > >
> > > What do the rest of the threads look like ?
> > >
> > > Try "info threads" in gdb and then progressively walking through the thread
> > > list with "thread N", N being the thread number. I ran into a funny
> > > create at thread start up time crash and I'm wondering if it could
> > > be the same thing.
> > >
> > > bill
> > >
> > >
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> > >
> >
> > --
> >_ __ ___   ___ ___ ___
> >   Wesley N Morgan   _ __ ___ | _ ) __|   \
> >   [EMAIL PROTECTED] _ __ | _ \._ \ |) |
> >   FreeBSD: The Power To Serve  _ |___/___/___/
> > Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
> >
> >
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> >
>

-- 
   _ __ ___   ___ ___ ___
  Wesley N Morgan   _ __ ___ | _ ) __|   \
  [EMAIL PROTECTED] _ __ | _ \._ \ |) |
  FreeBSD: The Power To Serve 

Re: Post-KSE disaster with libc_r

2002-06-30 Thread Bill Huey

On Mon, Jul 01, 2002 at 02:59:09AM -0400, Wesley Morgan wrote:
>   2723 kdeinit  CALL  gettimeofday(0x28e94ab8,0)
>   2723 kdeinit  RET   gettimeofday 0
>   2723 kdeinit  CALL  wait4(0x,0,0x1,0)
>   2723 kdeinit  RET   wait4 2724/0xaa4
>   2723 kdeinit  CALL  poll(0x8059000,0x1,0)
>   2723 kdeinit  RET   poll 1
>   2723 kdeinit  PSIG  SIGSEGV SIG_DFL
>   2723 kdeinit  NAMI  "kdeinit.core"
> 
> Both the kdeinit and a child it forks are dying... Setting a breakpoint of
> fork() in the binary shows:

That's almost definitely the same problem I'm running into with
the JVM, however it registered itself has an infinite hang since
SEGV deliveries are turned off when inside in a crashing/SEGVing
libc_r thread-kernel.

bill


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Wesley Morgan wrote:
> 
> Both the kdeinit and a child it forks are dying... Setting a breakpoint of
> fork() in the binary shows:
> 
> 
> Breakpoint 1, 0x28eda7d4 in fork () from /usr/lib/libc.so.5
> (gdb) bt
> #0  0x28eda7d4 in fork () from /usr/lib/libc.so.5
> #1  0x28e83a5c in fork () from /usr/lib/libc_r.so.5
> #2  0x0804e8d5 in QGListIterator::~QGListIterator() ()
> #3  0x0804add1 in QGListIterator::~QGListIterator() ()
> (gdb) s
> Single stepping until exit from function fork,
> which has no line number information.
> 0x28e83a5c in fork () from /usr/lib/libc_r.so.5
> (gdb)
> Single stepping until exit from function fork,
> which has no line number information.
> warning: Cannot insert breakpoint 0:
> Error accessing memory address 0xd0d0d0d0: Bad address.
> (gdb)
> 
> the 0xd0d0d0d0 is the same as in the coredump earlier.
> 
> Rebuilt libc_r with debugging symbols and...
> 


In the ktrace, an you show context switches?
(add -w to both ktrace and kdump)

Is this where is broke? it doesn't look much like the above..


> 
> (gdb) bt
> #0  thread_kern_poll (wait_reqd=0)
> at /usr/src/lib/libc_r/uthread/uthread_kern.c:862
> #1  0x28e8c8d7 in _thread_kern_scheduler ()
> at /usr/src/lib/libc_r/uthread/uthread_kern.c:372
> #2  0xd0d0d0d0 in ?? ()
> #3  0x0001 in ?? ()
> #4  0x5f28 in ?? ()
> Error accessing memory address 0xbecf2000: Bad address.
> 


can  you do what you did before and try singlestep
a bit?

also.. instead of checking out an older libc_r,
can you try see if there is actually on old copy
(say from teh DP1-image) somewhere and try that...
it's possible we have  symbol polution problemm..
a lot of the names in libc_r look awfully familliar
from the KSE code.. (this shouldn;t be possible but
 
> Hope some of this is useful to anyone out there!

not on its own, but as a part of a developing picture.

> 
> On Sun, 30 Jun 2002, Julian Elischer wrote:
> 
> > Can someone please check out a libc_r tree as of 3 days ago
> > and try that...
> >
> > There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> > failing that, can someone try newly compiled utilities on an older pre-KSE
> > kernel?
> >
> > We need to eliminate one of these two changes...
> >
> > I think it's likely that it's breakage in signals from KSE
> > but I'd like to know that before I tear even more hair out chasing this..
> >
> > SO, I'm suffering from brain fade now..
> > but please, signals is known to be in dire need of cleanup
> > after the KSE edit, (signals are delivered to processes but can effect
> > individual threads.  yuck)
> >
> > Anyone who can help identify the problem please do.. I'm off to bed before
> > my head explodes..
> > I'll be back tomorrow AM.
> > I'm going to spend as much of msuspension sleeping as possible :-)
> >
> > On Mon, 1 Jul 2002, Wesley Morgan wrote:
> >
> > > I see this problem too. Luckily I have my entire KDE and QT system build
> > > with debugging symbols... However, the problem is definitely in the
> > > libc_r... I get virtually the same dump as Michael.
> > >
> > > #0  0x28e8d280 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > #1  0x28e8c9a7 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> > > #2  0xd0d0d0d0 in ?? ()
> > > #3  0x0001 in ?? ()
> > > #4  0x5f28 in ?? ()
> > >
> > >
> > >
> > > On Sun, 30 Jun 2002, Bill Huey wrote:
> > >
> > > > On Mon, Jul 01, 2002 at 07:11:31AM +0200, Michael Nottebrock wrote:
> > > > > Program received signal SIGSEGV, Segmentation fault.
> > > > > 0x281cc918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > > > (gdb) bt
> > > > > #0  0x281cc918 in _thread_kern_sched_state_unlock () from
> > > > > /usr/lib/libc_r.so.5
> > > > > #1  0x281cc2e2 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5
> > > > > #2  0xd0d0d0d0 in ?? ()
> > > > > #3  0x080570b0 in ?? ()
> > > >
> > > > This is unlikely to be a KSE problem.
> > > >
> > > > What do the rest of the threads look like ?
> > > >
> > > > Try "info threads" in gdb and then progressively walking through the thread
> > > > list with "thread N", N being the thread number. I ran into a funny
> > > > create at thread start up time crash and I'm wondering if it could
> > > > be the same thing.
> > > >
> > > > bill
> > > >
> > > >
> > > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > > with "unsubscribe freebsd-current" in the body of the message
> > > >
> > >
> > > --
> > >_ __ ___   ___ ___ ___
> > >   Wesley N Morgan   _ __ ___ | _ ) __|   \
> > >   [EMAIL PROTECTED] _ __ | _ \._ \ |) |
> > >   FreeBSD: The Power To Serve  _ |___/___/___/
> > > Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
> > >
> > >
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> > >
> 

Re: Post-KSE disaster with libc_r

2002-07-01 Thread Marc Recht

> Can someone please check out a libc_r tree as of 3 days ago
> and try that...
> 
> There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> failing that, can someone try newly compiled utilities on an older pre-KSE
> kernel?
> 
> We need to eliminate one of these two changes...
I don't know if this helps, but I've a pre-KSE userland (28.06.), a
post-KSE kernel (30.06.) and I've none of the described problems.
Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread NAKAJI Hiroyuki

> In <[EMAIL PROTECTED]> 
>   Marc Recht <[EMAIL PROTECTED]> wrote:

> Can someone please check out a libc_r tree as of 3 days ago
> and try that...
> 
> There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> failing that, can someone try newly compiled utilities on an older pre-KSE
> kernel?

MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
MR> post-KSE kernel (30.06.) and I've none of the described problems.
MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.

I updated my current box about an hour ago, and got into trouble too.

My case is that amavis-milter dumps core with signal 11 and I cannot check
virus in emails. :(

$ sudo gdb ./amavis-milter /etc/mail/amavis-milter.core 
GNU gdb 5.2.0 (FreeBSD) 20020627
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-undermydesk-freebsd"...
Core was generated by `amavis-milter'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libmilter.so.2...done.
Loaded symbols for /usr/lib/libmilter.so.2
Reading symbols from /usr/lib/libc_r.so.5...done.
Loaded symbols for /usr/lib/libc_r.so.5
Reading symbols from /usr/lib/libc.so.5...done.
Loaded symbols for /usr/lib/libc.so.5
Reading symbols from /usr/libexec/ld-elf.so.1...done.
Loaded symbols for /usr/libexec/ld-elf.so.1
#0  0x2808e918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
(gdb) 
-- 
NAKAJI Hiroyuki

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Marc Recht

> MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
> MR> post-KSE kernel (30.06.) and I've none of the described problems.
> MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.
> 
> I updated my current box about an hour ago, and got into trouble too.
But you've updated the userland _and_ the kernel. I've only updated the
kernel and left the userland int the pre-KSE state.

Marc




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Wesley Morgan

Ktracing with context switches look the same as before

Stepping into libc_r leads me on a merry chase through what appears to be
normal execution, until somewhere in uthread_sig.c about line 552...
(gdb) r
Starting program: /usr/local/bin/kdeinit
Breakpoint 1 at 0x28e839f6: file
/usr/src/lib/libc_r/uthread/uthread_fork.c, line 49.Breakpoint 1, 0x28eda7d4 in fork 
() from /usr/lib/libc.so.5
(gdb) b /usr/src/lib/libc_r/uthread/uthread_sig.c:546
Breakpoint 2 at 0x28e8723d: file
/usr/src/lib/libc_r/uthread/uthread_sig.c, line 546.(gdb) c
Continuing.

Breakpoint 2, thread_sig_handle_special (sig=20)
at /usr/src/lib/libc_r/uthread/uthread_sig.c:546
546 for (pthread = TAILQ_FIRST(&_waitingq);
(gdb) print _waitingq
$1 = {tqh_first = 0x8054000, tqh_last = 0x8054210}
(gdb) s
0x28e8723e in thread_sig_handle_special (sig=20)
at /usr/src/lib/libc_r/uthread/uthread_sig.c:546
546 for (pthread = TAILQ_FIRST(&_waitingq);


Program received signal SIGSEGV, Segmentation fault.
0x28e8723e in thread_sig_handle_special (sig=20)
at /usr/src/lib/libc_r/uthread/uthread_sig.c:546
546 for (pthread = TAILQ_FIRST(&_waitingq);


Odd... Now if I set a breakpoint inside of the for() loop at line 552, it
will actually get past that:
Breakpoint 2, thread_sig_handle_special (sig=20)
at /usr/src/lib/libc_r/uthread/uthread_sig.c:552
552 pthread_next = TAILQ_NEXT(pthread, pqe);
(gdb) s
558 if (pthread->state == PS_WAIT_WAIT) {
(gdb) s

Program received signal SIGSEGV, Segmentation fault.
thread_sig_handle_special (sig=20)
at /usr/src/lib/libc_r/uthread/uthread_sig.c:558
558 if (pthread->state == PS_WAIT_WAIT) {
(gdb) print pthread
$1 = (struct pthread *) 0x210

That definitely is not right!

Backing up, this is the content of the pthread struct before it gets
munched into 0x210 (re-ran the process of course)$1 = {magic = 3499860245,
name = 0x8056030 "_thread_initial", uniqueid = 0,  lock = {access_lock = 686322256, 
lock_owner = 0, fname = 0x0, lineno = 0},
  tle = {tqe_next = 0x0, tqe_prev = 0x28e94a88}, dle = {tqe_next = 0x0,
tqe_prev = 0x0}, start_routine = 0, arg = 0x0, stack = 0xbfb0,
attr = {sched_policy = 3, sched_inherit = 0, sched_interval = 2, prio = 15,
suspend = 0, flags = 0, arg_attr = 0x0, cleanup_attr = 0,
stackaddr_attr = 0xbfb0, stacksize_attr = 1048576,
guardsize_attr = 4096}, ctx = {jb = {{_jb = {686343554, 686378132,
  -1077939364, -1077939336, -1, 134561792, 4735, 0, 0, 0, 0, 0}}},
uc = {uc_sigmask = {__bits = {686343554, 686378132, 3217027932,
  3217027960}}, uc_mcontext = {mc_onstack = -1, mc_gs = 134561792,
mc_fs = 4735, mc_es = 0, mc_ds = 0, mc_edi = 0, mc_esi = 0,
mc_ebp = 0, mc_isp = 0, mc_ebx = 0, mc_edx = 0, mc_ecx = 0,
mc_eax = 0, mc_trapno = 0, mc_err = 0, mc_eip = 0, mc_cs = 0,
mc_eflags = 0, mc_esp = 0, mc_ss = 0, mc_fpregs = {
  0 }, mc_flags = 0, __spare__ = {
  0 }}, uc_link = 0x0, uc_stack = {ss_sp = 0x0,
ss_size = 0, ss_flags = 0}, __spare__ = {0, 0, 0, 0, 0, 0, 0, 0}}},
  curframe = 0x0, cancelflags = 4, continuation = 0, sigmask = {__bits = {0,
  0, 0, 0}}, sigpend = {__bits = {0, 0, 0, 0}}, sigmask_seqno = 0,
  check_pending = 0, state = PS_FDR_WAIT, last_active = 0, last_inactive = 0,
  slice_usec = -1, wakeup_time = {tv_sec = -1, tv_nsec = -1}, timeout = 0,
  error = 0, joiner = 0x0, join_status = {thread = 0x0, ret = 0x0, error =
  0},   pqe = {tqe_next = 0x0, tqe_prev = 0x28e9a8d0}, sqe = {tqe_next =
  0x0,tqe_prev = 0x0}, qe = {tqe_next = 0x0, tqe_prev = 0x28e97080}, data = {
mutex = 0x7, cond = 0x7, sigwait = 0x7, fd = {fd = 7, branch = 0,
  fname = 0x0}, fp = 0x7, poll_data = 0x7, spinlock = 0x7, thread = 0x7},
  poll_data = {nfds = 0, fds = 0x0}, interrupted = 0, signo = 0,
  sig_defer_count = 0, yield_on_sig_undefer = 0, flags = 20,
  base_priority = 15 '\017', inherited_priority = 0 '\0',
  active_priority = 15 '\017', priority_mutex_count = 0, mutexq = {
tqh_first = 0x0, tqh_last = 0x8054254}, ret = 0x0, specific = 0x0,
  specific_data_count = 0, cleanup = 0x0,
  fname = 0x28e925a0 "/usr/src/lib/libc_r/uthread/uthread_read.c", lineno
  = 81}

Of course all this means absolutely nothing to me :) ... Setting the
breakpoint just past the for() loop give me the same old crash as before:
#0  thread_kern_poll (wait_reqd=0)
at /usr/src/lib/libc_r/uthread/uthread_kern.c:862
#1  0x28e8c8d7 in _thread_kern_scheduler ()
at /usr/src/lib/libc_r/uthread/uthread_kern.c:372
#2  0xd0d0d0d0 in ?? ()
(gdb) print pthread
$2 = (struct pthread *) 0x

That's all I've got for now. Someone please tell me if posting this much
junk to -current is frowned upon. I'm looking for an old libc_r now, but
there could be some problems with the GCC changeout since DP1 that won't
work too well with KDE...
> can  you 

Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

the question is:
did you update both kernel and userland?


On Mon, 1 Jul 2002, NAKAJI Hiroyuki wrote:

> > In <[EMAIL PROTECTED]> 
> > Marc Recht <[EMAIL PROTECTED]> wrote:
> 
> > Can someone please check out a libc_r tree as of 3 days ago
> > and try that...
> > 
> > There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> > failing that, can someone try newly compiled utilities on an older pre-KSE
> > kernel?
> 
> MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
> MR> post-KSE kernel (30.06.) and I've none of the described problems.
> MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.
> 
> I updated my current box about an hour ago, and got into trouble too.
> 
> My case is that amavis-milter dumps core with signal 11 and I cannot check
> virus in emails. :(
> 
> $ sudo gdb ./amavis-milter /etc/mail/amavis-milter.core 
> GNU gdb 5.2.0 (FreeBSD) 20020627
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-undermydesk-freebsd"...
> Core was generated by `amavis-milter'.
> Program terminated with signal 11, Segmentation fault.
> Reading symbols from /usr/lib/libmilter.so.2...done.
> Loaded symbols for /usr/lib/libmilter.so.2
> Reading symbols from /usr/lib/libc_r.so.5...done.
> Loaded symbols for /usr/lib/libc_r.so.5
> Reading symbols from /usr/lib/libc.so.5...done.
> Loaded symbols for /usr/lib/libc.so.5
> Reading symbols from /usr/libexec/ld-elf.so.1...done.
> Loaded symbols for /usr/libexec/ld-elf.so.1
> #0  0x2808e918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> (gdb) 
> -- 
> NAKAJI Hiroyuki
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Andrey A. Chernov

On Mon, Jul 01, 2002 at 10:26:22 -0700, Julian Elischer wrote:
> the question is:
> did you update both kernel and userland?

This bug is not related to in-kernel KSE code (but, maybe related to
header files compiled in). I got it even with updated userland and old
pre-KSE kernel (with both updated I have it too). Only switching to libc_r 
old about month ago helps.

> 
> 
> On Mon, 1 Jul 2002, NAKAJI Hiroyuki wrote:
> 
> > > In <[EMAIL PROTECTED]> 
> > >   Marc Recht <[EMAIL PROTECTED]> wrote:
> > 
> > > Can someone please check out a libc_r tree as of 3 days ago
> > > and try that...
> > > 
> > > There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> > > failing that, can someone try newly compiled utilities on an older pre-KSE
> > > kernel?
> > 
> > MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
> > MR> post-KSE kernel (30.06.) and I've none of the described problems.
> > MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.
> > 
> > I updated my current box about an hour ago, and got into trouble too.
> > 
> > My case is that amavis-milter dumps core with signal 11 and I cannot check
> > virus in emails. :(
> > 
> > $ sudo gdb ./amavis-milter /etc/mail/amavis-milter.core 
> > GNU gdb 5.2.0 (FreeBSD) 20020627
> > Copyright 2002 Free Software Foundation, Inc.
> > GDB is free software, covered by the GNU General Public License, and you are
> > welcome to change it and/or distribute copies of it under certain conditions.
> > Type "show copying" to see the conditions.
> > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > This GDB was configured as "i386-undermydesk-freebsd"...
> > Core was generated by `amavis-milter'.
> > Program terminated with signal 11, Segmentation fault.
> > Reading symbols from /usr/lib/libmilter.so.2...done.
> > Loaded symbols for /usr/lib/libmilter.so.2
> > Reading symbols from /usr/lib/libc_r.so.5...done.
> > Loaded symbols for /usr/lib/libc_r.so.5
> > Reading symbols from /usr/lib/libc.so.5...done.
> > Loaded symbols for /usr/lib/libc.so.5
> > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > Loaded symbols for /usr/libexec/ld-elf.so.1
> > #0  0x2808e918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > (gdb) 
> > -- 
> > NAKAJI Hiroyuki
> > 
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> > 
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message

-- 
Andrey A. Chernov
http://ache.pp.ru/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread NAKAJI Hiroyuki

> In <[EMAIL PROTECTED]> 
>   Julian Elischer <[EMAIL PROTECTED]> wrote:

JE> the question is:
JE> did you update both kernel and userland?

Yes. I always do update the whole world.

> I updated my current box about an hour ago, and got into trouble too.

And I backed kernel only of date=2002.06.29.17.00.00 and this old kernel
has problem. While running /etc/rc it gets panic, but I've lost the
message.

I'm now rebuilding the latest world, i.e. userland and kernel, with the
kernel and userland of yesterday.
-- 
NAKAJI Hiroyuki

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Michael Nottebrock

Andrey A. Chernov wrote:
> On Mon, Jul 01, 2002 at 10:26:22 -0700, Julian Elischer wrote:
> 
>>the question is:
>>did you update both kernel and userland?
> 
> 
> This bug is not related to in-kernel KSE code (but, maybe related to
> header files compiled in). I got it even with updated userland and old
> pre-KSE kernel (with both updated I have it too). Only switching to libc_r 
> old about month ago helps.

I applied "thediff" to a -CURRENT box as of June 25th and it promptly 
shows the symptoms, so I still think it's somewhere in the KSE-code.


Regards,
-- 
Michael Nottebrock
"The circumstance ends uglily in the cruel result." - Babelfish



msg40232/pgp0.pgp
Description: PGP signature


Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Andrey A. Chernov wrote:

> On Mon, Jul 01, 2002 at 10:26:22 -0700, Julian Elischer wrote:
> > the question is:
> > did you update both kernel and userland?
> 
> This bug is not related to in-kernel KSE code (but, maybe related to
> header files compiled in). I got it even with updated userland and old
> pre-KSE kernel (with both updated I have it too). Only switching to libc_r 
> old about month ago helps.

Ok this is good news becasue it helps narrow the problem.
If an old libc_r runs well it at least suggests that the kernel is
working as it should. That is a great relief to me!

My current guess is that something in an include file is poisonning the
compile of a new libc_r.

I think we may need Dan to help us find it..





> 
> > 
> > 
> > On Mon, 1 Jul 2002, NAKAJI Hiroyuki wrote:
> > 
> > > > In <[EMAIL PROTECTED]> 
> > > > Marc Recht <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Can someone please check out a libc_r tree as of 3 days ago
> > > > and try that...
> > > > 
> > > > There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> > > > failing that, can someone try newly compiled utilities on an older pre-KSE
> > > > kernel?
> > > 
> > > MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
> > > MR> post-KSE kernel (30.06.) and I've none of the described problems.
> > > MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.
> > > 
> > > I updated my current box about an hour ago, and got into trouble too.
> > > 
> > > My case is that amavis-milter dumps core with signal 11 and I cannot check
> > > virus in emails. :(
> > > 
> > > $ sudo gdb ./amavis-milter /etc/mail/amavis-milter.core 
> > > GNU gdb 5.2.0 (FreeBSD) 20020627
> > > Copyright 2002 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and you are
> > > welcome to change it and/or distribute copies of it under certain conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > This GDB was configured as "i386-undermydesk-freebsd"...
> > > Core was generated by `amavis-milter'.
> > > Program terminated with signal 11, Segmentation fault.
> > > Reading symbols from /usr/lib/libmilter.so.2...done.
> > > Loaded symbols for /usr/lib/libmilter.so.2
> > > Reading symbols from /usr/lib/libc_r.so.5...done.
> > > Loaded symbols for /usr/lib/libc_r.so.5
> > > Reading symbols from /usr/lib/libc.so.5...done.
> > > Loaded symbols for /usr/lib/libc.so.5
> > > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > > Loaded symbols for /usr/libexec/ld-elf.so.1
> > > #0  0x2808e918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > (gdb) 
> > > -- 
> > > NAKAJI Hiroyuki
> > > 
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> > > 
> > 
> > 
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> 
> -- 
> Andrey A. Chernov
> http://ache.pp.ru/
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

is there any chance you can try compile a new libc_r but with old versions
of proc.h and queue.h in the system?


On Mon, 1 Jul 2002, Andrey A. Chernov wrote:

> On Mon, Jul 01, 2002 at 10:26:22 -0700, Julian Elischer wrote:
> > the question is:
> > did you update both kernel and userland?
> 
> This bug is not related to in-kernel KSE code (but, maybe related to
> header files compiled in). I got it even with updated userland and old
> pre-KSE kernel (with both updated I have it too). Only switching to libc_r 
> old about month ago helps.
> 
> > 
> > 
> > On Mon, 1 Jul 2002, NAKAJI Hiroyuki wrote:
> > 
> > > > In <[EMAIL PROTECTED]> 
> > > > Marc Recht <[EMAIL PROTECTED]> wrote:
> > > 
> > > > Can someone please check out a libc_r tree as of 3 days ago
> > > > and try that...
> > > > 
> > > > There was a commit in libc_r/uthreads 2 days ago that might be relevant.
> > > > failing that, can someone try newly compiled utilities on an older pre-KSE
> > > > kernel?
> > > 
> > > MR> I don't know if this helps, but I've a pre-KSE userland (28.06.), a
> > > MR> post-KSE kernel (30.06.) and I've none of the described problems.
> > > MR> Evolution, KDE3, Mozilla, ogg123, jdk13 all run without a problem.
> > > 
> > > I updated my current box about an hour ago, and got into trouble too.
> > > 
> > > My case is that amavis-milter dumps core with signal 11 and I cannot check
> > > virus in emails. :(
> > > 
> > > $ sudo gdb ./amavis-milter /etc/mail/amavis-milter.core 
> > > GNU gdb 5.2.0 (FreeBSD) 20020627
> > > Copyright 2002 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and you are
> > > welcome to change it and/or distribute copies of it under certain conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > This GDB was configured as "i386-undermydesk-freebsd"...
> > > Core was generated by `amavis-milter'.
> > > Program terminated with signal 11, Segmentation fault.
> > > Reading symbols from /usr/lib/libmilter.so.2...done.
> > > Loaded symbols for /usr/lib/libmilter.so.2
> > > Reading symbols from /usr/lib/libc_r.so.5...done.
> > > Loaded symbols for /usr/lib/libc_r.so.5
> > > Reading symbols from /usr/lib/libc.so.5...done.
> > > Loaded symbols for /usr/lib/libc.so.5
> > > Reading symbols from /usr/libexec/ld-elf.so.1...done.
> > > Loaded symbols for /usr/libexec/ld-elf.so.1
> > > #0  0x2808e918 in _thread_kern_sched_state_unlock () from /usr/lib/libc_r.so.5
> > > (gdb) 
> > > -- 
> > > NAKAJI Hiroyuki
> > > 
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> > > 
> > 
> > 
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> 
> -- 
> Andrey A. Chernov
> http://ache.pp.ru/
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:

> is there any chance you can try compile a new libc_r but with old versions
> of proc.h and queue.h in the system?

What changed in queue.h?  We do have some macros defined for presetting
queue.h *_HEAD's.

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> On Mon, 1 Jul 2002, Andrey A. Chernov wrote:
> 
> > On Mon, Jul 01, 2002 at 10:26:22 -0700, Julian Elischer wrote:
> > > the question is:
> > > did you update both kernel and userland?
> > 
> > This bug is not related to in-kernel KSE code (but, maybe related to
> > header files compiled in). I got it even with updated userland and old
> > pre-KSE kernel (with both updated I have it too). Only switching to libc_r 
> > old about month ago helps.
> 
> Ok this is good news becasue it helps narrow the problem.
> If an old libc_r runs well it at least suggests that the kernel is
> working as it should. That is a great relief to me!
> 
> My current guess is that something in an include file is poisonning the
> compile of a new libc_r.
> 
> I think we may need Dan to help us find it..

I'm a few days away from being able to upgrade to the latest
-current.

I'd suspect that it is something to do with the layout of
the fpregs, mcontext or something like that.  Libc_r mucks
about in jmp_buf (userland) and ucontext/mcontext, so anything
that changed those would cause problems.

You can enable error checking when compiling libc_r and see
if anything comes up too.  It should be relatively error free
(other than a few _ not being defined).

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

I have some debug stuff added for TAILQs
theoretically it should be defined out but
I don't trust theory :-)


On Mon, 1 Jul 2002, Daniel Eischen wrote:

> On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> > is there any chance you can try compile a new libc_r but with old versions
> > of proc.h and queue.h in the system?
> 
> What changed in queue.h?  We do have some macros defined for presetting
> queue.h *_HEAD's.
> 
> -- 
> Dan Eischen
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

I don't change any of those.

On Mon, 1 Jul 2002, Daniel Eischen wrote:
> 
> I'd suspect that it is something to do with the layout of
> the fpregs, mcontext or something like that.  Libc_r mucks
> about in jmp_buf (userland) and ucontext/mcontext, so anything
> that changed those would cause problems.
> 


It's still unclear if a KSE kernel works with an old libc_r or visa versa.

I'd like to see if a new libc_r works with an old kernel (someone who can
boot kernel.back and test...)

to check if you have a non KSE kernel,
sysctl kern.threads will only succeed in a new kernel.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:
> I don't change any of those.
> 
> On Mon, 1 Jul 2002, Daniel Eischen wrote:
> > 
> > I'd suspect that it is something to do with the layout of
> > the fpregs, mcontext or something like that.  Libc_r mucks
> > about in jmp_buf (userland) and ucontext/mcontext, so anything
> > that changed those would cause problems.
> > 
> 
> 
> It's still unclear if a KSE kernel works with an old libc_r or visa versa.
> 
> I'd like to see if a new libc_r works with an old kernel (someone who can
> boot kernel.back and test...)
> 
> to check if you have a non KSE kernel,
> sysctl kern.threads will only succeed in a new kernel.

I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
3 days ago (lib/libc_r/uthread/...).  You can try reverting those
changes and go back to revisions 1.18 and 1.11 respectively.

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Wesley Morgan

Reverting proc.h and queue.h do nothing. Booting a kernel from 20020624,
still crashes all threaded systems. Same behavior on a 20020620 kernel.
> I don't change any of those.
>
> On Mon, 1 Jul 2002, Daniel Eischen wrote:
>>
>> I'd suspect that it is something to do with the layout of
>> the fpregs, mcontext or something like that.  Libc_r mucks
>> about in jmp_buf (userland) and ucontext/mcontext, so anything
>> that changed those would cause problems.
>>
>
>
> It's still unclear if a KSE kernel works with an old libc_r or visa
> versa.
>
> I'd like to see if a new libc_r works with an old kernel (someone who
> can boot kernel.back and test...)
>
> to check if you have a non KSE kernel,
> sysctl kern.threads will only succeed in a new kernel.
>
>
>
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

can you try compiling a new libc_r with th efollowing change suggested by
Dan Eischen:

--begin quote:

I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
3 days ago (lib/libc_r/uthread/...).  You can try reverting those
changes and go back to revisions 1.18 and 1.11 respectively.

--end quote..

so that is uthread_sigpending.c version 1.18
and
uthread_sigsuspend.c version 1.11

Thanks

Julian




On Mon, 1 Jul 2002, Wesley Morgan wrote:

> Reverting proc.h and queue.h do nothing. Booting a kernel from 20020624,
> still crashes all threaded systems. Same behavior on a 20020620 kernel.
> > I don't change any of those.
> >
> > On Mon, 1 Jul 2002, Daniel Eischen wrote:
> >>
> >> I'd suspect that it is something to do with the layout of
> >> the fpregs, mcontext or something like that.  Libc_r mucks
> >> about in jmp_buf (userland) and ucontext/mcontext, so anything
> >> that changed those would cause problems.
> >>
> >
> >
> > It's still unclear if a KSE kernel works with an old libc_r or visa
> > versa.
> >
> > I'd like to see if a new libc_r works with an old kernel (someone who
> > can boot kernel.back and test...)
> >
> > to check if you have a non KSE kernel,
> > sysctl kern.threads will only succeed in a new kernel.
> >
> >
> >
> > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > with "unsubscribe freebsd-current" in the body of the message
> 
> 
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Wesley Morgan

I already tried that this morning, it had no effect ... Unless you would
like me to try an  old kernel with it

On Mon, 1 Jul 2002, Julian Elischer wrote:

> can you try compiling a new libc_r with th efollowing change suggested by
> Dan Eischen:
>
> --begin quote:
>
> I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
> 3 days ago (lib/libc_r/uthread/...).  You can try reverting those
> changes and go back to revisions 1.18 and 1.11 respectively.
>
> --end quote..
>
> so that is uthread_sigpending.c version 1.18
> and
> uthread_sigsuspend.c version 1.11
>
> Thanks
>
> Julian
>
>
>
>
> On Mon, 1 Jul 2002, Wesley Morgan wrote:
>
> > Reverting proc.h and queue.h do nothing. Booting a kernel from 20020624,
> > still crashes all threaded systems. Same behavior on a 20020620 kernel.
> > > I don't change any of those.
> > >
> > > On Mon, 1 Jul 2002, Daniel Eischen wrote:
> > >>
> > >> I'd suspect that it is something to do with the layout of
> > >> the fpregs, mcontext or something like that.  Libc_r mucks
> > >> about in jmp_buf (userland) and ucontext/mcontext, so anything
> > >> that changed those would cause problems.
> > >>
> > >
> > >
> > > It's still unclear if a KSE kernel works with an old libc_r or visa
> > > versa.
> > >
> > > I'd like to see if a new libc_r works with an old kernel (someone who
> > > can boot kernel.back and test...)
> > >
> > > to check if you have a non KSE kernel,
> > > sysctl kern.threads will only succeed in a new kernel.
> > >
> > >
> > >
> > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > with "unsubscribe freebsd-current" in the body of the message
> >
> >
> >
> >
>

-- 
   _ __ ___   ___ ___ ___
  Wesley N Morgan   _ __ ___ | _ ) __|   \
  [EMAIL PROTECTED] _ __ | _ \._ \ |) |
  FreeBSD: The Power To Serve  _ |___/___/___/
Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

oops,, sorry, you are right..

still it's being so confusing with people saying this and that, that it's 
only now that I'm starting to believe that it's not the kernel
doing something nasty.



On Mon, 1 Jul 2002, Wesley Morgan wrote:

> I already tried that this morning, it had no effect ... Unless you would
> like me to try an  old kernel with it
> 
> On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> > can you try compiling a new libc_r with th efollowing change suggested by
> > Dan Eischen:
> >
> > --begin quote:
> >
> > I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
> > 3 days ago (lib/libc_r/uthread/...).  You can try reverting those
> > changes and go back to revisions 1.18 and 1.11 respectively.
> >
> > --end quote..
> >
> > so that is uthread_sigpending.c version 1.18
> > and
> > uthread_sigsuspend.c version 1.11
> >
> > Thanks
> >
> > Julian
> >
> >
> >
> >
> > On Mon, 1 Jul 2002, Wesley Morgan wrote:
> >
> > > Reverting proc.h and queue.h do nothing. Booting a kernel from 20020624,
> > > still crashes all threaded systems. Same behavior on a 20020620 kernel.
> > > > I don't change any of those.
> > > >
> > > > On Mon, 1 Jul 2002, Daniel Eischen wrote:
> > > >>
> > > >> I'd suspect that it is something to do with the layout of
> > > >> the fpregs, mcontext or something like that.  Libc_r mucks
> > > >> about in jmp_buf (userland) and ucontext/mcontext, so anything
> > > >> that changed those would cause problems.
> > > >>
> > > >
> > > >
> > > > It's still unclear if a KSE kernel works with an old libc_r or visa
> > > > versa.
> > > >
> > > > I'd like to see if a new libc_r works with an old kernel (someone who
> > > > can boot kernel.back and test...)
> > > >
> > > > to check if you have a non KSE kernel,
> > > > sysctl kern.threads will only succeed in a new kernel.
> > > >
> > > >
> > > >
> > > > To Unsubscribe: send mail to [EMAIL PROTECTED]
> > > > with "unsubscribe freebsd-current" in the body of the message
> > >
> > >
> > >
> > >
> >
> 
> -- 
>_ __ ___   ___ ___ ___
>   Wesley N Morgan   _ __ ___ | _ ) __|   \
>   [EMAIL PROTECTED] _ __ | _ \._ \ |) |
>   FreeBSD: The Power To Serve  _ |___/___/___/
> Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Daniel Eischen wrote:

> 
> I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
> 3 days ago (lib/libc_r/uthread/...).  You can try reverting those
> changes and go back to revisions 1.18 and 1.11 respectively.

It seems that you have been exhonorated..
I guess this means that teh KSE changes are in some way contaminating the 
build of libc_r.. but what? and how?




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> On Mon, 1 Jul 2002, Daniel Eischen wrote:
> 
> > 
> > I also made changes to uthread_sigpending.c and uthread_sigsuspend.c
> > 3 days ago (lib/libc_r/uthread/...).  You can try reverting those
> > changes and go back to revisions 1.18 and 1.11 respectively.
> 
> It seems that you have been exhonorated..
> I guess this means that teh KSE changes are in some way contaminating the 
> build of libc_r.. but what? and how?

I'm not sure.  I would be interested in seeing any warnings from building
new libc_r.  The only places I can think of are the queues (with the
QMD debug defined, that would definitely cause problems), but that
seems to have been ruled out also when queue.h was reverted.  Did
USRSTACK or SIGSTKSZ get changed somehow?

Someone can also try going into lib/libc_r/test and running the
tests in there, to see if even simple threaded programs are borken
or not.

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Daniel Eischen wrote:

> I'm not sure.  I would be interested in seeing any warnings from building
> new libc_r.  The only places I can think of are the queues (with the
> QMD debug defined, that would definitely cause problems), but that
> seems to have been ruled out also when queue.h was reverted.  Did
> USRSTACK or SIGSTKSZ get changed somehow?
> 
> Someone can also try going into lib/libc_r/test and running the
> tests in there, to see if even simple threaded programs are borken
> or not.

I'd try but...
cc -Wall -pipe -g3 -D_LIBC_R_ -D_REENTRANT -c mutex_d.c -o mutex_d_a.o
mutex_d.c:168: initializer element is not constant
mutex_d.c: In function `waiter':
mutex_d.c:358: warning: too few arguments for format
*** Error code 1

Stop in /usr/src/lib/libc_r/test.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

On Mon, 1 Jul 2002, Daniel Eischen wrote:

> 
> Someone can also try going into lib/libc_r/test and running the
> tests in there, to see if even simple threaded programs are borken
> or not.
A
cool 
I didn't know they are there!


heres what happens when it is run (After fixing typos)


make: don't know how to make test. Stop
ref4# make
Test static library:
--
Test  c_user c_system c_total chng
 passed/FAILEDh_user h_system h_total   % chng
--
hello_d 0.00 0.000.00
 passed 
--
hello_s 0.00 0.010.01
 passed 
--
join_leak_d 0.19 0.150.34
 passed 
--
mutex_d (**hangs here**)




> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

turned out to be minor.. see other mail



On Mon, 1 Jul 2002, Julian Elischer wrote:

> 
> 
> On Mon, 1 Jul 2002, Daniel Eischen wrote:
> 
> > I'm not sure.  I would be interested in seeing any warnings from building
> > new libc_r.  The only places I can think of are the queues (with the
> > QMD debug defined, that would definitely cause problems), but that
> > seems to have been ruled out also when queue.h was reverted.  Did
> > USRSTACK or SIGSTKSZ get changed somehow?
> > 
> > Someone can also try going into lib/libc_r/test and running the
> > tests in there, to see if even simple threaded programs are borken
> > or not.
> 
> I'd try but...
> cc -Wall -pipe -g3 -D_LIBC_R_ -D_REENTRANT -c mutex_d.c -o mutex_d_a.o
> mutex_d.c:168: initializer element is not constant
> mutex_d.c: In function `waiter':
> mutex_d.c:358: warning: too few arguments for format
> *** Error code 1
> 
> Stop in /usr/src/lib/libc_r/test.
> 
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen


On Mon, 1 Jul 2002, Julian Elischer wrote:

> On Mon, 1 Jul 2002, Daniel Eischen wrote:
> 
> > 
> > Someone can also try going into lib/libc_r/test and running the
> > tests in there, to see if even simple threaded programs are borken
> > or not.
> A
> cool 
> I didn't know they are there!
> 
> 
> heres what happens when it is run (After fixing typos)
> 
> 
> make: don't know how to make test. Stop
> ref4# make
> Test static library:
> --
> Test  c_user c_system c_total chng
>  passed/FAILEDh_user h_system h_total   % chng
> --
> hello_d 0.00 0.000.00
>  passed 
> --
> hello_s 0.00 0.010.01
>  passed 
> --
> join_leak_d 0.19 0.150.34
>  passed 
> --
> mutex_d (**hangs here**)

This one takes quite a long time to run.  Run it by hand and you'll
see if it's really hanging or not.

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Daniel Eischen wrote:

> 
> > mutex_d (**hangs here**)
> 
> This one takes quite a long time to run.  Run it by hand and you'll
> see if it's really hanging or not.

you're not wrong!

it takes ages.. (still running in another window)

false alarm so far!



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer

I think that gets us a LOT closer!


Total tests 212, passed 212, failed 0
ref4# Jul  2 01:52:52 ref4 kernel: pid 330 (guard_b), uid 0: exited on
signal 11 (core dumped)
Jul  2 01:52:52 ref4 kernel: pid 334 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:52 ref4 kernel: pid 338 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:52 ref4 kernel: pid 342 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:52 ref4 kernel: pid 346 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:52 ref4 kernel: pid 350 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 354 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 358 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 362 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 366 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 370 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 374 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 378 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:53 ref4 kernel: pid 382 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:54 ref4 kernel: pid 386 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:54 ref4 kernel: pid 390 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:54 ref4 kernel: pid 394 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:54 ref4 kernel: pid 398 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:54 ref4 kernel: pid 402 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 406 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 410 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 414 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 418 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 422 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 426 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 430 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:55 ref4 kernel: pid 434 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:56 ref4 kernel: pid 438 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:56 ref4 kernel: pid 442 (guard_b), uid 0: exited on signal 11
(core dumped)
Jul  2 01:52:56 ref4 kernel: pid 446 (guard_b), uid 0: exited on signal 11
(core dumped)

--
sigsuspend_d0.00 0.010.01
 passed 
--
sigwait_d   0.00 0.010.01
 *** FAILED *** 
--
guard_s.pl  0.06 0.880.95
 *** FAILED *** (30/30 failed)  
--
propagate_s.pl  0.14 0.070.21
 *** FAILED *** (1/1 failed)
--
Totals  3.34   151.65  154.99 0.00
 6 / 9 passed (66.67%)  0.00 0.000.000.00%
--
*** Error code 1

Stop in /usr/src/lib/libc_r/test.
in gdb:
Breakpoint 1, main (argc=3, argv=0xbfbffc10) at guard_b.c:103
103 assert(pthread_attr_init(&attr) == 0);
(gdb) s
98  fprintf(stderr, "Test begin\n");
(gdb) 
Test begin
100 stacksize = strtoul(argv[1], NULL, 10);
(gdb) 
101 guardsize = strtoul(argv[2], NULL, 10);
(gdb) 
103 assert(pthread_attr_init(&attr) == 0);
(gdb) 
108 assert(pthread_attr_getstacksize(&attr, &def_stacksize) ==
0);
(gdb) 
109 assert(pthread_attr_getguardsize(&attr, &def_guardsize) ==
0);
(gdb) 
110 if (def_stacksize != stacksize) {
(gdb) 
111 assert(pthread_attr_setstacksize(&attr,
stacksize) == 0);
(gdb) 
112 assert(pthread_attr_getstacksize(&attr,
&def_stacksize) == 0);
(gdb) 
113 assert(def_stacksize == stacksize);
(gdb) 
115 if (def_guardsize != guardsize) {
(gdb) 
116 assert(pthread_attr_setguardsize(&attr,
guardsize) == 0);
(gdb) 
117 assert(pthread_attr_getguardsize(&attr,
&def_guardsize) == 0);
(gdb) 
118 ass

Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:

> I think that gets us a LOT closer!
> 
> 
> Total tests 212, passed 212, failed 0
> ref4# Jul  2 01:52:52 ref4 kernel: pid 330 (guard_b), uid 0: exited on
> signal 11 (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 338 (guard_b), uid 0: exited on signal 11
> (core dumped)

I think this is supposed to SEGV.  It's testing guard pages placed
above thread stacks.

-- 
Dan Eischen


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Julian Elischer



On Mon, 1 Jul 2002, Daniel Eischen wrote:

> On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> > I think that gets us a LOT closer!
> > 
> > 
> > Total tests 212, passed 212, failed 0
> > ref4# Jul  2 01:52:52 ref4 kernel: pid 330 (guard_b), uid 0: exited on
> > signal 11 (core dumped)
> > Jul  2 01:52:52 ref4 kernel: pid 338 (guard_b), uid 0: exited on signal 11
> > (core dumped)
> 
> I think this is supposed to SEGV.  It's testing guard pages placed
> above thread stacks.

I would imagine that it is supposed to capture the sigsegv
instead of actually dying...

> 
> -- 
> Dan Eischen
> 
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-01 Thread Daniel Eischen

On Mon, 1 Jul 2002, Julian Elischer wrote:
> 
> On Mon, 1 Jul 2002, Daniel Eischen wrote:
> 
> > On Mon, 1 Jul 2002, Julian Elischer wrote:
> > 
> > > I think that gets us a LOT closer!
> > > 
> > > 
> > > Total tests 212, passed 212, failed 0
> > > ref4# Jul  2 01:52:52 ref4 kernel: pid 330 (guard_b), uid 0: exited on
> > > signal 11 (core dumped)
> > > Jul  2 01:52:52 ref4 kernel: pid 338 (guard_b), uid 0: exited on signal 11
> > > (core dumped)
> > 
> > I think this is supposed to SEGV.  It's testing guard pages placed
> > above thread stacks.
> 
> I would imagine that it is supposed to capture the sigsegv
> instead of actually dying...

There's a perl script which is supposed to do that I think.
guard_s.pl I think.

-- 
Dan Eischen



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: Post-KSE disaster with libc_r

2002-07-02 Thread Sven Petai


hi

I tried those libc_r test programs under about a month old CURRENT, and output 
seems to be identical to yours (didn't try gdb on it but it gives same 
guard_b segfaults and same programs fail)
here's the output:

Test static library:
--
Test  c_user c_system c_total chng
 passed/FAILEDh_user h_system h_total   % chng
--
hello_d 0.00 0.100.10
 passed 
--
hello_s 0.00 0.120.12
 passed 
--
join_leak_d 2.50 1.453.95
 passed 
--
mutex_d 2.21   116.36  118.57
 passed 
--
sem_d   0.02 0.120.14
 passed 
--
sigsuspend_d0.01 0.120.12
 passed 
--
sigwait_d   User defined signal 1
0.01 0.120.13
 *** FAILED *** 
--
guard_s.pl  1.23 6.918.14
 *** FAILED *** (30/30 failed)  
--
propagate_s.pl  3.68 0.774.45
 *** FAILED *** (1/1 failed)
--
Totals  4.74   118.26  123.00 0.00
 6 / 9 passed (66.67%)  0.00 0.000.000.00%
--
*** Error code 1
and lots of guard_b segfault messages to console

Stop in /usr/src/lib/libc_r/test.


On Tuesday 02 July 2002 05:15, Julian Elischer wrote:
> I think that gets us a LOT closer!
>
>
> Total tests 212, passed 212, failed 0
> ref4# Jul  2 01:52:52 ref4 kernel: pid 330 (guard_b), uid 0: exited on
> signal 11 (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 334 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 338 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 342 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 346 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:52 ref4 kernel: pid 350 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 354 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 358 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 362 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 366 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 370 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 374 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 378 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:53 ref4 kernel: pid 382 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:54 ref4 kernel: pid 386 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:54 ref4 kernel: pid 390 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:54 ref4 kernel: pid 394 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:54 ref4 kernel: pid 398 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:54 ref4 kernel: pid 402 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 kernel: pid 406 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 kernel: pid 410 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 kernel: pid 414 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 kernel: pid 418 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 kernel: pid 422 (guard_b), uid 0: exited on signal 11
> (core dumped)
> Jul  2 01:52:55 ref4 ke

[Fwd: Re: Post-KSE disaster with libc_r]

2002-07-01 Thread Michael Nottebrock

Somehow this thread slipped into privmail.

 Original Message 
Subject: Re: Post-KSE disaster with libc_r
Date: Mon, 1 Jul 2002 14:23:23 -0700 (PDT)
From: Julian Elischer <[EMAIL PROTECTED]>
To: Michael Nottebrock <[EMAIL PROTECTED]>

 > [Applied 'thediff' to pre-KSE CURRENT and rebuilt world, things
 >  break, things remain broken when booting the old kernel with the
 >  new world.]


THANKS!!!

ok so it's libc_r for sure..

now we have two possibilities:

1/ It's ingherrently broken because of a recent change.
if so, checking out 1 month old sources to libc_r
and compiling it should yield a working libc_r.
2/ Something I've committed is polluting the compile.
e.g. a namespace clash or similar in a new include file.
In this case, teh new compile of old sources should yield a bad libc_r.


can someone test this?



msg40222/pgp0.pgp
Description: PGP signature