Re: ffs_copyonwrite panics

2010-06-11 Thread Jeremie Le Hen
Hi Roman,

On Mon, May 24, 2010 at 03:21:41PM +0400, Roman Bogorodskiy wrote:
 I am not sure how to save coredump as when the system boots after the
 crash and starts saving coredump from swap partition to disk the system
 crashes again.
 
 Generally, the system is almost unusable and in order to try a new
 kernel I cross-compile it on my i386 laptop and copy in using livefs
 cdrom.
 
 Do you have an idea how to save a trace?

Sorry for the late reply.  If you're still undergoing this issue, once
your kernel has crashed and dumped his memory, you can reboot using your
previously working kernel.  You will be able to save the core to the
disk.

Regards,
-- 
Jeremie Le Hen

Humans are born free and equal.  But some are more equal than others.
Coluche
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ffs_copyonwrite panics

2010-06-11 Thread Roman Bogorodskiy
  Jeremie Le Hen wrote:

 Hi Roman,
 
 On Mon, May 24, 2010 at 03:21:41PM +0400, Roman Bogorodskiy wrote:
  I am not sure how to save coredump as when the system boots after the
  crash and starts saving coredump from swap partition to disk the system
  crashes again.
  
  Generally, the system is almost unusable and in order to try a new
  kernel I cross-compile it on my i386 laptop and copy in using livefs
  cdrom.
  
  Do you have an idea how to save a trace?
 
 Sorry for the late reply.  If you're still undergoing this issue, once
 your kernel has crashed and dumped his memory, you can reboot using your
 previously working kernel.  You will be able to save the core to the
 disk.

I've tried old kernel when just spotted this issue, but with the old
kernel and new world I wasn't able to use ppp so I gave up on that
quickly. Back then I haven't had dumpdev configured so wan't able to
save a dump.

Anyways, in the end I've tried different various kernels, including 8.0
kernel and world but was still having problems with writing stuff to
disk. Finally, I've did a reinstall of 8.0 with newfs for all
partitions and it now works fine, so probably the fs was damaged
somehow.

Roman Bogorodskiy


signature.asc
Description: Digital signature


Re: ffs_copyonwrite panics

2010-05-24 Thread Roman Bogorodskiy
  Jeff Roberson wrote:

 Tried today's -CURRENT and unfortunately the behaviour is still same.
 
 Can you give me a full stack trace?  Do you have coredumps enabled?
 I would like to have you look at a few things in a core or send it
 to me with your kernel.

I am not sure how to save coredump as when the system boots after the
crash and starts saving coredump from swap partition to disk the system
crashes again.

Generally, the system is almost unusable and in order to try a new
kernel I cross-compile it on my i386 laptop and copy in using livefs
cdrom.

Do you have an idea how to save a trace?

Thanks,

Roman Bogorodskiy


signature.asc
Description: Digital signature


Re: ffs_copyonwrite panics

2010-05-23 Thread Roman Bogorodskiy
  Jeff Roberson wrote:

 On Tue, 18 May 2010, Roman Bogorodskiy wrote:
 
  Hi,
 
  I've been using -CURRENT last update in February for quite a long time
  and few weeks ago decided to finally update it. The update was quite
  unfortunate as system became very unstable: it just hangs few times a
  day and panics sometimes.
 
  Some things can be reproduced, some cannot. Reproducible ones:
 
  1. background fsck always makes system hang
  2. system crashes on operations with nullfs mounts (disabled that for
  now)
 
  The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
  The thing is that if I will run 'startx' on it with some X apps it will
  panic just in few minutes. When I leave the box with nearly no stress
  (just use it as internet gateway for my laptop) it behaves a little
  better but will eventually crash in few hours anyway.
 
 This may have been my fault.  Can you please update and let me know if it 
 is resolved?  There was both a deadlock and a copyonwrite panic as a 
 result of the softupdates journaling import.  I just fixed the deadlock 
 today.

Tried today's -CURRENT and unfortunately the behaviour is still same.

Roman Bogorodskiy
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ffs_copyonwrite panics

2010-05-23 Thread Jeff Roberson

On Sun, 23 May 2010, Roman Bogorodskiy wrote:


 Jeff Roberson wrote:


On Tue, 18 May 2010, Roman Bogorodskiy wrote:


Hi,

I've been using -CURRENT last update in February for quite a long time
and few weeks ago decided to finally update it. The update was quite
unfortunate as system became very unstable: it just hangs few times a
day and panics sometimes.

Some things can be reproduced, some cannot. Reproducible ones:

1. background fsck always makes system hang
2. system crashes on operations with nullfs mounts (disabled that for
now)

The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
The thing is that if I will run 'startx' on it with some X apps it will
panic just in few minutes. When I leave the box with nearly no stress
(just use it as internet gateway for my laptop) it behaves a little
better but will eventually crash in few hours anyway.


This may have been my fault.  Can you please update and let me know if it
is resolved?  There was both a deadlock and a copyonwrite panic as a
result of the softupdates journaling import.  I just fixed the deadlock
today.


Tried today's -CURRENT and unfortunately the behaviour is still same.


Can you give me a full stack trace?  Do you have coredumps enabled?  I 
would like to have you look at a few things in a core or send it to me 
with your kernel.


Thanks,
Jeff



Roman Bogorodskiy


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: ffs_copyonwrite panics

2010-05-19 Thread Jeff Roberson

On Tue, 18 May 2010, Roman Bogorodskiy wrote:


Hi,

I've been using -CURRENT last update in February for quite a long time
and few weeks ago decided to finally update it. The update was quite
unfortunate as system became very unstable: it just hangs few times a
day and panics sometimes.

Some things can be reproduced, some cannot. Reproducible ones:

1. background fsck always makes system hang
2. system crashes on operations with nullfs mounts (disabled that for
now)

The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
The thing is that if I will run 'startx' on it with some X apps it will
panic just in few minutes. When I leave the box with nearly no stress
(just use it as internet gateway for my laptop) it behaves a little
better but will eventually crash in few hours anyway.


This may have been my fault.  Can you please update and let me know if it 
is resolved?  There was both a deadlock and a copyonwrite panic as a 
result of the softupdates journaling import.  I just fixed the deadlock 
today.


Thanks,
Jeff



The even more annoying thing is that when I cannot save the dump,
because when the system boots and runs 'savecore' it leads to
fss_copyonwrite panic as well. The panic happens when about 90% complete
(as seem via ctrl-t).

Any ideas how to debug and get rid of this issue?

System arch is amd64. I don't know what other details could be useful.

Roman Bogorodskiy


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


ffs_copyonwrite panics

2010-05-18 Thread Roman Bogorodskiy
Hi,

I've been using -CURRENT last update in February for quite a long time
and few weeks ago decided to finally update it. The update was quite
unfortunate as system became very unstable: it just hangs few times a
day and panics sometimes.

Some things can be reproduced, some cannot. Reproducible ones:

1. background fsck always makes system hang
2. system crashes on operations with nullfs mounts (disabled that for
now)

The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
The thing is that if I will run 'startx' on it with some X apps it will
panic just in few minutes. When I leave the box with nearly no stress
(just use it as internet gateway for my laptop) it behaves a little
better but will eventually crash in few hours anyway.

The even more annoying thing is that when I cannot save the dump,
because when the system boots and runs 'savecore' it leads to
fss_copyonwrite panic as well. The panic happens when about 90% complete
(as seem via ctrl-t).

Any ideas how to debug and get rid of this issue?

System arch is amd64. I don't know what other details could be useful.

Roman Bogorodskiy


signature.asc
Description: Digital signature


Re: ffs_copyonwrite panics

2010-05-18 Thread Fabian Keil
Roman Bogorodskiy bogorods...@gmail.com wrote:

 I've been using -CURRENT last update in February for quite a long time
 and few weeks ago decided to finally update it. The update was quite
 unfortunate as system became very unstable: it just hangs few times a
 day and panics sometimes.
 
 Some things can be reproduced, some cannot. Reproducible ones:
 
 1. background fsck always makes system hang
 2. system crashes on operations with nullfs mounts (disabled that for
 now)
 
 The most annoying one is ffs_copyonwrite panic which I cannot reproduce.
 The thing is that if I will run 'startx' on it with some X apps it will
 panic just in few minutes. When I leave the box with nearly no stress
 (just use it as internet gateway for my laptop) it behaves a little
 better but will eventually crash in few hours anyway.
 
 The even more annoying thing is that when I cannot save the dump,
 because when the system boots and runs 'savecore' it leads to
 fss_copyonwrite panic as well. The panic happens when about 90% complete
 (as seem via ctrl-t).
 
 Any ideas how to debug and get rid of this issue?
 
 System arch is amd64. I don't know what other details could be useful.

I'm not familiar with the background fsck issue, but if the nullfs
panic looks like this one, there's a fair chance it's already fixed:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x10
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x82412f14
stack pointer   = 0x28:0xff803e564620
frame pointer   = 0x28:0xff803e564770
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1825 (jail)
panic: from debugger
cpuid = 0
Uptime: 38s
Dumping 1992 MB (5 chunks)
  chunk 0: 1MB (155 pages) ... ok
  chunk 1: 1990MB (509345 pages) 1974 [...] 6 ... ok
  chunk 2: 2MB (273 pages) ... ok
  chunk 3: 1MB (184 pages)

#0  doadump () at pcpu.h:223
223 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) #0  doadump () at pcpu.h:223
#1  0x803c506f in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x803c546c in panic (fmt=Variable fmt is not available.
)
at /usr/src/sys/kern/kern_shutdown.c:590
#3  0x801f6e77 in db_panic (addr=Variable addr is not available.
)
at /usr/src/sys/ddb/db_command.c:478
#4  0x801f7281 in db_command (last_cmdp=0x808bfd80, 
cmd_table=Variable cmd_table is not available.

) at /usr/src/sys/ddb/db_command.c:445
#5  0x801f74d0 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:498
#6  0x801f9429 in db_trap (type=Variable type is not available.
) at /usr/src/sys/ddb/db_main.c:229
#7  0x803f3c25 in kdb_trap (type=12, code=0, tf=0xff803e564570)
at /usr/src/sys/kern/subr_kdb.c:535
#8  0x8062ad9d in trap_fatal (frame=0xff803e564570, eva=Variable 
eva is not available.
)
at /usr/src/sys/amd64/amd64/trap.c:773
#9  0x8062b0fc in trap_pfault (frame=0xff803e564570, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:694
#10 0x8062b8ff in trap (frame=0xff803e564570)
at /usr/src/sys/amd64/amd64/trap.c:451
#11 0x80611f33 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:223
#12 0x82412f14 in null_bypass (ap=0xff803e564780)
at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:269
#13 0x80448104 in vgonel (vp=0xff0005e05780) at vnode_if.h:1099
#14 0x8044835e in vrecycle (vp=0xff0005e05780, td=Variable td is 
not available.
)
at /usr/src/sys/kern/vfs_subr.c:2505
#15 0x82412e6f in null_inactive (ap=Variable ap is not available.
)
at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:665
#16 0x80444ff8 in vinactive (vp=0xff0005e05780, 
td=0xff00054743e0) at vnode_if.h:807
#17 0x804495dd in vputx (vp=0xff0005e05780, func=2)
at /usr/src/sys/kern/vfs_subr.c:2226
#18 0x8043e1ae in lookup (ndp=0xff803e564a50)
at /usr/src/sys/kern/vfs_lookup.c:905
#19 0x8043eef7 in namei (ndp=0xff803e564a50)
at /usr/src/sys/kern/vfs_lookup.c:269
#20 0x8044ec86 in kern_accessat (td=0xff00054743e0, fd=-100, 
path=0x800537000 Address 0x800537000 out of bounds, pathseg=Variable 
pathseg is not available.
)
at /usr/src/sys/kern/vfs_syscalls.c:2140
#21 0x8062b21d in syscall (frame=0xff803e564c80)
at /usr/src/sys/amd64/amd64/trap.c:946
#22 0x80612211 in Xfast_syscall ()
at /usr/src/sys/amd64/amd64/exception.S:374
#23 0x00080050e5ec in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) 

I got it reproducible with:

FreeBSD 9.0-CURRENT #66 r+3fe665b: Fri May 14 17:45:10 CEST 2010
f...@r500.local:/usr/obj/usr/src/sys/ZOEY amd64

but