[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-30 Thread Evan Broder
Whoops - here's a patch that includes the LP closer.

** Attachment added: "openafs_1.4.7.dfsg1-6+ubuntu0.1.debdiff"
   
http://launchpadlibrarian.net/24511582/openafs_1.4.7.dfsg1-6%2Bubuntu0.1.debdiff

** Summary changed:

- openafs-modules 1.4.8 segfault after stop
+ [SRU] openafs-modules segfault after stop

** Description changed:

+ Impact: This bug causes kernel oopses and hangs at shutdown.
+ 
+ Development: The two deltas being incorporated have been committed to
+ the upstream AFS tree, and have also been included in openafs
+ 1.4.8.dfsg1-3, which was just synced into Jaunty.
+ 
+ Patch: Attached at
+ 
http://launchpadlibrarian.net/24511582/openafs_1.4.7.dfsg1-6%2Bubuntu0.1.debdiff
+ - please see the comments for explanation of the version number.
+ 
+ Regression potential: For both of these deltas, the changes are limited
+ to the shutdown code, i.e. the functionality that's affected by the
+ bugs, so I find it unlikely that they'll make anything worse, and
+ empirically they seem to fix the oopses and hangs.
+ 
  was after openafs-client stop on server 
  ubuntu hardy
  
  
  [   99.016655] Starting AFS cache scan...found 45 non-empty cache files (2%).
  [   99.346761] NET: Registered protocol family 17
  [  101.010753] ip_tables: (C) 2000-2006 Netfilter Core Team
  [  101.071133] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
  [  101.403948] eth1: no IPv6 routers present
  [  102.199452] vlan11: no IPv6 routers present
  [  106.468767] tap0: no IPv6 routers present
  [73434.848255] EXT3-fs: cannot change data mode on remount
  [75198.320131] WARM shutting down of: CB... afs... BkG... CTrunc... AFSDB... 
RxEvent... UnmaskRxkSignals... RxListener... 
  [75198.833031] WARNING: not all blocks freed: large 1 small 4
  [75198.833041]  ALL allocated tables
  [75219.895067] kjournald starting.  Commit interval 120 seconds
  [75219.915815] EXT3 FS on dm-3, internal journal
  [75219.915823] EXT3-fs: mounted filesystem with writeback data mode.
  [75253.358769] Found system call table at 0xc033a680 (pattern scan)
  [75253.358773] Address 0xc033a680 is not writable.
  [75253.358774] System call hooks will not be installed; proceeding anyway
  [75253.398880] Starting AFS cache scan...found 347 non-empty cache files 
(22%).
  [76028.437373] AFS isn't unmounted yet! Call aborted
  [76034.981943] AFS isn't unmounted yet! Call aborted
  [76056.414781] AFS isn't unmounted yet! Call aborted
  [76100.504630] COLD shutting down of: CB... afs... BkG... CTrunc... AFSDB... 
RxEvent... UnmaskRxkSignals... RxListener... 
  [76101.000366] osi_linux_free: failed to remove chunk from hashtable
  (repeated about 300 times)
  [76101.000952] BUG: unable to handle kernel paging request at virtual address 
f8f5a020
  [76101.001039] printing eip: f8e09364 *pdpt = 4001 *pde = 
35836067 *pte =  
  [76101.001140] Oops:  [#1] SMP 
  [76101.001186] Modules linked in: openafs(P) ipt_REDIRECT ipt_REJECT xt_limit 
xt_state xt_tcpudp iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack 
iptable_mangle iptable_filter ip_tables x_tables af_packet nfsd auth_rpcgss 
exportfs tun container battery ac video output sbs sbshc dock nfs lockd nfs_acl 
sunrpc 8021q tcp_bic parport_pc lp parport loop ipv6 usbhid hid iTCO_wdt 
iTCO_vendor_support button shpchp pci_hotplug evdev pcspkr ext3 jbd mbcache 
ata_generic sg sd_mod ata_piix pata_acpi libata ehci_hcd uhci_hcd usbcore tg3 
mptsas mptscsih mptbase scsi_transport_sas scsi_mod dm_mirror dm_snapshot 
dm_mod thermal processor fan fbcon tileblit font bitblit softcursor fuse
  [76101.001836] 
  [76101.001872] Pid: 17524, comm: umount Tainted: P(2.6.24-23-server 
#1)
  [76101.001926] EIP: 0060:[] EFLAGS: 00010282 CPU: 0
  [76101.001997] EIP is at shutdown_vcache+0xe4/0x140 [openafs]
  [76101.002045] EAX: f8f5a01c EBX: f8f5a01c ECX: f8a0a0b0 EDX: f8a0a218
  [76101.002096] ESI: 0400 EDI: f8e6f080 EBP: df9fce00 ESP: d686bee8
  [76101.002147]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
  [76101.002195] Process umount (pid: 17524, ti=d686a000 task=c249f140 
task.ti=d686a000)
  [76101.002250] Stack: df9fce00 f8e67380 f8dff329 d69db400 f8e43cc4 f8e3e706 
df9fce00 f8e67380 
  [76101.002355]d69db400 df9fce00 c019c3e5 c01b088b  0017 
f8e67360 c019c4a9 
  [76101.002461]df9fce00 c019c55d  d686bf40 c01b0d36  
ecc11908 d69db400 
  [76101.002567] Call Trace:
  [76101.002637]  [] shutdown_cache+0x39/0xd0 [openafs]
  [76101.002702]  [] afs_shutdown+0x204/0x2a0 [openafs]
  [76101.002769]  [] afs_put_super+0x66/0xe0 [openafs]
  [76101.002836]  [] generic_shutdown_super+0x55/0xf0
  [76101.002888]  [] mntput_no_expire+0x3b/0x70
  [76101.002938]  [] kill_anon_super+0x9/0x40
  [76101.002987]  [] deactivate_super+0x5d/0x80
  [76101.003036]  [] sys_umount+0x46/0x250
  [76101.003086]  [] sys_stat64+0xf/0x30
  [76101.003133]  [] remove_vma+0x39/0x50
  [76101.003181]  [] do_munmap+0x180/0x1f0
  [76101.003232]  [] sys_oldumo

[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-30 Thread Evan Broder
Here's a patch for an SRU that includes STABLE14-cbr-free-what-you-
alloc-20090325 and STABLE14-shutdown-vcache-avoid-null-deref-20090324 -
both seem to cause similar symptoms at shutdown, and it seems that both
fixes are needed sometimes.

The version number in the SRU (1.4.7.dfsg1-6+ubuntu0.1) is intentionally
off from the standard SRU version numbering scheme for the sake of the
OpenAFS kernel modules. Without the plus, kernel modules built from the
SRU would fail to have a higher version number than the current version:

priscus:~ evan$ dpkg --compare-versions '1.4.7.dfsg1-6+2.6.27-11.27' lt 
'1.4.7.dfsg1-6ubuntu0.1+2.6.27-11.27' && echo "Yes" || echo "No"
No
priscus:~ evan$ dpkg --compare-versions '1.4.7.dfsg1-6+2.6.27-11.27' lt 
'1.4.7.dfsg1-6+ubuntu0.1+2.6.27-11.27' && echo "Yes" || echo "No"
Yes

I'll update the bug description in a bit for the SRU request.


** Attachment added: "openafs_1.4.7.dfsg1-6+ubuntu0.1.debdiff"
   
http://launchpadlibrarian.net/24510973/openafs_1.4.7.dfsg1-6%2Bubuntu0.1.debdiff

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-30 Thread Launchpad Bug Tracker
This bug was fixed in the package openafs - 1.4.8.dfsg1-3

---
openafs (1.4.8.dfsg1-3) unstable; urgency=low

  * Apply upstream CVS deltas:
- STABLE14-cbr-free-what-you-alloc-20090325: dequeue items in the same
  way they were allocated.
- STABLE14-shutdown-vcache-avoid-null-deref-20090324: avoid oops on
  shutdown.  (LP: #333197)
- STABLE14-uphys-invalidate-returns-void-20081130: fix apparent Ubik
  synchronization errors due to incorrect use of a void return value.
  * Update package sections for the new archive organization.
  

 -- Evan BroderMon,  30 Mar 2009 11:14:46 +0100

** Changed in: openafs (Ubuntu)
   Status: New => Fix Released

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-25 Thread Evan Broder
** Description changed:

  was after openafs-client stop on server 
  ubuntu hardy
  
  
  [   99.016655] Starting AFS cache scan...found 45 non-empty cache files (2%).
  [   99.346761] NET: Registered protocol family 17
  [  101.010753] ip_tables: (C) 2000-2006 Netfilter Core Team
  [  101.071133] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
  [  101.403948] eth1: no IPv6 routers present
  [  102.199452] vlan11: no IPv6 routers present
  [  106.468767] tap0: no IPv6 routers present
  [73434.848255] EXT3-fs: cannot change data mode on remount
  [75198.320131] WARM shutting down of: CB... afs... BkG... CTrunc... AFSDB... 
RxEvent... UnmaskRxkSignals... RxListener... 
  [75198.833031] WARNING: not all blocks freed: large 1 small 4
  [75198.833041]  ALL allocated tables
  [75219.895067] kjournald starting.  Commit interval 120 seconds
  [75219.915815] EXT3 FS on dm-3, internal journal
  [75219.915823] EXT3-fs: mounted filesystem with writeback data mode.
  [75253.358769] Found system call table at 0xc033a680 (pattern scan)
  [75253.358773] Address 0xc033a680 is not writable.
  [75253.358774] System call hooks will not be installed; proceeding anyway
  [75253.398880] Starting AFS cache scan...found 347 non-empty cache files 
(22%).
  [76028.437373] AFS isn't unmounted yet! Call aborted
  [76034.981943] AFS isn't unmounted yet! Call aborted
  [76056.414781] AFS isn't unmounted yet! Call aborted
  [76100.504630] COLD shutting down of: CB... afs... BkG... CTrunc... AFSDB... 
RxEvent... UnmaskRxkSignals... RxListener... 
  [76101.000366] osi_linux_free: failed to remove chunk from hashtable
- [76101.000369] osi_linux_free: failed to remove chunk from hashtable
- [76101.000370] osi_linux_free: failed to remove chunk from hashtable
- [76101.000375] osi_linux_free: failed to remove chunk from hashtable
- [76101.000376] osi_linux_free: failed to remove chunk from hashtable
- [76101.000379] osi_linux_free: failed to remove chunk from hashtable
- [76101.000380] osi_linux_free: failed to remove chunk from hashtable
- [76101.000381] osi_linux_free: failed to remove chunk from hashtable
- [76101.000384] osi_linux_free: failed to remove chunk from hashtable
- [76101.000385] osi_linux_free: failed to remove chunk from hashtable
- [76101.000388] osi_linux_free: failed to remove chunk from hashtable
- [76101.000389] osi_linux_free: failed to remove chunk from hashtable
- [76101.000392] osi_linux_free: failed to remove chunk from hashtable
- [76101.000393] osi_linux_free: failed to remove chunk from hashtable
- [76101.000396] osi_linux_free: failed to remove chunk from hashtable
- [76101.000397] osi_linux_free: failed to remove chunk from hashtable
- [76101.000400] osi_linux_free: failed to remove chunk from hashtable
- [76101.000401] osi_linux_free: failed to remove chunk from hashtable
- [76101.000402] osi_linux_free: failed to remove chunk from hashtable
- [76101.000403] osi_linux_free: failed to remove chunk from hashtable
- [76101.000405] osi_linux_free: failed to remove chunk from hashtable
- [76101.000406] osi_linux_free: failed to remove chunk from hashtable
- [76101.000408] osi_linux_free: failed to remove chunk from hashtable
- [76101.000409] osi_linux_free: failed to remove chunk from hashtable
- [76101.000410] osi_linux_free: failed to remove chunk from hashtable
- [76101.000413] osi_linux_free: failed to remove chunk from hashtable
- [76101.000414] osi_linux_free: failed to remove chunk from hashtable
- [76101.000417] osi_linux_free: failed to remove chunk from hashtable
- [76101.000418] osi_linux_free: failed to remove chunk from hashtable
- [76101.000421] osi_linux_free: failed to remove chunk from hashtable
- [76101.000422] osi_linux_free: failed to remove chunk from hashtable
- [76101.000423] osi_linux_free: failed to remove chunk from hashtable
- [76101.000424] osi_linux_free: failed to remove chunk from hashtable
- [76101.000427] osi_linux_free: failed to remove chunk from hashtable
- [76101.000428] osi_linux_free: failed to remove chunk from hashtable
- [76101.000431] osi_linux_free: failed to remove chunk from hashtable
- [76101.000432] osi_linux_free: failed to remove chunk from hashtable
- [76101.000434] osi_linux_free: failed to remove chunk from hashtable
- [76101.000435] osi_linux_free: failed to remove chunk from hashtable
- [76101.000436] osi_linux_free: failed to remove chunk from hashtable
- [76101.000438] osi_linux_free: failed to remove chunk from hashtable
- [76101.000440] osi_linux_free: failed to remove chunk from hashtable
- [76101.000441] osi_linux_free: failed to remove chunk from hashtable
- [76101.000442] osi_linux_free: failed to remove chunk from hashtable
- [76101.000443] osi_linux_free: failed to remove chunk from hashtable
- [76101.000445] osi_linux_free: failed to remove chunk from hashtable
- [76101.000446] osi_linux_free: failed to remove chunk from hashtable
- [76101.000449] osi_linux_free: failed to remove chunk from hashtable
- [76101.000450] osi_linux_free: failed to 

[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-10 Thread Evan Broder
One of the OpenAFS developers (Chaskiel) gave us this patch to try.
We'll be testing it over the next few days on some of our machines to
see if it fixes the problem.

I should warn anyone interested in trying the patch that it has
absolutely not been tested yet, or even built.

We'll report back once we've had some time to test it.

** Attachment added: "UNTESTED patch"
   http://launchpadlibrarian.net/23736017/cbr-only-free-what-you-alloc.diff

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey
Phew. Sorry for all the noise.


** Attachment added: "#1 last screenful."
   http://launchpadlibrarian.net/23566812/afs-oops-1.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#2 eighth screenful"
   http://launchpadlibrarian.net/23566804/afs-oops-2.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#3, seventh screenful"
   http://launchpadlibrarian.net/23566785/afs-oops-3.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#4 sixth screenful"
   http://launchpadlibrarian.net/23566768/afs-oops-4.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#5 fifth screenful"
   http://launchpadlibrarian.net/23566756/afs-oops-5.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#6 fourth screenful"
   http://launchpadlibrarian.net/23566738/afs-oops-6.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "#7, third screenful"
   http://launchpadlibrarian.net/23566733/afs-oops-7.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey

** Attachment added: "second screenful"
   http://launchpadlibrarian.net/23566705/afs-oops-8.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread William Cattey
This just happened to me in a VM.  Alas, cut/paste wasn't available.  But I WAS 
able to
scroll back and get the ENTIRE back trace as a series of images.

Start with afs-oops-9.png and work your way forward.


** Attachment added: "First screenful of dump."
   http://launchpadlibrarian.net/23566698/afs-oops-9.png

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread Nelson Elhage
We've seen what looks to be the same oops on 1.4.7 in Intrepid, using
the stock package and kernels built with no additional packages. The
last bit of our stack trace is:

(I don't have a full stack strace because it tends to happen at
reboot, and doesn't make it to disk or any other way we can access it;
This was copied down manually from the console)

=
BUG: unable to handle kernel paging request at f89e003c
IP: [] :openafs:shutdown_vcache+0xf8/0x160
...
... EFLAGS: 00010282
EAX: f89e0038 EBX: f89e0038 ECX: f8f400b0 EDX: 0246
ESI: 0400 EDI: f90da280 EBP: f57ebf14 ESP: f57ebf0c
...
Stack: f4d84800 f9d09c0 f57ebf20 f906815c f90d09a0 f57ebf28
...
Call Trace:
  [] ? shutdown_cache+0x3c/0xd0 [openafs]
  [] ? afs_shutdown+0x204/0x310 [openafs]
  [] ? afs_put_super+0x5b/0xf0 [openafs]
[followed by some nonmodule unmounting functions]
...
Code: ... c0 74 17 8d 74 26 00 <8b> 58 04 ba d0 20 00 00 ...
=

We also see a WARNING at kernel/exit.c:1001 do_exit+0x353/0x360() from
something called by shutdown_vcache -> error_code -> do_page_fault ->
afs_lhash_address -> -> oops_end -> -> etc.

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 333197] Re: openafs-modules 1.4.8 segfault after stop

2009-03-06 Thread Marco Rodrigues
** Changed in: openafs (Ubuntu)
Sourcepackagename: None => openafs

-- 
openafs-modules 1.4.8 segfault after stop
https://bugs.launchpad.net/bugs/333197
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs