Bug#735139: [Nbd] nbd recovery after suspend/resume

2014-02-22 Thread Wouter Verhelst
Hi Paul,

Op vrijdag 21 februari 2014 22:54:43 schreef Paul Clements:
 Is it a kernel related thing, or could it be the fix to nbd-client signal
 handling as detailed in this thread?

I doubt that it is userspace-related. When reading the bug log, no mention is 
made of changes in userspace, only of using a more recent kernel.

However, that doesn't mean there are no such changes. Ernesto, did you change 
anything in the way you ran the NBD device?

If not, can you clarify how exactly you're establishing the client side of the 
NBD connection?

Thanks,

 https://www.mail-archive.com/nbd-general@lists.sourceforge.net/msg01568.html
 On Wed, Feb 19, 2014 at 6:10 PM, Ben Hutchings b...@decadent.org.uk wrote:
  Ernesto reported that ndb mounts break after suspend/resume when running
  
  Linux 3.2.51:
   [48080.515468] block nbd1: Attempted send on closed socket
   [48080.515473] end_request: I/O error, dev nbd1, sector 91896
   [48080.515718] block nbd1: Attempted send on closed socket
   [48080.515721] end_request: I/O error, dev nbd1, sector 91896
   [48080.515752] [ cut here ]
   [48080.515863] kernel BUG at
  
  /build/linux-rrsxby/linux-3.2.51/fs/buffer.c:2917!
  
   [48080.516010] invalid opcode:  [#1] SMP
   [48080.516176] CPU 0
   [48080.516188] Modules linked in: snd_usb_audio snd_usbmidi_lib
  
  snd_seq_midi snd_seq_midi_event snd_rawmidi nls_utf8 nls_cp437 vfat fat
  nbd
  cbc ecb vmnet(O) vsock(O) vmci(O) vmmon(O) parport_pc ppdev lp parport
  cpufreq_conservative bnep cpufreq_userspace cpufreq_stats
  cpufreq_powersave
  rfcomm 8021q garp stp binfmt_misc uinput nfsd nfs nfs_acl auth_rpcgss
  fscache lockd sunrpc loop fuse ecryptfs dm_crypt dm_mod snd_hda_codec_hdmi
  snd_hda_codec_conexant pl2303 usbserial arc4 iwlwifi joydev btusb mac80211
  bluetooth snd_hda_intel snd_hda_codec snd_hwdep snd_pcm i915
  drm_kms_helper
  snd_page_alloc drm iTCO_wdt iTCO_vendor_support snd_seq cfg80211
  snd_seq_device snd_timer snd evdev soundcore i2c_i801 dell_laptop
  i2c_algo_bit i2c_core rfkill coretemp acpi_cpufreq mperf video pcspkr
  dcdbas psmouse dell_wmi ac serio_raw sparse_keymap processor button
  battery
  power_supply wmi ext4 crc16 jbd2 mbcache usbhid hid ums_realtek
  usb_storage
  sg sr_mod sd_mod cdrom crc_t10dif xhci_hcd crc32c_intel
  ghash_clmulni_intel
  aesni_intel ahci libahci aes_x86_64 thermal thermal_sys libata atl1c
  scsi_mod ehci_hcd aes_generic cryptd usbcore usb_common [last unloaded:
  scsi_wait_scan]
  
   [48080.520191]
   [48080.520931] Pid: 7672, comm: make Tainted: G   O
  
  3.2.0-4-amd64 #1 Debian 3.2.51-1 Dell Inc.  Dell System Inspiron
  N411Z/
  
   [48080.521803] RIP: 0010:[8111ccc3]  [8111ccc3]
  
  submit_bh+0x19/0xff
  
   [48080.522674] RSP: 0018:88017a5e5a68  EFLAGS: 00010246
  
   [48080.523557] RAX: 00040005 RBX: 8800c947af68 RCX:
  0004
  
   [48080.524480] RDX:  RSI: 8800c947af68 RDI:
  0211
  
   [48080.525417] RBP: 0211 R08: 0200 R09:
  8168f0a0
  
   [48080.526246] R10: 880107a798c0 R11: 880107a798c0 R12:
  8800c919e400
  
   [48080.527186] R13: 0001 R14: 0001f381 R15:
  03c94245
  
   [48080.528204] FS:  7fea81a02700() GS:88019fa0()
  
  knlGS:
  
   [48080.529252] CS:  0010 DS:  ES:  CR0: 80050033
  
   [48080.530326] CR2: 019c8000 CR3: 0001613a1000 CR4:
  000406f0
  
   [48080.531435] DR0:  DR1:  DR2:
  
  
   [48080.532557] DR3:  DR6: 0ff0 DR7:
  0400
  
   [48080.533691] Process make (pid: 7672, threadinfo 88017a5e4000,
  
  task 8800d066f650)
  
   [48080.534863] Stack:
   [48080.536040]  8800c947af68 0211 8800c919e400
  
  8111f577
  
   [48080.537291]  8800d066f650 811be4ff 8800c947af68
  
  8800c947af68
  
   [48080.538558]  88015c1cac00 a01c4fcd a01e5e1d
  
  88000dd8e840
  
   [48080.539848] Call Trace:
   [48080.541140]  [8111f577] ? __sync_dirty_buffer+0x52/0x87
   [48080.542474]  [811be4ff] ? __percpu_counter_sum+0x44/0x57
   [48080.543861]  [a01c4fcd] ? ext4_commit_super+0x191/0x1d3
  
  [ext4]
  
   [48080.545251]  [a01c636e] ? ext4_error_inode+0x4c/0xef [ext4]
   [48080.546654]  [a01b4275] ? ext4_find_entry+0x1eb/0x298
   [ext4]
   [48080.548096]  [a01b4350] ? ext4_lookup+0x2e/0x11c [ext4]
   [48080.549522]  [8110b1d3] ? __d_alloc+0x12c/0x13c
   [48080.550964]  [81102709] ? d_alloc_and_lookup+0x3a/0x60
   [48080.552429]  [811031ad] ? walk_component+0x219/0x406
   [48080.553934]  [810bdce1] ? add_page_to_lru_list+0x64/0x64
   [48080.555443]  [81104041] ? path_lookupat+0x7c/0x2bd
   [48080.556949]  [81036628] ? 

Bug#735139: [Nbd] nbd recovery after suspend/resume

2014-02-21 Thread Paul Clements
Is it a kernel related thing, or could it be the fix to nbd-client signal
handling as detailed in this thread?

https://www.mail-archive.com/nbd-general@lists.sourceforge.net/msg01568.html


On Wed, Feb 19, 2014 at 6:10 PM, Ben Hutchings b...@decadent.org.uk wrote:

 Ernesto reported that ndb mounts break after suspend/resume when running
 Linux 3.2.51:

  [48080.515468] block nbd1: Attempted send on closed socket
  [48080.515473] end_request: I/O error, dev nbd1, sector 91896
  [48080.515718] block nbd1: Attempted send on closed socket
  [48080.515721] end_request: I/O error, dev nbd1, sector 91896
  [48080.515752] [ cut here ]
  [48080.515863] kernel BUG at
 /build/linux-rrsxby/linux-3.2.51/fs/buffer.c:2917!
  [48080.516010] invalid opcode:  [#1] SMP
  [48080.516176] CPU 0
  [48080.516188] Modules linked in: snd_usb_audio snd_usbmidi_lib
 snd_seq_midi snd_seq_midi_event snd_rawmidi nls_utf8 nls_cp437 vfat fat nbd
 cbc ecb vmnet(O) vsock(O) vmci(O) vmmon(O) parport_pc ppdev lp parport
 cpufreq_conservative bnep cpufreq_userspace cpufreq_stats cpufreq_powersave
 rfcomm 8021q garp stp binfmt_misc uinput nfsd nfs nfs_acl auth_rpcgss
 fscache lockd sunrpc loop fuse ecryptfs dm_crypt dm_mod snd_hda_codec_hdmi
 snd_hda_codec_conexant pl2303 usbserial arc4 iwlwifi joydev btusb mac80211
 bluetooth snd_hda_intel snd_hda_codec snd_hwdep snd_pcm i915 drm_kms_helper
 snd_page_alloc drm iTCO_wdt iTCO_vendor_support snd_seq cfg80211
 snd_seq_device snd_timer snd evdev soundcore i2c_i801 dell_laptop
 i2c_algo_bit i2c_core rfkill coretemp acpi_cpufreq mperf video pcspkr
 dcdbas psmouse dell_wmi ac serio_raw sparse_keymap processor button battery
 power_supply wmi ext4 crc16 jbd2 mbcache usbhid hid ums_realtek usb_storage
 sg sr_mod sd_mod cdrom crc_t10dif xhci_hcd crc32c_intel ghash_clmulni_intel
 aesni_intel ahci libahci aes_x86_64 thermal thermal_sys libata atl1c
 scsi_mod ehci_hcd aes_generic cryptd usbcore usb_common [last unloaded:
 scsi_wait_scan]
  [48080.520191]
  [48080.520931] Pid: 7672, comm: make Tainted: G   O
 3.2.0-4-amd64 #1 Debian 3.2.51-1 Dell Inc.  Dell System Inspiron
 N411Z/
  [48080.521803] RIP: 0010:[8111ccc3]  [8111ccc3]
 submit_bh+0x19/0xff
  [48080.522674] RSP: 0018:88017a5e5a68  EFLAGS: 00010246
  [48080.523557] RAX: 00040005 RBX: 8800c947af68 RCX:
 0004
  [48080.524480] RDX:  RSI: 8800c947af68 RDI:
 0211
  [48080.525417] RBP: 0211 R08: 0200 R09:
 8168f0a0
  [48080.526246] R10: 880107a798c0 R11: 880107a798c0 R12:
 8800c919e400
  [48080.527186] R13: 0001 R14: 0001f381 R15:
 03c94245
  [48080.528204] FS:  7fea81a02700() GS:88019fa0()
 knlGS:
  [48080.529252] CS:  0010 DS:  ES:  CR0: 80050033
  [48080.530326] CR2: 019c8000 CR3: 0001613a1000 CR4:
 000406f0
  [48080.531435] DR0:  DR1:  DR2:
 
  [48080.532557] DR3:  DR6: 0ff0 DR7:
 0400
  [48080.533691] Process make (pid: 7672, threadinfo 88017a5e4000,
 task 8800d066f650)
  [48080.534863] Stack:
  [48080.536040]  8800c947af68 0211 8800c919e400
 8111f577
  [48080.537291]  8800d066f650 811be4ff 8800c947af68
 8800c947af68
  [48080.538558]  88015c1cac00 a01c4fcd a01e5e1d
 88000dd8e840
  [48080.539848] Call Trace:
  [48080.541140]  [8111f577] ? __sync_dirty_buffer+0x52/0x87
  [48080.542474]  [811be4ff] ? __percpu_counter_sum+0x44/0x57
  [48080.543861]  [a01c4fcd] ? ext4_commit_super+0x191/0x1d3
 [ext4]
  [48080.545251]  [a01c636e] ? ext4_error_inode+0x4c/0xef [ext4]
  [48080.546654]  [a01b4275] ? ext4_find_entry+0x1eb/0x298 [ext4]
  [48080.548096]  [a01b4350] ? ext4_lookup+0x2e/0x11c [ext4]
  [48080.549522]  [8110b1d3] ? __d_alloc+0x12c/0x13c
  [48080.550964]  [81102709] ? d_alloc_and_lookup+0x3a/0x60
  [48080.552429]  [811031ad] ? walk_component+0x219/0x406
  [48080.553934]  [810bdce1] ? add_page_to_lru_list+0x64/0x64
  [48080.555443]  [81104041] ? path_lookupat+0x7c/0x2bd
  [48080.556949]  [81036628] ? should_resched+0x5/0x23
  [48080.558485]  [8134deec] ? _cond_resched+0x7/0x1c
  [48080.560030]  [8110429e] ? do_path_lookup+0x1c/0x87
  [48080.561541]  [81105d27] ? user_path_at_empty+0x47/0x7b
  [48080.563129]  [81352198] ? do_page_fault+0x30a/0x345
  [48080.564737]  [810fdd7a] ? vfs_fstatat+0x32/0x60
  [48080.566340]  [810fdeb0] ? sys_newstat+0x12/0x2b
  [48080.567920]  [810fa75e] ? vfs_write+0xbb/0xe9
  [48080.569477]  [8134f7b5] ? page_fault+0x25/0x30
  [48080.571036]  [81354212] ? system_call_fastpath+0x16/0x1b
  [48080.572564] Code: ff b8 01 00 00 00 eb 02 31