Dear lustre wizards,

we are experiencing problems on our MDS and our Lustre expert is abroad
(he just attended LUG meeting).

One of the symptoms we observe are reproducible kernel oopses when
viewing some stats files beneath /proc/fs/lustre/mgs/MGS/exports :

    mds:~# cat /proc/fs/lustre/mgs/MGS/exports/10.12...@tcp/stats
    Killed
    mds:~#  mds kernel: Oops: 0000 [38] SMP
    Apr 23 13:23:19 mds kernel: Unable to handle kernel paging request
    at ffffffff00040024 RIP:
    Apr 23 13:23:19 mds kernel: [<ffffffff883d6680>]
    :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
    Apr 23 13:23:19 mds kernel: PGD 203067 PUD 0
    Apr 23 13:23:19 mds kernel: Oops: 0000 [38] SMP
    Apr 23 13:23:20 mds kernel: CPU 7
    Apr 23 13:23:20 mds kernel: Modules linked in: mds fsfilt_ldiskfs(F)
    mgs mgc ldiskfs crc16 lustre lov mdc lquota osc ksocklnd ptlrpc
    obdclass lnet lvfs libcfs xt_tcpudp iptable_filter ip_tables
    x_tables drbd cn button ac battery bonding xfs ipmi_si ipmi_devintf
    ipmi_msghandler serio_raw psmouse joydev pcspkr i2c_i801 i2c_core
    shpchp pci_hotplug evdev parport_pc parport ext3 jbd mbcache
    dm_mirror dm_snapshot dm_mod raid10 raid456 xor raid1 raid0
    multipath linear md_mod sd_mod ide_cd cdrom ata_generic libata
    generic usbhid hid piix 3w_9xxx floppy ide_core ehci_hcd uhci_hcd
    e1000 scsi_mod thermal processor fan
    Apr 23 13:23:20 mds kernel: Pid: 7293, comm: cat Tainted: GF     
    2.6.22+lustre1.6.7.2+0.credativ.etch.1 #2
    Apr 23 13:23:20 mds kernel: RIP: 0010:[<ffffffff883d6680>] 
    [<ffffffff883d6680>] :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
    Apr 23 13:23:20 mds kernel: RSP: 0018:ffff8103ba5f9e48  EFLAGS: 00010282
    Apr 23 13:23:20 mds kernel: RAX: ffffffff00040004 RBX:
    7fffffffffffffff RCX: 0000000000000006
    Apr 23 13:23:20 mds kernel: RDX: 0101010101010101 RSI:
    0000000000000000 RDI: 0000000000000000
    Apr 23 13:23:20 mds kernel: RBP: 0000000000000000 R08:
    0000000000000008 R09: 0000000000000000
    Apr 23 13:23:20 mds kernel: R10: 0000000000000000 R11:
    0000000000000000 R12: 0000000000000000
    Apr 23 13:23:20 mds kernel: R13: 0000000000000000 R14:
    0000000000000000 R15: ffff8108000a1760
    Apr 23 13:23:20 mds kernel: FS:  00002b4a366786d0(0000)
    GS:ffff81081004b840(0000) knlGS:0000000000000000
    Apr 23 13:23:20 mds kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
    000000008005003b
    Apr 23 13:23:20 mds kernel: CR2: ffffffff00040024 CR3:
    000000078f018000 CR4: 00000000000006e0
    Apr 23 13:23:20 mds kernel: Process cat (pid: 7293, threadinfo
    ffff8103ba5f8000, task ffff8107dc299530)
    Apr 23 13:23:20 mds kernel: Stack:  0000000000000202
    ffffffff00000000 ffffffff00040004 ffff81067dae2640
    Apr 23 13:23:20 mds kernel: 000000004bd18327 00000000000ca54d
    0000000000000000 ffff81067dae2640
    Apr 23 13:23:20 mds kernel: ffffffff00040004 0000000000040004
    0000000000000400 0000000000000000
    Apr 23 13:23:20 mds kernel: Call Trace:
    Apr 23 13:23:20 mds kernel: [<ffffffff8029c0ac>] seq_read+0x105/0x28d
    Apr 23 13:23:20 mds kernel: [<ffffffff80283f23>] vfs_read+0xcb/0x153
    Apr 23 13:23:20 mds kernel: [<ffffffff802842bf>] sys_read+0x45/0x6e
    Apr 23 13:23:20 mds kernel: [<ffffffff80209d8e>] system_call+0x7e/0x83
    Apr 23 13:23:20 mds kernel:
    Apr 23 13:23:20 mds kernel:
    Apr 23 13:23:20 mds kernel: Code: 48 8b 50 20 48 8b 48 28 4c 03 60
    10 4c 03 68 18 48 39 d3 48
    Apr 23 13:23:20 mds kernel: RIP  [<ffffffff883d6680>]
    :obdclass:lprocfs_stats_seq_show+0x80/0x1e0
     mds kernel: CR2: ffffffff00040024
    Apr 23 13:23:20 mds kernel: RSP <ffff8103ba5f9e48>
    Apr 23 13:23:20 mds kernel: CR2: ffffffff00040024


Server and affected client both run Lustre 1.6.7.2 on Debian Etch/x86_64
in this case. The behavior does not change after a client reboot.

All hints on how to solve this are really appreciated.

Kind regards,
    Christopher

-- 
Christopher Huhn
Linux therapist

GSI Helmholtzzentrum fuer Schwerionenforschung GmbH
Planckstr. 1
64291 Darmstadt
http://www.gsi.de/

Gesellschaft mit beschraenkter Haftung

Sitz der Gesellschaft / Registered Office:                    Darmstadt
Handelsregister       / Commercial Register: 
                                        Amtsgericht Darmstadt, HRB 1528

Geschaeftsfuehrung    / Managing Directors:  
                                 Professor Dr. Dr. h.c. Horst Stoecker,
                                                    Christiane Neumann,
                                                   Dr. Hartmut Eickhoff
Vorsitzende des Aufsichtsrates / Supervisory Board Chair:  
                                           Dr. Beatrix Vierkorn-Rudolph
Stellvertreter        / Deputy Chair:                 Dr. Rolf Bernhard


_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to