Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

Patrick Farrell Tue, 12 Apr 2016 13:54:12 -0700

Mark,

Giving the rest of the back trace of the crash would help for developers 
looking at it.

It's a lot easier to tell what code is involved with the whole trace.

Thanks,
- Patrick
________________________________________
From: lustre-discuss [lustre-discuss-boun...@lists.lustre.org] on behalf of 
Mark Hahn [h...@mcmaster.ca]
Sent: Tuesday, April 12, 2016 3:49 PM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] MDS crashing: unable to handle kernel paging request 
at 00000000deadbeef (iam_container_init+0x18/0x70)

One of our MDSs is crashing with the following:

BUG: unable to handle kernel paging request at 00000000deadbeef
IP: [<ffffffffa0ce0328>] iam_container_init+0x18/0x70 [osd_ldiskfs]
PGD 0
Oops: 0002 [#1] SMP

The MDS is running 2.5.3-RC1--PRISTINE-2.6.32-431.23.3.el6_lustre.x86_64
with about 2k clients ranging from 1.8.8 to 2.6.0

I'd appreciate any comments on where to point fingers: google doesn't
provide anything suggestive about iam_container_init.

Our problem seems to correlate with an unintentional creation of a tree
of >500M files.  Some of the crashes we've had since then appeared
to be related to vm.zone_reclaim_mode=1.  We also enabled quotas
right after the 500M file thing, and were thinking that inconsistent
quota records might cause this sort of crash.

But 0xdeadbeef is usually added as a canary for allocation issues;
is it used this way in Lustre?

thanks,
Mark Hahn | SHARCnet Sysadmin | h...@sharcnet.ca | http://www.sharcnet.ca
           | McMaster RHPCS    | h...@mcmaster.ca | 905 525 9140 x24687
           | Compute/Calcul Canada                | http://www.computecanada.ca
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

Reply via email to