Re: [Lustre-discuss] Kernel Panic error after lustre 2.0 installation
Ww! Thanks for your suggestion. The only thing needed to do is to make the 'arcmsr.c' and 'arcmsr.h' and finally make the ram disk. Now everything is working smoothly... Thanks again... ;) On Fri, Feb 18, 2011 at 1:16 AM, Kevin Van Maren kevin.van.ma...@oracle.com wrote: Yep. All you have to do is rebuild the driver for the Lustre kernel. First, bring the system back up with the non-Lustre kernel. See the bottom of the readme: # cd /usr/src/linux/drivers/scsi/arcmsr (suppose /usr/src/linux is the soft-link for /usr/src/kernel/2.6.23.1-42.fc8-i386) # make -C /lib/modules/`uname -r`/build CONFIG_SCSI_ARCMSR=m SUBDIRS=$PWD modules # insmod arcmsr.ko Except instead of uname -r substitute the lustre kernel's 'uname -r', as you want to build for the Lustre kernel. Be sure you have the Lustre kernel-devel RPM installed. Note that the insmod will not work (you already have it for the running kernel, and the one you built for the Lustre kernel will not work). You will need to rebuild the initrd for the Lustre kernel (see the other instructions in the readme, using the Lustre kernel). Kevin Arya Mazaheri wrote: The driver name is arcmsr.ko and I extracted it from driver.img included in RAID controller's CD. The following text file may clarify better: ftp://areca.starline.de/RaidCards/AP_Drivers/Linux/DRIVER/RedHat/FedoraCore/Redhat-Fedora-core8/1.20.0X.15/Intel/readme.txt Please tell me, if you need more information about this issue... On Thu, Feb 17, 2011 at 11:33 PM, Brian J. Murrell br...@whamcloud.commailto: br...@whamcloud.com wrote: On Thu, 2011-02-17 at 23:26 +0330, Arya Mazaheri wrote: Hi there, Hi, Unable to access resume device (LABEL=SWAP-sda3) mount: could not find filesystem 'dev/root' setuproot: moving /dev failed: No such file or directory setuproot: error mounting /proc: No such file or directory setuproot: error mounting /sys: No such file or directory swirchroot: mount failed: No such file or directory Kernel Panic - not syncing: Attempted to kill init! I have no problem with the original kernel installed by centos. I guessed this may be related to RAID controller card driver which may not loaded by the patched lustre kernel. That seems like a reasonable conclusion given the information available. so I have added the driver into the initrd.img file. Where did you get the driver from? What is the name of the driver? But it didn't solve the problem. Depending on where it came from, yes, it might not. Should I install the lustre by building the source? That may be required, but not necessarily required. We need more information. b. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Running MGS and OSS on the same machine
Hi again, I have planned to use one server as MGS and OSS simultaneously. But how can I format the OSTs as lustre FS? for example, the line below tells the ost which it's mgsnode is at 192.168.0.10@tcp0: mkfs.lustre --fsname lustre --ost --mgsnode=192.168.0.10@tcp0 /dev/vg00/ost1 But, now mgsnode is the same machine. I tried to put localhost instead the ip address. but I didn't work. What shoud I do? Arya ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Running MGS and OSS on the same machine
Hi Arya, if I remember well, Lustre uses 0@lo for the localhost address. Does using the other NID 192.168.0.10@tcp0 give any error message? Michael Am 18.02.2011 16:10, schrieb Arya Mazaheri: Hi again, I have planned to use one server as MGS and OSS simultaneously. But how can I format the OSTs as lustre FS? for example, the line below tells the ost which it's mgsnode is at 192.168.0.10@tcp0: mkfs.lustre --fsname lustre --ost --mgsnode=192.168.0.10@tcp0 /dev/vg00/ost1 But, now mgsnode is the same machine. I tried to put localhost instead the ip address. but I didn't work. What shoud I do? Arya ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Michael Kluge, M.Sc. Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) D-01062 Dresden Germany Contact: Willersbau, Room WIL A 208 Phone: (+49) 351 463-34217 Fax:(+49) 351 463-37773 e-mail: michael.kl...@tu-dresden.de WWW:http://www.tu-dresden.de/zih ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] trying to port 1.8.5 to RH6 I'm facing Kernel panics due to an error in the handling of page-private
If I understand everything correctly page-private should contain either 0 or a pointer to kernel space. For reasons I can not currently comprehend sometimes the value is set to 2. llap_cast_private() then tries to access page-private-llap_magic, and naturally this leads to NULL pointer dereference this problem seems to creep up only during parallel reads/writes any ideas? anyone doing a RH6 port of the 1.8 branch (and no, 2.1 won't help me, my backends are 1.7 and can not be changed at this time)? thanks Michael Michael Hebenstreit Senior Cluster Architect Intel Corporation Software and Services Group/HTE 2800 N Center Dr, DP3-307 Tel.: +1 253 371 3144 WA 98327, DuPont UNITED STATES E-mail: michael.hebenstr...@intel.com ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] lock callback timer expired
I've gotten hit with four lock callback timer expired events in the past week. The logs from one event are attached. I recently upgraded from 1.8.3 to 1.8.5. On 1.8.3 the error had only occurred once. Now, four times in a week. Looking through bugzilla, I thought it might be 23190 and 23963. But then I took note of the following line. Does this mean it's the backend storage? INFO: task jbd2/sdj-8:10577 blocked for more than 120 seconds. Any ideas? Thanks! -Joe client.log Description: Binary data oss1.log Description: Binary data ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] Running MGS and OSS on the same machine
Arya, I have the MGS, the MDT and the OST all on the same machine and everything works fine. It should not be a problem to have the MGS and the OST on the same machine. Are your MGS and MDT mounted when you execute mkfs.lustre for the OST? Denis Denis Charland, ing. | P. Eng. Administrateur de Systèmes UNIX | UNIX Systems Administrator Tél. | tel. (450) 641-5078 Fax (450) 641-5106 Courriel | E-mail : denis.charl...@cnrc-nrc.gc.camailto:denis.charl...@cnrc-nrc.gc.ca Institut des matériaux industriels | Industrial Materials Institute Conseil national de recherches Canada | National Research Council Canada 75, de Mortagne, Boucherville, Québec, Canada, J4B 6Y4 Gouvernement du Canada | Government of Canada ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss