Re: [Lustre-discuss] Kernel Panic error after lustre 2.0 installation

2011-02-18 Thread Arya Mazaheri
Ww! Thanks for your suggestion. The only thing needed to do is to make
the 'arcmsr.c' and 'arcmsr.h' and finally make the ram disk.
Now everything is working smoothly...

Thanks again... ;)

On Fri, Feb 18, 2011 at 1:16 AM, Kevin Van Maren kevin.van.ma...@oracle.com
 wrote:

 Yep.  All you have to do is rebuild the driver for the Lustre kernel.

 First, bring the system back up with the non-Lustre kernel.



 See the bottom of the readme:

   # cd /usr/src/linux/drivers/scsi/arcmsr
   (suppose /usr/src/linux is the soft-link for
 /usr/src/kernel/2.6.23.1-42.fc8-i386)
   # make -C /lib/modules/`uname -r`/build CONFIG_SCSI_ARCMSR=m SUBDIRS=$PWD
 modules
   # insmod arcmsr.ko

 Except instead of uname -r substitute the lustre kernel's 'uname -r', as
 you want to build for the Lustre kernel.  Be sure you have the Lustre
 kernel-devel RPM installed.

 Note that the insmod will not work (you already have it for the running
 kernel, and the one you built for the Lustre kernel will not work).  You
 will need to rebuild the initrd for the Lustre kernel (see the other
 instructions in the readme, using the Lustre kernel).

 Kevin


 Arya Mazaheri wrote:

 The driver name is arcmsr.ko and I extracted it from driver.img included
 in RAID controller's CD. The following text file may clarify better:


 ftp://areca.starline.de/RaidCards/AP_Drivers/Linux/DRIVER/RedHat/FedoraCore/Redhat-Fedora-core8/1.20.0X.15/Intel/readme.txt

 Please tell me, if you need more information about this issue...

 On Thu, Feb 17, 2011 at 11:33 PM, Brian J. Murrell 
 br...@whamcloud.commailto:
 br...@whamcloud.com wrote:

On Thu, 2011-02-17 at 23:26 +0330, Arya Mazaheri wrote:
 Hi there,

Hi,

 Unable to access resume device (LABEL=SWAP-sda3)
 mount: could not find filesystem 'dev/root'
 setuproot: moving /dev failed: No such file or directory
 setuproot: error mounting /proc: No such file or directory
 setuproot: error mounting /sys: No such file or directory
 swirchroot: mount failed: No such file or directory
 Kernel Panic - not syncing: Attempted to kill init!

 I have no problem with the original kernel installed by centos. I
 guessed this may be related to RAID controller card driver which may
 not loaded by the patched lustre kernel.

That seems like a reasonable conclusion given the information
available.

 so I have added the driver into the initrd.img file.

Where did you get the driver from?  What is the name of the driver?

 But it didn't solve the problem.

Depending on where it came from, yes, it might not.

 Should I install the lustre by building the source?

That may be required, but not necessarily required.  We need more
information.

b.



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
mailto:Lustre-discuss@lists.lustre.org

http://lists.lustre.org/mailman/listinfo/lustre-discuss


 


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss




___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Running MGS and OSS on the same machine

2011-02-18 Thread Arya Mazaheri
Hi again,
I have planned to use one server as MGS and OSS simultaneously. But how can
I format the OSTs as lustre FS?
for example, the line below tells the ost which it's mgsnode is at
192.168.0.10@tcp0:
mkfs.lustre --fsname lustre --ost --mgsnode=192.168.0.10@tcp0 /dev/vg00/ost1

But, now mgsnode is the same machine. I tried to put localhost instead the
ip address. but I didn't work.

What shoud I do?

Arya
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Running MGS and OSS on the same machine

2011-02-18 Thread Michael Kluge
Hi Arya,

if I remember well, Lustre uses 0@lo for the localhost address. Does 
using the other NID 192.168.0.10@tcp0 give any error message?


Michael

Am 18.02.2011 16:10, schrieb Arya Mazaheri:
 Hi again,
 I have planned to use one server as MGS and OSS simultaneously. But how
 can I format the OSTs as lustre FS?
 for example, the line below tells the ost which it's mgsnode is at
 192.168.0.10@tcp0:
 mkfs.lustre --fsname lustre --ost --mgsnode=192.168.0.10@tcp0 /dev/vg00/ost1

 But, now mgsnode is the same machine. I tried to put localhost instead
 the ip address. but I didn't work.

 What shoud I do?

 Arya



 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss


-- 
Michael Kluge, M.Sc.

Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:(+49) 351 463-37773
e-mail: michael.kl...@tu-dresden.de
WWW:http://www.tu-dresden.de/zih
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] trying to port 1.8.5 to RH6 I'm facing Kernel panics due to an error in the handling of page-private

2011-02-18 Thread Hebenstreit, Michael
If I understand everything correctly page-private should contain either 0 or a 
pointer to kernel space. For reasons I can not currently comprehend sometimes 
the value is set to 2. llap_cast_private() then tries to access 
page-private-llap_magic, and naturally this leads to NULL pointer dereference 
 
this problem seems to creep up only during parallel reads/writes
 
any ideas?
anyone doing a RH6 port of the 1.8 branch (and no, 2.1 won't help me, my 
backends are 1.7 and can not be changed at this time)?
 
thanks
Michael
 

Michael Hebenstreit Senior Cluster Architect
Intel Corporation   Software and Services Group/HTE
2800 N Center Dr, DP3-307   Tel.:   +1 253 371 3144
WA 98327, DuPont   
UNITED STATES   E-mail: michael.hebenstr...@intel.com
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] lock callback timer expired

2011-02-18 Thread Joe Digilio
I've gotten hit with four lock callback timer expired events in the
past week.  The logs from one event are attached.

I recently upgraded from 1.8.3 to 1.8.5.  On 1.8.3 the error had only
occurred once.  Now, four times in a week.  Looking through bugzilla,
I thought it might be 23190 and 23963.  But then I took note of the
following line.  Does this mean it's the backend storage?
INFO: task jbd2/sdj-8:10577 blocked for more than 120 seconds.

Any ideas?

Thanks!
-Joe


client.log
Description: Binary data


oss1.log
Description: Binary data
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Running MGS and OSS on the same machine

2011-02-18 Thread Charland, Denis
Arya,

I have the MGS, the MDT and the OST all on the same machine and everything 
works fine. It should not be a problem to have the MGS and the OST on the same 
machine.

Are your MGS and MDT mounted when you execute mkfs.lustre for the OST?

Denis

Denis Charland, ing. | P. Eng.
Administrateur de Systèmes UNIX | UNIX Systems Administrator
Tél. | tel. (450) 641-5078 Fax (450) 641-5106
Courriel | E-mail : 
denis.charl...@cnrc-nrc.gc.camailto:denis.charl...@cnrc-nrc.gc.ca

Institut des matériaux industriels | Industrial Materials Institute
Conseil national de recherches Canada | National Research Council Canada
75, de Mortagne, Boucherville, Québec, Canada, J4B 6Y4
Gouvernement du Canada | Government of Canada


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss