Folks, It been unsuccessful till now..
I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5). Later, I updated kernel to 2.6.18-92.1.17 version. Here is a output from uname and rpm query: [r...@localhost ~]# rpm -qa | grep lustre lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp [r...@localhost ~]# uname -a Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue Nov 4 13:45:01 EST 2008 i686 i686 i386 GNU/Linux Other details: --- --- --- [r...@localhost ~]# ls -l /lib/modules | grep 2.6 drwxr-xr-x 6 root root 4096 Jun 17 18:47 2.6.18-92.1.17.el5 drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5 [r...@localhost modules]# find . | grep lustre ./2.6.18-92.1.17.el5/kernel/net/lustre ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko --- --- --- I am still having same problem. I seriously doubt, am I missing anything? I also tried a source install for 'patchless client', however I have been consistent in its results too. Are there any configuration steps needed after rpm (or source) installation? The one that I know of is restricting interfaces in modeprobe.conf, however I have tried it on-n-off with no success. Could anyone please suggest any debugging and tests for the same? How can I provide you more valuable output to help me? Any insights? Also, I have a suggestion here. It might be good idea to check for 'uname -r' check in RPM installation to check for matching kernel version and if not suggest for source install. Thanks for the help. I really appreciate your patience.. - Thanks, CS. On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<ron.jer...@nrc-cnrc.gc.ca> wrote: > I think the problem you have, as Cliff alluded to, is a mismatch between > your kernel version and the Luster kernel version modules. > > > > You have kernel “2.6.18-92.el5” and are installing Lustre > “2.6.18_92.1.17.el5” Note the “.1.17” is significant as the modules will > end up in the wrong directory. There is an update to CentOS to bring the > kernel to the matching 2.6.18_92.1.17.el5 version you can pull it off the > CentOS mirror site in the updates directory. > > > > > > Ron. > > > > From: lustre-discuss-boun...@lists.lustre.org > [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Carlos Santana > Sent: June 17, 2009 11:21 AM > To: lustre-discuss@lists.lustre.org > Subject: Re: [Lustre-discuss] Lustre installation and configuration problems > > > > And is there any specific installation order for patchless client? Could > someone please share it with me? > > - > CS. > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neu...@gmail.com> wrote: > > Huh... :( Sorry to bug you guys again... > > I am planning to make a fresh start now as nothing seems to have worked for > me. If you have any comments/feedback please share them. > > I would like to confirm installation order before I make a fresh start. From > Arden's experience: > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html , the > lusre-module is installed last. As I was installing Lustre 1.8, I was > referring 1.8 operations manual > http://manual.lustre.org/index.php?title=Main_Page . The installation order > in the manual is different than what Arden has suggested. > > Will it make a difference in configuration at later stage? Which one should > I follow now? > Any comments? > > Thanks, > CS. > > > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neu...@gmail.com> wrote: > > Thanks Cliff. > > The depmod -a was successful before as well. I am using CentOS 5.2 > box. Following are the packages installed: > [r...@localhost tmp]# rpm -qa | grep -i lustre > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp > > [r...@localhost tmp]# uname -a > > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47 > EDT 2008 i686 i686 i386 GNU/Linux > > And here is a output from strace for mount: > http://www.heypasteit.com/clip/8WT > > Any further debugging hints? > > Thanks, > CS. > > On 6/16/09, Cliff White <cliff.wh...@sun.com> wrote: >> Carlos Santana wrote: >>> The '$ modprobe -l lustre*' did not show any module on a patchless >>> client. modprobe -v returns 'FATAL: Module lustre not found'. >>> >>> How do I install a patchless client? >>> I have tried lustre-client-modules and lustre-client-ver rpm packages in >>> both sequences. Am I missing anything? >>> >> >> Make sure the lustre-client-modules package matches your running kernel. >> Run depmod -a to be sure >> cliffw >> >>> Thanks, >>> CS. >>> >>> >>> >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <cliff.wh...@sun.com >>> <mailto:cliff.wh...@sun.com>> wrote: >>> >>> Carlos Santana wrote: >>> >>> The lctlt ping and 'net up' failed with the following messages: >>> --- --- >>> [r...@localhost ~]# lctl ping 10.0.0.42 >>> opening /dev/lnet failed: No such device >>> hint: the kernel modules may not be loaded >>> failed to ping 10.0.0...@tcp: No such device >>> >>> [r...@localhost ~]# lctl network up >>> opening /dev/lnet failed: No such device >>> hint: the kernel modules may not be loaded >>> LNET configure error 19: No such device >>> >>> >>> Make sure modules are unloaded, then try modprobe -v. >>> Looks like you have lnet mis-configured, if your module options are >>> wrong, you will see an error during the modprobe. >>> cliffw >>> >>> --- --- >>> >>> >>> I tried lustre_rmmod and depmod commands and it did not return >>> any error messages. Any further clues? Reinstall patchless >>> client again? >>> >>> - >>> CS. >>> >>> >>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White >>> <cliff.wh...@sun.com <mailto:cliff.wh...@sun.com> >>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>> wrote: >>> >>> Carlos Santana wrote: >>> >>> I was able to run lustre_rmmod and depmod successfully. >>> The >>> '$lctl list_nids' returned the server ip address and >>> interface >>> (tcp0). >>> >>> I tried to mount the file system on a remote client, but >>> it >>> failed with the following message. >>> --- --- >>> [r...@localhost ~]# mount -t lustre 10.0.0...@tcp0:/lustre >>> /mnt/lustre >>> mount.lustre: mount 10.0.0...@tcp0:/lustre at /mnt/lustre >>> failed: No such device >>> Are the lustre modules loaded? >>> Check /etc/modprobe.conf and /proc/filesystems >>> Note 'alias lustre llite' should be removed from >>> modprobe.conf >>> --- --- >>> >>> However, the mounting is successful on a single node >>> configuration - with client on the same machine as MDS >>> and OST. >>> Any clues? Where to look for logs and debug messages? >>> >>> >>> Syslog || /var/log/messages is the normal place. >>> >>> You can use 'lctl ping' to verify that the client can reach >>> the server. >>> Usually in these cases, it's a network/name misconfiguration. >>> >>> Run 'tunefs.lustre --print' on your servers, and verify that >>> mgsnode= >>> is correct. >>> >>> cliffw >>> >>> >>> Thanks, >>> CS. >>> >>> >>> >>> >>> >>> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White >>> <cliff.wh...@sun.com <mailto:cliff.wh...@sun.com> >>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>> >>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com> >>> <mailto:cliff.wh...@sun.com <mailto:cliff.wh...@sun.com>>>> >>> wrote: >>> >>> Carlos Santana wrote: >>> >>> Thanks Kevin.. >>> >>> Please read: >>> >>> >>> >>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529 >>> >>> Those instructions are identical for 1.6 and 1.8. >>> >>> For current lustre, only two commands are used for >>> configuration. >>> mkfs.lustre and mount. >>> >>> >>> Usually when lustre_rmmod returns that error, you run >>> it a second >>> time, and it will clear things. Unless you have live >>> mounts or >>> network connections. >>> >>> cliffw >>> >>> >>> I am referring to 1.8 manual, but I was also >>> referring to >>> HowTo >>> page on wiki which seems to be for 1.6. The HowTo >>> page >>> >>> >>> >>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools >>> mentions abt lmc, lconf, and lctl. >>> >>> The modules are installed in the right place. The >>> '$ >>> lustre_rmmod' resulted in following o/p: >>> [r...@localhost >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]# >>> lustre_rmmod >>> ERROR: Module obdfilter is in use >>> ERROR: Module ost is in use >>> ERROR: Module mds is in use >>> ERROR: Module fsfilt_ldiskfs is in use >>> ERROR: Module mgs is in use >>> ERROR: Module mgc is in use by mgs >>> ERROR: Module ldiskfs is in use by fsfilt_ldiskfs >>> ERROR: Module lov is in use >>> ERROR: Module lquota is in use by obdfilter,mds >>> ERROR: Module osc is in use >>> ERROR: Module ksocklnd is in use >>> ERROR: Module ptlrpc is in use by >>> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc >>> ERROR: Module obdclass is in use by >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc >>> ERROR: Module lnet is in use by >>> ksocklnd,ptlrpc,obdclass >>> ERROR: Module lvfs is in use by >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass >>> ERROR: Module libcfs is in use by >>> >>> >>> >>> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs >>> >>> Do I need to shutdown these services? How can I do >>> that? >>> >>> Thanks, >>> CS. >>> _______________________________________________ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss