Re: [Lustre-discuss] lustre knowledge base

2009-03-31 Thread Mag Gam
Yes who do we speak to get lots of that stuff removed? Most of the questions on the news groups are redundant, it would be very nice to put a FAQ /KB on the Wiki On Tue, Mar 31, 2009 at 7:28 PM, Aaron Porter wrote: > On Tue, Mar 31, 2009 at 5:09 AM, Mag Gam wrote: >> >> Any possibility we revi

Re: [Lustre-discuss] lustre knowledge base

2009-03-31 Thread Aaron Porter
On Tue, Mar 31, 2009 at 5:09 AM, Mag Gam wrote: > > Any possibility we revive this on the wiki? The KB is very useful. It really does feel like the wiki has been abandoned. Lots of very outdated information prominently displayed. ___ Lustre-discuss mail

Re: [Lustre-discuss] lustre & ypbind

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 17:00 -0500, Robert Olson wrote: > here's an odd one. Had a client system reboot, and when it came back > ypbind was listening on port 988 and a lustre start complained bitterly: ... > Has anyone run into this? Yes. > I didn't see mention of it in the mailing > list ar

[Lustre-discuss] lustre & ypbind

2009-03-31 Thread Robert Olson
here's an odd one. Had a client system reboot, and when it came back ypbind was listening on port 988 and a lustre start complained bitterly: Lustre: OBD class driver, http://www.lustre.org/ Lustre: Lustre Version: 1.6.7 Lustre: Build Version: 1.6.7-1969123117- PRISTINE-.cac

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Lundgren, Andrew
Should this work with 1.6.4.3? My lctl doesn't even have a get_param, the closest thing I have is getattr, and it doesn't like the syntax below... > -Original Message- > From: lustre-discuss-boun...@lists.lustre.org [mailto:lustre-discuss- > boun...@lists.lustre.org] On Behalf Of Brian J

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 14:41 -0600, Andreas Dilger wrote: > > You can use: > > $ lctl get_param osc.${fsname}-${OSTname}*.ost_conn_uuid > > e.g. > > $ lctl get_param osc.*-OST*.ost_conn_uuid > osc.myth-ost-osc-f1579000.ost_conn_uuid=192.168.2...@tcp > > or > $ lctl get_param osc.*.ost_c

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Andreas Dilger
On Mar 31, 2009 10:20 -0400, Brian J. Murrell wrote: > On Tue, 2009-03-31 at 13:38 +0200, Wolfgang Stief wrote: > > with the corresponding IP hostname/domain > > (something like lussrv1.some.domain)? > > I guess what you really want to know is "what is the IP address (and by > extension, DNS name

Re: [Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 16:02 -0400, syed haider wrote: > > What would cause this? Could this be because of the fabric also? Sure. When the fabric is flaky all sorts of unexpected things (can) happen. Really, your primary task should be making your network stable rather than continuing to muck wi

Re: [Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread syed haider
Thanks Brian. On one of the hung nodes I umounted lustre, rmmod lustre and reloaded the module and I mounted lustre again. The mount hangs again but I see 16 OSTs in "ST" state. These are also listed as in "UP" state: 0 UP mgc mgc192.255.255@o2ib bf0dec15-659a-5817-6c78-0d43ca25e7c9 5 1 UP

Re: [Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 11:38 -0400, syed haider wrote: > Hi Brian, Hi. > Thanks for the response. I've run a few ib tests and here is an > interesting response on the port for a failed node: > > [r...@tiger-node-0-1 ~]# ibqueryerrors.pl -c -a -r > Suppressing: RcvSwRelayErrors > Errors for 0x0008

Re: [Lustre-discuss] File Content change without Error log

2009-03-31 Thread Brian J. Murrell
On Wed, 2009-04-01 at 01:24 +0800, Lu Wang wrote: > I think data in the "good" OST may also be demaged, so I decide to delete all > files on these two OSTs. Probably the safest thing to do. > By the way, when I unlink a file, there is a "Input/Output error" , however > the file disappears. >

Re: [Lustre-discuss] Additional RPMs for older kernels

2009-03-31 Thread Brian J. Murrell
On Mon, 2009-03-30 at 16:57 -0700, Jordan Mendler wrote: > > Are there any additional repositories that provide RPMs for older > kernels? Not from Sun. > In particular I am looking of lustre client modules for 1.6.7 > for a 2.6.9-55.0.2 (centos/rhel 4.5) kernel. You will, in all likelihood

Re: [Lustre-discuss] File Content change without Error log

2009-03-31 Thread Lu Wang
yes, I am copying some files from our backup storage. # pwd /lustre/ost1/O/0/d0 [r...@boss10 d0]# ll total 58931924 -rwSrwSrw- 1 root root 0 Mar 4 15:33 10016 -rwSrwSrw- 1 root root 0 Mar 4 15:33 10048 -rwSrwSrw- 1 root root 0 Mar 4 15:33 10080 -rwSrwSrw- 1 roo

[Lustre-discuss] acl

2009-03-31 Thread Papp Tamas
hi All, This is not OK, am I right? Is this a bug, or am I doing something wrong? Lustre 1.6.4.3 on CentoS 5.2, client is 2.6.26+b1_8 on FC8. $ getfacl . # file: . # owner: root # group: root user::rwx group::r-x mask::rwx other::r-x default:user::rwx default:user:user:rwx default:group::r-x defa

Re: [Lustre-discuss] File Content change without Error log

2009-03-31 Thread Brian J. Murrell
On Wed, 2009-04-01 at 00:40 +0800, Lu Wang wrote: > Yes, you are right. > The problem is caused by misconfiguration of one disk array.Two Patritions > of this array are mapped to a same lun. Hrm. That sounds rather bad. > That is to say: When I created OST1 on /dev/sda OST2 on /dev/sdb, th

Re: [Lustre-discuss] File Content change without Error log

2009-03-31 Thread Lu Wang
Yes, you are right. The problem is caused by misconfiguration of one disk array.Two Patritions of this array are mapped to a same lun. That is to say: When I created OST1 on /dev/sda OST2 on /dev/sdb, the two OSTs are acturally written to a same disk patrition on the disk array. (It is qui

Re: [Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread syed haider
Hi Brian, Thanks for the response. I've run a few ib tests and here is an interesting response on the port for a failed node: [r...@tiger-node-0-1 ~]# ibqueryerrors.pl -c -a -r Suppressing: RcvSwRelayErrors Errors for 0x0008f104003f0e21 "ISR9288/ISR9096 Voltaire sLB-24" GUID 0x0008f104003f0e21

Re: [Lustre-discuss] File Content change without Error log

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 12:15 +0800, Lu Wang wrote: > Dear all, > There are more than 100 files demaged recently without any error logs on > OSS. The demaged files has same size with their original copys in our backup > system. However, the chksum changed. For example, > #ll run_0008126_All_f

[Lustre-discuss] Additional RPMs for older kernels

2009-03-31 Thread Jordan Mendler
Hi all, Are there any additional repositories that provide RPMs for older kernels? In particular I am looking of lustre client modules for 1.6.7 for a 2.6.9-55.0.2 (centos/rhel 4.5) kernel. If not, is there a way to build RPMs for just the kernel modules? It is my impression that 'make RPMs

Re: [Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 10:29 -0400, syed haider wrote: > > when a node > > > hangs, it is unable to do an lctl ping to a OSS. For example, node-0-6 > > > is hanging. From this node I can do an lctl ping to > > > oss-0-0, oss-0-2 and oss-0-3. Lctl ping to oss-0-1 just hangs. And if do > > > the

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 16:26 +0200, Arne Wiebalck wrote: > > lustre_createcsv maybe? Yeah, the information that OP is looking for could surely be gleaned out of a generated CSV file, but that createcsv script just boils down to the brute-force search I suggested. That's not to say that everyone s

[Lustre-discuss] Client evictions and RMDA failures

2009-03-31 Thread syed haider
Dear lustre group, I'm hoping you can help with this problem. My configuration is as follows: 4 OSS's | 1 MDS/MGS | n # nodes RPM's installed on CentOS 5.2 systems: lustre-1.6.6-2.6.18_92.1.10.el5_lustre.1.6.6smp kernel-ib-1.3.1-2.6.18_92.1.10.el5_lustre.1.6.6smp lustre-modules-1.6.6-2.6.18_92.

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Arne Wiebalck
On Tue, 2009-03-31 at 10:20 -0400, Brian J. Murrell wrote: > On Tue, 2009-03-31 at 13:38 +0200, Wolfgang Stief wrote: > > Hello out there! > > Hi. > > > My setup ist TCP based only. Is there an easy way to match the Lustre > > node name (testfs-OST0001) > > testfs-OST0001 is not a node. It's a

Re: [Lustre-discuss] Beginners Question: Mapping Lustre Node <-> DNS Hostname?

2009-03-31 Thread Brian J. Murrell
On Tue, 2009-03-31 at 13:38 +0200, Wolfgang Stief wrote: > Hello out there! Hi. > My setup ist TCP based only. Is there an easy way to match the Lustre > node name (testfs-OST0001) testfs-OST0001 is not a node. It's a lustre target. An OST to be specific. Think disk in a node, rather than a n