[Lustre-discuss] Lustre SLES 11 Clients - How to deal with updates

2010-01-30 Thread Jagga Soorma
Hi Guys,

I am building our sles 11 lustre clients and was wondering how people deal
with updates.  Here are some options I was thinking about:

1) Build the systems and patch everything except the kernel.  Then install
the lustre client and kernel-ib rpm's.  In the future we would upgrade
everything except the kernel.  Not sure about future dependency issues with
upgrading everything except the kernel.

2) Build the systems and patch everything.  Then compile the lustre-client
lustre-modules and kernel-ib instead of using the sun provided rpm's.  We
will have to compile every time we upgrade our servers.

How do people handle updates on the client side.  Is it easy to compile and
maintain the lustre client?  Where can I find detailed instructions on
compiling the client?

Thanks in advance,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Infiniband VS 1GiG Transfer rates. Confused

2010-02-10 Thread Jagga Soorma
Hi Guys,

I have setup a new cluster with a infiniband interconnect and I am a bit
confused with the performance I am getting as far as transfer rates.  Is
there something that I am missing here?  Shouldn't the transfer rate over
the ib interface be much faster and not the same compared with the 1G
(bonded mode=6) interface?

(I am new to infiniband but this does not seem right.  My routing is also
setup correctly.  Not sure what I might be missing here):

--
hpc101:/var/tmp # time scp SLES-11-DVD-x86_64-GM-DVD1.iso r...@hpc103
:/var/tmp/
SLES-11-DVD-x86_64-GM-DVD1.iso
100% 2749MB  62.5MB/s   00:44

real0m43.882s
user0m29.482s
sys0m6.736s

hpc101:/var/tmp # time scp SLES-11-DVD-x86_64-GM-DVD1.iso r...@hpc103-ib
:/var/tmp/
SLES-11-DVD-x86_64-GM-DVD1.iso
100% 2749MB  80.9MB/s   00:34

real0m35.757s
user0m24.498s
sys0m6.292s

hpc101:/var/tmp # netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags   MSS Window  irtt
Iface
10.0.250.0  0.0.0.0 255.255.255.0   U 0 0  0 ib0
128.137.126.0   0.0.0.0 255.255.255.0   U 0 0  0
bond0
127.0.0.0   0.0.0.0 255.0.0.0   U 0 0  0 lo
0.0.0.0 128.137.126.253 0.0.0.0 UG0 0  0
bond0

hpc101:/var/tmp # ifconfig ib0
ib0   Link encap:UNSPEC  HWaddr
80-00-00-48-FE-80-00-00-00-00-00-00-00-00-00-00
  inet addr:10.0.250.5  Bcast:10.0.250.255  Mask:255.255.255.0
  inet6 addr: fe80::223:7dff:ff93:bf4d/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:2044  Metric:1
  RX packets:493685 errors:0 dropped:0 overruns:0 frame:0
  TX packets:890402 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:256
  RX bytes:22848079 (21.7 Mb)  TX bytes:7349365631 (7008.9 Mb)
--

Thanks in advance for your assistance.

Regards,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Another Infiniband Question

2010-02-11 Thread Jagga Soorma
I have a QDR ib switch that should support up to 40Gbps.  After installing
the kernel-ib and lustre client rpms on my SuSe nodes I see the following:

hpc102:~ # ibstatus mlx4_0:1
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80::::0002:c903:0006:de19
base lid: 0x7
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)

Why is this only picking up 4X DDR at 20Gb/sec?  Do the lustre rpm's not
support QDR?  Is there something that I need to do on my side to force
40Gb/sec on these ports?

Thanks in advance,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre 1.8.1 QDR Support

2010-02-11 Thread Jagga Soorma
Hi Guys,

Wanted to give a bit more information.  So for some reason the transfer
rates on my ib interfaces are autonegotiating at 20Gb/s (4X DDR).  However,
these are QDR HCA's.

Here is the hardware that I have:

HP IB 4X QDR PCI-e G2 Dual Port HCA
HP 3M 4X DDR/QDR QSFP IB Cu Cables
Qlogic 12200 QDR switch

I am using all the lustre provided rpms on my servers (RHEL 5.3) and clients
(SLES 11).  All my servers in this cluster are auto negotiating to 20Gb/s
(4X DDR) which should be 40Gb/s (4X QDR).

Are any others out there using QDR?  If so, did you run into anything
similar to this?  Is there any specific configuration that is needed for the
servers to detect the higher rates.

Thanks in advance for your assistance.

-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre 1.8.1 QDR Support

2010-02-11 Thread Jagga Soorma
More information:

hpc116:/mnt/SLES11x86_64 # lspci | grep -i mellanox
10:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe 2.0
5GT/s] (rev a0)

hpc116:/mnt/SLES11x86_64 # ibstatus
Infiniband device 'mlx4_0' port 1 status:
default gid: fe80::::0002:c903:0006:9109
base lid: 0x14
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)

Infiniband device 'mlx4_0' port 2 status:
default gid: fe80::::0002:c903:0006:910a
base lid: 0x0
sm lid: 0x0
state: 1: DOWN
phys state: 2: Polling
rate: 10 Gb/sec (4X)

hpc116:/mnt/SLES11x86_64 # ibstat
CA 'mlx4_0'
CA type: MT26428
Number of ports: 2
Firmware version: 2.6.100
Hardware version: a0
Node GUID: 0x0002c90300069108
System image GUID: 0x0002c9030006910b
Port 1:
State: Active
Physical state: LinkUp
Rate: 20
Base lid: 20
LMC: 0
SM lid: 1
Capability mask: 0x02510868
Port GUID: 0x0002c90300069109
Port 2:
State: Down
Physical state: Polling
Rate: 10
Base lid: 0
LMC: 0
SM lid: 0
Capability mask: 0x02510868
Port GUID: 0x0002c9030006910a


On Thu, Feb 11, 2010 at 1:21 PM, Jagga Soorma  wrote:

> Hi Guys,
>
> Wanted to give a bit more information.  So for some reason the transfer
> rates on my ib interfaces are autonegotiating at 20Gb/s (4X DDR).  However,
> these are QDR HCA's.
>
> Here is the hardware that I have:
>
> HP IB 4X QDR PCI-e G2 Dual Port HCA
> HP 3M 4X DDR/QDR QSFP IB Cu Cables
> Qlogic 12200 QDR switch
>
> I am using all the lustre provided rpms on my servers (RHEL 5.3) and
> clients (SLES 11).  All my servers in this cluster are auto negotiating to
> 20Gb/s (4X DDR) which should be 40Gb/s (4X QDR).
>
> Are any others out there using QDR?  If so, did you run into anything
> similar to this?  Is there any specific configuration that is needed for the
> servers to detect the higher rates.
>
> Thanks in advance for your assistance.
>
> -J
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre 1.8.1 QDR Support

2010-02-11 Thread Jagga Soorma
Yet more information.  Looks like the switch thinks that this could be set
to 10Gbps (QDR):

hpc116:/mnt/SLES11x86_64 # iblinkinfo.pl -R | grep -i reshpc116
  1   34[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>  201[  ]
"hpc116 HCA-1" (  Could be 10.0 Gbps)

-J

On Thu, Feb 11, 2010 at 1:26 PM, Jagga Soorma  wrote:

> More information:
>
> hpc116:/mnt/SLES11x86_64 # lspci | grep -i mellanox
> 10:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe
> 2.0 5GT/s] (rev a0)
>
> hpc116:/mnt/SLES11x86_64 # ibstatus
> Infiniband device 'mlx4_0' port 1 status:
> default gid: fe80::::0002:c903:0006:9109
> base lid: 0x14
> sm lid: 0x1
> state: 4: ACTIVE
> phys state: 5: LinkUp
> rate: 20 Gb/sec (4X DDR)
>
> Infiniband device 'mlx4_0' port 2 status:
> default gid: fe80::::0002:c903:0006:910a
> base lid: 0x0
> sm lid: 0x0
> state: 1: DOWN
> phys state: 2: Polling
> rate: 10 Gb/sec (4X)
>
> hpc116:/mnt/SLES11x86_64 # ibstat
> CA 'mlx4_0'
> CA type: MT26428
> Number of ports: 2
> Firmware version: 2.6.100
> Hardware version: a0
> Node GUID: 0x0002c90300069108
> System image GUID: 0x0002c9030006910b
> Port 1:
> State: Active
> Physical state: LinkUp
> Rate: 20
> Base lid: 20
> LMC: 0
> SM lid: 1
> Capability mask: 0x02510868
> Port GUID: 0x0002c90300069109
> Port 2:
> State: Down
> Physical state: Polling
> Rate: 10
> Base lid: 0
> LMC: 0
> SM lid: 0
> Capability mask: 0x02510868
> Port GUID: 0x0002c9030006910a
>
>
>
> On Thu, Feb 11, 2010 at 1:21 PM, Jagga Soorma  wrote:
>
>> Hi Guys,
>>
>> Wanted to give a bit more information.  So for some reason the transfer
>> rates on my ib interfaces are autonegotiating at 20Gb/s (4X DDR).  However,
>> these are QDR HCA's.
>>
>> Here is the hardware that I have:
>>
>> HP IB 4X QDR PCI-e G2 Dual Port HCA
>> HP 3M 4X DDR/QDR QSFP IB Cu Cables
>> Qlogic 12200 QDR switch
>>
>> I am using all the lustre provided rpms on my servers (RHEL 5.3) and
>> clients (SLES 11).  All my servers in this cluster are auto negotiating to
>> 20Gb/s (4X DDR) which should be 40Gb/s (4X QDR).
>>
>> Are any others out there using QDR?  If so, did you run into anything
>> similar to this?  Is there any specific configuration that is needed for the
>> servers to detect the higher rates.
>>
>> Thanks in advance for your assistance.
>>
>> -J
>>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre 1.8.1 QDR Support {RESOLVED}

2010-02-12 Thread Jagga Soorma
This was an issue with the HCA firmware.  The mellanox card was not working
well with the QLogic switch.  The support guy's as Mellanox were pretty
helpful and came up with a new firmware for us to try which solved my auto
neg issues.


On Thu, Feb 11, 2010 at 1:44 PM, Jagga Soorma  wrote:

> Yet more information.  Looks like the switch thinks that this could be set
> to 10Gbps (QDR):
>
> hpc116:/mnt/SLES11x86_64 # iblinkinfo.pl -R | grep -i reshpc116
>   1   34[  ]  ==( 4X 5.0 Gbps Active /   LinkUp)==>  201[  ]
> "hpc116 HCA-1" (  Could be 10.0 Gbps)
>
> -J
>
>
> On Thu, Feb 11, 2010 at 1:26 PM, Jagga Soorma  wrote:
>
>> More information:
>>
>> hpc116:/mnt/SLES11x86_64 # lspci | grep -i mellanox
>> 10:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe
>> 2.0 5GT/s] (rev a0)
>>
>> hpc116:/mnt/SLES11x86_64 # ibstatus
>> Infiniband device 'mlx4_0' port 1 status:
>> default gid: fe80::::0002:c903:0006:9109
>> base lid: 0x14
>> sm lid: 0x1
>> state: 4: ACTIVE
>> phys state: 5: LinkUp
>> rate: 20 Gb/sec (4X DDR)
>>
>> Infiniband device 'mlx4_0' port 2 status:
>> default gid: fe80::::0002:c903:0006:910a
>> base lid: 0x0
>> sm lid: 0x0
>> state: 1: DOWN
>> phys state: 2: Polling
>> rate: 10 Gb/sec (4X)
>>
>> hpc116:/mnt/SLES11x86_64 # ibstat
>> CA 'mlx4_0'
>> CA type: MT26428
>> Number of ports: 2
>> Firmware version: 2.6.100
>> Hardware version: a0
>> Node GUID: 0x0002c90300069108
>> System image GUID: 0x0002c9030006910b
>> Port 1:
>> State: Active
>> Physical state: LinkUp
>> Rate: 20
>> Base lid: 20
>> LMC: 0
>> SM lid: 1
>> Capability mask: 0x02510868
>>     Port GUID: 0x0002c90300069109
>> Port 2:
>> State: Down
>> Physical state: Polling
>> Rate: 10
>> Base lid: 0
>> LMC: 0
>> SM lid: 0
>> Capability mask: 0x02510868
>> Port GUID: 0x0002c9030006910a
>>
>>
>>
>> On Thu, Feb 11, 2010 at 1:21 PM, Jagga Soorma  wrote:
>>
>>> Hi Guys,
>>>
>>> Wanted to give a bit more information.  So for some reason the transfer
>>> rates on my ib interfaces are autonegotiating at 20Gb/s (4X DDR).  However,
>>> these are QDR HCA's.
>>>
>>> Here is the hardware that I have:
>>>
>>> HP IB 4X QDR PCI-e G2 Dual Port HCA
>>> HP 3M 4X DDR/QDR QSFP IB Cu Cables
>>> Qlogic 12200 QDR switch
>>>
>>> I am using all the lustre provided rpms on my servers (RHEL 5.3) and
>>> clients (SLES 11).  All my servers in this cluster are auto negotiating to
>>> 20Gb/s (4X DDR) which should be 40Gb/s (4X QDR).
>>>
>>> Are any others out there using QDR?  If so, did you run into anything
>>> similar to this?  Is there any specific configuration that is needed for the
>>> servers to detect the higher rates.
>>>
>>> Thanks in advance for your assistance.
>>>
>>> -J
>>>
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] lfs setquota Question

2010-02-16 Thread Jagga Soorma
Hi Guys,

I am running lustre 1.8.1 and need to setup quota's for users.  I have done
the following:

On MDS:
tunefs.lustre --param mdt.quota_type=ug /dev/mdtvg/mdtlv

On OSS's:
(For all ost's)
tunefs.lustre --param ost.quota_type=ug /dev/mapper/mpath[X]

** I believe the above commands are persistent and I don't have to do
anything besides this? **

On Client:
Mounted the lustre filesystem and ran:
lfs quotacheck /mnt/lustre

Now, I need to set a 12TB limit for all users.  I believe this size is
supported for quota's in 1.8.  Would this be the right thing to do:

lfs setquota -u username -b 12884901888 -B 13199474688 /mnt/lustre

I believe I will need to do the above for all users.  Not using group quotas
even though they are enabled.  Also, this is a 20TB lustre volume and the
quota is just to safegaurd from one user filling up the filesystem.

Also, would it be okay to monitor the usage on a daily basis by running the
following command with a script:

lfs quota -u username -v /mnt/lustre

Or would it not be recommended?  I do not know how much overhead this would
add and don't want to cause other issues by running this.  But, this
information would be a nice to have.

Regards,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Curious about iozone findings of new Lustre FS

2010-03-03 Thread Jagga Soorma
Hi Guys,

(P.S. My apologies if this is a duplicate - Sent this email earlier today
with an attachment and did not see it populate in google groups so was not
sure if attachments were allowed)

I have just deployed a new Lustre FS with 2 MDS servers, 2 active OSS
servers (5x2TB OST's per OSS) and 16 compute nodes.  It looks like the
iozone throughput tests have demonstrated almost linear scalability of
Lustre except for when WRITING files that exceed 128MB in size.  When
multiple clients create/write files larger than 128MB, Lustre throughput
levels up to approximately ~1GB/s. This behavior has been observed with
almost all tested block size ranges except for 4KB.  I don't have any
explanation as to why Lustre performs poorly when writing large files.

Here is the iozone report:
http://docs.google.com/fileview?id=0Bz8GxDEZOhnwYjQyMDlhMWMtODVlYi00MTgwLTllN2QtYzU2MWJlNTEwMjA1&hl=en

The only changes I have made to the defaults are:
stripe_count: 2 stripe_size: 1048576 stripe_offset: -1

I am using Lustre 1.8.1.1 and my MDS/OSS servers are running RHEL 5.3.  All
the clients are SLES 11.

Has anyone experienced this behaviour?  Any comments on our findings?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Curious about iozone findings of new Lustre FS

2010-03-03 Thread Jagga Soorma
On Wed, Mar 3, 2010 at 2:30 PM, Andreas Dilger  wrote:

> On 2010-03-03, at 12:50, Jagga Soorma wrote:
>
>> I have just deployed a new Lustre FS with 2 MDS servers, 2 active OSS
>> servers (5x2TB OST's per OSS) and 16 compute nodes.
>>
>
> Does this mean you are using 5 2TB disks in a single RAID-5 OST per OSS
> (i.e. total OST size is 8TB), or are you using 5 separate 2TB OSTs?


No I am using 5 independent 2TB OST's per OSS.


>
>
>  Attached are our findings from the iozone tests and it looks like the
>> iozone throughput tests have demonstrated almost linear scalability of
>> Lustre except for when WRITING files that exceed 128MB in size.  When
>> multiple clients create/write files larger than 128MB, Lustre throughput
>> levels up to approximately ~1GB/s. This behavior has been observed with
>> almost all tested block size ranges except for 4KB.  I don't have any
>> explanation as to why Lustre performs poorly when writing large files.
>>
>> Has anyoned experienced this behaviour?  Any comments on our findings?
>>
>
>
> The default client tunable max_dirty_mb=32MB per OSC (i.e. the maximum
> amount of unwritten dirty data per OST before blocking the process
> submitting IO).  If you have 2 OST/OSCs and you have a stripe count of 2
> then you can cache up to 64MB on the client without having to wait for any
> RPCs to complete.  That is why you see a performance cliff for writes beyond
> 32MB.
>

So the true write performance should be measured for data captured for files
larger than 128MB?  If we do see a large number of large files being created
on the lustre fs, is this something that can be tuned on the client side?
If so, where/how can I get this done and what would be the recommended
settings?


> It should be clear that the read graphs are meaningless, due to local cache
> of the file.  I'd hazard a guess that you are not getting 100GB/s from 2 OSS
> nodes.
>

Agreed.  Is there a way to find out the size of the local cache on the
clients?


>
> Also, what is the interconnect on the client?  If you are using a single
> 10GigE then 1GB/s is as fast as you can possibly write large files to the
> OSTs, regardless of the striping.
>

I am using Infiniband (QDR) interconnects for all nodes.


>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Curious about iozone findings of new Lustre FS

2010-03-03 Thread Jagga Soorma
Or would it be better to increase the stripe count for my lustre filesystem
to the max number of OST's?

On Wed, Mar 3, 2010 at 3:27 PM, Jagga Soorma  wrote:

> On Wed, Mar 3, 2010 at 2:30 PM, Andreas Dilger  wrote:
>
>> On 2010-03-03, at 12:50, Jagga Soorma wrote:
>>
>>> I have just deployed a new Lustre FS with 2 MDS servers, 2 active OSS
>>> servers (5x2TB OST's per OSS) and 16 compute nodes.
>>>
>>
>> Does this mean you are using 5 2TB disks in a single RAID-5 OST per OSS
>> (i.e. total OST size is 8TB), or are you using 5 separate 2TB OSTs?
>
>
> No I am using 5 independent 2TB OST's per OSS.
>
>
>>
>>
>>  Attached are our findings from the iozone tests and it looks like the
>>> iozone throughput tests have demonstrated almost linear scalability of
>>> Lustre except for when WRITING files that exceed 128MB in size.  When
>>> multiple clients create/write files larger than 128MB, Lustre throughput
>>> levels up to approximately ~1GB/s. This behavior has been observed with
>>> almost all tested block size ranges except for 4KB.  I don't have any
>>> explanation as to why Lustre performs poorly when writing large files.
>>>
>>> Has anyoned experienced this behaviour?  Any comments on our findings?
>>>
>>
>>
>> The default client tunable max_dirty_mb=32MB per OSC (i.e. the maximum
>> amount of unwritten dirty data per OST before blocking the process
>> submitting IO).  If you have 2 OST/OSCs and you have a stripe count of 2
>> then you can cache up to 64MB on the client without having to wait for any
>> RPCs to complete.  That is why you see a performance cliff for writes beyond
>> 32MB.
>>
>
> So the true write performance should be measured for data captured for
> files larger than 128MB?  If we do see a large number of large files being
> created on the lustre fs, is this something that can be tuned on the client
> side?  If so, where/how can I get this done and what would be the
> recommended settings?
>
>
>> It should be clear that the read graphs are meaningless, due to local
>> cache of the file.  I'd hazard a guess that you are not getting 100GB/s from
>> 2 OSS nodes.
>>
>
> Agreed.  Is there a way to find out the size of the local cache on the
> clients?
>
>
>>
>> Also, what is the interconnect on the client?  If you are using a single
>> 10GigE then 1GB/s is as fast as you can possibly write large files to the
>> OSTs, regardless of the striping.
>>
>
> I am using Infiniband (QDR) interconnects for all nodes.
>
>
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Problem with flock and perl on Lustre FS

2010-03-05 Thread Jagga Soorma
Hi Guys,

How does lustre handle locking?  One of our users is complaining that a perl
module (Sotrable) has trouble with its lock_nstore method when it tries to
use flock.  The following is a hwo they are reporducing this issue:

--
> perl -d -e ''

Loading DB routines from perl5db.pl version 1.3
Editor support available.

Enter h or `h h' for help, or `man perldebug' for more help.

Debugged program terminated.  Use q to quit or R to restart,
 use o inhibit_exit to avoid stopping after program termination,
 h q, h R or h o to get additional info.
 DB<1> use Fcntl ':flock'

 DB<2> open(FOO, ">>/tmp/gh") or die "darn"

 DB<3> flock(FOO, LOCK_EX) || die "SHIE: $!"

 DB<4> close FOO

 DB<5> open(FOO, ">>gh") or die "darn"

 DB<6> flock(FOO, LOCK_EX) || die "SHIE: $!"
SHIE: Function not implemented at (eval
10)[/usr/lib/perl5/5.10.0/perl5db.pl:638] line 2.

 DB<7>
--

Thanks in advance for your assistance.

Regards,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Question regarding caution statement in 1.8 manual for the consistent mode flock option

2010-03-05 Thread Jagga Soorma
Hi Guys,

Thanks Andreas for pointing me to the flock options.  However, I see the
following caution statement for the consistent mode:

--
CAUTION: This mode has a noticeable performance impact and may affect
stability, depending on the Lustre version used. Consider using a newer
Lustre version which is more stable.
--

Is there an impact if the option is turned on, or only if it is turned on
and used?  Is the impact local to the file being locked, the machine on
which that file is locked, or the entire set of machines mounting that
lustre file system?

Thanks in advance,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Jagga Soorma
Hi Guys,

My users are reporting some issues with memory on our lustre 1.8.1 clients.
It looks like when they submit a single job at a time the run time was about
4.5 minutes.  However, when they ran multiple jobs (10 or less) on a client
with 192GB of memory on a single node the run time for each job was
exceeding 3-4X the run time for the single process.  They also noticed that
the swap space kept climbing even though there was plenty of free memory on
the system.  Could this possibly be related to the lustre client?  Does it
reserve any memory that is not accessible by any other process even though
it might not be in use?

Thanks much,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Jagga Soorma
ached:  23672 kB
Active:1667172 kB
Inactive:   114552 kB
SwapTotal:75505460 kB
SwapFree: 75461372 kB
Dirty: 116 kB
Writeback:   0 kB
AnonPages:   53284 kB
Mapped:   8884 kB
Slab: 95664132 kB
SReclaimable:   256656 kB
SUnreclaim:   95407476 kB
PageTables:   2368 kB
NFS_Unstable:0 kB
Bounce:  0 kB
WritebackTmp:0 kB
CommitLimit:  174551180 kB
Committed_AS:   137540 kB
VmallocTotal: 34359738367 kB
VmallocUsed:588416 kB
VmallocChunk: 34359149923 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
HugePages_Surp:  0
Hugepagesize: 2048 kB
DirectMap4k:  8432 kB
DirectMap2M:  201308160 kB
..
--

On Mon, Apr 19, 2010 at 10:07 AM, Andreas Dilger
wrote:

> There is a known problem with the DLM LRU size that may be affecting you.
> It may be something else too. Please check /proc/{slabinfo,meminfo} to see
> what is using the memory on the client.
>
> Cheers, Andreas
>
>
> On 2010-04-19, at 10:43, Jagga Soorma  wrote:
>
>  Hi Guys,
>>
>> My users are reporting some issues with memory on our lustre 1.8.1
>> clients.  It looks like when they submit a single job at a time the run time
>> was about 4.5 minutes.  However, when they ran multiple jobs (10 or less) on
>> a client with 192GB of memory on a single node the run time for each job was
>> exceeding 3-4X the run time for the single process.  They also noticed that
>> the swap space kept climbing even though there was plenty of free memory on
>> the system.  Could this possibly be related to the lustre client?  Does it
>> reserve any memory that is not accessible by any other process even though
>> it might not be in use?
>>
>> Thanks much,
>> -J
>> ___
>> Lustre-discuss mailing list
>> Lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-19 Thread Jagga Soorma
Actually this does not seem correct:

SUnreclaim:   95407476 kB

Shouldn't this be a lot smaller?

-Simran

On Mon, Apr 19, 2010 at 10:16 AM, Jagga Soorma  wrote:

> Thanks for the response Andreas.
>
> What is the known problem with the DLM LRU size?  Here is what my
> slabinfo/meminfo look like on one of the clients.  I don't see anything out
> of the ordinary:
>
> (then again there are no jobs currently running on this system)
>
> Thanks
> -J
>
> --
> slabinfo:
> ..
> slabinfo - version: 2.1
> # name   
>  : tunables: slabdata
>   
> nfs_direct_cache   0  0128   301 : tunables  120   608
> : slabdata  0  0  0
> nfs_write_data36 44704   112 : tunables   54   278
> : slabdata  4  4  0
> nfs_read_data 32 33704   112 : tunables   54   278
> : slabdata  3  3  0
> nfs_inode_cache0  098441 : tunables   54   278
> : slabdata  0  0  0
> nfs_page   0  0128   301 : tunables  120   608
> : slabdata  0  0  0
> rpc_buffers8  8   204821 : tunables   24   128
> : slabdata  4  4  0
> rpc_tasks  8 12320   121 : tunables   54   278
> : slabdata  1  1  0
> rpc_inode_cache0  083241 : tunables   54   278
> : slabdata  0  0  0
> ll_async_page 326589 328572320   121 : tunables   54   278
> : slabdata  27381  27381  0
> ll_file_data   0  0192   201 : tunables  120   608
> : slabdata  0  0  0
> lustre_inode_cache76977289641 : tunables   54   278
> : slabdata193193  0
> lov_oinfo   1322   1392320   121 : tunables   54   278
> : slabdata116116  0
> osc_quota_info 0  0 32  1121 : tunables  120   608
> : slabdata  0  0  0
> ll_qunit_cache 0  0112   341 : tunables  120   608
> : slabdata  0  0  0
> llcd_cache 0  0   395211 : tunables   24   128
> : slabdata  0  0  0
> ptlrpc_cbdatas 0  0 32  1121 : tunables  120   608
> : slabdata  0  0  0
> interval_node   1166   3240128   301 : tunables  120   608
> : slabdata108108  0
> ldlm_locks  2624   368851281 : tunables   54   278
> : slabdata461461  0
> ldlm_resources  2002   3340384   101 : tunables   54   278
> : slabdata334334  0
> ll_import_cache0  0   124831 : tunables   24   128
> : slabdata  0  0  0
> ll_obdo_cache  0 452282156208   191 : tunables  120   60
> 8 : slabdata  0 23804324  0
> ll_obd_dev_cache  13 13   567212 : tunables840
> : slabdata 13 13  0
> obd_lvfs_ctxt_cache  0  0 96   401 : tunables  120   60
> 8 : slabdata  0  0  0
> SDP0  0   172842 : tunables   24   128
> : slabdata  0  0  0
> fib6_nodes 7118 64   591 : tunables  120   608
> : slabdata  2  2  0
> ip6_dst_cache 14 36320   121 : tunables   54   278
> : slabdata  3  3  0
> ndisc_cache4 30256   151 : tunables  120   608
> : slabdata  2  2  0
> RAWv6 35 3696041 : tunables   54   278
> : slabdata  9  9  0
> UDPLITEv6  0  096041 : tunables   54   278
> : slabdata  0  0  0
> UDPv6  7 1296041 : tunables   54   278
> : slabdata  3  3  0
> tw_sock_TCPv6  0  0192   201 : tunables  120   608
> : slabdata  0  0  0
> request_sock_TCPv6  0  0192   201 : tunables  120   608
> : slabdata  0  0  0
> TCPv6  2  4   179221 : tunables   24   128
> : slabdata  2  2  0
> ib_mad  2069   216044881 : tunables   54   278
> : slabdata270270  6
> fuse_request   0  060861 : tunables   54   278
> : slabdata  0  0  0
> fuse_inode 0  0704   112 : tunables   54   278
> : slabdata  0  0  0
> kcopyd_job 0  0360   111 : tunables   54   278
> : slabdata  0  0  0
> dm_uevent  0  0   

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-20 Thread Jagga Soorma
Hi Andreas,

Thanks for your response.  I will try to run the leak-finder script and
hopefully it will point us in the right direction.  This only seems to be
happening on some of my clients:

--
client112: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client108: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client110: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client107: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client111: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client109: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client102: ll_obdo_cache  5 38208   191 : tunables
120   608 : slabdata  2  2  1
client114: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client105: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client103: ll_obdo_cache  0  0208   191 : tunables
120   608 : slabdata  0  0  0
client104: ll_obdo_cache  0 433506280208   191 : tunables
120   608 : slabdata  0 22816120  0
client116: ll_obdo_cache  0 457366746208   191 : tunables
120   608 : slabdata  0 24071934  0
client113: ll_obdo_cache  0 456778867208   191 : tunables
120   608 : slabdata  0 24040993  0
client106: ll_obdo_cache  0 456372267208   191 : tunables
120   608 : slabdata  0 24019593  0
client115: ll_obdo_cache  0 449929310208   191 : tunables
120   608 : slabdata  0 23680490  0
client101: ll_obdo_cache  0 454318101208   191 : tunables
120   608 : slabdata  0 23911479  0
--

Hopefully this should help.  Not sure which application might be causing the
leaks.  Currently R is the only app that users seem to be using heavily on
these clients.  Will let you know what I find.

Thanks again,
-J

On Mon, Apr 19, 2010 at 9:04 PM, Andreas Dilger
wrote:

> On 2010-04-19, at 11:16, Jagga Soorma wrote:
>
>> What is the known problem with the DLM LRU size?
>>
>
> It is mostly a problem on the server, actually.
>
>   Here is what my slabinfo/meminfo look like on one of the clients.  I
>> don't see anything out of the ordinary:
>>
>> (then again there are no jobs currently running on this system)
>>
>> slabinfo - version: 2.1
>> # name   
>>  : tunables: slabdata
>>   
>>
>
>  ll_async_page 326589 328572320   121 : tunables   54   278
>> : slabdata  27381  27381  0
>>
>
> This shows you have 326589 pages in the lustre filesystem cache, or about
> 1275MB of data.  That shouldn't be too much for a system with 192GB of
> RAM...
>
>  lustre_inode_cache76977289641 : tunables   54   27
>>  8 : slabdata193193  0
>> ldlm_locks  2624   368851281 : tunables   54   278
>> : slabdata461461  0
>> ldlm_resources  2002   3340384   101 : tunables   54   278
>> : slabdata334334  0
>>
>
> Only about 2600 locks on 770 files is fine (this is what the DLM LRU size
> would affect, if it were out of control, which it isn't).
>
>  ll_obdo_cache  0 452282156208   191 : tunables  120   60
>>  8 : slabdata  0 23804324  0
>>
>
> This is really out of whack.  The "obdo" struct should normally only be
> allocated for a short time and then freed again, but here you have 452M of
> them using over 90GB of RAM.  It looks like a leak of some kind, which is a
> bit surprising since we have fairly tight checking for memory leaks in the
> Lustre code.
>
> Are you running some unusual workload that is maybe walking an unusual code
> path?  What you can do to track down memory leaks is enable Lustre memory
> tracing, increase the size of the debug buffer to catch enough tracing to be
> useful, and then run your job to see what is causing the leak, dump the
> kernel debug log, and then run leak-finder.pl (attached, and also in
> Lustre sources):
>
> client# lctl set_param debug=+malloc
> client# lctl set_param debug_mb=256
> client$ {run job}
> client# sync
> client# lctl dk /tmp/debug
> client# perl leak-finder.pl < /tmp/debug 2>&1 | grep "Leak.*oa"
> client# lctl set_param debug=-malloc
> client# lctl set_param debug_mb=32
>

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-04-28 Thread Jagga Soorma
Hi Johann,

I am actually using 1.8.1 and not 1.8.2:

# rpm -qa | grep -i lustre
lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default

My kernel version on the SLES 11 clients is:
# uname -r
2.6.27.29-0.1-default

My kernel version on the RHEL 5.3 mds/oss servers is:
# uname -r
2.6.18-128.7.1.el5_lustre.1.8.1.1

Please let me know if you need any further information.  I am still trying
to get the user to help me run his app so that I can run the leak finder
script to capture more information.

Regards,
-Simran

On Tue, Apr 27, 2010 at 7:20 AM, Johann Lombardi  wrote:

> Hi,
>
> On Tue, Apr 20, 2010 at 09:08:25AM -0700, Jagga Soorma wrote:
> > Thanks for your response.* I will try to run the leak-finder script and
> > hopefully it will point us in the right direction.* This only seems to be
> > happening on some of my clients:
>
> Could you please tell us what kernel you use on the client side?
>
> >client104: ll_obdo_cache* 0 433506280*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 22816120* 0
> >client116: ll_obdo_cache* 0 457366746*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 24071934* 0
> >client113: ll_obdo_cache* 0 456778867*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 24040993* 0
> >client106: ll_obdo_cache* 0 456372267*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 24019593* 0
> >client115: ll_obdo_cache* 0 449929310*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 23680490* 0
> >client101: ll_obdo_cache* 0 454318101*** 208** 19*** 1 :
> tunables*
> >120** 60*** 8 : slabdata* 0 23911479* 0
> >--
> >
> >Hopefully this should help.* Not sure which application might be
> causing
> >the leaks.* Currently R is the only app that users seem to be using
> >heavily on these clients.* Will let you know what I find.
>
> Tommi Tervo has filed a bugzilla ticket for this issue, see
> https://bugzilla.lustre.org/show_bug.cgi?id=22701
>
> Could you please add a comment to this ticket to describe the
> behavior of the application "R" (fork many threads, write to
> many files, use direct i/o, ...)?
>
> Cheers,
> Johann
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Residual sessions on the luster cluster

2010-06-18 Thread Jagga Soorma
Hey Guys,

Some of my users are reporting that sessions that they start on the lustre
cluster seem to live on if their session times out, or if they otherwise get
disconnected. This even happens when the session was in the foreground.
 Could the shared lustre fs be causing this?  Any ideas?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Residual sessions on the luster cluster

2010-06-18 Thread Jagga Soorma
My apologies, should have been a bit more clear.  So the users home
directories reside on the lustre filesystem.  They login and start some
computational job in the foreground.  When their session times out or they
get disconnected their jobs are still running and need to be killed by
sending the kill signal.  The job should also die with the session if it is
running in the foreground.

-J

On Fri, Jun 18, 2010 at 10:55 AM, Jagga Soorma  wrote:

> Hey Guys,
>
> Some of my users are reporting that sessions that they start on the lustre
> cluster seem to live on if their session times out, or if they otherwise get
> disconnected. This even happens when the session was in the foreground.
>  Could the shared lustre fs be causing this?  Any ideas?
>
> Thanks,
> -J
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] New 10Gbe adapter on lustre client brought down switch due to broadcast traffic

2010-08-02 Thread Jagga Soorma
Hi Guys,

I had a situation yesterday where I had a 10Gbe adapter on my lustre client
configured but not active (the cable was plugged in and had link but the
port was down) and this actually brought down our cisco switch.  This is a
new port that I am setting up.  The reason why the switch went down was due
to the amount of broadcast traffic coming from this port even though it was
shutdown:

..snip..
TenGigabitEthernet7/4 is down, line protocol is down (notconnect)
  Hardware is C6k 1Mb 802.3, address is 001d.4577.c693 (bia
001d.4577.c693)
  MTU 1500 bytes, BW 1000 Kbit, DLY 10 usec,
 reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation ARPA, loopback not set
  Keepalive set (10 sec)
  Full-duplex, 10Gb/s
  input flow-control is off, output flow-control is off
  ARP type: ARPA, ARP Timeout 04:00:00
  Last input never, output 09:26:18, output hang never
  Last clearing of "show interface" counters 14:46:54
  Input queue: 0/2000/4149536364/0 (size/max/drops/flushes); Total output
drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 0 bits/sec, 0 packets/sec
  5 minute output rate 0 bits/sec, 0 packets/sec
 212613091560 packets input, 13607237844800 bytes, 0 no buffer
 Received 212613090908 broadcasts (2159693404 multicasts)
 0 runts, 0 giants, 0 throttles
 0 input errors, 0 CRC, 0 frame, 4149536364 overrun, 0 ignored
 0 watchdog, 0 multicast, 0 pause input
 0 input packets with dribble condition detected
 154150900 packets output, 232610834033 bytes, 0 underruns
 0 output errors, 0 collisions, 0 interface resets
 0 babbles, 0 late collision, 0 deferred
 0 lost carrier, 0 no carrier, 0 PAUSE output
 0 output buffer failures, 0 output buffers swapped out
..snip..

All my lustre traffic should be going through my ib interface based on my
modprobe.conf.local:

options lnet networks="o2ib3(ib0)"

Just wanted to put this out there and see if anyone can point out something
that I might be missing.  These broadcasts should not be lustre related,
right?  Also, it does not make sense that this port is sending out broadcast
traffic when the port is not even active.

Any comments/assistance would be grately appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Weird behavior on lustre clients

2010-08-08 Thread Jagga Soorma
Hello,

I am experiencing some weird behavior on my lustre clients.  I have worked
with Novell support and they keeping pointing to lustre as the culprit for
these issues.  I am getting intermittent I/O errors when running df/ls on
any nfs mounts without anything being logged in syslog.  After putting nfs
and rpc in debug mode by running:

rpcdebug -m nfs -s all
rpcdebug -m rpc -s all

I now see the following errors in my logs:

..snip..
Aug  8 02:32:56 reshpc115 kernel: RPC:  2440 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:32:56 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:32:59 reshpc115 kernel: RPC:  2441 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:32:59 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:47:59 reshpc115 kernel: RPC:  2447 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:47:59 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:57:59 reshpc115 kernel: RPC:  2451 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:57:59 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:58:00 reshpc115 kernel: RPC:  2452 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:58:00 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:58:13 reshpc115 kernel: RPC:  2453 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:58:13 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:58:26 reshpc115 kernel: RPC:  2454 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:58:26 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:58:30 reshpc115 kernel: RPC:  2455 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:58:30 reshpc115 kernel: nfs_statfs: statfs error = 5
Aug  8 02:58:32 reshpc115 kernel: RPC:  2456 xprt_connect_status: error 99
connecting to server nas-rwc-is2
Aug  8 02:58:32 reshpc115 kernel: nfs_statfs: statfs error = 5
..snip..

I am using all supported packages/kernels for lustre and on servers without
the lustre clients installed I have no issues with nfs.  Does the interval
between these errors mean anything?

Any help would be greatly appreciated.

Thanks,
-J

--
reshpc115:~ # uname -a
Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
x86_64 x86_64 x86_64 GNU/Linux
reshpc115:~ # rpm -qa | grep -i lustre
lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
reshpc115:~ # rpm -qa | grep -i kernel-ib
kernel-ib-1.4.2-2.6.27.29_0.1_default
--
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Weird behavior on lustre clients

2010-08-08 Thread Jagga Soorma
One other piece of information.  It seems like I have found a workaround by
adding a cronjob that runs every 2mins and runs a df command.  Is there some
caching issue that might be caused by lustre?

Thanks,
-J

On Sun, Aug 8, 2010 at 3:15 AM, Jagga Soorma  wrote:

> Hello,
>
> I am experiencing some weird behavior on my lustre clients.  I have worked
> with Novell support and they keeping pointing to lustre as the culprit for
> these issues.  I am getting intermittent I/O errors when running df/ls on
> any nfs mounts without anything being logged in syslog.  After putting nfs
> and rpc in debug mode by running:
>
> rpcdebug -m nfs -s all
> rpcdebug -m rpc -s all
>
> I now see the following errors in my logs:
>
> ..snip..
> Aug  8 02:32:56 reshpc115 kernel: RPC:  2440 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:32:56 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:32:59 reshpc115 kernel: RPC:  2441 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:32:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:47:59 reshpc115 kernel: RPC:  2447 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:47:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:57:59 reshpc115 kernel: RPC:  2451 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:57:59 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:00 reshpc115 kernel: RPC:  2452 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:00 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:13 reshpc115 kernel: RPC:  2453 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:13 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:26 reshpc115 kernel: RPC:  2454 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:26 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:30 reshpc115 kernel: RPC:  2455 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:30 reshpc115 kernel: nfs_statfs: statfs error = 5
> Aug  8 02:58:32 reshpc115 kernel: RPC:  2456 xprt_connect_status: error 99
> connecting to server nas-rwc-is2
> Aug  8 02:58:32 reshpc115 kernel: nfs_statfs: statfs error = 5
> ..snip..
>
> I am using all supported packages/kernels for lustre and on servers without
> the lustre clients installed I have no issues with nfs.  Does the interval
> between these errors mean anything?
>
> Any help would be greatly appreciated.
>
> Thanks,
> -J
>
> --
> reshpc115:~ # uname -a
> Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> reshpc115:~ # rpm -qa | grep -i lustre
> lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> reshpc115:~ # rpm -qa | grep -i kernel-ib
> kernel-ib-1.4.2-2.6.27.29_0.1_default
> --
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Weird behavior on lustre clients

2010-08-08 Thread Jagga Soorma
Andreas,

Yes, these I/O errors are for any NFS filesystems mounted on all lustre
clients.  Even though this nfs mount has nothing to do with lustre there
seems to be something specific on the lustre clients with the kernel-ib and
lustre client modules installed that seems to be causing this problem.

I believe lustre caches data locally and then flushes it out on a regular
basis, but don't know enough to rule lustre out.  It looks like this issue
is happening every 8-10mins.  Is there something that lustre is doing on the
system that might be flushing some type of a cache or might be causing this
problem?  If I do a df every 5mins or so then I never see this problem.

I have just run out of things to try and wanted to check the lustre route as
a last resort in hopes of getting more information that might help me find a
permanent solution for this issue.

Any assistance/comments would be appreciated.

Thanks,
-J

On Sun, Aug 8, 2010 at 6:53 PM, Andreas Dilger wrote:

> On 2010-08-08, at 16:44, Jagga Soorma wrote:
> > One other piece of information.  It seems like I have found a workaround
> by adding a cronjob that runs every 2mins and runs a df command.  Is there
> some caching issue that might be caused by lustre?
>
> Are the IO errors on NFS filesystems that have nothing to do with Lustre,
> or is this from NFS re-exporting of a Lustre filesystem?
>
> >> I am experiencing some weird behavior on my lustre clients.  I have
> worked with Novell support and they keeping pointing to lustre as the
> culprit for these issues.  I am getting intermittent I/O errors when running
> df/ls on any nfs mounts without anything being logged in syslog.  After
> putting nfs and rpc in debug mode by running:
> >
> > I am using all supported packages/kernels for lustre and on servers
> without the lustre clients installed I have no issues with nfs.  Does the
> interval between these errors mean anything?
> >
> > Any help would be greatly appreciated.
> >
> > reshpc115:~ # uname -a
> > Linux reshpc115 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
> x86_64 x86_64 x86_64 GNU/Linux
> > reshpc115:~ # rpm -qa | grep -i lustre
> > lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> > lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> > reshpc115:~ # rpm -qa | grep -i kernel-ib
> > kernel-ib-1.4.2-2.6.27.29_0.1_default
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Version mismatch of Lustre client and server

2010-08-11 Thread Jagga Soorma
Hello,

I am planning on deploying a few more clients in my lustre environment and
was wondering which client version to install.  I know it is okay to run a
newer client version than your lustre server for upgrade purposes.  However,
would it be okay to be in this state for a longer period of time (for the
life of this filesystem)?  My lustre server is currently running 1.8.1.1 on
RHEL 5.3 and I am planning on deploying 1.8.2 on SLES 11.  Also, I am trying
to stay as close as possible to 1.8.1.1 which is why I am planning on
installing the 1.8.2 version of lustre client.  Are they many changes
between 1.8.2 and 1.8.4.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Version mismatch of Lustre client and server

2010-08-12 Thread Jagga Soorma
>
> > My lustre server is currently running 1.8.1.1 on RHEL 5.3 and I am
> planning on deploying 1.8.2 on SLES 11.  Also, I am trying to stay as close
> as possible to 1.8.1.1
>
> Why, if I might ask?
>

To introduce the least amount of changes in this environment.  The only
reason I have updating the kernel for these new servers is since we have
been unable to solve the non lustre specific NFS timeouts on the compute
nodes which seems to be related to some bugs in rpc that might be fixed with
the new release.

You don't see any issues with me jumping directly to 1.8.4?  I just though
it would make sense to jump to 1.8.2 so that there are not too many changes
introduced in my environment which has been stable so far with the exception
of this nfs stuff.

Thanks,
-Simran


>
> > which is why I am planning on installing the 1.8.2 version of lustre
> client.
> >  Are they many changes between 1.8.2 and 1.8.4.
>
> 1.8.x is only getting bug fixes at this point.  As with any release, you
> can see the list of changes in 1.8.4 in the {lustre,lnet,ldiskfs}/ChangeLog
> files.  That said, 1.8.4 does have a fair number of bug fixes brought in
> from Cray and LLNL, so I would recommend everyone to use it (when it finally
> appears on the download site).
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-08-26 Thread Jagga Soorma
Hi Dmitry,

I am still running into this issue on some nodes:

client109: ll_obdo_cache  0 152914489208   191 : tunables
 120   608 : slabdata  0 8048131  0
client102: ll_obdo_cache  0 308526883208   191 : tunables
 120   608 : slabdata  0 16238257  0

How can I calculate how much memory this is holding on to.  My system shows
a lot of memory that is being used up but none of the jobs are using that
much memory.  Also, these clients are running a smp sles 11 kernel but I
can't find any /sys/kernel/slab directory.

Linux client102 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200
x86_64 x86_64 x86_64 GNU/Linux

What makes you say that this does not look like a lustre memory leak?  I
thought all the ll_* objects in slabinfo are lustre related?  To me it looks
like lustre is holding on to this memory but I don't know much about lustre
internals.

Also, memused on these systems are:

client102: 2353666940
client109: 2421645924

Any help would be greatly appreciated.

Thanks,
-J

On Wed, May 19, 2010 at 8:08 AM, Dmitry Zogin wrote:

>  Hello Jagga,
>
> I checked the data, and indeed this does not look like a lustre memory
> leak, rather than a slab fragmentation, which assumes there might be a
> kernel issue here. From the slabinfo (I only keep three first columns here):
>
>
> name 
> ll_obdo_cache  0 452282156208
>
> means that there are no active objects, but the memory pages are not
> released back from slab allocator to the free pool (the num value is huge).
> That looks like a slab fragmentation - you can get more description at
> http://kerneltrap.org/Linux/Slab_Defragmentation
>
> Checking your mails, I wonder if this only happens on clients which have
> SLES11 installed? As the RAM size is around 192Gb, I assume they are NUMA
> systems?
> If so, SLES11 has defrag_ratio tunables in /sys/kernel/slab/xxx
> From the source of get_any_partial()
>
> #ifdef CONFIG_NUMA
>
> /*
>  * The defrag ratio allows a configuration of the tradeoffs between
>  * inter node defragmentation and node local allocations. A lower
>  * defrag_ratio increases the tendency to do local allocations
>  * instead of attempting to obtain partial slabs from other nodes.
>  *
>  * If the defrag_ratio is set to 0 then kmalloc() always
>  * returns node local objects. If the ratio is higher then
> kmalloc()
>  * may return off node objects because partial slabs are obtained
>  * from other nodes and filled up.
>  *
>  * If /sys/kernel/slab/xx/defrag_ratio is set to 100 (which makes
>  * defrag_ratio = 1000) then every (well almost) allocation will
>  * first attempt to defrag slab caches on other nodes. This means
>  * scanning over all nodes to look for partial slabs which may be
>  * expensive if we do it every time we are trying to find a slab
>  * with available objects.
>  */
>
> Could you please verify that your clients have defrag_ratio tunable and try
> to use various values?
> It looks like the value of 100 should be the best, unless there is a bug,
> then may be even 0 gets the desired result?
>
> Best regards,
> Dmitry
>
>
> Jagga Soorma wrote:
>
> Hi Johann,
>
> I am actually using 1.8.1 and not 1.8.2:
>
> # rpm -qa | grep -i lustre
> lustre-client-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
> lustre-client-modules-1.8.1.1-2.6.27.29_0.1_lustre.1.8.1.1_default
>
> My kernel version on the SLES 11 clients is:
> # uname -r
> 2.6.27.29-0.1-default
>
> My kernel version on the RHEL 5.3 mds/oss servers is:
> # uname -r
> 2.6.18-128.7.1.el5_lustre.1.8.1.1
>
> Please let me know if you need any further information.  I am still trying
> to get the user to help me run his app so that I can run the leak finder
> script to capture more information.
>
> Regards,
> -Simran
>
> On Tue, Apr 27, 2010 at 7:20 AM, Johann Lombardi  wrote:
>
>> Hi,
>>
>> On Tue, Apr 20, 2010 at 09:08:25AM -0700, Jagga Soorma wrote:
>>  > Thanks for your response.* I will try to run the leak-finder script
>> and
>> > hopefully it will point us in the right direction.* This only seems to
>> be
>> > happening on some of my clients:
>>
>>  Could you please tell us what kernel you use on the client side?
>>
>>  >client104: ll_obdo_cache* 0 433506280*** 208** 19*** 1 :
>> tunables*
>> >120** 60*** 8 : slabdata* 0 22816120* 0
>> >client116: ll_obdo_cache* 0 457366746*** 208** 19*** 1 :
>> tunables*
>> >120** 60*** 8 : slabdata* 0 

Re: [Lustre-discuss] Lustre Client - Memory Issue

2010-08-31 Thread Jagga Soorma
It looks like 1.8.4 is the most recent stable release for 1.8.x, so I will
plan on upgrading to this new release and see if this resolves my memory
leak.  Is there a reason why SLES 11 SP1 is not being tested for these new
lustre clients?  Why is the kernel for SLES11 staying at 2.6.27.39-0.3.1?

Thanks,
-Simran

On Mon, Aug 30, 2010 at 6:50 PM, Dmitry Zogin wrote:

>  Actually there was a bug fixed in 1.8.4 when obdo structures can be
> allocated and freed outside of OBDO_ALLOC/OBDO_FREE macros. That could lead
> to the slab fragmentation and pseudo-leak.
> The patch is in the attachment 30664 for  bz 21980
>
> Dmitry
>
>
> Andreas Dilger wrote:
>
> On 2010-08-26, at 18:42, Jagga Soorma wrote:
>
>
>  I am still running into this issue on some nodes:
>
> client109: ll_obdo_cache  0 152914489208   191 : tunables  
> 120   608 : slabdata  0 8048131  0
> client102: ll_obdo_cache  0 308526883208   191 : tunables  
> 120   608 : slabdata  0 16238257  0
>
> How can I calculate how much memory this is holding on to.
>
>
>  If you do "head -1 /proc/slabinfo" it reports the column descriptions.
>
> The "slabdata" will section reports numslabs=16238257, and pagesperslab=1, so 
> tis is 16238257 pages of memory, or about 64GB of RAM on client102.  Ouch.
>
>
>
>   My system shows a lot of memory that is being used up but none of the jobs 
> are using that much memory.  Also, these clients are running a smp sles 11 
> kernel but I can't find any /sys/kernel/slab directory.
>
> Linux client102 2.6.27.29-0.1-default #1 SMP 2009-08-15 17:53:59 +0200 x86_64 
> x86_64 x86_64 GNU/Linux
>
> What makes you say that this does not look like a lustre memory leak?  I 
> thought all the ll_* objects in slabinfo are lustre related?
>
>
>  It's true that the ll_obdo_cache objects are allocated by Lustre, but the 
> above data shows 0 of those objects in use, so the kernel _should_ be freeing 
> the unused slab objects.  This particular data type (obdo) is only ever in 
> use temporarily during system calls on the client, and should never be 
> allocated for a long time.
>
> For some reason the kernel is not freeing the empty slab pages.  That is the 
> responsibility of the kernel, and not Lustre.
>
>
>
>   To me it looks like lustre is holding on to this memory but I don't know 
> much about lustre internals.
>
> Also, memused on these systems are:
>
> client102: 2353666940
> client109: 2421645924
>
>
>  This shows that Lustre is actively using about 2.4GB of memory allocations.  
> It is not tracking the 64GB of memory in the obdo_cache slab, because it has 
> freed that memory (even though the kernel has not freed those pages).
>
>
>
>  Any help would be greatly appreciated.
>
>
>  The only suggestion I have is that if you unmount Lustre and unload the 
> modules (lustre_rmmod) it will free up this memory.  Otherwise, searching for 
> problems with the slab cache on this kernel may turn up something.
>
>
>
>  On Wed, May 19, 2010 at 8:08 AM, Dmitry Zogin  
>  wrote:
> Hello Jagga,
>
> I checked the data, and indeed this does not look like a lustre memory leak, 
> rather than a slab fragmentation, which assumes there might be a kernel issue 
> here. From the slabinfo (I only keep three first columns here):
>
>
> name 
> ll_obdo_cache  0 452282156208
>
> means that there are no active objects, but the memory pages are not released 
> back from slab allocator to the free pool (the num value is huge). That looks 
> like a slab fragmentation - you can get more description at 
> http://kerneltrap.org/Linux/Slab_Defragmentation
>
> Checking your mails, I wonder if this only happens on clients which have  
> SLES11 installed? As the RAM size is around 192Gb, I assume they are NUMA 
> systems?
> If so, SLES11 has defrag_ratio tunables in /sys/kernel/slab/xxx
> From the source of get_any_partial()
>
> #ifdef CONFIG_NUMA
>
> /*
>  * The defrag ratio allows a configuration of the tradeoffs between
>  * inter node defragmentation and node local allocations. A lower
>  * defrag_ratio increases the tendency to do local allocations
>  * instead of attempting to obtain partial slabs from other nodes.
>  *
>  * If the defrag_ratio is set to 0 then kmalloc() always
>  * returns node local objects. If the ratio is higher then kmalloc()
>  * may return off node objects because partial slabs are obtained
>  * from other nodes and filled up.
>  *
>  * If /sys/kernel/slab/xx/defrag_ratio is set to 100 (which makes

[Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

2010-10-13 Thread Jagga Soorma
Hey Guys,

I have 16 clients running lustre 1.8.1.1 and 8 new clients running 1.8.4.
My server is still running lustre 1.8.1.1 (RHEL 5.3).  I just deployed these
new nodes a few weeks ago and have started seeing some user processes just
go into an uninteruptable state.  When this same workload is performed on
the 1.8.1.1 clients it runs fine, but when we run it in our 1.8.4 clients we
start seeing this issue.  All my clients are setup with SLES11 and the same
packages with the exception of a newer kernel in the 1.8.4 environment due
to the lustre dependency:

reshpc208:~ # uname -a
Linux reshpc208 2.6.27.39-0.3-default #1 SMP 2009-11-23 12:57:38 +0100
x86_64 x86_64 x86_64 GNU/Linux
reshpc208:~ # rpm -qa | grep -i lustre
lustre-client-modules-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
lustre-client-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
reshpc208:~ # rpm -qa | grep -i kernel-ib
kernel-ib-1.5.1-2.6.27.39_0.3_default

Doing a ps just hangs on the system and I need to just close and reopen a
session to the effected system.  The application (gsnap) is running from the
lustre filesystem and doing all IO to the lustre fs.  Here is a strace of
where ps hangs:

--
output from "strace ps -ef"

doing an ls in /proc/9598 just hangs the session as well

..snip..
open("/proc/9597/cmdline", O_RDONLY)= 6
read(6, "sh\0-c\0/gne/home/coryba/bin/gsnap"..., 2047) = 359
close(6)= 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
write(1, "degenhj2  9597  9595  0 11:35 ? "..., 130degenhj2  9597  9595  0
11:35 ?00:00:00 sh -c /gne/home/coryba/bin/gsnap -M 3 -t 16 -m 3 -n
1 -d mm9 -e 1000 -E 1000 --pa
) = 130
stat("/proc/9598", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
open("/proc/9598/stat", O_RDONLY)   = 6
read(6, "9598 (gsnap) S 9596 9589 9589 0 "..., 1023) = 254
close(6)= 0
open("/proc/9598/status", O_RDONLY) = 6
read(6, "Name:\tgsnap\nState:\tS (sleeping)\n"..., 1023) = 1023
close(6)= 0
open("/proc/9598/cmdline", O_RDONLY)= 6
read(6,
--

The "t" before "gsnap" is part of "\t", or a "tab" character.  It looks like
GSNAP was trying to open a file or read from it.

I don't see any recent lustre specific errors in my logs (The ones from Oct
10th are expected):

--
..snip..
Oct 10 12:52:01 reshpc208 kernel: Lustre:
12933:0:(import.c:517:import_select_connection())
reshpcfs-MDT-mdc-88200d5d2400: tried all connections, increasing
latency to 2s
Oct 10 12:52:01 reshpc208 kernel: Lustre:
12933:0:(import.c:517:import_select_connection()) Skipped 2 previous similar
messages
Oct 10 12:52:29 reshpc208 kernel: LustreError: 166-1: mgc10.0.250...@o2ib3:
Connection to service MGS via nid 10.0.250...@o2ib3 was lost; in progress
operations using this service will fail.
Oct 10 12:52:43 reshpc208 kernel: Lustre:
12932:0:(import.c:855:ptlrpc_connect_interpret()) m...@10.0.250.44@o2ib3
changed server handle from 0x816b8508159f149f to 0x816b850815ae68ab
Oct 10 12:52:43 reshpc208 kernel: Lustre: mgc10.0.250...@o2ib3: Reactivating
import
Oct 10 12:52:43 reshpc208 kernel: Lustre: mgc10.0.250...@o2ib3: Connection
restored to service MGS using nid 10.0.250...@o2ib3.
Oct 10 12:52:43 reshpc208 kernel: Lustre: Skipped 1 previous similar message
Oct 10 12:52:45 reshpc208 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250...@o2ib3. The obd_ping operation failed with
-107
--

Again we don't have any issues on our 1.8.1.1 client's and this seems to be
only happening on our 1.8.4 clients.  Any assistance would be greatly
appreciated.

Has anyone seen anything similar to this?  Should I just revert back to
1.8.1.1 on these new nodes?  When is 1.8.5 supposed to come out?  I would
prefer to jump to SLES 11 SP1.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

2010-10-13 Thread Jagga Soorma
Okay so one thing I noticed on both instances is that there was a Metadata
server outage a few days before the users complained about these issues.
The clients reestablished the connection once the metadata services were
brought back online.  But my understanding was that the processes would just
hang while the storage is unavailable.  But it should be fine once the
lustre filesystem was made available again.  Am I incorrect in this
assumption?  Could this have led to these processes being in this hung
state?  Again, it does not seem like all processes across all nodes were
effected.

Thanks,
-Simran

On Wed, Oct 13, 2010 at 2:33 PM, Jagga Soorma  wrote:

> Hey Guys,
>
> I have 16 clients running lustre 1.8.1.1 and 8 new clients running 1.8.4.
> My server is still running lustre 1.8.1.1 (RHEL 5.3).  I just deployed these
> new nodes a few weeks ago and have started seeing some user processes just
> go into an uninteruptable state.  When this same workload is performed on
> the 1.8.1.1 clients it runs fine, but when we run it in our 1.8.4 clients we
> start seeing this issue.  All my clients are setup with SLES11 and the same
> packages with the exception of a newer kernel in the 1.8.4 environment due
> to the lustre dependency:
>
> reshpc208:~ # uname -a
> Linux reshpc208 2.6.27.39-0.3-default #1 SMP 2009-11-23 12:57:38 +0100
> x86_64 x86_64 x86_64 GNU/Linux
> reshpc208:~ # rpm -qa | grep -i lustre
> lustre-client-modules-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
> lustre-client-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
> reshpc208:~ # rpm -qa | grep -i kernel-ib
> kernel-ib-1.5.1-2.6.27.39_0.3_default
>
> Doing a ps just hangs on the system and I need to just close and reopen a
> session to the effected system.  The application (gsnap) is running from the
> lustre filesystem and doing all IO to the lustre fs.  Here is a strace of
> where ps hangs:
>
> --
> output from "strace ps -ef"
>
> doing an ls in /proc/9598 just hangs the session as well
>
> ..snip..
> open("/proc/9597/cmdline", O_RDONLY)= 6
> read(6, "sh\0-c\0/gne/home/coryba/bin/gsnap"..., 2047) = 359
> close(6)= 0
> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
> write(1, "degenhj2  9597  9595  0 11:35 ? "..., 130degenhj2  9597  9595  0
> 11:35 ?00:00:00 sh -c /gne/home/coryba/bin/gsnap -M 3 -t 16 -m 3 -n
> 1 -d mm9 -e 1000 -E 1000 --pa
> ) = 130
> stat("/proc/9598", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
> open("/proc/9598/stat", O_RDONLY)   = 6
> read(6, "9598 (gsnap) S 9596 9589 9589 0 "..., 1023) = 254
> close(6)= 0
> open("/proc/9598/status", O_RDONLY) = 6
> read(6, "Name:\tgsnap\nState:\tS (sleeping)\n"..., 1023) = 1023
> close(6)= 0
> open("/proc/9598/cmdline", O_RDONLY)= 6
> read(6,
> --
>
> The "t" before "gsnap" is part of "\t", or a "tab" character.  It looks
> like GSNAP was trying to open a file or read from it.
>
> I don't see any recent lustre specific errors in my logs (The ones from Oct
> 10th are expected):
>
> --
> ..snip..
> Oct 10 12:52:01 reshpc208 kernel: Lustre:
> 12933:0:(import.c:517:import_select_connection())
> reshpcfs-MDT-mdc-88200d5d2400: tried all connections, increasing
> latency to 2s
> Oct 10 12:52:01 reshpc208 kernel: Lustre:
> 12933:0:(import.c:517:import_select_connection()) Skipped 2 previous similar
> messages
> Oct 10 12:52:29 reshpc208 kernel: LustreError: 166-1: mgc10.0.250...@o2ib3:
> Connection to service MGS via nid 10.0.250...@o2ib3 was lost; in progress
> operations using this service will fail.
> Oct 10 12:52:43 reshpc208 kernel: Lustre:
> 12932:0:(import.c:855:ptlrpc_connect_interpret()) m...@10.0.250.44@o2ib3
> changed server handle from 0x816b8508159f149f to 0x816b850815ae68ab
> Oct 10 12:52:43 reshpc208 kernel: Lustre: mgc10.0.250...@o2ib3:
> Reactivating import
> Oct 10 12:52:43 reshpc208 kernel: Lustre: mgc10.0.250...@o2ib3: Connection
> restored to service MGS using nid 10.0.250...@o2ib3.
> Oct 10 12:52:43 reshpc208 kernel: Lustre: Skipped 1 previous similar
> message
> Oct 10 12:52:45 reshpc208 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250...@o2ib3. The obd_ping operation failed
> with -107
> --
>
> Again we don't have any issues on our 1.8.1.1 client's and this seems to be
> only happening on our 1.8.4 clients.  Any assistance would be greatly
> appreciated.
>
> Has anyone seen anything similar to this?  Should I just revert back to
> 1.8.1.1 on these new nodes?  When is 1.8.5 supposed to come out?  I would
> prefer to jump to SLES 11 SP1.
>
> Thanks,
> -J
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

2010-10-13 Thread Jagga Soorma
Robin,

I don't believe that is the cause, otherwise I would be seeing this on my
older 1.8.1.1 clients as well.  Yes the /proc/sys/vm/zone_reclaim_mode is
set to 0 on all of my old/new clients.  I did say that I only changed the
lustre client rpms as well as the kernel due to the lustre clients
dependency.  Other than that all packages are the same.

Another thing to note is that my user home directories are sitting in the
lustre filesystem and we did have a failover occur over the past weekend.
Not sure if this would cause any issues but I did see the lustre client
re-establish a connection with the lustre servers and did not see any other
lustre specific errors after this.

-J

>On Wed, Oct 13, 2010 at 02:33:35PM -0700, Jagga Soorma wrote:
>>Doing a ps just hangs on the system and I need to just close and reopen a
>>session to the effected system.  The application (gsnap) is running from
the
>>lustre filesystem and doing all IO to the lustre fs.  Here is a strace of
>>where ps hangs:

>one possible cause of hung processes (that's not Lustre related) is the
>VM tying itself in knots. are your clients NUMA machines?
>is /proc/sys/vm/zone_reclaim_mode = 0?

>I guess this explanation is a bit unlikely if your only change is the
>client kernel version, but you don't say what you changed it from and
>I'm not familiar with SLES, so the possibility is there, and it's an
>easy fix (or actually a dodgy workaround) if that's the problem.
>--
>Dr Robin Humble, HPC Systems Analyst, NCI National Facility

On Wed, Oct 13, 2010 at 3:46 PM, Jagga Soorma  wrote:

> Okay so one thing I noticed on both instances is that there was a Metadata
> server outage a few days before the users complained about these issues.
> The clients reestablished the connection once the metadata services were
> brought back online.  But my understanding was that the processes would just
> hang while the storage is unavailable.  But it should be fine once the
> lustre filesystem was made available again.  Am I incorrect in this
> assumption?  Could this have led to these processes being in this hung
> state?  Again, it does not seem like all processes across all nodes were
> effected.
>
> Thanks,
> -Simran
>
>
> On Wed, Oct 13, 2010 at 2:33 PM, Jagga Soorma  wrote:
>
>> Hey Guys,
>>
>> I have 16 clients running lustre 1.8.1.1 and 8 new clients running 1.8.4.
>> My server is still running lustre 1.8.1.1 (RHEL 5.3).  I just deployed these
>> new nodes a few weeks ago and have started seeing some user processes just
>> go into an uninteruptable state.  When this same workload is performed on
>> the 1.8.1.1 clients it runs fine, but when we run it in our 1.8.4 clients we
>> start seeing this issue.  All my clients are setup with SLES11 and the same
>> packages with the exception of a newer kernel in the 1.8.4 environment due
>> to the lustre dependency:
>>
>> reshpc208:~ # uname -a
>> Linux reshpc208 2.6.27.39-0.3-default #1 SMP 2009-11-23 12:57:38 +0100
>> x86_64 x86_64 x86_64 GNU/Linux
>> reshpc208:~ # rpm -qa | grep -i lustre
>> lustre-client-modules-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
>> lustre-client-1.8.4-2.6.27_39_0.3_lustre.1.8.4_default
>> reshpc208:~ # rpm -qa | grep -i kernel-ib
>> kernel-ib-1.5.1-2.6.27.39_0.3_default
>>
>> Doing a ps just hangs on the system and I need to just close and reopen a
>> session to the effected system.  The application (gsnap) is running from the
>> lustre filesystem and doing all IO to the lustre fs.  Here is a strace of
>> where ps hangs:
>>
>> --
>> output from "strace ps -ef"
>>
>> doing an ls in /proc/9598 just hangs the session as well
>>
>> ..snip..
>> open("/proc/9597/cmdline", O_RDONLY)= 6
>> read(6, "sh\0-c\0/gne/home/coryba/bin/gsnap"..., 2047) = 359
>> close(6)= 0
>> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
>> stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=2819, ...}) = 0
>> write(1, "degenhj2  9597  9595  0 11:35 ? "..., 130degenhj2  9597  9595  0
>> 11:35 ?00:00:00 sh -c /gne/home/coryba/bin/gsnap -M 3 -t 16 -m 3 -n
>> 1 -d mm9 -e 1000 -E 1000 --pa
>> ) = 130
>> stat("/proc/9598", {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
>> open("/proc/9598/stat", O_RDONLY)   = 6
>> read(6, "9598 (gsnap) S 9596 9589 9589 0 "..., 1023) = 254
>> close(6)= 0
>> open("/proc/9598/status", O_RDONLY) = 6
>> read(6, "Name:\tgsnap\nState:\tS (sleeping)\n"..., 1023) = 1023
>

Re: [Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

2010-10-19 Thread Jagga Soorma
Hey Robin,

We are still looking into this issue and try to figure out what is causing
this problem.  We are using a application called gsnap that does use openmpi
and RMPI.  The next time this happens I will definitely look at lsof and see
if there are any /dev/shm related entries.  Will report back with more
information.

What os are you guys using on your clients?  What did you guys end up doing
for the long term fix of this issue?  I am thinking of downgrading to
2.6.27.29-0.1 kernel and 1.8.1.1 lustre client.

Regards,
-J

On Tue, Oct 19, 2010 at 9:48 PM, Robin Humble <
robin.humble+lus...@anu.edu.au > wrote:

> Hi Jagga,
>
> On Wed, Oct 13, 2010 at 02:33:35PM -0700, Jagga Soorma wrote:
> ..
> >start seeing this issue.  All my clients are setup with SLES11 and the
> same
> >packages with the exception of a newer kernel in the 1.8.4 environment due
> >to the lustre dependency:
> >
> >reshpc208:~ # uname -a
> >Linux reshpc208 2.6.27.39-0.3-default #1 SMP 2009-11-23 12:57:38 +0100
> x86_64 x86_64 x86_64 GNU/Linux
> ...
> >open("/proc/9598/stat", O_RDONLY)   = 6
> >read(6, "9598 (gsnap) S 9596 9589 9589 0 "..., 1023) = 254
> >close(6)= 0
> >open("/proc/9598/status", O_RDONLY) = 6
> >read(6, "Name:\tgsnap\nState:\tS (sleeping)\n"..., 1023) = 1023
> >close(6)= 0
> >open("/proc/9598/cmdline", O_RDONLY)= 6
> >read(6,
>
> did you get any further with this?
>
> we've just seen something similar in that we had D state hung processes
> and a strace of ps hung at the same place.
>
> in the end our hang appeared to be /dev/shm related, and an 'ipcs -ma'
> magically caused all the D state processes to continue... we don't have
> a good idea why this might be. looks kinda like a generic kernel shm
> deadlock, possibly unrelated to Lustre.
>
> sys_shmdt features in the hung process tracebacks that the kernel
> prints out.
>
> if you do 'lsof' do you see lots of /dev/shm entries for your app?
> the app we saw run into trouble was using HPMPI which is common in
> commercial packages. does gsnap use HPMPI?
>
> we are running vanilla 2.6.32.* kernels with Lustre 1.8.4 clients on
> this cluster.
>
> cheers,
> robin
> --
> Dr Robin Humble, HPC Systems Analyst, NCI National Facility
> ___
> Lustre-discuss mailing list
> Lustre-discuss@lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> --
> You received this message because you are subscribed to the Google Groups
> "lustre-discuss-list" group.
> To post to this group, send email to lustre-discuss-l...@googlegroups.com.
> To unsubscribe from this group, send email to
> lustre-discuss-list+unsubscr...@googlegroups.com
> .
> For more options, visit this group at
> http://groups.google.com/group/lustre-discuss-list?hl=en.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Issues with Lustre Client 1.8.4 and Server 1.8.1.1

2010-10-20 Thread Jagga Soorma
Robin,

This does not seem to help us at all.  I was not able to find any /dev/shm
related messages in the lsof output and after running 'ipcs -ma' my gsnap
went from a D state to a S state.  However, now my nscd daemon has entered
the D state.

--
# ipcs -ma

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch
status

-- Semaphore Arrays 
keysemid  owner  perms  nsems

-- Message Queues 
keymsqid  owner  perms  used-bytes   messages
--

-J

On Tue, Oct 19, 2010 at 10:05 PM, Jagga Soorma  wrote:

> Hey Robin,
>
> We are still looking into this issue and try to figure out what is causing
> this problem.  We are using a application called gsnap that does use openmpi
> and RMPI.  The next time this happens I will definitely look at lsof and see
> if there are any /dev/shm related entries.  Will report back with more
> information.
>
> What os are you guys using on your clients?  What did you guys end up doing
> for the long term fix of this issue?  I am thinking of downgrading to
> 2.6.27.29-0.1 kernel and 1.8.1.1 lustre client.
>
> Regards,
> -J
>
>
> On Tue, Oct 19, 2010 at 9:48 PM, Robin Humble <
> robin.humble+lus...@anu.edu.au > wrote:
>
>> Hi Jagga,
>>
>> On Wed, Oct 13, 2010 at 02:33:35PM -0700, Jagga Soorma wrote:
>> ..
>> >start seeing this issue.  All my clients are setup with SLES11 and the
>> same
>> >packages with the exception of a newer kernel in the 1.8.4 environment
>> due
>> >to the lustre dependency:
>> >
>> >reshpc208:~ # uname -a
>> >Linux reshpc208 2.6.27.39-0.3-default #1 SMP 2009-11-23 12:57:38 +0100
>> x86_64 x86_64 x86_64 GNU/Linux
>> ...
>> >open("/proc/9598/stat", O_RDONLY)   = 6
>> >read(6, "9598 (gsnap) S 9596 9589 9589 0 "..., 1023) = 254
>> >close(6)= 0
>> >open("/proc/9598/status", O_RDONLY) = 6
>> >read(6, "Name:\tgsnap\nState:\tS (sleeping)\n"..., 1023) = 1023
>> >close(6)= 0
>> >open("/proc/9598/cmdline", O_RDONLY)= 6
>> >read(6,
>>
>> did you get any further with this?
>>
>> we've just seen something similar in that we had D state hung processes
>> and a strace of ps hung at the same place.
>>
>> in the end our hang appeared to be /dev/shm related, and an 'ipcs -ma'
>> magically caused all the D state processes to continue... we don't have
>> a good idea why this might be. looks kinda like a generic kernel shm
>> deadlock, possibly unrelated to Lustre.
>>
>> sys_shmdt features in the hung process tracebacks that the kernel
>> prints out.
>>
>> if you do 'lsof' do you see lots of /dev/shm entries for your app?
>> the app we saw run into trouble was using HPMPI which is common in
>> commercial packages. does gsnap use HPMPI?
>>
>> we are running vanilla 2.6.32.* kernels with Lustre 1.8.4 clients on
>> this cluster.
>>
>> cheers,
>> robin
>> --
>> Dr Robin Humble, HPC Systems Analyst, NCI National Facility
>> ___
>> Lustre-discuss mailing list
>> Lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "lustre-discuss-list" group.
>> To post to this group, send email to lustre-discuss-l...@googlegroups.com
>> .
>> To unsubscribe from this group, send email to
>> lustre-discuss-list+unsubscr...@googlegroups.com
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/lustre-discuss-list?hl=en.
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre 1.8.4 client error message

2010-10-20 Thread Jagga Soorma
Hi Guys,

I received the following error message on one of my lustre 1.8.4 clients and
don't see any network related issues on this node.  Just to point out that
my server is still running 1.8.1.1.  Any ideas what this error message is:

--
Oct 19 09:57:08 node20 kernel: LustreError:
29081:0:(file.c:3280:ll_inode_revalidate_fini()) failure -2 inode 427560038
Oct 19 10:06:38 node20 kernel: Lustre: Listener bound to ib0:10.0.250.67:987
:mlx4_0
Oct 19 10:06:38 node20 kernel: LustreError:
8217:0:(o2iblnd_cb.c:2146:kiblnd_passive_connect()) Can't accept
10.0.250...@o2ib3 on NA (ib0:0:10.0.250.67): bad dst nid 10.0.250...@o2ib3
Oct 19 10:06:38 node20 kernel: LustreError:
8250:0:(o2iblnd_cb.c:2146:kiblnd_passive_connect()) Can't accept
10.0.250...@o2ib3 on NA (ib0:0:10.0.250.67): bad dst nid 10.0.250...@o2ib3
Oct 19 10:06:38 node20 kernel: Lustre: Register global MR array, MR size:
0x, array size: 1
Oct 19 10:06:38 node20 kernel: Lustre: Added LNI 10.0.250...@o2ib3[8/64/0/180]
Oct 19 10:07:19 node20 kernel: Lustre: OBD class driver,
http://www.lustre.org/
Oct 19 10:07:19 node20 kernel: Lustre: Lustre Version: 1.8.4
Oct 19 10:07:19 node20 kernel: Lustre: Build Version:
1.8.4-20100726222109-PRISTINE-2.6.27.39-0.3-default
Oct 19 10:07:20 node20 kernel: Lustre: Lustre Client File System;
http://www.lustre.org/
Oct 19 10:07:20 node20 kernel: Lustre: mgc10.0.250...@o2ib3: Reactivating
import
Oct 19 10:07:20 node20 kernel: Lustre: reshpcfs-clilov-888030066400.lov:
set parameter stripesize=1048576
Oct 19 10:07:20 node20 kernel: Lustre: Client reshpcfs-client has started
--

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Hey Guys,

I have been having a lot of trouble with the sles 11 kernel that lustre
1.8.4 supports.  I tried downgrading the kernel and the lustre client
(1.8.1.1) but the kernel-ib provided modules had some problems with my hca.
The newer kernel-ib package that comes with lustre client 1.8.4 did not have
any issues.  Would it be possible for me to build/install my own ofed
package and install just the lustre client rpms?  Do you see any issues with
this setup?  Or should lustre be built from scratch all together?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Thanks Michael for your response.  So if I understand correctly, you have
not had any issues running the stock kernel with the sun/oracle provided
lustre client rpms and instead of using the kernel-ib package you install
your own ofed packages.

Also, I have the new intel 8 core cpu's and would prefer to go to sles 11 sp
1 instead of sles 11.  However, this is not supported by the lustre client
yet.  What has your experience been with building your own lustre rpm's from
source using a different kernel?  Do you still have to patch the kernel?  I
am also thinking about installing sles 11 sp1 and just building the lustre
client rpm's from source.  Not sure if it is required to patch the kernel if
I use the most updated version provided my sles 11 sp1.

Thanks again,
-J

On Wed, Oct 27, 2010 at 10:49 AM, Michael Barnes wrote:

>
> On Oct 27, 2010, at 1:37 PM, Jagga Soorma wrote:
>
> > I have been having a lot of trouble with the sles 11 kernel that lustre
> 1.8.4 supports.  I tried downgrading the kernel and the lustre client
> (1.8.1.1) but the kernel-ib provided modules had some problems with my hca.
>  The newer kernel-ib package that comes with lustre client 1.8.4 did not
> have any issues.  Would it be possible for me to build/install my own ofed
> package and install just the lustre client rpms?  Do you see any issues with
> this setup?  Or should lustre be built from scratch all together?
>
> Jagga,
>
> We use our own OFED RPMs with Lustre clients using the Lustre client RPMs.
>  We also have some clients compiled from source due to the fact that we
> can't run stock kernels on some of our hardware.
>
> Both work fine.
>
> -mb
>
> --
> +---
> | Michael Barnes
> |
> | Thomas Jefferson National Accelerator Facility
> | Scientific Computing Group
> | 12000 Jefferson Ave.
> | Newport News, VA 23606
> | (757) 269-7634
> +---
>
>
>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Michael,

Which source should I be downloading from oracle's site?  There seem to be
different client source RPM's based on the distribution.  I would have
expected just a single source tarball or src.rpm but that does not seem to
be the case.

My apologies for the n00b question but I have not built the lustre client
from src before.

Thanks,
-J

On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes wrote:

>
> On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote:
>
> > Thanks Michael for your response.  So if I understand correctly, you have
> not had any issues running the stock kernel with the sun/oracle provided
> lustre client rpms and instead of using the kernel-ib package you install
> your own ofed packages.
>
> Thats correct.
>
> > Also, I have the new intel 8 core cpu's and would prefer to go to sles 11
> sp 1 instead of sles 11.  However, this is not supported by the lustre
> client yet.  What has your experience been with building your own lustre
> rpm's from source using a different kernel?  Do you still have to patch the
> kernel?  I am also thinking about installing sles 11 sp1 and just building
> the lustre client rpm's from source.  Not sure if it is required to patch
> the kernel if I use the most updated version provided my sles 11 sp1.
>
> No. Lustre client kernel modules are self-contained aka "patchless"
> clients.  Its been a while since I made the RPMs, but I found this laying
> around:
>
> ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon
> --with-o2ib --enable-quota --disable-readline
>
> Then I believe 'make rpms' does the right thing.
>
> Now that I said how easy it was, there is a caveat.  Now, there may be
> issues with specific kernels, but this worked for us.  The linux-2.6.22
> kernel is a kernel.org kernel with pfm patches (performance monitoring)
> and this kernel also has a NDAed patch from AMD because there are bugs in
> the CPUs and the patches are workarounds for the bugs in the CPU.
>
> It works for us, YMMV.
>
> -mb
>
> --
> +---
> | Michael Barnes
> |
> | Thomas Jefferson National Accelerator Facility
> | Scientific Computing Group
> | 12000 Jefferson Ave.
> | Newport News, VA 23606
> | (757) 269-7634
> +---
>
>
>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Okay, I think I need lustre-1.8.4.tar.gz.  Will try building the client with
it and install my own ofed package.  Hope this works.

Thanks,
-J

On Wed, Oct 27, 2010 at 11:36 AM, Jagga Soorma  wrote:

> Michael,
>
> Which source should I be downloading from oracle's site?  There seem to be
> different client source RPM's based on the distribution.  I would have
> expected just a single source tarball or src.rpm but that does not seem to
> be the case.
>
> My apologies for the n00b question but I have not built the lustre client
> from src before.
>
> Thanks,
> -J
>
>
> On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes 
> wrote:
>
>>
>> On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote:
>>
>> > Thanks Michael for your response.  So if I understand correctly, you
>> have not had any issues running the stock kernel with the sun/oracle
>> provided lustre client rpms and instead of using the kernel-ib package you
>> install your own ofed packages.
>>
>> Thats correct.
>>
>> > Also, I have the new intel 8 core cpu's and would prefer to go to sles
>> 11 sp 1 instead of sles 11.  However, this is not supported by the lustre
>> client yet.  What has your experience been with building your own lustre
>> rpm's from source using a different kernel?  Do you still have to patch the
>> kernel?  I am also thinking about installing sles 11 sp1 and just building
>> the lustre client rpm's from source.  Not sure if it is required to patch
>> the kernel if I use the most updated version provided my sles 11 sp1.
>>
>> No. Lustre client kernel modules are self-contained aka "patchless"
>> clients.  Its been a while since I made the RPMs, but I found this laying
>> around:
>>
>> ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon
>> --with-o2ib --enable-quota --disable-readline
>>
>> Then I believe 'make rpms' does the right thing.
>>
>> Now that I said how easy it was, there is a caveat.  Now, there may be
>> issues with specific kernels, but this worked for us.  The linux-2.6.22
>> kernel is a kernel.org kernel with pfm patches (performance monitoring)
>> and this kernel also has a NDAed patch from AMD because there are bugs in
>> the CPUs and the patches are workarounds for the bugs in the CPU.
>>
>> It works for us, YMMV.
>>
>> -mb
>>
>> --
>> +---
>> | Michael Barnes
>> |
>> | Thomas Jefferson National Accelerator Facility
>> | Scientific Computing Group
>> | 12000 Jefferson Ave.
>> | Newport News, VA 23606
>> | (757) 269-7634
>> +---
>>
>>
>>
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Just out of curiosity, how come you are using the --disable-readline option?

Thanks,
-J

On Wed, Oct 27, 2010 at 11:41 AM, Jagga Soorma  wrote:

> Okay, I think I need lustre-1.8.4.tar.gz.  Will try building the client
> with it and install my own ofed package.  Hope this works.
>
> Thanks,
> -J
>
>
> On Wed, Oct 27, 2010 at 11:36 AM, Jagga Soorma  wrote:
>
>> Michael,
>>
>> Which source should I be downloading from oracle's site?  There seem to be
>> different client source RPM's based on the distribution.  I would have
>> expected just a single source tarball or src.rpm but that does not seem to
>> be the case.
>>
>> My apologies for the n00b question but I have not built the lustre client
>> from src before.
>>
>> Thanks,
>> -J
>>
>>
>> On Wed, Oct 27, 2010 at 11:15 AM, Michael Barnes > > wrote:
>>
>>>
>>> On Oct 27, 2010, at 1:56 PM, Jagga Soorma wrote:
>>>
>>> > Thanks Michael for your response.  So if I understand correctly, you
>>> have not had any issues running the stock kernel with the sun/oracle
>>> provided lustre client rpms and instead of using the kernel-ib package you
>>> install your own ofed packages.
>>>
>>> Thats correct.
>>>
>>> > Also, I have the new intel 8 core cpu's and would prefer to go to sles
>>> 11 sp 1 instead of sles 11.  However, this is not supported by the lustre
>>> client yet.  What has your experience been with building your own lustre
>>> rpm's from source using a different kernel?  Do you still have to patch the
>>> kernel?  I am also thinking about installing sles 11 sp1 and just building
>>> the lustre client rpm's from source.  Not sure if it is required to patch
>>> the kernel if I use the most updated version provided my sles 11 sp1.
>>>
>>> No. Lustre client kernel modules are self-contained aka "patchless"
>>> clients.  Its been a while since I made the RPMs, but I found this laying
>>> around:
>>>
>>> ./configure --disable-server --with-linux=/usr/src/linux-2.6.22-pfm-xeon
>>> --with-o2ib --enable-quota --disable-readline
>>>
>>> Then I believe 'make rpms' does the right thing.
>>>
>>> Now that I said how easy it was, there is a caveat.  Now, there may be
>>> issues with specific kernels, but this worked for us.  The linux-2.6.22
>>> kernel is a kernel.org kernel with pfm patches (performance monitoring)
>>> and this kernel also has a NDAed patch from AMD because there are bugs in
>>> the CPUs and the patches are workarounds for the bugs in the CPU.
>>>
>>> It works for us, YMMV.
>>>
>>> -mb
>>>
>>> --
>>> +---
>>> | Michael Barnes
>>> |
>>> | Thomas Jefferson National Accelerator Facility
>>> | Scientific Computing Group
>>> | 12000 Jefferson Ave.
>>> | Newport News, VA 23606
>>> | (757) 269-7634
>>> +---
>>>
>>>
>>>
>>>
>>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Hey Michael,

The configure process went find.  However, after doing a make install and
rebooting the server I am not able to load the lustre module even though it
does exisit:

--
node205:~ # modprobe lustre
FATAL: Module lustre not found.
node205:~ # cd /lib
node205:/lib # find . -name "lustre.ko"
./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko
node205:/lib # uname -a
Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64
x86_64 x86_64 GNU/Linux
--

I am probably missing something here.  Any help would be appreciated.

Thanks,
-J


On Wed, Oct 27, 2010 at 11:59 AM, Michael Barnes wrote:

>
> On Oct 27, 2010, at 2:46 PM, Jagga Soorma wrote:
>
> > Just out of curiosity, how come you are using the --disable-readline
> option?
>
> Too lazy to install the readline source package, and I don't think I needed
> it.  No other reason.
>
> -mb
>
> --
> +---
> | Michael Barnes
> |
> | Thomas Jefferson National Accelerator Facility
> | Scientific Computing Group
> | 12000 Jefferson Ave.
> | Newport News, VA 23606
> | (757) 269-7634
> +---
>
>
>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre install

2010-10-27 Thread Jagga Soorma
Okay, so I ran a depmod and then tried again.  I am now running into these
error messages:

--
node205:/etc/modprobe.d # modprobe lustre
WARNING: Error inserting osc
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/osc.ko):
Input/output error
WARNING: Error inserting mdc
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/mdc.ko):
Input/output error
WARNING: Error inserting lov
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lov.ko):
Input/output error
FATAL: Error inserting lustre
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko):
Input/output error
--

What am I missing here?

Thanks,
-J

On Wed, Oct 27, 2010 at 3:00 PM, Jagga Soorma  wrote:

> Hey Michael,
>
> The configure process went find.  However, after doing a make install and
> rebooting the server I am not able to load the lustre module even though it
> does exisit:
>
> --
> node205:~ # modprobe lustre
> FATAL: Module lustre not found.
> node205:~ # cd /lib
> node205:/lib # find . -name "lustre.ko"
> ./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko
> node205:/lib # uname -a
> Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64
> x86_64 x86_64 GNU/Linux
> --
>
> I am probably missing something here.  Any help would be appreciated.
>
> Thanks,
> -J
>
>
>
> On Wed, Oct 27, 2010 at 11:59 AM, Michael Barnes 
> wrote:
>
>>
>> On Oct 27, 2010, at 2:46 PM, Jagga Soorma wrote:
>>
>> > Just out of curiosity, how come you are using the --disable-readline
>> option?
>>
>> Too lazy to install the readline source package, and I don't think I
>> needed
>> it.  No other reason.
>>
>> -mb
>>
>> --
>> +---
>> | Michael Barnes
>> |
>> | Thomas Jefferson National Accelerator Facility
>> | Scientific Computing Group
>> | 12000 Jefferson Ave.
>> | Newport News, VA 23606
>> | (757) 269-7634
>> +---
>>
>>
>>
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Patchless Client build issues.

2010-10-27 Thread Jagga Soorma
Hi Guys,

I have compiled lustre client 1.8.4 with the following options:

./configure --disable-server --with-linux=/usr/src/linux-2.6.32.12-0.7
--with-linux-obj=/usr/src/linux-2.6.32.12-0.7-obj/x86_64/default
--with-linux-config=/boot/config-2.6.32.12-0.7-default --with-o2ib
--enable-quota

uname (SLES 11 SP 1):
Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64
x86_64 x86_64 GNU/Linux

The only problem that I saw during the configure/make was:

..snip..
*** You have not yet configured your kernel!
*** (missing kernel config file ".config")
***
*** Please run some configurator (e.g. "make oldconfig" or
*** "make menuconfig" or "make xconfig").
***
..snip..

However the build did not fail so I thought this was okay.  Now that
everything has been built I am try to load the lustre module but keep
getting this error:

--
node205:~ # modprobe lustre
WARNING: Error inserting osc
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/osc.ko):
Input/output error
WARNING: Error inserting mdc
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/mdc.ko):
Input/output error
WARNING: Error inserting lov
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lov.ko):
Input/output error
FATAL: Error inserting lustre
(/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko):
Input/output error
--

I do have lustre installed under the correct path I believe:

node205:/lib # find . -name "lustre.ko"
./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko

Am I missing some step here?  Any help would be greatly appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre Patchless Client build issues.

2010-10-27 Thread Jagga Soorma
Also just to point out I am using OFED 1.5.1.

-J

On Wed, Oct 27, 2010 at 3:16 PM, Jagga Soorma  wrote:

> Hi Guys,
>
> I have compiled lustre client 1.8.4 with the following options:
>
> ./configure --disable-server --with-linux=/usr/src/linux-2.6.32.12-0.7
> --with-linux-obj=/usr/src/linux-2.6.32.12-0.7-obj/x86_64/default
> --with-linux-config=/boot/config-2.6.32.12-0.7-default --with-o2ib
> --enable-quota
>
> uname (SLES 11 SP 1):
> Linux node205 2.6.32.12-0.7-default #1 SMP 2010-05-20 11:14:20 +0200 x86_64
> x86_64 x86_64 GNU/Linux
>
> The only problem that I saw during the configure/make was:
>
> ..snip..
> *** You have not yet configured your kernel!
> *** (missing kernel config file ".config")
> ***
> *** Please run some configurator (e.g. "make oldconfig" or
> *** "make menuconfig" or "make xconfig").
> ***
> ..snip..
>
> However the build did not fail so I thought this was okay.  Now that
> everything has been built I am try to load the lustre module but keep
> getting this error:
>
> --
> node205:~ # modprobe lustre
> WARNING: Error inserting osc
> (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/osc.ko):
> Input/output error
> WARNING: Error inserting mdc
> (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/mdc.ko):
> Input/output error
> WARNING: Error inserting lov
> (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lov.ko):
> Input/output error
> FATAL: Error inserting lustre
> (/lib/modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko):
> Input/output error
> --
>
> I do have lustre installed under the correct path I believe:
>
> node205:/lib # find . -name "lustre.ko"
> ./modules/2.6.32.12-0.7-default/updates/kernel/fs/lustre/lustre.ko
>
> Am I missing some step here?  Any help would be greatly appreciated.
>
> Thanks,
> -J
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Error on client

2010-10-29 Thread Jagga Soorma
Hey Guys,

I just built a new 1.8.4 lustre client from source with OFED 1.5.1.  There
has been not too much I/O on this client for the lustre file system and I
just noticed the following error message on the client:

--
Oct 29 17:01:01 node205 kernel: LustreError:
31163:0:(dir.c:384:ll_readdir_18()) error reading dir 423788546/4213
page 0: rc -43
--

I don't see any lustre errors during this time on the MDS or the OSS's.
 What does this error mean?  Should I be concerned about this?  The
filesystem seems to be fine at this moment.

Any help would be greatly appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre client error

2011-02-15 Thread Jagga Soorma
Hi Guys,

One of my clients got a hung lustre mount this morning and I saw the
following errors in my logs:

--
..snip..
Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_write operation failed with
-28
Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836 previous
similar messages
Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_write operation failed with
-28
Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141 previous
similar messages
Feb 15 10:16:54 reshpc116 kernel: Lustre:
6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1360125198261945 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
Feb 15 10:16:54 reshpc116 kernel: Lustre:
reshpcfs-OST0005-osc-8830175c8400: Connection to service
reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress operations
using this service will wait for recovery to complete.
Feb 15 10:16:54 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_connect operation failed with
-16
Feb 15 10:16:54 reshpc116 kernel: LustreError: Skipped 2888779 previous
similar messages
Feb 15 10:16:55 reshpc116 kernel: Lustre:
6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
x1360125198261947 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
Feb 15 10:18:11 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_connect operation failed with
-16
Feb 15 10:18:11 reshpc116 kernel: LustreError: Skipped 10 previous similar
messages
Feb 15 10:20:45 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_connect operation failed with
-16
Feb 15 10:20:45 reshpc116 kernel: LustreError: Skipped 21 previous similar
messages
Feb 15 10:25:46 reshpc116 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250.47@o2ib3. The ost_connect operation failed with
-16
Feb 15 10:25:46 reshpc116 kernel: LustreError: Skipped 42 previous similar
messages
Feb 15 10:31:43 reshpc116 kernel: Lustre:
reshpcfs-OST0005-osc-8830175c8400: Connection restored to service
reshpcfs-OST0005 using nid 10.0.250.47@o2ib3.
--

Due to disk space issues on my lustre filesystem one of the OST's were full
and I deactivated that OST this morning.  I thought that operation just puts
it in a read only state and that clients can still access the data from that
OST.  After activating this OST again the client connected again and was
okay after this.  How else would you deal with a OST that is close to 100%
full?  Is it okay to leave the OST active and the clients will know not to
write data to that OST?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre client error

2011-02-15 Thread Jagga Soorma
I did deactivate this OST on the MDS server.  So how would I deal with a OST
filling up?  The OST's don't seem to be filling up evenly either.  How does
lustre handle a OST that is at 100%?  Would it not use this specific OST for
writes if there are other OST available with capacity?

Thanks,
-J

On Tue, Feb 15, 2011 at 11:45 AM, Andreas Dilger wrote:

> On 2011-02-15, at 12:20, Cliff White wrote:
> > Client situation depends on where you deactivated the OST - if you
> deactivate on the MDS only, clients should be able to read.
> >
> > What is best to do when an OST fills up really depends on what else you
> are doing at the time, and how much control you have over what the clients
> are doing and other things.  If you can solve the space issue with a quick
> rm -rf, best to leave it online, likewise if all your clients are trying to
> bang on it and failing, best to turn things off. YMMV
>
> In theory, with 1.8 the full OST should be skipped for new object
> allocations, but this is not robust in the face of e.g. a single very large
> file being written to the OST that takes it from "average" usage to being
> full.
>
> > On Tue, Feb 15, 2011 at 10:57 AM, Jagga Soorma 
> wrote:
> > Hi Guys,
> >
> > One of my clients got a hung lustre mount this morning and I saw the
> following errors in my logs:
> >
> > --
> > ..snip..
> > Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_write operation failed
> with -28
> > Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836 previous
> similar messages
> > Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_write operation failed
> with -28
> > Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141 previous
> similar messages
> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
> x1360125198261945 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
> reshpcfs-OST0005-osc-8830175c8400: Connection to service
> reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress
> operations using this service will wait for recovery to complete.
> > Feb 15 10:16:54 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
> failed with -16
> > Feb 15 10:16:54 reshpc116 kernel: LustreError: Skipped 2888779 previous
> similar messages
> > Feb 15 10:16:55 reshpc116 kernel: Lustre:
> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
> x1360125198261947 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
> > Feb 15 10:18:11 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
> failed with -16
> > Feb 15 10:18:11 reshpc116 kernel: LustreError: Skipped 10 previous
> similar messages
> > Feb 15 10:20:45 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
> failed with -16
> > Feb 15 10:20:45 reshpc116 kernel: LustreError: Skipped 21 previous
> similar messages
> > Feb 15 10:25:46 reshpc116 kernel: LustreError: 11-0: an error occurred
> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
> failed with -16
> > Feb 15 10:25:46 reshpc116 kernel: LustreError: Skipped 42 previous
> similar messages
> > Feb 15 10:31:43 reshpc116 kernel: Lustre:
> reshpcfs-OST0005-osc-8830175c8400: Connection restored to service
> reshpcfs-OST0005 using nid 10.0.250.47@o2ib3.
> > --
> >
> > Due to disk space issues on my lustre filesystem one of the OST's were
> full and I deactivated that OST this morning.  I thought that operation just
> puts it in a read only state and that clients can still access the data from
> that OST.  After activating this OST again the client connected again and
> was okay after this.  How else would you deal with a OST that is close to
> 100% full?  Is it okay to leave the OST active and the clients will know not
> to write data to that OST?
> >
> > Thanks,
> > -J
> >
> > ___
> > Lustre-discuss mailing list
> > Lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
> >
> > ___
> > Lustre-discuss mailing list
> > Lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Engineer
> Whamcloud, Inc.
>
>
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Lustre client error

2011-02-15 Thread Jagga Soorma
Also, it looks like the client is reporting a different %used compared to
the oss server itself:

client:
reshpc101:~ # lfs df -h | grep -i 0007
reshpcfs-OST0007_UUID  2.0T  1.7T202.7G   84% /reshpcfs[OST:7]

oss:
/dev/mapper/mpath72.0T  1.9T   40G  98% /gnet/lustre/oss02/mpath7

Here is how the data seems to be distributed on one of the OSS's:
--
/dev/mapper/mpath52.0T  1.2T  688G  65% /gnet/lustre/oss02/mpath5
/dev/mapper/mpath62.0T  1.7T  224G  89% /gnet/lustre/oss02/mpath6
/dev/mapper/mpath72.0T  1.9T   41G  98% /gnet/lustre/oss02/mpath7
/dev/mapper/mpath82.0T  1.3T  671G  65% /gnet/lustre/oss02/mpath8
/dev/mapper/mpath92.0T  1.3T  634G  67% /gnet/lustre/oss02/mpath9
--

-J

On Tue, Feb 15, 2011 at 2:37 PM, Jagga Soorma  wrote:

> I did deactivate this OST on the MDS server.  So how would I deal with a
> OST filling up?  The OST's don't seem to be filling up evenly either.  How
> does lustre handle a OST that is at 100%?  Would it not use this specific
> OST for writes if there are other OST available with capacity?
>
> Thanks,
> -J
>
>
> On Tue, Feb 15, 2011 at 11:45 AM, Andreas Dilger wrote:
>
>> On 2011-02-15, at 12:20, Cliff White wrote:
>> > Client situation depends on where you deactivated the OST - if you
>> deactivate on the MDS only, clients should be able to read.
>> >
>> > What is best to do when an OST fills up really depends on what else you
>> are doing at the time, and how much control you have over what the clients
>> are doing and other things.  If you can solve the space issue with a quick
>> rm -rf, best to leave it online, likewise if all your clients are trying to
>> bang on it and failing, best to turn things off. YMMV
>>
>> In theory, with 1.8 the full OST should be skipped for new object
>> allocations, but this is not robust in the face of e.g. a single very large
>> file being written to the OST that takes it from "average" usage to being
>> full.
>>
>> > On Tue, Feb 15, 2011 at 10:57 AM, Jagga Soorma 
>> wrote:
>> > Hi Guys,
>> >
>> > One of my clients got a hung lustre mount this morning and I saw the
>> following errors in my logs:
>> >
>> > --
>> > ..snip..
>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>> failed with -28
>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836 previous
>> similar messages
>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>> failed with -28
>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141 previous
>> similar messages
>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
>> x1360125198261945 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
>> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>> reshpcfs-OST0005-osc-8830175c8400: Connection to service
>> reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress
>> operations using this service will wait for recovery to complete.
>> > Feb 15 10:16:54 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>> failed with -16
>> > Feb 15 10:16:54 reshpc116 kernel: LustreError: Skipped 2888779 previous
>> similar messages
>> > Feb 15 10:16:55 reshpc116 kernel: Lustre:
>> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
>> x1360125198261947 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
>> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
>> > Feb 15 10:18:11 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>> failed with -16
>> > Feb 15 10:18:11 reshpc116 kernel: LustreError: Skipped 10 previous
>> similar messages
>> > Feb 15 10:20:45 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>> failed with -16
>> > Feb 15 10:20:45 reshpc116 kernel: LustreError: Skipped 21 previous
>> similar messages
>> > Feb 15 10:25:46 reshpc116 kernel: LustreError: 11-0: an error occurred
>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>> failed with -16
>> > Feb 15 10:25:46 reshpc116 kernel: LustreError: Skipped

Re: [Lustre-discuss] Lustre client error

2011-02-15 Thread Jagga Soorma
I might be looking at the wrong OST.  What is the best way to map the actual
/dev/mapper/mpath[X] to what OST ID is used for that volume?

Thanks,
-J

On Tue, Feb 15, 2011 at 3:01 PM, Jagga Soorma  wrote:

> Also, it looks like the client is reporting a different %used compared to
> the oss server itself:
>
> client:
> reshpc101:~ # lfs df -h | grep -i 0007
> reshpcfs-OST0007_UUID  2.0T  1.7T202.7G   84% /reshpcfs[OST:7]
>
> oss:
> /dev/mapper/mpath72.0T  1.9T   40G  98% /gnet/lustre/oss02/mpath7
>
> Here is how the data seems to be distributed on one of the OSS's:
> --
> /dev/mapper/mpath52.0T  1.2T  688G  65% /gnet/lustre/oss02/mpath5
> /dev/mapper/mpath62.0T  1.7T  224G  89% /gnet/lustre/oss02/mpath6
> /dev/mapper/mpath72.0T  1.9T   41G  98% /gnet/lustre/oss02/mpath7
> /dev/mapper/mpath82.0T  1.3T  671G  65% /gnet/lustre/oss02/mpath8
> /dev/mapper/mpath92.0T  1.3T  634G  67% /gnet/lustre/oss02/mpath9
> --
>
> -J
>
>
> On Tue, Feb 15, 2011 at 2:37 PM, Jagga Soorma  wrote:
>
>> I did deactivate this OST on the MDS server.  So how would I deal with a
>> OST filling up?  The OST's don't seem to be filling up evenly either.  How
>> does lustre handle a OST that is at 100%?  Would it not use this specific
>> OST for writes if there are other OST available with capacity?
>>
>> Thanks,
>> -J
>>
>>
>> On Tue, Feb 15, 2011 at 11:45 AM, Andreas Dilger 
>> wrote:
>>
>>> On 2011-02-15, at 12:20, Cliff White wrote:
>>> > Client situation depends on where you deactivated the OST - if you
>>> deactivate on the MDS only, clients should be able to read.
>>> >
>>> > What is best to do when an OST fills up really depends on what else you
>>> are doing at the time, and how much control you have over what the clients
>>> are doing and other things.  If you can solve the space issue with a quick
>>> rm -rf, best to leave it online, likewise if all your clients are trying to
>>> bang on it and failing, best to turn things off. YMMV
>>>
>>> In theory, with 1.8 the full OST should be skipped for new object
>>> allocations, but this is not robust in the face of e.g. a single very large
>>> file being written to the OST that takes it from "average" usage to being
>>> full.
>>>
>>> > On Tue, Feb 15, 2011 at 10:57 AM, Jagga Soorma 
>>> wrote:
>>> > Hi Guys,
>>> >
>>> > One of my clients got a hung lustre mount this morning and I saw the
>>> following errors in my logs:
>>> >
>>> > --
>>> > ..snip..
>>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>>> failed with -28
>>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836 previous
>>> similar messages
>>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>>> failed with -28
>>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141 previous
>>> similar messages
>>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>>> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
>>> x1360125198261945 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
>>> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
>>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>>> reshpcfs-OST0005-osc-8830175c8400: Connection to service
>>> reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress
>>> operations using this service will wait for recovery to complete.
>>> > Feb 15 10:16:54 reshpc116 kernel: LustreError: 11-0: an error occurred
>>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>>> failed with -16
>>> > Feb 15 10:16:54 reshpc116 kernel: LustreError: Skipped 2888779 previous
>>> similar messages
>>> > Feb 15 10:16:55 reshpc116 kernel: Lustre:
>>> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
>>> x1360125198261947 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
>>> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
>>> > Feb 15 10:18:11 reshpc116 kernel: LustreError: 11-0: an error occurred
>>> while communicating with 10.0.250.47@o2ib3. The ost_connect operation
>>> failed with -16
>>> > Feb 15 10:18:11 reshpc116 kernel: LustreError: Skipped 

Re: [Lustre-discuss] Lustre client error

2011-02-15 Thread Jagga Soorma
This OST is 100% now with only 12GB remaining and something is actively
writing to this volume.  What would be the appropriate thing to do in this
scenario?  If I set this to read only on the mds then some of my clients
start hanging up.

Should I be running "lfs find -O OST_UID /lustre" and then move the files
out of this filesystem and re-add them back?  But then there is no gurantee
that they will not be written to this specific OST.

Any help would be greately appreciated.

Thanks,
-J

On Tue, Feb 15, 2011 at 3:05 PM, Jagga Soorma  wrote:

> I might be looking at the wrong OST.  What is the best way to map the
> actual /dev/mapper/mpath[X] to what OST ID is used for that volume?
>
> Thanks,
> -J
>
>
> On Tue, Feb 15, 2011 at 3:01 PM, Jagga Soorma  wrote:
>
>> Also, it looks like the client is reporting a different %used compared to
>> the oss server itself:
>>
>> client:
>> reshpc101:~ # lfs df -h | grep -i 0007
>> reshpcfs-OST0007_UUID  2.0T  1.7T202.7G   84% /reshpcfs[OST:7]
>>
>> oss:
>> /dev/mapper/mpath72.0T  1.9T   40G  98% /gnet/lustre/oss02/mpath7
>>
>> Here is how the data seems to be distributed on one of the OSS's:
>> --
>> /dev/mapper/mpath52.0T  1.2T  688G  65% /gnet/lustre/oss02/mpath5
>> /dev/mapper/mpath62.0T  1.7T  224G  89% /gnet/lustre/oss02/mpath6
>> /dev/mapper/mpath72.0T  1.9T   41G  98% /gnet/lustre/oss02/mpath7
>> /dev/mapper/mpath82.0T  1.3T  671G  65% /gnet/lustre/oss02/mpath8
>> /dev/mapper/mpath92.0T  1.3T  634G  67% /gnet/lustre/oss02/mpath9
>> --
>>
>> -J
>>
>>
>> On Tue, Feb 15, 2011 at 2:37 PM, Jagga Soorma  wrote:
>>
>>> I did deactivate this OST on the MDS server.  So how would I deal with a
>>> OST filling up?  The OST's don't seem to be filling up evenly either.  How
>>> does lustre handle a OST that is at 100%?  Would it not use this specific
>>> OST for writes if there are other OST available with capacity?
>>>
>>> Thanks,
>>> -J
>>>
>>>
>>> On Tue, Feb 15, 2011 at 11:45 AM, Andreas Dilger 
>>> wrote:
>>>
>>>> On 2011-02-15, at 12:20, Cliff White wrote:
>>>> > Client situation depends on where you deactivated the OST - if you
>>>> deactivate on the MDS only, clients should be able to read.
>>>> >
>>>> > What is best to do when an OST fills up really depends on what else
>>>> you are doing at the time, and how much control you have over what the
>>>> clients are doing and other things.  If you can solve the space issue with 
>>>> a
>>>> quick rm -rf, best to leave it online, likewise if all your clients are
>>>> trying to bang on it and failing, best to turn things off. YMMV
>>>>
>>>> In theory, with 1.8 the full OST should be skipped for new object
>>>> allocations, but this is not robust in the face of e.g. a single very large
>>>> file being written to the OST that takes it from "average" usage to being
>>>> full.
>>>>
>>>> > On Tue, Feb 15, 2011 at 10:57 AM, Jagga Soorma 
>>>> wrote:
>>>> > Hi Guys,
>>>> >
>>>> > One of my clients got a hung lustre mount this morning and I saw the
>>>> following errors in my logs:
>>>> >
>>>> > --
>>>> > ..snip..
>>>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>>>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>>>> failed with -28
>>>> > Feb 15 09:38:07 reshpc116 kernel: LustreError: Skipped 4755836
>>>> previous similar messages
>>>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: 11-0: an error occurred
>>>> while communicating with 10.0.250.47@o2ib3. The ost_write operation
>>>> failed with -28
>>>> > Feb 15 09:48:07 reshpc116 kernel: LustreError: Skipped 4649141
>>>> previous similar messages
>>>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>>>> 6254:0:(client.c:1476:ptlrpc_expire_one_request()) @@@ Request
>>>> x1360125198261945 sent from reshpcfs-OST0005-osc-8830175c8400 to NID
>>>> 10.0.250.47@o2ib3 1344s ago has timed out (1344s prior to deadline).
>>>> > Feb 15 10:16:54 reshpc116 kernel: Lustre:
>>>> reshpcfs-OST0005-osc-8830175c8400: Connection to service
>>>> reshpcfs-OST0005 via nid 10.0.250.47@o2ib3 was lost; in progress
>>>> operations using this se

Re: [Lustre-discuss] Lustre client error

2011-02-16 Thread Jagga Soorma
Another thing that I just noticed is that after deactivating a OST on the
MDS, I am no longer able to check the quota's for users.  Here is the
message I receive:

--
Disk quotas for user testuser (uid 17229):
 Filesystem  kbytes   quota   limit   grace   files   quota   limit
grace
  /lustre [0] [0] [0] [0] [0] [0]

Some errors happened when getting quota info. Some devices may be not
working or deactivated. The data in "[]" is inaccurate.
--

Is this normal and expected?  Or am I missing something here?

Thanks for all your support.  It is much appreciated.

Regards,
-J

On Tue, Feb 15, 2011 at 4:25 PM, Cliff White  wrote:

> you can use lfs find or lfs getstripe to identify where files are.
> If you move the files out and move them back, the QOS policy should
> re-distribute them evenly, but it very much depends. If you have clients
> using a stripe count of 1,
> a single large file can fill up one OST.
> df on the client reports space for the entire filesystem, df on the OSS
> reports space for the targets
> attached to that server, so yes the results will be different.
> cliffw
>
>
> On Tue, Feb 15, 2011 at 4:09 PM, Jagga Soorma  wrote:
>
>> This OST is 100% now with only 12GB remaining and something is actively
>> writing to this volume.  What would be the appropriate thing to do in this
>> scenario?  If I set this to read only on the mds then some of my clients
>> start hanging up.
>>
>> Should I be running "lfs find -O OST_UID /lustre" and then move the files
>> out of this filesystem and re-add them back?  But then there is no gurantee
>> that they will not be written to this specific OST.
>>
>> Any help would be greately appreciated.
>>
>> Thanks,
>> -J
>>
>>
>> On Tue, Feb 15, 2011 at 3:05 PM, Jagga Soorma  wrote:
>>
>>> I might be looking at the wrong OST.  What is the best way to map the
>>> actual /dev/mapper/mpath[X] to what OST ID is used for that volume?
>>>
>>> Thanks,
>>> -J
>>>
>>>
>>> On Tue, Feb 15, 2011 at 3:01 PM, Jagga Soorma  wrote:
>>>
>>>> Also, it looks like the client is reporting a different %used compared
>>>> to the oss server itself:
>>>>
>>>> client:
>>>> reshpc101:~ # lfs df -h | grep -i 0007
>>>> reshpcfs-OST0007_UUID  2.0T  1.7T202.7G   84%
>>>> /reshpcfs[OST:7]
>>>>
>>>> oss:
>>>> /dev/mapper/mpath72.0T  1.9T   40G  98% /gnet/lustre/oss02/mpath7
>>>>
>>>> Here is how the data seems to be distributed on one of the OSS's:
>>>> --
>>>> /dev/mapper/mpath5    2.0T  1.2T  688G  65% /gnet/lustre/oss02/mpath5
>>>> /dev/mapper/mpath62.0T  1.7T  224G  89% /gnet/lustre/oss02/mpath6
>>>> /dev/mapper/mpath72.0T  1.9T   41G  98% /gnet/lustre/oss02/mpath7
>>>> /dev/mapper/mpath82.0T  1.3T  671G  65% /gnet/lustre/oss02/mpath8
>>>> /dev/mapper/mpath92.0T  1.3T  634G  67% /gnet/lustre/oss02/mpath9
>>>> --
>>>>
>>>> -J
>>>>
>>>>
>>>> On Tue, Feb 15, 2011 at 2:37 PM, Jagga Soorma wrote:
>>>>
>>>>> I did deactivate this OST on the MDS server.  So how would I deal with
>>>>> a OST filling up?  The OST's don't seem to be filling up evenly either.  
>>>>> How
>>>>> does lustre handle a OST that is at 100%?  Would it not use this specific
>>>>> OST for writes if there are other OST available with capacity?
>>>>>
>>>>> Thanks,
>>>>> -J
>>>>>
>>>>>
>>>>> On Tue, Feb 15, 2011 at 11:45 AM, Andreas Dilger <
>>>>> adil...@whamcloud.com> wrote:
>>>>>
>>>>>> On 2011-02-15, at 12:20, Cliff White wrote:
>>>>>> > Client situation depends on where you deactivated the OST - if you
>>>>>> deactivate on the MDS only, clients should be able to read.
>>>>>> >
>>>>>> > What is best to do when an OST fills up really depends on what else
>>>>>> you are doing at the time, and how much control you have over what the
>>>>>> clients are doing and other things.  If you can solve the space issue 
>>>>>> with a
>>>>>> quick rm -rf, best to leave it online, likewise if all your clients are
>>>>>> trying to bang on it and failing, best to turn things off. YMMV
>>>>>>
>>>>>> In the

[Lustre-discuss] New to lustre - Help setting up Lustre

2009-12-23 Thread Jagga Soorma
Hi Guys,

I am working on building a small cluster sharing the lustre file system.  I
will be using an infiniband interconnect for all the compute/head nodes.
Can someone please assist me in figuring out which RPM's should I install on
my MDS & OSS.:

kernel-ib-1.4.2-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
kernel-ib-1.4.2-2.6.18_128.7.1.el5.x86_64.rpm
kernel-lustre-2.6.18-128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
lustre-1.8.1.1-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
lustre-client-1.8.1.1-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
lustre-client-modules-1.8.1.1-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
lustre-ldiskfs-3.0.9-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm
lustre-modules-1.8.1.1-2.6.18_128.7.1.el5_lustre.1.8.1.1.x86_64.rpm

Should I be installing the kernel-ib* rpm?  I am planning on installing the
drivers that the vendor provides for inifiniband.  I am pretty new to this
whole setup, so if any of this sounds too basic, my apologies in advance.

Also, what is the recommendation on using LVM for MDT's and OST's?  Should
it be avoided or there is not much of a performance hit if we use LVM.

And finally, is there any reasons why I should not use large size luns
(approx. 2TB) for my OST's?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] OFED - How to configure infiniband interface on Linux

2009-12-25 Thread Jagga Soorma
Hi Guys,

I have another quick question.  I will need to configure my infiniband
interfaces and I guess the best method is to use the kernel-ib modules that
lustre.org provides instead of using the ones from the vendor.  Is there any
documentation that you can point me to that would help me configure these
new interfaces?  I am brand new to infiniband and don't have too much
experience with this.  I will be adding this to both a RHEL 5.3 and SLES 11
systems.

Thanks in advance.
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] What stripe size and extent size to choose for Lustre

2009-12-29 Thread Jagga Soorma
Hi Guys,

I am new to Lustre and will be deploying a small 20 node cluster using the
Lustre FS.  I do not have many requirements from my users :( and not sure
what type of workload they will be putting on this filesystem.  All nodes
will be interconnected using Infiniband.  My question is what stripe size
and extent size to use for my environment:

Stripe Size: I was thinking of just setting this to -1 since I will be
adding more OSS's at a later time and will probably have 4-5 OST's per OSS.
Any downfalls of using -1?
Extent Size: I was thinking of setting this to 1MB.

Regards,
-Simran
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Monitoring Tools

2010-01-06 Thread Jagga Soorma
Hi Guys,

I would like to monitor the performance and usage of my Lustre filesystem
and was wondering what are the commonly used monitoring tools for this?
Cacti? Nagios?  Any input would be greatly appreciated.

Regards,
-Simran
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] What HA Software to use with Lustre

2010-01-14 Thread Jagga Soorma
Hi Guys,

I am setting up our new Lustre environment and was wondering what is the
recommended (stable) HA clustering software to use with the MDS and OSS
failover.  Any input would be greatly appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Lustre Error Messages

2010-01-15 Thread Jagga Soorma
Hi Guys,

I just deployed a new lustre filesystem and was unable to mount the
filesystem on a client for the first time.  I was able to reach everything
using the:

lctl ping i...@o2ib3

All networks were up but the client hanged while doing a mount.  So, I
decided to reboot my client and was then able to mount the filesystem
without any issues.  Here is something that I saw on the MDS server.

Any ideas what might be the problem?

Thanks in advance for your input.


--
..snip..
Jan 14 17:06:36 resmds01 kernel: Lustre:
7790:0:(client.c:1383:ptlrpc_expire_one_request()) @@@ Request
x1324894173265938 sent from reshpcfs-OST0001-osc to NID 10.0.250...@o2ib3 0s
ago has failed due to network error (limit 15s).
Jan 14 17:06:36 resmds01 kernel: Lustre:
7722:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7722:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7719:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7709:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7717:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7717:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7714:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
7712:0:(o2iblnd_cb.c:459:kiblnd_rx_complete()) Rx from
10.0.250...@o2ib3failed: 5
Jan 14 17:06:51 resmds01 kernel: Lustre:
5616:0:(o2iblnd_cb.c:1953:kiblnd_peer_connect_failed()) Deleting messages
for 10.0.250...@o2ib3: connection failed
Jan 14 17:07:36 resmds01 kernel: LustreError: 11-0: an error occurred while
communicating with 10.0.250...@o2ib3. The ost_connect operation failed with
-19
Jan 14 17:13:55 resmds01 kernel: LustreError: 11-0: an error occurred while
communicating with 0...@lo. The mds_connect operation failed with -16
Jan 14 17:14:20 resmds01 kernel: LustreError: 11-0: an error occurred while
communicating with 0...@lo. The mds_connect operation failed with -16
..snip..
--
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Clustered MDS & OSS Servers

2010-01-18 Thread Jagga Soorma
Hi Guys,

I am working on clustering our MDS & OSS servers and wanted to make sure I
understand this correctly.  Can you please let me know if this sounds right:

a) Planning on having a floating virtual IP setup on the active MDS server
(ib1:1).  This is what the OSS's will use when doing their mkfs.  In an
outage this virtual IP address will migrate to the standby node.
b) On the oss's there is no need for a virtual IP that would need to fail
over in an outage.  I would simply have heartbeat mount the filesystems on
the other OSS node.

Please let me know if I missed anything.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Clustered MDS & OSS Servers

2010-01-19 Thread Jagga Soorma
How would the OSS's and client's communicate with the MDS server in a
failover situation?

This is how I am doing things:

mds01: mkfs.lustre --fsname=fsname --mdt --mgs /dev/vgname/lvname

oss01: mkfs.lustre --ost --fsname=fsname
--failnode=os...@o2ib3--mgsnode=mds01@o2ib3/dev/mapper/mpath0
oss02: mkfs.lustre --ost --fsname=fsname
--failnode=os...@o2ib3--mgsnode=mds01@o2ib3/dev/mapper/mpath0

client01: mount -t lustre mds01...@o2ib3:/fsname /mnt

Now, if mds01 fails over to mds02, how would the client communicate with the
new MDS server if the IP changes?

What would the mkfs.lustre commands look like for a HA setup for MDS & OSS.


Also, is there a downfall for using a virtual IP for the MDS's?

Thanks in advance for your assistance.
-J



On Tue, Jan 19, 2010 at 2:43 AM, Andreas Dilger  wrote:

> On 2010-01-19, at 13:01, Jagga Soorma wrote:
>
>> I am working on clustering our MDS & OSS servers and wanted to make sure I
>> understand this correctly.  Can you please let me know if this sounds right:
>>
>> a) Planning on having a floating virtual IP setup on the active MDS server
>> (ib1:1).  This is what the OSS's will use when doing their mkfs.  In an
>> outage this virtual IP address will migrate to the standby node.
>>
>
> This is not how Lustre failover works.  You need to assign a separate IP
> address for each MDS server.  Lustre handles multiple MDS failover nodes
> itself.
>
>
>  b) On the oss's there is no need for a virtual IP that would need to fail
>> over in an outage.  I would simply have heartbeat mount the filesystems on
>> the other OSS node.
>>
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Cluster Status Confusion - lctl dl

2010-01-19 Thread Jagga Soorma
Hi Guys,

I have unmounted my lustre filesystem from my clients and then umounted all
of my OST's.  However, the lctl dl output from my active MDS server still
shows the OST as being UP:

lctl dl
  0 UP mgs MGS MGS 5
  1 UP mgc mgc10.0.250...@o2ib3 64453d97-c46b-6d8c-f4cf-7fbbac1f3aee 5
  2 UP mdt MDS MDS_uuid 3
  3 UP lov fsname-mdtlov fsname-mdtlov_UUID 4
  4 UP mds fsname-MDT fsname-MDT_UUID 3
  5 UP osc fsname-OST-osc fsname-mdtlov_UUID 5
  6 UP osc fsname-OST0001-osc fsname-mdtlov_UUID 5

Shouldn't the status for 5 & 6 not be UP.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Heartbeat Failover Issues on MDS

2010-01-19 Thread Jagga Soorma
Hi Guys,

I am setting up a heartbeat cluster for my 2 MDS servers.  However, I am
running into the following issue.  If I power off the passive node and
heartbeat uncleanly shuts down, then after the server is brought back online
and the heartbeat services are started, all my resource are shutdown
eventhough they are running on the active node and then brought back online
automatically.  Am I missing some settings here?  Stickiness?  I have been
unable to get this to work.

Also do I need to disable the lvm2-monitor service on my MDS's?  Any
assistance would be greatly appreciated.

Thanks in advance,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Ping resource in cib.xml

2010-01-19 Thread Jagga Soorma
Hi Guys,

All the configurations that I am seeing for a clustered MDS does not have a
ping resource setup in their ha.cf.  How do people monitor the link that is
being used for lustre traffic and failover if that link is no longer
available?

I am currently trying to make hearbeat failover services if it is unable to
ping a specific IP address but the ping resource is not working at all.

Any ideas would be appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] Filesystem monitoring in Heartbeat

2010-01-21 Thread Jagga Soorma
Hi Guys,

My MDT is setup with LVM and I was able to test failover based on the Volume
Group failing on my MDS (by unplugging both fibre cables).  However, for my
OST's, I have created filesystems directly on the SAN luns and when I unplug
the fibre cables on my OSS, heartbeat does not detect failure for the
filesystem since it shows as mounted.  Is there somehow we can trigger a
failure based on multipath failing on the OSS?

Any assistance would be greatly appreciated.

Thanks in advance,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] IO Error for NFS mounts on my SLES 11 Clients

2010-01-21 Thread Jagga Soorma
Hi Guys,

I just noticed that I was getting I/O errors for all my NFS mounted volumes
on my sles 11 clients.  This I/O error only happened for a few minutes and
was on the sles 11 servers with the lustre client modules installed.  There
were no other network related issues on that server of the filer.  Has
anyone experienced this at all?

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


[Lustre-discuss] /etc/init.d/openibd hanging system shutdown

2010-01-26 Thread Jagga Soorma
Hi Guys,

My SLES clients are hanging during a system shutdown right after giving me a
message of Shutting down D-Bus Daemon done:

/etc/init.d/rc3.d: ls K08*
K08dbus  K08openibd

Has anyone noticed this on any of their clients if you are running sles?  I
have to kill the power on my client every time I need to shutdown.

Any help would be greatly appreciated.

Thanks,
-J
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] /etc/init.d/openibd hanging system shutdown

2010-01-26 Thread Jagga Soorma
Did some more digging around and the mlx4_ib module is not unloading:

..snip..
+ '[' '!' -z 'ERROR: Removing '\''mlx4_ib'\'': Device or resource busy' ']'
..snip..

# rmmod mlx4_ib
ERROR: Removing 'mlx4_ib': Device or resource busy

#lctl net down
LNET busy

How can I shutdown lctl network?  I do not have any lustre fs's mounted on
my client anymore.

Thanks,
-J

On Tue, Jan 26, 2010 at 3:10 PM, Jagga Soorma  wrote:

> Hi Guys,
>
> My SLES clients are hanging during a system shutdown right after giving me
> a message of Shutting down D-Bus Daemon done:
>
> /etc/init.d/rc3.d: ls K08*
> K08dbus  K08openibd
>
> Has anyone noticed this on any of their clients if you are running sles?  I
> have to kill the power on my client every time I need to shutdown.
>
> Any help would be greatly appreciated.
>
> Thanks,
> -J
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] /etc/init.d/openibd hanging system shutdown

2010-01-27 Thread Jagga Soorma
Thanks Erik,

The lustre_rmmod took care of stopping and unloading the lnet module.  That
is what I guess needed to happen for openibd to not hang during a reboot.

Thanks again for your assistance.

-J

On Tue, Jan 26, 2010 at 8:19 PM, Erik Froese  wrote:

> try lctl net down or lctl net unconfigure
>
> If those fail run lustre_rmmod
>
> Erik
>
> On Tue, Jan 26, 2010 at 6:56 PM, Jagga Soorma  wrote:
>
>> Did some more digging around and the mlx4_ib module is not unloading:
>>
>> ..snip..
>> + '[' '!' -z 'ERROR: Removing '\''mlx4_ib'\'': Device or resource busy'
>> ']'
>> ..snip..
>>
>> # rmmod mlx4_ib
>> ERROR: Removing 'mlx4_ib': Device or resource busy
>>
>> #lctl net down
>> LNET busy
>>
>> How can I shutdown lctl network?  I do not have any lustre fs's mounted on
>> my client anymore.
>>
>> Thanks,
>> -J
>>
>>
>> On Tue, Jan 26, 2010 at 3:10 PM, Jagga Soorma  wrote:
>>
>>> Hi Guys,
>>>
>>> My SLES clients are hanging during a system shutdown right after giving
>>> me a message of Shutting down D-Bus Daemon done:
>>>
>>> /etc/init.d/rc3.d: ls K08*
>>> K08dbus  K08openibd
>>>
>>> Has anyone noticed this on any of their clients if you are running sles?
>>> I have to kill the power on my client every time I need to shutdown.
>>>
>>> Any help would be greatly appreciated.
>>>
>>> Thanks,
>>> -J
>>>
>>
>>
>> ___
>> Lustre-discuss mailing list
>> Lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss