Re: [lustre-discuss] Understanding quotas

2016-07-18 Thread Nate Pearlstein
I'll hazard a guess here and say that based on how lustre quotas are 
"quantized," space is granted in chunks and it is possible to slightly exceed 
ones quota based on the number of ost objects that were granted in order to 
meet an I/O request.  See the lustre ops manual on quotas.  Looks like like the 
amount granted is now exposed in the lfs quota output.

> On Jul 18, 2016, at 7:00 AM, Gibbins, Faye  wrote:
> 
> Hi,
>  
> Thanks for replying. It’s good to know that my emails are getting through. J
>  
> H’m. 512Byte blocks you say? That’s confusing.
>  
> I know that my used quota is “88120” K, but this setting from 
> /proc/fs/lustre/qmt/scratch-QMT/dt-0x0/glb-usr is 117440512 so that means 
> that the units of this value would be 1332 Bytes surely.
>  
> Sorry I might be a little blonde here. How does 117440512 x 512Bytes = 86M?
>  
> Faye
>  
>  
>  
>  
>  
> From: Riccardo Veraldi [mailto:riccardo.vera...@cnaf.infn.it] 
> Sent: 14 July 2016 01:39
> To: Gibbins, Faye 
> Subject: Re: [lustre-discuss] Understanding quotas
>  
> 
> to answer you the mail passed through the list.
> I am not sure of the meaning of granted or I would have answered you. I am 
> not using quotas.
> Anyway looks like is is really your used quota if we suppose that 117440512 
> are 512Bytes blocks
> this matches 86M
> 
> hope I have been helpful
> 
> 
> 
> On 11/07/16 07:57, Gibbins, Faye wrote:
> Hi,
>  
> Could someone please help me understand lustre quotas?
>  
> I’ve created a lustre filesystem called “scratch” using RHEL 7.2 and Lustre 
> 2.8. When I run “lctl get_param qmt.scratch-QMT.dt-0x0.*” on the MDT I 
> see for my ID the following:
>  
> - id:  20977
>   limits:  { hard:6, soft:5, granted: 
>117440512, time:0 }
>  
> Can someone tell me what the “granted” field is representing? I know it’s not 
> my used quota, that’s only 86M and this value is nearer 112G.
>  
> Yours
> Faye Gibbins
>  
> Snr Systems Administrator, Unix Lead Architect (Software)
> Cirrus Logic | cirrus.com | +44 131 272 7398
> 
>  
>  
>  
> This message and any attachments may contain privileged and confidential 
> information that is intended solely for the person(s) to whom it is 
> addressed. If you are not an intended recipient you must not: read; copy; 
> distribute; discuss; take any action in or make any reliance upon the 
> contents of this message; nor open or read any attachment. If you have 
> received this message in error, please notify us as soon as possible on the 
> following telephone number and destroy this message including any 
> attachments. Thank you. Cirrus Logic International (UK) Ltd and Cirrus Logic 
> International Semiconductor Ltd are companies registered in Scotland, with 
> registered numbers SC089839 and SC495735 respectively. Our registered office 
> is at Westfield House, 26 Westfield Road, Edinburgh, EH11 2QB, UK. Tel: +44 
> (0)131 272 7000. cirrus.com 
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>  
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Error Lustre/multipath/storage

2016-03-28 Thread Nate Pearlstein
I thought I responded to the entire list but only sent to Angelo,

Very likely, lustre on the oss nodes is setting the max_sectors_kb all the way 
up to max_hw_sectors_kb and this value ends up being too large for the sas hca. 
 You should set max_sectors for you mpt2sas to something smaller like 4096, 
rebuild the initrd and this will put a better limit on max_hw_sectors_kb for 
the is5600 luns…


> On Mar 28, 2016, at 6:51 PM, Dilger, Andreas  wrote:
> 
> On 2016/03/28, 08:01, "lustre-discuss on behalf of Angelo Cavalcanti" 
> mailto:lustre-discuss-boun...@lists.lustre.org>
>  on behalf of acrribe...@gmail.com> wrote:
> 
> 
> Dear all,
> 
> We're having trouble with a lustre 2.5.3 implementation. This is our setup:
> 
> 
>  *   One server for MGS/MDS/MDT. MDT is served from a raid-6 backed partition 
> of 2TB (que tipo de hd?)
> 
> Note that using RAID-6 for the MDT storage will significantly hurt your 
> metadata
> performance, since this will incur a lot of read-modify-write overhead when 
> doing
> 4KB metadata block updates.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel High Performance Data Division
> 
> 
>  *   Two OSS/OST in a active/active HA with pacemaker. Both are connected to 
> a storage via SAS.
> 
> 
>  *   One SGI Infinite Storage IS5600 with two raid-6 backed volume groups. 
> Each group has two volumes, each volume has 15TB capacity.
> 
> 
> Volumes are recognized by OSSs as multipath devices, each voulme has 4 paths. 
> Volumes were created with a GPT partition table and a single partition.
> 
> 
> Volume partitions were then formatted as OSTs with the following command:
> 
> 
> # mkfs.lustre --replace --reformat --ost --mkfsoptions=" -E 
> stride=128,stripe_width=1024" 
> --mountfsoptions="errors=remount-ro,extents,mballoc" --fsname=lustre1 
> --mgsnode=10.149.0.153@o2ib1 --index=0 --servicenode=10.149.0.151@o2ib1 
> --servicenode=10.149.0.152@o2ib1 
> /dev/mapper/360080e500029eaec012656951fcap1
> 
> 
> Testing with bonnie++ in a client with the below command:
> 
> $ ./bonnie++-1.03e/bonnie++ -m lustre1 -d /mnt/lustre -s 128G:1024k -n 0 -f 
> -b -u vhpc
> 
> 
> No problem creating files inside the lustre mount point, but *rewriting* the 
> same files results in the errors below:
> 
> 
> Mar 18 17:46:13 oss01 multipathd: 8:128: mark as failed
> 
> Mar 18 17:46:13 oss01 multipathd: 360080e500029eaec012656951fca: 
> remaining active paths: 3
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] CDB: Read(10): 28 00 00 06 d8 
> 22 00 20 00 00
> 
> Mar 18 17:46:13 oss01 kernel: __ratelimit: 109 callbacks suppressed
> 
> Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:128.
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] CDB: Read(10): 28 00 00 07 18 
> 22 00 18 00 00
> 
> Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:192.
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] CDB: Read(10): 28 00 00 06 d8 
> 22 00 20 00 00
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] CDB: Read(10): 28 00 00 07 18 
> 22 00 18 00 00
> 
> Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:64.
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 07 18 
> 22 00 18 00 00
> 
> Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:0.
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Unhandled error code
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Result: 
> hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK
> 
> Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 06 d8 
> 22 00 20 00 00
> 
> Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec012656951fca: sdi - 
> rdac checker reports path is up
> 
> Mar 18 17:46:14 oss01 multipathd: 8:128: reinstated
> 
> Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec012656951fca: 
> remaining active paths: 4
> 
> Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Unhandled error code
> 
> Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Result: 
> hostbyte=DID_S

[Lustre-discuss] tunefs.lustre --print fails on mounted mdt/ost with mmp

2010-07-14 Thread Nate Pearlstein
Just checking to be sure this isn't a known bug or problem.  I couldn't
find a bz for this, but it would appear that tunefs.lustre --print fails
on a lustre mdt or ost device if mounted with mmp.

Is this expected behavior?

TIA

mds1-gps:~ # tunefs.lustre --print /dev/mapper/mdt1
checking for existing Lustre data: not found

tunefs.lustre FATAL: Device /dev/mapper/mdt1 has not been formatted with
mkfs.lustre
tunefs.lustre: exiting with 19 (No such device)
mds1-gps:~ # 


Command works fine when unmounted.

-- 
Sent from my wired giant hulking workstation

Nate Pearlstein - npe...@sgi.com - Product Support Engineer



___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Modifying Lustre network (good practices)

2010-05-20 Thread Nate Pearlstein
Which bonding method are you using?  Has the performance always been
this way?  Depending on which bonding type you are using and the network
hardware involved you might see the behavior you are describing.


On Thu, 2010-05-20 at 16:27 +0200, Olivier Hargoaa wrote:
> Dear All,
> 
> We have a cluster with lustre critical data. On this cluster there are 
> three networks on each Lustre server and client : one ethernet network 
> for administration (eth0), and two other ethernet networks configured in 
> bonding (bond0: eth1 & eth2). On Lustre we get poor read performances 
> and good write performances so we decide to modify Lustre network in 
> order to see if problems comes from network layer.
> 
> Currently Lustre network is bond0. We want to set it as eth0, then eth1, 
> then eth2 and finally back to bond0 in order to compare performances.
> 
> Therefore, we'll perform the following steps: we will umount the 
> filesystem, reformat the mgs, change lnet options in modprobe file, 
> start new mgs server, and finally modify our ost and mdt with 
> tunefs.lustre with failover and mgs new nids using "--erase-params" and 
> "--writeconf" options.
> 
> We tested it successfully on a test filesystem but we read in the manual 
> that this can be really dangerous. Do you agree with this procedure? Do 
> you have some advice or practice on this kind of requests? What's the 
> danger?
> 
> Regards.
> 

-- 
Sent from my wired giant hulking workstation

Nate Pearlstein - npe...@sgi.com - Product Support Engineer


___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss