from:"Cowe, Malcolm J"

Re: [lustre-discuss] changing the lnet IP addresses

2018-07-16 Thread Cowe, Malcolm J

Also refer to this discussion from earlier in the year:

http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2018-January/015258.html

Malcolm.
 

On 14/7/18, 12:59 am, "lustre-discuss on behalf of Andreas Dilger" 
 
wrote:

There is an "lctl replace_nids" command that does this. I believe
it is documented in the lctl(8) man page as well as the user manual. 

Cheers, Andreas

> On Jul 13, 2018, at 09:40, Lydia Heck  wrote:
> 
> 
> Dear all,
> 
> we are in the unfortunate position that we will have to change the lustre 
IP addresses on an existing lustre filesystem.
> 
> Can anybody point me to documentation describing how to do that 
non-distructively?
> 
> Lydia
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ZFS based OSTs need advice

2018-06-26 Thread Cowe, Malcolm J

You can create pools and format the storage on a single node, provided that the 
correct `--servicenode` parameters are applied to the format command (i.e. the 
NIDs for each OSS in the HA pair). Then export half of the ZFS pools from the 
first node and import them to the other node.

There is some documentation that describes the process here:

http://wiki.lustre.org/Category:Lustre_Systems_Administration

This includes sections on HA with Pacemaker:

http://wiki.lustre.org/Managing_Lustre_as_a_High_Availability_Service
http://wiki.lustre.org/Creating_a_Framework_for_High_Availability_with_Pacemaker
http://wiki.lustre.org/Lustre_Server_Fault_Isolation_with_Pacemaker_Node_Fencing
http://wiki.lustre.org/Creating_Pacemaker_Resources_for_Lustre_Storage_Services

For OSD and OSS stuff:

http://wiki.lustre.org/ZFS_OSD_Storage_Basics
http://wiki.lustre.org/Introduction_to_Lustre_Object_Storage_Devices_(OSDs)
http://wiki.lustre.org/Creating_Lustre_Object_Storage_Services_(OSS)

There are also sections that cover the MGT and MDTs.

Malcolm.

From: lustre-discuss  on behalf of 
Zeeshan Ali Shah 
Date: Wednesday, 27 June 2018 at 1:53 am
To: Lustre discussion 
Subject: Re: [lustre-discuss] ZFS based OSTs need advice

Our OST are based on supermicro SSG-J4000-LUSTRE-OST , it is a kind of JBOD.

all 360 Disks (90 disks x4 OST) appear in /dev/disk in both OSS1 and OSS2 .

My idea is to create zfspool of Raidz2 (9+2 spare) which means arround 36 
zfspools will be created  .

Q1) Out of 36 zfs pools shall i create all of 36 Pools in OSS1 ?  in this case 
those pools can only be imported in OSS1 not in OSS2 how to gain HA 
/active/active here=
Q2)  2nd option is to create  18 zfspools in OSS1 and 18 in OSS2 ? later in 
mkfs.luster specify oss1 as primary and oss2 in secondary (execute it in oss1) 
and 2nd time execute same command on oss2 and make oss2 primary and oss1 
secondary .

does it make sense ? am i missing some thing

Thanks a lot

/Zee

On Tue, Jun 26, 2018 at 5:38 PM, Dzmitryj Jakavuk 
mailto:dzmit...@gmail.com>> wrote:
Hello

You can share 4 osts between pair of oss making 2 osts imported into one oss 
and 2 osts into other oss.  At the same time hdds need to be shared between all 
oss. So in normal conditions  1 oss will import 2 ost and the second oss will 
import   Other 2 osts.in case of ha single oss can import all 
4osts

Kind Regards
Dzmitryj Jakavuk

> On Jun 26, 2018, at 16:02, Zeeshan Ali Shah 
> mailto:javacli...@gmail.com>> wrote:
>
> We have 2 OSS with 4 OST shared . Each OST has 90 Disk so total 360 Disks .
>
> I am in phase of installing 2OSS as active/active but as zfs pools can only 
> be imported in single OSS host in this case how to achieve active/active HA ?
> As what i read is that for active/active both HA hosts should have access to 
> a same sets of disks/volumes.
>
> any advice ?
>
>
> /Zeeshan
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] dealing with maybe dead OST

2018-06-19 Thread Cowe, Malcolm J

Would using hard links work, instead of mv?

Malcolm.
 

On 20/6/18, 1:34 am, "lustre-discuss on behalf of Robin Humble" 
 wrote:

Hi,

so we've maybe lost 1 OST out of a filesystem with 115 OSTs. we may
still be able to get the OST back, but it's been a month now so
there's pressure to get the cluster back and working and leave the
files missing for now...

the complication is that because the OST might come back to life we
would like to avoid the users rm'ing their broken files and potentially
deleting them forever.

lustre is 2.5.41 ldiskfs centos6.x x86_64.

ideally I think we'd move all the ~2M files on the OST to a root access
only "shadow" directory tree in lustre that's populated purely with
files from the dead OST.
if we manage to revive the OST then these can magically come back to
life and we can mv them back into their original locations.

but currently
  mv: cannot stat 'some_file': Cannot send after transport endpoint shutdown
the OST is deactivated on the client. the client hangs if the OST isn't
deactivated. the OST is still UP & activated on the MDS.

is there a way to mv files when their OST is unreachable?

seems like mv is an MDT operation so it should be possible somehow?


the only thing I've thought of seems pretty out there...
mount the MDT as ldiskfs and mv the affected files into the shadow
tree at the ldiskfs level.
ie. with lustre running and mounted, create an empty shadow tree of
all dirs under eg. /lustre/shadow/, and then at the ldiskfs level on
the MDT:
  for f in ; do
 mv /mnt/mdt0/ROOT/$f /mnt/mdt0/ROOT/shadow/$f
  done

would that work?
maybe we'd also have to rebuild OI's and lfsck - something along the
lines of the MDT restore procedure in the manual. hopefully that would
all work with an OST deactivated.


alternatively, should we just unlink all the currently dead files from
lustre now, and then if the OST comes back can we reconstruct the paths
and filenames from the FID in xattrs's on the revived OST?
I suspect unlink is final though and this wouldn't work... ?

we can also take an lvm snapshot of the MDT and refer to that later I
suppose, but I'm not sure how that might help us.

as you can probably tell I haven't had to deal with this particular
situation before :)

thanks for any help.

cheers,
robin
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Adding a new NID

2018-01-07 Thread Cowe, Malcolm J

There are, to my knowledge, a couple of open bugs related to the “lctl 
replace_nids” command that you should review prior to committing to a change:

https://jira.hpdd.intel.com/browse/LU-8948
https://jira.hpdd.intel.com/browse/LU-10384

Some time ago, I wrote a d[r]aft guide on how to manage relatively complex LNet 
server configs, including the long-hand method for changing server NIDs. I 
thought this had made it onto the community wiki but I appear to be mistaken. I 
don’t have time to make a mediawiki version, but I’ve uploaded a PDF version 
here:

http://wiki.lustre.org/File:Defining_Multiple_LNet_Interfaces_for_Multi-homed_Servers,_v1.pdf

YMMV, there’s no warranty, whether express or implied, and I assume no 
liability, etc. ☺

Nevertheless, I hope this helps, at least as a cross-reference.

Malcolm.

From: lustre-discuss  on behalf of 
"Vicker, Darby (JSC-EG311)" 
Date: Saturday, 6 January 2018 at 11:11 am
To: Lustre discussion 
Cc: "Kirk, Benjamin (JSC-EG311)" 
Subject: Re: [lustre-discuss] Adding a new NID

Sorry – one other question.  We are configured for failover too. Will the "lctl 
replace_nids" do the right thing or should I do the tunefs to make sure all the 
failover pairs get updated properly?  This is what our tunefs command would 
look like for an OST:

   tunefs.lustre \
   --dry-run \
   --verbose \
   --writeconf \
   --erase-param \
   --mgsnode=192.52.98.30@tcp0,10.148.0.30@o2ib0,10.150.100.30@o2ib1 \
   --mgsnode=192.52.98.31@tcp0,10.148.0.31@o2ib0,10.150.100.31@o2ib1 \
   
--servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_IB_L1_IP}@o2ib0,${LUSTRE_LOCAL_IB_EUROPA_IP}@o2ib1
 \
   
--servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_L1_IP}@o2ib0,${LUSTRE_PEER_IB_EUROPA_IP}@o2ib1
 \
   $pool/ost-fsl

Our original mkfs.lustre options looked about like that, sans the o2ib1 NIDs.  
I'm worried that the "lctl repalce_nids" command won't know how to update the 
mgsnode and servicenode properly.  Is replace_nids smart enough for this?

From: lustre-discuss  on behalf of 
Darby Vicker 
Date: Friday, January 5, 2018 at 5:16 PM
To: Lustre discussion 
Subject: [non-nasa source] [lustre-discuss] Adding a new NID

Hello everyone,

We have an existing LFS that is dual-homed on ethernet (mainly for our 
workstations) and IB (for the computational cluster), ZFS backend for the MDT 
and OST's.  We just got a new computational cluster and need to add another IB 
NID.  The procedure for doing this is straight forward (14.5 in the admin 
manual) and amounts to:

Unmount the clients
Unmount the MDT
Unmount all OSTs
mount -t lustre MDT partition -o nosvc mount_point
lctl replace_nids devicename nid1[,nid2,nid3 ...]

We haven't had to update a NID in a while so I was happy to see you can do this 
with "lctl replace_nids" instead of "tunsfs.lustre --writeconf".

I know this is dangerous, but we will sometime make minor changes to the 
servers by unmounting lustre on the servers (but leaving the clients up), make 
the changes, then remount the servers.  If we are confident we can do this 
quickly, the clients recover just fine.

While this isn't such a minor change, I'm a little tempted to do that in this 
case since nothing will really change for the existing clients – they don't 
need the new NID.  Am I asking for trouble here or do you think I can get away 
with this?  I'm not too concerned about the possibility of it taking too long 
and getting the existing clients evicted.   I'm (obviously) more concerned 
about doing something that would lead to corrupting the FS.  I should probably 
schedule an outage and do this right but... :)

Darby
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Announce: Lustre Systems Administration Guide

2017-11-16 Thread Cowe, Malcolm J

I am pleased to announce the availability of a new systems administration guide 
for the Lustre file system, which has been published to wiki.lustre.org. The 
content can be accessed directly from the front page of the wiki, or from the 
following URL:

http://wiki.lustre.org/Category:Lustre_Systems_Administration

The guide is intended to provide comprehensive instructions for the 
installation and configuration of production-ready Lustre storage clusters. 
Topics covered:


  1.  Introduction to Lustre
  2.  Lustre File System Components
  3.  Lustre Software Installation
  4.  Lustre Networking (LNet)
  5.  LNet Router Configuration
  6.  Lustre Object Storage Devices (OSDs)
  7.  Creating Lustre File System Services
  8.  Mounting a Lustre File System on Client Nodes
  9.  Starting and Stopping Lustre Services
  10. Lustre High Availability

Refer to the front 
page of the 
guide for the complete table of contents.

In addition, for people who are new to Lustre, there is a high-level 
introduction to Lustre concepts, available as a PDF download:

http://wiki.lustre.org/images/6/64/LustreArchitecture-v4.pdf


Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] mdt mounting error

2017-11-01 Thread Cowe, Malcolm J

Is the MDT being mounted on the same node as the MGT? The ordering of the 
servicenode flags is (or was) significant for the first time the target is 
mounted, and if the services will run on different nodes, the servicenode 
parameters on the MDT should be swapped. And confirm that the /mdt directory 
exists on the server being used to mount the target.

Malcolm.

From: lustre-discuss  on behalf of 
Parag Khuraswar 
Date: Wednesday, 1 November 2017 at 10:21 pm
To: 'Raj' , Lustre discussion 

Subject: Re: [lustre-discuss] mdt mounting error

Hi,

For mgt –
mkfs.lustre --servicenode=10.2.1.204@o2ib --servicenode=10.2.1.205@o2ib --mgs 
/dev/mapper/mpathc

For mdt
mkfs.lustre --fsname=home --mgsnode=10.2.1.204@o2ib --mgsnode=10.2.1.205@o2ib 
--servicenode=10.2.1.204@o2ib --servicenode=10.2.1.205@o2ib --mdt --index=0 
/dev/mapper/mpatha

Regards,
Parag

From: Raj [mailto:rajgau...@gmail.com]
Sent: Wednesday, November , 2017 4:46 PM
To: Parag Khuraswar; Lustre discussion
Subject: Re: [lustre-discuss] mdt mounting error

What options in mkfs.lustre did you use to format with lustre?
On Wed, Nov 1, 2017 at 6:14 AM Parag Khuraswar 
> wrote:
Hi Raj,

Yes, /dev/mapper/mpatha available.
I could format and mount using ext4.

Regards,
Parag

From: Raj [mailto:rajgau...@gmail.com]
Sent: Wednesday, November , 2017 4:39 PM
To: Parag Khuraswar; Lustre discussion
Subject: Re: [lustre-discuss] mdt mounting error

Parag,
Is the device /dev/mapper/mpatha available?
If not, the multipathd may not have started or the multipath configuration may 
not be correct.
On Wed, Nov 1, 2017 at 5:18 AM Parag Khuraswar 
> wrote:
Hi,

I am getting below error while mounting mdt. Mgt is mounted.

Please suggest

[root@mds2 ~]# mount -t lustre /dev/mapper/mpatha /mdt
mount.lustre: mount /dev/mapper/mpatha at /mdt failed: No such file or directory
Is the MGS specification correct?
Is the filesystem name correct?
If upgrading, is the copied client log valid? (see upgrade docs)

Regards,
Parag

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] mounting failover MGS and MDT

2017-10-12 Thread Cowe, Malcolm J

There’s also documentation on wiki.lustre.org, which provides more of a 
walkthrough of the process:

http://wiki.lustre.org/Category:Lustre_Systems_Administration

See section 6 in the table of contents for how to format MGT, MDT, OSTs:


  *   http://wiki.lustre.org/Creating_the_Lustre_Management_Service_(MGS)
  *   http://wiki.lustre.org/Creating_the_Lustre_Metadata_Service_(MDS)
  *   http://wiki.lustre.org/Creating_Lustre_Object_Storage_Services_(OSS)

There are also follow-on sections for mounting Lustre on clients, the startup 
and shutdown sequence for services, and how to configure Pacemaker and Corosync 
for HA (on CentOS / RHEL).


Malcolm.


From: lustre-discuss  on behalf of 
Cory Spitz 
Date: Friday, 13 October 2017 at 12:12 am
To: Ravi Konila , Lustre discussion 

Subject: Re: [lustre-discuss] mounting failover MGS and MDT

Ravi,

The section of the Lustre Operations Manual on configuration should be what you 
need: http://doc.lustre.org/lustre_manual.xhtml#configuringlustre and 
http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.50438194_88063.  Simply skip 
the --mdt option when making the MGT like so: mkfs.lustre --fsname=fsname --mgs 
/dev/block_device.

Please refer to the documentation for more details.

-Cory

--


From: lustre-discuss  on behalf of 
Ravi Konila 
Reply-To: Ravi Konila 
Date: Wednesday, October 11, 2017 at 11:43 PM
To: Lustre Discuss 
Subject: [lustre-discuss] mounting failover MGS and MDT

Hi

I am trying to configure Lustre 2.8 with two nodes for MGS/MDT with failover.
I have single attached shared storage to MDS servers. My shared storage 
partitions are

mpathb – 100G fot MGT
mpathc – 800G for MDT

Now what is the command to create MGS and MDS from mkfs.lustre?
Should I create MGS first and later MDS? How do I specify servicenode option?

Any help is highly appreciated.

Regards

Ravi Konila
Sr. Technical Consultant


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] rpm installtion

2017-10-10 Thread Cowe, Malcolm J

It is possible that you are trying to install more than you need for your 
machine. There is an explanation of the installation process on wiki.lustre.org:

http://wiki.lustre.org/Installing_the_Lustre_Software

Malcolm.


From: lustre-discuss  on behalf of 
Parag Khuraswar 
Date: Tuesday, 10 October 2017 at 11:54 pm
To: Lustre discussion 
Subject: [lustre-discuss] rpm installtion

Hi,

While installing Lustre rpms I am getting below errors

=

Error: Package: lustre-dkms-2.10.0-1.el7.noarch 
(/lustre-dkms-2.10.0-1.el7.noarch)
   Requires: dkms >= 2.2.0.3-28.git.7c3e7c5
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(nvlist_unpack) = 0x1cd81596
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_setup) = 0xf4af989e
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_object_next) = 0x4a72152f
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_request_arcbuf) = 0xd877830c
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_buf_hold_array_by_bonus) = 0x330ef227
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_tx_assign) = 0x4cad2510
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_bulk_lookup) = 0xcbb17a27
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_tx_hold_spill) = 0x5f1e8c34
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dsl_pool_config_exit) = 0xfe90cd42
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_handle_destroy) = 0xd3e92078
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_object_alloc) = 0xc6fd1135
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_lookup) = 0x6c565287
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_free_long_range) = 0x3321cb2f
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_tx_hold_zap) = 0x0921e256
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(zap_remove_uint64) = 0xcb5497ca
Error: Package: lustre-dkms-2.10.0-1.el7.noarch 
(/lustre-dkms-2.10.0-1.el7.noarch)
   Requires: spl-dkms >= 0.6.1
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(__cv_broadcast) = 0x97fb9a11
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(nvlist_pack) = 0x424ac2e1
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(spa_writeable) = 0xbc5c21ea
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_replace_all_by_template) = 0xf981e3d2
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dsl_prop_register) = 0x2869f407
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(spa_freeze) = 0x404a0201
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(zap_cursor_retrieve) = 0x148b243f
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(spl_panic) = 0xbc32eee7
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(sa_update) = 0x2fe1928c
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(nvlist_lookup_byte_array) = 0xcb59902f
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64 
(/kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64)
   Requires: ksym(dmu_objset_own) = 0x861d7fa7
Error: Package: kmod-lustre-osd-zfs-2.10.0-1.el7.x86_64

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-05 Thread Cowe, Malcolm J

One possibility is that the kernel-abi-whitelists.noarch package is not 
installed – although I’ve certainly compiled Lustre without this package in the 
past on RHEL 7.3.

I believe that the project quota patches for LDISKFS break KABI compatibility, 
so it is possible this is what is causing the build to fail. If so, then you 
can either remove the “vfs-project-quotas-rhel7.patch” from the patch series 
for the server kernel (which will remove project quota support), or disable the 
kabi check when compiling the kernel. For example:

_TOPDIR=`rpm --eval %{_topdir}`
rpmbuild -ba --with firmware --with baseonly \
--without kabichk \
--define "buildid _lustre" \
--target x86_64 \
$_TOPDIR/SPECS/kernel.spec

Malcolm.

On 6/9/17, 10:30 am, "lustre-discuss on behalf of Riccardo Veraldi" 
 wrote:

Hello,
is it foreseen that Lustre 2.10.*  will be compatible with RHEL74 ?
I tried lustre 2.10.52 but it complains abotu kABI.

thank you

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] nodes crash during ior test

2017-08-07 Thread Cowe, Malcolm J

I’ve created a Benchmarking process outline and tools overview here:

http://wiki.lustre.org/Category:Benchmarking

This has been recently updated and is based on notes I’ve maintained at Intel 
over the years.

Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com


From: lustre-discuss  on behalf of 
Alexander I Kulyavtsev 
Date: Tuesday, 8 August 2017 at 3:30 am
To: "E.S. Rosenberg" 
Cc: Lustre discussion 
Subject: Re: [lustre-discuss] nodes crash during ior test

Lustre wiki has sidebars on Testing and Monitoring, you may start Benchmarking.

there was Benchmarking Group in OpenSFS.
wiki:   http://wiki.opensfs.org/Benchmarking_Working_Group
mail list:  http://lists.opensfs.org/listinfo.cgi/openbenchmark-opensfs.org

It is actually question to the list what is the preferred location for KB on 
lustre benchmarking: on lustre.org or 
opensfs.org.
IMHO KB on lustre.org and BWG minutes (if it reengage)  on 
opensfs.org.

Alex.


On Aug 7, 2017, at 7:56 AM, E.S. Rosenberg 
> wrote:

OT:
Can we create a wiki page or some other form of knowledge pooling on 
benchmarking lustre?
Right now I'm using slides from 2009 as my source which may not be ideal...

http://wiki.lustre.org/images/4/40/Wednesday_shpc-2009-benchmarking.pdf


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre client 2.9 cannot mount 2.10.0 OSTs

2017-08-07 Thread Cowe, Malcolm J

Lustre file system names cannot exceed 8 characters in length, but “scratch12” 
is 9 characters. Try changing the fsname to a smaller string. You can do this 
with tunefs.lustre on all the storage targets, but I can’t remember if you need 
to use --erase-params and recreate all the options. Alternatively, reformat.

Malcolm.

On 8/8/17, 11:30 am, "lustre-discuss on behalf of Riccardo Veraldi" 
 wrote:

trying to debug more this problem looks like tcp port 9888 is closed on
the MDS.
this is weird. lnet module is running. There is no firewall and OSSs and
MDS are on the same subnet.
but I Cannot connect to port 9888.
There is anything which changed in Lustre 2.10.0 related to lnet and TCP
ports that I need to take care of in the configuration ?

On 8/7/17 6:13 PM, Riccardo Veraldi wrote:
> Hello,
>
> I have a new Lustre cluster based on Lustre 2.10.0/ZFS 0.7.0 on Centos 7.3
> Lustre FS creation went smooth.
> When I tryed then to mount from the clients, Lustre is not able to mount
> any of the OSTs.
> It stops at MGS/MDT level.
>
> this is from the client side:
>
> mount.lustre: mount 192.168..48.254@tcp2:/scratch12 at
> /reg/data/scratch12 failed: Invalid argument
> This may have multiple causes.
> Is 'scratch12' the correct filesystem name?
> Are the mount options correct?
> Check the syslog for more info.
>
> Aug  7 17:58:53 psana1510 kernel: [285130.463377] LustreError:
> 29240:0:(mgc_request.c:335:config_log_add()) logname scratch12-client is
> too long
> Aug  7 17:58:53 psana1510 kernel: [285130.463772] Lustre:
> :0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has
> failed due to network error: [sent 1502153933/real 1502153933] 
> req@88203d75ec00 x1574823717093632/t0(0)
> o250->MGC192.168.48.254@tcp2@192.168.48.254@tcp2:26/25 lens 520/544 e 0
> to 1 dl 1502153938 ref 1 fl Rpc:eXN/0/ rc 0/-1
> Aug  7 17:58:53 psana1510 kernel: [285130.469156] LustreError: 15b-f:
> MGC192.168.48.254@tcp2: The configuration from log
> 'scratch12-client'failed from the MGS (-22).  Make sure this client and
> the MGS are running compatible versions of Lustre.
> Aug  7 17:58:53 psana1510 kernel: [285130.472072] Lustre: Unmounted
> scratch12-client
> Aug  7 17:58:53 psana1510 kernel: [285130.473827] LustreError:
> 29240:0:(obd_mount.c:1505:lustre_fill_super()) Unable to mount  (-22)
>
> from the MDS side there is nothing in syslog. So I tried to engage 
tcpdump:
>
> 17:58:53.745610 IP psana1510.pcdsn.1023 >
> psanamds12.pcdsn.cyborg-systems: Flags [S], seq 1356843681, win 29200,
> options [mss 1460,sackOK,TS val 284847388 ecr 0,nop,wscale 7], length 0
> 17:58:53.745644 IP psanamds12.pcdsn.cyborg-systems >
> psana1510.pcdsn.1023: Flags [R.], seq 0, ack 1356843682, win 0, length 0
> 17:58:58.757421 ARP, Request who-has psanamds12.pcdsn tell
> psana1510.pcdsn, length 46
> 17:58:58.757441 ARP, Reply psanamds12.pcdsn is-at 00:1a:4a:16:01:56 (oui
> Unknown), length 28
>
> OSS, nothing in the log file or in tcpdump
>
> lustre client is 2.9 and the server 2.10.0
>
> I have no firewall running and no SElinux
>
> this never happened to me before. I am usually running older lustre
> versions on clients but I never had this problem before.
> Any hint ?
>
> thank you very much
>
> Rick
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Luster Performance Test

2017-08-02 Thread Cowe, Malcolm J

And there are docs here as well:

http://wiki.lustre.org/Category:Benchmarking

Malcolm.

On 2/8/17, 10:53 pm, "lustre-discuss on behalf of Gmitter, Joseph" 
 
wrote:

A more up to date link to the benchmarking section of the Lustre manual can 
be found at:  http://doc.lustre.org/lustre_manual.xhtml#benchmarkingtests


On 8/2/17, 6:37 AM, "lustre-discuss on behalf of Dhiraj Kalita" 
 wrote:


Dear Gabriele,


> If you need some help to get it running, please let me know.

Please help me out with the setup process.

I also find something in the internet, is it useful ?


http://wiki.old.lustre.org/manual/LustreManual20_HTML/BenchmarkingTests.html




> Hello Dhiraj,
>
> usually you could just use a program like dd to get those tests on the
> OSTs.
>
> If you are looking for something that collects the write/read 
throughput
> and duration over continuous time for the OSTs and saves that in a
> database for further evaluation, you could check our project on 
github at:
>
> https://github.com/gabrieleiannetti/lustre_ost_performance_testing
>
> The work is still in progress, but it can be used for the above
> description.
>
> If you need some help to get it running, please let me know.
>
>
> Regards,
> Gabriele
>
>
>
>
> On 08/02/2017 11:32 AM, Dhiraj Kalita wrote:
>>
>> Dear Gabriele,
>>
>> Throughout and I/O.
>>
>>
>>> Hi Dhiraj,
>>>
>>> which kind of performance tests are you interested in?
>>>
>>> Regards,
>>> Gabriele
>>>
>>>
>>> On 08/02/2017 10:16 AM, Dhiraj Kalita wrote:

 Hi,


 We do have a 3 node luster system, can anyone help me how to do
 performance test ?


 
 Regards
 Dhiraj Kalita
 IIT Guwahati
 ___
 lustre-discuss mailing list
 lustre-discuss@lists.lustre.org
 http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

>>>
>>> --
>>> Gabriele Iannetti
>>> Wissenschaftlicher Angestellter
>>> High Performance Computing (HPC)
>>>
>>> Phone / Telefon: +49 6159 71 3147
>>> g.ianne...@gsi.de
>>>
>>> GSI Helmholtzzentrum fÃƒÂ¼r Schwerionenforschung GmbH
>>> PlanckstraÃƒYe 1, 64291 Darmstadt, Germany, www.gsi.de
>>>
>>> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 
1528
>>> Managing Directors / GeschÃƒÂ¤ftsfÃƒÂ¼hrung:
>>> Professor Dr. Paolo Giubellino, Ursula Weyrich, JÃƒÂ¶rg Blaurock
>>> Chairman of the Supervisory Board / Vorsitzender des 
GSI-Aufsichtsrats:
>>> State Secretary / StaatssekretÃƒÂ¤r Dr. Georg SchÃƒÂ¼tte
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>>
>>
>> 
>> Regards
>> Dhiraj Kalita
>> IIT Guwahati
>>
>
> --
> Gabriele Iannetti
> Wissenschaftlicher Angestellter
> High Performance Computing (HPC)
>
> Phone / Telefon: +49 6159 71 3147
> g.ianne...@gsi.de
>
> GSI Helmholtzzentrum fÃ¼r Schwerionenforschung GmbH
> PlanckstraÃŸe 1, 64291 Darmstadt, Germany, www.gsi.de
>
> Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
> Managing Directors / GeschÃ¤ftsfÃ¼hrung:
> Professor Dr. Paolo Giubellino, Ursula Weyrich, JÃ¶rg Blaurock
> Chairman of the Supervisory Board / Vorsitzender des 
GSI-Aufsichtsrats:
> State Secretary / StaatssekretÃ¤r Dr. Georg SchÃ¼tte
>



Regards
Dhiraj Kalita
IIT Guwahati
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

Re: [lustre-discuss] Install issues on 2.10.0

2017-07-25 Thread Cowe, Malcolm J

Also, for the record, for distros that do not have the genhostid command, there 
is a fairly simple workaround:

h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"

I’m sure there’s a more elegant way to express the solution, but as a quick 
bash hack, it serves. genhostid is mostly just a wrapper around the sethostid() 
glibc function, if you prefer C.

Malcolm.

On 26/7/17, 3:53 am, "lustre-discuss on behalf of John Casu" 
 
wrote:

Ok, so I assume this is actually a ZFS/SPL bug & not a lustre bug.
Also, thanks Ben, for the ptr.

many thanks,
-john

On 7/25/17 10:19 AM, Mannthey, Keith wrote:
> Host_id is for zpool double import protection.  If a host id is set on a 
zpool (zfs does this automatically) then a HA server can't just import to pool 
(users have to use --force). This makes the system a lot safer from double 
zpool imports.  Call 'genhostid' on your Lustre servers and the warning will go 
away.
> 
> Thanks,
>   Keith
> 
> 
> 
> -Original Message-
> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On 
Behalf Of Ben Evans
> Sent: Tuesday, July 25, 2017 10:13 AM
> To: John Casu ; lustre-discuss@lists.lustre.org
> Subject: Re: [lustre-discuss] Install issues on 2.10.0
> 
> health_check moved to /sys/fs/lustre/ along with a bunch of other things.
> 
> -Ben
> 
> On 7/25/17, 12:21 PM, "lustre-discuss on behalf of John Casu"
>  wrote:
> 
>> Just installed latest 2.10.0 Lustre over ZFS on a vanilla Centos
>> 7.3.1611 system, using dkms.
>> ZFS is 0.6.5.11 from zfsonlinux.org, installed w. yum
>>
>> Not a single problem during installation, but I am having issues
>> building a lustre filesystem:
>> 1. Building a separate mgt doesn't seem to work properly, although the
>> mgt/mdt combo
>> seems to work just fine.
>> 2. I get spl_hostid not set warnings, which I've never seen before 3.
>> /proc/fs/lustre/health_check seems to be missing.
>>
>> thanks,
>> -john c
>>
>>
>>
>> -
>> Building an mgt by itself doesn't seem to work properly:
>>
>>> [root@fb-lts-mds0 x86_64]# mkfs.lustre --reformat --mgs
>>> --force-nohostid --servicenode=192.168.98.113@tcp \
>>> --backfstype=zfs mgs/mgt
>>>
>>> Permanent disk data:
>>> Target: MGS
>>> Index:  unassigned
>>> Lustre FS:
>>> Mount type: zfs
>>> Flags:  0x1064
>>>(MGS first_time update no_primnode ) Persistent mount
>>> opts:
>>> Parameters: failover.node=192.168.98.113@tcp
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>> mkfs_cmd = zfs create -o canmount=off -o xattr=sa mgs/mgt
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>> Writing mgs/mgt properties
>>>lustre:failover.node=192.168.98.113@tcp
>>>lustre:version=1
>>>lustre:flags=4196
>>>lustre:index=65535
>>>lustre:svname=MGS
>>> [root@fb-lts-mds0 x86_64]# mount.lustre mgs/mgt /mnt/mgs
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>>
>>> mount.lustre FATAL: unhandled/unloaded fs type 0 'ext3'
>>
>> If I build the combo mgt/mdt, things go a lot better:
>>
>>>
>>> [root@fb-lts-mds0 x86_64]# mkfs.lustre --reformat --mgs --mdt
>>> --force-nohostid --servicenode=192.168.98.113@tcp --backfstype=zfs
>>> --index=0 --fsname=test meta/meta
>>>
>>> Permanent disk data:
>>> Target: test:MDT
>>> Index:  0
>>> Lustre FS:  test
>>> Mount type: zfs
>>> Flags:  0x1065
>>>(MDT MGS first_time update no_primnode )  Persistent
>>> mount opts:
>>> Parameters: failover.node=192.168.98.113@tcp
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>> mkfs_cmd = zfs create -o canmount=off -o xattr=sa meta/meta
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>> Writing meta/meta properties
>>>lustre:failover.node=192.168.98.113@tcp
>>>lustre:version=1
>>>lustre:flags=4197
>>>lustre:index=0
>>>lustre:fsname=test
>>>lustre:svname=test:MDT
>>> [root@fb-lts-mds0 x86_64]# mount.lustre meta/meta  /mnt/meta
>>> WARNING: spl_hostid not set. ZFS has no zpool import protection
>>> [root@fb-lts-mds0 x86_64]# df
>>> Filesystem  1K-blocksUsed Available Use% Mounted on
>>> /dev/mapper/cl-root  52403200 3107560  49295640   6% /
>>> devtmpfs 28709656   0  28709656   0% /dev

Re: [lustre-discuss] Lustre 2.10.0 ZFS version

2017-07-17 Thread Cowe, Malcolm J

The further complication is that the Lustre kmod packages, including 
kmod-zfs-osd, are compiled against the “lustre-patched” kernel 
(3.10.0-514.21.1.el7_lustre.x86_64), rather than the unpatched OS distribution 
kernel that the ZoL packages are no doubt compiled against. The move to 
patchless kernels for LDISKFS (which more or less works today, provided you 
don’t need project quotas) will further simplify binary distribution of the 
Lustre modules.

Malcolm.

On 18/7/17, 9:01 am, "lustre-discuss on behalf of Dilger, Andreas" 

wrote:

To be clear - we do not _currently_ build the Lustre RPMs against a binary 
RPM from ZoL, but rather build our own ZFS RPM packages, then build the Lustre 
RPMs against those packages.  This was done because ZoL didn't provide binary 
RPM packages when we started using ZFS, and we are currently not able to ship 
the binary RPM packages ourselves.

We are planning to change the Lustre build process to use the ZoL 
pre-packaged binary RPMs for Lustre 2.11, so that the binary RPM packages we 
build can be used together with the ZFS RPMs installed by end users.  If that 
change is not too intrusive, we will also try to backport this to b2_10 for a 
2.10.x maintenance release.

Cheers, Andreas

On Jul 17, 2017, at 10:42, Götz Waschk  wrote:
> 
> Hi Peter,
> 
> I wasn't able to install the official binary build of
> kmod-lustre-osd-zfs, even with kmod-zfs-0.6.5.9-1.el7_3.centos from
> from zfsonlinux.org, the ksym deps do not match. For me, it is always
> rebuilding the lustre source rpm against the zfs kmod packages.
> 
> Regards, Götz Waschk
> 
> On Mon, Jul 17, 2017 at 2:39 PM, Jones, Peter A  
wrote:
>> 0.6.5.9 according to lustre/Changelog. We have tested with pre-release 
versions of 0.7 during the release cycle too if that’s what you’re wondering.
>> 
>> 
>> 
>> 
>> On 7/17/17, 1:55 AM, "lustre-discuss on behalf of Götz Waschk" 

wrote:
>> 
>>> Hi everyone,
>>> 
>>> which version of kmod-zfs was the official Lustre 2.10.0 binary
>>> release for CentOS 7.3 built against?
>>> 
>>> Regards, Götz Waschk
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre file system errors with large user UIDs

2017-06-26 Thread Cowe, Malcolm J

One of the most common causes of permissions problems on Lustre is not 
propagating UIDs and GIDs onto the MDS. The servers that run the MDTs for the 
file system need to be able to look up the UIDs and GIDs of the users that can 
access the file system.

The symptoms you describe do match this scenario. I can reproduce this 
behaviour with both “high” (60) and “low” (2000) UID numbers, and adding 
the UIDs/GIDs to the MDS fixes the issue on my testbed.

Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com


From: lustre-discuss  on behalf of 
"John S. Urban" 
Date: Monday, 26 June 2017 at 1:45 am
To: Lustre discussion 
Subject: [lustre-discuss] lustre file system errors with large user UIDs

We recently added a lustre file server.  All appears to be going well except 
when users with a UID >  2**19  (ie. 524 288,) try to access the file
System; although no similar problems appear mounting other types of file 
systems (we have many others).  So the problem appears to be
Specific to the lustre mounts.  Users with the large UIDs get  “permission 
denied” if they try to access the file system, and commands like ls(1)
Show question marks for most fields (permission, owner, group, …).  Is there a 
non-default mount option required to use lustre with large UID
Values?

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] permanent configuration: "set_param -P" or "conf_param"

2017-04-06 Thread Cowe, Malcolm J

Ah, ok. I only had an EE 3.1 system to hand, and to set the max_pages_per_rpc > 
256 I also needed to adjust the brw_size. Just assumed that was the same across 
the board, since I couldn’t find reference to brw_size prior to 2.9 / EE 3.1.

I can see now that on Lustre 2.5.x, you can set the max_pages_per_rpc to 1024 
without any other server-side changes (just checked it out on a different 
cluster running EE 2.4).

Malcolm.

From: "Dilger, Andreas" <andreas.dil...@intel.com>
Date: Friday, 7 April 2017 at 11:07 am
To: Malcolm Cowe <malcolm.j.c...@intel.com>
Cc: Lustre discussion <lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] permanent configuration: "set_param -P" or 
"conf_param"

Actually, it was 16MB RPCs that landed in 2.9, along with improvements for 
handling larger RPC sizes (memory usage and such), and server-side support for 
setting the maximum RPC size per OST.

The 4MB RPC support was included since 2.5 or so, but didn't have the other 
optimizations.

Cheers, Andreas

On Apr 6, 2017, at 18:03, Cowe, Malcolm J 
<malcolm.j.c...@intel.com<mailto:malcolm.j.c...@intel.com>> wrote:
I am not sure about the checksums value: I see the same behaviour on my system. 
It may be a failsafe against permanently disabling checksums, since there is a 
risk of data corruption.

For max_pages_per_rpc, setting the RPC size larger than 1MB (256 pages) is only 
available in Lustre versions 2.9.0 and newer, or in Intel’s EE Lustre version 
3.1. Also, to make this work, one must also adjust the brw_size parameter on 
the OSTs to match the RPC size. The Lustre manual provides documentation on the 
feature:

https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#idm139670075738896

Here are some rough notes I have to hand:

To check the current settings of the brw_size attribute, log into each OSS and 
run the following command:
lctl get_param obdfilter.*.brw_size

For example:
[root@ct66-oss1 ~]# lctl get_param obdfilter.*.brw_size
obdfilter.demo-OST.brw_size=1
obdfilter.demo-OST0004.brw_size=1

The value returned is measured in MB.

To change the setting temporarily on an OSS server:

lctl set_param obdfilter.*.brw_size=

where  is an integer value between 1 and 16. Again, the value is a 
measurement in MB. To set brw_size persistently, login to the MGS and as root 
use the following syntax:

lctl set_param -P obdfilter.*.brw_size=

This will set the value for all OSTs across all file systems registered with 
the MGS. To scope the settings to an individual file system, change the filter 
expression to include the file system name:

lctl set_param -P obdfilter.-*.brw_size=

To temporarily change the value of max_pages_per_rpc, use the following command 
on each client:

lctl set_param osc.*.max_pages_per_rpc=

for example, to set max_pages_per_rpc to 1024 (4M):

lctl set_param osc.*.max_pages_per_rpc=1024

To make the setting persistent, log into the MGS server and run the lctl 
set_param command using the -P flag:

lctl set_param -P osc.*.max_pages_per_rpc=

Again, the scope can be reined by changing the pattern to match the file system 
name:

lctl set_param -P osc.-*.max_pages_per_rpc=

For example:
lctl set_param -P osc.demo-*.max_pages_per_rpc=1024

Note that I have found that if the brw_size is changed you may have to re-mount 
the clients before you’ll be able to set max_pages_per_rpc > 256.


Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com<http://www.intel.com>


From: lustre-discuss 
<lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org>>
 on behalf of Reinoud Bokhorst <rbokho...@astron.nl<mailto:rbokho...@astron.nl>>
Date: Friday, 7 April 2017 at 1:31 am
To: Lustre discussion 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>
Subject: [lustre-discuss] permanent configuration: "set_param -P" or 
"conf_param"


Hi all,
Two days ago I made the following Lustre configuration changes:

lctl set_param -P osc.*.checksums=0
lctl set_param -P osc.*.max_pages_per_rpc=512
lctl set_param -P osc.*.max_rpcs_in_flight=32
lctl set_param -P osc.*.max_dirty_mb=128

I ran these commands on the MGS. The -P flag promised to make a permanent 
change and doing this on the MGS would make it system-wide. Indeed directly 
after running the commands, I noticed that the settings were nicely propagated 
to other nodes.

When I look now, only "max_rpcs_in_flight" and "max_dirty_mb" still have those 
values, the others are back to their defaults, namely checksums=1 and 
max_pages_per_rpc=256. The compute nodes have been rebooted in the mean time.

Two questions:
- Why were the settings of checksums and max_pages_per_rpc lost? (I suspect 
during the reboot)
- What is the proper way to make these changes permanent? Should I use "lctl 
co

Re: [lustre-discuss] permanent configuration: "set_param -P" or "conf_param"

2017-04-06 Thread Cowe, Malcolm J

I am not sure about the checksums value: I see the same behaviour on my system. 
It may be a failsafe against permanently disabling checksums, since there is a 
risk of data corruption.

For max_pages_per_rpc, setting the RPC size larger than 1MB (256 pages) is only 
available in Lustre versions 2.9.0 and newer, or in Intel’s EE Lustre version 
3.1. Also, to make this work, one must also adjust the brw_size parameter on 
the OSTs to match the RPC size. The Lustre manual provides documentation on the 
feature:

https://build.hpdd.intel.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#idm139670075738896

Here are some rough notes I have to hand:

To check the current settings of the brw_size attribute, log into each OSS and 
run the following command:
lctl get_param obdfilter.*.brw_size

For example:
[root@ct66-oss1 ~]# lctl get_param obdfilter.*.brw_size
obdfilter.demo-OST.brw_size=1
obdfilter.demo-OST0004.brw_size=1

The value returned is measured in MB.

To change the setting temporarily on an OSS server:

lctl set_param obdfilter.*.brw_size=

where  is an integer value between 1 and 16. Again, the value is a 
measurement in MB. To set brw_size persistently, login to the MGS and as root 
use the following syntax:

lctl set_param -P obdfilter.*.brw_size=

This will set the value for all OSTs across all file systems registered with 
the MGS. To scope the settings to an individual file system, change the filter 
expression to include the file system name:

lctl set_param -P obdfilter.-*.brw_size=

To temporarily change the value of max_pages_per_rpc, use the following command 
on each client:

lctl set_param osc.*.max_pages_per_rpc=

for example, to set max_pages_per_rpc to 1024 (4M):

lctl set_param osc.*.max_pages_per_rpc=1024

To make the setting persistent, log into the MGS server and run the lctl 
set_param command using the -P flag:

lctl set_param -P osc.*.max_pages_per_rpc=

Again, the scope can be reined by changing the pattern to match the file system 
name:

lctl set_param -P osc.-*.max_pages_per_rpc=

For example:
lctl set_param -P osc.demo-*.max_pages_per_rpc=1024

Note that I have found that if the brw_size is changed you may have to re-mount 
the clients before you’ll be able to set max_pages_per_rpc > 256.


Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com


From: lustre-discuss  on behalf of 
Reinoud Bokhorst 
Date: Friday, 7 April 2017 at 1:31 am
To: Lustre discussion 
Subject: [lustre-discuss] permanent configuration: "set_param -P" or 
"conf_param"


Hi all,
Two days ago I made the following Lustre configuration changes:

lctl set_param -P osc.*.checksums=0
lctl set_param -P osc.*.max_pages_per_rpc=512
lctl set_param -P osc.*.max_rpcs_in_flight=32
lctl set_param -P osc.*.max_dirty_mb=128

I ran these commands on the MGS. The -P flag promised to make a permanent 
change and doing this on the MGS would make it system-wide. Indeed directly 
after running the commands, I noticed that the settings were nicely propagated 
to other nodes.

When I look now, only "max_rpcs_in_flight" and "max_dirty_mb" still have those 
values, the others are back to their defaults, namely checksums=1 and 
max_pages_per_rpc=256. The compute nodes have been rebooted in the mean time.

Two questions:
- Why were the settings of checksums and max_pages_per_rpc lost? (I suspect 
during the reboot)
- What is the proper way to make these changes permanent? Should I use "lctl 
conf_param"?

Our lustre version:

# lctl get_param version
version=
lustre: 2.7.0
kernel: patchless_client
build:  2.7.0-RC4--PRISTINE-3.10.0-327.36.3.el7.x86_64

Thanks,
Reinoud Bokhorst
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] "getting" ldlm_enqueue_min

2017-03-29 Thread Cowe, Malcolm J

The Lustre manual suggests that the syntax should be:

lctl set_param -P .sys.ldlm_enqueue_min=100

… but this doesn’t work on my own environment, and a quick scan of JIRA tickets 
and the web indicates that these settings are no longer applied through lctl 
parameters. Instead, they are options for the ptlrpc kernel module. The 
relevant JIRA ticket is:

https://jira.hpdd.intel.com/browse/LUDOC-333

There’s an article on the lustre.org wiki that has an explanation:

http://wiki.lustre.org/Lustre_Resiliency:_Understanding_Lustre_Message_Loss_and_Tuning_for_Resiliency

One of the examples mentioned there is as follows:

options ptlrpc at_max=400
options ptlrpc at_min=40
options ptlrpc ldlm_enqueue_min=260

Malcolm.

On 30/3/17, 1:39 am, "lustre-discuss on behalf of Thomas Roth" 
 wrote:

Hi all,

I found that I can set 'ldlm_enqueue_min', but not read it.
At least
 > lctl set_param -P ldlm_enqueue_min=100
results in no errors but just 'Lustre: Setting parameter 
general.ldlm_enqueue_min in log params'

But
 > lctl get_param  ldlm_enqueue_min
 > error: get_param: param_path 'ldlm_enqueue_min': No such file or 
directory

Correct: all the timeouts are found in /proc/sys/lustre/, but there is no 
'ldlm_enqueue_min' (nor anywhere else in /proc)

This happens both on 2.5.3 and 2.9

Did I misunderstand something here, or are we missing a parameter?

Regards,
Thomas

-- 

Thomas Roth
GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführung: Professor Dr. Paolo Giubellino
Ursula Weyrich
Jörg Blaurock

Vorsitzender des Aufsichtsrates: St Dr. Georg Schütte
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Backup software for Lustre

2017-03-19 Thread Cowe, Malcolm J

The version of tar included in RHEL 7 doesn’t restore the lustre xattrs by 
default – you can use the following to extract files with the requisite xattrs:

tar --xattrs-include=lustre.* -xf .tar

This assumes the files were backed up with the --xattrs flag:

tar --xattrs -cf .tar 

Note, that you don’t appear to need to whitelist the Lustre xattrs when backing 
up, only when restoring.

Malcolm.


From: lustre-discuss  on behalf of 
"Dilger, Andreas" 
Date: Monday, 20 March 2017 at 8:11 am
To: Brett Lee 
Cc: Lustre discussion 
Subject: Re: [lustre-discuss] Backup software for Lustre

The use of openat() could be problematic since this precludes storing the xattr 
before the file is opened. That said, I don't see anywhere in your strace log 
that (f)setxattr() is called to restore the xattrs, for either the regular 
files or directories, even after the file is opened or written?

Does the RHEL tar have a whitelist of xattrs to be restored?  The fact that 
there are Lustre xattrs after the restore appears to just be normal behavior 
for creating a file, not anything related to tar restoring xattrs.

Cheers, Andreas

On Mar 19, 2017, at 10:45, Brett Lee 
> wrote:
Sure, happy to help.  I did not see mknod+setxattr in the strace output.  
Included is a trimmed version of the strace output, along with a few more bits 
of information.  Thanks!

# cat /proc/fs/lustre/version
lustre: 2.7.19.8
# cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
# uname -r
3.10.0-514.2.2.el7_lustre.x86_64
# rpm -qa|grep tar
tar-1.26-31.el7.x86_64
# sha1sum `which tar` `which gtar`
ea17ec98894212b2e2285eb2dd99aad76185ea7d  /usr/bin/tar
ea17ec98894212b2e2285eb2dd99aad76185ea7d  /usr/bin/gtar

Striping was set on the four directories before creating the files.
mkdir -p /scratch/1; lfs setstripe -c 1 --stripe-size 128K /scratch/1; lfs 
getstripe /scratch/1
mkdir -p /scratch/2; lfs setstripe -c 2 --stripe-size 256K /scratch/2; lfs 
getstripe /scratch/2
mkdir -p /scratch/3; lfs setstripe -c 3 --stripe-size 768K /scratch/3; lfs 
getstripe /scratch/3
mkdir -p /scratch/4; lfs setstripe -c 4 --stripe-size 1M/scratch/4; lfs 
getstripe /scratch/4
After tar, all files and directories all had the default Lustre striping.

# tar ztvf /scratch.tgz
drwxr-xr-x root/root 0 2017-03-19 10:54 scratch/
drwxr-xr-x root/root 0 2017-03-19 10:57 scratch/4/
-rw-r--r-- root/root   4194304 2017-03-19 10:57 scratch/4/4.dd
drwxr-xr-x root/root 0 2017-03-19 10:57 scratch/3/
-rw-r--r-- root/root   4194304 2017-03-19 10:57 scratch/3/3.dd
drwxr-xr-x root/root 0 2017-03-19 10:57 scratch/1/
-rw-r--r-- root/root   4194304 2017-03-19 10:57 scratch/1/1.dd
drwxr-xr-x root/root 0 2017-03-19 10:57 scratch/2/
-rw-r--r-- root/root   4194304 2017-03-19 10:57 scratch/2/2.dd

# strace tar zxvf /scratch.tgz > strace.out 2>&1
execve("/usr/bin/tar", ["tar", "zxvf", "/scratch.tgz"], [/* 22 vars */]) = 0
...
(-cut - loading libraries)
...
fstat(1, {st_mode=S_IFREG|0644, st_size=10187, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f4a63d9f000
write(1, "scratch/\n", 9scratch/
)   = 9
mkdirat(AT_FDCWD, "scratch", 0700)  = -1 EEXIST (File exists)
newfstatat(AT_FDCWD, "scratch", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 
AT_SYMLINK_NOFOLLOW) = 0
write(1, "scratch/4/\n", 11scratch/4/
)= 11
mkdirat(AT_FDCWD, "scratch/4", 0700)= 0
write(1, "scratch/4/4.dd\n", 15scratch/4/4.dd
)= 15
openat(AT_FDCWD, "scratch/4/4.dd",
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
10240) = 10240
O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0600) = 4
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
5632) = 5632
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
10240) = 10240
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
5632) = 5632
...
(-cut)
...
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 
512) = 512
dup2(4, 4)  = 4
fstat(4, {st_mode=S_IFREG|0600, st_size=4194304, ...}) = 0
utimensat(4, NULL, {{1489935825, 0}, {1489935444, 0}}, 0) = 0
fchown(4, 0, 0) = 0
fchmod(4, 0644) = 0
close(4)= 0
write(1, "scratch/3/\n", 11scratch/3/
)= 11
newfstatat(AT_FDCWD, "scratch/4", {st_mode=S_IFDIR|0700, st_size=4096, ...}, 
AT_SYMLINK_NOFOLLOW) = 0
utimensat(AT_FDCWD, "scratch/4", {{1489935825, 0}, {1489935444, 0}}, 
AT_SYMLINK_NOFOLLOW) = 0
fchownat(AT_FDCWD, "scratch/4", 0, 0, AT_SYMLINK_NOFOLLOW) = 0
fchmodat(AT_FDCWD, "scratch/4", 0755)   = 0
mkdirat(AT_FDCWD, "scratch/3", 0700)= 0
write(1, "scratch/3/3.dd\n",

Re: [lustre-discuss] Filesystem hanging....

2016-08-14 Thread Cowe, Malcolm J

The following page links to all the downloads. Source is available as SRPMs in 
the respective OS distribution directories:

https://wiki.hpdd.intel.com/display/PUB/Lustre+Releases

There are no Ubuntu packages here though.

To access via Git, clone from: 

git://git.hpdd.intel.com/fs/lustre-release.git

The “git tag” command lists all the release tags; you can use git checkout 
 to switch to the release of interest. You can also use “git branch -a” to 
list the branches and then checkout a branch. I use the tags, since I’m 
sometimes interested in specific builds.

Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com

On 15/08/2016, 3:13 AM, "lustre-discuss on behalf of Phill Harvey-Smith" 
 wrote:

On 14/08/2016 03:09, Stephane Thiell wrote:
> Hi Phil,

Phill :)

> I understand that you’re running master on your clients (tag v2_8_56
> was created 4 days ago) and 2.1 on the servers? Running master in
> production is already a challenge. Also Lustre has never be good for
> cross-version compatibility. For example, it is possible to make 2.1
> servers work with 2.5 clients and 2.5 servers work with 2.7 clients,
> even though additional patches may be needed.

Right, I believe that the version of the client we where using before 
the upgrade was certainly an older 2.x version, it might be worth seeing 
if I can get that to compile on the current kernel.

Humm upgrading the server is going to be a major pain in the ass, it was 
setup before I joined the department and is basically a Redhat kernel 
running on an Ubuntu 10.04 base. Ideally I'd want to migrate the servers 
(we have separate OSS and MDS) to a supported platform, even if it isn't 
Ubuntu, it's just more sane :)

> I would say try to reduce the gap, upgrade your servers and/or try an
> official lustre release on your clients…

Is there a bare tarball** available of the official version's source? As 
I'm working on Ubuntu, which is an un-supported platform IIRC, I'd have 
to source compile.

**or a way to check it out of the git repository.

Cheers.

Phill.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] New Luster 2.8 Install Question

2016-05-23 Thread Cowe, Malcolm J

I'd probably look at removing the e2fsprogs-devel package from the base 
install, since this is the source of the conflict, or adding the 
e2fsprogs-devel package into the update list, e.g.:

sudo rpm -Uvh e2fsprogs-* e2fsprogs-devel-?.* e2fsprogs-libs-?.* lib*

e2fsprogs-devel should already be in the directory downloaded by wget.

Malcolm Cowe
High Performance Data Division
+61 408 573 001

Intel Corporation | www.intel.com

From: Michael Skiba [mailto:michael.sk...@oracle.com]
Sent: Tuesday, May 24, 2016 9:04 AM
To: Mannthey, Keith; Cowe, Malcolm J; lustre-discuss@lists.lustre.org; 
michael.sk...@oracle.com
Subject: RE: [lustre-discuss] New Luster 2.8 Install Question

Keith, here is the output same thing.

[root@isr-x4150-03 lcsw]# ls e2fsprogs-?.* e2fsprogs-libs-?.* lib*
e2fsprogs-1.42.13.wc4-7.el7.x86_64.rpm
e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64.rpm
libcom_err-1.42.13.wc4-7.el7.x86_64.rpm
libcom_err-devel-1.42.13.wc4-7.el7.x86_64.rpm
libss-1.42.13.wc4-7.el7.x86_64.rpm
libss-devel-1.42.13.wc4-7.el7.x86_64.rpm
[root@isr-x4150-03 lcsw]# sudo rpm -Uvh e2fsprogs-?.* e2fsprogs-libs-?.* lib*
error: Failed dependencies:
e2fsprogs-libs(x86-64) = 1.42.9-7.el7 is needed by (installed) 
e2fsprogs-devel-1.42.9-7.el7.x86_64
libcom_err-devel(x86-64) = 1.42.9-7.el7 is needed by (installed) 
e2fsprogs-devel-1.42.9-7.el7.x86_64
[root@isr-x4150-03 lcsw]# rpm -qa | grep libcom_err-devel
libcom_err-devel-1.42.9-7.el7.x86_64
[root@isr-x4150-03 lcsw]#






[Oracle]<http://www.oracle.com/>
Michael J. Skiba | Principal Support Engineer
Phone: +1 3032729724<tel:+1%203032729724> | Mobile: +1 
3036198495<tel:+1%203036198495>
Oracle ISV
500 Elorado Blv. Bldg. 5
Broomfield, Colorado 80021
[Green Oracle]<http://www.oracle.com/commitment>

Oracle is committed to developing practices and products that help protect the 
environment



From: Mannthey, Keith [mailto:keith.mannt...@intel.com]
Sent: Monday, May 23, 2016 4:47 PM
To: Michael Skiba; Cowe, Malcolm J; 
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
Subject: RE: [lustre-discuss] New Luster 2.8 Install Question

It seems to me your system has the *-devel packages installed.

You will need to download the new devel packages and install them as well.

https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/RPMS/x86_64/

I would recommend you grab new versions of the devel packages mentioned below.

Thanks,
Keith


From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Michael Skiba
Sent: Monday, May 23, 2016 3:40 PM
To: Cowe, Malcolm J 
<malcolm.j.c...@intel.com<mailto:malcolm.j.c...@intel.com>>; 
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>; 
michael.sk...@oracle.com<mailto:michael.sk...@oracle.com>
Subject: Re: [lustre-discuss] New Luster 2.8 Install Question

Malcolm, here is my output same problem, remember I am running OUL 7.2 with Red 
Hat Compatible kernel. Will this work?


[root@isr-x4150-03 lcsw]# ls
e2fsprogs-1.42.13.wc4-7.el7.x86_64.rpm
e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64.rpm
kernel-3.10.0-327.3.1.el7_lustre.x86_64.rpm
libcom_err-1.42.13.wc4-7.el7.x86_64.rpm
libss-1.42.13.wc4-7.el7.x86_64.rpm
lustre-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
lustre-iokit-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
lustre-modules-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
lustre-osd-ldiskfs-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
lustre-osd-ldiskfs-mount-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
lustre-tests-2.8.0-3.10.0_327.3.1.el7_lustre.x86_64.x86_64.rpm
[root@isr-x4150-03 lcsw]# ls e2fsprogs-?.* e2fsprogs-libs-?.* lib*
e2fsprogs-1.42.13.wc4-7.el7.x86_64.rpm
e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64.rpm
libcom_err-1.42.13.wc4-7.el7.x86_64.rpm
libss-1.42.13.wc4-7.el7.x86_64.rpm
[root@isr-x4150-03 lcsw]# sudo rpm -Uvh e2fsprogs-?.* e2fsprogs-libs-?.* lib*
error: Failed dependencies:
e2fsprogs-libs(x86-64) = 1.42.9-7.el7 is needed by (installed) 
e2fsprogs-devel-1.42.9-7.el7.x86_64
libcom_err(x86-64) = 1.42.9-7.el7 is needed by (installed) 
libcom_err-devel-1.42.9-7.el7.x86_64
[root@isr-x4150-03 lcsw]#







[Oracle]<http://www.oracle.com/>
Michael J. Skiba | Principal Support Engineer
Phone: +1 3032729724<tel:+1%203032729724> | Mobile: +1 
3036198495<tel:+1%203036198495>
Oracle ISV
500 Elorado Blv. Bldg. 5
Broomfield, Colorado 80021
[Green Oracle]<http://www.oracle.com/commitment>

Oracle is committed to developing practices and products that help protect the 
environment



From: Cowe, Malcolm J [mailto:malcolm.j.c...@intel.com]
Sent: Monday, May 23, 2016 3:58 PM
To: Michael Skiba; 
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] New Luster 2.8 Install Question

There are several RPMs that need to be installed together in order to update 
the e2fsprogs packages to the lustre version,

Re: [lustre-discuss] New Luster 2.8 Install Question

2016-05-23 Thread Cowe, Malcolm J

There are several RPMs that need to be installed together in order to update 
the e2fsprogs packages to the lustre version, which can be confusing when 
working with YUM and RPM.  This is what I do to install the e2fsprogs RPMS:

To download:
cd $HOME
wget -r -np 
https://downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/RPMS/x86_64/
cd $HOME/downloads.hpdd.intel.com/public/e2fsprogs/latest/el7/RPMS/x86_64/

The packages of interest are:
[root@rh7z-c3 x86_64]# ls e2fsprogs-?.* e2fsprogs-libs-?.* lib*
e2fsprogs-1.42.13.wc5-7.el7.x86_64.rpm   
libcom_err-devel-1.42.13.wc5-7.el7.x86_64.rpm
e2fsprogs-libs-1.42.13.wc5-7.el7.x86_64.rpm  libss-1.42.13.wc5-7.el7.x86_64.rpm
libcom_err-1.42.13.wc5-7.el7.x86_64.rpm  
libss-devel-1.42.13.wc5-7.el7.x86_64.rpm

To install:
sudo rpm -Uvh e2fsprogs-?.* e2fsprogs-libs-?.* lib*

(the -devel packages are optional, but I wanted to keep the command line simple 
:) )

Malcolm Cowe
High Performance Data Division

Intel Corporation | www.intel.com

From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Michael Skiba
Sent: Tuesday, May 24, 2016 7:04 AM
To: Mannthey, Keith; lustre-discuss@lists.lustre.org; michael.sk...@oracle.com
Subject: Re: [lustre-discuss] New Luster 2.8 Install Question

Keith,  thanks for the reply back, I am trying to set up a very basic 
configuration for a POC. Taking my information from the latest Lustre manual 
and a doc. written from Oak Ridge.

The server is what I am working on and going to make it my MGS and MDS.  Than 
make one of the internal drives my MDT. Then add  the OSS and OST, than add a 
client. When I install the rpm ( rpm -Uvh 
e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64.rpm )
(below is the output in red). The dependence is already installed  
e2fsprogs-libs-1.42.9-7.el7.x86_64. The only way I get around installing this 
rpm is to run this yum command "yum install  --setopt=protected_multilib=false" 
(not a good way). If I do this I can get the necessary packages to install. 
Once this is done I can configure my Lustre Filesystem; however when I mount my 
filesystem  I get the output highlighted in yellow. So it looks like all 
packages did not load properly from running "yum install  
--setopt=protected_multilib=false". It also look like bug LU-6062 that should 
be fixed in 2.8 ( Lustre-Initialization-1: mount.lustre: mount luster-mdt1/mdt1 
at /mnt/mds 1 failed: no such device.


[root@isr-x4150-03 lcsw]# rpm -Uvh e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64.rpm
error: Failed dependencies:
libcom_err(x86-64) = 1.42.13.wc4-7.el7 is needed by 
e2fsprogs-libs-1.42.13.wc4-7.el7.x86_64
e2fsprogs-libs(x86-64) = 1.42.9-7.el7 is needed by (installed) 
e2fsprogs-1.42.9-7.el7.x86_64
e2fsprogs-libs(x86-64) = 1.42.9-7.el7 is needed by (installed) 
e2fsprogs-devel-1.42.9-7.el7.x86_64

[root@isr-x4150-03 lcsw]# rpm -qa | grep e2
vte291-0.38.3-2.el7.x86_64
e2fsprogs-libs-1.42.9-7.el7.x86_64
graphite2-1.2.2-5.el7.x86_64
e2fsprogs-1.42.9-7.el7.x86_64
geoclue2-2.1.10-2.el7.x86_64
uname26-1.0-1.el7.x86_64
e2fsprogs-devel-1.42.9-7.el7.x86_64
libglade2-2.6.4-11.el7.x86_64
isorelax-0-0.15.release20050331.el7.noarch
[root@isr-x4150-03 lcsw]#


[root@isr-x4150-01 ~]#  mount -t lustre /dev/sdb /lustre
mount.lustre: mount /dev/sdb at /lustre failed: No such device
Are the lustre modules loaded?
Check /etc/modprobe.conf and /proc/filesystems


Thanks Again..



[Oracle]
Michael J. Skiba | Principal Support Engineer
Phone: +1 3032729724 | Mobile: +1 
3036198495
Oracle ISV
500 Elorado Blv. Bldg. 5
Broomfield, Colorado 80021
[Green Oracle]

Oracle is committed to developing practices and products that help protect the 
environment



From: Mannthey, Keith [mailto:keith.mannt...@intel.com]
Sent: Monday, May 23, 2016 12:26 PM
To: Michael Skiba; 
lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] New Luster 2.8 Install Question

Are you trying to deploy Lustre servers or clients or both?

I would suspect with a little work OUL 7.2 should work just fine with Lustre 
2.8 and the Linux 3.10 based builds.

Thanks,
Keith

From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Michael Skiba
Sent: Monday, May 23, 2016 11:05 AM
To: Michael Skiba >; 
lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] New Luster 2.8 Install Question

Sorry for the second e-mail please use my work email.

[Oracle]
Michael J. Skiba | Principal Support Engineer
Phone: +1 3032729724 | Mobile: +1 
3036198495
Oracle ISV
500 Elorado Blv. Bldg. 5
Broomfield, Colorado 80021
[Green Oracle]

Oracle is committed to developing practices and products that help protect

Re: [lustre-discuss] Rebuild server

2016-03-10 Thread Cowe, Malcolm J

If one assumes that the rebuild will incorporate the same identity as the 
original host (same hostname, IP address, etc.), then it should just be a 
matter of restoring the OS, re-installing the Lustre packages, configuring LNet 
(e.g. /etc/modprobe.d/lustre.conf) and remounting. If you've got an HA setup 
(e.g. Pacemaker + Corosync), then you'll need to restore that as well. Or 
rather, keep a backup copy of the config so that you can restore it :). There 
is no need to perform any "rebuild" of Lustre itself; just repair/restore the 
OS.

Other than LNet, all the Lustre configuration information is held on the 
storage targets (MGT, MDT), so you can rebuild the root disks without affecting 
the Lustre config on the MGT and MDT.

So, in summary: rebuild the root disks (maybe use a provisioning system like 
kickstart for repeatability), restore the network config, restore LNet config, 
maybe restore the HA software, restore the identity management (e.g. LDAP, 
passwd, group) then mount the storage as before.


Malcolm Cowe
High Performance Data Division
Intel Corporation | www.intel.com

-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Jon Tegner
Sent: Friday, March 11, 2016 4:48 PM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Rebuild server

Hi,

yesterday I had an incident where the system disk of one of my servers 
(MDT/MGS) went down, but the raid could be rebuilt and the system went 
up again.

However, in the event of a complete failure of the system disk (assuming 
all relevant "lustre disks" are still intact) is there a clear procedure 
to follow in order to rebuild the file system once the OS has been 
reinstalled on new disk?

Thanks,

/jon
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Why certain commands should not be used on Lustre file system

2016-02-10 Thread Cowe, Malcolm J

Recursive delete with rm -r is generally the slowest way to clear out a 
directory tree (irrespective of file system). I've run tests where even "find 
 -depth -delete" will complete more quickly than "rm -rf ". There's 
also an rsync hack that some people like, and there's a funky perl option:

perl -e 'for(<*>){((stat)[9]<(unlink))}'

which I dunno, seems like it is trying too hard. Found it on stackoverflow, I 
think, so I'm not sure I quite trust it.

Stu's "find ... | xargs ... rm -f" looks like a winner though.
 
Malcolm.

-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Stu Midgley
Sent: Wednesday, February 10, 2016 6:50 PM
To: prakrati.agra...@shell.com
Cc: lustrefs
Subject: Re: [lustre-discuss] Why certain commands should not be used on Lustre 
file system

We actually use

find  -type f -print0 | xargs -n 100 -P 32 -0 -- rm -f

which will parallelise the rm... which runs a fair bit faster.


On Wed, Feb 10, 2016 at 3:33 PM,   wrote:
> Hi,
>
> Then rm -rf * should not be used in any kind of file system. Why only Lustre 
> file system' best practices have this as a pointer.
>
> Thanks and Regards,
> Prakrati
>
> -Original Message-
> From: Dilger, Andreas [mailto:andreas.dil...@intel.com]
> Sent: Wednesday, February 10, 2016 11:22 AM
> To: Agrawal, Prakrati PTIN-PTT/ICOE; lustre-discuss@lists.lustre.org
> Subject: Re: [lustre-discuss] Why certain commands should not be used on 
> Lustre file system
>
> On 2016/02/09, 21:16, "lustre-discuss on behalf of 
> prakrati.agra...@shell.com" 
> 
>  on behalf of prakrati.agra...@shell.com> 
> wrote:
>
> I read on Lustre best practices that ls -U should be used instead of ls -l . 
> I understand that ls -l makes MDS contact all OSS to get all information 
> about all files and hence loads it. But, what does ls -U do to avoid it?
>
>-U do not sort; list entries in directory order
>
> This is more important for very large directories, since "ls" will read all 
> of the entries and stat them before printing anything.  That said, GNU ls 
> will still read all of the entries before printing them, so for very large 
> directories "find  -ls" is a lot faster to start printing entries.
>
> Also, it is said that rm-rf * should not be used. Please can someone explain 
> the reason for that.
>
> It is also said that instead lfs find   --type f -print0 | 
> xargs -0 rm -f should be used. Please explain the reason for this also.
>
> "rm -rf *" will expand "*" onto the command line (done by bash) and if there 
> are too many files in the directory (more than about 8MB IIRC) then bash will 
> fail to execute the command.  Running "lfs find" (or just plain "find") will 
> only print the filenames onto the output and xargs will process them in 
> chunks that fit onto a command-line.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel High Performance Data Division
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



-- 
Dr Stuart Midgley
sdm...@sdm900.com
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] [HPDD-discuss] Lustre Server Sizing

2015-07-21 Thread Cowe, Malcolm J

I’ve seen CTDB + Samba deployed on several sites running Lustre. It’s stable in 
my experience, and straightforward to get installed and set up, although the 
process is time-consuming. The most significant hurdle is integrating with AD 
and maybe load balancing for the CTDB servers (RR DNS is the easiest and most 
common solution).

Performance is not nearly as good as for native Lustre client (apart from 
anything else, IIRC, SMB is a “chatty” protocol, esp with xattrs?). One 
downside of CTDB is that Lustre client must be mounted with -oflock in order 
for the recovery lock manager to work. Each individual connection to Samba from 
a Windows client is limited to the bandwidth and single thread performance of 
the CTDB node. Clients remain connected to a single CTDB node for the duration 
of their session, so there is a possibility of an imbalance in connections over 
time. Load balancing is strictly round-robin through DNS lookups, unless a more 
sophisticated load balancer is placed in front of the CTDB cluster.

There are references to CTDB + NFS / Ganesha as well but I haven’t had an 
opportunity to try it out. Most of the demand for non-native client access to 
Lustre involves Windows machines.

Malcolm.


From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Jeff Johnson
Sent: Wednesday, July 22, 2015 5:54 AM
To: Indivar Nair
Cc: lustre-discuss
Subject: Re: [lustre-discuss] [HPDD-discuss] Lustre Server Sizing

Indivar,

Since your CIFS or NFS gateways operate as Lustre clients there can be issues 
with running multiple NFS or CIFS gateway machines frontending the same Lustre 
filesystem. As Lustre clients there are no issues in terms of file locking but 
the NFS and CIFS caching and multi-client file access mechanics don't interface 
with Lustre's file locking mechanics. Perhaps that may have changed recently 
and a developer on the list may comment on developments there. So while you 
could provide client access through multiple NFS or CIFS gateway machines there 
would not be much in the way of file locking protection. There is a way to 
configure pCIFS with CTDB and get close to what you envision with Samba. I did 
that configuration once as a proof of concept (no valuable data). It is a 
*very* complex configuration and based on the state of software when I did it I 
wouldn't say it was a production grade environment.

As I said before, my understanding may be a year out of date and someone else 
could speak to the current state of things. Hopefully that would be a better 
story.

--Jeff



On Tue, Jul 21, 2015 at 10:26 AM, Indivar Nair 
indivar.n...@techterra.inmailto:indivar.n...@techterra.in wrote:
Hi Scott,

The 3 - SAN Storages with 240 disks each has its own 3 NAS Headers (NAS 
Appliances).
However, even with 240 10K RPM disk and RAID50, it is only providing around 1.2 
- 1.4GB/s per NAS Header.
There is no clustered file system, and each NAS Header has its own file-system.
It uses some custom mechanism to present the 3 file systems as single name 
space.
But the directories have to be manually spread across for load-balancing.
As you can guess, this doesn't work most of the time.
Many a times, most of the compute nodes access a single NAS Header, overloading 
it.

The customer wants *at least* 9GB/s throughput from a single file-system.

But I think, if we architect the Lustre Storage correctly, with these many 
disks, we should get at least 18GB/s throughput, if not more.

Regards,

Indivar Nair


On Tue, Jul 21, 2015 at 10:15 PM, Scott Nolin 
scott.no...@ssec.wisc.edumailto:scott.no...@ssec.wisc.edu wrote:
An important question is what performance do they have now, and what do they 
expect if converting it to Lustre. Our more basically, what are they looking 
for in general in changing?

The performance requirements may help drive your OSS numbers for example, or 
interconnect, and all kinds of stuff.

Also I don't have a lot of experience with NFS/CIFS gateways, but that is 
perhaps it's own topic and may need some close attention.

Scott

On 7/21/2015 10:57 AM, Indivar Nair wrote:
Hi ...,

One of our customers has a 3 x 240 Disk SAN Storage Array and would like
to convert it to Lustre.

They have around 150 Workstations and around 200 Compute (Render) nodes.
The File Sizes they generally work with are -
1 to 1.5 million files (images) of 10-20MB in size.
And a few thousand files of 500-1000MB in size.

Almost 50% of the infra is on MS Windows or Apple MACs

I was thinking of the following configuration -
1 MDS
1 Failover MDS
3 OSS (failover to each other)
3 NFS+CIFS Gateway Servers
FDR Infiniband backend network (to connect the Gateways to Lustre)
Each Gateway Server will have 8 x 10GbE Frontend Network (connecting the
clients)

*Option A*
 10+10 Disk RAID60 Array with 64KB Chunk Size i.e. 1MB Stripe Width
 720 Disks / (10+10) = 36 Arrays.
 12 OSTs per OSS
 18 OSTs per OSS in case of Failover

*Option B*
 10+10+10+10 Disk RAID60

Re: [lustre-discuss] Removing large directory tree

2015-07-10 Thread Cowe, Malcolm J

There's something in the rm command that makes recursive deletes really 
expensive, although I don't know why. I've found in the past that even running 
a regular find ... -exec rm {} \; has been quicker. Running lfs find to build 
the file list would presumably be quicker still.

Malcolm.


From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Andrus, Brian Contractor
Sent: Saturday, July 11, 2015 8:05 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Removing large directory tree

All,

I understand that doing recursive file operations can be taxing on lustre.
So, I wonder if there is a preferred performance-minded way to remove an entire 
directory tree that is several TB in size.
The standard rm -rf ./dir seems to spike the cpu usage on my OSSes where it 
sits and sometimes causes clients to be evicted.

Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] trouble mounting after a tunefs

2015-06-14 Thread Cowe, Malcolm J

I believe that this message is benign, and is presented when first starting the 
MDS. It has something to do with the OSTs not being online, IIRC. I get a 
similar warning on any system I run, for example:

May 31 20:53:56 ie2-mds1.lfs.intl kernel: LustreError: 11-0: 
demo-MDT-lwp-MDT: Communicating with 0@lo, operation mds_connect failed 
with -11.

This is from one of our lab systems. If the MDT shows up as mounted, there may 
not be a case to answer, although you will still need to verify that your 
connectivity works as expected :).

Check that the storage target is mounted, that service is started (kernel 
threads are running), and that the content of /proc/fs/lustre/health_check says 
healthy, etc. lctl dl on the MDS should list the services that are up 
including the MDT, and  lfs check servers on the client should return with a 
positive outlook (all targets active).


Malcolm Cowe
Intel High Performance Data Division


-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of John White
Sent: Saturday, June 13, 2015 1:07 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] trouble mounting after a tunefs

Good Morning Folks,
We recently had to add TCP NIDs to an existing o2ib FS.  We added the 
nid to the modprobe.d stuff and tossed the definition of the NID in the 
failnode and mgsnode params on all OSTs and the MGS + MDT.  When either an o2ib 
or tcp client try to mount, the mount command hangs and dmesg repeats:
LustreError: 11-0: brc-MDT-mdc-881036879c00: Communicating with 
10.4.250.10@o2ib, operation mds_connect failed with -11.

I fear we may have over-done the parameters, could anyone take a look here and 
let me know if we need to fix things up (remove params, etc)?

MGS:
Read previous values:
Target: MGS
Index:  unassigned
Lustre FS:  
Mount type: ldiskfs
Flags:  0x4
  (MGS )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:

MDT:
 Read previous values:
Target: brc-MDT
Index:  0
Lustre FS:  brc
Mount type: ldiskfs
Flags:  0x1001
  (MDT no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:  
mgsnode=10.4.250.11@o2ib,10.0.250.11@tcp:10.4.250.10@o2ib,10.0.250.10@tcp  
failover.node=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp 
mdt.quota_type=ug

OST(sample):
Read previous values:
Target: brc-OST0002
Index:  2
Lustre FS:  brc
Mount type: ldiskfs
Flags:  0x1002
  (OST no_primnode )
Persistent mount opts: errors=remount-ro
Parameters:  
mgsnode=10.4.250.10@o2ib,10.0.250.10@tcp:10.4.250.11@o2ib,10.0.250.11@tcp  
failover.node=10.4.250.12@o2ib,10.0.250.12@tcp:10.4.250.13@o2ib,10.0.250.13@tcp 
ost.quota_type=ug
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] HSM -- requirements -- usage scenario and expectations

2015-05-10 Thread Cowe, Malcolm J

The POSIX copytool can be used to transact with any archive that presents a 
POSIX interface. NFS is a common interface onto archives, for example. 

The POSIX CT is supplied in Lustre as a reference implementation of a CT -- 
many archives have their own interfaces, and these requires their own 
copytools, so the POSIX CT acts as a reference, a working example of the API. 
It is therefore, not necessarily particularly optimised.

A ZFS storage system may be a suitable archive; generally it is recommended 
that an archive can be presented/mounted on multiple HSM agent nodes 
simultaneously in order to provide multiple paths to the archive storage from 
Lustre. This allows for increased parallelism and availability. Presenting the 
ZFS storage to the HSM agents via NFS, for example.

Also be aware that the POSIX CT in Lustre will create a directory structure 
based on the FID for storing the Lustre files in the archive, with a separate 
directory tree for recording the name space. The name space tree uses soft 
links to refer to the actual files in the FID tree.

The following example may help to illustrate the structure of the archive when 
managed by the POSIX CT:

[root@c64-3a /]# find /archive
/archive
/archive/demo
/archive/demo/shadow
/archive/demo/shadow/f001
/archive/demo/0001
/archive/demo/0001/
/archive/demo/0001//0400
/archive/demo/0001//0400/
/archive/demo/0001//0400//0002
/archive/demo/0001//0400//0002/
/archive/demo/0001//0400//0002//0x20400:0x1:0x0
/archive/demo/0001//0400//0002//0x20400:0x1:0x0.lov

[root@c64-3a /]# ls -l /archive/demo/shadow/f001
lrwxrwxrwx 1 root root 52 Aug 1 19:40 /archive/demo/shadow/f001 - 
../0001//0400//0002//0x20400:0x1:0x0
[root@c64-3a /]# ls -lL /archive/demo/shadow/f001
-rw-r--r-- 1 root root 1048576 Jul 31 23:02 /archive/demo/shadow/f001

Different copytool implementations will have different structures, depending on 
the requirements of the archive. Not everything uses POSIX, after all.

Malcolm.

--
Malcolm Cowe
Intel High Performance Data Division

 -Original Message-
 From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On
 Behalf Of Kevin Abbey
 Sent: Friday, May 08, 2015 11:50 PM
 To: lustre-discuss
 Subject: [lustre-discuss] HSM -- requirements -- usage scenario and
 expectations
 
 Hi,
 
 I was reading the HSM documentation and I don't understand the HSM
 requirements exactly.
 
 Do we need to purchase an HSM solution to manage the migrated data?
 Can
 the POSIX CopyTool copy data to a zfs file system with compression and
 deduplication, as a slow tier for migrating data off of the primary lustre?
 
 Consider:
   a single or dual OSS, 500TB total capacity, as it fills to be over 85%
 utilized, performance decreases, then migrate 30%, purge 20% and
 expect
 to observer performance increases.  Is this a viable scenario and
 expectation?
 
 Can anyone share a link to a general use case, reference implementation
 without using proprietary 3rd party tools?
 
 Below are references I have read partially thus far.  If there is a
 recorded presentation, video or slides please share the link or a title
 to search in google will be helpful.  I'm guessing that these concepts
 have already been debated during the development of HSM.  I apologize
 if
 the questions here are repetitive on the lustre-discuss list.
 
 Thanks,
 Kevin
 
 
 
 
 
 An introduction to the newly HSM-enabled Lustre 2.5.x parallel file
 system
 http://www.seagate.com/files/www-content/solutions-content/cloud-
 systems-and-solutions/high-performance-
 computing/_shared/docs/clusterstor-inside-lustre-hsm-ti.pdf
 
 
 https://wiki.hpdd.intel.com/display/PUB/Lustre+2.5
 http://insidehpc.com/2015/02/inside-lustre-hierarchical-storage-
 management-hsm/
 
 Managing Data from High Performance Lustre to Deep Tape Archives
 http://web.stanford.edu/group/dlss/pasig/PASIG_September2014/2014
 0917_Presentations/20140917_08_Managing_Data_from_High_Perform
 ance_Lustre_to_Deep_Tape_Archives_Thomas_Schoenemeyer.pdf
 
 http://opensfs.org/wp-content/uploads/2012/12/530-
 600_Aurelien_Degremont_lustre_hsm_lug11.pdf
 
 
 --
 Kevin Abbey
 Systems Administrator
 Center for Computational and Integrative Biology (CCIB)
 http://ccib.camden.rutgers.edu/
 
 Rutgers University - Science Building
 315 Penn St.
 Camden, NJ 08102
 Telephone: (856) 225-6770
 Fax:(856) 225-6312
 Email: kevin.ab...@rutgers.edu
 
 ___
 lustre-discuss mailing list
 lustre-discuss@lists.lustre.org
 http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] changing the lnet IP addresses

Re: [lustre-discuss] ZFS based OSTs need advice

Re: [lustre-discuss] dealing with maybe dead OST

Re: [lustre-discuss] Adding a new NID

[lustre-discuss] Announce: Lustre Systems Administration Guide

Re: [lustre-discuss] mdt mounting error

Re: [lustre-discuss] mounting failover MGS and MDT

Re: [lustre-discuss] rpm installtion

Re: [lustre-discuss] Lustre 2.10 and RHEL74

Re: [lustre-discuss] nodes crash during ior test

Re: [lustre-discuss] lustre client 2.9 cannot mount 2.10.0 OSTs

Re: [lustre-discuss] Luster Performance Test

Re: [lustre-discuss] Install issues on 2.10.0

Re: [lustre-discuss] Lustre 2.10.0 ZFS version

Re: [lustre-discuss] lustre file system errors with large user UIDs

Re: [lustre-discuss] permanent configuration: "set_param -P" or "conf_param"

Re: [lustre-discuss] permanent configuration: "set_param -P" or "conf_param"

Re: [lustre-discuss] "getting" ldlm_enqueue_min

Re: [lustre-discuss] Backup software for Lustre

Re: [lustre-discuss] Filesystem hanging....

Re: [lustre-discuss] New Luster 2.8 Install Question

Re: [lustre-discuss] New Luster 2.8 Install Question

Re: [lustre-discuss] Rebuild server

Re: [lustre-discuss] Why certain commands should not be used on Lustre file system

Re: [lustre-discuss] [HPDD-discuss] Lustre Server Sizing

Re: [lustre-discuss] Removing large directory tree

Re: [lustre-discuss] trouble mounting after a tunefs

Re: [lustre-discuss] HSM -- requirements -- usage scenario and expectations

28 matches

Site Navigation

Mail list logo

Footer information