Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread yu sun
yes, drbd will mirror the content of block devices between hosts synchronously or asynchronously. this will provide us data redundancy between hosts. perhaps we should use zfs + drbd for mdt and ost? Thanks Yu Patrick Farrell 于2018年6月27日周三 下午9:28写道: > > I’m a little puzzled - it can switch,

[lustre-discuss] Cannot mount mdt or osts after upgrade

2018-06-27 Thread Shane Nehring
Hello all, I've been unable to mount the mdt or osts for a volume since upgrading to 2.10.4 yesterday (previously 2.10.2). on the mdt I'm getting: mount.lustre: mount store/metadata-store at /mnt/metadata-store failed: File exists and in the kernel: Lustre: MGS: Logs for fs newwork were removed

[lustre-discuss] what is fsname used for ? and how to get role based security ?

2018-06-27 Thread Zeeshan Ali Shah
Dear All, During mdt it ask for --fsname flag , docs mentioned it is a name for filesystem name to which mdt part of .. that is ok but on client when i mount /lustre/fsname it mount complete lustre filesystem . 1) Can a mds/mdt serve more than one fsname ? 2) What is the best practice for

Re: [lustre-discuss] ZFS based OSTs need advice

2018-06-27 Thread Zeeshan Ali Shah
Thanks a lot for guidance, I wl kick the installation in 1-2 days. /Zee On Wed, Jun 27, 2018 at 2:16 AM Cowe, Malcolm J wrote: > You can create pools and format the storage on a single node, provided > that the correct `--servicenode` parameters are applied to the format > command (i.e. the

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Jun 27, 2018, at 3:12 AM, yu sun wrote: > > client: > root@ml-gpu-ser200.nmg01:~$ mount -t lustre > node28@o2ib1:node29@o2ib1:/project /mnt/lustre_data > mount.lustre: mount node28@o2ib1:node29@o2ib1:/project at /mnt/lustre_data > failed: Input/output error > Is the MGS running? >

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Cory Spitz
Patrick didn’t say it, but in case it wan’t obvious you could us MDRAID underneath ldiskfs to achieve redundancy under a single host. Moreover, if you do, then you can have larger OSTs, which is helpful for file system usability. You’ll have fewer OSTs to manage and they will be larger, which

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Patrick Farrell
I’m a little puzzled - it can switch, but isn’t the data on the failed disk lost...? That’s why Andreas is suggesting RAID. Or is drbd doing syncing of the disk? That seems like a really expensive way to get redundancy, since it would have to be full online mirroring with all the costs in

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread yu sun
yes, you are right, thanks for your great suggestions. now we are using glusterfs to store training data for ML, and we begin to investigate lustre to instead glusterfs for performance. Firstly, yes we do want to get maximum perforance, you means we should use zfs , for example , not each

Re: [lustre-discuss] SSK configuration

2018-06-27 Thread Mark Roper
Hi Jeremy & All, I got a request to share the results of my SSK performance investigation with this group from Mark Hahn, which I'm happy to do! If you're not interested in the impact on throughput for encryption of client-to-mds and client-to-oss communication using the SSK feature, you can stop

[lustre-discuss] avoid_asym_router_failure

2018-06-27 Thread Matt Rásó-Barnett
Hi all, I just experienced our first asymmetric router failure today, where only one interface on a subset of our LNET routers was down - however, only our clients connected directly via that interface detected this. Unfortunately our Lustre servers, connected to the routers via a

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Andreas Dilger
On Jun 27, 2018, at 09:12, yu sun wrote: > > client: > root@ml-gpu-ser200.nmg01:~$ mount -t lustre > node28@o2ib1:node29@o2ib1:/project /mnt/lustre_data > mount.lustre: mount node28@o2ib1:node29@o2ib1:/project at /mnt/lustre_data > failed: Input/output error > Is the MGS running? >

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread yu sun
client: root@ml-gpu-ser200.nmg01:~$ mount -t lustre node28@o2ib1:node29@o2ib1:/project /mnt/lustre_data mount.lustre: mount node28@o2ib1:node29@o2ib1:/project at /mnt/lustre_data failed: Input/output error Is the MGS running? root@ml-gpu-ser200.nmg01:~$ lctl ping node28@o2ib1 failed to ping