Re: [lustre-discuss] Node Failure in Lustre

2023-03-15 Thread Laura Hild via lustre-discuss
Hi Nick-

If there is no MDS/MGS/OSS currently hosting a particular MDT/MGT/OST, then 
what is stored there will not be accessible.  I suggest looking at 

  https://doc.lustre.org/lustre_manual.xhtml#lustrerecovery

-Laura

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Node Failure in Lustre

2023-03-15 Thread Nick dan via lustre-discuss
Hi

Okay. Thank you for the information
Can you tell if the MDS/MGS or the OSS server goes down, how will the
failure be handled on Lustre level?

On Wed, 15 Mar 2023 at 13:45, Andreas Dilger  wrote:

> No, because the remote-attached SSDs are part of the ZFS pool and any
> drive failures a t that level are the responsibility of ZFS in that case to
> manage the failed drives (eg. with RAID) and for you to have system
> monitors in place to detect this case and alert you to the drive failures.
> This is no different than if the drives inside a RAID enclosure fail.
>
> Lustre cannot magically know about drives below the filesystem layer have
> problems. It only cares about being able to access the whole filesystem,
> and that the filesystem is intact even in the case of drive failures.
>
> Cheers, Andreas
>
> > On Mar 15, 2023, at 01:26, Nick dan via lustre-discuss <
> lustre-discuss@lists.lustre.org> wrote:
> >
> > 
> > Hi
> >
> > There is a situation where disks from multiple servers are sent to a
> main server.(Lustre storage) Zpool is created from the SSDs and mkfs.lustre
> is done using zfs as a backend file system. Lustre client is also
> connected. If one of the nodes from where the SSDs are sent goes down, will
> the node failure be handled?
> >
> > Thanks and regards,
> > Nick Dan
> > ___
> > lustre-discuss mailing list
> > lustre-discuss@lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Node Failure in Lustre

2023-03-15 Thread Andreas Dilger via lustre-discuss
No, because the remote-attached SSDs are part of the ZFS pool and any drive 
failures a t that level are the responsibility of ZFS in that case to manage 
the failed drives (eg. with RAID) and for you to have system monitors in place 
to detect this case and alert you to the drive failures.  This is no different 
than if the drives inside a RAID enclosure fail. 

Lustre cannot magically know about drives below the filesystem layer have 
problems. It only cares about being able to access the whole filesystem, and 
that the filesystem is intact even in the case of drive failures. 

Cheers, Andreas

> On Mar 15, 2023, at 01:26, Nick dan via lustre-discuss 
>  wrote:
> 
> 
> Hi
> 
> There is a situation where disks from multiple servers are sent to a main 
> server.(Lustre storage) Zpool is created from the SSDs and mkfs.lustre is 
> done using zfs as a backend file system. Lustre client is also connected. If 
> one of the nodes from where the SSDs are sent goes down, will the node 
> failure be handled?
> 
> Thanks and regards,
> Nick Dan
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Node Failure in Lustre

2023-03-15 Thread Nick dan via lustre-discuss
Hi

There is a situation where disks from multiple servers are sent to a main
server.(Lustre storage) Zpool is created from the SSDs and mkfs.lustre is
done using zfs as a backend file system. Lustre client is also connected.
If one of the nodes from where the SSDs are sent goes down, will the node
failure be handled?

Thanks and regards,
Nick Dan
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org