Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-20 Thread Colin Faber
Can you provide more details on your upgrade strategy? In some cases
expanding your storage shouldn't impact client / job activity at all.

On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam  wrote:

> Hello,
>
> We are planning on expanding our storage by adding more OSTs to our lustre
> file system. It looks like it would be easier to expand if we bring the
> filesystem down and perform the necessary operations. We are planning to
> suspend all the jobs running on the cluster. We originally planned to add
> new OSTs to the live filesystem.
>
> We are trying to determine the potential impact to the suspended jobs if
> we bring down the filesystem for the upgrade.
> One of the questions we have is what would happen to the suspended
> processes that hold an open file handle in the lustre file system when the
> filesystem is brought down for the upgrade?
> Will they recover from the client eviction?
>
> We do have vendor support and have engaged them. I wanted to ask the
> community and get some feedback.
>
> Thanks,
> -Raj
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-20 Thread Gin Tan
Hi Raj,

You can add the OSTs online, we have been doing it but if you are expanding
the storage array, you might want to think about what is involved such as
cabling etc depends on the recommendation from your storage vendor. We
added an expansion on every Dell storage array last year and because of the
physical location of these storage, we needed to do a full shutdown. It
means we created a maintenance reservation and performed a full filesystem
shutdown.

In many occasions when we perform Lustre maintenance, we have suspended
jobs but that was when we know the filesystem will stay online, some
clients might get evicted during the failover but they will reconnect when
jobs were resumed.

In your case, if you want to do a full filesystem shutdown, you will have
to unmount all the Lustre clients, it means the jobs will need to be killed
in order to unmount the filesystem. We always use cat /proc/sys/lnet/peers
or lshowmount to make sure that there are no other clients connected before
doing the full FS shut down.

Hope it helps.

-- 

*Gin Tan*
MASSIVE support and consulting services

*Monash eResearch Centre*
Monash University

15 Innovation Walk
Ground Floor, Room G22
Clayton Campus
Wellington Road
Clayton VIC 3800
Australia

T: +61 3 9902 0245
E: gin@monash.edu
Z: https://monash.zoom.us/my/gintan
www.monash.edu.au/eresearch


>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-20 Thread Raj Ayyampalayam
Hello,

We are planning on expanding our storage by adding more OSTs to our lustre
file system. It looks like it would be easier to expand if we bring the
filesystem down and perform the necessary operations. We are planning to
suspend all the jobs running on the cluster. We originally planned to add
new OSTs to the live filesystem.

We are trying to determine the potential impact to the suspended jobs if we
bring down the filesystem for the upgrade.
One of the questions we have is what would happen to the suspended
processes that hold an open file handle in the lustre file system when the
filesystem is brought down for the upgrade?
Will they recover from the client eviction?

We do have vendor support and have engaged them. I wanted to ask the
community and get some feedback.

Thanks,
-Raj
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Migrate MGS to ZFS

2019-02-20 Thread Fernando Perez

Thank you Andreas.

I will try to migrate the MGS according my previous idea, based in the 
lustre operations manual section for separate a combined MDT/MGS.


I agree that the dd backup of the current combined MDT/MGS is mandatory 
before try to perform the migration.


Regards.

=
Fernando Pérez
Institut de Ciències del Mar (CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=

On 2/20/19 4:33 AM, Andreas Dilger wrote:

PS: it is always a good idea to make a backup of your MDT, since it is relatively small 
compared to the rest of the filesystem. A full-device "dd" copy doesn't take 
too long and is the most accurate backup for ldiskfs.

Cheers, Andreas


On Feb 19, 2019, at 19:31, Andreas Dilger  wrote:

Yes, it is possible to migrate the MGS files to another device as you propose. 
I don't think there is any particular difference if you move it to a separate 
ldiskfs or ZFS target.

One caveat is that we don't test combined ZFS and ldiskfs targets on the same 
node, though in theory it would work.

Migrating the MDT from ldiskfs to ZFS is also possible with newer versions of 
Lustre (2.12 for sure, I don't recall if it is in 2.10 or not).  You need to 
follow a special process to do this, please see the Lustre Operations Manual 
for details.

Cheers, Andreas


On Feb 19, 2019, at 17:48, Fernando Pérez  wrote:

Dear lustre experts.

Whats is the best way to migrate a MGS device to ZFS? Copy the 
CONFIGS/filesystem_name-* files from the old ldiskfs device to the new ZFS MGS 
device?

Currently we have a combined MDT/MGT under ldiskfs with lustre 2.10.4.

We want to upgrade to lustre 2.12.0 and then separate the combined MDT/MGT and 
migrate MDT and MGT to separate ZFS devices.

Regards.
=
Fernando Pérez
Institut de Ciències del Mar (CMIMA-CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org