Re: [lustre-discuss] lustre filesystem in hung state
Anil, Your error message show o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib which means it is trying to connect (opcode o8 is OST connect) to OST0003 ost of scratch file system which is hosted in 192.168.1.5@o2ib nid OSS node but the client has lost connection to the OSS node. This seems to me that you are having network issue. You can ping server nid from client and try to troubleshoot network issue: client# lctl ping 192.168.1.5@o2ib On Tue, Feb 19, 2019 at 11:44 PM Anilkumar Naik wrote: > Dear All, > > Lustre file system goes to hung state and unable to know the exact issue > with lustre. Kindly find below information and help us to know the fixes > for flle system kernerl hung issue. > > Cluster Details: > > Oss node/server is mounted with below mount targets. We could able to > mount the client with home mounts and its works for some time. After > 10-15mins all the clients hangs and oss node get rebooted. Kindly help. > > /dev/mapper/mdt-mgt19G 446M 17G 3% /mdt-mgt > /dev/mapper/mdt-home 140G 2.8G 128G 3% /mdt-home > /dev/mapper/mdt-scratch 140G 759M 130G 1% /mdt-scratch > /dev/mapper/ost-home 3.7T 2.4T 1.1T 69% /ost-home > > Below Lustre packages has been installed at oss node. > == > kernel-devel-2.6.32-431.23.3.el6_lustre.x86_64 > lustre-debuginfo-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > kernel-firmware-2.6.32-431.23.3.el6_lustre.x86_64 > lustre-iokit-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > kernel-2.6.32-431.23.3.el6_lustre.x86_64 > lustre-modules-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > lustre-tests-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > kernel-debuginfo-common-x86_64-2.6.32-431.23.3.el6_lustre.x86_64 > lustre-osd-ldiskfs-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64 > kernel-debuginfo-2.6.32-431.23.3.el6_lustre.x86_64 > = > > Lustre errors: > = > Feb 20 06:22:06 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous > similar messages > Feb 20 06:29:11 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID: not > available for connect from 0@lo (no target). If you are running an HA > pair check that the target is mounted on the other server. > Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar > messages > Feb 20 06:29:11 oss1 kernel: LustreError: 11-0: > scratch-OST0001-osc-MDT: Communicating with 0@lo, operation > ost_connect failed with -19. > Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar > messages > Feb 20 06:32:42 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for sent delay: [sent 1550624551/real 0] req@880800be1000 > x1625913123994836/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4 > lens 400/544 e 0 to 1 dl 1550624562 ref 2 fl Rpc:XN/0/ rc 0/-1 > Feb 20 06:32:42 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous > similar messages > Feb 20 06:39:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not > available for connect from 0@lo (no target). If you are running an HA > pair check that the target is mounted on the other server. > Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar > messages > Feb 20 06:39:36 oss1 kernel: LustreError: 11-0: > scratch-OST0003-osc-MDT: Communicating with 0@lo, operation > ost_connect failed with -19. > Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar > messages > Feb 20 06:43:12 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for sent delay: [sent 1550625151/real 0] req@880800dcd000 > x1625913123996040/t0(0) o8->scratch-OST0001-osc-MDT@192.168.1.5@o2ib:28/4 > lens 400/544 e 0 to 1 dl 1550625192 ref 2 fl Rpc:XN/0/ rc 0/-1 > Feb 20 06:43:12 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous > similar messages > Feb 20 06:50:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID: not > available for connect from 0@lo (no target). If you are running an HA > pair check that the target is mounted on the other server. > Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar > messages > Feb 20 06:50:01 oss1 kernel: LustreError: 11-0: > scratch-OST0003-osc-MDT: Communicating with 0@lo, operation > ost_connect failed with -19. > Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar > messages > Feb 20 06:53:57 oss1 kernel: Lustre: > 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has > timed out for sent delay: [sent 1550625826/real 0] req@881005e88800 > x1625913123997352/t0(0) o8->scratch-OST0003-osc-MDT@192.168.1.5@o2ib:28/4 > lens 400/544 e 0 to 1 dl 1550625837 ref 2 fl Rpc:XN/0/ rc 0/-1 > Feb 20 06:53:57 oss1 kernel: Lustre: > 6285:0:(client.
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
Got it. I rather be safe than sorry. This is my first time doing a lustre configuration change. Raj On Thu, Feb 21, 2019, 11:55 PM Raj wrote: > I also agree with Colin's comment. > If the current OSTs are not touched, and you are only adding new OSTs to > existing OSS nodes and adding new ost-mount resources in your existing > (already running) Pacemaker configuration, you can achieve the upgrade with > no downtime. If your Corosync-Pacemaker configuration is working correctly, > you can failover and failback and take turn to reboot each OSS nodes. But, > chances of human error is too high in doing this. > > On Thu, Feb 21, 2019 at 10:30 PM Raj Ayyampalayam > wrote: > >> Hi Raj, >> >> Thanks for the explanation. We will have to rethink our upgrade process. >> >> Thanks again. >> Raj >> >> On Thu, Feb 21, 2019, 10:23 PM Raj wrote: >> >>> Hello Raj, >>> It’s best and safe to unmount from all the clients and then do the >>> upgrade. Your FS is getting more OSTs and changing conf in the existing >>> ones, your client needs to get the new layout by remounting it. >>> Also you mentioned about client eviction, during eviction the client has >>> to drop it’s dirty pages and all the open file descriptors in the FS will >>> be gone. >>> >>> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam >>> wrote: >>> What can I expect to happen to the jobs that are suspended during the file system restart? Will the processes holding an open file handle die when I unsuspend them after the filesystem restart? Thanks! -Raj On Thu, Feb 21, 2019 at 12:52 PM Colin Faber wrote: > Ah yes, > > If you're adding to an existing OSS, then you will need to reconfigure > the file system which requires writeconf event. > > On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam > wrote: > >> The new OST's will be added to the existing file system (the OSS >> nodes are already part of the filesystem), I will have to re-configure >> the >> current HA resource configuration to tell it about the 4 new OST's. >> Our exascaler's HA monitors the individual OST and I need to >> re-configure the HA on the existing filesystem. >> >> Our vendor support has confirmed that we would have to restart the >> filesystem if we want to regenerate the HA configs to include the new >> OST's. >> >> Thanks, >> -Raj >> >> >> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber >> wrote: >> >>> It seems to me that steps may still be missing? >>> >>> You're going to rack/stack and provision the OSS nodes with new >>> OSTs'. >>> >>> Then you're going to introduce failover options somewhere? new osts? >>> existing system? etc? >>> >>> If you're introducing failover with the new OST's and leaving the >>> existing system in place, you should be able to accomplish this without >>> bringing the system offline. >>> >>> If you're going to be introducing failover to your existing system >>> then you will need to reconfigure the file system to accommodate the new >>> failover settings (failover nides, etc.) >>> >>> -cf >>> >>> >>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam >>> wrote: >>> Our upgrade strategy is as follows: 1) Load all disks into the storage array. 2) Create RAID pools and virtual disks. 3) Create lustre file system using mkfs.lustre command. (I still have to figure out all the parameters used on the existing OSTs). 4) Create mount points on all OSSs. 5) Mount the lustre OSTs. 6) Maybe rebalance the filesystem. My understanding is that the above can be done without bringing the filesystem down. I want to create the HA configuration (corosync and pacemaker) for the new OSTs. This step requires the filesystem to be down. I want to know what would happen to the suspended processes across the cluster when I bring the filesystem down to re-generate the HA configs. Thanks, -Raj On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: > Can you provide more details on your upgrade strategy? In some > cases expanding your storage shouldn't impact client / job activity > at all. > > On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam > wrote: > >> Hello, >> >> We are planning on expanding our storage by adding more OSTs to >> our lustre file system. It looks like it would be easier to expand >> if we >> bring the filesystem down and perform the necessary operations. We >> are >> planning to suspend all the jobs running on the cluster. We >> originally >> planned to add new OSTs to the live filesy
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
I also agree with Colin's comment. If the current OSTs are not touched, and you are only adding new OSTs to existing OSS nodes and adding new ost-mount resources in your existing (already running) Pacemaker configuration, you can achieve the upgrade with no downtime. If your Corosync-Pacemaker configuration is working correctly, you can failover and failback and take turn to reboot each OSS nodes. But, chances of human error is too high in doing this. On Thu, Feb 21, 2019 at 10:30 PM Raj Ayyampalayam wrote: > Hi Raj, > > Thanks for the explanation. We will have to rethink our upgrade process. > > Thanks again. > Raj > > On Thu, Feb 21, 2019, 10:23 PM Raj wrote: > >> Hello Raj, >> It’s best and safe to unmount from all the clients and then do the >> upgrade. Your FS is getting more OSTs and changing conf in the existing >> ones, your client needs to get the new layout by remounting it. >> Also you mentioned about client eviction, during eviction the client has >> to drop it’s dirty pages and all the open file descriptors in the FS will >> be gone. >> >> On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam >> wrote: >> >>> What can I expect to happen to the jobs that are suspended during the >>> file system restart? >>> Will the processes holding an open file handle die when I unsuspend them >>> after the filesystem restart? >>> >>> Thanks! >>> -Raj >>> >>> >>> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber wrote: >>> Ah yes, If you're adding to an existing OSS, then you will need to reconfigure the file system which requires writeconf event. >>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam wrote: > The new OST's will be added to the existing file system (the OSS nodes > are already part of the filesystem), I will have to re-configure the > current HA resource configuration to tell it about the 4 new OST's. > Our exascaler's HA monitors the individual OST and I need to > re-configure the HA on the existing filesystem. > > Our vendor support has confirmed that we would have to restart the > filesystem if we want to regenerate the HA configs to include the new > OST's. > > Thanks, > -Raj > > > On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: > >> It seems to me that steps may still be missing? >> >> You're going to rack/stack and provision the OSS nodes with new >> OSTs'. >> >> Then you're going to introduce failover options somewhere? new osts? >> existing system? etc? >> >> If you're introducing failover with the new OST's and leaving the >> existing system in place, you should be able to accomplish this without >> bringing the system offline. >> >> If you're going to be introducing failover to your existing system >> then you will need to reconfigure the file system to accommodate the new >> failover settings (failover nides, etc.) >> >> -cf >> >> >> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam >> wrote: >> >>> Our upgrade strategy is as follows: >>> >>> 1) Load all disks into the storage array. >>> 2) Create RAID pools and virtual disks. >>> 3) Create lustre file system using mkfs.lustre command. (I still >>> have to figure out all the parameters used on the existing OSTs). >>> 4) Create mount points on all OSSs. >>> 5) Mount the lustre OSTs. >>> 6) Maybe rebalance the filesystem. >>> My understanding is that the above can be done without bringing the >>> filesystem down. I want to create the HA configuration (corosync and >>> pacemaker) for the new OSTs. This step requires the filesystem to be >>> down. >>> I want to know what would happen to the suspended processes across the >>> cluster when I bring the filesystem down to re-generate the HA configs. >>> >>> Thanks, >>> -Raj >>> >>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber >>> wrote: >>> Can you provide more details on your upgrade strategy? In some cases expanding your storage shouldn't impact client / job activity at all. On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam wrote: > Hello, > > We are planning on expanding our storage by adding more OSTs to > our lustre file system. It looks like it would be easier to expand if > we > bring the filesystem down and perform the necessary operations. We are > planning to suspend all the jobs running on the cluster. We originally > planned to add new OSTs to the live filesystem. > > We are trying to determine the potential impact to the suspended > jobs if we bring down the filesystem for the upgrade. > One of the questions we have is what would happen to the suspended > processes that hold an open file handle in the lustre file system > w
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
Hi Raj, Thanks for the explanation. We will have to rethink our upgrade process. Thanks again. Raj On Thu, Feb 21, 2019, 10:23 PM Raj wrote: > Hello Raj, > It’s best and safe to unmount from all the clients and then do the > upgrade. Your FS is getting more OSTs and changing conf in the existing > ones, your client needs to get the new layout by remounting it. > Also you mentioned about client eviction, during eviction the client has > to drop it’s dirty pages and all the open file descriptors in the FS will > be gone. > > On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam > wrote: > >> What can I expect to happen to the jobs that are suspended during the >> file system restart? >> Will the processes holding an open file handle die when I unsuspend them >> after the filesystem restart? >> >> Thanks! >> -Raj >> >> >> On Thu, Feb 21, 2019 at 12:52 PM Colin Faber wrote: >> >>> Ah yes, >>> >>> If you're adding to an existing OSS, then you will need to reconfigure >>> the file system which requires writeconf event. >>> >> >>> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam >>> wrote: >>> The new OST's will be added to the existing file system (the OSS nodes are already part of the filesystem), I will have to re-configure the current HA resource configuration to tell it about the 4 new OST's. Our exascaler's HA monitors the individual OST and I need to re-configure the HA on the existing filesystem. Our vendor support has confirmed that we would have to restart the filesystem if we want to regenerate the HA configs to include the new OST's. Thanks, -Raj On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: > It seems to me that steps may still be missing? > > You're going to rack/stack and provision the OSS nodes with new OSTs'. > > Then you're going to introduce failover options somewhere? new osts? > existing system? etc? > > If you're introducing failover with the new OST's and leaving the > existing system in place, you should be able to accomplish this without > bringing the system offline. > > If you're going to be introducing failover to your existing system > then you will need to reconfigure the file system to accommodate the new > failover settings (failover nides, etc.) > > -cf > > > On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam > wrote: > >> Our upgrade strategy is as follows: >> >> 1) Load all disks into the storage array. >> 2) Create RAID pools and virtual disks. >> 3) Create lustre file system using mkfs.lustre command. (I still have >> to figure out all the parameters used on the existing OSTs). >> 4) Create mount points on all OSSs. >> 5) Mount the lustre OSTs. >> 6) Maybe rebalance the filesystem. >> My understanding is that the above can be done without bringing the >> filesystem down. I want to create the HA configuration (corosync and >> pacemaker) for the new OSTs. This step requires the filesystem to be >> down. >> I want to know what would happen to the suspended processes across the >> cluster when I bring the filesystem down to re-generate the HA configs. >> >> Thanks, >> -Raj >> >> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber >> wrote: >> >>> Can you provide more details on your upgrade strategy? In some cases >>> expanding your storage shouldn't impact client / job activity at all. >>> >>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam >>> wrote: >>> Hello, We are planning on expanding our storage by adding more OSTs to our lustre file system. It looks like it would be easier to expand if we bring the filesystem down and perform the necessary operations. We are planning to suspend all the jobs running on the cluster. We originally planned to add new OSTs to the live filesystem. We are trying to determine the potential impact to the suspended jobs if we bring down the filesystem for the upgrade. One of the questions we have is what would happen to the suspended processes that hold an open file handle in the lustre file system when the filesystem is brought down for the upgrade? Will they recover from the client eviction? We do have vendor support and have engaged them. I wanted to ask the community and get some feedback. Thanks, -Raj >>> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http:
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
Hello Raj, It’s best and safe to unmount from all the clients and then do the upgrade. Your FS is getting more OSTs and changing conf in the existing ones, your client needs to get the new layout by remounting it. Also you mentioned about client eviction, during eviction the client has to drop it’s dirty pages and all the open file descriptors in the FS will be gone. On Thu, Feb 21, 2019 at 12:25 PM Raj Ayyampalayam wrote: > What can I expect to happen to the jobs that are suspended during the file > system restart? > Will the processes holding an open file handle die when I unsuspend them > after the filesystem restart? > > Thanks! > -Raj > > > On Thu, Feb 21, 2019 at 12:52 PM Colin Faber wrote: > >> Ah yes, >> >> If you're adding to an existing OSS, then you will need to reconfigure >> the file system which requires writeconf event. >> > >> On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam >> wrote: >> >>> The new OST's will be added to the existing file system (the OSS nodes >>> are already part of the filesystem), I will have to re-configure the >>> current HA resource configuration to tell it about the 4 new OST's. >>> Our exascaler's HA monitors the individual OST and I need to >>> re-configure the HA on the existing filesystem. >>> >>> Our vendor support has confirmed that we would have to restart the >>> filesystem if we want to regenerate the HA configs to include the new OST's. >>> >>> Thanks, >>> -Raj >>> >>> >>> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: >>> It seems to me that steps may still be missing? You're going to rack/stack and provision the OSS nodes with new OSTs'. Then you're going to introduce failover options somewhere? new osts? existing system? etc? If you're introducing failover with the new OST's and leaving the existing system in place, you should be able to accomplish this without bringing the system offline. If you're going to be introducing failover to your existing system then you will need to reconfigure the file system to accommodate the new failover settings (failover nides, etc.) -cf On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam wrote: > Our upgrade strategy is as follows: > > 1) Load all disks into the storage array. > 2) Create RAID pools and virtual disks. > 3) Create lustre file system using mkfs.lustre command. (I still have > to figure out all the parameters used on the existing OSTs). > 4) Create mount points on all OSSs. > 5) Mount the lustre OSTs. > 6) Maybe rebalance the filesystem. > My understanding is that the above can be done without bringing the > filesystem down. I want to create the HA configuration (corosync and > pacemaker) for the new OSTs. This step requires the filesystem to be down. > I want to know what would happen to the suspended processes across the > cluster when I bring the filesystem down to re-generate the HA configs. > > Thanks, > -Raj > > On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: > >> Can you provide more details on your upgrade strategy? In some cases >> expanding your storage shouldn't impact client / job activity at all. >> >> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam >> wrote: >> >>> Hello, >>> >>> We are planning on expanding our storage by adding more OSTs to our >>> lustre file system. It looks like it would be easier to expand if we >>> bring >>> the filesystem down and perform the necessary operations. We are >>> planning >>> to suspend all the jobs running on the cluster. We originally planned to >>> add new OSTs to the live filesystem. >>> >>> We are trying to determine the potential impact to the suspended >>> jobs if we bring down the filesystem for the upgrade. >>> One of the questions we have is what would happen to the suspended >>> processes that hold an open file handle in the lustre file system when >>> the >>> filesystem is brought down for the upgrade? >>> Will they recover from the client eviction? >>> >>> We do have vendor support and have engaged them. I wanted to ask the >>> community and get some feedback. >>> >>> Thanks, >>> -Raj >>> >> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> >> ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Due Tomorrow: Lustre User Group 2019: Call for Participation
Hello, As a reminder, tomorrow, February 22nd is the deadline for submitting LUG submissions. The Lustre User Group (LUG) meeting is the primary venue for discussion and seminars on the Lustre parallel file system. The 2019 Lustre User Group (LUG) conference will be held May 14-17, 2019 in Houston, Texas. Please see: http://opensfs.org/events/lug-2019/ The LUG Program Committee is particularly seeking presentations on: * Experiences running the newer community releases (2.10, 2.11 and 2.12) * Experiences using the new Lustre features (DNE2, SSK, UID/GID mapping, Project Quotas, PFL, DoM,Multirail LNet, LNet Network Health, etc.) * Best practices and practical experiences in deploying, monitoring, and operating Lustre * Pushing the boundaries with non-traditional deployments Submission guidelines You only need an abstract for the submission process; we will request presentation materials once abstracts are reviewed and selected. Abstracts should be a minimum of 250 words and should provide a clear description of the planned presentation and its goals. All LUG presentations will be 30 minutes (including questions). The abstract submission deadline is February 22, 2019, 23:59, AoE "Anywhere on Earth" The submission web page for LUG 2019 is available at https://easychair.org/conferences/?conf=lug2019 You will need to create a user account on EasyChair if you don't already have one. Please see https://easychair.org/cfp/LUG2019 for more information. We look forward to seeing you in Houston at LUG 2019! The LUG 2019 Program Committee ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
What can I expect to happen to the jobs that are suspended during the file system restart? Will the processes holding an open file handle die when I unsuspend them after the filesystem restart? Thanks! -Raj On Thu, Feb 21, 2019 at 12:52 PM Colin Faber wrote: > Ah yes, > > If you're adding to an existing OSS, then you will need to reconfigure the > file system which requires writeconf event. > > On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam > wrote: > >> The new OST's will be added to the existing file system (the OSS nodes >> are already part of the filesystem), I will have to re-configure the >> current HA resource configuration to tell it about the 4 new OST's. >> Our exascaler's HA monitors the individual OST and I need to re-configure >> the HA on the existing filesystem. >> >> Our vendor support has confirmed that we would have to restart the >> filesystem if we want to regenerate the HA configs to include the new OST's. >> >> Thanks, >> -Raj >> >> >> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: >> >>> It seems to me that steps may still be missing? >>> >>> You're going to rack/stack and provision the OSS nodes with new OSTs'. >>> >>> Then you're going to introduce failover options somewhere? new osts? >>> existing system? etc? >>> >>> If you're introducing failover with the new OST's and leaving the >>> existing system in place, you should be able to accomplish this without >>> bringing the system offline. >>> >>> If you're going to be introducing failover to your existing system then >>> you will need to reconfigure the file system to accommodate the new >>> failover settings (failover nides, etc.) >>> >>> -cf >>> >>> >>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam >>> wrote: >>> Our upgrade strategy is as follows: 1) Load all disks into the storage array. 2) Create RAID pools and virtual disks. 3) Create lustre file system using mkfs.lustre command. (I still have to figure out all the parameters used on the existing OSTs). 4) Create mount points on all OSSs. 5) Mount the lustre OSTs. 6) Maybe rebalance the filesystem. My understanding is that the above can be done without bringing the filesystem down. I want to create the HA configuration (corosync and pacemaker) for the new OSTs. This step requires the filesystem to be down. I want to know what would happen to the suspended processes across the cluster when I bring the filesystem down to re-generate the HA configs. Thanks, -Raj On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: > Can you provide more details on your upgrade strategy? In some cases > expanding your storage shouldn't impact client / job activity at all. > > On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam > wrote: > >> Hello, >> >> We are planning on expanding our storage by adding more OSTs to our >> lustre file system. It looks like it would be easier to expand if we >> bring >> the filesystem down and perform the necessary operations. We are planning >> to suspend all the jobs running on the cluster. We originally planned to >> add new OSTs to the live filesystem. >> >> We are trying to determine the potential impact to the suspended jobs >> if we bring down the filesystem for the upgrade. >> One of the questions we have is what would happen to the suspended >> processes that hold an open file handle in the lustre file system when >> the >> filesystem is brought down for the upgrade? >> Will they recover from the client eviction? >> >> We do have vendor support and have engaged them. I wanted to ask the >> community and get some feedback. >> >> Thanks, >> -Raj >> > ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
Ah yes, If you're adding to an existing OSS, then you will need to reconfigure the file system which requires writeconf event. On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam wrote: > The new OST's will be added to the existing file system (the OSS nodes are > already part of the filesystem), I will have to re-configure the current HA > resource configuration to tell it about the 4 new OST's. > Our exascaler's HA monitors the individual OST and I need to re-configure > the HA on the existing filesystem. > > Our vendor support has confirmed that we would have to restart the > filesystem if we want to regenerate the HA configs to include the new OST's. > > Thanks, > -Raj > > > On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: > >> It seems to me that steps may still be missing? >> >> You're going to rack/stack and provision the OSS nodes with new OSTs'. >> >> Then you're going to introduce failover options somewhere? new osts? >> existing system? etc? >> >> If you're introducing failover with the new OST's and leaving the >> existing system in place, you should be able to accomplish this without >> bringing the system offline. >> >> If you're going to be introducing failover to your existing system then >> you will need to reconfigure the file system to accommodate the new >> failover settings (failover nides, etc.) >> >> -cf >> >> >> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam >> wrote: >> >>> Our upgrade strategy is as follows: >>> >>> 1) Load all disks into the storage array. >>> 2) Create RAID pools and virtual disks. >>> 3) Create lustre file system using mkfs.lustre command. (I still have to >>> figure out all the parameters used on the existing OSTs). >>> 4) Create mount points on all OSSs. >>> 5) Mount the lustre OSTs. >>> 6) Maybe rebalance the filesystem. >>> My understanding is that the above can be done without bringing the >>> filesystem down. I want to create the HA configuration (corosync and >>> pacemaker) for the new OSTs. This step requires the filesystem to be down. >>> I want to know what would happen to the suspended processes across the >>> cluster when I bring the filesystem down to re-generate the HA configs. >>> >>> Thanks, >>> -Raj >>> >>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: >>> Can you provide more details on your upgrade strategy? In some cases expanding your storage shouldn't impact client / job activity at all. On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam wrote: > Hello, > > We are planning on expanding our storage by adding more OSTs to our > lustre file system. It looks like it would be easier to expand if we bring > the filesystem down and perform the necessary operations. We are planning > to suspend all the jobs running on the cluster. We originally planned to > add new OSTs to the live filesystem. > > We are trying to determine the potential impact to the suspended jobs > if we bring down the filesystem for the upgrade. > One of the questions we have is what would happen to the suspended > processes that hold an open file handle in the lustre file system when the > filesystem is brought down for the upgrade? > Will they recover from the client eviction? > > We do have vendor support and have engaged them. I wanted to ask the > community and get some feedback. > > Thanks, > -Raj > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
The new OST's will be added to the existing file system (the OSS nodes are already part of the filesystem), I will have to re-configure the current HA resource configuration to tell it about the 4 new OST's. Our exascaler's HA monitors the individual OST and I need to re-configure the HA on the existing filesystem. Our vendor support has confirmed that we would have to restart the filesystem if we want to regenerate the HA configs to include the new OST's. Thanks, -Raj On Thu, Feb 21, 2019 at 11:23 AM Colin Faber wrote: > It seems to me that steps may still be missing? > > You're going to rack/stack and provision the OSS nodes with new OSTs'. > > Then you're going to introduce failover options somewhere? new osts? > existing system? etc? > > If you're introducing failover with the new OST's and leaving the existing > system in place, you should be able to accomplish this without bringing the > system offline. > > If you're going to be introducing failover to your existing system then > you will need to reconfigure the file system to accommodate the new > failover settings (failover nides, etc.) > > -cf > > > On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam wrote: > >> Our upgrade strategy is as follows: >> >> 1) Load all disks into the storage array. >> 2) Create RAID pools and virtual disks. >> 3) Create lustre file system using mkfs.lustre command. (I still have to >> figure out all the parameters used on the existing OSTs). >> 4) Create mount points on all OSSs. >> 5) Mount the lustre OSTs. >> 6) Maybe rebalance the filesystem. >> My understanding is that the above can be done without bringing the >> filesystem down. I want to create the HA configuration (corosync and >> pacemaker) for the new OSTs. This step requires the filesystem to be down. >> I want to know what would happen to the suspended processes across the >> cluster when I bring the filesystem down to re-generate the HA configs. >> >> Thanks, >> -Raj >> >> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: >> >>> Can you provide more details on your upgrade strategy? In some cases >>> expanding your storage shouldn't impact client / job activity at all. >>> >>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam >>> wrote: >>> Hello, We are planning on expanding our storage by adding more OSTs to our lustre file system. It looks like it would be easier to expand if we bring the filesystem down and perform the necessary operations. We are planning to suspend all the jobs running on the cluster. We originally planned to add new OSTs to the live filesystem. We are trying to determine the potential impact to the suspended jobs if we bring down the filesystem for the upgrade. One of the questions we have is what would happen to the suspended processes that hold an open file handle in the lustre file system when the filesystem is brought down for the upgrade? Will they recover from the client eviction? We do have vendor support and have engaged them. I wanted to ask the community and get some feedback. Thanks, -Raj >>> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
It seems to me that steps may still be missing? You're going to rack/stack and provision the OSS nodes with new OSTs'. Then you're going to introduce failover options somewhere? new osts? existing system? etc? If you're introducing failover with the new OST's and leaving the existing system in place, you should be able to accomplish this without bringing the system offline. If you're going to be introducing failover to your existing system then you will need to reconfigure the file system to accommodate the new failover settings (failover nides, etc.) -cf On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam wrote: > Our upgrade strategy is as follows: > > 1) Load all disks into the storage array. > 2) Create RAID pools and virtual disks. > 3) Create lustre file system using mkfs.lustre command. (I still have to > figure out all the parameters used on the existing OSTs). > 4) Create mount points on all OSSs. > 5) Mount the lustre OSTs. > 6) Maybe rebalance the filesystem. > My understanding is that the above can be done without bringing the > filesystem down. I want to create the HA configuration (corosync and > pacemaker) for the new OSTs. This step requires the filesystem to be down. > I want to know what would happen to the suspended processes across the > cluster when I bring the filesystem down to re-generate the HA configs. > > Thanks, > -Raj > > On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: > >> Can you provide more details on your upgrade strategy? In some cases >> expanding your storage shouldn't impact client / job activity at all. >> >> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam wrote: >> >>> Hello, >>> >>> We are planning on expanding our storage by adding more OSTs to our >>> lustre file system. It looks like it would be easier to expand if we bring >>> the filesystem down and perform the necessary operations. We are planning >>> to suspend all the jobs running on the cluster. We originally planned to >>> add new OSTs to the live filesystem. >>> >>> We are trying to determine the potential impact to the suspended jobs if >>> we bring down the filesystem for the upgrade. >>> One of the questions we have is what would happen to the suspended >>> processes that hold an open file handle in the lustre file system when the >>> filesystem is brought down for the upgrade? >>> Will they recover from the client eviction? >>> >>> We do have vendor support and have engaged them. I wanted to ask the >>> community and get some feedback. >>> >>> Thanks, >>> -Raj >>> >> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> >> ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] Suspended jobs and rebooting lustre servers
Our upgrade strategy is as follows: 1) Load all disks into the storage array. 2) Create RAID pools and virtual disks. 3) Create lustre file system using mkfs.lustre command. (I still have to figure out all the parameters used on the existing OSTs). 4) Create mount points on all OSSs. 5) Mount the lustre OSTs. 6) Maybe rebalance the filesystem. My understanding is that the above can be done without bringing the filesystem down. I want to create the HA configuration (corosync and pacemaker) for the new OSTs. This step requires the filesystem to be down. I want to know what would happen to the suspended processes across the cluster when I bring the filesystem down to re-generate the HA configs. Thanks, -Raj On Thu, Feb 21, 2019 at 12:59 AM Colin Faber wrote: > Can you provide more details on your upgrade strategy? In some cases > expanding your storage shouldn't impact client / job activity at all. > > On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam wrote: > >> Hello, >> >> We are planning on expanding our storage by adding more OSTs to our >> lustre file system. It looks like it would be easier to expand if we bring >> the filesystem down and perform the necessary operations. We are planning >> to suspend all the jobs running on the cluster. We originally planned to >> add new OSTs to the live filesystem. >> >> We are trying to determine the potential impact to the suspended jobs if >> we bring down the filesystem for the upgrade. >> One of the questions we have is what would happen to the suspended >> processes that hold an open file handle in the lustre file system when the >> filesystem is brought down for the upgrade? >> Will they recover from the client eviction? >> >> We do have vendor support and have engaged them. I wanted to ask the >> community and get some feedback. >> >> Thanks, >> -Raj >> > ___ >> lustre-discuss mailing list >> lustre-discuss@lists.lustre.org >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> > ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org