[Lustre-discuss] delete a undeletable file
Hello, I have a corrupt file than I can’t delete. This is my file: ls –la .viminfo -? ? ? ??? .viminfo lfs getstripe .viminfo .viminfo lmm_stripe_count: 6 lmm_stripe_size:1048576 lmm_layout_gen: 0 lmm_stripe_offset: 18 obdidx objid objidgroup 18 1442898 0x1604520 22 48 0x300 19 1442770 0x1603d20 21 49 0x310 23 48 0x300 20 50 0x320 And these are my OST: lctl dl 0 UP mgc MGC192.168.11.9@tcp f6d5b76f-a7e0-61ca-b389-cb3896b86186 5 1 UP lov cetafs-clilov-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4 2 UP lmv cetafs-clilmv-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4 3 UP mdc cetafs-MDT-mdc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 4 UP osc cetafs-OST-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 5 UP osc cetafs-OST0001-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 6 UP osc cetafs-OST0002-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 7 UP osc cetafs-OST0003-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 8 UP osc cetafs-OST0004-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 9 UP osc cetafs-OST0005-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 10 UP osc cetafs-OST0006-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 11 UP osc cetafs-OST0007-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 12 UP osc cetafs-OST0012-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 13 UP osc cetafs-OST0013-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 14 UP osc cetafs-OST0008-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 15 UP osc cetafs-OST000a-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 16 UP osc cetafs-OST0009-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 17 UP osc cetafs-OST000b-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 18 UP osc cetafs-OST000c-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 19 UP osc cetafs-OST000d-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 20 UP osc cetafs-OST000e-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 21 UP osc cetafs-OST000f-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 22 UP osc cetafs-OST0010-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 23 UP osc cetafs-OST0011-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 24 UP osc cetafs-OST0018-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 25 UP osc cetafs-OST0015-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 26 UP osc cetafs-OST0016-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 27 UP osc cetafs-OST0017-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 When I try to delete the file I got the next menssage: rm .viminfo rm: cannot remove `.viminfo': Invalid argument How can I do to delete the file? THANKS! Alfonso Pardo Díaz Researcher / System Administrator at CETA-Ciemat c/ Sola nº 1; 10200 Trujillo, ESPAÑA Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener informaci�n privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilizaci�n, divulgaci�n y/o copia sin autorizaci�n est� prohibida en virtud de la legislaci�n vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente respondiendo al mensaje y proceda a su destrucci�n. Disclaimer: This message and its attached files is intended exclusively for its recipients and may contain confidential information. If you received this e-mail in error you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited and may be unlawful. In this case, please notify us by a reply and delete this email and its contents immediately. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSTs not activating following MGS/MDS move
On 26/02/13 17:30, Colin Faber wrote: Hi, As a follow up (for archival reasons) the issue Patrick experienced was CATALOG file corruption. Truncation of the CATALOG file on the MDS via ldiskfs mount corrected his problem. Thanks for the follow up. As I'm about to undertake a similar move (though on 1.8.8-wc1), and would like to avoid similar problems. It would be useful to know if the CATALOG file corruption was caused by the procedure, or if it was a coincidence. Chris -cf On 02/26/2013 08:43 AM, Patrick Shopbell wrote: Hello everyone, I am having an odd problem here, on our small Lustre installation. We have a single MGS/MDS and 3 OSS's with 7 OSTs total. I just tried moving the MDS/MGS to a faster machine, following the instructions in sections 17.3 and 17.4 of the Lustre manual: with the system offline, I mounted the file systems as ldiskfs and then used the Lustre tar command to make a copy of everything. I checked a bunch of the xattrs - all looked to match fine. Finally, I reset the system configs on the MDS/MGS with: tunefs.lustre --writeconf /dev/md126 and on the OSSs with something like: tunefs.lustre --writeconf /dev/sdb tunefs.lustre --erase-param --mgsnode=192.168.30.113@tcp --index=0 --writeconf /dev/sdb where I kept the indices the same as in my original setup. I can now mount the MGS/MDS, and then mount the OSTs. However, I get these three errors on the MGS, when an OST mounts: Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(lov_log.c:155:lov_llog_origin_connect()) error osc_llog_connect tgt 6 (-107) Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:873:__mds_lov_synchronize()) lustre-OST0006_UUID failed at llog_origin_connect: -107 Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:903:__mds_lov_synchronize()) lustre-OST0006_UUID sync failed -107, deactivating And when I run 'lctl dl', the OSTs are apparently all inactive: 5 IN osc lustre-OST-osc-MDT lustre-MDT-mdtlov_UUID 5 Any ideas what I need to do to activate these? I am running Lustre 2.3 on all nodes. I can see the file system on a client and, it seems like, read files, but I cannot create any new files, presumably because the OSTs are not active. Thanks for your suggestions, Patrick *---* | Patrick Shopbell Department of Astronomy| | p...@astro.caltech.edu Mail Code 249-17 | | (626) 395-4097 California Institute of Technology | | (626) 568-9352 (FAX) Pasadena, CA 91125| | WWW: http://www.astro.caltech.edu/~pls/ | *---* ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] OSTs not activating following MGS/MDS move
Hi Christopher, In general this can happen when your initial remount of the various services is in the wrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf On 03/07/2013 08:31 AM, Christopher J. Walker wrote: On 26/02/13 17:30, Colin Faber wrote: Hi, As a follow up (for archival reasons) the issue Patrick experienced was CATALOG file corruption. Truncation of the CATALOG file on the MDS via ldiskfs mount corrected his problem. Thanks for the follow up. As I'm about to undertake a similar move (though on 1.8.8-wc1), and would like to avoid similar problems. It would be useful to know if the CATALOG file corruption was caused by the procedure, or if it was a coincidence. Chris -cf On 02/26/2013 08:43 AM, Patrick Shopbell wrote: Hello everyone, I am having an odd problem here, on our small Lustre installation. We have a single MGS/MDS and 3 OSS's with 7 OSTs total. I just tried moving the MDS/MGS to a faster machine, following the instructions in sections 17.3 and 17.4 of the Lustre manual: with the system offline, I mounted the file systems as ldiskfs and then used the Lustre tar command to make a copy of everything. I checked a bunch of the xattrs - all looked to match fine. Finally, I reset the system configs on the MDS/MGS with: tunefs.lustre --writeconf /dev/md126 and on the OSSs with something like: tunefs.lustre --writeconf /dev/sdb tunefs.lustre --erase-param --mgsnode=192.168.30.113@tcp --index=0 --writeconf /dev/sdb where I kept the indices the same as in my original setup. I can now mount the MGS/MDS, and then mount the OSTs. However, I get these three errors on the MGS, when an OST mounts: Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(lov_log.c:155:lov_llog_origin_connect()) error osc_llog_connect tgt 6 (-107) Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:873:__mds_lov_synchronize()) lustre-OST0006_UUID failed at llog_origin_connect: -107 Feb 25 22:38:38 yupana kernel: LustreError: 3636:0:(mds_lov.c:903:__mds_lov_synchronize()) lustre-OST0006_UUID sync failed -107, deactivating And when I run 'lctl dl', the OSTs are apparently all inactive: 5 IN osc lustre-OST-osc-MDT lustre-MDT-mdtlov_UUID 5 Any ideas what I need to do to activate these? I am running Lustre 2.3 on all nodes. I can see the file system on a client and, it seems like, read files, but I cannot create any new files, presumably because the OSTs are not active. Thanks for your suggestions, Patrick *---* | Patrick Shopbell Department of Astronomy| | p...@astro.caltech.edu Mail Code 249-17 | | (626) 395-4097 California Institute of Technology | | (626) 568-9352 (FAX) Pasadena, CA 91125| | WWW: http://www.astro.caltech.edu/~pls/ | *---* ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] delete a undeletable file
The snarky reply would be to use Emacs. More seriously: When I see something like ?? in the attributes for a file, my first thought is that the group_upcall on the filesystem is not correct, so permissions are broken. If you can log on as root, you may be able to see it clearly. If that doesn't work, you may have to run an fsck on the MDT (which may take minutes to hours depending on the size of your MDT) If that doesn't work, follow the procedure for running an lfsck (which will take a long time, and require quite a bit of storage to execute) -Ben Evans From: lustre-discuss-boun...@lists.lustre.org [lustre-discuss-boun...@lists.lustre.org] on behalf of Alfonso Pardo [alfonso.pa...@ciemat.es] Sent: Thursday, March 07, 2013 10:09 AM To: lustre-discuss@lists.lustre.org; wc-disc...@whamcloud.com Subject: [Lustre-discuss] delete a undeletable file Hello, I have a corrupt file than I can’t delete. This is my file: ls –la .viminfo -? ? ? ??? .viminfo lfs getstripe .viminfo .viminfo lmm_stripe_count: 6 lmm_stripe_size:1048576 lmm_layout_gen: 0 lmm_stripe_offset: 18 obdidx objid objidgroup 18 1442898 0x1604520 22 48 0x300 19 1442770 0x1603d20 21 49 0x310 23 48 0x300 20 50 0x320 And these are my OST: lctl dl 0 UP mgc MGC192.168.11.9@tcp f6d5b76f-a7e0-61ca-b389-cb3896b86186 5 1 UP lov cetafs-clilov-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4 2 UP lmv cetafs-clilmv-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4 3 UP mdc cetafs-MDT-mdc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 4 UP osc cetafs-OST-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 5 UP osc cetafs-OST0001-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 6 UP osc cetafs-OST0002-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 7 UP osc cetafs-OST0003-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 8 UP osc cetafs-OST0004-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 9 UP osc cetafs-OST0005-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 10 UP osc cetafs-OST0006-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 11 UP osc cetafs-OST0007-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 12 UP osc cetafs-OST0012-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 13 UP osc cetafs-OST0013-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 14 UP osc cetafs-OST0008-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 15 UP osc cetafs-OST000a-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 16 UP osc cetafs-OST0009-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 17 UP osc cetafs-OST000b-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 18 UP osc cetafs-OST000c-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 19 UP osc cetafs-OST000d-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 20 UP osc cetafs-OST000e-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 21 UP osc cetafs-OST000f-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 22 UP osc cetafs-OST0010-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 23 UP osc cetafs-OST0011-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 24 UP osc cetafs-OST0018-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 25 UP osc cetafs-OST0015-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 26 UP osc cetafs-OST0016-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 27 UP osc cetafs-OST0017-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5 When I try to delete the file I got the next menssage: rm .viminfo rm: cannot remove `.viminfo': Invalid argument How can I do to delete the file? THANKS! Alfonso Pardo Díaz Researcher / System Administrator at CETA-Ciemat c/ Sola nº 1; 10200 Trujillo, ESPAÑA Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 [CETA-Ciemat logo]http://www.ceta-ciemat.es/ Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener informaci�n privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilizaci�n, divulgaci�n y/o copia sin autorizaci�n est� prohibida en virtud de la legislaci�n vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente respondiendo al mensaje y proceda a su destrucci�n. Disclaimer: This message and its attached files is intended exclusively for its recipients
Re: [Lustre-discuss] delete a undeletable file
Hi, If the file is disassociated with an OST which is offline, bring the OST back online, if the OST object it self is missing then you can remove the file using 'unlink' rather than 'rm' to unlink the object meta data. If you want to try and recover the missing OST object an lfs getstripe against the file should yield the the OST on which it resides. Once that's determined you can take that OST offline and e2fsck may successfully restore it. Another option as Ben correctly points out, lfsck will correct / prune this meta data as well as the now orphaned (if any) OST object. -cf On 03/07/2013 08:30 AM, Ben Evans wrote: The snarky reply would be to use Emacs. More seriously: When I see something like ?? in the attributes for a file, my first thought is that the group_upcall on the filesystem is not correct, so permissions are broken. If you can log on as root, you may be able to see it clearly. If that doesn't work, you may have to run an fsck on the MDT (which may take minutes to hours depending on the size of your MDT) If that doesn't work, follow the procedure for running an lfsck (which will take a long time, and require quite a bit of storage to execute) -Ben Evans *From:* lustre-discuss-boun...@lists.lustre.org [lustre-discuss-boun...@lists.lustre.org] on behalf of Alfonso Pardo [alfonso.pa...@ciemat.es] *Sent:* Thursday, March 07, 2013 10:09 AM *To:* lustre-discuss@lists.lustre.org; wc-disc...@whamcloud.com *Subject:* [Lustre-discuss] delete a undeletable file Hello, I have a corrupt file than I can’t delete. This is my file: ls –la .viminfo /-? ? ? ? ? ? .viminfo/ /lfs getstripe .viminfo/ /.viminfo/ /lmm_stripe_count: 6/ /lmm_stripe_size: 1048576/ /lmm_layout_gen: 0/ /lmm_stripe_offset: 18/ /obdidx objid objid group/ /18 1442898 0x160452 0/ /22 48 0x30 0/ /19 1442770 0x1603d2 0/ /21 49 0x31 0/ /23 48 0x30 0/ /20 50 0x32 0/ And these are my OST: /lctl dl/ /0 UP mgc MGC192.168.11.9@tcp f6d5b76f-a7e0-61ca-b389-cb3896b86186 5/ /1 UP lov cetafs-clilov-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4/ /2 UP lmv cetafs-clilmv-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4/ /3 UP mdc cetafs-MDT-mdc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /4 UP osc cetafs-OST-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /5 UP osc cetafs-OST0001-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /6 UP osc cetafs-OST0002-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /7 UP osc cetafs-OST0003-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /8 UP osc cetafs-OST0004-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /9 UP osc cetafs-OST0005-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /10 UP osc cetafs-OST0006-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /11 UP osc cetafs-OST0007-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /12 UP osc cetafs-OST0012-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /13 UP osc cetafs-OST0013-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /14 UP osc cetafs-OST0008-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /15 UP osc cetafs-OST000a-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /16 UP osc cetafs-OST0009-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /17 UP osc cetafs-OST000b-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /18 UP osc cetafs-OST000c-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /19 UP osc cetafs-OST000d-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /20 UP osc cetafs-OST000e-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /21 UP osc cetafs-OST000f-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /22 UP osc cetafs-OST0010-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /23 UP osc cetafs-OST0011-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /24 UP osc cetafs-OST0018-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /25 UP osc cetafs-OST0015-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /26 UP osc cetafs-OST0016-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /27 UP osc cetafs-OST0017-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ When I try to delete the file I got the next menssage: / rm .viminfo/ /rm: cannot remove `.viminfo': Invalid argument/ How can I do to delete the file? THANKS! // // // // /Alfonso Pardo Díaz /*/Researcher / System Administrator at CETA-Ciemat /* /c/ Sola nº 1; 10200 Trujillo, ESPAÑA/ /Tel: +34 927 65 93 17 Fax: +34 927 32 32 37/ CETA-Ciemat logo http://www.ceta-ciemat.es/ Confidencialidad: Este mensaje y sus
[Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. If the MGS and MDT are separate, the recommended startup order is: MGS, then OSTs, then the MDT, and then clients. On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in the wrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
Hi Yes, Thanks for finding this Alex. The manual should be updated with the correct order. -cf On 03/07/2013 09:39 AM, Alex Kulyavtsev wrote: Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. * If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. * If the MGS and MDT are separate, the recommended startup order is: *MGS, then OSTs, then the MDT, and then clients.* On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in thewrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
Hello AFAIK there is 2 orders: - If you are started your filesystem for the first time (or using --writeconf), order is : MGS, MDS, OST, Clients - On normal start MGS, OST, MDS, Clients There is a patch on some recent Lustre release to be able to use the first order any time but I would advise to use the second one anyway as it avoids starting MDS first, lacking connection to OST, and then reconnecting to them when they are really started. Aurélien Le 07/03/2013 17:48, Colin Faber a écrit : Hi Yes, Thanks for finding this Alex. The manual should be updated with the correct order. -cf On 03/07/2013 09:39 AM, Alex Kulyavtsev wrote: Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. * If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. * If the MGS and MDT are separate, the recommended startup order is: *MGS, then OSTs, then the MDT, and then clients.* On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in thewrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
I should make this clear, This is only critical for initial start up. Successive startups don't matter so much as services have already been registered. -cf On 03/07/2013 09:52 AM, DEGREMONT Aurelien wrote: Hello AFAIK there is 2 orders: - If you are started your filesystem for the first time (or using --writeconf), order is : MGS, MDS, OST, Clients - On normal start MGS, OST, MDS, Clients There is a patch on some recent Lustre release to be able to use the first order any time but I would advise to use the second one anyway as it avoids starting MDS first, lacking connection to OST, and then reconnecting to them when they are really started. Aurélien Le 07/03/2013 17:48, Colin Faber a écrit : Hi Yes, Thanks for finding this Alex. The manual should be updated with the correct order. -cf On 03/07/2013 09:39 AM, Alex Kulyavtsev wrote: Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. * If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. * If the MGS and MDT are separate, the recommended startup order is: *MGS, then OSTs, then the MDT, and then clients.* On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in thewrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] delete a undeletable file
You could just unlink it instead. That will work when rm fails. bob On 3/7/2013 11:10 AM, Colin Faber wrote: Hi, If the file is disassociated with an OST which is offline, bring the OST back online, if the OST object it self is missing then you can remove the file using 'unlink' rather than 'rm' to unlink the object meta data. If you want to try and recover the missing OST object an lfs getstripe against the file should yield the the OST on which it resides. Once that's determined you can take that OST offline and e2fsck may successfully restore it. Another option as Ben correctly points out, lfsck will correct / prune this meta data as well as the now orphaned (if any) OST object. -cf On 03/07/2013 08:30 AM, Ben Evans wrote: The snarky reply would be to use Emacs. More seriously: When I see something like ?? in the attributes for a file, my first thought is that the group_upcall on the filesystem is not correct, so permissions are broken. If you can log on as root, you may be able to see it clearly. If that doesn't work, you may have to run an fsck on the MDT (which may take minutes to hours depending on the size of your MDT) If that doesn't work, follow the procedure for running an lfsck (which will take a long time, and require quite a bit of storage to execute) -Ben Evans *From:* lustre-discuss-boun...@lists.lustre.org [lustre-discuss-boun...@lists.lustre.org] on behalf of Alfonso Pardo [alfonso.pa...@ciemat.es] *Sent:* Thursday, March 07, 2013 10:09 AM *To:* lustre-discuss@lists.lustre.org; wc-disc...@whamcloud.com *Subject:* [Lustre-discuss] delete a undeletable file Hello, I have a corrupt file than I can’t delete. This is my file: ls –la .viminfo /-? ? ? ? ? ? .viminfo/ /lfs getstripe .viminfo/ /.viminfo/ /lmm_stripe_count: 6/ /lmm_stripe_size: 1048576/ /lmm_layout_gen: 0/ /lmm_stripe_offset: 18/ /obdidx objid objid group/ /18 1442898 0x160452 0/ /22 48 0x30 0/ /19 1442770 0x1603d2 0/ /21 49 0x31 0/ /23 48 0x30 0/ /20 50 0x32 0/ And these are my OST: /lctl dl/ /0 UP mgc MGC192.168.11.9@tcp f6d5b76f-a7e0-61ca-b389-cb3896b86186 5/ /1 UP lov cetafs-clilov-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4/ /2 UP lmv cetafs-clilmv-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 4/ /3 UP mdc cetafs-MDT-mdc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /4 UP osc cetafs-OST-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /5 UP osc cetafs-OST0001-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /6 UP osc cetafs-OST0002-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /7 UP osc cetafs-OST0003-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /8 UP osc cetafs-OST0004-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /9 UP osc cetafs-OST0005-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /10 UP osc cetafs-OST0006-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /11 UP osc cetafs-OST0007-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /12 UP osc cetafs-OST0012-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /13 UP osc cetafs-OST0013-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /14 UP osc cetafs-OST0008-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /15 UP osc cetafs-OST000a-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /16 UP osc cetafs-OST0009-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /17 UP osc cetafs-OST000b-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /18 UP osc cetafs-OST000c-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /19 UP osc cetafs-OST000d-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /20 UP osc cetafs-OST000e-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /21 UP osc cetafs-OST000f-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /22 UP osc cetafs-OST0010-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /23 UP osc cetafs-OST0011-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /24 UP osc cetafs-OST0018-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /25 UP osc cetafs-OST0015-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /26 UP osc cetafs-OST0016-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ /27 UP osc cetafs-OST0017-osc-88009816e400 a7ba6783-6ed8-2197-4ffc-fecbff9860a5 5/ When I try to delete the file I got the next menssage: / rm .viminfo/ /rm: cannot remove `.viminfo': Invalid argument/ How can I do to delete the file? THANKS! // // // // /Alfonso Pardo Díaz /*/Researcher / System Administrator at CETA-Ciemat /* /c/ Sola nº 1; 10200 Trujillo, ESPAÑA/ /Tel: +34 927 65 93 17 Fax: +34 927 32 32 37/ CETA-Ciemat logo
Re: [Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
Colin Could you please open an LUDOC JIRA ticket to track this correction? Thanks Peter On 3/7/13 8:48 AM, Colin Faber colin_fa...@xyratex.com wrote: Hi Yes, Thanks for finding this Alex. The manual should be updated with the correct order. -cf On 03/07/2013 09:39 AM, Alex Kulyavtsev wrote: Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#5 0438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact /lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. * If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. * If the MGS and MDT are separate, the recommended startup order is: *MGS, then OSTs, then the MDT, and then clients.* On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in thewrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] lustre startup sequence Re: OSTs not activating following MGS/MDS move
Hi all - As the original poster of this thread, I should probably just weigh in that it is indeed possible that something was out of order when I brought up our setup with the new MGS+MDS. I *thought* I did it right, since I was following the instructions in section 14.5 of the manual (Changing a Server NID), and that section does indeed advise the proper initial order: MGS, MDS, OST, Clients But maybe I got a client or something in there too early. I also had some issues with the NIDs of the OSTs pointing to an old ethernet interface first, so maybe that confused things. The solution was perfect, though. Thanks to Colin and this list. -- Patrick On 3/7/13 8:53 AM, Colin Faber wrote: I should make this clear, This is only critical for initial start up. Successive startups don't matter so much as services have already been registered. -cf On 03/07/2013 09:52 AM, DEGREMONT Aurelien wrote: Hello AFAIK there is 2 orders: - If you are started your filesystem for the first time (or using --writeconf), order is : MGS, MDS, OST, Clients - On normal start MGS, OST, MDS, Clients There is a patch on some recent Lustre release to be able to use the first order any time but I would advise to use the second one anyway as it avoids starting MDS first, lacking connection to OST, and then reconnecting to them when they are really started. Aurélien Le 07/03/2013 17:48, Colin Faber a écrit : Hi Yes, Thanks for finding this Alex. The manual should be updated with the correct order. -cf On 03/07/2013 09:39 AM, Alex Kulyavtsev wrote: Hi Colin. This is not what the manual says. Shall it be corrected then? Or, add description for startup sequence in different situations (first start, restart). The manual (or online information) does not describe graceful shutdown sequence for separate MGS/MDT configuration, it will be nice to add that too. Alex. E.g. http://wiki.lustre.org/manual/LustreManual20_HTML/LustreOperations.html#50438194_24122 and similar http://build.whamcloud.com/job/lustre-manual/lastSuccessfulBuild/artifact/lustre_manual.xhtml#dbdoclet.50438194_24122 13.2 Starting Lustre The startup order of Lustre components depends on whether you have a combined MGS/MDT or these components are separate. * If you have a combined MGS/MDT, the recommended startup order is OSTs, then the MGS/MDT, and then clients. * If the MGS and MDT are separate, the recommended startup order is: *MGS, then OSTs, then the MDT, and then clients.* On Mar 7, 2013, at 9:51 AM, Colin Faber wrote: Hi Christopher, In general this can happen when your initial remount of the various services is in thewrong order. Such as MGS - OST - MDT - Client. or MGS - MDT - Clients - OST, etc. During initial mount and registration it's critical that your mount be in the correct order: MGS - MDT - OST(s) - Client(s) CATALOG corruption, or out of order sequence is more rare on active file system, but is possible. The simple fix here as described below is to just truncate it and all should be well again. -cf ailing list Lustre-discuss@lists.lustre.org mailto:Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss -- ** | Patrick Shopbell Department of Astronomy | | p...@astro.caltech.edu Mail Code 249-17| | (626) 395-4097 California Institute of Technology | | (626) 568-9352 (FAX) Pasadena, CA 91125 | | WWW: http://www.astro.caltech.edu/~pls/| ** ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss