[lustre-discuss] 回复: Cannot add new OST after upgrade from 2.5.3 to 2.10.6
Hi, This new OSTs are formated with e2fsprogs-1.44.3.wc1-0.el7.x86_64, while the MGS and other old OSTs are formated with e2fsprogs-1.42.12.wc1 last year, and mount with e2fsprogs-1.44.3.wc1-0.el7.x86_64 Do we need to run writeconf on all the devices following this process? https://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost Thanks, Lu Computing center,the Institute of High Energy Physics, CAS, China Wang LuTel: (+86) 10 8823 6087 P.O. Box 918-7 Fax: (+86) 10 8823 6839 Beijing 100049 P.R. ChinaEmail: lu.w...@ihep.ac.cn === From: wanglu Date: 2018-12-28 10:45 To: lustre-discuss@lists.lustre.org Subject: [lustre-discuss] Cannot add new OST after upgrade from 2.5.3 to 2.10.6 Hi, For hardware compatibiility reason, we just upgraded a 2.5.3 instance to 2.10.6. After that, when we tried to mount a new formated OST on 2.10.6, we got failures on OSS. Here is the symptom: 1. The ost mount operation will stuck for about 10 mins, and then we got “Is the MGS running?...” on terminal 2. In syslog, we found LustreError: 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 192.168.50.63@tcp) was lost; in progress operations using this service will fail LustreError: 105461:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1545962328, 300s ago), entering recovery for MGS@MGC192.168.50.63@tcp_0 ns: MGC192.168.50.63@tcp lock: 9ae9283b8200/0xa4c148c2f2e256b9 lrc: 4/1,0 mode: -/CR res: [0x73666361:0x0:0x0].0x0 rrc: 3 type: PLN flags: 0x1 nid: local remote: 0x38d3cf901311c189 expref: -99 pid: 105461 timeout: 0 lvb_type: 0 3. During the stuck, we can see ll_OST_XX and lazyldiskfsinit running on the new OSS, but the obdfilter directory can not be found under /proc/fs/lustre 4. On MDS+MGS node, we got " 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail" on MGS 5. After that , other new clients cannot mount the system. 6. It seemed the OST mount operation had caused problems on MGS, so we umounted the MDT and run e2fsck, and remount it. 7. After that,client mount is possible, and we got deactivate ost on "lfs df". 8. When we tried to mount the new OSS, the symptom repeat again... Any one has a hint on this problem? Cheers, Lu Computing center,the Institute of High Energy Physics, CAS, China Wang LuTel: (+86) 10 8823 6087 P.O. Box 918-7 Fax: (+86) 10 8823 6839 Beijing 100049 P.R. ChinaEmail: lu.w...@ihep.ac.cn === ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] Cannot add new OST after upgrade from 2.5.3 to 2.10.6
Hi, For hardware compatibiility reason, we just upgraded a 2.5.3 instance to 2.10.6. After that, when we tried to mount a new formated OST on 2.10.6, we got failures on OSS. Here is the symptom: 1. The ost mount operation will stuck for about 10 mins, and then we got “Is the MGS running?...” on terminal 2. In syslog, we found LustreError: 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 192.168.50.63@tcp) was lost; in progress operations using this service will fail LustreError: 105461:0:(ldlm_request.c:148:ldlm_expired_completion_wait()) ### lock timed out (enqueued at 1545962328, 300s ago), entering recovery for MGS@MGC192.168.50.63@tcp_0 ns: MGC192.168.50.63@tcp lock: 9ae9283b8200/0xa4c148c2f2e256b9 lrc: 4/1,0 mode: -/CR res: [0x73666361:0x0:0x0].0x0 rrc: 3 type: PLN flags: 0x1 nid: local remote: 0x38d3cf901311c189 expref: -99 pid: 105461 timeout: 0 lvb_type: 0 3. During the stuck, we can see ll_OST_XX and lazyldiskfsinit running on the new OSS, but the obdfilter directory can not be found under /proc/fs/lustre 4. On MDS+MGS node, we got " 166-1: MGC192.168.50.63@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail" on MGS 5. After that , other new clients cannot mount the system. 6. It seemed the OST mount operation had caused problems on MGS, so we umounted the MDT and run e2fsck, and remount it. 7. After that,client mount is possible, and we got deactivate ost on "lfs df". 8. When we tried to mount the new OSS, the symptom repeat again... Any one has a hint on this problem? Cheers, Lu Computing center,the Institute of High Energy Physics, CAS, China Wang LuTel: (+86) 10 8823 6087 P.O. Box 918-7 Fax: (+86) 10 8823 6839 Beijing 100049 P.R. ChinaEmail: lu.w...@ihep.ac.cn === ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] FID used by two objects
Hello, One OST of our system can not be mounted in lustre mode after an severe disk error and an 5 days' e2fsck. Here are errors we got during the mount operation. #grep FID /var/log/messages Jul 17 20:15:21 oss04 kernel: LustreError: 13089:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48085/1708371613 Jul 17 20:38:41 oss04 kernel: LustreError: 13988:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48086/3830163079 Jul 17 20:49:55 oss04 kernel: LustreError: 14221:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48087/538285899 Jul 18 11:39:25 oss04 kernel: LustreError: 31071:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48088/2468309129 Jul 18 11:39:56 oss04 kernel: LustreError: 31170:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48089/2021195118 Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248 and the mount operation is failed with error -17 Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(osd_oi.c:653:osd_oi_insert()) lustre-OST0036: the FID [0x20005:0x1:0x0] is used by two objects: 86/3303188178 48090/956682248 Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(qsd_lib.c:418:qsd_qtype_init()) lustre-OST0036: can't open slave index copy [0x20006:0x2:0x0] -17 Jul 18 12:04:31 oss04 kernel: LustreError: 32127:0:(obd_mount_server.c:1723:server_fill_super()) Unable to start targets: -17 Jul 18 12:04:31 oss04 kernel: Lustre: Failing over lustre-OST0036 Jul 18 12:04:32 oss04 kernel: Lustre: server umount lustre-OST0036 complete If you run e2fsck again, the command will claim that the inode 480xx has two reference and remove 480xxx to Lost+Found. # e2fsck -f /dev/sdn e2fsck 1.42.12.wc1 (15-Sep-2014) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Unattached inode 48090 Connect to /lost+found? yes Inode 48090 ref count is 2, should be 1. Fix? yes Pass 5: Checking group summary information lustre-OST0036: * FILE SYSTEM WAS MODIFIED * lustre-OST0036: 238443/549322752 files (4.4% non-contiguous), 1737885841/2197287936 blocks Is it possible to find the file corresponding to 86/3303188178 and delete it ? P.S 1. in ldiskfs mode, most of the disk files are OK to read, while some of them are red. 2. there are about 240'000 objects in the OST. [root@oss04 d0]# df -i /lustre/ostc FilesystemInodes IUsed IFree IUse% Mounted on /dev/sdn 549322752 238443 5490843091% /lustre/ostc 3. Lustre Version 2.5.3, e2fsprog version Thank You! Computing center,the Institute of High Energy Physics, CAS, China Wang LuTel: (+86) 10 8823 6087 P.O. Box 918-7 Fax: (+86) 10 8823 6839 Beijing 100049 P.R. ChinaEmail: lu.w...@ihep.ac.cn === ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] RE No free catalog slots for log ( Lustre 2.5.3 & Robinhood 2.5.3 )
Hi Alexander, Before I recieved this reply, I deregistered the cl1 user. It took a very long time, and I am not sure if it successfully finished or not since the server crashed once the next morning. Then, I moved the old changelog_catalog file, and created a zero changelog_user file instead. This is what I got from the old changelog_catalog file. # ls -l /tmp/changelog.dmp -rw-r--r-- 1 root root 4153280 Dec 6 06:54 /tmp/changelog.dmp # llog_reader changelog.dmp |grep "type=1064553b" |wc -l 63432 This number is smaller than 64768, I am not sure if it is related to the unfinished deregisteration or not. The first record number is 1, the last record number of is 64767. I think there maybe some skipped record numbers: # llog_reader changelog.dmp |grep "type=1064553b" |head -n 1 rec #1 type=1064553b len=64 # llog_reader changelog.dmp |grep "type=1064553b" |tail -n 1 rec #64767 type=1064553b len=64 # llog_reader changelog.dmp |grep "^rec" | grep -v "type=1064553b" return 0 lines. By the way, are the llog files you mentioned virtual or real? if they are real, where are they located? Need I clean them manually ? Thanks, Lu,Wang From: Alexander Boyko Date: 2015-12-04 21:36 To: wanglu; lustre-discuss Subject: RE [lustre-discuss] No free catalog slots for log ( Lustre 2.5.3 & Robinhood 2.5.3 ) Here are 4 questions which we cannot find answers in LU-1586: 1. According to Andres?s reply, there should some unconsumed changelog files on our MDT, and these files have taken all the space (file quotas?) Lustre gives to changelog. With Lustre 2.1, these files are under OBJECTS directory and can be listed in ldiskfs mode. In our case, with Lustre 2.5.3, there is no OBJECTS directory can be found. In this case, how can we monitor the situation before the unconsumed changelogs takes up all the disk space? The changelog base on one catalog file and a plain llog files. Catalog stores limited number of records about 64768. A catalog record size is 64 byte. Each record has information about plain llog file. A plain llog file stores records about IO operation. A number of records at the plain llog file is about 64768 with different record size. So changelog could store 64768^2 IO operations and it occupy filesystem space. The error "no free catalog slots" is happened when changelog catalog doesn`t have a slot to store a record about new plain lllog. All slots are filled or internal changelog markers became crazy and internal logic don`t work. To be closer to the root cause, you need to dump a changelog catalog and check bitmap. Is there free slots? Something like debugfs -R "dump changelog_catalog changelog_catalog.dmp" /dev/md55 && used=`llog_reader changelog_catalog.dmp | grep "type=1064553b" | wc -l` 2. Why there are so many unconsumed changelogs? Could it related to our frequent remount of MDT( abort_recovery mode )? umount operation create half empty plain llog file. And changelog_clear can`t remove it, if all slots is freed. Only new mount can remove that file. It could be related or not. 3. When we remount the MDT, robinhood is still running. Why robinhood can not consume those old changelogs after MDT service is recovered? 4. Why there is a huge difference between current index(4199610352 ) and cl1(49035933) index? Thank you for your time and help ! Wang,Lu -- Alexander Boyko Seagate www.seagate.com___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
[lustre-discuss] No free catalog slots for log ( Lustre 2.5.3 & Robinhood 2.5.3 )
Hi all, We meet a “no free catalog slots for log” problem yesterday. Users got “Bad address” error when they are trying to delete or create a new file. Here are some console logs on MDS: Dec 1 23:14:41 kernel: LustreError: 23658:0:(llog_cat.c:82:llog_cat_new_log()) no free catalog slots for log... Dec 1 23:14:42 kernel: LustreError: 23635:0:(llog_cat.c:82:llog_cat_new_log()) no free catalog slots for log... Dec 1 23:14:42 kernel: LustreError: 23635:0:(llog_cat.c:82:llog_cat_new_log()) Skipped 3029 previous similar messages Dec 1 23:14:42 kernel: LustreError: 23316:0:(mdd_dir.c:783:mdd_changelog_ns_store()) changelog failed: rc=-28, op6 jobOptions_sim_digam_10.txt.bosslog c[0x200010768:0x2118:0x0] p[0x200012a20:0x186c8:0x0] We solved the problem by deregistering the cl1 user just as someone mentioned in this thread: https://jira.hpdd.intel.com/browse/LU-1586 # lctl --device besfs-MDT changelog_deregister cl1 The process has taken 230:41.21 minutes, and has not finished yet. Good news is that MDS service became normal just after we executed the command. To avoid the recurrence of this problem before we know why it happens, we unmasked all the changelog operations and stopped robinhood. We are running Lustre 2.5.3 and Robinhood 2.5.3. Currently, there are 80 million files. Usage of MDT is 65% capacity 19% inodes. The size of changelog_catlog is only 4M. -rw-r--r-- 1 root root 4153280 Jul 21 15:18 changelog_catalog And the index of cl1 log is: lctl get_param mdd.besfs-MDT.changelog_users mdd.besfs-MDT.changelog_users=current index: 4199610352 IDindex cl1 49035933 Here are 4 questions which we cannot find answers in LU-1586: 1. According to Andres’s reply, there should some unconsumed changelog files on our MDT, and these files have taken all the space (file quotas?) Lustre gives to changelog. With Lustre 2.1, these files are under OBJECTS directory and can be listed in ldiskfs mode. In our case, with Lustre 2.5.3, there is no OBJECTS directory can be found. In this case, how can we monitor the situation before the unconsumed changelogs takes up all the disk space? 2. Why there are so many unconsumed changelogs? Could it related to our frequent remount of MDT( abort_recovery mode )? 3. When we remount the MDT, robinhood is still running. Why robinhood can not consume those old changelogs after MDT service is recovered? 4. Why there is a huge difference between current index(4199610352 ) and cl1(49035933) index? Thank you for your time and help ! Wang,Lu Computing center,the Institute of High Energy Physics, CAS, China Wang, Lu ( 汪 璐 ) Tel: (+86) 10 8823 6087 P.O. Box 918-7 Fax: (+86) 10 8823 6839 Beijing 100049 P.R. China Email: lu.w...@ihep.ac.cn ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [Lustre-discuss] MDS read-only
By the way, we have also tried to dd the MDT device and mount the replica, the problem still exists. Besides, we have not seen any error reported on hardware monitor. It is much more like an ldiskfs error than hardware error. Lu 在 2012-10-9,下午12:04, wanglu 写道: > Dear all, > Two of our MDS have got repeatedly read-only error recently after once > e2fsck on lustre 1.8.5. After the MDT mounted for a while, the kernel will > reports errors like: > Oct 8 20:16:44 mainmds kernel: LDISKFS-fs error (device cciss!c0d1): > ldiskfs_ext_check_inode: bad header/extent in inode #50736178: invalid magic > - magic 0, entries 0, max 0(0), depth 0(0) > Oct 8 20:16:44 mainmds kernel: Aborting journal on device cciss!c0d1-8. >And make the MDS read-only. > This problem has made about 1PB data, 0.1 billion files unavailable > to access. We believe there is some structure wrong in the local file system > of MDT, so we have tried to use e2fsck to fix it follow the process in lustre > manual. However, with the loop always goes like this: > 1. run e2fsck, fixed or not fixed some errors > 2. mount MDT, report read-only after some client operations, and the > whole system became unusable. > 3. e2fsck again. > >We have tried with three different version lustre: 1.8.5, 1.8.6, and > 1.8.8-wc and their corresponding e2fsprog, the problem still exists. > Currently, We can only use lustre with all the clients mounted in read-only > mode, and tried to copy the whole file system. However, It takes a long > period to generate all the directory structure and file list for 0.1 billion > files. > >Can any one give us some suggestions? Thank you very much! > > Lu Wang > CC-IHEP > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] MDS read-only
Dear all, Two of our MDS have got repeatedly read-only error recently after once e2fsck on lustre 1.8.5. After the MDT mounted for a while, the kernel will reports errors like: Oct 8 20:16:44 mainmds kernel: LDISKFS-fs error (device cciss!c0d1): ldiskfs_ext_check_inode: bad header/extent in inode #50736178: invalid magic - magic 0, entries 0, max 0(0), depth 0(0) Oct 8 20:16:44 mainmds kernel: Aborting journal on device cciss!c0d1-8. And make the MDS read-only. This problem has made about 1PB data, 0.1 billion files unavailable to access. We believe there is some structure wrong in the local file system of MDT, so we have tried to use e2fsck to fix it follow the process in lustre manual. However, with the loop always goes like this: 1. run e2fsck, fixed or not fixed some errors 2. mount MDT, report read-only after some client operations, and the whole system became unusable. 3. e2fsck again. We have tried with three different version lustre: 1.8.5, 1.8.6, and 1.8.8-wc and their corresponding e2fsprog, the problem still exists. Currently, We can only use lustre with all the clients mounted in read-only mode, and tried to copy the whole file system. However, It takes a long period to generate all the directory structure and file list for 0.1 billion files. Can any one give us some suggestions? Thank you very much! Lu Wang CC-IHEP ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
[Lustre-discuss] OSS crashed with PANIC
Hi all, One of our OSS with Lustre version:"2.6.18-194.17.1.el5_lustre.1.8.5 " crashed today. crash> sys KERNEL: /usr/lib/debug/lib/modules/2.6.18-194.17.1.el5_lustre.1.8.5//vmlinux DUMPFILE: /var/crash/2011-10-10-15:30/vmcore CPUS: 8 DATE: Mon Oct 10 15:29:15 2011 UPTIME: 05:30:52 LOAD AVERAGE: 3.74, 2.14, 1.57 TASKS: 983 NODENAME: RELEASE: 2.6.18-194.17.1.el5_lustre.1.8.5 VERSION: #1 SMP Mon Nov 15 15:48:43 MST 2010 MACHINE: x86_64 (2399 Mhz) MEMORY: 23.6 GB PANIC: "Oops: [1] SMP " (check log for details) Here is the end of crash dump log: Unable to handle kernel NULL pointer dereference at RIP: [] :obdfilter:filter_preprw+0x1746/0x1e00 PGD 2f8e86067 PUD 31c2c4067 PMD 0 Oops: [1] SMP last sysfs file: /devices/pci:00/:00:00.0/irq CPU 4 Modules linked in: autofs4(U) hidp(U) obdfilter(U) fsfilt_ldiskfs(U) ost(U) mgc( U) ldiskfs(U) jbd2(U) crc16(U) lustre(U) lov(U) mdc(U) lquota(U) osc(U) ksocklnd (U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) rfcomm(U) l2cap(U) bluetooth (U) sunrpc(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_mete r(U) hwmon(U) i2c_ec(U) i2c_core(U) dell_wmi(U) wmi(U) button(U) battery(U) asus _acpi(U) acpi_memhotplug(U) ac(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) parport_pc (U) lp(U) parport(U) joydev(U) ixgbe(U) 8021q(U) hpilo(U) sg(U) shpchp(U) dca(U) serio_raw(U) pcspkr(U) bnx2(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_ mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) usb_stor age(U) lpfc(U) scsi_transport_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U ) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) Pid: 4252, comm: ll_ost_io_12 Tainted: G 2.6.18-194.17.1.el5_lustre.1.8.5 # 1 RIP: 0010:[] [] :obdfilter:filter_preprw+0x 1746/0x1e00 RSP: 0018:81030bdcd8c0 EFLAGS: 00010206 RAX: 0021 RBX: RCX: 810011017300 RDX: 8101067b4c90 RSI: 000e RDI: 3533313130323331 RBP: 81030bdd1388 R08: 81061ff40b03 R09: 1000 R10: R11: 000200d2 R12: 007e R13: 0007e000 R14: 0100 R15: 0100 FS: 2ba793935220() GS:81010af99240() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 0002f92d1000 CR4: 06e0 Process ll_ost_io_12 (pid: 4252, threadinfo 81030bdcc000, task 81030bd17 080) Stack: 81031fc503c0 8102c44c7200 81031fc503c0 0002c0a83281 0002ca7a214e 88539543 885a0d80 8102c44c7200 8853ba03 Call Trace: [] :lnet:lnet_ni_send+0x93/0xd0 [] :obdclass:class_handle2object+0xe0/0x170 [] :lnet:lnet_send+0x9a3/0x9d0 [] truncate_inode_pages_range+0x222/0x2ba [] :ost:ost_brw_write+0xf9c/0x2480 [] :ptlrpc:ptlrpc_send_reply+0x5c8/0x5e0 [] :ptlrpc:target_committed_to_req+0x40/0x120 [] :ptlrpc:lustre_msg_get_version+0x35/0xf0 [] :ptlrpc:lustre_msg_get_opc+0x35/0xf0 [] default_wake_function+0x0/0xe [] :ptlrpc:lustre_msg_check_version_v2+0x8/0x20 [] :ost:ost_handle+0x2bae/0x55b0 [] __next_cpu+0x19/0x28 [] smp_send_reschedule+0x4e/0x53 [] :ptlrpc:ptlrpc_server_handle_request+0x97a/0xdf0 [] :ptlrpc:ptlrpc_wait_event+0x2d8/0x310 [] __wake_up_common+0x3e/0x68 [] :ptlrpc:ptlrpc_main+0xf37/0x10f0 [] child_rip+0xa/0x11 [] :ptlrpc:ptlrpc_main+0x0/0x10f0 [] child_rip+0x0/0x11 Code: 44 39 23 7e 0c 48 83 c5 28 e9 56 fd ff ff 45 31 ed 48 8d bc RIP [] :obdfilter:filter_preprw+0x1746/0x1e00 RSP and here is bt result: crash> bt PID: 4252 TASK: 81030bd17080 CPU: 4 COMMAND: "ll_ost_io_12" #0 [81030bdcd620] crash_kexec at 800ad9c4 #1 [81030bdcd6e0] __die at 80065157 #2 [81030bdcd720] do_page_fault at 80066dd7 #3 [81030bdcd810] error_exit at 8005dde9 [exception RIP: filter_preprw+5958] RIP: 8895e7c6 RSP: 81030bdcd8c0 RFLAGS: 00010206 RAX: 0021 RBX: RCX: 810011017300 RDX: 8101067b4c90 RSI: 000e RDI: 3533313130323331 RBP: 81030bdd1388 R8: 81061ff40b03 R9: 1000 R10: R11: 000200d2 R12: 007e R13: 0007e000 R14: 0100 R15: 0100 ORIG_RAX: CS: 0010 SS: 0018 #4 [81030bdcd8f8] lnet_ni_send at 88539543 #5 [81030bdcd918] lnet_send at 8853ba03 #6 [81030bdcd9d8] truncate_inode_pages_range at 8002b84a #7 [81030bdcdaf8] ptlrpc_send_reply at 8864a658 #8 [81030bdcdc18] lustre_msg_get_version at 8864eb05 #9 [81030bdcdc48] lustre_msg_check_version_v2 at 8864ebc8 #10 [81030bdcdca8] ost_handle at 8890d08e #11 [81030bdcde38] ptlrpc_wait_event at 8865e8a8 #12 [81030bdcdf48]
[Lustre-discuss] Default stripe count of an OST pool (1.8.5)
Hi all, We just upgraded our system from 1.8.1.1 to 1.8.5. On version 1.8.1.1 we use the default stripe configuration for each directory within a OST pool. Therefore each file has stripe_count=0. On 1.8.1.1, "stripe_count=0" has same result with " stripe_count=1" (one stripe for each file). However, it seems that, on 1.8.5, stripe_count=0 equals stripe_count=-1(Use all OSTs in the pool for this file). Anyone met the same problem? Is it possible to do "lfs setstripe -c" recursively? Thank you very much! Lu Wang Computing Center, IHEP ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Frequent OSS Crashes with heavy load
.c:1139:ost_brw_write()) @@@ network error on bulk GET 0(1048576) [EMAIL PROTECTED] x12883457/t0 o4->[EMAIL PROTECTED]:0/0 lens 384/352 e 0 to 0 dl 1226573467 ref 1 fl Interpret:/0/0 rc 0/0 Nov 13 18:35:15 boss02 kernel: Lustre: 27237:0:(ost_handler.c:1270:ost_brw_write()) besfs-OST0003: ignoring bulk IO comm error with [EMAIL PROTECTED] id [EMAIL PROTECTED] - client will retry Nov 13 18:35:17 boss02 kernel: LustreError: 26928:0:(events.c:361:server_bulk_callback()) event type 2, status -5, desc da5d7000 Nov 13 18:35:18 boss02 kernel: LustreError: 26928:0:(socklnd.c:1613:ksocknal_destroy_conn()) Completing partial receive from [EMAIL PROTECTED], ip 192.168.52.108:1021, with error Nov 13 18:35:18 boss02 kernel: LustreError: 26928:0:(socklnd.c:1613:ksocknal_destroy_conn()) Skipped 1 previous similar message Nov 13 18:35:18 boss02 kernel: LustreError: 26928:0:(events.c:361:server_bulk_callback()) event type 2, status -5, desc c71c Nov 13 18:35:18 boss02 kernel: LustreError: 27215:0:(ost_handler.c:1139:ost_brw_write()) @@@ network error on bulk GET 0(1048576) [EMAIL PROTECTED] x7236800/t0 o4->[EMAIL PROTECTED]:0/0 lens 384/352 e 0 to 0 dl 1226573468 ref 1 fl Interpret:/0/0 rc 0/0 Nov 13 18:35:18 boss02 kernel: LustreError: 27215:0:(ost_handler.c:1139:ost_brw_write()) Skipped 1 previous similar message Nov 13 18:35:18 boss02 kernel: Lustre: 27215:0:(ost_handler.c:1270:ost_brw_write()) besfs-OST: ignoring bulk IO comm error with [EMAIL PROTECTED] id [EMAIL PROTECTED] - client will retry Nov 13 18:35:18 boss02 kernel: Lustre: 27215:0:(ost_handler.c:1270:ost_brw_write()) Skipped 1 previous similar message <---At that time, the network was down, couldn't ping gateway--> <--I have tried restart service network, but after restarted, gateway was still unreachable---> ------ wanglu 2008-11-13 - 发件人:Andreas Dilger 发送日期:2008-11-13 01:36:57 收件人:Wang lu 抄送:Brian J. Murrell; lustre-discuss@lists.lustre.org 主题:Re: [Lustre-discuss] Frequent OSS Crashes with heavy load On Nov 12, 2008 13:48 +, Wang lu wrote: > May I ask where can I run PIOS command? I think to determine the max thread > number of OSS, it should be run on OSS, however, the OST directorys are > unwritable. Can I write to /dev/sdaX? I am confused. Running PIOS directly the /dev/sdX will overwrite all data there. It should only be run on the disk devices before the filesystem is formatted. You can run PIOS against the filesystem itself (e.g. /mnt/lustre) to just create regular files in the filesystem. > Brian J. Murrell 写: > > > On Mon, 2008-11-10 at 16:42 +, Wang lu wrote: > >> I have already 512(max number) IO thread running. Some of them are of > >> "Dead" > >> status. Is it safe to draw conclusion that the OSS is oversubscribed? > > > > Until you do some analysis of your storage with the iokit, one cannot > > really draw any conclusions, however if you are already at the maximum > > value of OST threads, it would not be difficult to believe that perhaps > > this is a possibility. > > > > Try a simple experiment and half the number to 256 and see if you have > > any drop off in throughput to the storage devices. If not, then you can > > easily assume that 512 was either too much or not necessary. You can > > try doing this again if you wish. If you get to a value of OST threads > > where your throughput is lower than it should be, you've gone too low. > > > > But really, the iokit is the more efficient and accurate way to > > determine this. > > > > b. > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Re: [Lustre-discuss] Frequent OSS Crashes with heavy load
Hi all, Since there are jobs running on the clustre, I cann't do PIOS test now. I am afraid this situtaion may happen later. Does Lustre has some solution to deal with over-subscribed instead of Kernel crash? Users can accpt that their jobs are slow down, but they can not accept their jobs are dead because of crash of OSSs. Or is there any other reason may cause crash of OSSs? Thank you very much! -- wanglu 2008-11-11 - 发件人:Wang lu 发送日期:2008-11-11 01:01:12 收件人:Brian J. Murrell [EMAIL PROTECTED] 主题:Re: [Lustre-discuss] Frequent OSS Crashes with heavy load Thanks a lot. I will go on tomorrow. Brian J. Murrell 写: > On Mon, 2008-11-10 at 16:42 +, Wang lu wrote: >> I have already 512(max number) IO thread running. Some of them are of "Dead" >> status. Is it safe to draw conclusion that the OSS is oversubscribed? > > Until you do some analysis of your storage with the iokit, one cannot > really draw any conclusions, however if you are already at the maximum > value of OST threads, it would not be difficult to believe that perhaps > this is a possibility. > > Try a simple experiment and half the number to 256 and see if you have > any drop off in throughput to the storage devices. If not, then you can > easily assume that 512 was either too much or not necessary. You can > try doing this again if you wish. If you get to a value of OST threads > where your throughput is lower than it should be, you've gone too low. > > But really, the iokit is the more efficient and accurate way to > determine this. > > b. > ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss