Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Thanks a lot Sanjeev! If you look my first message you will see that discrepancy in zdb... Leal. [http://www.eall.com.br/blog] -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Marcelo, On Wed, Dec 31, 2008 at 02:17:37AM -0800, Marcelo Leal wrote: Thanks a lot Sanjeev! If you look my first message you will see that discrepancy in zdb... Apologies. Now, in the hindsight I understand why you gave the zdb details :-( I should have read the mail carefully. Thanks and regards, Sanjeev. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Hello all, # zpool status pool: mypool state: ONLINE scrub: scrub completed after 0h2m with 0 errors on Fri Dec 19 09:32:42 2008 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t8d0 ONLINE 0 0 0 c0t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c0t13d0 ONLINE 0 0 0 logs ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 errors: No known data errors - zfs list -r shows eight filesystems, and nine snapshots per filesystem. ... mypool/colorado 1.83G 4.00T 1.13G /mypool/colorado mypool/color...@centenario-2008-12-28-01:00:00 40.3M - 1.46G - mypool/color...@centenario-2008-12-29-01:00:00 30.0M - 1.54G - mypool/color...@campeao-2008-12-29-09:00:00 10.4M - 1.24G - mypool/color...@campeao-2008-12-29-13:00:00 31.5M - 1.29G - mypool/color...@campeao-2008-12-29-17:00:00 5.46M - 1.10G - mypool/color...@campeao-2008-12-29-21:00:00 4.23M - 1.13G - mypool/color...@centenario-2008-12-30-01:00:00 0 - 1.16G - mypool/color...@campeao-2008-12-30-01:00:00 0 - 1.16G - mypool/color...@campeao-2008-12-30-05:00:00 6.24M - 1.16G - ... - How many entries does it have ? Now there is just one file, the problematic one... but before the whole problem, four or five small files (the whole pool is pretty empty). - Which filesystem (of the zpool) does it belong to ? See above... Thanks a lot! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Marcelo, Thanks for the details ! This rules out a bug that I was suspecting : http://bugs.opensolaris.org/view_bug.do?bug_id=6664765 This needs more analysis. What does the rm command fail with ? We could probably run truss on the rm command like : truss -o /tmp/rm.truss rm filename You then pass on the file : /tmp/rm.truss This would show us which system call is failing and why. That would give us a good idea of what is going wrong. Thanks and regards, Sanjeev. Marcelo Leal wrote: Hello all, # zpool status pool: mypool state: ONLINE scrub: scrub completed after 0h2m with 0 errors on Fri Dec 19 09:32:42 2008 config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t2d0 ONLINE 0 0 0 c0t3d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t4d0 ONLINE 0 0 0 c0t5d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t6d0 ONLINE 0 0 0 c0t7d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t8d0 ONLINE 0 0 0 c0t9d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t10d0 ONLINE 0 0 0 c0t11d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 c0t12d0 ONLINE 0 0 0 c0t13d0 ONLINE 0 0 0 logs ONLINE 0 0 0 c0t1d0 ONLINE 0 0 0 errors: No known data errors - zfs list -r shows eight filesystems, and nine snapshots per filesystem. ... mypool/colorado 1.83G 4.00T 1.13G /mypool/colorado mypool/color...@centenario-2008-12-28-01:00:00 40.3M - 1.46G - mypool/color...@centenario-2008-12-29-01:00:00 30.0M - 1.54G - mypool/color...@campeao-2008-12-29-09:00:00 10.4M - 1.24G - mypool/color...@campeao-2008-12-29-13:00:00 31.5M - 1.29G - mypool/color...@campeao-2008-12-29-17:00:00 5.46M - 1.10G - mypool/color...@campeao-2008-12-29-21:00:00 4.23M - 1.13G - mypool/color...@centenario-2008-12-30-01:00:00 0 - 1.16G - mypool/color...@campeao-2008-12-30-01:00:00 0 - 1.16G - mypool/color...@campeao-2008-12-30-05:00:00 6.24M - 1.16G - ... - How many entries does it have ? Now there is just one file, the problematic one... but before the whole problem, four or five small files (the whole pool is pretty empty). - Which filesystem (of the zpool) does it belong to ? See above... Thanks a lot! ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
execve(/usr/bin/rm, 0x08047DBC, 0x08047DC8) argc = 2 mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFF resolvepath(/usr/lib/ld.so.1, /lib/ld.so.1, 1023) = 12 resolvepath(/usr/bin/rm, /usr/bin/rm, 1023) = 11 sysconfig(_CONFIG_PAGESIZE) = 4096 xstat(2, /usr/bin/rm, 0x08047A68) = 0 open(/var/ld/ld.config, O_RDONLY) Err#2 ENOENT xstat(2, /lib/libc.so.1, 0x080471C8) = 0 resolvepath(/lib/libc.so.1, /lib/libc.so.1, 1023) = 14 open(/lib/libc.so.1, O_RDONLY)= 3 mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEFB mmap(0x0001, 1380352, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE5 mmap(0xFEE5, 1272553, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE5 mmap(0xFEF97000, 32482, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 1273856) = 0xFEF97000 mmap(0xFEF9F000, 6400, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF9F000 munmap(0xFEF87000, 65536) = 0 memcntl(0xFEE5, 208132, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(3)= 0 mmap(0x0001, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF9 munmap(0xFEFB, 32768) = 0 getcontext(0x08047820) getrlimit(RLIMIT_STACK, 0x08047818) = 0 getpid()= 3269 [3268] lwp_private(0, 1, 0xFEF92A00) = 0x01C3 setustack(0xFEF92A60) sysi86(SI86FPSTART, 0xFEFA0014, 0x133F, 0x1F80) = 0x0001 brk(0x08063770) = 0 brk(0x08065770) = 0 sysconfig(_CONFIG_PAGESIZE) = 4096 ioctl(0, TCGETA, 0x08047D3C)= 0 brk(0x08065770) = 0 brk(0x08067770) = 0 fstatat64(AT_FDCWD, Arquivos.file, 0x08047C80, 0x1000) Err#2 ENOENT fstat64(2, 0x08046CE0) = 0 write(2, r m : , 4) = 4 write(2, Arquivos . fil.., 13) = 13 write(2, : , 2) = 2 write(2, N o s u c h f i l e.., 25) = 25 write(2, \n, 1) = 1 _exit(2) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Marcelo, Thanks for the details. Comments inline... Marcelo Leal wrote: execve(/usr/bin/rm, 0x08047DBC, 0x08047DC8) argc = 2 mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFF resolvepath(/usr/lib/ld.so.1, /lib/ld.so.1, 1023) = 12 resolvepath(/usr/bin/rm, /usr/bin/rm, 1023) = 11 sysconfig(_CONFIG_PAGESIZE) = 4096 xstat(2, /usr/bin/rm, 0x08047A68) = 0 open(/var/ld/ld.config, O_RDONLY) Err#2 ENOENT xstat(2, /lib/libc.so.1, 0x080471C8) = 0 resolvepath(/lib/libc.so.1, /lib/libc.so.1, 1023) = 14 open(/lib/libc.so.1, O_RDONLY)= 3 fstatat64(AT_FDCWD, Arquivos.file, 0x08047C80, 0x1000) Err#2 ENOENT This is interesting ! Note that the fstatat64() call is failing with ENOENT. So, there is something we are missing. I assume you are able to list the directory contents and ascertain that the file exists. Can you please provide the directory listing (ls -l) of the directory in question ? Note that a ls -l would use fstat64 to get the stats of the files. So, truss on ls -l would also help. Thanks and regards, Sanjeev. fstat64(2, 0x08046CE0) = 0 write(2, r m : , 4) = 4 write(2, Arquivos . fil.., 13) = 13 write(2, : , 2) = 2 write(2, N o s u c h f i l e.., 25) = 25 write(2, \n, 1) = 1 _exit(2) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
execve(/usr/bin/ls, 0x08047DA8, 0x08047DB4) argc = 2 mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, -1, 0) = 0xFEFF resolvepath(/usr/lib/ld.so.1, /lib/ld.so.1, 1023) = 12 resolvepath(/usr/bin/ls, /usr/bin/ls, 1023) = 11 xstat(2, /usr/bin/ls, 0x08047A58) = 0 open(/var/ld/ld.config, O_RDONLY) Err#2 ENOENT sysconfig(_CONFIG_PAGESIZE) = 4096 xstat(2, /lib/libc.so.1, 0x080471B8) = 0 resolvepath(/lib/libc.so.1, /lib/libc.so.1, 1023) = 14 open(/lib/libc.so.1, O_RDONLY)= 3 mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEFB mmap(0x0001, 1380352, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE5 mmap(0xFEE5, 1272553, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE5 mmap(0xFEF97000, 32482, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 1273856) = 0xFEF97000 mmap(0xFEF9F000, 6400, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF9F000 munmap(0xFEF87000, 65536) = 0 memcntl(0xFEE5, 208132, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(3)= 0 mmap(0x0001, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF9 munmap(0xFEFB, 32768) = 0 getcontext(0x08047810) getrlimit(RLIMIT_STACK, 0x08047808) = 0 getpid()= 5410 [5409] lwp_private(0, 1, 0xFEF92A00) = 0x01C3 setustack(0xFEF92A60) sysi86(SI86FPSTART, 0xFEFA0014, 0x133F, 0x1F80) = 0x0001 brk(0x08067320) = 0 brk(0x08069320) = 0 time() = 1230662014 ioctl(1, TCGETA, 0x08047ABC)= 0 sysconfig(_CONFIG_PAGESIZE) = 4096 brk(0x08069320) = 0 brk(0x08073320) = 0 lstat64(., 0x080469A0)= 0 xstat(2, /lib/libsec.so.1, 0x08045F98)= 0 resolvepath(/lib/libsec.so.1, /lib/libsec.so.1, 1023) = 16 open(/lib/libsec.so.1, O_RDONLY) = 3 mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 0xFEFB mmap(0x0001, 151552, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE2 mmap(0xFEE2, 58047, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 0) = 0xFEE2 mmap(0xFEE3F000, 13477, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 61440) = 0xFEE3F000 mmap(0xFEE43000, 5760, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEE43000 munmap(0xFEE2F000, 65536) = 0 memcntl(0xFEE2, 13752, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(3)= 0 munmap(0xFEFB, 32768) = 0 pathconf(., 20) = 2 acl(., ACE_GETACLCNT, 0, 0x) = 6 stat64(., 0x08046890) = 0 acl(., ACE_GETACL, 6, 0x08071C48) = 6 openat(AT_FDCWD, ., O_RDONLY|O_NDELAY|O_LARGEFILE) = 3 fcntl(3, F_SETFD, 0x0001) = 0 fstat64(3, 0x080479A0) = 0 getdents64(3, 0xFEF94000, 8192) = 80 lstat64(./Arquivos.file, 0x08046930) Err#2 ENOENT getdents64(3, 0xFEF94000, 8192) = 0 close(3)= 0 ioctl(1, TCGETA, 0x08046BBC)= 0 fstat64(1, 0x08046B20) = 0 write(1, t o t a l 0\n, 8) = 8 _exit(0) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Marcello, Comments inline... On Tue, Dec 30, 2008 at 10:35:37AM -0800, Marcelo Leal wrote: pathconf(., 20) = 2 acl(., ACE_GETACLCNT, 0, 0x) = 6 stat64(., 0x08046890) = 0 acl(., ACE_GETACL, 6, 0x08071C48) = 6 openat(AT_FDCWD, ., O_RDONLY|O_NDELAY|O_LARGEFILE) = 3 fcntl(3, F_SETFD, 0x0001) = 0 fstat64(3, 0x080479A0) = 0 getdents64(3, 0xFEF94000, 8192) = 80 lstat64(./Arquivos.file, 0x08046930) Err#2 ENOENT getdents64(3, 0xFEF94000, 8192) = 0 This is quite strange... getdents() seems to be returning the name of the file in question. But, the lstat64() fails with ENOENT. I am wondering if there is a discrepancy between the directory contents and the actual file. Unfortunately I am on vacation for the whole of next week and hence may not be able to follow up. I hope someone else will be able to follow it up from here. Thanks and regards, Sanjeev. close(3)= 0 ioctl(1, TCGETA, 0x08046BBC)= 0 fstat64(1, 0x08046B20) = 0 write(1, t o t a l 0\n, 8) = 8 _exit(0) -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Hello all... Can that be caused by some cache on the LSI controller? Some flush that the controller or disk did not honour? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem
Marcelo, Marcelo Leal wrote: Hello all... Can that be caused by some cache on the LSI controller? Some flush that the controller or disk did not honour? More details on the problem would help. Can you please give the following details : - zpool status - zfs list -r - The details of the directory : - How many entries does it have ? - Which filesystem (of the zpool) does it belong to ? Thanks and regards, Sanjeev. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss