Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-31 Thread Sanjeev
Marcelo,

On Wed, Dec 31, 2008 at 02:17:37AM -0800, Marcelo Leal wrote:
> Thanks a lot Sanjeev!
>  If you look my first message you will see that discrepancy in zdb...

Apologies. Now, in the hindsight I understand why you gave the zdb details :-(
I should have read the mail carefully.

Thanks and regards,
Sanjeev.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-31 Thread Marcelo Leal
Thanks a lot Sanjeev!
 If you look my first message you will see that discrepancy in zdb...

 Leal.
[http://www.eall.com.br/blog]
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Sanjeev
Marcello,

Comments inline...

On Tue, Dec 30, 2008 at 10:35:37AM -0800, Marcelo Leal wrote:
> pathconf(".", 20)   = 2
> acl(".", ACE_GETACLCNT, 0, 0x)  = 6
> stat64(".", 0x08046890) = 0
> acl(".", ACE_GETACL, 6, 0x08071C48) = 6
> openat(AT_FDCWD, ".", O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
> fcntl(3, F_SETFD, 0x0001)   = 0
> fstat64(3, 0x080479A0)  = 0
> getdents64(3, 0xFEF94000, 8192) = 80
> lstat64("./Arquivos.file", 0x08046930)  Err#2 ENOENT
> getdents64(3, 0xFEF94000, 8192) = 0

This is quite strange... getdents() seems to be returning the
name of the file in question. But, the lstat64() fails with ENOENT.

I am wondering if there is a discrepancy between the directory contents
and the actual file.

Unfortunately I am on vacation for the whole of next week and hence
may not be able to follow up.

I hope someone else will be able to follow it up from here.

Thanks and regards,
Sanjeev.

> close(3)= 0
> ioctl(1, TCGETA, 0x08046BBC)= 0
> fstat64(1, 0x08046B20)  = 0
> write(1, " t o t a l   0\n", 8) = 8
> _exit(0)
> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Marcelo Leal
execve("/usr/bin/ls", 0x08047DA8, 0x08047DB4)  argc = 2
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, 
-1, 0) = 0xFEFF
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/usr/bin/ls", "/usr/bin/ls", 1023) = 11
xstat(2, "/usr/bin/ls", 0x08047A58) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
sysconfig(_CONFIG_PAGESIZE) = 4096
xstat(2, "/lib/libc.so.1", 0x080471B8)  = 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY)= 3
mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 
0xFEFB
mmap(0x0001, 1380352, PROT_NONE, 
MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE5
mmap(0xFEE5, 1272553, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 
3, 0) = 0xFEE5
mmap(0xFEF97000, 32482, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 1273856) = 0xFEF97000
mmap(0xFEF9F000, 6400, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF9F000
munmap(0xFEF87000, 65536)   = 0
memcntl(0xFEE5, 208132, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)= 0
mmap(0x0001, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF9
munmap(0xFEFB, 32768)   = 0
getcontext(0x08047810)
getrlimit(RLIMIT_STACK, 0x08047808) = 0
getpid()= 5410 [5409]
lwp_private(0, 1, 0xFEF92A00)   = 0x01C3
setustack(0xFEF92A60)
sysi86(SI86FPSTART, 0xFEFA0014, 0x133F, 0x1F80) = 0x0001
brk(0x08067320) = 0
brk(0x08069320) = 0
time()  = 1230662014
ioctl(1, TCGETA, 0x08047ABC)= 0
sysconfig(_CONFIG_PAGESIZE) = 4096
brk(0x08069320) = 0
brk(0x08073320) = 0
lstat64(".", 0x080469A0)= 0
xstat(2, "/lib/libsec.so.1", 0x08045F98)= 0
resolvepath("/lib/libsec.so.1", "/lib/libsec.so.1", 1023) = 16
open("/lib/libsec.so.1", O_RDONLY)  = 3
mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 
0xFEFB
mmap(0x0001, 151552, PROT_NONE, 
MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE2
mmap(0xFEE2, 58047, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 3, 
0) = 0xFEE2
mmap(0xFEE3F000, 13477, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 61440) = 0xFEE3F000
mmap(0xFEE43000, 5760, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANON, 
-1, 0) = 0xFEE43000
munmap(0xFEE2F000, 65536)   = 0
memcntl(0xFEE2, 13752, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)= 0
munmap(0xFEFB, 32768)   = 0
pathconf(".", 20)   = 2
acl(".", ACE_GETACLCNT, 0, 0x)  = 6
stat64(".", 0x08046890) = 0
acl(".", ACE_GETACL, 6, 0x08071C48) = 6
openat(AT_FDCWD, ".", O_RDONLY|O_NDELAY|O_LARGEFILE) = 3
fcntl(3, F_SETFD, 0x0001)   = 0
fstat64(3, 0x080479A0)  = 0
getdents64(3, 0xFEF94000, 8192) = 80
lstat64("./Arquivos.file", 0x08046930)  Err#2 ENOENT
getdents64(3, 0xFEF94000, 8192) = 0
close(3)= 0
ioctl(1, TCGETA, 0x08046BBC)= 0
fstat64(1, 0x08046B20)  = 0
write(1, " t o t a l   0\n", 8) = 8
_exit(0)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Sanjeev Bagewadi
Marcelo,

Thanks for the details.
Comments inline...

Marcelo Leal wrote:
> execve("/usr/bin/rm", 0x08047DBC, 0x08047DC8)  argc = 2
> mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, 
> -1, 0) = 0xFEFF
> resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
> resolvepath("/usr/bin/rm", "/usr/bin/rm", 1023) = 11
> sysconfig(_CONFIG_PAGESIZE) = 4096
> xstat(2, "/usr/bin/rm", 0x08047A68) = 0
> open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
> xstat(2, "/lib/libc.so.1", 0x080471C8)  = 0
> resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
> open("/lib/libc.so.1", O_RDONLY)= 3
>   
> fstatat64(AT_FDCWD, "Arquivos.file", 0x08047C80, 0x1000) Err#2 ENOENT
>   
This is interesting !
Note that the fstatat64() call is failing with ENOENT. So, there is 
something we are missing.
I assume you are able to list the directory contents and ascertain that 
the file exists.
Can you please provide the directory listing ("ls -l") of the directory 
in question ?
Note that a "ls -l" would use fstat64 to get the stats of the files. So, 
truss on "ls -l" would
also help.

Thanks and regards,
Sanjeev.
> fstat64(2, 0x08046CE0)  = 0
> write(2, " r m :  ", 4) = 4
> write(2, " Arquivos . fil".., 13)  = 13
> write(2, " :  ", 2) = 2
> write(2, " N o   s u c h   f i l e".., 25)  = 25
> write(2, "\n", 1)   = 1
> _exit(2)
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Marcelo Leal
execve("/usr/bin/rm", 0x08047DBC, 0x08047DC8)  argc = 2
mmap(0x, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANON, 
-1, 0) = 0xFEFF
resolvepath("/usr/lib/ld.so.1", "/lib/ld.so.1", 1023) = 12
resolvepath("/usr/bin/rm", "/usr/bin/rm", 1023) = 11
sysconfig(_CONFIG_PAGESIZE) = 4096
xstat(2, "/usr/bin/rm", 0x08047A68) = 0
open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT
xstat(2, "/lib/libc.so.1", 0x080471C8)  = 0
resolvepath("/lib/libc.so.1", "/lib/libc.so.1", 1023) = 14
open("/lib/libc.so.1", O_RDONLY)= 3
mmap(0x0001, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 3, 0) = 
0xFEFB
mmap(0x0001, 1380352, PROT_NONE, 
MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEE5
mmap(0xFEE5, 1272553, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 
3, 0) = 0xFEE5
mmap(0xFEF97000, 32482, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 3, 1273856) = 0xFEF97000
mmap(0xFEF9F000, 6400, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_FIXED|MAP_ANON, -1, 0) = 0xFEF9F000
munmap(0xFEF87000, 65536)   = 0
memcntl(0xFEE5, 208132, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0
close(3)= 0
mmap(0x0001, 24576, PROT_READ|PROT_WRITE|PROT_EXEC, 
MAP_PRIVATE|MAP_ANON|MAP_ALIGN, -1, 0) = 0xFEF9
munmap(0xFEFB, 32768)   = 0
getcontext(0x08047820)
getrlimit(RLIMIT_STACK, 0x08047818) = 0
getpid()= 3269 [3268]
lwp_private(0, 1, 0xFEF92A00)   = 0x01C3
setustack(0xFEF92A60)
sysi86(SI86FPSTART, 0xFEFA0014, 0x133F, 0x1F80) = 0x0001
brk(0x08063770) = 0
brk(0x08065770) = 0
sysconfig(_CONFIG_PAGESIZE) = 4096
ioctl(0, TCGETA, 0x08047D3C)= 0
brk(0x08065770) = 0
brk(0x08067770) = 0
fstatat64(AT_FDCWD, "Arquivos.file", 0x08047C80, 0x1000) Err#2 ENOENT
fstat64(2, 0x08046CE0)  = 0
write(2, " r m :  ", 4) = 4
write(2, " Arquivos . fil".., 13)  = 13
write(2, " :  ", 2) = 2
write(2, " N o   s u c h   f i l e".., 25)  = 25
write(2, "\n", 1)   = 1
_exit(2)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Sanjeev Bagewadi
Marcelo,

Thanks for the details ! This rules out a bug that I was suspecting :
http://bugs.opensolaris.org/view_bug.do?bug_id=6664765

This needs more analysis.
What does the "rm" command fail with ?
We could probably run truss on the rm command like : "truss -o 
/tmp/rm.truss rm "
You then pass on the file : /tmp/rm.truss

This would show us which system call is failing and why. That would give 
us a good idea of what
is going wrong.

Thanks and regards,
Sanjeev.

Marcelo Leal wrote:
> Hello all,
>
> # zpool status
>   pool: mypool
>  state: ONLINE
>  scrub: scrub completed after 0h2m with 0 errors on Fri Dec 19 09:32:42 2008
> config:
>
> NAME STATE READ WRITE CKSUM
> storage  ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t2d0   ONLINE   0 0 0
> c0t3d0   ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t4d0   ONLINE   0 0 0
> c0t5d0   ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t6d0   ONLINE   0 0 0
> c0t7d0   ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t8d0   ONLINE   0 0 0
> c0t9d0   ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t10d0  ONLINE   0 0 0
> c0t11d0  ONLINE   0 0 0
>   mirror ONLINE   0 0 0
> c0t12d0  ONLINE   0 0 0
> c0t13d0  ONLINE   0 0 0
> logs ONLINE   0 0 0
>   c0t1d0 ONLINE   0 0 0
>
> errors: No known data errors
>
> -  "zfs list -r " shows eight filesystems, and nine snapshots per filesystem.
> ...
> mypool/colorado 1.83G  4.00T  1.13G  
> /mypool/colorado
> mypool/color...@centenario-2008-12-28-01:00:00   40.3M  -  1.46G  -
> mypool/color...@centenario-2008-12-29-01:00:00   30.0M  -  1.54G  -
> mypool/color...@campeao-2008-12-29-09:00:00  10.4M  -  1.24G  -
> mypool/color...@campeao-2008-12-29-13:00:00  31.5M  -  1.29G  -
> mypool/color...@campeao-2008-12-29-17:00:00  5.46M  -  1.10G  -
> mypool/color...@campeao-2008-12-29-21:00:00  4.23M  -  1.13G  -
> mypool/color...@centenario-2008-12-30-01:00:00   0  -  1.16G  -
> mypool/color...@campeao-2008-12-30-01:00:00  0  -  1.16G  -
> mypool/color...@campeao-2008-12-30-05:00:00  6.24M  -  1.16G  -
> ...
>  
>  - How many entries does it have ?
>  Now there is just one file, the problematic one... but before the whole 
> problem, four or five small files (the whole pool is pretty empty).
> - Which filesystem (of the zpool) does it belong to ?
>  See above...
>
>  Thanks a lot!
>   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-30 Thread Marcelo Leal
Hello all,

# zpool status
  pool: mypool
 state: ONLINE
 scrub: scrub completed after 0h2m with 0 errors on Fri Dec 19 09:32:42 2008
config:

NAME STATE READ WRITE CKSUM
storage  ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t2d0   ONLINE   0 0 0
c0t3d0   ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t4d0   ONLINE   0 0 0
c0t5d0   ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t6d0   ONLINE   0 0 0
c0t7d0   ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t8d0   ONLINE   0 0 0
c0t9d0   ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t10d0  ONLINE   0 0 0
c0t11d0  ONLINE   0 0 0
  mirror ONLINE   0 0 0
c0t12d0  ONLINE   0 0 0
c0t13d0  ONLINE   0 0 0
logs ONLINE   0 0 0
  c0t1d0 ONLINE   0 0 0

errors: No known data errors

-  "zfs list -r " shows eight filesystems, and nine snapshots per filesystem.
...
mypool/colorado 1.83G  4.00T  1.13G  
/mypool/colorado
mypool/color...@centenario-2008-12-28-01:00:00   40.3M  -  1.46G  -
mypool/color...@centenario-2008-12-29-01:00:00   30.0M  -  1.54G  -
mypool/color...@campeao-2008-12-29-09:00:00  10.4M  -  1.24G  -
mypool/color...@campeao-2008-12-29-13:00:00  31.5M  -  1.29G  -
mypool/color...@campeao-2008-12-29-17:00:00  5.46M  -  1.10G  -
mypool/color...@campeao-2008-12-29-21:00:00  4.23M  -  1.13G  -
mypool/color...@centenario-2008-12-30-01:00:00   0  -  1.16G  -
mypool/color...@campeao-2008-12-30-01:00:00  0  -  1.16G  -
mypool/color...@campeao-2008-12-30-05:00:00  6.24M  -  1.16G  -
...
 
 - How many entries does it have ?
 Now there is just one file, the problematic one... but before the whole 
problem, four or five small files (the whole pool is pretty empty).
- Which filesystem (of the zpool) does it belong to ?
 See above...

 Thanks a lot!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-29 Thread Sanjeev Bagewadi
Marcelo,

Marcelo Leal wrote:
> Hello all...
>  Can that be caused by some cache on the LSI controller? 
>  Some flush that the controller or disk did not honour?
>   
More details on the problem would help. Can you please give the 
following details :
- zpool status
- zfs list -r
- The details of the directory :
- How many entries does it have ?
- Which filesystem (of the zpool) does it belong to ?

Thanks and regards,
Sanjeev.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Cannot remove a file on a GOOD ZFS filesystem

2008-12-29 Thread Marcelo Leal
Hello all...
 Can that be caused by some cache on the LSI controller? 
 Some flush that the controller or disk did not honour?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss