Re: FIBMAP unsupported

2014-10-04 Thread Marc Dietrich
On Fri, 03 Oct 2014 14:15:11 +0100
Filipe Manana fdman...@suse.com wrote:
 Just tried it and I confirm filefrag's call to ioctl FS_IOC_FIEMAP fails
 with -EEXIST.
 
 It's actually a known issue affecting any of the 3.17 RCs (except RC1).
 The extent map manipulation/merging is broken for some cases. Try with
 this 2 patches on top of 3.17-rcX:
 
 https://patchwork.kernel.org/patch/4929981/
 https://patchwork.kernel.org/patch/4945191/

ok, 2nd patch did not apply cleanly, so I just replaced the  with =.

Otherwise, I can confirm the patches are fixing the issue here.

 Or, alternatively, reverting this patch:
 https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=51f395ad4058883e4273b02fdebe98072dbdc0d2
 
 Someone else reported on this list a write/pwrite/writev failure with
 errno EEXIST too (and apparently caused by the same reason).
 
 This broken extent map handling is serious IMHO, it can make fsync log
 bogus extent items for example, amongst other possible bad and weird things.

Yes, it clearly corrupted by files in a strange way. Please try to get
it into final, or at least -stable if too late.

Marc
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FIBMAP unsupported

2014-10-03 Thread Filipe Manana


On 10/02/2014 11:11 PM, Marc Dietrich wrote:
 Am Donnerstag 02 Oktober 2014, 21:55:55 schrieb Marc Dietrich:
 Will try to restore the file using btrfs restore
 
 ok, restore worked. I did some more tests. This is unrelated to CoW. It seems 
 that the fallocate -n in combination with dd conv=notrunc using large 
 files (10G) triggers it. Maybe this rings some bells.

Just tried it and I confirm filefrag's call to ioctl FS_IOC_FIEMAP fails
with -EEXIST.

It's actually a known issue affecting any of the 3.17 RCs (except RC1).
The extent map manipulation/merging is broken for some cases. Try with
this 2 patches on top of 3.17-rcX:

https://patchwork.kernel.org/patch/4929981/
https://patchwork.kernel.org/patch/4945191/

Or, alternatively, reverting this patch:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=51f395ad4058883e4273b02fdebe98072dbdc0d2

Someone else reported on this list a write/pwrite/writev failure with
errno EEXIST too (and apparently caused by the same reason).

This broken extent map handling is serious IMHO, it can make fsync log
bogus extent items for example, amongst other possible bad and weird things.

 
 Marc
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


FIBMAP unsupported

2014-10-02 Thread Marc Dietrich
Hi,

I have a large (25G) virtual disk on a btrfs fs. Yes, I know this is not 
optimial. So I try to defrag it from time to time. However, using btrfs fi 
defrag -c vm.vdi results in even more fragments than before (reported by 
filefrag). So I wrote my own pseudo defragger,

-
#!/bin/sh

test -f $1 || exit 2

echo defrag $1
/usr/sbin/filefrag $1 || exit

fallocate -n -l `filesize $1` $1.new || exit
chattr +C $1.new
dd if=$1 of=$1.new conv=notrunc oflag=append status=none
chmod --reference $1 $1.new
chown --reference $1 $1.new
mv $1.new $1
/usr/sbin/filefrag $1

-

which produces much better results (ok, the file must not be in use). 
Somewhere in the 3.17 cycle the resulting image got corrupted using the script 
above. 

Running filefrag on it returns FIBMAP unsupported.

Virtualbox returns  AHCI#0P0: Read at offset 606236672 (49152 bytes left) 
returned rc=VERR_DEV_IO_ERROR. No errors in the kernel log.

Trying cp vm.vdi /dev/null returns: cp: Error reading „vm.vdi“: IO-Error

kernel 3.17-rc7
btrfs 3.17.x
mount options: rw,nodiratime,relatime,compress=lzo,space_cache,autodefrag

Marc

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FIBMAP unsupported

2014-10-02 Thread David Sterba
On Thu, Oct 02, 2014 at 05:13:22PM +0200, Marc Dietrich wrote:
 I have a large (25G) virtual disk on a btrfs fs. Yes, I know this is not 
 optimial. So I try to defrag it from time to time. However, using btrfs fi 
 defrag -c vm.vdi results in even more fragments than before (reported by 
 filefrag). So I wrote my own pseudo defragger,

Unfortunatelly the default target fragment size is 256k. Try
'btrfs filesystem defrag -t 32m ...' or higher numbers and see if it
helps.

 which produces much better results (ok, the file must not be in use). 
 Somewhere in the 3.17 cycle the resulting image got corrupted using the 
 script 
 above. 
 
 Running filefrag on it returns FIBMAP unsupported.

This message doe not mean it is a corruption, but filefrag tries to use
the FIBMAP ioctl that is not implemented on btrfs, instead FIEMAP is
used.

filefrag on a nocow file works for me here (3.16.x kernel), I can see
that filefrag on a directory prints the FIBMAP message.

 Virtualbox returns  AHCI#0P0: Read at offset 606236672 (49152 bytes left) 
 returned rc=VERR_DEV_IO_ERROR. No errors in the kernel log.
 
 Trying cp vm.vdi /dev/null returns: cp: Error reading „vm.vdi“: IO-Error

This could be caused by the virtualization layer. Try to run scrub and
fsck in the non-destru^Wchecking mode if it finds problems.

As you're using compression and autodefrag, a quick skim of the 3.17
patches points to e9512d72e8e61c750c90efacd720abe3c4569822 fix
autodefrag with compression, but that's just keyword match.

There's another report about nocow corruption and VirtualBox in a 3.16 +
for-linus version (which is almost 3.17-rc)
http://article.gmane.org/gmane.comp.file-systems.btrfs/38701/

But according to the attached messages, the underlying device is
unreliable and logs a lot of IO errors.

For now it looks like VirutalBox is not writing the data or there is a
bug introduced post 3.16 killing nocow files.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FIBMAP unsupported

2014-10-02 Thread Hugo Mills
On Thu, Oct 02, 2014 at 07:25:49PM +0200, David Sterba wrote:
 On Thu, Oct 02, 2014 at 05:13:22PM +0200, Marc Dietrich wrote:
  I have a large (25G) virtual disk on a btrfs fs. Yes, I know this is not 
  optimial. So I try to defrag it from time to time. However, using btrfs fi 
  defrag -c vm.vdi results in even more fragments than before (reported by 
  filefrag). So I wrote my own pseudo defragger,
 
 Unfortunatelly the default target fragment size is 256k. Try
 'btrfs filesystem defrag -t 32m ...' or higher numbers and see if it
 helps.

   Note also that a compressed file will have fragments on the scale
of about 128k reported by filefrag, because of the way that the
compression works. The file may actually be contiguous, but filefrag
won't know about it. (At least, that's historically been the case. I
don't know if filefrag has recently grown some extra knowledge of
compressed extents.)

   Hugo.

  which produces much better results (ok, the file must not be in use). 
  Somewhere in the 3.17 cycle the resulting image got corrupted using the 
  script 
  above. 
  
  Running filefrag on it returns FIBMAP unsupported.
 
 This message doe not mean it is a corruption, but filefrag tries to use
 the FIBMAP ioctl that is not implemented on btrfs, instead FIEMAP is
 used.
 
 filefrag on a nocow file works for me here (3.16.x kernel), I can see
 that filefrag on a directory prints the FIBMAP message.
 
  Virtualbox returns  AHCI#0P0: Read at offset 606236672 (49152 bytes left) 
  returned rc=VERR_DEV_IO_ERROR. No errors in the kernel log.
  
  Trying cp vm.vdi /dev/null returns: cp: Error reading „vm.vdi“: IO-Error
 
 This could be caused by the virtualization layer. Try to run scrub and
 fsck in the non-destru^Wchecking mode if it finds problems.
 
 As you're using compression and autodefrag, a quick skim of the 3.17
 patches points to e9512d72e8e61c750c90efacd720abe3c4569822 fix
 autodefrag with compression, but that's just keyword match.
 
 There's another report about nocow corruption and VirtualBox in a 3.16 +
 for-linus version (which is almost 3.17-rc)
 http://article.gmane.org/gmane.comp.file-systems.btrfs/38701/
 
 But according to the attached messages, the underlying device is
 unreliable and logs a lot of IO errors.
 
 For now it looks like VirutalBox is not writing the data or there is a
 bug introduced post 3.16 killing nocow files.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
  --- ...  one ping(1) to rule them all, and in the ---  
 darkness bind(2) them.  


signature.asc
Description: Digital signature


Re: FIBMAP unsupported

2014-10-02 Thread Marc Dietrich
Am Donnerstag 02 Oktober 2014, 19:25:49 schrieb David Sterba:
 On Thu, Oct 02, 2014 at 05:13:22PM +0200, Marc Dietrich wrote:
  I have a large (25G) virtual disk on a btrfs fs. Yes, I know this is not
  optimial. So I try to defrag it from time to time. However, using btrfs
  fi
  defrag -c vm.vdi results in even more fragments than before (reported by
  filefrag). So I wrote my own pseudo defragger,
 
 Unfortunatelly the default target fragment size is 256k. Try
 'btrfs filesystem defrag -t 32m ...' or higher numbers and see if it
 helps.

ok, need to try if I can ever recover from this error ...

  which produces much better results (ok, the file must not be in use).
  Somewhere in the 3.17 cycle the resulting image got corrupted using the
  script above.
  
  Running filefrag on it returns FIBMAP unsupported.
 
 This message doe not mean it is a corruption, but filefrag tries to use
 the FIBMAP ioctl that is not implemented on btrfs, instead FIEMAP is
 used.
 
 filefrag on a nocow file works for me here (3.16.x kernel), I can see
 that filefrag on a directory prints the FIBMAP message.

ah, for some reason FIBMAP is used on a file:

# strace filefrag vm.vdi
open(vm.vdi, O_RDONLY)   = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=26681016320, ...}) = 0
fstatfs(3, {f_type=0x9123683e, f_bsize=4096, f_blocks=106856459, 
f_bfree=35124686, f_bavail=33785307, f_files=0, f_ffree=0, f_fsid={741604101, 
-890814488}, f_namelen=255, f_frsize=4096}) = 0
ioctl(3, FIGETBSZ, 0x603150)= 0
ioctl(3, FS_IOC_FIEMAP, 0x7fff8e0b65a0) = -1 EEXIST (File exists)
ioctl(3, FIBMAP, 0x7fff8e0ba65c)= -1 EINVAL (Invalid argument)
write(2, vm.vdi: FIBMAP unsupported\n, 32vm.vdi: FIBMAP unsupported
) = 32
close(3)= 0
exit_group(22)  = ?

So it tries first FIEMAP and fails and then it tries FIBMAP which also fails.

  Virtualbox returns  AHCI#0P0: Read at offset 606236672 (49152 bytes left)
  returned rc=VERR_DEV_IO_ERROR. No errors in the kernel log.
  
  Trying cp vm.vdi /dev/null returns: cp: Error reading „vm.vdi“: IO-Error
 
 This could be caused by the virtualization layer. Try to run scrub and
 fsck in the non-destru^Wchecking mode if it finds problems.

the first error comes from the virtual machine log, maybe we can ignore it.

the second error is on the bare metal (no virtual machine). The disk is ok, so 
the I/O error comes from btrfs itself.

 As you're using compression and autodefrag, a quick skim of the 3.17
 patches points to e9512d72e8e61c750c90efacd720abe3c4569822 fix
 autodefrag with compression, but that's just keyword match.

I have this in my kernel already...

 There's another report about nocow corruption and VirtualBox in a 3.16 +
 for-linus version (which is almost 3.17-rc)
 http://article.gmane.org/gmane.comp.file-systems.btrfs/38701/
 
 But according to the attached messages, the underlying device is
 unreliable and logs a lot of IO errors.
 
 For now it looks like VirutalBox is not writing the data or there is a
 bug introduced post 3.16 killing nocow files.

as said above, this is not related to virtualbox (problem exists on bare 
metal).

Will try to restore the file using btrfs restore

Marc

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FIBMAP unsupported

2014-10-02 Thread Marc Dietrich
Am Donnerstag 02 Oktober 2014, 21:55:55 schrieb Marc Dietrich:
 Will try to restore the file using btrfs restore

ok, restore worked. I did some more tests. This is unrelated to CoW. It seems 
that the fallocate -n in combination with dd conv=notrunc using large 
files (10G) triggers it. Maybe this rings some bells.

Marc

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html