Re: Bad fs performance, IO freezes

2015-10-29 Thread cheater00 .
Hi Liu,
after talking with Holger I believe turning off COW on this FS will
work to alleviate this issue. However, even with COW on, btrfs
shouldn't be making my computer freeze every 5 seconds... especially
while the disk is written to at mere tens of kilobytes per second.
It's not even the disk holding the system. I consider this a pretty
bad bug... should we go on with trying to reproduce a minimum case?
How would I go about this?

Thanks

On Tue, Oct 27, 2015 at 4:26 PM, Austin S Hemmelgarn
 wrote:
> On 2015-10-27 10:43, cheater00 . wrote:
>>
>> I have remounted without autodefrag and the issue keeps on happening.
>
> OK, that at least narrows things down further.  My guess is the spikes are
> utorrent getting a bunch of blocks at once from one place, and then trying
> to write all of them at the same time, which could theoretically cause a
> latency spike on any filesystem, and BTRFS may just be making it worse.
>>
>>
>> On Tue, Oct 27, 2015 at 3:30 PM, cheater00 .  wrote:
>>>
>>> Feel free to suggest a good 1.5m USB3 cable, too. Let's get rid of all
>>> the unknowns.
>
> When it comes to external cables, I've had really good success with Amazon's
> 'Amazon Basics' branded stuff.  It's usually some of the best quality you
> can find for the price.  The 'Cable Matters' and 'Pluggable' brands also
> tend to be really good quality for the price.
>>>
>>> On Tue, Oct 27, 2015 at 3:26 PM, cheater00 .  wrote:

 If you can suggest a dual (or better yet quad) USB3 bay that can be
 bought on Amazon, I'll buy it now, and once that arrives, we can be
 sure it's not the JMicron chipset.
>
> I don't really have any suggestions here.  Usually when I hook up an
> external drive, it's to recover data from a friends computer, so I don't
> typically use a enclosure, but just use a simple adapter cable.  I would
> suggest looking for one advertising 'UAS' or 'UASP' support, as that's a
> relatively new standard for USB storage devices, and newer hardware should
> be more reliable.  It's also notoriously hard to determine what chipset a
> given model of external drive bay has (there are people I know who bought
> multiples of the same model and each one had a different chipset
> internally), and to complicate matters, quite often the exact same hardware
> gets marketed under half a dozen different names.  JMicron is popular
> because their chips are comparatively inexpensive, and while I've not had
> good results with them, that doesn't mean that they are all bad (especially
> considering that they are highly configurable based on how they are wired
> into the device, and not everyone who designs hardware around them properly
> understands the implications of some of the features).


 On Tue, Oct 27, 2015 at 3:22 PM, cheater00 . 
 wrote:
>
> The (dual) HDD bay and the chipset are, according to lsusb:
> Bus 002 Device 005: ID 152d:0551 JMicron Technology Corp. / JMicron
> USA Technology Corp.
> Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
>
> Not sure how to find out specific model numbers? I could open up the
> bay. OK I'll open up the bay.
> Good thing I have just the right screwdriver. It's a JMS551, and just
> for records sake, here's the manufacture info:
>
> JMS551
> 1120 LGAA2 A
> 572QV0024
>
> The laptop manual says it's either "Intel HM65 Express chipset with
> NEC USB 3.0 (select models only)" or "Intel HM65 Express chipset".
> Here are technical documents for my model:
> Manual: http://docdro.id/hG627JM
> "Intel chipset datasheet": http://docdro.id/yKRupYO
> Service guide: http://docdro.id/AuDgUdE
> Service guide, alt. ver.: http://docdro.id/WwQRpsH
>
> From what I can tell, you've got the one with the NEC USB 3.0 chip, I'm
> pretty cure that the HM65 doesn't have USB 3.0 itself.  FWIW, I've never
> personally had issues with NEC's USB 3.0 chips, but I've not had much
> experience using systems with them either.
>
>
> FWIW I'm using one of the USB3 ports on the left. The ones on the
> right are USB2.
>
> I've never used docdro.id so if it's not good let me know where to
> upload the PDFs to.
>
> autodefrag is on, yes. But I have been having issues before turning it
> on - I turned it on as a measure towards fixing the issues. I will
> turn it off and remount, then report. But I don't think that should be
> it. As you see the transfer speeds are minimal. They're *all* that's
> happening on the disk. Right now that's under 100 KB/sec and I'm still
> getting freezes albeit less. Also why would I be getting freezes when
> the transfer speeds jump up - just for them to drop again? Hmm, maybe
> utorrent has some sort of scheduler that gets preempted while the
> spike is happening, and some algorithm in it gets the wrong idea and

Re: [PATCH 6/6] btrfs-progs: Avoid use pointer in handle_options

2015-10-29 Thread David Sterba
On Thu, Oct 29, 2015 at 05:31:48PM +0800, Zhao Lei wrote:
> +static void check_options(int argc, char **argv)
>  {
> + if (argc == 0)
> + return;
> +
> + const char *arg = argv[0];

Declaration after statements, fixed at commit time.

> +
> + if (arg[0] != '-' ||
> + !strcmp(arg, "--help") ||
> + !strcmp(arg, "--version"))
> + return;
> +
> + fprintf(stderr, "Unknown option: %s\n", arg);
> + fprintf(stderr, "usage: %s\n",
> + btrfs_cmd_group.usagestr[0]);
> + exit(129);
>  }
>  
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad fs performance, IO freezes

2015-10-29 Thread Austin S Hemmelgarn

On 2015-10-29 09:03, cheater00 . wrote:

Hi Liu,
after talking with Holger I believe turning off COW on this FS will
work to alleviate this issue. However, even with COW on, btrfs
shouldn't be making my computer freeze every 5 seconds... especially
while the disk is written to at mere tens of kilobytes per second.
It's not even the disk holding the system. I consider this a pretty
bad bug... should we go on with trying to reproduce a minimum case?
How would I go about this?


Well, COW can cause some pretty unexpected behavior for some use cases. 
 If you have a big disk (I think I remember you saying it was larger 
than 1TB), then COW can cause some pretty significant seek times because 
of how it works.  With the current state of BTRFS, I wouldn't personally 
consider running BTRFS on anything bigger than 256G with a non-zero seek 
time with COW turned on, because large rewrites would have the potential 
to cause horrifically long seek times just for a RMW cycle on a single 
block, and this is in turn part of why database files and 
virtual-machine images tend to be pathological use cases for BTRFS.


I do agree that this kind of thing is a bug, but it's not something that 
causes data corruption, which means that it is slightly lower priority 
as far as most people are concerned.  Reproducing it might be tricky 
also, because I'd be willing to bet that things get better to the point 
of it being almost unnoticeable with an internal disk (USB is horrible 
when it comes to block storage performance, and has all kinds of 
potential reliability issues).


Normally, when I try to go about reproducing something like this, I use 
a virtual machine running the most recent stable version of the Linux 
kernel, usually with a minimalistic Gentoo installation (although a 
clean install of pretty much any distro works fine).  There are a couple 
of reasons I use such a setup:
1. Using a clean install provides a well defined initial state, making 
it easier for other people to reproduce any results.
2. Using the most recent stable kernel available (usually) eliminates 
the chances of old bugs causing issues.
3. Using a VM means that your disk access will be slower, which will 
visibly accentuate any kind of performance issues.
4. Using a VM also means that it is very easy to safely generate crash 
dumps and simulate data corruption for testing purposes, and makes it 
easier to experiment with different parameters (for example, UP versus 
SMP, or different amounts of RAM).


If you do decide to go this route, my suggestion would be to use 
VirtualBox unless you have significant experience with some other 
hypervisor, as it's one of the easiest to learn to use (I usually use 
Xen or QEMU, but both require significant effort to set up initially, 
and are decidedly non-trivial to learn), and learning to debug stuff 
like this is itself not an easy task.




smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCHv2] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread David Sterba
On Thu, Oct 29, 2015 at 08:22:21AM +, Luke Dashjr wrote:
> 32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
> be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
> fail.
> 
> Signed-off-by: Luke Dashjr 
> Cc: sta...@vger.kernel.org

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/6] btrfs-progs: free fslabel for btrfs-convert

2015-10-29 Thread David Sterba
On Thu, Oct 29, 2015 at 05:31:45PM +0800, Zhao Lei wrote:
> fslabel need to be freed before exit.
> 
> Signed-off-by: Zhao Lei 
> ---
>  btrfs-convert.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/btrfs-convert.c b/btrfs-convert.c
> index 5b9171e..1693d03 100644
> --- a/btrfs-convert.c
> +++ b/btrfs-convert.c
> @@ -3027,6 +3027,9 @@ int main(int argc, char *argv[])
>   ret = do_convert(file, datacsum, packing, noxattr, nodesize,
>   copylabel, fslabel, progress, features);
>   }
> +
> + free(fslabel);

fslabel is on stack:

btrfs-convert.c: In function 'main':
btrfs-convert.c:3031:6: warning: attempt to free a non-heap object 'fslabel' 
[-Wfree-nonheap-object]
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Question about Object IDs

2015-10-29 Thread Quad Cores
Hello,

On the Btrfs wiki link :
https://btrfs.wiki.kernel.org/index.php/Trees, under the section 'FS
Tree' it says that the trees are numbered 5,256,257 and so on, and the
top level subvolume has the ID 5; and the first created subvolume has
an ID of 256.

However, after installing btrfs on my system as a module, when I
created a new subvolume (my first subvolume), the ID given to it was
257. I checked this through the command:
btrfs subvolume list /mnt/point

A member on the btrfs IRC  also confirmed it and suggested me
to ask the mailing list first and update the wiki if required.

Thanks,
QuadCores
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Thomas Rohwer

Hello,


I've investigated this now, and it seems to be the pointer-type clone_sources
member of struct btrfs_ioctl_send_args. I can't think of a perfect way to fix
this, but it might not be *too* ugly to:
- replace the current clone_sources with a u64 that must always be (u64)-1;
   this causes older kernels to error cleanly if called with a new ioctl data
- use the top 1 or 2 bits of flags to indicate sizeof(void*) as it appears to
   userspace OR just use up reserved[0] for pointer size:
   io_send.ptr_size = sizeof(void*);
- replace one of the reserved fields with the new clone_sources

The way it was done for receive seems like it might not work for non-x86
compat interfaces (eg, MIPS n32) - but I could be wrong.

Thoughts?


I also encountered that problem with send. I posted the following a while ago to
comp.file-systems.btrfs, but never got any replies. The patch works for me (I 
used
it on my system in some cases), but is not extensively tested.

Sincerely,

Thomas Rohwer





Hello,

I am using as kernel Linux 4.1.3 (64bit) and btrfs-prog version 4.0 (32 bit 
user space).
I wanted to use send/receive with btrfs for the first time today and I got the 
following error:

humbur:~# btrfs send /snap > /dev/null
At subvol /snap
ERROR: send ioctl failed with -25: Inappropriate ioctl for device
ERROR: failed to read stream from kernel. Bad file descriptor


Investigating the source, I noticed that probably the problem is the member 
clone_sources in the structure

struct btrfs_ioctl_send_args {
  __s64 send_fd;  /* in */
  __u64 clone_sources_count;  /* in */
  __u64 __user *clone_sources;  /* in */
  __u64 parent_root;/* in */
  __u64 flags;  /* in */
  __u64 reserved[4];/* in */
};

in include/uapi/linux/btrfs.h and the missing adaption code in 
fs/btrfs/ioctl.c. The member clone_sources is only 32 bit wide in case of 32 
bit user space.
For the ioctl RECEIVED_SUBVOL somebody already added code for the in this case 
also necessary translation. I took this as a template and
wrote a patch (see below). The patch compiles and with the new kernel I seem to 
get valid data with send (I have to read it back yet,
but I get about the expected amount and structure).

This is a proof of concept patch; for example the compiler currently warns for
 (args64->clone_sources = (__u64*)args32->clone_sources;
and I would have to investigate how to properly convert the pointer. Further 
there are probably some issues with the formating of the source code.
I also have not tested the 32bit/32bit 64bit/64bit userspace/kernel 
combinations.
If there is interest, I can resubmit an improved patch. Please CC me in 
replies, since I have not subscribed to the list.


Sincerely,

Thomas Rohwer





Signed-off-by: Thomas Rohwer 

From e4156c38105200fa83913b6a94f07a41631c5f75 Mon Sep 17 00:00:00 2001
From: Thomas 
Date: Wed, 22 Jul 2015 22:03:15 +0200
Subject: [PATCH] add 32 bit adaption code for send ioctl

---
 fs/btrfs/ioctl.c | 73 
 fs/btrfs/send.c  | 11 +
 fs/btrfs/send.h  |  2 +-
 3 files changed, 75 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 1c22c65..8b969c0 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -83,6 +83,19 @@ struct btrfs_ioctl_received_subvol_args_32 {

 #define BTRFS_IOC_SET_RECEIVED_SUBVOL_32 _IOWR(BTRFS_IOCTL_MAGIC, 37, \
 struct btrfs_ioctl_received_subvol_args_32)
+
+struct btrfs_ioctl_send_args_32 {
+__s64 send_fd;  /* in */
+__u64 clone_sources_count;  /* in */
+__u32 clone_sources;/* in */
+__u64 parent_root;  /* in */
+__u64 flags;/* in */
+__u64 reserved[4];  /* in */
+} __attribute__ ((__packed__));
+
+#define BTRFS_IOC_SEND_32 _IOW(BTRFS_IOCTL_MAGIC, 38, \
+struct btrfs_ioctl_send_args_32)
+
 #endif


@@ -4965,6 +4978,43 @@ out:
 kfree(args64);
 return ret;
 }
+
+static long btrfs_ioctl_send_32(struct file *file,
+void __user *arg)
+{
+struct btrfs_ioctl_send_args_32 *args32 = NULL;
+struct btrfs_ioctl_send_args *args64 = NULL;
+int ret = 0;
+
+args32 = memdup_user(arg, sizeof(*args32));
+if (IS_ERR(args32)) {
+ret = PTR_ERR(args32);
+args32 = NULL;
+goto out;
+}
+
+args64 = kmalloc(sizeof(*args64), GFP_NOFS);
+if (!args64) {
+ret = -ENOMEM;
+goto out;
+}
+
+args64->send_fd = args32->send_fd;
+args64->clone_sources_count = args32->clone_sources_count;
+args64->clone_sources = (__u64*)args32->clone_sources;
+args64->parent_root = args32->parent_root;
+args64->flags = args32->flags;
+
+ret = _btrfs_ioctl_send(file, args64);
+
+// only in arguments, so no copy back to args32
+
+out:
+kfree(args32);
+kfree(args64);
+return ret;
+}
+
 #endif

 static 

Re: [PATCH 1/6] btrfs-progs: fix floating point exception for btrfs-calc-size

2015-10-29 Thread David Sterba
Hi,

paches 1,2,4,5,6 applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad fs performance, IO freezes

2015-10-29 Thread cheater00 .
Hi Austin,
seek times are fine, but this literally freezes my computer for a
split second. I've had to re-type this email twice because the freezes
meant letters I typed would not arrive on the screen.
USB disks are so common they should not be having issues.
I have 4.3.0-040300rc7-generic #201510260712 which is just three days old.

Please advise. Isn't it better to *not* use a vm to debug this?
BTW, if we are talking about slow speed making things worse, I could
try downgrading the cable to usb2.
Is there a standard virtualbox VM that I could use?
I'll download Gentoo in the meantime. I have never used it. I'm
getting the "minimal installation cd" from 29th september.
http://distfiles.gentoo.org/releases/x86/autobuilds/20150929/install-x86-minimal-20150929.iso

On Thu, Oct 29, 2015 at 3:00 PM, Austin S Hemmelgarn
 wrote:
> On 2015-10-29 09:03, cheater00 . wrote:
>>
>> Hi Liu,
>> after talking with Holger I believe turning off COW on this FS will
>> work to alleviate this issue. However, even with COW on, btrfs
>> shouldn't be making my computer freeze every 5 seconds... especially
>> while the disk is written to at mere tens of kilobytes per second.
>> It's not even the disk holding the system. I consider this a pretty
>> bad bug... should we go on with trying to reproduce a minimum case?
>> How would I go about this?
>
>
> Well, COW can cause some pretty unexpected behavior for some use cases.  If
> you have a big disk (I think I remember you saying it was larger than 1TB),
> then COW can cause some pretty significant seek times because of how it
> works.  With the current state of BTRFS, I wouldn't personally consider
> running BTRFS on anything bigger than 256G with a non-zero seek time with
> COW turned on, because large rewrites would have the potential to cause
> horrifically long seek times just for a RMW cycle on a single block, and
> this is in turn part of why database files and virtual-machine images tend
> to be pathological use cases for BTRFS.
>
> I do agree that this kind of thing is a bug, but it's not something that
> causes data corruption, which means that it is slightly lower priority as
> far as most people are concerned.  Reproducing it might be tricky also,
> because I'd be willing to bet that things get better to the point of it
> being almost unnoticeable with an internal disk (USB is horrible when it
> comes to block storage performance, and has all kinds of potential
> reliability issues).
>
> Normally, when I try to go about reproducing something like this, I use a
> virtual machine running the most recent stable version of the Linux kernel,
> usually with a minimalistic Gentoo installation (although a clean install of
> pretty much any distro works fine).  There are a couple of reasons I use
> such a setup:
> 1. Using a clean install provides a well defined initial state, making it
> easier for other people to reproduce any results.
> 2. Using the most recent stable kernel available (usually) eliminates the
> chances of old bugs causing issues.
> 3. Using a VM means that your disk access will be slower, which will visibly
> accentuate any kind of performance issues.
> 4. Using a VM also means that it is very easy to safely generate crash dumps
> and simulate data corruption for testing purposes, and makes it easier to
> experiment with different parameters (for example, UP versus SMP, or
> different amounts of RAM).
>
> If you do decide to go this route, my suggestion would be to use VirtualBox
> unless you have significant experience with some other hypervisor, as it's
> one of the easiest to learn to use (I usually use Xen or QEMU, but both
> require significant effort to set up initially, and are decidedly
> non-trivial to learn), and learning to debug stuff like this is itself not
> an easy task.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread David Sterba
On Thu, Oct 29, 2015 at 01:05:13PM +0100, Thomas Rohwer wrote:
> Investigating the source, I noticed that probably the problem is the member 
> clone_sources in the structure

That's, right. Yet another thing to keep in mind when designing ioctls.

> struct btrfs_ioctl_send_args {
>__s64 send_fd;  /* in */
>__u64 clone_sources_count;  /* in */
>__u64 __user *clone_sources;  /* in */
>__u64 parent_root;/* in */
>__u64 flags;  /* in */
>__u64 reserved[4];/* in */
> };
> 
> in include/uapi/linux/btrfs.h and the missing adaption code in
> fs/btrfs/ioctl.c. The member clone_sources is only 32 bit wide in case
> of 32 bit user space.  For the ioctl RECEIVED_SUBVOL somebody already
> added code for the in this case also necessary translation. I took
> this as a template and wrote a patch (see below). The patch compiles
> and with the new kernel I seem to get valid data with send (I have to
> read it back yet, but I get about the expected amount and structure).

I'm not yet sure if this is the right approach, but I'm still
considering it. My suggestion is to add the union and force the width.
In case of the receive subvol this was not possible because it was
around an external structure.

The basic difference if we use the same structure definition, let the
compiler pick the pointer type width, and do the conversion in kernel
transparently (ie. the RECEIVED_SUBVOL workaround), or if we patch the
structure so the pointer magic happens in the userspace.

> This is a proof of concept patch; for example the compiler currently warns for
>   (args64->clone_sources = (__u64*)args32->clone_sources;
> and I would have to investigate how to properly convert the pointer.

args64->clone_sources = (__u64*)(long)args32->clone_sources;

should work.

> Further there are probably some issues with the formating of the source code.

Yeah there are some, but at this point we're not sure what's the right
approach so don't worry.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread David Sterba
On Thu, Oct 29, 2015 at 08:22:34AM +, Luke Dashjr wrote:
> > > I don't see what is different with that implementation. All
> > > f2fs_compat_ioctl does is change cmd to the plain-IOC equivalent and
> > > call f2fs_ioctl with the same arg (compat_ptr merely causes a cast to
> > > void* and back, which AFAIK is a noop on 64-bit?). Am I missing
> > > something?
> > 
> > No, that's the idea. Add new calback for compat_ioctl, put it under
> > #ifdef CONFIG_COMPAT and do the same number switch.
> 
> Ok, someone else explained this to me. Please let me know if PATCHv2 (sent 
> separately) does not address the needed changes.

Patch is ok, thanks.

> > > I could try to just imitate it, but
> > > I'd rather know what is significant/going on to ensure I don't waste your
> > > time with code I don't even properly understand myself.
> > > 
> > > Perhaps by coincidence, the patch does at least in practice work
> > > (although at least `btrfs send` appears to be broken still, and I'm at a
> > > loss for how to approach fixing that).
> > 
> > The 'receive' 32bit/64bit was broken due to size difference in the ioctl
> > structure that led to different ioctl. This is transparently fixed, see
> > BTRFS_IOC_SET_RECEIVED_SUBVOL_32 at the top of ioctl.c.
> > 
> > In what way is SEND broken? There are only u64/s64 members in
> > btrfs_ioctl_send_args, I don't see how this could break on 32/64
> > userspace/kernel.
> 
> I've investigated this now, and it seems to be the pointer-type clone_sources 
> member of struct btrfs_ioctl_send_args. I can't think of a perfect way to fix 
> this, but it might not be *too* ugly to:
> - replace the current clone_sources with a u64 that must always be (u64)-1;
>   this causes older kernels to error cleanly if called with a new ioctl data
> - use the top 1 or 2 bits of flags to indicate sizeof(void*) as it appears to
>   userspace OR just use up reserved[0] for pointer size:
>   io_send.ptr_size = sizeof(void*);
> - replace one of the reserved fields with the new clone_sources

All the change seem too intrusive or not so easy to use.

I suggest to add an anonymous union and add a u64 member that would
force the type width:

struct btrfs_ioctl_send_args {
__s64 send_fd;  /* in */
__u64 clone_sources_count;  /* in */
union {
__u64 __user *clone_sources;/* in */
u64 __pointer_alignment;
};
__u64 parent_root;  /* in */
__u64 flags;/* in */
__u64 reserved[4];  /* in */
};

> The way it was done for receive seems like it might not work for non-x86 
> compat interfaces (eg, MIPS n32) - but I could be wrong.

Possible, but I don't see right now how it would not work on eg. mips32.
unless sizeof(long) is 8 bytes there and CONFIG_64BIT is not defined.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Josef Bacik

On 10/29/2015 04:22 AM, Luke Dashjr wrote:

32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
fail.



Looks good, thanks,

Reviewed-by: Josef Bacik 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad fs performance, IO freezes

2015-10-29 Thread Henk Slager
The graph uploaded shows 'Up speed' and 'Down speed', I just assumed
that this is network speed and not disk I/O. Is this correct
assumption? If so, how does the traffic to/from the /dev/sdX look like
(e.g. iostat or ksysguard) ?

W.r.t. USB: I had quite some trouble with NEC/Renesas USB3 host
controller 2 years back. I got it working once under windows7 after
quite some drivers version trials. Under linux it is listed, but I
don't use is anymore; On the same PC under windows7 it doesn't work
anymore. For btrfs on WD elements and Sandisk Extreme on ASUS H87M-Pro
with kernels 3.x kernels (64-bit) I got similar freezes/perfomance
issues as mentioned here. The same fs configurations and tests on sata
cabled (2TB HDD) did not show these hickups. I have now mostly ext4 on
those USB connected disk, so with 4.3-rcX kernels I cant tell more.

There might be many configuration issues if a drive is connected via
removable USB, like writeback-caching etc. I have not looked into all
possible issues any further.
A month ago I got quite some random I/O and timeout errors on a 4TB
raw dd_rescue copy action (1 HDD in USB3 bay), kernel 4.1.6. Then
hooked-up all 3 disks, including rootfs, to another motherboard via
sata and not a single error. On the other hand, I have also a 2TB disk
connected via USB2 formatted btrfs and online 24/7 for over a year and
it works fine.

So I would (temporary) connect this WB 6TB via sata and use some
latest 64-bit liveCD/DVD linux distro and see how the disktraffic is
for some file copy or defrag action. And then, step by step go back to
your original configuration. Also make sure the disk is not filled
much more than 95%, as that could easily lead to the situation that
your latest active files will easily be very scattered, so not
beneficial for fs performance.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Thomas Rohwer

I suggest to add an anonymous union and add a u64 member that would
force the type width:

struct btrfs_ioctl_send_args {
 __s64 send_fd;  /* in */
 __u64 clone_sources_count;  /* in */
union {
__u64 __user *clone_sources;/* in */
u64 __pointer_alignment;
};
 __u64 parent_root;  /* in */
 __u64 flags;/* in */
 __u64 reserved[4];  /* in */
};


I am no expert, but would this change alone modify the user space ABI of a 
32-bit Linux kernel?
I.e. people in the (presumably currently working) btrfs-send situation (32-bit) 
user space/32-bit kernel
would have to upgrade user space tools and kernel at the same time. Otherwise, 
they will encounter a non-working setup.
I think, my suggested patch does not change any working ABI, and no change to 
the user
space tools are necessary.


Sincerely,

Thomas Rohwer

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Luke Dashjr
On Thursday, October 29, 2015 7:36:35 PM Thomas Rohwer wrote:
> > I suggest to add an anonymous union and add a u64 member that would
> > force the type width:
> > 
> > struct btrfs_ioctl_send_args {
> > 
> >  __s64 send_fd;  /* in */
> >  __u64 clone_sources_count;  /* in */
> > 
> > union {
> > 
> > __u64 __user *clone_sources;/* in */
> > u64 __pointer_alignment;
> > 
> > };
> > 
> >  __u64 parent_root;  /* in */
> >  __u64 flags;/* in */
> >  __u64 reserved[4];  /* in */
> > 
> > };
> 
> I am no expert, but would this change alone modify the user space ABI of a
> 32-bit Linux kernel? I.e. people in the (presumably currently working)
> btrfs-send situation (32-bit) user space/32-bit kernel would have to
> upgrade user space tools and kernel at the same time. Otherwise, they will
> encounter a non-working setup.

Yes, it would, but this appears to already be the case for btrfs-progs in 
general.

> I think, my suggested patch does not change any working ABI, and no change
> to the user space tools are necessary.

Don't the user space tools need to call a different ioctl?

Luke
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Luke Dashjr
On Thursday, October 29, 2015 2:39:32 PM David Sterba wrote:
> On Thu, Oct 29, 2015 at 08:22:34AM +, Luke Dashjr wrote:
> > > In what way is SEND broken? There are only u64/s64 members in
> > > btrfs_ioctl_send_args, I don't see how this could break on 32/64
> > > userspace/kernel.
> > 
> > I've investigated this now, and it seems to be the pointer-type
> > clone_sources member of struct btrfs_ioctl_send_args. I can't think of a
> > perfect way to fix this, but it might not be *too* ugly to:
> > - replace the current clone_sources with a u64 that must always be
> > (u64)-1;
> > 
> >   this causes older kernels to error cleanly if called with a new ioctl
> >   data
> > 
> > - use the top 1 or 2 bits of flags to indicate sizeof(void*) as it
> > appears to
> > 
> >   userspace OR just use up reserved[0] for pointer size:
> >   io_send.ptr_size = sizeof(void*);
> > 
> > - replace one of the reserved fields with the new clone_sources
> 
> All the change seem too intrusive or not so easy to use.
> 
> I suggest to add an anonymous union and add a u64 member that would
> force the type width:
> 
> struct btrfs_ioctl_send_args {
> __s64 send_fd;  /* in */
> __u64 clone_sources_count;  /* in */
>   union {
>   __u64 __user *clone_sources;/* in */
>   u64 __pointer_alignment;
>   };
> __u64 parent_root;  /* in */
> __u64 flags;/* in */
> __u64 reserved[4];  /* in */
> };

What guarantees the union to position clone_sources in the LSB of 
__pointer_alignment (rather than the MSB side)?

> > The way it was done for receive seems like it might not work for non-x86
> > compat interfaces (eg, MIPS n32) - but I could be wrong.
> 
> Possible, but I don't see right now how it would not work on eg. mips32.
> unless sizeof(long) is 8 bytes there and CONFIG_64BIT is not defined.

n32 is a MIPS64 ABI, like the new x32 ABI for x86_64 machines, so I would 
expect sizeof(long) to be 8 bytes, and am uncertain of if this implies any 
particular alignment. (But I don't have any MIPS systems, so this isn't 
something I'm too concerned with myself.)

Luke
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bad fs performance, IO freezes

2015-10-29 Thread Austin S Hemmelgarn

On 2015-10-29 11:49, cheater00 . wrote:

Hi Austin,
seek times are fine, but this literally freezes my computer for a
split second. I've had to re-type this email twice because the freezes
meant letters I typed would not arrive on the screen.
USB disks are so common they should not be having issues.
That's debatable.  USB is commonly used because it's almost impossible 
to find a system that doesn't have it, not because it's reliable.  The 
original intent was for it to be used for stuff like mice and keyboards, 
so it was designed with low-latency and fair scheduling in mind, both of 
which really hurt performance of bulk data storage devices.

I have 4.3.0-040300rc7-generic #201510260712 which is just three days old.
That should be perfectly recent enough, although FWIW, the official 
version of 4.3 should be out this Sunday.


Please advise. Isn't it better to *not* use a vm to debug this?
That depends.  For something like this, it could go either way.  I just 
use a VM because that's what I always use, because it's nice not 
crashing your system when trying to debug a kernel panic.

BTW, if we are talking about slow speed making things worse, I could
try downgrading the cable to usb2.
Is there a standard virtualbox VM that I could use?
In general, it's pretty easy to set something like Ubuntu up in 
VirtualBox, the install is essentially identical to regular hardware 
aside from the initial setup of the VM itself.  The documentation for 
VirtualBox is really good, if you've never used virtualization before, 
it's definitely worth reading.

I'll download Gentoo in the meantime. I have never used it. I'm
getting the "minimal installation cd" from 29th september.
http://distfiles.gentoo.org/releases/x86/autobuilds/20150929/install-x86-minimal-20150929.iso
I meant by no means that you needed to use Gentoo, I only mentioned it 
because it's what I use (which in turn is because that's what I use on 
just about everything except stuff like the Raspberry Pi or the 
BeagleBoard).  If you just want to debug this and then be done with it, 
I would actually advise against using Gentoo, it takes a lot of effort 
to get a system up and running with it, and it's very involved to 
maintain compared to Ubuntu.  On the other hand though, if you are 
willing to learn to use it, it's one of the most highly customizable 
Linux distros out there, and can have noticeably better performance than 
more generic distros (FWIW, it's also one of the last big distros that 
doesn't force systemd on it's users by default).




smime.p7s
Description: S/MIME Cryptographic Signature


Re: random i/o error without error in dmesg

2015-10-29 Thread Marc Joliet
On Wednesday 28 October 2015 05:21:13 Duncan wrote:
>Marc Joliet posted on Tue, 27 Oct 2015 21:54:40 +0100 as excerpted:
>>>IOW, does it take a full reboot to clear the problem, or is a simple
>>>ro/rw mount cycle enough, or an unmount/remount?
>>>
>> Seems that a full reboot is needed, but I would expect that it would
>> have the same effect if I were to pivot back into the initramfs, unmount
>> / from there,
>> then boot back into the system.  Because quite frankly, I can't think of
>> any reason why a power cycle to the SSD should make a difference here.
>> I vaguely remember that systemd can do that, so I'll see if I can find
>> out how.
>
>Agree with both the systemd returning to the initr* point (which I
>actually had in mind while writing the above but don't remember the
>details either, so chose to omit in the interest of limiting the size of
>the reply and research necessary to generate it), and the ssd power-cycle
>point.

I haven't found any single command that lets you do that, but I can try one of 
the special targets as detailed in bootup(7) (e.g., initrd.target) when I have 
a chance.

>>>Finally, assuming root itself isn't btrfs, if you have btrfs configured
>>>as a module, you could try unmounting all btrfs and then unloading the
>>>module, then reloading and remounting.  That should entirely clear all
>>>in-memory btrfs state, so if that doesn't solve the problem, while
>>>rebooting does, then the problem's very possibly outside of btrfs scope.
>>>
>>> Of course if root is btrfs, you can't really check that.
>> 
>> Nope, btrfs is built-in (though it doesn't have to be, what with me
>> using an initramfs).
>
>Same here, also gentoo as I guess you know from previous exchanges.  But
>unfortunately, if your initr* is anything like mine, and your kernel
>monolithic as mine, making btrfs a module with a btrfs root isn't the
>easy thing it might seem to those who run ordinary distro supplied binary
>kernels with pretty much everything modularized, as doing so involves a
>whole new set of research on how to get that module properly included in
>the initr* and loaded there, as well as installing and building the whole
>module-handling infrastructure (modprobe and friends) again, as it's not
>actually installed on the system at all at this point, because with the
>kernel entirely monolithic, module-handling tools are unnecessary and
>thus just another unnecessary package to have to keep building updates
>for, if they remain installed.

My kernel is fairly modular, and I use dracut to make my initramfs, so I 
wouldn't be surprised if it works.  For me, personally, I just don't see any 
point in making btrfs a module.

(And yes, of course I know you run Gentoo ;-) .)

>So I definitely sympathize with the feeling that such a stone is better
>left unturned, if overturning it is at all a possibility that can be
>avoided, as it is here, this whole exercise being simply one of better
>pinning the bug down, not yet actually trying to solve it.  And given
>that unturned stone, there are certainly easier ways.
>
>And one of those easier ways is investigating that whole systemd return
>to initr* idea, since we both remember reading something about it, but
>aren't familiar with the details.  In addition to addressing the problem
>headon if anyone offers a way to do so, that's the path I'd be looking at
>right now.

Like I said above, I'll try it out when I have a moment where I have a more 
"steady hand" so-to-speak.

[snip deleted files stuff]
>app-admin/lib_users
[snip the rest of the deleted files stuff]

I use that to find processes that need restarting after upgrades, though I'll 
sometimes check to see if it's really a library that's causing it to show up, 
since often a process is listed because of stuff like the font cache, or, in 
the case of the FISH shell, it's own history file.

But yeah, didn't think of running that, but in rescue mode there were at most 
a dozen processes running, so there's not much to choose from, anyway.  I did 
have to kill two remaining user processes first (pulseaudio and... I forgot 
the other one).  I didn't try the same with / and /var because I was eager to 
get back to a normally running system ;-) .

>Of course, if lib_users reports nothing further still holding references
>to deleted files, and a remount read-only STILL fails, that's a major
>note of trouble and an important finding in itself.

I don't expect that, but I'll make note of it if I encounter it.

>Meanwhile, as explained in the systemd docs (specifically the systemd for
>administrators series, IIRC), systemd dropping back to the initr* is
>actually its way of automatically doing effectively the same thing we
>were using lib_users and all those restarts to do, getting rid of all
>possible still running on root executables, including systemd itself, by
>reexecing systemd itself back in the initr*, as a way to help eliminate
>*everything* running on root, so it can not only be remounted read-only,
>but 

Re: corrupted RAID1: unsuccessful recovery / help needed

2015-10-29 Thread Lukas Pirl

TL;DR: thanks but recovery still preferred over recreation.

Hello Duncan and thanks for your reply!

On 10/26/2015 09:31 PM, Duncan wrote:

FWIW... Older btrfs userspace such as your v3.17 is "OK" for normal
runtime use, assuming you don't need any newer features, as in normal
runtime, it's the kernel code doing the real work and userspace for the
most part simply makes the appropriate kernel calls to do that work.

>

But, once you get into a recovery situation like the one you're in now,
current userspace becomes much more important, as the various things
you'll do to attempt recovery rely far more on userspace code directly
accessing the filesystem, and it's only the newest userspace code that
has the latest fixes.

So for a recovery situation, the newest userspace release (4.2.2 at
present) as well as a recent kernel is recommended, and depending on the
problem, you may at times need to run integration or apply patches on top
of that.


I am willing to update before trying further repairs. Is e.g. "balance" 
also influenced by the userspace tools or does the kernel the actual work?



General note about btrfs and btrfs raid.  Given that btrfs itself remains
a "stabilizing, but not yet fully mature and stable filesystem", while
btrfs raid will often let you recover from a bad device, sometimes that
recovery is in the form of letting you mount ro, so you can access the
data and copy it elsewhere, before blowing away the filesystem and
starting over.


If there is one subvolume that contains all other (read only) snapshots 
and there is insufficient storage to copy them all separately:

Is there an elegant way to preserve those when moving the data across disks?


Back to the problem at hand.  Current btrfs has a known limitation when
operating in degraded mode.  That being, a btrfs raid may be write-
mountable only once, degraded, after which it can only be read-only
mounted.  This is because under certain circumstances in degraded mode,
btrfs will fall back from its normal raid mode to single mode chunk
allocation for new writes, and once there's single-mode chunks on the
filesystem, btrfs mount isn't currently smart enough to check that all
chunks are actually available on present devices, and simply jumps to the
conclusion that there's single mode chunks on the missing device(s) as
well, so refuses to mount writable after that in ordered to prevent
further damage to the filesystem and preserve the ability to mount at
least ro, to copy off what isn't damaged.

There's a patch in the pipeline for this problem, that checks individual
chunks instead of leaping to conclusions based on the presence of single-
mode chunks on a degraded filesystem with missing devices.  If that's
your only problem (which the backtraces might reveal but I as a non-dev
btrfs user can't tell), the patches should let you mount writable.


Interesting, thanks for the insights.


But that patch isn't in kernel 4.2.  You'll need at least kernel 4.3-rc,
and possibly btrfs integration, or to cherrypick the patches onto 4.2.


Well, before digging into that, a hint that this is actually the case 
would be appreciated. :)



Meanwhile, in keeping with the admin's rule on backups, by definition, if
you valued the data more than the time and resources necessary for a
backup, by definition, you have a backup available, otherwise, by
definition, you valued the data less than the time and resources
necessary to back it up.

Therefore, no worries.  Regardless of the fate of the data, you saved
what your actions declared of most valuable to you, either the data, or
the hassle and resources cost of the backup you didn't do.  As such, if
you don't have a backup (or if you do but it's outdated), the data at
risk of loss is by definition of very limited value.

That said, it appears you don't even have to worry about loss of that
very limited value data, since mounting degraded,recovery,ro gives you
stable access to it, and you can use the opportunity provided to copy it
elsewhere, at least to the extent that the data we already know is of
limited value is even worth the hassle of doing that.

Which is exactly what I'd do.  Actually, I've had to resort to btrfs
restore[1] a couple times when the filesystem wouldn't mount at all, so
the fact that you can mount it degraded,recovery,ro, already puts you
ahead of the game. =:^)

So yeah, first thing, since you have the opportunity, unless your backups
are sufficiently current that it's not worth the trouble, copy off the
data while you can.

Then, unless you wish to keep the filesystem around in case the devs want
to use it to improve btrfs' recovery system, I'd just blow it away and
start over, restoring the data from backup once you have a fresh
filesystem to restore to.  That's the simplest and fastest way to a fully
working system once again, and what I did here after using btrfs restore
to recover the delta between current and my backups.


Thanks for all the elaborations. I guess there are 

Re: [PATCH v3 06/21] btrfs: delayed_ref: Add new function to record reserved space into delayed ref

2015-10-29 Thread Chris Mason
On Wed, Oct 28, 2015 at 02:36:42PM +0100, Holger Hoffstätte wrote:
> On Tue, Oct 27, 2015 at 12:34 PM, Chris Mason  wrote:
> > On Tue, Oct 27, 2015 at 05:05:56PM +0800, Qu Wenruo wrote:
> >>
> >>
> >> Chris Mason wrote on 2015/10/27 02:12 -0400:
> >> >On Tue, Oct 27, 2015 at 01:48:34PM +0800, Qu Wenruo wrote:
> >> Are you testing integration-4.4 from Chris repo?
> >> Or 4.3-rc from mainline repo with my qgroup reserve patchset applied?
> >> 
> >> Although integration-4.4 already merged qgroup reserve patchset, but 
> >> it's
> >> causing some strange bug like over decrease data sinfo->bytes_may_use,
> >> mainly in generic/127 testcase.
> >> 
> >> But if qgroup reserve patchset is rebased to integration-4.3 (I did 
> >> all my
> >> old tests based on that), no generic/127 problem at all.
> >> >>>
> >> >>>Did I mismerge things?
> >> >>>
> >> >>>-chris
> >> >>>
> >> >>Not sure yet.
> >> >>
> >> >>But at least some patches in 4.3 is not in integration-4.4, like the
> >> >>following patch:
> >> >>btrfs: Avoid truncate tailing page if fallocate range doesn't exceed 
> >> >>inode
> >> >>size
> >> >
> >> >Have you tried testing integration-4.4 merged with current Linus git?
> 
> Chris, something went definitely wrong with the 4.4-integration
> branch, and it's not the point where you merged from Josef. Mainline
> has: 0f6925fa2907df58496cabc33fa4677c635e2223 ("btrfs: Avoid truncate
> tailing page if fallocate range doesn't exceed inode size"), and that
> commit just doesn't exist in 4.4-integration any more. Neither did any
> merges touch file.c, so it
> seems this just got lost for some reason (rebase? forced push?).
> It's difficult to say what else might have gone missing.

Hi Holger,

integration-4.4 is based on 4.3-rc5, and it doesn't include any of the
btrfs commits that went in after rc5.  So if you want the latest commits
from 4.3, you just need to merge integration-4.4 with a more recent
Linus rc.

This isn't completely intuitive ;)  I could merge in 4.3-rc7, but for the
trees that I send to Linus, he prefers I not add extra merges unless it
solves some dependency (like a new API, or highly critical bug).

So when I test integration, I test it merged into Linus' latest rc, but
I apply patches on top of the older base.  It makes the resulting graph
of merges look much nicer when Linus pulls from me, and if you scroll
through the commits with git log or gitweb, its more clear where the
new commits are.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Luke Dashjr
On Friday, May 15, 2015 11:19:22 AM David Sterba wrote:
> On Thu, May 14, 2015 at 04:27:54PM +, Luke Dashjr wrote:
> > On Thursday, May 14, 2015 2:06:17 PM David Sterba wrote:
> > > On Wed, May 13, 2015 at 05:15:26PM +, Luke Dashjr wrote:
> > > > 32-bit ioctl uses these rather than the regular FS_IOC_* versions.
> > > > They can be handled in btrfs using the same code. Without this,
> > > > 32-bit {ch,ls}attr fail.
> > > 
> > > Yes, but this has to be implemented in another way. See eg.
> > > https://git.kernel.org/linus/e9750824114ff
> > 
> > I don't see what is different with that implementation. All
> > f2fs_compat_ioctl does is change cmd to the plain-IOC equivalent and
> > call f2fs_ioctl with the same arg (compat_ptr merely causes a cast to
> > void* and back, which AFAIK is a noop on 64-bit?). Am I missing
> > something?
> 
> No, that's the idea. Add new calback for compat_ioctl, put it under
> #ifdef CONFIG_COMPAT and do the same number switch.

Ok, someone else explained this to me. Please let me know if PATCHv2 (sent 
separately) does not address the needed changes.

> > I could try to just imitate it, but
> > I'd rather know what is significant/going on to ensure I don't waste your
> > time with code I don't even properly understand myself.
> > 
> > Perhaps by coincidence, the patch does at least in practice work
> > (although at least `btrfs send` appears to be broken still, and I'm at a
> > loss for how to approach fixing that).
> 
> The 'receive' 32bit/64bit was broken due to size difference in the ioctl
> structure that led to different ioctl. This is transparently fixed, see
> BTRFS_IOC_SET_RECEIVED_SUBVOL_32 at the top of ioctl.c.
> 
> In what way is SEND broken? There are only u64/s64 members in
> btrfs_ioctl_send_args, I don't see how this could break on 32/64
> userspace/kernel.

I've investigated this now, and it seems to be the pointer-type clone_sources 
member of struct btrfs_ioctl_send_args. I can't think of a perfect way to fix 
this, but it might not be *too* ugly to:
- replace the current clone_sources with a u64 that must always be (u64)-1;
  this causes older kernels to error cleanly if called with a new ioctl data
- use the top 1 or 2 bits of flags to indicate sizeof(void*) as it appears to
  userspace OR just use up reserved[0] for pointer size:
  io_send.ptr_size = sizeof(void*);
- replace one of the reserved fields with the new clone_sources

The way it was done for receive seems like it might not work for non-x86 
compat interfaces (eg, MIPS n32) - but I could be wrong.

Thoughts?

Luke
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: Fix a data space underflow warning

2015-10-29 Thread Qu Wenruo
Even with quota disabled, generic/127 will trigger a kernel warning by
underflow data space info.

The bug is caused by buffered write, which in case of short copy, the
start parameter for btrfs_delalloc_release_space() is wrong, and
round_up/down() in btrfs_delalloc_release() extents the range to page
aligned, decreasing one more page than expected.

This patch will fix it by passing correct start.

Signed-off-by: Qu Wenruo 
---
 fs/btrfs/file.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index dce93dc..933a723 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1607,12 +1607,17 @@ again:
BTRFS_I(inode)->outstanding_extents++;
spin_unlock(_I(inode)->lock);
}
-   if (only_release_metadata)
+   if (only_release_metadata) {
btrfs_delalloc_release_metadata(inode,
release_bytes);
-   else
-   btrfs_delalloc_release_space(inode, pos,
+   } else {
+   u64 __pos;
+
+   __pos = round_down(pos, root->sectorsize) +
+   (dirty_pages << PAGE_CACHE_SHIFT);
+   btrfs_delalloc_release_space(inode, __pos,
 release_bytes);
+   }
}
 
release_bytes = dirty_pages << PAGE_CACHE_SHIFT;
-- 
2.6.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/6] btrfs-progs: Fix uninitialized key.type for btrfs_find_free_objectid

2015-10-29 Thread Zhao Lei
To avoid using uninitialized value in btrfs_search_slot().

Signed-off-by: Zhao Lei 
---
 inode-map.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/inode-map.c b/inode-map.c
index 1321bfb..346952b 100644
--- a/inode-map.c
+++ b/inode-map.c
@@ -44,6 +44,7 @@ int btrfs_find_free_objectid(struct btrfs_trans_handle *trans,
BTRFS_FIRST_FREE_OBJECTID);
search_key.objectid = search_start;
search_key.offset = 0;
+   search_key.type = 0;
 
btrfs_init_path(path);
start_found = 0;
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl

2015-10-29 Thread Luke Dashjr
32-bit ioctl uses these rather than the regular FS_IOC_* versions. They can
be handled in btrfs using the same code. Without this, 32-bit {ch,ls}attr
fail.

Signed-off-by: Luke Dashjr 
Cc: sta...@vger.kernel.org
---
 fs/btrfs/ctree.h |  1 +
 fs/btrfs/file.c  |  2 +-
 fs/btrfs/inode.c |  2 +-
 fs/btrfs/ioctl.c | 21 +
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 6f364e1..880604d 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3956,6 +3956,7 @@ void btrfs_test_inode_set_ops(struct inode *inode);
 
 /* ioctl.c */
 long btrfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
+long btrfs_compat_ioctl(struct file *file, unsigned int cmd, unsigned long 
arg);
 void btrfs_update_iflags(struct inode *inode);
 void btrfs_inherit_iflags(struct inode *inode, struct inode *dir);
 int btrfs_is_empty_uuid(u8 *uuid);
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index b072e17..fa366f4 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -2811,7 +2811,7 @@ const struct file_operations btrfs_file_operations = {
.fallocate  = btrfs_fallocate,
.unlocked_ioctl = btrfs_ioctl,
 #ifdef CONFIG_COMPAT
-   .compat_ioctl   = btrfs_ioctl,
+   .compat_ioctl   = btrfs_compat_ioctl,
 #endif
 };
 
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 8bb0136..3cde738 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -9812,7 +9812,7 @@ static const struct file_operations 
btrfs_dir_file_operations = {
.iterate= btrfs_real_readdir,
.unlocked_ioctl = btrfs_ioctl,
 #ifdef CONFIG_COMPAT
-   .compat_ioctl   = btrfs_ioctl,
+   .compat_ioctl   = btrfs_compat_ioctl,
 #endif
.release= btrfs_release_file,
.fsync  = btrfs_sync_file,
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 1c22c65..55baee9 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -5357,3 +5357,24 @@ long btrfs_ioctl(struct file *file, unsigned int
 
return -ENOTTY;
 }
+
+#ifdef CONFIG_COMPAT
+long btrfs_compat_ioctl(struct file *file, unsigned int
+   cmd, unsigned long arg)
+{
+   switch (cmd) {
+   case FS_IOC32_GETFLAGS:
+   cmd = FS_IOC_GETFLAGS;
+   break;
+   case FS_IOC32_SETFLAGS:
+   cmd = FS_IOC_SETFLAGS;
+   break;
+   case FS_IOC32_GETVERSION:
+   cmd = FS_IOC_GETVERSION;
+   break;
+   default:
+   return -ENOIOCTLCMD;
+   }
+   return btrfs_ioctl(file, cmd, (unsigned long) compat_ptr(arg));
+}
+#endif
-- 
2.4.10
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/6] btrfs-progs: free fslabel for btrfs-convert

2015-10-29 Thread Zhao Lei
fslabel need to be freed before exit.

Signed-off-by: Zhao Lei 
---
 btrfs-convert.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/btrfs-convert.c b/btrfs-convert.c
index 5b9171e..1693d03 100644
--- a/btrfs-convert.c
+++ b/btrfs-convert.c
@@ -3027,6 +3027,9 @@ int main(int argc, char *argv[])
ret = do_convert(file, datacsum, packing, noxattr, nodesize,
copylabel, fslabel, progress, features);
}
+
+   free(fslabel);
+
if (ret)
return 1;
return 0;
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/6] btrfs-progs: Fix negative eb's ref_cnt in btrfs-calc-size

2015-10-29 Thread Zhao Lei
btrfs-calc-size show following warning:
 # btrfs-calc-size /dev/sda6
 Calculating size of root tree
 ...
 extent_io.c:582: free_extent_buffer: Assertion `eb->refs < 0` failed.
 ./btrfs-calc-size[0x41d642]
 ./btrfs-calc-size(free_extent_buffer+0x70)[0x41e1c1]
 ./btrfs-calc-size(btrfs_free_fs_root+0x11)[0x40e1e8]
 ./btrfs-calc-size[0x40e215]
 ./btrfs-calc-size(rb_free_nodes+0x1d)[0x4326fe]
 ./btrfs-calc-size(close_ctree+0x3f3)[0x40f9ea]
 ./btrfs-calc-size(main+0x200)[0x431b4e]
 /lib64/libc.so.6(__libc_start_main+0xf5)[0x3858621d65]
 ./btrfs-calc-size[0x407009]

Reason:
 path in calc_root_size() is only used to save node data,
 it don't hold ref_cnt for each eb in.
 Using btrfs_free_path() to free path will reduce these eb
 again, and cause many problems, as negative ref_cnt or
 invalid memory access.

Signed-off-by: Zhao Lei 
---
 btrfs-calc-size.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
index 17d44ae..b84cda9 100644
--- a/btrfs-calc-size.c
+++ b/btrfs-calc-size.c
@@ -417,7 +417,14 @@ out:
free(seek);
}
 
-   btrfs_free_path(path);
+   /*
+* We only use path to save node data in iterating,
+* without holding eb's ref_cnt in path.
+* Don't use btrfs_free_path() here, it will free these
+* eb again, and cause many problems, as negative ref_cnt
+* or invalid memory access.
+*/
+   free(path);
return ret;
 }
 
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/6] btrfs-progs: fix floating point exception for btrfs-calc-size

2015-10-29 Thread Zhao Lei
Current code exit with floating point exception on a blank fs:
 # btrfs-calc-size -b /dev/sda6
 Calculating size of root tree
 Total size: 16384
 Inline data: 0
 Total seeks: 0
 Forward seeks: 0
 Backward seeks: 0
 Floating point exception

This patch add a condition check for above case.

Signed-off-by: Zhao Lei 
---
 btrfs-calc-size.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/btrfs-calc-size.c b/btrfs-calc-size.c
index b756693..17d44ae 100644
--- a/btrfs-calc-size.c
+++ b/btrfs-calc-size.c
@@ -372,8 +372,8 @@ out_print:
printf("\tTotal seeks: %Lu\n", stat.total_seeks);
printf("\t\tForward seeks: %Lu\n", stat.forward_seeks);
printf("\t\tBackward seeks: %Lu\n", stat.backward_seeks);
-   printf("\t\tAvg seek len: %Lu\n", stat.total_seek_len /
-  stat.total_seeks);
+   printf("\t\tAvg seek len: %llu\n", stat.total_seeks ?
+   stat.total_seek_len / stat.total_seeks : 0);
print_seek_histogram();
printf("\tTotal clusters: %Lu\n", stat.total_clusters);
printf("\t\tAvg cluster size: %Lu\n", stat.total_cluster_size /
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/6] btrfs-progs: free comparer_set in cmd_qgroup_show

2015-10-29 Thread Zhao Lei
comparer_set, which was allocated by malloc(), should be free before
function return.

Signed-off-by: Zhao Lei 
---
 cmds-qgroup.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/cmds-qgroup.c b/cmds-qgroup.c
index a64b716..f069d32 100644
--- a/cmds-qgroup.c
+++ b/cmds-qgroup.c
@@ -290,7 +290,7 @@ static int cmd_qgroup_show(int argc, char **argv)
int filter_flag = 0;
unsigned unit_mode;
 
-   struct btrfs_qgroup_comparer_set *comparer_set;
+   struct btrfs_qgroup_comparer_set *comparer_set = NULL;
struct btrfs_qgroup_filter_set *filter_set;
filter_set = btrfs_qgroup_alloc_filter_set();
comparer_set = btrfs_qgroup_alloc_comparer_set();
@@ -372,6 +372,8 @@ static int cmd_qgroup_show(int argc, char **argv)
fprintf(stderr, "ERROR: can't list qgroups: %s\n",
strerror(e));
 
+   free(comparer_set);
+
return !!ret;
 }
 
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 6/6] btrfs-progs: Avoid use pointer in handle_options

2015-10-29 Thread Zhao Lei
We use pointer of argc and argv in handle_options() because they
are necessary in very old code which are not exist now.

This patch move to use argc and argv directly in handle_options(),
alone with following update:
1: rename handle_options() to check_options()
   to fit its function.
2: cleanup for condition in handle_options() to make line short.

Signed-off-by: Zhao Lei 
---
 btrfs.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 9416a29..f881c18 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -172,20 +172,22 @@ static int cmd_version(int argc, char **argv)
return 0;
 }
 
-static void handle_options(int *argc, char ***argv)
+static void check_options(int argc, char **argv)
 {
-   if (*argc > 0) {
-   const char *arg = (*argv)[0];
-   if (arg[0] != '-' ||
-   !strcmp(arg, "--help") ||
-   !strcmp(arg, "--version"))
-   return;
-   fprintf(stderr, "Unknown option: %s\n", arg);
-   fprintf(stderr, "usage: %s\n",
-   btrfs_cmd_group.usagestr[0]);
-   exit(129);
-   }
-   return;
+   if (argc == 0)
+   return;
+
+   const char *arg = argv[0];
+
+   if (arg[0] != '-' ||
+   !strcmp(arg, "--help") ||
+   !strcmp(arg, "--version"))
+   return;
+
+   fprintf(stderr, "Unknown option: %s\n", arg);
+   fprintf(stderr, "usage: %s\n",
+   btrfs_cmd_group.usagestr[0]);
+   exit(129);
 }
 
 static const struct cmd_group btrfs_cmd_group = {
@@ -227,7 +229,7 @@ int main(int argc, char **argv)
} else {
argc--;
argv++;
-   handle_options(, );
+   check_options(argc, argv);
if (argc > 0) {
if (!prefixcmp(argv[0], "--"))
argv[0] += 2;
-- 
1.8.5.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [3/3] btrfs: qgroup: Fix a rebase bug which will cause qgroup double free

2015-10-29 Thread Johannes Henninger
Tested-by: Johannes Henninger 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to replicate a Xen VM using BTRFS as the root filesystem.

2015-10-29 Thread Austin S Hemmelgarn

On 2015-10-28 22:39, Russell Coker wrote:

On Wed, 28 Oct 2015 11:07:20 PM Austin S Hemmelgarn wrote:

Using this methodology, I can have a new Gentoo PV domain running in
about half an hour, whereas it takes me at least two and a half hours
(and often much longer than that) when using the regular install process
for Gentoo.


On my virtual servers I have a BTRFS subvol /xenstore for the block devices of
virtual machines.  When I want to duplicate a VM I run
"cp -a --reflink=aways /xenstore/A /xenstore/B" which takes a few seconds.
Well yes, that works quickly too, but is kind of hard to do when using 
LVM or other non-file-backed block storage, and the usual recommendation 
when not using blktap for storage is to use normal block storage devices 
(that, and I've had some horrible experience WRT performance with 
running VM's with BTRFS filesystems on files on a BTRFS filesystem on 
the host system).





smime.p7s
Description: S/MIME Cryptographic Signature


bad handling of unpartitioned device in sysfs_devno_to_wholedisk() (which breaks mkfs.btrfs)

2015-10-29 Thread Tom Yan
So I noticed that SSD detection does work on unpartitioned devices in
mkfs.btrfs somehow:
https://bugzilla.kernel.org/show_bug.cgi?id=102921

Later I found out that it breaks at blkid_devno_to_wholedisk() in is_ssd():
http://git.kernel.org/cgit/linux/kernel/git/kdave/btrfs-progs.git/tree/mkfs.c?h=v4.2.3#n1103

which Elliot had shown an example with strace:
https://lists.01.org/pipermail/linux-nvdimm/2015-September/002109.html

And I think the problem occurs in the sysfs_get_devname() here:
https://git.kernel.org/cgit/utils/util-linux/util-linux.git/tree/lib/sysfs.c?h=v2.27#n785

Since sysfs_get_devname() has to call sysfs_readlink() later, which
output a long full device path in /sys, I don't think we should call
it directly with the buffer "diskname", which people won't expect that
it has to be large enough to carry the path in the middle of the
process. For example in is_sdd(), a char array of size 32 is used
("wholedisk").
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html