btrfs device delete /dev/sdc1 /mnt/raid1 user experience

2016-06-05 Thread Kai Hendry
Hi there,


I planned to remove one of my disks, so that I can take it from
Singapore to the UK and then re-establish another remote RAID1 store.

delete is an alias of remove, so I added a new disk (devid 3) and
proceeded to run:
btrfs device delete /dev/sdc1 /mnt/raid1 (devid 1)


nuc:~$ uname -a
Linux nuc 4.5.4-1-ARCH #1 SMP PREEMPT Wed May 11 22:21:28 CEST 2016
x86_64 GNU/Linux
nuc:~$   btrfs --version
btrfs-progs v4.5.3
nuc:~$ sudo btrfs fi show /mnt/raid1/
Label: 'extraid1'  uuid: 5cab2a4a-e282-4931-b178-bec4c73cdf77
Total devices 2 FS bytes used 776.56GiB
devid2 size 931.48GiB used 778.03GiB path /dev/sdb1
devid3 size 1.82TiB used 778.03GiB path /dev/sdd1

nuc:~$ sudo btrfs fi df /mnt/raid1/
Data, RAID1: total=775.00GiB, used=774.39GiB
System, RAID1: total=32.00MiB, used=144.00KiB
Metadata, RAID1: total=3.00GiB, used=2.17GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


First off, I was expecting btrfs to release the drive pretty much
immediately. The command took about half a day to complete. I watched
`btrfs fi show` to see size of devid 1 (the one I am trying to remove)
to be zero and to see used space slowly go down whilst used space of
devid 3 (the new disk) slowly go up.

Secondly and most importantly my /dev/sdc1 can't be mounted now anymore.
Why?

sudo mount -t btrfs /dev/sdc1 /mnt/test/
mount: wrong fs type, bad option, bad superblock on /dev/sdc1,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

There is nothing in dmesg nor my journal. I wasn't expecting my drive to
be rendered useless on removing or am I missing something?

nuc:~$ sudo fdisk -l /dev/sdc
Disk /dev/sdc: 931.5 GiB, 1000204885504 bytes, 1953525167 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 33553920 bytes
Disklabel type: gpt
Disk identifier: 19938642-3B10-4220-BF99-3E12AF1D1CF6

Device StartEndSectors   Size Type
/dev/sdc1   2048 1953525133 1953523086 931.5G Linux filesystem

On #btrfs IRC channel I'm told:

 hendry: breaking multi-disk filesystems in half is not a
recommended way to do "normal operations" :D

I'm still keen to take a TB on a flight with me the day after tomorrow.
What is the recommended course of action? Recreate a mkfs.btrfs on
/dev/sdc1 and send data to it from /mnt/raid1?

Still I hope the experience could be improved to remove a disk sanely.

Many thanks,
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Is it possible to block any writer inside a fs?

2016-06-05 Thread Qu Wenruo

Hi,

Is it possible to block any new writer from a fs?
Just like what we do in remounting fs to readonly, although it's done in 
VFS.


The case is, for example, we have an ioctl to control how buffered write 
works.(inband dedupe)


And when changing/disabling such behavior, we need to ensure that all 
current writer has finished and no new incoming writer until we finished 
the work, just like remount to RO.


In VFS, it's done by sb_prepare_mount_readonly(), while it's an 
internally used function, and shouldn't be directly called by a fs.


So, is there any method for fs to block incomming write inside a fs?

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Recommended why to use btrfs for production?

2016-06-05 Thread James Johnston
On 06/06/2016 at 01:47, Chris Murphy wrote:
> On Sun, Jun 5, 2016 at 4:45 AM, Mladen Milinkovic  
> wrote:
> > On 06/03/2016 04:05 PM, Chris Murphy wrote:
> >> Make certain the kernel command timer value is greater than the driver
> >> error recovery timeout. The former is found in sysfs, per block
> >> device, the latter can be get and set with smartctl. Wrong
> >> configuration is common (it's actually the default) when using
> >> consumer drives, and inevitably leads to problems, even the loss of
> >> the entire array. It really is a terrible default.
> >
> > Since it's first time i've heard of this I did some googling.
> >
> > Here's some nice article about these timeouts:
> > http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-
> timeouts/comment-page-1/
> >
> > And some udev rules that should apply this automatically:
> > http://comments.gmane.org/gmane.linux.raid/48193
> 
> Yes it's a constant problem that pops up on the linux-raid list.
> Sometimes the list is quiet on this issue but it really seems like
> it's once a week. From last week...
> 
> http://www.spinics.net/lists/raid/msg52447.html

It seems like it would be useful if the distributions or the kernel could
automatically set the kernel timeout to an appropriate value.  If the TLER can 
be
indeed be queried via smartctl, then it would be easy to automatically read it,
and then calculate a suitable timeout.  A RAID-oriented drive would end up 
leaving
the current 30 seconds, while if it can't successfully query for TLER or the 
drive
just doesn't support it, then assume a consumer drive and set timeout for 180
seconds.

That way, zero user configuration would be needed in the common case.  Or is it
not that simple?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs: fix check_shared for fiemap ioctl

2016-06-05 Thread Lu Fengqi

At 06/03/2016 10:02 PM, Josef Bacik wrote:

On 06/01/2016 01:48 AM, Lu Fengqi wrote:

Only in the case of different root_id or different object_id,
check_shared
identified extent as the shared. However, If a extent was referred by
different offset of same file, it should also be identified as shared.
In addition, check_shared's loop scale is at least  n^3, so if a extent
has too many references,  even causes soft hang up.

First, add all delayed_ref to the ref_tree and calculate the unqiue_refs,
if the unique_refs is greater than one, return BACKREF_FOUND_SHARED.
Then individually add the  on-disk reference(inline/keyed) to the
ref_tree
and calculate the unique_refs of the ref_tree to check if the unique_refs
is greater than one.Because once there are two references to return
SHARED, so the time complexity is close to the constant.

Reported-by: Tsutomu Itoh 
Signed-off-by: Lu Fengqi 


This is a lot of work for just wanting to know if something is shared.
Instead lets adjust this slightly.  Instead of passing down a
root_objectid/inum and noticing this and returned shared, add a new way
to iterate refs.  Currently we go gather all the refs and then do the
iterate dance, which is what takes so long.  So instead add another
helper that calls the provided function every time it has a match, and
then we can pass in whatever context we want, and we return when
something matches.  This way we don't have all this extra accounting,
and we're no longer passing root_objectid/inum around and testing for
some magic scenario.  Thanks,

Josef





With this patch, we can quickly find extent that has more than one 
reference(delayed, inline and keyed) and return SHARED immediately. 
However, for indirect refs, we have to continue to resolve indirect refs 
to their parent bytenr, and check if this parent bytenr is shared. So, 
the original function is necessary. The original refs list reduces the 
efficiency of search, so maybe we can use rb_tree to replace it for 
optimize the original function in the furture. And, we just want to 
solve the problem of check_shared now.


--
Thanks,
Lu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommended why to use btrfs for production?

2016-06-05 Thread Chris Murphy
On Sun, Jun 5, 2016 at 4:45 AM, Mladen Milinkovic  wrote:
> On 06/03/2016 04:05 PM, Chris Murphy wrote:
>> Make certain the kernel command timer value is greater than the driver
>> error recovery timeout. The former is found in sysfs, per block
>> device, the latter can be get and set with smartctl. Wrong
>> configuration is common (it's actually the default) when using
>> consumer drives, and inevitably leads to problems, even the loss of
>> the entire array. It really is a terrible default.
>
> Since it's first time i've heard of this I did some googling.
>
> Here's some nice article about these timeouts:
> http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-timeouts/comment-page-1/
>
> And some udev rules that should apply this automatically:
> http://comments.gmane.org/gmane.linux.raid/48193

Yes it's a constant problem that pops up on the linux-raid list.
Sometimes the list is quiet on this issue but it really seems like
it's once a week. From last week...

http://www.spinics.net/lists/raid/msg52447.html

And you wouldn't know it because the subject is "raid 5 crashed" so
you wouldn't think, oh bad sectors are accumulating because they're
not getting fixed up and they're not getting fixed up because the
kernel command timer is resetting the link preventing the drive from
reporting a read error and the associated sector LBA. It starts with
that, and then you get a single disk failure, and now when doing a
rebuild, you hit the bad sector on an otherwise good drive and in
effect that's like a 2nd drive failure and now the raid5 implodes.
It's fixable, sometimes, but really tedious.



-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs

2016-06-05 Thread Chris Murphy
On Fri, Jun 3, 2016 at 7:51 PM, Christoph Anton Mitterer
 wrote:

> I think I remember that you've claimed that last time already, and as
> I've said back then:
> - what counts is probably the common understanding of the term, which
>   is N disks RAID1 = N disks mirrored
> - if there is something like an "official definition", it's probably
>   the original paper that introduced RAID:
>   http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf
>   PDF page 11, respectively content page 9 describes RAID1 as:
>   "This is the most expensive option since *all* disks are
>   duplicated..."


You've misread the paper.

It defines what it means by "all disks are duplicated" as G=1 and C=1.
That is, every data disk has one check disk. That is, two copies.
There is no mention of n-copies.

Further in table 2 "Characteristics of Level 1 RAID" the overhead is
described as 100%, and the usable storage capacity is 50%. Again, that
is consistent with duplication.

The definition of duplicate is "one of two or more identical things."

The etymology of duplicate is "1400-50; late Middle English < Latin
duplicātus (past participle of duplicāre to make double), equivalent
to duplic- (stem of duplex) duplex + -ātus -ate1
http://www.dictionary.com/browse/duplicate

There is no possible reading of this that suggests n-way RAID is intended.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs

2016-06-05 Thread Chris Murphy
On Sun, Jun 5, 2016 at 3:31 PM, Christoph Anton Mitterer
 wrote:
> On Sun, 2016-06-05 at 21:07 +, Hugo Mills wrote:
>>The problem is that you can't guarantee consistency with
>> nodatacow+checksums. If you have nodatacow, then data is overwritten,
>> in place. If you do that, then you can't have a fully consistent
>> checksum -- there are always race conditions between the checksum and
>> the data being written (or the data and the checksum, depending on
>> which way round you do it).
>
> I'm not an expert in the btrfs internals... but I had a pretty long
> discussion back then when I brought this up first, and everything that
> came out of that - to my understanding - indicated, that it should be
> simply possible.
>
> a) nodatacow just means "no data cow", but not "no meta data cow".
>And isn't the checksumming data meda data? So AFAIU, this is itself
>anyway COWed.
> b) What you refer to above is, AFAIU, that data may be written (not
>COWed) and there is of course no guarantee that the written data
>matches the checksum (which may e.g. still be the old sum).
>=> So what?

For  a file like a VM image constantly being modified, essentially at
no time will the csums on disk ever reflect the state of the file.

>   This anyway only happens in case of crash/etc. and in that case
>   we anyway have no idea, whether the written not COWed block is
>   consistent or not, whether we do checksumming or not.

If the file is cow'd and checksummed, and there's a crash, there is
supposed to be consistency: either the old state or new state for the
data is on-disk and the current valid metadata correctly describes
which state that data is in.

If the file is not cow'd and not checksummed, its consistency is
unknown but also ignored, when doing normal reads, balance or scrubs.

If the file is not cow'd but were checksummed, there would always be
some inconsistency if the file is actively being modified. Only when
it's not being modified, and metadata writes for that file are
committed to disk and the superblock updated, is there consistency. At
any other time, there's inconsistency. So if there's a crash, a
balance or scrub or normal read will say the file is corrupt. And the
normal way Btrfs deals with corruption on reads from a mounted fs is
to complain and it does not pass the corrupt data to user space,
instead there's an i/o error. You have to use restore to scrape it off
the volume; or alternatively use btrfsck to recompute checksums.

Presumably you'd ask for an exception for this kind of file, where it
can still be read even though there's a checksum mismatch, can be
scrubbed and balanced which will report there's corruption even if
there isn't any, and you've gained, insofar as I can tell, a lot of
confusion and ambiguity.

It's fine you want a change in behavior for Btrfs. But when a
developer responds, more than once, about how this is somewhere
between difficult and not possible, and you say it should simply be
possible, I think that's annoying, bordering on irritating.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 vs RAID10 and best way to set up 6 disks

2016-06-05 Thread Chris Murphy
On Sun, Jun 5, 2016 at 2:31 PM, Christoph Anton Mitterer
 wrote:
> On Sun, 2016-06-05 at 09:36 -0600, Chris Murphy wrote:
>> That's ridiculous. It isn't incorrect to refer to only 2 copies as
>> raid1.
> No, if there are only two devices then not.
> But obviously we're talking about how btrfs does RAID1, in which even
> with n>2 devices there are only 2 copies - that's incorrect.

OK and I think the assertion is asinine. You reject the only neutral
party's definition and distinction of RAID-1 types, and then claim on
the basis of opinion that Btrf's raid1 is not merely different from
traditional/classic/common understandings of RAID-1, but that they're
incorrect to have called it raid1. It's just nonsense. I find your
argument uncompellling.


>>  You have to explicitly ask both mdadm
> Aha, and which option would that be?

For mdadm it's implied as a combination of -n -x and number of
devices. For lvcreate it's explicit with -m. This is in the man page,
so I don't understand why you're asking.


>
>>  and lvcreate for the
>> number of copies you want, it doesn't automatically happen.
> I've said that before, but at least it allows you to use the full
> number of disks, so we're again back to that it's closer to the
> original and common meaning of RAID1 than what btrfs does.

The original and common meaning defined by whom, where? You're welcome
to go take it up with Wikipedia but they're using SNIA definitions for
the standard RAID levels.




>
>
>>  The man
>> page for mkfs.btrfs is very clear you only get two copies.
>
> I haven't denied that... but one shouldn't use terms that are commonly
> understood in a different mannor and require people to read all the
> small printed.

And I disagree because what you required is more reading by the user
to understand entirely new nomenclature.



> One could also have changed it's RAID0 with RAID1, and I guess people
> wouldn't be too delighted if the excuse was "well it's in the manpage".

Except nothing that crazy has been done so I fail to see the point.



>
>
>>
>> > Well I'd say, for btrfs: do away with the term "RAID" at all, use
>> > e.g.:
>> >
>> > linear = just a bunch of devices put together, no striping
>> >  basically what MD's linear is
>> Except this isn't really how Btrfs single works. The difference
>> between mdadm linear and Btrfs single is more different in behavior
>> than the difference between mdadm raid1 and btrfs raid1. So you're
>> proposing tolerating a bigger difference, while criticizing a smaller
>> one. *shrug*
>
> What's the big difference? Would you care to explain?

It's not linear. The archives detail how block groups are allocated to
devices. There are rules, linearity isn't one of them.



> But I'm happy
> with "single" either, it just doesn't really tell that there is no
> striping, I mean "single" points more towards "we have no resilience
> but only 1 copy", whether this is striped or not.
>
>
>
>> If a metaphor is going to be used for a technical thing, it would be
>> mirrors or mirroring. Mirror would mean exactly two (the original and
>> the mirror). See lvcreate --mirrors. Also, the lvm mirror segment
>> type
>> is legacy, having been replaced with raid1 (man lvcreate uses the
>> term
>> raid1, not RAID1 or RAID-1). So I'm not a big fan of this term.
>
> Admittedly, I didn't like the "mirror(s)" either... I was just trying
> to show that different names could be used that are already a bit
> better.
>
>
>> > striped = basically what RAID0 is
>>
>> lvcreate uses only striped, not raid0. mdadm uses only RAID0, not
>> striped. Since striping is also employed with RAIDs 4, 5, 6, 7, it
>> seems ambiguous even though without further qualification whether
>> parity exists, it's considered to mean non-parity striping. The
>> ambiguity is probably less of a problem than the contradiction that
>> is
>> RAID0.
>
> Mhh,.. well or one makes schema names that contain all possible
> properties of a "RAID", something like:
> replicasN-parityN-[not]striped



SNIA has created such a schema.




>
> SINGLE would be something like "replicas1-parity0-notstriped".
> RAID5 would be something like "replicas0-parity1-striped".
>
>
>> > And just mention in the manpage, which of these names comes closest
>> > to
>> > what people understand by RAID level i.
>>
>> It already does this. What version of btrfs-progs are you basing your
>> criticism on that there's some inconsistency, deficiency, or
>> ambiguity
>> when it comes to these raid levels?
>
> Well first, the terminology thing is the least serious issue from my
> original list ;-) ... TBH I don't know why such a large discussion came
> out of that point.
>
> Even though I'm not reading along all mails here, we have probably at
> least every month someone who wasn't aware that RAID1 is not what he
> assumes it to be.
> And I don't think these people can be blamed for not RTFM, because IMHO
> this is a term commonly understood as mirror all available 

Re: [PATCH 42/45] block, fs, drivers: remove REQ_OP compat defs and related code

2016-06-05 Thread kbuild test robot
Hi,

[auto build test ERROR on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
config: blackfin-BF526-EZBRD_defconfig (attached as .config)
compiler: bfin-uclinux-gcc (GCC) 4.6.3
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=blackfin 

All errors (new ones prefixed by >>):

   drivers/built-in.o: In function `sd_init_command':
>> drivers/scsi/sd.c:1141: undefined reference to `__ucmpdi2'

vim +1141 drivers/scsi/sd.c

^1da177e Linus Torvalds2005-04-16  1135  }
^1da177e Linus Torvalds2005-04-16  1136  
87949eee Christoph Hellwig 2014-06-28  1137  static int sd_init_command(struct 
scsi_cmnd *cmd)
87949eee Christoph Hellwig 2014-06-28  1138  {
87949eee Christoph Hellwig 2014-06-28  1139 struct request *rq = 
cmd->request;
87949eee Christoph Hellwig 2014-06-28  1140  
b826ba83 Mike Christie 2016-06-05 @1141 switch (req_op(rq)) {
b826ba83 Mike Christie 2016-06-05  1142 case REQ_OP_DISCARD:
87949eee Christoph Hellwig 2014-06-28  1143 return 
sd_setup_discard_cmnd(cmd);
b826ba83 Mike Christie 2016-06-05  1144 case REQ_OP_WRITE_SAME:

:: The code at line 1141 was first introduced by commit
:: b826ba83985b86029288d8cc24fb93ce96947b18 drivers: use req op accessor

:: TO: Mike Christie 
:: CC: 0day robot 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH 01/45] block/fs/drivers: remove rw argument from submit_bio

2016-06-05 Thread kbuild test robot
Hi,

[auto build test WARNING on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
config: i386-allyesconfig (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All warnings (new ones prefixed by >>):

   fs/ext4/crypto.c: In function 'ext4_encrypted_zeroout':
>> fs/ext4/crypto.c:442:25: warning: passing argument 1 of 'submit_bio_wait' 
>> makes pointer from integer without a cast [-Wint-conversion]
  err = submit_bio_wait(WRITE, bio);
^
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: expected 'struct bio *' but argument is of 
type 'long long unsigned int'
extern int submit_bio_wait(struct bio *bio);
   ^~~
   fs/ext4/crypto.c:442:9: error: too many arguments to function 
'submit_bio_wait'
  err = submit_bio_wait(WRITE, bio);
^~~
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: declared here
extern int submit_bio_wait(struct bio *bio);
   ^~~

vim +/submit_bio_wait +442 fs/ext4/crypto.c

b30ab0e0 Michael Halcrow 2015-04-12  426goto errout;
b30ab0e0 Michael Halcrow 2015-04-12  427}
b30ab0e0 Michael Halcrow 2015-04-12  428bio->bi_bdev = 
inode->i_sb->s_bdev;
36086d43 Theodore Ts'o   2015-10-03  429bio->bi_iter.bi_sector =
36086d43 Theodore Ts'o   2015-10-03  430pblk << 
(inode->i_sb->s_blocksize_bits - 9);
36086d43 Theodore Ts'o   2015-10-03  431ret = bio_add_page(bio, 
ciphertext_page,
b30ab0e0 Michael Halcrow 2015-04-12  432   
inode->i_sb->s_blocksize, 0);
36086d43 Theodore Ts'o   2015-10-03  433if (ret != 
inode->i_sb->s_blocksize) {
36086d43 Theodore Ts'o   2015-10-03  434/* should never 
happen! */
36086d43 Theodore Ts'o   2015-10-03  435
ext4_msg(inode->i_sb, KERN_ERR,
36086d43 Theodore Ts'o   2015-10-03  436 
"bio_add_page failed: %d", ret);
36086d43 Theodore Ts'o   2015-10-03  437WARN_ON(1);
b30ab0e0 Michael Halcrow 2015-04-12  438bio_put(bio);
36086d43 Theodore Ts'o   2015-10-03  439err = -EIO;
b30ab0e0 Michael Halcrow 2015-04-12  440goto errout;
b30ab0e0 Michael Halcrow 2015-04-12  441}
b30ab0e0 Michael Halcrow 2015-04-12 @442err = 
submit_bio_wait(WRITE, bio);
36086d43 Theodore Ts'o   2015-10-03  443if ((err == 0) && 
bio->bi_error)
36086d43 Theodore Ts'o   2015-10-03  444err = -EIO;
95ea68b4 Theodore Ts'o   2015-05-31  445bio_put(bio);
b30ab0e0 Michael Halcrow 2015-04-12  446if (err)
b30ab0e0 Michael Halcrow 2015-04-12  447goto errout;
36086d43 Theodore Ts'o   2015-10-03  448lblk++; pblk++;
b30ab0e0 Michael Halcrow 2015-04-12  449}
b30ab0e0 Michael Halcrow 2015-04-12  450err = 0;

:: The code at line 442 was first introduced by commit
:: b30ab0e03407d2aa2d9316cba199c757e4bfc8ad ext4 crypto: add ext4 
encryption facilities

:: TO: Michael Halcrow 
:: CC: Theodore Ts'o 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH 42/45] block, fs, drivers: remove REQ_OP compat defs and related code

2016-06-05 Thread kbuild test robot
Hi,

[auto build test ERROR on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
config: m32r-opsput_defconfig (attached as .config)
compiler: m32r-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m32r 

All errors (new ones prefixed by >>):

>> ERROR: "__ucmpdi2" [drivers/scsi/sd_mod.ko] undefined!

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: btrfs

2016-06-05 Thread Christoph Anton Mitterer
On Sun, 2016-06-05 at 21:07 +, Hugo Mills wrote:
>    The problem is that you can't guarantee consistency with
> nodatacow+checksums. If you have nodatacow, then data is overwritten,
> in place. If you do that, then you can't have a fully consistent
> checksum -- there are always race conditions between the checksum and
> the data being written (or the data and the checksum, depending on
> which way round you do it).

I'm not an expert in the btrfs internals... but I had a pretty long
discussion back then when I brought this up first, and everything that
came out of that - to my understanding - indicated, that it should be
simply possible.

a) nodatacow just means "no data cow", but not "no meta data cow".
   And isn't the checksumming data meda data? So AFAIU, this is itself
   anyway COWed.
b) What you refer to above is, AFAIU, that data may be written (not
   COWed) and there is of course no guarantee that the written data
   matches the checksum (which may e.g. still be the old sum).
   => So what?
      This anyway only happens in case of crash/etc. and in that case
      we anyway have no idea, whether the written not COWed block is
      consistent or not, whether we do checksumming or not.
      We rather get the benefit that we now know: it may be garbage
      The only "bad" thing that could happen was:
      the block is fully written and actually consistent, but the
      checksum hasn't been written yet - IMHO much less likely than
      the other case(s). And I rather get one false positive in an
      more unlikely case, than corrupted blocks in all other possible
      situations (silent block errors, etc. pp.)
      And in principle, nothing would prevent a future btrfs to get a
      journal for the nodatacow-ed writes.

Look for the past thread "dear developers, can we have notdatacow +
checksumming, plz?",... I think I wrote about much more cases there,
any why - even it may not be perfect as datacow+checksumming - it would
always still be better to have checksumming with nodatacow.

> > Wasn't it said, that autodefrag performs bad for anything larger
> > than
> > ~1G?
> 
>    I don't recall ever seeing someone saying that. Of course, I may
> have forgotten seeing it...
I think it was mentioned below this thread:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/50444/focus=50586
and also implied here:
http://article.gmane.org/gmane.comp.file-systems.btrfs/51399/match=autodefrag+large+files


> > Well the fragmentation has also many other consequences and not
> > just
> > seeks (assuming everyone would use SSDs, which is and probably
> > won't be
> > the case for quite a while).
> > Most obviously you get much more IOPS and btrfs itself will, AFAIU,
> > also suffer from some issues due to the fragmentation.
>    This is a fundamental problem with all CoW filesystems. There are
> some mititgations that can be put in place (true CoW rather than
> btrfs's redirect-on-write, like some databases do, where the original
> data is copied elsewhere before overwriting; cache aggressively and
> with knowledge of the CoW nature of the FS, like ZFS does), but they
> all have their drawbacks and pathological cases.
Sure... but defrag (if it would generally work) or notdatacow (if it
wouldn't make you loose the ability to determine whether you're
consistent or not) would be already quite helpful here.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 01/45] block/fs/drivers: remove rw argument from submit_bio

2016-06-05 Thread kbuild test robot
Hi,

[auto build test WARNING on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   include/linux/compiler.h:232:8: sparse: attribute 'no_sanitize_address': 
unknown attribute
>> fs/ext4/crypto.c:442:38: sparse: too many arguments for function 
>> submit_bio_wait
   In file included from include/linux/fs.h:31:0,
from include/linux/seq_file.h:10,
from include/linux/pinctrl/consumer.h:17,
from include/linux/pinctrl/devinfo.h:21,
from include/linux/device.h:24,
from include/linux/genhd.h:64,
from include/linux/blkdev.h:9,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   fs/ext4/crypto.c: In function 'ext4_encrypted_zeroout':
   include/linux/blk_types.h:194:20: warning: passing argument 1 of 
'submit_bio_wait' makes pointer from integer without a cast [-Wint-conversion]
#define REQ_WRITE  (1ULL << __REQ_WRITE)
   ^
   include/linux/fs.h:196:19: note: in expansion of macro 'REQ_WRITE'
#define RW_MASK   REQ_WRITE
  ^
   include/linux/fs.h:200:17: note: in expansion of macro 'RW_MASK'
#define WRITE   RW_MASK
^~~
   fs/ext4/crypto.c:442:25: note: in expansion of macro 'WRITE'
  err = submit_bio_wait(WRITE, bio);
^
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: expected 'struct bio *' but argument is of 
type 'long long unsigned int'
extern int submit_bio_wait(struct bio *bio);
   ^~~
   fs/ext4/crypto.c:442:9: error: too many arguments to function 
'submit_bio_wait'
  err = submit_bio_wait(WRITE, bio);
^~~
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: declared here
extern int submit_bio_wait(struct bio *bio);
   ^~~

vim +442 fs/ext4/crypto.c

b30ab0e0 Michael Halcrow 2015-04-12  426goto errout;
b30ab0e0 Michael Halcrow 2015-04-12  427}
b30ab0e0 Michael Halcrow 2015-04-12  428bio->bi_bdev = 
inode->i_sb->s_bdev;
36086d43 Theodore Ts'o   2015-10-03  429bio->bi_iter.bi_sector =
36086d43 Theodore Ts'o   2015-10-03  430pblk << 
(inode->i_sb->s_blocksize_bits - 9);
36086d43 Theodore Ts'o   2015-10-03  431ret = bio_add_page(bio, 
ciphertext_page,
b30ab0e0 Michael Halcrow 2015-04-12  432   
inode->i_sb->s_blocksize, 0);
36086d43 Theodore Ts'o   2015-10-03  433if (ret != 
inode->i_sb->s_blocksize) {
36086d43 Theodore Ts'o   2015-10-03  434/* should never 
happen! */
36086d43 Theodore Ts'o   2015-10-03  435
ext4_msg(inode->i_sb, KERN_ERR,
36086d43 Theodore Ts'o   2015-10-03  436 
"bio_add_page failed: %d", ret);
36086d43 Theodore Ts'o   2015-10-03  437WARN_ON(1);
b30ab0e0 Michael Halcrow 2015-04-12  438bio_put(bio);
36086d43 Theodore Ts'o   2015-10-03  439err = -EIO;
b30ab0e0 Michael Halcrow 2015-04-12  440goto errout;
b30ab0e0 Michael Halcrow 2015-04-12  441}
b30ab0e0 Michael Halcrow 2015-04-12 @442err = 
submit_bio_wait(WRITE, bio);
36086d43 Theodore Ts'o   2015-10-03  443if ((err == 0) && 
bio->bi_error)
36086d43 Theodore Ts'o   2015-10-03  444err = -EIO;
95ea68b4 Theodore Ts'o   2015-05-31  445bio_put(bio);
b30ab0e0 Michael Halcrow 2015-04-12  446if (err)
b30ab0e0 Michael Halcrow 2015-04-12  447goto errout;
36086d43 Theodore Ts'o   2015-10-03  448lblk++; pblk++;
b30ab0e0 Michael Halcrow 2015-04-12  449}
b30ab0e0 Michael Halcrow 2015-04-12  450err = 0;

:: The code at line 442 was first introduced by commit
:: b30ab0e03407d2aa2d9316cba199c757e4bfc8ad ext4 crypto: add ext4 
encryption facilities

:: TO: Michael Halcrow 

Re: btrfs

2016-06-05 Thread Hugo Mills
On Sun, Jun 05, 2016 at 10:56:45PM +0200, Christoph Anton Mitterer wrote:
> On Sun, 2016-06-05 at 22:39 +0200, Henk Slager wrote:
> > > So the point I'm trying to make:
> > > People do probably not care so much whether their VM image/etc. is
> > > COWed or not, snapshots/etc. still work with that,... but they may
> > > likely care if the integrity feature is lost.
> > > So IMHO, nodatacow + checksumming deserves to be amongst the top
> > > priorities.
> > Have you tried blockdevice/HDD caching like bcache or dmcache in
> > combination with VMs on BTRFS?
> No yet,... my personal use case is just some VMs on the notebook, and
> for this, the above would seem a bit overkill.
> For the larger VM cluster at the institute,... puh to be honest I don't
> know by hard what we do there.
> 
> 
> >   Or ZVOL for VMs in ZFS with L2ARC?
> Well but all this is an alternative solution,...
> 
> 
> > I assume the primary reason for wanting nodatacow + checksumming is
> > to
> > avoid long seektimes on HDDs due to growing fragmentation of the VM
> > images over time.
> Well the primary reason is wanting to have overall checksumming in the
> fs, regardless of which features one uses.

   The problem is that you can't guarantee consistency with
nodatacow+checksums. If you have nodatacow, then data is overwritten,
in place. If you do that, then you can't have a fully consistent
checksum -- there are always race conditions between the checksum and
the data being written (or the data and the checksum, depending on
which way round you do it).

> I think we already have some situations where tools use/set btrfs
> features by themselves (i.e. automatically)... wasn't systemd creating
> subvols per default in some locations, when there's btrfs?
> So it's no big step to postgresql/etc. setting nodatacow, making people
> loose integrity without them even knowing.
> 
> Of course, avoiding the fragmentation is the reason for the desire to
> have nodatacow.
> 
> 
> >  But even if you have nodatacow + checksumming
> > implemented, it is then still HDD access and a VM imagefile itself is
> > not guaranteed to be continuous.
> Uhm... sure, but that's no difference to other filesystems?!
> 
> 
> > It is clear that for VM images the amount of extents will be large
> > over time (like 50k or so, autodefrag on),
> Wasn't it said, that autodefrag performs bad for anything larger than
> ~1G?

   I don't recall ever seeing someone saying that. Of course, I may
have forgotten seeing it...

> >  but with a modern SSD used
> > as cache, it doesn't matter. It is still way faster than just HDD(s),
> > even with freshly copied image with <100 extents.
> Well the fragmentation has also many other consequences and not just
> seeks (assuming everyone would use SSDs, which is and probably won't be
> the case for quite a while).
> Most obviously you get much more IOPS and btrfs itself will, AFAIU,
> also suffer from some issues due to the fragmentation.

   This is a fundamental problem with all CoW filesystems. There are
some mititgations that can be put in place (true CoW rather than
btrfs's redirect-on-write, like some databases do, where the original
data is copied elsewhere before overwriting; cache aggressively and
with knowledge of the CoW nature of the FS, like ZFS does), but they
all have their drawbacks and pathological cases.

   Hugo.

-- 
Hugo Mills | How do you become King? You stand in the marketplace
hugo@... carfax.org.uk | and announce you're going to tax everyone. If you
http://carfax.org.uk/  | get out alive, you're King.
PGP: E2AB1DE4  |Harry Harrison


signature.asc
Description: Digital signature


Re: btrfs

2016-06-05 Thread Christoph Anton Mitterer
On Sun, 2016-06-05 at 22:39 +0200, Henk Slager wrote:
> > So the point I'm trying to make:
> > People do probably not care so much whether their VM image/etc. is
> > COWed or not, snapshots/etc. still work with that,... but they may
> > likely care if the integrity feature is lost.
> > So IMHO, nodatacow + checksumming deserves to be amongst the top
> > priorities.
> Have you tried blockdevice/HDD caching like bcache or dmcache in
> combination with VMs on BTRFS?
No yet,... my personal use case is just some VMs on the notebook, and
for this, the above would seem a bit overkill.
For the larger VM cluster at the institute,... puh to be honest I don't
know by hard what we do there.


>   Or ZVOL for VMs in ZFS with L2ARC?
Well but all this is an alternative solution,...


> I assume the primary reason for wanting nodatacow + checksumming is
> to
> avoid long seektimes on HDDs due to growing fragmentation of the VM
> images over time.
Well the primary reason is wanting to have overall checksumming in the
fs, regardless of which features one uses.

I think we already have some situations where tools use/set btrfs
features by themselves (i.e. automatically)... wasn't systemd creating
subvols per default in some locations, when there's btrfs?
So it's no big step to postgresql/etc. setting nodatacow, making people
loose integrity without them even knowing.


Of course, avoiding the fragmentation is the reason for the desire to
have nodatacow.


>  But even if you have nodatacow + checksumming
> implemented, it is then still HDD access and a VM imagefile itself is
> not guaranteed to be continuous.
Uhm... sure, but that's no difference to other filesystems?!


> It is clear that for VM images the amount of extents will be large
> over time (like 50k or so, autodefrag on),
Wasn't it said, that autodefrag performs bad for anything larger than
~1G?


>  but with a modern SSD used
> as cache, it doesn't matter. It is still way faster than just HDD(s),
> even with freshly copied image with <100 extents.
Well the fragmentation has also many other consequences and not just
seeks (assuming everyone would use SSDs, which is and probably won't be
the case for quite a while).
Most obviously you get much more IOPS and btrfs itself will, AFAIU,
also suffer from some issues due to the fragmentation.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: [PATCH 42/45] block, fs, drivers: remove REQ_OP compat defs and related code

2016-06-05 Thread kbuild test robot
Hi,

[auto build test WARNING on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
config: x86_64-randconfig-i0-201623 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/linux/genhd.h:67:0,
from include/linux/blkdev.h:9,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   fs/ext4/crypto.c: In function 'ext4_encrypted_zeroout':
>> include/linux/fs.h:197:19: warning: passing argument 1 of 'submit_bio_wait' 
>> makes pointer from integer without a cast [-Wint-conversion]
#define RW_MASK   REQ_OP_WRITE
  ^
   include/linux/fs.h:201:17: note: in expansion of macro 'RW_MASK'
#define WRITE   RW_MASK
^~~
   fs/ext4/crypto.c:442:25: note: in expansion of macro 'WRITE'
  err = submit_bio_wait(WRITE, bio);
^
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:471:12: note: expected 'struct bio *' but argument is of 
type 'int'
extern int submit_bio_wait(struct bio *bio);
   ^~~
   fs/ext4/crypto.c:442:9: error: too many arguments to function 
'submit_bio_wait'
  err = submit_bio_wait(WRITE, bio);
^~~
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:471:12: note: declared here
extern int submit_bio_wait(struct bio *bio);
   ^~~

vim +/submit_bio_wait +197 include/linux/fs.h

   181   * READAUsed for read-ahead operations. Lower priority, 
and the
   182   *  block layer could (in theory) choose to ignore 
this
   183   *  request if it runs into resource problems.
   184   * WRITEA normal async write. Device will be plugged.
   185   * WRITE_SYNC   Synchronous write. Identical to WRITE, but 
passes down
   186   *  the hint that someone will be waiting on this IO
   187   *  shortly. The write equivalent of READ_SYNC.
   188   * WRITE_ODIRECTSpecial case write for O_DIRECT only.
   189   * WRITE_FLUSH  Like WRITE_SYNC but with preceding cache flush.
   190   * WRITE_FUALike WRITE_SYNC but data is guaranteed to be on
   191   *  non-volatile media on completion.
   192   * WRITE_FLUSH_FUA  Combination of WRITE_FLUSH and FUA. The IO is 
preceded
   193   *  by a cache flush and data is guaranteed to be on
   194   *  non-volatile media on completion.
   195   *
   196   */
 > 197  #define RW_MASK REQ_OP_WRITE
   198  #define RWA_MASKREQ_RAHEAD
   199  
   200  #define READREQ_OP_READ
   201  #define WRITE   RW_MASK
   202  #define READA   RWA_MASK
   203  
   204  #define READ_SYNC   REQ_SYNC
   205  #define WRITE_SYNC  (REQ_SYNC | REQ_NOIDLE)

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: [PATCH 01/45] block/fs/drivers: remove rw argument from submit_bio

2016-06-05 Thread kbuild test robot
Hi,

[auto build test ERROR on v4.7-rc1]
[cannot apply to dm/for-next md/for-next next-20160603]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/mchristi-redhat-com/v8-separate-operations-from-flags-in-the-bio-request-structs/20160606-040240
config: x86_64-randconfig-i0-201623 (attached as .config)
compiler: gcc-6 (Debian 6.1.1-1) 6.1.1 20160430
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/fs.h:31:0,
from include/linux/genhd.h:67,
from include/linux/blkdev.h:9,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   fs/ext4/crypto.c: In function 'ext4_encrypted_zeroout':
   include/linux/blk_types.h:194:20: warning: passing argument 1 of 
'submit_bio_wait' makes pointer from integer without a cast [-Wint-conversion]
#define REQ_WRITE  (1ULL << __REQ_WRITE)
   ^
   include/linux/fs.h:196:19: note: in expansion of macro 'REQ_WRITE'
#define RW_MASK   REQ_WRITE
  ^
>> include/linux/fs.h:200:17: note: in expansion of macro 'RW_MASK'
#define WRITE   RW_MASK
^~~
>> fs/ext4/crypto.c:442:25: note: in expansion of macro 'WRITE'
  err = submit_bio_wait(WRITE, bio);
^
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: expected 'struct bio *' but argument is of 
type 'long long unsigned int'
extern int submit_bio_wait(struct bio *bio);
   ^~~
>> fs/ext4/crypto.c:442:9: error: too many arguments to function 
>> 'submit_bio_wait'
  err = submit_bio_wait(WRITE, bio);
^~~
   In file included from include/linux/blkdev.h:19:0,
from fs/ext4/ext4.h:20,
from fs/ext4/ext4_extents.h:22,
from fs/ext4/crypto.c:37:
   include/linux/bio.h:476:12: note: declared here
extern int submit_bio_wait(struct bio *bio);
   ^~~

vim +/submit_bio_wait +442 fs/ext4/crypto.c

36086d43 Theodore Ts'o   2015-10-03  436 
"bio_add_page failed: %d", ret);
36086d43 Theodore Ts'o   2015-10-03  437WARN_ON(1);
b30ab0e0 Michael Halcrow 2015-04-12  438bio_put(bio);
36086d43 Theodore Ts'o   2015-10-03  439err = -EIO;
b30ab0e0 Michael Halcrow 2015-04-12  440goto errout;
b30ab0e0 Michael Halcrow 2015-04-12  441}
b30ab0e0 Michael Halcrow 2015-04-12 @442err = 
submit_bio_wait(WRITE, bio);
36086d43 Theodore Ts'o   2015-10-03  443if ((err == 0) && 
bio->bi_error)
36086d43 Theodore Ts'o   2015-10-03  444err = -EIO;
95ea68b4 Theodore Ts'o   2015-05-31  445bio_put(bio);

:: The code at line 442 was first introduced by commit
:: b30ab0e03407d2aa2d9316cba199c757e4bfc8ad ext4 crypto: add ext4 
encryption facilities

:: TO: Michael Halcrow 
:: CC: Theodore Ts'o 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data


Re: btrfs

2016-06-05 Thread Henk Slager
>> > - OTOH, defrag seems to be viable for important use cases (VM
>> > images,
>> >   DBs,... everything where large files are internally re-written
>> >   randomly).
>> >   Sure there is nodatacow, but with that one effectively completely
>> >   looses one of the core features/promises of btrfs (integrity by
>> >   checksumming)... and as I've showed in an earlier large
>> > discussion,
>> >   none of the typical use cases for nodatacow has any high-level
>> >   checksumming, and even if, it's not used per default, or doesn't
>> > give
>> >   the same benefits at it would on the fs level, like using it for
>> > RAID
>> >   recovery).
>> The argument of nodatacow being viable for anything is a pretty
>> significant secondary discussion that is itself entirely orthogonal
>> to
>> the point you appear to be trying to make here.
>
> Well the point here was:
> - many people (including myself) like btrfs, it's
>   (promised/future/current) features
> - it's intended as a general purpose fs
> - this includes the case of having such file/IO patterns as e.g. for VM
>   images or DBs
> - this is currently not really doable without loosing one of the
>   promises (integrity)
>
> So the point I'm trying to make:
> People do probably not care so much whether their VM image/etc. is
> COWed or not, snapshots/etc. still work with that,... but they may
> likely care if the integrity feature is lost.
> So IMHO, nodatacow + checksumming deserves to be amongst the top
> priorities.

Have you tried blockdevice/HDD caching like bcache or dmcache in
combination with VMs on BTRFS?  Or ZVOL for VMs in ZFS with L2ARC?
I assume the primary reason for wanting nodatacow + checksumming is to
avoid long seektimes on HDDs due to growing fragmentation of the VM
images over time. But even if you have nodatacow + checksumming
implemented, it is then still HDD access and a VM imagefile itself is
not guaranteed to be continuous.
It is clear that for VM images the amount of extents will be large
over time (like 50k or so, autodefrag on), but with a modern SSD used
as cache, it doesn't matter. It is still way faster than just HDD(s),
even with freshly copied image with <100 extents.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs

2016-06-05 Thread Christoph Anton Mitterer
On Sun, 2016-06-05 at 09:51 -0600, Chris Murphy wrote:
> Why is mdadm the reference point for terminology?
I haven't said it is,... I just said it mdadm, original paper, WP use
it the common/historic way.
And since all of these were there before btrfs, and in the case of
mdadm/MD "in" the kernel,... one should probably try to follow that, if
possible.



> There's actually better consistency in terminology usage outside
> Linux
> because of SNIA and DDF than within Linux where the most basic terms
> aren't agreed upon by various upstream maintainers.

Does anyone in the Linux world really care much about DDF? Even
outside? ;-)
Seriously,... as I tried to show in one of my previous posts, I think
the terminology of DDF, at least WRT RAID1 is a bit awkward.


>  mdadm and lvm use
> different terms even though they're both now using the same md
> backend
> in the kernel.
Depending on whether one choose to use "raid1" and "mirror" segment
types


Anyway,... I think that discussion gets a bit pointless,... I think
it's clear that the current terminology may easily cause confusion, and
I think for a term like "RAID1", which is a artificial name it's
something completely else as for terms like "stripe", "chunk", etc.,
which are rather common terms and where one must expect that they are
used for different things in different areas.

And as I've said just before... the other points on my bucket list,
like the UUID collision (security) issues, the no checksumming with
nodatacow, etc.  deserve IMHO much more attention than the terminology
:)

So I'm kinda out of this specific part of the discussion.

Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


Re: check if hardware checksumming works or not

2016-06-05 Thread Nicholas D Steeves
Hi Alberto,

On 5 June 2016 at 15:37, Alberto Bursi  wrote:
>
> Hi, I'm running Debian ARM on a Marvell Kirkwood-based 2-disk NAS.
>
> Kirkwood SoCs have a XOR engine that can hardware-accelerate crc32c
> checksumming, and from what I see in kernel mailing lists it seems to have a
> linux driver and should be supported.
>
> I wanted to ask if there is a way to test if it is working at all.
>
> How do I force btrfs to use software checksumming for testing purposes?

Is there a mv_xor.ko module you can blacklist?  I'm not familiar with
the platform, but I imagine you'll have to blacklist it and reboot,
because I'm guessing the module can't be removed once it's loaded.

'just a guess,
Nicholas
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 vs RAID10 and best way to set up 6 disks

2016-06-05 Thread Christoph Anton Mitterer
On Sun, 2016-06-05 at 09:36 -0600, Chris Murphy wrote:
> That's ridiculous. It isn't incorrect to refer to only 2 copies as
> raid1.
No, if there are only two devices then not.
But obviously we're talking about how btrfs does RAID1, in which even
with n>2 devices there are only 2 copies - that's incorrect.


>  You have to explicitly ask both mdadm
Aha, and which option would that be?

>  and lvcreate for the
> number of copies you want, it doesn't automatically happen.
I've said that before, but at least it allows you to use the full
number of disks, so we're again back to that it's closer to the
original and common meaning of RAID1 than what btrfs does.


>  The man
> page for mkfs.btrfs is very clear you only get two copies.

I haven't denied that... but one shouldn't use terms that are commonly
understood in a different mannor and require people to read all the
small printed.
One could also have changed it's RAID0 with RAID1, and I guess people
wouldn't be too delighted if the excuse was "well it's in the manpage".


> 
> > Well I'd say, for btrfs: do away with the term "RAID" at all, use
> > e.g.:
> > 
> > linear = just a bunch of devices put together, no striping
> >  basically what MD's linear is
> Except this isn't really how Btrfs single works. The difference
> between mdadm linear and Btrfs single is more different in behavior
> than the difference between mdadm raid1 and btrfs raid1. So you're
> proposing tolerating a bigger difference, while criticizing a smaller
> one. *shrug*

What's the big difference? Would you care to explain? But I'm happy
with "single" either, it just doesn't really tell that there is no
striping, I mean "single" points more towards "we have no resilience
but only 1 copy", whether this is striped or not.



> If a metaphor is going to be used for a technical thing, it would be
> mirrors or mirroring. Mirror would mean exactly two (the original and
> the mirror). See lvcreate --mirrors. Also, the lvm mirror segment
> type
> is legacy, having been replaced with raid1 (man lvcreate uses the
> term
> raid1, not RAID1 or RAID-1). So I'm not a big fan of this term.

Admittedly, I didn't like the "mirror(s)" either... I was just trying
to show that different names could be used that are already a bit
better.


> > striped = basically what RAID0 is
> 
> lvcreate uses only striped, not raid0. mdadm uses only RAID0, not
> striped. Since striping is also employed with RAIDs 4, 5, 6, 7, it
> seems ambiguous even though without further qualification whether
> parity exists, it's considered to mean non-parity striping. The
> ambiguity is probably less of a problem than the contradiction that
> is
> RAID0.

Mhh,.. well or one makes schema names that contain all possible
properties of a "RAID", something like:
replicasN-parityN-[not]striped

SINGLE would be something like "replicas1-parity0-notstriped".
RAID5 would be something like "replicas0-parity1-striped".


> > And just mention in the manpage, which of these names comes closest
> > to
> > what people understand by RAID level i.
> 
> It already does this. What version of btrfs-progs are you basing your
> criticism on that there's some inconsistency, deficiency, or
> ambiguity
> when it comes to these raid levels?

Well first, the terminology thing is the least serious issue from my
original list ;-) ... TBH I don't know why such a large discussion came
out of that point.

Even though I'm not reading along all mails here, we have probably at
least every month someone who wasn't aware that RAID1 is not what he
assumes it to be.
And I don't think these people can be blamed for not RTFM, because IMHO
this is a term commonly understood as mirror all available devices.
That's how the original paper describes it, it's how Wikipedia
describes it and all other sources I've ever read to the topic.


>  The one that's unequivocally
> problematic alone without reading the man page is raid10. The
> historic
> understanding is that it's a stripe of mirrors, and this suggests you
> can lose a mirror of each stripe i.e. multiple disks and not lose
> data, which is not true for Btrfs raid10. But the man page makes that
> clear, you have 2 copies for redundancy, that's it.
Yes, same basic problem.


> On the CLI? Not worth it. If the user is that ignorant, too bad, use
> a
> GUI program to help build the storage stack from scratch. I'm really
> not sympathetic if a user creates a raid1 from two partitions of the
> same block device anymore than if it's ultimately the same physical
> device managed by a device mapper variant.

Well one I have no strong opinion on that... if testing for it (or at
least simple cases) would be easy, why not.
Not every situation may be as easily visible as creating a RAID1 on
/dev/sda1 and /dev/sda2.
One may use LABELs, or UUIDs and accidentally catch the wrong, and in
such cases a check may help.


Cheers,
Chris.

smime.p7s
Description: S/MIME cryptographic signature


[PATCH 11/45] btrfs: have submit_one_bio users use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

This patch has btrfs's submit_one_bio users set the bio op using
bio_set_op_attrs and get the op using bio_op.

The next patches will continue to convert btrfs,
so submit_bio_hook and merge_bio_hook
related code will be modified to take only the bio. I did
not do it in this patch to try and keep it smaller.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/btrfs/extent_io.c | 88 +---
 1 file changed, 43 insertions(+), 45 deletions(-)

diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index c1e6f20..48f0302 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -2377,7 +2377,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 
phy_offset,
int read_mode;
int ret;
 
-   BUG_ON(failed_bio->bi_rw & REQ_WRITE);
+   BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
 
ret = btrfs_get_io_failure_record(inode, start, end, );
if (ret)
@@ -2403,6 +2403,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 
phy_offset,
free_io_failure(inode, failrec);
return -EIO;
}
+   bio_set_op_attrs(bio, REQ_OP_READ, read_mode);
 
pr_debug("Repair Read Error: submitting new read[%#x] to 
this_mirror=%d, in_validation=%d\n",
 read_mode, failrec->this_mirror, failrec->in_validation);
@@ -2714,8 +2715,8 @@ struct bio *btrfs_io_bio_alloc(gfp_t gfp_mask, unsigned 
int nr_iovecs)
 }
 
 
-static int __must_check submit_one_bio(int rw, struct bio *bio,
-  int mirror_num, unsigned long bio_flags)
+static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
+  unsigned long bio_flags)
 {
int ret = 0;
struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
@@ -2726,12 +2727,12 @@ static int __must_check submit_one_bio(int rw, struct 
bio *bio,
start = page_offset(page) + bvec->bv_offset;
 
bio->bi_private = NULL;
-   bio->bi_rw = rw;
bio_get(bio);
 
if (tree->ops && tree->ops->submit_bio_hook)
-   ret = tree->ops->submit_bio_hook(page->mapping->host, rw, bio,
-  mirror_num, bio_flags, start);
+   ret = tree->ops->submit_bio_hook(page->mapping->host,
+bio->bi_rw, bio, mirror_num,
+bio_flags, start);
else
btrfsic_submit_bio(bio);
 
@@ -2739,20 +2740,20 @@ static int __must_check submit_one_bio(int rw, struct 
bio *bio,
return ret;
 }
 
-static int merge_bio(int rw, struct extent_io_tree *tree, struct page *page,
+static int merge_bio(struct extent_io_tree *tree, struct page *page,
 unsigned long offset, size_t size, struct bio *bio,
 unsigned long bio_flags)
 {
int ret = 0;
if (tree->ops && tree->ops->merge_bio_hook)
-   ret = tree->ops->merge_bio_hook(rw, page, offset, size, bio,
-   bio_flags);
+   ret = tree->ops->merge_bio_hook(bio_op(bio), page, offset, size,
+   bio, bio_flags);
BUG_ON(ret < 0);
return ret;
 
 }
 
-static int submit_extent_page(int rw, struct extent_io_tree *tree,
+static int submit_extent_page(int op, int op_flags, struct extent_io_tree 
*tree,
  struct writeback_control *wbc,
  struct page *page, sector_t sector,
  size_t size, unsigned long offset,
@@ -2780,10 +2781,9 @@ static int submit_extent_page(int rw, struct 
extent_io_tree *tree,
 
if (prev_bio_flags != bio_flags || !contig ||
force_bio_submit ||
-   merge_bio(rw, tree, page, offset, page_size, bio, 
bio_flags) ||
+   merge_bio(tree, page, offset, page_size, bio, bio_flags) ||
bio_add_page(bio, page, page_size, offset) < page_size) {
-   ret = submit_one_bio(rw, bio, mirror_num,
-prev_bio_flags);
+   ret = submit_one_bio(bio, mirror_num, prev_bio_flags);
if (ret < 0) {
*bio_ret = NULL;
return ret;
@@ -2804,6 +2804,7 @@ static int submit_extent_page(int rw, struct 
extent_io_tree *tree,
bio_add_page(bio, page, page_size, offset);
bio->bi_end_io = end_io_func;
bio->bi_private = tree;
+   bio_set_op_attrs(bio, op, op_flags);
if (wbc) {
wbc_init_bio(wbc, bio);
wbc_account_io(wbc, page, page_size);
@@ -2812,7 +2813,7 @@ static int 

[PATCH 09/45] block discard: use bio set op accessor

2016-06-05 Thread mchristi
From: Mike Christie 

This converts the block issue discard helper and users to use
the bio_set_op_attrs accessor and only pass in the operation flags
like REQ_SEQURE.

Signed-off-by: Mike Christie 
---
 block/blk-lib.c| 13 +++--
 drivers/md/dm-thin.c   |  2 +-
 include/linux/blkdev.h |  3 ++-
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/block/blk-lib.c b/block/blk-lib.c
index c614eaa..ff2a7f0 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -23,7 +23,8 @@ static struct bio *next_bio(struct bio *bio, unsigned int 
nr_pages,
 }
 
 int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
-   sector_t nr_sects, gfp_t gfp_mask, int type, struct bio **biop)
+   sector_t nr_sects, gfp_t gfp_mask, int op_flags,
+   struct bio **biop)
 {
struct request_queue *q = bdev_get_queue(bdev);
struct bio *bio = *biop;
@@ -34,7 +35,7 @@ int __blkdev_issue_discard(struct block_device *bdev, 
sector_t sector,
return -ENXIO;
if (!blk_queue_discard(q))
return -EOPNOTSUPP;
-   if ((type & REQ_SECURE) && !blk_queue_secdiscard(q))
+   if ((op_flags & REQ_SECURE) && !blk_queue_secdiscard(q))
return -EOPNOTSUPP;
 
/* Zero-sector (unknown) and one-sector granularities are the same.  */
@@ -65,7 +66,7 @@ int __blkdev_issue_discard(struct block_device *bdev, 
sector_t sector,
bio = next_bio(bio, 1, gfp_mask);
bio->bi_iter.bi_sector = sector;
bio->bi_bdev = bdev;
-   bio->bi_rw = type;
+   bio_set_op_attrs(bio, REQ_OP_DISCARD, op_flags);
 
bio->bi_iter.bi_size = req_sects << 9;
nr_sects -= req_sects;
@@ -99,16 +100,16 @@ EXPORT_SYMBOL(__blkdev_issue_discard);
 int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, unsigned long flags)
 {
-   int type = REQ_WRITE | REQ_DISCARD;
+   int op_flags = 0;
struct bio *bio = NULL;
struct blk_plug plug;
int ret;
 
if (flags & BLKDEV_DISCARD_SECURE)
-   type |= REQ_SECURE;
+   op_flags |= REQ_SECURE;
 
blk_start_plug();
-   ret = __blkdev_issue_discard(bdev, sector, nr_sects, gfp_mask, type,
+   ret = __blkdev_issue_discard(bdev, sector, nr_sects, gfp_mask, op_flags,
);
if (!ret && bio) {
ret = submit_bio_wait(bio);
diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c
index 8c070ee..e8661c2 100644
--- a/drivers/md/dm-thin.c
+++ b/drivers/md/dm-thin.c
@@ -360,7 +360,7 @@ static int issue_discard(struct discard_op *op, dm_block_t 
data_b, dm_block_t da
sector_t len = block_to_sectors(tc->pool, data_e - data_b);
 
return __blkdev_issue_discard(tc->pool_dev->bdev, s, len,
- GFP_NOWAIT, REQ_WRITE | REQ_DISCARD, 
>bio);
+ GFP_NOWAIT, 0, >bio);
 }
 
 static void end_discard(struct discard_op *op, int r)
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 49c2dbc..8c78aca 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1149,7 +1149,8 @@ extern int blkdev_issue_flush(struct block_device *, 
gfp_t, sector_t *);
 extern int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, unsigned long flags);
 extern int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
-   sector_t nr_sects, gfp_t gfp_mask, int type, struct bio **biop);
+   sector_t nr_sects, gfp_t gfp_mask, int op_flags,
+   struct bio **biop);
 extern int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask, struct page *page);
 extern int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/45] block, fs, mm, drivers: use bio set/get op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

This patch converts the simple bi_rw use cases in the block,
drivers, mm and fs code to set/get the bio operation using
bio_set_op_attrs/bio_op

These should be simple one or two liner cases, so I just did them
in one patch. The next patches handle the more complicated
cases in a module per patch.

Signed-off-by: Mike Christie 
---

v5:
1. Add missed crypto call.
2. Change nfs bi_rw check to bi_op.

 block/bio.c | 13 ++---
 block/blk-core.c|  6 +++---
 block/blk-flush.c   |  2 +-
 block/blk-lib.c |  4 ++--
 block/blk-map.c |  2 +-
 block/blk-merge.c   | 12 ++--
 drivers/block/brd.c |  2 +-
 drivers/block/floppy.c  |  2 +-
 drivers/block/pktcdvd.c |  4 ++--
 drivers/block/rsxx/dma.c|  2 +-
 drivers/block/zram/zram_drv.c   |  2 +-
 drivers/lightnvm/rrpc.c |  6 +++---
 drivers/scsi/osd/osd_initiator.c|  8 
 drivers/staging/lustre/lustre/llite/lloop.c |  6 +++---
 fs/crypto/crypto.c  |  2 +-
 fs/exofs/ore.c  |  2 +-
 fs/ext4/page-io.c   |  6 +++---
 fs/ext4/readpage.c  |  2 +-
 fs/jfs/jfs_logmgr.c |  4 ++--
 fs/jfs/jfs_metapage.c   |  4 ++--
 fs/logfs/dev_bdev.c | 12 ++--
 fs/nfs/blocklayout/blocklayout.c|  4 ++--
 include/linux/bio.h | 15 ++-
 mm/page_io.c|  4 ++--
 24 files changed, 65 insertions(+), 61 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index fc779eb..848cd35 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -656,16 +656,15 @@ struct bio *bio_clone_bioset(struct bio *bio_src, gfp_t 
gfp_mask,
bio = bio_alloc_bioset(gfp_mask, bio_segments(bio_src), bs);
if (!bio)
return NULL;
-
bio->bi_bdev= bio_src->bi_bdev;
bio->bi_rw  = bio_src->bi_rw;
bio->bi_iter.bi_sector  = bio_src->bi_iter.bi_sector;
bio->bi_iter.bi_size= bio_src->bi_iter.bi_size;
 
-   if (bio->bi_rw & REQ_DISCARD)
+   if (bio_op(bio) == REQ_OP_DISCARD)
goto integrity_clone;
 
-   if (bio->bi_rw & REQ_WRITE_SAME) {
+   if (bio_op(bio) == REQ_OP_WRITE_SAME) {
bio->bi_io_vec[bio->bi_vcnt++] = bio_src->bi_io_vec[0];
goto integrity_clone;
}
@@ -1166,7 +1165,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
goto out_bmd;
 
if (iter->type & WRITE)
-   bio->bi_rw |= REQ_WRITE;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
 
ret = 0;
 
@@ -1336,7 +1335,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
 * set data direction, and check if mapped pages need bouncing
 */
if (iter->type & WRITE)
-   bio->bi_rw |= REQ_WRITE;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
 
bio_set_flag(bio, BIO_USER_MAPPED);
 
@@ -1529,7 +1528,7 @@ struct bio *bio_copy_kern(struct request_queue *q, void 
*data, unsigned int len,
bio->bi_private = data;
} else {
bio->bi_end_io = bio_copy_kern_endio;
-   bio->bi_rw |= REQ_WRITE;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
}
 
return bio;
@@ -1784,7 +1783,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
 * Discards need a mutable bio_vec to accommodate the payload
 * required by the DSM TRIM and UNMAP commands.
 */
-   if (bio->bi_rw & REQ_DISCARD)
+   if (bio_op(bio) == REQ_OP_DISCARD)
split = bio_clone_bioset(bio, gfp, bs);
else
split = bio_clone_fast(bio, gfp, bs);
diff --git a/block/blk-core.c b/block/blk-core.c
index e8e5865..7e943dc 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1973,14 +1973,14 @@ generic_make_request_checks(struct bio *bio)
}
}
 
-   if ((bio->bi_rw & REQ_DISCARD) &&
+   if ((bio_op(bio) == REQ_OP_DISCARD) &&
(!blk_queue_discard(q) ||
 ((bio->bi_rw & REQ_SECURE) && !blk_queue_secdiscard(q {
err = -EOPNOTSUPP;
goto end_io;
}
 
-   if (bio->bi_rw & REQ_WRITE_SAME && !bdev_write_same(bio->bi_bdev)) {
+   if (bio_op(bio) == REQ_OP_WRITE_SAME && !bdev_write_same(bio->bi_bdev)) 
{
err = -EOPNOTSUPP;
goto end_io;
}
@@ -2110,7 +2110,7 @@ blk_qc_t submit_bio(struct bio *bio)
if (bio_has_data(bio)) {
unsigned int count;
 
-   if 

[PATCH 10/45] direct-io: use bio set/get op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

This patch has the dio code use a REQ_OP for the op and rq_flag_bits
for bi_rw flags. To set/get the op it uses the bio_set_op_attrs/bio_op
accssors.

It also begins to convert btrfs's dio_submit_t because of the dio
submit_io callout use. The next patches will completely convert
this code and the reset of the btrfs code paths.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/btrfs/inode.c   |  8 
 fs/direct-io.c | 34 --
 include/linux/fs.h |  2 +-
 3 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 2704995..96f9192 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8422,14 +8422,14 @@ out_err:
return 0;
 }
 
-static void btrfs_submit_direct(int rw, struct bio *dio_bio,
-   struct inode *inode, loff_t file_offset)
+static void btrfs_submit_direct(struct bio *dio_bio, struct inode *inode,
+   loff_t file_offset)
 {
struct btrfs_dio_private *dip = NULL;
struct bio *io_bio = NULL;
struct btrfs_io_bio *btrfs_bio;
int skip_sum;
-   int write = rw & REQ_WRITE;
+   bool write = (bio_op(dio_bio) == REQ_OP_WRITE);
int ret = 0;
 
skip_sum = BTRFS_I(inode)->flags & BTRFS_INODE_NODATASUM;
@@ -8480,7 +8480,7 @@ static void btrfs_submit_direct(int rw, struct bio 
*dio_bio,
dio_data->unsubmitted_oe_range_end;
}
 
-   ret = btrfs_submit_direct_hook(rw, dip, skip_sum);
+   ret = btrfs_submit_direct_hook(dio_bio->bi_rw, dip, skip_sum);
if (!ret)
return;
 
diff --git a/fs/direct-io.c b/fs/direct-io.c
index 1bcdd5d..7c3ce73 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -108,7 +108,8 @@ struct dio_submit {
 /* dio_state communicated between submission path and end_io */
 struct dio {
int flags;  /* doesn't change */
-   int rw;
+   int op;
+   int op_flags;
blk_qc_t bio_cookie;
struct block_device *bio_bdev;
struct inode *inode;
@@ -163,7 +164,7 @@ static inline int dio_refill_pages(struct dio *dio, struct 
dio_submit *sdio)
ret = iov_iter_get_pages(sdio->iter, dio->pages, LONG_MAX, DIO_PAGES,
>from);
 
-   if (ret < 0 && sdio->blocks_available && (dio->rw & WRITE)) {
+   if (ret < 0 && sdio->blocks_available && (dio->op == REQ_OP_WRITE)) {
struct page *page = ZERO_PAGE(0);
/*
 * A memory fault, but the filesystem has some outstanding
@@ -242,7 +243,8 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, 
bool is_async)
transferred = dio->result;
 
/* Check for short read case */
-   if ((dio->rw == READ) && ((offset + transferred) > dio->i_size))
+   if ((dio->op == REQ_OP_READ) &&
+   ((offset + transferred) > dio->i_size))
transferred = dio->i_size - offset;
}
 
@@ -273,7 +275,7 @@ static ssize_t dio_complete(struct dio *dio, ssize_t ret, 
bool is_async)
 */
dio->iocb->ki_pos += transferred;
 
-   if (dio->rw & WRITE)
+   if (dio->op == REQ_OP_WRITE)
ret = generic_write_sync(dio->iocb,  transferred);
dio->iocb->ki_complete(dio->iocb, ret, 0);
}
@@ -375,7 +377,7 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
 
bio->bi_bdev = bdev;
bio->bi_iter.bi_sector = first_sector;
-   bio->bi_rw = dio->rw;
+   bio_set_op_attrs(bio, dio->op, dio->op_flags);
if (dio->is_async)
bio->bi_end_io = dio_bio_end_aio;
else
@@ -403,14 +405,13 @@ static inline void dio_bio_submit(struct dio *dio, struct 
dio_submit *sdio)
dio->refcount++;
spin_unlock_irqrestore(>bio_lock, flags);
 
-   if (dio->is_async && dio->rw == READ && dio->should_dirty)
+   if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty)
bio_set_pages_dirty(bio);
 
dio->bio_bdev = bio->bi_bdev;
 
if (sdio->submit_io) {
-   sdio->submit_io(dio->rw, bio, dio->inode,
-  sdio->logical_offset_in_bio);
+   sdio->submit_io(bio, dio->inode, sdio->logical_offset_in_bio);
dio->bio_cookie = BLK_QC_T_NONE;
} else
dio->bio_cookie = submit_bio(bio);
@@ -479,14 +480,14 @@ static int dio_bio_complete(struct dio *dio, struct bio 
*bio)
if (bio->bi_error)
dio->io_error = -EIO;
 
-   if (dio->is_async && dio->rw == READ && dio->should_dirty) {
+   if (dio->is_async && dio->op == REQ_OP_READ && dio->should_dirty) {
err = 

[PATCH 15/45] f2fs: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have f2fs
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/f2fs/checkpoint.c| 10 ++
 fs/f2fs/data.c  | 47 ++---
 fs/f2fs/f2fs.h  |  5 +++--
 fs/f2fs/gc.c|  9 ++---
 fs/f2fs/inline.c|  3 ++-
 fs/f2fs/node.c  |  8 +---
 fs/f2fs/segment.c   | 12 +++-
 fs/f2fs/trace.c |  7 ---
 include/trace/events/f2fs.h | 34 +++-
 9 files changed, 81 insertions(+), 54 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 3891600..b6d600e 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -63,14 +63,15 @@ static struct page *__get_meta_page(struct f2fs_sb_info 
*sbi, pgoff_t index,
struct f2fs_io_info fio = {
.sbi = sbi,
.type = META,
-   .rw = READ_SYNC | REQ_META | REQ_PRIO,
+   .op = REQ_OP_READ,
+   .op_flags = READ_SYNC | REQ_META | REQ_PRIO,
.old_blkaddr = index,
.new_blkaddr = index,
.encrypted_page = NULL,
};
 
if (unlikely(!is_meta))
-   fio.rw &= ~REQ_META;
+   fio.op_flags &= ~REQ_META;
 repeat:
page = f2fs_grab_cache_page(mapping, index, false);
if (!page) {
@@ -157,13 +158,14 @@ int ra_meta_pages(struct f2fs_sb_info *sbi, block_t 
start, int nrpages,
struct f2fs_io_info fio = {
.sbi = sbi,
.type = META,
-   .rw = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA,
+   .op = REQ_OP_READ,
+   .op_flags = sync ? (READ_SYNC | REQ_META | REQ_PRIO) : READA,
.encrypted_page = NULL,
};
struct blk_plug plug;
 
if (unlikely(type == META_POR))
-   fio.rw &= ~REQ_META;
+   fio.op_flags &= ~REQ_META;
 
blk_start_plug();
for (; nrpages-- > 0; blkno++) {
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index c595c8f..8769e83 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -97,12 +97,10 @@ static struct bio *__bio_alloc(struct f2fs_sb_info *sbi, 
block_t blk_addr,
return bio;
 }
 
-static inline void __submit_bio(struct f2fs_sb_info *sbi, int rw,
-   struct bio *bio)
+static inline void __submit_bio(struct f2fs_sb_info *sbi, struct bio *bio)
 {
-   if (!is_read_io(rw))
+   if (!is_read_io(bio_op(bio)))
atomic_inc(>nr_wb_bios);
-   bio->bi_rw = rw;
submit_bio(bio);
 }
 
@@ -113,12 +111,14 @@ static void __submit_merged_bio(struct f2fs_bio_info *io)
if (!io->bio)
return;
 
-   if (is_read_io(fio->rw))
+   if (is_read_io(fio->op))
trace_f2fs_submit_read_bio(io->sbi->sb, fio, io->bio);
else
trace_f2fs_submit_write_bio(io->sbi->sb, fio, io->bio);
 
-   __submit_bio(io->sbi, fio->rw, io->bio);
+   bio_set_op_attrs(io->bio, fio->op, fio->op_flags);
+
+   __submit_bio(io->sbi, io->bio);
io->bio = NULL;
 }
 
@@ -184,10 +184,12 @@ static void __f2fs_submit_merged_bio(struct f2fs_sb_info 
*sbi,
/* change META to META_FLUSH in the checkpoint procedure */
if (type >= META_FLUSH) {
io->fio.type = META_FLUSH;
+   io->fio.op = REQ_OP_WRITE;
if (test_opt(sbi, NOBARRIER))
-   io->fio.rw = WRITE_FLUSH | REQ_META | REQ_PRIO;
+   io->fio.op_flags = WRITE_FLUSH | REQ_META | REQ_PRIO;
else
-   io->fio.rw = WRITE_FLUSH_FUA | REQ_META | REQ_PRIO;
+   io->fio.op_flags = WRITE_FLUSH_FUA | REQ_META |
+   REQ_PRIO;
}
__submit_merged_bio(io);
 out:
@@ -229,14 +231,16 @@ int f2fs_submit_page_bio(struct f2fs_io_info *fio)
f2fs_trace_ios(fio, 0);
 
/* Allocate a new bio */
-   bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->rw));
+   bio = __bio_alloc(fio->sbi, fio->new_blkaddr, 1, is_read_io(fio->op));
 
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
bio_put(bio);
return -EFAULT;
}
+   bio->bi_rw = fio->op_flags;
+   bio_set_op_attrs(bio, fio->op, fio->op_flags);
 
-   __submit_bio(fio->sbi, fio->rw, bio);
+   __submit_bio(fio->sbi, bio);
return 0;
 }
 
@@ -245,7 +249,7 @@ void f2fs_submit_page_mbio(struct f2fs_io_info *fio)
struct f2fs_sb_info *sbi = fio->sbi;
enum page_type btype = PAGE_TYPE_OF_BIO(fio->type);
struct f2fs_bio_info 

[PATCH 13/45] btrfs: update __btrfs_map_block for REQ_OP transition

2016-06-05 Thread mchristi
From: Mike Christie 

We no longer pass in a bitmap of rq_flag_bits bits to __btrfs_map_block.
It will always be a REQ_OP, or the btrfs specific REQ_GET_READ_MIRRORS,
so this drops the bit tests.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/btrfs/extent-tree.c |  2 +-
 fs/btrfs/inode.c   |  2 +-
 fs/btrfs/volumes.c | 55 +++---
 fs/btrfs/volumes.h |  4 ++--
 4 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a400951..70af591 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2043,7 +2043,7 @@ int btrfs_discard_extent(struct btrfs_root *root, u64 
bytenr,
 
 
/* Tell the block device(s) that the sectors can be discarded */
-   ret = btrfs_map_block(root->fs_info, REQ_DISCARD,
+   ret = btrfs_map_block(root->fs_info, REQ_OP_DISCARD,
  bytenr, _bytes, , 0);
/* Error condition is -ENOMEM */
if (!ret) {
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index b07e1d9..1575944 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1838,7 +1838,7 @@ int btrfs_merge_bio_hook(int rw, struct page *page, 
unsigned long offset,
 
length = bio->bi_iter.bi_size;
map_length = length;
-   ret = btrfs_map_block(root->fs_info, rw, logical,
+   ret = btrfs_map_block(root->fs_info, bio_op(bio), logical,
  _length, NULL, 0);
/* Will always return 0 with map_multi == NULL */
BUG_ON(ret < 0);
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 4dc2249..345b183 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -5248,7 +5248,7 @@ void btrfs_put_bbio(struct btrfs_bio *bbio)
kfree(bbio);
 }
 
-static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int rw,
+static int __btrfs_map_block(struct btrfs_fs_info *fs_info, int op,
 u64 logical, u64 *length,
 struct btrfs_bio **bbio_ret,
 int mirror_num, int need_raid_map)
@@ -5334,7 +5334,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
raid56_full_stripe_start *= full_stripe_len;
}
 
-   if (rw & REQ_DISCARD) {
+   if (op == REQ_OP_DISCARD) {
/* we don't discard raid56 yet */
if (map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) {
ret = -EOPNOTSUPP;
@@ -5347,7 +5347,7 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
   For other RAID types and for RAID[56] reads, just allow a 
single
   stripe (on a single disk). */
if ((map->type & BTRFS_BLOCK_GROUP_RAID56_MASK) &&
-   (rw & REQ_WRITE)) {
+   (op == REQ_OP_WRITE)) {
max_len = stripe_len * nr_data_stripes(map) -
(offset - raid56_full_stripe_start);
} else {
@@ -5372,8 +5372,8 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
btrfs_dev_replace_set_lock_blocking(dev_replace);
 
if (dev_replace_is_ongoing && mirror_num == map->num_stripes + 1 &&
-   !(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)) &&
-   dev_replace->tgtdev != NULL) {
+   op != REQ_OP_WRITE && op != REQ_OP_DISCARD &&
+   op != REQ_GET_READ_MIRRORS && dev_replace->tgtdev != NULL) {
/*
 * in dev-replace case, for repair case (that's the only
 * case where the mirror is selected explicitly when
@@ -5460,15 +5460,17 @@ static int __btrfs_map_block(struct btrfs_fs_info 
*fs_info, int rw,
(offset + *length);
 
if (map->type & BTRFS_BLOCK_GROUP_RAID0) {
-   if (rw & REQ_DISCARD)
+   if (op == REQ_OP_DISCARD)
num_stripes = min_t(u64, map->num_stripes,
stripe_nr_end - stripe_nr_orig);
stripe_nr = div_u64_rem(stripe_nr, map->num_stripes,
_index);
-   if (!(rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS)))
+   if (op != REQ_OP_WRITE && op != REQ_OP_DISCARD &&
+   op != REQ_GET_READ_MIRRORS)
mirror_num = 1;
} else if (map->type & BTRFS_BLOCK_GROUP_RAID1) {
-   if (rw & (REQ_WRITE | REQ_DISCARD | REQ_GET_READ_MIRRORS))
+   if (op == REQ_OP_WRITE || op == REQ_OP_DISCARD ||
+   op == REQ_GET_READ_MIRRORS)
num_stripes = map->num_stripes;
else if (mirror_num)
stripe_index = mirror_num - 1;
@@ -5481,7 +5483,8 @@ static int 

[PATCH 12/45] btrfs: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

This should be the easier cases to convert btrfs to
bio_set_op_attrs/bio_op.
They are mostly just cut and replace type of changes.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
v5:
- Misset bi_rw to REQ_OP_WRITE in finish_parity_scrub

 fs/btrfs/check-integrity.c | 19 +--
 fs/btrfs/compression.c |  4 
 fs/btrfs/disk-io.c |  8 
 fs/btrfs/inode.c   | 21 -
 fs/btrfs/raid56.c  | 10 +-
 fs/btrfs/scrub.c   | 10 +-
 fs/btrfs/volumes.c | 15 ---
 7 files changed, 47 insertions(+), 40 deletions(-)

diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 0d3748b..80a4389 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -1673,7 +1673,7 @@ static int btrfsic_read_block(struct btrfsic_state *state,
}
bio->bi_bdev = block_ctx->dev->bdev;
bio->bi_iter.bi_sector = dev_bytenr >> 9;
-   bio->bi_rw = READ;
+   bio_set_op_attrs(bio, REQ_OP_READ, 0);
 
for (j = i; j < num_pages; j++) {
ret = bio_add_page(bio, block_ctx->pagev[j],
@@ -2922,7 +2922,6 @@ int btrfsic_submit_bh(int op, int op_flags, struct 
buffer_head *bh)
 static void __btrfsic_submit_bio(struct bio *bio)
 {
struct btrfsic_dev_state *dev_state;
-   int rw = bio->bi_rw;
 
if (!btrfsic_is_initialized)
return;
@@ -2932,7 +2931,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
 * btrfsic_mount(), this might return NULL */
dev_state = btrfsic_dev_state_lookup(bio->bi_bdev);
if (NULL != dev_state &&
-   (rw & WRITE) && NULL != bio->bi_io_vec) {
+   (bio_op(bio) == REQ_OP_WRITE) && NULL != bio->bi_io_vec) {
unsigned int i;
u64 dev_bytenr;
u64 cur_bytenr;
@@ -2944,9 +2943,9 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=0x%x, bi_vcnt=%u,"
+  "submit_bio(rw=%d,0x%lx, bi_vcnt=%u,"
   " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n",
-  rw, bio->bi_vcnt,
+  bio_op(bio), bio->bi_rw, bio->bi_vcnt,
   (unsigned long long)bio->bi_iter.bi_sector,
   dev_bytenr, bio->bi_bdev);
 
@@ -2977,18 +2976,18 @@ static void __btrfsic_submit_bio(struct bio *bio)
btrfsic_process_written_block(dev_state, dev_bytenr,
  mapped_datav, bio->bi_vcnt,
  bio, _is_patched,
- NULL, rw);
+ NULL, bio->bi_rw);
while (i > 0) {
i--;
kunmap(bio->bi_io_vec[i].bv_page);
}
kfree(mapped_datav);
-   } else if (NULL != dev_state && (rw & REQ_FLUSH)) {
+   } else if (NULL != dev_state && (bio->bi_rw & REQ_FLUSH)) {
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=0x%x FLUSH, bdev=%p)\n",
-  rw, bio->bi_bdev);
+  "submit_bio(rw=%d,0x%lx FLUSH, bdev=%p)\n",
+  bio_op(bio), bio->bi_rw, bio->bi_bdev);
if (!dev_state->dummy_block_for_bio_bh_flush.is_iodone) {
if ((dev_state->state->print_mask &
 (BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH |
@@ -3006,7 +3005,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
block->never_written = 0;
block->iodone_w_error = 0;
block->flush_gen = dev_state->last_flush_gen + 1;
-   block->submit_bio_bh_rw = rw;
+   block->submit_bio_bh_rw = bio->bi_rw;
block->orig_bio_bh_private = bio->bi_private;
block->orig_bio_bh_end_io.bio = bio->bi_end_io;
block->next_in_same_bio = NULL;
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 658c39b..029bd79 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -363,6 +363,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
kfree(cb);
return -ENOMEM;
}
+   bio_set_op_attrs(bio, REQ_OP_WRITE, 0);

[PATCH 00/45] v8: separate operations from flags in the bio/request structs

2016-06-05 Thread mchristi
The following patches begin to cleanup the request->cmd_flags and
bio->bi_rw mess. We currently use cmd_flags to specify the operation,
attributes and state of the request. For bi_rw we use it for similar
info and also the priority but then also have another bi_flags field
for state. At some point, we abused them so much we just made cmd_flags
64 bits, so we could add more.

The following patches seperate the operation (read, write discard,
flush, etc) from cmd_flags/bi_rw. They were made against Linus's
tree.

I put a git tree here:

https://github.com/mikechristie/linux-kernel/tree/op

The patches are in the op branch.

Note that I made it against linus's tree, but right now the only
major conflicts with -next are in the dm tree from the dm-rq related
changes. I have patches for that and can submit them. I was just not
sure how to coordinate everything.

v8:
1. Handle Jens's review comments from LSF.  Instead of adding a op
field, store the value in bi_rw/cmd_flags and access via accessors.

v7:
1. Fix broken feature_flush/fua use.

v6 and maybe hopfully the last version:
1. Adapt patch 41 to Jens's QUEUE_FLAG_WC/FUA patchset.

v5:
1. Missed crypto fs submit_bio_wait call.
2. Change nfs bi_rw check to bi_op.
3. btrfs. Convert finish_parity_scrub.
4. Reworked against Jens's QUEUE_FLAG patches so I could drop my similar
code.
5. Separated the core block layer change into multiple patches for
merging, elevator, stats, mq and non mq request allocation to try
and make it easier to read.

v4:
1. Rebased to current linux-next tree.

v3:

1. Used "=" instead of "|="  to setup bio bi_rw.
2. Removed __get_request cmd_flags compat code.
3. Merged initial dm related changes requested by Mike Snitzer.
4. Fixed ubd kbuild errors in flush related patches.
5. Fix 80 char col issues in several patches.
6. Fix issue with one of the btrfs patches where it looks like I reverted
a patch when trying to fix a merge error.

v2

1. Dropped arguments from submit_bio, and had callers setup
bio.
2. Add REQ_OP_FLUSH for request_fn users and renamed REQ_FLUSH
to REQ_PREFLUSH for make_request_fn users.
3. Dropped bio/rq_data_dir functions, and added a op_is_write
function instead.



Diffstat for the set:

Documentation/block/writeback_cache_control.txt |   28 +++---
 Documentation/device-mapper/log-writes.txt  |   10 +-
 arch/um/drivers/ubd_kern.c  |2
 block/bio.c |   20 ++--
 block/blk-core.c|  105 +++
 block/blk-flush.c   |   25 ++---
 block/blk-lib.c |   37 
 block/blk-map.c |2
 block/blk-merge.c   |   24 ++---
 block/blk-mq.c  |   42 -
 block/cfq-iosched.c |   55 +++-
 block/elevator.c|7 -
 drivers/ata/libata-scsi.c   |2
 drivers/block/brd.c |2
 drivers/block/drbd/drbd_actlog.c|   34 ---
 drivers/block/drbd/drbd_bitmap.c|   10 +-
 drivers/block/drbd/drbd_int.h   |4
 drivers/block/drbd/drbd_main.c  |   22 ++--
 drivers/block/drbd/drbd_protocol.h  |2
 drivers/block/drbd/drbd_receiver.c  |   38 +---
 drivers/block/drbd/drbd_req.c   |2
 drivers/block/drbd/drbd_worker.c|7 -
 drivers/block/floppy.c  |5 -
 drivers/block/loop.c|   16 +--
 drivers/block/mtip32xx/mtip32xx.c   |2
 drivers/block/nbd.c |4
 drivers/block/osdblk.c  |2
 drivers/block/pktcdvd.c |4
 drivers/block/ps3disk.c |4
 drivers/block/rbd.c |4
 drivers/block/rsxx/dma.c|2
 drivers/block/skd_main.c|2
 drivers/block/umem.c|2
 drivers/block/virtio_blk.c  |2
 drivers/block/xen-blkback/blkback.c |   31 --
 drivers/block/xen-blkfront.c|   67 +++
 drivers/block/zram/zram_drv.c   |2
 drivers/ide/ide-cd_ioctl.c  |3
 drivers/ide/ide-disk.c  |2
 drivers/ide/ide-floppy.c|2
 drivers/lightnvm/rrpc.c |6 -
 drivers/md/bcache/btree.c   |4
 drivers/md/bcache/debug.c   |   10 +-
 drivers/md/bcache/io.c  |2
 drivers/md/bcache/journal.c |   11 +-
 drivers/md/bcache/movinggc.c|2
 drivers/md/bcache/request.c |   28 

[PATCH 18/45] hfsplus: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have gfs2
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/hfsplus/hfsplus_fs.h |  2 +-
 fs/hfsplus/part_tbl.c   |  5 +++--
 fs/hfsplus/super.c  |  6 --
 fs/hfsplus/wrapper.c| 14 --
 4 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/fs/hfsplus/hfsplus_fs.h b/fs/hfsplus/hfsplus_fs.h
index fdc3446..047245b 100644
--- a/fs/hfsplus/hfsplus_fs.h
+++ b/fs/hfsplus/hfsplus_fs.h
@@ -526,7 +526,7 @@ int hfsplus_compare_dentry(const struct dentry *parent,
 
 /* wrapper.c */
 int hfsplus_submit_bio(struct super_block *sb, sector_t sector, void *buf,
-  void **data, int rw);
+  void **data, int op, int op_flags);
 int hfsplus_read_wrapper(struct super_block *sb);
 
 /* time macros */
diff --git a/fs/hfsplus/part_tbl.c b/fs/hfsplus/part_tbl.c
index eb355d8..63164eb 100644
--- a/fs/hfsplus/part_tbl.c
+++ b/fs/hfsplus/part_tbl.c
@@ -112,7 +112,8 @@ static int hfs_parse_new_pmap(struct super_block *sb, void 
*buf,
if ((u8 *)pm - (u8 *)buf >= buf_size) {
res = hfsplus_submit_bio(sb,
 *part_start + HFS_PMAP_BLK + i,
-buf, (void **), READ);
+buf, (void **), REQ_OP_READ,
+0);
if (res)
return res;
}
@@ -136,7 +137,7 @@ int hfs_part_find(struct super_block *sb,
return -ENOMEM;
 
res = hfsplus_submit_bio(sb, *part_start + HFS_PMAP_BLK,
-buf, , READ);
+buf, , REQ_OP_READ, 0);
if (res)
goto out;
 
diff --git a/fs/hfsplus/super.c b/fs/hfsplus/super.c
index 755bf30..11854dd 100644
--- a/fs/hfsplus/super.c
+++ b/fs/hfsplus/super.c
@@ -220,7 +220,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait)
 
error2 = hfsplus_submit_bio(sb,
   sbi->part_start + HFSPLUS_VOLHEAD_SECTOR,
-  sbi->s_vhdr_buf, NULL, WRITE_SYNC);
+  sbi->s_vhdr_buf, NULL, REQ_OP_WRITE,
+  WRITE_SYNC);
if (!error)
error = error2;
if (!write_backup)
@@ -228,7 +229,8 @@ static int hfsplus_sync_fs(struct super_block *sb, int wait)
 
error2 = hfsplus_submit_bio(sb,
  sbi->part_start + sbi->sect_count - 2,
- sbi->s_backup_vhdr_buf, NULL, WRITE_SYNC);
+ sbi->s_backup_vhdr_buf, NULL, REQ_OP_WRITE,
+ WRITE_SYNC);
if (!error)
error2 = error;
 out:
diff --git a/fs/hfsplus/wrapper.c b/fs/hfsplus/wrapper.c
index d026bb3..ebb85e5 100644
--- a/fs/hfsplus/wrapper.c
+++ b/fs/hfsplus/wrapper.c
@@ -30,7 +30,8 @@ struct hfsplus_wd {
  * @sector: block to read or write, for blocks of HFSPLUS_SECTOR_SIZE bytes
  * @buf: buffer for I/O
  * @data: output pointer for location of requested data
- * @rw: direction of I/O
+ * @op: direction of I/O
+ * @op_flags: request op flags
  *
  * The unit of I/O is hfsplus_min_io_size(sb), which may be bigger than
  * HFSPLUS_SECTOR_SIZE, and @buf must be sized accordingly. On reads
@@ -44,7 +45,7 @@ struct hfsplus_wd {
  * will work correctly.
  */
 int hfsplus_submit_bio(struct super_block *sb, sector_t sector,
-   void *buf, void **data, int rw)
+  void *buf, void **data, int op, int op_flags)
 {
struct bio *bio;
int ret = 0;
@@ -65,9 +66,9 @@ int hfsplus_submit_bio(struct super_block *sb, sector_t 
sector,
bio = bio_alloc(GFP_NOIO, 1);
bio->bi_iter.bi_sector = sector;
bio->bi_bdev = sb->s_bdev;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
-   if (!(rw & WRITE) && data)
+   if (op != WRITE && data)
*data = (u8 *)buf + offset;
 
while (io_size > 0) {
@@ -182,7 +183,7 @@ int hfsplus_read_wrapper(struct super_block *sb)
 reread:
error = hfsplus_submit_bio(sb, part_start + HFSPLUS_VOLHEAD_SECTOR,
   sbi->s_vhdr_buf, (void **)>s_vhdr,
-  READ);
+  REQ_OP_READ, 0);
if (error)
goto out_free_backup_vhdr;
 
@@ -214,7 +215,8 @@ reread:
 
error = hfsplus_submit_bio(sb, part_start + part_size - 2,
   sbi->s_backup_vhdr_buf,
-  (void **)>s_backup_vhdr, READ);
+

[PATCH 30/45] block: copy bio op to request op

2016-06-05 Thread mchristi
From: Mike Christie 

The bio users should now always be setting up the bio op. This patch
has the block layer copy that to the request.

Signed-off-by: Mike Christie 
---
 block/blk-core.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 7e943dc..3c45254 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2976,8 +2976,7 @@ EXPORT_SYMBOL_GPL(__blk_end_request_err);
 void blk_rq_bio_prep(struct request_queue *q, struct request *rq,
 struct bio *bio)
 {
-   /* Bit 0 (R/W) is identical in rq->cmd_flags and bio->bi_rw */
-   rq->cmd_flags |= bio->bi_rw & REQ_WRITE;
+   req_set_op(rq, bio_op(bio));
 
if (bio_has_data(bio))
rq->nr_phys_segments = bio_phys_segments(q, bio);
@@ -3062,7 +3061,8 @@ EXPORT_SYMBOL_GPL(blk_rq_unprep_clone);
 static void __blk_rq_prep_clone(struct request *dst, struct request *src)
 {
dst->cpu = src->cpu;
-   dst->cmd_flags |= (src->cmd_flags & REQ_CLONE_MASK) | REQ_NOMERGE;
+   req_set_op_attrs(dst, req_op(src),
+(src->cmd_flags & REQ_CLONE_MASK) | REQ_NOMERGE);
dst->cmd_type = src->cmd_type;
dst->__sector = blk_rq_pos(src);
dst->__data_len = blk_rq_bytes(src);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 32/45] block: prepare mq request creation to use REQ_OPs

2016-06-05 Thread mchristi
From: Mike Christie 

This patch modifies the blk mq request creation code to use
separate variables for the operation and flags, because in the
the next patches the struct request users will be converted like
was done for bios.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/blk-mq.c | 30 --
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 29cbc1b..3393f29 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -159,16 +159,17 @@ bool blk_mq_can_queue(struct blk_mq_hw_ctx *hctx)
 EXPORT_SYMBOL(blk_mq_can_queue);
 
 static void blk_mq_rq_ctx_init(struct request_queue *q, struct blk_mq_ctx *ctx,
-  struct request *rq, unsigned int rw_flags)
+  struct request *rq, int op,
+  unsigned int op_flags)
 {
if (blk_queue_io_stat(q))
-   rw_flags |= REQ_IO_STAT;
+   op_flags |= REQ_IO_STAT;
 
INIT_LIST_HEAD(>queuelist);
/* csd/requeue_work/fifo_time is initialized before use */
rq->q = q;
rq->mq_ctx = ctx;
-   rq->cmd_flags |= rw_flags;
+   req_set_op_attrs(rq, op, op_flags);
/* do not touch atomic flags, it needs atomic ops against the timer */
rq->cpu = -1;
INIT_HLIST_NODE(>hash);
@@ -203,11 +204,11 @@ static void blk_mq_rq_ctx_init(struct request_queue *q, 
struct blk_mq_ctx *ctx,
rq->end_io_data = NULL;
rq->next_rq = NULL;
 
-   ctx->rq_dispatched[rw_is_sync(rw_flags)]++;
+   ctx->rq_dispatched[rw_is_sync(op | op_flags)]++;
 }
 
 static struct request *
-__blk_mq_alloc_request(struct blk_mq_alloc_data *data, int rw)
+__blk_mq_alloc_request(struct blk_mq_alloc_data *data, int op, int op_flags)
 {
struct request *rq;
unsigned int tag;
@@ -222,7 +223,7 @@ __blk_mq_alloc_request(struct blk_mq_alloc_data *data, int 
rw)
}
 
rq->tag = tag;
-   blk_mq_rq_ctx_init(data->q, data->ctx, rq, rw);
+   blk_mq_rq_ctx_init(data->q, data->ctx, rq, op, op_flags);
return rq;
}
 
@@ -246,7 +247,7 @@ struct request *blk_mq_alloc_request(struct request_queue 
*q, int rw,
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(_data, q, flags, ctx, hctx);
 
-   rq = __blk_mq_alloc_request(_data, rw);
+   rq = __blk_mq_alloc_request(_data, rw, 0);
if (!rq && !(flags & BLK_MQ_REQ_NOWAIT)) {
__blk_mq_run_hw_queue(hctx);
blk_mq_put_ctx(ctx);
@@ -254,7 +255,7 @@ struct request *blk_mq_alloc_request(struct request_queue 
*q, int rw,
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(_data, q, flags, ctx, hctx);
-   rq =  __blk_mq_alloc_request(_data, rw);
+   rq =  __blk_mq_alloc_request(_data, rw, 0);
ctx = alloc_data.ctx;
}
blk_mq_put_ctx(ctx);
@@ -1169,7 +1170,8 @@ static struct request *blk_mq_map_request(struct 
request_queue *q,
struct blk_mq_hw_ctx *hctx;
struct blk_mq_ctx *ctx;
struct request *rq;
-   int rw = bio_data_dir(bio);
+   int op = bio_data_dir(bio);
+   int op_flags = 0;
struct blk_mq_alloc_data alloc_data;
 
blk_queue_enter_live(q);
@@ -1177,20 +1179,20 @@ static struct request *blk_mq_map_request(struct 
request_queue *q,
hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
if (rw_is_sync(bio->bi_rw))
-   rw |= REQ_SYNC;
+   op_flags |= REQ_SYNC;
 
-   trace_block_getrq(q, bio, rw);
+   trace_block_getrq(q, bio, op);
blk_mq_set_alloc_data(_data, q, BLK_MQ_REQ_NOWAIT, ctx, hctx);
-   rq = __blk_mq_alloc_request(_data, rw);
+   rq = __blk_mq_alloc_request(_data, op, op_flags);
if (unlikely(!rq)) {
__blk_mq_run_hw_queue(hctx);
blk_mq_put_ctx(ctx);
-   trace_block_sleeprq(q, bio, rw);
+   trace_block_sleeprq(q, bio, op);
 
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
blk_mq_set_alloc_data(_data, q, 0, ctx, hctx);
-   rq = __blk_mq_alloc_request(_data, rw);
+   rq = __blk_mq_alloc_request(_data, op, op_flags);
ctx = alloc_data.ctx;
hctx = alloc_data.hctx;
}
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/45] pm: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have the pm code
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 kernel/power/swap.c | 30 ++
 1 file changed, 18 insertions(+), 12 deletions(-)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index be227f5..c1aaac4 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -261,7 +261,7 @@ static void hib_end_io(struct bio *bio)
bio_put(bio);
 }
 
-static int hib_submit_io(int rw, pgoff_t page_off, void *addr,
+static int hib_submit_io(int op, int op_flags, pgoff_t page_off, void *addr,
struct hib_bio_batch *hb)
 {
struct page *page = virt_to_page(addr);
@@ -271,7 +271,7 @@ static int hib_submit_io(int rw, pgoff_t page_off, void 
*addr,
bio = bio_alloc(__GFP_RECLAIM | __GFP_HIGH, 1);
bio->bi_iter.bi_sector = page_off * (PAGE_SIZE >> 9);
bio->bi_bdev = hib_resume_bdev;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
if (bio_add_page(bio, page, PAGE_SIZE, 0) < PAGE_SIZE) {
printk(KERN_ERR "PM: Adding page to bio failed at %llu\n",
@@ -307,7 +307,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, 
unsigned int flags)
 {
int error;
 
-   hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL);
+   hib_submit_io(REQ_OP_READ, READ_SYNC, swsusp_resume_block,
+ swsusp_header, NULL);
if (!memcmp("SWAP-SPACE",swsusp_header->sig, 10) ||
!memcmp("SWAPSPACE2",swsusp_header->sig, 10)) {
memcpy(swsusp_header->orig_sig,swsusp_header->sig, 10);
@@ -316,8 +317,8 @@ static int mark_swapfiles(struct swap_map_handle *handle, 
unsigned int flags)
swsusp_header->flags = flags;
if (flags & SF_CRC32_MODE)
swsusp_header->crc32 = handle->crc32;
-   error = hib_submit_io(WRITE_SYNC, swsusp_resume_block,
-   swsusp_header, NULL);
+   error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC,
+ swsusp_resume_block, swsusp_header, NULL);
} else {
printk(KERN_ERR "PM: Swap header not found!\n");
error = -ENODEV;
@@ -390,7 +391,7 @@ static int write_page(void *buf, sector_t offset, struct 
hib_bio_batch *hb)
} else {
src = buf;
}
-   return hib_submit_io(WRITE_SYNC, offset, src, hb);
+   return hib_submit_io(REQ_OP_WRITE, WRITE_SYNC, offset, src, hb);
 }
 
 static void release_swap_writer(struct swap_map_handle *handle)
@@ -993,7 +994,8 @@ static int get_swap_reader(struct swap_map_handle *handle,
return -ENOMEM;
}
 
-   error = hib_submit_io(READ_SYNC, offset, tmp->map, NULL);
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset,
+ tmp->map, NULL);
if (error) {
release_swap_reader(handle);
return error;
@@ -1017,7 +1019,7 @@ static int swap_read_page(struct swap_map_handle *handle, 
void *buf,
offset = handle->cur->entries[handle->k];
if (!offset)
return -EFAULT;
-   error = hib_submit_io(READ_SYNC, offset, buf, hb);
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC, offset, buf, hb);
if (error)
return error;
if (++handle->k >= MAP_PAGE_ENTRIES) {
@@ -1526,7 +1528,8 @@ int swsusp_check(void)
if (!IS_ERR(hib_resume_bdev)) {
set_blocksize(hib_resume_bdev, PAGE_SIZE);
clear_page(swsusp_header);
-   error = hib_submit_io(READ_SYNC, swsusp_resume_block,
+   error = hib_submit_io(REQ_OP_READ, READ_SYNC,
+   swsusp_resume_block,
swsusp_header, NULL);
if (error)
goto put;
@@ -1534,7 +1537,8 @@ int swsusp_check(void)
if (!memcmp(HIBERNATE_SIG, swsusp_header->sig, 10)) {
memcpy(swsusp_header->sig, swsusp_header->orig_sig, 10);
/* Reset swap signature now */
-   error = hib_submit_io(WRITE_SYNC, swsusp_resume_block,
+   error = hib_submit_io(REQ_OP_WRITE, WRITE_SYNC,
+   swsusp_resume_block,
swsusp_header, NULL);
} else {
error = -EINVAL;
@@ -1578,10 +1582,12 @@ int swsusp_unmark(void)
 {
int error;
 
-   hib_submit_io(READ_SYNC, swsusp_resume_block, swsusp_header, NULL);
+   hib_submit_io(REQ_OP_READ, READ_SYNC, 

[PATCH 06/45] dm: use op_is_write instead of checking for REQ_WRITE

2016-06-05 Thread mchristi
From: Mike Christie 

We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.

This has dm use the op_is_write helper which will do the right
thing.

Signed-off-by: Mike Christie 
---
 drivers/md/dm-io.c | 4 ++--
 drivers/md/dm-kcopyd.c | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 50f17e3..26e9a85 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -409,7 +409,7 @@ static int sync_io(struct dm_io_client *client, unsigned 
int num_regions,
struct io *io;
struct sync_io sio;
 
-   if (num_regions > 1 && (rw & RW_MASK) != WRITE) {
+   if (num_regions > 1 && !op_is_write(rw)) {
WARN_ON(1);
return -EIO;
}
@@ -442,7 +442,7 @@ static int async_io(struct dm_io_client *client, unsigned 
int num_regions,
 {
struct io *io;
 
-   if (num_regions > 1 && (rw & RW_MASK) != WRITE) {
+   if (num_regions > 1 && !op_is_write(rw)) {
WARN_ON(1);
fn(1, context);
return -EIO;
diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c
index 1452ed9..9f390e4 100644
--- a/drivers/md/dm-kcopyd.c
+++ b/drivers/md/dm-kcopyd.c
@@ -465,7 +465,7 @@ static void complete_io(unsigned long error, void *context)
io_job_finish(kc->throttle);
 
if (error) {
-   if (job->rw & WRITE)
+   if (op_is_write(job->rw))
job->write_err |= error;
else
job->read_err = 1;
@@ -477,7 +477,7 @@ static void complete_io(unsigned long error, void *context)
}
}
 
-   if (job->rw & WRITE)
+   if (op_is_write(job->rw))
push(>complete_jobs, job);
 
else {
@@ -550,7 +550,7 @@ static int process_jobs(struct list_head *jobs, struct 
dm_kcopyd_client *kc,
 
if (r < 0) {
/* error this rogue job */
-   if (job->rw & WRITE)
+   if (op_is_write(job->rw))
job->write_err = (unsigned long) -1L;
else
job->read_err = 1;
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/45] xfs: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have xfs
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
---

v8:
1. Handled changes due to rebase and dropped signed offs due to
upstream changes since last review.

 fs/xfs/xfs_aops.c | 12 
 fs/xfs/xfs_buf.c  | 26 ++
 2 files changed, 18 insertions(+), 20 deletions(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 0cd1603..87d2b21 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -438,10 +438,8 @@ xfs_submit_ioend(
 
ioend->io_bio->bi_private = ioend;
ioend->io_bio->bi_end_io = xfs_end_bio;
-   if (wbc->sync_mode == WB_SYNC_ALL)
-   ioend->io_bio->bi_rw = WRITE_SYNC;
-   else
-   ioend->io_bio->bi_rw = WRITE;
+   bio_set_op_attrs(ioend->io_bio, REQ_OP_WRITE,
+(wbc->sync_mode == WB_SYNC_ALL) ? WRITE_SYNC : 0);
/*
 * If we are failing the IO now, just mark the ioend with an
 * error and finish it. This will run IO completion immediately
@@ -512,10 +510,8 @@ xfs_chain_bio(
 
bio_chain(ioend->io_bio, new);
bio_get(ioend->io_bio); /* for xfs_destroy_ioend */
-   if (wbc->sync_mode == WB_SYNC_ALL)
-   ioend->io_bio->bi_rw = WRITE_SYNC;
-   else
-   ioend->io_bio->bi_rw = WRITE;
+   bio_set_op_attrs(ioend->io_bio, REQ_OP_WRITE,
+ (wbc->sync_mode == WB_SYNC_ALL) ? WRITE_SYNC : 0);
submit_bio(ioend->io_bio);
ioend->io_bio = new;
 }
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 0777c67..d8acd37 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1127,7 +1127,8 @@ xfs_buf_ioapply_map(
int map,
int *buf_offset,
int *count,
-   int rw)
+   int op,
+   int op_flags)
 {
int page_index;
int total_nr_pages = bp->b_page_count;
@@ -1166,7 +1167,7 @@ next_chunk:
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = xfs_buf_bio_end_io;
bio->bi_private = bp;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
for (; size && nr_pages; nr_pages--, page_index++) {
int rbytes, nbytes = PAGE_SIZE - offset;
@@ -1210,7 +1211,8 @@ _xfs_buf_ioapply(
struct xfs_buf  *bp)
 {
struct blk_plug plug;
-   int rw;
+   int op;
+   int op_flags = 0;
int offset;
int size;
int i;
@@ -1229,14 +1231,13 @@ _xfs_buf_ioapply(
bp->b_ioend_wq = bp->b_target->bt_mount->m_buf_workqueue;
 
if (bp->b_flags & XBF_WRITE) {
+   op = REQ_OP_WRITE;
if (bp->b_flags & XBF_SYNCIO)
-   rw = WRITE_SYNC;
-   else
-   rw = WRITE;
+   op_flags = WRITE_SYNC;
if (bp->b_flags & XBF_FUA)
-   rw |= REQ_FUA;
+   op_flags |= REQ_FUA;
if (bp->b_flags & XBF_FLUSH)
-   rw |= REQ_FLUSH;
+   op_flags |= REQ_FLUSH;
 
/*
 * Run the write verifier callback function if it exists. If
@@ -1266,13 +1267,14 @@ _xfs_buf_ioapply(
}
}
} else if (bp->b_flags & XBF_READ_AHEAD) {
-   rw = READA;
+   op = REQ_OP_READ;
+   op_flags = REQ_RAHEAD;
} else {
-   rw = READ;
+   op = REQ_OP_READ;
}
 
/* we only use the buffer cache for meta-data */
-   rw |= REQ_META;
+   op_flags |= REQ_META;
 
/*
 * Walk all the vectors issuing IO on them. Set up the initial offset
@@ -1284,7 +1286,7 @@ _xfs_buf_ioapply(
size = BBTOB(bp->b_io_length);
blk_start_plug();
for (i = 0; i < bp->b_map_count; i++) {
-   xfs_buf_ioapply_map(bp, i, , , rw);
+   xfs_buf_ioapply_map(bp, i, , , op, op_flags);
if (bp->b_error)
break;
if (size <= 0)
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/45] md: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have md
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 drivers/md/bitmap.c  |  2 +-
 drivers/md/dm-raid.c |  5 +++--
 drivers/md/linear.c  |  2 +-
 drivers/md/md.c  | 12 ++--
 drivers/md/md.h  |  3 ++-
 drivers/md/raid0.c   |  2 +-
 drivers/md/raid1.c   | 32 +++-
 drivers/md/raid10.c  | 48 +++-
 drivers/md/raid5-cache.c | 26 --
 drivers/md/raid5.c   | 40 
 10 files changed, 88 insertions(+), 84 deletions(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index bc6dced..6fff794 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -162,7 +162,7 @@ static int read_sb_page(struct mddev *mddev, loff_t offset,
 
if (sync_page_io(rdev, target,
 roundup(size, 
bdev_logical_block_size(rdev->bdev)),
-page, READ, true)) {
+page, REQ_OP_READ, 0, true)) {
page->index = index;
return 0;
}
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index 5253274..8cbac62 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -792,7 +792,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size)
if (rdev->sb_loaded)
return 0;
 
-   if (!sync_page_io(rdev, 0, size, rdev->sb_page, READ, 1)) {
+   if (!sync_page_io(rdev, 0, size, rdev->sb_page, REQ_OP_READ, 0, 1)) {
DMERR("Failed to read superblock of device at position %d",
  rdev->raid_disk);
md_error(rdev->mddev, rdev);
@@ -1651,7 +1651,8 @@ static void attempt_restore_of_faulty_devices(struct 
raid_set *rs)
for (i = 0; i < rs->md.raid_disks; i++) {
r = >dev[i].rdev;
if (test_bit(Faulty, >flags) && r->sb_page &&
-   sync_page_io(r, 0, r->sb_size, r->sb_page, READ, 1)) {
+   sync_page_io(r, 0, r->sb_size, r->sb_page, REQ_OP_READ, 0,
+1)) {
DMINFO("Faulty %s device #%d has readable super block."
   "  Attempting to revive it.",
   rs->raid_type->name, i);
diff --git a/drivers/md/linear.c b/drivers/md/linear.c
index b7fe7e9..1ad3f48 100644
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -252,7 +252,7 @@ static void linear_make_request(struct mddev *mddev, struct 
bio *bio)
split->bi_iter.bi_sector = split->bi_iter.bi_sector -
start_sector + data_offset;
 
-   if (unlikely((split->bi_rw & REQ_DISCARD) &&
+   if (unlikely((bio_op(split) == REQ_OP_DISCARD) &&
 !blk_queue_discard(bdev_get_queue(split->bi_bdev {
/* Just ignore it */
bio_endio(split);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index fb3950b..bd4844f 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -394,7 +394,7 @@ static void submit_flushes(struct work_struct *ws)
bi->bi_end_io = md_end_flush;
bi->bi_private = rdev;
bi->bi_bdev = rdev->bdev;
-   bi->bi_rw = WRITE_FLUSH;
+   bio_set_op_attrs(bi, REQ_OP_WRITE, WRITE_FLUSH);
atomic_inc(>flush_pending);
submit_bio(bi);
rcu_read_lock();
@@ -743,7 +743,7 @@ void md_super_write(struct mddev *mddev, struct md_rdev 
*rdev,
bio_add_page(bio, page, size, 0);
bio->bi_private = rdev;
bio->bi_end_io = super_written;
-   bio->bi_rw = WRITE_FLUSH_FUA;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, WRITE_FLUSH_FUA);
 
atomic_inc(>pending_writes);
submit_bio(bio);
@@ -756,14 +756,14 @@ void md_super_wait(struct mddev *mddev)
 }
 
 int sync_page_io(struct md_rdev *rdev, sector_t sector, int size,
-struct page *page, int rw, bool metadata_op)
+struct page *page, int op, int op_flags, bool metadata_op)
 {
struct bio *bio = bio_alloc_mddev(GFP_NOIO, 1, rdev->mddev);
int ret;
 
bio->bi_bdev = (metadata_op && rdev->meta_bdev) ?
rdev->meta_bdev : rdev->bdev;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
if (metadata_op)
bio->bi_iter.bi_sector = sector + rdev->sb_start;
else if (rdev->mddev->reshape_position != MaxSector &&
@@ -789,7 +789,7 @@ static int read_disk_sb(struct md_rdev *rdev, int size)
if 

[PATCH 24/45] dm: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have dm
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 

v8:
- Moved op_is_write changes to its own patch.
- Dropped signed offs due to changes in dm.

---
 drivers/md/dm-bufio.c   |  8 +++---
 drivers/md/dm-cache-target.c| 10 +---
 drivers/md/dm-crypt.c   |  8 +++---
 drivers/md/dm-io.c  | 56 ++---
 drivers/md/dm-kcopyd.c  |  5 ++--
 drivers/md/dm-log-writes.c  |  8 +++---
 drivers/md/dm-log.c |  5 ++--
 drivers/md/dm-raid1.c   | 19 --
 drivers/md/dm-region-hash.c |  4 +--
 drivers/md/dm-snap-persistent.c | 24 ++
 drivers/md/dm-stripe.c  |  4 +--
 drivers/md/dm-thin.c| 17 +++--
 drivers/md/dm.c |  8 +++---
 include/linux/dm-io.h   |  3 ++-
 14 files changed, 99 insertions(+), 80 deletions(-)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index 9d3ee7f..6571c81 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -574,7 +574,8 @@ static void use_dmio(struct dm_buffer *b, int rw, sector_t 
block,
 {
int r;
struct dm_io_request io_req = {
-   .bi_rw = rw,
+   .bi_op = rw,
+   .bi_op_flags = 0,
.notify.fn = dmio_complete,
.notify.context = b,
.client = b->c->dm_io,
@@ -634,7 +635,7 @@ static void use_inline_bio(struct dm_buffer *b, int rw, 
sector_t block,
 * the dm_buffer's inline bio is local to bufio.
 */
b->bio.bi_private = end_io;
-   b->bio.bi_rw = rw;
+   bio_set_op_attrs(>bio, rw, 0);
 
/*
 * We assume that if len >= PAGE_SIZE ptr is page-aligned.
@@ -1327,7 +1328,8 @@ EXPORT_SYMBOL_GPL(dm_bufio_write_dirty_buffers);
 int dm_bufio_issue_flush(struct dm_bufio_client *c)
 {
struct dm_io_request io_req = {
-   .bi_rw = WRITE_FLUSH,
+   .bi_op = REQ_OP_WRITE,
+   .bi_op_flags = WRITE_FLUSH,
.mem.type = DM_IO_KMEM,
.mem.ptr.addr = NULL,
.client = c->dm_io,
diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index ee0510f..540e80e 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -788,7 +788,8 @@ static void check_if_tick_bio_needed(struct cache *cache, 
struct bio *bio)
 
spin_lock_irqsave(>lock, flags);
if (cache->need_tick_bio &&
-   !(bio->bi_rw & (REQ_FUA | REQ_FLUSH | REQ_DISCARD))) {
+   !(bio->bi_rw & (REQ_FUA | REQ_FLUSH)) &&
+   bio_op(bio) != REQ_OP_DISCARD) {
pb->tick = true;
cache->need_tick_bio = false;
}
@@ -851,7 +852,7 @@ static void inc_ds(struct cache *cache, struct bio *bio,
 static bool accountable_bio(struct cache *cache, struct bio *bio)
 {
return ((bio->bi_bdev == cache->origin_dev->bdev) &&
-   !(bio->bi_rw & REQ_DISCARD));
+   bio_op(bio) != REQ_OP_DISCARD);
 }
 
 static void accounted_begin(struct cache *cache, struct bio *bio)
@@ -1067,7 +1068,8 @@ static void dec_io_migrations(struct cache *cache)
 
 static bool discard_or_flush(struct bio *bio)
 {
-   return bio->bi_rw & (REQ_FLUSH | REQ_FUA | REQ_DISCARD);
+   return bio_op(bio) == REQ_OP_DISCARD ||
+  bio->bi_rw & (REQ_FLUSH | REQ_FUA);
 }
 
 static void __cell_defer(struct cache *cache, struct dm_bio_prison_cell *cell)
@@ -1980,7 +1982,7 @@ static void process_deferred_bios(struct cache *cache)
 
if (bio->bi_rw & REQ_FLUSH)
process_flush_bio(cache, bio);
-   else if (bio->bi_rw & REQ_DISCARD)
+   else if (bio_op(bio) == REQ_OP_DISCARD)
process_discard_bio(cache, , bio);
else
process_bio(cache, , bio);
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 4f3cb35..057d19b 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1136,7 +1136,7 @@ static void clone_init(struct dm_crypt_io *io, struct bio 
*clone)
clone->bi_private = io;
clone->bi_end_io  = crypt_endio;
clone->bi_bdev= cc->dev->bdev;
-   clone->bi_rw  = io->base_bio->bi_rw;
+   bio_set_op_attrs(clone, bio_op(io->base_bio), io->base_bio->bi_rw);
 }
 
 static int kcryptd_io_read(struct dm_crypt_io *io, gfp_t gfp)
@@ -1911,11 +1911,11 @@ static int crypt_map(struct dm_target *ti, struct bio 
*bio)
struct crypt_config *cc = ti->private;
 
/*
-* If bio is REQ_FLUSH or REQ_DISCARD, just bypass crypt queues.
+* If bio is REQ_FLUSH or REQ_OP_DISCARD, just bypass crypt queues.
 * - for REQ_FLUSH device-mapper core ensures that no IO is in-flight
-* - for REQ_DISCARD caller must 

[PATCH 19/45] mpage: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have the mpage code
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/mpage.c | 40 
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/fs/mpage.c b/fs/mpage.c
index 2c251ec..37b2828 100644
--- a/fs/mpage.c
+++ b/fs/mpage.c
@@ -56,11 +56,11 @@ static void mpage_end_io(struct bio *bio)
bio_put(bio);
 }
 
-static struct bio *mpage_bio_submit(int rw, struct bio *bio)
+static struct bio *mpage_bio_submit(int op, int op_flags, struct bio *bio)
 {
bio->bi_end_io = mpage_end_io;
-   bio->bi_rw = rw;
-   guard_bio_eod(rw, bio);
+   bio_set_op_attrs(bio, op, op_flags);
+   guard_bio_eod(op, bio);
submit_bio(bio);
return NULL;
 }
@@ -270,7 +270,7 @@ do_mpage_readpage(struct bio *bio, struct page *page, 
unsigned nr_pages,
 * This page will go to BIO.  Do we need to send this BIO off first?
 */
if (bio && (*last_block_in_bio != blocks[0] - 1))
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
 
 alloc_new:
if (bio == NULL) {
@@ -287,7 +287,7 @@ alloc_new:
 
length = first_hole << blkbits;
if (bio_add_page(bio, page, length, 0) < length) {
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
goto alloc_new;
}
 
@@ -295,7 +295,7 @@ alloc_new:
nblocks = map_bh->b_size >> blkbits;
if ((buffer_boundary(map_bh) && relative_block == nblocks) ||
(first_hole != blocks_per_page))
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
else
*last_block_in_bio = blocks[blocks_per_page - 1];
 out:
@@ -303,7 +303,7 @@ out:
 
 confused:
if (bio)
-   bio = mpage_bio_submit(READ, bio);
+   bio = mpage_bio_submit(REQ_OP_READ, 0, bio);
if (!PageUptodate(page))
block_read_full_page(page, get_block);
else
@@ -385,7 +385,7 @@ mpage_readpages(struct address_space *mapping, struct 
list_head *pages,
}
BUG_ON(!list_empty(pages));
if (bio)
-   mpage_bio_submit(READ, bio);
+   mpage_bio_submit(REQ_OP_READ, 0, bio);
return 0;
 }
 EXPORT_SYMBOL(mpage_readpages);
@@ -406,7 +406,7 @@ int mpage_readpage(struct page *page, get_block_t get_block)
bio = do_mpage_readpage(bio, page, 1, _block_in_bio,
_bh, _logical_block, get_block, gfp);
if (bio)
-   mpage_bio_submit(READ, bio);
+   mpage_bio_submit(REQ_OP_READ, 0, bio);
return 0;
 }
 EXPORT_SYMBOL(mpage_readpage);
@@ -487,7 +487,7 @@ static int __mpage_writepage(struct page *page, struct 
writeback_control *wbc,
struct buffer_head map_bh;
loff_t i_size = i_size_read(inode);
int ret = 0;
-   int wr = (wbc->sync_mode == WB_SYNC_ALL ?  WRITE_SYNC : WRITE);
+   int op_flags = (wbc->sync_mode == WB_SYNC_ALL ?  WRITE_SYNC : 0);
 
if (page_has_buffers(page)) {
struct buffer_head *head = page_buffers(page);
@@ -596,7 +596,7 @@ page_is_mapped:
 * This page will go to BIO.  Do we need to send this BIO off first?
 */
if (bio && mpd->last_block_in_bio != blocks[0] - 1)
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
 
 alloc_new:
if (bio == NULL) {
@@ -623,7 +623,7 @@ alloc_new:
wbc_account_io(wbc, page, PAGE_SIZE);
length = first_unmapped << blkbits;
if (bio_add_page(bio, page, length, 0) < length) {
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
goto alloc_new;
}
 
@@ -633,7 +633,7 @@ alloc_new:
set_page_writeback(page);
unlock_page(page);
if (boundary || (first_unmapped != blocks_per_page)) {
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
if (boundary_block) {
write_boundary_block(boundary_bdev,
boundary_block, 1 << blkbits);
@@ -645,7 +645,7 @@ alloc_new:
 
 confused:
if (bio)
-   bio = mpage_bio_submit(wr, bio);
+   bio = mpage_bio_submit(REQ_OP_WRITE, op_flags, bio);
 
if (mpd->use_writepage) {
ret = mapping->a_ops->writepage(page, wbc);
@@ -702,9 +702,9 @@ mpage_writepages(struct address_space *mapping,
 
ret = write_cache_pages(mapping, wbc, __mpage_writepage, );

[PATCH 26/45] drbd: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have drbd
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---

v8:
1. Combined this patch with what was the cleanup/completion
path handling patch so all bio op drbd changes are now only
in this patch.

 drivers/block/drbd/drbd_actlog.c   | 28 +++-
 drivers/block/drbd/drbd_bitmap.c   |  6 +++---
 drivers/block/drbd/drbd_int.h  |  4 ++--
 drivers/block/drbd/drbd_main.c | 20 +++-
 drivers/block/drbd/drbd_receiver.c | 36 
 drivers/block/drbd/drbd_worker.c   |  7 ---
 6 files changed, 59 insertions(+), 42 deletions(-)

diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 6069e15..f236a31 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -137,19 +137,19 @@ void wait_until_done_or_force_detached(struct drbd_device 
*device, struct drbd_b
 
 static int _drbd_md_sync_page_io(struct drbd_device *device,
 struct drbd_backing_dev *bdev,
-sector_t sector, int rw)
+sector_t sector, int op)
 {
struct bio *bio;
/* we do all our meta data IO in aligned 4k blocks. */
const int size = 4096;
-   int err;
+   int err, op_flags = 0;
 
device->md_io.done = 0;
device->md_io.error = -ENODEV;
 
-   if ((rw & WRITE) && !test_bit(MD_NO_FUA, >flags))
-   rw |= REQ_FUA | REQ_FLUSH;
-   rw |= REQ_SYNC | REQ_NOIDLE;
+   if ((op == REQ_OP_WRITE) && !test_bit(MD_NO_FUA, >flags))
+   op_flags |= REQ_FUA | REQ_FLUSH;
+   op_flags |= REQ_SYNC | REQ_NOIDLE;
 
bio = bio_alloc_drbd(GFP_NOIO);
bio->bi_bdev = bdev->md_bdev;
@@ -159,9 +159,9 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
goto out;
bio->bi_private = device;
bio->bi_end_io = drbd_md_endio;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
-   if (!(rw & WRITE) && device->state.disk == D_DISKLESS && device->ldev 
== NULL)
+   if (op != REQ_OP_WRITE && device->state.disk == D_DISKLESS && 
device->ldev == NULL)
/* special case, drbd_md_read() during drbd_adm_attach(): no 
get_ldev */
;
else if (!get_ldev_if_state(device, D_ATTACHING)) {
@@ -174,7 +174,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
bio_get(bio); /* one bio_put() is in the completion handler */
atomic_inc(>md_io.in_use); /* drbd_md_put_buffer() is in the 
completion handler */
device->md_io.submit_jif = jiffies;
-   if (drbd_insert_fault(device, (rw & WRITE) ? DRBD_FAULT_MD_WR : 
DRBD_FAULT_MD_RD))
+   if (drbd_insert_fault(device, (op == REQ_OP_WRITE) ? DRBD_FAULT_MD_WR : 
DRBD_FAULT_MD_RD))
bio_io_error(bio);
else
submit_bio(bio);
@@ -188,7 +188,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
 }
 
 int drbd_md_sync_page_io(struct drbd_device *device, struct drbd_backing_dev 
*bdev,
-sector_t sector, int rw)
+sector_t sector, int op)
 {
int err;
D_ASSERT(device, atomic_read(>md_io.in_use) == 1);
@@ -197,19 +197,21 @@ int drbd_md_sync_page_io(struct drbd_device *device, 
struct drbd_backing_dev *bd
 
dynamic_drbd_dbg(device, "meta_data io: %s [%d]:%s(,%llus,%s) %pS\n",
 current->comm, current->pid, __func__,
-(unsigned long long)sector, (rw & WRITE) ? "WRITE" : "READ",
+(unsigned long long)sector, (op == REQ_OP_WRITE) ? "WRITE" : 
"READ",
 (void*)_RET_IP_ );
 
if (sector < drbd_md_first_sector(bdev) ||
sector + 7 > drbd_md_last_sector(bdev))
drbd_alert(device, "%s [%d]:%s(,%llus,%s) out of range md 
access!\n",
 current->comm, current->pid, __func__,
-(unsigned long long)sector, (rw & WRITE) ? "WRITE" : 
"READ");
+(unsigned long long)sector,
+(op == REQ_OP_WRITE) ? "WRITE" : "READ");
 
-   err = _drbd_md_sync_page_io(device, bdev, sector, rw);
+   err = _drbd_md_sync_page_io(device, bdev, sector, op);
if (err) {
drbd_err(device, "drbd_md_sync_page_io(,%llus,%s) failed with 
error %d\n",
-   (unsigned long long)sector, (rw & WRITE) ? "WRITE" : 
"READ", err);
+   (unsigned long long)sector,
+   (op == REQ_OP_WRITE) ? "WRITE" : "READ", err);
}
return err;
 }
diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index e8959fe..e5d89f6 100644
--- 

[PATCH 01/45] block/fs/drivers: remove rw argument from submit_bio

2016-06-05 Thread mchristi
From: Mike Christie 

This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.

Signed-off-by: Mike Christie 
---

v8:
1. Fix bug in xfs code introduced in v6 due to ioend changes.
2. Dropped signed-offs due to so many upstream changes since
last review.

v5:
1. Missed crypto fs submit_bio_wait call.

v2:

1. Set bi_rw instead of ORing it. For cloned bios, I still OR it
to keep the old behavior incase there bits we wanted to keep.

 block/bio.c |  7 +++
 block/blk-core.c| 11 ---
 block/blk-flush.c   |  3 ++-
 block/blk-lib.c | 20 +++-
 drivers/block/drbd/drbd_actlog.c|  2 +-
 drivers/block/drbd/drbd_bitmap.c|  4 ++--
 drivers/block/floppy.c  |  3 ++-
 drivers/block/xen-blkback/blkback.c |  4 +++-
 drivers/block/xen-blkfront.c|  4 ++--
 drivers/md/bcache/debug.c   |  6 --
 drivers/md/bcache/journal.c |  2 +-
 drivers/md/bcache/super.c   |  4 ++--
 drivers/md/dm-bufio.c   |  3 ++-
 drivers/md/dm-io.c  |  3 ++-
 drivers/md/dm-log-writes.c  |  9 ++---
 drivers/md/dm-thin.c|  3 ++-
 drivers/md/md.c | 10 +++---
 drivers/md/raid1.c  |  3 ++-
 drivers/md/raid10.c |  4 +++-
 drivers/md/raid5-cache.c|  7 ---
 drivers/target/target_core_iblock.c | 24 +---
 fs/btrfs/check-integrity.c  | 18 ++
 fs/btrfs/check-integrity.h  |  4 ++--
 fs/btrfs/disk-io.c  |  3 ++-
 fs/btrfs/extent_io.c|  7 ---
 fs/btrfs/raid56.c   | 17 -
 fs/btrfs/scrub.c| 15 ++-
 fs/btrfs/volumes.c  | 14 +++---
 fs/buffer.c |  3 ++-
 fs/crypto/crypto.c  |  3 ++-
 fs/direct-io.c  |  3 ++-
 fs/ext4/page-io.c   |  3 ++-
 fs/ext4/readpage.c  |  9 +
 fs/f2fs/data.c  |  4 +++-
 fs/f2fs/segment.c   |  6 --
 fs/gfs2/lops.c  |  3 ++-
 fs/gfs2/meta_io.c   |  3 ++-
 fs/gfs2/ops_fstype.c|  3 ++-
 fs/hfsplus/wrapper.c|  3 ++-
 fs/jfs/jfs_logmgr.c |  6 --
 fs/jfs/jfs_metapage.c   | 10 ++
 fs/logfs/dev_bdev.c | 15 ++-
 fs/mpage.c  |  3 ++-
 fs/nfs/blocklayout/blocklayout.c| 22 --
 fs/nilfs2/segbuf.c  |  3 ++-
 fs/ocfs2/cluster/heartbeat.c| 12 +++-
 fs/xfs/xfs_aops.c   | 15 ++-
 fs/xfs/xfs_buf.c|  4 ++--
 include/linux/bio.h |  2 +-
 include/linux/fs.h  |  2 +-
 kernel/power/swap.c |  5 +++--
 mm/page_io.c| 10 ++
 52 files changed, 219 insertions(+), 147 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 0e4aa42..fc779eb 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -854,21 +854,20 @@ static void submit_bio_wait_endio(struct bio *bio)
 
 /**
  * submit_bio_wait - submit a bio, and wait until it completes
- * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
  * @bio: The  bio which describes the I/O
  *
  * Simple wrapper around submit_bio(). Returns 0 on success, or the error from
  * bio_endio() on failure.
  */
-int submit_bio_wait(int rw, struct bio *bio)
+int submit_bio_wait(struct bio *bio)
 {
struct submit_bio_ret ret;
 
-   rw |= REQ_SYNC;
init_completion();
bio->bi_private = 
bio->bi_end_io = submit_bio_wait_endio;
-   submit_bio(rw, bio);
+   bio->bi_rw |= REQ_SYNC;
+   submit_bio(bio);
wait_for_completion_io();
 
return ret.error;
diff --git a/block/blk-core.c b/block/blk-core.c
index 2475b1c7..e953407 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2094,7 +2094,6 @@ EXPORT_SYMBOL(generic_make_request);
 
 /**
  * submit_bio - submit a bio to the block device layer for I/O
- * @rw: whether to %READ or %WRITE, or maybe to %READA (read ahead)
  * @bio: The  bio which describes the I/O
  *
  * submit_bio() is very similar in purpose to generic_make_request(), and
@@ -2102,10 +2101,8 @@ EXPORT_SYMBOL(generic_make_request);
  * interfaces; @bio must be presetup and ready for I/O.
  *
  */
-blk_qc_t submit_bio(int rw, struct bio *bio)
+blk_qc_t submit_bio(struct bio *bio)
 {
-   bio->bi_rw |= rw;
-
/*
 * If it's a regular read/write or a barrier with data attached,
 * go through the normal accounting stuff before submission.
@@ -2113,12 +2110,12 

[PATCH 14/45] btrfs: use bio fields for op and flags

2016-06-05 Thread mchristi
From: Mike Christie 

The bio REQ_OP and bi_rw rq_flag_bits are now always setup, so there is
no need to pass around the rq_flag_bits bits too. btrfs users should
should access the bio insead.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
v2:

1. Fix merge_bio issue where instead of removing rw/op argument
I passed it in again to the merge_bio related functions.

 fs/btrfs/compression.c | 13 ++---
 fs/btrfs/ctree.h   |  2 +-
 fs/btrfs/disk-io.c | 30 --
 fs/btrfs/disk-io.h |  2 +-
 fs/btrfs/extent_io.c   | 12 +---
 fs/btrfs/extent_io.h   |  8 
 fs/btrfs/inode.c   | 41 +++--
 fs/btrfs/volumes.c | 11 +--
 fs/btrfs/volumes.h |  2 +-
 9 files changed, 54 insertions(+), 67 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 029bd79..cefedab 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -374,7 +374,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
page = compressed_pages[pg_index];
page->mapping = inode->i_mapping;
if (bio->bi_iter.bi_size)
-   ret = io_tree->ops->merge_bio_hook(WRITE, page, 0,
+   ret = io_tree->ops->merge_bio_hook(page, 0,
   PAGE_SIZE,
   bio, 0);
else
@@ -402,7 +402,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, WRITE, bio, 0, 1);
+   ret = btrfs_map_bio(root, bio, 0, 1);
BUG_ON(ret); /* -ENOMEM */
 
bio_put(bio);
@@ -433,7 +433,7 @@ int btrfs_submit_compressed_write(struct inode *inode, u64 
start,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, WRITE, bio, 0, 1);
+   ret = btrfs_map_bio(root, bio, 0, 1);
BUG_ON(ret); /* -ENOMEM */
 
bio_put(bio);
@@ -659,7 +659,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
page->index = em_start >> PAGE_SHIFT;
 
if (comp_bio->bi_iter.bi_size)
-   ret = tree->ops->merge_bio_hook(READ, page, 0,
+   ret = tree->ops->merge_bio_hook(page, 0,
PAGE_SIZE,
comp_bio, 0);
else
@@ -690,8 +690,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
sums += DIV_ROUND_UP(comp_bio->bi_iter.bi_size,
 root->sectorsize);
 
-   ret = btrfs_map_bio(root, READ, comp_bio,
-   mirror_num, 0);
+   ret = btrfs_map_bio(root, comp_bio, mirror_num, 0);
if (ret) {
bio->bi_error = ret;
bio_endio(comp_bio);
@@ -721,7 +720,7 @@ int btrfs_submit_compressed_read(struct inode *inode, 
struct bio *bio,
BUG_ON(ret); /* -ENOMEM */
}
 
-   ret = btrfs_map_bio(root, READ, comp_bio, mirror_num, 0);
+   ret = btrfs_map_bio(root, comp_bio, mirror_num, 0);
if (ret) {
bio->bi_error = ret;
bio_endio(comp_bio);
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 101c3cf..4088d7f 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3091,7 +3091,7 @@ int btrfs_create_subvol_root(struct btrfs_trans_handle 
*trans,
 struct btrfs_root *new_root,
 struct btrfs_root *parent_root,
 u64 new_dirid);
-int btrfs_merge_bio_hook(int rw, struct page *page, unsigned long offset,
+int btrfs_merge_bio_hook(struct page *page, unsigned long offset,
 size_t size, struct bio *bio,
 unsigned long bio_flags);
 int btrfs_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf);
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 93278c2..e80ef6e 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -124,7 +124,6 @@ struct async_submit_bio {
struct list_head list;
extent_submit_bio_hook_t *submit_bio_start;
extent_submit_bio_hook_t *submit_bio_done;
-   int rw;
int mirror_num;
unsigned long bio_flags;
/*
@@ -797,7 +796,7 @@ static void run_one_async_start(struct btrfs_work *work)
int ret;
 
async = container_of(work, struct  async_submit_bio, work);
-   

[PATCH 04/45] fs: have ll_rw_block users pass in op and flags separately

2016-06-05 Thread mchristi
From: Mike Christie 

This has ll_rw_block users pass in the operation and flags separately,
so ll_rw_block can setup the bio op and bi_rw flags on the bio that
is submitted.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---

v2:

1. Fix for kbuild error in ll_rw_block comments.

 fs/buffer.c | 19 ++-
 fs/ext4/inode.c |  6 +++---
 fs/ext4/namei.c |  3 ++-
 fs/ext4/super.c |  2 +-
 fs/gfs2/bmap.c  |  2 +-
 fs/gfs2/meta_io.c   |  4 ++--
 fs/gfs2/quota.c |  2 +-
 fs/isofs/compress.c |  2 +-
 fs/jbd2/journal.c   |  2 +-
 fs/jbd2/recovery.c  |  4 ++--
 fs/ocfs2/aops.c |  2 +-
 fs/ocfs2/super.c|  2 +-
 fs/reiserfs/journal.c   |  8 
 fs/reiserfs/stree.c |  4 ++--
 fs/reiserfs/super.c |  2 +-
 fs/squashfs/block.c |  4 ++--
 fs/udf/dir.c|  2 +-
 fs/udf/directory.c  |  2 +-
 fs/udf/inode.c  |  2 +-
 fs/ufs/balloc.c |  2 +-
 include/linux/buffer_head.h |  2 +-
 21 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/fs/buffer.c b/fs/buffer.c
index 881d336..373aacb 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -588,7 +588,7 @@ void write_boundary_block(struct block_device *bdev,
struct buffer_head *bh = __find_get_block(bdev, bblock + 1, blocksize);
if (bh) {
if (buffer_dirty(bh))
-   ll_rw_block(WRITE, 1, );
+   ll_rw_block(REQ_OP_WRITE, 0, 1, );
put_bh(bh);
}
 }
@@ -1395,7 +1395,7 @@ void __breadahead(struct block_device *bdev, sector_t 
block, unsigned size)
 {
struct buffer_head *bh = __getblk(bdev, block, size);
if (likely(bh)) {
-   ll_rw_block(READA, 1, );
+   ll_rw_block(REQ_OP_READ, READA, 1, );
brelse(bh);
}
 }
@@ -1955,7 +1955,7 @@ int __block_write_begin(struct page *page, loff_t pos, 
unsigned len,
if (!buffer_uptodate(bh) && !buffer_delay(bh) &&
!buffer_unwritten(bh) &&
 (block_start < from || block_end > to)) {
-   ll_rw_block(READ, 1, );
+   ll_rw_block(REQ_OP_READ, 0, 1, );
*wait_bh++=bh;
}
}
@@ -2852,7 +2852,7 @@ int block_truncate_page(struct address_space *mapping,
 
if (!buffer_uptodate(bh) && !buffer_delay(bh) && !buffer_unwritten(bh)) 
{
err = -EIO;
-   ll_rw_block(READ, 1, );
+   ll_rw_block(REQ_OP_READ, 0, 1, );
wait_on_buffer(bh);
/* Uhhuh. Read error. Complain and punt. */
if (!buffer_uptodate(bh))
@@ -3051,7 +3051,8 @@ EXPORT_SYMBOL(submit_bh);
 
 /**
  * ll_rw_block: low-level access to block devices (DEPRECATED)
- * @rw: whether to %READ or %WRITE or maybe %READA (readahead)
+ * @op: whether to %READ or %WRITE
+ * @op_flags: rq_flag_bits or %READA (readahead)
  * @nr: number of  buffer_heads in the array
  * @bhs: array of pointers to  buffer_head
  *
@@ -3074,7 +3075,7 @@ EXPORT_SYMBOL(submit_bh);
  * All of the buffers must be for the same device, and must also be a
  * multiple of the current approved size for the device.
  */
-void ll_rw_block(int rw, int nr, struct buffer_head *bhs[])
+void ll_rw_block(int op, int op_flags,  int nr, struct buffer_head *bhs[])
 {
int i;
 
@@ -3083,18 +3084,18 @@ void ll_rw_block(int rw, int nr, struct buffer_head 
*bhs[])
 
if (!trylock_buffer(bh))
continue;
-   if (rw == WRITE) {
+   if (op == WRITE) {
if (test_clear_buffer_dirty(bh)) {
bh->b_end_io = end_buffer_write_sync;
get_bh(bh);
-   submit_bh(rw, 0, bh);
+   submit_bh(op, op_flags, bh);
continue;
}
} else {
if (!buffer_uptodate(bh)) {
bh->b_end_io = end_buffer_read_sync;
get_bh(bh);
-   submit_bh(rw, 0, bh);
+   submit_bh(op, op_flags, bh);
continue;
}
}
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index ee3c7d8..ae44916 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -981,7 +981,7 @@ struct buffer_head *ext4_bread(handle_t *handle, struct 
inode *inode,
return bh;
if (!bh || buffer_uptodate(bh))
return bh;
-   ll_rw_block(READ | REQ_META | REQ_PRIO, 1, );
+   ll_rw_block(REQ_OP_READ, REQ_META | 

[PATCH 16/45] gfs2: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have gfs2
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/gfs2/log.c|  8 
 fs/gfs2/lops.c   | 11 ++-
 fs/gfs2/lops.h   |  2 +-
 fs/gfs2/meta_io.c|  7 ---
 fs/gfs2/ops_fstype.c |  2 +-
 5 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/fs/gfs2/log.c b/fs/gfs2/log.c
index 0ff028c..e58ccef0 100644
--- a/fs/gfs2/log.c
+++ b/fs/gfs2/log.c
@@ -657,7 +657,7 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 
flags)
struct gfs2_log_header *lh;
unsigned int tail;
u32 hash;
-   int rw = WRITE_FLUSH_FUA | REQ_META;
+   int op_flags = WRITE_FLUSH_FUA | REQ_META;
struct page *page = mempool_alloc(gfs2_page_pool, GFP_NOIO);
enum gfs2_freeze_state state = atomic_read(>sd_freeze_state);
lh = page_address(page);
@@ -682,12 +682,12 @@ static void log_write_header(struct gfs2_sbd *sdp, u32 
flags)
if (test_bit(SDF_NOBARRIERS, >sd_flags)) {
gfs2_ordered_wait(sdp);
log_flush_wait(sdp);
-   rw = WRITE_SYNC | REQ_META | REQ_PRIO;
+   op_flags = WRITE_SYNC | REQ_META | REQ_PRIO;
}
 
sdp->sd_log_idle = (tail == sdp->sd_log_flush_head);
gfs2_log_write_page(sdp, page);
-   gfs2_log_flush_bio(sdp, rw);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, op_flags);
log_flush_wait(sdp);
 
if (sdp->sd_log_tail != tail)
@@ -738,7 +738,7 @@ void gfs2_log_flush(struct gfs2_sbd *sdp, struct gfs2_glock 
*gl,
 
gfs2_ordered_write(sdp);
lops_before_commit(sdp, tr);
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
 
if (sdp->sd_log_head != sdp->sd_log_flush_head) {
log_flush_wait(sdp);
diff --git a/fs/gfs2/lops.c b/fs/gfs2/lops.c
index ce28242..58d1c98 100644
--- a/fs/gfs2/lops.c
+++ b/fs/gfs2/lops.c
@@ -230,17 +230,18 @@ static void gfs2_end_log_write(struct bio *bio)
 /**
  * gfs2_log_flush_bio - Submit any pending log bio
  * @sdp: The superblock
- * @rw: The rw flags
+ * @op: REQ_OP
+ * @op_flags: rq_flag_bits
  *
  * Submit any pending part-built or full bio to the block device. If
  * there is no pending bio, then this is a no-op.
  */
 
-void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw)
+void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags)
 {
if (sdp->sd_log_bio) {
atomic_inc(>sd_log_in_flight);
-   sdp->sd_log_bio->bi_rw = rw;
+   bio_set_op_attrs(sdp->sd_log_bio, op, op_flags);
submit_bio(sdp->sd_log_bio);
sdp->sd_log_bio = NULL;
}
@@ -300,7 +301,7 @@ static struct bio *gfs2_log_get_bio(struct gfs2_sbd *sdp, 
u64 blkno)
nblk >>= sdp->sd_fsb2bb_shift;
if (blkno == nblk)
return bio;
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
}
 
return gfs2_log_alloc_bio(sdp, blkno);
@@ -329,7 +330,7 @@ static void gfs2_log_write(struct gfs2_sbd *sdp, struct 
page *page,
bio = gfs2_log_get_bio(sdp, blkno);
ret = bio_add_page(bio, page, size, offset);
if (ret == 0) {
-   gfs2_log_flush_bio(sdp, WRITE);
+   gfs2_log_flush_bio(sdp, REQ_OP_WRITE, 0);
bio = gfs2_log_alloc_bio(sdp, blkno);
ret = bio_add_page(bio, page, size, offset);
WARN_ON(ret == 0);
diff --git a/fs/gfs2/lops.h b/fs/gfs2/lops.h
index a65a7ba..e529f53 100644
--- a/fs/gfs2/lops.h
+++ b/fs/gfs2/lops.h
@@ -27,7 +27,7 @@ extern const struct gfs2_log_operations gfs2_databuf_lops;
 
 extern const struct gfs2_log_operations *gfs2_log_ops[];
 extern void gfs2_log_write_page(struct gfs2_sbd *sdp, struct page *page);
-extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int rw);
+extern void gfs2_log_flush_bio(struct gfs2_sbd *sdp, int op, int op_flags);
 extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh);
 
 static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
diff --git a/fs/gfs2/meta_io.c b/fs/gfs2/meta_io.c
index b718447..052c113 100644
--- a/fs/gfs2/meta_io.c
+++ b/fs/gfs2/meta_io.c
@@ -213,7 +213,8 @@ static void gfs2_meta_read_endio(struct bio *bio)
  * Submit several consecutive buffer head I/O requests as a single bio I/O
  * request.  (See submit_bh_wbc.)
  */
-static void gfs2_submit_bhs(int rw, struct buffer_head *bhs[], int num)
+static void gfs2_submit_bhs(int op, int op_flags, struct buffer_head *bhs[],
+   int num)
 {
struct buffer_head *bh = bhs[0];
struct bio *bio;
@@ -230,7 +231,7 @@ static void gfs2_submit_bhs(int rw, struct buffer_head 
*bhs[], int num)
   

[PATCH 03/45] fs: have submit_bh users pass in op and flags separately

2016-06-05 Thread mchristi
From: Mike Christie 

This has submit_bh users pass in the operation and flags separately,
so submit_bh_wbc can setup the bio op and bi_rw flags on the bio that
is submitted.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 drivers/md/bitmap.c |  4 ++--
 fs/btrfs/check-integrity.c  | 24 ++--
 fs/btrfs/check-integrity.h  |  2 +-
 fs/btrfs/disk-io.c  |  4 ++--
 fs/buffer.c | 53 +++--
 fs/ext4/balloc.c|  2 +-
 fs/ext4/ialloc.c|  2 +-
 fs/ext4/inode.c |  2 +-
 fs/ext4/mmp.c   |  4 ++--
 fs/fat/misc.c   |  2 +-
 fs/gfs2/bmap.c  |  2 +-
 fs/gfs2/dir.c   |  2 +-
 fs/gfs2/meta_io.c   |  6 ++---
 fs/jbd2/commit.c|  6 ++---
 fs/jbd2/journal.c   |  8 +++
 fs/nilfs2/btnode.c  |  6 ++---
 fs/nilfs2/btnode.h  |  2 +-
 fs/nilfs2/btree.c   |  6 +++--
 fs/nilfs2/gcinode.c |  5 +++--
 fs/nilfs2/mdt.c | 11 +-
 fs/ntfs/aops.c  |  6 ++---
 fs/ntfs/compress.c  |  2 +-
 fs/ntfs/file.c  |  2 +-
 fs/ntfs/logfile.c   |  2 +-
 fs/ntfs/mft.c   |  4 ++--
 fs/ocfs2/buffer_head_io.c   |  8 +++
 fs/reiserfs/inode.c |  4 ++--
 fs/reiserfs/journal.c   |  6 ++---
 fs/ufs/util.c   |  2 +-
 include/linux/buffer_head.h |  9 
 30 files changed, 102 insertions(+), 96 deletions(-)

diff --git a/drivers/md/bitmap.c b/drivers/md/bitmap.c
index d8129ec..bc6dced 100644
--- a/drivers/md/bitmap.c
+++ b/drivers/md/bitmap.c
@@ -297,7 +297,7 @@ static void write_page(struct bitmap *bitmap, struct page 
*page, int wait)
atomic_inc(>pending_writes);
set_buffer_locked(bh);
set_buffer_mapped(bh);
-   submit_bh(WRITE | REQ_SYNC, bh);
+   submit_bh(REQ_OP_WRITE, REQ_SYNC, bh);
bh = bh->b_this_page;
}
 
@@ -392,7 +392,7 @@ static int read_page(struct file *file, unsigned long index,
atomic_inc(>pending_writes);
set_buffer_locked(bh);
set_buffer_mapped(bh);
-   submit_bh(READ, bh);
+   submit_bh(REQ_OP_READ, 0, bh);
}
block++;
bh = bh->b_this_page;
diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 50f8191..0d3748b 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2856,12 +2856,12 @@ static struct btrfsic_dev_state 
*btrfsic_dev_state_lookup(
return ds;
 }
 
-int btrfsic_submit_bh(int rw, struct buffer_head *bh)
+int btrfsic_submit_bh(int op, int op_flags, struct buffer_head *bh)
 {
struct btrfsic_dev_state *dev_state;
 
if (!btrfsic_is_initialized)
-   return submit_bh(rw, bh);
+   return submit_bh(op, op_flags, bh);
 
mutex_lock(_mutex);
/* since btrfsic_submit_bh() might also be called before
@@ -2870,26 +2870,26 @@ int btrfsic_submit_bh(int rw, struct buffer_head *bh)
 
/* Only called to write the superblock (incl. FLUSH/FUA) */
if (NULL != dev_state &&
-   (rw & WRITE) && bh->b_size > 0) {
+   (op == REQ_OP_WRITE) && bh->b_size > 0) {
u64 dev_bytenr;
 
dev_bytenr = 4096 * bh->b_blocknr;
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bh(rw=0x%x, blocknr=%llu (bytenr %llu),"
-  " size=%zu, data=%p, bdev=%p)\n",
-  rw, (unsigned long long)bh->b_blocknr,
+  "submit_bh(op=0x%x,0x%x, blocknr=%llu "
+  "(bytenr %llu), size=%zu, data=%p, bdev=%p)\n",
+  op, op_flags, (unsigned long long)bh->b_blocknr,
   dev_bytenr, bh->b_size, bh->b_data, bh->b_bdev);
btrfsic_process_written_block(dev_state, dev_bytenr,
  >b_data, 1, NULL,
- NULL, bh, rw);
-   } else if (NULL != dev_state && (rw & REQ_FLUSH)) {
+ NULL, bh, op_flags);
+   } else if (NULL != dev_state && (op_flags & REQ_FLUSH)) {
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bh(rw=0x%x FLUSH, bdev=%p)\n",
-  rw, bh->b_bdev);
+  

[PATCH 20/45] nilfs: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have nilfs
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
Acked-by: Ryusuke Konishi 
---
 fs/nilfs2/segbuf.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/fs/nilfs2/segbuf.c b/fs/nilfs2/segbuf.c
index 0f62909..a962d7d 100644
--- a/fs/nilfs2/segbuf.c
+++ b/fs/nilfs2/segbuf.c
@@ -346,7 +346,8 @@ static void nilfs_end_bio_write(struct bio *bio)
 }
 
 static int nilfs_segbuf_submit_bio(struct nilfs_segment_buffer *segbuf,
-  struct nilfs_write_info *wi, int mode)
+  struct nilfs_write_info *wi, int mode,
+  int mode_flags)
 {
struct bio *bio = wi->bio;
int err;
@@ -364,7 +365,7 @@ static int nilfs_segbuf_submit_bio(struct 
nilfs_segment_buffer *segbuf,
 
bio->bi_end_io = nilfs_end_bio_write;
bio->bi_private = segbuf;
-   bio->bi_rw = mode;
+   bio_set_op_attrs(bio, mode, mode_flags);
submit_bio(bio);
segbuf->sb_nbio++;
 
@@ -438,7 +439,7 @@ static int nilfs_segbuf_submit_bh(struct 
nilfs_segment_buffer *segbuf,
return 0;
}
/* bio is FULL */
-   err = nilfs_segbuf_submit_bio(segbuf, wi, mode);
+   err = nilfs_segbuf_submit_bio(segbuf, wi, mode, 0);
/* never submit current bh */
if (likely(!err))
goto repeat;
@@ -462,19 +463,19 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer 
*segbuf,
 {
struct nilfs_write_info wi;
struct buffer_head *bh;
-   int res = 0, rw = WRITE;
+   int res = 0;
 
wi.nilfs = nilfs;
nilfs_segbuf_prepare_write(segbuf, );
 
list_for_each_entry(bh, >sb_segsum_buffers, b_assoc_buffers) {
-   res = nilfs_segbuf_submit_bh(segbuf, , bh, rw);
+   res = nilfs_segbuf_submit_bh(segbuf, , bh, REQ_OP_WRITE);
if (unlikely(res))
goto failed_bio;
}
 
list_for_each_entry(bh, >sb_payload_buffers, b_assoc_buffers) {
-   res = nilfs_segbuf_submit_bh(segbuf, , bh, rw);
+   res = nilfs_segbuf_submit_bh(segbuf, , bh, REQ_OP_WRITE);
if (unlikely(res))
goto failed_bio;
}
@@ -484,8 +485,8 @@ static int nilfs_segbuf_write(struct nilfs_segment_buffer 
*segbuf,
 * Last BIO is always sent through the following
 * submission.
 */
-   rw |= REQ_SYNC;
-   res = nilfs_segbuf_submit_bio(segbuf, , rw);
+   res = nilfs_segbuf_submit_bio(segbuf, , REQ_OP_WRITE,
+ REQ_SYNC);
}
 
  failed_bio:
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/45] block, drivers, cgroup: use op_is_write helper instead of checking for REQ_WRITE

2016-06-05 Thread mchristi
From: Mike Christie 

We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.

This patch converts the drivers and cgroup to use the
op_is_write helper. This should just cover the simple
cases. I did dm, md and bcache in their own patches
because they were more involved.

Signed-off-by: Mike Christie 
---
 block/blk-core.c | 4 ++--
 block/blk-merge.c| 2 +-
 drivers/ata/libata-scsi.c| 2 +-
 drivers/block/loop.c | 6 +++---
 drivers/block/umem.c | 2 +-
 drivers/scsi/osd/osd_initiator.c | 4 ++--
 include/linux/blk-cgroup.h   | 2 +-
 7 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index e953407..e8e5865 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2115,7 +2115,7 @@ blk_qc_t submit_bio(struct bio *bio)
else
count = bio_sectors(bio);
 
-   if (bio->bi_rw & WRITE) {
+   if (op_is_write(bio_op(bio))) {
count_vm_events(PGPGOUT, count);
} else {
task_io_account_read(bio->bi_iter.bi_size);
@@ -2126,7 +2126,7 @@ blk_qc_t submit_bio(struct bio *bio)
char b[BDEVNAME_SIZE];
printk(KERN_DEBUG "%s(%d): %s block %Lu on %s (%u 
sectors)\n",
current->comm, task_pid_nr(current),
-   (bio->bi_rw & WRITE) ? "WRITE" : "READ",
+   op_is_write(bio_op(bio)) ? "WRITE" : "READ",
(unsigned long long)bio->bi_iter.bi_sector,
bdevname(bio->bi_bdev, b),
count);
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 2613531..b198070 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -439,7 +439,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request 
*rq,
}
 
if (q->dma_drain_size && q->dma_drain_needed(rq)) {
-   if (rq->cmd_flags & REQ_WRITE)
+   if (op_is_write(req_op(rq)))
memset(q->dma_drain_buffer, 0, q->dma_drain_size);
 
sg_unmark_end(sg);
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index bfec66f..4c6eb22 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -1190,7 +1190,7 @@ static int atapi_drain_needed(struct request *rq)
if (likely(rq->cmd_type != REQ_TYPE_BLOCK_PC))
return 0;
 
-   if (!blk_rq_bytes(rq) || (rq->cmd_flags & REQ_WRITE))
+   if (!blk_rq_bytes(rq) || op_is_write(req_op(rq)))
return 0;
 
return atapi_cmd_type(rq->cmd[0]) == ATAPI_MISC;
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 1fa8cc2..e9f1701 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -447,7 +447,7 @@ static int lo_req_flush(struct loop_device *lo, struct 
request *rq)
 
 static inline void handle_partial_read(struct loop_cmd *cmd, long bytes)
 {
-   if (bytes < 0 || (cmd->rq->cmd_flags & REQ_WRITE))
+   if (bytes < 0 || op_is_write(req_op(cmd->rq)))
return;
 
if (unlikely(bytes < blk_rq_bytes(cmd->rq))) {
@@ -541,7 +541,7 @@ static int do_req_filebacked(struct loop_device *lo, struct 
request *rq)
 
pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
 
-   if (rq->cmd_flags & REQ_WRITE) {
+   if (op_is_write(req_op(rq))) {
if (rq->cmd_flags & REQ_FLUSH)
ret = lo_req_flush(lo, rq);
else if (rq->cmd_flags & REQ_DISCARD)
@@ -1672,7 +1672,7 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 static void loop_handle_cmd(struct loop_cmd *cmd)
 {
-   const bool write = cmd->rq->cmd_flags & REQ_WRITE;
+   const bool write = op_is_write(req_op(cmd->rq));
struct loop_device *lo = cmd->rq->q->queuedata;
int ret = 0;
 
diff --git a/drivers/block/umem.c b/drivers/block/umem.c
index 7939b9f..4b3ba74 100644
--- a/drivers/block/umem.c
+++ b/drivers/block/umem.c
@@ -462,7 +462,7 @@ static void process_page(unsigned long data)
le32_to_cpu(desc->local_addr)>>9,
le32_to_cpu(desc->transfer_size));
dump_dmastat(card, control);
-   } else if ((bio->bi_rw & REQ_WRITE) &&
+   } else if (op_is_write(bio_op(bio)) &&
   le32_to_cpu(desc->local_addr) >> 9 ==
card->init_size) {
card->init_size += le32_to_cpu(desc->transfer_size) >> 
9;
diff --git a/drivers/scsi/osd/osd_initiator.c b/drivers/scsi/osd/osd_initiator.c
index 

[PATCH 25/45] bcache: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have bcache
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
---
 drivers/md/bcache/btree.c |  4 ++--
 drivers/md/bcache/debug.c |  4 ++--
 drivers/md/bcache/journal.c   |  7 ---
 drivers/md/bcache/movinggc.c  |  2 +-
 drivers/md/bcache/request.c   | 14 +++---
 drivers/md/bcache/super.c | 24 +---
 drivers/md/bcache/writeback.c |  4 ++--
 7 files changed, 31 insertions(+), 28 deletions(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index eab505e..76f7534 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -294,10 +294,10 @@ static void bch_btree_node_read(struct btree *b)
closure_init_stack();
 
bio = bch_bbio_alloc(b->c);
-   bio->bi_rw  = REQ_META|READ_SYNC;
bio->bi_iter.bi_size = KEY_SIZE(>key) << 9;
bio->bi_end_io  = btree_node_read_endio;
bio->bi_private = 
+   bio_set_op_attrs(bio, REQ_OP_READ, REQ_META|READ_SYNC);
 
bch_bio_map(bio, b->keys.set[0].data);
 
@@ -396,8 +396,8 @@ static void do_btree_node_write(struct btree *b)
 
b->bio->bi_end_io   = btree_node_write_endio;
b->bio->bi_private  = cl;
-   b->bio->bi_rw   = REQ_META|WRITE_SYNC|REQ_FUA;
b->bio->bi_iter.bi_size = roundup(set_bytes(i), block_bytes(b->c));
+   bio_set_op_attrs(b->bio, REQ_OP_WRITE, REQ_META|WRITE_SYNC|REQ_FUA);
bch_bio_map(b->bio, i);
 
/*
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index 52b6bcf..c28df164 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -52,7 +52,7 @@ void bch_btree_verify(struct btree *b)
bio->bi_bdev= PTR_CACHE(b->c, >key, 0)->bdev;
bio->bi_iter.bi_sector  = PTR_OFFSET(>key, 0);
bio->bi_iter.bi_size= KEY_SIZE(>key) << 9;
-   bio->bi_rw  = REQ_META|READ_SYNC;
+   bio_set_op_attrs(bio, REQ_OP_READ, REQ_META|READ_SYNC);
bch_bio_map(bio, sorted);
 
submit_bio_wait(bio);
@@ -114,7 +114,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
check = bio_clone(bio, GFP_NOIO);
if (!check)
return;
-   check->bi_rw |= READ_SYNC;
+   bio_set_op_attrs(check, REQ_OP_READ, READ_SYNC);
 
if (bio_alloc_pages(check, GFP_NOIO))
goto out_put;
diff --git a/drivers/md/bcache/journal.c b/drivers/md/bcache/journal.c
index af3f9f7..a3c3b30 100644
--- a/drivers/md/bcache/journal.c
+++ b/drivers/md/bcache/journal.c
@@ -54,11 +54,11 @@ reread: left = ca->sb.bucket_size - offset;
bio_reset(bio);
bio->bi_iter.bi_sector  = bucket + offset;
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = READ;
bio->bi_iter.bi_size= len << 9;
 
bio->bi_end_io  = journal_read_endio;
bio->bi_private = 
+   bio_set_op_attrs(bio, REQ_OP_READ, 0);
bch_bio_map(bio, data);
 
closure_bio_submit(bio, );
@@ -449,10 +449,10 @@ static void do_journal_discard(struct cache *ca)
atomic_set(>discard_in_flight, DISCARD_IN_FLIGHT);
 
bio_init(bio);
+   bio_set_op_attrs(bio, REQ_OP_DISCARD, 0);
bio->bi_iter.bi_sector  = bucket_to_sector(ca->set,
ca->sb.d[ja->discard_idx]);
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = REQ_WRITE|REQ_DISCARD;
bio->bi_max_vecs= 1;
bio->bi_io_vec  = bio->bi_inline_vecs;
bio->bi_iter.bi_size= bucket_bytes(ca);
@@ -626,11 +626,12 @@ static void journal_write_unlocked(struct closure *cl)
bio_reset(bio);
bio->bi_iter.bi_sector  = PTR_OFFSET(k, i);
bio->bi_bdev= ca->bdev;
-   bio->bi_rw  = REQ_WRITE|REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA;
bio->bi_iter.bi_size = sectors << 9;
 
bio->bi_end_io  = journal_write_endio;
bio->bi_private = w;
+   bio_set_op_attrs(bio, REQ_OP_WRITE,
+REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA);
bch_bio_map(bio, w->data);
 
trace_bcache_journal_write(bio);
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index b929fc9..1881319 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -163,7 +163,7 @@ static void read_moving(struct cache_set *c)
moving_init(io);
bio = >bio.bio;
 
-   bio->bi_rw  = READ;
+   bio_set_op_attrs(bio, REQ_OP_READ, 0);
bio->bi_end_io  = read_moving_endio;
 
if 

[PATCH 29/45] xen: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have xen
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 drivers/block/xen-blkback/blkback.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c 
b/drivers/block/xen-blkback/blkback.c
index 79fe493..4a80ee7 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -501,7 +501,7 @@ static int xen_vbd_translate(struct phys_req *req, struct 
xen_blkif *blkif,
struct xen_vbd *vbd = >vbd;
int rc = -EACCES;
 
-   if ((operation != READ) && vbd->readonly)
+   if ((operation != REQ_OP_READ) && vbd->readonly)
goto out;
 
if (likely(req->nr_sects)) {
@@ -1014,7 +1014,7 @@ static int dispatch_discard_io(struct xen_blkif_ring 
*ring,
preq.sector_number = req->u.discard.sector_number;
preq.nr_sects  = req->u.discard.nr_sectors;
 
-   err = xen_vbd_translate(, blkif, WRITE);
+   err = xen_vbd_translate(, blkif, REQ_OP_WRITE);
if (err) {
pr_warn("access denied: DISCARD [%llu->%llu] on dev=%04x\n",
preq.sector_number,
@@ -1229,6 +1229,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
struct bio **biolist = pending_req->biolist;
int i, nbio = 0;
int operation;
+   int operation_flags = 0;
struct blk_plug plug;
bool drain = false;
struct grant_page **pages = pending_req->segments;
@@ -1247,17 +1248,19 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
switch (req_operation) {
case BLKIF_OP_READ:
ring->st_rd_req++;
-   operation = READ;
+   operation = REQ_OP_READ;
break;
case BLKIF_OP_WRITE:
ring->st_wr_req++;
-   operation = WRITE_ODIRECT;
+   operation = REQ_OP_WRITE;
+   operation_flags = WRITE_ODIRECT;
break;
case BLKIF_OP_WRITE_BARRIER:
drain = true;
case BLKIF_OP_FLUSH_DISKCACHE:
ring->st_f_req++;
-   operation = WRITE_FLUSH;
+   operation = REQ_OP_WRITE;
+   operation_flags = WRITE_FLUSH;
break;
default:
operation = 0; /* make gcc happy */
@@ -1269,7 +1272,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
nseg = req->operation == BLKIF_OP_INDIRECT ?
   req->u.indirect.nr_segments : req->u.rw.nr_segments;
 
-   if (unlikely(nseg == 0 && operation != WRITE_FLUSH) ||
+   if (unlikely(nseg == 0 && operation_flags != WRITE_FLUSH) ||
unlikely((req->operation != BLKIF_OP_INDIRECT) &&
 (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) ||
unlikely((req->operation == BLKIF_OP_INDIRECT) &&
@@ -1310,7 +1313,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
 
if (xen_vbd_translate(, ring->blkif, operation) != 0) {
pr_debug("access denied: %s of [%llu,%llu] on dev=%04x\n",
-operation == READ ? "read" : "write",
+operation == REQ_OP_READ ? "read" : "write",
 preq.sector_number,
 preq.sector_number + preq.nr_sects,
 ring->blkif->vbd.pdevice);
@@ -1369,7 +1372,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
bio->bi_private = pending_req;
bio->bi_end_io  = end_block_io_op;
bio->bi_iter.bi_sector  = preq.sector_number;
-   bio->bi_rw  = operation;
+   bio_set_op_attrs(bio, operation, operation_flags);
}
 
preq.sector_number += seg[i].nsec;
@@ -1377,7 +1380,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
 
/* This will be hit if the operation was a flush or discard. */
if (!bio) {
-   BUG_ON(operation != WRITE_FLUSH);
+   BUG_ON(operation_flags != WRITE_FLUSH);
 
bio = bio_alloc(GFP_KERNEL, 0);
if (unlikely(bio == NULL))
@@ -1387,7 +1390,7 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
bio->bi_bdev= preq.bdev;
bio->bi_private = pending_req;
bio->bi_end_io  = end_block_io_op;
-   bio->bi_rw  = operation;
+   bio_set_op_attrs(bio, operation, operation_flags);
}
 
atomic_set(_req->pendcnt, nbio);
@@ -1399,9 +1402,9 @@ static int dispatch_rw_block_io(struct xen_blkif_ring 
*ring,
/* Let the I/Os go.. */

[PATCH 02/45] block: add REQ_OP definitions and helpers

2016-06-05 Thread mchristi
From: Mike Christie 

The following patches separate the operation (WRITE, READ, DISCARD,
etc) from the rq_flag_bits flags. This patch adds definitions for
request/bio operations (REQ_OPs) and adds request/bio accessors to
get/set the op.

In this patch the REQ_OPs match the REQ rq_flag_bits ones
for compat reasons while all the code is converted to use the
op accessors in the set. In the last patches the op will become a
number and the accessors and helpers in this patch will be dropped
or updated.

Signed-off-by: Mike Christie 
---
 include/linux/bio.h   |  3 +++
 include/linux/blk_types.h | 24 
 include/linux/blkdev.h| 10 +-
 include/linux/fs.h| 26 --
 4 files changed, 60 insertions(+), 3 deletions(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 3bde942..09c5308 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -44,6 +44,9 @@
 #define BIO_MAX_SIZE   (BIO_MAX_PAGES << PAGE_SHIFT)
 #define BIO_MAX_SECTORS(BIO_MAX_SIZE >> 9)
 
+#define bio_op(bio)(op_from_rq_bits((bio)->bi_rw))
+#define bio_set_op_attrs(bio, op, flags)   ((bio)->bi_rw |= (op | flags))
+
 /*
  * upper 16 bits of bi_rw define the io priority of this bio
  */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 77e5d81..6e60baa 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -242,6 +242,30 @@ enum rq_flag_bits {
 #define REQ_HASHED (1ULL << __REQ_HASHED)
 #define REQ_MQ_INFLIGHT(1ULL << __REQ_MQ_INFLIGHT)
 
+enum req_op {
+   REQ_OP_READ,
+   REQ_OP_WRITE= REQ_WRITE,
+   REQ_OP_DISCARD  = REQ_DISCARD,
+   REQ_OP_WRITE_SAME   = REQ_WRITE_SAME,
+};
+
+/*
+ * tmp cpmpat. Users used to set the write bit for all non reads, but
+ * we will be dropping the bitmap use for ops. Support both until
+ * the end of the patchset.
+ */
+static inline int op_from_rq_bits(u64 flags)
+{
+   if (flags & REQ_OP_DISCARD)
+   return REQ_OP_DISCARD;
+   else if (flags & REQ_OP_WRITE_SAME)
+   return REQ_OP_WRITE_SAME;
+   else if (flags & REQ_OP_WRITE)
+   return REQ_OP_WRITE;
+   else
+   return REQ_OP_READ;
+}
+
 typedef unsigned int blk_qc_t;
 #define BLK_QC_T_NONE  -1U
 #define BLK_QC_T_SHIFT 16
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 3d9cf32..49c2dbc 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -200,6 +200,13 @@ struct request {
struct request *next_rq;
 };
 
+#define req_op(req)(op_from_rq_bits((req)->cmd_flags))
+#define req_set_op(req, op)((req)->cmd_flags |= op)
+#define req_set_op_attrs(req, op, flags) do {  \
+   req_set_op(req, op);\
+   (req)->cmd_flags |= flags;  \
+} while (0)
+
 static inline unsigned short req_get_ioprio(struct request *req)
 {
return req->ioprio;
@@ -597,7 +604,8 @@ static inline void queue_flag_clear(unsigned int flag, 
struct request_queue *q)
 
 #define list_entry_rq(ptr) list_entry((ptr), struct request, queuelist)
 
-#define rq_data_dir(rq)((int)((rq)->cmd_flags & 1))
+#define rq_data_dir(rq) \
+   (op_is_write(op_from_rq_bits(rq->cmd_flags)) ? WRITE : READ)
 
 /*
  * Driver can handle struct request, if it either has an old style
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 65e4c51..62ca2f9 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2465,14 +2465,36 @@ extern bool is_bad_inode(struct inode *);
 
 #ifdef CONFIG_BLOCK
 /*
+ * tmp cpmpat. Users used to set the write bit for all non reads, but
+ * we will be dropping the bitmap use for ops. Support both until
+ * the end of the patchset.
+ */
+static inline bool op_is_write(unsigned long flags)
+{
+   if (flags & (REQ_OP_WRITE | REQ_OP_WRITE_SAME | REQ_OP_DISCARD))
+   return true;
+   else
+   return false;
+}
+
+/*
  * return READ, READA, or WRITE
  */
-#define bio_rw(bio)((bio)->bi_rw & (RW_MASK | RWA_MASK))
+static inline int bio_rw(struct bio *bio)
+{
+   if (op_is_write(op_from_rq_bits(bio->bi_rw)))
+   return WRITE;
+
+   return bio->bi_rw & RWA_MASK;
+}
 
 /*
  * return data direction, READ or WRITE
  */
-#define bio_data_dir(bio)  ((bio)->bi_rw & 1)
+static inline int bio_data_dir(struct bio *bio)
+{
+   return op_is_write(op_from_rq_bits(bio->bi_rw)) ? WRITE : READ;
+}
 
 extern void check_disk_size_change(struct gendisk *disk,
   struct block_device *bdev);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 31/45] block: prepare request creation/destruction code to use REQ_OPs

2016-06-05 Thread mchristi
From: Mike Christie 

This patch prepares *_get_request/*_put_request and freed_request,
to use separate variables for the operation and flags. In the
next patches the struct request users will be converted like
was done for bios where the op and flags are set separately.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/blk-core.c | 54 +-
 1 file changed, 29 insertions(+), 25 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 3c45254..a68dc07 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -959,10 +959,10 @@ static void __freed_request(struct request_list *rl, int 
sync)
  * A request has just been released.  Account for it, update the full and
  * congestion status, wake up any waiters.   Called under q->queue_lock.
  */
-static void freed_request(struct request_list *rl, unsigned int flags)
+static void freed_request(struct request_list *rl, int op, unsigned int flags)
 {
struct request_queue *q = rl->q;
-   int sync = rw_is_sync(flags);
+   int sync = rw_is_sync(op | flags);
 
q->nr_rqs[sync]--;
rl->count[sync]--;
@@ -1054,7 +1054,8 @@ static struct io_context *rq_ioc(struct bio *bio)
 /**
  * __get_request - get a free request
  * @rl: request list to allocate from
- * @rw_flags: RW and SYNC flags
+ * @op: REQ_OP_READ/REQ_OP_WRITE
+ * @op_flags: rq_flag_bits
  * @bio: bio to allocate request for (can be %NULL)
  * @gfp_mask: allocation mask
  *
@@ -1065,21 +1066,22 @@ static struct io_context *rq_ioc(struct bio *bio)
  * Returns ERR_PTR on failure, with @q->queue_lock held.
  * Returns request pointer on success, with @q->queue_lock *not held*.
  */
-static struct request *__get_request(struct request_list *rl, int rw_flags,
-struct bio *bio, gfp_t gfp_mask)
+static struct request *__get_request(struct request_list *rl, int op,
+int op_flags, struct bio *bio,
+gfp_t gfp_mask)
 {
struct request_queue *q = rl->q;
struct request *rq;
struct elevator_type *et = q->elevator->type;
struct io_context *ioc = rq_ioc(bio);
struct io_cq *icq = NULL;
-   const bool is_sync = rw_is_sync(rw_flags) != 0;
+   const bool is_sync = rw_is_sync(op | op_flags) != 0;
int may_queue;
 
if (unlikely(blk_queue_dying(q)))
return ERR_PTR(-ENODEV);
 
-   may_queue = elv_may_queue(q, rw_flags);
+   may_queue = elv_may_queue(q, op | op_flags);
if (may_queue == ELV_MQUEUE_NO)
goto rq_starved;
 
@@ -1123,7 +1125,7 @@ static struct request *__get_request(struct request_list 
*rl, int rw_flags,
 
/*
 * Decide whether the new request will be managed by elevator.  If
-* so, mark @rw_flags and increment elvpriv.  Non-zero elvpriv will
+* so, mark @op_flags and increment elvpriv.  Non-zero elvpriv will
 * prevent the current elevator from being destroyed until the new
 * request is freed.  This guarantees icq's won't be destroyed and
 * makes creating new ones safe.
@@ -1132,14 +1134,14 @@ static struct request *__get_request(struct 
request_list *rl, int rw_flags,
 * it will be created after releasing queue_lock.
 */
if (blk_rq_should_init_elevator(bio) && !blk_queue_bypass(q)) {
-   rw_flags |= REQ_ELVPRIV;
+   op_flags |= REQ_ELVPRIV;
q->nr_rqs_elvpriv++;
if (et->icq_cache && ioc)
icq = ioc_lookup_icq(ioc, q);
}
 
if (blk_queue_io_stat(q))
-   rw_flags |= REQ_IO_STAT;
+   op_flags |= REQ_IO_STAT;
spin_unlock_irq(q->queue_lock);
 
/* allocate and init request */
@@ -1149,10 +1151,10 @@ static struct request *__get_request(struct 
request_list *rl, int rw_flags,
 
blk_rq_init(q, rq);
blk_rq_set_rl(rq, rl);
-   rq->cmd_flags = rw_flags | REQ_ALLOCED;
+   req_set_op_attrs(rq, op, op_flags | REQ_ALLOCED);
 
/* init elvpriv */
-   if (rw_flags & REQ_ELVPRIV) {
+   if (op_flags & REQ_ELVPRIV) {
if (unlikely(et->icq_cache && !icq)) {
if (ioc)
icq = ioc_create_icq(ioc, q, gfp_mask);
@@ -1178,7 +1180,7 @@ out:
if (ioc_batching(q, ioc))
ioc->nr_batch_requests--;
 
-   trace_block_getrq(q, bio, rw_flags & 1);
+   trace_block_getrq(q, bio, op);
return rq;
 
 fail_elvpriv:
@@ -1208,7 +1210,7 @@ fail_alloc:
 * queue, but this is pretty rare.
 */
spin_lock_irq(q->queue_lock);
-   freed_request(rl, rw_flags);
+   freed_request(rl, op, op_flags);
 
/*
 * in the very unlikely event that allocation 

check if hardware checksumming works or not

2016-06-05 Thread Alberto Bursi


Hi, I'm running Debian ARM on a Marvell Kirkwood-based 2-disk NAS.

Kirkwood SoCs have a XOR engine that can hardware-accelerate crc32c 
checksumming, and from what I see in kernel mailing lists it seems to 
have a linux driver and should be supported.


I wanted to ask if there is a way to test if it is working at all.

How do I force btrfs to use software checksumming for testing purposes?


Thanks


-Albert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/45] bcache: use op_is_write instead of checking for REQ_WRITE

2016-06-05 Thread mchristi
From: Mike Christie 

We currently set REQ_WRITE/WRITE for all non READ IOs
like discard, flush, writesame, etc. In the next patches where we
no longer set up the op as a bitmap, we will not be able to
detect a operation direction like writesame by testing if REQ_WRITE is
set.

This has bcache use the op_is_write helper which will do the right
thing.

Signed-off-by: Mike Christie 
---
 drivers/md/bcache/io.c  | 2 +-
 drivers/md/bcache/request.c | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index 86a0bb8..fd885cc 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -111,7 +111,7 @@ void bch_bbio_count_io_errors(struct cache_set *c, struct 
bio *bio,
struct bbio *b = container_of(bio, struct bbio, bio);
struct cache *ca = PTR_CACHE(c, >key, 0);
 
-   unsigned threshold = bio->bi_rw & REQ_WRITE
+   unsigned threshold = op_is_write(bio_op(bio))
? c->congested_write_threshold_us
: c->congested_read_threshold_us;
 
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 25fa844..6b85a23 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -383,7 +383,7 @@ static bool check_should_bypass(struct cached_dev *dc, 
struct bio *bio)
 
if (mode == CACHE_MODE_NONE ||
(mode == CACHE_MODE_WRITEAROUND &&
-(bio->bi_rw & REQ_WRITE)))
+op_is_write(bio_op(bio
goto skip;
 
if (bio->bi_iter.bi_sector & (c->sb.block_size - 1) ||
@@ -404,7 +404,7 @@ static bool check_should_bypass(struct cached_dev *dc, 
struct bio *bio)
 
if (!congested &&
mode == CACHE_MODE_WRITEBACK &&
-   (bio->bi_rw & REQ_WRITE) &&
+   op_is_write(bio_op(bio)) &&
(bio->bi_rw & REQ_SYNC))
goto rescale;
 
@@ -657,7 +657,7 @@ static inline struct search *search_alloc(struct bio *bio,
s->cache_miss   = NULL;
s->d= d;
s->recoverable  = 1;
-   s->write= (bio->bi_rw & REQ_WRITE) != 0;
+   s->write= op_is_write(bio_op(bio));
s->read_dirty_data  = 0;
s->start_time   = jiffies;
 
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 35/45] block: convert merge/insert code to check for REQ_OPs.

2016-06-05 Thread mchristi
From: Mike Christie 

This patch converts the block layer merging code to use separate variables
for the operation and flags, and to check req_op for the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/blk-core.c   |  2 +-
 block/blk-merge.c  | 10 ++
 include/linux/blkdev.h | 20 ++--
 3 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 090e55d..1333bb7 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -2161,7 +2161,7 @@ EXPORT_SYMBOL(submit_bio);
 static int blk_cloned_rq_check_limits(struct request_queue *q,
  struct request *rq)
 {
-   if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, rq->cmd_flags)) {
+   if (blk_rq_sectors(rq) > blk_queue_get_max_sectors(q, req_op(rq))) {
printk(KERN_ERR "%s: over max size limit.\n", __func__);
return -EIO;
}
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 5a03f96..c265348 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -649,7 +649,8 @@ static int attempt_merge(struct request_queue *q, struct 
request *req,
if (!rq_mergeable(req) || !rq_mergeable(next))
return 0;
 
-   if (!blk_check_merge_flags(req->cmd_flags, next->cmd_flags))
+   if (!blk_check_merge_flags(req->cmd_flags, req_op(req), next->cmd_flags,
+  req_op(next)))
return 0;
 
/*
@@ -663,7 +664,7 @@ static int attempt_merge(struct request_queue *q, struct 
request *req,
|| req_no_special_merge(next))
return 0;
 
-   if (req->cmd_flags & REQ_WRITE_SAME &&
+   if (req_op(req) == REQ_OP_WRITE_SAME &&
!blk_write_same_mergeable(req->bio, next->bio))
return 0;
 
@@ -751,7 +752,8 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
if (!rq_mergeable(rq) || !bio_mergeable(bio))
return false;
 
-   if (!blk_check_merge_flags(rq->cmd_flags, bio->bi_rw))
+   if (!blk_check_merge_flags(rq->cmd_flags, req_op(rq), bio->bi_rw,
+  bio_op(bio)))
return false;
 
/* different data direction or already started, don't merge */
@@ -767,7 +769,7 @@ bool blk_rq_merge_ok(struct request *rq, struct bio *bio)
return false;
 
/* must be using the same buffer */
-   if (rq->cmd_flags & REQ_WRITE_SAME &&
+   if (req_op(rq) == REQ_OP_WRITE_SAME &&
!blk_write_same_mergeable(rq->bio, bio))
return false;
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8c78aca..25f01ff 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -666,16 +666,16 @@ static inline bool rq_mergeable(struct request *rq)
return true;
 }
 
-static inline bool blk_check_merge_flags(unsigned int flags1,
-unsigned int flags2)
+static inline bool blk_check_merge_flags(unsigned int flags1, unsigned int op1,
+unsigned int flags2, unsigned int op2)
 {
-   if ((flags1 & REQ_DISCARD) != (flags2 & REQ_DISCARD))
+   if ((op1 == REQ_OP_DISCARD) != (op2 == REQ_OP_DISCARD))
return false;
 
if ((flags1 & REQ_SECURE) != (flags2 & REQ_SECURE))
return false;
 
-   if ((flags1 & REQ_WRITE_SAME) != (flags2 & REQ_WRITE_SAME))
+   if ((op1 == REQ_OP_WRITE_SAME) != (op2 == REQ_OP_WRITE_SAME))
return false;
 
return true;
@@ -887,12 +887,12 @@ static inline unsigned int blk_rq_cur_sectors(const 
struct request *rq)
 }
 
 static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q,
-unsigned int cmd_flags)
+int op)
 {
-   if (unlikely(cmd_flags & REQ_DISCARD))
+   if (unlikely(op == REQ_OP_DISCARD))
return min(q->limits.max_discard_sectors, UINT_MAX >> 9);
 
-   if (unlikely(cmd_flags & REQ_WRITE_SAME))
+   if (unlikely(op == REQ_OP_WRITE_SAME))
return q->limits.max_write_same_sectors;
 
return q->limits.max_sectors;
@@ -919,11 +919,11 @@ static inline unsigned int blk_rq_get_max_sectors(struct 
request *rq)
if (unlikely(rq->cmd_type != REQ_TYPE_FS))
return q->limits.max_hw_sectors;
 
-   if (!q->limits.chunk_sectors || (rq->cmd_flags & REQ_DISCARD))
-   return blk_queue_get_max_sectors(q, rq->cmd_flags);
+   if (!q->limits.chunk_sectors || (req_op(rq) == REQ_OP_DISCARD))
+   return blk_queue_get_max_sectors(q, req_op(rq));
 
return min(blk_max_size_offset(q, blk_rq_pos(rq)),
-   blk_queue_get_max_sectors(q, 

[PATCH 33/45] block: prepare elevator to use REQ_OPs.

2016-06-05 Thread mchristi
From: Mike Christie 

This patch converts the elevator code to use separate variables
for the operation and flags, and to check req_op for the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/blk-core.c | 2 +-
 block/cfq-iosched.c  | 4 ++--
 block/elevator.c | 7 +++
 include/linux/elevator.h | 4 ++--
 4 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index a68dc07..090e55d 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1081,7 +1081,7 @@ static struct request *__get_request(struct request_list 
*rl, int op,
if (unlikely(blk_queue_dying(q)))
return ERR_PTR(-ENODEV);
 
-   may_queue = elv_may_queue(q, op | op_flags);
+   may_queue = elv_may_queue(q, op, op_flags);
if (may_queue == ELV_MQUEUE_NO)
goto rq_starved;
 
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 4a34978..3fcc598 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4285,7 +4285,7 @@ static inline int __cfq_may_queue(struct cfq_queue *cfqq)
return ELV_MQUEUE_MAY;
 }
 
-static int cfq_may_queue(struct request_queue *q, int rw)
+static int cfq_may_queue(struct request_queue *q, int op, int op_flags)
 {
struct cfq_data *cfqd = q->elevator->elevator_data;
struct task_struct *tsk = current;
@@ -4302,7 +4302,7 @@ static int cfq_may_queue(struct request_queue *q, int rw)
if (!cic)
return ELV_MQUEUE_MAY;
 
-   cfqq = cic_to_cfqq(cic, rw_is_sync(rw));
+   cfqq = cic_to_cfqq(cic, rw_is_sync(op | op_flags));
if (cfqq) {
cfq_init_prio_data(cfqq, cic);
 
diff --git a/block/elevator.c b/block/elevator.c
index c3555c9..ea9319d 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -366,8 +366,7 @@ void elv_dispatch_sort(struct request_queue *q, struct 
request *rq)
list_for_each_prev(entry, >queue_head) {
struct request *pos = list_entry_rq(entry);
 
-   if ((rq->cmd_flags & REQ_DISCARD) !=
-   (pos->cmd_flags & REQ_DISCARD))
+   if ((req_op(rq) == REQ_OP_DISCARD) != (req_op(pos) == 
REQ_OP_DISCARD))
break;
if (rq_data_dir(rq) != rq_data_dir(pos))
break;
@@ -717,12 +716,12 @@ void elv_put_request(struct request_queue *q, struct 
request *rq)
e->type->ops.elevator_put_req_fn(rq);
 }
 
-int elv_may_queue(struct request_queue *q, int rw)
+int elv_may_queue(struct request_queue *q, int op, int op_flags)
 {
struct elevator_queue *e = q->elevator;
 
if (e->type->ops.elevator_may_queue_fn)
-   return e->type->ops.elevator_may_queue_fn(q, rw);
+   return e->type->ops.elevator_may_queue_fn(q, op, op_flags);
 
return ELV_MQUEUE_MAY;
 }
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 638b324..953d286 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -26,7 +26,7 @@ typedef int (elevator_dispatch_fn) (struct request_queue *, 
int);
 typedef void (elevator_add_req_fn) (struct request_queue *, struct request *);
 typedef struct request *(elevator_request_list_fn) (struct request_queue *, 
struct request *);
 typedef void (elevator_completed_req_fn) (struct request_queue *, struct 
request *);
-typedef int (elevator_may_queue_fn) (struct request_queue *, int);
+typedef int (elevator_may_queue_fn) (struct request_queue *, int, int);
 
 typedef void (elevator_init_icq_fn) (struct io_cq *);
 typedef void (elevator_exit_icq_fn) (struct io_cq *);
@@ -134,7 +134,7 @@ extern struct request *elv_former_request(struct 
request_queue *, struct request
 extern struct request *elv_latter_request(struct request_queue *, struct 
request *);
 extern int elv_register_queue(struct request_queue *q);
 extern void elv_unregister_queue(struct request_queue *q);
-extern int elv_may_queue(struct request_queue *, int);
+extern int elv_may_queue(struct request_queue *, int, int);
 extern void elv_completed_request(struct request_queue *, struct request *);
 extern int elv_set_request(struct request_queue *q, struct request *rq,
   struct bio *bio, gfp_t gfp_mask);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/45] ocfs2: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have ocfs2
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 fs/ocfs2/cluster/heartbeat.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/ocfs2/cluster/heartbeat.c b/fs/ocfs2/cluster/heartbeat.c
index 8b1d86e..636abcb 100644
--- a/fs/ocfs2/cluster/heartbeat.c
+++ b/fs/ocfs2/cluster/heartbeat.c
@@ -530,7 +530,8 @@ static void o2hb_bio_end_io(struct bio *bio)
 static struct bio *o2hb_setup_one_bio(struct o2hb_region *reg,
  struct o2hb_bio_wait_ctxt *wc,
  unsigned int *current_slot,
- unsigned int max_slots, int rw)
+ unsigned int max_slots, int op,
+ int op_flags)
 {
int len, current_page;
unsigned int vec_len, vec_start;
@@ -556,7 +557,7 @@ static struct bio *o2hb_setup_one_bio(struct o2hb_region 
*reg,
bio->bi_bdev = reg->hr_bdev;
bio->bi_private = wc;
bio->bi_end_io = o2hb_bio_end_io;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
vec_start = (cs << bits) % PAGE_SIZE;
while(cs < max_slots) {
@@ -593,7 +594,7 @@ static int o2hb_read_slots(struct o2hb_region *reg,
 
while(current_slot < max_slots) {
bio = o2hb_setup_one_bio(reg, , _slot, max_slots,
-READ);
+REQ_OP_READ, 0);
if (IS_ERR(bio)) {
status = PTR_ERR(bio);
mlog_errno(status);
@@ -625,7 +626,8 @@ static int o2hb_issue_node_write(struct o2hb_region *reg,
 
slot = o2nm_this_node();
 
-   bio = o2hb_setup_one_bio(reg, write_wc, , slot+1, WRITE_SYNC);
+   bio = o2hb_setup_one_bio(reg, write_wc, , slot+1, REQ_OP_WRITE,
+WRITE_SYNC);
if (IS_ERR(bio)) {
status = PTR_ERR(bio);
mlog_errno(status);
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 39/45] ide cd: do not set REQ_WRITE on requests.

2016-06-05 Thread mchristi
From: Mike Christie 

The block layer will set the correct READ/WRITE operation flags/fields
when creating a request, so there is not need for drivers to set the
REQ_WRITE flag.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 drivers/ide/ide-cd_ioctl.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/ide/ide-cd_ioctl.c b/drivers/ide/ide-cd_ioctl.c
index 474173e..5887a7a 100644
--- a/drivers/ide/ide-cd_ioctl.c
+++ b/drivers/ide/ide-cd_ioctl.c
@@ -459,9 +459,6 @@ int ide_cdrom_packet(struct cdrom_device_info *cdi,
   layer. the packet must be complete, as we do not
   touch it at all. */
 
-   if (cgc->data_direction == CGC_DATA_WRITE)
-   flags |= REQ_WRITE;
-
if (cgc->sense)
memset(cgc->sense, 0, sizeof(struct request_sense));
 
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 23/45] dm: pass dm stats data dir instead of bi_rw

2016-06-05 Thread mchristi
From: Mike Christie 

It looks like dm stats cares about the data direction
(READ vs WRITE) and does not need the bio/request flags.
Commands like REQ_FLUSH, REQ_DISCARD and REQ_WRITE_SAME
are currently always set with REQ_WRITE, so the extra check for
REQ_DISCARD in dm_stats_account_io is not needed.

This patch has it use the bio and request data_dir helpers
instead of accessing the bi_rw/cmd_flags directly. This makes
the next patches that remove the operation from the cmd_flags
and bi_rw easier, because we will no longer have the REQ_WRITE
bit set for operations like discards.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
v2:

1. Merged Mike Snitzer's fixes to pass in int instead of
unsigned long.

2. Fix 80 char col issues.

 drivers/md/dm-stats.c |  9 -
 drivers/md/dm.c   | 21 -
 2 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
index 8289804..4fba26c 100644
--- a/drivers/md/dm-stats.c
+++ b/drivers/md/dm-stats.c
@@ -514,11 +514,10 @@ static void dm_stat_round(struct dm_stat *s, struct 
dm_stat_shared *shared,
 }
 
 static void dm_stat_for_entry(struct dm_stat *s, size_t entry,
- unsigned long bi_rw, sector_t len,
+ int idx, sector_t len,
  struct dm_stats_aux *stats_aux, bool end,
  unsigned long duration_jiffies)
 {
-   unsigned long idx = bi_rw & REQ_WRITE;
struct dm_stat_shared *shared = >stat_shared[entry];
struct dm_stat_percpu *p;
 
@@ -584,7 +583,7 @@ static void dm_stat_for_entry(struct dm_stat *s, size_t 
entry,
 #endif
 }
 
-static void __dm_stat_bio(struct dm_stat *s, unsigned long bi_rw,
+static void __dm_stat_bio(struct dm_stat *s, int bi_rw,
  sector_t bi_sector, sector_t end_sector,
  bool end, unsigned long duration_jiffies,
  struct dm_stats_aux *stats_aux)
@@ -645,8 +644,8 @@ void dm_stats_account_io(struct dm_stats *stats, unsigned 
long bi_rw,
last = raw_cpu_ptr(stats->last);
stats_aux->merged =
(bi_sector == (ACCESS_ONCE(last->last_sector) &&
-  ((bi_rw & (REQ_WRITE | REQ_DISCARD)) ==
-   (ACCESS_ONCE(last->last_rw) & 
(REQ_WRITE | REQ_DISCARD)))
+  ((bi_rw == WRITE) ==
+   (ACCESS_ONCE(last->last_rw) == WRITE))
   ));
ACCESS_ONCE(last->last_sector) = end_sector;
ACCESS_ONCE(last->last_rw) = bi_rw;
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1b2f962..f5ac0a3 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -723,8 +723,9 @@ static void start_io_acct(struct dm_io *io)
atomic_inc_return(>pending[rw]));
 
if (unlikely(dm_stats_used(>stats)))
-   dm_stats_account_io(>stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
-   bio_sectors(bio), false, 0, >stats_aux);
+   dm_stats_account_io(>stats, bio_data_dir(bio),
+   bio->bi_iter.bi_sector, bio_sectors(bio),
+   false, 0, >stats_aux);
 }
 
 static void end_io_acct(struct dm_io *io)
@@ -738,8 +739,9 @@ static void end_io_acct(struct dm_io *io)
generic_end_io_acct(rw, _disk(md)->part0, io->start_time);
 
if (unlikely(dm_stats_used(>stats)))
-   dm_stats_account_io(>stats, bio->bi_rw, 
bio->bi_iter.bi_sector,
-   bio_sectors(bio), true, duration, 
>stats_aux);
+   dm_stats_account_io(>stats, bio_data_dir(bio),
+   bio->bi_iter.bi_sector, bio_sectors(bio),
+   true, duration, >stats_aux);
 
/*
 * After this is decremented the bio must not be touched if it is
@@ -1121,9 +1123,9 @@ static void rq_end_stats(struct mapped_device *md, struct 
request *orig)
if (unlikely(dm_stats_used(>stats))) {
struct dm_rq_target_io *tio = tio_from_request(orig);
tio->duration_jiffies = jiffies - tio->duration_jiffies;
-   dm_stats_account_io(>stats, orig->cmd_flags, 
blk_rq_pos(orig),
-   tio->n_sectors, true, tio->duration_jiffies,
-   >stats_aux);
+   dm_stats_account_io(>stats, rq_data_dir(orig),
+   blk_rq_pos(orig), tio->n_sectors, true,
+   tio->duration_jiffies, >stats_aux);
}
 }
 
@@ -2082,8 +2084,9 @@ static void dm_start_request(struct mapped_device *md, 
struct request *orig)
 

[PATCH 28/45] target: use bio op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Separate the op from the rq_flag_bits and have the target layer
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie 
---
 drivers/target/target_core_iblock.c | 29 ++---
 drivers/target/target_core_pscsi.c  |  2 +-
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/drivers/target/target_core_iblock.c 
b/drivers/target/target_core_iblock.c
index c25109c..22af12f 100644
--- a/drivers/target/target_core_iblock.c
+++ b/drivers/target/target_core_iblock.c
@@ -312,7 +312,8 @@ static void iblock_bio_done(struct bio *bio)
 }
 
 static struct bio *
-iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int rw)
+iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 sg_num, int op,
+  int op_flags)
 {
struct iblock_dev *ib_dev = IBLOCK_DEV(cmd->se_dev);
struct bio *bio;
@@ -334,7 +335,7 @@ iblock_get_bio(struct se_cmd *cmd, sector_t lba, u32 
sg_num, int rw)
bio->bi_private = cmd;
bio->bi_end_io = _bio_done;
bio->bi_iter.bi_sector = lba;
-   bio->bi_rw = rw;
+   bio_set_op_attrs(bio, op, op_flags);
 
return bio;
 }
@@ -480,7 +481,7 @@ iblock_execute_write_same(struct se_cmd *cmd)
goto fail;
cmd->priv = ibr;
 
-   bio = iblock_get_bio(cmd, block_lba, 1, WRITE);
+   bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE, 0);
if (!bio)
goto fail_free_ibr;
 
@@ -493,7 +494,8 @@ iblock_execute_write_same(struct se_cmd *cmd)
while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset)
!= sg->length) {
 
-   bio = iblock_get_bio(cmd, block_lba, 1, WRITE);
+   bio = iblock_get_bio(cmd, block_lba, 1, REQ_OP_WRITE,
+0);
if (!bio)
goto fail_put_bios;
 
@@ -679,8 +681,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
struct scatterlist *sg;
u32 sg_num = sgl_nents;
unsigned bio_cnt;
-   int rw = 0;
-   int i;
+   int i, op, op_flags = 0;
 
if (data_direction == DMA_TO_DEVICE) {
struct iblock_dev *ib_dev = IBLOCK_DEV(dev);
@@ -689,18 +690,15 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
 * Force writethrough using WRITE_FUA if a volatile write cache
 * is not enabled, or if initiator set the Force Unit Access 
bit.
 */
+   op = REQ_OP_WRITE;
if (test_bit(QUEUE_FLAG_FUA, >queue_flags)) {
if (cmd->se_cmd_flags & SCF_FUA)
-   rw = WRITE_FUA;
+   op_flags = WRITE_FUA;
else if (!test_bit(QUEUE_FLAG_WC, >queue_flags))
-   rw = WRITE_FUA;
-   else
-   rw = WRITE;
-   } else {
-   rw = WRITE;
+   op_flags = WRITE_FUA;
}
} else {
-   rw = READ;
+   op = REQ_OP_READ;
}
 
ibr = kzalloc(sizeof(struct iblock_req), GFP_KERNEL);
@@ -714,7 +712,7 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
return 0;
}
 
-   bio = iblock_get_bio(cmd, block_lba, sgl_nents, rw);
+   bio = iblock_get_bio(cmd, block_lba, sgl_nents, op, op_flags);
if (!bio)
goto fail_free_ibr;
 
@@ -738,7 +736,8 @@ iblock_execute_rw(struct se_cmd *cmd, struct scatterlist 
*sgl, u32 sgl_nents,
bio_cnt = 0;
}
 
-   bio = iblock_get_bio(cmd, block_lba, sg_num, rw);
+   bio = iblock_get_bio(cmd, block_lba, sg_num, op,
+op_flags);
if (!bio)
goto fail_put_bios;
 
diff --git a/drivers/target/target_core_pscsi.c 
b/drivers/target/target_core_pscsi.c
index de18790..81564c8 100644
--- a/drivers/target/target_core_pscsi.c
+++ b/drivers/target/target_core_pscsi.c
@@ -922,7 +922,7 @@ pscsi_map_sg(struct se_cmd *cmd, struct scatterlist *sgl, 
u32 sgl_nents,
goto fail;
 
if (rw)
-   bio->bi_rw |= REQ_WRITE;
+   bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
 
pr_debug("PSCSI: Allocated bio: %p,"
" dir: %s nr_vecs: %d\n", bio,
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More 

[PATCH 41/45] block, drivers, fs: shrink bi_rw from long to int

2016-06-05 Thread mchristi
From: Mike Christie 

We don't need bi_rw to be so large on 64 bit archs, so
reduce it to unsigned int.

Signed-off-by: Mike Christie 
---
 block/blk-core.c   | 2 +-
 drivers/md/dm-flakey.c | 2 +-
 drivers/md/raid5.c | 6 +++---
 fs/btrfs/check-integrity.c | 4 ++--
 fs/btrfs/inode.c   | 2 +-
 include/linux/blk_types.h  | 2 +-
 6 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index f9f4228..c7d66c2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1853,7 +1853,7 @@ static void handle_bad_sector(struct bio *bio)
char b[BDEVNAME_SIZE];
 
printk(KERN_INFO "attempt to access beyond end of device\n");
-   printk(KERN_INFO "%s: rw=%ld, want=%Lu, limit=%Lu\n",
+   printk(KERN_INFO "%s: rw=%d, want=%Lu, limit=%Lu\n",
bdevname(bio->bi_bdev, b),
bio->bi_rw,
(unsigned long long)bio_end_sector(bio),
diff --git a/drivers/md/dm-flakey.c b/drivers/md/dm-flakey.c
index b7341de..29b99fb 100644
--- a/drivers/md/dm-flakey.c
+++ b/drivers/md/dm-flakey.c
@@ -266,7 +266,7 @@ static void corrupt_bio_data(struct bio *bio, struct 
flakey_c *fc)
data[fc->corrupt_bio_byte - 1] = fc->corrupt_bio_value;
 
DMDEBUG("Corrupting data bio=%p by writing %u to byte %u "
-   "(rw=%c bi_rw=%lu bi_sector=%llu cur_bytes=%u)\n",
+   "(rw=%c bi_rw=%u bi_sector=%llu cur_bytes=%u)\n",
bio, fc->corrupt_bio_value, fc->corrupt_bio_byte,
(bio_data_dir(bio) == WRITE) ? 'w' : 'r', bio->bi_rw,
(unsigned long long)bio->bi_iter.bi_sector, bio_bytes);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e35c163..b9122e2 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -1001,7 +1001,7 @@ again:
: raid5_end_read_request;
bi->bi_private = sh;
 
-   pr_debug("%s: for %llu schedule op %ld on disc %d\n",
+   pr_debug("%s: for %llu schedule op %d on disc %d\n",
__func__, (unsigned long long)sh->sector,
bi->bi_rw, i);
atomic_inc(>count);
@@ -1052,7 +1052,7 @@ again:
rbi->bi_end_io = raid5_end_write_request;
rbi->bi_private = sh;
 
-   pr_debug("%s: for %llu schedule op %ld on "
+   pr_debug("%s: for %llu schedule op %d on "
 "replacement disc %d\n",
__func__, (unsigned long long)sh->sector,
rbi->bi_rw, i);
@@ -1087,7 +1087,7 @@ again:
if (!rdev && !rrdev) {
if (op_is_write(op))
set_bit(STRIPE_DEGRADED, >state);
-   pr_debug("skip op %ld on disc %d for sector %llu\n",
+   pr_debug("skip op %d on disc %d for sector %llu\n",
bi->bi_rw, i, (unsigned long long)sh->sector);
clear_bit(R5_LOCKED, >dev[i].flags);
set_bit(STRIPE_HANDLE, >state);
diff --git a/fs/btrfs/check-integrity.c b/fs/btrfs/check-integrity.c
index 80a4389..da944ff 100644
--- a/fs/btrfs/check-integrity.c
+++ b/fs/btrfs/check-integrity.c
@@ -2943,7 +2943,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=%d,0x%lx, bi_vcnt=%u,"
+  "submit_bio(rw=%d,0x%x, bi_vcnt=%u,"
   " bi_sector=%llu (bytenr %llu), bi_bdev=%p)\n",
   bio_op(bio), bio->bi_rw, bio->bi_vcnt,
   (unsigned long long)bio->bi_iter.bi_sector,
@@ -2986,7 +2986,7 @@ static void __btrfsic_submit_bio(struct bio *bio)
if (dev_state->state->print_mask &
BTRFSIC_PRINT_MASK_SUBMIT_BIO_BH)
printk(KERN_INFO
-  "submit_bio(rw=%d,0x%lx FLUSH, bdev=%p)\n",
+  "submit_bio(rw=%d,0x%x FLUSH, bdev=%p)\n",
   bio_op(bio), bio->bi_rw, bio->bi_bdev);
if (!dev_state->dummy_block_for_bio_bh_flush.is_iodone) {
if ((dev_state->state->print_mask &
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 128b02b..412e582 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -8173,7 +8173,7 @@ static void btrfs_end_dio_bio(struct bio *bio)
 
if (err)
btrfs_warn(BTRFS_I(dip->inode)->root->fs_info,
-  "direct IO failed 

[PATCH 44/45] block: do not use REQ_FLUSH for tracking flush support

2016-06-05 Thread mchristi
From: Mike Christie 

The last patch added a REQ_OP_FLUSH for request_fn drivers
and the next patch renames REQ_FLUSH to REQ_PREFLUSH which
will be used by file systems and make_request_fn drivers so
they can send a write/flush combo.

This patch drops xen's use of REQ_FLUSH to track if it supports
REQ_OP_FLUSH requests, so REQ_FLUSH can be deleted.

Signed-off-by: Mike Christie 
Reviewed-by: Hannes Reinecke 
Signed-off-by: Juergen Gross 
---

v7:
- Fix feature_flush/fua use.

v6:
- Dropped parts of patch handled by Jens's QUEUE_FLAG_WC/FUA
patches and modified patch to check feature_flush/fua bits.

 drivers/block/xen-blkfront.c | 47 ++--
 1 file changed, 24 insertions(+), 23 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 3aeb25b..343ef7a 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -196,6 +196,7 @@ struct blkfront_info
unsigned int nr_ring_pages;
struct request_queue *rq;
unsigned int feature_flush;
+   unsigned int feature_fua;
unsigned int feature_discard:1;
unsigned int feature_secdiscard:1;
unsigned int discard_granularity;
@@ -763,19 +764,14 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
 * implement it the same way.  (It's also a FLUSH+FUA,
 * since it is guaranteed ordered WRT previous writes.)
 */
-   switch (info->feature_flush &
-   ((REQ_FLUSH|REQ_FUA))) {
-   case REQ_FLUSH|REQ_FUA:
+   if (info->feature_flush && info->feature_fua)
ring_req->operation =
BLKIF_OP_WRITE_BARRIER;
-   break;
-   case REQ_FLUSH:
+   else if (info->feature_flush)
ring_req->operation =
BLKIF_OP_FLUSH_DISKCACHE;
-   break;
-   default:
+   else
ring_req->operation = 0;
-   }
}
ring_req->u.rw.nr_segments = num_grant;
if (unlikely(require_extra_req)) {
@@ -866,9 +862,9 @@ static inline bool blkif_request_flush_invalid(struct 
request *req,
 {
return ((req->cmd_type != REQ_TYPE_FS) ||
((req_op(req) == REQ_OP_FLUSH) &&
-!(info->feature_flush & REQ_FLUSH)) ||
+!info->feature_flush) ||
((req->cmd_flags & REQ_FUA) &&
-!(info->feature_flush & REQ_FUA)));
+!info->feature_fua));
 }
 
 static int blkif_queue_rq(struct blk_mq_hw_ctx *hctx,
@@ -985,24 +981,22 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 
sector_size,
return 0;
 }
 
-static const char *flush_info(unsigned int feature_flush)
+static const char *flush_info(struct blkfront_info *info)
 {
-   switch (feature_flush & ((REQ_FLUSH | REQ_FUA))) {
-   case REQ_FLUSH|REQ_FUA:
+   if (info->feature_flush && info->feature_fua)
return "barrier: enabled;";
-   case REQ_FLUSH:
+   else if (info->feature_flush)
return "flush diskcache: enabled;";
-   default:
+   else
return "barrier or flush: disabled;";
-   }
 }
 
 static void xlvbd_flush(struct blkfront_info *info)
 {
-   blk_queue_write_cache(info->rq, info->feature_flush & REQ_FLUSH,
-   info->feature_flush & REQ_FUA);
+   blk_queue_write_cache(info->rq, info->feature_flush ? true : false,
+ info->feature_fua ? true : false);
pr_info("blkfront: %s: %s %s %s %s %s\n",
-   info->gd->disk_name, flush_info(info->feature_flush),
+   info->gd->disk_name, flush_info(info),
"persistent grants:", info->feature_persistent ?
"enabled;" : "disabled;", "indirect descriptors:",
info->max_indirect_segments ? "enabled;" : "disabled;");
@@ -1621,6 +1615,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id)
if (unlikely(error)) {
if (error == -EOPNOTSUPP)
error = 0;
+   info->feature_fua = 0;
info->feature_flush = 0;
xlvbd_flush(info);
}
@@ -2315,6 +2310,7 @@ static void blkfront_gather_backend_features(struct 
blkfront_info *info)
unsigned int indirect_segments;
 
info->feature_flush = 0;
+   info->feature_fua = 0;
 
err = 

[PATCH 37/45] drivers: use req op accessor

2016-06-05 Thread mchristi
From: Mike Christie 

The req operation REQ_OP is separated from the rq_flag_bits
definition. This converts the block layer drivers to
use req_op to get the op from the request struct.

Signed-off-by: Mike Christie 
---
 drivers/block/loop.c  |  6 +++---
 drivers/block/mtip32xx/mtip32xx.c |  2 +-
 drivers/block/nbd.c   |  2 +-
 drivers/block/rbd.c   |  4 ++--
 drivers/block/xen-blkfront.c  |  8 +---
 drivers/ide/ide-floppy.c  |  2 +-
 drivers/md/dm.c   |  2 +-
 drivers/mmc/card/block.c  |  7 +++
 drivers/mmc/card/queue.c  |  6 ++
 drivers/mmc/card/queue.h  |  5 -
 drivers/mtd/mtd_blkdevs.c |  2 +-
 drivers/nvme/host/core.c  |  2 +-
 drivers/nvme/host/nvme.h  |  4 ++--
 drivers/scsi/sd.c | 25 -
 14 files changed, 43 insertions(+), 34 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index e9f1701..b9b737c 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -544,7 +544,7 @@ static int do_req_filebacked(struct loop_device *lo, struct 
request *rq)
if (op_is_write(req_op(rq))) {
if (rq->cmd_flags & REQ_FLUSH)
ret = lo_req_flush(lo, rq);
-   else if (rq->cmd_flags & REQ_DISCARD)
+   else if (req_op(rq) == REQ_OP_DISCARD)
ret = lo_discard(lo, rq, pos);
else if (lo->transfer)
ret = lo_write_transfer(lo, rq, pos);
@@ -1659,8 +1659,8 @@ static int loop_queue_rq(struct blk_mq_hw_ctx *hctx,
if (lo->lo_state != Lo_bound)
return -EIO;
 
-   if (lo->use_dio && !(cmd->rq->cmd_flags & (REQ_FLUSH |
-   REQ_DISCARD)))
+   if (lo->use_dio && (!(cmd->rq->cmd_flags & REQ_FLUSH) ||
+   req_op(cmd->rq) == REQ_OP_DISCARD))
cmd->use_aio = true;
else
cmd->use_aio = false;
diff --git a/drivers/block/mtip32xx/mtip32xx.c 
b/drivers/block/mtip32xx/mtip32xx.c
index 6053e46..8e3e708 100644
--- a/drivers/block/mtip32xx/mtip32xx.c
+++ b/drivers/block/mtip32xx/mtip32xx.c
@@ -3765,7 +3765,7 @@ static int mtip_submit_request(struct blk_mq_hw_ctx 
*hctx, struct request *rq)
return -ENODATA;
}
 
-   if (rq->cmd_flags & REQ_DISCARD) {
+   if (req_op(rq) == REQ_OP_DISCARD) {
int err;
 
err = mtip_send_trim(dd, blk_rq_pos(rq), blk_rq_sectors(rq));
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 31e73a7..6c2c28d 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -282,7 +282,7 @@ static int nbd_send_req(struct nbd_device *nbd, struct 
request *req)
 
if (req->cmd_type == REQ_TYPE_DRV_PRIV)
type = NBD_CMD_DISC;
-   else if (req->cmd_flags & REQ_DISCARD)
+   else if (req_op(req) == REQ_OP_DISCARD)
type = NBD_CMD_TRIM;
else if (req->cmd_flags & REQ_FLUSH)
type = NBD_CMD_FLUSH;
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 81666a5..4506620 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3286,9 +3286,9 @@ static void rbd_queue_workfn(struct work_struct *work)
goto err;
}
 
-   if (rq->cmd_flags & REQ_DISCARD)
+   if (req_op(rq) == REQ_OP_DISCARD)
op_type = OBJ_OP_DISCARD;
-   else if (rq->cmd_flags & REQ_WRITE)
+   else if (req_op(rq) == REQ_OP_WRITE)
op_type = OBJ_OP_WRITE;
else
op_type = OBJ_OP_READ;
diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 52963a2..6fd1601 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -844,7 +844,8 @@ static int blkif_queue_request(struct request *req, struct 
blkfront_ring_info *r
if (unlikely(rinfo->dev_info->connected != BLKIF_STATE_CONNECTED))
return 1;
 
-   if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE)))
+   if (unlikely(req_op(req) == REQ_OP_DISCARD ||
+req->cmd_flags & REQ_SECURE))
return blkif_queue_discard_req(req, rinfo);
else
return blkif_queue_rw_req(req, rinfo);
@@ -2054,8 +2055,9 @@ static int blkif_recover(struct blkfront_info *info)
/*
 * Get the bios in the request so we can re-queue them.
 */
-   if (copy[i].request->cmd_flags &
-   (REQ_FLUSH | REQ_FUA | REQ_DISCARD | REQ_SECURE)) {
+   if (copy[i].request->cmd_flags & REQ_FLUSH ||
+   req_op(copy[i].request) == REQ_OP_DISCARD ||
+   copy[i].request->cmd_flags & (REQ_FUA | 
REQ_SECURE)) {
/*

[PATCH 40/45] block: move bio io prio to a new field

2016-06-05 Thread mchristi
From: Mike Christie 

In the next patch, we move drop the compat code and make
the op a separate value that is hidden in bi_rw. To give
the op and rq bits flags room to grow this moves prio to
its own field.

Signed-off-by: Mike Christie 
---
 include/linux/bio.h   | 14 ++
 include/linux/blk_types.h |  5 ++---
 2 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/include/linux/bio.h b/include/linux/bio.h
index 4568647..35108c2 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -47,18 +47,8 @@
 #define bio_op(bio)(op_from_rq_bits((bio)->bi_rw))
 #define bio_set_op_attrs(bio, op, flags)   ((bio)->bi_rw |= (op | flags))
 
-/*
- * upper 16 bits of bi_rw define the io priority of this bio
- */
-#define BIO_PRIO_SHIFT (8 * sizeof(unsigned long) - IOPRIO_BITS)
-#define bio_prio(bio)  ((bio)->bi_rw >> BIO_PRIO_SHIFT)
-#define bio_prio_valid(bio)ioprio_valid(bio_prio(bio))
-
-#define bio_set_prio(bio, prio)do {\
-   WARN_ON(prio >= (1 << IOPRIO_BITS));\
-   (bio)->bi_rw &= ((1UL << BIO_PRIO_SHIFT) - 1);  \
-   (bio)->bi_rw |= ((unsigned long) (prio) << BIO_PRIO_SHIFT); \
-} while (0)
+#define bio_prio(bio)  (bio)->bi_ioprio
+#define bio_set_prio(bio, prio)((bio)->bi_ioprio = prio)
 
 /*
  * various member access, note that bio_data should of course not be used
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 6e60baa..2738413 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -48,9 +48,8 @@ struct bio {
struct block_device *bi_bdev;
unsigned intbi_flags;   /* status, command, etc */
int bi_error;
-   unsigned long   bi_rw;  /* bottom bits READ/WRITE,
-* top bits priority
-*/
+   unsigned long   bi_rw;  /* READ/WRITE */
+   unsigned short  bi_ioprio;
 
struct bvec_iterbi_iter;
 
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 38/45] blktrace: use op accessors

2016-06-05 Thread mchristi
From: Mike Christie 

Have blktrace use the req/bio op accessor to get the REQ_OP.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---

v8:
1. Fix REQ_OP_WRITE_SAME handling, so it is not reported as a N.

 include/linux/blktrace_api.h  |  2 +-
 include/trace/events/bcache.h | 12 ++---
 include/trace/events/block.h  | 31 ++
 kernel/trace/blktrace.c   | 62 +--
 4 files changed, 65 insertions(+), 42 deletions(-)

diff --git a/include/linux/blktrace_api.h b/include/linux/blktrace_api.h
index 0f3172b..cceb72f 100644
--- a/include/linux/blktrace_api.h
+++ b/include/linux/blktrace_api.h
@@ -118,7 +118,7 @@ static inline int blk_cmd_buf_len(struct request *rq)
 }
 
 extern void blk_dump_cmd(char *buf, struct request *rq);
-extern void blk_fill_rwbs(char *rwbs, u32 rw, int bytes);
+extern void blk_fill_rwbs(char *rwbs, int op, u32 rw, int bytes);
 
 #endif /* CONFIG_EVENT_TRACING && CONFIG_BLOCK */
 
diff --git a/include/trace/events/bcache.h b/include/trace/events/bcache.h
index 981acf7..65673d8 100644
--- a/include/trace/events/bcache.h
+++ b/include/trace/events/bcache.h
@@ -27,7 +27,8 @@ DECLARE_EVENT_CLASS(bcache_request,
__entry->sector = bio->bi_iter.bi_sector;
__entry->orig_sector= bio->bi_iter.bi_sector - 16;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio_op(bio), bio->bi_rw,
+ bio->bi_iter.bi_size);
),
 
TP_printk("%d,%d %s %llu + %u (from %d,%d @ %llu)",
@@ -101,7 +102,8 @@ DECLARE_EVENT_CLASS(bcache_bio,
__entry->dev= bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio_op(bio), bio->bi_rw,
+ bio->bi_iter.bi_size);
),
 
TP_printk("%d,%d  %s %llu + %u",
@@ -136,7 +138,8 @@ TRACE_EVENT(bcache_read,
__entry->dev= bio->bi_bdev->bd_dev;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio_op(bio), bio->bi_rw,
+ bio->bi_iter.bi_size);
__entry->cache_hit = hit;
__entry->bypass = bypass;
),
@@ -167,7 +170,8 @@ TRACE_EVENT(bcache_write,
__entry->inode  = inode;
__entry->sector = bio->bi_iter.bi_sector;
__entry->nr_sector  = bio->bi_iter.bi_size >> 9;
-   blk_fill_rwbs(__entry->rwbs, bio->bi_rw, bio->bi_iter.bi_size);
+   blk_fill_rwbs(__entry->rwbs, bio_op(bio), bio->bi_rw,
+ bio->bi_iter.bi_size);
__entry->writeback = writeback;
__entry->bypass = bypass;
),
diff --git a/include/trace/events/block.h b/include/trace/events/block.h
index e8a5eca..5a2a759 100644
--- a/include/trace/events/block.h
+++ b/include/trace/events/block.h
@@ -84,7 +84,8 @@ DECLARE_EVENT_CLASS(block_rq_with_error,
0 : blk_rq_sectors(rq);
__entry->errors= rq->errors;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq));
+   blk_fill_rwbs(__entry->rwbs, req_op(rq), rq->cmd_flags,
+ blk_rq_bytes(rq));
blk_dump_cmd(__get_str(cmd), rq);
),
 
@@ -162,7 +163,7 @@ TRACE_EVENT(block_rq_complete,
__entry->nr_sector = nr_bytes >> 9;
__entry->errors= rq->errors;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, nr_bytes);
+   blk_fill_rwbs(__entry->rwbs, req_op(rq), rq->cmd_flags, 
nr_bytes);
blk_dump_cmd(__get_str(cmd), rq);
),
 
@@ -198,7 +199,8 @@ DECLARE_EVENT_CLASS(block_rq,
__entry->bytes = (rq->cmd_type == REQ_TYPE_BLOCK_PC) ?
blk_rq_bytes(rq) : 0;
 
-   blk_fill_rwbs(__entry->rwbs, rq->cmd_flags, blk_rq_bytes(rq));
+   blk_fill_rwbs(__entry->rwbs, req_op(rq), rq->cmd_flags,
+ blk_rq_bytes(rq));
blk_dump_cmd(__get_str(cmd), rq);
memcpy(__entry->comm, current->comm, TASK_COMM_LEN);
),
@@ -272,7 +274,8 @@ TRACE_EVENT(block_bio_bounce,
  

[PATCH 34/45] blkg_rwstat: separate op from flags

2016-06-05 Thread mchristi
From: Mike Christie 

The bio and request operation and flags are going to be separate
definitions, so we cannot pass them in as a bitmap. This patch
converts the blkg_rwstat code and its caller, cfq, to pass in the
values separately.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/cfq-iosched.c| 49 +++---
 include/linux/blk-cgroup.h | 13 ++--
 2 files changed, 36 insertions(+), 26 deletions(-)

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 3fcc598..3dafdba 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -667,9 +667,10 @@ static inline void cfqg_put(struct cfq_group *cfqg)
 } while (0)
 
 static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-   struct cfq_group *curr_cfqg, int rw)
+   struct cfq_group *curr_cfqg, int op,
+   int op_flags)
 {
-   blkg_rwstat_add(>stats.queued, rw, 1);
+   blkg_rwstat_add(>stats.queued, op, op_flags, 1);
cfqg_stats_end_empty_time(>stats);
cfqg_stats_set_start_group_wait_time(cfqg, curr_cfqg);
 }
@@ -683,26 +684,30 @@ static inline void 
cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
 #endif
 }
 
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int rw)
+static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int op,
+  int op_flags)
 {
-   blkg_rwstat_add(>stats.queued, rw, -1);
+   blkg_rwstat_add(>stats.queued, op, op_flags, -1);
 }
 
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int rw)
+static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int op,
+  int op_flags)
 {
-   blkg_rwstat_add(>stats.merged, rw, 1);
+   blkg_rwstat_add(>stats.merged, op, op_flags, 1);
 }
 
 static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-   uint64_t start_time, uint64_t io_start_time, int rw)
+   uint64_t start_time, uint64_t io_start_time, int op,
+   int op_flags)
 {
struct cfqg_stats *stats = >stats;
unsigned long long now = sched_clock();
 
if (time_after64(now, io_start_time))
-   blkg_rwstat_add(>service_time, rw, now - io_start_time);
+   blkg_rwstat_add(>service_time, op, op_flags,
+   now - io_start_time);
if (time_after64(io_start_time, start_time))
-   blkg_rwstat_add(>wait_time, rw,
+   blkg_rwstat_add(>wait_time, op, op_flags,
io_start_time - start_time);
 }
 
@@ -781,13 +786,16 @@ static inline void cfqg_put(struct cfq_group *cfqg) { }
 #define cfq_log_cfqg(cfqd, cfqg, fmt, args...) do {} while (0)
 
 static inline void cfqg_stats_update_io_add(struct cfq_group *cfqg,
-   struct cfq_group *curr_cfqg, int rw) { }
+   struct cfq_group *curr_cfqg, int op, int op_flags) { }
 static inline void cfqg_stats_update_timeslice_used(struct cfq_group *cfqg,
unsigned long time, unsigned long unaccounted_time) { }
-static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int rw) 
{ }
-static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int rw) 
{ }
+static inline void cfqg_stats_update_io_remove(struct cfq_group *cfqg, int op,
+   int op_flags) { }
+static inline void cfqg_stats_update_io_merged(struct cfq_group *cfqg, int op,
+   int op_flags) { }
 static inline void cfqg_stats_update_completion(struct cfq_group *cfqg,
-   uint64_t start_time, uint64_t io_start_time, int rw) { }
+   uint64_t start_time, uint64_t io_start_time, int op,
+   int op_flags) { }
 
 #endif /* CONFIG_CFQ_GROUP_IOSCHED */
 
@@ -2461,10 +2469,10 @@ static void cfq_reposition_rq_rb(struct cfq_queue 
*cfqq, struct request *rq)
 {
elv_rb_del(>sort_list, rq);
cfqq->queued[rq_is_sync(rq)]--;
-   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
+   cfqg_stats_update_io_remove(RQ_CFQG(rq), req_op(rq), rq->cmd_flags);
cfq_add_rq_rb(rq);
cfqg_stats_update_io_add(RQ_CFQG(rq), cfqq->cfqd->serving_group,
-rq->cmd_flags);
+req_op(rq), rq->cmd_flags);
 }
 
 static struct request *
@@ -2517,7 +2525,7 @@ static void cfq_remove_request(struct request *rq)
cfq_del_rq_rb(rq);
 
cfqq->cfqd->rq_queued--;
-   cfqg_stats_update_io_remove(RQ_CFQG(rq), rq->cmd_flags);
+   cfqg_stats_update_io_remove(RQ_CFQG(rq), req_op(rq), 

[PATCH 36/45] block: convert is_sync helpers to use REQ_OPs.

2016-06-05 Thread mchristi
From: Mike Christie 

This patch converts the is_sync helpers to use separate variables
for the operation and flags.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 block/blk-core.c   | 6 +++---
 block/blk-mq.c | 8 
 block/cfq-iosched.c| 2 +-
 include/linux/blkdev.h | 6 +++---
 4 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 1333bb7..f9f4228 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -962,7 +962,7 @@ static void __freed_request(struct request_list *rl, int 
sync)
 static void freed_request(struct request_list *rl, int op, unsigned int flags)
 {
struct request_queue *q = rl->q;
-   int sync = rw_is_sync(op | flags);
+   int sync = rw_is_sync(op, flags);
 
q->nr_rqs[sync]--;
rl->count[sync]--;
@@ -1075,7 +1075,7 @@ static struct request *__get_request(struct request_list 
*rl, int op,
struct elevator_type *et = q->elevator->type;
struct io_context *ioc = rq_ioc(bio);
struct io_cq *icq = NULL;
-   const bool is_sync = rw_is_sync(op | op_flags) != 0;
+   const bool is_sync = rw_is_sync(op, op_flags) != 0;
int may_queue;
 
if (unlikely(blk_queue_dying(q)))
@@ -1244,7 +1244,7 @@ static struct request *get_request(struct request_queue 
*q, int op,
   int op_flags, struct bio *bio,
   gfp_t gfp_mask)
 {
-   const bool is_sync = rw_is_sync(op | op_flags) != 0;
+   const bool is_sync = rw_is_sync(op, op_flags) != 0;
DEFINE_WAIT(wait);
struct request_list *rl;
struct request *rq;
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3393f29..29bcd9c 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -204,7 +204,7 @@ static void blk_mq_rq_ctx_init(struct request_queue *q, 
struct blk_mq_ctx *ctx,
rq->end_io_data = NULL;
rq->next_rq = NULL;
 
-   ctx->rq_dispatched[rw_is_sync(op | op_flags)]++;
+   ctx->rq_dispatched[rw_is_sync(op, op_flags)]++;
 }
 
 static struct request *
@@ -1178,7 +1178,7 @@ static struct request *blk_mq_map_request(struct 
request_queue *q,
ctx = blk_mq_get_ctx(q);
hctx = q->mq_ops->map_queue(q, ctx->cpu);
 
-   if (rw_is_sync(bio->bi_rw))
+   if (rw_is_sync(bio_op(bio), bio->bi_rw))
op_flags |= REQ_SYNC;
 
trace_block_getrq(q, bio, op);
@@ -1246,7 +1246,7 @@ static int blk_mq_direct_issue_request(struct request 
*rq, blk_qc_t *cookie)
  */
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 {
-   const int is_sync = rw_is_sync(bio->bi_rw);
+   const int is_sync = rw_is_sync(bio_op(bio), bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
struct blk_map_ctx data;
struct request *rq;
@@ -1343,7 +1343,7 @@ done:
  */
 static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
 {
-   const int is_sync = rw_is_sync(bio->bi_rw);
+   const int is_sync = rw_is_sync(bio_op(bio), bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
struct blk_plug *plug;
unsigned int request_count = 0;
diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index 3dafdba..b115486 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -4311,7 +4311,7 @@ static int cfq_may_queue(struct request_queue *q, int op, 
int op_flags)
if (!cic)
return ELV_MQUEUE_MAY;
 
-   cfqq = cic_to_cfqq(cic, rw_is_sync(op | op_flags));
+   cfqq = cic_to_cfqq(cic, rw_is_sync(op, op_flags));
if (cfqq) {
cfq_init_prio_data(cfqq, cic);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 25f01ff..4937c05 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -624,14 +624,14 @@ static inline unsigned int blk_queue_cluster(struct 
request_queue *q)
 /*
  * We regard a request as sync, if either a read or a sync write
  */
-static inline bool rw_is_sync(unsigned int rw_flags)
+static inline bool rw_is_sync(int op, unsigned int rw_flags)
 {
-   return !(rw_flags & REQ_WRITE) || (rw_flags & REQ_SYNC);
+   return op == REQ_OP_READ || (rw_flags & REQ_SYNC);
 }
 
 static inline bool rq_is_sync(struct request *rq)
 {
-   return rw_is_sync(rq->cmd_flags);
+   return rw_is_sync(req_op(rq), rq->cmd_flags);
 }
 
 static inline bool blk_rl_full(struct request_list *rl, bool sync)
-- 
2.7.2

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 45/45] block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH

2016-06-05 Thread mchristi
From: Mike Christie 

To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---
 Documentation/block/writeback_cache_control.txt | 22 +++---
 Documentation/device-mapper/log-writes.txt  | 10 +-
 block/blk-core.c| 12 ++--
 block/blk-flush.c   | 16 
 block/blk-mq.c  |  4 ++--
 drivers/block/drbd/drbd_actlog.c|  4 ++--
 drivers/block/drbd/drbd_main.c  |  2 +-
 drivers/block/drbd/drbd_protocol.h  |  2 +-
 drivers/block/drbd/drbd_receiver.c  |  2 +-
 drivers/block/drbd/drbd_req.c   |  2 +-
 drivers/md/bcache/journal.c |  2 +-
 drivers/md/bcache/request.c |  8 
 drivers/md/dm-cache-target.c| 12 ++--
 drivers/md/dm-crypt.c   |  7 ---
 drivers/md/dm-era-target.c  |  4 ++--
 drivers/md/dm-io.c  |  2 +-
 drivers/md/dm-log-writes.c  |  2 +-
 drivers/md/dm-raid1.c   |  5 +++--
 drivers/md/dm-region-hash.c |  4 ++--
 drivers/md/dm-snap.c|  6 +++---
 drivers/md/dm-stripe.c  |  2 +-
 drivers/md/dm-thin.c|  8 
 drivers/md/dm.c | 12 ++--
 drivers/md/linear.c |  2 +-
 drivers/md/md.c |  2 +-
 drivers/md/md.h |  2 +-
 drivers/md/multipath.c  |  2 +-
 drivers/md/raid0.c  |  2 +-
 drivers/md/raid1.c  |  3 ++-
 drivers/md/raid10.c |  2 +-
 drivers/md/raid5-cache.c|  2 +-
 drivers/md/raid5.c  |  2 +-
 fs/btrfs/check-integrity.c  |  8 
 fs/jbd2/journal.c   |  2 +-
 fs/xfs/xfs_buf.c|  2 +-
 include/linux/blk_types.h   |  8 
 include/linux/fs.h  |  4 ++--
 include/trace/events/f2fs.h |  2 +-
 kernel/trace/blktrace.c |  5 +++--
 39 files changed, 102 insertions(+), 98 deletions(-)

diff --git a/Documentation/block/writeback_cache_control.txt 
b/Documentation/block/writeback_cache_control.txt
index da70bda..8a6bdad 100644
--- a/Documentation/block/writeback_cache_control.txt
+++ b/Documentation/block/writeback_cache_control.txt
@@ -20,11 +20,11 @@ a forced cache flush, and the Force Unit Access (FUA) flag 
for requests.
 Explicit cache flushes
 --
 
-The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
+The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
 the filesystem and will make sure the volatile cache of the storage device
 has been flushed before the actual I/O operation is started.  This explicitly
 guarantees that previously completed write requests are on non-volatile
-storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
+storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
 set on an otherwise empty bio structure, which causes only an explicit cache
 flush without any dependent I/O.  It is recommend to use
 the blkdev_issue_flush() helper for a pure cache flush.
@@ -41,21 +41,21 @@ signaled after the data has been committed to non-volatile 
storage.
 Implementation details for filesystems
 --
 
-Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
+Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
 worry if the underlying devices need any explicit cache flushing and how
-the Forced Unit Access is implemented.  The REQ_FLUSH and REQ_FUA flags
+the Forced Unit Access is implemented.  The REQ_PREFLUSH and REQ_FUA flags
 may both be set on a single bio.
 
 
 Implementation details for make_request_fn based block drivers
 --
 
-These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
+These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
 directly below the submit_bio interface.  For remapping drivers the REQ_FUA
 bits need to be propagated to underlying devices, and a global flush needs
-to be implemented for bios 

[PATCH 42/45] block, fs, drivers: remove REQ_OP compat defs and related code

2016-06-05 Thread mchristi
From: Mike Christie 

This patch drops the compat definition of req_op where it matches
the rq_flag_bits definitions, and drops the related old and compat
code that allowed users to set either the op or flags for the operation.

We also then store the operation in the bi_rw/cmd_flags field similar
to how we used to store the bio ioprio where it sat in the upper bits
of the field.

Signed-off-by: Mike Christie 
---
 drivers/scsi/sd.c   |  2 +-
 include/linux/bio.h |  3 ---
 include/linux/blk_types.h   | 52 +
 include/linux/blkdev.h  | 14 
 include/linux/fs.h  | 37 +---
 include/trace/events/f2fs.h |  1 -
 6 files changed, 46 insertions(+), 63 deletions(-)

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index c8dc221..fad86ad 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1012,7 +1012,7 @@ static int sd_setup_read_write_cmnd(struct scsi_cmnd 
*SCpnt)
} else if (rq_data_dir(rq) == READ) {
SCpnt->cmnd[0] = READ_6;
} else {
-   scmd_printk(KERN_ERR, SCpnt, "Unknown command %d,%llx\n",
+   scmd_printk(KERN_ERR, SCpnt, "Unknown command %llu,%llx\n",
req_op(rq), (unsigned long long) rq->cmd_flags);
goto out;
}
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 35108c2..0bbb2e3 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -44,9 +44,6 @@
 #define BIO_MAX_SIZE   (BIO_MAX_PAGES << PAGE_SHIFT)
 #define BIO_MAX_SECTORS(BIO_MAX_SIZE >> 9)
 
-#define bio_op(bio)(op_from_rq_bits((bio)->bi_rw))
-#define bio_set_op_attrs(bio, op, flags)   ((bio)->bi_rw |= (op | flags))
-
 #define bio_prio(bio)  (bio)->bi_ioprio
 #define bio_set_prio(bio, prio)((bio)->bi_ioprio = prio)
 
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 5efb6f1..23c1ab2 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -48,7 +48,9 @@ struct bio {
struct block_device *bi_bdev;
unsigned intbi_flags;   /* status, command, etc */
int bi_error;
-   unsigned intbi_rw;  /* READ/WRITE */
+   unsigned intbi_rw;  /* bottom bits req flags,
+* top bits REQ_OP
+*/
unsigned short  bi_ioprio;
 
struct bvec_iterbi_iter;
@@ -106,6 +108,16 @@ struct bio {
struct bio_vec  bi_inline_vecs[0];
 };
 
+#define BIO_OP_SHIFT   (8 * sizeof(unsigned int) - REQ_OP_BITS)
+#define bio_op(bio)((bio)->bi_rw >> BIO_OP_SHIFT)
+
+#define bio_set_op_attrs(bio, op, op_flags) do {   \
+   WARN_ON(op >= (1 << REQ_OP_BITS));  \
+   (bio)->bi_rw &= ((1 << BIO_OP_SHIFT) - 1);  \
+   (bio)->bi_rw |= ((unsigned int) (op) << BIO_OP_SHIFT);  \
+   (bio)->bi_rw |= op_flags;   \
+} while (0)
+
 #define BIO_RESET_BYTESoffsetof(struct bio, bi_max_vecs)
 
 /*
@@ -144,7 +156,6 @@ struct bio {
  */
 enum rq_flag_bits {
/* common flags */
-   __REQ_WRITE,/* not set, read. set, write */
__REQ_FAILFAST_DEV, /* no driver retries of device errors */
__REQ_FAILFAST_TRANSPORT, /* no driver retries of transport errors */
__REQ_FAILFAST_DRIVER,  /* no driver retries of driver errors */
@@ -152,9 +163,7 @@ enum rq_flag_bits {
__REQ_SYNC, /* request is sync (sync write or read) */
__REQ_META, /* metadata io request */
__REQ_PRIO, /* boost priority in cfq */
-   __REQ_DISCARD,  /* request to discard sectors */
-   __REQ_SECURE,   /* secure discard (used with __REQ_DISCARD) */
-   __REQ_WRITE_SAME,   /* write same block many times */
+   __REQ_SECURE,   /* secure discard (used with REQ_OP_DISCARD) */
 
__REQ_NOIDLE,   /* don't anticipate more IO after this one */
__REQ_INTEGRITY,/* I/O includes block integrity payload */
@@ -190,28 +199,22 @@ enum rq_flag_bits {
__REQ_NR_BITS,  /* stops here */
 };
 
-#define REQ_WRITE  (1ULL << __REQ_WRITE)
 #define REQ_FAILFAST_DEV   (1ULL << __REQ_FAILFAST_DEV)
 #define REQ_FAILFAST_TRANSPORT (1ULL << __REQ_FAILFAST_TRANSPORT)
 #define REQ_FAILFAST_DRIVER(1ULL << __REQ_FAILFAST_DRIVER)
 #define REQ_SYNC   (1ULL << __REQ_SYNC)
 #define REQ_META   (1ULL << __REQ_META)
 #define REQ_PRIO   (1ULL << __REQ_PRIO)
-#define REQ_DISCARD(1ULL << __REQ_DISCARD)
-#define REQ_WRITE_SAME (1ULL << __REQ_WRITE_SAME)
 #define REQ_NOIDLE   

[PATCH 43/45] block, drivers: add REQ_OP_FLUSH operation

2016-06-05 Thread mchristi
From: Mike Christie 

This adds a REQ_OP_FLUSH operation that is sent to request_fn
based drivers by the block layer's flush code, instead of
sending requests with the request->cmd_flags REQ_FLUSH bit set.

Signed-off-by: Mike Christie 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Hannes Reinecke 
---

v2.

1. Fix kbuild failures. Forgot to update ubd driver.

 Documentation/block/writeback_cache_control.txt | 6 +++---
 arch/um/drivers/ubd_kern.c  | 2 +-
 block/blk-flush.c   | 4 ++--
 drivers/block/loop.c| 4 ++--
 drivers/block/nbd.c | 2 +-
 drivers/block/osdblk.c  | 2 +-
 drivers/block/ps3disk.c | 4 ++--
 drivers/block/skd_main.c| 2 +-
 drivers/block/virtio_blk.c  | 2 +-
 drivers/block/xen-blkfront.c| 8 
 drivers/ide/ide-disk.c  | 2 +-
 drivers/md/dm.c | 2 +-
 drivers/mmc/card/block.c| 6 +++---
 drivers/mmc/card/queue.h| 3 ++-
 drivers/mtd/mtd_blkdevs.c   | 2 +-
 drivers/nvme/host/core.c| 2 +-
 drivers/scsi/sd.c   | 7 +++
 include/linux/blk_types.h   | 3 ++-
 include/linux/blkdev.h  | 3 +++
 kernel/trace/blktrace.c | 5 +
 20 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/Documentation/block/writeback_cache_control.txt 
b/Documentation/block/writeback_cache_control.txt
index 59e0516..da70bda 100644
--- a/Documentation/block/writeback_cache_control.txt
+++ b/Documentation/block/writeback_cache_control.txt
@@ -73,9 +73,9 @@ doing:
 
blk_queue_write_cache(sdkp->disk->queue, true, false);
 
-and handle empty REQ_FLUSH requests in its prep_fn/request_fn.  Note that
+and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn.  Note that
 REQ_FLUSH requests with a payload are automatically turned into a sequence
-of an empty REQ_FLUSH request followed by the actual write by the block
+of an empty REQ_OP_FLUSH request followed by the actual write by the block
 layer.  For devices that also support the FUA bit the block layer needs
 to be told to pass through the REQ_FUA bit using:
 
@@ -83,4 +83,4 @@ to be told to pass through the REQ_FUA bit using:
 
 and the driver must handle write requests that have the REQ_FUA bit set
 in prep_fn/request_fn.  If the FUA bit is not natively supported the block
-layer turns it into an empty REQ_FLUSH request after the actual write.
+layer turns it into an empty REQ_OP_FLUSH request after the actual write.
diff --git a/arch/um/drivers/ubd_kern.c b/arch/um/drivers/ubd_kern.c
index 17e96dc..ef6b4d9 100644
--- a/arch/um/drivers/ubd_kern.c
+++ b/arch/um/drivers/ubd_kern.c
@@ -1286,7 +1286,7 @@ static void do_ubd_request(struct request_queue *q)
 
req = dev->request;
 
-   if (req->cmd_flags & REQ_FLUSH) {
+   if (req_op(req) == REQ_OP_FLUSH) {
io_req = kmalloc(sizeof(struct io_thread_req),
 GFP_ATOMIC);
if (io_req == NULL) {
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 9fd1f63..21f0d5b 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -29,7 +29,7 @@
  * The actual execution of flush is double buffered.  Whenever a request
  * needs to execute PRE or POSTFLUSH, it queues at
  * fq->flush_queue[fq->flush_pending_idx].  Once certain criteria are met, a
- * flush is issued and the pending_idx is toggled.  When the flush
+ * REQ_OP_FLUSH is issued and the pending_idx is toggled.  When the flush
  * completes, all the requests which were pending are proceeded to the next
  * step.  This allows arbitrary merging of different types of FLUSH/FUA
  * requests.
@@ -330,7 +330,7 @@ static bool blk_kick_flush(struct request_queue *q, struct 
blk_flush_queue *fq)
}
 
flush_rq->cmd_type = REQ_TYPE_FS;
-   flush_rq->cmd_flags = WRITE_FLUSH | REQ_FLUSH_SEQ;
+   req_set_op_attrs(flush_rq, REQ_OP_FLUSH, WRITE_FLUSH | REQ_FLUSH_SEQ);
flush_rq->rq_disk = first_rq->rq_disk;
flush_rq->end_io = flush_end_io;
 
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index b9b737c..364d491 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -542,7 +542,7 @@ static int do_req_filebacked(struct loop_device *lo, struct 
request *rq)
pos = ((loff_t) blk_rq_pos(rq) << 9) + lo->lo_offset;
 
if (op_is_write(req_op(rq))) {
-   if (rq->cmd_flags & REQ_FLUSH)
+   if (req_op(rq) == REQ_OP_FLUSH)
ret = lo_req_flush(lo, rq);
else if (req_op(rq) == REQ_OP_DISCARD)
 

Re: Recommended why to use btrfs for production?

2016-06-05 Thread Andrei Borzenkov
05.06.2016 19:33, James Johnston пишет:
> On 06/05/2016 10:46 AM, Mladen Milinkovic wrote:
>> On 06/03/2016 04:05 PM, Chris Murphy wrote:
>>> Make certain the kernel command timer value is greater than the driver
>>> error recovery timeout. The former is found in sysfs, per block
>>> device, the latter can be get and set with smartctl. Wrong
>>> configuration is common (it's actually the default) when using
>>> consumer drives, and inevitably leads to problems, even the loss of
>>> the entire array. It really is a terrible default.
>>
>> Since it's first time i've heard of this I did some googling.
>>
>> Here's some nice article about these timeouts:
>> http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-
>> timeouts/comment-page-1/
>>
>> And some udev rules that should apply this automatically:
>> http://comments.gmane.org/gmane.linux.raid/48193
> 
> I think the first link there is a good one.  On my system:
> 
> /sys/block/sdX/device/timeout
> 
> defaults to 30 seconds - long enough for a drive with short TLER setting
> but too short for a consumer drive.
> 
> There is a Red Hat link on setting up a udev rule for it here:
> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/task_controlling-scsi-command-timer-onlining-devices.html
> 
> I thought it looked a little funny, so I combined the above with one of the
> VMware udev rules pre-installed on my Ubuntu system and came up with this:
> 
> # Update timeout from 180 to one of your choosing:
> ACTION=="add|change", SUBSYSTEMS=="scsi", ATTRS{type}=="0|7|14", \
> RUN+="/bin/sh -c 'echo 180 >/sys$DEVPATH/device/timeout'"
> 

Last line is actually

ATTR{device/timeout}="100"

to avoid spawning extra process for every device.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Recommended why to use btrfs for production?

2016-06-05 Thread James Johnston
On 06/05/2016 10:46 AM, Mladen Milinkovic wrote:
> On 06/03/2016 04:05 PM, Chris Murphy wrote:
> > Make certain the kernel command timer value is greater than the driver
> > error recovery timeout. The former is found in sysfs, per block
> > device, the latter can be get and set with smartctl. Wrong
> > configuration is common (it's actually the default) when using
> > consumer drives, and inevitably leads to problems, even the loss of
> > the entire array. It really is a terrible default.
> 
> Since it's first time i've heard of this I did some googling.
> 
> Here's some nice article about these timeouts:
> http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-
> timeouts/comment-page-1/
> 
> And some udev rules that should apply this automatically:
> http://comments.gmane.org/gmane.linux.raid/48193

I think the first link there is a good one.  On my system:

/sys/block/sdX/device/timeout

defaults to 30 seconds - long enough for a drive with short TLER setting
but too short for a consumer drive.

There is a Red Hat link on setting up a udev rule for it here:
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Online_Storage_Reconfiguration_Guide/task_controlling-scsi-command-timer-onlining-devices.html

I thought it looked a little funny, so I combined the above with one of the
VMware udev rules pre-installed on my Ubuntu system and came up with this:

# Update timeout from 180 to one of your choosing:
ACTION=="add|change", SUBSYSTEMS=="scsi", ATTRS{type}=="0|7|14", \
RUN+="/bin/sh -c 'echo 180 >/sys$DEVPATH/device/timeout'"

Now my attached drives automatically get this timeout without any scripting
or manual setting of the timeout.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs

2016-06-05 Thread Chris Murphy
On Sat, Jun 4, 2016 at 4:43 PM, Christoph Anton Mitterer
 wrote:
> On Sat, 2016-06-04 at 13:13 -0600, Chris Murphy wrote:
>> mdadm supports DDF.
>
> Sure... it also supports IMSM,... so what? Neither of them are the
> default for mdadm, nor does it change the used terminology :)

Why is mdadm the reference point for terminology?

There's actually better consistency in terminology usage outside Linux
because of SNIA and DDF than within Linux where the most basic terms
aren't agreed upon by various upstream maintainers. mdadm and lvm use
different terms even though they're both now using the same md backend
in the kernel.

 mdadm chunk = lvm segment = btrfs stripe = ddf strip = ddf stripe
element. Some things have no equivalents like the Btrfs chunk. But
someone hears chunk and they wonder if it's the same thing as the
mdadm chunk but it isn't, and actually Btrfs also uses the term block
group for chunk, because...

So if you want to create a decoder ring for terminology that's great
and would be useful; but just asking everyone in Btrfs land to come up
with Btrfs terminology 2.0 merely adds to the list of inconsistent
term usage, it doesn't actual fix any problems.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID1 vs RAID10 and best way to set up 6 disks

2016-06-05 Thread Chris Murphy
On Sat, Jun 4, 2016 at 7:10 PM, Christoph Anton Mitterer
 wrote:

> Well the RAID1 was IMHO still bad choice as it's pretty ambiguous.

That's ridiculous. It isn't incorrect to refer to only 2 copies as
raid1. You have to explicitly ask both mdadm and lvcreate for the
number of copies you want, it doesn't automatically happen. The man
page for mkfs.btrfs is very clear you only get two copies.

What's ambiguous is raid10 expectations with multiple device failures.


> Well I'd say, for btrfs: do away with the term "RAID" at all, use e.g.:
>
> linear = just a bunch of devices put together, no striping
>  basically what MD's linear is

Except this isn't really how Btrfs single works. The difference
between mdadm linear and Btrfs single is more different in behavior
than the difference between mdadm raid1 and btrfs raid1. So you're
proposing tolerating a bigger difference, while criticizing a smaller
one. *shrug*



> mirror (or perhaps something like clones) = each device in the fs
> contains a copy of
> everything (i.e. classic
> RAID1)


If a metaphor is going to be used for a technical thing, it would be
mirrors or mirroring. Mirror would mean exactly two (the original and
the mirror). See lvcreate --mirrors. Also, the lvm mirror segment type
is legacy, having been replaced with raid1 (man lvcreate uses the term
raid1, not RAID1 or RAID-1). So I'm not a big fan of this term.


> striped = basically what RAID0 is

lvcreate uses only striped, not raid0. mdadm uses only RAID0, not
striped. Since striping is also employed with RAIDs 4, 5, 6, 7, it
seems ambiguous even though without further qualification whether
parity exists, it's considered to mean non-parity striping. The
ambiguity is probably less of a problem than the contradiction that is
RAID0.



> replicaN = N replicas of each chunk on distinct devices
> -replicaN = N replicas of each chunk NOT necessarily on
>   distinct devices

This is kinda interesting. At least it's a new term so all the new
rules can be stuffed into that new term and helps distinguish it from
other implementations, not entirely different with how ZFS does this
with their raidz.



> parityN = n parity chunks i.e. parity1 ~= RAID5, parity2 ~= RAID6
> or perhaps better: striped-parityN or striped+parityN ??

It's not easy, is it?


>
> And just mention in the manpage, which of these names comes closest to
> what people understand by RAID level i.

It already does this. What version of btrfs-progs are you basing your
criticism on that there's some inconsistency, deficiency, or ambiguity
when it comes to these raid levels? The one that's unequivocally
problematic alone without reading the man page is raid10. The historic
understanding is that it's a stripe of mirrors, and this suggests you
can lose a mirror of each stripe i.e. multiple disks and not lose
data, which is not true for Btrfs raid10. But the man page makes that
clear, you have 2 copies for redundancy, that's it.





>
>
>>
>> The reason I say "naively" is that there is little to stop you from
>> creating a 2-device "raid1" using two partitions on the same
>> physical
>> device. This is especially difficult to detect if you add
>> abstraction
>> layers (lvm, dm-crypt, etc). This same problem does apply to mdadm
>> however.
> Sure... I think software should try to prevent people from doing stupid
> things, but not by all means ;-)
> If one makes n partitions on the same device an puts a RAID on that,
> one probably doesn't deserve it any better ;-)
>
> I'd guess it's probably doable to detect such stupidness for e.g.
> partitions and dm-crypt (because these are linearly on one device)...
> but for lvm/MD it really depends on the actual block allocation/layout,
> whether it's safe or not.
> Maybe the tools could detect *if* lvm/MD is in between and just give a
> general warning what that could mean.

On the CLI? Not worth it. If the user is that ignorant, too bad, use a
GUI program to help build the storage stack from scratch. I'm really
not sympathetic if a user creates a raid1 from two partitions of the
same block device anymore than if it's ultimately the same physical
device managed by a device mapper variant.

Anyway, I think there's a whole separate github discussion on Btrfs
UI/Ux that presumably also includes terminology concerns like this.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Recommended why to use btrfs for production?

2016-06-05 Thread Mladen Milinkovic
On 06/03/2016 04:05 PM, Chris Murphy wrote:
> Make certain the kernel command timer value is greater than the driver
> error recovery timeout. The former is found in sysfs, per block
> device, the latter can be get and set with smartctl. Wrong
> configuration is common (it's actually the default) when using
> consumer drives, and inevitably leads to problems, even the loss of
> the entire array. It really is a terrible default.

Since it's first time i've heard of this I did some googling.

Here's some nice article about these timeouts:
http://strugglers.net/~andy/blog/2015/11/09/linux-software-raid-and-drive-timeouts/comment-page-1/

And some udev rules that should apply this automatically:
http://comments.gmane.org/gmane.linux.raid/48193

Cheers

-- 
Mladen Milinkovic
GPG: EF9D9B26

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html