Re: [PATCH V2 10/11] block-throttle: add a simple idle detection

2016-09-15 Thread kbuild test robot
Hi Shaohua, [auto build test ERROR on block/for-next] [also build test ERROR on v4.8-rc6 next-20160915] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] [Suggest to use git(>=2.9.0) format-patch --base= (or --base=auto for convenience) to rec

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Paolo Bonzini
On 15/09/2016 17:23, Alex Bligh wrote: > Paolo, > >> On 15 Sep 2016, at 15:07, Paolo Bonzini wrote: >> >> I don't think QEMU forbids multiple clients to the single server, and >> guarantees consistency as long as there is no overlap between writes and >> reads. These are

Counting I/O errors in /sys/block//stat

2016-09-15 Thread Jim Shankland
[Resent without html] Dear colleagues, The attached patch keeps a count of block device I/O errors -- any error event that generates a klog message in blk_update_request -- and reports the count as a 12th field in /sys/block//stat. That allows, e.g., monitoring systems to detect and count block

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Eric Blake
On 09/15/2016 11:27 AM, Wouter Verhelst wrote: > On Thu, Sep 15, 2016 at 05:08:21PM +0100, Alex Bligh wrote: >> Wouter, >> >>> The server can always refuse to allow multiple connections. >> >> Sure, but it would be neater to warn the client of that at negotiation >> stage (it would only be one

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
Wouter, > On 15 Sep 2016, at 17:27, Wouter Verhelst wrote: > > On Thu, Sep 15, 2016 at 05:08:21PM +0100, Alex Bligh wrote: >> Wouter, >> >>> The server can always refuse to allow multiple connections. >> >> Sure, but it would be neater to warn the client of that at negotiation

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 05:08:21PM +0100, Alex Bligh wrote: > Wouter, > > > The server can always refuse to allow multiple connections. > > Sure, but it would be neater to warn the client of that at negotiation > stage (it would only be one flag, e.g. 'multiple connections > unsafe'). I

[PATCH V2 02/11] block-throttle: add .high interface

2016-09-15 Thread Shaohua Li
Add high limit for cgroup and corresponding cgroup interface. Signed-off-by: Shaohua Li --- block/blk-throttle.c | 139 +++ 1 file changed, 107 insertions(+), 32 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c

[PATCH V2 10/11] block-throttle: add a simple idle detection

2016-09-15 Thread Shaohua Li
A cgroup gets assigned a high limit, but the cgroup could never dispatch enough IO to cross the high limit. In such case, the queue state machine will remain in LIMIT_HIGH state and all other cgroups will be throttled according to high limit. This is unfair for other cgroups. We should treat the

[PATCH V2 04/11] block-throttle: add upgrade logic for LIMIT_HIGH state

2016-09-15 Thread Shaohua Li
When queue is in LIMIT_HIGH state and all cgroups with high limit cross the bps/iops limitation, we will upgrade queue's state to LIMIT_MAX For a cgroup hierarchy, there are two cases. Children has lower high limit than parent. Parent's high limit is meaningless. If children's bps/iops cross high

[PATCH V2 11/11] blk-throttle: ignore idle cgroup limit

2016-09-15 Thread Shaohua Li
Last patch introduces a way to detect idle cgroup. We use it to make upgrade/downgrade decision. Signed-off-by: Shaohua Li --- block/blk-throttle.c | 30 ++ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/block/blk-throttle.c

[PATCH V2 08/11] blk-throttle: detect completed idle cgroup

2016-09-15 Thread Shaohua Li
cgroup could be assigned a limit, but doesn't dispatch enough IO, eg the cgroup is idle. When this happens, the cgroup doesn't hit its limit, so we can't move the state machine to higher level and all cgroups will be throttled to thier lower limit, so we waste bandwidth. Detecting idle cgroup is

[PATCH V2 09/11] block-throttle: make bandwidth change smooth

2016-09-15 Thread Shaohua Li
When cgroups all reach high limit, cgroups can dispatch more IO. This could make some cgroups dispatch more IO but others not, and even some cgroups could dispatch less IO than their high limit. For example, cg1 high limit 10MB/s, cg2 limit 80MB/s, assume disk maximum bandwidth is 120M/s for the

[PATCH V2 00/11] block-throttle: add .high limit

2016-09-15 Thread Shaohua Li
Hi, The background is we don't have an ioscheduler for blk-mq yet, so we can't prioritize processes/cgroups. This patch set tries to add basic arbitration between cgroups with blk-throttle. It adds a new limit io.high for blk-throttle. It's only for cgroup2. io.max is a hard limit throttling.

[PATCH V2 07/11] blk-throttle: make throtl_slice tunable

2016-09-15 Thread Shaohua Li
throtl_slice is important for blk-throttling. A lot of stuffes depend on it, for example, throughput measurement. It has 100ms default value, which is not appropriate for all disks. For example, for SSD we might use a smaller value to make the throughput smoother. This patch makes it tunable.

[PATCH V2 06/11] blk-throttle: make sure expire time isn't too big

2016-09-15 Thread Shaohua Li
cgroup could be throttled to a limit but when all cgroups cross high limit, queue enters a higher state and so the group should be throttled to a higher limit. It's possible the cgroup is sleeping because of throttle and other cgroups don't dispatch IO any more. In this case, nobody can trigger

[GIT PULL] Block fixes for 4.8-rc

2016-09-15 Thread Jens Axboe
Hi Linus, A set of fixes for the current series in the realm of block. Like the previous pull request, the meat of it are fixes for the nvme fabrics/target code. Outside of that, just one fix from Gabriel for not doing a queue suspend if we didn't get the admin queue setup in the first place.

Re: [PATCH 13/13] blk-mq: get rid of the cpumask in struct blk_mq_tags

2016-09-15 Thread Christoph Hellwig
> +static int blk_mq_create_mq_map(struct blk_mq_tag_set *set, > + const struct cpumask *affinity_mask) > { > + int queue = -1, cpu = 0; > + > + set->mq_map = kzalloc_node(sizeof(*set->mq_map) * nr_cpu_ids, > + GFP_KERNEL, set->numa_node); > + if

Re: blk-mq: allow passing in an external queue mapping V3

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 08:34:42AM -0600, Jens Axboe wrote: > I was going to ask about splitting it, but that looks fine, I can pull > that in. > > The series looks fine to me. My only real concern is giving drivers the > flexibility to define mappings, I don't want that to evolve into drivers >

Re: blk-mq: allow passing in an external queue mapping V3

2016-09-15 Thread Christoph Hellwig
Thanks for all the testing and the review Keith, as well as the fixes earlier. Jens, what do you think of the series? Thomas has added the first 5 patches to https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/log/?h=irq/for-block so it would be great if we could pull that into a block

Re: blk-mq: allow passing in an external queue mapping V3

2016-09-15 Thread Keith Busch
On Wed, Sep 14, 2016 at 04:18:46PM +0200, Christoph Hellwig wrote: > This series is the remainder of the earlier "automatic interrupt affinity for > MSI/MSI-X capable devices" series, and make uses of the new irq-level > interrupt / queue mapping code in blk-mq, as well as allowing the driver > to

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Josef Bacik
On 09/15/2016 09:17 AM, Wouter Verhelst wrote: On Thu, Sep 15, 2016 at 01:44:29PM +0100, Alex Bligh wrote: On 15 Sep 2016, at 13:41, Christoph Hellwig wrote: On Thu, Sep 15, 2016 at 01:39:11PM +0100, Alex Bligh wrote: That's probably right in the case of file-based back

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 01:44:29PM +0100, Alex Bligh wrote: > > > On 15 Sep 2016, at 13:41, Christoph Hellwig wrote: > > > > On Thu, Sep 15, 2016 at 01:39:11PM +0100, Alex Bligh wrote: > >> That's probably right in the case of file-based back ends that > >> are running on a

Re: [PATCH v2 2/6] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests

2016-09-15 Thread Mike Snitzer
On Thu, Sep 15 2016 at 2:14am -0400, Hannes Reinecke wrote: > On 09/14/2016 06:29 PM, Mike Snitzer wrote: > > Otherwise blk-mq will immediately dispatch requests that are requeued > > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. > > > > Delayed requeue is

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
> On 15 Sep 2016, at 13:41, Christoph Hellwig wrote: > > On Thu, Sep 15, 2016 at 01:39:11PM +0100, Alex Bligh wrote: >> That's probably right in the case of file-based back ends that >> are running on a Linux OS. But gonbdserver for instance supports >> (e.g.) Ceph based

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 01:39:11PM +0100, Alex Bligh wrote: > That's probably right in the case of file-based back ends that > are running on a Linux OS. But gonbdserver for instance supports > (e.g.) Ceph based backends, where each connection might be talking > to a completely separate ceph node,

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
> On 15 Sep 2016, at 13:36, Christoph Hellwig wrote: > > On Thu, Sep 15, 2016 at 01:33:20PM +0100, Alex Bligh wrote: >> At an implementation level that is going to be a little difficult >> for some NBD servers, e.g. ones that fork() a different process per >> connection.

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
> On 15 Sep 2016, at 13:23, Christoph Hellwig wrote: > > On Thu, Sep 15, 2016 at 02:21:20PM +0200, Wouter Verhelst wrote: >> Right. So do I understand you correctly that blk-mq currently doesn't >> look at multiple queues, and just assumes that if a FLUSH is sent over >> any

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
> On 15 Sep 2016, at 13:18, Christoph Hellwig wrote: > > Yes, please do that. A "barrier" implies draining of the queue. Done -- Alex Bligh -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org

Re: [PATCHv3 29/41] ext4: make ext4_mpage_readpages() hugepage-aware

2016-09-15 Thread Andreas Dilger
On Sep 15, 2016, at 5:55 AM, Kirill A. Shutemov wrote: > > This patch modifies ext4_mpage_readpages() to deal with huge pages. > > We read out 2M at once, so we have to alloc (HPAGE_PMD_NR * > blocks_per_page) sector_t for that. I'm not entirely happy with

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 05:20:08AM -0700, Christoph Hellwig wrote: > On Thu, Sep 15, 2016 at 02:01:59PM +0200, Wouter Verhelst wrote: > > Yes. There was some discussion on that part, and we decided that setting > > the flag doesn't hurt, but the spec also clarifies that using it on READ > > does

[PATCHv3 04/41] radix-tree: Add radix_tree_split

2016-09-15 Thread Kirill A. Shutemov
From: Matthew Wilcox This new function splits a larger multiorder entry into smaller entries (potentially multi-order entries). These entries are initialised to RADIX_TREE_RETRY to ensure that RCU walkers who see this state aren't confused. The caller should then call

[PATCHv3 03/41] radix-tree: Add radix_tree_join

2016-09-15 Thread Kirill A. Shutemov
From: Matthew Wilcox This new function allows for the replacement of many smaller entries in the radix tree with one larger multiorder entry. From the point of view of an RCU walker, they may see a mixture of the smaller entries and the large entry during the same walk,

[PATCHv3 07/41] mm, shmem: swich huge tmpfs to multi-order radix-tree entries

2016-09-15 Thread Kirill A. Shutemov
We would need to use multi-order radix-tree entires for ext4 and other filesystems to have coherent view on tags (dirty/towrite) in the tree. This patch converts huge tmpfs implementation to multi-order entries, so we will be able to use the same code patch for all filesystems. Signed-off-by:

[PATCHv3 09/41] page-flags: relax page flag policy for few flags

2016-09-15 Thread Kirill A. Shutemov
These flags are in use for filesystems with backing storage: PG_error, PG_writeback and PG_readahead. Signed-off-by: Kirill A. Shutemov --- include/linux/page-flags.h | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git

[PATCHv3 10/41] mm, rmap: account file thp pages

2016-09-15 Thread Kirill A. Shutemov
Let's add FileHugePages and FilePmdMapped fields into meminfo and smaps. It indicates how many times we allocate and map file THP. Signed-off-by: Kirill A. Shutemov --- drivers/base/node.c| 6 ++ fs/proc/meminfo.c | 4 fs/proc/task_mmu.c

[PATCHv3 17/41] filemap: handle huge pages in filemap_fdatawait_range()

2016-09-15 Thread Kirill A. Shutemov
We writeback whole huge page a time. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 5 + 1 file changed, 5 insertions(+) diff --git a/mm/filemap.c b/mm/filemap.c index 05b42d3e5ed8..53da93156e60 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -372,9

[PATCHv3 26/41] truncate: make truncate_inode_pages_range() aware about huge pages

2016-09-15 Thread Kirill A. Shutemov
As with shmem_undo_range(), truncate_inode_pages_range() removes huge pages, if it fully within range. Partial truncate of huge pages zero out this part of THP. Unlike with shmem, it doesn't prevent us having holes in the middle of huge page we still can skip writeback not touched buffers. With

[PATCHv3 35/41] ext4: make ext4_da_page_release_reservation() aware about huge pages

2016-09-15 Thread Kirill A. Shutemov
For huge pages 'stop' must be within HPAGE_PMD_SIZE. Let's use hpage_size() in the BUG_ON(). We also need to change how we calculate lblk for cluster deallocation. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 5 +++-- 1 file changed, 3 insertions(+),

[PATCHv3 32/41] ext4: handle huge pages in __ext4_block_zero_page_range()

2016-09-15 Thread Kirill A. Shutemov
As the function handles zeroing range only within one block, the required changes are trivial, just remove assuption on page size. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git

[PATCHv3 33/41] ext4: make ext4_block_write_begin() aware about huge pages

2016-09-15 Thread Kirill A. Shutemov
It simply matches changes to __block_write_begin_int(). Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index

[PATCHv3 36/41] ext4: handle writeback with huge pages

2016-09-15 Thread Kirill A. Shutemov
Modify mpage_map_and_submit_buffers() and mpage_release_unused_pages() to deal with huge pages. Mostly result of try-and-error. Critical view would be appriciated. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 61

[PATCHv3 29/41] ext4: make ext4_mpage_readpages() hugepage-aware

2016-09-15 Thread Kirill A. Shutemov
This patch modifies ext4_mpage_readpages() to deal with huge pages. We read out 2M at once, so we have to alloc (HPAGE_PMD_NR * blocks_per_page) sector_t for that. I'm not entirely happy with kmalloc in this codepath, but don't see any other option. Signed-off-by: Kirill A. Shutemov

[PATCHv3 11/41] thp: try to free page's buffers before attempt split

2016-09-15 Thread Kirill A. Shutemov
We want page to be isolated from the rest of the system before spliting it. We rely on page count to be 2 for file pages to make sure nobody uses the page: one pin to caller, one to radix-tree. Filesystems with backing storage can have page count increased if it has buffers. Let's try to free

[PATCHv3 24/41] fs: make block_write_{begin,end}() be able to handle huge pages

2016-09-15 Thread Kirill A. Shutemov
It's more or less straight-forward. Most changes are around getting offset/len withing page right and zero out desired part of the page. Signed-off-by: Kirill A. Shutemov --- fs/buffer.c | 53 +++-- 1 file

[PATCHv3 15/41] filemap: handle huge pages in do_generic_file_read()

2016-09-15 Thread Kirill A. Shutemov
Most of work happans on head page. Only when we need to do copy data to userspace we find relevant subpage. We are still limited by PAGE_SIZE per iteration. Lifting this limitation would require some more work. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 04:38:07AM -0700, Christoph Hellwig wrote: > On Thu, Sep 15, 2016 at 12:49:35PM +0200, Wouter Verhelst wrote: > > A while back, we spent quite some time defining the semantics of the > > various commands in the face of the NBD_CMD_FLUSH and NBD_CMD_FLAG_FUA > > write

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 01:55:14PM +0200, Wouter Verhelst wrote: > Maybe I'm not using the correct terminology here. The point is that > after a FLUSH, the server asserts that all write commands *for which a > reply has already been sent to the client* will also have reached > permanent storage.

[PATCHv3 14/41] filemap: allocate huge page in page_cache_read(), if allowed

2016-09-15 Thread Kirill A. Shutemov
This patch adds basic functionality to put huge page into page cache. At the moment we only put huge pages into radix-tree if the range covered by the huge page is empty. We ignore shadow entires for now, just remove them from the tree before inserting huge page. Later we can add logic to

[PATCHv3 41/41] ext4, vfs: add huge= mount option

2016-09-15 Thread Kirill A. Shutemov
The same four values as in tmpfs case. Encyption code is not yet ready to handle huge page, so we disable huge pages support if the inode has EXT4_INODE_ENCRYPT. Signed-off-by: Kirill A. Shutemov --- fs/ext4/ext4.h | 5 + fs/ext4/inode.c | 26

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 04:52:17AM -0700, Christoph Hellwig wrote: > On Thu, Sep 15, 2016 at 12:46:07PM +0100, Alex Bligh wrote: > > Essentially NBD does supports FLUSH/FUA like this: > > > > https://www.kernel.org/doc/Documentation/block/writeback_cache_control.txt > > > > IE supports the same

[PATCHv3 38/41] ext4: fix SEEK_DATA/SEEK_HOLE for huge pages

2016-09-15 Thread Kirill A. Shutemov
ext4_find_unwritten_pgoff() needs few tweaks to work with huge pages. Mostly trivial page_mapping()/page_to_pgoff() and adjustment to how we find relevant block. Signe-off-by: Kirill A. Shutemov --- fs/ext4/file.c | 18 ++ 1 file changed, 14

[PATCHv3 25/41] fs: make block_page_mkwrite() aware about huge pages

2016-09-15 Thread Kirill A. Shutemov
Adjust check on whether part of the page beyond file size and apply compound_head() and page_mapping() where appropriate. Signed-off-by: Kirill A. Shutemov --- fs/buffer.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/buffer.c

[PATCHv3 12/41] thp: handle write-protection faults for file THP

2016-09-15 Thread Kirill A. Shutemov
For filesystems that wants to be write-notified (has mkwrite), we will encount write-protection faults for huge PMDs in shared mappings. The easiest way to handle them is to clear the PMD and let it refault as wriable. Signed-off-by: Kirill A. Shutemov ---

[PATCHv3 34/41] ext4: handle huge pages in ext4_da_write_end()

2016-09-15 Thread Kirill A. Shutemov
Call ext4_da_should_update_i_disksize() for head page with offset relative to head page. Signed-off-by: Kirill A. Shutemov --- fs/ext4/inode.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index

[PATCHv3 22/41] thp: do not threat slab pages as huge in hpage_{nr_pages,size,mask}

2016-09-15 Thread Kirill A. Shutemov
Slab pages can be compound, but we shouldn't threat them as THP for pupose of hpage_* helpers, otherwise it would lead to confusing results. For instance, ext4 uses slab pages for journal pages and we shouldn't confuse them with THPs. The easiest way is to exclude them in hpage_* helpers.

[PATCHv3 16/41] filemap: allocate huge page in pagecache_get_page(), if allowed

2016-09-15 Thread Kirill A. Shutemov
Write path allocate pages using pagecache_get_page(). We should be able to allocate huge pages there, if it's allowed. As usually, fallback to small pages, if failed. Signed-off-by: Kirill A. Shutemov --- mm/filemap.c | 18 -- 1 file changed, 16

[PATCHv3 05/41] radix-tree: Add radix_tree_split_preload()

2016-09-15 Thread Kirill A. Shutemov
From: Matthew Wilcox Calculate how many nodes we need to allocate to split an old_order entry into multiple entries, each of size new_order. The test suite checks that we allocated exactly the right number of nodes; neither too many (checked by rtp->nr == 0), nor too few

[PATCHv3 08/41] Revert "radix-tree: implement radix_tree_maybe_preload_order()"

2016-09-15 Thread Kirill A. Shutemov
This reverts commit 356e1c23292a4f63cfdf1daf0e0ddada51f32de8. After conversion of huge tmpfs to multi-order entries, we don't need this anymore. Signed-off-by: Kirill A. Shutemov --- include/linux/radix-tree.h | 1 - lib/radix-tree.c | 74

[PATCHv3 06/41] radix-tree: Handle multiorder entries being deleted by replace_clear_tags

2016-09-15 Thread Kirill A. Shutemov
From: Matthew Wilcox radix_tree_replace_clear_tags() can be called with NULL as the replacement value; in this case we need to delete sibling entries which point to the slot. Signed-off-by: Matthew Wilcox Signed-off-by: Kirill A. Shutemov

[PATCHv3 27/41] truncate: make invalidate_inode_pages2_range() aware about huge pages

2016-09-15 Thread Kirill A. Shutemov
For huge pages we need to unmap whole range covered by the huge page. Signed-off-by: Kirill A. Shutemov --- mm/truncate.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/mm/truncate.c b/mm/truncate.c index

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 12:43:35PM +0100, Alex Bligh wrote: > Sure, it's at: > > https://github.com/yoe/nbd/blob/master/doc/proto.md#ordering-of-messages-and-writes > > and that link takes you to the specific section. > > The treatment of FLUSH and FUA is meant to mirror exactly the > linux

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
> On 15 Sep 2016, at 12:40, Christoph Hellwig wrote: > > On Thu, Sep 15, 2016 at 01:29:36PM +0200, Wouter Verhelst wrote: >> Yes, and that is why I was asking about this. If the write barriers >> are expected to be shared across connections, we have a problem. If, >>

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Alex Bligh
Christoph, > On 15 Sep 2016, at 12:38, Christoph Hellwig wrote: > > On Thu, Sep 15, 2016 at 12:49:35PM +0200, Wouter Verhelst wrote: >> A while back, we spent quite some time defining the semantics of the >> various commands in the face of the NBD_CMD_FLUSH and

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 01:29:36PM +0200, Wouter Verhelst wrote: > Yes, and that is why I was asking about this. If the write barriers > are expected to be shared across connections, we have a problem. If, > however, they are not, then it doesn't matter that the commands may be > processed out of

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Christoph Hellwig
On Thu, Sep 15, 2016 at 12:09:28PM +0100, Alex Bligh wrote: > A more general point is that with multiple queues requests > may be processed in a different order even by those servers that > currently process the requests in strict order, or in something > similar to strict order. The server is

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
On Thu, Sep 15, 2016 at 12:09:28PM +0100, Alex Bligh wrote: > Wouter, Josef, (& Eric) > > > On 15 Sep 2016, at 11:49, Wouter Verhelst wrote: > > > > Hi, > > > > On Fri, Sep 09, 2016 at 10:02:03PM +0200, Wouter Verhelst wrote: > >> I see some practical problems with this: > >

Re: [Nbd] [RESEND][PATCH 0/5] nbd improvements

2016-09-15 Thread Wouter Verhelst
Hi, On Fri, Sep 09, 2016 at 10:02:03PM +0200, Wouter Verhelst wrote: > I see some practical problems with this: [...] One more that I didn't think about earlier: A while back, we spent quite some time defining the semantics of the various commands in the face of the NBD_CMD_FLUSH and

Re: [bug report] fs/block_dev.c: add bdev_read_page() and bdev_write_page()

2016-09-15 Thread Dan Carpenter
Let's try reporting this again to new email addresses... Btw, belated thanks for creating a linux-block mailing list Jens. :) regards, dan carpenter On Thu, Aug 04, 2016 at 05:02:06PM +0300, Dan Carpenter wrote: > Hello Matthew Wilcox, > > The patch 47a191fd38eb: "fs/block_dev.c: add

Re: [PATCH v2 4/6] dm rq: introduce dm_mq_kick_requeue_list()

2016-09-15 Thread Hannes Reinecke
On 09/14/2016 06:29 PM, Mike Snitzer wrote: > Make it possible for a request-based target to kick the DM device's > blk-mq request_queue's requeue_list. > > Signed-off-by: Mike Snitzer > --- > drivers/md/dm-rq.c | 17 + > drivers/md/dm-rq.h | 2 ++ > 2 files

Re: [PATCH v2 2/6] dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requests

2016-09-15 Thread Hannes Reinecke
On 09/14/2016 06:29 PM, Mike Snitzer wrote: > Otherwise blk-mq will immediately dispatch requests that are requeued > via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. > > Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() > with a delay of 5 secs. In the context of