[PATCH 11/30] drbd: when receiving P_TRIM, zero-out partial unaligned chunks

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg We can avoid spurious data divergence caused by partially-ignored discards on certain backends with discard_zeroes_data=0, if we translate partial unaligned discard requests into explicit zero-out. The relevant use case is LVM/DM thin. If on

[PATCH 22/30] drbd: introduce WRITE_SAME support

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg We will support WRITE_SAME, if * all peers support WRITE_SAME (both in kernel and DRBD version), * all peer devices support WRITE_SAME * logical_block_size is identical on all peers. We may at some point introduce a fallback on the receiving side for devices/kernels that

Re: [RFC 11/18] limits: track and present RLIMIT_NPROC actual max

2016-06-13 Thread Jann Horn
On Mon, Jun 13, 2016 at 10:44:18PM +0300, Topi Miettinen wrote: > Track maximum number of processes per user and present it > in /proc/self/limits. > > Signed-off-by: Topi Miettinen > --- > fs/proc/base.c| 4 > include/linux/sched.h | 1 + > kernel/fork.c | 5 + >

[PATCH 11/30] drbd: when receiving P_TRIM, zero-out partial unaligned chunks

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg We can avoid spurious data divergence caused by partially-ignored discards on certain backends with discard_zeroes_data=0, if we translate partial unaligned discard requests into explicit zero-out. The relevant use case is LVM/DM thin. If on different nodes, DRBD is backed

[PATCH 06/30] drbd: Create the protocol feature THIN_RESYNC

2016-06-13 Thread Philipp Reisner
If thinly provisioned volumes are used, during a resync the sync source tries to find out if a block is deallocated. If it is deallocated, then the resync target uses block_dev_issue_zeroout() on the range in question. Signed-off-by: Philipp Reisner Signed-off-by:

[PATCH 10/30] drbd: allow parallel flushes for multi-volume resources

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg To maintain write-order fidelity accros all volumes in a DRBD resource, the receiver of a P_BARRIER needs to issue flushes to all volumes. We used to do this by calling blkdev_issue_flush(), synchronously, one volume at a time. We now submit all

[PATCH 05/30] drbd: Introduce new disk config option rs-discard-granularity

2016-06-13 Thread Philipp Reisner
As long as the value is 0 the feature is disabled. With setting it to a positive value, DRBD limits and aligns its resync requests to the rs-discard-granularity setting. If the sync source detects all zeros in such a block, the resync target discards the range on disk. Signed-off-by: Philipp

[PATCH 09/30] drbd: fix for truncated minor number in callback command line

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg The command line parameter the kernel module uses to communicate the device minor to userland helper is flawed in a way that the device indentifier "minor-%d" is being truncated to minors with a maximum of 5 digits. But DRBD 8.4 allows 2^20 ==

[PATCH 06/30] drbd: Create the protocol feature THIN_RESYNC

2016-06-13 Thread Philipp Reisner
If thinly provisioned volumes are used, during a resync the sync source tries to find out if a block is deallocated. If it is deallocated, then the resync target uses block_dev_issue_zeroout() on the range in question. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg ---

[PATCH 10/30] drbd: allow parallel flushes for multi-volume resources

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg To maintain write-order fidelity accros all volumes in a DRBD resource, the receiver of a P_BARRIER needs to issue flushes to all volumes. We used to do this by calling blkdev_issue_flush(), synchronously, one volume at a time. We now submit all flushes to all volumes in

[PATCH 05/30] drbd: Introduce new disk config option rs-discard-granularity

2016-06-13 Thread Philipp Reisner
As long as the value is 0 the feature is disabled. With setting it to a positive value, DRBD limits and aligns its resync requests to the rs-discard-granularity setting. If the sync source detects all zeros in such a block, the resync target discards the range on disk. Signed-off-by: Philipp

[PATCH 09/30] drbd: fix for truncated minor number in callback command line

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg The command line parameter the kernel module uses to communicate the device minor to userland helper is flawed in a way that the device indentifier "minor-%d" is being truncated to minors with a maximum of 5 digits. But DRBD 8.4 allows 2^20 == 1048576 minors, thus a minimum

[PATCH 12/30] drbd: possibly disable discard support, if backend has discard_zeroes_data=0

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Now that we have the discard_zeroes_if_aligned setting, we should also check it when setting up our queue parameters on the primary, not only on the receiving side. We announce discard support, UNLESS * we are connected to a peer that does not

[PATCH 02/30] drbd: change bitmap write-out when leaving resync states

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When leaving resync states because of disconnect, do the bitmap write-out synchronously in the drbd_disconnected() path. When leaving resync states because we go back to AHEAD/BEHIND, or because resync actually finished, or some disk was lost

[PATCH 03/30] drbd: Kill code duplication

2016-06-13 Thread Philipp Reisner
Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c

[PATCH 01/30] drbd: bitmap bulk IO: do not always suspend IO

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg The intention was to only suspend IO if some normal bitmap operation is supposed to be locked out, not always. If the bulk operation is flaged as BM_LOCKED_CHANGE_ALLOWED, we do not need to suspend IO. Signed-off-by: Philipp Reisner

[PATCH 12/30] drbd: possibly disable discard support, if backend has discard_zeroes_data=0

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Now that we have the discard_zeroes_if_aligned setting, we should also check it when setting up our queue parameters on the primary, not only on the receiving side. We announce discard support, UNLESS * we are connected to a peer that does not support TRIM on the DRBD

[PATCH 02/30] drbd: change bitmap write-out when leaving resync states

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When leaving resync states because of disconnect, do the bitmap write-out synchronously in the drbd_disconnected() path. When leaving resync states because we go back to AHEAD/BEHIND, or because resync actually finished, or some disk was lost during resync, trigger the

[PATCH 03/30] drbd: Kill code duplication

2016-06-13 Thread Philipp Reisner
Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 18 ++ 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c index 0bac9c8..fad03e4 100644 ---

[PATCH 01/30] drbd: bitmap bulk IO: do not always suspend IO

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg The intention was to only suspend IO if some normal bitmap operation is supposed to be locked out, not always. If the bulk operation is flaged as BM_LOCKED_CHANGE_ALLOWED, we do not need to suspend IO. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg ---

[PATCH 1/6] genhd: Add GENHD_FL_DAX to gendisk flags

2016-06-13 Thread Toshi Kani
Currently, presence of direct_access() in block_device_operations indicates support of DAX on its block device. Because block_device_operations is instantiated with 'const', this DAX capablity may not be enabled conditinally. In preparation for supporting DAX to device-mapper devices, add

[PATCH 2/6] block: Check GENHD_FL_DAX for DAX capability

2016-06-13 Thread Toshi Kani
Now that GENHD_FL_DAX is set to all drivers supporting DAX, change bdev_direct_access() and __blkdev_get() to check this GENHD_FL_DAX flag. Signed-off-by: Toshi Kani Cc: Alexander Viro Cc: Dan Williams Cc: Ross Zwisler

[PATCH 4/6] dm-linear: Add linear_direct_access()

2016-06-13 Thread Toshi Kani
Change dm-linear to implement direct_access function, linear_direct_access(), which maps sector and calls direct_access function of its target device. Signed-off-by: Toshi Kani Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Dan Williams

[PATCH 5/6] dm, dm-linear: Add dax_supported to dm_target

2016-06-13 Thread Toshi Kani
Extend 'struct dm_target' to have dax_supported bit, which allows dm-table to check if a dm-target supports DAX. Change dm-linear to set this bit when its target physical device supports DAX. Signed-off-by: Toshi Kani Cc: Alasdair Kergon Cc: Mike Snitzer

[PATCH 6/6] dm: Enable DAX support for mapper device

2016-06-13 Thread Toshi Kani
Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped device supports DAX and is bio based. This new type is used to assure that all target devices have DAX support and remain that way after GENHD_FL_DAX is set to mapped device. At initial table load, GENHD_FL_DAX is set to

[PATCH 4/6] dm-linear: Add linear_direct_access()

2016-06-13 Thread Toshi Kani
Change dm-linear to implement direct_access function, linear_direct_access(), which maps sector and calls direct_access function of its target device. Signed-off-by: Toshi Kani Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Dan Williams Cc: Ross Zwisler --- drivers/md/dm-linear.c | 17

[PATCH 5/6] dm, dm-linear: Add dax_supported to dm_target

2016-06-13 Thread Toshi Kani
Extend 'struct dm_target' to have dax_supported bit, which allows dm-table to check if a dm-target supports DAX. Change dm-linear to set this bit when its target physical device supports DAX. Signed-off-by: Toshi Kani Cc: Alasdair Kergon Cc: Mike Snitzer Cc: Dan Williams Cc: Ross Zwisler

[PATCH 6/6] dm: Enable DAX support for mapper device

2016-06-13 Thread Toshi Kani
Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped device supports DAX and is bio based. This new type is used to assure that all target devices have DAX support and remain that way after GENHD_FL_DAX is set to mapped device. At initial table load, GENHD_FL_DAX is set to

[PATCH 1/6] genhd: Add GENHD_FL_DAX to gendisk flags

2016-06-13 Thread Toshi Kani
Currently, presence of direct_access() in block_device_operations indicates support of DAX on its block device. Because block_device_operations is instantiated with 'const', this DAX capablity may not be enabled conditinally. In preparation for supporting DAX to device-mapper devices, add

[PATCH 2/6] block: Check GENHD_FL_DAX for DAX capability

2016-06-13 Thread Toshi Kani
Now that GENHD_FL_DAX is set to all drivers supporting DAX, change bdev_direct_access() and __blkdev_get() to check this GENHD_FL_DAX flag. Signed-off-by: Toshi Kani Cc: Alexander Viro Cc: Dan Williams Cc: Ross Zwisler Cc: --- fs/block_dev.c |5 +++-- 1 file changed, 3 insertions(+), 2

[PATCH 0/6] Support DAX for device-mapper dm-linear devices

2016-06-13 Thread Toshi Kani
This patch-set adds DAX support to device-mapper dm-linear devices used by LVM. It works with LVM commands as follows: - Creation of a logical volume with all DAX capable devices (such as pmem) sets the logical volume DAX capable as well. - Once a logical volume is set to DAX capable, the

[PATCH 3/6] dm: Add dm_blk_direct_access() for mapped device

2016-06-13 Thread Toshi Kani
Change mapped device to implement direct_access function, dm_blk_direct_access(), which calls a target direct_access function. 'struct target_type' is extended to have target direct_access interface. This function limits direct accessible size to the dm_target's limit with max_io_len().

[PATCH 0/6] Support DAX for device-mapper dm-linear devices

2016-06-13 Thread Toshi Kani
This patch-set adds DAX support to device-mapper dm-linear devices used by LVM. It works with LVM commands as follows: - Creation of a logical volume with all DAX capable devices (such as pmem) sets the logical volume DAX capable as well. - Once a logical volume is set to DAX capable, the

[PATCH 3/6] dm: Add dm_blk_direct_access() for mapped device

2016-06-13 Thread Toshi Kani
Change mapped device to implement direct_access function, dm_blk_direct_access(), which calls a target direct_access function. 'struct target_type' is extended to have target direct_access interface. This function limits direct accessible size to the dm_target's limit with max_io_len().

[PATCH 23/30] drbd: sync_handshake: handle identical uuids with current (frozen) Primary

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If in a two-primary scenario, we lost our peer, freeze IO, and are still frozen (no UUID rotation) when the peer comes back as Secondary after a hard crash, we will see identical UUIDs. The "rule_nr = 40" chose to use the "CRASHED_PRIMARY" bit as

[PATCH 18/30] drbd: if there is no good data accessible, writes should be IO errors

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If DRBD lost all path to good data, and the on-no-data-accessible policy is OND_SUSPEND_IO, all pending and new IO requests are suspended (will block). If that setting is OND_IO_ERROR, IO will still be completed. READ to "clean" areas (e.g. on an

[PATCH 19/30] drbd: only restart frozen disk io when D_UP_TO_DATE

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When re-attaching the local backend device to a C_STANDALONE D_DISKLESS R_PRIMARY with OND_SUSPEND_IO, we may only resume IO if we recognize the backend that is being attached as D_UP_TO_DATE. Signed-off-by: Philipp Reisner

[PATCH 23/30] drbd: sync_handshake: handle identical uuids with current (frozen) Primary

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If in a two-primary scenario, we lost our peer, freeze IO, and are still frozen (no UUID rotation) when the peer comes back as Secondary after a hard crash, we will see identical UUIDs. The "rule_nr = 40" chose to use the "CRASHED_PRIMARY" bit as arbitration, but that would

[PATCH 18/30] drbd: if there is no good data accessible, writes should be IO errors

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If DRBD lost all path to good data, and the on-no-data-accessible policy is OND_SUSPEND_IO, all pending and new IO requests are suspended (will block). If that setting is OND_IO_ERROR, IO will still be completed. READ to "clean" areas (e.g. on an D_INCONSISTENT device, and

[PATCH 19/30] drbd: only restart frozen disk io when D_UP_TO_DATE

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When re-attaching the local backend device to a C_STANDALONE D_DISKLESS R_PRIMARY with OND_SUSPEND_IO, we may only resume IO if we recognize the backend that is being attached as D_UP_TO_DATE. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg ---

[PATCH 13/30] drbd: zero-out partial unaligned discards on local backend

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg For consistency, also zero-out partial unaligned chunks of discard requests on the local backend. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg ---

[PATCH 20/30] drbd: discard_zeroes_if_aligned allows "thin" resync for discard_zeroes_data=0

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Even if discard_zeroes_data != 0, if discard_zeroes_if_aligned is set, we assume we can reliably zero-out/discard using the drbd_issue_peer_discard() helper. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg

[PATCH 15/30] drbd: finish resync on sync source only by notification from sync target

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If the replication link breaks exactly during "resync finished" detection, finishing too early on the sync source could again lead to UUIDs rotated too fast, and potentially a spurious full resync on next handshake. Always wait for explicit resync finished

[PATCH 13/30] drbd: zero-out partial unaligned discards on local backend

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg For consistency, also zero-out partial unaligned chunks of discard requests on the local backend. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_int.h | 2 ++ drivers/block/drbd/drbd_req.c | 29 +++-- 2

[PATCH 20/30] drbd: discard_zeroes_if_aligned allows "thin" resync for discard_zeroes_data=0

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Even if discard_zeroes_data != 0, if discard_zeroes_if_aligned is set, we assume we can reliably zero-out/discard using the drbd_issue_peer_discard() helper. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_nl.c | 9 ++--- 1

[PATCH 15/30] drbd: finish resync on sync source only by notification from sync target

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg If the replication link breaks exactly during "resync finished" detection, finishing too early on the sync source could again lead to UUIDs rotated too fast, and potentially a spurious full resync on next handshake. Always wait for explicit resync finished state change

[PATCH 14/30] drbd: allow larger max_discard_sectors

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Make sure we have at least 67 (> AL_UPDATES_PER_TRANSACTION) al-extents available, and allow up to half of that to be discarded in one bio. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg

[PATCH 21/30] drbd: report sizes if rejecting too small peer disk

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git

[PATCH 16/30] drbd: introduce unfence-peer handler

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When resync is finished, we already call the "after-resync-target" handler (on the former sync target, obviously), once per volume. Paired with the before-resync-target handler, you can create snapshots, before the resync causes the volumes to become

[PATCH 29/30] drbd: al_write_transaction: skip re-scanning of bitmap page pointer array

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg For larger devices, the array of bitmap page pointers can grow very large (8000 pointers per TB of storage). For each activity log transaction, we need to flush the associated bitmap pages to stable storage. Currently, we just "mark" the

[PATCH 30/30] drbd: correctly handle failed crypto_alloc_hash

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg crypto_alloc_hash returns an ERR_PTR(), not NULL. Also reset peer_integrity_tfm to NULL, to not call crypto_free_hash() on an errno in the cleanup path. Reported-by: Insu Yun Signed-off-by: Philipp Reisner

[PATCH 17/30] drbd: don't forget error completion when "unsuspending" IO

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Possibly sequence of events: SyncTarget is made Primary, then loses replication link (only path to good data on SyncSource). Behavior is then controlled by the on-no-data-accessible policy, which defaults to OND_IO_ERROR (may be set to

[PATCH 14/30] drbd: allow larger max_discard_sectors

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Make sure we have at least 67 (> AL_UPDATES_PER_TRANSACTION) al-extents available, and allow up to half of that to be discarded in one bio. Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_actlog.c | 2 +-

[PATCH 21/30] drbd: report sizes if rejecting too small peer disk

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg --- drivers/block/drbd/drbd_receiver.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index cb80fb4..367b8e9

[PATCH 16/30] drbd: introduce unfence-peer handler

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg When resync is finished, we already call the "after-resync-target" handler (on the former sync target, obviously), once per volume. Paired with the before-resync-target handler, you can create snapshots, before the resync causes the volumes to become inconsistent, and

[PATCH 29/30] drbd: al_write_transaction: skip re-scanning of bitmap page pointer array

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg For larger devices, the array of bitmap page pointers can grow very large (8000 pointers per TB of storage). For each activity log transaction, we need to flush the associated bitmap pages to stable storage. Currently, we just "mark" the respective pages while setting up

[PATCH 30/30] drbd: correctly handle failed crypto_alloc_hash

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg crypto_alloc_hash returns an ERR_PTR(), not NULL. Also reset peer_integrity_tfm to NULL, to not call crypto_free_hash() on an errno in the cleanup path. Reported-by: Insu Yun Signed-off-by: Philipp Reisner Signed-off-by: Lars Ellenberg ---

[PATCH 17/30] drbd: don't forget error completion when "unsuspending" IO

2016-06-13 Thread Philipp Reisner
From: Lars Ellenberg Possibly sequence of events: SyncTarget is made Primary, then loses replication link (only path to good data on SyncSource). Behavior is then controlled by the on-no-data-accessible policy, which defaults to OND_IO_ERROR (may be set to OND_SUSPEND_IO). If OND_IO_ERROR is

Linux 3.10.102

2016-06-13 Thread Willy Tarreau
Linux 3.10.102 was just released. The patch and changelog will appear soon at the following locations: https://www.kernel.org/pub/linux/kernel/v3.x/ https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.102.xz https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.102.gz

[PART2 RFC v2 06/10] iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices

2016-06-13 Thread Suravee Suthikulpanit
This patch implements irq_set_vcpu_affinity() function to set up interrupt remapping table entry with vapic mode for pass-through devices. In case requirements for vapic mode are not met, it falls back to set up the IRTE in legacy mode. Signed-off-by: Suravee Suthikulpanit

Re: [PATCH 00/30] DRBD updates

2016-06-13 Thread Philipp Reisner
[...] > If you want me to add it to that branch (which is where it should go), > then why aren't the patches against that branch? I get rejects on > several of the patches, mainly because they are not done on top of this > particular branch. > > We can do two things here. I can skip patches, I

Linux 3.10.102

2016-06-13 Thread Willy Tarreau
Linux 3.10.102 was just released. The patch and changelog will appear soon at the following locations: https://www.kernel.org/pub/linux/kernel/v3.x/ https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.102.xz https://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.102.gz

[PART2 RFC v2 06/10] iommu/amd: Implements irq_set_vcpu_affinity() hook to setup vapic mode for pass-through devices

2016-06-13 Thread Suravee Suthikulpanit
This patch implements irq_set_vcpu_affinity() function to set up interrupt remapping table entry with vapic mode for pass-through devices. In case requirements for vapic mode are not met, it falls back to set up the IRTE in legacy mode. Signed-off-by: Suravee Suthikulpanit ---

Re: [PATCH 00/30] DRBD updates

2016-06-13 Thread Philipp Reisner
[...] > If you want me to add it to that branch (which is where it should go), > then why aren't the patches against that branch? I get rejects on > several of the patches, mainly because they are not done on top of this > particular branch. > > We can do two things here. I can skip patches, I

[PART2 RFC v2 09/10] svm: Implements update_pi_irte hook to setup posted interrupt

2016-06-13 Thread Suravee Suthikulpanit
This patch implements update_pi_irte function hook to allow SVM communicate to IOMMU driver regarding how to set up IRTE for handling posted interrupt. Signed-off-by: Suravee Suthikulpanit --- arch/x86/kvm/svm.c | 107

[PART2 RFC v2 10/10] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC

2016-06-13 Thread Suravee Suthikulpanit
In case AVIC is enabled, during vcpu_load/unload, SVM needs to update IOMMU IRTE with appropriate host physical APIC ID. Also, when vcpu_blocking/unblocking, SVM needs to update the is-running bit in the IOMMU IRTE. Both are achieved via calling amd_iommu_update_ga(). However, if GA mode is not

[PART2 RFC v2 00/10] iommu/AMD: Introduce IOMMU AVIC support

2016-06-13 Thread Suravee Suthikulpanit
CHANGES FROM V1 === * Rename the new mode from ga to vapic to be consistent with the current terminology used in KVM for guest virtual APIC (1/10 and 2/10). * Rename the member enum amd_iommu_intr_mode_type, and provide helper functions (1/10 and 2/10). * Re-factor

[PART2 RFC v2 00/10] iommu/AMD: Introduce IOMMU AVIC support

2016-06-13 Thread Suravee Suthikulpanit
CHANGES FROM V1 === * Rename the new mode from ga to vapic to be consistent with the current terminology used in KVM for guest virtual APIC (1/10 and 2/10). * Rename the member enum amd_iommu_intr_mode_type, and provide helper functions (1/10 and 2/10). * Re-factor

[PART2 RFC v2 09/10] svm: Implements update_pi_irte hook to setup posted interrupt

2016-06-13 Thread Suravee Suthikulpanit
This patch implements update_pi_irte function hook to allow SVM communicate to IOMMU driver regarding how to set up IRTE for handling posted interrupt. Signed-off-by: Suravee Suthikulpanit --- arch/x86/kvm/svm.c | 107 + 1 file changed, 107

[PART2 RFC v2 10/10] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC

2016-06-13 Thread Suravee Suthikulpanit
In case AVIC is enabled, during vcpu_load/unload, SVM needs to update IOMMU IRTE with appropriate host physical APIC ID. Also, when vcpu_blocking/unblocking, SVM needs to update the is-running bit in the IOMMU IRTE. Both are achieved via calling amd_iommu_update_ga(). However, if GA mode is not

[PART2 RFC v2 01/10] iommu/amd: Detect and enable guest vAPIC support

2016-06-13 Thread Suravee Suthikulpanit
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir, which can be used to specify different interrupt remapping mode for passthrough devices to VM guest: * legacy: Legacy interrupt remapping (w/ 32-bit IRTE) * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit

[PART2 RFC v2 05/10] iommu/amd: Introduce amd_iommu_update_ga()

2016-06-13 Thread Suravee Suthikulpanit
Introduces a new IOMMU API, amd_iommu_update_ga(), which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when load/unload vcpu. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu.c | 67 +

[PART2 RFC v2 03/10] iommu/amd: Detect and initialize guest vAPIC log

2016-06-13 Thread Suravee Suthikulpanit
This patch adds support to detect and initialize IOMMU Guest vAPIC log (GALOG). By default, it also enable GALog interrupt to notify IOMMU driver when GA Log entry is created. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu_init.c | 82

[PART2 RFC v2 05/10] iommu/amd: Introduce amd_iommu_update_ga()

2016-06-13 Thread Suravee Suthikulpanit
Introduces a new IOMMU API, amd_iommu_update_ga(), which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when load/unload vcpu. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu.c | 67 +

[PART2 RFC v2 03/10] iommu/amd: Detect and initialize guest vAPIC log

2016-06-13 Thread Suravee Suthikulpanit
This patch adds support to detect and initialize IOMMU Guest vAPIC log (GALOG). By default, it also enable GALog interrupt to notify IOMMU driver when GA Log entry is created. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd_iommu_init.c | 82 +

[PART2 RFC v2 01/10] iommu/amd: Detect and enable guest vAPIC support

2016-06-13 Thread Suravee Suthikulpanit
This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir, which can be used to specify different interrupt remapping mode for passthrough devices to VM guest: * legacy: Legacy interrupt remapping (w/ 32-bit IRTE) * vapic : Guest vAPIC interrupt remapping (w/ GA mode 128-bit

[PART2 RFC v2 08/10] svm: Introduce AMD IOMMU avic_ga_log_notifier

2016-06-13 Thread Suravee Suthikulpanit
This patch introduces avic_ga_log_notifier, which will be called by IOMMU driver whenever it handles the Guest vAPIC (GA) log entry. Signed-off-by: Suravee Suthikulpanit --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/svm.c | 58

[PART2 RFC v2 07/10] iommu/amd: Enable vAPIC interrupt remapping mode by default

2016-06-13 Thread Suravee Suthikulpanit
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver uses to determine if it should enable vAPIC support, by setting the ga_mode bit in the device's interrupt remapping table entry. Currently, it is enabled for all pass-through device if vAPIC mode is enabled. Signed-off-by:

[PART2 RFC v2 02/10] iommu/amd: Add support for 128-bit IRTE

2016-06-13 Thread Suravee Suthikulpanit
This patch adds new data structure for the 128-bit IOMMU IRTE format, which can support both legacy and vapic interrupt remapping modes. It also provides helper functions for setting up, accessing, and updating interrupt remapping table entries in different mode. Signed-off-by: Suravee

[PART2 RFC v2 08/10] svm: Introduce AMD IOMMU avic_ga_log_notifier

2016-06-13 Thread Suravee Suthikulpanit
This patch introduces avic_ga_log_notifier, which will be called by IOMMU driver whenever it handles the Guest vAPIC (GA) log entry. Signed-off-by: Suravee Suthikulpanit --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/svm.c | 58 -

[PART2 RFC v2 07/10] iommu/amd: Enable vAPIC interrupt remapping mode by default

2016-06-13 Thread Suravee Suthikulpanit
Introduce struct iommu_dev_data.use_vapic flag, which IOMMU driver uses to determine if it should enable vAPIC support, by setting the ga_mode bit in the device's interrupt remapping table entry. Currently, it is enabled for all pass-through device if vAPIC mode is enabled. Signed-off-by:

[PART2 RFC v2 02/10] iommu/amd: Add support for 128-bit IRTE

2016-06-13 Thread Suravee Suthikulpanit
This patch adds new data structure for the 128-bit IOMMU IRTE format, which can support both legacy and vapic interrupt remapping modes. It also provides helper functions for setting up, accessing, and updating interrupt remapping table entries in different mode. Signed-off-by: Suravee

[PART2 RFC v2 04/10] iommu/amd: Adding GALOG interrupt handler

2016-06-13 Thread Suravee Suthikulpanit
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler. When IOMMU hardware receives an interrupt targeting a blocking vcpu, it creates an entry in the GALOG, and generates an interrupt to notify the AMD IOMMU driver. At this point, the driver processes the log entry, and notify the SVM

[PART2 RFC v2 04/10] iommu/amd: Adding GALOG interrupt handler

2016-06-13 Thread Suravee Suthikulpanit
This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler. When IOMMU hardware receives an interrupt targeting a blocking vcpu, it creates an entry in the GALOG, and generates an interrupt to notify the AMD IOMMU driver. At this point, the driver processes the log entry, and notify the SVM

Re: [PATCH v6 3/6] crypto: AF_ALG -- add asymmetric cipher interface

2016-06-13 Thread Andrew Zaborowski
Hi, On 8 June 2016 at 21:14, Mat Martineau wrote: > On Wed, 8 Jun 2016, Stephan Mueller wrote: >> What is your concern? > Userspace must allocate larger buffers than it knows are necessary for > expected results. > > It looks like the software rsa

Re: [PATCH v6 3/6] crypto: AF_ALG -- add asymmetric cipher interface

2016-06-13 Thread Andrew Zaborowski
Hi, On 8 June 2016 at 21:14, Mat Martineau wrote: > On Wed, 8 Jun 2016, Stephan Mueller wrote: >> What is your concern? > Userspace must allocate larger buffers than it knows are necessary for > expected results. > > It looks like the software rsa implementation handles shorter output buffers >

Re: [PATCH] writeback: inode cgroup wb switch should skip inode with zero i_count

2016-06-13 Thread Tejun Heo
On Wed, Jun 08, 2016 at 07:59:28PM -0700, Tahsin Erdogan wrote: > Asynchronous wb switching of inodes takes an additional ref count on an > inode to make sure inode remains valid until switchover is completed. > > However, it is possible that inode->i_count has already reached zero > while inode

Re: [PATCH] writeback: inode cgroup wb switch should skip inode with zero i_count

2016-06-13 Thread Tejun Heo
On Wed, Jun 08, 2016 at 07:59:28PM -0700, Tahsin Erdogan wrote: > Asynchronous wb switching of inodes takes an additional ref count on an > inode to make sure inode remains valid until switchover is completed. > > However, it is possible that inode->i_count has already reached zero > while inode

Re: [PATCH v2 3/3] ARM64: dts: meson-gxbb: Add Hardware Random Generator node

2016-06-13 Thread Kevin Hilman
Hi Neil, Neil Armstrong writes: > Signed-off-by: Neil Armstrong > --- > arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi >

Re: [PATCH v2 0/3] hw_random: Add Amlogic Meson SoCs Random Generator driver

2016-06-13 Thread Kevin Hilman
Hi Herbert, Herbert Xu writes: > On Fri, Jun 10, 2016 at 10:21:52AM +0200, Neil Armstrong wrote: >> Add support for the Amlogic Meson SoCs Hardware Random generator as a >> hw_random char driver. >> The generator is a single 32bit wide register. >> Also adds the

Re: [PATCH v2 3/3] ARM64: dts: meson-gxbb: Add Hardware Random Generator node

2016-06-13 Thread Kevin Hilman
Hi Neil, Neil Armstrong writes: > Signed-off-by: Neil Armstrong > --- > arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi > b/arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi > index

Re: [PATCH v2 0/3] hw_random: Add Amlogic Meson SoCs Random Generator driver

2016-06-13 Thread Kevin Hilman
Hi Herbert, Herbert Xu writes: > On Fri, Jun 10, 2016 at 10:21:52AM +0200, Neil Armstrong wrote: >> Add support for the Amlogic Meson SoCs Hardware Random generator as a >> hw_random char driver. >> The generator is a single 32bit wide register. >> Also adds the Meson GXBB SoC DTSI node and

Re: [PATCH] x86 / hibernate: Fix 64-bit code passing control to image kernel

2016-06-13 Thread Rafael J. Wysocki
On Monday, June 13, 2016 02:58:57 PM Kees Cook wrote: > On Mon, Jun 13, 2016 at 6:42 AM, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki > > > > Logan Gunthorpe reports that hibernation stopped working reliably for > > him after commit

[PATCH] x86/KASLR: remove x86 hibernation restrictions

2016-06-13 Thread Kees Cook
With the commit "Fix 64-bit code passing control to image kernel", there is no longer a problem with hibernation resuming a KASLR-booted kernel image. Signed-off-by: Kees Cook --- Depends on: https://lkml.org/lkml/2016/6/13/442 --- Documentation/kernel-parameters.txt | 10

Re: [PATCH] x86 / hibernate: Fix 64-bit code passing control to image kernel

2016-06-13 Thread Rafael J. Wysocki
On Monday, June 13, 2016 02:58:57 PM Kees Cook wrote: > On Mon, Jun 13, 2016 at 6:42 AM, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki > > > > Logan Gunthorpe reports that hibernation stopped working reliably for > > him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table

[PATCH] x86/KASLR: remove x86 hibernation restrictions

2016-06-13 Thread Kees Cook
With the commit "Fix 64-bit code passing control to image kernel", there is no longer a problem with hibernation resuming a KASLR-booted kernel image. Signed-off-by: Kees Cook --- Depends on: https://lkml.org/lkml/2016/6/13/442 --- Documentation/kernel-parameters.txt | 10 --

Re: [RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Tejun Heo
On Mon, Jun 13, 2016 at 09:59:32PM +, Topi Miettinen wrote: > On 06/13/16 21:33, Tejun Heo wrote: > > Hello, > > > > On Mon, Jun 13, 2016 at 09:29:32PM +, Topi Miettinen wrote: > >> I used fork callback as I don't want to lower the watermark in all cases > >> where the charge can be

Re: [RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Tejun Heo
On Mon, Jun 13, 2016 at 09:59:32PM +, Topi Miettinen wrote: > On 06/13/16 21:33, Tejun Heo wrote: > > Hello, > > > > On Mon, Jun 13, 2016 at 09:29:32PM +, Topi Miettinen wrote: > >> I used fork callback as I don't want to lower the watermark in all cases > >> where the charge can be

Re: [RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Topi Miettinen
On 06/13/16 21:33, Tejun Heo wrote: > Hello, > > On Mon, Jun 13, 2016 at 09:29:32PM +, Topi Miettinen wrote: >> I used fork callback as I don't want to lower the watermark in all cases >> where the charge can be lowered, so I'd update the watermark only when >> the fork really happens. > > I

Re: [RFC 02/18] cgroup_pids: track maximum pids

2016-06-13 Thread Topi Miettinen
On 06/13/16 21:33, Tejun Heo wrote: > Hello, > > On Mon, Jun 13, 2016 at 09:29:32PM +, Topi Miettinen wrote: >> I used fork callback as I don't want to lower the watermark in all cases >> where the charge can be lowered, so I'd update the watermark only when >> the fork really happens. > > I

<    1   2   3   4   5   6   7   8   9   10   >