Re: [PATCH v3 08/17] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:06:00PM +0100, Cédric Le Goater wrote:
> This control will be used by the H_INT_SYNC hcall from QEMU to flush
> event notifications on the XIVE IC owning the source.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v2 :
> 
>  - fixed locking on source block
> 
>  arch/powerpc/include/uapi/asm/kvm.h|  1 +
>  arch/powerpc/kvm/book3s_xive_native.c  | 36 ++
>  Documentation/virtual/kvm/devices/xive.txt |  8 +
>  3 files changed, 45 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 95e82ab57c03..fc9211dbfec8 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -681,6 +681,7 @@ struct kvm_ppc_cpu_char {
>  #define KVM_DEV_XIVE_GRP_SOURCE  2   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG   3   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_EQ_CONFIG   4   /* 64-bit EQ identifier */
> +#define KVM_DEV_XIVE_GRP_SOURCE_SYNC 5   /* 64-bit source identifier */
>  
>  /* Layout of 64-bit XIVE source attribute values */
>  #define KVM_XIVE_LEVEL_SENSITIVE (1ULL << 0)
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
> b/arch/powerpc/kvm/book3s_xive_native.c
> index 3385c336fd89..26ac3c505cd2 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -340,6 +340,38 @@ static int kvmppc_xive_native_set_source_config(struct 
> kvmppc_xive *xive,
>  priority, masked, eisn);
>  }
>  
> +static int kvmppc_xive_native_sync_source(struct kvmppc_xive *xive,
> +   long irq, u64 addr)
> +{
> + struct kvmppc_xive_src_block *sb;
> + struct kvmppc_xive_irq_state *state;
> + struct xive_irq_data *xd;
> + u32 hw_num;
> + u16 src;
> + int rc = 0;
> +
> + pr_devel("%s irq=0x%lx", __func__, irq);
> +
> + sb = kvmppc_xive_find_source(xive, irq, );
> + if (!sb)
> + return -ENOENT;
> +
> + state = >irq_state[src];
> +
> + rc = -EINVAL;
> +
> + arch_spin_lock(>lock);
> +
> + if (state->valid) {
> + kvmppc_xive_select_irq(state, _num, );
> + xive_native_sync_source(hw_num);
> + rc = 0;
> + }
> +
> + arch_spin_unlock(>lock);
> + return rc;
> +}
> +
>  static int xive_native_validate_queue_size(u32 qsize)
>  {
>   /*
> @@ -658,6 +690,9 @@ static int kvmppc_xive_native_set_attr(struct kvm_device 
> *dev,
>   case KVM_DEV_XIVE_GRP_EQ_CONFIG:
>   return kvmppc_xive_native_set_queue_config(xive, attr->attr,
>  attr->addr);
> + case KVM_DEV_XIVE_GRP_SOURCE_SYNC:
> + return kvmppc_xive_native_sync_source(xive, attr->attr,
> +   attr->addr);
>   }
>   return -ENXIO;
>  }
> @@ -687,6 +722,7 @@ static int kvmppc_xive_native_has_attr(struct kvm_device 
> *dev,
>   break;
>   case KVM_DEV_XIVE_GRP_SOURCE:
>   case KVM_DEV_XIVE_GRP_SOURCE_CONFIG:
> + case KVM_DEV_XIVE_GRP_SOURCE_SYNC:
>   if (attr->attr >= KVMPPC_XIVE_FIRST_IRQ &&
>   attr->attr < KVMPPC_XIVE_NR_IRQS)
>   return 0;
> diff --git a/Documentation/virtual/kvm/devices/xive.txt 
> b/Documentation/virtual/kvm/devices/xive.txt
> index e1893d303ab7..055aed0c2abb 100644
> --- a/Documentation/virtual/kvm/devices/xive.txt
> +++ b/Documentation/virtual/kvm/devices/xive.txt
> @@ -89,3 +89,11 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
>  -EINVAL: Invalid queue address
>  -EFAULT: Invalid user pointer for attr->addr.
>  -EIO:Configuration of the underlying HW failed
> +
> +  5. KVM_DEV_XIVE_GRP_SOURCE_SYNC (write only)
> +  Synchronize the source to flush event notifications
> +  Attributes:
> +Interrupt source number  (64-bit)
> +  Errors:
> +-ENOENT: Unknown source number
> +-EINVAL: Not initialized source number

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 06/17] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:05:58PM +0100, Cédric Le Goater wrote:
> These controls will be used by the H_INT_SET_QUEUE_CONFIG and
> H_INT_GET_QUEUE_CONFIG hcalls from QEMU to configure the underlying
> Event Queue in the XIVE IC. They will also be used to restore the
> configuration of the XIVE EQs and to capture the internal run-time
> state of the EQs. Both 'get' and 'set' rely on an OPAL call to access
> the EQ toggle bit and EQ index which are updated by the XIVE IC when
> event notifications are enqueued in the EQ.
> 
> The value of the guest physical address of the event queue is saved in
> the XIVE internal xive_q structure for later use. That is when
> migration needs to mark the EQ pages dirty to capture a consistent
> memory state of the VM.
> 
> To be noted that H_INT_SET_QUEUE_CONFIG does not require the extra
> OPAL call setting the EQ toggle bit and EQ index to configure the EQ,
> but restoring the EQ state will.
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  Changes since v2 :
>  
>  - fixed comments on the KVM device attribute definitions
>  - fixed check on supported EQ size to restrict to 64K pages
>  - checked kvm_eq.flags that need to be zero
>  - removed the OPAL call when EQ qtoggle bit and index are zero. 
> 
>  arch/powerpc/include/asm/xive.h|   2 +
>  arch/powerpc/include/uapi/asm/kvm.h|  21 ++
>  arch/powerpc/kvm/book3s_xive.h |   2 +
>  arch/powerpc/kvm/book3s_xive.c |  15 +-
>  arch/powerpc/kvm/book3s_xive_native.c  | 232 +
>  Documentation/virtual/kvm/devices/xive.txt |  31 +++
>  6 files changed, 297 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
> index b579a943407b..46891f321606 100644
> --- a/arch/powerpc/include/asm/xive.h
> +++ b/arch/powerpc/include/asm/xive.h
> @@ -73,6 +73,8 @@ struct xive_q {
>   u32 esc_irq;
>   atomic_tcount;
>   atomic_tpending_count;
> + u64 guest_qpage;
> + u32 guest_qsize;
>  };
>  
>  /* Global enable flags for the XIVE support */
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 12bb01baf0ae..1cd728c87d7c 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -679,6 +679,7 @@ struct kvm_ppc_cpu_char {
>  #define KVM_DEV_XIVE_GRP_CTRL1
>  #define KVM_DEV_XIVE_GRP_SOURCE  2   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG   3   /* 64-bit source 
> identifier */
> +#define KVM_DEV_XIVE_GRP_EQ_CONFIG   4   /* 64-bit EQ identifier */
>  
>  /* Layout of 64-bit XIVE source attribute values */
>  #define KVM_XIVE_LEVEL_SENSITIVE (1ULL << 0)
> @@ -694,4 +695,24 @@ struct kvm_ppc_cpu_char {
>  #define KVM_XIVE_SOURCE_EISN_SHIFT   33
>  #define KVM_XIVE_SOURCE_EISN_MASK0xfffeULL
>  
> +/* Layout of 64-bit EQ identifier */
> +#define KVM_XIVE_EQ_PRIORITY_SHIFT   0
> +#define KVM_XIVE_EQ_PRIORITY_MASK0x7
> +#define KVM_XIVE_EQ_SERVER_SHIFT 3
> +#define KVM_XIVE_EQ_SERVER_MASK  0xfff8ULL
> +
> +/* Layout of EQ configuration values (64 bytes) */
> +struct kvm_ppc_xive_eq {
> + __u32 flags;
> + __u32 qsize;
> + __u64 qpage;
> + __u32 qtoggle;
> + __u32 qindex;
> + __u8  pad[40];
> +};
> +
> +#define KVM_XIVE_EQ_FLAG_ENABLED 0x0001
> +#define KVM_XIVE_EQ_FLAG_ALWAYS_NOTIFY   0x0002
> +#define KVM_XIVE_EQ_FLAG_ESCALATE0x0004
> +
>  #endif /* __LINUX_KVM_POWERPC_H */
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index ae26fe653d98..622f594d93e1 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -272,6 +272,8 @@ struct kvmppc_xive_src_block 
> *kvmppc_xive_create_src_block(
>   struct kvmppc_xive *xive, int irq);
>  void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb);
>  int kvmppc_xive_select_target(struct kvm *kvm, u32 *server, u8 prio);
> +int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
> +   bool single_escalation);
>  
>  #endif /* CONFIG_KVM_XICS */
>  #endif /* _KVM_PPC_BOOK3S_XICS_H */
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index e09f3addffe5..c1b7aa7dbc28 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -166,7 +166,8 @@ static irqreturn_t xive_esc_irq(int irq, void *data)
>   return IRQ_HANDLED;
>  }
>  
> -static int xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio)
> +int kvmppc_xive_attach_escalation(struct kvm_vcpu *vcpu, u8 prio,
> +   bool single_escalation)
>  {
>   struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
>   struct xive_q *q = >queues[prio];
> @@ -185,7 +186,7 @@ static int 

Re: [PATCH v3 09/17] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:06:01PM +0100, Cédric Le Goater wrote:
> When migration of a VM is initiated, a first copy of the RAM is
> transferred to the destination before the VM is stopped, but there is
> no guarantee that the EQ pages in which the event notifications are
> queued have not been modified.
> 
> To make sure migration will capture a consistent memory state, the
> XIVE device should perform a XIVE quiesce sequence to stop the flow of
> event notifications and stabilize the EQs. This is the purpose of the
> KVM_DEV_XIVE_EQ_SYNC control which will also marks the EQ pages dirty
> to force their transfer.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v2 :
> 
>  - Extra comments
>  - fixed locking on source block
> 
>  arch/powerpc/include/uapi/asm/kvm.h|  1 +
>  arch/powerpc/kvm/book3s_xive_native.c  | 85 ++
>  Documentation/virtual/kvm/devices/xive.txt | 29 
>  3 files changed, 115 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index fc9211dbfec8..caf52be89494 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -678,6 +678,7 @@ struct kvm_ppc_cpu_char {
>  /* POWER9 XIVE Native Interrupt Controller */
>  #define KVM_DEV_XIVE_GRP_CTRL1
>  #define   KVM_DEV_XIVE_RESET 1
> +#define   KVM_DEV_XIVE_EQ_SYNC   2
>  #define KVM_DEV_XIVE_GRP_SOURCE  2   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG   3   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_EQ_CONFIG   4   /* 64-bit EQ identifier */
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
> b/arch/powerpc/kvm/book3s_xive_native.c
> index 26ac3c505cd2..ea091c0a8fb6 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -669,6 +669,88 @@ static int kvmppc_xive_reset(struct kvmppc_xive *xive)
>   return 0;
>  }
>  
> +static void kvmppc_xive_native_sync_sources(struct kvmppc_xive_src_block *sb)
> +{
> + int j;
> +
> + for (j = 0; j < KVMPPC_XICS_IRQ_PER_ICS; j++) {
> + struct kvmppc_xive_irq_state *state = >irq_state[j];
> + struct xive_irq_data *xd;
> + u32 hw_num;
> +
> + if (!state->valid)
> + continue;
> +
> + /*
> +  * The struct kvmppc_xive_irq_state reflects the state
> +  * of the EAS configuration and not the state of the
> +  * source. The source is masked setting the PQ bits to
> +  * '-Q', which is what is being done before calling
> +  * the KVM_DEV_XIVE_EQ_SYNC control.
> +  *
> +  * If a source EAS is configured, OPAL syncs the XIVE
> +  * IC of the source and the XIVE IC of the previous
> +  * target if any.
> +  *
> +  * So it should be fine ignoring MASKED sources as
> +  * they have been synced already.
> +  */
> + if (state->act_priority == MASKED)
> + continue;
> +
> + kvmppc_xive_select_irq(state, _num, );
> + xive_native_sync_source(hw_num);
> + xive_native_sync_queue(hw_num);
> + }
> +}
> +
> +static int kvmppc_xive_native_vcpu_eq_sync(struct kvm_vcpu *vcpu)
> +{
> + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> + unsigned int prio;
> +
> + if (!xc)
> + return -ENOENT;
> +
> + for (prio = 0; prio < KVMPPC_XIVE_Q_COUNT; prio++) {
> + struct xive_q *q = >queues[prio];
> +
> + if (!q->qpage)
> + continue;
> +
> + /* Mark EQ page dirty for migration */
> + mark_page_dirty(vcpu->kvm, gpa_to_gfn(q->guest_qpage));
> + }
> + return 0;
> +}
> +
> +static int kvmppc_xive_native_eq_sync(struct kvmppc_xive *xive)
> +{
> + struct kvm *kvm = xive->kvm;
> + struct kvm_vcpu *vcpu;
> + unsigned int i;
> +
> + pr_devel("%s\n", __func__);
> +
> + mutex_lock(>lock);
> + for (i = 0; i <= xive->max_sbid; i++) {
> + struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
> +
> + if (sb) {
> + arch_spin_lock(>lock);
> + kvmppc_xive_native_sync_sources(sb);
> + arch_spin_unlock(>lock);
> + }
> + }
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + kvmppc_xive_native_vcpu_eq_sync(vcpu);
> + }
> + mutex_unlock(>lock);
> +
> + return 0;
> +}
> +
>  static int kvmppc_xive_native_set_attr(struct kvm_device *dev,
>  struct kvm_device_attr *attr)
>  {
> @@ -679,6 +761,8 @@ static int kvmppc_xive_native_set_attr(struct kvm_device 
> *dev,
>   switch (attr->attr) {
>   case 

Re: [PATCH v3 11/17] KVM: introduce a 'mmap' method for KVM devices

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:06:03PM +0100, Cédric Le Goater wrote:
> Some KVM devices will want to handle special mappings related to the
> underlying HW. For instance, the XIVE interrupt controller of the
> POWER9 processor has MMIO pages for thread interrupt management and
> for interrupt source control that need to be exposed to the guest when
> the OS has the required support.
> 
> Cc: Paolo Bonzini 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
>  include/linux/kvm_host.h |  1 +
>  virt/kvm/kvm_main.c  | 11 +++
>  2 files changed, 12 insertions(+)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index c38cc5eb7e73..cbf81487b69f 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1223,6 +1223,7 @@ struct kvm_device_ops {
>   int (*has_attr)(struct kvm_device *dev, struct kvm_device_attr *attr);
>   long (*ioctl)(struct kvm_device *dev, unsigned int ioctl,
> unsigned long arg);
> + int (*mmap)(struct kvm_device *dev, struct vm_area_struct *vma);
>  };
>  
>  void kvm_device_get(struct kvm_device *dev);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 076bc38963bf..e4881a8c2a6f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2878,6 +2878,16 @@ static long kvm_vcpu_compat_ioctl(struct file *filp,
>  }
>  #endif
>  
> +static int kvm_device_mmap(struct file *filp, struct vm_area_struct *vma)
> +{
> + struct kvm_device *dev = filp->private_data;
> +
> + if (dev->ops->mmap)
> + return dev->ops->mmap(dev, vma);
> +
> + return -ENODEV;
> +}
> +
>  static int kvm_device_ioctl_attr(struct kvm_device *dev,
>int (*accessor)(struct kvm_device *dev,
>struct kvm_device_attr *attr),
> @@ -2927,6 +2937,7 @@ static const struct file_operations kvm_device_fops = {
>   .unlocked_ioctl = kvm_device_ioctl,
>   .release = kvm_device_release,
>   KVM_COMPAT(kvm_device_ioctl),
> + .mmap = kvm_device_mmap,
>  };
>  
>  struct kvm_device *kvm_device_from_filp(struct file *filp)

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v3 07/17] KVM: PPC: Book3S HV: XIVE: add a global reset control

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:05:59PM +0100, Cédric Le Goater wrote:
> This control is to be used by the H_INT_RESET hcall from QEMU. Its
> purpose is to clear all configuration of the sources and EQs. This is
> necessary in case of a kexec (for a kdump kernel for instance) to make
> sure that no remaining configuration is left from the previous boot
> setup so that the new kernel can start safely from a clean state.
> 
> The queue 7 is ignored when the XIVE device is configured to run in
> single escalation mode. Prio 7 is used by escalations.
> 
> The XIVE VP is kept enabled as the vCPU is still active and connected
> to the XIVE device.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v2 :
> 
>  - fixed locking on source block
> 
>  arch/powerpc/include/uapi/asm/kvm.h|  1 +
>  arch/powerpc/kvm/book3s_xive_native.c  | 85 ++
>  Documentation/virtual/kvm/devices/xive.txt |  5 ++
>  3 files changed, 91 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 1cd728c87d7c..95e82ab57c03 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -677,6 +677,7 @@ struct kvm_ppc_cpu_char {
>  
>  /* POWER9 XIVE Native Interrupt Controller */
>  #define KVM_DEV_XIVE_GRP_CTRL1
> +#define   KVM_DEV_XIVE_RESET 1
>  #define KVM_DEV_XIVE_GRP_SOURCE  2   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG   3   /* 64-bit source 
> identifier */
>  #define KVM_DEV_XIVE_GRP_EQ_CONFIG   4   /* 64-bit EQ identifier */
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
> b/arch/powerpc/kvm/book3s_xive_native.c
> index 42e824658a30..3385c336fd89 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -560,6 +560,83 @@ static int kvmppc_xive_native_get_queue_config(struct 
> kvmppc_xive *xive,
>   return 0;
>  }
>  
> +static void kvmppc_xive_reset_sources(struct kvmppc_xive_src_block *sb)
> +{
> + int i;
> +
> + for (i = 0; i < KVMPPC_XICS_IRQ_PER_ICS; i++) {
> + struct kvmppc_xive_irq_state *state = >irq_state[i];
> +
> + if (!state->valid)
> + continue;
> +
> + if (state->act_priority == MASKED)
> + continue;
> +
> + state->eisn = 0;
> + state->act_server = 0;
> + state->act_priority = MASKED;
> + xive_vm_esb_load(>ipi_data, XIVE_ESB_SET_PQ_01);
> + xive_native_configure_irq(state->ipi_number, 0, MASKED, 0);
> + if (state->pt_number) {
> + xive_vm_esb_load(state->pt_data, XIVE_ESB_SET_PQ_01);
> + xive_native_configure_irq(state->pt_number,
> +   0, MASKED, 0);
> + }
> + }
> +}
> +
> +static int kvmppc_xive_reset(struct kvmppc_xive *xive)
> +{
> + struct kvm *kvm = xive->kvm;
> + struct kvm_vcpu *vcpu;
> + unsigned int i;
> +
> + pr_devel("%s\n", __func__);
> +
> + mutex_lock(>lock);
> +
> + kvm_for_each_vcpu(i, vcpu, kvm) {
> + struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> + unsigned int prio;
> +
> + if (!xc)
> + continue;
> +
> + kvmppc_xive_disable_vcpu_interrupts(vcpu);
> +
> + for (prio = 0; prio < KVMPPC_XIVE_Q_COUNT; prio++) {
> +
> + /* Single escalation, no queue 7 */
> + if (prio == 7 && xive->single_escalation)
> + break;
> +
> + if (xc->esc_virq[prio]) {
> + free_irq(xc->esc_virq[prio], vcpu);
> + irq_dispose_mapping(xc->esc_virq[prio]);
> + kfree(xc->esc_virq_names[prio]);
> + xc->esc_virq[prio] = 0;
> + }
> +
> + kvmppc_xive_native_cleanup_queue(vcpu, prio);
> + }
> + }
> +
> + for (i = 0; i <= xive->max_sbid; i++) {
> + struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
> +
> + if (sb) {
> + arch_spin_lock(>lock);
> + kvmppc_xive_reset_sources(sb);
> + arch_spin_unlock(>lock);
> + }
> + }
> +
> + mutex_unlock(>lock);
> +
> + return 0;
> +}
> +
>  static int kvmppc_xive_native_set_attr(struct kvm_device *dev,
>  struct kvm_device_attr *attr)
>  {
> @@ -567,6 +644,10 @@ static int kvmppc_xive_native_set_attr(struct kvm_device 
> *dev,
>  
>   switch (attr->group) {
>   case KVM_DEV_XIVE_GRP_CTRL:
> + switch (attr->attr) {
> + case KVM_DEV_XIVE_RESET:
> + return kvmppc_xive_reset(xive);
> + 

[RESEND 7/7] IB/mthca: Use the new FOLL_LONGTERM flag to get_user_pages_fast()

2019-03-17 Thread ira . weiny
From: Ira Weiny 

Use the new FOLL_LONGTERM to get_user_pages_fast() to protect against
FS DAX pages being mapped.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/hw/mthca/mthca_memfree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_memfree.c 
b/drivers/infiniband/hw/mthca/mthca_memfree.c
index 112d2f38e0de..8ff0e90d7564 100644
--- a/drivers/infiniband/hw/mthca/mthca_memfree.c
+++ b/drivers/infiniband/hw/mthca/mthca_memfree.c
@@ -472,7 +472,8 @@ int mthca_map_user_db(struct mthca_dev *dev, struct 
mthca_uar *uar,
goto out;
}
 
-   ret = get_user_pages_fast(uaddr & PAGE_MASK, 1, FOLL_WRITE, pages);
+   ret = get_user_pages_fast(uaddr & PAGE_MASK, 1,
+ FOLL_WRITE | FOLL_LONGTERM, pages);
if (ret < 0)
goto out;
 
-- 
2.20.1



[RESEND 6/7] IB/qib: Use the new FOLL_LONGTERM flag to get_user_pages_fast()

2019-03-17 Thread ira . weiny
From: Ira Weiny 

Use the new FOLL_LONGTERM to get_user_pages_fast() to protect against
FS DAX pages being mapped.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/hw/qib/qib_user_sdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/qib/qib_user_sdma.c 
b/drivers/infiniband/hw/qib/qib_user_sdma.c
index 31c523b2a9f5..b53cc0240e02 100644
--- a/drivers/infiniband/hw/qib/qib_user_sdma.c
+++ b/drivers/infiniband/hw/qib/qib_user_sdma.c
@@ -673,7 +673,7 @@ static int qib_user_sdma_pin_pages(const struct qib_devdata 
*dd,
else
j = npages;
 
-   ret = get_user_pages_fast(addr, j, 0, pages);
+   ret = get_user_pages_fast(addr, j, FOLL_LONGTERM, pages);
if (ret != j) {
i = 0;
j = ret;
-- 
2.20.1



[RESEND 5/7] IB/hfi1: Use the new FOLL_LONGTERM flag to get_user_pages_fast()

2019-03-17 Thread ira . weiny
From: Ira Weiny 

Use the new FOLL_LONGTERM to get_user_pages_fast() to protect against
FS DAX pages being mapped.

Signed-off-by: Ira Weiny 
---
 drivers/infiniband/hw/hfi1/user_pages.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_pages.c 
b/drivers/infiniband/hw/hfi1/user_pages.c
index 78ccacaf97d0..6a7f9cd5a94e 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -104,9 +104,11 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned 
long vaddr, size_t np
bool writable, struct page **pages)
 {
int ret;
+   unsigned int gup_flags = writable ? FOLL_WRITE : 0;
 
-   ret = get_user_pages_fast(vaddr, npages, writable ? FOLL_WRITE : 0,
- pages);
+   gup_flags |= FOLL_LONGTERM;
+
+   ret = get_user_pages_fast(vaddr, npages, gup_flags, pages);
if (ret < 0)
return ret;
 
-- 
2.20.1



[RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-17 Thread ira . weiny
From: Ira Weiny 

DAX pages were previously unprotected from longterm pins when users
called get_user_pages_fast().

Use the new FOLL_LONGTERM flag to check for DEVMAP pages and fall
back to regular GUP processing if a DEVMAP page is encountered.

Signed-off-by: Ira Weiny 
---
 mm/gup.c | 29 +
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 0684a9536207..173db0c44678 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1600,6 +1600,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
goto pte_unmap;
 
if (pte_devmap(pte)) {
+   if (unlikely(flags & FOLL_LONGTERM))
+   goto pte_unmap;
+
pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
if (unlikely(!pgmap)) {
undo_dev_pagemap(nr, nr_start, pages);
@@ -1739,8 +1742,11 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
unsigned long addr,
if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
return 0;
 
-   if (pmd_devmap(orig))
+   if (pmd_devmap(orig)) {
+   if (unlikely(flags & FOLL_LONGTERM))
+   return 0;
return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr);
+   }
 
refs = 0;
page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
@@ -1777,8 +1783,11 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
unsigned long addr,
if (!pud_access_permitted(orig, flags & FOLL_WRITE))
return 0;
 
-   if (pud_devmap(orig))
+   if (pud_devmap(orig)) {
+   if (unlikely(flags & FOLL_LONGTERM))
+   return 0;
return __gup_device_huge_pud(orig, pudp, addr, end, pages, nr);
+   }
 
refs = 0;
page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
@@ -2066,8 +2075,20 @@ int get_user_pages_fast(unsigned long start, int 
nr_pages,
start += nr << PAGE_SHIFT;
pages += nr;
 
-   ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
- gup_flags);
+   if (gup_flags & FOLL_LONGTERM) {
+   down_read(>mm->mmap_sem);
+   ret = __gup_longterm_locked(current, current->mm,
+   start, nr_pages - nr,
+   pages, NULL, gup_flags);
+   up_read(>mm->mmap_sem);
+   } else {
+   /*
+* retain FAULT_FOLL_ALLOW_RETRY optimization if
+* possible
+*/
+   ret = get_user_pages_unlocked(start, nr_pages - nr,
+ pages, gup_flags);
+   }
 
/* Have to be a bit careful with return values */
if (nr > 0) {
-- 
2.20.1



[RESEND 3/7] mm/gup: Change GUP fast to use flags rather than a write 'bool'

2019-03-17 Thread ira . weiny
From: Ira Weiny 

To facilitate additional options to get_user_pages_fast() change the
singular write parameter to be gup_flags.

This patch does not change any functionality.  New functionality will
follow in subsequent patches.

Some of the get_user_pages_fast() call sites were unchanged because they
already passed FOLL_WRITE or 0 for the write parameter.

Signed-off-by: Ira Weiny 

---
Changes from V1:
Rebase to current merge tree
arch/powerpc/mm/mmu_context_iommu.c no longer calls gup_fast
The gup_longterm was converted in patch 1

 arch/mips/mm/gup.c | 11 ++-
 arch/powerpc/kvm/book3s_64_mmu_hv.c|  4 ++--
 arch/powerpc/kvm/e500_mmu.c|  2 +-
 arch/s390/kvm/interrupt.c  |  2 +-
 arch/s390/mm/gup.c | 12 ++--
 arch/sh/mm/gup.c   | 11 ++-
 arch/sparc/mm/gup.c|  9 +
 arch/x86/kvm/paging_tmpl.h |  2 +-
 arch/x86/kvm/svm.c |  2 +-
 drivers/fpga/dfl-afu-dma-region.c  |  2 +-
 drivers/gpu/drm/via/via_dmablit.c  |  3 ++-
 drivers/infiniband/hw/hfi1/user_pages.c|  3 ++-
 drivers/misc/genwqe/card_utils.c   |  2 +-
 drivers/misc/vmw_vmci/vmci_host.c  |  2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c|  6 --
 drivers/platform/goldfish/goldfish_pipe.c  |  3 ++-
 drivers/rapidio/devices/rio_mport_cdev.c   |  4 +++-
 drivers/sbus/char/oradax.c |  2 +-
 drivers/scsi/st.c  |  3 ++-
 drivers/staging/gasket/gasket_page_table.c |  4 ++--
 drivers/tee/tee_shm.c  |  2 +-
 drivers/vfio/vfio_iommu_spapr_tce.c|  3 ++-
 drivers/vhost/vhost.c  |  2 +-
 drivers/video/fbdev/pvr2fb.c   |  2 +-
 drivers/virt/fsl_hypervisor.c  |  2 +-
 drivers/xen/gntdev.c   |  2 +-
 fs/orangefs/orangefs-bufmap.c  |  2 +-
 include/linux/mm.h |  4 ++--
 kernel/futex.c |  2 +-
 lib/iov_iter.c |  7 +--
 mm/gup.c   | 10 +-
 mm/util.c  |  8 
 net/ceph/pagevec.c |  2 +-
 net/rds/info.c |  2 +-
 net/rds/rdma.c |  3 ++-
 35 files changed, 79 insertions(+), 63 deletions(-)

diff --git a/arch/mips/mm/gup.c b/arch/mips/mm/gup.c
index 0d14e0d8eacf..4c2b4483683c 100644
--- a/arch/mips/mm/gup.c
+++ b/arch/mips/mm/gup.c
@@ -235,7 +235,7 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
  * get_user_pages_fast() - pin user pages in memory
  * @start: starting user address
  * @nr_pages:  number of pages from start to pin
- * @write: whether pages will be written to
+ * @gup_flags: flags modifying pin behaviour
  * @pages: array that receives pointers to the pages pinned.
  * Should be at least nr_pages long.
  *
@@ -247,8 +247,8 @@ int __get_user_pages_fast(unsigned long start, int 
nr_pages, int write,
  * requested. If nr_pages is 0 or negative, returns 0. If no pages
  * were pinned, returns -errno.
  */
-int get_user_pages_fast(unsigned long start, int nr_pages, int write,
-   struct page **pages)
+int get_user_pages_fast(unsigned long start, int nr_pages,
+   unsigned int gup_flags, struct page **pages)
 {
struct mm_struct *mm = current->mm;
unsigned long addr, len, end;
@@ -273,7 +273,8 @@ int get_user_pages_fast(unsigned long start, int nr_pages, 
int write,
next = pgd_addr_end(addr, end);
if (pgd_none(pgd))
goto slow;
-   if (!gup_pud_range(pgd, addr, next, write, pages, ))
+   if (!gup_pud_range(pgd, addr, next, gup_flags & FOLL_WRITE,
+  pages, ))
goto slow;
} while (pgdp++, addr = next, addr != end);
local_irq_enable();
@@ -289,7 +290,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, 
int write,
pages += nr;
 
ret = get_user_pages_unlocked(start, (end - start) >> PAGE_SHIFT,
- pages, write ? FOLL_WRITE : 0);
+ pages, gup_flags);
 
/* Have to be a bit careful with return values */
if (nr > 0) {
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index be7bc070eae5..ab3d484c5e2e 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -600,7 +600,7 @@ int kvmppc_book3s_hv_page_fault(struct kvm_run *run, struct 
kvm_vcpu *vcpu,
/* If writing != 0, then the HPTE must allow writing, if we get here */
write_ok = writing;
hva = 

[RESEND 2/7] mm/gup: Change write parameter to flags in fast walk

2019-03-17 Thread ira . weiny
From: Ira Weiny 

In order to support more options in the GUP fast walk, change
the write parameter to flags throughout the call stack.

This patch does not change functionality and passes FOLL_WRITE
where write was previously used.

Signed-off-by: Ira Weiny 
---
 mm/gup.c | 52 ++--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 8cb4cff067bc..2b21eeaf8cc8 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1578,7 +1578,7 @@ static void undo_dev_pagemap(int *nr, int nr_start, 
struct page **pages)
 
 #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
 static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-int write, struct page **pages, int *nr)
+unsigned int flags, struct page **pages, int *nr)
 {
struct dev_pagemap *pgmap = NULL;
int nr_start = *nr, ret = 0;
@@ -1596,7 +1596,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
if (pte_protnone(pte))
goto pte_unmap;
 
-   if (!pte_access_permitted(pte, write))
+   if (!pte_access_permitted(pte, flags & FOLL_WRITE))
goto pte_unmap;
 
if (pte_devmap(pte)) {
@@ -1648,7 +1648,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
unsigned long end,
  * useful to have gup_huge_pmd even if we can't operate on ptes.
  */
 static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-int write, struct page **pages, int *nr)
+unsigned int flags, struct page **pages, int *nr)
 {
return 0;
 }
@@ -1731,12 +1731,12 @@ static int __gup_device_huge_pud(pud_t pud, pud_t 
*pudp, unsigned long addr,
 #endif
 
 static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
-   unsigned long end, int write, struct page **pages, int *nr)
+   unsigned long end, unsigned int flags, struct page **pages, int 
*nr)
 {
struct page *head, *page;
int refs;
 
-   if (!pmd_access_permitted(orig, write))
+   if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
return 0;
 
if (pmd_devmap(orig))
@@ -1769,12 +1769,12 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
unsigned long addr,
 }
 
 static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
-   unsigned long end, int write, struct page **pages, int *nr)
+   unsigned long end, unsigned int flags, struct page **pages, int 
*nr)
 {
struct page *head, *page;
int refs;
 
-   if (!pud_access_permitted(orig, write))
+   if (!pud_access_permitted(orig, flags & FOLL_WRITE))
return 0;
 
if (pud_devmap(orig))
@@ -1807,13 +1807,13 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
unsigned long addr,
 }
 
 static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
-   unsigned long end, int write,
+   unsigned long end, unsigned int flags,
struct page **pages, int *nr)
 {
int refs;
struct page *head, *page;
 
-   if (!pgd_access_permitted(orig, write))
+   if (!pgd_access_permitted(orig, flags & FOLL_WRITE))
return 0;
 
BUILD_BUG_ON(pgd_devmap(orig));
@@ -1844,7 +1844,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned 
long addr,
 }
 
 static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
-   int write, struct page **pages, int *nr)
+   unsigned int flags, struct page **pages, int *nr)
 {
unsigned long next;
pmd_t *pmdp;
@@ -1867,7 +1867,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, 
unsigned long end,
if (pmd_protnone(pmd))
return 0;
 
-   if (!gup_huge_pmd(pmd, pmdp, addr, next, write,
+   if (!gup_huge_pmd(pmd, pmdp, addr, next, flags,
pages, nr))
return 0;
 
@@ -1877,9 +1877,9 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, 
unsigned long end,
 * pmd format and THP pmd format
 */
if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr,
-PMD_SHIFT, next, write, pages, nr))
+PMD_SHIFT, next, flags, pages, nr))
return 0;
-   } else if (!gup_pte_range(pmd, addr, next, write, pages, nr))
+   } else if (!gup_pte_range(pmd, addr, next, flags, pages, nr))
return 0;
} while (pmdp++, addr = next, addr != end);
 
@@ -1887,7 +1887,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, 
unsigned long end,
 }
 
 static int 

[RESEND 1/7] mm/gup: Replace get_user_pages_longterm() with FOLL_LONGTERM

2019-03-17 Thread ira . weiny
From: Ira Weiny 

Rather than have a separate get_user_pages_longterm() call,
introduce FOLL_LONGTERM and change the longterm callers to use
it.

This patch does not change any functionality.

FOLL_LONGTERM can only be supported with get_user_pages() as it
requires vmas to determine if DAX is in use.

CC: Aneesh Kumar K.V 
CC: Andrew Morton 
CC: Michal Hocko 
Signed-off-by: Ira Weiny 

---
Changes from V1:
Rebased on 5.1 merge
Adjusted for changes introduced by CONFIG_CMA
Convert new users of GUP longterm
io_uring.c
xdp_umem.c

 arch/powerpc/mm/mmu_context_iommu.c|   3 +-
 drivers/infiniband/core/umem.c |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   8 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   9 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c  |   6 +-
 drivers/vfio/vfio_iommu_type1.c|   3 +-
 fs/io_uring.c  |   5 +-
 include/linux/mm.h |  14 +-
 mm/gup.c   | 171 -
 mm/gup_benchmark.c |   5 +-
 net/xdp/xdp_umem.c |   4 +-
 11 files changed, 129 insertions(+), 104 deletions(-)

diff --git a/arch/powerpc/mm/mmu_context_iommu.c 
b/arch/powerpc/mm/mmu_context_iommu.c
index e7a9c4f6bfca..2bd48998765e 100644
--- a/arch/powerpc/mm/mmu_context_iommu.c
+++ b/arch/powerpc/mm/mmu_context_iommu.c
@@ -148,7 +148,8 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, 
unsigned long ua,
}
 
down_read(>mmap_sem);
-   ret = get_user_pages_longterm(ua, entries, FOLL_WRITE, mem->hpages, 
NULL);
+   ret = get_user_pages(ua, entries, FOLL_WRITE | FOLL_LONGTERM,
+mem->hpages, NULL);
up_read(>mmap_sem);
if (ret != entries) {
/* free the reference taken */
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index fe5551562dbc..31191f098e73 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -189,10 +189,11 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
unsigned long addr,
 
while (npages) {
down_read(>mmap_sem);
-   ret = get_user_pages_longterm(cur_base,
+   ret = get_user_pages(cur_base,
 min_t(unsigned long, npages,
   PAGE_SIZE / sizeof (struct page *)),
-gup_flags, page_list, vma_list);
+gup_flags | FOLL_LONGTERM,
+page_list, vma_list);
if (ret < 0) {
up_read(>mmap_sem);
goto umem_release;
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c 
b/drivers/infiniband/hw/qib/qib_user_pages.c
index 123ca8f64f75..f712fb7fa82f 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -114,10 +114,10 @@ int qib_get_user_pages(unsigned long start_page, size_t 
num_pages,
 
down_read(>mm->mmap_sem);
for (got = 0; got < num_pages; got += ret) {
-   ret = get_user_pages_longterm(start_page + got * PAGE_SIZE,
- num_pages - got,
- FOLL_WRITE | FOLL_FORCE,
- p + got, NULL);
+   ret = get_user_pages(start_page + got * PAGE_SIZE,
+num_pages - got,
+FOLL_LONGTERM | FOLL_WRITE | FOLL_FORCE,
+p + got, NULL);
if (ret < 0) {
up_read(>mm->mmap_sem);
goto bail_release;
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c 
b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 06862a6af185..1d9a182ac163 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -143,10 +143,11 @@ static int usnic_uiom_get_pages(unsigned long addr, 
size_t size, int writable,
ret = 0;
 
while (npages) {
-   ret = get_user_pages_longterm(cur_base,
-   min_t(unsigned long, npages,
-   PAGE_SIZE / sizeof(struct page *)),
-   gup_flags, page_list, NULL);
+   ret = get_user_pages(cur_base,
+min_t(unsigned long, npages,
+PAGE_SIZE / sizeof(struct page *)),
+gup_flags | FOLL_LONGTERM,
+page_list, NULL);
 
if (ret < 0)
goto out;
diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c 
b/drivers/media/v4l2-core/videobuf-dma-sg.c

[RESEND PATCH 0/7] Add FOLL_LONGTERM to GUP fast and use it

2019-03-17 Thread ira . weiny
From: Ira Weiny 

Resending after rebasing to the latest mm tree.

HFI1, qib, and mthca, use get_user_pages_fast() due to it performance
advantages.  These pages can be held for a significant time.  But
get_user_pages_fast() does not protect against mapping FS DAX pages.

Introduce FOLL_LONGTERM and use this flag in get_user_pages_fast() which
retains the performance while also adding the FS DAX checks.  XDP has also
shown interest in using this functionality.[1]

In addition we change get_user_pages() to use the new FOLL_LONGTERM flag and
remove the specialized get_user_pages_longterm call.

[1] https://lkml.org/lkml/2019/2/11/1789

Ira Weiny (7):
  mm/gup: Replace get_user_pages_longterm() with FOLL_LONGTERM
  mm/gup: Change write parameter to flags in fast walk
  mm/gup: Change GUP fast to use flags rather than a write 'bool'
  mm/gup: Add FOLL_LONGTERM capability to GUP fast
  IB/hfi1: Use the new FOLL_LONGTERM flag to get_user_pages_fast()
  IB/qib: Use the new FOLL_LONGTERM flag to get_user_pages_fast()
  IB/mthca: Use the new FOLL_LONGTERM flag to get_user_pages_fast()

 arch/mips/mm/gup.c  |  11 +-
 arch/powerpc/kvm/book3s_64_mmu_hv.c |   4 +-
 arch/powerpc/kvm/e500_mmu.c |   2 +-
 arch/powerpc/mm/mmu_context_iommu.c |   3 +-
 arch/s390/kvm/interrupt.c   |   2 +-
 arch/s390/mm/gup.c  |  12 +-
 arch/sh/mm/gup.c|  11 +-
 arch/sparc/mm/gup.c |   9 +-
 arch/x86/kvm/paging_tmpl.h  |   2 +-
 arch/x86/kvm/svm.c  |   2 +-
 drivers/fpga/dfl-afu-dma-region.c   |   2 +-
 drivers/gpu/drm/via/via_dmablit.c   |   3 +-
 drivers/infiniband/core/umem.c  |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c |   5 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c |   3 +-
 drivers/infiniband/hw/qib/qib_user_pages.c  |   8 +-
 drivers/infiniband/hw/qib/qib_user_sdma.c   |   2 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c|   9 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c   |   6 +-
 drivers/misc/genwqe/card_utils.c|   2 +-
 drivers/misc/vmw_vmci/vmci_host.c   |   2 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c |   6 +-
 drivers/platform/goldfish/goldfish_pipe.c   |   3 +-
 drivers/rapidio/devices/rio_mport_cdev.c|   4 +-
 drivers/sbus/char/oradax.c  |   2 +-
 drivers/scsi/st.c   |   3 +-
 drivers/staging/gasket/gasket_page_table.c  |   4 +-
 drivers/tee/tee_shm.c   |   2 +-
 drivers/vfio/vfio_iommu_spapr_tce.c |   3 +-
 drivers/vfio/vfio_iommu_type1.c |   3 +-
 drivers/vhost/vhost.c   |   2 +-
 drivers/video/fbdev/pvr2fb.c|   2 +-
 drivers/virt/fsl_hypervisor.c   |   2 +-
 drivers/xen/gntdev.c|   2 +-
 fs/io_uring.c   |   5 +-
 fs/orangefs/orangefs-bufmap.c   |   2 +-
 include/linux/mm.h  |  18 +-
 kernel/futex.c  |   2 +-
 lib/iov_iter.c  |   7 +-
 mm/gup.c| 258 
 mm/gup_benchmark.c  |   5 +-
 mm/util.c   |   8 +-
 net/ceph/pagevec.c  |   2 +-
 net/rds/info.c  |   2 +-
 net/rds/rdma.c  |   3 +-
 net/xdp/xdp_umem.c  |   4 +-
 46 files changed, 262 insertions(+), 197 deletions(-)

-- 
2.20.1



Re: [PATCH v3 04/17] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:05:56PM +0100, Cédric Le Goater wrote:
> The XIVE KVM device maintains a list of interrupt sources for the VM
> which are allocated in the pool of generic interrupts (IPIs) of the
> main XIVE IC controller. These are used for the CPU IPIs as well as
> for virtual device interrupts. The IRQ number space is defined by
> QEMU.
> 
> The XIVE device reuses the source structures of the XICS-on-XIVE
> device for the source blocks (2-level tree) and for the source
> interrupts. Under XIVE native, the source interrupt caches mostly
> configuration information and is less used than under the XICS-on-XIVE
> device in which hcalls are still necessary at run-time.
> 
> When a source is initialized in KVM, an IPI interrupt source is simply
> allocated at the OPAL level and then MASKED. KVM only needs to know
> about its type: LSI or MSI.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v2:
> 
>  - extra documentation in commit log
>  - fixed comments on XIVE IRQ number space
>  - removed usage of the __x_* macros
>  - fixed locking on source block
> 
>  arch/powerpc/include/uapi/asm/kvm.h|   5 +
>  arch/powerpc/kvm/book3s_xive.h |  10 ++
>  arch/powerpc/kvm/book3s_xive.c |   8 +-
>  arch/powerpc/kvm/book3s_xive_native.c  | 106 +
>  Documentation/virtual/kvm/devices/xive.txt |  15 +++
>  5 files changed, 140 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index b002c0c67787..11985148073f 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -677,5 +677,10 @@ struct kvm_ppc_cpu_char {
>  
>  /* POWER9 XIVE Native Interrupt Controller */
>  #define KVM_DEV_XIVE_GRP_CTRL1
> +#define KVM_DEV_XIVE_GRP_SOURCE  2   /* 64-bit source 
> identifier */
> +
> +/* Layout of 64-bit XIVE source attribute values */
> +#define KVM_XIVE_LEVEL_SENSITIVE (1ULL << 0)
> +#define KVM_XIVE_LEVEL_ASSERTED  (1ULL << 1)
>  
>  #endif /* __LINUX_KVM_POWERPC_H */
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index d366df69b9cb..1be921cb5dcb 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -12,6 +12,13 @@
>  #ifdef CONFIG_KVM_XICS
>  #include "book3s_xics.h"
>  
> +/*
> + * The XIVE Interrupt source numbers are within the range 0 to
> + * KVMPPC_XICS_NR_IRQS.
> + */
> +#define KVMPPC_XIVE_FIRST_IRQ0
> +#define KVMPPC_XIVE_NR_IRQS  KVMPPC_XICS_NR_IRQS
> +
>  /*
>   * State for one guest irq source.
>   *
> @@ -258,6 +265,9 @@ extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, 
> unsigned long xirr);
>   */
>  void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu);
>  int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu);
> +struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
> + struct kvmppc_xive *xive, int irq);
> +void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb);
>  
>  #endif /* CONFIG_KVM_XICS */
>  #endif /* _KVM_PPC_BOOK3S_XICS_H */
> diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
> index e7f1ada1c3de..6c9f9fd0855f 100644
> --- a/arch/powerpc/kvm/book3s_xive.c
> +++ b/arch/powerpc/kvm/book3s_xive.c
> @@ -1480,8 +1480,8 @@ static int xive_get_source(struct kvmppc_xive *xive, 
> long irq, u64 addr)
>   return 0;
>  }
>  
> -static struct kvmppc_xive_src_block *xive_create_src_block(struct 
> kvmppc_xive *xive,
> -int irq)
> +struct kvmppc_xive_src_block *kvmppc_xive_create_src_block(
> + struct kvmppc_xive *xive, int irq)
>  {
>   struct kvm *kvm = xive->kvm;
>   struct kvmppc_xive_src_block *sb;
> @@ -1560,7 +1560,7 @@ static int xive_set_source(struct kvmppc_xive *xive, 
> long irq, u64 addr)
>   sb = kvmppc_xive_find_source(xive, irq, );
>   if (!sb) {
>   pr_devel("No source, creating source block...\n");
> - sb = xive_create_src_block(xive, irq);
> + sb = kvmppc_xive_create_src_block(xive, irq);
>   if (!sb) {
>   pr_devel("Failed to create block...\n");
>   return -ENOMEM;
> @@ -1784,7 +1784,7 @@ static void kvmppc_xive_cleanup_irq(u32 hw_num, struct 
> xive_irq_data *xd)
>   xive_cleanup_irq_data(xd);
>  }
>  
> -static void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb)
> +void kvmppc_xive_free_sources(struct kvmppc_xive_src_block *sb)
>  {
>   int i;
>  
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c 
> b/arch/powerpc/kvm/book3s_xive_native.c
> index a078f99bc156..99c04d5c5566 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -31,6 +31,17 @@
>  
>  #include "book3s_xive.h"
>  
> +static u8 xive_vm_esb_load(struct xive_irq_data 

Re: [PATCH v3 03/17] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:05:55PM +0100, Cédric Le Goater wrote:
> The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to
> let QEMU connect the vCPU presenters to the XIVE KVM device if
> required. The capability is not advertised for now as the full support
> for the XIVE native exploitation mode is not yet available. When this
> is case, the capability will be advertised on PowerNV Hypervisors
> only. Nested guests (pseries KVM Hypervisor) are not supported.
> 
> Internally, the interface to the new KVM device is protected with a
> new interrupt mode: KVMPPC_IRQ_XIVE.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

Though a couple of minor nits are noted below.

> ---
> 
>  Changes since v2:
> 
>  - made use of the xive_vp() macro to compute VP identifiers
>  - reworked locking in kvmppc_xive_native_connect_vcpu() to fix races 
>  - stop advertising KVM_CAP_PPC_IRQ_XIVE as support is not fully
>available yet 
>  
>  arch/powerpc/include/asm/kvm_host.h   |   1 +
>  arch/powerpc/include/asm/kvm_ppc.h|  13 +++
>  arch/powerpc/kvm/book3s_xive.h|  11 ++
>  include/uapi/linux/kvm.h  |   1 +
>  arch/powerpc/kvm/book3s_xive.c|  88 ---
>  arch/powerpc/kvm/book3s_xive_native.c | 150 ++
>  arch/powerpc/kvm/powerpc.c|  36 +++
>  Documentation/virtual/kvm/api.txt |   9 ++
>  8 files changed, 268 insertions(+), 41 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 9f75a75a07f2..eb8581be0ee8 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -448,6 +448,7 @@ struct kvmppc_passthru_irqmap {
>  #define KVMPPC_IRQ_DEFAULT   0
>  #define KVMPPC_IRQ_MPIC  1
>  #define KVMPPC_IRQ_XICS  2 /* Includes a XIVE option */
> +#define KVMPPC_IRQ_XIVE  3 /* XIVE native exploitation mode */
>  
>  #define MMIO_HPTE_CACHE_SIZE 4
>  
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index 4b72ddde7dc1..1e61877fe147 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -594,6 +594,14 @@ extern int kvmppc_xive_set_irq(struct kvm *kvm, int 
> irq_source_id, u32 irq,
>  int level, bool line_status);
>  extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
>  
> +static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
> +{
> + return vcpu->arch.irq_type == KVMPPC_IRQ_XIVE;
> +}
> +
> +extern int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> +struct kvm_vcpu *vcpu, u32 cpu);
> +extern void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu);
>  extern void kvmppc_xive_native_init_module(void);
>  extern void kvmppc_xive_native_exit_module(void);
>  
> @@ -621,6 +629,11 @@ static inline int kvmppc_xive_set_irq(struct kvm *kvm, 
> int irq_source_id, u32 ir
> int level, bool line_status) { return 
> -ENODEV; }
>  static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
>  
> +static inline int kvmppc_xive_enabled(struct kvm_vcpu *vcpu)
> + { return 0; }
> +static inline int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> +   struct kvm_vcpu *vcpu, u32 cpu) { return -EBUSY; }
> +static inline void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { }
>  static inline void kvmppc_xive_native_init_module(void) { }
>  static inline void kvmppc_xive_native_exit_module(void) { }
>  
> diff --git a/arch/powerpc/kvm/book3s_xive.h b/arch/powerpc/kvm/book3s_xive.h
> index a08ae6fd4c51..d366df69b9cb 100644
> --- a/arch/powerpc/kvm/book3s_xive.h
> +++ b/arch/powerpc/kvm/book3s_xive.h
> @@ -198,6 +198,11 @@ static inline struct kvmppc_xive_src_block 
> *kvmppc_xive_find_source(struct kvmpp
>   return xive->src_blocks[bid];
>  }
>  
> +static inline u32 kvmppc_xive_vp(struct kvmppc_xive *xive, u32 server)
> +{
> + return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
> +}
> +
>  /*
>   * Mapping between guest priorities and host priorities
>   * is as follow.
> @@ -248,5 +253,11 @@ extern int (*__xive_vm_h_ipi)(struct kvm_vcpu *vcpu, 
> unsigned long server,
>  extern int (*__xive_vm_h_cppr)(struct kvm_vcpu *vcpu, unsigned long cppr);
>  extern int (*__xive_vm_h_eoi)(struct kvm_vcpu *vcpu, unsigned long xirr);
>  
> +/*
> + * Common Xive routines for XICS-over-XIVE and XIVE native
> + */
> +void kvmppc_xive_disable_vcpu_interrupts(struct kvm_vcpu *vcpu);
> +int kvmppc_xive_debug_show_queues(struct seq_file *m, struct kvm_vcpu *vcpu);
> +
>  #endif /* CONFIG_KVM_XICS */
>  #endif /* _KVM_PPC_BOOK3S_XICS_H */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index e6368163d3a0..52bf74a1616e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -988,6 +988,7 @@ struct 

Re: [PATCH 1/5] ocxl: Rename struct link to ocxl_link

2019-03-17 Thread Andrew Donnellan

On 13/3/19 3:06 pm, Alastair D'Silva wrote:

From: Alastair D'Silva 

The term 'link' is ambiguous (especially when the struct is used for a
list), so rename it for clarity.

Signed-off-by: Alastair D'Silva 
Reviewed-by: Greg Kurz 


Acked-by: Andrew Donnellan 

(In future please include a changelog to explain what's changed from v1->v2)


---
  drivers/misc/ocxl/file.c |  5 ++---
  drivers/misc/ocxl/link.c | 36 ++--
  2 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/drivers/misc/ocxl/file.c b/drivers/misc/ocxl/file.c
index e6a607488f8a..009e09b7ded5 100644
--- a/drivers/misc/ocxl/file.c
+++ b/drivers/misc/ocxl/file.c
@@ -151,10 +151,9 @@ static long afu_ioctl_enable_p9_wait(struct ocxl_context 
*ctx,
mutex_unlock(>status_mutex);
  
  		if (status == ATTACHED) {

-   int rc;
-   struct link *link = ctx->afu->fn->link;
+   int rc = ocxl_link_update_pe(ctx->afu->fn->link,
+   ctx->pasid, ctx->tidr);
  
-			rc = ocxl_link_update_pe(link, ctx->pasid, ctx->tidr);

if (rc)
return rc;
}
diff --git a/drivers/misc/ocxl/link.c b/drivers/misc/ocxl/link.c
index d50b861d7e57..8d2690a1a9de 100644
--- a/drivers/misc/ocxl/link.c
+++ b/drivers/misc/ocxl/link.c
@@ -76,7 +76,7 @@ struct spa {
   * limited number of opencapi slots on a system and lookup is only
   * done when the device is probed
   */
-struct link {
+struct ocxl_link {
struct list_head list;
struct kref ref;
int domain;
@@ -179,7 +179,7 @@ static void xsl_fault_handler_bh(struct work_struct 
*fault_work)
  
  static irqreturn_t xsl_fault_handler(int irq, void *data)

  {
-   struct link *link = (struct link *) data;
+   struct ocxl_link *link = (struct ocxl_link *) data;
struct spa *spa = link->spa;
u64 dsisr, dar, pe_handle;
struct pe_data *pe_data;
@@ -256,7 +256,7 @@ static int map_irq_registers(struct pci_dev *dev, struct 
spa *spa)
>reg_tfc, >reg_pe_handle);
  }
  
-static int setup_xsl_irq(struct pci_dev *dev, struct link *link)

+static int setup_xsl_irq(struct pci_dev *dev, struct ocxl_link *link)
  {
struct spa *spa = link->spa;
int rc;
@@ -311,7 +311,7 @@ static int setup_xsl_irq(struct pci_dev *dev, struct link 
*link)
return rc;
  }
  
-static void release_xsl_irq(struct link *link)

+static void release_xsl_irq(struct ocxl_link *link)
  {
struct spa *spa = link->spa;
  
@@ -323,7 +323,7 @@ static void release_xsl_irq(struct link *link)

unmap_irq_registers(spa);
  }
  
-static int alloc_spa(struct pci_dev *dev, struct link *link)

+static int alloc_spa(struct pci_dev *dev, struct ocxl_link *link)
  {
struct spa *spa;
  
@@ -350,7 +350,7 @@ static int alloc_spa(struct pci_dev *dev, struct link *link)

return 0;
  }
  
-static void free_spa(struct link *link)

+static void free_spa(struct ocxl_link *link)
  {
struct spa *spa = link->spa;
  
@@ -364,12 +364,12 @@ static void free_spa(struct link *link)

}
  }
  
-static int alloc_link(struct pci_dev *dev, int PE_mask, struct link **out_link)

+static int alloc_link(struct pci_dev *dev, int PE_mask, struct ocxl_link 
**out_link)
  {
-   struct link *link;
+   struct ocxl_link *link;
int rc;
  
-	link = kzalloc(sizeof(struct link), GFP_KERNEL);

+   link = kzalloc(sizeof(struct ocxl_link), GFP_KERNEL);
if (!link)
return -ENOMEM;
  
@@ -405,7 +405,7 @@ static int alloc_link(struct pci_dev *dev, int PE_mask, struct link **out_link)

return rc;
  }
  
-static void free_link(struct link *link)

+static void free_link(struct ocxl_link *link)
  {
release_xsl_irq(link);
free_spa(link);
@@ -415,7 +415,7 @@ static void free_link(struct link *link)
  int ocxl_link_setup(struct pci_dev *dev, int PE_mask, void **link_handle)
  {
int rc = 0;
-   struct link *link;
+   struct ocxl_link *link;
  
  	mutex_lock(_list_lock);

list_for_each_entry(link, _list, list) {
@@ -442,7 +442,7 @@ EXPORT_SYMBOL_GPL(ocxl_link_setup);
  
  static void release_xsl(struct kref *ref)

  {
-   struct link *link = container_of(ref, struct link, ref);
+   struct ocxl_link *link = container_of(ref, struct ocxl_link, ref);
  
  	list_del(>list);

/* call platform code before releasing data */
@@ -452,7 +452,7 @@ static void release_xsl(struct kref *ref)
  
  void ocxl_link_release(struct pci_dev *dev, void *link_handle)

  {
-   struct link *link = (struct link *) link_handle;
+   struct ocxl_link *link = (struct ocxl_link *) link_handle;
  
  	mutex_lock(_list_lock);

kref_put(>ref, release_xsl);
@@ -488,7 +488,7 @@ int ocxl_link_add_pe(void *link_handle, int pasid, u32 
pidr, u32 tidr,
void (*xsl_err_cb)(void *data, u64 

Re: [PATCH v3 02/17] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode

2019-03-17 Thread David Gibson
On Fri, Mar 15, 2019 at 01:05:54PM +0100, Cédric Le Goater wrote:
> This is the basic framework for the new KVM device supporting the XIVE
> native exploitation mode. The user interface exposes a new KVM device
> to be created by QEMU, only available when running on a L0 hypervisor
> only. Support for nested guests is not available yet.
> 
> The XIVE device reuses the device structure of the XICS-on-XIVE device
> as they have a lot in common. That could possibly change in the future
> if the need arise.
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: David Gibson 

> ---
> 
>  Changes since v2:
> 
>  - removed ->q_order setting. Only useful in the XICS-on-XIVE KVM
>device which allocates the EQs on behalf of the guest.
>  - returned -ENXIO when VP base is invalid
> 
>  arch/powerpc/include/asm/kvm_host.h|   1 +
>  arch/powerpc/include/asm/kvm_ppc.h |   8 +
>  arch/powerpc/include/uapi/asm/kvm.h|   3 +
>  include/uapi/linux/kvm.h   |   2 +
>  arch/powerpc/kvm/book3s.c  |   7 +-
>  arch/powerpc/kvm/book3s_xive_native.c  | 184 +
>  Documentation/virtual/kvm/devices/xive.txt |  19 +++
>  arch/powerpc/kvm/Makefile  |   2 +-
>  8 files changed, 224 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/kvm/book3s_xive_native.c
>  create mode 100644 Documentation/virtual/kvm/devices/xive.txt
> 
> diff --git a/arch/powerpc/include/asm/kvm_host.h 
> b/arch/powerpc/include/asm/kvm_host.h
> index 091430339db1..9f75a75a07f2 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -220,6 +220,7 @@ extern struct kvm_device_ops kvm_xics_ops;
>  struct kvmppc_xive;
>  struct kvmppc_xive_vcpu;
>  extern struct kvm_device_ops kvm_xive_ops;
> +extern struct kvm_device_ops kvm_xive_native_ops;
>  
>  struct kvmppc_passthru_irqmap;
>  
> diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
> b/arch/powerpc/include/asm/kvm_ppc.h
> index b3bf4f61b30c..4b72ddde7dc1 100644
> --- a/arch/powerpc/include/asm/kvm_ppc.h
> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> @@ -593,6 +593,10 @@ extern int kvmppc_xive_set_icp(struct kvm_vcpu *vcpu, 
> u64 icpval);
>  extern int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, u32 irq,
>  int level, bool line_status);
>  extern void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu);
> +
> +extern void kvmppc_xive_native_init_module(void);
> +extern void kvmppc_xive_native_exit_module(void);
> +
>  #else
>  static inline int kvmppc_xive_set_xive(struct kvm *kvm, u32 irq, u32 server,
>  u32 priority) { return -1; }
> @@ -616,6 +620,10 @@ static inline int kvmppc_xive_set_icp(struct kvm_vcpu 
> *vcpu, u64 icpval) { retur
>  static inline int kvmppc_xive_set_irq(struct kvm *kvm, int irq_source_id, 
> u32 irq,
> int level, bool line_status) { return 
> -ENODEV; }
>  static inline void kvmppc_xive_push_vcpu(struct kvm_vcpu *vcpu) { }
> +
> +static inline void kvmppc_xive_native_init_module(void) { }
> +static inline void kvmppc_xive_native_exit_module(void) { }
> +
>  #endif /* CONFIG_KVM_XIVE */
>  
>  #ifdef CONFIG_PPC_POWERNV
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h 
> b/arch/powerpc/include/uapi/asm/kvm.h
> index 8c876c166ef2..b002c0c67787 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -675,4 +675,7 @@ struct kvm_ppc_cpu_char {
>  #define  KVM_XICS_PRESENTED  (1ULL << 43)
>  #define  KVM_XICS_QUEUED (1ULL << 44)
>  
> +/* POWER9 XIVE Native Interrupt Controller */
> +#define KVM_DEV_XIVE_GRP_CTRL1
> +
>  #endif /* __LINUX_KVM_POWERPC_H */
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 6d4ea4b6c922..e6368163d3a0 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1211,6 +1211,8 @@ enum kvm_device_type {
>  #define KVM_DEV_TYPE_ARM_VGIC_V3 KVM_DEV_TYPE_ARM_VGIC_V3
>   KVM_DEV_TYPE_ARM_VGIC_ITS,
>  #define KVM_DEV_TYPE_ARM_VGIC_ITSKVM_DEV_TYPE_ARM_VGIC_ITS
> + KVM_DEV_TYPE_XIVE,
> +#define KVM_DEV_TYPE_XIVEKVM_DEV_TYPE_XIVE
>   KVM_DEV_TYPE_MAX,
>  };
>  
> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> index 601c094f15ab..96d43f091255 100644
> --- a/arch/powerpc/kvm/book3s.c
> +++ b/arch/powerpc/kvm/book3s.c
> @@ -1040,6 +1040,9 @@ static int kvmppc_book3s_init(void)
>   if (xics_on_xive()) {
>   kvmppc_xive_init_module();
>   kvm_register_device_ops(_xive_ops, KVM_DEV_TYPE_XICS);
> + kvmppc_xive_native_init_module();
> + kvm_register_device_ops(_xive_native_ops,
> + KVM_DEV_TYPE_XIVE);
>   } else
>  #endif
>   kvm_register_device_ops(_xics_ops, KVM_DEV_TYPE_XICS);
> @@ -1050,8 +1053,10 @@ static int kvmppc_book3s_init(void)
>  static void 

[PATCH v2 13/13] syscall_get_arch: add "struct task_struct *" argument

2019-03-17 Thread Dmitry V. Levin
This argument is required to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request: syscall_get_arch() is going
to be called from ptrace_request() along with syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions with a tracee as their argument.

The primary intent is that the triple (audit_arch, syscall_nr, arg1..arg6)
should describe what system call is being called and what its arguments
are.

Reverts: 5e937a9ae913 ("syscall_get_arch: remove useless function arguments")
Reverts: 1002d94d3076 ("syscall.h: fix doc text for syscall_get_arch()")
Reviewed-by: Andy Lutomirski  # for x86
Reviewed-by: Palmer Dabbelt 
Acked-by: Paul Moore 
Acked-by: Paul Burton  # MIPS parts
Acked-by: Michael Ellerman  (powerpc)
Acked-by: Kees Cook  # seccomp parts
Acked-by: Mark Salter  # for the c6x bit
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Oleg Nesterov 
Cc: x...@kernel.org
Cc: linux-al...@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux-a...@vger.kernel.org
Cc: linux-au...@redhat.com
Signed-off-by: Dmitry V. Levin 
---

Notes:
v2: unchanged

 arch/alpha/include/asm/syscall.h  |  2 +-
 arch/arc/include/asm/syscall.h|  2 +-
 arch/arm/include/asm/syscall.h|  2 +-
 arch/arm64/include/asm/syscall.h  |  4 ++--
 arch/c6x/include/asm/syscall.h|  2 +-
 arch/csky/include/asm/syscall.h   |  2 +-
 arch/h8300/include/asm/syscall.h  |  2 +-
 arch/hexagon/include/asm/syscall.h|  2 +-
 arch/ia64/include/asm/syscall.h   |  2 +-
 arch/m68k/include/asm/syscall.h   |  2 +-
 arch/microblaze/include/asm/syscall.h |  2 +-
 arch/mips/include/asm/syscall.h   |  6 +++---
 arch/mips/kernel/ptrace.c |  2 +-
 arch/nds32/include/asm/syscall.h  |  2 +-
 arch/nios2/include/asm/syscall.h  |  2 +-
 arch/openrisc/include/asm/syscall.h   |  2 +-
 arch/parisc/include/asm/syscall.h |  4 ++--
 arch/powerpc/include/asm/syscall.h| 10 --
 arch/riscv/include/asm/syscall.h  |  2 +-
 arch/s390/include/asm/syscall.h   |  4 ++--
 arch/sh/include/asm/syscall_32.h  |  2 +-
 arch/sh/include/asm/syscall_64.h  |  2 +-
 arch/sparc/include/asm/syscall.h  |  5 +++--
 arch/unicore32/include/asm/syscall.h  |  2 +-
 arch/x86/include/asm/syscall.h|  8 +---
 arch/x86/um/asm/syscall.h |  2 +-
 arch/xtensa/include/asm/syscall.h |  2 +-
 include/asm-generic/syscall.h |  5 +++--
 kernel/auditsc.c  |  4 ++--
 kernel/seccomp.c  |  4 ++--
 30 files changed, 52 insertions(+), 42 deletions(-)

diff --git a/arch/alpha/include/asm/syscall.h b/arch/alpha/include/asm/syscall.h
index d73a6fcb519c..11c688c1d7ec 100644
--- a/arch/alpha/include/asm/syscall.h
+++ b/arch/alpha/include/asm/syscall.h
@@ -4,7 +4,7 @@
 
 #include 
 
-static inline int syscall_get_arch(void)
+static inline int syscall_get_arch(struct task_struct *task)
 {
return AUDIT_ARCH_ALPHA;
 }
diff --git a/arch/arc/include/asm/syscall.h b/arch/arc/include/asm/syscall.h
index c7fc4c0c3bcb..caf2697ef5b7 100644
--- a/arch/arc/include/asm/syscall.h
+++ b/arch/arc/include/asm/syscall.h
@@ -70,7 +70,7 @@ syscall_get_arguments(struct task_struct *task, struct 
pt_regs *regs,
 }
 
 static inline int
-syscall_get_arch(void)
+syscall_get_arch(struct task_struct *task)
 {
return IS_ENABLED(CONFIG_ISA_ARCOMPACT)
? (IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)
diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h
index 06dea6bce293..3940ceac0bdc 100644
--- a/arch/arm/include/asm/syscall.h
+++ b/arch/arm/include/asm/syscall.h
@@ -104,7 +104,7 @@ static inline void syscall_set_arguments(struct task_struct 
*task,
memcpy(>ARM_r0 + i, args, n * sizeof(args[0]));
 }
 
-static inline int syscall_get_arch(void)
+static inline int syscall_get_arch(struct task_struct *task)
 {
/* ARM tasks don't change audit architectures on the fly. */
return AUDIT_ARCH_ARM;
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index ad8be16a39c9..1870df03f774 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -117,9 +117,9 @@ static inline void syscall_set_arguments(struct task_struct 
*task,
  * We don't care about endianness (__AUDIT_ARCH_LE bit) here because
  * AArch64 has the same system 

[RESEND PATCH v2] powerpc: mute unused-but-set-variable warnings

2019-03-17 Thread Qian Cai
pte_unmap() compiles away on some powerpc platforms, so silence the
warnings below by making it a static inline function.

mm/memory.c: In function 'copy_pte_range':
mm/memory.c:820:24: warning: variable 'orig_dst_pte' set but not used
[-Wunused-but-set-variable]
mm/memory.c:820:9: warning: variable 'orig_src_pte' set but not used
[-Wunused-but-set-variable]
mm/madvise.c: In function 'madvise_free_pte_range':
mm/madvise.c:318:9: warning: variable 'orig_pte' set but not used
[-Wunused-but-set-variable]
mm/swap_state.c: In function 'swap_ra_info':
mm/swap_state.c:634:15: warning: variable 'orig_pte' set but not used
[-Wunused-but-set-variable]

Suggested-by: Christophe Leroy 
Reviewed-by:: Christophe Leroy 
Signed-off-by: Qian Cai 
---

v2: make it a static inline function.

 arch/powerpc/include/asm/book3s/64/pgtable.h | 3 ++-
 arch/powerpc/include/asm/nohash/64/pgtable.h | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 868fcaf56f6b..d798e33a0c86 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1006,7 +1006,8 @@ extern struct page *pgd_page(pgd_t pgd);
(((pte_t *) pmd_page_vaddr(*(dir))) + pte_index(addr))
 
 #define pte_offset_map(dir,addr)   pte_offset_kernel((dir), (addr))
-#define pte_unmap(pte) do { } while(0)
+
+static inline void pte_unmap(pte_t *pte) { }
 
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h 
b/arch/powerpc/include/asm/nohash/64/pgtable.h
index e77ed9761632..0384a3302fb6 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -205,7 +205,8 @@ static inline void pgd_set(pgd_t *pgdp, unsigned long val)
   (((pte_t *) pmd_page_vaddr(*(dir))) + (((addr) >> PAGE_SHIFT) & 
(PTRS_PER_PTE - 1)))
 
 #define pte_offset_map(dir,addr)   pte_offset_kernel((dir), (addr))
-#define pte_unmap(pte) do { } while(0)
+
+static inline void pte_unmap(pte_t *pte) { }
 
 /* to find an entry in a kernel page-table-directory */
 /* This now only contains the vmalloc pages */
-- 
2.17.2 (Apple Git-113)



Re: [PATCH v7 4/4] hugetlb: allow to free gigantic pages regardless of the configuration

2019-03-17 Thread christophe leroy




Le 17/03/2019 à 17:28, Alexandre Ghiti a écrit :

On systems without CONTIG_ALLOC activated but that support gigantic pages,
boottime reserved gigantic pages can not be freed at all. This patch
simply enables the possibility to hand back those pages to memory
allocator.

Signed-off-by: Alexandre Ghiti 
Acked-by: David S. Miller  [sparc]
---
  arch/arm64/Kconfig   |  2 +-
  arch/arm64/include/asm/hugetlb.h |  4 --
  arch/powerpc/include/asm/book3s/64/hugetlb.h |  7 ---
  arch/powerpc/platforms/Kconfig.cputype   |  2 +-
  arch/s390/Kconfig|  2 +-
  arch/s390/include/asm/hugetlb.h  |  3 --
  arch/sh/Kconfig  |  2 +-
  arch/sparc/Kconfig   |  2 +-
  arch/x86/Kconfig |  2 +-
  arch/x86/include/asm/hugetlb.h   |  4 --
  include/asm-generic/hugetlb.h| 14 +
  include/linux/gfp.h  |  2 +-
  mm/hugetlb.c | 54 ++--
  mm/page_alloc.c  |  4 +-
  14 files changed, 61 insertions(+), 43 deletions(-)



[...]


diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 71d7b77eea50..aaf14974ee5f 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -126,4 +126,18 @@ static inline pte_t huge_ptep_get(pte_t *ptep)
  }
  #endif
  
+#ifndef __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED

+#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
+static inline bool gigantic_page_runtime_supported(void)
+{
+   return true;
+}
+#else
+static inline bool gigantic_page_runtime_supported(void)
+{
+   return false;
+}
+#endif /* CONFIG_ARCH_HAS_GIGANTIC_PAGE */


What about the following instead:

static inline bool gigantic_page_runtime_supported(void)
{
return IS_ENABLED(CONFIG_ARCH_HAS_GIGANTIC_PAGE);
}



+#endif /* __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED */
+
  #endif /* _ASM_GENERIC_HUGETLB_H */
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 1f1ad9aeebb9..58ea44bf75de 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -589,8 +589,8 @@ static inline bool pm_suspended_storage(void)
  /* The below functions must be run on a range from a single zone. */
  extern int alloc_contig_range(unsigned long start, unsigned long end,
  unsigned migratetype, gfp_t gfp_mask);
-extern void free_contig_range(unsigned long pfn, unsigned nr_pages);
  #endif
+extern void free_contig_range(unsigned long pfn, unsigned int nr_pages);


'extern' is unneeded and should be avoided (iaw checkpatch)

Christophe

  
  #ifdef CONFIG_CMA

  /* CMA stuff */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index afef61656c1e..4e55aa38704f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1058,6 +1058,7 @@ static void free_gigantic_page(struct page *page, 
unsigned int order)
free_contig_range(page_to_pfn(page), 1 << order);
  }
  
+#ifdef CONFIG_CONTIG_ALLOC

  static int __alloc_gigantic_page(unsigned long start_pfn,
unsigned long nr_pages, gfp_t gfp_mask)
  {
@@ -1142,11 +1143,20 @@ static struct page *alloc_gigantic_page(struct hstate 
*h, gfp_t gfp_mask,
  
  static void prep_new_huge_page(struct hstate *h, struct page *page, int nid);

  static void prep_compound_gigantic_page(struct page *page, unsigned int 
order);
+#else /* !CONFIG_CONTIG_ALLOC */
+static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
+   int nid, nodemask_t *nodemask)
+{
+   return NULL;
+}
+#endif /* CONFIG_CONTIG_ALLOC */
  
  #else /* !CONFIG_ARCH_HAS_GIGANTIC_PAGE */

-static inline bool gigantic_page_supported(void) { return false; }
  static struct page *alloc_gigantic_page(struct hstate *h, gfp_t gfp_mask,
-   int nid, nodemask_t *nodemask) { return NULL; }
+   int nid, nodemask_t *nodemask)
+{
+   return NULL;
+}
  static inline void free_gigantic_page(struct page *page, unsigned int order) 
{ }
  static inline void destroy_compound_gigantic_page(struct page *page,
unsigned int order) { }
@@ -1156,7 +1166,7 @@ static void update_and_free_page(struct hstate *h, struct 
page *page)
  {
int i;
  
-	if (hstate_is_gigantic(h) && !gigantic_page_supported())

+   if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
return;
  
  	h->nr_huge_pages--;

@@ -2276,13 +2286,27 @@ static int adjust_pool_surplus(struct hstate *h, 
nodemask_t *nodes_allowed,
  }
  
  #define persistent_huge_pages(h) (h->nr_huge_pages - h->surplus_huge_pages)

-static unsigned long set_max_huge_pages(struct hstate *h, unsigned long count,
-   nodemask_t *nodes_allowed)
+static int set_max_huge_pages(struct hstate *h, unsigned long count,
+   

Re: Mac Mini G4 hang on boot with git master

2019-03-17 Thread Mark Cave-Ayland
On 17/03/2019 16:25, christophe leroy wrote:

>> This was a weird one: bisecting directly from git master gave a nonsense 
>> result,
>> however by manually rebasing Michael's PR onto my last known good commit 
>> from master
>> I was able to finally pin it down to this commit:
>>
>>
>> 7a0d6955f3f7a4250da63d528bfff7a9c91b5725 is the first bad commit
>> commit 7a0d6955f3f7a4250da63d528bfff7a9c91b5725
>> Author: Christophe Leroy 
>> Date:   Thu Feb 21 10:37:55 2019 +
>>
>>  powerpc/6xx: Store PGDIR physical address in a SPRG
>>
>>  Use SPRN_SPRG2 to store the current thread PGDIR and
>>  avoid reading thread_struct.pgdir at every TLB miss.
>>
>>  Signed-off-by: Christophe Leroy 
>>  Signed-off-by: Michael Ellerman 
>>
>>
> 
> Hi,
> 
> The fix is there:
> 
> https://patchwork.ozlabs.org/patch/1053385/
> 
> Christophe

Hi Christophe,

Thank you! I've tried the patch here and have confirmed that it fixes the 
problem
with the hang on boot.


ATB,

Mark.


[PATCH v7 4/4] hugetlb: allow to free gigantic pages regardless of the configuration

2019-03-17 Thread Alexandre Ghiti
On systems without CONTIG_ALLOC activated but that support gigantic pages,
boottime reserved gigantic pages can not be freed at all. This patch
simply enables the possibility to hand back those pages to memory
allocator.

Signed-off-by: Alexandre Ghiti 
Acked-by: David S. Miller  [sparc]
---
 arch/arm64/Kconfig   |  2 +-
 arch/arm64/include/asm/hugetlb.h |  4 --
 arch/powerpc/include/asm/book3s/64/hugetlb.h |  7 ---
 arch/powerpc/platforms/Kconfig.cputype   |  2 +-
 arch/s390/Kconfig|  2 +-
 arch/s390/include/asm/hugetlb.h  |  3 --
 arch/sh/Kconfig  |  2 +-
 arch/sparc/Kconfig   |  2 +-
 arch/x86/Kconfig |  2 +-
 arch/x86/include/asm/hugetlb.h   |  4 --
 include/asm-generic/hugetlb.h| 14 +
 include/linux/gfp.h  |  2 +-
 mm/hugetlb.c | 54 ++--
 mm/page_alloc.c  |  4 +-
 14 files changed, 61 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 091a513b93e9..af687eff884a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -18,7 +18,7 @@ config ARM64
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
-   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
+   select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_PTE_SPECIAL
diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h
index fb6609875455..59893e766824 100644
--- a/arch/arm64/include/asm/hugetlb.h
+++ b/arch/arm64/include/asm/hugetlb.h
@@ -65,8 +65,4 @@ extern void set_huge_swap_pte_at(struct mm_struct *mm, 
unsigned long addr,
 
 #include 
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static inline bool gigantic_page_supported(void) { return true; }
-#endif
-
 #endif /* __ASM_HUGETLB_H */
diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h 
b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index 5b0177733994..d04a0bcc2f1c 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -32,13 +32,6 @@ static inline int hstate_get_psize(struct hstate *hstate)
}
 }
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static inline bool gigantic_page_supported(void)
-{
-   return true;
-}
-#endif
-
 /* hugepd entry valid bit */
 #define HUGEPD_VAL_BITS(0x8000UL)
 
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index f677c8974212..dc0328de20cd 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -319,7 +319,7 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK
 config PPC_RADIX_MMU
bool "Radix MMU Support"
depends on PPC_BOOK3S_64
-   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
+   select ARCH_HAS_GIGANTIC_PAGE
default y
help
  Enable support for the Power ISA 3.0 Radix style MMU. Currently this
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 1c57b83c76f5..d84e536796b1 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -69,7 +69,7 @@ config S390
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
-   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
+   select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_MEMORY
diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h
index 2d1afa58a4b6..bd191560efcf 100644
--- a/arch/s390/include/asm/hugetlb.h
+++ b/arch/s390/include/asm/hugetlb.h
@@ -116,7 +116,4 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t 
newprot)
return pte_modify(pte, newprot);
 }
 
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-static inline bool gigantic_page_supported(void) { return true; }
-#endif
 #endif /* _ASM_S390_HUGETLB_H */
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index c7266302691c..404b12a0d871 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -53,7 +53,7 @@ config SUPERH
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_NMI
select NEED_SG_DMA_LENGTH
-   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
+   select ARCH_HAS_GIGANTIC_PAGE
 
help
  The SuperH is a RISC processor targeted for use in embedded systems
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index ca33c80870e2..234a6bd46e89 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -90,7 +90,7 @@ config SPARC64
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_PTE_SPECIAL
select PCI_DOMAINS if PCI
-   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
+   select ARCH_HAS_GIGANTIC_PAGE
 
 config 

[PATCH v7 3/4] mm: Simplify MEMORY_ISOLATION && COMPACTION || CMA into CONTIG_ALLOC

2019-03-17 Thread Alexandre Ghiti
This condition allows to define alloc_contig_range, so simplify
it into a more accurate naming.

Suggested-by: Vlastimil Babka 
Signed-off-by: Alexandre Ghiti 
Acked-by: Vlastimil Babka 
---
 arch/arm64/Kconfig | 2 +-
 arch/powerpc/platforms/Kconfig.cputype | 2 +-
 arch/s390/Kconfig  | 2 +-
 arch/sh/Kconfig| 2 +-
 arch/sparc/Kconfig | 2 +-
 arch/x86/Kconfig   | 2 +-
 arch/x86/mm/hugetlbpage.c  | 2 +-
 include/linux/gfp.h| 2 +-
 mm/Kconfig | 3 +++
 mm/page_alloc.c| 3 +--
 10 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index a4168d366127..091a513b93e9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -18,7 +18,7 @@ config ARM64
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
select ARCH_HAS_KCOV
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_PTE_SPECIAL
diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 8c7464c3f27f..f677c8974212 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -319,7 +319,7 @@ config ARCH_ENABLE_SPLIT_PMD_PTLOCK
 config PPC_RADIX_MMU
bool "Radix MMU Support"
depends on PPC_BOOK3S_64
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
default y
help
  Enable support for the Power ISA 3.0 Radix style MMU. Currently this
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index ed554b09eb3f..1c57b83c76f5 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -69,7 +69,7 @@ config S390
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
select ARCH_HAS_KCOV
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_MEMORY
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 299a17bed67c..c7266302691c 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -53,7 +53,7 @@ config SUPERH
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_NMI
select NEED_SG_DMA_LENGTH
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
 
help
  The SuperH is a RISC processor targeted for use in embedded systems
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 0b7f0e0fefa5..ca33c80870e2 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -90,7 +90,7 @@ config SPARC64
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_PTE_SPECIAL
select PCI_DOMAINS if PCI
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
 
 config ARCH_DEFCONFIG
string
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 68261430fe6e..8ba90f3e0038 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -23,7 +23,7 @@ config X86_64
def_bool y
depends on 64BIT
# Options that are inherently 64-bit kernel only:
-   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
+   select ARCH_HAS_GIGANTIC_PAGE if CONTIG_ALLOC
select ARCH_SUPPORTS_INT128
select ARCH_USE_CMPXCHG_LOCKREF
select HAVE_ARCH_SOFT_DIRTY
diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c
index 92e4c4b85bba..fab095362c50 100644
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -203,7 +203,7 @@ static __init int setup_hugepagesz(char *opt)
 }
 __setup("hugepagesz=", setup_hugepagesz);
 
-#if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || 
defined(CONFIG_CMA)
+#ifdef CONFIG_CONTIG_ALLOC
 static __init int gigantic_pages_init(void)
 {
/* With compaction or CMA we can allocate gigantic pages at runtime */
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 5f5e25fd6149..1f1ad9aeebb9 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -585,7 +585,7 @@ static inline bool pm_suspended_storage(void)
 }
 #endif /* CONFIG_PM_SLEEP */
 
-#if (defined(CONFIG_MEMORY_ISOLATION) && defined(CONFIG_COMPACTION)) || 
defined(CONFIG_CMA)
+#ifdef CONFIG_CONTIG_ALLOC
 /* The below functions must be run on a range from a single zone. */
 extern int alloc_contig_range(unsigned long start, unsigned long end,
  unsigned migratetype, gfp_t gfp_mask);
diff --git a/mm/Kconfig 

[PATCH v7 2/4] sparc: Advertise gigantic page support

2019-03-17 Thread Alexandre Ghiti
sparc actually supports gigantic pages and selecting
ARCH_HAS_GIGANTIC_PAGE allows it to allocate and free
gigantic pages at runtime.

sparc allows configuration such as huge pages of 16GB,
pages of 8KB and MAX_ORDER = 13 (default):
HPAGE_SHIFT (34) - PAGE_SHIFT (13) = 21 >= MAX_ORDER (13)

Signed-off-by: Alexandre Ghiti 
Acked-by: David S. Miller 
---
 arch/sparc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index d5dd652fb8cc..0b7f0e0fefa5 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -90,6 +90,7 @@ config SPARC64
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_PTE_SPECIAL
select PCI_DOMAINS if PCI
+   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
 
 config ARCH_DEFCONFIG
string
-- 
2.20.1



[PATCH v7 1/4] sh: Advertise gigantic page support

2019-03-17 Thread Alexandre Ghiti
sh actually supports gigantic pages and selecting
ARCH_HAS_GIGANTIC_PAGE allows it to allocate and free
gigantic pages at runtime.

At least sdk7786_defconfig exposes such a configuration with
huge pages of 64MB, pages of 4KB and MAX_ORDER = 11:
HPAGE_SHIFT (26) - PAGE_SHIFT (12) = 14 >= MAX_ORDER (11)

Signed-off-by: Alexandre Ghiti 
---
 arch/sh/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index a9c36f95744a..299a17bed67c 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -53,6 +53,7 @@ config SUPERH
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_NMI
select NEED_SG_DMA_LENGTH
+   select ARCH_HAS_GIGANTIC_PAGE if (MEMORY_ISOLATION && COMPACTION) || CMA
 
help
  The SuperH is a RISC processor targeted for use in embedded systems
-- 
2.20.1



[PATCH v7 0/4] Fix free/allocation of runtime gigantic pages

2019-03-17 Thread Alexandre Ghiti
his series fixes sh and sparc that did not advertise their gigantic page
support and then were not able to allocate and free those pages at runtime.
It renames MEMORY_ISOLATION && COMPACTION || CMA condition into the more
accurate CONTIG_ALLOC, since it allows the definition of alloc_contig_range
function.
Finally, it then fixes the wrong definition of ARCH_HAS_GIGANTIC_PAGE config
that, without MEMORY_ISOLATION && COMPACTION || CMA defined, did not allow
architectures to free boottime allocated gigantic pages although unrelated.

Changes in v7:
  I thought gigantic page support was settled at compile time, but Aneesh
  and Michael have just come up with a patch proving me wrong for
  powerpc: https://patchwork.ozlabs.org/patch/1047003/. So this version:
  - reintroduces gigantic_page_supported renamed into
gigantic_page_runtime_supported
  - reintroduces gigantic page page support corresponding checks (not
everywhere though: set_max_huge_pages check was redundant with
__nr_hugepages_store_common)
  - introduces the possibility for arch to override this function
by using asm-generic/hugetlb.h current semantics although Aneesh
proposed something else.

Changes in v6:
- Remove unnecessary goto since the fallthrough path does the same and is
  the 'normal' behaviour, as suggested by Dave Hensen
- Be more explicit in comment in set_max_huge_page: we return an error
  if alloc_contig_range is not defined and the user tries to allocate a
  gigantic page (we keep the same behaviour as before this patch), but we
  now let her free boottime gigantic page, as suggested by Dave Hensen
- Add Acked-by, thanks. 

Changes in v5:
- Fix bug in previous version thanks to Mike Kravetz
- Fix block comments that did not respect coding style thanks to Dave Hensen
- Define ARCH_HAS_GIGANTIC_PAGE only for sparc64 as advised by David Miller
- Factorize "def_bool" and "depends on" thanks to Vlastimil Babka

Changes in v4 as suggested by Dave Hensen:
- Split previous version into small patches
- Do not compile alloc_gigantic** functions for architectures that do not
  support those pages
- Define correct ARCH_HAS_GIGANTIC_PAGE in all arch that support them to avoid
  useless runtime check
- Add comment in set_max_huge_pages to explain that freeing is possible even
  without CONTIG_ALLOC defined
- Remove gigantic_page_supported function across all archs

Changes in v3 as suggested by Vlastimil Babka and Dave Hansen:
- config definition was wrong and is now in mm/Kconfig
- COMPACTION_CORE was renamed in CONTIG_ALLOC

Changes in v2 as suggested by Vlastimil Babka:
- Get rid of ARCH_HAS_GIGANTIC_PAGE
- Get rid of architecture specific gigantic_page_supported
- Factorize CMA or (MEMORY_ISOLATION && COMPACTION) into COMPACTION_CORE 

Alexandre Ghiti (4):
  sh: Advertise gigantic page support
  sparc: Advertise gigantic page support
  mm: Simplify MEMORY_ISOLATION && COMPACTION || CMA into CONTIG_ALLOC
  hugetlb: allow to free gigantic pages regardless of the configuration

 arch/arm64/Kconfig   |  2 +-
 arch/arm64/include/asm/hugetlb.h |  4 --
 arch/powerpc/include/asm/book3s/64/hugetlb.h |  7 ---
 arch/powerpc/platforms/Kconfig.cputype   |  2 +-
 arch/s390/Kconfig|  2 +-
 arch/s390/include/asm/hugetlb.h  |  3 --
 arch/sh/Kconfig  |  1 +
 arch/sparc/Kconfig   |  1 +
 arch/x86/Kconfig |  2 +-
 arch/x86/include/asm/hugetlb.h   |  4 --
 arch/x86/mm/hugetlbpage.c|  2 +-
 include/asm-generic/hugetlb.h| 14 +
 include/linux/gfp.h  |  4 +-
 mm/Kconfig   |  3 ++
 mm/hugetlb.c | 54 ++--
 mm/page_alloc.c  |  7 ++-
 16 files changed, 67 insertions(+), 45 deletions(-)

-- 
2.20.1



Re: Mac Mini G4 hang on boot with git master

2019-03-17 Thread christophe leroy




Le 17/03/2019 à 15:15, Mark Cave-Ayland a écrit :

On 15/03/2019 13:37, Mark Cave-Ayland wrote:


Hi all,

I've just done a git pull and rebuilt master on my Mac Mini G4 in order to test
Michael's merge of my KVM PR fix, and unfortunately my kernel now hangs on boot 
:(

My last working git checkout was somewhere around the 5.0-rc stage, so I 
suspect it's
something that's been merged for 5.1.

The hang occurs just after the boot console is disabled which makes me wonder if
something is going wrong during PCI bus enumeration. Does anyone have an idea 
as to
what may be causing this? I can obviously bisect it down, but on slow hardware 
it can
take some time...


This was a weird one: bisecting directly from git master gave a nonsense result,
however by manually rebasing Michael's PR onto my last known good commit from 
master
I was able to finally pin it down to this commit:


7a0d6955f3f7a4250da63d528bfff7a9c91b5725 is the first bad commit
commit 7a0d6955f3f7a4250da63d528bfff7a9c91b5725
Author: Christophe Leroy 
Date:   Thu Feb 21 10:37:55 2019 +

 powerpc/6xx: Store PGDIR physical address in a SPRG

 Use SPRN_SPRG2 to store the current thread PGDIR and
 avoid reading thread_struct.pgdir at every TLB miss.

 Signed-off-by: Christophe Leroy 
 Signed-off-by: Michael Ellerman 




Hi,

The fix is there:

https://patchwork.ozlabs.org/patch/1053385/

Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



[PATCH v2 5/5] powerpc/mm/hash: Simplify the region id calculation.

2019-03-17 Thread Aneesh Kumar K.V
This reduces multiple comparisons in get_region_id to a bit shift operation.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |  4 ++-
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  1 +
 arch/powerpc/include/asm/book3s/64/hash.h | 31 +--
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
 4 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 0dd62287f56c..64eaf187f891 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -13,12 +13,14 @@
  */
 #define MAX_EA_BITS_PER_CONTEXT46
 
+#define REGION_SHIFT   (MAX_EA_BITS_PER_CONTEXT - 2)
+
 /*
  * Our page table limit us to 64TB. Hence for the kernel mapping,
  * each MAP area is limited to 16 TB.
  * The four map areas are:  linear mapping, vmap, IO and vmemmap
  */
-#define H_KERN_MAP_SIZE(ASM_CONST(1) << 
(MAX_EA_BITS_PER_CONTEXT - 2))
+#define H_KERN_MAP_SIZE(ASM_CONST(1) << REGION_SHIFT)
 
 /*
  * Define the address range of the kernel non-linear virtual area
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index e392cf17b457..24ca63beba14 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -12,6 +12,7 @@
  * is handled in the hotpath.
  */
 #define MAX_EA_BITS_PER_CONTEXT49
+#define REGION_SHIFT   MAX_EA_BITS_PER_CONTEXT
 
 /*
  * We use one context for each MAP area.
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 523b9191a1e2..d1f0d7332b84 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -87,26 +87,26 @@
 #define H_VMEMMAP_SIZE H_KERN_MAP_SIZE
 #define H_VMEMMAP_END  (H_VMEMMAP_START + H_VMEMMAP_SIZE)
 
+#define REGION_ID(ea)  unsigned long)ea - H_KERN_VIRT_START) >> 
REGION_SHIFT) + 2)
+
 /*
  * Region IDs
  */
-#define USER_REGION_ID 1
-#define KERNEL_REGION_ID   2
-#define VMALLOC_REGION_ID  3
-#define IO_REGION_ID   4
-#define VMEMMAP_REGION_ID  5
+#define USER_REGION_ID 0
+#define KERNEL_REGION_ID   1
+#define VMALLOC_REGION_ID  REGION_ID(H_VMALLOC_START)
+#define IO_REGION_ID   REGION_ID(H_KERN_IO_START)
+#define VMEMMAP_REGION_ID  REGION_ID(H_VMEMMAP_START)
 
 /*
  * Defines the address of the vmemap area, in its own region on
  * hash table CPUs.
  */
-
 #ifdef CONFIG_PPC_MM_SLICES
 #define HAVE_ARCH_UNMAPPED_AREA
 #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
 #endif /* CONFIG_PPC_MM_SLICES */
 
-
 /* PTEIDX nibble */
 #define _PTEIDX_SECONDARY  0x8
 #define _PTEIDX_GROUP_IX   0x7
@@ -117,22 +117,21 @@
 #ifndef __ASSEMBLY__
 static inline int get_region_id(unsigned long ea)
 {
+   int region_id;
int id = (ea >> 60UL);
 
if (id == 0)
return USER_REGION_ID;
 
-   VM_BUG_ON(id != 0xc);
-   VM_BUG_ON(ea >= H_VMEMMAP_END);
+   if (ea < H_KERN_VIRT_START)
+   return KERNEL_REGION_ID;
 
-   if (ea >= H_VMEMMAP_START)
-   return VMEMMAP_REGION_ID;
-   else if (ea >= H_KERN_IO_START)
-   return IO_REGION_ID;
-   else if (ea >= H_VMALLOC_START)
-   return VMALLOC_REGION_ID;
+   VM_BUG_ON(id != 0xc);
+   BUILD_BUG_ON(REGION_ID(H_VMALLOC_START) != 2);
 
-   return KERNEL_REGION_ID;
+   region_id = REGION_ID(ea);
+   VM_BUG_ON(region_id > VMEMMAP_REGION_ID);
+   return region_id;
 }
 
 #definehash__pmd_bad(pmd)  (pmd_val(pmd) & H_PMD_BAD_BITS)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h 
b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index b3f256c042aa..b146448109fd 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -794,7 +794,7 @@ static inline unsigned long get_kernel_context(unsigned 
long ea)
 */
ctx =  1 + ((ea & EA_MASK) >> MAX_EA_BITS_PER_CONTEXT);
} else
-   ctx = region_id + MAX_KERNEL_CTX_CNT - 2;
+   ctx = region_id + MAX_KERNEL_CTX_CNT - 1;
return ctx;
 }
 
-- 
2.20.1



[PATCH v2 4/5] powerpc/mm: Drop the unnecessary region check

2019-03-17 Thread Aneesh Kumar K.V
All the regions are now mapped with top nibble 0xc. Hence the region id
check is not needed for virt_addr_valid()

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/page.h | 12 
 1 file changed, 12 deletions(-)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 918228f2205b..748f5db2e2b7 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -132,19 +132,7 @@ static inline bool pfn_valid(unsigned long pfn)
 #define virt_to_page(kaddr)pfn_to_page(virt_to_pfn(kaddr))
 #define pfn_to_kaddr(pfn)  __va((pfn) << PAGE_SHIFT)
 
-#ifdef CONFIG_PPC_BOOK3S_64
-/*
- * On hash the vmalloc and other regions alias to the kernel region when passed
- * through __pa(), which virt_to_pfn() uses. That means virt_addr_valid() can
- * return true for some vmalloc addresses, which is incorrect. So explicitly
- * check that the address is in the kernel region.
- */
-/* may be can drop get_region_id */
-#define virt_addr_valid(kaddr) (get_region_id((unsigned long)kaddr) == 
KERNEL_REGION_ID && \
-   pfn_valid(virt_to_pfn(kaddr)))
-#else
 #define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr))
-#endif
 
 /*
  * On Book-E parts we need __va to parse the device tree and we can't
-- 
2.20.1



[PATCH v2 3/5] powerpc/mm: Validate address values against different region limits

2019-03-17 Thread Aneesh Kumar K.V
This adds an explicit check in various functions.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash_utils_64.c  | 18 +++---
 arch/powerpc/mm/pgtable-hash64.c | 13 ++---
 arch/powerpc/mm/pgtable-radix.c  | 16 
 arch/powerpc/mm/pgtable_64.c |  5 +
 4 files changed, 46 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index c6b39e7694ba..ef0ca3bf555d 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -786,9 +786,16 @@ void resize_hpt_for_hotplug(unsigned long new_mem_size)
 
 int hash__create_section_mapping(unsigned long start, unsigned long end, int 
nid)
 {
-   int rc = htab_bolt_mapping(start, end, __pa(start),
-  pgprot_val(PAGE_KERNEL), mmu_linear_psize,
-  mmu_kernel_ssize);
+   int rc;
+
+   if (end >= H_VMALLOC_START) {
+   pr_warn("Outisde the supported range\n");
+   return -1;
+   }
+
+   rc = htab_bolt_mapping(start, end, __pa(start),
+  pgprot_val(PAGE_KERNEL), mmu_linear_psize,
+  mmu_kernel_ssize);
 
if (rc < 0) {
int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
@@ -929,6 +936,11 @@ static void __init htab_initialize(void)
DBG("creating mapping for region: %lx..%lx (prot: %lx)\n",
base, size, prot);
 
+   if ((base + size) >= H_VMALLOC_START) {
+   pr_warn("Outisde the supported range\n");
+   continue;
+   }
+
BUG_ON(htab_bolt_mapping(base, base + size, __pa(base),
prot, mmu_linear_psize, mmu_kernel_ssize));
}
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index c08d49046a96..d934de4e2b3a 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -112,9 +112,16 @@ int __meminit hash__vmemmap_create_mapping(unsigned long 
start,
   unsigned long page_size,
   unsigned long phys)
 {
-   int rc = htab_bolt_mapping(start, start + page_size, phys,
-  pgprot_val(PAGE_KERNEL),
-  mmu_vmemmap_psize, mmu_kernel_ssize);
+   int rc;
+
+   if ((start + page_size) >= H_VMEMMAP_END) {
+   pr_warn("Outisde the supported range\n");
+   return -1;
+   }
+
+   rc = htab_bolt_mapping(start, start + page_size, phys,
+  pgprot_val(PAGE_KERNEL),
+  mmu_vmemmap_psize, mmu_kernel_ssize);
if (rc < 0) {
int rc2 = htab_remove_mapping(start, start + page_size,
  mmu_vmemmap_psize,
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index ba485fbd81f1..c9b24bf78819 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -334,6 +334,12 @@ void __init radix_init_pgtable(void)
 * page tables will be allocated within the range. No
 * need or a node (which we don't have yet).
 */
+
+   if ((reg->base + reg->size) >= RADIX_VMALLOC_START) {
+   pr_warn("Outisde the supported range\n");
+   continue;
+   }
+
WARN_ON(create_physical_mapping(reg->base,
reg->base + reg->size,
-1));
@@ -866,6 +872,11 @@ static void __meminit remove_pagetable(unsigned long 
start, unsigned long end)
 
 int __meminit radix__create_section_mapping(unsigned long start, unsigned long 
end, int nid)
 {
+   if (end >= RADIX_VMALLOC_START) {
+   pr_warn("Outisde the supported range\n");
+   return -1;
+   }
+
return create_physical_mapping(start, end, nid);
 }
 
@@ -893,6 +904,11 @@ int __meminit radix__vmemmap_create_mapping(unsigned long 
start,
int nid = early_pfn_to_nid(phys >> PAGE_SHIFT);
int ret;
 
+   if ((start + page_size) >= RADIX_VMEMMAP_END) {
+   pr_warn("Outisde the supported range\n");
+   return -1;
+   }
+
ret = __map_kernel_page_nid(start, phys, __pgprot(flags), page_size, 
nid);
BUG_ON(ret);
 
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 56068cac2a3c..72f58c076e26 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -121,6 +121,11 @@ void __iomem *__ioremap_at(phys_addr_t pa, void *ea, 
unsigned long size, pgprot_
if (pgprot_val(prot) & H_PAGE_4K_PFN)
return NULL;
 
+   if ((ea + size) >= (void *)IOREMAP_END) {
+  

[PATCH v2 2/5] powerpc/mm/hash64: Map all the kernel regions in the same 0xc range

2019-03-17 Thread Aneesh Kumar K.V
This patch maps vmap, IO and vmemap regions in the 0xc address range
instead of the current 0xd and 0xf range. This brings the mapping closer
to radix translation mode.

With hash 64K page size each of this region is 512TB whereas with 4K config
we are limited by the max page table range of 64TB and hence there regions
are of 16TB size.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 13 +++
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 11 +++
 arch/powerpc/include/asm/book3s/64/hash.h | 95 ---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 31 +++---
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  1 -
 arch/powerpc/include/asm/book3s/64/radix.h| 41 
 arch/powerpc/include/asm/page.h   |  3 +-
 arch/powerpc/kvm/book3s_hv_rm_xics.c  |  2 +-
 arch/powerpc/mm/copro_fault.c | 14 ++-
 arch/powerpc/mm/hash_utils_64.c   | 26 ++---
 arch/powerpc/mm/pgtable-radix.c   |  3 +-
 arch/powerpc/mm/pgtable_64.c  |  2 -
 arch/powerpc/mm/ptdump/hashpagetable.c|  2 +-
 arch/powerpc/mm/ptdump/ptdump.c   |  3 +-
 arch/powerpc/mm/slb.c | 22 +++--
 arch/powerpc/platforms/cell/spu_base.c|  4 +-
 drivers/misc/cxl/fault.c  |  2 +-
 drivers/misc/ocxl/link.c  |  2 +-
 18 files changed, 170 insertions(+), 107 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h 
b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index cf5ba5254299..0dd62287f56c 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -13,6 +13,19 @@
  */
 #define MAX_EA_BITS_PER_CONTEXT46
 
+/*
+ * Our page table limit us to 64TB. Hence for the kernel mapping,
+ * each MAP area is limited to 16 TB.
+ * The four map areas are:  linear mapping, vmap, IO and vmemmap
+ */
+#define H_KERN_MAP_SIZE(ASM_CONST(1) << 
(MAX_EA_BITS_PER_CONTEXT - 2))
+
+/*
+ * Define the address range of the kernel non-linear virtual area
+ * 16TB
+ */
+#define H_KERN_VIRT_START  ASM_CONST(0xc0001000)
+
 #ifndef __ASSEMBLY__
 #define H_PTE_TABLE_SIZE   (sizeof(pte_t) << H_PTE_INDEX_SIZE)
 #define H_PMD_TABLE_SIZE   (sizeof(pmd_t) << H_PMD_INDEX_SIZE)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h 
b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index f82ee8a3b561..e392cf17b457 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -13,6 +13,17 @@
  */
 #define MAX_EA_BITS_PER_CONTEXT49
 
+/*
+ * We use one context for each MAP area.
+ */
+#define H_KERN_MAP_SIZE(1UL << MAX_EA_BITS_PER_CONTEXT)
+
+/*
+ * Define the address range of the kernel non-linear virtual area
+ * 2PB
+ */
+#define H_KERN_VIRT_START  ASM_CONST(0xc008)
+
 /*
  * 64k aligned address free up few of the lower bits of RPN for us
  * We steal that here. For more deatils look at pte_pfn/pfn_pte()
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 8cbc4106d449..523b9191a1e2 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -29,6 +29,10 @@
 #define H_PGTABLE_EADDR_SIZE   (H_PTE_INDEX_SIZE + H_PMD_INDEX_SIZE + \
 H_PUD_INDEX_SIZE + H_PGD_INDEX_SIZE + 
PAGE_SHIFT)
 #define H_PGTABLE_RANGE(ASM_CONST(1) << H_PGTABLE_EADDR_SIZE)
+/*
+ * Top 2 bits are ignored in page table walk.
+ */
+#define EA_MASK(~(0xcUL << 60))
 
 /*
  * We store the slot details in the second half of page table.
@@ -42,53 +46,60 @@
 #endif
 
 /*
- * Define the address range of the kernel non-linear virtual area. In contrast
- * to the linear mapping, this is managed using the kernel page tables and then
- * inserted into the hash page table to actually take effect, similarly to user
- * mappings.
+ * One context each will be used for vmap, IO and vmemmap
  */
-#define H_KERN_VIRT_START ASM_CONST(0xD000)
-
+#define H_KERN_VIRT_SIZE   (H_KERN_MAP_SIZE * 3)
 /*
- * Allow virtual mapping of one context size.
- * 512TB for 64K page size
- * 64TB for 4K page size
+ * +--+
+ * |  |
+ * |  |
+ * |  |
+ * +--+  Kernel virtual map end 
(0xc00e)
+ * |  |
+ * |  |
+ * |  512TB/16TB of vmemmap   |
+ * |  |
+ * |  |
+ * +--+  Kernel vmemmap  start
+ * |  |
+ * |  512TB/16TB of IO map|
+ * |  |
+ * +--+  Kernel IO map start
+ * 

[PATCH v2 1/5] powerpc/mm/hash64: Add a variable to track the end of IO mapping

2019-03-17 Thread Aneesh Kumar K.V
This makes it easy to update the region mapping in the later patch

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/include/asm/book3s/64/hash.h| 3 ++-
 arch/powerpc/include/asm/book3s/64/pgtable.h | 8 +---
 arch/powerpc/include/asm/book3s/64/radix.h   | 1 +
 arch/powerpc/mm/hash_utils_64.c  | 1 +
 arch/powerpc/mm/pgtable-radix.c  | 1 +
 arch/powerpc/mm/pgtable_64.c | 2 ++
 6 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h 
b/arch/powerpc/include/asm/book3s/64/hash.h
index 54b7af6cd27f..8cbc4106d449 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -69,7 +69,8 @@
 #define H_VMALLOC_SIZE (H_KERN_VIRT_SIZE - H_KERN_IO_SIZE)
 #define H_VMALLOC_END  (H_VMALLOC_START + H_VMALLOC_SIZE)
 
-#define H_KERN_IO_START H_VMALLOC_END
+#define H_KERN_IO_STARTH_VMALLOC_END
+#define H_KERN_IO_END  (H_KERN_VIRT_START + H_KERN_VIRT_SIZE)
 
 /*
  * Region IDs
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 581f91be9dd4..51190a6d1c8a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -277,9 +277,12 @@ extern unsigned long __vmalloc_end;
 extern unsigned long __kernel_virt_start;
 extern unsigned long __kernel_virt_size;
 extern unsigned long __kernel_io_start;
+extern unsigned long __kernel_io_end;
 #define KERN_VIRT_START __kernel_virt_start
 #define KERN_VIRT_SIZE  __kernel_virt_size
 #define KERN_IO_START  __kernel_io_start
+#define KERN_IO_END __kernel_io_end
+
 extern struct page *vmemmap;
 extern unsigned long ioremap_bot;
 extern unsigned long pci_io_base;
@@ -296,8 +299,7 @@ extern unsigned long pci_io_base;
 
 #include 
 /*
- * The second half of the kernel virtual space is used for IO mappings,
- * it's itself carved into the PIO region (ISA and PHB IO space) and
+ * IO space itself carved into the PIO region (ISA and PHB IO space) and
  * the ioremap space
  *
  *  ISA_IO_BASE = KERN_IO_START, 64K reserved area
@@ -310,7 +312,7 @@ extern unsigned long pci_io_base;
 #define  PHB_IO_BASE   (ISA_IO_END)
 #define  PHB_IO_END(KERN_IO_START + FULL_IO_SIZE)
 #define IOREMAP_BASE   (PHB_IO_END)
-#define IOREMAP_END(KERN_VIRT_START + KERN_VIRT_SIZE)
+#define IOREMAP_END(KERN_IO_END)
 
 /* Advertise special mapping type for AGP */
 #define HAVE_PAGE_AGP
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h 
b/arch/powerpc/include/asm/book3s/64/radix.h
index 5ab134eeed20..6d760a083d62 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -111,6 +111,7 @@
 #define RADIX_VMEMMAP_BASE (RADIX_VMALLOC_END)
 
 #define RADIX_KERN_IO_START(RADIX_KERN_VIRT_START + (RADIX_KERN_VIRT_SIZE 
>> 1))
+#define RADIX_KERN_IO_END   (RADIX_KERN_VIRT_START + RADIX_KERN_VIRT_SIZE)
 
 #ifndef __ASSEMBLY__
 #define RADIX_PTE_TABLE_SIZE   (sizeof(pte_t) << RADIX_PTE_INDEX_SIZE)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0a4f939a8161..394dd969002f 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1017,6 +1017,7 @@ void __init hash__early_init_mmu(void)
__vmalloc_start = H_VMALLOC_START;
__vmalloc_end = H_VMALLOC_END;
__kernel_io_start = H_KERN_IO_START;
+   __kernel_io_end = H_KERN_IO_END;
vmemmap = (struct page *)H_VMEMMAP_BASE;
ioremap_bot = IOREMAP_BASE;
 
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 154472a28c77..bca1bf66c56e 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -578,6 +578,7 @@ void __init radix__early_init_mmu(void)
__vmalloc_start = RADIX_VMALLOC_START;
__vmalloc_end = RADIX_VMALLOC_END;
__kernel_io_start = RADIX_KERN_IO_START;
+   __kernel_io_end = RADIX_KERN_IO_END;
vmemmap = (struct page *)RADIX_VMEMMAP_BASE;
ioremap_bot = IOREMAP_BASE;
 
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index fb1375c07e8c..7cea39bdf05f 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -98,6 +98,8 @@ unsigned long __vmalloc_end;
 EXPORT_SYMBOL(__vmalloc_end);
 unsigned long __kernel_io_start;
 EXPORT_SYMBOL(__kernel_io_start);
+unsigned long __kernel_io_end;
+EXPORT_SYMBOL(__kernel_io_end);
 struct page *vmemmap;
 EXPORT_SYMBOL(vmemmap);
 unsigned long __pte_frag_nr;
-- 
2.20.1



[PATCH v2 0/5] Update hash MMU kernel mapping to be in sync with radix.

2019-03-17 Thread Aneesh Kumar K.V
This patch series map all the kernel regions (vmalloc, IO and vmemmap) using
0xc top nibble address. This brings hash translation kernel mapping in sync 
with radix.
Each of these regions can now map 512TB. We use one context to map these
regions and hence the 512TB limit. We also update radix to use the same limit 
even though
we don't have context related restrictions there.

Aneesh Kumar K.V (5):
  powerpc/mm/hash64: Add a variable to track the end of IO mapping
  powerpc/mm/hash64: Map all the kernel regions in the same 0xc range
  powerpc/mm: Validate address values against different region limits
  powerpc/mm: Drop the unnecessary region check
  powerpc/mm/hash: Simplify the region id calculation.

 arch/powerpc/include/asm/book3s/64/hash-4k.h  | 15 +++
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 12 +++
 arch/powerpc/include/asm/book3s/64/hash.h | 97 ---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h | 31 +++---
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  9 +-
 arch/powerpc/include/asm/book3s/64/radix.h| 40 
 arch/powerpc/include/asm/page.h   | 11 ---
 arch/powerpc/kvm/book3s_hv_rm_xics.c  |  2 +-
 arch/powerpc/mm/copro_fault.c | 14 ++-
 arch/powerpc/mm/hash_utils_64.c   | 45 ++---
 arch/powerpc/mm/pgtable-hash64.c  | 13 ++-
 arch/powerpc/mm/pgtable-radix.c   | 20 +++-
 arch/powerpc/mm/pgtable_64.c  |  9 +-
 arch/powerpc/mm/ptdump/hashpagetable.c|  2 +-
 arch/powerpc/mm/ptdump/ptdump.c   |  3 +-
 arch/powerpc/mm/slb.c | 22 +++--
 arch/powerpc/platforms/cell/spu_base.c|  4 +-
 drivers/misc/cxl/fault.c  |  2 +-
 drivers/misc/ocxl/link.c  |  2 +-
 19 files changed, 227 insertions(+), 126 deletions(-)

-- 
2.20.1



Re: Mac Mini G4 hang on boot with git master

2019-03-17 Thread Mark Cave-Ayland
On 15/03/2019 13:37, Mark Cave-Ayland wrote:

> Hi all,
> 
> I've just done a git pull and rebuilt master on my Mac Mini G4 in order to 
> test
> Michael's merge of my KVM PR fix, and unfortunately my kernel now hangs on 
> boot :(
> 
> My last working git checkout was somewhere around the 5.0-rc stage, so I 
> suspect it's
> something that's been merged for 5.1.
> 
> The hang occurs just after the boot console is disabled which makes me wonder 
> if
> something is going wrong during PCI bus enumeration. Does anyone have an idea 
> as to
> what may be causing this? I can obviously bisect it down, but on slow 
> hardware it can
> take some time...

This was a weird one: bisecting directly from git master gave a nonsense result,
however by manually rebasing Michael's PR onto my last known good commit from 
master
I was able to finally pin it down to this commit:


7a0d6955f3f7a4250da63d528bfff7a9c91b5725 is the first bad commit
commit 7a0d6955f3f7a4250da63d528bfff7a9c91b5725
Author: Christophe Leroy 
Date:   Thu Feb 21 10:37:55 2019 +

powerpc/6xx: Store PGDIR physical address in a SPRG

Use SPRN_SPRG2 to store the current thread PGDIR and
avoid reading thread_struct.pgdir at every TLB miss.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 


ATB,

Mark.


Re: [PATCH v2 10/45] drivers: tty: serial: zs: use devm_* functions

2019-03-17 Thread Greg KH
On Sat, Mar 16, 2019 at 10:17:11AM +0100, Enrico Weigelt, metux IT consult 
wrote:
> On 16.03.19 04:26, Greg KH wrote:
> 
> > No, it's just that those systems do not allow those devices to be
> > removed because they are probably not on a removable bus.
> 
> Ok, devices (hw) might not be removable - that also the case for uarts
> builtin some SoCs, or the good old PC w/ 8250. But does that also mean
> that the driver should not be removable ?

No, but 'rmmod' is not a normal operation that anyone ever does in a
working system.  It is only for developer's ease-of-use.

> IMHO, even if that's the case, it's still inconsistent. The driver then
> shouldn't support a remove at all (or even builtin only), not just
> incomplete remove.

Cleaning up properly when the module is unloaded is a good idea, but so
far the patches you submitted did not change anything from a logic point
of view.  They all just cleaned up memory the same way it was cleaned up
before, so I really do not understand what you are trying to do here.

> >> Okay, I was on a wrong track here - I had the silly idea that it would
> >> make things easier if we'd do it the same way everywhere.
> > 
> > "Consistent" is good, and valid, but touching old drivers that have few
> > users is always risky, and you need a solid reason to do so.
> 
> Understood.
> 
> By the way: do we have some people who have those old hw and could test?
> Should we (try to) create some ? Perhaps some "tester" entry in
> MAINTAINERS file ? (I could ask around several people who might have
> lots of old / rare hardware.)

Let's not clutter up MAINTAINERS with anything else please.

> >> Understood. Assuming I've found some of these cases, shall I use devm
> >> oder just add the missing release ?
> > 
> > If it actually makes the code "simpler" or "more obvious", sure, that's
> > fine.  But churn for churns sake is not ok.
> 
> Ok.
> 
> > I put the review of new patch submissions on hold, yes.  Almost all
> > maintainers do that as we can not add new patches to our trees at that
> > point in time.
> 
> hmm, looks like a pipeline stall ;-)
> why not collecting in a separate branch, which later gets rebased to
> mainline when rc is out ?

I do do that for subsystems that actually have a high patch rate.  The
tty/serial subsystem is not such a thing, and it can handle 2 weeks of
delay just fine.

> > And I do have other things I do during that period so it's not like I'm
> > just sitting around doing nothing :)
> 
> So it's also a fixed schedule for your other work. Understood.
> 
> It seems that this workflow can confuse people. Few days ago, somebody
> became nervous about missing reactions on patches. Your autoresponder
> worked for me, but maybe not for everybody.

Why would it not work for everybody?  Kernel development has been done
in this manner for over a decade.  Having a 2 week window like this is
good for the maintainers, remember they are the most limited resource we
have, not developers.

thanks,

greg k-h


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.1-2 tag

2019-03-17 Thread christophe leroy

Hi Michael,

Le 16/03/2019 à 12:28, Michael Ellerman a écrit :

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Linus,

Please pull a few small powerpc updates for 5.1:

The following changes since commit 9580b71b5a7863c24a9bd18bcd2ad759b86b1eff:

   powerpc/32: Clear on-stack exception marker upon exception return 
(2019-03-04 00:37:23 +1100)

are available in the git repository at:

   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.1-2

for you to fetch changes up to de3c83c2fd2b87cf68214eda76dfa66989d78cb6:

   powerpc/64s: Include  header file to fix a warning (2019-03-13 
15:03:13 +1100)

- --
powerpc fixes for 5.1 #2

One fix to prevent runtime allocation of 16GB pages when running in a VM (as
opposed to bare metal), because it doesn't work.

A small fix to our recently added KCOV support to exempt some more code from
being instrumented.

Plus a few minor build fixes, a small dead code removal and a defconfig update.

Thanks to:
   Alexey Kardashevskiy, Aneesh Kumar K.V, Christophe Leroy, Jason Yan, Joel
   Stanley, Mahesh Salgaonkar, Mathieu Malaterre.


Looks like the fix for 6xx/hash use of SPRN_SPRG_PGDIR is not there.

Patch at https://patchwork.ozlabs.org/patch/1053385/
(Or at https://patchwork.ozlabs.org/patch/1054213/ with better commit text)

Without that patch, 6xx with hash table silently loop forever in 
pagefault handler while trying to start init, as reported from 
linux-next by Guenter Roeck (see 
https://patchwork.ozlabs.org/patch/1046041/)


Thanks
Christophe



- --
Alexey Kardashevskiy (1):
   powerpc/powernv: Fix compile without CONFIG_TRACEPOINTS

Aneesh Kumar K.V (1):
   powerpc/hugetlb: Don't do runtime allocation of 16G pages in LPAR 
configuration

Jason Yan (1):
   powerpc: remove dead code in head_fsl_booke.S

Joel Stanley (1):
   powerpc/configs: Sync skiroot defconfig

Mahesh Salgaonkar (1):
   powerpc/mm: Disable kcov for SLB routines

Mathieu Malaterre (1):
   powerpc/64s: Include  header file to fix a warning


  arch/powerpc/configs/skiroot_defconfig   | 12 +---
  arch/powerpc/include/asm/book3s/64/hugetlb.h |  8 
  arch/powerpc/kernel/head_fsl_booke.S |  7 ---
  arch/powerpc/kernel/traps.c  |  1 +
  arch/powerpc/mm/Makefile |  3 +++
  arch/powerpc/platforms/powernv/opal-call.c   |  1 +
  6 files changed, 22 insertions(+), 10 deletions(-)
-BEGIN PGP SIGNATURE-

iQIcBAEBAgAGBQJcjN2tAAoJEFHr6jzI4aWA/jIQAK4J2VQD8Sw+2kSm3h7wW18U
+BDyc7fbbigQyBHFkMAdybRKsXMSCbco7jK12yUbh5xqYlo8Hc40DbKI32f0D3WE
7rRotjalxR9tF+u0+m8Pdge42bPmEyt6p/7w5Ys+wVj/KXqlwTJinqSvp5Qmrilk
19qOTaTCXEMJ7dFTXqlFNpBW+0kaahCZ6f767dPPKkiYSm/qMZjKG/KCejLDAGQL
x5ouTpPos8sOjts7dwJuBGCxTfU7usKpy1EbguIklzYjedk1MSh5sg6STTQsH8Y4
kAgd8T12Wo4cQPaBmjwTkD7BrCdWbjNcK5U61kKAByshM3ZyPo+xARyQMdIWVZJQ
pX51tjmKwzGk3nf1UiMP/jdx55Cj6rhr3EsfQepjocMa9t6IWNVpasAvFRPlw8ca
Xmhbqsjwy9wKroAYgITq1L+VfeDe+dXBgK7yrChpqSdU89iOYgjoUbnwI+OeSCbk
Hm8w1p5+7CNxRxNzBieqBCtYUIlEwjP3rOwuNEpb0dJ4USD4jEr/8Mk0YXWJj4yg
mplyFwUXBrWVQHlKRI2tabO8rY7KN8H+SC/EczvERxpLRc3m7dH3DIlMi8A4a9Lk
QyvoQY7n9fZw1lm+/6ORMCNSc8lkIrsDu44rgn7WpaZDbN1woia0q5AtsNDU9jr2
HwUoI/HIHq5FWRu4m6LN
=TNMo
-END PGP SIGNATURE-



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus