Re: [PATCHv4 0/6] printk/ia64/ppc64/parisc64: let's deprecate %pF/%pf printk specifiers

2017-11-28 Thread Sergey Senozhatsky
On (11/28/17 16:47), Petr Mladek wrote:
> On Fri 2017-11-10 08:48:24, Sergey Senozhatsky wrote:
> > Hello,
> > 
> > A reworked version. There is a new dereference_symbol_descriptor()
> > function now, where "the magic happens", so I don't touch kallsyms_lookup()
> > and module_address_lookup() anymore.
> 
> The new version looks good to me. Thanks a lot for reworking it.
> I feel much better now. For the whole series:
> 
> Reviewed-by: Petr Mladek 
> 
> > All Ack-s/Tested-by-s were dropped, since the patch set has been
> > reworked. I'm kindly asking arch-s maintainers and developers to test it
> > once again. Sorry for any inconveniences and thanks for your help in
> > advance.
> 
> I see that it was tested on all affected architectures. Thanks a lot
> all testers.
> 
> It seems that we are ready to go. I am going to push this into
> for-4.16 branch in printk.git.

thanks.

-ss


Re: [PATCH] powerpc/powernv : Add support to enable sensor groups

2017-11-28 Thread Shilpasri G Bhat
Hi,

On 11/28/2017 05:07 PM, Michael Ellerman wrote:
> Shilpasri G Bhat  writes:
> 
>> Adds support to enable/disable a sensor group. This can be used to
>> select the sensor groups that needs to be copied to main memory by
>> OCC. Sensor groups like power, temperature, current, voltage,
>> frequency, utilization can be enabled/disabled at runtime.
>>
>> Signed-off-by: Shilpasri G Bhat 
>> ---
>> The skiboot patch for the opal call is posted below:
>> https://lists.ozlabs.org/pipermail/skiboot/2017-November/009713.html
> 
> Can you remind me why we're doing this with a completely bespoke sysfs
> API, rather than using some generic sensors API?
> 

Disabling/Enabling sensor groups is not supported in the current generic sensors
API. And also we dont export all type of sensors in HWMON as not all of them are
environment sensors (like performance).

> And if we must do it that way, please add documentation for the sysfs
> file(s) in Documentation/ABI/.
> 

Will do.

Thanks and Regards,
Shilpa

> cheers
> 



Re: [PATCH][V2] crypto/nx: fix spelling mistake: "availavle" -> "available"

2017-11-28 Thread Herbert Xu
On Tue, Nov 14, 2017 at 02:32:17PM +, Colin King wrote:
> From: Colin Ian King 
> 
> Trivial fix to spelling mistake in pr_err error message text. Also
> fix spelling mistake in proceeding comment.
> 
> Signed-off-by: Colin Ian King 

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: a3b2cb30 broken panic reporting for qemu guests

2017-11-28 Thread Nicholas Piggin
On Wed, 29 Nov 2017 15:06:52 +1100
David Gibson  wrote:

> a3b2cb30 "powerpc: Do not call ppc_md.panic in fadump panic notifier"
> purports to fix a problem when the kernel panics with fadump not
> registered, but it breaks something else instead.  I _think_ it was
> working on the incorrect assumption that ppc_md.panic was (or should
> be) only used with fadump, but I'm not really sure.
> 
> Panic works with kdump enabled, and (I think) with fadump enabled).
> However, with neither of these enabled, we always go to the generic
> panic logic.

Yeah thanks, I can't remember what assumption I was working on tbh.
 
> That's incorrect for PAPR guests - they should call ibm,os-term via
> RTAS.  Under qemu this leads to a "GUEST_PANICKED" event notification
> which higher-level management pays attention to.  Since a3b2cb30 we
> now reboot instead of reporting that.
> 
> I believe it will also break panic for PS3 machines, but since that
> platform basically no longer exists, we probably don't care.

I (hope) it should just go down to the normal panic path and not do
much worse than it already does -- although it won't print out that
message.

> I'm not entirely sure how to fix this.  I _think_ what we want is to
> call ppc_md.panic from a late panic notifier, the way this patch does
> for fadump_panic_event() if fadump is registered.

The problem I had there is that some of the printk and console stuff
wasn't getting flushed out, so I was getting a blank screen. This was
probably in conjunction with panicing from NMI context that we're now
starting to introduce.

So it's a bit annoying. There's other ugliness we have for being unable
to control panic code well enough from arch code
(arch/powerpc/platforms/powernv/opal.c)

I guess a really minimal fix is to put an #ifdef powerpc down the bottom
there (/me *cries*).



a3b2cb30 broken panic reporting for qemu guests

2017-11-28 Thread David Gibson
a3b2cb30 "powerpc: Do not call ppc_md.panic in fadump panic notifier"
purports to fix a problem when the kernel panics with fadump not
registered, but it breaks something else instead.  I _think_ it was
working on the incorrect assumption that ppc_md.panic was (or should
be) only used with fadump, but I'm not really sure.

Panic works with kdump enabled, and (I think) with fadump enabled).
However, with neither of these enabled, we always go to the generic
panic logic.

That's incorrect for PAPR guests - they should call ibm,os-term via
RTAS.  Under qemu this leads to a "GUEST_PANICKED" event notification
which higher-level management pays attention to.  Since a3b2cb30 we
now reboot instead of reporting that.

I believe it will also break panic for PS3 machines, but since that
platform basically no longer exists, we probably don't care.

I'm not entirely sure how to fix this.  I _think_ what we want is to
call ppc_md.panic from a late panic notifier, the way this patch does
for fadump_panic_event() if fadump is registered.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH v2 2/2] ASoC: fsl_ssi: call _fsl_ssi_set_dai_fmt() just once in AC'97 mode

2017-11-28 Thread Nicolin Chen
On Wed, Nov 22, 2017 at 12:55:14AM +0100, Maciej S. Szmigiero wrote:
> In AC'97 mode we configure and start SSI RX / TX on probe path via
> a call to _fsl_ssi_set_dai_fmt() function.
> We don't need to call this function again later and in fact don't want to
> do it since this function temporarily sets STCR, SRCR and SCR to some
> intermediate values.
> 
> Signed-off-by: Maciej S. Szmigiero 

Acked-by: Nicolin Chen 

> ---
> Changes from v1: The SACCST setup code was split out into a separate
> commit.
> 
>  sound/soc/fsl/fsl_ssi.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 375aaaf6080d..70df6bf832df 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -1116,6 +1116,9 @@ static int fsl_ssi_set_dai_fmt(struct snd_soc_dai 
> *cpu_dai, unsigned int fmt)
>  {
>   struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
>  
> + if (fsl_ssi_is_ac97(ssi_private))
> + return 0;
> +
>   return _fsl_ssi_set_dai_fmt(cpu_dai->dev, ssi_private, fmt);
>  }
>  
> 


Re: [alsa-devel] [PATCH v2 1/2] ASoC: fsl_ssi: only enable proper channel slots in AC'97 mode

2017-11-28 Thread Nicolin Chen
On Wed, Nov 22, 2017 at 12:54:26AM +0100, Maciej S. Szmigiero wrote:
> We need to make sure that only proper channel slots (in SACCST register)
> are enabled at playback start time since some AC'97 CODECs (like VT1613 on
> UDOO board) were observed requesting via SLOTREQ spurious ones just after
> an AC'97 link is started but before the CODEC is configured by its driver.
> When a bit for some channel slot is set in a SLOTREQ request then SSI sets
> the relevant bit in SACCST automatically, which then 'sticks' until it is
> manually unset.
> The SACCST register is not writable directly, we have to use SACCDIS and
> SACCEN registers to configure it instead (these aren't normal registers:
> writing a '1' bit at some position in SACCEN sets the relevant bit in
> SACCST; SACCDIS operates in a similar way but allows unsetting bits in
> SACCST).
> 
> Theoretically, this should be necessary only for the very first playback
> but since some CODECs are so untrustworthy and extra channel slots enabled
> mean ruined playback let's play safe here and make sure that no extra
> slots are enabled in SACCST every time a playback is started.
> 
> Signed-off-by: Maciej S. Szmigiero 

The inline comments feel over descriptive but not critical. Anyway,
I plan to do some clean up to this driver after all pending changes
get finalized. So,

Acked-by: Nicolin Chen 

> ---
> Changes from v1: Split out this part from
> "fsl_ssi: call _fsl_ssi_set_dai_fmt() just once in AC'97 mode" commit,
> describe the problem and its solution better both in the commit message and
> in the code, move the SACCST setup code into a separate function and call
> it from TX config instead of doing it from trigger handler function.
> 
>  sound/soc/fsl/fsl_ssi.c | 52 
> +++--
>  1 file changed, 46 insertions(+), 6 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index 48bb850a34d9..375aaaf6080d 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -574,8 +574,54 @@ static void fsl_ssi_rx_config(struct fsl_ssi_private 
> *ssi_private, bool enable)
>   fsl_ssi_config(ssi_private, enable, _private->rxtx_reg_val.rx);
>  }
>  
> +static void fsl_ssi_tx_ac97_saccst_setup(struct fsl_ssi_private *ssi_private)
> +{
> + struct regmap *regs = ssi_private->regs;
> +
> + /* no SACC{ST,EN,DIS} regs on imx21-class SSI */
> + if (!ssi_private->soc->imx21regs) {
> + /*
> +  * Note that these below aren't just normal registers.
> +  * They are a way to disable or enable bits in SACCST
> +  * register:
> +  * - writing a '1' bit at some position in SACCEN sets the
> +  * relevant bit in SACCST,
> +  * - writing a '1' bit at some position in SACCDIS unsets
> +  * the relevant bit in SACCST register.
> +  *
> +  * The two writes below first disable all channels slots,
> +  * then enable just slots 3 & 4 ("PCM Playback Left Channel"
> +  * and "PCM Playback Right Channel").
> +  */
> + regmap_write(regs, CCSR_SSI_SACCDIS, 0xff);
> + regmap_write(regs, CCSR_SSI_SACCEN, 0x300);
> + }
> +}
> +
>  static void fsl_ssi_tx_config(struct fsl_ssi_private *ssi_private, bool 
> enable)
>  {
> + /*
> +  * Why are we setting up SACCST everytime we are starting a
> +  * playback?
> +  * Some CODECs (like VT1613 CODEC on UDOO board) like to
> +  * (sometimes) set extra bits in their SLOTREQ requests.
> +  * When a bit is set in a SLOTREQ request then SSI sets the
> +  * relevant bit in SACCST automatically (it is enough if a bit was
> +  * set in a SLOTREQ just once, bits in SACCST are 'sticky').
> +  * If an extra slot gets enabled that's a disaster for playback
> +  * because some of normal left or right channel samples are
> +  * redirected instead to this extra slot.
> +  *
> +  * A workaround implemented in fsl-asoc-card of setting an
> +  * appropriate CODEC register so that slots 3 & 4 (the normal
> +  * stereo playback slots) are used for S/PDIF seems to mostly fix
> +  * this issue on the UDOO board but since this CODEC is so
> +  * untrustworthy let's play safe here and make sure that no extra
> +  * slots are enabled every time a playback is started.
> +  */
> + if (enable && fsl_ssi_is_ac97(ssi_private))
> + fsl_ssi_tx_ac97_saccst_setup(ssi_private);
> +
>   fsl_ssi_config(ssi_private, enable, _private->rxtx_reg_val.tx);
>  }
>  
> @@ -630,12 +676,6 @@ static void fsl_ssi_setup_ac97(struct fsl_ssi_private 
> *ssi_private)
>   regmap_write(regs, CCSR_SSI_SACNT,
>   CCSR_SSI_SACNT_AC97EN | CCSR_SSI_SACNT_FV);
>  
> - /* no SACC{ST,EN,DIS} regs on imx21-class SSI */
> - if (!ssi_private->soc->imx21regs) {
> -   

Resend: [PATCH V5 0/4] powerpc/devtree: Add support for 'ibm,drc-info' property

2017-11-28 Thread Michael Bringmann
Several properties in the DRC device tree format are replaced by
more compact representations to allow, for example, for the encoding
of vast amounts of memory, and or reduced duplication of information
in related data structures.

"ibm,drc-info": This property, when present, replaces the following
four properties: "ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains".  This property is defined for all
dynamically reconfigurable platform nodes.  The "ibm,drc-info" elements
are intended to provide a more compact representation, and reduce some
search overhead.

"ibm,architecture.vec": Bidirectional communication mechanism between
the host system and the front end processor indicating what features
the host system supports and what features the front end processor will
actually provide.  In this case, we are indicating that the host system
can support the new device tree structure "ibm,drc-info".

Signed-off-by: Michael Bringmann 

Michael Bringmann (4):
  powerpc/firmware: Add definitions for new drc-info firmware feature.
  pseries/drc-info: Search new DRC properties for CPU indexes
  hotplug/drc-info: Add code to search new devtree property
  powerpc: Enable support for new DRC devtree property
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Remove unnecessary WARN_ON() tests



Re: [PATCH v2 00/10] posix_clocks: Prepare syscalls for 64 bit time_t conversion

2017-11-28 Thread Deepa Dinamani
On Tue, Nov 28, 2017 at 6:17 AM, Arnd Bergmann  wrote:
> On Mon, Nov 27, 2017 at 11:29 PM, Deepa Dinamani  
> wrote:
 I decided against using LEGACY_TIME_SYSCALLS to conditionally compile
 legacy time syscalls such as sys_nanosleep because this will need to
 enclose compat_sys_nanosleep as well. So, defining it as

 config LEGACY_TIME_SYSCALLS
  def_bool 64BIT || !64BIT_TIME

 will not include compat_sys_nanosleep. We will instead need a new config to
 exclusively mark legacy syscalls.
>>>
>>> Do you mean we would need to do this separately for native and compat
>>> syscalls, and have yet another option, like LEGACY_TIME_SYSCALLS
>>> and LEGACY_TIME_COMPAT_SYSCALLS, to cover all cases? I would
>>> think that CONFIG_COMPAT_32BIT_TIME handles all the compat versions,
>>> while CONFIG_LEGACY_TIME_SYSCALLS handles all the native ones.
>>
>> I meant sys_nanosleep would be covered by LEGACY_TIME_SYSCALLS, but
>> compat_sys_nanosleep would be covered by CONFIG_COMPAT_32BIT_TIME
>> along with other compat syscalls.
>> So, if we define the LEGACY_TIME_SYSCALLS as
>>
>>
>> "This controls the compilation of the following system calls:
>> time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,
>> alarm, getitimer,
>> setitimer, select, utime, utimes, futimesat, and
>> {old,new}{l,f,}stat{,64}.
>> These all pass 32-bit time_t arguments on 32-bit architectures and
>> are replaced by other interfaces (e.g. posix timers and clocks, 
>> statx).
>> C libraries implementing 64-bit time_t in 32-bit architectures have 
>> to
>> implement the handles by wrapping around the newer interfaces.
>> New architectures should not explicitly enable this."
>>
>> This would not be really true as compat interfaces have nothing to do
>> with this config.
>>
>> I was proposing that we could have LEGACY_TIME_SYSCALLS config, but
>> then have all these "deprecated" syscalls be enclosed within this,
>> compat or not.
>> This will also mean that we will have to come up representing these
>> syscalls in the syscall header files.
>> This can be a separate patch and this series can be merged as is if
>> everyone agrees.
>
> I think doing this separately  would be good, I don't see any interdependency
> with the other patches, we just need to decide what we want in the long
> run.

Right. There are three options:

1. Use two configs to identify which syscalls need not be supported by
new architectures.
In this case it makes sense to say LEGACY_TIME_SYSCALLS and
COMPAT_32BIT_TIME both need to be disabled for new architectures. And,
I can reword the config to what you mention below.

2. Make the LEGACY_TIME_SYSCALLS eliminate non y2038 safe syscalls
mentioned below only.
In this case only the native and compat functions of the below
mentioned syscalls need to be identified by the config. I like this
option as this clearly identifies which syscalls are deprecated and do
not have a 64 bit counterpart. Not all architectures need to support
turning this off.

3. If we don't need either 1 or 2, then we could stick with what we
have today in the series as CONFIG_64BIT_TIME will be deleted and they
only need #ifdef CONFIG_64BIT.

Let me know if anyone prefers something else.

> I agree my text that you cited doesn't capture the situation correctly,
> as this is really about the obsolete system calls that take 64-bit time_t
> arguments on architectures that are converted to allow 64-bit time_t
> for non-obsolete system calls.
>
> Maybe it's better to just reword this to
>
>   "This controls the compilation of the following system calls:
>   time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,
> alarm, getitimer,
>   setitimer, select, utime, utimes, futimesat, and 
> {old,new}{l,f,}stat{,64}.
>   These are all replaced by other interfaces (e.g. posix timers and 
> clocks,
>   statx) on architectures that got converted from 32-bit time_t to
> 64-bit time_t.
>   C libraries implementing 64-bit time_t in 32-bit architectures have to
>   implement the handles by wrapping around the newer interfaces.
>   New architectures should not explicitly enable this."
>
> That would clarify that it's not about the compat system calls, while
> also allowing the two options to be set independently.

-Deepa


Resend: [PATCH V5 4/4] powerpc: Enable support for ibm,drc-info devtree property

2017-11-28 Thread Michael Bringmann
prom_init.c: Enable support for new DRC device tree property
"ibm,drc-info" in initial handshake between the Linux kernel and
the front end processor.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/kernel/prom_init.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 02190e9..f962908 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -873,6 +873,7 @@ struct ibm_arch_vec __cacheline_aligned 
ibm_architecture_vec = {
.mmu = 0,
.hash_ext = 0,
.radix_ext = 0,
+   .byte22 = OV5_FEAT(OV5_DRC_INFO),
},
 
/* option vector 6: IBM PAPR hints */



Resend: [PATCH V5 3/4] hotplug/drc-info: Add code to search ibm,drc-info property

2017-11-28 Thread Michael Bringmann
rpadlpar_core.c: Provide parallel routines to search the older device-
tree properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".

The interface to examine the DRC information is changed from a "get"
function that returns values for local verification elsewhere, to a
"check" function that validates the 'name' and/or 'type' of a device
node.  This update hides the format of the underlying device-tree
properties, and concentrates the value checks into a single function
without requiring the user to verify whether a search was successful.

Signed-off-by: Michael Bringmann 
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Fix some spacing within arguments
---
 drivers/pci/hotplug/rpadlpar_core.c |   13 ++--
 drivers/pci/hotplug/rpaphp.h|4 +
 drivers/pci/hotplug/rpaphp_core.c   |  109 +++
 3 files changed, 91 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/hotplug/rpadlpar_core.c 
b/drivers/pci/hotplug/rpadlpar_core.c
index a3449d7..fc01d7d 100644
--- a/drivers/pci/hotplug/rpadlpar_core.c
+++ b/drivers/pci/hotplug/rpadlpar_core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../pci.h"
 #include "rpaphp.h"
@@ -44,15 +45,14 @@ static struct device_node *find_vio_slot_node(char 
*drc_name)
 {
struct device_node *parent = of_find_node_by_name(NULL, "vdevice");
struct device_node *dn = NULL;
-   char *name;
int rc;
 
if (!parent)
return NULL;
 
while ((dn = of_get_next_child(parent, dn))) {
-   rc = rpaphp_get_drc_props(dn, NULL, , NULL, NULL);
-   if ((rc == 0) && (!strcmp(drc_name, name)))
+   rc = rpaphp_check_drc_props(dn, drc_name, NULL);
+   if (rc == 0)
break;
}
 
@@ -64,15 +64,12 @@ static struct device_node *find_php_slot_pci_node(char 
*drc_name,
  char *drc_type)
 {
struct device_node *np = NULL;
-   char *name;
-   char *type;
int rc;
 
while ((np = of_find_node_by_name(np, "pci"))) {
-   rc = rpaphp_get_drc_props(np, NULL, , , NULL);
+   rc = rpaphp_check_drc_props(np, drc_name, drc_type);
if (rc == 0)
-   if (!strcmp(drc_name, name) && !strcmp(drc_type, type))
-   break;
+   break;
}
 
return np;
diff --git a/drivers/pci/hotplug/rpaphp.h b/drivers/pci/hotplug/rpaphp.h
index 7db024e..8db5f2e 100644
--- a/drivers/pci/hotplug/rpaphp.h
+++ b/drivers/pci/hotplug/rpaphp.h
@@ -91,8 +91,8 @@ struct slot {
 
 /* rpaphp_core.c */
 int rpaphp_add_slot(struct device_node *dn);
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain);
+int rpaphp_check_drc_props(struct device_node *dn, char *drc_name,
+   char *drc_type);
 
 /* rpaphp_slot.c */
 void dealloc_slot_struct(struct slot *slot);
diff --git a/drivers/pci/hotplug/rpaphp_core.c 
b/drivers/pci/hotplug/rpaphp_core.c
index 1e29aba..6da613a 100644
--- a/drivers/pci/hotplug/rpaphp_core.c
+++ b/drivers/pci/hotplug/rpaphp_core.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include/* for eeh_add_device() */
 #include   /* rtas_call */
 #include /* for pci_controller */
@@ -196,25 +197,21 @@ static int get_children_props(struct device_node *dn, 
const int **drc_indexes,
return 0;
 }
 
-/* To get the DRC props describing the current node, first obtain it's
- * my-drc-index property.  Next obtain the DRC list from it's parent.  Use
- * the my-drc-index for correlation, and obtain the requested properties.
+
+/* Verify the existence of 'drc_name' and/or 'drc_type' within the
+ * current node.  First obtain it's my-drc-index property.  Next,
+ * obtain the DRC info from it's parent.  Use the my-drc-index for
+ * correlation, and obtain/validate the requested properties.
  */
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain)
+
+static int rpaphp_check_drc_props_v1(struct device_node *dn, char *drc_name,
+   char *drc_type, unsigned int my_index)
 {
+   char *name_tmp, *type_tmp;
const int *indexes, *names;
const int *types, *domains;
-   const unsigned int *my_index;
-   char *name_tmp, *type_tmp;
int i, rc;
 
-   my_index = of_get_property(dn, "ibm,my-drc-index", NULL);
-   if (!my_index) {
-   /* Node isn't DLPAR/hotplug capable */
-   return -EINVAL;
-   }
-
rc = get_children_props(dn->parent, , , , );
if (rc < 0) {
return -EINVAL;
@@ -225,24 +222,86 @@ int 

Resend: [PATCH V5 2/4] pseries/drc-info: Search DRC properties for CPU indexes

2017-11-28 Thread Michael Bringmann
pseries/drc-info: Provide parallel routines to convert between
drc_index and CPU numbers at runtime, using the older device-tree
properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".

Signed-off-by: Michael Bringmann 
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Remove unnecessary WARN_ON() tests
---
 arch/powerpc/include/asm/prom.h |   15 +++
 arch/powerpc/platforms/pseries/of_helpers.c |   60 +++
 arch/powerpc/platforms/pseries/pseries_energy.c |  126 ++-
 3 files changed, 173 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 3243455..0ef41b1 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -96,6 +96,21 @@ struct of_drconf_cell {
 #define DRCONF_MEM_AI_INVALID  0x0040
 #define DRCONF_MEM_RESERVED0x0080
 
+struct of_drc_info {
+   char *drc_type;
+   char *drc_name_prefix;
+   u32 drc_index_start;
+   u32 drc_name_suffix_start;
+   u32 num_sequential_elems;
+   u32 sequential_inc;
+   u32 drc_power_domain;
+   u32 last_drc_index;
+};
+
+extern int of_read_drc_info_cell(struct property **prop,
+   const __be32 **curval, struct of_drc_info *data);
+
+
 /*
  * There are two methods for telling firmware what our capabilities are.
  * Newer machines have an "ibm,client-architecture-support" method on the
diff --git a/arch/powerpc/platforms/pseries/of_helpers.c 
b/arch/powerpc/platforms/pseries/of_helpers.c
index 7e75101..6df192f 100644
--- a/arch/powerpc/platforms/pseries/of_helpers.c
+++ b/arch/powerpc/platforms/pseries/of_helpers.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "of_helpers.h"
 
@@ -37,3 +38,62 @@ struct device_node *pseries_of_derive_parent(const char 
*path)
kfree(parent_path);
return parent ? parent : ERR_PTR(-EINVAL);
 }
+
+
+/* Helper Routines to convert between drc_index to cpu numbers */
+
+int of_read_drc_info_cell(struct property **prop, const __be32 **curval,
+   struct of_drc_info *data)
+{
+   const char *p;
+   const __be32 *p2;
+
+   if (!data)
+   return -EINVAL;
+
+   /* Get drc-type:encode-string */
+   p = data->drc_type = (char*) (*curval);
+   p = of_prop_next_string(*prop, p);
+   if (!p)
+   return -EINVAL;
+
+   /* Get drc-name-prefix:encode-string */
+   data->drc_name_prefix = (char *)p;
+   p = of_prop_next_string(*prop, p);
+   if (!p)
+   return -EINVAL;
+
+   /* Get drc-index-start:encode-int */
+   p2 = (const __be32 *)p;
+   p2 = of_prop_next_u32(*prop, p2, >drc_index_start);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get drc-name-suffix-start:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >drc_name_suffix_start);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get number-sequential-elements:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >num_sequential_elems);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get sequential-increment:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >sequential_inc);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get drc-power-domain:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >drc_power_domain);
+   if (!p2)
+   return -EINVAL;
+
+   /* Should now know end of current entry */
+   (*curval) = (void *)p2;
+   data->last_drc_index = data->drc_index_start +
+   ((data->num_sequential_elems - 1) * data->sequential_inc);
+
+   return 0;
+}
+EXPORT_SYMBOL(of_read_drc_info_cell);
diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c 
b/arch/powerpc/platforms/pseries/pseries_energy.c
index 35c891a..f96677b 100644
--- a/arch/powerpc/platforms/pseries/pseries_energy.c
+++ b/arch/powerpc/platforms/pseries/pseries_energy.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 #define MODULE_VERS "1.0"
@@ -38,26 +39,58 @@
 static u32 cpu_to_drc_index(int cpu)
 {
struct device_node *dn = NULL;
-   const int *indexes;
-   int i;
+   int thread_index;
int rc = 1;
u32 ret = 0;
 
dn = of_find_node_by_path("/cpus");
if (dn == NULL)
goto err;
-   indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
-   if (indexes == NULL)
-   goto err_of_node_put;
+
/* Convert logical cpu number to core number */
-   i = cpu_core_index_of_thread(cpu);
-   /*
-* The first element indexes[0] is the number of drc_indexes
-* returned in the list.  Hence i+1 will get the drc_index
-* corresponding to core number i.
-*/
-   WARN_ON(i > indexes[0]);
-   ret = indexes[i + 1];
+   

Resend: [PATCH V5 1/4] powerpc/firmware: Add definitions for new drc-info firmware feature

2017-11-28 Thread Michael Bringmann
Firmware Features: Define new bit flag representing the presence of
new device tree property "ibm,drc-info".  The flag is used to tell
the front end processor whether the Linux kernel supports the new
property, and by the front end processor to tell the Linux kernel
that the new property is present in the device tree.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/firmware.h   |3 ++-
 arch/powerpc/include/asm/prom.h   |1 +
 arch/powerpc/platforms/pseries/firmware.c |1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index 8645897..329d537 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -51,6 +51,7 @@
 #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x8000)
 #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001)
 #define FW_FEATURE_PRRNASM_CONST(0x0002)
+#define FW_FEATURE_DRC_INFOASM_CONST(0x0004)
 
 #ifndef __ASSEMBLY__
 
@@ -67,7 +68,7 @@ enum {
FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
-   FW_FEATURE_HPT_RESIZE,
+   FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRC_INFO,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
FW_FEATURE_POWERNV_ALWAYS = 0,
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 825bd59..3243455 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -175,6 +175,7 @@ struct of_drconf_cell {
 #define OV5_HASH_GTSE  0x1940  /* Guest Translation Shoot Down Avail */
 /* Radix Table Extensions */
 #define OV5_RADIX_GTSE 0x1A40  /* Guest Translation Shoot Down Avail */
+#define OV5_DRC_INFO   0x1640  /* Redef Prop Structures: drc-info   */
 
 /* Option Vector 6: IBM PAPR hints */
 #define OV6_LINUX  0x02/* Linux is our OS */
diff --git a/arch/powerpc/platforms/pseries/firmware.c 
b/arch/powerpc/platforms/pseries/firmware.c
index 63cc82a..757d757 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -114,6 +114,7 @@ struct vec5_fw_feature {
 vec5_fw_features_table[] = {
{FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY},
{FW_FEATURE_PRRN,   OV5_PRRN},
+   {FW_FEATURE_DRC_INFO,   OV5_DRC_INFO},
 };
 
 static void __init fw_vec5_feature_init(const char *vec5, unsigned long len)



[PATCH V8 3/3] hotplug/cpu: Fix crash with memoryless nodes

2017-11-28 Thread Michael Bringmann
On powerpc systems with shared configurations of CPUs and memory and
memoryless nodes at boot, an event ordering problem was observed on
a SLES12 build platforms with the hot-add of CPUs to the memoryless
nodes.

* The most common error occurred when the memory SLAB driver attempted
  to reference the memoryless node to which a CPU was being added
  before the kernel had finished initializing all of the data structures
  for the CPU and exited 'device_online' under DLPAR/hot-add.

  Normally the memoryless node would be initialized through the call
  path device_online ... arch_update_cpu_topology ... find_cpu_nid
  ...  try_online_node.  This patch ensures that the powerpc node will
  be initialized as early as possible, even if it was memoryless and
  CPU-less at the point when we are trying to hot-add a new CPU to it.

Signed-off-by: Michael Bringmann 
---
Changes in V8:
  -- Change a 'printk(KERN_INFO ...)' statement to be a pr_debug()
 statement.
  -- Rename 'find_cpu_nid' to 'find_and_online_cpu_nid' for better
 clarity of its function.
---
 arch/powerpc/mm/numa.c   |4 +++-
 arch/powerpc/platforms/pseries/hotplug-cpu.c |3 +++
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 6b08dd8..a182f9e 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1307,7 +1307,7 @@ static long vphn_get_associativity(unsigned long cpu,
return rc;
 }
 
-static inline int find_and_online_cpu_nid(int cpu)
+int find_and_online_cpu_nid(int cpu)
 {
__be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
int new_nid;
@@ -1340,6 +1340,8 @@ static inline int find_and_online_cpu_nid(int cpu)
 #endif
}
 
+   pr_debug("%s:%d cpu %d nid %d\n", __FUNCTION__, __LINE__,
+   cpu, new_nid);
return new_nid;
 }
 
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index a7d14aa7..dceb514 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -340,6 +340,8 @@ static void pseries_remove_processor(struct device_node *np)
cpu_maps_update_done();
 }
 
+extern int find_and_online_cpu_nid(int cpu);
+
 static int dlpar_online_cpu(struct device_node *dn)
 {
int rc = 0;
@@ -364,6 +366,7 @@ static int dlpar_online_cpu(struct device_node *dn)
!= CPU_STATE_OFFLINE);
cpu_maps_update_done();
timed_topology_update(1);
+   find_and_online_cpu_nid(cpu);
rc = device_online(get_cpu_device(cpu));
if (rc)
goto out;



[PATCH V8 2/3] poserpc/initnodes: Ensure nodes initialized for hotplug

2017-11-28 Thread Michael Bringmann
On powerpc systems which allow 'hot-add' of CPU, it may occur that
the new resources are to be inserted into nodes that were not used
for memory resources at bootup.  Many different configurations of
PowerPC resources may need to be supported depending upon the
environment.  Important characteristics of the nodes and operating
environment include:

* Dedicated vs. shared CPUs.  Shared CPUs require information such
  as the VPHN hcall for CPU assignment to nodes, since shared CPUs
  have their affinity set to node 0 at boot and when hot-added.
  Associativity decisions made based on dedicated resource rules,
  such as associativity properties in the device tree, may vary from
  decisions made using the values returned by the VPHN hcall.
* memoryless nodes at boot.  Nodes need to be defined as 'possible'
  at boot for operation with other code modules.  Previously, the
  powerpc code would limit the set of possible nodes to those which
  have memory assigned at boot, and were thus online.  Subsequent
  add/remove of CPUs or memory would only work with this subset of
  possible nodes.
* memoryless nodes with CPUs at boot.  Due to the previous restriction
  on nodes, nodes that had CPUs but no memory were being collapsed
  into other nodes that did have memory at boot.  In practice this
  meant that the node assignment presented by the runtime kernel
  differed from the affinity and associativity attributes presented
  by the device tree or VPHN hcalls.  Nodes that might be known to
  the pHyp were not 'possible' in the runtime kernel because they did
  not have memory at boot.

This patch fixes some problems encountered at runtime with
configurations that support memory-less nodes, or that hot-add CPUs
into nodes that are memoryless during system execution after boot.
The problems of interest include,

* Nodes known to powerpc to be memoryless at boot, but to have
  CPUs in them are allowed to be 'possible' and 'online'.  Memory
  allocations for those nodes are taken from another node that does
  have memory until and if memory is hot-added to the node.
* Nodes which have no resources assigned at boot, but which may still
  be referenced subsequently by affinity or associativity attributes,
  are kept in the list of 'possible' nodes for powerpc.  Hot-add of
  memory or CPUs to the system can reference these nodes and bring
  them online instead of redirecting the references to one of the set
  of nodes known to have memory at boot.

Note that this software operates under the context of CPU hotplug.
We are not doing memory hotplug in this code, but rather updating
the kernel's CPU topology (i.e. arch_update_cpu_topology /
numa_update_cpu_topology).  We are initializing a node that may be
used by CPUs or memory before it can be referenced as invalid by a
CPU hotplug operation.  CPU hotplug operations are protected by a
range of APIs including cpu_maps_update_begin/cpu_maps_update_done,
cpus_read/write_lock / cpus_read/write_unlock, device locks, and more.
Memory hotplug operations, including try_online_node, are protected
by mem_hotplug_begin/mem_hotplug_done, device locks, and more.  In
the case of CPUs being hot-added to a previously memoryless node, the
try_online_node operation occurs wholly within the CPU locks with no
overlap.  Using HMC hot-add/hot-remove operations, we have been able
to add and remove CPUs to any possible node without failures.  HMC
operations involve a degree self-serialization, though.

Signed-off-by: Michael Bringmann 
---
Changes in V8:
  -- Clarify 'resources' as CPUs in patch description regarding
 VPHN call.  Add another clause to statement mentioning that
 shared CPUs start in node 0, and are finally assigned per
 VPHN information.
  -- Rename 'find_cpu_nid' to 'find_and_online_cpu_nid' for better
 clarity of its function.
  -- Restore '__init' tag to definition of 'setup_node_data'
---
 arch/powerpc/mm/numa.c |   49 ++--
 1 file changed, 39 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 735e3fd..6b08dd8 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -551,7 +551,7 @@ static int numa_setup_cpu(unsigned long lcpu)
nid = of_node_to_nid_single(cpu);
 
 out_present:
-   if (nid < 0 || !node_online(nid))
+   if (nid < 0 || !node_possible(nid))
nid = first_online_node;
 
map_cpu_to_node(lcpu, nid);
@@ -910,10 +910,8 @@ static void __init find_possible_nodes(void)
goto out;
 
for (i = 0; i < numnodes; i++) {
-   if (!node_possible(i)) {
-   setup_node_data(i, 0, 0);
+   if (!node_possible(i))
node_set(i, node_possible_map);
-   }
}
 
 out:
@@ -1309,6 +1307,42 @@ static long vphn_get_associativity(unsigned long cpu,
return rc;
 }
 
+static inline int 

[PATCH V8 1/3] powerpc/nodes: Ensure enough nodes avail for operations

2017-11-28 Thread Michael Bringmann
On powerpc systems which allow 'hot-add' of CPU or memory resources,
it may occur that the new resources are to be inserted into nodes
that were not used for these resources at bootup.  In the kernel,
any node that is used must be defined and initialized.  These empty
nodes may occur when,

* Dedicated vs. shared resources.  Shared resources require
  information such as the VPHN hcall for CPU assignment to nodes.
  Associativity decisions made based on dedicated resource rules,
  such as associativity properties in the device tree, may vary
  from decisions made using the values returned by the VPHN hcall.
* memoryless nodes at boot.  Nodes need to be defined as 'possible'
  at boot for operation with other code modules.  Previously, the
  powerpc code would limit the set of possible nodes to those which
  have memory assigned at boot, and were thus online.  Subsequent
  add/remove of CPUs or memory would only work with this subset of
  possible nodes.
* memoryless nodes with CPUs at boot.  Due to the previous restriction
  on nodes, nodes that had CPUs but no memory were being collapsed
  into other nodes that did have memory at boot.  In practice this
  meant that the node assignment presented by the runtime kernel
  differed from the affinity and associativity attributes presented
  by the device tree or VPHN hcalls.  Nodes that might be known to
  the pHyp were not 'possible' in the runtime kernel because they did
  not have memory at boot.

This patch ensures that sufficient nodes are defined to support
configuration requirements after boot, as well as at boot.  This
patch set fixes a couple of problems.

* Nodes known to powerpc to be memoryless at boot, but to have
  CPUs in them are allowed to be 'possible' and 'online'.  Memory
  allocations for those nodes are taken from another node that does
  have memory until and if memory is hot-added to the node.
* Nodes which have no resources assigned at boot, but which may still
  be referenced subsequently by affinity or associativity attributes,
  are kept in the list of 'possible' nodes for powerpc.  Hot-add of
  memory or CPUs to the system can reference these nodes and bring
  them online instead of redirecting to one of the set of nodes that
  were known to have memory at boot.

This patch extracts the value of the lowest domain level (number of
allocable resources) from the device tree property
"ibm,max-associativity-domains" to use as the maximum number of nodes
to setup as possibly available in the system.  This new setting will
override the instruction,

nodes_and(node_possible_map, node_possible_map, node_online_map);

presently seen in the function arch/powerpc/mm/numa.c:initmem_init().

If the "ibm,max-associativity-domains" property is not present at boot,
no operation will be performed to define or enable additional nodes, or
enable the above 'nodes_and()'.

Signed-off-by: Michael Bringmann 
---
Changes in V8:
  -- Remove unneeded pr_info() statement
---
 arch/powerpc/mm/numa.c |   37 ++---
 1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index adb6364f..735e3fd 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -892,6 +892,34 @@ static void __init setup_node_data(int nid, u64 start_pfn, 
u64 end_pfn)
NODE_DATA(nid)->node_spanned_pages = spanned_pages;
 }
 
+static void __init find_possible_nodes(void)
+{
+   struct device_node *rtas;
+   u32 numnodes, i;
+
+   if (min_common_depth <= 0)
+   return;
+
+   rtas = of_find_node_by_path("/rtas");
+   if (!rtas)
+   return;
+
+   if (of_property_read_u32_index(rtas,
+   "ibm,max-associativity-domains",
+   min_common_depth, ))
+   goto out;
+
+   for (i = 0; i < numnodes; i++) {
+   if (!node_possible(i)) {
+   setup_node_data(i, 0, 0);
+   node_set(i, node_possible_map);
+   }
+   }
+
+out:
+   of_node_put(rtas);
+}
+
 void __init initmem_init(void)
 {
int nid, cpu;
@@ -905,12 +933,15 @@ void __init initmem_init(void)
memblock_dump_all();
 
/*
-* Reduce the possible NUMA nodes to the online NUMA nodes,
-* since we do not support node hotplug. This ensures that  we
-* lower the maximum NUMA node ID to what is actually present.
+* Modify the set of possible NUMA nodes to reflect information
+* available about the set of online nodes, and the set of nodes
+* that we expect to make use of for this platform's affinity
+* calculations.
 */
nodes_and(node_possible_map, node_possible_map, node_online_map);
 
+   find_possible_nodes();
+
for_each_online_node(nid) {
unsigned long start_pfn, end_pfn;
 



[PATCH V8 0/3] powerpc/nodes: Fix issues with memoryless nodes

2017-11-28 Thread Michael Bringmann
powerpc/nodes: Ensure enough nodes avail for operations

powerpc/initnodes: Ensure nodes initialized for hotplug

hotplug/cpu: Fix crash with memoryless nodes

Signed-off-by: Michael Bringmann 

Michael Bringmann (3):
  powerpc/nodes: Ensure enough nodes avail for operations
  powerpc/initnodes: Ensure nodes initialized for hotplug
  hotplug/cpu: Fix crash with memoryless nodes
---
Changes in V8:
  -- Remove unneeded pr_info() statement
  -- Clarify 'resources' as 'CPUs' in patch description regarding
 VPHN call.  Add another clause to statement mentioning that
 shared CPUs start in node 0, and are finally assigned per
 VPHN information.
  -- Change a 'printk(KERN_INFO ...)' statement to be a pr_debug()
 statement.
  -- Rename 'find_cpu_nid' to 'find_and_online_cpu_nid' for better
 clarity of its function.
  -- Restore '__init' tag to definition of 'setup_node_data'



Re: [PATCH V2 0/3] powerpc/hotplug: Fix affinity assoc for LPAR migration

2017-11-28 Thread Michael Bringmann
Hello:
I would like to pull / defer further consideration of this patch set
for a while.  I will be discussing changes here with respect to the
LMB optimizations that Nathan Fontenot is working upon.
A revision of this patch set will be sent out somewhat later.
Thanks for your attention and assistance.

Michael

On 11/16/2017 11:50 AM, Michael Bringmann wrote:
> The migration of LPARs across Power systems affects many attributes
> including that of the associativity of memory blocks and CPUs.  The
> patches in this set execute when a system is coming up fresh upon a
> migration target.  They are intended to,
> 
> * Recognize changes to the associativity of memory and CPUs recorded
>   in internal data structures when compared to the latest copies in
>   the device tree (e.g. ibm,dynamic-memory, ibm,dynamic-memory-v2,
>   cpus),
> * Recognize changes to the associativity mapping (e.g. ibm,
>   associativity-lookup-arrays), locate all assigned memory blocks
>   corresponding to each changed row, and readd all such blocks.
> * Generate calls to other code layers to reset the data structures
>   related to associativity of the CPUs and memory.
> * Re-register the 'changed' entities into the target system.
>   Re-registration of CPUs and memory blocks mostly entails acting as
>   if they have been newly hot-added into the target system.
> 
> Signed-off-by: Michael Bringmann 
> 
> Michael Bringmann (3):
>   hotplug/mobility: Apply assoc lookup updates for Post Migration Topo
>   postmigration/memory: Review assoc lookup array changes
>   postmigration/memory: Associativity & 'ibm,dynamic-memory-v2'
> ---
> Changes in V2:
>   -- Try to improve patch header documentation.
>   -- Remove unnecessary spacing changes from patch
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



[PATCH V5 2/4] pseries/drc-info: Search DRC properties for CPU indexes

2017-11-28 Thread Michael Bringmann
pseries/drc-info: Provide parallel routines to convert between
drc_index and CPU numbers at runtime, using the older device-tree
properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".

Signed-off-by: Michael Bringmann 
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Remove unnecessary WARN_ON() tests
---
 arch/powerpc/include/asm/prom.h |   15 +++
 arch/powerpc/platforms/pseries/of_helpers.c |   60 +++
 arch/powerpc/platforms/pseries/pseries_energy.c |  126 ++-
 3 files changed, 173 insertions(+), 28 deletions(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 3243455..0ef41b1 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -96,6 +96,21 @@ struct of_drconf_cell {
 #define DRCONF_MEM_AI_INVALID  0x0040
 #define DRCONF_MEM_RESERVED0x0080
 
+struct of_drc_info {
+   char *drc_type;
+   char *drc_name_prefix;
+   u32 drc_index_start;
+   u32 drc_name_suffix_start;
+   u32 num_sequential_elems;
+   u32 sequential_inc;
+   u32 drc_power_domain;
+   u32 last_drc_index;
+};
+
+extern int of_read_drc_info_cell(struct property **prop,
+   const __be32 **curval, struct of_drc_info *data);
+
+
 /*
  * There are two methods for telling firmware what our capabilities are.
  * Newer machines have an "ibm,client-architecture-support" method on the
diff --git a/arch/powerpc/platforms/pseries/of_helpers.c 
b/arch/powerpc/platforms/pseries/of_helpers.c
index 7e75101..6df192f 100644
--- a/arch/powerpc/platforms/pseries/of_helpers.c
+++ b/arch/powerpc/platforms/pseries/of_helpers.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "of_helpers.h"
 
@@ -37,3 +38,62 @@ struct device_node *pseries_of_derive_parent(const char 
*path)
kfree(parent_path);
return parent ? parent : ERR_PTR(-EINVAL);
 }
+
+
+/* Helper Routines to convert between drc_index to cpu numbers */
+
+int of_read_drc_info_cell(struct property **prop, const __be32 **curval,
+   struct of_drc_info *data)
+{
+   const char *p;
+   const __be32 *p2;
+
+   if (!data)
+   return -EINVAL;
+
+   /* Get drc-type:encode-string */
+   p = data->drc_type = (char*) (*curval);
+   p = of_prop_next_string(*prop, p);
+   if (!p)
+   return -EINVAL;
+
+   /* Get drc-name-prefix:encode-string */
+   data->drc_name_prefix = (char *)p;
+   p = of_prop_next_string(*prop, p);
+   if (!p)
+   return -EINVAL;
+
+   /* Get drc-index-start:encode-int */
+   p2 = (const __be32 *)p;
+   p2 = of_prop_next_u32(*prop, p2, >drc_index_start);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get drc-name-suffix-start:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >drc_name_suffix_start);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get number-sequential-elements:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >num_sequential_elems);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get sequential-increment:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >sequential_inc);
+   if (!p2)
+   return -EINVAL;
+
+   /* Get drc-power-domain:encode-int */
+   p2 = of_prop_next_u32(*prop, p2, >drc_power_domain);
+   if (!p2)
+   return -EINVAL;
+
+   /* Should now know end of current entry */
+   (*curval) = (void *)p2;
+   data->last_drc_index = data->drc_index_start +
+   ((data->num_sequential_elems - 1) * data->sequential_inc);
+
+   return 0;
+}
+EXPORT_SYMBOL(of_read_drc_info_cell);
diff --git a/arch/powerpc/platforms/pseries/pseries_energy.c 
b/arch/powerpc/platforms/pseries/pseries_energy.c
index 35c891a..f96677b 100644
--- a/arch/powerpc/platforms/pseries/pseries_energy.c
+++ b/arch/powerpc/platforms/pseries/pseries_energy.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 
 #define MODULE_VERS "1.0"
@@ -38,26 +39,58 @@
 static u32 cpu_to_drc_index(int cpu)
 {
struct device_node *dn = NULL;
-   const int *indexes;
-   int i;
+   int thread_index;
int rc = 1;
u32 ret = 0;
 
dn = of_find_node_by_path("/cpus");
if (dn == NULL)
goto err;
-   indexes = of_get_property(dn, "ibm,drc-indexes", NULL);
-   if (indexes == NULL)
-   goto err_of_node_put;
+
/* Convert logical cpu number to core number */
-   i = cpu_core_index_of_thread(cpu);
-   /*
-* The first element indexes[0] is the number of drc_indexes
-* returned in the list.  Hence i+1 will get the drc_index
-* corresponding to core number i.
-*/
-   WARN_ON(i > indexes[0]);
-   ret = indexes[i + 1];
+   

[PATCH V5 4/4] powerpc: Enable support for ibm,drc-info devtree property

2017-11-28 Thread Michael Bringmann
prom_init.c: Enable support for new DRC device tree property
"ibm,drc-info" in initial handshake between the Linux kernel and
the front end processor.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/kernel/prom_init.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 02190e9..f962908 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -873,6 +873,7 @@ struct ibm_arch_vec __cacheline_aligned 
ibm_architecture_vec = {
.mmu = 0,
.hash_ext = 0,
.radix_ext = 0,
+   .byte22 = OV5_FEAT(OV5_DRC_INFO),
},
 
/* option vector 6: IBM PAPR hints */



[PATCH V5 3/4] hotplug/drc-info: Add code to search ibm,drc-info property

2017-11-28 Thread Michael Bringmann
rpadlpar_core.c: Provide parallel routines to search the older device-
tree properties ("ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains"), or the new property "ibm,drc-info".

The interface to examine the DRC information is changed from a "get"
function that returns values for local verification elsewhere, to a
"check" function that validates the 'name' and/or 'type' of a device
node.  This update hides the format of the underlying device-tree
properties, and concentrates the value checks into a single function
without requiring the user to verify whether a search was successful.

Signed-off-by: Michael Bringmann 
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Fix some spacing within arguments
---
 drivers/pci/hotplug/rpadlpar_core.c |   13 ++--
 drivers/pci/hotplug/rpaphp.h|4 +
 drivers/pci/hotplug/rpaphp_core.c   |  109 +++
 3 files changed, 91 insertions(+), 35 deletions(-)

diff --git a/drivers/pci/hotplug/rpadlpar_core.c 
b/drivers/pci/hotplug/rpadlpar_core.c
index a3449d7..fc01d7d 100644
--- a/drivers/pci/hotplug/rpadlpar_core.c
+++ b/drivers/pci/hotplug/rpadlpar_core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "../pci.h"
 #include "rpaphp.h"
@@ -44,15 +45,14 @@ static struct device_node *find_vio_slot_node(char 
*drc_name)
 {
struct device_node *parent = of_find_node_by_name(NULL, "vdevice");
struct device_node *dn = NULL;
-   char *name;
int rc;
 
if (!parent)
return NULL;
 
while ((dn = of_get_next_child(parent, dn))) {
-   rc = rpaphp_get_drc_props(dn, NULL, , NULL, NULL);
-   if ((rc == 0) && (!strcmp(drc_name, name)))
+   rc = rpaphp_check_drc_props(dn, drc_name, NULL);
+   if (rc == 0)
break;
}
 
@@ -64,15 +64,12 @@ static struct device_node *find_php_slot_pci_node(char 
*drc_name,
  char *drc_type)
 {
struct device_node *np = NULL;
-   char *name;
-   char *type;
int rc;
 
while ((np = of_find_node_by_name(np, "pci"))) {
-   rc = rpaphp_get_drc_props(np, NULL, , , NULL);
+   rc = rpaphp_check_drc_props(np, drc_name, drc_type);
if (rc == 0)
-   if (!strcmp(drc_name, name) && !strcmp(drc_type, type))
-   break;
+   break;
}
 
return np;
diff --git a/drivers/pci/hotplug/rpaphp.h b/drivers/pci/hotplug/rpaphp.h
index 7db024e..8db5f2e 100644
--- a/drivers/pci/hotplug/rpaphp.h
+++ b/drivers/pci/hotplug/rpaphp.h
@@ -91,8 +91,8 @@ struct slot {
 
 /* rpaphp_core.c */
 int rpaphp_add_slot(struct device_node *dn);
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain);
+int rpaphp_check_drc_props(struct device_node *dn, char *drc_name,
+   char *drc_type);
 
 /* rpaphp_slot.c */
 void dealloc_slot_struct(struct slot *slot);
diff --git a/drivers/pci/hotplug/rpaphp_core.c 
b/drivers/pci/hotplug/rpaphp_core.c
index 1e29aba..6da613a 100644
--- a/drivers/pci/hotplug/rpaphp_core.c
+++ b/drivers/pci/hotplug/rpaphp_core.c
@@ -30,6 +30,7 @@
 #include 
 #include 
 #include 
+#include 
 #include/* for eeh_add_device() */
 #include   /* rtas_call */
 #include /* for pci_controller */
@@ -196,25 +197,21 @@ static int get_children_props(struct device_node *dn, 
const int **drc_indexes,
return 0;
 }
 
-/* To get the DRC props describing the current node, first obtain it's
- * my-drc-index property.  Next obtain the DRC list from it's parent.  Use
- * the my-drc-index for correlation, and obtain the requested properties.
+
+/* Verify the existence of 'drc_name' and/or 'drc_type' within the
+ * current node.  First obtain it's my-drc-index property.  Next,
+ * obtain the DRC info from it's parent.  Use the my-drc-index for
+ * correlation, and obtain/validate the requested properties.
  */
-int rpaphp_get_drc_props(struct device_node *dn, int *drc_index,
-   char **drc_name, char **drc_type, int *drc_power_domain)
+
+static int rpaphp_check_drc_props_v1(struct device_node *dn, char *drc_name,
+   char *drc_type, unsigned int my_index)
 {
+   char *name_tmp, *type_tmp;
const int *indexes, *names;
const int *types, *domains;
-   const unsigned int *my_index;
-   char *name_tmp, *type_tmp;
int i, rc;
 
-   my_index = of_get_property(dn, "ibm,my-drc-index", NULL);
-   if (!my_index) {
-   /* Node isn't DLPAR/hotplug capable */
-   return -EINVAL;
-   }
-
rc = get_children_props(dn->parent, , , , );
if (rc < 0) {
return -EINVAL;
@@ -225,24 +222,86 @@ int 

[PATCH V5 1/4] powerpc/firmware: Add definitions for new drc-info firmware feature

2017-11-28 Thread Michael Bringmann
Firmware Features: Define new bit flag representing the presence of
new device tree property "ibm,drc-info".  The flag is used to tell
the front end processor whether the Linux kernel supports the new
property, and by the front end processor to tell the Linux kernel
that the new property is present in the device tree.

Signed-off-by: Michael Bringmann 
---
 arch/powerpc/include/asm/firmware.h   |3 ++-
 arch/powerpc/include/asm/prom.h   |1 +
 arch/powerpc/platforms/pseries/firmware.c |1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/firmware.h 
b/arch/powerpc/include/asm/firmware.h
index 8645897..329d537 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -51,6 +51,7 @@
 #define FW_FEATURE_BEST_ENERGY ASM_CONST(0x8000)
 #define FW_FEATURE_TYPE1_AFFINITY ASM_CONST(0x0001)
 #define FW_FEATURE_PRRNASM_CONST(0x0002)
+#define FW_FEATURE_DRC_INFOASM_CONST(0x0004)
 
 #ifndef __ASSEMBLY__
 
@@ -67,7 +68,7 @@ enum {
FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
-   FW_FEATURE_HPT_RESIZE,
+   FW_FEATURE_HPT_RESIZE | FW_FEATURE_DRC_INFO,
FW_FEATURE_PSERIES_ALWAYS = 0,
FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
FW_FEATURE_POWERNV_ALWAYS = 0,
diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 825bd59..3243455 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -175,6 +175,7 @@ struct of_drconf_cell {
 #define OV5_HASH_GTSE  0x1940  /* Guest Translation Shoot Down Avail */
 /* Radix Table Extensions */
 #define OV5_RADIX_GTSE 0x1A40  /* Guest Translation Shoot Down Avail */
+#define OV5_DRC_INFO   0x1640  /* Redef Prop Structures: drc-info   */
 
 /* Option Vector 6: IBM PAPR hints */
 #define OV6_LINUX  0x02/* Linux is our OS */
diff --git a/arch/powerpc/platforms/pseries/firmware.c 
b/arch/powerpc/platforms/pseries/firmware.c
index 63cc82a..757d757 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -114,6 +114,7 @@ struct vec5_fw_feature {
 vec5_fw_features_table[] = {
{FW_FEATURE_TYPE1_AFFINITY, OV5_TYPE1_AFFINITY},
{FW_FEATURE_PRRN,   OV5_PRRN},
+   {FW_FEATURE_DRC_INFO,   OV5_DRC_INFO},
 };
 
 static void __init fw_vec5_feature_init(const char *vec5, unsigned long len)



[PATCH V5 0/4] powerpc/devtree: Add support for 'ibm,drc-info' property

2017-11-28 Thread Michael Bringmann
Several properties in the DRC device tree format are replaced by
more compact representations to allow, for example, for the encoding
of vast amounts of memory, and or reduced duplication of information
in related data structures.

"ibm,drc-info": This property, when present, replaces the following
four properties: "ibm,drc-indexes", "ibm,drc-names", "ibm,drc-types"
and "ibm,drc-power-domains".  This property is defined for all
dynamically reconfigurable platform nodes.  The "ibm,drc-info" elements
are intended to provide a more compact representation, and reduce some
search overhead.

"ibm,architecture.vec": Bidirectional communication mechanism between
the host system and the front end processor indicating what features
the host system supports and what features the front end processor will
actually provide.  In this case, we are indicating that the host system
can support the new device tree structure "ibm,drc-info".

Signed-off-by: Michael Bringmann 

Michael Bringmann (4):
  powerpc/firmware: Add definitions for new drc-info firmware feature.
  pseries/drc-info: Search new DRC properties for CPU indexes
  hotplug/drc-info: Add code to search new devtree property
  powerpc: Enable support for new DRC devtree property
---
Changes in V5:
  -- Simplify of_prop_next_u32 invocation
  -- Remove unnecessary WARN_ON() tests



[PATCH] scripts: Add ppc64le support for checkstack.pl

2017-11-28 Thread Breno Leitao
64-bit ELF v2 ABI specification for POWER describes, on section "General
Stack Frame Requirements", that the stack should use the following
instructions when compiled with backchain:

  mflr r0
  std  r0, 16(r1)
  stdu r1, -XX(r1)

Where XX is the frame size for that function, and this is the value
checkstack.pl will find the stack size for each function.

This patch also simplifies the entire Powerpc section, since just two
type of instructions are used, 'stdu' for 64 bits and 'stwu' for 32 bits
platform.

Signed-off-by: Breno Leitao 
---
 scripts/checkstack.pl | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/scripts/checkstack.pl b/scripts/checkstack.pl
index 7f4c41717e26..8ed217ddf2c9 100755
--- a/scripts/checkstack.pl
+++ b/scripts/checkstack.pl
@@ -14,6 +14,7 @@
 #  M68k port by Geert Uytterhoeven and Andreas Schwab
 #  AArch64, PARISC ports by Kyle McMartin
 #  sparc port by Martin Habets 
+#  ppc64le port by Breno Leitao 
 #
 #  Usage:
 #  objdump -d vmlinux | scripts/checkstack.pl [arch]
@@ -81,13 +82,9 @@ my (@stack, $re, $dre, $x, $xs, $funcre);
$re = qr/.*l\.addi.*r1,r1,-(([0-9]{2}|[3-9])[0-9]{2})/o;
} elsif ($arch eq 'parisc' || $arch eq 'parisc64') {
$re = qr/.*ldo ($x{1,8})\(sp\),sp/o;
-   } elsif ($arch eq 'ppc') {
-   #c00029f4:   94 21 ff 30 stwur1,-208(r1)
-   $re = qr/.*stwu.*r1,-($x{1,8})\(r1\)/o;
-   } elsif ($arch eq 'ppc64') {
-   #XXX
-   $re = qr/.*stdu.*r1,-($x{1,8})\(r1\)/o;
-   } elsif ($arch eq 'powerpc') {
+   } elsif ($arch eq 'powerpc' || $arch =~ /^ppc(64)?(le)?$/ ) {
+   # powerpc: 94 21 ff 30 stwur1,-208(r1)
+   # ppc64(le)  : 81 ff 21 f8 stdur1,-128(r1)
$re = qr/.*st[dw]u.*r1,-($x{1,8})\(r1\)/o;
} elsif ($arch =~ /^s390x?$/) {
#   11160:   a7 fb ff 60 aghi   %r15,-160
-- 
2.15.0



Re: [PATCHv4 0/6] printk/ia64/ppc64/parisc64: let's deprecate %pF/%pf printk specifiers

2017-11-28 Thread Petr Mladek
On Fri 2017-11-10 08:48:24, Sergey Senozhatsky wrote:
>   Hello,
> 
>   A reworked version. There is a new dereference_symbol_descriptor()
> function now, where "the magic happens", so I don't touch kallsyms_lookup()
> and module_address_lookup() anymore.

The new version looks good to me. Thanks a lot for reworking it.
I feel much better now. For the whole series:

Reviewed-by: Petr Mladek 

>   All Ack-s/Tested-by-s were dropped, since the patch set has been
> reworked. I'm kindly asking arch-s maintainers and developers to test it
> once again. Sorry for any inconveniences and thanks for your help in
> advance.

I see that it was tested on all affected architectures. Thanks a lot
all testers.

It seems that we are ready to go. I am going to push this into
for-4.16 branch in printk.git.

Best Regards,
Petr


Re: [PATCHv4 5/6] symbol lookup: introduce dereference_symbol_descriptor()

2017-11-28 Thread Petr Mladek
On Sat 2017-11-11 13:49:32, Sergey Senozhatsky wrote:
> On (11/10/17 10:09), Luck, Tony wrote:
> > On Fri, Nov 10, 2017 at 08:48:29AM +0900, Sergey Senozhatsky wrote:
> > > -Examples::
> > > -
> > > - printk("Going to call: %pF\n", gettimeofday);
> > > - printk("Going to call: %pF\n", p->func);
> > > - printk("%s: called from %pS\n", __func__, (void *)_RET_IP_);
> > > - printk("%s: called from %pS\n", __func__,
> > > - (void *)__builtin_return_address(0));
> > > - printk("Faulted at %pS\n", (void *)regs->ip);
> > > - printk(" %s%pB\n", (reliable ? "" : "? "), (void *)*stack);
> > 
> > Did you mean to delete the Examples completely?  Wouldn't it
> > be better to just update (s/%pF/%pS/g)?
> 
> good question. yes, I think I did it deliberately :) we still
> kinda have some sort of "examples", right at the beginning of
> section "Symbols/Function Pointers"

These extra examples were added just recently (v4.14-rc1)
by the commit fd46cd55fbc5a8e8c ("printk-formats.txt: Add examples
for %pF and %pS usage"). They were supposed to help using
%pF and %pS correctly according to the situation. But we
have a better solution now. %pF is obsoleted by this
patchset.

IMHO, it is perfectly fine to remove the extra examples.

Best Regards,
Petr


Re: [PATCH] powerpc/powernv: Add queue mechanism for early messages

2017-11-28 Thread Deb McLemore
Hi Michael,

Thanks for the comments, I'll respin the patch and send another version.

Summary on the problem being solved:

When issuing a BMC soft poweroff during IPL the poweroff was being lost,

so the machine would not poweroff.

Opal messages were being received before the opal-power code registered its 
notifiers.

A few alternatives were discussed (option #3 was chosen):

1 - Have opal_message_init() explicitly call opal_power_control_init() before it

dequeues any OPAL messages (i.e. before we register the opal-msg IRQ handler).

2 - Introduce concept of "critical" message types and when we register handlers

we track which message types have a registered handler, then defer the opal-msg

IRQ registration until we have a handler registered for all the critical types.

3 - Buffering messages, if we receive a message and do not yet have a handler

for that type, store the message and replay when a handler for that type is 
registered.


There was also a patch submitted for Busybox to close an exposed path there.

http://lists.busybox.net/pipermail/busybox/2017-November/085980.html


On 11/28/2017 07:30 AM, Michael Ellerman wrote:
> Hi Deb,
>
> Thanks for the patch.
>
> Some comments below ...
>
> Deb McLemore  writes:
>> Add a check for do_notify to confirm that a message handler
>> has been registered before an attempt is made to call notifier
>> call chain.
>>
>> If the message handler has not been registered queue up the message
>> to be replayed when the proper registration is called.
> Can you give me a bit more detail here on why we want to do this,
> what the alternatives are (if any), and what problem it solves.
>
>> diff --git a/arch/powerpc/platforms/powernv/opal.c 
>> b/arch/powerpc/platforms/powernv/opal.c
>> index 65c79ec..0e3b464 100644
>> --- a/arch/powerpc/platforms/powernv/opal.c
>> +++ b/arch/powerpc/platforms/powernv/opal.c
>> @@ -40,6 +40,16 @@
>>  
>>  #include "powernv.h"
>>  
>> +#define OPAL_MSG_QUEUE_MAX 16
> Why 16?
Arbitrary limit, the case of not having a handler registered is a short-lived
window and the replay queue is not meant to hide bugs, if messages are
being sent and no one has registered there is a problem and having things
be surfaced earlier rather than later seems more helpful in identifying 
exposures.
If future use cases need larger replay queue limits that can be re-visited.
>
> It seems a bit arbitrary. You're kzalloc'ing them, and they're < 100
> bytes or so, so I don't see any reason to restrict it so much?
>
> Having some sort of limit is probably good, but it could be 1024 or
> something, just to catch the case where nothing ever registers for that
> message type due to a bug.
>
>> +
>> +struct OpalMsgNode {
> Please use snake case, rather than camel case. I know some of the
> existing opal code uses camel case, but it's still wrong :)
>
> So that'd be opal_msg_node.
>
>> +struct list_headopal_queue_list_node;
> It's usual practice to just use "list" as the name for these. It doesn't
> need to be fully qualified like that, and "list" will look familiar to
> people.
>
>> +struct opal_msg msg;
>> +uint32_tqueue_msg_type;
> The type is in the struct opal_msg, so I don't think we need it here do
> we? You will have to endian-convert it though.
>
>> +};
>> +
>> +static LIST_HEAD(opal_msg_queue_pending);
> Being a list head this would usually have "list" in the name, so it
> could just be "msg_list".
>
>> @@ -55,11 +65,15 @@ struct mcheck_recoverable_range {
>>  u64 recover_addr;
>>  };
>>  
>> +static unsigned long opal_msg_notify_reg_mask;
>> +static int opal_active_queue_elements;
> And then this could just be "msg_list_size" or "len".
>
>
>>  static struct mcheck_recoverable_range *mc_recoverable_range;
>>  static int mc_recoverable_range_len;
>>  
>>  struct device_node *opal_node;
>>  static DEFINE_SPINLOCK(opal_write_lock);
>> +static DEFINE_SPINLOCK(opal_msg_lock);
> You've grouped this with the other lock, but it would actually be better
> with the list head that it protects. And if you accept my suggestion of
> renaming the list to "msg_list" then this would be "msg_list_lock".
>
>> @@ -238,14 +252,47 @@ machine_early_initcall(powernv, 
>> opal_register_exception_handlers);
>>  int opal_message_notifier_register(enum opal_msg_type msg_type,
>>  struct notifier_block *nb)
>>  {
>> +struct OpalMsgNode *msg_node, *tmp;
>> +int ret;
>> +unsigned long flags;
> I prefer this style:
>
>> +struct OpalMsgNode *msg_node, *tmp;
>> +unsigned long flags;
>> +int ret;
>> +
>> +spin_lock_irqsave(_msg_lock, flags);
>> +
>> +opal_msg_notify_reg_mask |= 1 << msg_type;
>> +
>> +spin_unlock_irqrestore(_msg_lock, flags);
> So setting the bit in the mask here, before you check the args below is
> a bit fishy. It's also a bit fishy to take the lock, then drop it, then
> take it again below, though I don't think there's 

Re: [PATCH v2 00/10] posix_clocks: Prepare syscalls for 64 bit time_t conversion

2017-11-28 Thread Arnd Bergmann
On Mon, Nov 27, 2017 at 11:29 PM, Deepa Dinamani  wrote:
>>> I decided against using LEGACY_TIME_SYSCALLS to conditionally compile
>>> legacy time syscalls such as sys_nanosleep because this will need to
>>> enclose compat_sys_nanosleep as well. So, defining it as
>>>
>>> config LEGACY_TIME_SYSCALLS
>>>  def_bool 64BIT || !64BIT_TIME
>>>
>>> will not include compat_sys_nanosleep. We will instead need a new config to
>>> exclusively mark legacy syscalls.
>>
>> Do you mean we would need to do this separately for native and compat
>> syscalls, and have yet another option, like LEGACY_TIME_SYSCALLS
>> and LEGACY_TIME_COMPAT_SYSCALLS, to cover all cases? I would
>> think that CONFIG_COMPAT_32BIT_TIME handles all the compat versions,
>> while CONFIG_LEGACY_TIME_SYSCALLS handles all the native ones.
>
> I meant sys_nanosleep would be covered by LEGACY_TIME_SYSCALLS, but
> compat_sys_nanosleep would be covered by CONFIG_COMPAT_32BIT_TIME
> along with other compat syscalls.
> So, if we define the LEGACY_TIME_SYSCALLS as
>
>
> "This controls the compilation of the following system calls:
> time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,
> alarm, getitimer,
> setitimer, select, utime, utimes, futimesat, and
> {old,new}{l,f,}stat{,64}.
> These all pass 32-bit time_t arguments on 32-bit architectures and
> are replaced by other interfaces (e.g. posix timers and clocks, 
> statx).
> C libraries implementing 64-bit time_t in 32-bit architectures have to
> implement the handles by wrapping around the newer interfaces.
> New architectures should not explicitly enable this."
>
> This would not be really true as compat interfaces have nothing to do
> with this config.
>
> I was proposing that we could have LEGACY_TIME_SYSCALLS config, but
> then have all these "deprecated" syscalls be enclosed within this,
> compat or not.
> This will also mean that we will have to come up representing these
> syscalls in the syscall header files.
> This can be a separate patch and this series can be merged as is if
> everyone agrees.

I think doing this separately  would be good, I don't see any interdependency
with the other patches, we just need to decide what we want in the long
run.

I agree my text that you cited doesn't capture the situation correctly,
as this is really about the obsolete system calls that take 64-bit time_t
arguments on architectures that are converted to allow 64-bit time_t
for non-obsolete system calls.

Maybe it's better to just reword this to

  "This controls the compilation of the following system calls:
  time, stime, gettimeofday, settimeofday, adjtimex, nanosleep,
alarm, getitimer,
  setitimer, select, utime, utimes, futimesat, and {old,new}{l,f,}stat{,64}.
  These are all replaced by other interfaces (e.g. posix timers and clocks,
  statx) on architectures that got converted from 32-bit time_t to
64-bit time_t.
  C libraries implementing 64-bit time_t in 32-bit architectures have to
  implement the handles by wrapping around the newer interfaces.
  New architectures should not explicitly enable this."

That would clarify that it's not about the compat system calls, while
also allowing the two options to be set independently.

Arnd


Re: [PATCH] powerpc/powernv: Add queue mechanism for early messages

2017-11-28 Thread Michael Ellerman
Hi Deb,

Thanks for the patch.

Some comments below ...

Deb McLemore  writes:
> Add a check for do_notify to confirm that a message handler
> has been registered before an attempt is made to call notifier
> call chain.
>
> If the message handler has not been registered queue up the message
> to be replayed when the proper registration is called.

Can you give me a bit more detail here on why we want to do this,
what the alternatives are (if any), and what problem it solves.

> diff --git a/arch/powerpc/platforms/powernv/opal.c 
> b/arch/powerpc/platforms/powernv/opal.c
> index 65c79ec..0e3b464 100644
> --- a/arch/powerpc/platforms/powernv/opal.c
> +++ b/arch/powerpc/platforms/powernv/opal.c
> @@ -40,6 +40,16 @@
>  
>  #include "powernv.h"
>  
> +#define OPAL_MSG_QUEUE_MAX 16

Why 16?

It seems a bit arbitrary. You're kzalloc'ing them, and they're < 100
bytes or so, so I don't see any reason to restrict it so much?

Having some sort of limit is probably good, but it could be 1024 or
something, just to catch the case where nothing ever registers for that
message type due to a bug.

> +
> +struct OpalMsgNode {

Please use snake case, rather than camel case. I know some of the
existing opal code uses camel case, but it's still wrong :)

So that'd be opal_msg_node.

> + struct list_headopal_queue_list_node;

It's usual practice to just use "list" as the name for these. It doesn't
need to be fully qualified like that, and "list" will look familiar to
people.

> + struct opal_msg msg;
> + uint32_tqueue_msg_type;

The type is in the struct opal_msg, so I don't think we need it here do
we? You will have to endian-convert it though.

> +};
> +
> +static LIST_HEAD(opal_msg_queue_pending);

Being a list head this would usually have "list" in the name, so it
could just be "msg_list".

> @@ -55,11 +65,15 @@ struct mcheck_recoverable_range {
>   u64 recover_addr;
>  };
>  
> +static unsigned long opal_msg_notify_reg_mask;
> +static int opal_active_queue_elements;

And then this could just be "msg_list_size" or "len".


>  static struct mcheck_recoverable_range *mc_recoverable_range;
>  static int mc_recoverable_range_len;
>  
>  struct device_node *opal_node;
>  static DEFINE_SPINLOCK(opal_write_lock);
> +static DEFINE_SPINLOCK(opal_msg_lock);

You've grouped this with the other lock, but it would actually be better
with the list head that it protects. And if you accept my suggestion of
renaming the list to "msg_list" then this would be "msg_list_lock".

> @@ -238,14 +252,47 @@ machine_early_initcall(powernv, 
> opal_register_exception_handlers);
>  int opal_message_notifier_register(enum opal_msg_type msg_type,
>   struct notifier_block *nb)
>  {
> + struct OpalMsgNode *msg_node, *tmp;
> + int ret;
> + unsigned long flags;

I prefer this style:

> + struct OpalMsgNode *msg_node, *tmp;
> + unsigned long flags;
> + int ret;

> +
> + spin_lock_irqsave(_msg_lock, flags);
> +
> + opal_msg_notify_reg_mask |= 1 << msg_type;
> +
> + spin_unlock_irqrestore(_msg_lock, flags);

So setting the bit in the mask here, before you check the args below is
a bit fishy. It's also a bit fishy to take the lock, then drop it, then
take it again below, though I don't think there's actually a bug.

But, do we even need the mask? The only place it's used is in
opal_message_do_notify(), and I think that could just be replaced with a
list_empty() check of the notifier chain?


>   if (!nb || msg_type >= OPAL_MSG_TYPE_MAX) {
>   pr_warning("%s: Invalid arguments, msg_type:%d\n",
>  __func__, msg_type);
>   return -EINVAL;
>   }
>  
> - return atomic_notifier_chain_register(
> - _msg_notifier_head[msg_type], nb);
> + ret = atomic_notifier_chain_register(
> + _msg_notifier_head[msg_type], nb);
> +
> + if (ret)
> + return ret;
 
The logic below should probably be in a helper function.

> + spin_lock_irqsave(_msg_lock, flags);
> + list_for_each_entry_safe(msg_node,
> + tmp,
> + _msg_queue_pending,
> + opal_queue_list_node) {
> + if (msg_node->queue_msg_type == msg_type) {

You can reduce the indentation by doing:

if (msg_node->queue_msg_type != msg_type)
continue;

atomic_notifier_call_chain(
_msg_notifier_head[msg_type],
msg_type,
_node->msg);
list_del(_node->opal_queue_list_node);
kfree(msg_node);
opal_active_queue_elements--;
}

> + spin_unlock_irqrestore(_msg_lock, flags);
> +
> + return ret;

ret can only be 0 here, so it's clearer to just return 0.

> +
>  }
>  

Re: [PATCH v2 5/5] of/fdt: only store the device node basename in full_name

2017-11-28 Thread Rob Herring
On Tue, Nov 28, 2017 at 7:13 AM, Geert Uytterhoeven
 wrote:
> Hi Rob,
>
> On Mon, Aug 21, 2017 at 5:16 PM, Rob Herring  wrote:
>> With dependencies on a statically allocated full path name converted to
>> use %pOF format specifier, we can store just the basename of node, and
>> the unflattening of the FDT can be simplified.
>>
>> This commit will affect the remaining users of full_name. After
>> analyzing these users, the remaining cases should only change some print
>> messages. The main users of full_name are providing a name for struct
>> resource. The resource names shouldn't be important other than providing
>> /proc/iomem names.
>
> I guess the plan is to get rid in a subsequent step of all calls to 
> kbasename()
> on a full name, which is now futile?

No. Sparc (PDT) is still the full path and I don't plan to change that.

Rob


Re: [PATCH v2 5/5] of/fdt: only store the device node basename in full_name

2017-11-28 Thread Geert Uytterhoeven
Hi Rob,

On Mon, Aug 21, 2017 at 5:16 PM, Rob Herring  wrote:
> With dependencies on a statically allocated full path name converted to
> use %pOF format specifier, we can store just the basename of node, and
> the unflattening of the FDT can be simplified.
>
> This commit will affect the remaining users of full_name. After
> analyzing these users, the remaining cases should only change some print
> messages. The main users of full_name are providing a name for struct
> resource. The resource names shouldn't be important other than providing
> /proc/iomem names.

I guess the plan is to get rid in a subsequent step of all calls to kbasename()
on a full name, which is now futile?

Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


Re: [PATCH] powerpc/powernv : Add support to enable sensor groups

2017-11-28 Thread Michael Ellerman
Shilpasri G Bhat  writes:

> Adds support to enable/disable a sensor group. This can be used to
> select the sensor groups that needs to be copied to main memory by
> OCC. Sensor groups like power, temperature, current, voltage,
> frequency, utilization can be enabled/disabled at runtime.
>
> Signed-off-by: Shilpasri G Bhat 
> ---
> The skiboot patch for the opal call is posted below:
> https://lists.ozlabs.org/pipermail/skiboot/2017-November/009713.html

Can you remind me why we're doing this with a completely bespoke sysfs
API, rather than using some generic sensors API?

And if we must do it that way, please add documentation for the sysfs
file(s) in Documentation/ABI/.

cheers


Re: what is the state about "[v2] ppc64 boot: Wait for boot cpu to show up if nr_cpus limit is about to hit"

2017-11-28 Thread Michael Ellerman
Liu ping fan  writes:

> Hi,
>
> I can not find the history about:
> https://patchwork.ozlabs.org/patch/577193/
>
>
> Can we have this patch?

I strongly dislike it.

Our CPU discovery code is already a big mess, in two separate places,
and this makes it worse.

In theory we have a split between logical and hardware CPU numbers, so
we should just be able to call the whatever CPU we boot on CPU 0. Why
does that not work? And should we fix that?

cheers


Re: Qoriq P5020 PowerPC board doesn't boot with the latest git version anymore

2017-11-28 Thread Michael Ellerman
Christian Zigotzky  writes:

> Hi All,
>
> I compiled the latest git kernel today. Unfortunately my Varisys Cyrus 
> Plus board still doesn't boot with the latest git kernel.
>
> After that I patched the kernel source code with the spinlock patch and 
> compiled the kernel again. With the spinlock patch, the latest git 
> kernel boots without any problems.
>
> Please find attached the spinlock patch.

Can you remove the patch, and then build your kernel with
CONFIG_PREEMPT=n.

cheers


[PATCH] selftest/powerpc: Add additional option to mmap_bench test

2017-11-28 Thread Aneesh Kumar K.V
This patch adds --pgfault and --iterations options to mmap_bench test. With
--pgfault we touch every page mapped. This helps in measuring impact in the
page fault path with a patch series.

Signed-off-by: Aneesh Kumar K.V 
---
 .../selftests/powerpc/benchmarks/mmap_bench.c  | 53 --
 1 file changed, 50 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c 
b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
index 8d084a2d6e74..7a0a462a2272 100644
--- a/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
+++ b/tools/testing/selftests/powerpc/benchmarks/mmap_bench.c
@@ -7,17 +7,34 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "utils.h"
 
 #define ITERATIONS 500
 
-#define MEMSIZE (128 * 1024 * 1024)
+#define MEMSIZE (1UL << 27)
+#define PAGE_SIZE (1UL << 16)
+#define CHUNK_COUNT (MEMSIZE/PAGE_SIZE)
+
+static int pg_fault;
+static int iterations = ITERATIONS;
+
+static struct option options[] = {
+   { "pgfault", no_argument, _fault, 1 },
+   { "iterations", required_argument, 0, 'i' },
+   { 0, },
+};
+
+static void usage(void)
+{
+   printf("mmap_bench <--pgfault> <--iterations count>\n");
+}
 
 int test_mmap(void)
 {
struct timespec ts_start, ts_end;
-   unsigned long i = ITERATIONS;
+   unsigned long i = iterations;
 
clock_gettime(CLOCK_MONOTONIC, _start);
 
@@ -25,6 +42,11 @@ int test_mmap(void)
char *c = mmap(NULL, MEMSIZE, PROT_READ|PROT_WRITE,
   MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
FAIL_IF(c == MAP_FAILED);
+   if (pg_fault) {
+   int count;
+   for (count = 0; count < CHUNK_COUNT; count++)
+   c[count << 16] = 'c';
+   }
munmap(c, MEMSIZE);
}
 
@@ -35,7 +57,32 @@ int test_mmap(void)
return 0;
 }
 
-int main(void)
+int main(int argc, char *argv[])
 {
+   signed char c;
+   while (1) {
+   int option_index = 0;
+
+   c = getopt_long(argc, argv, "", options, _index);
+
+   if (c == -1)
+   break;
+
+   switch (c) {
+   case 0:
+   if (options[option_index].flag != 0)
+   break;
+
+   usage();
+   exit(1);
+   break;
+   case 'i':
+   iterations = atoi(optarg);
+   break;
+   default:
+   usage();
+   exit(1);
+   }
+   }
return test_harness(test_mmap, "mmap_bench");
 }
-- 
2.14.3



[PATCH] powerpc/hash: Skip non initialized page size in init_hpte_page_sizes

2017-11-28 Thread Aneesh Kumar K.V
One of the easiest way to test config with 4K HPTE is to disable 64K hardware
page size like below.

int __init htab_dt_scan_page_sizes(unsigned long node,

size -= 3; prop += 3;
base_idx = get_idx_from_shift(base_shift);
-   if (base_idx < 0) {
+   if (base_idx < 0 || base_idx == MMU_PAGE_64K) {
/* skip the pte encoding also */
prop += lpnum * 2; size -= lpnum * 2;

But then this results in error in other part of the code such as MPSS parsing
where we look at 4K base page size and 64K actual page size support.

This patch fix MPSS parsing by ignoring the actual page sizes marked
unsupported. In reality this can happen only with a corrupt device tree. But it
is good to tighten the error check.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/hash_utils_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index e700660459c4..2ae18ff91390 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -601,7 +601,7 @@ static void init_hpte_page_sizes(void)
continue;   /* not a supported page size */
for (ap = bp; ap < MMU_PAGE_COUNT; ++ap) {
penc = mmu_psize_defs[bp].penc[ap];
-   if (penc == -1)
+   if (penc == -1 || !mmu_psize_defs[ap].shift)
continue;
shift = mmu_psize_defs[ap].shift - LP_SHIFT;
if (shift <= 0)
-- 
2.14.3



Re: [PATCH v4] powerpc: Avoid signed to unsigned conversion in set_thread_tidr()

2017-11-28 Thread christophe lombard

Le 28/11/2017 à 03:53, Vaibhav Jain a écrit :

There is an unsafe signed to unsigned conversion in set_thread_tidr()
that may cause an error value to be assigned to SPRN_TIDR register and
used as thread-id.

The issue happens as assign_thread_tidr() returns an int and
thread.tidr is an unsigned-long. So a negative error code returned
from assign_thread_tidr() will fail the error check and gets assigned
as tidr as a large positive value.

To fix this the patch assigns the return value of assign_thread_tidr()
to a temporary int and assigns it to thread.tidr iff its '> 0'.

The patch shouldn't impact the calling convention of set_thread_tidr()
i.e all -ve return-values are error codes and a return value of '0'
indicates success.

Fixes: ec233ede4c86("powerpc: Add support for setting SPRN_TIDR")
Signed-off-by: Vaibhav Jain 

---


sounds good for me
Thanks

Reviewed-by: Christophe Lombard clomb...@linux.vnet.ibm.com