Re: [PATCH v3] arm64: v8.4: Support for new floating point multiplication instructions

2017-12-13 Thread Suzuki K Poulose

On 13/12/17 10:13, Dongjiu Geng wrote:

ARM v8.4 extensions add new neon instructions for performing a
multiplication of each FP16 element of one vector with the corresponding
FP16 element of a second vector, and to add or subtract this without an
intermediate rounding to the corresponding FP32 element in a third vector.

This patch detects this feature and let the userspace know about it via a
HWCAP bit and MRS emulation.

Cc: Dave Martin <dave.mar...@arm.com>
Cc: Suzuki K Poulose <suzuki.poul...@arm.com>
Signed-off-by: Dongjiu Geng <gengdong...@huawei.com>
Reviewed-by: Dave Martin <dave.mar...@arm.com>


Looks good to me.

Reviewed-by: Suzuki K Poulose <suzuki.poul...@arm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RESEND PATCH] arm64: v8.4: Support for new floating point multiplication variant

2017-12-11 Thread Suzuki K Poulose

Hi gengdongjiu

Sorry for the late response. I have a similar patch to add the support 
for "FHM", which I was about to post it this week.


On 11/12/17 13:29, Dave Martin wrote:

On Mon, Dec 11, 2017 at 08:47:00PM +0800, gengdongjiu wrote:


On 2017/12/11 19:59, Dave P Martin wrote:

On Sat, Dec 09, 2017 at 03:28:42PM +, Dongjiu Geng wrote:

ARM v8.4 extensions include support for new floating point
multiplication variant instructions to the AArch64 SIMD


Do we have any human-readable description of what the new instructions
do?

Since the v8.4 spec itself only describes these as "New Floating
Point Multiplication Variant", I wonder what "FHM" actually stands
for.

Thanks for the point out.
In fact, this feature only adds two instructions:
FP16 * FP16 + FP32
FP16 * FP16 - FP32

The spec call this bit to ID_AA64ISAR0_EL1.FHM, I do not know why it
will call "FHM", I  think call it "FMLXL" may be better, which can
stand for FMLAL/FMLSL instructions.


Although "FHM" is cryptic, I think it makes sense to keep this as "FHM"
to match the ISAR0 field name -- we've tended to follow this policy
for other extension names unless there's a much better or more obvious
name available.

For "FMLXL", new instructions might be added in the future that match
the same pattern, and then "FMLXL" could become ambiguous.  So maybe
this is not the best choice.


I think the FHM stands for "FP Half precision Multiplication 
instructions". I vote for keeping the feature bit in sync with the 
register bit definition. i.e, FHM.


However, my version of the patch names the HWCAP bit "asimdfml", 
following the compiler name for the feature option "fp16fml", which

is not perfect either. I think FHM is the safe option here.




Maybe something like "widening half-precision floating-point multiply
accumulate" is acceptable wording consistent with the existing
architecture, but I just made that up, so it's not official ;)


how about something like "performing a multiplication of each FP16
element of one vector with the corresponding FP16 element of a second
vector, and to add or subtract this without an intermediate rounding
to the corresponding FP32 element in a third vector."?


We could have that, I guess.



I agree, and that matches the feature description.
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 10/10] arm64: dts: juno: Add Coresight CPU debug nodes

2017-05-19 Thread Suzuki K Poulose

On 19/05/17 18:44, Sudeep Holla wrote:

Hi Suzuki, Leo,

On 19/05/17 05:25, Leo Yan wrote:

From: Suzuki K Poulose <suzuki.poul...@arm.com>

Add Coresight CPU debug nodes for Juno r0, r1 & r2. The CPU
debug areas are mapped at the same address for all revisions,
like the ETM, even though the CPUs have changed from r1 to r2.

Cc: Sudeep Holla <sudeep.ho...@arm.com>
Cc: Leo Yan <leo@linaro.org>
Cc: Mathieu Poirier <mathieu.por...@linaro.org>
Cc: Liviu Dudau <liviu.du...@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poul...@arm.com>


ARM-SoC team expect DTS to be routed via platform tree.
So I have queued this for v4.13[1]


Sudeep

Thanks for picking that up.

Suzuki

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v9 7/9] coresight: add support for CPU debug module

2017-05-11 Thread Suzuki K Poulose

On 09/05/17 03:50, Leo Yan wrote:

Coresight includes debug module and usually the module connects with CPU
debug logic. ARMv8 architecture reference manual (ARM DDI 0487A.k) has
description for related info in "Part H: External Debug".

Chapter H7 "The Sample-based Profiling Extension" introduces several
sampling registers, e.g. we can check program counter value with
combined CPU exception level, secure state, etc. So this is helpful for
analysis CPU lockup scenarios, e.g. if one CPU has run into infinite
loop with IRQ disabled. In this case the CPU cannot switch context and
handle any interrupt (including IPIs), as the result it cannot handle
SMP call for stack dump.

This patch is to enable coresight debug module, so firstly this driver
is to bind apb clock for debug module and this is to ensure the debug
module can be accessed from program or external debugger. And the driver
uses sample-based registers for debug purpose, e.g. when system triggers
panic, the driver will dump program counter and combined context
registers (EDCIDSR, EDVIDSR); by parsing context registers so can
quickly get to know CPU secure state, exception level, etc.

Some of the debug module registers are located in CPU power domain, so
this requires the CPU power domain stays on when access related debug
registers, but the power management for CPU power domain is quite
dependent on SoC integration for power management. For the platforms
which with sane power controller implementations, this driver follows
the method to set EDPRCR to try to pull the CPU out of low power state
and then set 'no power down request' bit so the CPU has no chance to
lose power.

If the SoC has not followed up this design well for power management
controller, the user should use the command line parameter or sysfs
to constrain all or partial idle states to ensure the CPU power
domain is enabled and access coresight CPU debug component safely.

Signed-off-by: Leo Yan <leo@linaro.org>


With comments from Mathieu addressed,

Reviewed-by: Suzuki K Poulose <suzuki.poul...@arm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 6/8] coresight: add support for CPU debug module

2017-04-19 Thread Suzuki K Poulose

On 19/04/17 15:28, Leo Yan wrote:

Hi Suzuki,

On Wed, Apr 19, 2017 at 02:23:04PM +0100, Suzuki K Poulose wrote:

Hi Leo,

This version looks good to me. I have two minor comments below.


Thanks for reviewing. Will take the suggestions. Just check a bit for
last comment.

[...]


+static int debug_probe(struct amba_device *adev, const struct amba_id *id)
+{
+   void __iomem *base;
+   struct device *dev = >dev;
+   struct debug_drvdata *drvdata;
+   struct resource *res = >res;
+   struct device_node *np = adev->dev.of_node;
+   int ret;
+
+   drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
+   if (!drvdata)
+   return -ENOMEM;
+
+   drvdata->cpu = np ? of_coresight_get_cpu(np) : 0;
+   if (per_cpu(debug_drvdata, drvdata->cpu)) {
+   dev_err(dev, "CPU%d drvdata has been initialized\n",
+   drvdata->cpu);


May be we could warn about a possible issue in the DT ?


So can I understand the suggestion is to add warning in function
of_coresight_get_cpu() when cannot find CPU number, and here directly
bail out?


No. One could have single CPU DT, where he doesn't need to provide the CPU 
number.
Hence, it doesn't make sense to WARN  in of_coresight_get_cpu().

But when we hit the case above, we find that the some node was registered for
the given CPU (be it 0 or any other), which is definitely an error in DT. Due to

1) Hasn't specified the CPU number for more than one node

OR

2) CPU number duplicated in the more than one nodes.

Cheers
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 6/8] coresight: add support for CPU debug module

2017-04-19 Thread Suzuki K Poulose

On 06/04/17 14:30, Leo Yan wrote:

Coresight includes debug module and usually the module connects with CPU
debug logic. ARMv8 architecture reference manual (ARM DDI 0487A.k) has
description for related info in "Part H: External Debug".

Chapter H7 "The Sample-based Profiling Extension" introduces several
sampling registers, e.g. we can check program counter value with
combined CPU exception level, secure state, etc. So this is helpful for
analysis CPU lockup scenarios, e.g. if one CPU has run into infinite
loop with IRQ disabled. In this case the CPU cannot switch context and
handle any interrupt (including IPIs), as the result it cannot handle
SMP call for stack dump.

This patch is to enable coresight debug module, so firstly this driver
is to bind apb clock for debug module and this is to ensure the debug
module can be accessed from program or external debugger. And the driver
uses sample-based registers for debug purpose, e.g. when system triggers
panic, the driver will dump program counter and combined context
registers (EDCIDSR, EDVIDSR); by parsing context registers so can
quickly get to know CPU secure state, exception level, etc.

Some of the debug module registers are located in CPU power domain, so
this requires the CPU power domain stays on when access related debug
registers, but the power management for CPU power domain is quite
dependent on SoC integration for power management. For the platforms
which with sane power controller implementations, this driver follows
the method to set EDPRCR to try to pull the CPU out of low power state
and then set 'no power down request' bit so the CPU has no chance to
lose power.

If the SoC has not followed up this design well for power management
controller, the user should use the command line parameter or sysfs
to constrain all or partial idle states to ensure the CPU power
domain is enabled and access coresight CPU debug component safely.


Hi Leo,

This version looks good to me. I have two minor comments below.



+
+static struct notifier_block debug_notifier = {
+   .notifier_call = debug_notifier_call,
+};
+
+static int debug_enable_func(void)
+{
+   struct debug_drvdata *drvdata;
+   int cpu;
+
+   for_each_possible_cpu(cpu) {
+   drvdata = per_cpu(debug_drvdata, cpu);
+   if (!drvdata)
+   continue;
+
+   pm_runtime_get_sync(drvdata->dev);
+   }
+
+   return atomic_notifier_chain_register(_notifier_list,
+ _notifier);
+}
+
+static int debug_disable_func(void)
+{
+   struct debug_drvdata *drvdata;
+   int cpu;
+
+   for_each_possible_cpu(cpu) {
+   drvdata = per_cpu(debug_drvdata, cpu);
+   if (!drvdata)
+   continue;
+
+   pm_runtime_put(drvdata->dev);
+   }
+
+   return atomic_notifier_chain_unregister(_notifier_list,
+   _notifier);
+}


I believe you should, reverse the order of these operations in 
debug_disable_func()
to prevent getting a panic notifier after we have released the power domain for 
the
debug.
i.e, :
atomic_notifier_chain_unregister(...);

for_each_possible_cpu(cpu) {}




+
+static ssize_t debug_func_knob_write(struct file *f,
+   const char __user *buf, size_t count, loff_t *ppos)
+{
+   u8 val;
+   int ret;
+
+   ret = kstrtou8_from_user(buf, count, 2, );
+   if (ret)
+   return ret;
+
+   mutex_lock(_lock);
+
+   if (val == debug_enable)
+   goto out;
+
+   if (val)
+   ret = debug_enable_func();
+   else
+   ret = debug_disable_func();
+
+   if (ret) {
+   pr_err("%s: unable to %s debug function: %d\n",
+  __func__, val ? "enable" : "disable", ret);
+   goto err;
+   }
+
+   debug_enable = val;
+out:
+   ret = count;
+err:
+   mutex_unlock(_lock);
+   return ret;
+}
+
+static ssize_t debug_func_knob_read(struct file *f,
+   char __user *ubuf, size_t count, loff_t *ppos)
+{
+   ssize_t ret;
+   char buf[2];
+
+   mutex_lock(_lock);
+
+   buf[0] = '0' + debug_enable;
+   buf[1] = '\n';
+   ret = simple_read_from_buffer(ubuf, count, ppos, buf, sizeof(buf));
+
+   mutex_unlock(_lock);
+   return ret;
+}
+
+static const struct file_operations debug_func_knob_fops = {
+   .open   = simple_open,
+   .read   = debug_func_knob_read,
+   .write  = debug_func_knob_write,
+};
+
+static int debug_func_init(void)
+{
+   struct dentry *file;
+   int ret;
+
+   /* Create debugfs node */
+   debug_debugfs_dir = debugfs_create_dir("coresight_cpu_debug", NULL);
+   if (!debug_debugfs_dir) {
+   pr_err("%s: unable to create debugfs directory\n", __func__);
+   return -ENOMEM;
+   }
+
+   file = 

Re: [PATCH v5 1/9] coresight: bindings for CPU debug module

2017-03-31 Thread Suzuki K Poulose

On 25/03/17 18:23, Leo Yan wrote:

According to ARMv8 architecture reference manual (ARM DDI 0487A.k)
Chapter 'Part H: External debug', the CPU can integrate debug module
and it can support self-hosted debug and external debug. Especially
for supporting self-hosted debug, this means the program can access
the debug module from mmio region; and usually the mmio region is
integrated with coresight.

So add document for binding debug component, includes binding to APB
clock; and also need specify the CPU node which the debug module is
dedicated to specific CPU.

Suggested-by: Mike Leach <mike.le...@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poir...@linaro.org>
Signed-off-by: Leo Yan <leo@linaro.org>


Reviewed-by: Suzuki K Poulose <suzuki.poul...@arm.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 4/9] coresight: refactor with function of_coresight_get_cpu

2017-03-31 Thread Suzuki K Poulose

On 25/03/17 18:23, Leo Yan wrote:

This is refactor to add function of_coresight_get_cpu(), so it's used to
retrieve CPU id for coresight component. Finally can use it as a common
function for multiple places.

Suggested-by: Mathieu Poirier <mathieu.poir...@linaro.org>
Signed-off-by: Leo Yan <leo@linaro.org>


Reviewed-by: Suzuki K Poulose <suzuki.poul...@arm.com>

-   if (found)
-   break;
-   }
-   of_node_put(dn);
-
-   /* Affinity to CPU0 if no cpu nodes are found */
-   pdata->cpu = found ? cpu : 0;
+   pdata->cpu = of_coresight_get_cpu(node);

return pdata;
 }
diff --git a/include/linux/coresight.h b/include/linux/coresight.h
index 2a5982c..bf96678 100644
--- a/include/linux/coresight.h
+++ b/include/linux/coresight.h
@@ -263,9 +263,11 @@ static inline int coresight_timeout(void __iomem *addr, 
u32 offset,
 #endif

 #ifdef CONFIG_OF
+extern int of_coresight_get_cpu(struct device_node *node);
 extern struct coresight_platform_data *of_get_coresight_platform_data(
struct device *dev, struct device_node *node);
 #else
+static inline int of_coresight_get_cpu(struct device_node *node) { return 0; }
 static inline struct coresight_platform_data *of_get_coresight_platform_data(
struct device *dev, struct device_node *node) { return NULL; }
 #endif



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/9] coresight: add support for CPU debug module

2017-03-30 Thread Suzuki K Poulose

On 30/03/17 02:03, Leo Yan wrote:

On Wed, Mar 29, 2017 at 03:56:23PM +0100, Mike Leach wrote:

[...]


+   /*
+* Unfortunately the CPU cannot be powered up, so return
+* back and later has no permission to access other
+* registers. For this case, should set 'idle_constraint'
+* to ensure CPU power domain is enabled!
+*/
+   if (!(drvdata->edprsr & EDPRSR_PU)) {
+   pr_err("%s: power up request for CPU%d failed\n",
+   __func__, drvdata->cpu);
+   goto out;
+   }
+
+out_powered_up:
+   debug_os_unlock(drvdata);
+
+   /*
+* At this point the CPU is powered up, so set the no powerdown
+* request bit so we don't lose power and emulate power down.
+*/
+   drvdata->edprsr = readl(drvdata->base + EDPRCR);
+   drvdata->edprsr |= EDPRCR_COREPURQ | EDPRCR_CORENPDRQ;


If we are here the core is already up.  Shouldn't we need to set
EDPRCR_CORENPDRQ only?


Yeah. Will fix.


No - EDPRCR_COREPURQ and EDPRCR_CORENPDRQ have different semantics and purposes

EDPRCR_COREPURQ is in the debug power domain an is tied to an external
debug request that should be an input to the external (to the PE)
system power controller.
The requirement is that the system power controller powers up the core
domain and does not power it down while it remains asserted.

EDPRCR_CORENPDRQ is in the core power domain and thus to the specific
core only. This ensures that any power control software running on
that core should emulate a power down if this is set to one.


I'm curious the exact meaning for "power control software".

Does it mean EDPRCR_CORENPDRQ should be checked by kernel or PSCI
liked code in ARM trusted firmware to avoid to run CPU power off flow?

Or will EDPRCR_CORENPDRQ assert CPU external signal to notify power
controller so power controller emulate a power down?


We cannot know the power control design of the system, so the safe
solution is to set both bits.


Thanks a lot for the suggestion. Will set both bits.


Leo,

Also, it would be good to restore the PRCR register back to the original state
after we read the registers (if we changed them). I am exploring ways to make
use of this feature on demand (e.g, tie it to sysrq-t or sysrq-l) and not just
at panic time. So it would be good to have the state restored to not affect the
normal functioning even after dumping the registers.

PS: I have a small hack to hook this into a sysrq-x at runtime, but on a second 
thought
it is better not to consume the key and rather tie it to something which 
already exist,
as mentioned above.

Thanks
Suzuki



Thanks,
Leo Yan



--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/9] coresight: add support for CPU debug module

2017-03-29 Thread Suzuki K Poulose

On 29/03/17 11:37, Leo Yan wrote:

On Wed, Mar 29, 2017 at 11:31:03AM +0100, Suzuki K Poulose wrote:

On 29/03/17 11:27, Leo Yan wrote:

On Wed, Mar 29, 2017 at 10:07:07AM +0100, Suzuki K Poulose wrote:

[...]


+   if (mode == EDDEVID_IMPL_NONE) {
+   drvdata->edpcsr_present  = false;
+   drvdata->edcidsr_present = false;
+   drvdata->edvidsr_present = false;
+   } else if (mode == EDDEVID_IMPL_EDPCSR) {
+   drvdata->edpcsr_present  = true;
+   drvdata->edcidsr_present = false;
+   drvdata->edvidsr_present = false;
+   } else if (mode == EDDEVID_IMPL_EDPCSR_EDCIDSR) {
+   if (!IS_ENABLED(CONFIG_64BIT) &&
+   (pcsr_offset == EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32))
+   drvdata->edpcsr_present = false;
+   else
+   drvdata->edpcsr_present = true;


Sorry, I forgot why we do this check only in this mode. Shouldn't this be
common to all modes (of course which implies PCSR is present) ?


No. PCSROffset is defined differently in ARMv7 and ARMv8; So finally we
simplize PCSROffset value :
 - Sample offset applies based on the instruction state (indicated by 
PCSR[0])
0001 - No offset applies.
0010 - No offset applies, but do not use in AArch32 mode!

So we need handle the corner case is when CPU runs AArch32 mode and
PCSRoffset = 'b0010. Other cases the pcsr should be present.


I understand that reasoning. But my question is, why do we check for PCSROffset
only when mode == EDDEVID_IMPL_EDPCSR_EDCIDSR and not for say mode == 
EDDEVID_IMPL_EDPCSR or
any other mode where PCSR is present.


Sorry I misunderstood your question.

I made mistake when I analyzed the possbile combination for mode and
PCSROffset so I thought it's the only case should handle:
{ EDDEVID_IMPL_EDPCSR_EDCIDSR, EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 }

Below three combinations are possible to exist; so you are right, I
should move this out for the checking:
{ EDDEVID_IMPL_NONE,   EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 }


That need not be covered, as IMPL_NONE says PCSR is not implemented hence you
don't worry about anything as the functionality is missing. This should rather 
be:
EDDEVID_IMPL_EDPCSR, where only PCSR is implemented.


I think below combination doesn't really exist:
{ EDDEVID_IMPL_EDPCSR, EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 };

EDDEVID_IMPL_EDPCSR is only defined in ARMv7 ARM, and
EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 is only defined in ARMv8 ARM.


It is not wrong to check the PCSROffset in all cases where PCSR is available, 
as if
we hit PCSR on ARMv7 then PCSROffset shouldn't be DIS_AARCH32. And in fact that
would make the code a bit more cleaner. Anyways, I am not particular about this.

Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/9] coresight: add support for CPU debug module

2017-03-29 Thread Suzuki K Poulose

On 29/03/17 11:27, Leo Yan wrote:

On Wed, Mar 29, 2017 at 10:07:07AM +0100, Suzuki K Poulose wrote:

[...]


+   if (mode == EDDEVID_IMPL_NONE) {
+   drvdata->edpcsr_present  = false;
+   drvdata->edcidsr_present = false;
+   drvdata->edvidsr_present = false;
+   } else if (mode == EDDEVID_IMPL_EDPCSR) {
+   drvdata->edpcsr_present  = true;
+   drvdata->edcidsr_present = false;
+   drvdata->edvidsr_present = false;
+   } else if (mode == EDDEVID_IMPL_EDPCSR_EDCIDSR) {
+   if (!IS_ENABLED(CONFIG_64BIT) &&
+   (pcsr_offset == EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32))
+   drvdata->edpcsr_present = false;
+   else
+   drvdata->edpcsr_present = true;


Sorry, I forgot why we do this check only in this mode. Shouldn't this be
common to all modes (of course which implies PCSR is present) ?


No. PCSROffset is defined differently in ARMv7 and ARMv8; So finally we
simplize PCSROffset value :
 - Sample offset applies based on the instruction state (indicated by 
PCSR[0])
0001 - No offset applies.
0010 - No offset applies, but do not use in AArch32 mode!

So we need handle the corner case is when CPU runs AArch32 mode and
PCSRoffset = 'b0010. Other cases the pcsr should be present.


I understand that reasoning. But my question is, why do we check for PCSROffset
only when mode == EDDEVID_IMPL_EDPCSR_EDCIDSR and not for say mode == 
EDDEVID_IMPL_EDPCSR or
any other mode where PCSR is present.


Sorry I misunderstood your question.

I made mistake when I analyzed the possbile combination for mode and
PCSROffset so I thought it's the only case should handle:
{ EDDEVID_IMPL_EDPCSR_EDCIDSR, EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 }

Below three combinations are possible to exist; so you are right, I
should move this out for the checking:
{ EDDEVID_IMPL_NONE,   EDDEVID1_PCSR_NO_OFFSET_DIS_AARCH32 }


That need not be covered, as IMPL_NONE says PCSR is not implemented hence you
don't worry about anything as the functionality is missing. This should rather 
be:
EDDEVID_IMPL_EDPCSR, where only PCSR is implemented.

My switch...case suggestion makes it easier to do all this checking.


Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 6/9] coresight: add support for CPU debug module

2017-03-29 Thread Suzuki K Poulose

On 29/03/17 04:07, Leo Yan wrote:

Hi Suzuki,

On Mon, Mar 27, 2017 at 05:34:57PM +0100, Suzuki K Poulose wrote:

On 25/03/17 18:23, Leo Yan wrote:


[...]


Leo,

Thanks a lot for the quick rework. I don't fully understand (yet!) why we need 
the
idle_constraint. I will leave it for Sudeep to comment on it, as he is the 
expert
in that area. Some minor comments below.


Thanks a lot for quick reviewing :)


Signed-off-by: Leo Yan <leo@linaro.org>
---
drivers/hwtracing/coresight/Kconfig   |  11 +
drivers/hwtracing/coresight/Makefile  |   1 +
drivers/hwtracing/coresight/coresight-cpu-debug.c | 704 ++
3 files changed, 716 insertions(+)
create mode 100644 drivers/hwtracing/coresight/coresight-cpu-debug.c

diff --git a/drivers/hwtracing/coresight/Kconfig 
b/drivers/hwtracing/coresight/Kconfig
index 130cb21..18d7931 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -89,4 +89,15 @@ config CORESIGHT_STM
  logging useful software events or data coming from various entities
  in the system, possibly running different OSs

+config CORESIGHT_CPU_DEBUG
+   tristate "CoreSight CPU Debug driver"
+   depends on ARM || ARM64
+   depends on DEBUG_FS
+   help
+ This driver provides support for coresight debugging module. This
+ is primarily used to dump sample-based profiling registers when
+ system triggers panic, the driver will parse context registers so
+ can quickly get to know program counter (PC), secure state,
+ exception level, etc.


May be we should mention/warn the user about the possible caveats of using
this feature to help him make a better decision ? And / Or we should add a 
documentation
for it. We have collected some real good information over the discussions and
it is a good idea to capture it somewhere.


Sure, I will add a documentation for this.

[...]


+static struct pm_qos_request debug_qos_req;
+static int idle_constraint = PM_QOS_DEFAULT_VALUE;
+module_param(idle_constraint, int, 0600);
+MODULE_PARM_DESC(idle_constraint, "Latency requirement in microseconds for CPU 
"
+"idle states (default is -1, which means have no limiation "
+"to CPU idle states; 0 means disabling all idle states; user "
+"can choose other platform dependent values so can disable "
+"specific idle states for the platform)");


Correct me if I am wrong,

All we want to do is disable the CPUIdle explicitly if the user knows that this
could be a problem to use CPU debug on his platform. So, in effect, we should
only be using idle_constraint = 0 or -1.

In which case, we could make it easier for the user to tell us, either

 0 - Don't do anything with CPUIdle (default)
 1 - Disable CPUIdle for me as I know the platform has issues with CPU debug 
and CPUidle.


The reason for not using bool flag is: usually SoC may have many idle
states, so if user wants to partially enable some states then can set
the latency to constraint.

But of course, we can change this to binary value as you suggested,
this means turn on of turn off all states. The only one reason to use
latency value is it is more friendly for hardware design, e.g. some
platforms can enable partial states to save power and avoid overheat
after using this driver.

If you guys think this is a bit over design, I will follow up your
suggestion. I also have some replying in Mathieu's reviewing, please
help review as well.


than explaining the miscrosecond latency etc and make the appropriate calls 
underneath.
something like (not necessarily the same name) :

module_param(broken_with_cpuidle, bool, 0600);
MODULE_PARAM_DESC(broken_with_cpuidle, "Specifies whether the CPU debug has issues 
with CPUIdle on"
   " the platform. Non-zero value implies 
CPUIdle has to be"
   " explicitly disabled.",);


[...]


+   /*
+* Unfortunately the CPU cannot be powered up, so return
+* back and later has no permission to access other
+* registers. For this case, should set 'idle_constraint'
+* to ensure CPU power domain is enabled!
+*/
+   if (!(drvdata->edprsr & EDPRSR_PU)) {
+   pr_err("%s: power up request for CPU%d failed\n",
+   __func__, drvdata->cpu);
+   goto out;
+   }
+
+out_powered_up:
+   debug_os_unlock(drvdata);


Question: Do we need a matching debug_os_lock() once we are done ?


I have checked ARM ARMv8, but there have no detailed description for
this. I refered coresight-etmv4 code and Mike's pseudo code, ther have
no debug_os_lock() related operations.

Mike, Mathieu, could you also help confirm this?

[...]


+static void debug_init_arch_data(void *inf

Re: [PATCH v5 6/9] coresight: add support for CPU debug module

2017-03-27 Thread Suzuki K Poulose

On 25/03/17 18:23, Leo Yan wrote:

Coresight includes debug module and usually the module connects with CPU
debug logic. ARMv8 architecture reference manual (ARM DDI 0487A.k) has
description for related info in "Part H: External Debug".

Chapter H7 "The Sample-based Profiling Extension" introduces several
sampling registers, e.g. we can check program counter value with
combined CPU exception level, secure state, etc. So this is helpful for
analysis CPU lockup scenarios, e.g. if one CPU has run into infinite
loop with IRQ disabled. In this case the CPU cannot switch context and
handle any interrupt (including IPIs), as the result it cannot handle
SMP call for stack dump.

This patch is to enable coresight debug module, so firstly this driver
is to bind apb clock for debug module and this is to ensure the debug
module can be accessed from program or external debugger. And the driver
uses sample-based registers for debug purpose, e.g. when system triggers
panic, the driver will dump program counter and combined context
registers (EDCIDSR, EDVIDSR); by parsing context registers so can
quickly get to know CPU secure state, exception level, etc.

Some of the debug module registers are located in CPU power domain, so
this requires the CPU power domain stays on when access related debug
registers, but the power management for CPU power domain is quite
dependent on SoC integration for power management. For the platforms
which with sane power controller implementations, this driver follows
the method to set EDPRCR to try to pull the CPU out of low power state
and then set 'no power down request' bit so the CPU has no chance to
lose power.

If the SoC has not followed up this design well for power management
controller, the driver introduces module parameter "idle_constraint".
Setting this parameter for latency requirement in microseconds, finally
we can constrain all or partial idle states to ensure the CPU power
domain is enabled, this is a backup method to access coresight CPU
debug component safely.


Leo,

Thanks a lot for the quick rework. I don't fully understand (yet!) why we need 
the
idle_constraint. I will leave it for Sudeep to comment on it, as he is the 
expert
in that area. Some minor comments below.



Signed-off-by: Leo Yan 
---
 drivers/hwtracing/coresight/Kconfig   |  11 +
 drivers/hwtracing/coresight/Makefile  |   1 +
 drivers/hwtracing/coresight/coresight-cpu-debug.c | 704 ++
 3 files changed, 716 insertions(+)
 create mode 100644 drivers/hwtracing/coresight/coresight-cpu-debug.c

diff --git a/drivers/hwtracing/coresight/Kconfig 
b/drivers/hwtracing/coresight/Kconfig
index 130cb21..18d7931 100644
--- a/drivers/hwtracing/coresight/Kconfig
+++ b/drivers/hwtracing/coresight/Kconfig
@@ -89,4 +89,15 @@ config CORESIGHT_STM
  logging useful software events or data coming from various entities
  in the system, possibly running different OSs

+config CORESIGHT_CPU_DEBUG
+   tristate "CoreSight CPU Debug driver"
+   depends on ARM || ARM64
+   depends on DEBUG_FS
+   help
+ This driver provides support for coresight debugging module. This
+ is primarily used to dump sample-based profiling registers when
+ system triggers panic, the driver will parse context registers so
+ can quickly get to know program counter (PC), secure state,
+ exception level, etc.


May be we should mention/warn the user about the possible caveats of using
this feature to help him make a better decision ? And / Or we should add a 
documentation
for it. We have collected some real good information over the discussions and
it is a good idea to capture it somewhere.


+
 endif
diff --git a/drivers/hwtracing/coresight/Makefile 
b/drivers/hwtracing/coresight/Makefile
index af480d9..433d590 100644
--- a/drivers/hwtracing/coresight/Makefile
+++ b/drivers/hwtracing/coresight/Makefile
@@ -16,3 +16,4 @@ obj-$(CONFIG_CORESIGHT_SOURCE_ETM4X) += coresight-etm4x.o \
coresight-etm4x-sysfs.o
 obj-$(CONFIG_CORESIGHT_QCOM_REPLICATOR) += coresight-replicator-qcom.o
 obj-$(CONFIG_CORESIGHT_STM) += coresight-stm.o
+obj-$(CONFIG_CORESIGHT_CPU_DEBUG) += coresight-cpu-debug.o
diff --git a/drivers/hwtracing/coresight/coresight-cpu-debug.c 
b/drivers/hwtracing/coresight/coresight-cpu-debug.c
new file mode 100644
index 000..fbec1d1
--- /dev/null
+++ b/drivers/hwtracing/coresight/coresight-cpu-debug.c



+#define EDPCSR 0x0A0
+#define EDCIDSR0x0A4
+#define EDVIDSR0x0A8
+#define EDPCSR_HI  0x0AC
+#define EDOSLAR0x300
+#define EDPRCR 0x310
+#define EDPRSR 0x314
+#define EDDEVID1   0xFC4
+#define EDDEVID0xFC8
+
+#define EDPCSR_PROHIBITED

Re: [PATCH] arm64: erratum: Workaround for Kryo reserved system register read

2016-04-08 Thread Suzuki K Poulose

On 08/04/16 11:24, Marc Zyngier wrote:

On 08/04/16 10:58, Suzuki K Poulose wrote:

On 07/04/16 18:31, Marc Zyngier wrote:


+   All system register encodings above use the form
+
+   Op0, Op1, CRn, CRm, Op2.
+
+   Note that some of the encodings listed above include
+   the system register space reserved for the following
+   identification registers which may appear in future revisions
+   of the ARM architecture beyond ARMv8.0.
+   This space includes:
+   ID_AA64PFR[2-7]_EL1
+   ID_AA64DFR[2-3]_EL1
+   ID_AA64AFR[2-3]_EL1
+   ID_AA64ISAR[2-7]_EL1
+   ID_AA64MMFR[2-7]_EL1



AFAIK, the id space is unassigned. So the naming above could cause confusion
if the register is named something else.


It is reserved *at the moment*, but already has a defined behaviour. My


Absolutely, they do need to be RAZ.  My point was assigning names to the 
reserved
space where the names are unassigned.


worry is that when some new architecture revision comes around, we start
using these registers without thinking much about it (because we should
be able to). At this point, your SoC will catch fire and nobody will
have a clue about the problem because it is not apparent in the code.

I'd really like to see something a bit more forward looking that covers
that space for good.


I agree, the patch definitely needs to take care of handling the entire space.

Cheers
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] arm64: erratum: Workaround for Kryo reserved system register read

2016-04-08 Thread Suzuki K Poulose

On 07/04/16 18:31, Marc Zyngier wrote:


+   All system register encodings above use the form
+
+   Op0, Op1, CRn, CRm, Op2.
+
+   Note that some of the encodings listed above include
+   the system register space reserved for the following
+   identification registers which may appear in future revisions
+   of the ARM architecture beyond ARMv8.0.
+   This space includes:
+   ID_AA64PFR[2-7]_EL1
+   ID_AA64DFR[2-3]_EL1
+   ID_AA64AFR[2-3]_EL1
+   ID_AA64ISAR[2-7]_EL1
+   ID_AA64MMFR[2-7]_EL1



AFAIK, the id space is unassigned. So the naming above could cause confusion
if the register is named something else.


+
+   check_local_cpu_errata();
+


What is the impact of moving this around? Suzuki, was there any
particular reason why this check was done later rather than earlier?


All the existing errata look for MIDR to match, which is read separately
using read_cpuid_id(). The moment we need to do something w.r.t an ID register,
this will break. So at the moment moving this doesn't have much of an impact.

Thanks
Suzuki
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html