Re: [PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-04 Thread Alan Kao
Hi Alex,

On Tue, Apr 03, 2018 at 07:08:43PM -0700, Alex Solomatnikov wrote:
> Doc fixes:
> 
>

Thanks for these fixes.  I'll edit this patch and send a v3 once I am done
with the PMU patch.

I suppose a "Reviewed-by: Alex Solomatnikov" appending at the end of the
commit will be great, right?

Alan

> diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
> index a3e930e..ae90a5e 100644
> --- a/Documentation/riscv/pmu.txt
> +++ b/Documentation/riscv/pmu.txt
> @@ -20,7 +20,7 @@ the lack of the following general architectural
> performance monitoring features:
>  * Enabling/Disabling counters
>Counters are just free-running all the time in our case.
>  * Interrupt caused by counter overflow
> -  No such design in the spec.
> +  No such feature in the spec.
>  * Interrupt indicator
>It is not possible to have many interrupt ports for all counters, so an
>interrupt indicator is required for software to tell which counter has
> @@ -159,14 +159,14 @@ interrupt for perf, so the details are to be
> completed in the future.
> 
>  They seem symmetric but perf treats them quite differently.  For reading, 
> there
>  is a *read* interface in *struct pmu*, but it serves more than just reading.
> -According to the context, the *read* function not only read the content of 
> the
> -counter (event->count), but also update the left period to the next interrupt
> +According to the context, the *read* function not only reads the content of 
> the
> +counter (event->count), but also updates the left period for the next 
> interrupt
>  (event->hw.period_left).
> 
>  But the core of perf does not need direct write to counters.  Writing 
> counters
> -hides behind the abstraction of 1) *pmu->start*, literally start
> counting so one
> +is hidden behind the abstraction of 1) *pmu->start*, literally start
> counting so one
>  has to set the counter to a good value for the next interrupt; 2)
> inside the IRQ
> -it should set the counter with the same reason.
> +it should set the counter to the same reasonable value.
> 
>  Reading is not a problem in RISC-V but writing would need some effort, since
>  counters are not allowed to be written by S-mode.
> @@ -190,37 +190,37 @@ Three states (event->hw.state) are defined:
>  A normal flow of these state transitions are as follows:
> 
>  * A user launches a perf event, resulting in calling to *event_init*.
> -* When being context-switched in, *add* is called by the perf core, with flag
> -  PERF_EF_START, which mean that the event should be started after it is 
> added.
> -  In this stage, an general event is binded to a physical counter, if any.
> +* When being context-switched in, *add* is called by the perf core, with a 
> flag
> +  PERF_EF_START, which means that the event should be started after
> it is added.
> +  At this stage, a general event is bound to a physical counter, if any.
>The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE,
> because it is now
>stopped, and the (software) event count does not need updating.
>  ** *start* is then called, and the counter is enabled.
> -   With flag PERF_EF_RELOAD, it write the counter to an appropriate
> value (check
> -   previous section for detail).
> -   No writing is made if the flag does not contain PERF_EF_RELOAD.
> -   The state now is reset to none, because it is neither stopped nor update
> -   (the counting already starts)
> -* When being context-switched out, *del* is called.  It then checkout all the
> -  events in the PMU and call *stop* to update their counts.
> +   With flag PERF_EF_RELOAD, it writes an appropriate value to the
> counter (check
> +   the previous section for details).
> +   Nothing is written if the flag does not contain PERF_EF_RELOAD.
> +   The state now is reset to none, because it is neither stopped nor updated
> +   (the counting already started)
> +* When being context-switched out, *del* is called.  It then checks out all 
> the
> +  events in the PMU and calls *stop* to update their counts.
>  ** *stop* is called by *del*
> and the perf core with flag PERF_EF_UPDATE, and it often shares the same
> subroutine as *read* with the same logic.
> The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again.
> 
> -** Life cycles of these two pairs: *add* and *del* are called repeatedly as
> +** Life cycle of these two pairs: *add* and *del* are called repeatedly as
>tasks switch in-and-out; *start* and *stop* is also called when the perf 
> core
>needs a quick stop-and-start, for instance, when the interrupt
> period is being
>adjusted.
> 
> -Current implementation is sufficient for now and can be easily extend to
> +Current implementation is sufficient for now and can be easily
> extended with new
>  features in the future.
> 
>  A. Related Structures
>  -
> 
> -* struct pmu: include/linux/perf_events.h
> -* struct riscv_pmu: arch/riscv/include/asm/perf_events.h
> +* struct pmu: include/linux/perf_event.h
> +* 

Re: [PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-04 Thread Alan Kao
Hi Alex,

On Tue, Apr 03, 2018 at 07:08:43PM -0700, Alex Solomatnikov wrote:
> Doc fixes:
> 
>

Thanks for these fixes.  I'll edit this patch and send a v3 once I am done
with the PMU patch.

I suppose a "Reviewed-by: Alex Solomatnikov" appending at the end of the
commit will be great, right?

Alan

> diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
> index a3e930e..ae90a5e 100644
> --- a/Documentation/riscv/pmu.txt
> +++ b/Documentation/riscv/pmu.txt
> @@ -20,7 +20,7 @@ the lack of the following general architectural
> performance monitoring features:
>  * Enabling/Disabling counters
>Counters are just free-running all the time in our case.
>  * Interrupt caused by counter overflow
> -  No such design in the spec.
> +  No such feature in the spec.
>  * Interrupt indicator
>It is not possible to have many interrupt ports for all counters, so an
>interrupt indicator is required for software to tell which counter has
> @@ -159,14 +159,14 @@ interrupt for perf, so the details are to be
> completed in the future.
> 
>  They seem symmetric but perf treats them quite differently.  For reading, 
> there
>  is a *read* interface in *struct pmu*, but it serves more than just reading.
> -According to the context, the *read* function not only read the content of 
> the
> -counter (event->count), but also update the left period to the next interrupt
> +According to the context, the *read* function not only reads the content of 
> the
> +counter (event->count), but also updates the left period for the next 
> interrupt
>  (event->hw.period_left).
> 
>  But the core of perf does not need direct write to counters.  Writing 
> counters
> -hides behind the abstraction of 1) *pmu->start*, literally start
> counting so one
> +is hidden behind the abstraction of 1) *pmu->start*, literally start
> counting so one
>  has to set the counter to a good value for the next interrupt; 2)
> inside the IRQ
> -it should set the counter with the same reason.
> +it should set the counter to the same reasonable value.
> 
>  Reading is not a problem in RISC-V but writing would need some effort, since
>  counters are not allowed to be written by S-mode.
> @@ -190,37 +190,37 @@ Three states (event->hw.state) are defined:
>  A normal flow of these state transitions are as follows:
> 
>  * A user launches a perf event, resulting in calling to *event_init*.
> -* When being context-switched in, *add* is called by the perf core, with flag
> -  PERF_EF_START, which mean that the event should be started after it is 
> added.
> -  In this stage, an general event is binded to a physical counter, if any.
> +* When being context-switched in, *add* is called by the perf core, with a 
> flag
> +  PERF_EF_START, which means that the event should be started after
> it is added.
> +  At this stage, a general event is bound to a physical counter, if any.
>The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE,
> because it is now
>stopped, and the (software) event count does not need updating.
>  ** *start* is then called, and the counter is enabled.
> -   With flag PERF_EF_RELOAD, it write the counter to an appropriate
> value (check
> -   previous section for detail).
> -   No writing is made if the flag does not contain PERF_EF_RELOAD.
> -   The state now is reset to none, because it is neither stopped nor update
> -   (the counting already starts)
> -* When being context-switched out, *del* is called.  It then checkout all the
> -  events in the PMU and call *stop* to update their counts.
> +   With flag PERF_EF_RELOAD, it writes an appropriate value to the
> counter (check
> +   the previous section for details).
> +   Nothing is written if the flag does not contain PERF_EF_RELOAD.
> +   The state now is reset to none, because it is neither stopped nor updated
> +   (the counting already started)
> +* When being context-switched out, *del* is called.  It then checks out all 
> the
> +  events in the PMU and calls *stop* to update their counts.
>  ** *stop* is called by *del*
> and the perf core with flag PERF_EF_UPDATE, and it often shares the same
> subroutine as *read* with the same logic.
> The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again.
> 
> -** Life cycles of these two pairs: *add* and *del* are called repeatedly as
> +** Life cycle of these two pairs: *add* and *del* are called repeatedly as
>tasks switch in-and-out; *start* and *stop* is also called when the perf 
> core
>needs a quick stop-and-start, for instance, when the interrupt
> period is being
>adjusted.
> 
> -Current implementation is sufficient for now and can be easily extend to
> +Current implementation is sufficient for now and can be easily
> extended with new
>  features in the future.
> 
>  A. Related Structures
>  -
> 
> -* struct pmu: include/linux/perf_events.h
> -* struct riscv_pmu: arch/riscv/include/asm/perf_events.h
> +* struct pmu: include/linux/perf_event.h
> +* 

Re: [PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-03 Thread Alex Solomatnikov
Doc fixes:


diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
index a3e930e..ae90a5e 100644
--- a/Documentation/riscv/pmu.txt
+++ b/Documentation/riscv/pmu.txt
@@ -20,7 +20,7 @@ the lack of the following general architectural
performance monitoring features:
 * Enabling/Disabling counters
   Counters are just free-running all the time in our case.
 * Interrupt caused by counter overflow
-  No such design in the spec.
+  No such feature in the spec.
 * Interrupt indicator
   It is not possible to have many interrupt ports for all counters, so an
   interrupt indicator is required for software to tell which counter has
@@ -159,14 +159,14 @@ interrupt for perf, so the details are to be
completed in the future.

 They seem symmetric but perf treats them quite differently.  For reading, there
 is a *read* interface in *struct pmu*, but it serves more than just reading.
-According to the context, the *read* function not only read the content of the
-counter (event->count), but also update the left period to the next interrupt
+According to the context, the *read* function not only reads the content of the
+counter (event->count), but also updates the left period for the next interrupt
 (event->hw.period_left).

 But the core of perf does not need direct write to counters.  Writing counters
-hides behind the abstraction of 1) *pmu->start*, literally start
counting so one
+is hidden behind the abstraction of 1) *pmu->start*, literally start
counting so one
 has to set the counter to a good value for the next interrupt; 2)
inside the IRQ
-it should set the counter with the same reason.
+it should set the counter to the same reasonable value.

 Reading is not a problem in RISC-V but writing would need some effort, since
 counters are not allowed to be written by S-mode.
@@ -190,37 +190,37 @@ Three states (event->hw.state) are defined:
 A normal flow of these state transitions are as follows:

 * A user launches a perf event, resulting in calling to *event_init*.
-* When being context-switched in, *add* is called by the perf core, with flag
-  PERF_EF_START, which mean that the event should be started after it is added.
-  In this stage, an general event is binded to a physical counter, if any.
+* When being context-switched in, *add* is called by the perf core, with a flag
+  PERF_EF_START, which means that the event should be started after
it is added.
+  At this stage, a general event is bound to a physical counter, if any.
   The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE,
because it is now
   stopped, and the (software) event count does not need updating.
 ** *start* is then called, and the counter is enabled.
-   With flag PERF_EF_RELOAD, it write the counter to an appropriate
value (check
-   previous section for detail).
-   No writing is made if the flag does not contain PERF_EF_RELOAD.
-   The state now is reset to none, because it is neither stopped nor update
-   (the counting already starts)
-* When being context-switched out, *del* is called.  It then checkout all the
-  events in the PMU and call *stop* to update their counts.
+   With flag PERF_EF_RELOAD, it writes an appropriate value to the
counter (check
+   the previous section for details).
+   Nothing is written if the flag does not contain PERF_EF_RELOAD.
+   The state now is reset to none, because it is neither stopped nor updated
+   (the counting already started)
+* When being context-switched out, *del* is called.  It then checks out all the
+  events in the PMU and calls *stop* to update their counts.
 ** *stop* is called by *del*
and the perf core with flag PERF_EF_UPDATE, and it often shares the same
subroutine as *read* with the same logic.
The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again.

-** Life cycles of these two pairs: *add* and *del* are called repeatedly as
+** Life cycle of these two pairs: *add* and *del* are called repeatedly as
   tasks switch in-and-out; *start* and *stop* is also called when the perf core
   needs a quick stop-and-start, for instance, when the interrupt
period is being
   adjusted.

-Current implementation is sufficient for now and can be easily extend to
+Current implementation is sufficient for now and can be easily
extended with new
 features in the future.

 A. Related Structures
 -

-* struct pmu: include/linux/perf_events.h
-* struct riscv_pmu: arch/riscv/include/asm/perf_events.h
+* struct pmu: include/linux/perf_event.h
+* struct riscv_pmu: arch/riscv/include/asm/perf_event.h

   Both structures are designed to be read-only.

@@ -231,13 +231,13 @@ perf's internal state machine (check
kernel/events/core.c for details).
   *struct riscv_pmu* defines PMU-specific parameters.  The naming follows the
 convention of all other architectures.

-* struct perf_event: include/linux/perf_events.h
+* struct perf_event: include/linux/perf_event.h
 * struct hw_perf_event

   The generic structure that represents perf events, 

Re: [PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-03 Thread Alex Solomatnikov
Doc fixes:


diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
index a3e930e..ae90a5e 100644
--- a/Documentation/riscv/pmu.txt
+++ b/Documentation/riscv/pmu.txt
@@ -20,7 +20,7 @@ the lack of the following general architectural
performance monitoring features:
 * Enabling/Disabling counters
   Counters are just free-running all the time in our case.
 * Interrupt caused by counter overflow
-  No such design in the spec.
+  No such feature in the spec.
 * Interrupt indicator
   It is not possible to have many interrupt ports for all counters, so an
   interrupt indicator is required for software to tell which counter has
@@ -159,14 +159,14 @@ interrupt for perf, so the details are to be
completed in the future.

 They seem symmetric but perf treats them quite differently.  For reading, there
 is a *read* interface in *struct pmu*, but it serves more than just reading.
-According to the context, the *read* function not only read the content of the
-counter (event->count), but also update the left period to the next interrupt
+According to the context, the *read* function not only reads the content of the
+counter (event->count), but also updates the left period for the next interrupt
 (event->hw.period_left).

 But the core of perf does not need direct write to counters.  Writing counters
-hides behind the abstraction of 1) *pmu->start*, literally start
counting so one
+is hidden behind the abstraction of 1) *pmu->start*, literally start
counting so one
 has to set the counter to a good value for the next interrupt; 2)
inside the IRQ
-it should set the counter with the same reason.
+it should set the counter to the same reasonable value.

 Reading is not a problem in RISC-V but writing would need some effort, since
 counters are not allowed to be written by S-mode.
@@ -190,37 +190,37 @@ Three states (event->hw.state) are defined:
 A normal flow of these state transitions are as follows:

 * A user launches a perf event, resulting in calling to *event_init*.
-* When being context-switched in, *add* is called by the perf core, with flag
-  PERF_EF_START, which mean that the event should be started after it is added.
-  In this stage, an general event is binded to a physical counter, if any.
+* When being context-switched in, *add* is called by the perf core, with a flag
+  PERF_EF_START, which means that the event should be started after
it is added.
+  At this stage, a general event is bound to a physical counter, if any.
   The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE,
because it is now
   stopped, and the (software) event count does not need updating.
 ** *start* is then called, and the counter is enabled.
-   With flag PERF_EF_RELOAD, it write the counter to an appropriate
value (check
-   previous section for detail).
-   No writing is made if the flag does not contain PERF_EF_RELOAD.
-   The state now is reset to none, because it is neither stopped nor update
-   (the counting already starts)
-* When being context-switched out, *del* is called.  It then checkout all the
-  events in the PMU and call *stop* to update their counts.
+   With flag PERF_EF_RELOAD, it writes an appropriate value to the
counter (check
+   the previous section for details).
+   Nothing is written if the flag does not contain PERF_EF_RELOAD.
+   The state now is reset to none, because it is neither stopped nor updated
+   (the counting already started)
+* When being context-switched out, *del* is called.  It then checks out all the
+  events in the PMU and calls *stop* to update their counts.
 ** *stop* is called by *del*
and the perf core with flag PERF_EF_UPDATE, and it often shares the same
subroutine as *read* with the same logic.
The state changes to PERF_HES_STOPPED and PERF_HES_UPTODATE, again.

-** Life cycles of these two pairs: *add* and *del* are called repeatedly as
+** Life cycle of these two pairs: *add* and *del* are called repeatedly as
   tasks switch in-and-out; *start* and *stop* is also called when the perf core
   needs a quick stop-and-start, for instance, when the interrupt
period is being
   adjusted.

-Current implementation is sufficient for now and can be easily extend to
+Current implementation is sufficient for now and can be easily
extended with new
 features in the future.

 A. Related Structures
 -

-* struct pmu: include/linux/perf_events.h
-* struct riscv_pmu: arch/riscv/include/asm/perf_events.h
+* struct pmu: include/linux/perf_event.h
+* struct riscv_pmu: arch/riscv/include/asm/perf_event.h

   Both structures are designed to be read-only.

@@ -231,13 +231,13 @@ perf's internal state machine (check
kernel/events/core.c for details).
   *struct riscv_pmu* defines PMU-specific parameters.  The naming follows the
 convention of all other architectures.

-* struct perf_event: include/linux/perf_events.h
+* struct perf_event: include/linux/perf_event.h
 * struct hw_perf_event

   The generic structure that represents perf events, 

[PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-02 Thread Alan Kao
Cc: Nick Hu 
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 Documentation/riscv/pmu.txt | 249 
 1 file changed, 249 insertions(+)
 create mode 100644 Documentation/riscv/pmu.txt

diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
new file mode 100644
index ..a3e930ed5141
--- /dev/null
+++ b/Documentation/riscv/pmu.txt
@@ -0,0 +1,249 @@
+Supporting PMUs on RISC-V platforms
+==
+Alan Kao , Mar 2018
+
+Introduction
+
+
+As of this writing, perf_event-related features mentioned in The RISC-V ISA
+Privileged Version 1.10 are as follows:
+(please check the manual for more details)
+
+* [m|s]counteren
+* mcycle[h], cycle[h]
+* minstret[h], instret[h]
+* mhpeventx, mhpcounterx[h]
+
+With such function set only, porting perf would require a lot of work, due to
+the lack of the following general architectural performance monitoring 
features:
+
+* Enabling/Disabling counters
+  Counters are just free-running all the time in our case.
+* Interrupt caused by counter overflow
+  No such design in the spec.
+* Interrupt indicator
+  It is not possible to have many interrupt ports for all counters, so an
+  interrupt indicator is required for software to tell which counter has
+  just overflowed.
+* Writing to counters
+  There will be an SBI to support this since the kernel cannot modify the
+  counters [1].  Alternatively, some vendor considers to implement
+  hardware-extension for M-S-U model machines to write counters directly.
+
+This document aims to provide developers a quick guide on supporting their
+PMUs in the kernel.  The following sections briefly explain perf' mechanism
+and todos.
+
+You may check previous discussions here [1][2].  Also, it might be helpful
+to check the appendix for related kernel structures.
+
+
+1. Initialization
+-
+
+*riscv_pmu* is a global pointer of type *struct riscv_pmu*, which contains
+various methods according to perf's internal convention and PMU-specific
+parameters.  One should declare such instance to represent the PMU.  By 
default,
+*riscv_pmu* points to a constant structure *riscv_base_pmu*, which has very
+basic support to a baseline QEMU model.
+
+Then he/she can either assign the instance's pointer to *riscv_pmu* so that
+the minimal and already-implemented logic can be leveraged, or invent his/her
+own *riscv_init_platform_pmu* implementation.
+
+In other words, existing sources of *riscv_base_pmu* merely provide a
+reference implementation.  Developers can flexibly decide how many parts they
+can leverage, and in the most extreme case, they can customize every function
+according to their needs.
+
+
+2. Event Initialization
+---
+
+When a user launches a perf command to monitor some events, it is first
+interpreted by the userspace perf tool into multiple *perf_event_open*
+system calls, and then each of them calls to the body of *event_init*
+member function that was assigned in the previous step.  In *riscv_base_pmu*'s
+case, it is *riscv_event_init*.
+
+The main purpose of this function is to translate the event provided by user
+into bitmap, so that HW-related control registers or counters can directly be
+manipulated.  The translation is based on the mappings and methods provided in
+*riscv_pmu*.
+
+Note that some features can be done in this stage as well:
+
+(1) interrupt setting, which is stated in the next section;
+(2) privilege level setting (user space only, kernel space only, both);
+(3) destructor setting.  Normally it is sufficient to apply 
*riscv_destroy_event*;
+(4) tweaks for non-sampling events, which will be utilized by functions such as
+*perf_adjust_period*, usually something like the follows:
+
+if (!is_sampling_event(event)) {
+hwc->sample_period = x86_pmu.max_period;
+hwc->last_period = hwc->sample_period;
+local64_set(>period_left, hwc->sample_period);
+}
+
+In the case of *riscv_base_pmu*, only (3) is provided for now.
+
+
+3. Interrupt
+
+
+3.1. Interrupt Initialization
+
+This often occurs at the beginning of the *event_init* method. In common
+practice, this should be a code segment like
+
+int x86_reserve_hardware(void)
+{
+int err = 0;
+
+if (!atomic_inc_not_zero(_refcount)) {
+mutex_lock(_reserve_mutex);
+if (atomic_read(_refcount) == 0) {
+if (!reserve_pmc_hardware())
+err = -EBUSY;
+else
+reserve_ds_buffers();
+}
+if (!err)
+atomic_inc(_refcount);
+mutex_unlock(_reserve_mutex);
+}
+
+return err;
+}
+
+And the magic is in *reserve_pmc_hardware*, which usually does atomic
+operations to make 

[PATCH v2 2/2] perf: riscv: Add Document for Future Porting Guide

2018-04-02 Thread Alan Kao
Cc: Nick Hu 
Cc: Greentime Hu 
Signed-off-by: Alan Kao 
---
 Documentation/riscv/pmu.txt | 249 
 1 file changed, 249 insertions(+)
 create mode 100644 Documentation/riscv/pmu.txt

diff --git a/Documentation/riscv/pmu.txt b/Documentation/riscv/pmu.txt
new file mode 100644
index ..a3e930ed5141
--- /dev/null
+++ b/Documentation/riscv/pmu.txt
@@ -0,0 +1,249 @@
+Supporting PMUs on RISC-V platforms
+==
+Alan Kao , Mar 2018
+
+Introduction
+
+
+As of this writing, perf_event-related features mentioned in The RISC-V ISA
+Privileged Version 1.10 are as follows:
+(please check the manual for more details)
+
+* [m|s]counteren
+* mcycle[h], cycle[h]
+* minstret[h], instret[h]
+* mhpeventx, mhpcounterx[h]
+
+With such function set only, porting perf would require a lot of work, due to
+the lack of the following general architectural performance monitoring 
features:
+
+* Enabling/Disabling counters
+  Counters are just free-running all the time in our case.
+* Interrupt caused by counter overflow
+  No such design in the spec.
+* Interrupt indicator
+  It is not possible to have many interrupt ports for all counters, so an
+  interrupt indicator is required for software to tell which counter has
+  just overflowed.
+* Writing to counters
+  There will be an SBI to support this since the kernel cannot modify the
+  counters [1].  Alternatively, some vendor considers to implement
+  hardware-extension for M-S-U model machines to write counters directly.
+
+This document aims to provide developers a quick guide on supporting their
+PMUs in the kernel.  The following sections briefly explain perf' mechanism
+and todos.
+
+You may check previous discussions here [1][2].  Also, it might be helpful
+to check the appendix for related kernel structures.
+
+
+1. Initialization
+-
+
+*riscv_pmu* is a global pointer of type *struct riscv_pmu*, which contains
+various methods according to perf's internal convention and PMU-specific
+parameters.  One should declare such instance to represent the PMU.  By 
default,
+*riscv_pmu* points to a constant structure *riscv_base_pmu*, which has very
+basic support to a baseline QEMU model.
+
+Then he/she can either assign the instance's pointer to *riscv_pmu* so that
+the minimal and already-implemented logic can be leveraged, or invent his/her
+own *riscv_init_platform_pmu* implementation.
+
+In other words, existing sources of *riscv_base_pmu* merely provide a
+reference implementation.  Developers can flexibly decide how many parts they
+can leverage, and in the most extreme case, they can customize every function
+according to their needs.
+
+
+2. Event Initialization
+---
+
+When a user launches a perf command to monitor some events, it is first
+interpreted by the userspace perf tool into multiple *perf_event_open*
+system calls, and then each of them calls to the body of *event_init*
+member function that was assigned in the previous step.  In *riscv_base_pmu*'s
+case, it is *riscv_event_init*.
+
+The main purpose of this function is to translate the event provided by user
+into bitmap, so that HW-related control registers or counters can directly be
+manipulated.  The translation is based on the mappings and methods provided in
+*riscv_pmu*.
+
+Note that some features can be done in this stage as well:
+
+(1) interrupt setting, which is stated in the next section;
+(2) privilege level setting (user space only, kernel space only, both);
+(3) destructor setting.  Normally it is sufficient to apply 
*riscv_destroy_event*;
+(4) tweaks for non-sampling events, which will be utilized by functions such as
+*perf_adjust_period*, usually something like the follows:
+
+if (!is_sampling_event(event)) {
+hwc->sample_period = x86_pmu.max_period;
+hwc->last_period = hwc->sample_period;
+local64_set(>period_left, hwc->sample_period);
+}
+
+In the case of *riscv_base_pmu*, only (3) is provided for now.
+
+
+3. Interrupt
+
+
+3.1. Interrupt Initialization
+
+This often occurs at the beginning of the *event_init* method. In common
+practice, this should be a code segment like
+
+int x86_reserve_hardware(void)
+{
+int err = 0;
+
+if (!atomic_inc_not_zero(_refcount)) {
+mutex_lock(_reserve_mutex);
+if (atomic_read(_refcount) == 0) {
+if (!reserve_pmc_hardware())
+err = -EBUSY;
+else
+reserve_ds_buffers();
+}
+if (!err)
+atomic_inc(_refcount);
+mutex_unlock(_reserve_mutex);
+}
+
+return err;
+}
+
+And the magic is in *reserve_pmc_hardware*, which usually does atomic
+operations to make implemented IRQ accessible from some global function 
pointer.
+*release_pmc_hardware* serves the