date:20210321

This patch supports the HPB 2.0.

The HPB 2.0 supports read of varying sizes from 4KB to 512KB.
In the case of Read (<= 32KB) is supported as single HPB read.
In the case of Read (36KB ~ 512KB) is supported by as a combination of
write buffer command and HPB read command to deliver more PPN.
The write buffer commands may not be issued immediately due to busy tags.
To use HPB read more aggressively, the driver can requeue the write buffer
command. The requeue threshold is implemented as timeout and can be
modified with requeue_timeout_ms entry in sysfs.

Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs |  47 +-
 drivers/scsi/ufs/ufs-sysfs.c   |   4 +
 drivers/scsi/ufs/ufs.h |   3 +-
 drivers/scsi/ufs/ufshcd.c  |  25 +-
 drivers/scsi/ufs/ufshcd.h  |   7 +
 drivers/scsi/ufs/ufshpb.c  | 626 +++--
 drivers/scsi/ufs/ufshpb.h  |  67 ++-
 7 files changed, 698 insertions(+), 81 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index 528bf89fc98b..419adf450b89 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB 
pinned regions assigned to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that changed to HPB read.
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that cannot be changed to
@@ -1268,7 +1268,7 @@ Description:  This entry shows the number of reads 
that cannot be changed to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of response UPIUs that has
@@ -1276,7 +1276,7 @@ Description:  This entry shows the number of response 
UPIUs that has
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of active sub-regions recommended by
@@ -1284,7 +1284,7 @@ Description:  This entry shows the number of active 
sub-regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of inactive regions recommended by
@@ -1292,10 +1292,45 @@ Description:This entry shows the number of inactive 
regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of read buffer commands for
activating sub-regions recommended by response UPIUs.
 
The file is read only.
+
+What:  
/sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the requeue timeout threshold for write buffer
+   command in ms. This value can be changed by writing proper 
integer to
+   this entry.
+
+What:  
/sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the maximum HPB data size for using single HPB
+   command.
+
+   ===  
+   00h  4KB
+   01h  8KB
+   02h  12KB
+   ...
+   FFh  1024KB
+   ===  
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the status of HPB.
+
+   == 
+   0  HPB is not enabled.
+   1  HPB is enabled
+   == 
+
+

[PATCH v31 3/4] scsi: ufs: Prepare HPB read for cached sub-region

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 257 +-
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 258 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 88280310bb64..a7cf9278965c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2653,6 +2653,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
lrbp->req_abort_skip = false;
 
+   ufshpb_prep(hba, lrbp);
+
ufshcd_comp_scsi_upiu(hba, lrbp);
 
err = ufshcd_map_sg(hba, lrbp);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index c67acfc8c6bf..f789339f68d9 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(&hpb->hpb_state, state);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+   struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb)
 }
 
 static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
-struct ufshcd_lrb *lrbp,
-struct utp_hpb_rsp *rsp_field)
+   struct ufshcd_lrb *lrbp,
+   struct utp_hpb_rsp *rsp_field)
 {
/* Check HPB_UPDATE_ALERT */
if (!(lrbp->ucd_rsp_ptr->header.dword_2 &
@@ -107,6 +130,234 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
return true;
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+ int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   if (unlikely(!srgn->mctx)) {
+   dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+   "no mctx in region %d subregion %d.\n",
+   srgn->rgn_idx, srgn->srgn_idx);
+   return true;

[PATCH v31 2/4] scsi: ufs: L2P map management for HPB read

This is a patch for managing L2P map in HPB module.

The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.

Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sensedata.
The HPB module performs L2P mapping management on the host through the
delivered information.

A pinned region is a pre-set regions on the UFS device that is always
activate-state.

The data structure for map data request and L2P map uses mempool API,
minimizing allocation overhead while avoiding static allocation.

The mininum size of the memory pool used in the HPB is implemented
as a module parameter, so that it can be configurable by the user.

To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096

The map_work manages active/inactive by 2 "to-do" lists.
Each hpb lun maintains 2 "to-do" lists:
  hpb->lh_inact_rgn - regions to be inactivated, and
  hpb->lh_act_srgn - subregions to be activated
Those lists are maintained on IO completion.

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h|   36 ++
 drivers/scsi/ufs/ufshcd.c |4 +
 drivers/scsi/ufs/ufshpb.c | 1088 -
 drivers/scsi/ufs/ufshpb.h |   65 +++
 4 files changed, 1178 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 4eee7e31d08d..bfb84d2ba990 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -478,6 +478,41 @@ struct utp_cmd_rsp {
u8 sense_data[UFS_SENSE_SIZE];
 };
 
+struct ufshpb_active_field {
+   __be16 active_rgn;
+   __be16 active_srgn;
+};
+#define HPB_ACT_FIELD_SIZE 4
+
+/**
+ * struct utp_hpb_rsp - Response UPIU structure
+ * @residual_transfer_count: Residual transfer count DW-3
+ * @reserved1: Reserved double words DW-4 to DW-7
+ * @sense_data_len: Sense data length DW-8 U16
+ * @desc_type: Descriptor type of sense data
+ * @additional_len: Additional length of sense data
+ * @hpb_op: HPB operation type
+ * @lun: LUN of response UPIU
+ * @active_rgn_cnt: Active region count
+ * @inactive_rgn_cnt: Inactive region count
+ * @hpb_active_field: Recommended to read HPB region and subregion
+ * @hpb_inactive_field: To be inactivated HPB region and subregion
+ */
+struct utp_hpb_rsp {
+   __be32 residual_transfer_count;
+   __be32 reserved1[4];
+   __be16 sense_data_len;
+   u8 desc_type;
+   u8 additional_len;
+   u8 hpb_op;
+   u8 lun;
+   u8 active_rgn_cnt;
+   u8 inactive_rgn_cnt;
+   struct ufshpb_active_field hpb_active_field[2];
+   __be16 hpb_inactive_field[2];
+};
+#define UTP_HPB_RSP_SIZE 40
+
 /**
  * struct utp_upiu_rsp - general upiu response structure
  * @header: UPIU header structure DW-0 to DW-2
@@ -488,6 +523,7 @@ struct utp_upiu_rsp {
struct utp_upiu_header header;
union {
struct utp_cmd_rsp sr;
+   struct utp_hpb_rsp hr;
struct utp_upiu_query qr;
};
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index ddeb5bb9fb88..88280310bb64 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5018,6 +5018,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+   if (scsi_status == SAM_STAT_GOOD)
+   ufshpb_rsp_upiu(hba, lrbp);
break;
case UPIU_TRANSACTION_REJECT_UPIU:
/* TODO: handle Reject UPIU Response */
@@ -9233,6 +9236,7 @@ EXPORT_SYMBOL(ufshcd_shutdown);
 void ufshcd_remove(struct ufs_hba *hba)
 {
ufs_bsg_remove(hba);
+   ufshpb_remove(hba);
ufs_sysfs_remove_nodes(hba->dev);
blk_cleanup_queue(hba->tmf_queue);
blk_mq_free_tag_set(&hba->tmf_tag_set);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 1a72f6541510..c67acfc8c6bf 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -16,6 +16,16 @@
 #include "ufshpb.h"
 #include "../sd.h"
 
+/* memory management */
+static struct kmem_cache *ufshpb_mctx_cache;
+static mempool_t *ufshpb_mctx_pool;
+static mempool_t *ufshpb_page_pool;
+/* A cache size of 2MB can cache ppn in the 1GB range. */
+static unsigned int ufshpb_host_map_kbytes = 2048;
+static int tot_active_srgn_pages;
+
+static struct workqueue_struct *ufshpb_wq;
+
 bool ufshpb_is_allo

[PATCH v31 1/4] scsi: ufs: Introduce HPB feature

This is a patch for the HPB initialization and adds HPB function calls to
UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of IO requests to the corresponding physical
addresses of the flash storage.
In UFS, Logical-address-to-Physical-address (L2P) map data, which is
required to identify the physical address for the requested IOs, can only
be partially stored in SRAM from NAND flash. Due to this partial loading,
accessing the flash address area where the L2P information for that address
is not loaded in the SRAM can result in serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command.
The HPB READ command allows to read data faster than a read command in UFS
since it provides the physical address (HPB Entry) of the desired logical
block in addition to its logical address. The UFS device can access the
physical block in NAND directly without searching and uploading L2P mapping
table. This improves read performance because the NAND read operation for
uploading L2P mapping table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, some HPB
parameters are configured in the device.

We measured the total start-up time of popular applications and observed
the difference by enabling the HPB.
Popular applications are 12 game apps and 24 non-game apps. Each target
applications were launched in order. The cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The Following is experiment environment:
 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Result:
+---+--+--+---+
| cycle | baseline | with HPB | diff  |
+---+--+--+---+
| 1 | 272.4| 264.9| -7.5  |
| 2 | 250.4| 248.2| -2.2  |
| 3 | 226.2| 215.6| -10.6 |
| 4 | 230.6| 214.8| -15.8 |
| 5 | 232.0| 218.1| -13.9 |
| 6 | 231.9| 212.6| -19.3 |
+---+--+--+---+

We also measured HPB performance using iozone.
Here is my iozone script:
iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16
-s $IO_RANGE/16 -F mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5
mnt/tmp_6 mnt/tmp_7 mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12
mnt/tmp_13 mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Result:
+--++-+
| IO range | HPB on | HPB off |
+--++-+
|   1 GB   | 294.8  | 300.87  |
|   4 GB   | 293.51 | 179.35  |
|   8 GB   | 294.85 | 162.52  |
|  16 GB   | 293.45 | 156.26  |
|  32 GB   | 277.4  | 153.25  |
+--++-+

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Reported-by: kernel test robot 
Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs | 127 +
 drivers/scsi/ufs/Kconfig   |   9 +
 drivers/scsi/ufs/Makefile  |   1 +
 drivers/scsi/ufs/ufs-sysfs.c   |  18 +
 drivers/scsi/ufs/ufs.h |  15 +
 drivers/scsi/ufs/ufshcd.c  |  49 ++
 drivers/scsi/ufs/ufshcd.h  |  22 +
 drivers/scsi/ufs/ufshpb.c  | 569 +
 drivers/scsi/ufs/ufshpb.h  | 167 ++
 9 files changed, 977 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index d1bc23cb6a9d..528bf89fc98b 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1172,3 +1172,130 @@ Description:This node is used to set or display 
whether UFS WriteBooster is
(if the platform supports UFSHCD_CAP_CLK_SCALING). For a
platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
disable/enable WriteBooster through this sysfs node.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the HPB specification version.
+   The full information about the descriptor could be found at UFS
+   HPB (Host Performance Booster) Extension specifications.
+   Example: version 1.2.3 = 0123h
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_control
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows an indication of the HPB control mode.
+   00h: Host control mode
+   01h: Device control mode
+
+   The

[PATCH] watchdog: fix syntactic kernel-doc issues

2021-03-21 Thread Lukas Bulwahn

The command 'find drivers/watchdog | xargs ./scripts/kernel-doc -none'
reports a number of kernel-doc warnings in the watchdog subsystem.

Address the kernel-doc warnings that were purely syntactic issues with
kernel-doc comments.

The remaining kernel-doc warnings are of type "Excess function parameter"
and "Function parameter or member not described". These warnings would
need to be addressed in a second pass with a bit more insight into the
APIs and purpose of the functions in the watchdog subsystem.

Signed-off-by: Lukas Bulwahn 
---
Guenter, Wim, please pick this minor clean-up patch.

 drivers/watchdog/booke_wdt.c   |  2 +-
 drivers/watchdog/eurotechwdt.c |  2 +-
 drivers/watchdog/mei_wdt.c |  8 
 drivers/watchdog/octeon-wdt-main.c | 12 +++-
 drivers/watchdog/pc87413_wdt.c |  2 +-
 drivers/watchdog/wdt.c |  4 ++--
 drivers/watchdog/wdt_pci.c |  2 +-
 7 files changed, 17 insertions(+), 15 deletions(-)

diff --git a/drivers/watchdog/booke_wdt.c b/drivers/watchdog/booke_wdt.c
index 7817fb976f9c..5e4dc1a0f2c6 100644
--- a/drivers/watchdog/booke_wdt.c
+++ b/drivers/watchdog/booke_wdt.c
@@ -148,7 +148,7 @@ static void __booke_wdt_enable(void *data)
 }
 
 /**
- * booke_wdt_disable - disable the watchdog on the given CPU
+ * __booke_wdt_disable - disable the watchdog on the given CPU
  *
  * This function is called on each CPU.  It disables the watchdog on that CPU.
  *
diff --git a/drivers/watchdog/eurotechwdt.c b/drivers/watchdog/eurotechwdt.c
index 2418ebb707bd..ce682942662c 100644
--- a/drivers/watchdog/eurotechwdt.c
+++ b/drivers/watchdog/eurotechwdt.c
@@ -392,7 +392,7 @@ static struct notifier_block eurwdt_notifier = {
 };
 
 /**
- * cleanup_module:
+ * eurwdt_exit:
  *
  * Unload the watchdog. You cannot do this with any file handles open.
  * If your watchdog is set to continue ticking on close and you unload
diff --git a/drivers/watchdog/mei_wdt.c b/drivers/watchdog/mei_wdt.c
index e023d7d90d66..c7a7235e6224 100644
--- a/drivers/watchdog/mei_wdt.c
+++ b/drivers/watchdog/mei_wdt.c
@@ -105,7 +105,7 @@ struct mei_wdt {
 #endif /* CONFIG_DEBUG_FS */
 };
 
-/*
+/**
  * struct mei_mc_hdr - Management Control Command Header
  *
  * @command: Management Control (0x2)
@@ -121,7 +121,7 @@ struct mei_mc_hdr {
 };
 
 /**
- * struct mei_wdt_start_request watchdog start/ping
+ * struct mei_wdt_start_request - watchdog start/ping
  *
  * @hdr: Management Control Command Header
  * @timeout: timeout value
@@ -134,7 +134,7 @@ struct mei_wdt_start_request {
 } __packed;
 
 /**
- * struct mei_wdt_start_response watchdog start/ping response
+ * struct mei_wdt_start_response - watchdog start/ping response
  *
  * @hdr: Management Control Command Header
  * @status: operation status
@@ -474,7 +474,7 @@ static void mei_wdt_rx(struct mei_cl_device *cldev)
complete(&wdt->response);
 }
 
-/*
+/**
  * mei_wdt_notif - callback for event notification
  *
  * @cldev: bus device
diff --git a/drivers/watchdog/octeon-wdt-main.c 
b/drivers/watchdog/octeon-wdt-main.c
index fde9e739b436..298c070884c4 100644
--- a/drivers/watchdog/octeon-wdt-main.c
+++ b/drivers/watchdog/octeon-wdt-main.c
@@ -119,7 +119,7 @@ static int cpu2core(int cpu)
 }
 
 /**
- * Poke the watchdog when an interrupt is received
+ * octeon_wdt_poke_irq - Poke the watchdog when an interrupt is received
  *
  * @cpl:
  * @dev_id:
@@ -153,7 +153,7 @@ static irqreturn_t octeon_wdt_poke_irq(int cpl, void 
*dev_id)
 extern int prom_putchar(char c);
 
 /**
- * Write a string to the uart
+ * octeon_wdt_write_string - Write a string to the uart
  *
  * @str:String to write
  */
@@ -165,7 +165,7 @@ static void octeon_wdt_write_string(const char *str)
 }
 
 /**
- * Write a hex number out of the uart
+ * octeon_wdt_write_hex() - Write a hex number out of the uart
  *
  * @value:  Number to display
  * @digits: Number of digits to print (1 to 16)
@@ -192,6 +192,8 @@ static const char reg_name[][3] = {
 };
 
 /**
+ * octeon_wdt_nmi_stage3:
+ *
  * NMI stage 3 handler. NMIs are handled in the following manner:
  * 1) The first NMI handler enables CVMSEG and transfers from
  * the bootbus region into normal memory. It is careful to not
@@ -513,7 +515,7 @@ static struct watchdog_device octeon_wdt = {
 
 static enum cpuhp_state octeon_wdt_online;
 /**
- * Module/ driver initialization.
+ * octeon_wdt_init - Module/ driver initialization.
  *
  * Returns Zero on success
  */
@@ -585,7 +587,7 @@ static int __init octeon_wdt_init(void)
 }
 
 /**
- * Module / driver shutdown
+ * octeon_wdt_cleanup - Module / driver shutdown
  */
 static void __exit octeon_wdt_cleanup(void)
 {
diff --git a/drivers/watchdog/pc87413_wdt.c b/drivers/watchdog/pc87413_wdt.c
index 2d4504302c9e..9f9a340427fc 100644
--- a/drivers/watchdog/pc87413_wdt.c
+++ b/drivers/watchdog/pc87413_wdt.c
@@ -445,7 +445,7 @@ static long pc87413_ioctl(struct file *file, unsigned int 
cmd,
 /* -- Notifier funtions

[PATCH] ASoC: Intel: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 sound/soc/intel/atom/sst-mfld-dsp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/atom/sst-mfld-dsp.h 
b/sound/soc/intel/atom/sst-mfld-dsp.h
index 102b0e7eafb0..8d9e29b16e57 100644
--- a/sound/soc/intel/atom/sst-mfld-dsp.h
+++ b/sound/soc/intel/atom/sst-mfld-dsp.h
@@ -256,7 +256,7 @@ struct snd_sst_tstamp {
u32 channel_peak[8];
 } __packed;

-/* Stream type params struture for Alloc stream */
+/* Stream type params structure for Alloc stream */
 struct snd_sst_str_type {
u8 codec_type;  /* Codec type */
u8 str_type;/* 1 = voice 2 = music */
--
2.31.0

[PATCH v31 0/4] scsi: ufs: Add Host Performance Booster Support

Changelog:

v30 -> v31
Delete debug unnecessary debug message.

v29 -> v30
1. Add support to reuse bio of pre-request.
2. Delete unreached code in the ufshpb_issue_map_req.

v28 -> v29
1. Remove unused variable that reported by kernel test robot.

v27 -> v28
1. Fix wrong return value of ufshpb_prep.

v26 -> v27
1. Fix wrong refernce of sense buffer in pre_req complete function.
2. Fix read_id error.
3. Fix chunk size checking for HPB 1.0.
4. Mute unnecessary messages before HPB initialization.

v25 -> v26
1. Fix wrong chunk size checking for HPB 1.0.
2. Fix wrong max data size for HPB single command.
3. Fix typo error.

v24 -> v25
1. Change write buffer API for unmap region.
2. Add checking hpb_enable for avoiding unnecessary memory allocation.
3. Change pr_info to dev_info.
4. Change default requeue timeout value for HPB read.
5. Fix wrong offset manipulation on ufshpb_prep_entry.

v23 -> v24
1. Fix build error reported by kernel test robot.

v22 -> v23
1. Add support compatibility of HPB 1.0.
2. Fix read id for single HPB read command.
3. Fix number of pre-allocated requests for write buffer.
4. Add fast path for response UPIU that has same LUN in sense data.
5. Remove WARN_ON for preventing kernel crash.
7. Fix wrong argument for read buffer command.

v21 -> v22
1. Add support processing response UPIU in suspend state.
2. Add support HPB hint from other LU.
3. Add sending write buffer with 0x03 after HPB init.

v20 -> v21
1. Add bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD attr. and fHPBen flag support.

v19 -> v20
1. Add documentation for sysfs entries of hpb->stat.
2. Fix read buffer command for under-sized sub-region.
3. Fix wrong condition checking for kick map work.
4. Delete redundant response UPIU checking.
5. Add LUN checking in response UPIU.
6. Fix possible deadlock problem due to runtime PM.
7. Add instant changing of sub-region state from response UPIU.
8. Fix endian problem in prefetched PPN.
9. Add JESD220-3A (HPB v2.0) support.

v18 -> 19
1. Fix null pointer error when printing sysfs from non-HPB LU.
2. Apply HPB read opcode in lrbp->cmd->cmnd (from Can Guo's review).
3. Rebase the patch on 5.12/scsi-queue.

v17 -> v18
Fix build error which reported by kernel test robot.

v16 -> v17
1. Rename hpb_state_lock to rgn_state_lock and move it to corresponding
patch.
2. Remove redundant information messages.

v15 -> v16
1. Add missed sysfs ABI documentation.

v14 -> v15
1. Remove duplicated sysfs ABI entries in documentation.
2. Add experiment result of HPB performance testing with iozone.

v13 -> v14
1. Cleanup codes by commentted in Greg's review.
2. Add documentation for sysfs entries (from Greg's review).
3. Add experiment result of HPB performance testing.

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.

v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned active 
region.
2. Fixed to disable HPB feature if HPB fails to allocate essential memory
and workqueue.
3. Fixed to change proper sub-region state when region is already evicted.

v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device informa

Re: [PATCH] s390/kernel: Fix a typo

2021-03-21 Thread Heiko Carstens

On Mon, Mar 22, 2021 at 11:55:00AM +0530, Bhaskar Chowdhury wrote:
> 
> s/struture/structure/
> 
> Signed-off-by: Bhaskar Chowdhury 
> ---
>  arch/s390/kernel/os_info.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kernel/os_info.c b/arch/s390/kernel/os_info.c
> index 0a5e4bafb6ad..5a7420b23aa8 100644
> --- a/arch/s390/kernel/os_info.c
> +++ b/arch/s390/kernel/os_info.c
> @@ -52,7 +52,7 @@ void os_info_entry_add(int nr, void *ptr, u64 size)
>  }
> 
>  /*
> - * Initialize OS info struture and set lowcore pointer
> + * Initialize OS info structure and set lowcore pointer

Applied, thanks.

[PATCH] ASoC: Intel: Fix a typo

s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 sound/soc/intel/atom/sst/sst.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/atom/sst/sst.h b/sound/soc/intel/atom/sst/sst.h
index 4d37d39fd8f4..978bf4255888 100644
--- a/sound/soc/intel/atom/sst/sst.h
+++ b/sound/soc/intel/atom/sst/sst.h
@@ -344,7 +344,7 @@ struct sst_fw_save {
  * @block_lock : spin lock to add block to block_list and assign pvt_id
  * @rx_msg_lock : spin lock to handle the rx messages from the DSP
  * @scard_ops : sst card ops
- * @pci : sst pci device struture
+ * @pci : sst pci device structure
  * @dev : pointer to current device struct
  * @sst_lock : sst device lock
  * @pvt_id : sst private id
--
2.31.0

Re: [PATCH 2/5] cifsd: add server-side procedures for SMB3

2021-03-21 Thread Christoph Hellwig

On Mon, Mar 22, 2021 at 09:47:13AM +0300, Dan Carpenter wrote:
> On Mon, Mar 22, 2021 at 02:13:41PM +0900, Namjae Jeon wrote:
> > +static unsigned char
> > +asn1_octet_decode(struct asn1_ctx *ctx, unsigned char *ch)
> > +{
> > +   if (ctx->pointer >= ctx->end) {
> > +   ctx->error = ASN1_ERR_DEC_EMPTY;
> > +   return 0;
> > +   }
> > +   *ch = *(ctx->pointer)++;
> > +   return 1;
> > +}
> 
> 
> Make this bool.
>

More importantly don't add another ANS1 parser, but use the generic
one in lib/asn1_decoder.c instead.  CIFS should also really use it.

[PATCH] ASoC: Intel: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 sound/soc/intel/atom/sst-mfld-dsp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/intel/atom/sst-mfld-dsp.h 
b/sound/soc/intel/atom/sst-mfld-dsp.h
index 5795f98e04d4..102b0e7eafb0 100644
--- a/sound/soc/intel/atom/sst-mfld-dsp.h
+++ b/sound/soc/intel/atom/sst-mfld-dsp.h
@@ -358,7 +358,7 @@ struct snd_wma_params {
u8 reserved;/* reserved */
 } __packed;

-/* Codec params struture */
+/* Codec params structure */
 union  snd_sst_codec_params {
struct snd_pcm_params pcm_params;
struct snd_mp3_params mp3_params;
--
2.31.0

Re: [PATCH 2/5] cifsd: add server-side procedures for SMB3

2021-03-21 Thread Dan Carpenter

On Mon, Mar 22, 2021 at 02:13:41PM +0900, Namjae Jeon wrote:
> +static unsigned char
> +asn1_octet_decode(struct asn1_ctx *ctx, unsigned char *ch)
> +{
> + if (ctx->pointer >= ctx->end) {
> + ctx->error = ASN1_ERR_DEC_EMPTY;
> + return 0;
> + }
> + *ch = *(ctx->pointer)++;
> + return 1;
> +}


Make this bool.

> +
> +static unsigned char
> +asn1_tag_decode(struct asn1_ctx *ctx, unsigned int *tag)
> +{
> + unsigned char ch;
> +
> + *tag = 0;
> +
> + do {
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> + *tag <<= 7;
> + *tag |= ch & 0x7F;
> + } while ((ch & 0x80) == 0x80);
> + return 1;
> +}

Bool.

> +
> +static unsigned char
> +asn1_id_decode(struct asn1_ctx *ctx,
> +unsigned int *cls, unsigned int *con, unsigned int *tag)
> +{
> + unsigned char ch;
> +
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> +
> + *cls = (ch & 0xC0) >> 6;
> + *con = (ch & 0x20) >> 5;
> + *tag = (ch & 0x1F);
> +
> + if (*tag == 0x1F) {
> + if (!asn1_tag_decode(ctx, tag))
> + return 0;
> + }
> + return 1;
> +}


Same.

> +
> +static unsigned char
> +asn1_length_decode(struct asn1_ctx *ctx, unsigned int *def, unsigned int 
> *len)
> +{
> + unsigned char ch, cnt;
> +
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> +
> + if (ch == 0x80)
> + *def = 0;
> + else {
> + *def = 1;
> +
> + if (ch < 0x80)
> + *len = ch;
> + else {
> + cnt = (unsigned char) (ch & 0x7F);
> + *len = 0;
> +
> + while (cnt > 0) {
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> + *len <<= 8;
> + *len |= ch;
> + cnt--;
> + }
> + }
> + }
> +
> + /* don't trust len bigger than ctx buffer */
> + if (*len > ctx->end - ctx->pointer)
> + return 0;
> +
> + return 1;
> +}


Same etc for all.

> +
> +static unsigned char
> +asn1_header_decode(struct asn1_ctx *ctx,
> +unsigned char **eoc,
> +unsigned int *cls, unsigned int *con, unsigned int *tag)
> +{
> + unsigned int def = 0;
> + unsigned int len = 0;
> +
> + if (!asn1_id_decode(ctx, cls, con, tag))
> + return 0;
> +
> + if (!asn1_length_decode(ctx, &def, &len))
> + return 0;
> +
> + /* primitive shall be definite, indefinite shall be constructed */
> + if (*con == ASN1_PRI && !def)
> + return 0;
> +
> + if (def)
> + *eoc = ctx->pointer + len;
> + else
> + *eoc = NULL;
> + return 1;
> +}
> +
> +static unsigned char
> +asn1_eoc_decode(struct asn1_ctx *ctx, unsigned char *eoc)
> +{
> + unsigned char ch;
> +
> + if (!eoc) {
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> +
> + if (ch != 0x00) {
> + ctx->error = ASN1_ERR_DEC_EOC_MISMATCH;
> + return 0;
> + }
> +
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> +
> + if (ch != 0x00) {
> + ctx->error = ASN1_ERR_DEC_EOC_MISMATCH;
> + return 0;
> + }
> + } else {
> + if (ctx->pointer != eoc) {
> + ctx->error = ASN1_ERR_DEC_LENGTH_MISMATCH;
> + return 0;
> + }
> + }
> + return 1;
> +}
> +
> +static unsigned char
> +asn1_subid_decode(struct asn1_ctx *ctx, unsigned long *subid)
> +{
> + unsigned char ch;
> +
> + *subid = 0;
> +
> + do {
> + if (!asn1_octet_decode(ctx, &ch))
> + return 0;
> +
> + *subid <<= 7;
> + *subid |= ch & 0x7F;
> + } while ((ch & 0x80) == 0x80);
> + return 1;
> +}
> +
> +static int
> +asn1_oid_decode(struct asn1_ctx *ctx,
> + unsigned char *eoc, unsigned long **oid, unsigned int *len)
> +{
> + unsigned long subid;
> + unsigned int size;
> + unsigned long *optr;
> +
> + size = eoc - ctx->pointer + 1;
> +
> + /* first subid actually encodes first two subids */
> + if (size < 2 || size > UINT_MAX/sizeof(unsigned long))
> + return 0;
> +
> + *oid = kmalloc(size * sizeof(unsigned long), GFP_KERNEL);
> + if (!*oid)
> + return 0;
> +
> + optr = *oid;
> +
> + if (!asn1_subid_decode(ctx, &subid)) {
> + kfree(*oid);
> + *oid = NULL;
> + return 0;
> + }
> +
> + if (subid < 40) {
> + optr[0] = 0;
> + optr[1] = subid;
> + } else if (subid < 80) {
> + optr[0

[PATCH] scsi: scsi_dh: Fix a typo



s/infrastruture/infrastructure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/scsi/scsi_dh.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_dh.c b/drivers/scsi/scsi_dh.c
index 6f41e4b5a2b8..7b56e00c7df6 100644
--- a/drivers/scsi/scsi_dh.c
+++ b/drivers/scsi/scsi_dh.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0-or-later
 /*
- * SCSI device handler infrastruture.
+ * SCSI device handler infrastructure.
  *
  * Copyright IBM Corporation, 2007
  *  Authors:
--
2.31.0

Re: [RFC PATCH v4 6/9] KVM: selftests: Add a helper to get system default hugetlb page size

2021-03-21 Thread wangyanan (Y)




On 2021/3/12 19:40, Andrew Jones wrote:

On Tue, Mar 02, 2021 at 08:57:48PM +0800, Yanan Wang wrote:

If HUGETLB is configured in the host kernel, then we can know the system
default hugetlb page size through *cat /proc/meminfo*. Otherwise, we will
not see the information of hugetlb pages in file /proc/meminfo if it's not
configured. So add a helper to determine whether HUGETLB is configured and
then get the default page size by reading /proc/meminfo.

This helper can be useful when a program wants to use the default hugetlb
pages of the system and doesn't know the default page size.

Signed-off-by: Yanan Wang 
---
  .../testing/selftests/kvm/include/test_util.h |  1 +
  tools/testing/selftests/kvm/lib/test_util.c   | 27 +++
  2 files changed, 28 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/test_util.h 
b/tools/testing/selftests/kvm/include/test_util.h
index ef24c76ba89a..e087174eefe5 100644
--- a/tools/testing/selftests/kvm/include/test_util.h
+++ b/tools/testing/selftests/kvm/include/test_util.h
@@ -80,6 +80,7 @@ struct vm_mem_backing_src_alias {
  
  bool thp_configured(void);

  size_t get_trans_hugepagesz(void);
+size_t get_def_hugetlb_pagesz(void);
  void backing_src_help(void);
  enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name);
  
diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c

index f2d133f76c67..80d68dbd72d2 100644
--- a/tools/testing/selftests/kvm/lib/test_util.c
+++ b/tools/testing/selftests/kvm/lib/test_util.c
@@ -153,6 +153,33 @@ size_t get_trans_hugepagesz(void)
return size;
  }
  
+size_t get_def_hugetlb_pagesz(void)

+{
+   char buf[64];
+   const char *tag = "Hugepagesize:";
+   FILE *f;
+
+   f = fopen("/proc/meminfo", "r");
+   TEST_ASSERT(f != NULL, "Error in opening /proc/meminfo: %d", errno);
+
+   while (fgets(buf, sizeof(buf), f) != NULL) {
+   if (strstr(buf, tag) == buf) {
+   fclose(f);
+   return strtoull(buf + strlen(tag), NULL, 10) << 10;
+   }
+   }
+
+   if (feof(f)) {
+   fclose(f);
+   TEST_FAIL("HUGETLB is not configured in host kernel");
+   } else {
+   fclose(f);
+   TEST_FAIL("Error in reading /proc/meminfo: %d", errno);
+   }

fclose() can be factored out.


+
+   return 0;
+}
+
  void backing_src_help(void)
  {
int i;
--
2.23.0


Besides the fclose comment and the same errno comment as the previous
patch

I will fix it and add your R-b in this patch.

Thanks,
Yanan

Reviewed-by: Andrew Jones 

.

[PATCH] scsi_dh: Fix a typo



s/infrastruture/infrastructure/

Signed-off-by: Bhaskar Chowdhury 
---
 include/scsi/scsi_dh.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/scsi/scsi_dh.h b/include/scsi/scsi_dh.h
index 2852e470a8ed..a9f782fe732a 100644
--- a/include/scsi/scsi_dh.h
+++ b/include/scsi/scsi_dh.h
@@ -1,6 +1,6 @@
 /* SPDX-License-Identifier: GPL-2.0-or-later */
 /*
- * Header file for SCSI device handler infrastruture.
+ * Header file for SCSI device handler infrastructure.
  *
  * Modified version of patches posted by Mike Christie 
  *
--
2.31.0

[PATCH v30 4/4] scsi: ufs: Add HPB 2.0 support

This patch supports the HPB 2.0.

The HPB 2.0 supports read of varying sizes from 4KB to 512KB.
In the case of Read (<= 32KB) is supported as single HPB read.
In the case of Read (36KB ~ 512KB) is supported by as a combination of
write buffer command and HPB read command to deliver more PPN.
The write buffer commands may not be issued immediately due to busy tags.
To use HPB read more aggressively, the driver can requeue the write buffer
command. The requeue threshold is implemented as timeout and can be
modified with requeue_timeout_ms entry in sysfs.

Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs |  47 +-
 drivers/scsi/scsi_lib.c|   4 +-
 drivers/scsi/ufs/ufs-sysfs.c   |   4 +
 drivers/scsi/ufs/ufs.h |   3 +-
 drivers/scsi/ufs/ufshcd.c  |  25 +-
 drivers/scsi/ufs/ufshcd.h  |   7 +
 drivers/scsi/ufs/ufshpb.c  | 626 +++--
 drivers/scsi/ufs/ufshpb.h  |  67 ++-
 8 files changed, 701 insertions(+), 82 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index 528bf89fc98b..419adf450b89 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB 
pinned regions assigned to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that changed to HPB read.
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that cannot be changed to
@@ -1268,7 +1268,7 @@ Description:  This entry shows the number of reads 
that cannot be changed to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of response UPIUs that has
@@ -1276,7 +1276,7 @@ Description:  This entry shows the number of response 
UPIUs that has
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of active sub-regions recommended by
@@ -1284,7 +1284,7 @@ Description:  This entry shows the number of active 
sub-regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of inactive regions recommended by
@@ -1292,10 +1292,45 @@ Description:This entry shows the number of inactive 
regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of read buffer commands for
activating sub-regions recommended by response UPIUs.
 
The file is read only.
+
+What:  
/sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the requeue timeout threshold for write buffer
+   command in ms. This value can be changed by writing proper 
integer to
+   this entry.
+
+What:  
/sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the maximum HPB data size for using single HPB
+   command.
+
+   ===  
+   00h  4KB
+   01h  8KB
+   02h  12KB
+   ...
+   FFh  1024KB
+   ===  
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the status of HPB.
+
+   == 
+   0  HPB is not enabled.
+   1  HPB is enabled
+

Re: [PATCH] tools: include: nolibc: Fix a typo occured to occurred in the file nolibc.h

2021-03-21 Thread Willy Tarreau

On Sat, Feb 27, 2021 at 02:58:18PM -0800, Randy Dunlap wrote:
> On 2/27/21 2:44 PM, Bhaskar Chowdhury wrote:
> > 
> > s/occured/occurred/
> > 
> > Signed-off-by: Bhaskar Chowdhury 
> 
> Acked-by: Randy Dunlap 

Oops, seems like I missed this one, now queued, thanks to you both!
Willy

[PATCH v30 3/4] scsi: ufs: Prepare HPB read for cached sub-region

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 257 +-
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 258 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 88280310bb64..a7cf9278965c 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2653,6 +2653,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
lrbp->req_abort_skip = false;
 
+   ufshpb_prep(hba, lrbp);
+
ufshcd_comp_scsi_upiu(hba, lrbp);
 
err = ufshcd_map_sg(hba, lrbp);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index c67acfc8c6bf..f789339f68d9 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(&hpb->hpb_state, state);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+   struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb)
 }
 
 static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
-struct ufshcd_lrb *lrbp,
-struct utp_hpb_rsp *rsp_field)
+   struct ufshcd_lrb *lrbp,
+   struct utp_hpb_rsp *rsp_field)
 {
/* Check HPB_UPDATE_ALERT */
if (!(lrbp->ucd_rsp_ptr->header.dword_2 &
@@ -107,6 +130,234 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
return true;
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+ int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   if (unlikely(!srgn->mctx)) {
+   dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+   "no mctx in region %d subregion %d.\n",
+   srgn->rgn_idx, srgn->srgn_idx);
+   return true;

[PATCH v30 2/4] scsi: ufs: L2P map management for HPB read

This is a patch for managing L2P map in HPB module.

The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.

Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sensedata.
The HPB module performs L2P mapping management on the host through the
delivered information.

A pinned region is a pre-set regions on the UFS device that is always
activate-state.

The data structure for map data request and L2P map uses mempool API,
minimizing allocation overhead while avoiding static allocation.

The mininum size of the memory pool used in the HPB is implemented
as a module parameter, so that it can be configurable by the user.

To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096

The map_work manages active/inactive by 2 "to-do" lists.
Each hpb lun maintains 2 "to-do" lists:
  hpb->lh_inact_rgn - regions to be inactivated, and
  hpb->lh_act_srgn - subregions to be activated
Those lists are maintained on IO completion.

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h|   36 ++
 drivers/scsi/ufs/ufshcd.c |4 +
 drivers/scsi/ufs/ufshpb.c | 1088 -
 drivers/scsi/ufs/ufshpb.h |   65 +++
 4 files changed, 1178 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 4eee7e31d08d..bfb84d2ba990 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -478,6 +478,41 @@ struct utp_cmd_rsp {
u8 sense_data[UFS_SENSE_SIZE];
 };
 
+struct ufshpb_active_field {
+   __be16 active_rgn;
+   __be16 active_srgn;
+};
+#define HPB_ACT_FIELD_SIZE 4
+
+/**
+ * struct utp_hpb_rsp - Response UPIU structure
+ * @residual_transfer_count: Residual transfer count DW-3
+ * @reserved1: Reserved double words DW-4 to DW-7
+ * @sense_data_len: Sense data length DW-8 U16
+ * @desc_type: Descriptor type of sense data
+ * @additional_len: Additional length of sense data
+ * @hpb_op: HPB operation type
+ * @lun: LUN of response UPIU
+ * @active_rgn_cnt: Active region count
+ * @inactive_rgn_cnt: Inactive region count
+ * @hpb_active_field: Recommended to read HPB region and subregion
+ * @hpb_inactive_field: To be inactivated HPB region and subregion
+ */
+struct utp_hpb_rsp {
+   __be32 residual_transfer_count;
+   __be32 reserved1[4];
+   __be16 sense_data_len;
+   u8 desc_type;
+   u8 additional_len;
+   u8 hpb_op;
+   u8 lun;
+   u8 active_rgn_cnt;
+   u8 inactive_rgn_cnt;
+   struct ufshpb_active_field hpb_active_field[2];
+   __be16 hpb_inactive_field[2];
+};
+#define UTP_HPB_RSP_SIZE 40
+
 /**
  * struct utp_upiu_rsp - general upiu response structure
  * @header: UPIU header structure DW-0 to DW-2
@@ -488,6 +523,7 @@ struct utp_upiu_rsp {
struct utp_upiu_header header;
union {
struct utp_cmd_rsp sr;
+   struct utp_hpb_rsp hr;
struct utp_upiu_query qr;
};
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index ddeb5bb9fb88..88280310bb64 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5018,6 +5018,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+   if (scsi_status == SAM_STAT_GOOD)
+   ufshpb_rsp_upiu(hba, lrbp);
break;
case UPIU_TRANSACTION_REJECT_UPIU:
/* TODO: handle Reject UPIU Response */
@@ -9233,6 +9236,7 @@ EXPORT_SYMBOL(ufshcd_shutdown);
 void ufshcd_remove(struct ufs_hba *hba)
 {
ufs_bsg_remove(hba);
+   ufshpb_remove(hba);
ufs_sysfs_remove_nodes(hba->dev);
blk_cleanup_queue(hba->tmf_queue);
blk_mq_free_tag_set(&hba->tmf_tag_set);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 1a72f6541510..c67acfc8c6bf 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -16,6 +16,16 @@
 #include "ufshpb.h"
 #include "../sd.h"
 
+/* memory management */
+static struct kmem_cache *ufshpb_mctx_cache;
+static mempool_t *ufshpb_mctx_pool;
+static mempool_t *ufshpb_page_pool;
+/* A cache size of 2MB can cache ppn in the 1GB range. */
+static unsigned int ufshpb_host_map_kbytes = 2048;
+static int tot_active_srgn_pages;
+
+static struct workqueue_struct *ufshpb_wq;
+
 bool ufshpb_is_allo

[PATCH] RDMA: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 include/rdma/rdma_vt.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index 9fd217b24916..d6611f2dd6a5 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -245,7 +245,7 @@ struct rvt_driver_provided {
void * (*qp_priv_alloc)(struct rvt_dev_info *rdi, struct rvt_qp *qp);

/*
-* Init a struture allocated with qp_priv_alloc(). This should be
+* Init a structure allocated with qp_priv_alloc(). This should be
 * called after all qp fields have been initialized in rdmavt.
 */
int (*qp_priv_init)(struct rvt_dev_info *rdi, struct rvt_qp *qp,
--
2.31.0

Re: [PATCH -tip v3 00/12] kprobes: Fix stacktrace with kretprobes on x86

Oops, please ignore this. I missed updating the version.

Thanks,

On Mon, 22 Mar 2021 15:36:46 +0900
Masami Hiramatsu  wrote:

> Hello,
> 
> Here is the 4th version of the series to fix the stacktrace with kretprobe
> on x86. After merging this, I'll fix other architectures.
> 
> The previous version is;
> 
> https://lore.kernel.org/bpf/161615650355.306069.17260992641363840330.stgit@devnote2/
> 
> This version fixes some build warnings/errors and a bug on arm. (I think
> arm's kretprobe implementation is a bit odd. anyway, that is off topic.)
> [5/12] fixes objtool warning when CONFIG_FRAME_POINTER=y. [7/12] fixes a
> build error on ia64. And add [8/12] for avoiding stack corruption by
> instruction_pointer_set() in kretprobe_trampoline_handler on arm.
> 
> With this series, unwinder can unwind stack correctly from ftrace as below;
> 
>   # cd /sys/kernel/debug/tracing
>   # echo > trace
>   # echo r vfs_read >> kprobe_events
>   # echo r full_proxy_read >> kprobe_events
>   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
>   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
>   # echo 1 > events/kprobes/enable
>   # echo 1 > options/sym-offset
>   # cat /sys/kernel/debug/kprobes/list
> 8133b740  r  full_proxy_read+0x0[FTRACE]
> 812560b0  r  vfs_read+0x0[FTRACE]
>   # echo 0 > events/kprobes/enable
>   # cat trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 3/3   #P:8
> #
> #_-=> irqs-off
> #   / _=> need-resched
> #  | / _---=> hardirq/softirq
> #  || / _--=> preempt-depth
> #  ||| / delay
> #   TASK-PID CPU#     TIMESTAMP  FUNCTION
> #  | | |     | |
><...>-135 [005] ...1 9.422114: r_full_proxy_read_0: 
> (vfs_read+0xab/0x1a0 <- full_proxy_read)
><...>-135 [005] ...1 9.422158: 
>  => kretprobe_trace_func+0x209/0x2f0
>  => kretprobe_dispatcher+0x4a/0x70
>  => __kretprobe_trampoline_handler+0xca/0x150
>  => trampoline_handler+0x44/0x70
>  => kretprobe_trampoline+0x2a/0x50
>  => vfs_read+0xab/0x1a0
>  => ksys_read+0x5f/0xe0
>  => do_syscall_64+0x33/0x40
>  => entry_SYSCALL_64_after_hwframe+0x44/0xae
>  => 0
> 
> This shows the double return probes (vfs_read and full_proxy_read) on the 
> stack
> correctly unwinded. (vfs_read was called from ksys_read+0x5f and 
> full_proxy_read
> was called from vfs_read+0xab)
> 
> This actually changes the kretprobe behavisor a bit, now the instraction 
> pointer in
> the pt_regs passed to kretprobe user handler is correctly set the real return
> address. So user handlers can get it via instruction_pointer() API.
> 
> You can also get this series from 
>  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git 
> kprobes/kretprobe-stackfix-v4
> 
> 
> Thank you,
> 
> ---
> 
> Josh Poimboeuf (1):
>   x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code
> 
> Masami Hiramatsu (11):
>   ia64: kprobes: Fix to pass correct trampoline address to the handler
>   kprobes: treewide: Replace arch_deref_entry_point() with 
> dereference_function_descriptor()
>   kprobes: treewide: Remove trampoline_address from 
> kretprobe_trampoline_handler()
>   kprobes: Add kretprobe_find_ret_addr() for searching return address
>   ARC: Add instruction_pointer_set() API
>   ia64: Add instruction_pointer_set() API
>   arm: kprobes: Make a space for regs->ARM_pc at kretprobe_trampoline
>   kprobes: Setup instruction pointer in __kretprobe_trampoline_handler
>   x86/kprobes: Push a fake return address at kretprobe_trampoline
>   x86/unwind: Recover kretprobe trampoline entry
>   tracing: Show kretprobe unknown indicator only for kretprobe_trampoline
> 
> 
>  arch/arc/include/asm/ptrace.h   |5 ++
>  arch/arc/kernel/kprobes.c   |2 -
>  arch/arm/probes/kprobes/core.c  |5 +-
>  arch/arm64/kernel/probes/kprobes.c  |3 -
>  arch/csky/kernel/probes/kprobes.c   |2 -
>  arch/ia64/include/asm/ptrace.h  |5 ++
>  arch/ia64/kernel/kprobes.c  |   15 ++---
>  arch/mips/kernel/kprobes.c  |3 -
>  arch/parisc/kernel/kprobes.c|4 +
>  arch/powerpc/kernel/kprobes.c   |   13 -
>  arch/riscv/kernel/probes/kprobes.c  |2 -
>  arch/s390/kernel/kprobes.c  |2 -
>  arch/sh/kernel/kprobes.c|2 -
>  arch/sparc/kernel/kprobes.c |2 -
>  arch/x86/include/asm/kprobes.h  |1 
>  arch/x86/include/asm/unwind.h   |   17 ++
>  arch/x86/include/asm/unwind_hints.h |5 ++
>  arch/x86/kernel/kprobes/core.c  |   44 
>  arch/x86/kernel/unwind_frame.c  |4 +
>  arch/x86/kernel/unwind_guess.c  |3 -
>  arch/x86/kernel/unwind_orc.c|6 +-
>  include/linux/kprobes.h |   41 ++

[PATCH v30 1/4] scsi: ufs: Introduce HPB feature

This is a patch for the HPB initialization and adds HPB function calls to
UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of IO requests to the corresponding physical
addresses of the flash storage.
In UFS, Logical-address-to-Physical-address (L2P) map data, which is
required to identify the physical address for the requested IOs, can only
be partially stored in SRAM from NAND flash. Due to this partial loading,
accessing the flash address area where the L2P information for that address
is not loaded in the SRAM can result in serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command.
The HPB READ command allows to read data faster than a read command in UFS
since it provides the physical address (HPB Entry) of the desired logical
block in addition to its logical address. The UFS device can access the
physical block in NAND directly without searching and uploading L2P mapping
table. This improves read performance because the NAND read operation for
uploading L2P mapping table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, some HPB
parameters are configured in the device.

We measured the total start-up time of popular applications and observed
the difference by enabling the HPB.
Popular applications are 12 game apps and 24 non-game apps. Each target
applications were launched in order. The cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The Following is experiment environment:
 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Result:
+---+--+--+---+
| cycle | baseline | with HPB | diff  |
+---+--+--+---+
| 1 | 272.4| 264.9| -7.5  |
| 2 | 250.4| 248.2| -2.2  |
| 3 | 226.2| 215.6| -10.6 |
| 4 | 230.6| 214.8| -15.8 |
| 5 | 232.0| 218.1| -13.9 |
| 6 | 231.9| 212.6| -19.3 |
+---+--+--+---+

We also measured HPB performance using iozone.
Here is my iozone script:
iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16
-s $IO_RANGE/16 -F mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5
mnt/tmp_6 mnt/tmp_7 mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12
mnt/tmp_13 mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Result:
+--++-+
| IO range | HPB on | HPB off |
+--++-+
|   1 GB   | 294.8  | 300.87  |
|   4 GB   | 293.51 | 179.35  |
|   8 GB   | 294.85 | 162.52  |
|  16 GB   | 293.45 | 156.26  |
|  32 GB   | 277.4  | 153.25  |
+--++-+

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Reported-by: kernel test robot 
Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs | 127 +
 drivers/scsi/ufs/Kconfig   |   9 +
 drivers/scsi/ufs/Makefile  |   1 +
 drivers/scsi/ufs/ufs-sysfs.c   |  18 +
 drivers/scsi/ufs/ufs.h |  15 +
 drivers/scsi/ufs/ufshcd.c  |  49 ++
 drivers/scsi/ufs/ufshcd.h  |  22 +
 drivers/scsi/ufs/ufshpb.c  | 569 +
 drivers/scsi/ufs/ufshpb.h  | 167 ++
 9 files changed, 977 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index d1bc23cb6a9d..528bf89fc98b 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1172,3 +1172,130 @@ Description:This node is used to set or display 
whether UFS WriteBooster is
(if the platform supports UFSHCD_CAP_CLK_SCALING). For a
platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
disable/enable WriteBooster through this sysfs node.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the HPB specification version.
+   The full information about the descriptor could be found at UFS
+   HPB (Host Performance Booster) Extension specifications.
+   Example: version 1.2.3 = 0123h
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_control
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows an indication of the HPB control mode.
+   00h: Host control mode
+   01h: Device control mode
+
+   The

Re: [RFC PATCH v4 5/9] KVM: selftests: Add a helper to get system configured THP page size

2021-03-21 Thread wangyanan (Y)


Hi Drew,

Thanks for your attention to this series!
On 2021/3/12 19:31, Andrew Jones wrote:

On Tue, Mar 02, 2021 at 08:57:47PM +0800, Yanan Wang wrote:

If we want to have some tests about transparent hugepages, the system
configured THP hugepage size should better be known by the tests, which
can be used for kinds of alignment or guest memory accessing of vcpus...
So it makes sense to add a helper to get the transparent hugepage size.

With VM_MEM_SRC_ANONYMOUS_THP specified in vm_userspace_mem_region_add(),
we now stat /sys/kernel/mm/transparent_hugepage to check whether THP is
configured in the host kernel before madvise(). Based on this, we can also
read file /sys/kernel/mm/transparent_hugepage/hpage_pmd_size to get THP
hugepage size.

Signed-off-by: Yanan Wang 
Reviewed-by: Ben Gardon 
---
  .../testing/selftests/kvm/include/test_util.h |  2 ++
  tools/testing/selftests/kvm/lib/test_util.c   | 36 +++
  2 files changed, 38 insertions(+)

diff --git a/tools/testing/selftests/kvm/include/test_util.h 
b/tools/testing/selftests/kvm/include/test_util.h
index b7f41399f22c..ef24c76ba89a 100644
--- a/tools/testing/selftests/kvm/include/test_util.h
+++ b/tools/testing/selftests/kvm/include/test_util.h
@@ -78,6 +78,8 @@ struct vm_mem_backing_src_alias {
enum vm_mem_backing_src_type type;
  };
  
+bool thp_configured(void);

+size_t get_trans_hugepagesz(void);
  void backing_src_help(void);
  enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name);
  
diff --git a/tools/testing/selftests/kvm/lib/test_util.c b/tools/testing/selftests/kvm/lib/test_util.c

index c7c0627c6842..f2d133f76c67 100644
--- a/tools/testing/selftests/kvm/lib/test_util.c
+++ b/tools/testing/selftests/kvm/lib/test_util.c
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  #include "linux/kernel.h"
  
  #include "test_util.h"

@@ -117,6 +118,41 @@ const struct vm_mem_backing_src_alias 
backing_src_aliases[] = {
{"anonymous_hugetlb", VM_MEM_SRC_ANONYMOUS_HUGETLB,},
  };
  
+bool thp_configured(void)

+{
+   int ret;
+   struct stat statbuf;
+
+   ret = stat("/sys/kernel/mm/transparent_hugepage", &statbuf);
+   TEST_ASSERT(ret == 0 || (ret == -1 && errno == ENOENT),
+   "Error in stating /sys/kernel/mm/transparent_hugepage: %d",
+   errno);

TEST_ASSERT will already output errno's string. Is that not sufficient? If
not, I think extending TEST_ASSERT to output errno too would be fine.
I think it's a good idea to output the errno together with it's string 
in TEST_ASSERT,
it will explicitly indicate that the string is an error information and 
the errno is much
easier to be used for debugging than the string. I will make this change 
a separate

patch in next version and add your S-b tag.

+
+   return ret == 0;
+}
+
+size_t get_trans_hugepagesz(void)
+{
+   size_t size;
+   char buf[16];
+   FILE *f;
+
+   TEST_ASSERT(thp_configured(), "THP is not configured in host kernel");
+
+   f = fopen("/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", "r");
+   TEST_ASSERT(f != NULL,
+   "Error in opening transparent_hugepage/hpage_pmd_size: %d",
+   errno);

Same comment as above.


+
+   if (fread(buf, sizeof(char), sizeof(buf), f) == 0) {
+   fclose(f);
+   TEST_FAIL("Unable to read transparent_hugepage/hpage_pmd_size");
+   }
+
+   size = strtoull(buf, NULL, 10);

fscanf with %lld?

This makes senses. But it should be %ld corresponding to size_t.

Thanks,
Yanan.

+   return size;
+}
+
  void backing_src_help(void)
  {
int i;
--
2.23.0


Thanks,
drew

.

[PATCH -tip v4 11/12] x86/unwind: Recover kretprobe trampoline entry

Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, x86 unwinders can not
continue the stack unwinding at that point, or record
kretprobe_trampoline instead of correct return address.

To fix this issue, find the correct return address from task's
kretprobe_instances as like as function-graph tracer does.

With this fix, the unwinder can correctly unwind the stack
from kretprobe event on x86, as below.

   <...>-135 [003] ...1 6.722338: r_full_proxy_read_0: 
(vfs_read+0xab/0x1a0 <- full_proxy_read)
   <...>-135 [003] ...1 6.722377: 
 => kretprobe_trace_func+0x209/0x2f0
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xca/0x150
 => trampoline_handler+0x44/0x70
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae


Reported-by: Daniel Xu 
Signed-off-by: Masami Hiramatsu 
Suggested-by: Josh Poimboeuf 
---
  Changes in v3:
   - Split out the kretprobe side patch
   - Fix build error when CONFIG_KRETPROBES=n.
  Changes in v2:
   - Remove kretprobe wrapper functions from unwind_orc.c
   - Do not fixup state->ip when unwinding with regs because
 kretprobe fixup instruction pointer before calling handler.
---
 arch/x86/include/asm/unwind.h  |   17 +
 arch/x86/kernel/unwind_frame.c |4 ++--
 arch/x86/kernel/unwind_guess.c |3 +--
 arch/x86/kernel/unwind_orc.c   |6 +++---
 4 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 70fc159ebe69..332aa6174b10 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -15,6 +16,7 @@ struct unwind_state {
unsigned long stack_mask;
struct task_struct *task;
int graph_idx;
+   struct llist_node *kr_cur;
bool error;
 #if defined(CONFIG_UNWINDER_ORC)
bool signal, full_regs;
@@ -99,6 +101,21 @@ void unwind_module_init(struct module *mod, void *orc_ip, 
size_t orc_ip_size,
void *orc, size_t orc_size) {}
 #endif
 
+/* Recover the return address modified by instrumentation (e.g. kretprobe) */
+static inline
+unsigned long unwind_recover_ret_addr(struct unwind_state *state,
+unsigned long addr, unsigned long *addr_p)
+{
+   unsigned long ret;
+
+   ret = ftrace_graph_ret_addr(state->task, &state->graph_idx,
+   addr, addr_p);
+   if (is_kretprobe_trampoline(ret))
+   ret = kretprobe_find_ret_addr(state->task, addr_p,
+ &state->kr_cur);
+   return ret;
+}
+
 /*
  * This disables KASAN checking when reading a value from another task's stack,
  * since the other task could be running on another CPU and could have poisoned
diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index d7c44b257f7f..24e33b44b2be 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -240,8 +241,7 @@ static bool update_stack_state(struct unwind_state *state,
else {
addr_p = unwind_get_return_address_ptr(state);
addr = READ_ONCE_TASK_STACK(state->task, *addr_p);
-   state->ip = ftrace_graph_ret_addr(state->task, 
&state->graph_idx,
- addr, addr_p);
+   state->ip = unwind_recover_ret_addr(state, addr, addr_p);
}
 
/* Save the original stack pointer for unwind_dump(): */
diff --git a/arch/x86/kernel/unwind_guess.c b/arch/x86/kernel/unwind_guess.c
index c49f10ffd8cd..884d68a6e714 100644
--- a/arch/x86/kernel/unwind_guess.c
+++ b/arch/x86/kernel/unwind_guess.c
@@ -15,8 +15,7 @@ unsigned long unwind_get_return_address(struct unwind_state 
*state)
 
addr = READ_ONCE_NOCHECK(*state->sp);
 
-   return ftrace_graph_ret_addr(state->task, &state->graph_idx,
-addr, state->sp);
+   return unwind_recover_ret_addr(state, addr, state->sp);
 }
 EXPORT_SYMBOL_GPL(unwind_get_return_address);
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index a1202536fc57..839a0698342a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -2,6 +2,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -534,9 +535,8 @@ bool unwind_next_frame(struct unwind_state *state)
if (!deref_stack_reg(state, ip_p, &state->ip))
goto err;
 
-   state->ip = ftrace_graph_ret_addr(state->task, 
&state->graph_idx,
- state->ip, (void *)ip_p);
-
+

[PATCH v30 0/4] scsi: ufs: Add Host Performance Booster Support

Changelog:

v29 -> v30
1. Add support to reuse bio of pre-request.
2. Delete unreached code in the ufshpb_issue_map_req.

v28 -> v29
1. Remove unused variable that reported by kernel test robot.

v27 -> v28
1. Fix wrong return value of ufshpb_prep.

v26 -> v27
1. Fix wrong refernce of sense buffer in pre_req complete function.
2. Fix read_id error.
3. Fix chunk size checking for HPB 1.0.
4. Mute unnecessary messages before HPB initialization.

v25 -> v26
1. Fix wrong chunk size checking for HPB 1.0.
2. Fix wrong max data size for HPB single command.
3. Fix typo error.

v24 -> v25
1. Change write buffer API for unmap region.
2. Add checking hpb_enable for avoiding unnecessary memory allocation.
3. Change pr_info to dev_info.
4. Change default requeue timeout value for HPB read.
5. Fix wrong offset manipulation on ufshpb_prep_entry.

v23 -> v24
1. Fix build error reported by kernel test robot.

v22 -> v23
1. Add support compatibility of HPB 1.0.
2. Fix read id for single HPB read command.
3. Fix number of pre-allocated requests for write buffer.
4. Add fast path for response UPIU that has same LUN in sense data.
5. Remove WARN_ON for preventing kernel crash.
7. Fix wrong argument for read buffer command.

v21 -> v22
1. Add support processing response UPIU in suspend state.
2. Add support HPB hint from other LU.
3. Add sending write buffer with 0x03 after HPB init.

v20 -> v21
1. Add bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD attr. and fHPBen flag support.

v19 -> v20
1. Add documentation for sysfs entries of hpb->stat.
2. Fix read buffer command for under-sized sub-region.
3. Fix wrong condition checking for kick map work.
4. Delete redundant response UPIU checking.
5. Add LUN checking in response UPIU.
6. Fix possible deadlock problem due to runtime PM.
7. Add instant changing of sub-region state from response UPIU.
8. Fix endian problem in prefetched PPN.
9. Add JESD220-3A (HPB v2.0) support.

v18 -> 19
1. Fix null pointer error when printing sysfs from non-HPB LU.
2. Apply HPB read opcode in lrbp->cmd->cmnd (from Can Guo's review).
3. Rebase the patch on 5.12/scsi-queue.

v17 -> v18
Fix build error which reported by kernel test robot.

v16 -> v17
1. Rename hpb_state_lock to rgn_state_lock and move it to corresponding
patch.
2. Remove redundant information messages.

v15 -> v16
1. Add missed sysfs ABI documentation.

v14 -> v15
1. Remove duplicated sysfs ABI entries in documentation.
2. Add experiment result of HPB performance testing with iozone.

v13 -> v14
1. Cleanup codes by commentted in Greg's review.
2. Add documentation for sysfs entries (from Greg's review).
3. Add experiment result of HPB performance testing.

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.

v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned active 
region.
2. Fixed to disable HPB feature if HPB fails to allocate essential memory
and workqueue.
3. Fixed to change proper sub-region state when region is already evicted.

v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported

Re: [PATCH] prctl: fix overwrite last but one entry in auxv vector

2021-03-21 Thread Cyrill Gorcunov

On Sun, Mar 21, 2021 at 11:36:42PM +0300, Cyrill Gorcunov wrote:
> Alexey reported that current PR_SET_MM_AUXV (and PR_SET_MM_MAP) overwrite
> too many entries on non 64bit kernels. This is because auxv is defined
> as an array of longs and in result access to AT_VECTOR_SIZE - 2 entry
> is not a type of auxv entry but rather an entry before the last one.

Drop this patch, please. I'll make a new version.

[PATCH -tip v4 12/12] tracing: Show kretprobe unknown indicator only for kretprobe_trampoline

ftrace shows "[unknown/kretprobe'd]" indicator all addresses in the
kretprobe_trampoline, but the modified address by kretprobe should
be only kretprobe_trampoline+0.

Signed-off-by: Masami Hiramatsu 
---
 kernel/trace/trace_output.c |   17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 61255bad7e01..e12437388686 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -346,22 +347,12 @@ int trace_output_call(struct trace_iterator *iter, char 
*name, char *fmt, ...)
 }
 EXPORT_SYMBOL_GPL(trace_output_call);
 
-#ifdef CONFIG_KRETPROBES
-static inline const char *kretprobed(const char *name)
+static inline const char *kretprobed(const char *name, unsigned long addr)
 {
-   static const char tramp_name[] = "kretprobe_trampoline";
-   int size = sizeof(tramp_name);
-
-   if (strncmp(tramp_name, name, size) == 0)
+   if (is_kretprobe_trampoline(addr))
return "[unknown/kretprobe'd]";
return name;
 }
-#else
-static inline const char *kretprobed(const char *name)
-{
-   return name;
-}
-#endif /* CONFIG_KRETPROBES */
 
 void
 trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset)
@@ -374,7 +365,7 @@ trace_seq_print_sym(struct trace_seq *s, unsigned long 
address, bool offset)
sprint_symbol(str, address);
else
kallsyms_lookup(address, NULL, NULL, NULL, str);
-   name = kretprobed(str);
+   name = kretprobed(str, address);
 
if (name && strlen(name)) {
trace_seq_puts(s, name);

Re: [PATCH v9] i2c: virtio: add a virtio i2c frontend driver

2021-03-21 Thread Viresh Kumar

On 22-03-21, 21:35, Jie Deng wrote:
> diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
> new file mode 100644
> index 000..316986e
> --- /dev/null
> +++ b/drivers/i2c/busses/i2c-virtio.c
> @@ -0,0 +1,286 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Virtio I2C Bus Driver
> + *
> + * The Virtio I2C Specification:
> + * 
> https://raw.githubusercontent.com/oasis-tcs/virtio-spec/master/virtio-i2c.tex
> + *
> + * Copyright (c) 2021 Intel Corporation. All rights reserved.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +/**
> + * struct virtio_i2c - virtio I2C data
> + * @vdev: virtio device for this controller
> + * @completion: completion of virtio I2C message
> + * @adap: I2C adapter for this controller
> + * @i2c_lock: lock for virtqueue processing

Name mismatch here.

> + * @vq: the virtio virtqueue for communication
> + */
> +struct virtio_i2c {
> + struct virtio_device *vdev;
> + struct completion completion;
> + struct i2c_adapter adap;
> + struct mutex lock;
> + struct virtqueue *vq;
> +};


> +static int virtio_i2c_complete_reqs(struct virtqueue *vq,
> + struct virtio_i2c_req *reqs,
> + struct i2c_msg *msgs, int nr,
> + bool timeout)
> +{
> + struct virtio_i2c_req *req;
> + bool err_found = false;
> + unsigned int len;
> + int i, j = 0;
> +
> + for (i = 0; i < nr; i++) {
> + /* Detach the ith request from the vq */
> + req = virtqueue_get_buf(vq, &len);
> +
> + if (timeout || err_found)  {
> + i2c_put_dma_safe_msg_buf(reqs[i].buf, &msgs[i], false);
> + continue;
> + }
> +
> + /*
> +  * Condition (req && req == &reqs[i]) should always meet since
> +  * we have total nr requests in the vq.
> +  */
> + if (WARN_ON(!(req && req == &reqs[i])) ||
> + (req->in_hdr.status != VIRTIO_I2C_MSG_OK)) {
> + err_found = true;
> + i2c_put_dma_safe_msg_buf(reqs[i].buf, &msgs[i], false);
> + continue;
> + }
> +
> + i2c_put_dma_safe_msg_buf(reqs[i].buf, &msgs[i], true);
> + ++j;
> + }

I think you can simplify the code like this here:

bool err_found = timeout;

for (i = 0; i < nr; i++) {
/* Detach the ith request from the vq */
req = virtqueue_get_buf(vq, &len);

/*
 * Condition (req && req == &reqs[i]) should always meet since
 * we have total nr requests in the vq.
 */
if (!err_found &&
(WARN_ON(!(req && req == &reqs[i])) ||
 (req->in_hdr.status != VIRTIO_I2C_MSG_OK))) {
err_found = true;
continue;
}

i2c_put_dma_safe_msg_buf(reqs[i].buf, &msgs[i], err_found);
if (!err_found)
++j;

> +
> + return (timeout ? -ETIMEDOUT : j);
> +}
> +
> +static int virtio_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg *msgs, 
> int num)
> +{
> + struct virtio_i2c *vi = i2c_get_adapdata(adap);
> + struct virtqueue *vq = vi->vq;
> + struct virtio_i2c_req *reqs;
> + unsigned long time_left;
> + int ret, nr;
> +
> + reqs = kcalloc(num, sizeof(*reqs), GFP_KERNEL);
> + if (!reqs)
> + return -ENOMEM;
> +
> + mutex_lock(&vi->lock);
> +
> + ret = virtio_i2c_send_reqs(vq, reqs, msgs, num);
> + if (ret == 0)
> + goto err_unlock_free;
> +
> + nr = ret;
> + reinit_completion(&vi->completion);
> + virtqueue_kick(vq);
> +
> + time_left = wait_for_completion_timeout(&vi->completion, adap->timeout);
> + if (!time_left) {
> + dev_err(&adap->dev, "virtio i2c backend timeout.\n");
> + ret = virtio_i2c_complete_reqs(vq, reqs, msgs, nr, true);
> + goto err_unlock_free;
> + }
> +
> + ret = virtio_i2c_complete_reqs(vq, reqs, msgs, nr, false);

And this can be optimized as well:

time_left = wait_for_completion_timeout(&vi->completion, adap->timeout);
if (!time_left)
dev_err(&adap->dev, "virtio i2c backend timeout.\n");

ret = virtio_i2c_complete_reqs(vq, reqs, msgs, nr, !time_left);

-- 
viresh

[PATCH -tip v4 09/12] kprobes: Setup instruction pointer in __kretprobe_trampoline_handler

To simplify the stacktrace with pt_regs from kretprobe handler,
set the correct return address to the instruction pointer in
the pt_regs before calling kretprobe handlers.

Suggested-by: Josh Poimboeuf 
Signed-off-by: Masami Hiramatsu 
---
 Changes in v3:
  - Cast the correct_ret_addr to unsigned long.
---
 kernel/kprobes.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index cf19edc038e4..4ce3e6f5d28d 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1914,6 +1914,9 @@ unsigned long __kretprobe_trampoline_handler(struct 
pt_regs *regs,
BUG_ON(1);
}
 
+   /* Set the instruction pointer to the correct address */
+   instruction_pointer_set(regs, (unsigned long)correct_ret_addr);
+
/* Run them. */
first = current->kretprobe_instances.first;
while (first) {

[PATCH -tip v4 10/12] x86/kprobes: Push a fake return address at kretprobe_trampoline

This changes x86/kretprobe stack frame on kretprobe_trampoline
a bit, which now push the kretprobe_trampoline as a fake return
address at the bottom of the stack frame. With this fix, the ORC
unwinder will see the kretprobe_trampoline as a return address.

Signed-off-by: Masami Hiramatsu 
Suggested-by: Josh Poimboeuf 
---
 arch/x86/kernel/kprobes/core.c |   31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 23255663c434..d7b90541eda1 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -782,28 +782,31 @@ asm(
".global kretprobe_trampoline\n"
".type kretprobe_trampoline, @function\n"
"kretprobe_trampoline:\n"
-   /* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
-   "   pushq %rsp\n"
+   /* Push fake return address to tell the unwinder it's a kretprobe */
+   "   pushq $kretprobe_trampoline\n"
UNWIND_HINT_FUNC
+   /* Save the sp-8, this will be fixed later */
+   "   pushq %rsp\n"
"   pushfq\n"
SAVE_REGS_STRING
"   movq %rsp, %rdi\n"
"   call trampoline_handler\n"
-   /* Replace saved sp with true return address. */
-   "   movq %rax, 19*8(%rsp)\n"
RESTORE_REGS_STRING
+   "   addq $8, %rsp\n"
"   popfq\n"
 #else
-   "   pushl %esp\n"
+   /* Push fake return address to tell the unwinder it's a kretprobe */
+   "   pushl $kretprobe_trampoline\n"
UNWIND_HINT_FUNC
+   /* Save the sp-4, this will be fixed later */
+   "   pushl %esp\n"
"   pushfl\n"
SAVE_REGS_STRING
"   movl %esp, %eax\n"
"   call trampoline_handler\n"
-   /* Replace saved sp with true return address. */
-   "   movl %eax, 15*4(%esp)\n"
RESTORE_REGS_STRING
+   "   addl $4, %esp\n"
"   popfl\n"
 #endif
"   ret\n"
@@ -814,8 +817,10 @@ NOKPROBE_SYMBOL(kretprobe_trampoline);
 /*
  * Called from kretprobe_trampoline
  */
-__used __visible void *trampoline_handler(struct pt_regs *regs)
+__used __visible void trampoline_handler(struct pt_regs *regs)
 {
+   unsigned long *frame_pointer;
+
/* fixup registers */
regs->cs = __KERNEL_CS;
 #ifdef CONFIG_X86_32
@@ -823,8 +828,16 @@ __used __visible void *trampoline_handler(struct pt_regs 
*regs)
 #endif
regs->ip = (unsigned long)&kretprobe_trampoline;
regs->orig_ax = ~0UL;
+   regs->sp += sizeof(long);
+   frame_pointer = ((unsigned long *)®s->sp) + 1;
 
-   return (void *)kretprobe_trampoline_handler(regs, ®s->sp);
+   /* Replace fake return address with real one. */
+   *frame_pointer = kretprobe_trampoline_handler(regs, frame_pointer);
+   /*
+* Move flags to sp so that kretprobe_trapmoline can return
+* right after popf.
+*/
+   regs->sp = regs->flags;
 }
 NOKPROBE_SYMBOL(trampoline_handler);

[PATCH -tip v4 08/12] arm: kprobes: Make a space for regs->ARM_pc at kretprobe_trampoline

Change kretprobe_trampoline to make a space for regs->ARM_pc so that
kretprobe_trampoline_handler can call instruction_pointer_set()
safely.

Signed-off-by: Masami Hiramatsu 
---
 arch/arm/probes/kprobes/core.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 1782b41df095..5f3c2b42787f 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -397,11 +397,13 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
 void __naked __kprobes kretprobe_trampoline(void)
 {
__asm__ __volatile__ (
+   "subsp, sp, #16 \n\t"
"stmdb  sp!, {r0 - r11} \n\t"
"movr0, sp  \n\t"
"bl trampoline_handler  \n\t"
"movlr, r0  \n\t"
"ldmia  sp!, {r0 - r11} \n\t"
+   "addsp, sp, #16 \n\t"
 #ifdef CONFIG_THUMB2_KERNEL
"bx lr  \n\t"
 #else

Re: [PATCH v7 1/3] dmaengine: ptdma: Initial driver for the AMD PTDMA

2021-03-21 Thread Sanjay R Mehta




On 3/22/2021 11:34 AM, Vinod Koul wrote:
> [CAUTION: External Email]
> 
> On 18-03-21, 16:16, Sanjay R Mehta wrote:
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
 +#include 
>>>
>>> why do you need sched.h here?
>>>
 +
 +#include "ptdma.h"
 +
 +/* Ever-increasing value to produce unique unit numbers */
 +static atomic_t pt_ordinal;
>>>
>>> What is the need of that?
>>>
>>
> 
> [please wrap your emails within 80 chars]
> 
Sure Vinod.

>> The "pt_ordinal" is incremented for each DMA instances and its number
>> is used only to assign device name for each instances.  This same
>> device name is passed as a string parameter in many places in code
>> like while using request_irq(), dma_pool_create() and in debugfs.
> 
> Why do you need that, why not use device name which is unique..?
> 
Can we take this as part of bug fixes series in future?

>> Also, I have implemented all of the comments for this patch except
>> this. if this is fine, will send the next version for review.
> 
> Am not sure I remember all the comments I gave, it has been _quite_ a
> while since the feedback was provided. In order to have effective review
> it would be great to revert back on a reasonable timeline and discuss...
> 
Apologies from my side. I understand that I have taken more time. But I assure 
it doesn't happen again.
I have already sent out v8, can you please have a look at and provide your 
valuable feedback.


> Thanks
> --
> ~Vinod
>

[PATCH -tip v4 07/12] ia64: Add instruction_pointer_set() API

Add instruction_pointer_set() API for ia64.

Signed-off-by: Masami Hiramatsu 
---
  Changes in v4:
   - Make the API macro for avoiding a build error.
---
 arch/ia64/include/asm/ptrace.h |5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/ia64/include/asm/ptrace.h b/arch/ia64/include/asm/ptrace.h
index b3aa46090101..4c2f838b2e77 100644
--- a/arch/ia64/include/asm/ptrace.h
+++ b/arch/ia64/include/asm/ptrace.h
@@ -51,6 +51,11 @@
  * the canonical representation by adding to instruction pointer.
  */
 # define instruction_pointer(regs) ((regs)->cr_iip + ia64_psr(regs)->ri)
+# define instruction_pointer_set(regs, val)\
+  ({   \
+   ia64_psr(regs)->ri = (val & 0xf);   \
+   regs->cr_iip = (val & ~0xfULL); \
+  })
 
 static inline unsigned long user_stack_pointer(struct pt_regs *regs)
 {

[PATCH -tip v4 06/12] ARC: Add instruction_pointer_set() API

Add instruction_pointer_set() API for arc.

Signed-off-by: Masami Hiramatsu 
---
 arch/arc/include/asm/ptrace.h |5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/arc/include/asm/ptrace.h b/arch/arc/include/asm/ptrace.h
index 4c3c9be5bd16..cca8d6583e31 100644
--- a/arch/arc/include/asm/ptrace.h
+++ b/arch/arc/include/asm/ptrace.h
@@ -149,6 +149,11 @@ static inline long regs_return_value(struct pt_regs *regs)
return (long)regs->r0;
 }
 
+static inline void instruction_pointer_set(struct pt_regs *regs,
+  unsigned long val)
+{
+   instruction_pointer(regs) = val;
+}
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_PTRACE_H */

[PATCH -tip v4 05/12] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code

From: Josh Poimboeuf 

Add UNWIND_HINT_FUNC on kretporbe_trampoline code so that ORC
information is generated on the kretprobe_trampoline correctly.

Note that when the CONFIG_FRAME_POINTER=y, since the
kretprobe_trampoline skips updating frame pointer, the stack frame
of the kretprobe_trampoline seems non-standard. So this marks it
is STACK_FRAME_NON_STANDARD() and undefine UNWIND_HINT_FUNC.
Anyway, with the frame pointer, FP unwinder can unwind the stack
frame correctly without that hint.

Signed-off-by: Josh Poimboeuf 
Signed-off-by: Masami Hiramatsu 
---
 Changes in v4:
  - Apply UNWIND_HINT_FUNC only if CONFIG_FRAME_POINTER=n.
---
 arch/x86/include/asm/unwind_hints.h |5 +
 arch/x86/kernel/kprobes/core.c  |   17 +++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/unwind_hints.h 
b/arch/x86/include/asm/unwind_hints.h
index 8e574c0afef8..8b33674288ea 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -52,6 +52,11 @@
UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
 .endm
 
+#else
+
+#define UNWIND_HINT_FUNC \
+   UNWIND_HINT(ORC_REG_SP, 8, UNWIND_HINT_TYPE_FUNC, 0)
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_X86_UNWIND_HINTS_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 427d648fffcd..23255663c434 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -760,6 +760,19 @@ int kprobe_int3_handler(struct pt_regs *regs)
 }
 NOKPROBE_SYMBOL(kprobe_int3_handler);
 
+#ifdef CONFIG_FRAME_POINTER
+/*
+ * kretprobe_trampoline skips updating frame pointer. The frame pointer
+ * saved in trampoline_handler points to the real caller function's
+ * frame pointer. Thus the kretprobe_trampoline doesn't seems to have a
+ * standard stack frame with CONFIG_FRAME_POINTER=y.
+ * Let's mark it non-standard function. Anyway, FP unwinder can correctly
+ * unwind without the hint.
+ */
+STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
+#undef UNWIND_HINT_FUNC
+#define UNWIND_HINT_FUNC
+#endif
 /*
  * When a retprobed function returns, this code saves registers and
  * calls trampoline_handler() runs, which calls the kretprobe's handler.
@@ -772,6 +785,7 @@ asm(
/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
"   pushq %rsp\n"
+   UNWIND_HINT_FUNC
"   pushfq\n"
SAVE_REGS_STRING
"   movq %rsp, %rdi\n"
@@ -782,6 +796,7 @@ asm(
"   popfq\n"
 #else
"   pushl %esp\n"
+   UNWIND_HINT_FUNC
"   pushfl\n"
SAVE_REGS_STRING
"   movl %esp, %eax\n"
@@ -795,8 +810,6 @@ asm(
".size kretprobe_trampoline, .-kretprobe_trampoline\n"
 );
 NOKPROBE_SYMBOL(kretprobe_trampoline);
-STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
-
 
 /*
  * Called from kretprobe_trampoline

Re: [Linuxarm] Re: [PATCH v14 07/13] iommu/smmuv3: Implement cache_invalidate

2021-03-21 Thread chenxiang (M)


Hi Eric,


在 2021/3/20 1:36, Auger Eric 写道:

Hi Chenxiang,

On 3/4/21 8:55 AM, chenxiang (M) wrote:

Hi Eric,


在 2021/2/24 4:56, Eric Auger 写道:

Implement domain-selective, pasid selective and page-selective
IOTLB invalidations.

Signed-off-by: Eric Auger 

---

v13 -> v14:
- Add domain invalidation
- do global inval when asid is not provided with addr
   granularity

v7 -> v8:
- ASID based invalidation using iommu_inv_pasid_info
- check ARCHID/PASID flags in addr based invalidation
- use __arm_smmu_tlb_inv_context and __arm_smmu_tlb_inv_range_nosync

v6 -> v7
- check the uapi version

v3 -> v4:
- adapt to changes in the uapi
- add support for leaf parameter
- do not use arm_smmu_tlb_inv_range_nosync or arm_smmu_tlb_inv_context
   anymore

v2 -> v3:
- replace __arm_smmu_tlb_sync by arm_smmu_cmdq_issue_sync

v1 -> v2:
- properly pass the asid
---
  drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 74 +
  1 file changed, 74 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 4c19a1114de4..df3adc49111c 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -2949,6 +2949,79 @@ static void arm_smmu_detach_pasid_table(struct 
iommu_domain *domain)
mutex_unlock(&smmu_domain->init_mutex);
  }
  
+static int

+arm_smmu_cache_invalidate(struct iommu_domain *domain, struct device *dev,
+ struct iommu_cache_invalidate_info *inv_info)
+{
+   struct arm_smmu_cmdq_ent cmd = {.opcode = CMDQ_OP_TLBI_NSNH_ALL};
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+
+   if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
+   return -EINVAL;
+
+   if (!smmu)
+   return -EINVAL;
+
+   if (inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1)
+   return -EINVAL;
+
+   if (inv_info->cache & IOMMU_CACHE_INV_TYPE_PASID ||
+   inv_info->cache & IOMMU_CACHE_INV_TYPE_DEV_IOTLB) {
+   return -ENOENT;
+   }
+
+   if (!(inv_info->cache & IOMMU_CACHE_INV_TYPE_IOTLB))
+   return -EINVAL;
+
+   /* IOTLB invalidation */
+
+   switch (inv_info->granularity) {
+   case IOMMU_INV_GRANU_PASID:
+   {
+   struct iommu_inv_pasid_info *info =
+   &inv_info->granu.pasid_info;
+
+   if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+   return -ENOENT;
+   if (!(info->flags & IOMMU_INV_PASID_FLAGS_ARCHID))
+   return -EINVAL;
+
+   __arm_smmu_tlb_inv_context(smmu_domain, info->archid);
+   return 0;
+   }
+   case IOMMU_INV_GRANU_ADDR:
+   {
+   struct iommu_inv_addr_info *info = &inv_info->granu.addr_info;
+   size_t size = info->nb_granules * info->granule_size;
+   bool leaf = info->flags & IOMMU_INV_ADDR_FLAGS_LEAF;
+
+   if (info->flags & IOMMU_INV_ADDR_FLAGS_PASID)
+   return -ENOENT;
+
+   if (!(info->flags & IOMMU_INV_ADDR_FLAGS_ARCHID))
+   break;
+
+   arm_smmu_tlb_inv_range_domain(info->addr, size,
+ info->granule_size, leaf,
+ info->archid, smmu_domain);

Is it possible to add a check before the function to make sure that
info->granule_size can be recognized by SMMU?
There is a scenario which will cause TLBI issue: RIL feature is enabled
on guest but is disabled on host, and then on
host it just invalidate 4K/2M/1G once a time, but from QEMU,
info->nb_granules is always 1 and info->granule_size = size,
if size is not equal to 4K or 2M or 1G (for example size = granule_size
is 5M), it will only part of the size it wants to invalidate.


Do you have any idea about this issue?



I think maybe we can add a check here: if RIL is not enabled and also
size is not the granule_size (4K/2M/1G) supported by
SMMU hardware, can we just simply use the smallest granule_size
supported by hardware all the time?


+
+   arm_smmu_cmdq_issue_sync(smmu);
+   return 0;
+   }
+   case IOMMU_INV_GRANU_DOMAIN:
+   break;

I check the qemu code
(https://github.com/eauger/qemu/tree/v5.2.0-2stage-rfcv8), for opcode
CMD_TLBI_NH_ALL or CMD_TLBI_NSNH_ALL from guest OS
it calls smmu_inv_notifiers_all() to unamp all notifiers of all mr's,
but it seems not set event.entry.granularity which i think it should set
IOMMU_INV_GRAN_ADDR.

this is because IOMMU_INV_GRAN_ADDR = 0. But for clarity I should rather
set it explicitly ;-)


ok


BTW, for opcode CMD_TLBI_NH_ALL or CMD_TLBI_NSNH_ALL, it needs to unmap
size = 0x1 on 48bit system (it may spend much time),  maybe
it is better
to set it as IOMMU_INV_GRANU_DOMAIN, then in host OS, send

[PATCH -tip v4 04/12] kprobes: Add kretprobe_find_ret_addr() for searching return address

Add kretprobe_find_ret_addr() for searching correct return address
from kretprobe instance list.

Signed-off-by: Masami Hiramatsu 
---
 Changes in v3:
  - Remove generic stacktrace fixup. Instead, it should be solved in
each unwinder. This just provide the generic interface.
 Changes in v2:
  - Add is_kretprobe_trampoline() for checking address outside of
kretprobe_find_ret_addr()
  - Remove unneeded addr from kretprobe_find_ret_addr()
  - Rename fixup_kretprobe_tramp_addr() to fixup_kretprobe_trampoline()
---
 include/linux/kprobes.h |   22 +++
 kernel/kprobes.c|   90 +--
 2 files changed, 86 insertions(+), 26 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 65dadd4238a2..f530f82a046d 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -215,6 +215,14 @@ static nokprobe_inline void 
*kretprobe_trampoline_addr(void)
return dereference_function_descriptor(kretprobe_trampoline);
 }
 
+static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
+{
+   return (void *)addr == kretprobe_trampoline_addr();
+}
+
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+ struct llist_node **cur);
+
 /* If the trampoline handler called from a kprobe, use this version */
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 void *frame_pointer);
@@ -514,6 +522,20 @@ static inline bool is_kprobe_optinsn_slot(unsigned long 
addr)
 }
 #endif
 
+#if !defined(CONFIG_KRETPROBES)
+static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
+{
+   return false;
+}
+
+static nokprobe_inline
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+ struct llist_node **cur)
+{
+   return 0;
+}
+#endif
+
 /* Returns true if kprobes handled the fault */
 static nokprobe_inline bool kprobe_page_fault(struct pt_regs *regs,
  unsigned int trap)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 75c0a58c19c2..cf19edc038e4 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1858,45 +1858,68 @@ static struct notifier_block kprobe_exceptions_nb = {
 
 #ifdef CONFIG_KRETPROBES
 
-unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
-void *frame_pointer)
+/* This assumes the tsk is current or the task which is not running. */
+static unsigned long __kretprobe_find_ret_addr(struct task_struct *tsk,
+  struct llist_node **cur)
 {
-   kprobe_opcode_t *correct_ret_addr = NULL;
struct kretprobe_instance *ri = NULL;
-   struct llist_node *first, *node;
-   struct kretprobe *rp;
+   struct llist_node *node = *cur;
+
+   if (!node)
+   node = tsk->kretprobe_instances.first;
+   else
+   node = node->next;
 
-   /* Find all nodes for this frame. */
-   first = node = current->kretprobe_instances.first;
while (node) {
ri = container_of(node, struct kretprobe_instance, llist);
-
-   BUG_ON(ri->fp != frame_pointer);
-
if (ri->ret_addr != kretprobe_trampoline_addr()) {
-   correct_ret_addr = ri->ret_addr;
-   /*
-* This is the real return address. Any other
-* instances associated with this task are for
-* other calls deeper on the call stack
-*/
-   goto found;
+   *cur = node;
+   return (unsigned long)ri->ret_addr;
}
-
node = node->next;
}
-   pr_err("Oops! Kretprobe fails to find correct return address.\n");
-   BUG_ON(1);
+   return 0;
+}
+NOKPROBE_SYMBOL(__kretprobe_find_ret_addr);
 
-found:
-   /* Unlink all nodes for this frame. */
-   current->kretprobe_instances.first = node->next;
-   node->next = NULL;
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+ struct llist_node **cur)
+{
+   struct kretprobe_instance *ri = NULL;
+   unsigned long ret;
+
+   do {
+   ret = __kretprobe_find_ret_addr(tsk, cur);
+   if (!ret)
+   return ret;
+   ri = container_of(*cur, struct kretprobe_instance, llist);
+   } while (ri->fp != fp);
+
+   return ret;
+}
+NOKPROBE_SYMBOL(kretprobe_find_ret_addr);
 
-   /* Run them..  */
+unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
+void *frame_pointer)
+{
+   kprobe_opcode_t *correct_ret_addr = NULL;
+   struct kretprobe_instance *ri = NULL;
+   struct llist_nod

linux-next: manual merge of the akpm tree with the arm64 tree

Hi all,

Today's linux-next merge of the akpm tree got a conflict in:

  arch/arm64/mm/mmu.c

between commit:

  87143f404f33 ("arm64: mm: use XN table mapping attributes for the linear 
region")

from the arm64 tree and commit:

  0a2634348ef8 ("set_memory: allow querying whether set_direct_map_*() is 
actually enabled")

from the akpm tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/arm64/mm/mmu.c
index 4c2305cca6d2,fb675069a3b7..
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@@ -503,20 -490,11 +504,20 @@@ static void __init map_mem(pgd_t *pgdp
phys_addr_t kernel_start = __pa_symbol(_stext);
phys_addr_t kernel_end = __pa_symbol(__init_begin);
phys_addr_t start, end;
 -  int flags = 0;
 +  int flags = NO_EXEC_MAPPINGS;
u64 i;
  
 +  /*
 +   * Setting hierarchical PXNTable attributes on table entries covering
 +   * the linear region is only possible if it is guaranteed that no table
 +   * entries at any level are being shared between the linear region and
 +   * the vmalloc region. Check whether this is true for the PGD level, in
 +   * which case it is guaranteed to be true for all other levels as well.
 +   */
 +  BUILD_BUG_ON(pgd_index(direct_map_end - 1) == 
pgd_index(direct_map_end));
 +
-   if (rodata_full || crash_mem_map || debug_pagealloc_enabled())
+   if (can_set_direct_map() || crash_mem_map)
 -  flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 +  flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
  
/*
 * Take care not to create a writable alias for the
@@@ -1468,9 -1446,8 +1469,8 @@@ int arch_add_memory(int nid, u64 start
 * KFENCE requires linear map to be mapped at page granularity, so that
 * it is possible to protect/unprotect single pages in the KFENCE pool.
 */
-   if (rodata_full || debug_pagealloc_enabled() ||
-   IS_ENABLED(CONFIG_KFENCE))
+   if (can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE))
 -  flags = NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
 +  flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
  
__create_pgd_mapping(swapper_pg_dir, start, __phys_to_virt(start),
 size, params->pgprot, __pgd_pgtable_alloc,


pgp1brZIO02ag.pgp
Description: OpenPGP digital signature

[PATCH -tip v4 03/12] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()

Remove trampoline_address from kretprobe_trampoline_handler().
Instead of passing the address, kretprobe_trampoline_handler()
can use new kretprobe_trampoline_addr().

Signed-off-by: Masami Hiramatsu 
---
 Changes in v3:
   - Remove wrong kretprobe_trampoline declaration from
 arch/x86/include/asm/kprobes.h.
 Changes in v2:
   - Remove arch_deref_entry_point() from comment.
---
 arch/arc/kernel/kprobes.c  |2 +-
 arch/arm/probes/kprobes/core.c |3 +--
 arch/arm64/kernel/probes/kprobes.c |3 +--
 arch/csky/kernel/probes/kprobes.c  |2 +-
 arch/ia64/kernel/kprobes.c |5 ++---
 arch/mips/kernel/kprobes.c |3 +--
 arch/parisc/kernel/kprobes.c   |4 ++--
 arch/powerpc/kernel/kprobes.c  |2 +-
 arch/riscv/kernel/probes/kprobes.c |2 +-
 arch/s390/kernel/kprobes.c |2 +-
 arch/sh/kernel/kprobes.c   |2 +-
 arch/sparc/kernel/kprobes.c|2 +-
 arch/x86/include/asm/kprobes.h |1 -
 arch/x86/kernel/kprobes/core.c |2 +-
 include/linux/kprobes.h|   18 +-
 kernel/kprobes.c   |3 +--
 16 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index cabef45f11df..3ae01bb5820c 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -397,7 +397,7 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes trampoline_probe_handler(struct kprobe *p,
  struct pt_regs *regs)
 {
-   regs->ret = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, 
NULL);
+   regs->ret = __kretprobe_trampoline_handler(regs, NULL);
 
/* By returning a non zero value, we are telling the kprobe handler
 * that we don't want the post_handler to run
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index a9653117ca0d..1782b41df095 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -413,8 +413,7 @@ void __naked __kprobes kretprobe_trampoline(void)
 /* Called from kretprobe_trampoline */
 static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
 {
-   return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
-   (void *)regs->ARM_fp);
+   return (void *)kretprobe_trampoline_handler(regs, (void *)regs->ARM_fp);
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/arm64/kernel/probes/kprobes.c 
b/arch/arm64/kernel/probes/kprobes.c
index 66aac2881ba8..fce681fdfce6 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -412,8 +412,7 @@ int __init arch_populate_kprobe_blacklist(void)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-   return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
-   (void *)kernel_stack_pointer(regs));
+   return (void *)kretprobe_trampoline_handler(regs, (void 
*)kernel_stack_pointer(regs));
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/csky/kernel/probes/kprobes.c 
b/arch/csky/kernel/probes/kprobes.c
index 589f090f48b9..cc589bc11904 100644
--- a/arch/csky/kernel/probes/kprobes.c
+++ b/arch/csky/kernel/probes/kprobes.c
@@ -404,7 +404,7 @@ int __init arch_populate_kprobe_blacklist(void)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-   return (void *)kretprobe_trampoline_handler(regs, 
&kretprobe_trampoline, NULL);
+   return (void *)kretprobe_trampoline_handler(regs, NULL);
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 15871eb170c0..a008df8e7203 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -392,14 +392,13 @@ static void __kprobes set_current_kprobe(struct kprobe *p,
__this_cpu_write(current_kprobe, p);
 }
 
-static void kretprobe_trampoline(void)
+void kretprobe_trampoline(void)
 {
 }
 
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-   regs->cr_iip = __kretprobe_trampoline_handler(regs,
-   dereference_function_descriptor(kretprobe_trampoline), NULL);
+   regs->cr_iip = __kretprobe_trampoline_handler(regs, NULL);
/*
 * By returning a non-zero value, we are telling
 * kprobe_handler() that we don't want the post_handler
diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c
index 54dfba8fa77c..001a2f07ef44 100644
--- a/arch/mips/kernel/kprobes.c
+++ b/arch/mips/kernel/kprobes.c
@@ -489,8 +489,7 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
 static int __kprobes trampoline_probe_handler(struct kprobe *p,

Re: [PATCH net-next v2 2/2] net: ipa: fix IPA validation

2021-03-21 Thread Leon Romanovsky

On Sun, Mar 21, 2021 at 12:19:02PM -0500, Alex Elder wrote:
> On 3/21/21 8:49 AM, Leon Romanovsky wrote:
> > On Sun, Mar 21, 2021 at 08:21:24AM -0500, Alex Elder wrote:
> >> On 3/21/21 3:21 AM, Leon Romanovsky wrote:
> >>> On Sat, Mar 20, 2021 at 09:17:29AM -0500, Alex Elder wrote:
>  There are blocks of IPA code that sanity-check various values, at
>  compile time where possible.  Most of these checks can be done once
>  during development but skipped for normal operation.  These checks
>  permit the driver to make certain assumptions, thereby avoiding the
>  need for runtime error checking.
> 
>  The checks are defined conditionally, but not consistently.  In
>  some cases IPA_VALIDATION enables the optional checks, while in
>  others IPA_VALIDATE is used.
> 
>  Fix this by using IPA_VALIDATION consistently.
> 
>  Signed-off-by: Alex Elder 
>  ---
>    drivers/net/ipa/Makefile   | 2 +-
>    drivers/net/ipa/gsi_trans.c| 8 
>    drivers/net/ipa/ipa_cmd.c  | 4 ++--
>    drivers/net/ipa/ipa_cmd.h  | 6 +++---
>    drivers/net/ipa/ipa_endpoint.c | 6 +++---
>    drivers/net/ipa/ipa_main.c | 6 +++---
>    drivers/net/ipa/ipa_mem.c  | 6 +++---
>    drivers/net/ipa/ipa_table.c| 6 +++---
>    drivers/net/ipa/ipa_table.h| 6 +++---
>    9 files changed, 25 insertions(+), 25 deletions(-)
> 
>  diff --git a/drivers/net/ipa/Makefile b/drivers/net/ipa/Makefile
>  index afe5df1e6..014ae36ac6004 100644
>  --- a/drivers/net/ipa/Makefile
>  +++ b/drivers/net/ipa/Makefile
>  @@ -1,5 +1,5 @@
>    # Un-comment the next line if you want to validate configuration data
>  -#ccflags-y  +=  -DIPA_VALIDATE
>  +# ccflags-y +=  -DIPA_VALIDATION
> >>>
> >>> Maybe netdev folks think differently here, but general rule that dead
> >>> code and closed code is such, is not acceptable to in Linux kernel.
> >>>
> >>> <...>
> >>
> >> What is the purpose of CONFIG_KGDB?  Or CONFIG_DEBUG_KERNEL?
> >> Would you prefer I expose this through a kconfig option?  I
> >> intentionally did not do that, because I really intended it
> >> to be only for development, so defined it in the Makefile.
> >> But I have no objection to making it configurable that way.
> > 
> > I prefer you to follow netdev/linux kernel rules of development.
> > The upstream repository and drivers/net/* folder especially are not
> > the place to put code used for the development.
> 
> How do I add support for new versions of the hardware as
> it evolves?

Exactly like all other driver developers do. You are not different here.

1. Clean your driver to have stable base without dead/debug code.
2. Send patch series per-feature/hardware enablement on top of this base.

> 
> What I started supporting (v3.5.1) was in some respects
> relatively old.  Version 4.2 is newer, and the v4.5 and
> beyond are for products that are relatively new on the
> market.

I see that it was submitted in 2018, we have many large drivers
that by far older than IPA. For example, mlx5 supports more than 5
generations of hardware and was added in 2013.

> 
> Some updates to IPA (like 4.0+ after 3.5.1, or 4.5+
> after 4.2) include substantial updates to the way the
> hardware works.  The code can't support the new hardware
> without being adapted and generalized to support both
> old and new.

It is ok.

> 
> My goal is to get upstream support for IPA for all
> Qualcomm SoCs that have it.  But the hardware design
> is evolving; Qualcomm is actively developing their
> architecture so they can support new technologies
> (e.g. cellular 5G).  Development of the driver is
> simply *necessary*.

No argue here.

> 
> The assertions I proposed and checks like this are
> intended as an *aid* to the active development I
> have been doing.

They need to be local to your development environment.
It is perfectly fine if you keep extra debug patch internally.

> 
> They may look like hacky debugging--checking errors
> that can't happen.  They aren't that at all--they're
> intended to the compiler help me develop correct code,
> given I *know* it will be evolving.

It is wrong assumption that you are alone who are reading this code.
I presented my view as a casual developer who sometimes need to change
code that is not my expertise.

The extra checks, unreachable and/or for non-existing code are very similar
to the bad comments - they make simple tasks too complex, it causes to us
wonder why they exist, maybe I broke something, e.t.c.

Unreachable code and checks sometimes serve as a hint for code deletion
and this is exactly how I spotted your driver.

> 
> But the assertions are gone, and I accept/agree that
> these specific checks "look funny."  More below.
> 
>  -#ifdef IPA_VALIDATE
>  +#ifdef IPA_VALIDATION
>   if (!size || size % 8)
>   return -EINVAL;
>   if (count

[PATCH -tip v4 02/12] kprobes: treewide: Replace arch_deref_entry_point() with dereference_function_descriptor()

Replace arch_deref_entry_point() with dereference_function_descriptor()
because those are doing same thing.

Signed-off-by: Masami Hiramatsu 
---
 arch/ia64/kernel/kprobes.c|5 -
 arch/powerpc/kernel/kprobes.c |   11 ---
 include/linux/kprobes.h   |1 -
 kernel/kprobes.c  |7 +--
 lib/error-inject.c|3 ++-
 5 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 006fbc1d7ae9..15871eb170c0 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -907,11 +907,6 @@ int __kprobes kprobe_exceptions_notify(struct 
notifier_block *self,
return ret;
 }
 
-unsigned long arch_deref_entry_point(void *entry)
-{
-   return ((struct fnptr *)entry)->ip;
-}
-
 static struct kprobe trampoline_p = {
.pre_handler = trampoline_probe_handler
 };
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 01ab2163659e..eb0460949e1b 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -539,17 +539,6 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 }
 NOKPROBE_SYMBOL(kprobe_fault_handler);
 
-unsigned long arch_deref_entry_point(void *entry)
-{
-#ifdef PPC64_ELF_ABI_v1
-   if (!kernel_text_address((unsigned long)entry))
-   return ppc_global_function_entry(entry);
-   else
-#endif
-   return (unsigned long)entry;
-}
-NOKPROBE_SYMBOL(arch_deref_entry_point);
-
 static struct kprobe trampoline_p = {
.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
.pre_handler = trampoline_probe_handler
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 1883a4a9f16a..d65c041b5c22 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -390,7 +390,6 @@ int register_kprobe(struct kprobe *p);
 void unregister_kprobe(struct kprobe *p);
 int register_kprobes(struct kprobe **kps, int num);
 void unregister_kprobes(struct kprobe **kps, int num);
-unsigned long arch_deref_entry_point(void *);
 
 int register_kretprobe(struct kretprobe *rp);
 void unregister_kretprobe(struct kretprobe *rp);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 745f08fdd7a6..2913de07f4a3 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1856,11 +1856,6 @@ static struct notifier_block kprobe_exceptions_nb = {
.priority = 0x7fff /* we need to be notified first */
 };
 
-unsigned long __weak arch_deref_entry_point(void *entry)
-{
-   return (unsigned long)entry;
-}
-
 #ifdef CONFIG_KRETPROBES
 
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
@@ -2324,7 +2319,7 @@ static int __init populate_kprobe_blacklist(unsigned long 
*start,
int ret;
 
for (iter = start; iter < end; iter++) {
-   entry = arch_deref_entry_point((void *)*iter);
+   entry = (unsigned long)dereference_function_descriptor((void 
*)*iter);
ret = kprobe_add_ksym_blacklist(entry);
if (ret == -EINVAL)
continue;
diff --git a/lib/error-inject.c b/lib/error-inject.c
index c73651b15b76..f71875ac5f9f 100644
--- a/lib/error-inject.c
+++ b/lib/error-inject.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Whitelist of symbols that can be overridden for error injection. */
 static LIST_HEAD(error_injection_list);
@@ -64,7 +65,7 @@ static void populate_error_injection_list(struct 
error_injection_entry *start,
 
mutex_lock(&ei_mutex);
for (iter = start; iter < end; iter++) {
-   entry = arch_deref_entry_point((void *)iter->addr);
+   entry = (unsigned long)dereference_function_descriptor((void 
*)iter->addr);
 
if (!kernel_text_address(entry) ||
!kallsyms_lookup_size_offset(entry, &size, &offset)) {

[PATCH -tip v4 00/12] kprobes: Fix stacktrace with kretprobes on x86

Hello,

Here is the 4th version of the series to fix the stacktrace with kretprobe
on x86. After merging this, I'll fix other architectures.

The previous version is;

https://lore.kernel.org/bpf/161615650355.306069.17260992641363840330.stgit@devnote2/

This version fixes some build warnings/errors and a bug on arm. (I think
arm's kretprobe implementation is a bit odd. anyway, that is off topic.)
[5/12] fixes objtool warning when CONFIG_FRAME_POINTER=y. [7/12] fixes a
build error on ia64. And add [8/12] for avoiding stack corruption by
instruction_pointer_set() in kretprobe_trampoline_handler on arm.

With this series, unwinder can unwind stack correctly from ftrace as below;

  # cd /sys/kernel/debug/tracing
  # echo > trace
  # echo r vfs_read >> kprobe_events
  # echo r full_proxy_read >> kprobe_events
  # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
  # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
  # echo 1 > events/kprobes/enable
  # echo 1 > options/sym-offset
  # cat /sys/kernel/debug/kprobes/list
8133b740  r  full_proxy_read+0x0[FTRACE]
812560b0  r  vfs_read+0x0[FTRACE]
  # echo 0 > events/kprobes/enable
  # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3   #P:8
#
#_-=> irqs-off
#   / _=> need-resched
#  | / _---=> hardirq/softirq
#  || / _--=> preempt-depth
#  ||| / delay
#   TASK-PID CPU#     TIMESTAMP  FUNCTION
#  | | |     | |
   <...>-135 [005] ...1 9.422114: r_full_proxy_read_0: 
(vfs_read+0xab/0x1a0 <- full_proxy_read)
   <...>-135 [005] ...1 9.422158: 
 => kretprobe_trace_func+0x209/0x2f0
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xca/0x150
 => trampoline_handler+0x44/0x70
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
 => 0

This shows the double return probes (vfs_read and full_proxy_read) on the stack
correctly unwinded. (vfs_read was called from ksys_read+0x5f and full_proxy_read
was called from vfs_read+0xab)

This actually changes the kretprobe behavisor a bit, now the instraction 
pointer in
the pt_regs passed to kretprobe user handler is correctly set the real return
address. So user handlers can get it via instruction_pointer() API.

You can also get this series from 
 git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git 
kprobes/kretprobe-stackfix-v4


Thank you,

---

Josh Poimboeuf (1):
  x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code

Masami Hiramatsu (11):
  ia64: kprobes: Fix to pass correct trampoline address to the handler
  kprobes: treewide: Replace arch_deref_entry_point() with 
dereference_function_descriptor()
  kprobes: treewide: Remove trampoline_address from 
kretprobe_trampoline_handler()
  kprobes: Add kretprobe_find_ret_addr() for searching return address
  ARC: Add instruction_pointer_set() API
  ia64: Add instruction_pointer_set() API
  arm: kprobes: Make a space for regs->ARM_pc at kretprobe_trampoline
  kprobes: Setup instruction pointer in __kretprobe_trampoline_handler
  x86/kprobes: Push a fake return address at kretprobe_trampoline
  x86/unwind: Recover kretprobe trampoline entry
  tracing: Show kretprobe unknown indicator only for kretprobe_trampoline


 arch/arc/include/asm/ptrace.h   |5 ++
 arch/arc/kernel/kprobes.c   |2 -
 arch/arm/probes/kprobes/core.c  |5 +-
 arch/arm64/kernel/probes/kprobes.c  |3 -
 arch/csky/kernel/probes/kprobes.c   |2 -
 arch/ia64/include/asm/ptrace.h  |5 ++
 arch/ia64/kernel/kprobes.c  |   15 ++---
 arch/mips/kernel/kprobes.c  |3 -
 arch/parisc/kernel/kprobes.c|4 +
 arch/powerpc/kernel/kprobes.c   |   13 -
 arch/riscv/kernel/probes/kprobes.c  |2 -
 arch/s390/kernel/kprobes.c  |2 -
 arch/sh/kernel/kprobes.c|2 -
 arch/sparc/kernel/kprobes.c |2 -
 arch/x86/include/asm/kprobes.h  |1 
 arch/x86/include/asm/unwind.h   |   17 ++
 arch/x86/include/asm/unwind_hints.h |5 ++
 arch/x86/kernel/kprobes/core.c  |   44 
 arch/x86/kernel/unwind_frame.c  |4 +
 arch/x86/kernel/unwind_guess.c  |3 -
 arch/x86/kernel/unwind_orc.c|6 +-
 include/linux/kprobes.h |   41 --
 kernel/kprobes.c|   99 ---
 kernel/trace/trace_output.c |   17 +-
 lib/error-inject.c  |3 +
 25 files changed, 200 insertions(+), 105 deletions(-)

--
Masami Hiramatsu (Linaro)

[PATCH -tip v4 01/12] ia64: kprobes: Fix to pass correct trampoline address to the handler

Commit e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")
missed to pass the wrong trampoline address (it passes the descriptor address
instead of function entry address).
This fixes it to pass correct trampoline address to 
__kretprobe_trampoline_handler().
This also changes to use correct symbol dereference function to get the
function address from the kretprobe_trampoline.

Fixes: e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")
Signed-off-by: Masami Hiramatsu 
---
 arch/ia64/kernel/kprobes.c |9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index fc1ff8a4d7de..006fbc1d7ae9 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -398,7 +398,8 @@ static void kretprobe_trampoline(void)
 
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-   regs->cr_iip = __kretprobe_trampoline_handler(regs, 
kretprobe_trampoline, NULL);
+   regs->cr_iip = __kretprobe_trampoline_handler(regs,
+   dereference_function_descriptor(kretprobe_trampoline), NULL);
/*
 * By returning a non-zero value, we are telling
 * kprobe_handler() that we don't want the post_handler
@@ -414,7 +415,7 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
ri->fp = NULL;
 
/* Replace the return addr with trampoline addr */
-   regs->b0 = ((struct fnptr *)kretprobe_trampoline)->ip;
+   regs->b0 = (unsigned 
long)dereference_function_descriptor(kretprobe_trampoline);
 }
 
 /* Check the instruction in the slot is break */
@@ -918,14 +919,14 @@ static struct kprobe trampoline_p = {
 int __init arch_init_kprobes(void)
 {
trampoline_p.addr =
-   (kprobe_opcode_t *)((struct fnptr *)kretprobe_trampoline)->ip;
+   dereference_function_description(kretprobe_trampoline);
return register_kprobe(&trampoline_p);
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
if (p->addr ==
-   (kprobe_opcode_t *)((struct fnptr *)kretprobe_trampoline)->ip)
+   dereference_function_descriptor(kretprobe_trampoline))
return 1;
 
return 0;

[PATCH] xfs: Fix a typo



s/strutures/structures/

Signed-off-by: Bhaskar Chowdhury 
---
 fs/xfs/xfs_aops.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index b4186d666157..1cc7c36d98e9 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -158,7 +158,7 @@ xfs_end_ioend(
nofs_flag = memalloc_nofs_save();

/*
-* Just clean up the in-memory strutures if the fs has been shut down.
+* Just clean up the in-memory structures if the fs has been shut down.
 */
if (XFS_FORCED_SHUTDOWN(ip->i_mount)) {
error = -EIO;
--
2.31.0

[PATCH -tip v3 00/12] kprobes: Fix stacktrace with kretprobes on x86

Hello,

Here is the 4th version of the series to fix the stacktrace with kretprobe
on x86. After merging this, I'll fix other architectures.

The previous version is;

https://lore.kernel.org/bpf/161615650355.306069.17260992641363840330.stgit@devnote2/

This version fixes some build warnings/errors and a bug on arm. (I think
arm's kretprobe implementation is a bit odd. anyway, that is off topic.)
[5/12] fixes objtool warning when CONFIG_FRAME_POINTER=y. [7/12] fixes a
build error on ia64. And add [8/12] for avoiding stack corruption by
instruction_pointer_set() in kretprobe_trampoline_handler on arm.

With this series, unwinder can unwind stack correctly from ftrace as below;

  # cd /sys/kernel/debug/tracing
  # echo > trace
  # echo r vfs_read >> kprobe_events
  # echo r full_proxy_read >> kprobe_events
  # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
  # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
  # echo 1 > events/kprobes/enable
  # echo 1 > options/sym-offset
  # cat /sys/kernel/debug/kprobes/list
8133b740  r  full_proxy_read+0x0[FTRACE]
812560b0  r  vfs_read+0x0[FTRACE]
  # echo 0 > events/kprobes/enable
  # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3   #P:8
#
#_-=> irqs-off
#   / _=> need-resched
#  | / _---=> hardirq/softirq
#  || / _--=> preempt-depth
#  ||| / delay
#   TASK-PID CPU#     TIMESTAMP  FUNCTION
#  | | |     | |
   <...>-135 [005] ...1 9.422114: r_full_proxy_read_0: 
(vfs_read+0xab/0x1a0 <- full_proxy_read)
   <...>-135 [005] ...1 9.422158: 
 => kretprobe_trace_func+0x209/0x2f0
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xca/0x150
 => trampoline_handler+0x44/0x70
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
 => 0

This shows the double return probes (vfs_read and full_proxy_read) on the stack
correctly unwinded. (vfs_read was called from ksys_read+0x5f and full_proxy_read
was called from vfs_read+0xab)

This actually changes the kretprobe behavisor a bit, now the instraction 
pointer in
the pt_regs passed to kretprobe user handler is correctly set the real return
address. So user handlers can get it via instruction_pointer() API.

You can also get this series from 
 git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git 
kprobes/kretprobe-stackfix-v4


Thank you,

---

Josh Poimboeuf (1):
  x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline code

Masami Hiramatsu (11):
  ia64: kprobes: Fix to pass correct trampoline address to the handler
  kprobes: treewide: Replace arch_deref_entry_point() with 
dereference_function_descriptor()
  kprobes: treewide: Remove trampoline_address from 
kretprobe_trampoline_handler()
  kprobes: Add kretprobe_find_ret_addr() for searching return address
  ARC: Add instruction_pointer_set() API
  ia64: Add instruction_pointer_set() API
  arm: kprobes: Make a space for regs->ARM_pc at kretprobe_trampoline
  kprobes: Setup instruction pointer in __kretprobe_trampoline_handler
  x86/kprobes: Push a fake return address at kretprobe_trampoline
  x86/unwind: Recover kretprobe trampoline entry
  tracing: Show kretprobe unknown indicator only for kretprobe_trampoline


 arch/arc/include/asm/ptrace.h   |5 ++
 arch/arc/kernel/kprobes.c   |2 -
 arch/arm/probes/kprobes/core.c  |5 +-
 arch/arm64/kernel/probes/kprobes.c  |3 -
 arch/csky/kernel/probes/kprobes.c   |2 -
 arch/ia64/include/asm/ptrace.h  |5 ++
 arch/ia64/kernel/kprobes.c  |   15 ++---
 arch/mips/kernel/kprobes.c  |3 -
 arch/parisc/kernel/kprobes.c|4 +
 arch/powerpc/kernel/kprobes.c   |   13 -
 arch/riscv/kernel/probes/kprobes.c  |2 -
 arch/s390/kernel/kprobes.c  |2 -
 arch/sh/kernel/kprobes.c|2 -
 arch/sparc/kernel/kprobes.c |2 -
 arch/x86/include/asm/kprobes.h  |1 
 arch/x86/include/asm/unwind.h   |   17 ++
 arch/x86/include/asm/unwind_hints.h |5 ++
 arch/x86/kernel/kprobes/core.c  |   44 
 arch/x86/kernel/unwind_frame.c  |4 +
 arch/x86/kernel/unwind_guess.c  |3 -
 arch/x86/kernel/unwind_orc.c|6 +-
 include/linux/kprobes.h |   41 --
 kernel/kprobes.c|   99 ---
 kernel/trace/trace_output.c |   17 +-
 lib/error-inject.c  |3 +
 25 files changed, 200 insertions(+), 105 deletions(-)

--
Masami Hiramatsu (Linaro)

[PATCH] scsi: bnx2fc: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c 
b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index 16bb6d2f98de..8863a74e6c57 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -1796,7 +1796,7 @@ static void bnx2fc_unbind_pcidev(struct bnx2fc_hba *hba)
 /**
  * bnx2fc_ulp_get_stats - cnic callback to populate FCoE stats
  *
- * @handle:transport handle pointing to adapter struture
+ * @handle:transport handle pointing to adapter structure
  */
 static int bnx2fc_ulp_get_stats(void *handle)
 {
--
2.31.0

[PATCH] net: ethernet: Fix a typo



s/datastruture/"data structure"/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/net/ethernet/mediatek/mtk_eth_soc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index fd3cec8f06ba..79c9c6bd2e4f 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -908,7 +908,7 @@ struct mtk_eth {
  * @id:The number of the MAC
  * @interface: Interface mode kept for detecting change in hw settings
  * @of_node:   Our devicetree node
- * @hw:Backpointer to our main datastruture
+ * @hw:Backpointer to our main data structure
  * @hw_stats:  Packet statistics counter
  */
 struct mtk_mac {
--
2.31.0

[PATCH] liquidio: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/net/ethernet/cavium/liquidio/octeon_device.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.h 
b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
index fb380b4f3e02..b402facfdc04 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
@@ -880,7 +880,7 @@ void octeon_set_droq_pkt_op(struct octeon_device *oct, u32 
q_no, u32 enable);
 void *oct_get_config_info(struct octeon_device *oct, u16 card_type);

 /** Gets the octeon device configuration
- *  @return - pointer to the octeon configuration struture
+ *  @return - pointer to the octeon configuration structure
  */
 struct octeon_config *octeon_get_conf(struct octeon_device *oct);

--
2.31.0

[PATCH] IB/hfi1: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/infiniband/hw/hfi1/iowait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/iowait.h 
b/drivers/infiniband/hw/hfi1/iowait.h
index d580aa17ae37..377e00a109c2 100644
--- a/drivers/infiniband/hw/hfi1/iowait.h
+++ b/drivers/infiniband/hw/hfi1/iowait.h
@@ -321,7 +321,7 @@ static inline void iowait_drain_wakeup(struct iowait *wait)
 /**
  * iowait_get_txhead() - get packet off of iowait list
  *
- * @wait iowait_work struture
+ * @wait iowait_work structure
  */
 static inline struct sdma_txreq *iowait_get_txhead(struct iowait_work *wait)
 {
--
2.31.0

[PATCH] drm/msm/dpu: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
index 09a3fb3e89f5..bb9ceadeb0bb 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_mdss.h
@@ -343,7 +343,7 @@ enum dpu_3d_blend_mode {

 /** struct dpu_format - defines the format configuration which
  * allows DPU HW to correctly fetch and decode the format
- * @base: base msm_format struture containing fourcc code
+ * @base: base msm_format structure containing fourcc code
  * @fetch_planes: how the color components are packed in pixel format
  * @element: element color ordering
  * @bits: element bit widths
--
2.31.0

[RFC PATCH] arm64: dts: allwinner: a64/h5: Add CPU idle states

2021-03-21 Thread Samuel Holland

Powering off idle CPUs saves about 33 mW compared to using WFI only.
Additional power savings are possible by idling the L2 and downclocking
the cluster when all CPUs are idle.

Entry and exit latency were measured using a logic analyzer, with GPIO
pins toggled in Linux after the calls to trace_cpu_idle() in
cpuidle_enter_state(), and in the power management firmware after CPU
power-off completes and immediately after detecting an interrupt.

800 us and 1500 us are worst-case values, largely driven by the fact
that the power management firmware is single threaded. It can only
handle commands to power off CPUs one at a time, and it cannot process
any commands while powering on a CPU in response to an interrupt.

The cluster suspend process reliably takes 36 us; I rounded this up to
50 us. If all CPUs enter the cluster idle state at the same time, exit
latency is actually reduced, because there is no contention in that
case. However, if only some CPUs enter the cluster idle state, behavior
is the same as for CPU idle.

Polling delay for the power management firmware to detect a pending
interrupt is insignificant; it is less than 20 us.

min-residency was chosen as the point where enabling the idle state
consumed no more average power than disabling the idle state at a
variety of interrupt rates.

Signed-off-by: Samuel Holland
---

I'm sending this patch as an RFC because it raises questions about how
we handle firmware versioning. How far back does (or should) our support
for old TF-A and Crust versions go?

cpuidle has a problem that without working firmware support, CPUs will
enter idle states and be unable to wake up. As a result, the system will
hang at some point during boot, usually before getting to userspace.

For over a year[0], TF-A has exposed the PSCI CPU_SUSPEND function when
a SCPI implementation is present[1]. Implementing CPU_SUSPEND is
required for implementing SYSTEM_SUSPEND[2], even if CPU_SUSPEND is not
itself used for anything.

However, there was no code to actually wake up a CPU once it called the
CPU_SUSPEND function, because I could not find the register providing
the necessary information. The fact that CPU_SUSPEND was broken affected
nobody, because nothing ever called it -- there were no idle states in
the DTS. In hindsight, what I should have done was always return failure
from sunxi_validate_power_state(), but that ship has long sailed.

I finally found the elusive register and implemented the wakeup code
earlier this month[3]. So now, CPU_SUSPEND actually works, if all of
your firmware is up to date, and cpuidle works if you add the states in
your device tree.

Unfortunately, there is currently nothing verifying that compatibility.
So you can get into four possible scenarios:
1) No idle states in DTS, any firmware => Linux works, with baseline
power consumption.
2) Idle states added to DTS, no Crust/SCPI => Linux works, but every
attempt to enter an idle state is rejected because CPU_SUSPEND is
not hooked up. So power consumption increases by a sizable amount.
3) Idle states added to DTS, "old" Crust/SCPI (before [3]) => Linux
fails to boot, because CPUs never return from idle states.
4) Idle states added to DTS, "new" Crust/SCPI (after [3]) => Linux
works, with improved power consumption compared to the baseline.

Obviously, we want to prevent scenario 3 if possible.

Enter the current patch: I chose the arm,psci-suspend-param values
specifically so they would be _rejected_ by the current TF-A code. This
makes scenario 3 behave like scenario 2. I then have some follow-up TF-A
patches (not yet submitted) to switch to the new parameter encoding[4].

This brings me back to my original question. Once the TF-A patches in
[4] are merged, scenario 3 (with an updated TF-A but an old Crust) would
fail to boot again. Do we care?

Should I implement some kind of runtime version checking, so TF-A can
disable CPU_SUSPEND if it would be broken? Or instead, should we wait
some amount of time to merge this patch (or the patches at [4]) and
assume people have upgraded?

Where would people expect this sort of possibly-breaking change to be
documented?

Separately, since I assume most A64/H5 users (outside of LibreELEC and
the PinePhone) are not using Crust, scenario 2 would be very common. If
merging this patch increases their idle power draw by 500 mW, is that an
acceptable cost for decreasing other users' idle power draw by 50 mW?

Sorry for the wall of text,
Samuel

[0]:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/commit/plat/allwinner/common/sunxi_pm.c?id=e382c88e2a26995099bb931d49e754dcaebc5593
[1]:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/plat/allwinner/common/sunxi_scpi_pm.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n190
[2]:
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/tree/lib/psci/psci_setup.c?id=2e0e51f42586826a1f6f6c1e532f90e6df642cf5#n251
[3]: https://github.com/crust-firmware/crust/commi

[PATCH] s390/kernel: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 arch/s390/kernel/os_info.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/s390/kernel/os_info.c b/arch/s390/kernel/os_info.c
index 0a5e4bafb6ad..5a7420b23aa8 100644
--- a/arch/s390/kernel/os_info.c
+++ b/arch/s390/kernel/os_info.c
@@ -52,7 +52,7 @@ void os_info_entry_add(int nr, void *ptr, u64 size)
 }

 /*
- * Initialize OS info struture and set lowcore pointer
+ * Initialize OS info structure and set lowcore pointer
  */
 void __init os_info_init(void)
 {
--
2.31.0

Re: [PATCH v14 05/13] iommu/smmuv3: Implement attach/detach_pasid_table

2021-03-21 Thread Keqian Zhu

Hi Eric,

On 2021/3/19 21:15, Auger Eric wrote:
> Hi Keqian,
> 
> On 3/2/21 9:35 AM, Keqian Zhu wrote:
>> Hi Eric,
>>
>> On 2021/2/24 4:56, Eric Auger wrote:
>>> On attach_pasid_table() we program STE S1 related info set
>>> by the guest into the actual physical STEs. At minimum
>>> we need to program the context descriptor GPA and compute
>>> whether the stage1 is translated/bypassed or aborted.
>>>
>>> On detach, the stage 1 config is unset and the abort flag is
>>> unset.
>>>
>>> Signed-off-by: Eric Auger 
>>>
>> [...]
>>
>>> +
>>> +   /*
>>> +* we currently support a single CD so s1fmt and s1dss
>>> +* fields are also ignored
>>> +*/
>>> +   if (cfg->pasid_bits)
>>> +   goto out;
>>> +
>>> +   smmu_domain->s1_cfg.cdcfg.cdtab_dma = cfg->base_ptr;
>> only the "cdtab_dma" field of "cdcfg" is set, we are not able to locate a 
>> specific cd using arm_smmu_get_cd_ptr().
>>
>> Maybe we'd better use a specialized function to fill other fields of "cdcfg" 
>> or add a sanity check in arm_smmu_get_cd_ptr()
>> to prevent calling it under nested mode?
>>
>> As now we just call arm_smmu_get_cd_ptr() during finalise_s1(), no problem 
>> found. Just a suggestion ;-)
> 
> forgive me for the delay. yes I can indeed make sure that code is not
> called in nested mode. Please could you detail why you would need to
> call arm_smmu_get_cd_ptr()?
I accidentally called this function in nested mode when verify the smmu mpam 
feature. :)

Yes, in nested mode, context descriptor is owned by guest, hypervisor does not 
need to care about its content.
Maybe we'd better give an explicit comment for arm_smmu_get_cd_ptr() to let 
coder pay attention to this? :)

Thanks,
Keqian

> 
> Thanks
> 
> Eric
>>
>> Thanks,
>> Keqian
>>
>>
>>> +   smmu_domain->s1_cfg.set = true;
>>> +   smmu_domain->abort = false;
>>> +   break;
>>> +   default:
>>> +   goto out;
>>> +   }
>>> +   spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>>> +   list_for_each_entry(master, &smmu_domain->devices, domain_head)
>>> +   arm_smmu_install_ste_for_dev(master);
>>> +   spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>>> +   ret = 0;
>>> +out:
>>> +   mutex_unlock(&smmu_domain->init_mutex);
>>> +   return ret;
>>> +}
>>> +
>>> +static void arm_smmu_detach_pasid_table(struct iommu_domain *domain)
>>> +{
>>> +   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>> +   struct arm_smmu_master *master;
>>> +   unsigned long flags;
>>> +
>>> +   mutex_lock(&smmu_domain->init_mutex);
>>> +
>>> +   if (smmu_domain->stage != ARM_SMMU_DOMAIN_NESTED)
>>> +   goto unlock;
>>> +
>>> +   smmu_domain->s1_cfg.set = false;
>>> +   smmu_domain->abort = false;
>>> +
>>> +   spin_lock_irqsave(&smmu_domain->devices_lock, flags);
>>> +   list_for_each_entry(master, &smmu_domain->devices, domain_head)
>>> +   arm_smmu_install_ste_for_dev(master);
>>> +   spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
>>> +
>>> +unlock:
>>> +   mutex_unlock(&smmu_domain->init_mutex);
>>> +}
>>> +
>>>  static bool arm_smmu_dev_has_feature(struct device *dev,
>>>  enum iommu_dev_features feat)
>>>  {
>>> @@ -2939,6 +3026,8 @@ static struct iommu_ops arm_smmu_ops = {
>>> .of_xlate   = arm_smmu_of_xlate,
>>> .get_resv_regions   = arm_smmu_get_resv_regions,
>>> .put_resv_regions   = generic_iommu_put_resv_regions,
>>> +   .attach_pasid_table = arm_smmu_attach_pasid_table,
>>> +   .detach_pasid_table = arm_smmu_detach_pasid_table,
>>> .dev_has_feat   = arm_smmu_dev_has_feature,
>>> .dev_feat_enabled   = arm_smmu_dev_feature_enabled,
>>> .dev_enable_feat= arm_smmu_dev_enable_feature,
>>>
>>
> 
> .
>

linux-next: build warnings after merge of the cifsd tree

Hi all,

After merging the cifsd tree, today's linux-next build (htmldocs)
produced these warnings:

Documentation/filesystems/cifs/cifsd.rst:13: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:14: WARNING: Block quote ends without 
a blank line; unexpected unindent.
Documentation/filesystems/cifs/cifsd.rst:14: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:18: WARNING: Block quote ends without 
a blank line; unexpected unindent.
Documentation/filesystems/cifs/cifsd.rst:23: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:23: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:24: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:25: WARNING: Definition list ends 
without a blank line; unexpected unindent.
Documentation/filesystems/cifs/cifsd.rst:28: WARNING: Unexpected indentation.
Documentation/filesystems/cifs/cifsd.rst:31: WARNING: Block quote ends without 
a blank line; unexpected unindent.
Documentation/filesystems/cifs/cifsd.rst:38: WARNING: Unexpected indentation.
Documentation/filesystems/cifs/cifsd.rst:32: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:32: WARNING: Inline 
substitution_reference start-string without end-string.
Documentation/filesystems/cifs/cifsd.rst:39: WARNING: Block quote ends without 
a blank line; unexpected unindent.
Documentation/filesystems/cifs/cifsd.rst:14: WARNING: Undefined substitution 
referenced: "--- ksmbd/3 - Client 3 |---".
Documentation/filesystems/cifs/cifsd.rst:0: WARNING: Undefined substitution 
referenced: "".
Documentation/filesystems/cifs/cifsd.rst:25: WARNING: Undefined substitution 
referenced: "--- ksmbd/0(forker kthread) ---|".
Documentation/filesystems/cifs/cifsd.rst:32: WARNING: Undefined substitution 
referenced: "__".

Introduced by commit

  30f44e929aa6 ("cifsd: update cifsd.rst document")

-- 
Cheers,
Stephen Rothwell


pgpxLKpo60xaC.pgp
Description: OpenPGP digital signature

[PATCH] docs: powerpc: Fix a typo



s/struture/structure/

Signed-off-by: Bhaskar Chowdhury 
---
 Documentation/powerpc/firmware-assisted-dump.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/powerpc/firmware-assisted-dump.rst 
b/Documentation/powerpc/firmware-assisted-dump.rst
index 20ea8cdee0aa..6c0ae070ba67 100644
--- a/Documentation/powerpc/firmware-assisted-dump.rst
+++ b/Documentation/powerpc/firmware-assisted-dump.rst
@@ -171,7 +171,7 @@ that were present in CMA region::
(meta area)|
   |
   |
-  Metadata: This area holds a metadata struture whose
+  Metadata: This area holds a metadata structure whose
   address is registered with f/w and retrieved in the
   second kernel after crash, on platforms that support
   tags (OPAL). Having such structure with info needed
--
2.31.0

Re: [PATCH v4 00/11] KVM: x86/pmu: Guest Architectural LBR Enabling

2021-03-21 Thread Xu, Like


Hi, do we have any comments on this patch set?

On 2021/3/14 23:52, Like Xu wrote:

Hi geniuses,

Please help review the new version of Arch LBR enabling patch set.

The Architectural Last Branch Records (LBRs) is publiced
in the 319433-040 release of Intel Architecture Instruction
Set Extensions and Future Features Programming Reference[0].
---
v3->v4 Changelog:
- Add one more host patch to reuse ARCH_LBR_CTL_MASK;
- Add reserve_lbr_buffers() instead of using GFP_ATOMIC;
- Fia a bug in the arch_lbr_depth_is_valid();
- Add LBR_CTL_EN to unify DEBUGCTLMSR_LBR and ARCH_LBR_CTL_LBREN;
- Add vmx->host_lbrctlmsr to save/restore host values;
- Add KVM_SUPPORTED_XSS to refactoring supported_xss;
- Clear Arch_LBR ans its XSS bit if it's not supported;
- Add negative testing to the related kvm-unit-tests;
- Refine code and commit messages;

Previous:
https://lore.kernel.org/kvm/20210303135756.1546253-1-like...@linux.intel.com/

Like Xu (11):
   KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR
   KVM: vmx/pmu: Add MSR_ARCH_LBR_CTL emulation for Arch LBR
   KVM: vmx/pmu: Add Arch LBR emulation and its VMCS field
   KVM: x86: Expose Architectural LBR CPUID leaf
   KVM: x86: Refine the matching and clearing logic for supported_xss
   KVM: x86: Add XSAVE Support for Architectural LBRs

  arch/x86/events/core.c   |   8 ++-
  arch/x86/events/intel/bts.c  |   2 +-
  arch/x86/events/intel/core.c |   6 +-
  arch/x86/events/intel/lbr.c  |  28 +
  arch/x86/events/perf_event.h |   8 ++-
  arch/x86/include/asm/msr-index.h |   1 +
  arch/x86/include/asm/vmx.h   |   4 ++
  arch/x86/kvm/cpuid.c |  25 +++-
  arch/x86/kvm/vmx/capabilities.h  |  25 +---
  arch/x86/kvm/vmx/pmu_intel.c | 103 ---
  arch/x86/kvm/vmx/vmx.c   |  50 +--
  arch/x86/kvm/vmx/vmx.h   |   4 ++
  arch/x86/kvm/x86.c   |   6 +-
  13 files changed, 227 insertions(+), 43 deletions(-)

Re: [PATCH 5.10 267/290] powerpc: Fix missing declaration of [en/dis]able_kernel_vsx()

2021-03-21 Thread Christophe Leroy





Le 15/03/2021 à 15:15, Geert Uytterhoeven a écrit :

On Mon, Mar 15, 2021 at 3:04 PM  wrote:

From: Greg Kroah-Hartman 

From: Christophe Leroy 

commit bd73758803c2eedc037c2268b65a19542a832594 upstream.

Add stub instances of enable_kernel_vsx() and disable_kernel_vsx()
when CONFIG_VSX is not set, to avoid following build failure.


Please note that this is not sufficient, and will just turn the build error
in another, different build error.


Not exactly, the fix is sufficient in most case, it is only with ancient versions of gcc (eg 4.9) or 
with CONFIG_CC_OPTIMISE_FOR_SIZE that we now get a build bug. Building with gcc 10 now works.



Waiting for the subsequent fix to enter v5.12-rc4...
https://lore.kernel.org/lkml/2c123f94-ceae-80c0-90e2-21909795e...@csgroup.eu/


This has now landed in mainline as commit eed5fae00593ab9d261a0c1ffc1bdb786a87a55a see 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/powerpc/include/asm/cpu_has_feature.h?h=v5.12-rc4&id=eed5fae00593ab9d261a0c1ffc1bdb786a87a55a


Christophe





   CC [M]  drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o
   In file included from 
./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services_types.h:29,
from 
./drivers/gpu/drm/amd/amdgpu/../display/dc/dm_services.h:37,
from 
drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:27:
   drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c: In function 
'dcn_bw_apply_registry_override':
   ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:64:3: error: implicit 
declaration of function 'enable_kernel_vsx'; did you mean 'enable_kernel_fp'? 
[-Werror=implicit-function-declaration]
  64 |   enable_kernel_vsx(); \
 |   ^
   drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:640:2: note: in 
expansion of macro 'DC_FP_START'
 640 |  DC_FP_START();
 |  ^~~
   ./drivers/gpu/drm/amd/amdgpu/../display/dc/os_types.h:75:3: error: implicit 
declaration of function 'disable_kernel_vsx'; did you mean 'disable_kernel_fp'? 
[-Werror=implicit-function-declaration]
  75 |   disable_kernel_vsx(); \
 |   ^~
   drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.c:676:2: note: in 
expansion of macro 'DC_FP_END'
 676 |  DC_FP_END();
 |  ^
   cc1: some warnings being treated as errors
   make[5]: *** [drivers/gpu/drm/amd/amdgpu/../display/dc/calcs/dcn_calcs.o] 
Error 1

This works because the caller is checking if VSX is available using
cpu_has_feature():

   #define DC_FP_START() { \
 if (cpu_has_feature(CPU_FTR_VSX_COMP)) { \
 preempt_disable(); \
 enable_kernel_vsx(); \
 } else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP)) { \
 preempt_disable(); \
 enable_kernel_altivec(); \
 } else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE)) { \
 preempt_disable(); \
 enable_kernel_fp(); \
 } \

When CONFIG_VSX is not selected, cpu_has_feature(CPU_FTR_VSX_COMP)
constant folds to 'false' so the call to enable_kernel_vsx() is
discarded and the build succeeds.

Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER")
Cc: sta...@vger.kernel.org # v5.6+
Reported-by: Geert Uytterhoeven 
Reported-by: kernel test robot 
Signed-off-by: Christophe Leroy 
[mpe: Incorporate some discussion comments into the change log]
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/8d7d285a027e9d21f5ff7f850fa71a2655b0c4af.1615279170.git.christophe.le...@csgroup.eu
Signed-off-by: Greg Kroah-Hartman 
---
  arch/powerpc/include/asm/switch_to.h |   10 ++
  1 file changed, 10 insertions(+)

--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -71,6 +71,16 @@ static inline void disable_kernel_vsx(vo
  {
 msr_check_and_clear(MSR_FP|MSR_VEC|MSR_VSX);
  }
+#else
+static inline void enable_kernel_vsx(void)
+{
+   BUILD_BUG();
+}
+
+static inline void disable_kernel_vsx(void)
+{
+   BUILD_BUG();
+}
  #endif

  #ifdef CONFIG_SPE


Gr{oetje,eeting}s,

 Geert

[PATCH v4 RESEND 5/5] perf/x86: Move ARCH_LBR_CTL_MASK definition to include/asm/msr-index.h

The ARCH_LBR_CTL_MASK will be reused for Arch LBR emulation in the KVM.

Signed-off-by: Like Xu 
---
 arch/x86/events/intel/lbr.c  | 2 --
 arch/x86/include/asm/msr-index.h | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 237876733e12..f60339ff0c13 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -168,8 +168,6 @@ enum {
 ARCH_LBR_RETURN|\
 ARCH_LBR_OTHER_BRANCH)
 
-#define ARCH_LBR_CTL_MASK  0x7f000e
-
 static void intel_pmu_lbr_filter(struct cpu_hw_events *cpuc);
 
 static __always_inline bool is_lbr_call_stack_bit_set(u64 config)
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 546d6ecf0a35..8f3375961efc 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -169,6 +169,7 @@
 #define LBR_INFO_BR_TYPE   (0xfull << LBR_INFO_BR_TYPE_OFFSET)
 
 #define MSR_ARCH_LBR_CTL   0x14ce
+#define ARCH_LBR_CTL_MASK  0x7f000e
 #define ARCH_LBR_CTL_LBREN BIT(0)
 #define ARCH_LBR_CTL_CPL_OFFSET1
 #define ARCH_LBR_CTL_CPL   (0x3ull << ARCH_LBR_CTL_CPL_OFFSET)
-- 
2.29.2

[PATCH v4 RESEND 0/5] x86: The perf/x86 changes to support guest Arch LBR

Hi Peter,

Please help review these minor perf/x86 changes in this patch set,
and we need some of them to support Guest Architectural LBR in KVM.

If you are interested in the KVM emulation, please check
https://lore.kernel.org/kvm/20210314155225.206661-1-like...@linux.intel.com/

Please check more details in each commit and feel free to comment.

Like Xu (5):
  perf/x86/intel: Fix the comment about guest LBR support on KVM
  perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers
  perf/x86/lbr: Move cpuc->lbr_xsave allocation out of sleeping region
  perf/x86/lbr: Skip checking for the existence of LBR_TOS for Arch LBR
  perf/x86: Move ARCH_LBR_CTL_MASK definition to include/asm/msr-index.h

 arch/x86/events/core.c   |  8 +---
 arch/x86/events/intel/bts.c  |  2 +-
 arch/x86/events/intel/core.c |  6 +++---
 arch/x86/events/intel/lbr.c  | 28 +---
 arch/x86/events/perf_event.h |  8 +++-
 arch/x86/include/asm/msr-index.h |  1 +
 6 files changed, 34 insertions(+), 19 deletions(-)

-- 
2.29.2

[PATCH v4 RESEND 1/5] perf/x86/intel: Fix the comment about guest LBR support on KVM

Starting from v5.12, KVM reports guest LBR and extra_regs support
when the host has relevant support. Just delete this part of the
comment and fix a typo incidentally.

Signed-off-by: Like Xu 
Reviewed-by: Kan Liang 
Reviewed-by: Andi Kleen 
---
 arch/x86/events/intel/core.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 37ce38403cb8..382dd3994463 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5737,8 +5737,7 @@ __init int intel_pmu_init(void)
 
/*
 * Access LBR MSR may cause #GP under certain circumstances.
-* E.g. KVM doesn't support LBR MSR
-* Check all LBT MSR here.
+* Check all LBR MSR here.
 * Disable LBR access if any LBR MSRs can not be accessed.
 */
if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
-- 
2.29.2

[PATCH v4 RESEND 4/5] perf/x86/lbr: Skip checking for the existence of LBR_TOS for Arch LBR

The Architecture LBR does not have MSR_LBR_TOS (0x01c9). KVM will
generate #GP for this MSR access, thereby preventing the initialization
of the guest LBR.

Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
Signed-off-by: Like Xu 
Reviewed-by: Kan Liang 
Reviewed-by: Andi Kleen 
---
 arch/x86/events/intel/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 382dd3994463..7f6d748421f2 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5740,7 +5740,8 @@ __init int intel_pmu_init(void)
 * Check all LBR MSR here.
 * Disable LBR access if any LBR MSRs can not be accessed.
 */
-   if (x86_pmu.lbr_nr && !check_msr(x86_pmu.lbr_tos, 0x3UL))
+   if (x86_pmu.lbr_nr && !boot_cpu_has(X86_FEATURE_ARCH_LBR) &&
+   !check_msr(x86_pmu.lbr_tos, 0x3UL))
x86_pmu.lbr_nr = 0;
for (i = 0; i < x86_pmu.lbr_nr; i++) {
if (!(check_msr(x86_pmu.lbr_from + i, 0xUL) &&
-- 
2.29.2

[PATCH v4 RESEND 2/5] perf/x86/lbr: Simplify the exposure check for the LBR_INFO registers

If the platform supports LBR_INFO register, the x86_pmu.lbr_info will
be assigned in intel_pmu_?_lbr_init_?() and it's safe to expose LBR_INFO
in the x86_perf_get_lbr() directly, instead of relying on lbr_format check.

Also Architectural LBR has IA32_LBR_x_INFO instead of LBR_FORMAT_INFO_x
to hold metadata for the operation, including mispredict, TSX, and
elapsed cycle time information.

Signed-off-by: Like Xu 
Reviewed-by: Kan Liang 
Reviewed-by: Andi Kleen 
---
 arch/x86/events/intel/lbr.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 21890dacfcfe..355ea70f1879 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1832,12 +1832,10 @@ void __init intel_pmu_arch_lbr_init(void)
  */
 int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
 {
-   int lbr_fmt = x86_pmu.intel_cap.lbr_format;
-
lbr->nr = x86_pmu.lbr_nr;
lbr->from = x86_pmu.lbr_from;
lbr->to = x86_pmu.lbr_to;
-   lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? x86_pmu.lbr_info : 0;
+   lbr->info = x86_pmu.lbr_info;
 
return 0;
 }
-- 
2.29.2

[PATCH v4 RESEND 3/5] perf/x86/lbr: Move cpuc->lbr_xsave allocation out of sleeping region

If the kernel is compiled with the CONFIG_LOCKDEP option, the conditional
might_sleep_if() deep in kmem_cache_alloc() will generate the following
trace, and potentially cause a deadlock when another LBR event is added:

[  243.115549] BUG: sleeping function called from invalid context at 
include/linux/sched/mm.h:196
[  243.117576] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 839, 
name: perf
[  243.119326] INFO: lockdep is turned off.
[  243.120249] irq event stamp: 0
[  243.120967] hardirqs last  enabled at (0): [<>] 0x0
[  243.122415] hardirqs last disabled at (0): [] 
copy_process+0xa45/0x1dc0
[  243.124302] softirqs last  enabled at (0): [] 
copy_process+0xa45/0x1dc0
[  243.126255] softirqs last disabled at (0): [<>] 0x0
[  243.128119] CPU: 0 PID: 839 Comm: perf Not tainted 5.11.0-rc4-guest+ #8
[  243.129654] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
0.0.0 02/06/2015
[  243.131520] Call Trace:
[  243.132112]  dump_stack+0x8d/0xb5
[  243.132896]  ___might_sleep.cold.106+0xb3/0xc3
[  243.133984]  slab_pre_alloc_hook.constprop.85+0x96/0xd0
[  243.135208]  ? intel_pmu_lbr_add+0x152/0x170
[  243.136207]  kmem_cache_alloc+0x36/0x250
[  243.137126]  intel_pmu_lbr_add+0x152/0x170
[  243.138088]  x86_pmu_add+0x83/0xd0
[  243.138889]  ? lock_acquire+0x158/0x350
[  243.139791]  ? lock_acquire+0x158/0x350
[  243.140694]  ? lock_acquire+0x158/0x350
[  243.141625]  ? lock_acquired+0x1e3/0x360
[  243.142544]  ? lock_release+0x1bf/0x340
[  243.143726]  ? trace_hardirqs_on+0x1a/0xd0
[  243.144823]  ? lock_acquired+0x1e3/0x360
[  243.145742]  ? lock_release+0x1bf/0x340
[  243.147107]  ? __slab_free+0x49/0x540
[  243.147966]  ? trace_hardirqs_on+0x1a/0xd0
[  243.148924]  event_sched_in.isra.129+0xf8/0x2a0
[  243.149989]  merge_sched_in+0x261/0x3e0
[  243.150889]  ? trace_hardirqs_on+0x1a/0xd0
[  243.151869]  visit_groups_merge.constprop.135+0x130/0x4a0
[  243.153122]  ? sched_clock_cpu+0xc/0xb0
[  243.154023]  ctx_sched_in+0x101/0x210
[  243.154884]  ctx_resched+0x6f/0xc0
[  243.155686]  perf_event_exec+0x21e/0x2e0
[  243.156641]  begin_new_exec+0x5e5/0xbd0
[  243.157540]  load_elf_binary+0x6af/0x1770
[  243.158478]  ? __kernel_read+0x19d/0x2b0
[  243.159977]  ? lock_acquire+0x158/0x350
[  243.160876]  ? __kernel_read+0x19d/0x2b0
[  243.161796]  bprm_execve+0x3c8/0x840
[  243.162638]  do_execveat_common.isra.38+0x1a5/0x1c0
[  243.163776]  __x64_sys_execve+0x32/0x40
[  243.164676]  do_syscall_64+0x33/0x40
[  243.165514]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  243.166746] RIP: 0033:0x7f6180a26feb
[  243.167590] Code: Unable to access opcode bytes at RIP 0x7f6180a26fc1.
[  243.169097] RSP: 002b:7ffc6558ce18 EFLAGS: 0202 ORIG_RAX: 
003b
[  243.170844] RAX: ffda RBX: 7ffc65592d30 RCX: 7f6180a26feb
[  243.172514] RDX: 55657f408dc0 RSI: 7ffc65592410 RDI: 7ffc65592d30
[  243.174162] RBP: 7ffc6558ce80 R08: 7ffc6558cde0 R09: 
[  243.176042] R10: 0008 R11: 0202 R12: 7ffc65592410
[  243.177696] R13: 55657f408dc0 R14: 0001 R15: 7ffc65592410

One of the solution is to use GFP_ATOMIC, but it will make the code less
reliable under memory pressue. Let's move the memory allocation out of
the sleeping region and put it into the x86_reserve_hardware().

The disadvantage of this fix is that the cpuc->lbr_xsave memory
will be allocated for each cpu like the legacy ds_buffer.

Fixes: c085fb8774 ("perf/x86/intel/lbr: Support XSAVES for arch LBR read")
Suggested-by: Kan Liang 
Signed-off-by: Like Xu 
---
 arch/x86/events/core.c   |  8 +---
 arch/x86/events/intel/bts.c  |  2 +-
 arch/x86/events/intel/lbr.c  | 22 --
 arch/x86/events/perf_event.h |  8 +++-
 4 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 18df17129695..a4ce669cc78d 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -373,7 +373,7 @@ set_ext_hw_attr(struct hw_perf_event *hwc, struct 
perf_event *event)
return x86_pmu_extra_regs(val, event);
 }
 
-int x86_reserve_hardware(void)
+int x86_reserve_hardware(struct perf_event *event)
 {
int err = 0;
 
@@ -382,8 +382,10 @@ int x86_reserve_hardware(void)
if (atomic_read(&pmc_refcount) == 0) {
if (!reserve_pmc_hardware())
err = -EBUSY;
-   else
+   else {
reserve_ds_buffers();
+   reserve_lbr_buffers(event);
+   }
}
if (!err)
atomic_inc(&pmc_refcount);
@@ -634,7 +636,7 @@ static int __x86_pmu_event_init(struct perf_event *event)
if (!x86_pmu_initialized())
return -ENODEV;
 
-   err = x86_reserve_hardware();
+   err = x86_reserve_hardware(event);

linux-next: manual merge of the akpm-current tree with the tip tree

Hi all,

Today's linux-next merge of the akpm-current tree got a conflict in:

  arch/x86/mm/init_64.c

between commit:

  d9f6e12fb0b7 ("x86: Fix various typos in comments")

from the tip tree and commit:

  68f7bf6e7e98 ("x86/vmemmap: drop handling of 4K unaligned vmemmap range")

from the akpm-current tree.

I fixed it up (the latter removed the comments fixed up by the former)
and can carry the fix as necessary. This is now fixed as far as linux-next
is concerned, but any non trivial conflicts should be mentioned to your
upstream maintainer when your tree is submitted for merging.  You may
also want to consider cooperating with the maintainer of the conflicting
tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpLNcP8GJ9cI.pgp
Description: OpenPGP digital signature

Re: [PATCH v1 14/14] mm: multigenerational lru: documentation

2021-03-21 Thread Yu Zhao

On Fri, Mar 19, 2021 at 05:31:20PM +0800, Alex Shi wrote:
> 
> 
> 在 2021/3/13 下午3:57, Yu Zhao 写道:
> > +Recipes
> > +---
> > +:Android on ARMv8.1+: ``X=4``, ``N=0``
> > +
> > +:Android on pre-ARMv8.1 CPUs: Not recommended due to the lack of
> > + ``ARM64_HW_AFDBM``
> > +
> > +:Laptops running Chrome on x86_64: ``X=7``, ``N=2``
> > +
> > +:Working set estimation: Write ``+ memcg_id node_id gen [swappiness]``
> > + to ``/sys/kernel/debug/lru_gen`` to account referenced pages to
> > + generation ``max_gen`` and create the next generation ``max_gen+1``.
> > + ``gen`` must be equal to ``max_gen`` in order to avoid races. A swap
> > + file and a non-zero swappiness value are required to scan anon pages.
> > + If swapping is not desired, set ``vm.swappiness`` to ``0`` and
> > + overwrite it with a non-zero ``swappiness``.
> > +
> > +:Proactive reclaim: Write ``- memcg_id node_id gen [swappiness]
> > + [nr_to_reclaim]`` to ``/sys/kernel/debug/lru_gen`` to evict
> > + generations less than or equal to ``gen``. ``gen`` must be less than
> > + ``max_gen-1`` as ``max_gen`` and ``max_gen-1`` are active generations
> > + and therefore protected from the eviction. ``nr_to_reclaim`` can be
> > + used to limit the number of pages to be evicted. Multiple command
> > + lines are supported, so does concatenation with delimiters ``,`` and
> > + ``;``.
> > +
> 
> These are difficult options for users, especially for 'races' involving.
> Is it possible to simplify them for end users?

They look simple for a few lruvecs, but do become human-unfriendly on
servers that have thousands of lruvecs.

It's certainly possible simplify them, but we'd have to sacrifice
some flexibility. Any particular idea in mind?

[PATCH] usb: cdnsp: Fixes issue with Configure Endpoint command

2021-03-21 Thread Pawel Laszczak

From: Pawel Laszczak 

Patch adds flag EP_UNCONFIGURED to detect whether endpoint was
unconfigured. This flag is set in cdnsp_reset_device after Reset Device
command. Among others this command disables all non control endpoints.
Flag is used in cdnsp_gadget_ep_disable to protect controller against
invoking Configure Endpoint command on disabled endpoint. Lack of this
protection in some cases caused that Configure Endpoint command completed
with Context State Error code completion.

Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD 
Driver")
Signed-off-by: Pawel Laszczak 
---
 drivers/usb/cdns3/cdnsp-gadget.c | 18 +-
 drivers/usb/cdns3/cdnsp-gadget.h | 11 ++-
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/usb/cdns3/cdnsp-gadget.c b/drivers/usb/cdns3/cdnsp-gadget.c
index d7d4bdd57f46..de17cc4ad91a 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.c
+++ b/drivers/usb/cdns3/cdnsp-gadget.c
@@ -727,7 +727,7 @@ int cdnsp_reset_device(struct cdnsp_device *pdev)
 * are in Disabled state.
 */
for (i = 1; i < CDNSP_ENDPOINTS_NUM; ++i)
-   pdev->eps[i].ep_state |= EP_STOPPED;
+   pdev->eps[i].ep_state |= EP_STOPPED | EP_UNCONFIGURED;
 
trace_cdnsp_handle_cmd_reset_dev(slot_ctx);
 
@@ -942,6 +942,7 @@ static int cdnsp_gadget_ep_enable(struct usb_ep *ep,
 
pep = to_cdnsp_ep(ep);
pdev = pep->pdev;
+   pep->ep_state &= ~EP_UNCONFIGURED;
 
if (dev_WARN_ONCE(pdev->dev, pep->ep_state & EP_ENABLED,
  "%s is already enabled\n", pep->name))
@@ -1023,9 +1024,13 @@ static int cdnsp_gadget_ep_disable(struct usb_ep *ep)
goto finish;
}
 
-   cdnsp_cmd_stop_ep(pdev, pep);
pep->ep_state |= EP_DIS_IN_RROGRESS;
-   cdnsp_cmd_flush_ep(pdev, pep);
+
+   /* Endpoint was unconfigured by Reset Device command. */
+   if (!(pep->ep_state & EP_UNCONFIGURED)) {
+   cdnsp_cmd_stop_ep(pdev, pep);
+   cdnsp_cmd_flush_ep(pdev, pep);
+   }
 
/* Remove all queued USB requests. */
while (!list_empty(&pep->pending_list)) {
@@ -1036,6 +1041,7 @@ static int cdnsp_gadget_ep_disable(struct usb_ep *ep)
cdnsp_invalidate_ep_events(pdev, pep);
 
pep->ep_state &= ~EP_DIS_IN_RROGRESS;
+
drop_flag = cdnsp_get_endpoint_flag(pep->endpoint.desc);
ctrl_ctx = cdnsp_get_input_control_ctx(&pdev->in_ctx);
ctrl_ctx->drop_flags = cpu_to_le32(drop_flag);
@@ -1043,10 +1049,12 @@ static int cdnsp_gadget_ep_disable(struct usb_ep *ep)
 
cdnsp_endpoint_zero(pdev, pep);
 
-   ret = cdnsp_update_eps_configuration(pdev, pep);
+   if (!(pep->ep_state & EP_UNCONFIGURED))
+   ret = cdnsp_update_eps_configuration(pdev, pep);
+
cdnsp_free_endpoint_rings(pdev, pep);
 
-   pep->ep_state &= ~EP_ENABLED;
+   pep->ep_state &= ~(EP_ENABLED | EP_UNCONFIGURED);
pep->ep_state |= EP_STOPPED;
 
 finish:
diff --git a/drivers/usb/cdns3/cdnsp-gadget.h b/drivers/usb/cdns3/cdnsp-gadget.h
index 6bbb26548c04..e628bd539e23 100644
--- a/drivers/usb/cdns3/cdnsp-gadget.h
+++ b/drivers/usb/cdns3/cdnsp-gadget.h
@@ -830,11 +830,12 @@ struct cdnsp_ep {
unsigned int ep_state;
 #define EP_ENABLED BIT(0)
 #define EP_DIS_IN_RROGRESS BIT(1)
-#define EP_HALTED  BIT(2)
-#define EP_STOPPED BIT(3)
-#define EP_WEDGE   BIT(4)
-#define EP0_HALTED_STATUS  BIT(5)
-#define EP_HAS_STREAMS BIT(6)
+#define EP_UNCONFIGUREDBIT(2)
+#define EP_HALTED  BIT(3)
+#define EP_STOPPED BIT(4)
+#define EP_WEDGE   BIT(5)
+#define EP0_HALTED_STATUS  BIT(6)
+#define EP_HAS_STREAMS BIT(7)
 
bool skip;
 };
-- 
2.25.1

Re: [PATCH v4 3/5] RISC-V: Initial DTS for Microchip ICICLE board

2021-03-21 Thread Bin Meng

On Thu, Mar 4, 2021 at 8:48 PM Atish Patra  wrote:
>
> Add initial DTS for Microchip ICICLE board having only
> essential devices (clocks, sdhci, ethernet, serial, etc).
> The device tree is based on the U-Boot patch.
>
> https://patchwork.ozlabs.org/project/uboot/patch/20201110103414.10142-6-padmarao.beg...@microchip.com/
>
> Signed-off-by: Atish Patra 
> ---
>  arch/riscv/boot/dts/Makefile  |   1 +
>  arch/riscv/boot/dts/microchip/Makefile|   2 +
>  .../microchip/microchip-mpfs-icicle-kit.dts   |  72 
>  .../boot/dts/microchip/microchip-mpfs.dtsi| 329 ++
>  4 files changed, 404 insertions(+)
>  create mode 100644 arch/riscv/boot/dts/microchip/Makefile
>  create mode 100644 
> arch/riscv/boot/dts/microchip/microchip-mpfs-icicle-kit.dts
>  create mode 100644 arch/riscv/boot/dts/microchip/microchip-mpfs.dtsi
>

Reviewed-by: Bin Meng

linux-next: build warning after merge of the net-next tree

Hi all,

After merging the net-next tree, today's linux-next build (htmldocs)
produced this warning:

include/linux/netdevice.h:2191: warning: Function parameter or member 
'dev_refcnt' not described in 'net_device'

Introduced by commit

  919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT")

-- 
Cheers,
Stephen Rothwell


pgpM6J3hmGKoq.pgp
Description: OpenPGP digital signature

Re: [PATCH v7 1/3] dmaengine: ptdma: Initial driver for the AMD PTDMA

2021-03-21 Thread Vinod Koul

On 18-03-21, 16:16, Sanjay R Mehta wrote:
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> >> +#include 
> > 
> > why do you need sched.h here?
> > 
> >> +
> >> +#include "ptdma.h"
> >> +
> >> +/* Ever-increasing value to produce unique unit numbers */
> >> +static atomic_t pt_ordinal;
> > 
> > What is the need of that?
> > 
> 

[please wrap your emails within 80 chars]

> The "pt_ordinal" is incremented for each DMA instances and its number
> is used only to assign device name for each instances.  This same
> device name is passed as a string parameter in many places in code
> like while using request_irq(), dma_pool_create() and in debugfs.

Why do you need that, why not use device name which is unique..?

> Also, I have implemented all of the comments for this patch except
> this. if this is fine, will send the next version for review.

Am not sure I remember all the comments I gave, it has been _quite_ a
while since the feedback was provided. In order to have effective review
it would be great to revert back on a reasonable timeline and discuss...

Thanks
-- 
~Vinod

[PATCH V2] KVM: x86: A typo fix



s/resued/reused/


Signed-off-by: Bhaskar Chowdhury 
---
 Changes from V1:
 As Ingo found the correct word for replacement, so incorporating.

 arch/x86/include/asm/kvm_host.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3768819693e5..e37c2ebc02e5 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1488,7 +1488,7 @@ extern u64 kvm_mce_cap_supported;
 /*
  * EMULTYPE_NO_DECODE - Set when re-emulating an instruction (after completing
  * userspace I/O) to indicate that the emulation context
- * should be resued as is, i.e. skip initialization of
+ * should be reused as is, i.e. skip initialization of
  * emulation context, instruction fetch and decode.
  *
  * EMULTYPE_TRAP_UD - Set when emulating an intercepted #UD from hardware.
--
2.31.0

Re: BUG: Out of bounds read in hci_le_ext_adv_report_evt()

2021-03-21 Thread Luiz Augusto von Dentz

Hi Emil,

On Sun, Mar 21, 2021 at 4:23 PM Emil Lenngren  wrote:
>
> Hi,
>
> Den mån 22 mars 2021 kl 00:01 skrev Luiz Augusto von Dentz
> :
> > Or we do something like
> > https://lore.kernel.org/linux-bluetooth/20201024002251.1389267-1-luiz.de...@gmail.com/,
> > that said the reason we didn't applied my patches was that the
> > controller would be the one generating invalid data, but it seems you
> > are reproducing with vhci controller which is only used for emulating
> > a controller and requires root privileges so it is unlikely these
> > conditions would happens with hardware itself, in the other hand as
> > there seems to be more and more reports using vhci to emulate broken
> > events it perhaps more productive to introduce proper checks for all
> > events so we don't have to deal with more reports like this in the
> > future.
>
> Keep in mind that when using the H4 uart protocol without any error
> correction (as H5 has), it is possible that random bit errors occur on
> the wire. I wouldn't like my kernel to crash due to this. Bit errors
> happen all the time on RPi 4 for example at the default baud rate if
> you just do some heavy stress testing, or use an application that
> transfers a lot of data over Bluetooth.

While we can catch some errors like that, and possible avoid crashes,
this should be limited to just boundary checks and not actually error
correction, that I'm afraid is out of our hands since we can still
receive an event that does match the original packet size but meant
something else which may break the synchronization of the states
between the controller and the host, also perhaps we need to notify
this type of error since even if we start discarding the events that
can possible cause states to be out of sync and the controller will
need to be reset in order to recover.

-- 
Luiz Augusto von Dentz

[PATCH] Bluetooth: verify AMP hci_chan before amp_destroy

2021-03-21 Thread Archie Pusaka

From: Archie Pusaka 

hci_chan can be created in 2 places: hci_loglink_complete_evt() if
it is an AMP hci_chan, or l2cap_conn_add() otherwise. In theory,
Only AMP hci_chan should be removed by a call to
hci_disconn_loglink_complete_evt(). However, the controller might mess
up, call that function, and destroy an hci_chan which is not initiated
by hci_loglink_complete_evt().

This patch adds a verification that the destroyed hci_chan must have
been init'd by hci_loglink_complete_evt().

Example crash call trace:
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0xe3/0x144 lib/dump_stack.c:118
 print_address_description+0x67/0x22a mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report mm/kasan/report.c:412 [inline]
 kasan_report+0x251/0x28f mm/kasan/report.c:396
 hci_send_acl+0x3b/0x56e net/bluetooth/hci_core.c:4072
 l2cap_send_cmd+0x5af/0x5c2 net/bluetooth/l2cap_core.c:877
 l2cap_send_move_chan_cfm_icid+0x8e/0xb1 net/bluetooth/l2cap_core.c:4661
 l2cap_move_fail net/bluetooth/l2cap_core.c:5146 [inline]
 l2cap_move_channel_rsp net/bluetooth/l2cap_core.c:5185 [inline]
 l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:5464 [inline]
 l2cap_sig_channel net/bluetooth/l2cap_core.c:5799 [inline]
 l2cap_recv_frame+0x1d12/0x51aa net/bluetooth/l2cap_core.c:7023
 l2cap_recv_acldata+0x2ea/0x693 net/bluetooth/l2cap_core.c:7596
 hci_acldata_packet net/bluetooth/hci_core.c:4606 [inline]
 hci_rx_work+0x2bd/0x45e net/bluetooth/hci_core.c:4796
 process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
 worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
 kthread+0x2f0/0x304 kernel/kthread.c:253
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

Allocated by task 38:
 set_track mm/kasan/kasan.c:460 [inline]
 kasan_kmalloc+0x8d/0x9a mm/kasan/kasan.c:553
 kmem_cache_alloc_trace+0x102/0x129 mm/slub.c:2787
 kmalloc include/linux/slab.h:515 [inline]
 kzalloc include/linux/slab.h:709 [inline]
 hci_chan_create+0x86/0x26d net/bluetooth/hci_conn.c:1674
 l2cap_conn_add.part.0+0x1c/0x814 net/bluetooth/l2cap_core.c:7062
 l2cap_conn_add net/bluetooth/l2cap_core.c:7059 [inline]
 l2cap_connect_cfm+0x134/0x852 net/bluetooth/l2cap_core.c:7381
 hci_connect_cfm+0x9d/0x122 include/net/bluetooth/hci_core.h:1404
 hci_remote_ext_features_evt net/bluetooth/hci_event.c:4161 [inline]
 hci_event_packet+0x463f/0x72fa net/bluetooth/hci_event.c:5981
 hci_rx_work+0x197/0x45e net/bluetooth/hci_core.c:4791
 process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
 worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
 kthread+0x2f0/0x304 kernel/kthread.c:253
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

Freed by task 1732:
 set_track mm/kasan/kasan.c:460 [inline]
 __kasan_slab_free mm/kasan/kasan.c:521 [inline]
 __kasan_slab_free+0x106/0x128 mm/kasan/kasan.c:493
 slab_free_hook mm/slub.c:1409 [inline]
 slab_free_freelist_hook+0xaa/0xf6 mm/slub.c:1436
 slab_free mm/slub.c:3009 [inline]
 kfree+0x182/0x21e mm/slub.c:3972
 hci_disconn_loglink_complete_evt net/bluetooth/hci_event.c:4891 [inline]
 hci_event_packet+0x6a1c/0x72fa net/bluetooth/hci_event.c:6050
 hci_rx_work+0x197/0x45e net/bluetooth/hci_core.c:4791
 process_one_work+0x6f8/0xb50 kernel/workqueue.c:2175
 worker_thread+0x4fc/0x670 kernel/workqueue.c:2321
 kthread+0x2f0/0x304 kernel/kthread.c:253
 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

The buggy address belongs to the object at 8881d7af9180
 which belongs to the cache kmalloc-128 of size 128
The buggy address is located 24 bytes inside of
 128-byte region [8881d7af9180, 8881d7af9200)
The buggy address belongs to the page:
page:ea00075ebe40 count:1 mapcount:0 mapping:8881da403200 index:0x0
flags: 0x8200(slab)
raw: 8200 dead0100 dead0200 8881da403200
raw:  80150015 0001 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 8881d7af9080: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
 8881d7af9100: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>8881d7af9180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
 8881d7af9200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 8881d7af9280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Signed-off-by: Archie Pusaka 
Reported-by: syzbot+98228e7407314d2d4...@syzkaller.appspotmail.com
Reviewed-by: Alain Michaud 
Reviewed-by: Abhishek Pandit-Subedi 
---

 include/net/bluetooth/hci_core.h | 1 +
 net/bluetooth/hci_event.c| 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
index ebdd4afe30d2..ca4ac6603b9a 100644
--- a/include/net/bluetooth/hci_core.h
+++ b/include/net/bluetooth/hci_core.h
@@ -704,6 +704,7 @@ struct hci_chan {
struct sk_buff_head data_q;
unsigned intsent;
__u8state;
+   boolamp;
 };
 
 struct hci_

[PATCH v5 6/6] KVM: arm64: GICv4.1: Give a chance to save VLPI state

Before GICv4.1, we don't have direct access to the VLPI state. So
we simply let it fail early when encountering any VLPI in saving.

But now we don't have to return -EACCES directly if on GICv4.1. Let’s
change the hard code and give a chance to save the VLPI state (and
preserve the UAPI).

Signed-off-by: Shenming Lu 
---
 Documentation/virt/kvm/devices/arm-vgic-its.rst | 2 +-
 arch/arm64/kvm/vgic/vgic-its.c  | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/devices/arm-vgic-its.rst 
b/Documentation/virt/kvm/devices/arm-vgic-its.rst
index 6c304fd2b1b4..d257eddbae29 100644
--- a/Documentation/virt/kvm/devices/arm-vgic-its.rst
+++ b/Documentation/virt/kvm/devices/arm-vgic-its.rst
@@ -80,7 +80,7 @@ KVM_DEV_ARM_VGIC_GRP_CTRL
 -EFAULT  Invalid guest ram access
 -EBUSY   One or more VCPUS are running
 -EACCES  The virtual ITS is backed by a physical GICv4 ITS, and the
-state is not available
+state is not available without GICv4.1
 ===  ==
 
 KVM_DEV_ARM_VGIC_GRP_ITS_REGS
diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
index 40cbaca81333..ec7543a9617c 100644
--- a/arch/arm64/kvm/vgic/vgic-its.c
+++ b/arch/arm64/kvm/vgic/vgic-its.c
@@ -2218,10 +2218,10 @@ static int vgic_its_save_itt(struct vgic_its *its, 
struct its_device *device)
/*
 * If an LPI carries the HW bit, this means that this
 * interrupt is controlled by GICv4, and we do not
-* have direct access to that state. Let's simply fail
-* the save operation...
+* have direct access to that state without GICv4.1.
+* Let's simply fail the save operation...
 */
-   if (ite->irq->hw)
+   if (ite->irq->hw && !kvm_vgic_global_state.has_gicv4_1)
return -EACCES;
 
ret = vgic_its_save_ite(its, device, ite, gpa, ite_esz);
-- 
2.19.1

[PATCH v5 2/6] irqchip/gic-v3-its: Drop the setting of PTZ altogether

GICv4.1 gives a way to get the VLPI state, which needs to map the
vPE first, and after the state read, we may remap the vPE back while
the VPT is not empty. So we can't assume that the VPT is empty at
the first map. Besides, the optimization of PTZ is probably limited
since the HW should be fairly efficient to parse the empty VPT. Let's
drop the setting of PTZ altogether.

Signed-off-by: Shenming Lu 
---
 drivers/irqchip/irq-gic-v3-its.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 4eb907f65bd0..c8b5a88ac31c 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -794,8 +794,16 @@ static struct its_vpe *its_build_vmapp_cmd(struct its_node 
*its,
 
its_encode_alloc(cmd, alloc);
 
-   /* We can only signal PTZ when alloc==1. Why do we have two bits? */
-   its_encode_ptz(cmd, alloc);
+   /*
+* We can only signal PTZ when alloc==1. Why do we have two bits?
+* GICv4.1 gives a way to get the VLPI state, which needs the vPE
+* to be unmapped first, and in this case, we may remap the vPE
+* back while the VPT is not empty. So we can't assume that the
+* VPT is empty at the first map. Besides, the optimization of PTZ
+* is probably limited since the HW should be fairly efficient to
+* parse the empty VPT. Let's drop the setting of PTZ altogether.
+*/
+   its_encode_ptz(cmd, false);
its_encode_vconf_addr(cmd, vconf_addr);
its_encode_vmapp_default_db(cmd, desc->its_vmapp_cmd.vpe->vpe_db_lpi);
 
-- 
2.19.1

[PATCH v5 4/6] KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables

After pausing all vCPUs and devices capable of interrupting, in order
to save the states of all interrupts, besides flushing the states in
kvm’s vgic, we also try to flush the states of VLPIs in the virtual
pending tables into guest RAM, but we need to have GICv4.1 and safely
unmap the vPEs first.

As for the saving of VSGIs, which needs the vPEs to be mapped and might
conflict with the saving of VLPIs, but since we will map the vPEs back
at the end of save_pending_tables and both savings require the kvm->lock
to be held (thus only happen serially), it will work fine.

Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v3.c | 66 +++
 1 file changed, 60 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 6f530925a231..41ecf219c333 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -1,6 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -356,6 +358,32 @@ int vgic_v3_lpi_sync_pending_status(struct kvm *kvm, 
struct vgic_irq *irq)
return 0;
 }
 
+/*
+ * The deactivation of the doorbell interrupt will trigger the
+ * unmapping of the associated vPE.
+ */
+static void unmap_all_vpes(struct vgic_dist *dist)
+{
+   struct irq_desc *desc;
+   int i;
+
+   for (i = 0; i < dist->its_vm.nr_vpes; i++) {
+   desc = irq_to_desc(dist->its_vm.vpes[i]->irq);
+   irq_domain_deactivate_irq(irq_desc_get_irq_data(desc));
+   }
+}
+
+static void map_all_vpes(struct vgic_dist *dist)
+{
+   struct irq_desc *desc;
+   int i;
+
+   for (i = 0; i < dist->its_vm.nr_vpes; i++) {
+   desc = irq_to_desc(dist->its_vm.vpes[i]->irq);
+   irq_domain_activate_irq(irq_desc_get_irq_data(desc), false);
+   }
+}
+
 /**
  * vgic_v3_save_pending_tables - Save the pending tables into guest RAM
  * kvm lock and all vcpu lock must be held
@@ -365,13 +393,28 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
struct vgic_dist *dist = &kvm->arch.vgic;
struct vgic_irq *irq;
gpa_t last_ptr = ~(gpa_t)0;
-   int ret;
+   bool vlpi_avail = false;
+   int ret = 0;
u8 val;
 
+   if (unlikely(!vgic_initialized(kvm)))
+   return -ENXIO;
+
+   /*
+* A preparation for getting any VLPI states.
+* The above vgic initialized check also ensures that the allocation
+* and enabling of the doorbells have already been done.
+*/
+   if (kvm_vgic_global_state.has_gicv4_1) {
+   unmap_all_vpes(dist);
+   vlpi_avail = true;
+   }
+
list_for_each_entry(irq, &dist->lpi_list_head, lpi_list) {
int byte_offset, bit_nr;
struct kvm_vcpu *vcpu;
gpa_t pendbase, ptr;
+   bool is_pending;
bool stored;
 
vcpu = irq->target_vcpu;
@@ -387,24 +430,35 @@ int vgic_v3_save_pending_tables(struct kvm *kvm)
if (ptr != last_ptr) {
ret = kvm_read_guest_lock(kvm, ptr, &val, 1);
if (ret)
-   return ret;
+   goto out;
last_ptr = ptr;
}
 
stored = val & (1U << bit_nr);
-   if (stored == irq->pending_latch)
+
+   is_pending = irq->pending_latch;
+
+   if (irq->hw && vlpi_avail)
+   vgic_v4_get_vlpi_state(irq, &is_pending);
+
+   if (stored == is_pending)
continue;
 
-   if (irq->pending_latch)
+   if (is_pending)
val |= 1 << bit_nr;
else
val &= ~(1 << bit_nr);
 
ret = kvm_write_guest_lock(kvm, ptr, &val, 1);
if (ret)
-   return ret;
+   goto out;
}
-   return 0;
+
+out:
+   if (vlpi_avail)
+   map_all_vpes(dist);
+
+   return ret;
 }
 
 /**
-- 
2.19.1

[PATCH v5 3/6] KVM: arm64: GICv4.1: Add function to get VLPI state

With GICv4.1 and the vPE unmapped, which indicates the invalidation
of any VPT caches associated with the vPE, we can get the VLPI state
by peeking at the VPT. So we add a function for this.

Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v4.c | 19 +++
 arch/arm64/kvm/vgic/vgic.h|  1 +
 2 files changed, 20 insertions(+)

diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
index 66508b03094f..ac029ba3d337 100644
--- a/arch/arm64/kvm/vgic/vgic-v4.c
+++ b/arch/arm64/kvm/vgic/vgic-v4.c
@@ -203,6 +203,25 @@ void vgic_v4_configure_vsgis(struct kvm *kvm)
kvm_arm_resume_guest(kvm);
 }
 
+/*
+ * Must be called with GICv4.1 and the vPE unmapped, which
+ * indicates the invalidation of any VPT caches associated
+ * with the vPE, thus we can get the VLPI state by peeking
+ * at the VPT.
+ */
+void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val)
+{
+   struct its_vpe *vpe = &irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe;
+   int mask = BIT(irq->intid % BITS_PER_BYTE);
+   void *va;
+   u8 *ptr;
+
+   va = page_address(vpe->vpt_page);
+   ptr = va + irq->intid / BITS_PER_BYTE;
+
+   *val = !!(*ptr & mask);
+}
+
 /**
  * vgic_v4_init - Initialize the GICv4 data structures
  * @kvm:   Pointer to the VM being initialized
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 64fcd750..d8cfd360838c 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -317,5 +317,6 @@ bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
 void vgic_v4_teardown(struct kvm *kvm);
 void vgic_v4_configure_vsgis(struct kvm *kvm);
+void vgic_v4_get_vlpi_state(struct vgic_irq *irq, bool *val);
 
 #endif
-- 
2.19.1

[PATCH v5 1/6] irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping

From: Marc Zyngier 

Since there may be a direct read from the CPU side to the VPT after
unmapping the vPE, we add a cache coherency maintenance at the end
of its_vpe_irq_domain_deactivate() to ensure the validity of the VPT
read later.

Signed-off-by: Marc Zyngier 
Signed-off-by: Shenming Lu 
---
 drivers/irqchip/irq-gic-v3-its.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index ed46e6057e33..4eb907f65bd0 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -4554,6 +4554,15 @@ static void its_vpe_irq_domain_deactivate(struct 
irq_domain *domain,
 
its_send_vmapp(its, vpe, false);
}
+
+   /*
+* There may be a direct read to the VPT after unmapping the
+* vPE, to guarantee the validity of this, we make the VPT
+* memory coherent with the CPU caches here.
+*/
+   if (find_4_1_its() && !atomic_read(&vpe->vmapp_count))
+   gic_flush_dcache_to_poc(page_address(vpe->vpt_page),
+   LPI_PENDBASE_SZ);
 }
 
 static const struct irq_domain_ops its_vpe_domain_ops = {
-- 
2.19.1

[PATCH v5 5/6] KVM: arm64: GICv4.1: Restore VLPI pending state to physical side

From: Zenghui Yu 

When setting the forwarding path of a VLPI (switch to the HW mode),
we can also transfer the pending state from irq->pending_latch to
VPT (especially in migration, the pending states of VLPIs are restored
into kvm’s vgic first). And we currently send "INT+VSYNC" to trigger
a VLPI to pending.

Signed-off-by: Zenghui Yu 
Signed-off-by: Shenming Lu 
---
 arch/arm64/kvm/vgic/vgic-v4.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/kvm/vgic/vgic-v4.c b/arch/arm64/kvm/vgic/vgic-v4.c
index ac029ba3d337..c1845d8f5f7e 100644
--- a/arch/arm64/kvm/vgic/vgic-v4.c
+++ b/arch/arm64/kvm/vgic/vgic-v4.c
@@ -404,6 +404,7 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
struct vgic_its *its;
struct vgic_irq *irq;
struct its_vlpi_map map;
+   unsigned long flags;
int ret;
 
if (!vgic_supports_direct_msis(kvm))
@@ -449,6 +450,24 @@ int kvm_vgic_v4_set_forwarding(struct kvm *kvm, int virq,
irq->host_irq   = virq;
atomic_inc(&map.vpe->vlpi_count);
 
+   /* Transfer pending state */
+   raw_spin_lock_irqsave(&irq->irq_lock, flags);
+   if (irq->pending_latch) {
+   ret = irq_set_irqchip_state(irq->host_irq,
+   IRQCHIP_STATE_PENDING,
+   irq->pending_latch);
+   WARN_RATELIMIT(ret, "IRQ %d", irq->host_irq);
+
+   /*
+* Clear pending_latch and communicate this state
+* change via vgic_queue_irq_unlock.
+*/
+   irq->pending_latch = false;
+   vgic_queue_irq_unlock(kvm, irq, flags);
+   } else {
+   raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+   }
+
 out:
mutex_unlock(&its->its_lock);
return ret;
-- 
2.19.1

RE: linux-next: Fixes tag needs some work in the usb-chipidea-fixes tree

2021-03-21 Thread Pawel Laszczak

Hi Stephen,

I've send the new version

Thanks,

>
>Hi all,
>
>In commit
>
>  67a788c7c3e7 ("usb: cdnsp: Fixes issue with dequeuing requests after 
> disabling endpoint")
>
>Fixes tag
>
>  Fixes: commit 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence 
> USBSSP DRD Driver")
>
>has these problem(s):
>
>  - leading word 'commit' unexpected
>
>--
>Cheers,
>Stephen Rothwell

Regards,
Pawel Laszczak

[PATCH v5 0/6] KVM: arm64: Add VLPI migration support on GICv4.1

Hi,

In GICv4.1, migration has been supported except for (directly-injected)
VLPI. And GICv4.1 Spec explicitly gives a way to get the VLPI's pending
state (which was crucially missing in GICv4.0). So we make VLPI migration
capable on GICv4.1 in this series.

In order to support VLPI migration, we need to save and restore all
required configuration information and pending states of VLPIs. But
in fact, the configuration information of VLPIs has already been saved
(or will be reallocated on the dst host...) in vgic(kvm) migration.
So we only have to migrate the pending states of VLPIs specially.

Below is the related workflow in migration.

On the save path:
In migration completion:
pause all vCPUs
|
call each VM state change handler:
pause other devices (just keep from sending interrupts, 
and
such as VFIO migration protocol has already realized it 
[1])
|
flush ITS tables into guest RAM
|
flush RDIST pending tables (also flush VLPI pending 
states here)
|
...
On the resume path:
load each device's state:
restore ITS tables (include pending tables) from guest RAM
|
for other (PCI) devices (paused), if configured to have VLPIs,
establish the forwarding paths of their VLPIs (and transfer
the pending states from kvm's vgic to VPT here)

We have tested this series in VFIO migration, and found some related
issues in QEMU [2].

Links:
[1] vfio: UAPI for migration interface for device state:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a8a24f3f6e38103b77cf399c38eb54e1219d00d6
[2] vfio: Some fixes and optimizations for VFIO migration:

https://patchwork.ozlabs.org/project/qemu-devel/cover/20210310030233.1133-1-lushenm...@huawei.com/

History:

v4 -> v5
 - Lock the whole pending state read/write sequence. (in Patch 5, from Marc)

v3 -> v4
 - Nit fixes.
 - Add a CPU cache invalidation right after unmapping the vPE. (Patch 1)
 - Drop the setting of PTZ altogether. (Patch 2)
 - Bail out if spot !vgic_initialized(). (in Patch 4)
 - Communicate the state change (clear pending_latch) via
   vgic_queue_irq_unlock. (in Patch 5)

Thanks a lot for the suggestions from Marc!

v2 -> v3
 - Add the vgic initialized check to ensure that the allocation and enabling
   of the doorbells have already been done before unmapping the vPEs.
 - Check all get_vlpi_state related conditions in save_pending_tables in one 
place.
 - Nit fixes.

v1 -> v2:
 - Get the VLPI state from the KVM side.
 - Nit fixes.

Thanks,
Shenming


Marc Zyngier (1):
  irqchip/gic-v3-its: Add a cache invalidation right after vPE unmapping

Shenming Lu (4):
  irqchip/gic-v3-its: Drop the setting of PTZ altogether
  KVM: arm64: GICv4.1: Add function to get VLPI state
  KVM: arm64: GICv4.1: Try to save VLPI state in save_pending_tables
  KVM: arm64: GICv4.1: Give a chance to save VLPI state

Zenghui Yu (1):
  KVM: arm64: GICv4.1: Restore VLPI pending state to physical side

 .../virt/kvm/devices/arm-vgic-its.rst |  2 +-
 arch/arm64/kvm/vgic/vgic-its.c|  6 +-
 arch/arm64/kvm/vgic/vgic-v3.c | 66 +--
 arch/arm64/kvm/vgic/vgic-v4.c | 38 +++
 arch/arm64/kvm/vgic/vgic.h|  1 +
 drivers/irqchip/irq-gic-v3-its.c  | 21 +-
 6 files changed, 122 insertions(+), 12 deletions(-)

-- 
2.19.1

[PATCH] Bluetooth: Set CONF_NOT_COMPLETE as l2cap_chan default

2021-03-21 Thread Archie Pusaka

From: Archie Pusaka 

Currently l2cap_chan_set_defaults() reset chan->conf_state to zero.
However, there is a flag CONF_NOT_COMPLETE which is set when
creating the l2cap_chan. It is suggested that the flag should be
cleared when l2cap_chan is ready, but when l2cap_chan_set_defaults()
is called, l2cap_chan is not yet ready. Therefore, we must set this
flag as the default.

Example crash call trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0xc4/0x118 lib/dump_stack.c:56
panic+0x1c6/0x38b kernel/panic.c:117
__warn+0x170/0x1b9 kernel/panic.c:471
warn_slowpath_fmt+0xc7/0xf8 kernel/panic.c:494
debug_print_object+0x175/0x193 lib/debugobjects.c:260
debug_object_assert_init+0x171/0x1bf lib/debugobjects.c:614
debug_timer_assert_init kernel/time/timer.c:629 [inline]
debug_assert_init kernel/time/timer.c:677 [inline]
del_timer+0x7c/0x179 kernel/time/timer.c:1034
try_to_grab_pending+0x81/0x2e5 kernel/workqueue.c:1230
cancel_delayed_work+0x7c/0x1c4 kernel/workqueue.c:2929
l2cap_clear_timer+0x1e/0x41 include/net/bluetooth/l2cap.h:834
l2cap_chan_del+0x2d8/0x37e net/bluetooth/l2cap_core.c:640
l2cap_chan_close+0x532/0x5d8 net/bluetooth/l2cap_core.c:756
l2cap_sock_shutdown+0x806/0x969 net/bluetooth/l2cap_sock.c:1174
l2cap_sock_release+0x64/0x14d net/bluetooth/l2cap_sock.c:1217
__sock_release+0xda/0x217 net/socket.c:580
sock_close+0x1b/0x1f net/socket.c:1039
__fput+0x322/0x55c fs/file_table.c:208
fput+0x17/0x19 fs/file_table.c:244
task_work_run+0x19b/0x1d3 kernel/task_work.c:115
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0xe4c/0x204a kernel/exit.c:766
do_group_exit+0x291/0x291 kernel/exit.c:891
get_signal+0x749/0x1093 kernel/signal.c:2396
do_signal+0xa5/0xcdb arch/x86/kernel/signal.c:737
exit_to_usermode_loop arch/x86/entry/common.c:243 [inline]
prepare_exit_to_usermode+0xed/0x235 arch/x86/entry/common.c:277
syscall_return_slowpath+0x3a7/0x3b3 arch/x86/entry/common.c:348
int_ret_from_sys_call+0x25/0xa3

Signed-off-by: Archie Pusaka 
Reported-by: syzbot+338f014a98367a08a...@syzkaller.appspotmail.com
Reviewed-by: Alain Michaud 
Reviewed-by: Abhishek Pandit-Subedi 
Reviewed-by: Guenter Roeck 
---

 net/bluetooth/l2cap_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 374cc32d7138..59ab9689b37d 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -516,7 +516,9 @@ void l2cap_chan_set_defaults(struct l2cap_chan *chan)
chan->flush_to = L2CAP_DEFAULT_FLUSH_TO;
chan->retrans_timeout = L2CAP_DEFAULT_RETRANS_TO;
chan->monitor_timeout = L2CAP_DEFAULT_MONITOR_TO;
+
chan->conf_state = 0;
+   set_bit(CONF_NOT_COMPLETE, &chan->conf_state);
 
set_bit(FLAG_FORCE_ACTIVE, &chan->flags);
 }
-- 
2.31.0.rc2.261.g7f71774620-goog

Re: [PATCH 2/3] arm64: dts: qcom: sm8150: add iommus to qups

2021-03-21 Thread Vinod Koul

On 20-03-21, 17:16, Caleb Connolly wrote:
> Hi Vinod,
> 
> On 16/03/2021 6:15 am, Vinod Koul wrote:
> > On 10-03-21, 16:31, Caleb Connolly wrote:
> >> Hook up the SMMU for doing DMA over i2c. Some peripherals like
> >> touchscreens easily exceed 32-bytes per transfer, causing errors and
> >> lockups without this.
> > Why not squash this to patch 1..?
> 
> I thought it made more sense to separate these patches to keep the 
> history a bit cleaner. I can squash them if you'd prefer.

The nodes should be typically added in a single patch, maybe Bjorn is
fine with this ;-)

-- 
~Vinod

[PATCH] Bluetooth: check for zapped sk before connecting

2021-03-21 Thread Archie Pusaka

From: Archie Pusaka 

There is a possibility of receiving a zapped sock on
l2cap_sock_connect(). This could lead to interesting crashes, one
such case is tearing down an already tore l2cap_sock as is happened
with this call trace:

__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0xc4/0x118 lib/dump_stack.c:56
register_lock_class kernel/locking/lockdep.c:792 [inline]
register_lock_class+0x239/0x6f6 kernel/locking/lockdep.c:742
__lock_acquire+0x209/0x1e27 kernel/locking/lockdep.c:3105
lock_acquire+0x29c/0x2fb kernel/locking/lockdep.c:3599
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:137 [inline]
_raw_spin_lock_bh+0x38/0x47 kernel/locking/spinlock.c:175
spin_lock_bh include/linux/spinlock.h:307 [inline]
lock_sock_nested+0x44/0xfa net/core/sock.c:2518
l2cap_sock_teardown_cb+0x88/0x2fb net/bluetooth/l2cap_sock.c:1345
l2cap_chan_del+0xa3/0x383 net/bluetooth/l2cap_core.c:598
l2cap_chan_close+0x537/0x5dd net/bluetooth/l2cap_core.c:756
l2cap_chan_timeout+0x104/0x17e net/bluetooth/l2cap_core.c:429
process_one_work+0x7e3/0xcb0 kernel/workqueue.c:2064
worker_thread+0x5a5/0x773 kernel/workqueue.c:2196
kthread+0x291/0x2a6 kernel/kthread.c:211
ret_from_fork+0x4e/0x80 arch/x86/entry/entry_64.S:604

Signed-off-by: Archie Pusaka 
Reported-by: syzbot+abfc0f5e668d4099a...@syzkaller.appspotmail.com
Reviewed-by: Alain Michaud 
Reviewed-by: Abhishek Pandit-Subedi 
Reviewed-by: Guenter Roeck 
---

 net/bluetooth/l2cap_sock.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index f1b1edd0b697..b86fd8cc4dc1 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -182,6 +182,13 @@ static int l2cap_sock_connect(struct socket *sock, struct 
sockaddr *addr,
 
BT_DBG("sk %p", sk);
 
+   lock_sock(sk);
+   if (sock_flag(sk, SOCK_ZAPPED)) {
+   release_sock(sk);
+   return -EINVAL;
+   }
+   release_sock(sk);
+
if (!addr || alen < offsetofend(struct sockaddr, sa_family) ||
addr->sa_family != AF_BLUETOOTH)
return -EINVAL;
-- 
2.31.0.rc2.261.g7f71774620-goog

Re: GTE - The hardware timestamping engine

2021-03-21 Thread Kent Gibson

On Sat, Mar 20, 2021 at 12:56:36PM +0100, Linus Walleij wrote:
> Hi Dipen,
> 
> thanks for your mail!
> 
> I involved some other kernel people to get some discussion.
> I think Kent Gibson can be of great help because he is using
> GPIOs with high precision.
> 

Actually I just extended the cdev uAPI to provide the REALTIME option,
which was the event clock until we changed to MONOTONIC in Linux 5.7,
as there were some users that were requiring the REALTIME clock.

> We actually discussed this a bit when adding support for
> realtime timestamps.
> 
> On Wed, Mar 17, 2021 at 11:29 PM Dipen Patel  wrote:
> 
> > Nvidia Tegra SoCs have generic timestamping engine (GTE) hardware module 
> > which
> > can monitor SoC signals like IRQ lines and GPIO lines for state change, upon
> > detecting the change, it can timestamp and store in its internal hardware 
> > FIFO.
> > The advantage of the GTE module can be realized in applications like 
> > robotics
> > or autonomous vehicle where it can help record events with precise 
> > timestamp.
> 
> That sounds very useful.
> 

Indeed - it could remove the latency and jitter that results from
timestamping events in the IRQ handler.

> Certainly the kernel shall be able to handle this.
> 
> > 
> > For GPIO:
> > 
> > 1.  GPIO has to be configured as input and IRQ must be enabled.
> > 2.  Ask GPIO controller driver to set corresponding timestamp bit in the
> > specified GPIO config register.
> > 3.  Translate GPIO specified by the client to its internal bitmap.
> > 3.a For example, If client specifies GPIO line 31, it could be bit 13 of GTE
> > register.
> > 4.  Set internal bits to enable monitoring in GTE module
> > 5.  Additionally GTE driver can open up lanes for the user space application
> > as a client and can send timestamping events directly to the 
> > application.
> 
> I have some concerns:
> 
> 1. GPIO should for all professional applications be used with the character
> device /dev/gpiochipN, under no circumstances shall the old sysfs
> ABI be used for this. In this case it is necessary because the
> character device provides events in a FIFO to userspace, which is
> what we need.
> 

The cdev uAPI would certainly be the most sensible place to expose
this to userspace - its line events being a direct analog to what the GTE
provides.

> The timestamp provided to userspace is an opaque 64bit
> unsigned value. I suppose we assume it is monotonic but
> you can actually augment the semantics for your specific
> stamp, as long as 64 bits is gonna work.
> 
> 2. The timestamp for the chardev is currently obtained in
> drivers/gpio/gpiolib-cdev.c like this:
> 
> static u64 line_event_timestamp(struct line *line)
> {
> if (test_bit(FLAG_EVENT_CLOCK_REALTIME, &line->desc->flags))
> return ktime_get_real_ns();
> 
> return ktime_get_ns();
> }
> 
> What you want to do is to add a new flag for hardware timestamps
> and use that if available. FLAG_EVENT_CLOCK_HARDWARE?
> FLAG_EVENT_CLOCK_NATIVE?
> 

HARDWARE looks better to me, as NATIVE is more vague.

> Then you need to figure out a mechanism so we can obtain
> the right timestamp from the hardware event right here,
> you can hook into the GPIO driver if need be, we can
> figure out the gpio_chip for a certain line for sure.
> 

Firstly, line_event_timestamp() is called from the IRQ handler context.
That is obviously more constraining than if it were only called from the
IRQ thread. If the GTE is providing the timestamp then that could be
put off until the IRQ thread.
So you probably want to refactor line_event_timestamp() into two flavours
- one for IRQ handler that returns 0 if HARDWARE is set, and the other for
IRQ thread, where there is already a fallback call to
line_event_timestamp() for the nested threaded interrupt case, that gets
the timestamp from the GTE.

But my primary concern here would be keeping the two event FIFOs (GTE and
cdev) in sync.  Masking and unmasking in hardware and the kernel needs to
be coordinated to prevent races that would result in sync loss.
So this probably needs to be configured in the GTE driver via the irq
path, rather than pinctrl?

Is every event detected by the GTE guaranteed to trigger an interrupt in
the kernel?

How to handle GTE FIFO overflows?  Can they be detected or prevented?

> So you first need to augment the userspace
> ABI and the character device code to add this. See
> commit 26d060e47e25f2c715a1b2c48fea391f67907a30
> "gpiolib: cdev: allow edge event timestamps to be configured as REALTIME"
> by Kent Gibson to see what needs to be done.
> 

You should also extend gpio_v2_line_flags_validate() to disallow setting
of multiple event clock flags, similar to the bias flag checks.
Currently there is only the one event clock flag, so no need to check.

> 3. Also patch tools/gpio/gpio-event-mon.c to support this flag and use that
> for prototyping and proof of concept.
> 

The corresponding commit for the REALTIM

linux-next: build warning after merge of the amdgpu tree

Hi all,

After merging the amdgpu tree, today's linux-next build (htmldocs)
produced this warning:

drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h:405: warning: Function 
parameter or member 'dmcub_trace_event_en' not described in 
'amdgpu_display_manager'

Introduced by commit

  4057828a1283 ("drm/amd/display: Add debugfs to control DMUB trace buffer 
events")

-- 
Cheers,
Stephen Rothwell


pgpVOWiRQN_TH.pgp
Description: OpenPGP digital signature

Re: [PATCH v4 0/3] Fix pinctrl-single pcs_pin_dbg_show()

2021-03-21 Thread Drew Fustini

On Fri, Mar 19, 2021 at 05:21:30PM +0200, Hanna Hawa wrote:
> These patches fix the pcs_pin_dbg_show() function for the scenario where
> a single register controls multiple pins (i.e. bits_per_mux is not zero)
> Additionally, the common formula is moved to a separate function to
> allow reuse.
> 
> Changes since v3:
> -
> - define and set variable 'mux_bytes' in one line
> - update commit message
> 
> Changes since v2:
> -
> - move read() register to be outside of if condition (as it common
>   read()).
> - Remove extra parentheses
> - replace offset variable by direct return statements
> 
> Changes since v1:
> -
> - remove unused variable in In function 'pcs_allocate_pin_table'
>   (Reported-by: kernel test robot )
> 
> Hanna Hawa (3):
>   pinctrl: pinctrl-single: remove unused variable
>   pinctrl: pinctrl-single: remove unused parameter
>   pinctrl: pinctrl-single: fix pcs_pin_dbg_show() when bits_per_mux is
> not zero
> 
>  drivers/pinctrl/pinctrl-single.c | 65 ++--
>  1 file changed, 37 insertions(+), 28 deletions(-)
> 
> -- 
> 2.17.1
> 

I'm curious what SoC are you using?

It's good to know who has hardware to test bits_per_mux in the future.

I pay attention to pinctrl-single as that is the driver used for the TI
AM3358 SoC used in a variety of BeagleBone boards.  It does not use 
bits_per_mux, but I can verify that this does not cause any regression
for the AM3358 SoC:

  /sys/kernel/debug/pinctrl/44e10800.pinmux-pinctrl-single# cat pins
  registered pins: 142
  pin 0 (PIN0) 0:? 44e10800 0027 pinctrl-single
  pin 1 (PIN1) 0:? 44e10804 0027 pinctrl-single
  pin 2 (PIN2) 0:? 44e10808 0027 pinctrl-single
  pin 3 (PIN3) 0:? 44e1080c 0027 pinctrl-single
  pin 4 (PIN4) 0:? 44e10810 0027 pinctrl-single
  pin 5 (PIN5) 0:? 44e10814 0027 pinctrl-single
  pin 6 (PIN6) 0:? 44e10818 0027 pinctrl-single
  pin 7 (PIN7) 0:? 44e1081c 0027 pinctrl-single
  pin 8 (PIN8) 22:gpio-96-127 44e10820 0027 pinctrl-single
  pin 9 (PIN9) 23:gpio-96-127 44e10824 0037 pinctrl-single
  pin 10 (PIN10) 26:gpio-96-127 44e10828 0037 pinctrl-single
  pin 11 (PIN11) 27:gpio-96-127 44e1082c 0037 pinctrl-single
  pin 12 (PIN12) 0:? 44e10830 0037 pinctrl-single
  
  pin 140 (PIN140) 0:? 44e10a30 0028 pinctrl-single
  pin 141 (PIN141) 13:gpio-64-95 44e10a34 0020 pinctrl-single

Reviewed-by: Drew Fustini 

Thanks,
Drew

[PATCH 3/4] spi: mediatek: add mtk_spi_compatible support

this patch adds max_fifo_size and must_rx compat support.

Signed-off-by: Leilk Liu 
---
 drivers/spi/spi-slave-mt27xx.c | 28 
 1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/drivers/spi/spi-slave-mt27xx.c b/drivers/spi/spi-slave-mt27xx.c
index 44edaa360405..7e6fadc88cef 100644
--- a/drivers/spi/spi-slave-mt27xx.c
+++ b/drivers/spi/spi-slave-mt27xx.c
@@ -10,6 +10,8 @@
 #include 
 #include 
 #include 
+#include 
+
 
 #define SPIS_IRQ_EN_REG0x0
 #define SPIS_IRQ_CLR_REG   0x4
@@ -61,8 +63,6 @@
 #define SPIS_DMA_ADDR_EN   BIT(1)
 #define SPIS_SOFT_RST  BIT(0)
 
-#define MTK_SPI_SLAVE_MAX_FIFO_SIZE 512U
-
 struct mtk_spi_slave {
struct device *dev;
void __iomem *base;
@@ -70,10 +70,19 @@ struct mtk_spi_slave {
struct completion xfer_done;
struct spi_transfer *cur_transfer;
bool slave_aborted;
+   const struct mtk_spi_compatible *dev_comp;
 };
 
+struct mtk_spi_compatible {
+   const u32 max_fifo_size;
+   bool must_rx;
+};
+static const struct mtk_spi_compatible mt2712_compat = {
+   .max_fifo_size = 512,
+};
 static const struct of_device_id mtk_spi_slave_of_match[] = {
-   { .compatible = "mediatek,mt2712-spi-slave", },
+   { .compatible = "mediatek,mt2712-spi-slave",
+ .data = (void *)&mt2712_compat,},
{}
 };
 MODULE_DEVICE_TABLE(of, mtk_spi_slave_of_match);
@@ -272,7 +281,7 @@ static int mtk_spi_slave_transfer_one(struct spi_controller 
*ctlr,
mdata->slave_aborted = false;
mdata->cur_transfer = xfer;
 
-   if (xfer->len > MTK_SPI_SLAVE_MAX_FIFO_SIZE)
+   if (xfer->len > mdata->dev_comp->max_fifo_size)
return mtk_spi_slave_dma_transfer(ctlr, spi, xfer);
else
return mtk_spi_slave_fifo_transfer(ctlr, spi, xfer);
@@ -369,6 +378,7 @@ static int mtk_spi_slave_probe(struct platform_device *pdev)
struct spi_controller *ctlr;
struct mtk_spi_slave *mdata;
int irq, ret;
+   const struct of_device_id *of_id;
 
ctlr = spi_alloc_slave(&pdev->dev, sizeof(*mdata));
if (!ctlr) {
@@ -386,7 +396,17 @@ static int mtk_spi_slave_probe(struct platform_device 
*pdev)
ctlr->setup = mtk_spi_slave_setup;
ctlr->slave_abort = mtk_slave_abort;
 
+   of_id = of_match_node(mtk_spi_slave_of_match, pdev->dev.of_node);
+   if (!of_id) {
+   dev_err(&pdev->dev, "failed to probe of_node\n");
+   ret = -EINVAL;
+   goto err_put_ctlr;
+   }
mdata = spi_controller_get_devdata(ctlr);
+   mdata->dev_comp = of_id->data;
+
+   if (mdata->dev_comp->must_rx)
+   ctlr->flags = SPI_MASTER_MUST_RX;
 
platform_set_drvdata(pdev, ctlr);
 
-- 
2.18.0

[PATCH 4/4] spi: mediatek: add mt8195 spi slave support

this patch adds mt8195 spi slave compatible support.

Signed-off-by: Leilk Liu 
---
 drivers/spi/spi-slave-mt27xx.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/spi/spi-slave-mt27xx.c b/drivers/spi/spi-slave-mt27xx.c
index 7e6fadc88cef..f199a6c4738a 100644
--- a/drivers/spi/spi-slave-mt27xx.c
+++ b/drivers/spi/spi-slave-mt27xx.c
@@ -77,12 +77,20 @@ struct mtk_spi_compatible {
const u32 max_fifo_size;
bool must_rx;
 };
+
 static const struct mtk_spi_compatible mt2712_compat = {
.max_fifo_size = 512,
 };
+static const struct mtk_spi_compatible mt8195_compat = {
+   .max_fifo_size = 128,
+   .must_rx = true,
+};
+
 static const struct of_device_id mtk_spi_slave_of_match[] = {
{ .compatible = "mediatek,mt2712-spi-slave",
  .data = (void *)&mt2712_compat,},
+   { .compatible = "mediatek,mt8195-spi-slave",
+ .data = (void *)&mt8195_compat,},
{}
 };
 MODULE_DEVICE_TABLE(of, mtk_spi_slave_of_match);
-- 
2.18.0

[PATCH 0/4] Add Mediatek MT8195 SPI driver support

This series are based on spi/for-next, and provide 4 patches to add MT8195 spi 
support.

Leilk Liu (4):
  spi: update spi master bindings for MT8195 SoC
  spi: update spi slave bindings for MT8195 SoC
  spi: mediatek: add mtk_spi_compatible support
  spi: mediatek: add mt8195 spi slave support

 .../devicetree/bindings/spi/spi-mt65xx.txt|  1 +
 .../bindings/spi/spi-slave-mt27xx.txt |  1 +
 drivers/spi/spi-slave-mt27xx.c| 36 ---
 3 files changed, 34 insertions(+), 4 deletions(-)

-- 
2.25.1

[PATCH 2/4] spi: update spi slave bindings for MT8195 SoC

Add a DT binding documentation for the MT8195 soc.

Signed-off-by: Leilk Liu 
---
 Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt 
b/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
index c37e5a179b21..9192724540fd 100644
--- a/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
+++ b/Documentation/devicetree/bindings/spi/spi-slave-mt27xx.txt
@@ -3,6 +3,7 @@ Binding for MTK SPI Slave controller
 Required properties:
 - compatible: should be one of the following.
 - mediatek,mt2712-spi-slave: for mt2712 platforms
+- mediatek,mt8195-spi-slave: for mt8195 platforms
 - reg: Address and length of the register set for the device.
 - interrupts: Should contain spi interrupt.
 - clocks: phandles to input clocks.
-- 
2.18.0

[PATCH 1/4] spi: update spi master bindings for MT8195 SoC