date:20210312

Changelog:

v26 -> v27
1. Fix wrong refernce of sense buffer in pre_req complete function.
2. Fix read_id error.
3. Fix chunk size checking for HPB 1.0.
4. Mute unnecessary messages before HPB initialization.

v25 -> v26
1. Fix wrong chunk size checking for HPB 1.0.
2. Fix wrong max data size for HPB single command.
3. Fix typo error.

v24 -> v25
1. Change write buffer API for unmap region.
2. Add checking hpb_enable for avoiding unnecessary memory allocation.
3. Change pr_info to dev_info.
4. Change default requeue timeout value for HPB read.
5. Fix wrong offset manipulation on ufshpb_prep_entry.

v23 -> v24
1. Fix build error reported by kernel test robot.

v22 -> v23
1. Add support compatibility of HPB 1.0.
2. Fix read id for single HPB read command.
3. Fix number of pre-allocated requests for write buffer.
4. Add fast path for response UPIU that has same LUN in sense data.
5. Remove WARN_ON for preventing kernel crash.
7. Fix wrong argument for read buffer command.

v21 -> v22
1. Add support processing response UPIU in suspend state.
2. Add support HPB hint from other LU.
3. Add sending write buffer with 0x03 after HPB init.

v20 -> v21
1. Add bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD attr. and fHPBen flag support.

v19 -> v20
1. Add documentation for sysfs entries of hpb->stat.
2. Fix read buffer command for under-sized sub-region.
3. Fix wrong condition checking for kick map work.
4. Delete redundant response UPIU checking.
5. Add LUN checking in response UPIU.
6. Fix possible deadlock problem due to runtime PM.
7. Add instant changing of sub-region state from response UPIU.
8. Fix endian problem in prefetched PPN.
9. Add JESD220-3A (HPB v2.0) support.

v18 -> 19
1. Fix null pointer error when printing sysfs from non-HPB LU.
2. Apply HPB read opcode in lrbp->cmd->cmnd (from Can Guo's review).
3. Rebase the patch on 5.12/scsi-queue.

v17 -> v18
Fix build error which reported by kernel test robot.

v16 -> v17
1. Rename hpb_state_lock to rgn_state_lock and move it to corresponding
patch.
2. Remove redundant information messages.

v15 -> v16
1. Add missed sysfs ABI documentation.

v14 -> v15
1. Remove duplicated sysfs ABI entries in documentation.
2. Add experiment result of HPB performance testing with iozone.

v13 -> v14
1. Cleanup codes by commentted in Greg's review.
2. Add documentation for sysfs entries (from Greg's review).
3. Add experiment result of HPB performance testing.

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.

v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned active 
region.
2. Fixed to disable HPB feature if HPB fails to allocate essential memory
and workqueue.
3. Fixed to change proper sub-region state when region is already evicted.

v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported features, the data structure for the HPB
is initialized according to the device information.

A read I/O in the active sub-region where the map is cached is changed to
HPB READ by the HPB.

The HPB manages the L2P map using information r

[PATCH v27 1/4] scsi: ufs: Introduce HPB feature

This is a patch for the HPB initialization and adds HPB function calls to
UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of IO requests to the corresponding physical
addresses of the flash storage.
In UFS, Logical-address-to-Physical-address (L2P) map data, which is
required to identify the physical address for the requested IOs, can only
be partially stored in SRAM from NAND flash. Due to this partial loading,
accessing the flash address area where the L2P information for that address
is not loaded in the SRAM can result in serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command.
The HPB READ command allows to read data faster than a read command in UFS
since it provides the physical address (HPB Entry) of the desired logical
block in addition to its logical address. The UFS device can access the
physical block in NAND directly without searching and uploading L2P mapping
table. This improves read performance because the NAND read operation for
uploading L2P mapping table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, some HPB
parameters are configured in the device.

We measured the total start-up time of popular applications and observed
the difference by enabling the HPB.
Popular applications are 12 game apps and 24 non-game apps. Each target
applications were launched in order. The cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The Following is experiment environment:
 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Result:
+---+--+--+---+
| cycle | baseline | with HPB | diff  |
+---+--+--+---+
| 1 | 272.4| 264.9| -7.5  |
| 2 | 250.4| 248.2| -2.2  |
| 3 | 226.2| 215.6| -10.6 |
| 4 | 230.6| 214.8| -15.8 |
| 5 | 232.0| 218.1| -13.9 |
| 6 | 231.9| 212.6| -19.3 |
+---+--+--+---+

We also measured HPB performance using iozone.
Here is my iozone script:
iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16
-s $IO_RANGE/16 -F mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5
mnt/tmp_6 mnt/tmp_7 mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12
mnt/tmp_13 mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Result:
+--++-+
| IO range | HPB on | HPB off |
+--++-+
|   1 GB   | 294.8  | 300.87  |
|   4 GB   | 293.51 | 179.35  |
|   8 GB   | 294.85 | 162.52  |
|  16 GB   | 293.45 | 156.26  |
|  32 GB   | 277.4  | 153.25  |
+--++-+

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Reported-by: kernel test robot 
Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs | 127 +
 drivers/scsi/ufs/Kconfig   |   9 +
 drivers/scsi/ufs/Makefile  |   1 +
 drivers/scsi/ufs/ufs-sysfs.c   |  18 +
 drivers/scsi/ufs/ufs.h |  15 +
 drivers/scsi/ufs/ufshcd.c  |  49 ++
 drivers/scsi/ufs/ufshcd.h  |  22 +
 drivers/scsi/ufs/ufshpb.c  | 569 +
 drivers/scsi/ufs/ufshpb.h  | 167 ++
 9 files changed, 977 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index d1bc23cb6a9d..528bf89fc98b 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1172,3 +1172,130 @@ Description:This node is used to set or display 
whether UFS WriteBooster is
(if the platform supports UFSHCD_CAP_CLK_SCALING). For a
platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
disable/enable WriteBooster through this sysfs node.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the HPB specification version.
+   The full information about the descriptor could be found at UFS
+   HPB (Host Performance Booster) Extension specifications.
+   Example: version 1.2.3 = 0123h
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_control
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows an indication of the HPB control mode.
+   00h: Host control mode
+   01h: Device control mode
+
+   The

[PATCH v27 2/4] scsi: ufs: L2P map management for HPB read

This is a patch for managing L2P map in HPB module.

The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.

Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sensedata.
The HPB module performs L2P mapping management on the host through the
delivered information.

A pinned region is a pre-set regions on the UFS device that is always
activate-state.

The data structure for map data request and L2P map uses mempool API,
minimizing allocation overhead while avoiding static allocation.

The mininum size of the memory pool used in the HPB is implemented
as a module parameter, so that it can be configurable by the user.

To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096

The map_work manages active/inactive by 2 "to-do" lists.
Each hpb lun maintains 2 "to-do" lists:
  hpb->lh_inact_rgn - regions to be inactivated, and
  hpb->lh_act_srgn - subregions to be activated
Those lists are maintained on IO completion.

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h|   36 ++
 drivers/scsi/ufs/ufshcd.c |4 +
 drivers/scsi/ufs/ufshpb.c | 1094 -
 drivers/scsi/ufs/ufshpb.h |   65 +++
 4 files changed, 1184 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 4eee7e31d08d..bfb84d2ba990 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -478,6 +478,41 @@ struct utp_cmd_rsp {
u8 sense_data[UFS_SENSE_SIZE];
 };
 
+struct ufshpb_active_field {
+   __be16 active_rgn;
+   __be16 active_srgn;
+};
+#define HPB_ACT_FIELD_SIZE 4
+
+/**
+ * struct utp_hpb_rsp - Response UPIU structure
+ * @residual_transfer_count: Residual transfer count DW-3
+ * @reserved1: Reserved double words DW-4 to DW-7
+ * @sense_data_len: Sense data length DW-8 U16
+ * @desc_type: Descriptor type of sense data
+ * @additional_len: Additional length of sense data
+ * @hpb_op: HPB operation type
+ * @lun: LUN of response UPIU
+ * @active_rgn_cnt: Active region count
+ * @inactive_rgn_cnt: Inactive region count
+ * @hpb_active_field: Recommended to read HPB region and subregion
+ * @hpb_inactive_field: To be inactivated HPB region and subregion
+ */
+struct utp_hpb_rsp {
+   __be32 residual_transfer_count;
+   __be32 reserved1[4];
+   __be16 sense_data_len;
+   u8 desc_type;
+   u8 additional_len;
+   u8 hpb_op;
+   u8 lun;
+   u8 active_rgn_cnt;
+   u8 inactive_rgn_cnt;
+   struct ufshpb_active_field hpb_active_field[2];
+   __be16 hpb_inactive_field[2];
+};
+#define UTP_HPB_RSP_SIZE 40
+
 /**
  * struct utp_upiu_rsp - general upiu response structure
  * @header: UPIU header structure DW-0 to DW-2
@@ -488,6 +523,7 @@ struct utp_upiu_rsp {
struct utp_upiu_header header;
union {
struct utp_cmd_rsp sr;
+   struct utp_hpb_rsp hr;
struct utp_upiu_query qr;
};
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 70a567ea7d5a..e10984fd8d47 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5021,6 +5021,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+   if (scsi_status == SAM_STAT_GOOD)
+   ufshpb_rsp_upiu(hba, lrbp);
break;
case UPIU_TRANSACTION_REJECT_UPIU:
/* TODO: handle Reject UPIU Response */
@@ -9241,6 +9244,7 @@ EXPORT_SYMBOL(ufshcd_shutdown);
 void ufshcd_remove(struct ufs_hba *hba)
 {
ufs_bsg_remove(hba);
+   ufshpb_remove(hba);
ufs_sysfs_remove_nodes(hba->dev);
blk_cleanup_queue(hba->tmf_queue);
blk_mq_free_tag_set(&hba->tmf_tag_set);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 1a72f6541510..489c8b1ac580 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -16,6 +16,16 @@
 #include "ufshpb.h"
 #include "../sd.h"
 
+/* memory management */
+static struct kmem_cache *ufshpb_mctx_cache;
+static mempool_t *ufshpb_mctx_pool;
+static mempool_t *ufshpb_page_pool;
+/* A cache size of 2MB can cache ppn in the 1GB range. */
+static unsigned int ufshpb_host_map_kbytes = 2048;
+static int tot_active_srgn_pages;
+
+static struct workqueue_struct *ufshpb_wq;
+
 bool ufshpb_is_allo

[PATCH v27 3/4] scsi: ufs: Prepare HPB read for cached sub-region

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 256 +-
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 257 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index e10984fd8d47..88dd0f34fa09 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2656,6 +2656,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
lrbp->req_abort_skip = false;
 
+   ufshpb_prep(hba, lrbp);
+
ufshcd_comp_scsi_upiu(hba, lrbp);
 
err = ufshcd_map_sg(hba, lrbp);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 489c8b1ac580..201dc24d55b3 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(&hpb->hpb_state, state);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+   struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb)
 }
 
 static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
-struct ufshcd_lrb *lrbp,
-struct utp_hpb_rsp *rsp_field)
+   struct ufshcd_lrb *lrbp,
+   struct utp_hpb_rsp *rsp_field)
 {
/* Check HPB_UPDATE_ALERT */
if (!(lrbp->ucd_rsp_ptr->header.dword_2 &
@@ -107,6 +130,233 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
return true;
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+ int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   if (unlikely(!srgn->mctx)) {
+   dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+   "no mctx in region %d subregion %d.\n",
+   srgn->rgn_idx, srgn->srgn_idx);
+   return true;

[PATCH net,v2] net: dsa: mt7530: setup core clock even in TRGMII mode

2021-03-12 Thread Ilya Lipnitskiy

A recent change to MIPS ralink reset logic made it so mt7530 actually
resets the switch on platforms such as mt7621 (where bit 2 is the reset
line for the switch). That exposed an issue where the switch would not
function properly in TRGMII mode after a reset.

Reconfigure core clock in TRGMII mode to fix the issue.

Tested on Ubiquiti ER-X (MT7621) with TRGMII mode enabled.

Fixes: 3f9ef7785a9c ("MIPS: ralink: manage low reset lines")
Signed-off-by: Ilya Lipnitskiy 
---
 drivers/net/dsa/mt7530.c | 52 +++-
 1 file changed, 25 insertions(+), 27 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index f06f5fa2f898..9871d7cff93a 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -436,34 +436,32 @@ mt7530_pad_clk_setup(struct dsa_switch *ds, 
phy_interface_t interface)
 TD_DM_DRVP(8) | TD_DM_DRVN(8));
 
/* Setup core clock for MT7530 */
-   if (!trgint) {
-   /* Disable MT7530 core clock */
-   core_clear(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN);
-
-   /* Disable PLL, since phy_device has not yet been created
-* provided for phy_[read,write]_mmd_indirect is called, we
-* provide our own core_write_mmd_indirect to complete this
-* function.
-*/
-   core_write_mmd_indirect(priv,
-   CORE_GSWPLL_GRP1,
-   MDIO_MMD_VEND2,
-   0);
-
-   /* Set core clock into 500Mhz */
-   core_write(priv, CORE_GSWPLL_GRP2,
-  RG_GSWPLL_POSDIV_500M(1) |
-  RG_GSWPLL_FBKDIV_500M(25));
+   /* Disable MT7530 core clock */
+   core_clear(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN);
 
-   /* Enable PLL */
-   core_write(priv, CORE_GSWPLL_GRP1,
-  RG_GSWPLL_EN_PRE |
-  RG_GSWPLL_POSDIV_200M(2) |
-  RG_GSWPLL_FBKDIV_200M(32));
-
-   /* Enable MT7530 core clock */
-   core_set(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN);
-   }
+   /* Disable PLL, since phy_device has not yet been created
+* provided for phy_[read,write]_mmd_indirect is called, we
+* provide our own core_write_mmd_indirect to complete this
+* function.
+*/
+   core_write_mmd_indirect(priv,
+   CORE_GSWPLL_GRP1,
+   MDIO_MMD_VEND2,
+   0);
+
+   /* Set core clock into 500Mhz */
+   core_write(priv, CORE_GSWPLL_GRP2,
+  RG_GSWPLL_POSDIV_500M(1) |
+  RG_GSWPLL_FBKDIV_500M(25));
+
+   /* Enable PLL */
+   core_write(priv, CORE_GSWPLL_GRP1,
+  RG_GSWPLL_EN_PRE |
+  RG_GSWPLL_POSDIV_200M(2) |
+  RG_GSWPLL_FBKDIV_200M(32));
+
+   /* Enable MT7530 core clock */
+   core_set(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN);
 
/* Setup the MT7530 TRGMII Tx Clock */
core_set(priv, CORE_TRGMII_GSW_CLK_CG, REG_GSWCK_EN);
-- 
2.30.2

[PATCH v27 4/4] scsi: ufs: Add HPB 2.0 support

This patch supports the HPB 2.0.

The HPB 2.0 supports read of varying sizes from 4KB to 512KB.
In the case of Read (<= 32KB) is supported as single HPB read.
In the case of Read (36KB ~ 512KB) is supported by as a combination of
write buffer command and HPB read command to deliver more PPN.
The write buffer commands may not be issued immediately due to busy tags.
To use HPB read more aggressively, the driver can requeue the write buffer
command. The requeue threshold is implemented as timeout and can be
modified with requeue_timeout_ms entry in sysfs.

Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs |  47 +-
 drivers/scsi/ufs/ufs-sysfs.c   |   4 +
 drivers/scsi/ufs/ufs.h |   3 +-
 drivers/scsi/ufs/ufshcd.c  |  25 +-
 drivers/scsi/ufs/ufshcd.h  |   7 +
 drivers/scsi/ufs/ufshpb.c  | 627 +++--
 drivers/scsi/ufs/ufshpb.h  |  66 ++-
 7 files changed, 699 insertions(+), 80 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index 528bf89fc98b..419adf450b89 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB 
pinned regions assigned to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that changed to HPB read.
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that cannot be changed to
@@ -1268,7 +1268,7 @@ Description:  This entry shows the number of reads 
that cannot be changed to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of response UPIUs that has
@@ -1276,7 +1276,7 @@ Description:  This entry shows the number of response 
UPIUs that has
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of active sub-regions recommended by
@@ -1284,7 +1284,7 @@ Description:  This entry shows the number of active 
sub-regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of inactive regions recommended by
@@ -1292,10 +1292,45 @@ Description:This entry shows the number of inactive 
regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of read buffer commands for
activating sub-regions recommended by response UPIUs.
 
The file is read only.
+
+What:  
/sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the requeue timeout threshold for write buffer
+   command in ms. This value can be changed by writing proper 
integer to
+   this entry.
+
+What:  
/sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the maximum HPB data size for using single HPB
+   command.
+
+   ===  
+   00h  4KB
+   01h  8KB
+   02h  12KB
+   ...
+   FFh  1024KB
+   ===  
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the status of HPB.
+
+   == 
+   0  HPB is not enabled.
+   1  HPB is enabled
+   == 
+
+

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

Re: [PATCH v7] i2c: virtio: add a virtio i2c frontend driver

2021-03-12 Thread Viresh Kumar

On 12-03-21, 15:51, Jie Deng wrote:
> 
> On 2021/3/12 14:10, Viresh Kumar wrote:
> > I saw your email about wrong version being sent, I already wrote some
> > reviews. Sending them anyway for FWIW :)
> > 
> > On 12-03-21, 21:33, Jie Deng wrote:
> > > +struct virtio_i2c {
> > > + struct virtio_device *vdev;
> > > + struct completion completion;
> > > + struct i2c_adapter adap;
> > > + struct mutex lock;
> > As I said in the previous version (Yes, we were both waiting for
> > Wolfram to answer that), this lock shouldn't be required at all.
> > 
> > And since none of us have a use-case at hand where we will have a
> > problem without this lock, we should really skip it. We can always
> > come back and add it if we find an issue somewhere. Until then, better
> > to keep it simple.

> The problem is you can't guarantee that adap->algo->master_xfer
> is only called from i2c_transfer. Any function who holds the adapter can
> call
> adap->algo->master_xfer directly.

See my last reply here, (almost) no one in the mainline kernel call it
directly. And perhaps you can consider the caller broken in that case
and so there is no need of an extra lock, unless you have a case that
is broken.

https://lore.kernel.org/lkml/20210305072903.wtw645rukmqr5hx5@vireshk-i7/

> I prefer to avoid potential issues rather
> than
> find a issue then fix.

This is a very hypothetical issue IMHO as the kernel code doesn't have
such a user. There is no need of locks here, else the i2c core won't
have handled it by itself.

> > 
> > > +
> > > +static struct i2c_adapter virtio_adapter = {
> > > + .owner = THIS_MODULE,
> > > + .name = "Virtio I2C Adapter",
> > > + .class = I2C_CLASS_DEPRECATED,
> > What happened to this discussion ?
> > 
> > https://lore.kernel.org/lkml/20210305072903.wtw645rukmqr5hx5@vireshk-i7/
> 
> My understanding is that new driver sets this to warn users that the adapter
> doesn't support classes anymore.

I think the warning is relevant for old drivers who used to support
classes and not for new ones.

> I'm not sure if Wolfram can explain it more clear for you.

Okay, lemme dig in a bit then.

$ git grep -l i2c_add_adapter drivers/i2c/busses/ | wc -l
77

$ git grep -l I2C_CLASS_DEPRECATED drivers/i2c/busses/
drivers/i2c/busses/i2c-at91-core.c
drivers/i2c/busses/i2c-bcm2835.c
drivers/i2c/busses/i2c-davinci.c
drivers/i2c/busses/i2c-designware-platdrv.c
drivers/i2c/busses/i2c-mv64xxx.c
drivers/i2c/busses/i2c-nomadik.c
drivers/i2c/busses/i2c-ocores.c
drivers/i2c/busses/i2c-omap.c
drivers/i2c/busses/i2c-rcar.c
drivers/i2c/busses/i2c-s3c2410.c
drivers/i2c/busses/i2c-tegra.c
drivers/i2c/busses/i2c-virtio.c
drivers/i2c/busses/i2c-xiic.c

i.e. only 13 of 77 drivers are using this flag.

The latest addition among these drivers is i2c-bcm2835.c and it was
added back in 2013 and the flag was added to it in 2014:

commit 37888f71e2c9 ("i2c: i2c-bcm2835: deprecate class based instantiation")

FWIW, I also checked all the new drivers added since kernel release
v4.0 and none of them set this flag.

-- 
viresh

Re: [PATCH v5 2/2] tty/serial: Add rx-tx-swap OF option to stm32-usart

2021-03-12 Thread Greg Kroah-Hartman

On Thu, Mar 11, 2021 at 10:51:53PM +0100, Martin Devera wrote:
> STM32 F7/H7 usarts supports RX & TX pin swapping.
> Add option to turn it on.
> Tested on STM32MP157.
> 
> Signed-off-by: Martin Devera 
> Acked-by: Fabrice Gasnier 
> ---
>  drivers/tty/serial/stm32-usart.c | 11 ++-
>  drivers/tty/serial/stm32-usart.h |  4 
>  2 files changed, 14 insertions(+), 1 deletion(-)

What changed from v4-v1 on this patch series?  That needs to go below
the --- line as documented.

Please fix up and send v6.

thanks,

greg k-h

Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page

2021-03-12 Thread Michal Hocko

On Thu 11-03-21 14:53:08, Mike Kravetz wrote:
> On 3/11/21 9:59 AM, Mike Kravetz wrote:
> > On 3/11/21 4:17 AM, Michal Hocko wrote:
> >>> Yeah per cpu preempt counting shouldn't be noticeable but I have to
> >>> confess I haven't benchmarked it.
> >>
> >> But all this seems moot now 
> >> http://lkml.kernel.org/r/yeoa08n60+jzs...@hirez.programming.kicks-ass.net
> >>
> > 
> > The proper fix for free_huge_page independent of this series would
> > involve:
> > 
> > - Make hugetlb_lock and subpool lock irq safe
> > - Hand off freeing to a workque if the freeing could sleep
> > 
> > Today, the only time we can sleep in free_huge_page is for gigantic
> > pages allocated via cma.  I 'think' the concern about undesirable
> > user visible side effects in this case is minimal as freeing/allocating
> > 1G pages is not something that is going to happen at a high frequency.
> > My thinking could be wrong?
> > 
> > Of more concern, is the introduction of this series.  If this feature
> > is enabled, then ALL free_huge_page requests must be sent to a workqueue.
> > Any ideas on how to address this?
> > 
> 
> Thinking about this more ...
> 
> A call to free_huge_page has two distinct outcomes
> 1) Page is freed back to the original allocator: buddy or cma
> 2) Page is put on hugetlb free list
> 
> We can only possibly sleep in the first case 1.  In addition, freeing a
> page back to the original allocator involves these steps:
> 1) Removing page from hugetlb lists
> 2) Updating hugetlb counts: nr_hugepages, surplus
> 3) Updating page fields
> 4) Allocate vmemmap pages if needed as in this series
> 5) Calling free routine of original allocator
> 
> If hugetlb_lock is irq safe, we can perform the first 3 steps under that
> lock without issue.  We would then use a workqueue to perform the last
> two steps.  Since we are updating hugetlb user visible data under the
> lock, there should be no delays.  Of course, giving those pages back to
> the original allocator could still be delayed, and a user may notice
> that.  Not sure if that would be acceptable?

Well, having many in-flight huge pages can certainly be visible. Say you
are freeing hundreds of huge pages and your echo n > nr_hugepages will
return just for you to find out that the memory hasn't been freed and
therefore cannot be reused for another use - recently there was somebody
mentioning their usecase to free up huge pages to prevent OOM for
example. I do expect more people doing something like that.

Now, nr_hugepages can be handled by blocking on the same WQ until all
pre-existing items are processed. Maybe we will need to have a more
generic API to achieve the same for in kernel users but let's wait for
those requests.

> I think Muchun had a
> similar setup just for vmemmmap allocation in an early version of this
> series.
> 
> This would also require changes to where accounting is done in
> dissolve_free_huge_page and update_and_free_page as mentioned elsewhere.

Normalizing dissolve_free_huge_page is definitely a good idea. It is
really tricky how it sticks out and does half of the job of
update_and_free_page.

That being said, if it is possible to have a fully consistent h state
before handing over to WQ for sleeping operation then we should be all
fine. I am slightly worried about potential tricky situations where the
sleeping operation fails because that would require that page to be
added back to the pool again. As said above we would need some sort of
sync with in-flight operations before returning to the userspace.

-- 
Michal Hocko
SUSE Labs

[PATCH v28 0/4] scsi: ufs: Add Host Performance Booster Support

Changelog:

v27 -> v28
1. Fix wrong return value of ufshpb_prep.

v26 -> v27
1. Fix wrong refernce of sense buffer in pre_req complete function.
2. Fix read_id error.
3. Fix chunk size checking for HPB 1.0.
4. Mute unnecessary messages before HPB initialization.

v25 -> v26
1. Fix wrong chunk size checking for HPB 1.0.
2. Fix wrong max data size for HPB single command.
3. Fix typo error.

v24 -> v25
1. Change write buffer API for unmap region.
2. Add checking hpb_enable for avoiding unnecessary memory allocation.
3. Change pr_info to dev_info.
4. Change default requeue timeout value for HPB read.
5. Fix wrong offset manipulation on ufshpb_prep_entry.

v23 -> v24
1. Fix build error reported by kernel test robot.

v22 -> v23
1. Add support compatibility of HPB 1.0.
2. Fix read id for single HPB read command.
3. Fix number of pre-allocated requests for write buffer.
4. Add fast path for response UPIU that has same LUN in sense data.
5. Remove WARN_ON for preventing kernel crash.
7. Fix wrong argument for read buffer command.

v21 -> v22
1. Add support processing response UPIU in suspend state.
2. Add support HPB hint from other LU.
3. Add sending write buffer with 0x03 after HPB init.

v20 -> v21
1. Add bMAX_DATA_SIZE_FOR_HPB_SINGLE_CMD attr. and fHPBen flag support.

v19 -> v20
1. Add documentation for sysfs entries of hpb->stat.
2. Fix read buffer command for under-sized sub-region.
3. Fix wrong condition checking for kick map work.
4. Delete redundant response UPIU checking.
5. Add LUN checking in response UPIU.
6. Fix possible deadlock problem due to runtime PM.
7. Add instant changing of sub-region state from response UPIU.
8. Fix endian problem in prefetched PPN.
9. Add JESD220-3A (HPB v2.0) support.

v18 -> 19
1. Fix null pointer error when printing sysfs from non-HPB LU.
2. Apply HPB read opcode in lrbp->cmd->cmnd (from Can Guo's review).
3. Rebase the patch on 5.12/scsi-queue.

v17 -> v18
Fix build error which reported by kernel test robot.

v16 -> v17
1. Rename hpb_state_lock to rgn_state_lock and move it to corresponding
patch.
2. Remove redundant information messages.

v15 -> v16
1. Add missed sysfs ABI documentation.

v14 -> v15
1. Remove duplicated sysfs ABI entries in documentation.
2. Add experiment result of HPB performance testing with iozone.

v13 -> v14
1. Cleanup codes by commentted in Greg's review.
2. Add documentation for sysfs entries (from Greg's review).
3. Add experiment result of HPB performance testing.

v12 -> v13
1. Cleanup codes by comments from Can Guo.
2. Add HPB related descriptor/flag/attributes in sysfs.
3. Change base commit from 5.10/scsi-queue to 5.11/scsi-queue.

v11 -> v12
1. Fixed to return error value when HPB fails to initialize pinned active 
region.
2. Fixed to disable HPB feature if HPB fails to allocate essential memory
and workqueue.
3. Fixed to change proper sub-region state when region is already evicted.

v10 -> v11
Add a newline at end the last line on Kconfig file.

v9 -> v10
1. Fixed 64-bit division error
2. Fixed problems commentted in Bart's review.

v8 -> v9
1. Change sysfs initialization.
2. Change reading descriptor during HPB initialization
3. Fixed problems commentted in Bart's review.
4. Change base commit from 5.9/scsi-queue to 5.10/scsi-queue.

v7 -> v8
Remove wrongly added tags.

v6 -> v7
1. Remove UFS feature layer.
2. Cleanup for sparse error.

v5 -> v6
Change base commit to b53293fa662e28ae0cdd40828dc641c09f133405

v4 -> v5
Delete unused macro define.

v3 -> v4
1. Cleanup.

v2 -> v3
1. Add checking input module parameter value.
2. Change base commit from 5.8/scsi-queue to 5.9/scsi-queue.
3. Cleanup for unused variables and label.

v1 -> v2
1. Change the full boilerplate text to SPDX style.
2. Adopt dynamic allocation for sub-region data structure.
3. Cleanup.

NAND flash memory-based storage devices use Flash Translation Layer (FTL)
to translate logical addresses of I/O requests to corresponding flash
memory addresses. Mobile storage devices typically have RAM with
constrained size, thus lack in memory to keep the whole mapping table.
Therefore, mapping tables are partially retrieved from NAND flash on
demand, causing random-read performance degradation.

To improve random read performance, JESD220-3 (HPB v1.0) proposes HPB
(Host Performance Booster) which uses host system memory as a cache for the
FTL mapping table. By using HPB, FTL data can be read from host memory
faster than from NAND flash memory. 

The current version only supports the DCM (device control mode).
This patch consists of 3 parts to support HPB feature.

1) HPB probe and initialization process
2) READ -> HPB READ using cached map information
3) L2P (logical to physical) map management

In the HPB probe and init process, the device information of the UFS is
queried. After checking supported features, the data structure for the HPB
is initialized according to the device information.

A read I/O in the active sub-region where the map is cached is changed to
HPB READ by the

Re: [PATCH net-next v5 2/2] net: Add Qcom WWAN control driver

2021-03-12 Thread Greg KH

On Thu, Mar 11, 2021 at 09:41:04PM +0100, Loic Poulain wrote:
> The MHI WWWAN control driver allows MHI QCOM-based modems to expose
> different modem control protocols/ports to userspace, so that userspace
> modem tools or daemon (e.g. ModemManager) can control WWAN config
> and state (APN config, SMS, provider selection...). A QCOM-based
> modem can expose one or several of the following protocols:
> - AT: Well known AT commands interactive protocol (microcom, minicom...)
> - MBIM: Mobile Broadband Interface Model (libmbim, mbimcli)
> - QMI: QCOM MSM/Modem Interface (libqmi, qmicli)
> - QCDM: QCOM Modem diagnostic interface (libqcdm)
> - FIREHOSE: XML-based protocol for Modem firmware management
> (qmi-firmware-update)
> 
> The different interfaces are exposed as character devices through the
> WWAN subsystem, in the same way as for USB modem variants.
> 
> Note that this patch is mostly a rework of the earlier MHI UCI
> tentative that was a generic interface for accessing MHI bus from
> userspace. As suggested, this new version is WWAN specific and is
> dedicated to only expose channels used for controlling a modem, and
> for which related opensource user support exist. Other MHI channels
> not fitting the requirements will request either to be plugged to
> the right Linux subsystem (when available) or to be discussed as a
> new MHI driver (e.g AI accelerator, WiFi debug channels, etc...).
> 
> Signed-off-by: Loic Poulain 
> ---
>  v2: update copyright (2021)
>  v3: Move driver to dedicated drivers/net/wwan directory
>  v4: Rework to use wwan framework instead of self cdev management
>  v5: Fix errors/typos in Kconfig
> 
>  drivers/net/wwan/Kconfig |  14 ++
>  drivers/net/wwan/Makefile|   1 +
>  drivers/net/wwan/mhi_wwan_ctrl.c | 497 
> +++
>  3 files changed, 512 insertions(+)
>  create mode 100644 drivers/net/wwan/mhi_wwan_ctrl.c
> 
> diff --git a/drivers/net/wwan/Kconfig b/drivers/net/wwan/Kconfig
> index 545fe54..ce0bbfb 100644
> --- a/drivers/net/wwan/Kconfig
> +++ b/drivers/net/wwan/Kconfig
> @@ -19,4 +19,18 @@ config WWAN_CORE
> To compile this driver as a module, choose M here: the module will be
> called wwan.
>  
> +config MHI_WWAN_CTRL
> + tristate "MHI WWAN control driver for QCOM-based PCIe modems"
> + select WWAN_CORE
> + depends on MHI_BUS
> + help
> +   MHI WWAN CTRL allows QCOM-based PCIe modems to expose different modem
> +   control protocols/ports to userspace, including AT, MBIM, QMI, DIAG
> +   and FIREHOSE. These protocols can be accessed directly from userspace
> +   (e.g. AT commands) or via libraries/tools (e.g. libmbim, libqmi,
> +   libqcdm...).
> +
> +   To compile this driver as a module, choose M here: the module will be
> +   called mhi_wwan_ctrl
> +
>  endif # WWAN
> diff --git a/drivers/net/wwan/Makefile b/drivers/net/wwan/Makefile
> index ca8bb5a..e18ecda 100644
> --- a/drivers/net/wwan/Makefile
> +++ b/drivers/net/wwan/Makefile
> @@ -6,3 +6,4 @@
>  obj-$(CONFIG_WWAN_CORE) += wwan.o
>  wwan-objs += wwan_core.o wwan_port.o
>  
> +obj-$(CONFIG_MHI_WWAN_CTRL) += mhi_wwan_ctrl.o
> diff --git a/drivers/net/wwan/mhi_wwan_ctrl.c 
> b/drivers/net/wwan/mhi_wwan_ctrl.c
> new file mode 100644
> index 000..abda4b0
> --- /dev/null
> +++ b/drivers/net/wwan/mhi_wwan_ctrl.c
> @@ -0,0 +1,497 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/* Copyright (c) 2018-2021, The Linux Foundation. All rights reserved.*/
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "wwan_core.h"
> +
> +#define MHI_WWAN_CTRL_DRIVER_NAME "mhi_wwan_ctrl"
> +#define MHI_WWAN_CTRL_MAX_MINORS 128
> +#define MHI_WWAN_MAX_MTU 0x8000
> +
> +/* MHI wwan device flags */
> +#define MHI_WWAN_DL_CAP  BIT(0)
> +#define MHI_WWAN_UL_CAP  BIT(1)
> +#define MHI_WWAN_CONNECTED   BIT(2)
> +
> +struct mhi_wwan_buf {
> + struct list_head node;
> + void *data;
> + size_t len;
> + size_t consumed;
> +};
> +
> +struct mhi_wwan_dev {
> + unsigned int minor;

You never use this, why is it here?

{sigh}

Who reviewed this series before sending it out?

greg k-h

[PATCH v28 1/4] scsi: ufs: Introduce HPB feature

This is a patch for the HPB initialization and adds HPB function calls to
UFS core driver.

NAND flash-based storage devices, including UFS, have mechanisms to
translate logical addresses of IO requests to the corresponding physical
addresses of the flash storage.
In UFS, Logical-address-to-Physical-address (L2P) map data, which is
required to identify the physical address for the requested IOs, can only
be partially stored in SRAM from NAND flash. Due to this partial loading,
accessing the flash address area where the L2P information for that address
is not loaded in the SRAM can result in serious performance degradation.

The basic concept of HPB is to cache L2P mapping entries in host system
memory so that both physical block address (PBA) and logical block address
(LBA) can be delivered in HPB read command.
The HPB READ command allows to read data faster than a read command in UFS
since it provides the physical address (HPB Entry) of the desired logical
block in addition to its logical address. The UFS device can access the
physical block in NAND directly without searching and uploading L2P mapping
table. This improves read performance because the NAND read operation for
uploading L2P mapping table is removed.

In HPB initialization, the host checks if the UFS device supports HPB
feature and retrieves related device capabilities. Then, some HPB
parameters are configured in the device.

We measured the total start-up time of popular applications and observed
the difference by enabling the HPB.
Popular applications are 12 game apps and 24 non-game apps. Each target
applications were launched in order. The cycle consists of running 36
applications in sequence. We repeated the cycle for observing performance
improvement by L2P mapping cache hit in HPB.

The Following is experiment environment:
 - kernel version: 4.4.0
 - RAM: 8GB
 - UFS 2.1 (64GB)

Result:
+---+--+--+---+
| cycle | baseline | with HPB | diff  |
+---+--+--+---+
| 1 | 272.4| 264.9| -7.5  |
| 2 | 250.4| 248.2| -2.2  |
| 3 | 226.2| 215.6| -10.6 |
| 4 | 230.6| 214.8| -15.8 |
| 5 | 232.0| 218.1| -13.9 |
| 6 | 231.9| 212.6| -19.3 |
+---+--+--+---+

We also measured HPB performance using iozone.
Here is my iozone script:
iozone -r 4k -+n -i2 -ecI -t 16 -l 16 -u 16
-s $IO_RANGE/16 -F mnt/tmp_1 mnt/tmp_2 mnt/tmp_3 mnt/tmp_4 mnt/tmp_5
mnt/tmp_6 mnt/tmp_7 mnt/tmp_8 mnt/tmp_9 mnt/tmp_10 mnt/tmp_11 mnt/tmp_12
mnt/tmp_13 mnt/tmp_14 mnt/tmp_15 mnt/tmp_16

Result:
+--++-+
| IO range | HPB on | HPB off |
+--++-+
|   1 GB   | 294.8  | 300.87  |
|   4 GB   | 293.51 | 179.35  |
|   8 GB   | 294.85 | 162.52  |
|  16 GB   | 293.45 | 156.26  |
|  32 GB   | 277.4  | 153.25  |
+--++-+

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Reported-by: kernel test robot 
Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs | 127 +
 drivers/scsi/ufs/Kconfig   |   9 +
 drivers/scsi/ufs/Makefile  |   1 +
 drivers/scsi/ufs/ufs-sysfs.c   |  18 +
 drivers/scsi/ufs/ufs.h |  15 +
 drivers/scsi/ufs/ufshcd.c  |  49 ++
 drivers/scsi/ufs/ufshcd.h  |  22 +
 drivers/scsi/ufs/ufshpb.c  | 569 +
 drivers/scsi/ufs/ufshpb.h  | 167 ++
 9 files changed, 977 insertions(+)
 create mode 100644 drivers/scsi/ufs/ufshpb.c
 create mode 100644 drivers/scsi/ufs/ufshpb.h

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index d1bc23cb6a9d..528bf89fc98b 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1172,3 +1172,130 @@ Description:This node is used to set or display 
whether UFS WriteBooster is
(if the platform supports UFSHCD_CAP_CLK_SCALING). For a
platform that doesn't support UFSHCD_CAP_CLK_SCALING, we can
disable/enable WriteBooster through this sysfs node.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_version
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the HPB specification version.
+   The full information about the descriptor could be found at UFS
+   HPB (Host Performance Booster) Extension specifications.
+   Example: version 1.2.3 = 0123h
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/device_descriptor/hpb_control
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows an indication of the HPB control mode.
+   00h: Host control mode
+   01h: Device control mode
+
+   The

[PATCH v28 2/4] scsi: ufs: L2P map management for HPB read

This is a patch for managing L2P map in HPB module.

The HPB divides logical addresses into several regions. A region consists
of several sub-regions. The sub-region is a basic unit where L2P mapping is
managed. The driver loads L2P mapping data of each sub-region. The loaded
sub-region is called active-state. The HPB driver unloads L2P mapping data
as region unit. The unloaded region is called inactive-state.

Sub-region/region candidates to be loaded and unloaded are delivered from
the UFS device. The UFS device delivers the recommended active sub-region
and inactivate region to the driver using sensedata.
The HPB module performs L2P mapping management on the host through the
delivered information.

A pinned region is a pre-set regions on the UFS device that is always
activate-state.

The data structure for map data request and L2P map uses mempool API,
minimizing allocation overhead while avoiding static allocation.

The mininum size of the memory pool used in the HPB is implemented
as a module parameter, so that it can be configurable by the user.

To gurantee a minimum memory pool size of 4MB: ufshpb_host_map_kbytes=4096

The map_work manages active/inactive by 2 "to-do" lists.
Each hpb lun maintains 2 "to-do" lists:
  hpb->lh_inact_rgn - regions to be inactivated, and
  hpb->lh_act_srgn - subregions to be activated
Those lists are maintained on IO completion.

Reviewed-by: Bart Van Assche 
Reviewed-by: Can Guo 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufs.h|   36 ++
 drivers/scsi/ufs/ufshcd.c |4 +
 drivers/scsi/ufs/ufshpb.c | 1094 -
 drivers/scsi/ufs/ufshpb.h |   65 +++
 4 files changed, 1184 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/ufs/ufs.h b/drivers/scsi/ufs/ufs.h
index 4eee7e31d08d..bfb84d2ba990 100644
--- a/drivers/scsi/ufs/ufs.h
+++ b/drivers/scsi/ufs/ufs.h
@@ -478,6 +478,41 @@ struct utp_cmd_rsp {
u8 sense_data[UFS_SENSE_SIZE];
 };
 
+struct ufshpb_active_field {
+   __be16 active_rgn;
+   __be16 active_srgn;
+};
+#define HPB_ACT_FIELD_SIZE 4
+
+/**
+ * struct utp_hpb_rsp - Response UPIU structure
+ * @residual_transfer_count: Residual transfer count DW-3
+ * @reserved1: Reserved double words DW-4 to DW-7
+ * @sense_data_len: Sense data length DW-8 U16
+ * @desc_type: Descriptor type of sense data
+ * @additional_len: Additional length of sense data
+ * @hpb_op: HPB operation type
+ * @lun: LUN of response UPIU
+ * @active_rgn_cnt: Active region count
+ * @inactive_rgn_cnt: Inactive region count
+ * @hpb_active_field: Recommended to read HPB region and subregion
+ * @hpb_inactive_field: To be inactivated HPB region and subregion
+ */
+struct utp_hpb_rsp {
+   __be32 residual_transfer_count;
+   __be32 reserved1[4];
+   __be16 sense_data_len;
+   u8 desc_type;
+   u8 additional_len;
+   u8 hpb_op;
+   u8 lun;
+   u8 active_rgn_cnt;
+   u8 inactive_rgn_cnt;
+   struct ufshpb_active_field hpb_active_field[2];
+   __be16 hpb_inactive_field[2];
+};
+#define UTP_HPB_RSP_SIZE 40
+
 /**
  * struct utp_upiu_rsp - general upiu response structure
  * @header: UPIU header structure DW-0 to DW-2
@@ -488,6 +523,7 @@ struct utp_upiu_rsp {
struct utp_upiu_header header;
union {
struct utp_cmd_rsp sr;
+   struct utp_hpb_rsp hr;
struct utp_upiu_query qr;
};
 };
diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 70a567ea7d5a..e10984fd8d47 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -5021,6 +5021,9 @@ ufshcd_transfer_rsp_status(struct ufs_hba *hba, struct 
ufshcd_lrb *lrbp)
 */
pm_runtime_get_noresume(hba->dev);
}
+
+   if (scsi_status == SAM_STAT_GOOD)
+   ufshpb_rsp_upiu(hba, lrbp);
break;
case UPIU_TRANSACTION_REJECT_UPIU:
/* TODO: handle Reject UPIU Response */
@@ -9241,6 +9244,7 @@ EXPORT_SYMBOL(ufshcd_shutdown);
 void ufshcd_remove(struct ufs_hba *hba)
 {
ufs_bsg_remove(hba);
+   ufshpb_remove(hba);
ufs_sysfs_remove_nodes(hba->dev);
blk_cleanup_queue(hba->tmf_queue);
blk_mq_free_tag_set(&hba->tmf_tag_set);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 1a72f6541510..489c8b1ac580 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -16,6 +16,16 @@
 #include "ufshpb.h"
 #include "../sd.h"
 
+/* memory management */
+static struct kmem_cache *ufshpb_mctx_cache;
+static mempool_t *ufshpb_mctx_pool;
+static mempool_t *ufshpb_page_pool;
+/* A cache size of 2MB can cache ppn in the 1GB range. */
+static unsigned int ufshpb_host_map_kbytes = 2048;
+static int tot_active_srgn_pages;
+
+static struct workqueue_struct *ufshpb_wq;
+
 bool ufshpb_is_allo

[PATCH v28 3/4] scsi: ufs: Prepare HPB read for cached sub-region

This patch changes the read I/O to the HPB read I/O.

If the logical address of the read I/O belongs to active sub-region, the
HPB driver modifies the read I/O command to HPB read. It modifies the UPIU
command of UFS instead of modifying the existing SCSI command.

In the HPB version 1.0, the maximum read I/O size that can be converted to
HPB read is 4KB.

The dirty map of the active sub-region prevents an incorrect HPB read that
has stale physical page number which is updated by previous write I/O.

Reviewed-by: Can Guo 
Reviewed-by: Bart Van Assche 
Acked-by: Avri Altman 
Tested-by: Bean Huo 
Signed-off-by: Daejun Park 
---
 drivers/scsi/ufs/ufshcd.c |   2 +
 drivers/scsi/ufs/ufshpb.c | 256 +-
 drivers/scsi/ufs/ufshpb.h |   2 +
 3 files changed, 257 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index e10984fd8d47..88dd0f34fa09 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2656,6 +2656,8 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, 
struct scsi_cmnd *cmd)
 
lrbp->req_abort_skip = false;
 
+   ufshpb_prep(hba, lrbp);
+
ufshcd_comp_scsi_upiu(hba, lrbp);
 
err = ufshcd_map_sg(hba, lrbp);
diff --git a/drivers/scsi/ufs/ufshpb.c b/drivers/scsi/ufs/ufshpb.c
index 489c8b1ac580..201dc24d55b3 100644
--- a/drivers/scsi/ufs/ufshpb.c
+++ b/drivers/scsi/ufs/ufshpb.c
@@ -46,6 +46,29 @@ static void ufshpb_set_state(struct ufshpb_lu *hpb, int 
state)
atomic_set(&hpb->hpb_state, state);
 }
 
+static int ufshpb_is_valid_srgn(struct ufshpb_region *rgn,
+   struct ufshpb_subregion *srgn)
+{
+   return rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID;
+}
+
+static bool ufshpb_is_read_cmd(struct scsi_cmnd *cmd)
+{
+   return req_op(cmd->request) == REQ_OP_READ;
+}
+
+static bool ufshpb_is_write_or_discard_cmd(struct scsi_cmnd *cmd)
+{
+   return op_is_write(req_op(cmd->request)) ||
+  op_is_discard(req_op(cmd->request));
+}
+
+static bool ufshpb_is_support_chunk(int transfer_len)
+{
+   return transfer_len <= HPB_MULTI_CHUNK_HIGH;
+}
+
 static bool ufshpb_is_general_lun(int lun)
 {
return lun < UFS_UPIU_MAX_UNIT_NUM_ID;
@@ -80,8 +103,8 @@ static void ufshpb_kick_map_work(struct ufshpb_lu *hpb)
 }
 
 static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
-struct ufshcd_lrb *lrbp,
-struct utp_hpb_rsp *rsp_field)
+   struct ufshcd_lrb *lrbp,
+   struct utp_hpb_rsp *rsp_field)
 {
/* Check HPB_UPDATE_ALERT */
if (!(lrbp->ucd_rsp_ptr->header.dword_2 &
@@ -107,6 +130,233 @@ static bool ufshpb_is_hpb_rsp_valid(struct ufs_hba *hba,
return true;
 }
 
+static void ufshpb_set_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int set_bit_len;
+   int bitmap_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if ((srgn_offset + cnt) > bitmap_len)
+   set_bit_len = bitmap_len - srgn_offset;
+   else
+   set_bit_len = cnt;
+
+   if (rgn->rgn_state != HPB_RGN_INACTIVE &&
+   srgn->srgn_state == HPB_SRGN_VALID)
+   bitmap_set(srgn->mctx->ppn_dirty, srgn_offset, set_bit_len);
+
+   srgn_offset = 0;
+   if (++srgn_idx == hpb->srgns_per_rgn) {
+   srgn_idx = 0;
+   rgn_idx++;
+   }
+
+   cnt -= set_bit_len;
+   if (cnt > 0)
+   goto next_srgn;
+}
+
+static bool ufshpb_test_ppn_dirty(struct ufshpb_lu *hpb, int rgn_idx,
+ int srgn_idx, int srgn_offset, int cnt)
+{
+   struct ufshpb_region *rgn;
+   struct ufshpb_subregion *srgn;
+   int bitmap_len;
+   int bit_len;
+
+next_srgn:
+   rgn = hpb->rgn_tbl + rgn_idx;
+   srgn = rgn->srgn_tbl + srgn_idx;
+
+   if (likely(!srgn->is_last))
+   bitmap_len = hpb->entries_per_srgn;
+   else
+   bitmap_len = hpb->last_srgn_entries;
+
+   if (!ufshpb_is_valid_srgn(rgn, srgn))
+   return true;
+
+   /*
+* If the region state is active, mctx must be allocated.
+* In this case, check whether the region is evicted or
+* mctx allcation fail.
+*/
+   if (unlikely(!srgn->mctx)) {
+   dev_err(&hpb->sdev_ufs_lu->sdev_dev,
+   "no mctx in region %d subregion %d.\n",
+   srgn->rgn_idx, srgn->srgn_idx);
+   return true;

[PATCH v28 4/4] scsi: ufs: Add HPB 2.0 support

This patch supports the HPB 2.0.

The HPB 2.0 supports read of varying sizes from 4KB to 512KB.
In the case of Read (<= 32KB) is supported as single HPB read.
In the case of Read (36KB ~ 512KB) is supported by as a combination of
write buffer command and HPB read command to deliver more PPN.
The write buffer commands may not be issued immediately due to busy tags.
To use HPB read more aggressively, the driver can requeue the write buffer
command. The requeue threshold is implemented as timeout and can be
modified with requeue_timeout_ms entry in sysfs.

Signed-off-by: Daejun Park 
---
 Documentation/ABI/testing/sysfs-driver-ufs |  47 +-
 drivers/scsi/ufs/ufs-sysfs.c   |   4 +
 drivers/scsi/ufs/ufs.h |   3 +-
 drivers/scsi/ufs/ufshcd.c  |  25 +-
 drivers/scsi/ufs/ufshcd.h  |   7 +
 drivers/scsi/ufs/ufshpb.c  | 629 +++--
 drivers/scsi/ufs/ufshpb.h  |  66 ++-
 7 files changed, 700 insertions(+), 81 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-driver-ufs 
b/Documentation/ABI/testing/sysfs-driver-ufs
index 528bf89fc98b..419adf450b89 100644
--- a/Documentation/ABI/testing/sysfs-driver-ufs
+++ b/Documentation/ABI/testing/sysfs-driver-ufs
@@ -1253,14 +1253,14 @@ Description:This entry shows the number of HPB 
pinned regions assigned to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/hit_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/hit_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that changed to HPB read.
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/miss_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/miss_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of reads that cannot be changed to
@@ -1268,7 +1268,7 @@ Description:  This entry shows the number of reads 
that cannot be changed to
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_noti_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_noti_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of response UPIUs that has
@@ -1276,7 +1276,7 @@ Description:  This entry shows the number of response 
UPIUs that has
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_active_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_active_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of active sub-regions recommended by
@@ -1284,7 +1284,7 @@ Description:  This entry shows the number of active 
sub-regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/rb_inactive_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/rb_inactive_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of inactive regions recommended by
@@ -1292,10 +1292,45 @@ Description:This entry shows the number of inactive 
regions recommended by
 
The file is read only.
 
-What:  /sys/class/scsi_device/*/device/hpb_sysfs/map_req_cnt
+What:  /sys/class/scsi_device/*/device/hpb_stat_sysfs/map_req_cnt
 Date:  March 2021
 Contact:   Daejun Park 
 Description:   This entry shows the number of read buffer commands for
activating sub-regions recommended by response UPIUs.
 
The file is read only.
+
+What:  
/sys/class/scsi_device/*/device/hpb_param_sysfs/requeue_timeout_ms
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the requeue timeout threshold for write buffer
+   command in ms. This value can be changed by writing proper 
integer to
+   this entry.
+
+What:  
/sys/bus/platform/drivers/ufshcd/*/attributes/max_data_size_hpb_single_cmd
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the maximum HPB data size for using single HPB
+   command.
+
+   ===  
+   00h  4KB
+   01h  8KB
+   02h  12KB
+   ...
+   FFh  1024KB
+   ===  
+
+   The file is read only.
+
+What:  /sys/bus/platform/drivers/ufshcd/*/flags/wb_enable
+Date:  March 2021
+Contact:   Daejun Park 
+Description:   This entry shows the status of HPB.
+
+   == 
+   0  HPB is not enabled.
+   1  HPB is enabled
+   == 
+
+

Re: [PATCH v3 1/3] mm: replace migrate_prep with lru_add_drain_all

2021-03-12 Thread Michal Hocko

On Wed 10-03-21 08:14:27, Minchan Kim wrote:
> Currently, migrate_prep is merely a wrapper of lru_cache_add_all.
> There is not much to gain from having additional abstraction.
> 
> Use lru_add_drain_all instead of migrate_prep, which would be more
> descriptive.
> 
> note: migrate_prep_local in compaction.c changed into lru_add_drain
> to avoid CPU schedule cost with involving many other CPUs to keep
> keep old behavior.
> 
> Signed-off-by: Minchan Kim 

Acked-by: Michal Hocko 

Btw. that migrate_prep_local likely needs revisiting. I really fail to
see why it is useful. It looks like just in case thing to me. If it is
needed then the comment should be describing why. Something for a
separate patch though.

> ---
>  include/linux/migrate.h |  5 -
>  mm/compaction.c |  3 ++-
>  mm/mempolicy.c  |  4 ++--
>  mm/migrate.c| 24 +---
>  mm/page_alloc.c |  2 +-
>  mm/swap.c   |  5 +
>  6 files changed, 11 insertions(+), 32 deletions(-)
> 
> diff --git a/include/linux/migrate.h b/include/linux/migrate.h
> index 3a389633b68f..6155d97ec76c 100644
> --- a/include/linux/migrate.h
> +++ b/include/linux/migrate.h
> @@ -45,8 +45,6 @@ extern struct page *alloc_migration_target(struct page 
> *page, unsigned long priv
>  extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
>  extern void putback_movable_page(struct page *page);
>  
> -extern void migrate_prep(void);
> -extern void migrate_prep_local(void);
>  extern void migrate_page_states(struct page *newpage, struct page *page);
>  extern void migrate_page_copy(struct page *newpage, struct page *page);
>  extern int migrate_huge_page_move_mapping(struct address_space *mapping,
> @@ -66,9 +64,6 @@ static inline struct page *alloc_migration_target(struct 
> page *page,
>  static inline int isolate_movable_page(struct page *page, isolate_mode_t 
> mode)
>   { return -EBUSY; }
>  
> -static inline int migrate_prep(void) { return -ENOSYS; }
> -static inline int migrate_prep_local(void) { return -ENOSYS; }
> -
>  static inline void migrate_page_states(struct page *newpage, struct page 
> *page)
>  {
>  }
> diff --git a/mm/compaction.c b/mm/compaction.c
> index e04f4476e68e..3be017ececc0 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -2319,7 +2319,8 @@ compact_zone(struct compact_control *cc, struct 
> capture_control *capc)
>   trace_mm_compaction_begin(start_pfn, cc->migrate_pfn,
>   cc->free_pfn, end_pfn, sync);
>  
> - migrate_prep_local();
> + /* lru_add_drain_all could be expensive with involving other CPUs */
> + lru_add_drain();
>  
>   while ((ret = compact_finished(cc)) == COMPACT_CONTINUE) {
>   int err;
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index ab51132547b8..fc024e97be37 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1124,7 +1124,7 @@ int do_migrate_pages(struct mm_struct *mm, const 
> nodemask_t *from,
>   int err = 0;
>   nodemask_t tmp;
>  
> - migrate_prep();
> + lru_add_drain_all();
>  
>   mmap_read_lock(mm);
>  
> @@ -1323,7 +1323,7 @@ static long do_mbind(unsigned long start, unsigned long 
> len,
>  
>   if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
>  
> - migrate_prep();
> + lru_add_drain_all();
>   }
>   {
>   NODEMASK_SCRATCH(scratch);
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 62b81d5257aa..45f925e10f5a 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -57,28 +57,6 @@
>  
>  #include "internal.h"
>  
> -/*
> - * migrate_prep() needs to be called before we start compiling a list of 
> pages
> - * to be migrated using isolate_lru_page(). If scheduling work on other CPUs 
> is
> - * undesirable, use migrate_prep_local()
> - */
> -void migrate_prep(void)
> -{
> - /*
> -  * Clear the LRU lists so pages can be isolated.
> -  * Note that pages may be moved off the LRU after we have
> -  * drained them. Those pages will fail to migrate like other
> -  * pages that may be busy.
> -  */
> - lru_add_drain_all();
> -}
> -
> -/* Do the necessary work of migrate_prep but not if it involves other CPUs */
> -void migrate_prep_local(void)
> -{
> - lru_add_drain();
> -}
> -
>  int isolate_movable_page(struct page *page, isolate_mode_t mode)
>  {
>   struct address_space *mapping;
> @@ -1769,7 +1747,7 @@ static int do_pages_move(struct mm_struct *mm, 
> nodemask_t task_nodes,
>   int start, i;
>   int err = 0, err1;
>  
> - migrate_prep();
> + lru_add_drain_all();
>  
>   for (i = start = 0; i < nr_pages; i++) {
>   const void __user *p;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2e8348936df8..f05a8db741ca 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8467,7 +8467,7 @@ static int __alloc_contig_migrate_range(struct 
> compact_control *cc,
>   .gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_R

Re: [PATCH v3 2/3] mm: disable LRU pagevec during the migration temporarily

2021-03-12 Thread Michal Hocko

On Wed 10-03-21 08:14:28, Minchan Kim wrote:
> LRU pagevec holds refcount of pages until the pagevec are drained.
> It could prevent migration since the refcount of the page is greater
> than the expection in migration logic. To mitigate the issue,
> callers of migrate_pages drains LRU pagevec via migrate_prep or
> lru_add_drain_all before migrate_pages call.
> 
> However, it's not enough because pages coming into pagevec after the
> draining call still could stay at the pagevec so it could keep
> preventing page migration. Since some callers of migrate_pages have
> retrial logic with LRU draining, the page would migrate at next trail
> but it is still fragile in that it doesn't close the fundamental race
> between upcoming LRU pages into pagvec and migration so the migration
> failure could cause contiguous memory allocation failure in the end.
> 
> To close the race, this patch disables lru caches(i.e, pagevec)
> during ongoing migration until migrate is done.
> 
> Since it's really hard to reproduce, I measured how many times
> migrate_pages retried with force mode(it is about a fallback to a
> sync migration) with below debug code.
> 
> int migrate_pages(struct list_head *from, new_page_t get_new_page,
>   ..
>   ..
> 
> if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
>printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
>dump_page(page, "fail to migrate");
> }
> 
> The test was repeating android apps launching with cma allocation
> in background every five seconds. Total cma allocation count was
> about 500 during the testing. With this patch, the dump_page count
> was reduced from 400 to 30.
> 
> The new interface is also useful for memory hotplug which currently
> drains lru pcp caches after each migration failure. This is rather
> suboptimal as it has to disrupt others running during the operation.
> With the new interface the operation happens only once. This is also in
> line with pcp allocator cache which are disabled for the offlining as
> well.
> 
> Signed-off-by: Minchan Kim 

Looks goot to me
Acked-by: Michal Hocko 

Thanks

-- 
Michal Hocko
SUSE Labs

[PATCHSET] Remove 153 typedefs in staging/rtl8723bs

2021-03-12 Thread Marco Cesati

This set of patches remove 153 useless typedef instructions in the
staging/rtl8723bs source code, as identified by the checkpatch.pl
script. Every patch is purely syntactical: it does not change the
generated machine code. Furthermore, every single patch leaves the
source code fully compilable, so that 'git bisect' will not be affected.

[PATCH 01/33] staging: rtl8723bs: remove typedefs in HalBtcOutSrc.h
[PATCH 02/33] staging: rtl8723bs: remove typedefs in rtw_mlme.h
[PATCH 03/33] staging: rtl8723bs: remove typedefs in odm.h
[PATCH 04/33] staging: rtl8723bs: remove typedefs in odm_CfoTracking.h
[PATCH 05/33] staging: rtl8723bs: remove typedefs in odm_NoiseMonitor.h
[PATCH 06/33] staging: rtl8723bs: remove typedefs in odm_interface.h
[PATCH 07/33] staging: rtl8723bs: remove typedefs in odm_EdcaTurboCheck.h
[PATCH 08/33] staging: rtl8723bs: remove typedefs in odm_HWConfig.h
[PATCH 09/33] staging: rtl8723bs: remove typedefs in odm_types.h
[PATCH 10/33] staging: rtl8723bs: remove typedefs in rtw_eeprom.h
[PATCH 11/33] staging: rtl8723bs: remove typedefs in hal_com.h
[PATCH 12/33] staging: rtl8723bs: remove typedefs in drv_types.h
[PATCH 13/33] staging: rtl8723bs: remove typedefs in rtw_ht.h
[PATCH 14/33] staging: rtl8723bs: remove typedefs in rtw_ioctl_set.h
[PATCH 15/33] staging: rtl8723bs: remove typedefs in wlan_bssdef.h
[PATCH 16/33] staging: rtl8723bs: remove typedefs in rtw_mp.h
[PATCH 17/33] staging: rtl8723bs: remove typedefs in osdep_service.h
[PATCH 18/33] staging: rtl8723bs: remove typedefs in rtw_security.h
[PATCH 19/33] staging: rtl8723bs: remove typedefs in hal_com_h2c.h
[PATCH 20/33] staging: rtl8723bs: remove typedefs in rtl8723b_xmit.h
[PATCH 21/33] staging: rtl8723bs: remove typedefs in HalVerDef.h
[PATCH 22/33] staging: rtl8723bs: remove typedefs in rtl8723b_hal.h
[PATCH 23/33] staging: rtl8723bs: remove typedefs in rtw_mlme_ext.h
[PATCH 24/33] staging: rtl8723bs: remove typedefs in HalPwrSeqCmd.h
[PATCH 25/33] staging: rtl8723bs: remove typedefs in sta_info.h
[PATCH 26/33] staging: rtl8723bs: remove typedefs in ieee80211.h
[PATCH 27/33] staging: rtl8723bs: remove typedefs in basic_types.h
[PATCH 28/33] staging: rtl8723bs: remove typedefs in osdep_service_linux.h
[PATCH 29/33] staging: rtl8723bs: remove typedefs in rtw_efuse.h
[PATCH 30/33] staging: rtl8723bs: remove typedefs in hal_btcoex.h
[PATCH 31/33] staging: rtl8723bs: remove typedefs in odm_DIG.h
[PATCH 32/33] staging: rtl8723bs: remove typedefs in hal_btcoex.c
[PATCH 33/33] staging: rtl8723bs: remove typedefs in odm_DynamicBBPowerSaving.h

Signed-off-by: Marco Cesati

[PATCH] cpuset: Modify the type of use_parent_ecpus from int to bool

2021-03-12 Thread Li Feng

Since the use_parent_ecpus in cpuset is only used as bool type, change
the type from int to bool.

Signed-off-by: Li Feng 
---
 kernel/cgroup/cpuset.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 5258b68153e0..ab0bf3cc7093 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -158,7 +158,7 @@ struct cpuset {
 * use_parent_ecpus - set if using parent's effective_cpus
 * child_ecpus_count - # of children with use_parent_ecpus set
 */
-   int use_parent_ecpus;
+   bool use_parent_ecpus;
int child_ecpus_count;
 };
 
-- 
2.25.1

[PATCH 01/33] staging: rtl8723bs: remove typedefs in HalBtcOutSrc.h

2021-03-12 Thread Marco Cesati

This commit fixes the following checkpatch.pl warnings:

WARNING: do not add new typedefs
#47: FILE: hal/HalBtcOutSrc.h:47:
+typedef enum _BTC_POWERSAVE_TYPE {

WARNING: do not add new typedefs
#54: FILE: hal/HalBtcOutSrc.h:54:
+typedef enum _BTC_BT_REG_TYPE {

WARNING: do not add new typedefs
#63: FILE: hal/HalBtcOutSrc.h:63:
+typedef enum _BTC_CHIP_INTERFACE {

WARNING: do not add new typedefs
#71: FILE: hal/HalBtcOutSrc.h:71:
+typedef enum _BTC_CHIP_TYPE {

WARNING: do not add new typedefs
#81: FILE: hal/HalBtcOutSrc.h:81:
+typedef enum _BTC_MSG_TYPE {

WARNING: do not add new typedefs
#167: FILE: hal/HalBtcOutSrc.h:167:
+typedef struct _BTC_BOARD_INFO {

WARNING: do not add new typedefs
#177: FILE: hal/HalBtcOutSrc.h:177:
+typedef enum _BTC_DBG_OPCODE {

WARNING: do not add new typedefs
#187: FILE: hal/HalBtcOutSrc.h:187:
+typedef enum _BTC_RSSI_STATE {

WARNING: do not add new typedefs
#200: FILE: hal/HalBtcOutSrc.h:200:
+typedef enum _BTC_WIFI_ROLE {

WARNING: do not add new typedefs
#208: FILE: hal/HalBtcOutSrc.h:208:
+typedef enum _BTC_WIFI_BW_MODE {

WARNING: do not add new typedefs
#215: FILE: hal/HalBtcOutSrc.h:215:
+typedef enum _BTC_WIFI_TRAFFIC_DIR {

WARNING: do not add new typedefs
#221: FILE: hal/HalBtcOutSrc.h:221:
+typedef enum _BTC_WIFI_PNP {

WARNING: do not add new typedefs
#228: FILE: hal/HalBtcOutSrc.h:228:
+typedef enum _BT_WIFI_COEX_STATE {

WARNING: do not add new typedefs
#239: FILE: hal/HalBtcOutSrc.h:239:
+typedef enum _BTC_GET_TYPE {

WARNING: do not add new typedefs
#281: FILE: hal/HalBtcOutSrc.h:281:
+typedef enum _BTC_SET_TYPE {

WARNING: do not add new typedefs
#321: FILE: hal/HalBtcOutSrc.h:321:
+typedef enum _BTC_DBG_DISP_TYPE {

WARNING: do not add new typedefs
#328: FILE: hal/HalBtcOutSrc.h:328:
+typedef enum _BTC_NOTIFY_TYPE_IPS {

WARNING: do not add new typedefs
#334: FILE: hal/HalBtcOutSrc.h:334:
+typedef enum _BTC_NOTIFY_TYPE_LPS {

WARNING: do not add new typedefs
#340: FILE: hal/HalBtcOutSrc.h:340:
+typedef enum _BTC_NOTIFY_TYPE_SCAN {

WARNING: do not add new typedefs
#346: FILE: hal/HalBtcOutSrc.h:346:
+typedef enum _BTC_NOTIFY_TYPE_ASSOCIATE {

WARNING: do not add new typedefs
#352: FILE: hal/HalBtcOutSrc.h:352:
+typedef enum _BTC_NOTIFY_TYPE_MEDIA_STATUS {

WARNING: do not add new typedefs
#358: FILE: hal/HalBtcOutSrc.h:358:
+typedef enum _BTC_NOTIFY_TYPE_SPECIAL_PACKET {

WARNING: do not add new typedefs
#366: FILE: hal/HalBtcOutSrc.h:366:
+typedef enum _BTC_NOTIFY_TYPE_STACK_OPERATION {

WARNING: do not add new typedefs
#374: FILE: hal/HalBtcOutSrc.h:374:
+typedef enum _BTC_ANTENNA_POS {

WARNING: do not add new typedefs
#412: FILE: hal/HalBtcOutSrc.h:412:
+typedef struct _BTC_BT_INFO {

WARNING: do not add new typedefs
#440: FILE: hal/HalBtcOutSrc.h:440:
+typedef struct _BTC_STACK_INFO {

WARNING: do not add new typedefs
#455: FILE: hal/HalBtcOutSrc.h:455:
+typedef struct _BTC_BT_LINK_INFO {

WARNING: do not add new typedefs
#468: FILE: hal/HalBtcOutSrc.h:468:
+typedef struct _BTC_STATISTICS {

WARNING: do not add new typedefs
#487: FILE: hal/HalBtcOutSrc.h:487:
+typedef struct _BTC_COEXIST {

Signed-off-by: Marco Cesati 
---
 .../staging/rtl8723bs/hal/HalBtc8723b1Ant.c   | 148 
 .../staging/rtl8723bs/hal/HalBtc8723b1Ant.h   |  28 ++--
 .../staging/rtl8723bs/hal/HalBtc8723b2Ant.c   | 138 +++
 .../staging/rtl8723bs/hal/HalBtc8723b2Ant.h   |  28 ++--
 drivers/staging/rtl8723bs/hal/HalBtcOutSrc.h  | 158 +-
 drivers/staging/rtl8723bs/hal/hal_btcoex.c| 122 +++---
 6 files changed, 311 insertions(+), 311 deletions(-)

diff --git a/drivers/staging/rtl8723bs/hal/HalBtc8723b1Ant.c 
b/drivers/staging/rtl8723bs/hal/HalBtc8723b1Ant.c
index ef8c6a0f31ae..87dc63408133 100644
--- a/drivers/staging/rtl8723bs/hal/HalBtc8723b1Ant.c
+++ b/drivers/staging/rtl8723bs/hal/HalBtc8723b1Ant.c
@@ -151,7 +151,7 @@ static u8 halbtc8723b1ant_BtRssiState(
 }
 
 static void halbtc8723b1ant_UpdateRaMask(
-   PBTC_COEXIST pBtCoexist, bool bForceExec, u32 disRateMask
+   struct BTC_COEXIST * pBtCoexist, bool bForceExec, u32 disRateMask
 )
 {
pCoexDm->curRaMask = disRateMask;
@@ -166,7 +166,7 @@ static void halbtc8723b1ant_UpdateRaMask(
 }
 
 static void halbtc8723b1ant_AutoRateFallbackRetry(
-   PBTC_COEXIST pBtCoexist, bool bForceExec, u8 type
+   struct BTC_COEXIST * pBtCoexist, bool bForceExec, u8 type
 )
 {
bool bWifiUnderBMode = false;
@@ -204,7 +204,7 @@ static void halbtc8723b1ant_AutoRateFallbackRetry(
 }
 
 static void halbtc8723b1ant_RetryLimit(
-   PBTC_COEXIST pBtCoexist, bool bForceExec, u8 type
+   struct BTC_COEXIST * pBtCoexist, bool bFo

[tip: objtool/urgent] objtool,x86: Fix uaccess PUSHF/POPF validation

2021-03-12 Thread tip-bot2 for Peter Zijlstra

The following commit has been merged into the objtool/urgent branch of tip:

Commit-ID: ba08abca66d46381df60842f64f70099d5482b92
Gitweb:
https://git.kernel.org/tip/ba08abca66d46381df60842f64f70099d5482b92
Author:Peter Zijlstra 
AuthorDate:Mon, 08 Mar 2021 15:46:04 +01:00
Committer: Peter Zijlstra 
CommitterDate: Fri, 12 Mar 2021 09:15:49 +01:00

objtool,x86: Fix uaccess PUSHF/POPF validation

Commit ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not
use popf") replaced "push %reg; popf" with something like: "test
$0x200, %reg; jz 1f; sti; 1:", which breaks the pushf/popf symmetry
that commit ea24213d8088 ("objtool: Add UACCESS validation") relies
on.

The result is:

  drivers/gpu/drm/amd/amdgpu/si.o: warning: objtool: si_common_hw_init()+0xf36: 
PUSHF stack exhausted

Meanwhile, commit c9c324dc22aa ("objtool: Support stack layout changes
in alternatives") makes that we can actually use stack-ops in
alternatives, which means we can revert 1ff865e343c2 ("x86,smap: Fix
smap_{save,restore}() alternatives").

That in turn means we can limit the PUSHF/POPF handling of
ea24213d8088 to those instructions that are in alternatives.

Fixes: ab234a260b1f ("x86/pv: Rework arch_local_irq_restore() to not use popf")
Reported-by: Borislav Petkov 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Josh Poimboeuf 
Link: https://lkml.kernel.org/r/yey4ribqya5fn...@hirez.programming.kicks-ass.net
---
 arch/x86/include/asm/smap.h | 10 --
 tools/objtool/check.c   |  3 +++
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h
index 8b58d69..0bc9b08 100644
--- a/arch/x86/include/asm/smap.h
+++ b/arch/x86/include/asm/smap.h
@@ -58,9 +58,8 @@ static __always_inline unsigned long smap_save(void)
unsigned long flags;
 
asm volatile ("# smap_save\n\t"
- ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP)
- "pushf; pop %0; " __ASM_CLAC "\n\t"
- "1:"
+ ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC "\n\t",
+ X86_FEATURE_SMAP)
  : "=rm" (flags) : : "memory", "cc");
 
return flags;
@@ -69,9 +68,8 @@ static __always_inline unsigned long smap_save(void)
 static __always_inline void smap_restore(unsigned long flags)
 {
asm volatile ("# smap_restore\n\t"
- ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP)
- "push %0; popf\n\t"
- "1:"
+ ALTERNATIVE("", "push %0; popf\n\t",
+ X86_FEATURE_SMAP)
  : : "g" (flags) : "memory", "cc");
 }
 
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 068cdb4..5e5388a 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2442,6 +2442,9 @@ static int handle_insn_ops(struct instruction *insn, 
struct insn_state *state)
if (update_cfi_state(insn, &state->cfi, op))
return 1;
 
+   if (!insn->alt_group)
+   continue;
+
if (op->dest.type == OP_DEST_PUSHF) {
if (!state->uaccess_stack) {
state->uaccess_stack = 1;

Re: [PATCH v2 25/43] powerpc/32: Replace ASM exception exit by C exception exit from ppc64

Le 12/03/2021 à 00:26, Michael Ellerman a écrit :

Christophe Leroy writes:

Le 11/03/2021 à 14:46, Michael Ellerman a écrit :

Christophe Leroy writes:

This patch replaces the PPC32 ASM exception exit by C exception exit.

Signed-off-by: Christophe Leroy
---
arch/powerpc/kernel/entry_32.S | 481 +---
arch/powerpc/kernel/interrupt.c | 4 +
2 files changed, 132 insertions(+), 353 deletions(-)

Bisect points to this breaking qemu mac99 for me, with pmac32_defconfig.

I haven't had time to dig any deeper sorry.

Embarrasing ...

Nah, these things happen.

I don't get this problem on the 8xx (nohash/32) or the 83xx (book3s/32).
I don't get this problem with qemu mac99 when using my klibc-based initramfs.

I managed to reproduce it with the rootfs.cpio that I got some time ago from
linuxppc github Wiki.

OK.

I'm using the ppc-rootfs.cpio.gz from here:

https://github.com/linuxppc/ci-scripts/blob/master/root-disks/Makefile

And the boot script is:

https://github.com/linuxppc/ci-scripts/blob/master/scripts/boot/qemu-mac99

I've been meaning to write docs on how to use those scripts, but haven't
got around to it.

There's nothing really special though it's just a wrapper around qemu -M mac99.

I'll investigate it tomorrow.

Problem is the fast_interrupt_return, registers are not all saved yet on ppc32 (msr, nip, xer, ctr),
can't restore them all as ppc64 do.

The problem happens only when userspace uses floating point or altivec.

For the time being, I'll keep the original fast_interrupt_return.

I will likely send a new version of the series later today, taking into account
Nick's comments.

Christophe

[PATCH] optee: enable apci support

2021-03-12 Thread Ran Wang

This patch add ACPI support for optee driver.

Signed-off-by: Ran Wang 
---
 drivers/tee/optee/core.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
index cf4718c6d35d..8fb261f4b9db 100644
--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -5,6 +5,7 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include 
 #include 
 #include 
 #include 
@@ -735,12 +736,21 @@ static const struct of_device_id optee_dt_match[] = {
 };
 MODULE_DEVICE_TABLE(of, optee_dt_match);
 
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id optee_acpi_match[] = {
+   { "OPTEE01",},
+   { },
+};
+MODULE_DEVICE_TABLE(acpi, dwc3_acpi_match);
+#endif
+
 static struct platform_driver optee_driver = {
.probe  = optee_probe,
.remove = optee_remove,
.driver = {
.name = "optee",
.of_match_table = optee_dt_match,
+   .acpi_match_table = ACPI_PTR(optee_acpi_match),
},
 };
 module_platform_driver(optee_driver);
-- 
2.25.1

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

Re: [PATCH] ASoC: core: Don't set platform name when of_node is set

2021-03-12 Thread Daniel Baluta

On Tue, Mar 9, 2021 at 5:38 PM Mark Brown  wrote:
>
> On Tue, Mar 09, 2021 at 10:23:28AM +0200, Daniel Baluta wrote:
> > From: Daniel Baluta 
> >
> > Platform may be specified by either name or OF node but not
> > both.
> >
> > For OF node platforms (e.g i.MX) we end up with both platform name
> > and of_node set and sound card registration will fail with the error:
> >
> >   asoc-simple-card sof-sound-wm8960: ASoC: Neither/both
> >   platform name/of_node are set for sai1-wm8960-hifi
>
> This doesn't actually say what the change does.

Will send v2 with a better explanation.

>
> > - dai_link->platforms->name = component->name;
> > +
> > + if (!dai_link->platforms->of_node)
> > + dai_link->platforms->name = component->name;
>
> Why would we prefer the node name over something explicitly configured?

Not sure I follow your question. I think the difference stands in the
way we treat OF vs non-OF platforms.

With OF-platforms, dai_link->platforms->of_node is always set! So we
no longer need
to set dai->platforms->name.

Actually setting both of_node and name will make sound card
registration fail! In this is the case I'm trying
to fix here.

Re: [PATCH] drivers: tty: vt: vt.c: fix NULL dereference crash

2021-03-12 Thread Greg KH

On Sun, Mar 07, 2021 at 12:56:43PM +0200, Hassan Shahbazi wrote:
> Fix a NULL deference crash on hiding the cursor.
> 
> Reported by: syzbot
> https://syzkaller.appspot.com/bug?id=defb47bf56e1c14d5687280c7bb91ce7b608b94b
> 
> Signed-off-by: Hassan Shahbazi 
> ---
>  drivers/tty/vt/vt.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
> index 284b07224c55..8c3e83c81341 100644
> --- a/drivers/tty/vt/vt.c
> +++ b/drivers/tty/vt/vt.c
> @@ -904,7 +904,9 @@ static void hide_cursor(struct vc_data *vc)
>   if (vc_is_sel(vc))
>   clear_selection();
>  
> - vc->vc_sw->con_cursor(vc, CM_ERASE);
> + if (vc->vc_sw)
> + vc->vc_sw->con_cursor(vc, CM_ERASE);
> +
>   hide_softcursor(vc);
>  }
>  
> -- 
> 2.26.2
> 

Are you sure this actually fixes the problem?  How did you test it?  Did
syzbot test this?

I had a few reports of this patch _not_ solving the problem, so getting
confirmation of this would be good.

thanks,

greg k-h

[GIT PULL] xen: branch for v5.12-rc3

2021-03-12 Thread Juergen Gross

Linus,

Please git pull the following tag:

 git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git 
for-linus-5.12b-rc3-tag

xen: branch for v5.12-rc3

It contains two patch series and a single patch:

- a small cleanup patch to remove unneeded symbol exports
- a series to cleanup Xen grant handling (avoiding allocations in some
  cases, and using common defines for "invalid" values)
- a series to address a race issue in Xen event channel handling


Thanks.

Juergen

 arch/arm/xen/p2m.c   |   5 +-
 arch/x86/xen/p2m.c   |   6 +-
 drivers/pci/xen-pcifront.c   |   4 +-
 drivers/xen/events/events_2l.c   |  22 --
 drivers/xen/events/events_base.c | 130 +++
 drivers/xen/events/events_fifo.c |   7 --
 drivers/xen/events/events_internal.h |  14 ++--
 drivers/xen/gntdev.c |  54 +--
 include/xen/grant_table.h|   7 ++
 include/xen/xenbus.h |   1 -
 10 files changed, 169 insertions(+), 81 deletions(-)

Jan Beulich (4):
  Xen: drop exports of {set,clear}_foreign_p2m_mapping()
  Xen/gntdev: don't needlessly allocate k{,un}map_ops[]
  Xen/gnttab: introduce common INVALID_GRANT_{HANDLE,REF}
  Xen/gntdev: don't needlessly use kvcalloc()

Juergen Gross (3):
  xen/events: reset affinity of 2-level event when tearing it down
  xen/events: don't unmask an event channel when an eoi is pending
  xen/events: avoid handling the same event on two cpus at the same time

[PATCH 1/4] mfd/power: ab8500: Require device tree

The core AB8500 driver and the whole platform is completely
dependent on being probed from device tree so remove the
non-DT probe paths.

Signed-off-by: Linus Walleij 
---
 drivers/power/supply/Kconfig   |  2 +-
 drivers/power/supply/ab8500_btemp.c| 10 --
 drivers/power/supply/ab8500_charger.c  | 15 ++-
 drivers/power/supply/ab8500_fg.c   | 10 --
 drivers/power/supply/abx500_chargalg.c | 10 --
 5 files changed, 19 insertions(+), 28 deletions(-)

diff --git a/drivers/power/supply/Kconfig b/drivers/power/supply/Kconfig
index 006b95eca673..a910571e8d4f 100644
--- a/drivers/power/supply/Kconfig
+++ b/drivers/power/supply/Kconfig
@@ -698,7 +698,7 @@ config BATTERY_GAUGE_LTC2941
 
 config AB8500_BM
bool "AB8500 Battery Management Driver"
-   depends on AB8500_CORE && AB8500_GPADC && (IIO = y)
+   depends on AB8500_CORE && AB8500_GPADC && (IIO = y) && OF
help
  Say Y to include support for AB8500 battery management.
 
diff --git a/drivers/power/supply/ab8500_btemp.c 
b/drivers/power/supply/ab8500_btemp.c
index 7095ea4c68d6..ca5153c82c81 100644
--- a/drivers/power/supply/ab8500_btemp.c
+++ b/drivers/power/supply/ab8500_btemp.c
@@ -1008,12 +1008,10 @@ static int ab8500_btemp_probe(struct platform_device 
*pdev)
}
di->bm = plat;
 
-   if (np) {
-   ret = ab8500_bm_of_probe(dev, np, di->bm);
-   if (ret) {
-   dev_err(dev, "failed to get battery information\n");
-   return ret;
-   }
+   ret = ab8500_bm_of_probe(dev, np, di->bm);
+   if (ret) {
+   dev_err(dev, "failed to get battery information\n");
+   return ret;
}
 
/* get parent data */
diff --git a/drivers/power/supply/ab8500_charger.c 
b/drivers/power/supply/ab8500_charger.c
index ac77c8882d17..aa573cd299e2 100644
--- a/drivers/power/supply/ab8500_charger.c
+++ b/drivers/power/supply/ab8500_charger.c
@@ -3360,15 +3360,12 @@ static int ab8500_charger_probe(struct platform_device 
*pdev)
}
di->bm = plat;
 
-   if (np) {
-   ret = ab8500_bm_of_probe(dev, np, di->bm);
-   if (ret) {
-   dev_err(dev, "failed to get battery information\n");
-   return ret;
-   }
-   di->autopower_cfg = of_property_read_bool(np, "autopower_cfg");
-   } else
-   di->autopower_cfg = false;
+   ret = ab8500_bm_of_probe(dev, np, di->bm);
+   if (ret) {
+   dev_err(dev, "failed to get battery information\n");
+   return ret;
+   }
+   di->autopower_cfg = of_property_read_bool(np, "autopower_cfg");
 
/* get parent data */
di->dev = dev;
diff --git a/drivers/power/supply/ab8500_fg.c b/drivers/power/supply/ab8500_fg.c
index 06ff42c71f24..079e11325a81 100644
--- a/drivers/power/supply/ab8500_fg.c
+++ b/drivers/power/supply/ab8500_fg.c
@@ -3043,12 +3043,10 @@ static int ab8500_fg_probe(struct platform_device *pdev)
}
di->bm = plat;
 
-   if (np) {
-   ret = ab8500_bm_of_probe(dev, np, di->bm);
-   if (ret) {
-   dev_err(dev, "failed to get battery information\n");
-   return ret;
-   }
+   ret = ab8500_bm_of_probe(dev, np, di->bm);
+   if (ret) {
+   dev_err(dev, "failed to get battery information\n");
+   return ret;
}
 
mutex_init(&di->cc_lock);
diff --git a/drivers/power/supply/abx500_chargalg.c 
b/drivers/power/supply/abx500_chargalg.c
index a9d84d845f24..591ddd2987a3 100644
--- a/drivers/power/supply/abx500_chargalg.c
+++ b/drivers/power/supply/abx500_chargalg.c
@@ -1997,12 +1997,10 @@ static int abx500_chargalg_probe(struct platform_device 
*pdev)
}
di->bm = plat;
 
-   if (np) {
-   ret = ab8500_bm_of_probe(&pdev->dev, np, di->bm);
-   if (ret) {
-   dev_err(&pdev->dev, "failed to get battery 
information\n");
-   return ret;
-   }
+   ret = ab8500_bm_of_probe(&pdev->dev, np, di->bm);
+   if (ret) {
+   dev_err(&pdev->dev, "failed to get battery information\n");
+   return ret;
}
 
/* get device struct and parent */
-- 
2.29.2

[PATCH 3/4] mfd/power: ab8500: Push algorithm to power supply code

The charging algorithm header is only used locally in the
power supply subsystem so push this down into
drivers/power/supply and rename from the confusing
"ux500_chargalg.h" to "ab8500-chargalg.h" for clarity:
it is only used with the AB8500.

This is another remnant of non-DT code needing to pass
data from boardfiles, which we don't do anymore.

Signed-off-by: Linus Walleij 
---
 .../power/supply/ab8500-chargalg.h  | 6 +++---
 drivers/power/supply/ab8500_charger.c   | 2 +-
 drivers/power/supply/abx500_chargalg.c  | 2 +-
 drivers/power/supply/pm2301_charger.c   | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)
 rename include/linux/mfd/abx500/ux500_chargalg.h => 
drivers/power/supply/ab8500-chargalg.h (93%)

diff --git a/include/linux/mfd/abx500/ux500_chargalg.h 
b/drivers/power/supply/ab8500-chargalg.h
similarity index 93%
rename from include/linux/mfd/abx500/ux500_chargalg.h
rename to drivers/power/supply/ab8500-chargalg.h
index 9b97d284d0ce..94a6f9068bc5 100644
--- a/include/linux/mfd/abx500/ux500_chargalg.h
+++ b/drivers/power/supply/ab8500-chargalg.h
@@ -4,8 +4,8 @@
  * Author: Johan Gardsmark  for ST-Ericsson.
  */
 
-#ifndef _UX500_CHARGALG_H
-#define _UX500_CHARGALG_H
+#ifndef _AB8500_CHARGALG_H_
+#define _AB8500_CHARGALG_H_
 
 #include 
 
@@ -48,4 +48,4 @@ struct ux500_charger {
 
 extern struct blocking_notifier_head charger_notifier_list;
 
-#endif
+#endif /* _AB8500_CHARGALG_H_ */
diff --git a/drivers/power/supply/ab8500_charger.c 
b/drivers/power/supply/ab8500_charger.c
index 50989a5ec95c..a9be10eb2c22 100644
--- a/drivers/power/supply/ab8500_charger.c
+++ b/drivers/power/supply/ab8500_charger.c
@@ -28,12 +28,12 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 
 #include "ab8500-bm.h"
+#include "ab8500-chargalg.h"
 
 /* Charger constants */
 #define NO_PW_CONN 0
diff --git a/drivers/power/supply/abx500_chargalg.c 
b/drivers/power/supply/abx500_chargalg.c
index 5b28d58041b4..f5b792243727 100644
--- a/drivers/power/supply/abx500_chargalg.c
+++ b/drivers/power/supply/abx500_chargalg.c
@@ -28,10 +28,10 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
 #include "ab8500-bm.h"
+#include "ab8500-chargalg.h"
 
 /* Watchdog kick interval */
 #define CHG_WD_INTERVAL(6 * HZ)
diff --git a/drivers/power/supply/pm2301_charger.c 
b/drivers/power/supply/pm2301_charger.c
index 5aeff75db33b..d53e0c37c059 100644
--- a/drivers/power/supply/pm2301_charger.c
+++ b/drivers/power/supply/pm2301_charger.c
@@ -18,13 +18,13 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
 
 #include "ab8500-bm.h"
+#include "ab8500-chargalg.h"
 #include "pm2301_charger.h"
 
 #define to_pm2xxx_charger_ac_device_info(x) container_of((x), \
-- 
2.29.2

[PATCH 2/4] mfd/power: ab8500: Push data to power supply code

The global definition of platform data for the battery
management code has no utility after the OF conversion,
move the  to be a local
file in drivers/power/supply and stop defining the
platform data in drivers/power/supply/ab8500_bmdata.c
and broadcast to the kernel only to have it assigned
as platform data to the MFD cells and then picked back
into the same subsystem that defined it in the first
place. This kills off a layer of indirection.

Acked-for-MFD-by: Lee Jones 
Signed-off-by: Linus Walleij 
---
 drivers/mfd/ab8500-core.c | 17 +
 .../power/supply}/ab8500-bm.h | 19 ++
 drivers/power/supply/ab8500_bmdata.c  |  3 +-
 drivers/power/supply/ab8500_btemp.c   | 35 +++
 drivers/power/supply/ab8500_charger.c | 10 ++
 drivers/power/supply/ab8500_fg.c  | 10 ++
 drivers/power/supply/abx500_chargalg.c| 10 ++
 drivers/power/supply/pm2301_charger.c |  2 +-
 8 files changed, 27 insertions(+), 79 deletions(-)
 rename {include/linux/mfd/abx500 => drivers/power/supply}/ab8500-bm.h (96%)

diff --git a/drivers/mfd/ab8500-core.c b/drivers/mfd/ab8500-core.c
index 2dde3a7532c4..7bb23c0d1efd 100644
--- a/drivers/mfd/ab8500-core.c
+++ b/drivers/mfd/ab8500-core.c
@@ -19,7 +19,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -603,14 +602,14 @@ int ab8500_suspend(struct ab8500 *ab8500)
 }
 
 static const struct mfd_cell ab8500_bm_devs[] = {
-   MFD_CELL_OF("ab8500-charger", NULL, &ab8500_bm_data,
-   sizeof(ab8500_bm_data), 0, "stericsson,ab8500-charger"),
-   MFD_CELL_OF("ab8500-btemp", NULL, &ab8500_bm_data,
-   sizeof(ab8500_bm_data), 0, "stericsson,ab8500-btemp"),
-   MFD_CELL_OF("ab8500-fg", NULL, &ab8500_bm_data,
-   sizeof(ab8500_bm_data), 0, "stericsson,ab8500-fg"),
-   MFD_CELL_OF("ab8500-chargalg", NULL, &ab8500_bm_data,
-   sizeof(ab8500_bm_data), 0, "stericsson,ab8500-chargalg"),
+   MFD_CELL_OF("ab8500-charger", NULL, NULL, 0, 0,
+   "stericsson,ab8500-charger"),
+   MFD_CELL_OF("ab8500-btemp", NULL, NULL, 0, 0,
+   "stericsson,ab8500-btemp"),
+   MFD_CELL_OF("ab8500-fg", NULL, NULL, 0, 0,
+   "stericsson,ab8500-fg"),
+   MFD_CELL_OF("ab8500-chargalg", NULL, NULL, 0, 0,
+   "stericsson,ab8500-chargalg"),
 };
 
 static const struct mfd_cell ab8500_devs[] = {
diff --git a/include/linux/mfd/abx500/ab8500-bm.h 
b/drivers/power/supply/ab8500-bm.h
similarity index 96%
rename from include/linux/mfd/abx500/ab8500-bm.h
rename to drivers/power/supply/ab8500-bm.h
index 903e94c189d8..a1b31c971a45 100644
--- a/include/linux/mfd/abx500/ab8500-bm.h
+++ b/drivers/power/supply/ab8500-bm.h
@@ -1,12 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright ST-Ericsson 2012.
- *
- * Author: Arun Murthy 
- */
 
-#ifndef _AB8500_BM_H
-#define _AB8500_BM_H
+#ifndef _AB8500_CHARGER_H_
+#define _AB8500_CHARGER_H_
 
 #include 
 #include 
@@ -453,16 +448,11 @@ struct ab8500_bm_data {
 };
 
 struct ab8500_btemp;
-struct ab8500_gpadc;
 struct ab8500_fg;
 
-#ifdef CONFIG_AB8500_BM
 extern struct abx500_bm_data ab8500_bm_data;
 
 void ab8500_charger_usb_state_changed(u8 bm_usb_state, u16 mA);
-struct ab8500_btemp *ab8500_btemp_get(void);
-int ab8500_btemp_get_batctrl_temp(struct ab8500_btemp *btemp);
-int ab8500_btemp_get_temp(struct ab8500_btemp *btemp);
 struct ab8500_fg *ab8500_fg_get(void);
 int ab8500_fg_inst_curr_blocking(struct ab8500_fg *dev);
 int ab8500_fg_inst_curr_start(struct ab8500_fg *di);
@@ -470,7 +460,4 @@ int ab8500_fg_inst_curr_finalize(struct ab8500_fg *di, int 
*res);
 int ab8500_fg_inst_curr_started(struct ab8500_fg *di);
 int ab8500_fg_inst_curr_done(struct ab8500_fg *di);
 
-#else
-static struct abx500_bm_data ab8500_bm_data;
-#endif
-#endif /* _AB8500_BM_H */
+#endif /* _AB8500_CHARGER_H_ */
diff --git a/drivers/power/supply/ab8500_bmdata.c 
b/drivers/power/supply/ab8500_bmdata.c
index f6a66979cbb5..c2b8c0bb77e2 100644
--- a/drivers/power/supply/ab8500_bmdata.c
+++ b/drivers/power/supply/ab8500_bmdata.c
@@ -4,7 +4,8 @@
 #include 
 #include 
 #include 
-#include 
+
+#include "ab8500-bm.h"
 
 /*
  * These are the defined batteries that uses a NTC and ID resistor placed
diff --git a/drivers/power/supply/ab8500_btemp.c 
b/drivers/power/supply/ab8500_btemp.c
index ca5153c82c81..33fb6f65749c 100644
--- a/drivers/power/supply/ab8500_btemp.c
+++ b/drivers/power/supply/ab8500_btemp.c
@@ -25,9 +25,10 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
+#include "ab8500-bm.h"
+
 #define VTVOUT_V   1800
 
 #define BTEMP_THERMAL_LOW_LIMIT-10
@@ -119,16 +120,6 @@ static enum power_supply_property ab8500_btemp_props[] = {
 
 static LIST_HEAD(ab8500_btemp_list);
 
-/**
- * ab8500_btemp_get() - returns a reference to the primary AB8500 BTEMP
- * (i.e. the f

Re: [PATCH] perf-stat: introduce bperf, share hardware PMCs with BPF

2021-03-12 Thread Namhyung Kim

Hi,

On Fri, Mar 12, 2021 at 11:03 AM Song Liu  wrote:
>
> perf uses performance monitoring counters (PMCs) to monitor system
> performance. The PMCs are limited hardware resources. For example,
> Intel CPUs have 3x fixed PMCs and 4x programmable PMCs per cpu.
>
> Modern data center systems use these PMCs in many different ways:
> system level monitoring, (maybe nested) container level monitoring, per
> process monitoring, profiling (in sample mode), etc. In some cases,
> there are more active perf_events than available hardware PMCs. To allow
> all perf_events to have a chance to run, it is necessary to do expensive
> time multiplexing of events.
>
> On the other hand, many monitoring tools count the common metrics (cycles,
> instructions). It is a waste to have multiple tools create multiple
> perf_events of "cycles" and occupy multiple PMCs.
>
> bperf tries to reduce such wastes by allowing multiple perf_events of
> "cycles" or "instructions" (at different scopes) to share PMUs. Instead
> of having each perf-stat session to read its own perf_events, bperf uses
> BPF programs to read the perf_events and aggregate readings to BPF maps.
> Then, the perf-stat session(s) reads the values from these BPF maps.
>
> Please refer to the comment before the definition of bperf_ops for the
> description of bperf architecture.

Interesting!  Actually I thought about something similar before,
but my BPF knowledge is outdated.  So I need to catch up but
failed to have some time for it so far. ;-)

>
> bperf is off by default. To enable it, pass --use-bpf option to perf-stat.
> bperf uses a BPF hashmap to share information about BPF programs and maps
> used by bperf. This map is pinned to bpffs. The default address is
> /sys/fs/bpf/bperf_attr_map. The user could change the address with option
> --attr-map.
>
> ---
> Known limitations:
> 1. Do not support per cgroup events;
> 2. Do not support monitoring of BPF program (perf-stat -b);
> 3. Do not support event groups.

In my case, per cgroup event counting is very important.
And I'd like to do that with lots of cpus and cgroups.
So I'm working on an in-kernel solution (without BPF),
I hope to share it soon.

And for event groups, it seems the current implementation
cannot handle more than one event (not even in a group).
That could be a serious limitation..

>
> The following commands have been tested:
>
>perf stat --use-bpf -e cycles -a
>perf stat --use-bpf -e cycles -C 1,3,4
>perf stat --use-bpf -e cycles -p 123
>perf stat --use-bpf -e cycles -t 100,101

Hmm... so it loads both leader and follower programs if needed, right?
Does it support multiple followers with different targets at the same time?

Thanks,
Namhyung

[PATCH 0/4] mfd/power: Push data into power supply

This series pushes some AB8500 power supply headers down
into the power supply subsystem so the power supply code
becomes independent from the other AB8500 stuff.

The first patch makes the code require device tree so
that the series make sense: once all data for the power
supply comes from device tree, it makes sense for that
code to not require global headers for platform data
etc.

This is in preparation for some finalization of the AB8500
power code, as merge strategy I think it is best if:

- The power maintainer (Sebastian) provide an ACK

- The MFD matinainer (Lee) merges this and provide an
  immutable branch that the power maintainer can possibly
  pull as a base for his tree

I hope both subsystems are happy with the changes.

One of the patches already has Lee's Acked-for-MFD, but I
got a bit stressed out in the last kernel cycle. Let's
take this stepwise, first these four patches. No hurry.

Linus Walleij (4):
  mfd/power: ab8500: Require device tree
  mfd/power: ab8500: Push data to power supply code
  mfd/power: ab8500: Push algorithm to power supply code
  mfd/power: ab8500: Push data to power supply code

 drivers/mfd/ab8500-core.c |  17 +-
 drivers/power/supply/Kconfig  |   2 +-
 .../power/supply}/ab8500-bm.h | 297 --
 .../power/supply/ab8500-chargalg.h|   6 +-
 drivers/power/supply/ab8500_bmdata.c  |   3 +-
 drivers/power/supply/ab8500_btemp.c   |  45 +--
 drivers/power/supply/ab8500_charger.c |  27 +-
 drivers/power/supply/ab8500_fg.c  |  20 +-
 drivers/power/supply/abx500_chargalg.c|  22 +-
 drivers/power/supply/pm2301_charger.c |   4 +-
 include/linux/mfd/abx500.h| 276 
 11 files changed, 326 insertions(+), 393 deletions(-)
 rename {include/linux/mfd/abx500 => drivers/power/supply}/ab8500-bm.h (58%)
 rename include/linux/mfd/abx500/ux500_chargalg.h => 
drivers/power/supply/ab8500-chargalg.h (93%)

-- 
2.29.2

[PATCH 4/4] mfd/power: ab8500: Push data to power supply code

There is a slew of defines, structs and enums and even a
function call only relevant for the charging code that
still lives in . Push it down to the
"ab8500-bm.h" header in the power supply subsystem where
it is actually used.

Signed-off-by: Linus Walleij 
---
 drivers/power/supply/ab8500-bm.h | 278 ++-
 include/linux/mfd/abx500.h   | 276 --
 2 files changed, 274 insertions(+), 280 deletions(-)

diff --git a/drivers/power/supply/ab8500-bm.h b/drivers/power/supply/ab8500-bm.h
index a1b31c971a45..41c69a4f2a1f 100644
--- a/drivers/power/supply/ab8500-bm.h
+++ b/drivers/power/supply/ab8500-bm.h
@@ -4,7 +4,6 @@
 #define _AB8500_CHARGER_H_
 
 #include 
-#include 
 
 /*
  * System control 2 register offsets.
@@ -268,6 +267,277 @@ enum bup_vch_sel {
 #define BUS_PP_PRECHG_CURRENT_MASK 0x0E
 #define BUS_POWER_PATH_PRECHG_ENA  0x01
 
+/*
+ * ADC for the battery thermistor.
+ * When using the ABx500_ADC_THERM_BATCTRL the battery ID resistor is combined
+ * with a NTC resistor to both identify the battery and to measure its
+ * temperature. Different phone manufactures uses different techniques to both
+ * identify the battery and to read its temperature.
+ */
+enum abx500_adc_therm {
+   ABx500_ADC_THERM_BATCTRL,
+   ABx500_ADC_THERM_BATTEMP,
+};
+
+/**
+ * struct abx500_res_to_temp - defines one point in a temp to res curve. To
+ * be used in battery packs that combines the identification resistor with a
+ * NTC resistor.
+ * @temp:  battery pack temperature in Celsius
+ * @resist:NTC resistor net total resistance
+ */
+struct abx500_res_to_temp {
+   int temp;
+   int resist;
+};
+
+/**
+ * struct abx500_v_to_cap - Table for translating voltage to capacity
+ * @voltage:   Voltage in mV
+ * @capacity:  Capacity in percent
+ */
+struct abx500_v_to_cap {
+   int voltage;
+   int capacity;
+};
+
+/* Forward declaration */
+struct abx500_fg;
+
+/**
+ * struct abx500_fg_parameters - Fuel gauge algorithm parameters, in seconds
+ * if not specified
+ * @recovery_sleep_timer:  Time between measurements while recovering
+ * @recovery_total_time:   Total recovery time
+ * @init_timer:Measurement interval during startup
+ * @init_discard_time: Time we discard voltage measurement at startup
+ * @init_total_time:   Total init time during startup
+ * @high_curr_time:Time current has to be high to go to recovery
+ * @accu_charging: FG accumulation time while charging
+ * @accu_high_curr:FG accumulation time in high current mode
+ * @high_curr_threshold:   High current threshold, in mA
+ * @lowbat_threshold:  Low battery threshold, in mV
+ * @overbat_threshold: Over battery threshold, in mV
+ * @battok_falling_th_sel0 Threshold in mV for battOk signal sel0
+ * Resolution in 50 mV step.
+ * @battok_raising_th_sel1 Threshold in mV for battOk signal sel1
+ * Resolution in 50 mV step.
+ * @user_cap_limit Capacity reported from user must be within this
+ * limit to be considered as sane, in percentage
+ * points.
+ * @maint_thresThis is the threshold where we stop 
reporting
+ * battery full while in maintenance, in per cent
+ * @pcut_enable:   Enable power cut feature in ab8505
+ * @pcut_max_time: Max time threshold
+ * @pcut_flag_time:Flagtime threshold
+ * @pcut_max_restart:  Max number of restarts
+ * @pcut_debounce_time:Sets battery debounce time
+ */
+struct abx500_fg_parameters {
+   int recovery_sleep_timer;
+   int recovery_total_time;
+   int init_timer;
+   int init_discard_time;
+   int init_total_time;
+   int high_curr_time;
+   int accu_charging;
+   int accu_high_curr;
+   int high_curr_threshold;
+   int lowbat_threshold;
+   int overbat_threshold;
+   int battok_falling_th_sel0;
+   int battok_raising_th_sel1;
+   int user_cap_limit;
+   int maint_thres;
+   bool pcut_enable;
+   u8 pcut_max_time;
+   u8 pcut_flag_time;
+   u8 pcut_max_restart;
+   u8 pcut_debounce_time;
+};
+
+/**
+ * struct abx500_charger_maximization - struct used by the board config.
+ * @use_maxi:  Enable maximization for this battery type
+ * @maxi_chg_curr: Maximum charger current allowed
+ * @maxi_wait_cycles:  cycles to wait before setting charger current
+ * @charger_curr_step  delta between two charger current settings (mA)
+ */
+struct abx500_maxim_parameters {
+   bool ena_maxi;
+   int chg_curr;
+   int wait_cycles;
+   int charger_curr_step;
+};
+
+/**
+ * struct abx500_battery_type - different batteries supported
+ * @n

Re: [PATCH v4] USB: serial: cp210x: Make the CP210x driver work with GPIOs of CP2108

2021-03-12 Thread Johan Hovold

On Fri, Mar 12, 2021 at 04:27:57AM +, Pho Tran wrote:
> Similar to other CP210x devices, GPIO interfaces (gpiochip) should be
> supported for CP2108.
> 
> CP2108 has 4 serial interfaces but only 1 set of GPIO pins are shared
> to all of those interfaces. So, just need to initialize GPIOs of CP2108
> with only one interface (I use interface 0). It means just only 1 gpiochip
> device file will be created for CP2108.
> 
> CP2108 has 16 GPIOs, So data types of several variables need to be is u16
> instead of u8(in struct cp210x_serial_private). This doesn't affect other
> CP210x devices.
> 
> Because CP2108 has 16 GPIO pins, the parameter passed by cp210x functions
> will be different from other CP210x devices. So need to check part number
> of the device to use correct data format  before sending commands to
> devices.
> 
> Like CP2104, CP2108 have GPIO pins with configurable options. Therefore,
> should be mask all pins which are not in GPIO mode in cp2108_gpio_init()
> function.
> 
> Signed-off-by: Pho Tran mailto:pho.t...@silabs.com>>
> —
> 03/05/2021: Patch v3 modified format and contents of changelog follow feedback
> from Jonhan Hovold mailto:jo...@kernel.org>>.
> 03/04/2021: Patch v2 modified format patch as comment from
> Johan Hovold mailto:jo...@kernel.org>>:
> 1. Break commit message lines at 80 cols
> 2. Use kernel u8 and u16 types instead of the c99 ones.
> 03/01/2021: Initialed submission of patch “Make the CP210x driver work with
> GPIOs of CP2108.”.

Why are you resending the v4 that you submitted only four days ago? 

Note that this version is again white space corrupted, and something
happened to you SOB tag above.

Johan

Re: [PATCH v2 40/43] powerpc/64s: Make kuap_check_amr() and kuap_get_and_check_amr() generic





Le 10/03/2021 à 02:37, Nicholas Piggin a écrit :

Excerpts from Christophe Leroy's message of March 9, 2021 10:10 pm:

In preparation of porting powerpc32 to C syscall entry/exit,
rename kuap_check_amr() and kuap_get_and_check_amr() as kuap_check()
and kuap_get_and_check(), and move in the generic asm/kup.h the stub
for when CONFIG_PPC_KUAP is not selected.


Looks pretty straightforward to me.

While you're renaming things, could kuap_check_amr() be changed to
kuap_assert_locked() or similar? Otherwise,


Ok, renamed kuap_assert_locked() and kuap_get_and_assert_locked()



Reviewed-by: Nicholas Piggin 



Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/book3s/64/kup.h | 24 ++--
  arch/powerpc/include/asm/kup.h   | 10 +-
  arch/powerpc/kernel/interrupt.c  | 12 ++--
  arch/powerpc/kernel/irq.c|  2 +-
  4 files changed, 18 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
b/arch/powerpc/include/asm/book3s/64/kup.h
index 8bd905050896..d9b07e9998be 100644
--- a/arch/powerpc/include/asm/book3s/64/kup.h
+++ b/arch/powerpc/include/asm/book3s/64/kup.h
@@ -287,7 +287,7 @@ static inline void kuap_kernel_restore(struct pt_regs *regs,
 */
  }
  
-static inline unsigned long kuap_get_and_check_amr(void)

+static inline unsigned long kuap_get_and_check(void)
  {
if (mmu_has_feature(MMU_FTR_BOOK3S_KUAP)) {
unsigned long amr = mfspr(SPRN_AMR);
@@ -298,27 +298,7 @@ static inline unsigned long kuap_get_and_check_amr(void)
return 0;
  }
  
-#else /* CONFIG_PPC_PKEY */

-
-static inline void kuap_user_restore(struct pt_regs *regs)
-{
-}
-
-static inline void kuap_kernel_restore(struct pt_regs *regs, unsigned long amr)
-{
-}
-
-static inline unsigned long kuap_get_and_check_amr(void)
-{
-   return 0;
-}
-
-#endif /* CONFIG_PPC_PKEY */
-
-
-#ifdef CONFIG_PPC_KUAP
-
-static inline void kuap_check_amr(void)
+static inline void kuap_check(void)
  {
if (IS_ENABLED(CONFIG_PPC_KUAP_DEBUG) && 
mmu_has_feature(MMU_FTR_BOOK3S_KUAP))
WARN_ON_ONCE(mfspr(SPRN_AMR) != AMR_KUAP_BLOCKED);
diff --git a/arch/powerpc/include/asm/kup.h b/arch/powerpc/include/asm/kup.h
index 25671f711ec2..b7efa46b3109 100644
--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -74,7 +74,15 @@ bad_kuap_fault(struct pt_regs *regs, unsigned long address, 
bool is_write)
return false;
  }
  
-static inline void kuap_check_amr(void) { }

+static inline void kuap_check(void) { }
+static inline void kuap_save_and_lock(struct pt_regs *regs) { }
+static inline void kuap_user_restore(struct pt_regs *regs) { }
+static inline void kuap_kernel_restore(struct pt_regs *regs, unsigned long 
amr) { }
+
+static inline unsigned long kuap_get_and_check(void)
+{
+   return 0;
+}
  
  /*

   * book3s/64/kup-radix.h defines these functions for the !KUAP case to flush
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index 727b7848c9cc..40ed55064e54 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -76,7 +76,7 @@ notrace long system_call_exception(long r3, long r4, long r5,
} else
  #endif
  #ifdef CONFIG_PPC64
-   kuap_check_amr();
+   kuap_check();
  #endif
  
  	booke_restore_dbcr0();

@@ -254,7 +254,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
CT_WARN_ON(ct_state() == CONTEXT_USER);
  
  #ifdef CONFIG_PPC64

-   kuap_check_amr();
+   kuap_check();
  #endif
  
  	regs->result = r3;

@@ -380,7 +380,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct 
pt_regs *regs, unsigned
 * AMR can only have been unlocked if we interrupted the kernel.
 */
  #ifdef CONFIG_PPC64
-   kuap_check_amr();
+   kuap_check();
  #endif
  
  	local_irq_save(flags);

@@ -451,7 +451,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs, unsign
unsigned long flags;
unsigned long ret = 0;
  #ifdef CONFIG_PPC64
-   unsigned long amr;
+   unsigned long kuap;
  #endif
  
  	if (!IS_ENABLED(CONFIG_BOOKE) && !IS_ENABLED(CONFIG_40x) &&

@@ -467,7 +467,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs, unsign
CT_WARN_ON(ct_state() == CONTEXT_USER);
  
  #ifdef CONFIG_PPC64

-   amr = kuap_get_and_check_amr();
+   kuap = kuap_get_and_check();
  #endif
  
  	if (unlikely(current_thread_info()->flags & _TIF_EMULATE_STACK_STORE)) {

@@ -511,7 +511,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs, unsign
 * value from the check above.
 */
  #ifdef CONFIG_PPC64
-   kuap_kernel_restore(regs, amr);
+   kuap_kernel_restore(regs, kuap);
  #endif
  
  	return ret;

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index d71fd10a1dd4..3b18d2b2c702 100644
--- a/arch/powerpc/k

Re: [PATCH v7] i2c: virtio: add a virtio i2c frontend driver

2021-03-12 Thread Jie Deng




On 2021/3/12 16:11, Viresh Kumar wrote:

On 12-03-21, 15:51, Jie Deng wrote:

On 2021/3/12 14:10, Viresh Kumar wrote:

I saw your email about wrong version being sent, I already wrote some
reviews. Sending them anyway for FWIW :)

On 12-03-21, 21:33, Jie Deng wrote:

+struct virtio_i2c {
+   struct virtio_device *vdev;
+   struct completion completion;
+   struct i2c_adapter adap;
+   struct mutex lock;

As I said in the previous version (Yes, we were both waiting for
Wolfram to answer that), this lock shouldn't be required at all.

And since none of us have a use-case at hand where we will have a
problem without this lock, we should really skip it. We can always
come back and add it if we find an issue somewhere. Until then, better
to keep it simple.

The problem is you can't guarantee that adap->algo->master_xfer
is only called from i2c_transfer. Any function who holds the adapter can
call
adap->algo->master_xfer directly.

See my last reply here, (almost) no one in the mainline kernel call it
directly. And perhaps you can consider the caller broken in that case
and so there is no need of an extra lock, unless you have a case that
is broken.

https://lore.kernel.org/lkml/20210305072903.wtw645rukmqr5hx5@vireshk-i7/


I prefer to avoid potential issues rather
than
find a issue then fix.

This is a very hypothetical issue IMHO as the kernel code doesn't have
such a user. There is no need of locks here, else the i2c core won't
have handled it by itself.


I'd like to see Wolfram's opinion.
Is it safe to remove lock in adap->algo->master_xfer ?

Re: [PATCH] leds: leds-dual-gpio: Add dual GPIO LEDs driver

2021-03-12 Thread Alexander Dahl

Hallo Hermes,

thanks for your effort.

Am Donnerstag, 11. März 2021, 14:04:08 CET schrieb Hermes Zhang:
> From: Hermes Zhang 
> 
> Introduce a new Dual GPIO LED driver. These two GPIOs LED will act as
> one LED as normal GPIO LED but give the possibility to change the
> intensity in four levels: OFF, LOW, MIDDLE and HIGH.

Interesting use case. Is there any real world hardware wired like that you 
could point to?

> +config LEDS_DUAL_GPIO
> + tristate "LED Support for Dual GPIO connected LEDs"
> + depends on LEDS_CLASS
> + depends on GPIOLIB || COMPILE_TEST
> + help
> +   This option enables support for the two LEDs connected to GPIO
> +   outputs. These two GPIO LEDs act as one LED in the sysfs and
> +   perform different intensity by enable either one of them or both.

Well, although I never had time to implement that, I suspect that could 
conflict if someone will eventually write a driver for two pin dual color LEDs 
connected to GPIO pins.  We actually do that on our hardware and I know others 
do, too.

I asked about that back in 2019, see this thread:

https://www.spinics.net/lists/linux-leds/msg11665.html

At the time the multicolor framework was not yet merged, so today I would 
probably make something which either uses the multicolor framework or at least 
has a similar interface to userspace. However, it probably won't surprise you 
all, this is not highest priority on my ToDo list. ;-)

(What we actually do is pretend those are separate LEDs and ignore the 
conflicting case where both GPIOs are on and the LED is dark then.)

Greets
Alex

[PATCH] arm: plat-pxa: delete redundant printing of the error

2021-03-12 Thread Wan Jiabing

platform_get_irq() has already checked and printed the error,
the printing here is not necessary at all.

Signed-off-by: Wan Jiabing 
---
 arch/arm/plat-pxa/ssp.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/arm/plat-pxa/ssp.c b/arch/arm/plat-pxa/ssp.c
index 563440315acd..9e77b3392c1e 100644
--- a/arch/arm/plat-pxa/ssp.c
+++ b/arch/arm/plat-pxa/ssp.c
@@ -146,10 +146,8 @@ static int pxa_ssp_probe(struct platform_device *pdev)
}
 
ssp->irq = platform_get_irq(pdev, 0);
-   if (ssp->irq < 0) {
-   dev_err(dev, "no IRQ resource defined\n");
+   if (ssp->irq < 0)
return -ENODEV;
-   }
 
if (dev->of_node) {
const struct of_device_id *id =
-- 
2.25.1

[PATCH v6 1/8] mm: Remove special swap entry functions

Remove the migration and device private entry_to_page() and
entry_to_pfn() inline functions and instead open code them directly.
This results in shorter code which is easier to understand.

Signed-off-by: Alistair Popple 
Reviewed-by: Ralph Campbell 

---

v6:
* Removed redundant compound_page() call from inside PageLocked()
* Fixed a minor build issue for s390 reported by kernel test bot

v4:
* Added pfn_swap_entry_to_page()
* Reinstated check that migration entries point to locked pages
* Removed #define swapcache_prepare which isn't needed for CONFIG_SWAP=0
  builds
---
 arch/s390/mm/pgtable.c  |  2 +-
 fs/proc/task_mmu.c  | 23 +-
 include/linux/swap.h|  4 +--
 include/linux/swapops.h | 69 ++---
 mm/hmm.c|  5 ++-
 mm/huge_memory.c|  4 +--
 mm/memcontrol.c |  2 +-
 mm/memory.c | 10 +++---
 mm/migrate.c|  6 ++--
 mm/page_vma_mapped.c|  6 ++--
 10 files changed, 50 insertions(+), 81 deletions(-)

diff --git a/arch/s390/mm/pgtable.c b/arch/s390/mm/pgtable.c
index 18205f851c24..eec3a9d7176e 100644
--- a/arch/s390/mm/pgtable.c
+++ b/arch/s390/mm/pgtable.c
@@ -691,7 +691,7 @@ static void ptep_zap_swap_entry(struct mm_struct *mm, 
swp_entry_t entry)
if (!non_swap_entry(entry))
dec_mm_counter(mm, MM_SWAPENTS);
else if (is_migration_entry(entry)) {
-   struct page *page = migration_entry_to_page(entry);
+   struct page *page = pfn_swap_entry_to_page(entry);
 
dec_mm_counter(mm, mm_counter(page));
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3cec6fbef725..08ee59d945c0 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -514,10 +514,8 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
} else {
mss->swap_pss += (u64)PAGE_SIZE << PSS_SHIFT;
}
-   } else if (is_migration_entry(swpent))
-   page = migration_entry_to_page(swpent);
-   else if (is_device_private_entry(swpent))
-   page = device_private_entry_to_page(swpent);
+   } else if (is_pfn_swap_entry(swpent))
+   page = pfn_swap_entry_to_page(swpent);
} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
&& pte_none(*pte))) {
page = xa_load(&vma->vm_file->f_mapping->i_pages,
@@ -549,7 +547,7 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
swp_entry_t entry = pmd_to_swp_entry(*pmd);
 
if (is_migration_entry(entry))
-   page = migration_entry_to_page(entry);
+   page = pfn_swap_entry_to_page(entry);
}
if (IS_ERR_OR_NULL(page))
return;
@@ -691,10 +689,8 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long 
hmask,
} else if (is_swap_pte(*pte)) {
swp_entry_t swpent = pte_to_swp_entry(*pte);
 
-   if (is_migration_entry(swpent))
-   page = migration_entry_to_page(swpent);
-   else if (is_device_private_entry(swpent))
-   page = device_private_entry_to_page(swpent);
+   if (is_pfn_swap_entry(swpent))
+   page = pfn_swap_entry_to_page(swpent);
}
if (page) {
int mapcount = page_mapcount(page);
@@ -1383,11 +1379,8 @@ static pagemap_entry_t pte_to_pagemap_entry(struct 
pagemapread *pm,
frame = swp_type(entry) |
(swp_offset(entry) << MAX_SWAPFILES_SHIFT);
flags |= PM_SWAP;
-   if (is_migration_entry(entry))
-   page = migration_entry_to_page(entry);
-
-   if (is_device_private_entry(entry))
-   page = device_private_entry_to_page(entry);
+   if (is_pfn_swap_entry(entry))
+   page = pfn_swap_entry_to_page(entry);
}
 
if (page && !PageAnon(page))
@@ -1444,7 +1437,7 @@ static int pagemap_pmd_range(pmd_t *pmdp, unsigned long 
addr, unsigned long end,
if (pmd_swp_soft_dirty(pmd))
flags |= PM_SOFT_DIRTY;
VM_BUG_ON(!is_pmd_migration_entry(pmd));
-   page = migration_entry_to_page(entry);
+   page = pfn_swap_entry_to_page(entry);
}
 #endif
 
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4cc6ec3bf0ab..516104b9334b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -523,8 +523,8 @@ static inline void show_swap_cache_info(void)
 {
 }
 
-#define free_swap_and_cache(e) ({(is_migration_entry(e) || 
is_device_private_entry(e));})
-#define swapcache_prepare(e) ({(is_migration_entry

[PATCH v6 0/8] Add support for SVM atomics in Nouveau

This is the sixth version of a series to add support to Nouveau for atomic
memory operations on OpenCL shared virtual memory (SVM) regions.

There are no significant changes for version six other than correcting a
minor s390 build and bisectability issue and removing a redundant call to
compound_page() when checking for PageLocked in patch 1.

Exclusive device access is implemented by adding a new swap entry type
(SWAP_DEVICE_EXCLUSIVE) which is similar to a migration entry. The main
difference is that on fault the original entry is immediately restored by
the fault handler instead of waiting.

Restoring the entry triggers calls to MMU notifers which allows a device
driver to revoke the atomic access permission from the GPU prior to the CPU
finalising the entry.

Patches 1 & 2 refactor existing migration and device private entry
functions.

Patches 3 & 4 rework try_to_unmap_one() by splitting out unrelated
functionality into separate functions - try_to_migrate_one() and
try_to_munlock_one(). These should not change any functionality, but any
help testing would be much appreciated as I have not been able to test
every usage of try_to_unmap_one().

Patch 5 contains the bulk of the implementation for device exclusive
memory.

Patch 6 contains some additions to the HMM selftests to ensure everything
works as expected.

Patch 7 is a cleanup for the Nouveau SVM implementation.

Patch 8 contains the implementation of atomic access for the Nouveau
driver.

This has been tested using the latest upstream Mesa userspace with a simple
OpenCL test program which checks the results of atomic GPU operations on a
SVM buffer whilst also writing to the same buffer from the CPU.

Alistair Popple (8):
  mm: Remove special swap entry functions
  mm/swapops: Rework swap entry manipulation code
  mm/rmap: Split try_to_munlock from try_to_unmap
  mm/rmap: Split migration into its own function
  mm: Device exclusive memory access
  mm: Selftests for exclusive device memory
  nouveau/svm: Refactor nouveau_range_fault
  nouveau/svm: Implement atomic SVM access

 Documentation/vm/hmm.rst  |  19 +-
 arch/s390/mm/pgtable.c|   2 +-
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c | 130 +++-
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c|   6 +
 fs/proc/task_mmu.c|  23 +-
 include/linux/mmu_notifier.h  |  25 +-
 include/linux/rmap.h  |   9 +-
 include/linux/swap.h  |   8 +-
 include/linux/swapops.h   | 123 ++--
 lib/test_hmm.c| 126 +++-
 lib/test_hmm_uapi.h   |   2 +
 mm/debug_vm_pgtable.c |  12 +-
 mm/hmm.c  |  12 +-
 mm/huge_memory.c  |  45 +-
 mm/hugetlb.c  |  10 +-
 mm/memcontrol.c   |   2 +-
 mm/memory.c   | 127 +++-
 mm/migrate.c  |  41 +-
 mm/mprotect.c |  18 +-
 mm/page_vma_mapped.c  |  15 +-
 mm/rmap.c | 597 +++---
 tools/testing/selftests/vm/hmm-tests.c| 219 +++
 24 files changed, 1313 insertions(+), 260 deletions(-)

-- 
2.20.1

[PATCH v6 3/8] mm/rmap: Split try_to_munlock from try_to_unmap

The behaviour of try_to_unmap_one() is difficult to follow because it
performs different operations based on a fairly large set of flags used
in different combinations.

TTU_MUNLOCK is one such flag. However it is exclusively used by
try_to_munlock() which specifies no other flags. Therefore rather than
overload try_to_unmap_one() with unrelated behaviour split this out into
it's own function and remove the flag.

Signed-off-by: Alistair Popple 
Reviewed-by: Ralph Campbell 

---

Christoph - I didn't add your Reviewed-by from v3 because removal of the
extra VM_LOCKED check in v4 changed things slightly. Let me know if
you're still ok for me to add it. Thanks.

v4:
* Removed redundant check for VM_LOCKED
---
 include/linux/rmap.h |  1 -
 mm/rmap.c| 40 
 2 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index def5c62c93b3..e26ac2d71346 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -87,7 +87,6 @@ struct anon_vma_chain {
 
 enum ttu_flags {
TTU_MIGRATION   = 0x1,  /* migration mode */
-   TTU_MUNLOCK = 0x2,  /* munlock mode */
 
TTU_SPLIT_HUGE_PMD  = 0x4,  /* split huge PMD if any */
TTU_IGNORE_MLOCK= 0x8,  /* ignore mlock */
diff --git a/mm/rmap.c b/mm/rmap.c
index 977e70803ed8..d02bade5245b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1405,10 +1405,6 @@ static bool try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
struct mmu_notifier_range range;
enum ttu_flags flags = (enum ttu_flags)(long)arg;
 
-   /* munlock has nothing to gain from examining un-locked vmas */
-   if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED))
-   return true;
-
if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) &&
is_zone_device_page(page) && !is_device_private_page(page))
return true;
@@ -1469,8 +1465,6 @@ static bool try_to_unmap_one(struct page *page, struct 
vm_area_struct *vma,
page_vma_mapped_walk_done(&pvmw);
break;
}
-   if (flags & TTU_MUNLOCK)
-   continue;
}
 
/* Unexpected PMD-mapped THP? */
@@ -1784,6 +1778,37 @@ bool try_to_unmap(struct page *page, enum ttu_flags 
flags)
return !page_mapcount(page) ? true : false;
 }
 
+static bool try_to_munlock_one(struct page *page, struct vm_area_struct *vma,
+unsigned long address, void *arg)
+{
+   struct page_vma_mapped_walk pvmw = {
+   .page = page,
+   .vma = vma,
+   .address = address,
+   };
+
+   /* munlock has nothing to gain from examining un-locked vmas */
+   if (!(vma->vm_flags & VM_LOCKED))
+   return true;
+
+   while (page_vma_mapped_walk(&pvmw)) {
+   /* PTE-mapped THP are never mlocked */
+   if (!PageTransCompound(page)) {
+   /*
+* Holding pte lock, we do *not* need
+* mmap_lock here
+*/
+   mlock_vma_page(page);
+   }
+   page_vma_mapped_walk_done(&pvmw);
+
+   /* found a mlocked page, no point continuing munlock check */
+   return false;
+   }
+
+   return true;
+}
+
 /**
  * try_to_munlock - try to munlock a page
  * @page: the page to be munlocked
@@ -1796,8 +1821,7 @@ bool try_to_unmap(struct page *page, enum ttu_flags flags)
 void try_to_munlock(struct page *page)
 {
struct rmap_walk_control rwc = {
-   .rmap_one = try_to_unmap_one,
-   .arg = (void *)TTU_MUNLOCK,
+   .rmap_one = try_to_munlock_one,
.done = page_not_mapped,
.anon_lock = page_lock_anon_vma_read,
 
-- 
2.20.1

[PATCH v6 2/8] mm/swapops: Rework swap entry manipulation code

Both migration and device private pages use special swap entries that
are manipluated by a range of inline functions. The arguments to these
are somewhat inconsitent so rework them to remove flag type arguments
and to make the arguments similar for both read and write entry
creation.

Signed-off-by: Alistair Popple 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Ralph Campbell 
---
 include/linux/swapops.h | 56 ++---
 mm/debug_vm_pgtable.c   | 12 -
 mm/hmm.c|  2 +-
 mm/huge_memory.c| 26 +--
 mm/hugetlb.c| 10 +---
 mm/memory.c | 10 +---
 mm/migrate.c| 26 ++-
 mm/mprotect.c   | 10 +---
 mm/rmap.c   | 10 +---
 9 files changed, 100 insertions(+), 62 deletions(-)

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 139be8235ad2..4dfd807ae52a 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -100,35 +100,35 @@ static inline void *swp_to_radix_entry(swp_entry_t entry)
 }
 
 #if IS_ENABLED(CONFIG_DEVICE_PRIVATE)
-static inline swp_entry_t make_device_private_entry(struct page *page, bool 
write)
+static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 {
-   return swp_entry(write ? SWP_DEVICE_WRITE : SWP_DEVICE_READ,
-page_to_pfn(page));
+   return swp_entry(SWP_DEVICE_READ, offset);
 }
 
-static inline bool is_device_private_entry(swp_entry_t entry)
+static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
-   int type = swp_type(entry);
-   return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE;
+   return swp_entry(SWP_DEVICE_WRITE, offset);
 }
 
-static inline void make_device_private_entry_read(swp_entry_t *entry)
+static inline bool is_device_private_entry(swp_entry_t entry)
 {
-   *entry = swp_entry(SWP_DEVICE_READ, swp_offset(*entry));
+   int type = swp_type(entry);
+   return type == SWP_DEVICE_READ || type == SWP_DEVICE_WRITE;
 }
 
-static inline bool is_write_device_private_entry(swp_entry_t entry)
+static inline bool is_writable_device_private_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_DEVICE_WRITE);
 }
 #else /* CONFIG_DEVICE_PRIVATE */
-static inline swp_entry_t make_device_private_entry(struct page *page, bool 
write)
+static inline swp_entry_t make_readable_device_private_entry(pgoff_t offset)
 {
return swp_entry(0, 0);
 }
 
-static inline void make_device_private_entry_read(swp_entry_t *entry)
+static inline swp_entry_t make_writable_device_private_entry(pgoff_t offset)
 {
+   return swp_entry(0, 0);
 }
 
 static inline bool is_device_private_entry(swp_entry_t entry)
@@ -136,35 +136,32 @@ static inline bool is_device_private_entry(swp_entry_t 
entry)
return false;
 }
 
-static inline bool is_write_device_private_entry(swp_entry_t entry)
+static inline bool is_writable_device_private_entry(swp_entry_t entry)
 {
return false;
 }
 #endif /* CONFIG_DEVICE_PRIVATE */
 
 #ifdef CONFIG_MIGRATION
-static inline swp_entry_t make_migration_entry(struct page *page, int write)
-{
-   BUG_ON(!PageLocked(compound_head(page)));
-
-   return swp_entry(write ? SWP_MIGRATION_WRITE : SWP_MIGRATION_READ,
-   page_to_pfn(page));
-}
-
 static inline int is_migration_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_MIGRATION_READ ||
swp_type(entry) == SWP_MIGRATION_WRITE);
 }
 
-static inline int is_write_migration_entry(swp_entry_t entry)
+static inline int is_writable_migration_entry(swp_entry_t entry)
 {
return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE);
 }
 
-static inline void make_migration_entry_read(swp_entry_t *entry)
+static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
 {
-   *entry = swp_entry(SWP_MIGRATION_READ, swp_offset(*entry));
+   return swp_entry(SWP_MIGRATION_READ, offset);
+}
+
+static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(SWP_MIGRATION_WRITE, offset);
 }
 
 extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep,
@@ -174,21 +171,28 @@ extern void migration_entry_wait(struct mm_struct *mm, 
pmd_t *pmd,
 extern void migration_entry_wait_huge(struct vm_area_struct *vma,
struct mm_struct *mm, pte_t *pte);
 #else
+static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(0, 0);
+}
+
+static inline swp_entry_t make_writable_migration_entry(pgoff_t offset)
+{
+   return swp_entry(0, 0);
+}
 
-#define make_migration_entry(page, write) swp_entry(0, 0)
 static inline int is_migration_entry(swp_entry_t swp)
 {
return 0;
 }
 
-static inline void make_migration_entry_read(swp_entry_t *entryp) { }
 static inline void __migration_entry_wait(struct mm_struct *mm, pt

Re: [PATCH 1/1] ARM: owl: Add Actions Semi Owl S500 SoC machine

2021-03-12 Thread Andreas Färber

Hi Cristian,

On 11.03.21 20:19, Cristian Ciocaltea wrote:
> Add machine entry for the S500 variant of the Actions Semi Owl SoCs
> family.
> 
> For the moment the only purpose is to provide the system serial
> information which will be used by the Owl Ethernet MAC driver to
> generate a stable MAC address.

Can't that be done in either a sys_soc driver or U-Boot?

Regards,
Andreas

-- 
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer
HRB 36809 (AG Nürnberg)

[PATCH v6 4/8] mm/rmap: Split migration into its own function

Migration is currently implemented as a mode of operation for
try_to_unmap_one() generally specified by passing the TTU_MIGRATION flag
or in the case of splitting a huge anonymous page TTU_SPLIT_FREEZE.

However it does not have much in common with the rest of the unmap
functionality of try_to_unmap_one() and thus splitting it into a
separate function reduces the complexity of try_to_unmap_one() making it
more readable.

Several simplifications can also be made in try_to_migrate_one() based
on the following observations:

 - All users of TTU_MIGRATION also set TTU_IGNORE_MLOCK.
 - No users of TTU_MIGRATION ever set TTU_IGNORE_HWPOISON.
 - No users of TTU_MIGRATION ever set TTU_BATCH_FLUSH.

TTU_SPLIT_FREEZE is a special case of migration used when splitting an
anonymous page. This is most easily dealt with by calling the correct
function from unmap_page() in mm/huge_memory.c  - either
try_to_migrate() for PageAnon or try_to_unmap().

Signed-off-by: Alistair Popple 
Reviewed-by: Christoph Hellwig 
Reviewed-by: Ralph Campbell 

---

v5:
* Added comments about how PMD splitting works for migration vs.
  unmapping
* Tightened up the flag check in try_to_migrate() to be explicit about
  which TTU_XXX flags are supported.
---
 include/linux/rmap.h |   4 +-
 mm/huge_memory.c |  15 +-
 mm/migrate.c |   9 +-
 mm/rmap.c| 358 ---
 4 files changed, 280 insertions(+), 106 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index e26ac2d71346..6062e0cfca2d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -86,8 +86,6 @@ struct anon_vma_chain {
 };
 
 enum ttu_flags {
-   TTU_MIGRATION   = 0x1,  /* migration mode */
-
TTU_SPLIT_HUGE_PMD  = 0x4,  /* split huge PMD if any */
TTU_IGNORE_MLOCK= 0x8,  /* ignore mlock */
TTU_IGNORE_HWPOISON = 0x20, /* corrupted page is recoverable */
@@ -96,7 +94,6 @@ enum ttu_flags {
 * do a final flush if necessary */
TTU_RMAP_LOCKED = 0x80, /* do not grab rmap lock:
 * caller holds it */
-   TTU_SPLIT_FREEZE= 0x100,/* freeze pte under 
splitting thp */
 };
 
 #ifdef CONFIG_MMU
@@ -193,6 +190,7 @@ static inline void page_dup_rmap(struct page *page, bool 
compound)
 int page_referenced(struct page *, int is_locked,
struct mem_cgroup *memcg, unsigned long *vm_flags);
 
+bool try_to_migrate(struct page *page, enum ttu_flags flags);
 bool try_to_unmap(struct page *, enum ttu_flags flags);
 
 /* Avoid racy checks */
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 89af065cea5b..eab004331b97 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2357,16 +2357,21 @@ void vma_adjust_trans_huge(struct vm_area_struct *vma,
 
 static void unmap_page(struct page *page)
 {
-   enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK |
-   TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
+   enum ttu_flags ttu_flags = TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD;
bool unmap_success;
 
VM_BUG_ON_PAGE(!PageHead(page), page);
 
if (PageAnon(page))
-   ttu_flags |= TTU_SPLIT_FREEZE;
-
-   unmap_success = try_to_unmap(page, ttu_flags);
+   unmap_success = try_to_migrate(page, ttu_flags);
+   else
+   /*
+* Don't install migration entries for file backed pages. This
+* helps handle cases when i_size is in the middle of the page
+* as there is no need to unmap pages beyond i_size manually.
+*/
+   unmap_success = try_to_unmap(page, ttu_flags |
+   TTU_IGNORE_MLOCK);
VM_BUG_ON_PAGE(!unmap_success, page);
 }
 
diff --git a/mm/migrate.c b/mm/migrate.c
index b752543adb64..cc4612e2a246 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1130,7 +1130,7 @@ static int __unmap_and_move(struct page *page, struct 
page *newpage,
/* Establish migration ptes */
VM_BUG_ON_PAGE(PageAnon(page) && !PageKsm(page) && !anon_vma,
page);
-   try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK);
+   try_to_migrate(page, 0);
page_was_mapped = 1;
}
 
@@ -1332,7 +1332,7 @@ static int unmap_and_move_huge_page(new_page_t 
get_new_page,
 
if (page_mapped(hpage)) {
bool mapping_locked = false;
-   enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK;
+   enum ttu_flags ttu = 0;
 
if (!PageAnon(hpage)) {
/*
@@ -1349,7 +1349,7 @@ static int unmap_and_move_huge_page(new_page_t 
get_new_page,
ttu |= TTU_RMAP_LOCKED;
}
 
-   try_to_unmap(hpage, ttu);
+   try_to_migrate(hpage, ttu);
page_was_mapped

[PATCH v6 5/8] mm: Device exclusive memory access

Some devices require exclusive write access to shared virtual
memory (SVM) ranges to perform atomic operations on that memory. This
requires CPU page tables to be updated to deny access whilst atomic
operations are occurring.

In order to do this introduce a new swap entry
type (SWP_DEVICE_EXCLUSIVE). When a SVM range needs to be marked for
exclusive access by a device all page table mappings for the particular
range are replaced with device exclusive swap entries. This causes any
CPU access to the page to result in a fault.

Faults are resovled by replacing the faulting entry with the original
mapping. This results in MMU notifiers being called which a driver uses
to update access permissions such as revoking atomic access. After
notifiers have been called the device will no longer have exclusive
access to the region.

Signed-off-by: Alistair Popple 

---

v6:
* Fixed a bisectablity issue due to incorrectly applying the rename of
  migrate_pgmap_owner to the wrong patches for Nouveau and hmm_test.

v5:
* Renamed range->migrate_pgmap_owner to range->owner.
* Added MMU_NOTIFY_EXCLUSIVE to allow passing of a driver cookie which
  allows notifiers called as a result of make_device_exclusive_range() to
  be ignored.
* Added a check to try_to_protect_one() to detect if the pages originally
  returned from get_user_pages() have been unmapped or not.
* Removed check_device_exclusive_range() as it is no longer required with
  the other changes.
* Documentation update.

v4:
* Add function to check that mappings are still valid and exclusive.
* s/long/unsigned long/ in make_device_exclusive_entry().
---
 Documentation/vm/hmm.rst  |  19 ++-
 drivers/gpu/drm/nouveau/nouveau_svm.c |   2 +-
 include/linux/mmu_notifier.h  |  25 +++-
 include/linux/rmap.h  |   4 +
 include/linux/swap.h  |   4 +-
 include/linux/swapops.h   |  44 +-
 lib/test_hmm.c|   2 +-
 mm/hmm.c  |   5 +
 mm/memory.c   | 107 +-
 mm/mprotect.c |   8 +
 mm/page_vma_mapped.c  |   9 +-
 mm/rmap.c | 203 ++
 12 files changed, 419 insertions(+), 13 deletions(-)

diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst
index 09e28507f5b2..a5fdee82c037 100644
--- a/Documentation/vm/hmm.rst
+++ b/Documentation/vm/hmm.rst
@@ -332,7 +332,7 @@ between device driver specific code and shared common code:
walks to fill in the ``args->src`` array with PFNs to be migrated.
The ``invalidate_range_start()`` callback is passed a
``struct mmu_notifier_range`` with the ``event`` field set to
-   ``MMU_NOTIFY_MIGRATE`` and the ``migrate_pgmap_owner`` field set to
+   ``MMU_NOTIFY_MIGRATE`` and the ``owner`` field set to
the ``args->pgmap_owner`` field passed to migrate_vma_setup(). This is
allows the device driver to skip the invalidation callback and only
invalidate device private MMU mappings that are actually migrating.
@@ -405,6 +405,23 @@ between device driver specific code and shared common code:
 
The lock can now be released.
 
+Exclusive access memory
+===
+
+Not all devices support atomic access to system memory. To support atomic
+operations to a shared virtual memory page such a device needs access to that
+page which is exclusive of any userspace access from the CPU. The
+``make_device_exclusive_range()`` function can be used to make a memory range
+inaccessible from userspace.
+
+This replaces all mappings for pages in the given range with special swap
+entries. Any attempt to access the swap entry results in a fault which is
+resovled by replacing the entry with the original mapping. A driver gets
+notified that the mapping has been changed by MMU notifiers, after which point
+it will no longer have exclusive access to the page. Exclusive access is
+guranteed to last until the driver drops the page lock and page reference, at
+which point any CPU faults on the page may proceed as described.
+
 Memory cgroup (memcg) and rss accounting
 
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index f18bd53da052..94f841026c3b 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -265,7 +265,7 @@ nouveau_svmm_invalidate_range_start(struct mmu_notifier *mn,
 * the invalidation is handled as part of the migration process.
 */
if (update->event == MMU_NOTIFY_MIGRATE &&
-   update->migrate_pgmap_owner == svmm->vmm->cli->drm->dev)
+   update->owner == svmm->vmm->cli->drm->dev)
goto out;
 
if (limit > svmm->unmanaged.start && start < svmm->unmanaged.limit) {
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index b8200782dede..455e269bf825 100644
--- a/include/linux/mm

[PATCH v6 6/8] mm: Selftests for exclusive device memory

Adds some selftests for exclusive device memory.

Signed-off-by: Alistair Popple 
Acked-by: Jason Gunthorpe 
Tested-by: Ralph Campbell 
Reviewed-by: Ralph Campbell 
---
 lib/test_hmm.c | 124 ++
 lib/test_hmm_uapi.h|   2 +
 tools/testing/selftests/vm/hmm-tests.c | 219 +
 3 files changed, 345 insertions(+)

diff --git a/lib/test_hmm.c b/lib/test_hmm.c
index 5c9f5a020c1d..305a9d9e2b4c 100644
--- a/lib/test_hmm.c
+++ b/lib/test_hmm.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "test_hmm_uapi.h"
 
@@ -46,6 +47,7 @@ struct dmirror_bounce {
unsigned long   cpages;
 };
 
+#define DPT_XA_TAG_ATOMIC 1UL
 #define DPT_XA_TAG_WRITE 3UL
 
 /*
@@ -619,6 +621,54 @@ static void dmirror_migrate_alloc_and_copy(struct 
migrate_vma *args,
}
 }
 
+static int dmirror_check_atomic(struct dmirror *dmirror, unsigned long start,
+unsigned long end)
+{
+   unsigned long pfn;
+
+   for (pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); pfn++) {
+   void *entry;
+   struct page *page;
+
+   entry = xa_load(&dmirror->pt, pfn);
+   page = xa_untag_pointer(entry);
+   if (xa_pointer_tag(entry) == DPT_XA_TAG_ATOMIC)
+   return -EPERM;
+   }
+
+   return 0;
+}
+
+static int dmirror_atomic_map(unsigned long start, unsigned long end,
+ struct page **pages, struct dmirror *dmirror)
+{
+   unsigned long pfn, mapped = 0;
+   int i;
+
+   /* Map the migrated pages into the device's page tables. */
+   mutex_lock(&dmirror->mutex);
+
+   for (i = 0, pfn = start >> PAGE_SHIFT; pfn < (end >> PAGE_SHIFT); 
pfn++, i++) {
+   void *entry;
+
+   if (!pages[i])
+   continue;
+
+   entry = pages[i];
+   entry = xa_tag_pointer(entry, DPT_XA_TAG_ATOMIC);
+   entry = xa_store(&dmirror->pt, pfn, entry, GFP_ATOMIC);
+   if (xa_is_err(entry)) {
+   mutex_unlock(&dmirror->mutex);
+   return xa_err(entry);
+   }
+
+   mapped++;
+   }
+
+   mutex_unlock(&dmirror->mutex);
+   return mapped;
+}
+
 static int dmirror_migrate_finalize_and_map(struct migrate_vma *args,
struct dmirror *dmirror)
 {
@@ -661,6 +711,71 @@ static int dmirror_migrate_finalize_and_map(struct 
migrate_vma *args,
return 0;
 }
 
+static int dmirror_exclusive(struct dmirror *dmirror,
+struct hmm_dmirror_cmd *cmd)
+{
+   unsigned long start, end, addr;
+   unsigned long size = cmd->npages << PAGE_SHIFT;
+   struct mm_struct *mm = dmirror->notifier.mm;
+   struct page *pages[64];
+   struct dmirror_bounce bounce;
+   unsigned long next;
+   int ret;
+
+   start = cmd->addr;
+   end = start + size;
+   if (end < start)
+   return -EINVAL;
+
+   /* Since the mm is for the mirrored process, get a reference first. */
+   if (!mmget_not_zero(mm))
+   return -EINVAL;
+
+   mmap_read_lock(mm);
+   for (addr = start; addr < end; addr = next) {
+   int i, mapped;
+
+   if (end < addr + (ARRAY_SIZE(pages) << PAGE_SHIFT))
+   next = end;
+   else
+   next = addr + (ARRAY_SIZE(pages) << PAGE_SHIFT);
+
+   ret = make_device_exclusive_range(mm, addr, next, pages, NULL);
+   mapped = dmirror_atomic_map(addr, next, pages, dmirror);
+   for (i = 0; i < ret; i++) {
+   if (pages[i]) {
+   unlock_page(pages[i]);
+   put_page(pages[i]);
+   }
+   }
+
+   if (addr + (mapped << PAGE_SHIFT) < next) {
+   mmap_read_unlock(mm);
+   mmput(mm);
+   return -EBUSY;
+   }
+   }
+   mmap_read_unlock(mm);
+   mmput(mm);
+
+   /* Return the migrated data for verification. */
+   ret = dmirror_bounce_init(&bounce, start, size);
+   if (ret)
+   return ret;
+   mutex_lock(&dmirror->mutex);
+   ret = dmirror_do_read(dmirror, start, end, &bounce);
+   mutex_unlock(&dmirror->mutex);
+   if (ret == 0) {
+   if (copy_to_user(u64_to_user_ptr(cmd->ptr), bounce.ptr,
+bounce.size))
+   ret = -EFAULT;
+   }
+
+   cmd->cpages = bounce.cpages;
+   dmirror_bounce_fini(&bounce);
+   return ret;
+}
+
 static int dmirror_migrate(struct dmirror *dmirror,
   struct hmm_dmirror_cmd *cmd)
 {
@@ -949,6 +1064,15 @@ static long dmirror_fops_unlocked_ioctl(struct file *filp,

[PATCH v6 8/8] nouveau/svm: Implement atomic SVM access

Some NVIDIA GPUs do not support direct atomic access to system memory
via PCIe. Instead this must be emulated by granting the GPU exclusive
access to the memory. This is achieved by replacing CPU page table
entries with special swap entries that fault on userspace access.

The driver then grants the GPU permission to update the page undergoing
atomic access via the GPU page tables. When CPU access to the page is
required a CPU fault is raised which calls into the device driver via
MMU notifiers to revoke the atomic access. The original page table
entries are then restored allowing CPU access to proceed.

Signed-off-by: Alistair Popple 

---

v4:
* Check that page table entries haven't changed before mapping on the
  device
---
 drivers/gpu/drm/nouveau/include/nvif/if000c.h |   1 +
 drivers/gpu/drm/nouveau/nouveau_svm.c | 100 --
 drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h |   1 +
 .../drm/nouveau/nvkm/subdev/mmu/vmmgp100.c|   6 ++
 4 files changed, 100 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/include/nvif/if000c.h 
b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
index d6dd40f21eed..9c7ff56831c5 100644
--- a/drivers/gpu/drm/nouveau/include/nvif/if000c.h
+++ b/drivers/gpu/drm/nouveau/include/nvif/if000c.h
@@ -77,6 +77,7 @@ struct nvif_vmm_pfnmap_v0 {
 #define NVIF_VMM_PFNMAP_V0_APER   0x00f0ULL
 #define NVIF_VMM_PFNMAP_V0_HOST   0xULL
 #define NVIF_VMM_PFNMAP_V0_VRAM   0x0010ULL
+#define NVIF_VMM_PFNMAP_V0_A 0x0004ULL
 #define NVIF_VMM_PFNMAP_V0_W  0x0002ULL
 #define NVIF_VMM_PFNMAP_V0_V  0x0001ULL
 #define NVIF_VMM_PFNMAP_V0_NONE   0xULL
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index a195e48c9aee..16b07d7589d2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct nouveau_svm {
struct nouveau_drm *drm;
@@ -421,9 +422,9 @@ nouveau_svm_fault_cmp(const void *a, const void *b)
return ret;
if ((ret = (s64)fa->addr - fb->addr))
return ret;
-   /*XXX: atomic? */
-   return (fa->access == 0 || fa->access == 3) -
-  (fb->access == 0 || fb->access == 3);
+   /* Atomic access (2) has highest priority */
+   return (-1*(fa->access == 2) + (fa->access == 0 || fa->access == 3)) -
+  (-1*(fb->access == 2) + (fb->access == 0 || fb->access == 3));
 }
 
 static void
@@ -487,6 +488,10 @@ static bool nouveau_svm_range_invalidate(struct 
mmu_interval_notifier *mni,
struct svm_notifier *sn =
container_of(mni, struct svm_notifier, notifier);
 
+   if (range->event == MMU_NOTIFY_EXCLUSIVE &&
+   range->owner == sn->svmm->vmm->cli->drm->dev)
+   return true;
+
/*
 * serializes the update to mni->invalidate_seq done by caller and
 * prevents invalidation of the PTE from progressing while HW is being
@@ -555,6 +560,73 @@ static void nouveau_hmm_convert_pfn(struct nouveau_drm 
*drm,
args->p.phys[0] |= NVIF_VMM_PFNMAP_V0_W;
 }
 
+static int nouveau_atomic_range_fault(struct nouveau_svmm *svmm,
+  struct nouveau_drm *drm,
+  struct nouveau_pfnmap_args *args, u32 size,
+  struct svm_notifier *notifier)
+{
+   unsigned long timeout =
+   jiffies + msecs_to_jiffies(HMM_RANGE_DEFAULT_TIMEOUT);
+   struct mm_struct *mm = svmm->notifier.mm;
+   struct page *page;
+   unsigned long start = args->p.addr;
+   unsigned long notifier_seq;
+   int ret = 0;
+
+   ret = mmu_interval_notifier_insert(¬ifier->notifier, mm,
+   args->p.addr, args->p.size,
+   &nouveau_svm_mni_ops);
+   if (ret)
+   return ret;
+
+   while (true) {
+   if (time_after(jiffies, timeout)) {
+   ret = -EBUSY;
+   goto out;
+   }
+
+   notifier_seq = mmu_interval_read_begin(¬ifier->notifier);
+   mmap_read_lock(mm);
+   make_device_exclusive_range(mm, start, start + PAGE_SIZE,
+   &page, drm->dev);
+   mmap_read_unlock(mm);
+   if (!page) {
+   ret = -EINVAL;
+   goto out;
+   }
+
+   mutex_lock(&svmm->mutex);
+   if (mmu_interval_read_retry(¬ifier->notifier,
+   notifier_seq)) {
+   mutex_unlock(&svmm->mutex);
+

Re: [PATCH][next] usb: mtu3: Fix spelling mistake "disabed" -> "disabled"

2021-03-12 Thread Chunfeng Yun

On Thu, 2021-03-11 at 09:25 +, Colin King wrote:
> From: Colin Ian King 
> 
> The variable u3_ports_disabed contains a spelling mistake,
> rename it to u3_ports_disabled.
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/usb/mtu3/mtu3_host.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/mtu3/mtu3_host.c b/drivers/usb/mtu3/mtu3_host.c
> index c871b94f3e6f..41a5675ac5ca 100644
> --- a/drivers/usb/mtu3/mtu3_host.c
> +++ b/drivers/usb/mtu3/mtu3_host.c
> @@ -109,7 +109,7 @@ int ssusb_host_enable(struct ssusb_mtk *ssusb)
>   void __iomem *ibase = ssusb->ippc_base;
>   int num_u3p = ssusb->u3_ports;
>   int num_u2p = ssusb->u2_ports;
> - int u3_ports_disabed;
> + int u3_ports_disabled;
>   u32 check_clk;
>   u32 value;
>   int i;
> @@ -118,10 +118,10 @@ int ssusb_host_enable(struct ssusb_mtk *ssusb)
>   mtu3_clrbits(ibase, U3D_SSUSB_IP_PW_CTRL1, SSUSB_IP_HOST_PDN);
>  
>   /* power on and enable u3 ports except skipped ones */
> - u3_ports_disabed = 0;
> + u3_ports_disabled = 0;
>   for (i = 0; i < num_u3p; i++) {
>   if ((0x1 << i) & ssusb->u3p_dis_msk) {
> - u3_ports_disabed++;
> + u3_ports_disabled++;
>   continue;
>   }
>  
> @@ -140,7 +140,7 @@ int ssusb_host_enable(struct ssusb_mtk *ssusb)
>   }
>  
>   check_clk = SSUSB_XHCI_RST_B_STS;
> - if (num_u3p > u3_ports_disabed)
> + if (num_u3p > u3_ports_disabled)
>   check_clk = SSUSB_U3_MAC_RST_B_STS;
Reviewed-by: Chunfeng Yun 

Thanks a lot


>  
>   return ssusb_check_clocks(ssusb, check_clk);

Re: [PATCH v2 28/43] powerpc/64e: Call bad_page_fault() from do_page_fault()





Le 10/03/2021 à 02:29, Nicholas Piggin a écrit :

Excerpts from Christophe Leroy's message of March 9, 2021 10:09 pm:

book3e/64 is the last one calling __bad_page_fault()
from assembly.

Save non volatile registers before calling do_page_fault()
and modify do_page_fault() to call __bad_page_fault()
for all platforms.

Then it can be refactored by the call of bad_page_fault()
which avoids the duplication of the exception table search.


This can go in with the 64e change after your series. I think it should
be ready for the next merge window as well.


Yes, I thought it would pull more optimisation, but at the end it doesn't bring anythink, so I'll 
drop it for now and leave it to you for your series.




Thanks,
Nick



Signed-off-by: Christophe Leroy 
---
  arch/powerpc/kernel/exceptions-64e.S |  8 +---
  arch/powerpc/mm/fault.c  | 17 -
  2 files changed, 5 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64e.S 
b/arch/powerpc/kernel/exceptions-64e.S
index e8eb9992a270..b60f89078a3f 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1010,15 +1010,9 @@ storage_fault_common:
addir3,r1,STACK_FRAME_OVERHEAD
ld  r14,PACA_EXGEN+EX_R14(r13)
ld  r15,PACA_EXGEN+EX_R15(r13)
+   bl  save_nvgprs
bl  do_page_fault
-   cmpdi   r3,0
-   bne-1f
b   ret_from_except_lite
-1: bl  save_nvgprs
-   mr  r4,r3
-   addir3,r1,STACK_FRAME_OVERHEAD
-   bl  __bad_page_fault
-   b   ret_from_except
  
  /*

   * Alignment exception doesn't fit entirely in the 0x100 bytes so it
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 2e54bac99a22..7bcff3fca110 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -541,24 +541,15 @@ NOKPROBE_SYMBOL(___do_page_fault);
  
  static long __do_page_fault(struct pt_regs *regs)

  {
-   const struct exception_table_entry *entry;
long err;
  
  	err = ___do_page_fault(regs, regs->dar, regs->dsisr);

if (likely(!err))
-   return err;
-
-   entry = search_exception_tables(regs->nip);
-   if (likely(entry)) {
-   instruction_pointer_set(regs, extable_fixup(entry));
return 0;
-   } else if (!IS_ENABLED(CONFIG_PPC_BOOK3E_64)) {
-   __bad_page_fault(regs, err);
-   return 0;
-   } else {
-   /* 32 and 64e handle the bad page fault in asm */
-   return err;
-   }
+
+   bad_page_fault(regs, err);
+
+   return 0;
  }
  NOKPROBE_SYMBOL(__do_page_fault);
  
--

2.25.0

[PATCH v6 7/8] nouveau/svm: Refactor nouveau_range_fault

Call mmu_interval_notifier_insert() as part of nouveau_range_fault().
This doesn't introduce any functional change but makes it easier for a
subsequent patch to alter the behaviour of nouveau_range_fault() to
support GPU atomic operations.

Signed-off-by: Alistair Popple 
---
 drivers/gpu/drm/nouveau/nouveau_svm.c | 34 ---
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c 
b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 94f841026c3b..a195e48c9aee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -567,18 +567,27 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
unsigned long hmm_pfns[1];
struct hmm_range range = {
.notifier = ¬ifier->notifier,
-   .start = notifier->notifier.interval_tree.start,
-   .end = notifier->notifier.interval_tree.last + 1,
.default_flags = hmm_flags,
.hmm_pfns = hmm_pfns,
.dev_private_owner = drm->dev,
};
-   struct mm_struct *mm = notifier->notifier.mm;
+   struct mm_struct *mm = svmm->notifier.mm;
int ret;
 
+   ret = mmu_interval_notifier_insert(¬ifier->notifier, mm,
+   args->p.addr, args->p.size,
+   &nouveau_svm_mni_ops);
+   if (ret)
+   return ret;
+
+   range.start = notifier->notifier.interval_tree.start;
+   range.end = notifier->notifier.interval_tree.last + 1;
+
while (true) {
-   if (time_after(jiffies, timeout))
-   return -EBUSY;
+   if (time_after(jiffies, timeout)) {
+   ret = -EBUSY;
+   goto out;
+   }
 
range.notifier_seq = mmu_interval_read_begin(range.notifier);
mmap_read_lock(mm);
@@ -587,7 +596,7 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
if (ret) {
if (ret == -EBUSY)
continue;
-   return ret;
+   goto out;
}
 
mutex_lock(&svmm->mutex);
@@ -606,6 +615,9 @@ static int nouveau_range_fault(struct nouveau_svmm *svmm,
svmm->vmm->vmm.object.client->super = false;
mutex_unlock(&svmm->mutex);
 
+out:
+   mmu_interval_notifier_remove(¬ifier->notifier);
+
return ret;
 }
 
@@ -727,14 +739,8 @@ nouveau_svm_fault(struct nvif_notify *notify)
}
 
notifier.svmm = svmm;
-   ret = mmu_interval_notifier_insert(¬ifier.notifier, mm,
-  args.i.p.addr, args.i.p.size,
-  &nouveau_svm_mni_ops);
-   if (!ret) {
-   ret = nouveau_range_fault(svmm, svm->drm, &args.i,
-   sizeof(args), hmm_flags, ¬ifier);
-   mmu_interval_notifier_remove(¬ifier.notifier);
-   }
+   ret = nouveau_range_fault(svmm, svm->drm, &args.i,
+   sizeof(args), hmm_flags, ¬ifier);
mmput(mm);
 
limit = args.i.p.addr + args.i.p.size;
-- 
2.20.1

Re: arm64 syzbot instances

2021-03-12 Thread Arnd Bergmann

On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov  wrote:
> On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann  wrote:
> > >
> > > The instances found few arm64-specific issues that we have not
> > > observed on other instances:
> >
> > I've had a brief look at these:
> >
> > > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> >
> > This one  doesn't seem arm64 specific at all. While the KASAN report has 
> > shown
> > up on arm64, the link to
> > https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> > seems to be for x86 machines running into the same problem.
> >
> > Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> > an existing
> > list entry on the global "pending_fw_head" list, which seems to have been 
> > freed
> > earlier (the allocation listed here is not for a firmware load, so 
> > presumably it
> > was recycled in the meantime). The log shows that this is the second time 
> > that
> > loading the regulatory database failed in that run, so my guess is that it 
> > was
> > the first failed load that left the freed firmware private data on the
> > list, but I
> > don't see how that happened.
> >
> > > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> >
> > This one rings a bell: opening a 8250 uart on a well-known port must fail
> > when no I/O ports are registered in the system, or when the PCI I/O ports
> > are mapped to an invalid area.
> >
> > It seems to be attempting a register access at I/O port '1' (virtual
> > address 0xfbfffe81 is one byte into the well-known PCI_IOBASE),
> > which is an unusual place for a UART, traditional PCs had it at 0x3F8.
> >
> > This could be either a result of qemu claiming to support a PIO based UART
> > at the first available address, or the table of UARTS being uninitialized
> > .bss memory.
> >
> > Definitely an arm64 specific bug.
>
> I can reproduce this with just:
>
> #include 
> #include 
> #include 
> #include 
> #include 
>
> int main(void)
> {
>   int fd = syscall(__NR_openat, 0xff9cul, "/dev/ttyS3", 0ul, 0ul);
>   char ch = 0;
>   syscall(__NR_ioctl, fd, 0x5412, &ch); // TIOCSTI
>   return 0;
> }
>
>
> It does not even do any tty setup... does it point to a qemu bug?

There are at least two bugs here, but both could be either in the
kernel or in qemu:

a) accessing a legacy ISA/LPC port should not result in an oops,
but should instead return values with all bits set. There could
be a ratelimited console warning about broken drivers, but we
can't assume that all drivers work correctly, as some ancient
PC style drivers still rely on this.
John Garry has recently worked on a related bugfix, so maybe
either this is the same bug he encountered (and hasn't merged
yet), or if his fix got merged there is still a remaining problem.

b) It should not be possible to open /dev/ttyS3 if the device is
not initialized. What is the output of 'cat /proc/tty/driver/serial'
on this machine? Do you see any messages from the serial
driver in the boot log?
Unfortunately there are so many different ways to probe devices
in the 8250 driver that I don't know where this comes from.
Your config file has
   CONFIG_SERIAL_8250_PNP=y
   CONFIG_SERIAL_8250_NR_UARTS=32
   CONFIG_SERIAL_8250_RUNTIME_UARTS=4
   CONFIG_SERIAL_8250_EXTENDED=y
   I guess it's probably the preconfigured uarts that somehow
   become probed without initialization, but it could also be
   an explicit device incorrectly described by qemu.

Arnd

Re: [PATCH] perf annotate: Fix sample events lost in stdio mode

2021-03-12 Thread Namhyung Kim

On Fri, Mar 12, 2021 at 4:19 PM Yang Jihong  wrote:
>
>
> Hello,
> On 2021/3/12 13:49, Namhyung Kim wrote:
> > Hi,
> >
> > On Fri, Mar 12, 2021 at 12:24 PM Yang Jihong  wrote:
> >>
> >> Hello, Namhyung
> >>
> >> On 2021/3/11 22:42, Namhyung Kim wrote:
> >>> Hi,
> >>>
> >>> On Thu, Mar 11, 2021 at 5:48 PM Yang Jihong  
> >>> wrote:
> 
>  Hello,
> 
>  On 2021/3/6 16:28, Yang Jihong wrote:
> > In hist__find_annotations function, since have a hist_entry per IP for 
> > the same
> > symbol, we free notes->src to signal already processed this symbol in 
> > stdio mode;
> > when annotate, entry will skipped if notes->src is NULL to avoid 
> > repeated output.
> >>>
> >>> I'm not sure it's still true that we have a hist_entry per IP.
> >>> Afaik the default sort key is comm,dso,sym which means it should have a 
> >>> single
> >>> hist_entry for each symbol.  It seems like an old comment..
> >>>
> >> Emm, yes, we have a hist_entry for per IP.
> >> a member named "sym" in struct "hist_entry" points to symbol,
> >> different IP may point to the same symbol.
> >
> > Are you sure about this?  It seems like a bug then.
> >
> Yes, now each IP corresponds to a hist_entry :)
>
> Last week I found that some sample events were missing when perf
> annotate in stdio mode, so I went through the annotate code carefully.
>
> The event handling process is as follows:
> process_sample_event
>evsel_add_sample
>  hists__add_entry
>__hists__add_entry
>  hists__findnew_entry
>hist_entry__new  -> here allock new hist_entry

Yeah, so this is for a symbol.

>
>  hist_entry__inc_addr_samples
>symbol__inc_addr_samples
>  symbol__hists
>annotated_source__new-> here alloc annotate soruce
>annotated_source__alloc_histograms -> here alloc histograms

This should be for each IP (ideally it should be per instruction).

>
> By bugs, do you mean there's something wrong?

No. I think we were saying about different things.  :)


> >>> diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
> >>> index a23ba6bb99b6..a91fe45bd69f 100644
> >>> --- a/tools/perf/builtin-annotate.c
> >>> +++ b/tools/perf/builtin-annotate.c
> >>> @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists 
> >>> *hists,
> >>>   } else {
> >>>   hist_entry__tty_annotate(he, evsel, ann);
> >>>   nd = rb_next(nd);
> >>> -   /*
> >>> -* Since we have a hist_entry per IP for the same
> >>> -* symbol, free he->ms.sym->src to signal we 
> >>> already
> >>> -* processed this symbol.
> >>> -*/
> >>> -   zfree(¬es->src->cycles_hist);
> >>> -   zfree(¬es->src);
> >>>   }
> >>>   }
> >>>}
> >>>
> >> This solution may have the following problem:
> >> For example, if two sample events are in two different processes but in
> >> the same symbol, repeated output may occur.
> >> Therefore, a flag is required to indicate whether the symbol has been
> >> processed to avoid repeated output.
> >
> > Hmm.. ok.  Yeah we don't care about the processes here.
> > Then we should remove it from the sort key like below:
> >
> > @@ -624,6 +617,7 @@ int cmd_annotate(int argc, const char **argv)
> >  if (setup_sorting(annotate.session->evlist) < 0)
> >  usage_with_options(annotate_usage, options);
> >  } else {
> > +   sort_order = "dso,symbol";
> >  if (setup_sorting(NULL) < 0)
> >  usage_with_options(annotate_usage, options);
> >  }
> >
> >
> Are you referring to this solution?
> --- a/tools/perf/builtin-annotate.c
> +++ b/tools/perf/builtin-annotate.c
> @@ -374,13 +374,6 @@ static void hists__find_annotations(struct hists
> *hists,
>  } else {
>  hist_entry__tty_annotate(he, evsel, ann);
>  nd = rb_next(nd);
> -   /*
> -* Since we have a hist_entry per IP for the same
> -* symbol, free he->ms.sym->src to signal we already
> -* processed this symbol.
> -*/
> -   zfree(¬es->src->cycles_hist);
> -   zfree(¬es->src);
>  }
>  }
>   }
> @@ -624,6 +617,7 @@ int cmd_annotate(int argc, const char **argv)
>  if (setup_sorting(annotate.session->evlist) < 0)
>  usage_with_options(annotate_usage, options);
>  } else {
> +   sort_order = "dso,symbol";
>  if (setup_sorting(NULL) < 0)
>  usage_with_options(annotate_usage, options);
>  }
> It seems to

Re: [PATCH 0/4] mfd/power: Push data into power supply

On Fri, Mar 12, 2021 at 9:36 AM Linus Walleij  wrote:

> - The power maintainer (Sebastian) provide an ACK

Ooops I noticed actuall Sebastian already gave an ACK for these
patches:
https://lore.kernel.org/linux-pm/20210128001700.pkuyfpq6uzcjb5ud@earth.universe/

Sorry for keeping bad track.

This means Lee can add Sebastians ACK and
merge these patches at will. (I can also resend
with the ACKs if need be.)

Yours,
Linus Walleij

Re: [PATCH v2 02/43] powerpc/traps: Declare unrecoverable_exception() as __noreturn





Le 10/03/2021 à 02:22, Nicholas Piggin a écrit :

Excerpts from Christophe Leroy's message of March 9, 2021 10:09 pm:

unrecoverable_exception() is never expected to return, most callers
have an infiniteloop in case it returns.

Ensure it really never returns by terminating it with a BUG(), and
declare it __no_return.

It always GCC to really simplify functions calling it. In the exemple
below, it avoids the stack frame in the likely fast path and avoids
code duplication for the exit.

With this patch:


[snip]

Nice.


diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index a44a30b0688c..d5c9d9ddd186 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -2170,11 +2170,15 @@ DEFINE_INTERRUPT_HANDLER(SPEFloatingPointRoundException)
   * in the MSR is 0.  This indicates that SRR0/1 are live, and that
   * we therefore lost state by taking this exception.
   */
-void unrecoverable_exception(struct pt_regs *regs)
+void __noreturn unrecoverable_exception(struct pt_regs *regs)
  {
pr_emerg("Unrecoverable exception %lx at %lx (msr=%lx)\n",
 regs->trap, regs->nip, regs->msr);
die("Unrecoverable exception", regs, SIGABRT);
+   /* die() should not return */
+   WARN(true, "die() unexpectedly returned");
+   for (;;)
+   ;
  }


I don't think the WARN should be added because that will cause another
interrupt after something is already badly wrong, so this might just
make it harder to debug.

For example if die() is falling through for some reason, we warn and
cause a program check here, and that might also be unrecoverable so it
might come through here and fall through again and warn again, etc.

Putting the infinite loop is good enough I think (and better than there
was previously).


Ok, dropped the WARN()



Otherwise

Reviewed-by: Nicholas Piggin 

Thanks,
Nick

Re: arm64 syzbot instances

2021-03-12 Thread Dmitry Vyukov

On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann  wrote:
>
> On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov  wrote:
> > On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann  wrote:
> > > >
> > > > The instances found few arm64-specific issues that we have not
> > > > observed on other instances:
> > >
> > > I've had a brief look at these:
> > >
> > > > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> > >
> > > This one  doesn't seem arm64 specific at all. While the KASAN report has 
> > > shown
> > > up on arm64, the link to
> > > https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> > > seems to be for x86 machines running into the same problem.
> > >
> > > Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> > > an existing
> > > list entry on the global "pending_fw_head" list, which seems to have been 
> > > freed
> > > earlier (the allocation listed here is not for a firmware load, so 
> > > presumably it
> > > was recycled in the meantime). The log shows that this is the second time 
> > > that
> > > loading the regulatory database failed in that run, so my guess is that 
> > > it was
> > > the first failed load that left the freed firmware private data on the
> > > list, but I
> > > don't see how that happened.
> > >
> > > > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> > >
> > > This one rings a bell: opening a 8250 uart on a well-known port must fail
> > > when no I/O ports are registered in the system, or when the PCI I/O ports
> > > are mapped to an invalid area.
> > >
> > > It seems to be attempting a register access at I/O port '1' (virtual
> > > address 0xfbfffe81 is one byte into the well-known PCI_IOBASE),
> > > which is an unusual place for a UART, traditional PCs had it at 0x3F8.
> > >
> > > This could be either a result of qemu claiming to support a PIO based UART
> > > at the first available address, or the table of UARTS being uninitialized
> > > .bss memory.
> > >
> > > Definitely an arm64 specific bug.
> >
> > I can reproduce this with just:
> >
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> >
> > int main(void)
> > {
> >   int fd = syscall(__NR_openat, 0xff9cul, "/dev/ttyS3", 0ul, 
> > 0ul);
> >   char ch = 0;
> >   syscall(__NR_ioctl, fd, 0x5412, &ch); // TIOCSTI
> >   return 0;
> > }
> >
> >
> > It does not even do any tty setup... does it point to a qemu bug?
>
> There are at least two bugs here, but both could be either in the
> kernel or in qemu:
>
> a) accessing a legacy ISA/LPC port should not result in an oops,
> but should instead return values with all bits set. There could
> be a ratelimited console warning about broken drivers, but we
> can't assume that all drivers work correctly, as some ancient
> PC style drivers still rely on this.
> John Garry has recently worked on a related bugfix, so maybe
> either this is the same bug he encountered (and hasn't merged
> yet), or if his fix got merged there is still a remaining problem.
>
> b) It should not be possible to open /dev/ttyS3 if the device is
> not initialized. What is the output of 'cat /proc/tty/driver/serial'
> on this machine? Do you see any messages from the serial
> driver in the boot log?
> Unfortunately there are so many different ways to probe devices
> in the 8250 driver that I don't know where this comes from.
> Your config file has
>CONFIG_SERIAL_8250_PNP=y
>CONFIG_SERIAL_8250_NR_UARTS=32
>CONFIG_SERIAL_8250_RUNTIME_UARTS=4
>CONFIG_SERIAL_8250_EXTENDED=y
>I guess it's probably the preconfigured uarts that somehow
>become probed without initialization, but it could also be
>an explicit device incorrectly described by qemu.


Here is fool boot log, /proc/tty/driver/serial and the crash:
https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

RE: [PATCH] leds: leds-dual-gpio: Add dual GPIO LEDs driver

2021-03-12 Thread Hermes Zhang

Hi Alexander,

> Am Donnerstag, 11. März 2021, 14:04:08 CET schrieb Hermes Zhang:
> > From: Hermes Zhang 
> >
> > Introduce a new Dual GPIO LED driver. These two GPIOs LED will act as
> > one LED as normal GPIO LED but give the possibility to change the
> > intensity in four levels: OFF, LOW, MIDDLE and HIGH.
> 
> Interesting use case. Is there any real world hardware wired like that you
> could point to?
> 

Yes, we have the HW, it's not a chip but just some circuit to made of.
 
> > +config LEDS_DUAL_GPIO
> > +   tristate "LED Support for Dual GPIO connected LEDs"
> > +   depends on LEDS_CLASS
> > +   depends on GPIOLIB || COMPILE_TEST
> > +   help
> > + This option enables support for the two LEDs connected to GPIO
> > + outputs. These two GPIO LEDs act as one LED in the sysfs and
> > + perform different intensity by enable either one of them or both.
> 
> Well, although I never had time to implement that, I suspect that could
> conflict if someone will eventually write a driver for two pin dual color LEDs
> connected to GPIO pins.  We actually do that on our hardware and I know
> others do, too.
> 
> I asked about that back in 2019, see this thread:
> 
> https://www.spinics.net/lists/linux-leds/msg11665.html
> 
> At the time the multicolor framework was not yet merged, so today I would
> probably make something which either uses the multicolor framework or at
> least has a similar interface to userspace. However, it probably won't 
> surprise
> you all, this is not highest priority on my ToDo list. ;-)
> 
> (What we actually do is pretend those are separate LEDs and ignore the
> conflicting case where both GPIOs are on and the LED is dark then.)
> 

Yes, that case seems conflict with mine, the pattern for me is like:

P1 | P2 | LED
-- + -- + -
 0 |  0 | off
 0 |  1 | Any color
 1 |  0 | Any color
 1 |  1 | both on

Now I'm investigate another way from Marek's suggestion by using 
REGULATOR_GPIO, to see if could meet my requirement. If yes, then I do think no 
new  driver is needed.

Best Regards,
Hermes

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

fuse: kernel BUG at mm/truncate.c:763!

2021-03-12 Thread Luis Henriques

Hi Miklos,

I've seen a bug report (5.10.16 kernel splat below) that seems to be
reproducible in kernels as early as 5.4.

The commit that caught my attention when looking at what was merged in 5.4
was e4648309b85a ("fuse: truncate pending writes on O_TRUNC") but I didn't
went too deeper on that -- I was wondering if you have seen something
similar before.

There's another splat in the bug report[1] for a 5.4.14 kernel (which may
be for a different bug, but the traces don't look as reliable as the one
bellow).

[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1182929

[97604.721590] kernel BUG at mm/truncate.c:763!
[97604.721601] invalid opcode:  [#1] SMP PTI
[97604.721613] CPU: 18 PID: 1584438 Comm: g++ Tainted: P   O 
 5.10.16-1-default #1 openSUSE Tumbleweed
[97604.721618] Hardware name: Supermicro X11DPi-N(T)/X11DPi-N, BIOS 3.1a
10/16/2019
[97604.721631] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0
[97604.721637] Code: 0f 48 f0 e9 19 ff ff ff 31 c9 4c 89 e7 ba 01 00 00 00
48 89 ee e8 1a c5 02 00 4c 89 ff e8 02 1b 01 00 84 c0 0f 84 ca fe ff ff <0f>
0b 49 8b 57 18 49 39 d4 0f 85 e2 fe ff ff 49 f7 07 00 60 00 00
[97604.721645] RSP: 0018:a613aa54ba40 EFLAGS: 00010202
[97604.721651] RAX: 0001 RBX: 000a RCX:
0200
[97604.721656] RDX: 0090 RSI: 00a800010037 RDI:
d880718e
[97604.721660] RBP: 1400 R08: 1400 R09:
1a73
[97604.721664] R10:  R11: 04a684da R12:
8a28d4549d78
[97604.721669] R13:  R14:  R15:
d880718e
[97604.721674] FS:  7f9cdd7fb740() GS:8a5c7f98()
knlGS:
[97604.721679] CS:  0010 DS:  ES:  CR0: 80050033
[97604.721683] CR2: 7f89d3d78d80 CR3: 004d8a14e005 CR4:
007706e0
[97604.721688] DR0:  DR1:  DR2:

[97604.721692] DR3:  DR6: fffe0ff0 DR7:
0400
97604.721696] PKRU: 5554
[97604.721699] Call Trace:
[97604.721719]  ? request_wait_answer+0x11a/0x210 [fuse]
[97604.721729]  ? fuse_dentry_delete+0xb/0x20 [fuse]
[97604.721740]  fuse_finish_open+0x85/0x150 [fuse]
[97604.721750]  fuse_open_common+0x1a8/0x1b0 [fuse]
[97604.721759]  ? fuse_open_common+0x1b0/0x1b0 [fuse]
[97604.721766]  do_dentry_open+0x14e/0x380
[97604.721775]  path_openat+0x600/0x10d0
[97604.721782]  ? handle_mm_fault+0x103c/0x1a00
[97604.721791]  ? follow_page_pte+0x314/0x5f0
[97604.721795]  do_filp_open+0x88/0x130
[97604.721803]  ? security_prepare_creds+0x6d/0x90
[97604.721808]  ? __kmalloc+0x11d/0x2a0
[97604.721814]  do_open_execat+0x6d/0x1a0
[97604.721819]  bprm_execve+0x190/0x6b0
[97604.721825]  do_execveat_common+0x192/0x1c0
[97604.721830]  __x64_sys_execve+0x39/0x50
[97604.721836]  do_syscall_64+0x33/0x80
[97604.721843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[97604.721848] RIP: 0033:0x7f9cdcfe2c37
[97604.721853] Code: ff ff 76 df 89 c6 f7 de 64 41 89 32 eb d5 89 c6 f7 de
64 41 89 32 eb db 66 2e 0f 1f 84 00 00 00 00 00 90 b8 3b 00 00 00 0f 05 <48>
3d 00 f0 ff ff 77 02 f3 c3 48 8b 15 08 12 30 00 f7 d8 64 89 02
[97604.721862] RSP: 002b:7ffe444f5758 EFLAGS: 0202 ORIG_RAX:
003b
[97604.721867] RAX: ffda RBX: 7f9cdd7fb6a0 RCX:
7f9cdcfe2c37
[97604.721872] RDX: 020f5300 RSI: 020f3bf8 RDI:
020f36a0
[97604.721876] RBP: 0001 R08:  R09:

[97604.721880] R10: 7ffe444f4b60 R11: 0202 R12:

[97604.721884] R13: 0001 R14: 020f36a0 R15:

[97604.721890] Modules linked in: overlay rpcsec_gss_krb5 nfsv4 dns_resolver
nfsv3 nfs fscache libafs(PO) iscsi_ibft iscsi_boot_sysfs rfkill
vboxnetadp(O) vboxnetflt(O) vboxdrv(O) dmi_sysfs intel_rapl_msr
intel_rapl_common isst_if_common joydev ipmi_ssif i40iw ib_uverbs iTCO_wdt
intel_pmc_bxt ib_core hid_generic iTCO_vendor_support skx_edac nfit
libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel acpi_ipmi
usbhid kvm i40e ipmi_si ioatdma mei_me i2c_i801 irqbypass ipmi_devintf mei
i2c_smbus lpc_ich dca efi_pstore pcspkr ipmi_msghandler tiny_power_button
acpi_pad button nls_iso8859_1 nls_cp437 vfat fat nfsd nfs_acl lockd
auth_rpcgss grace sunrpc fuse configfs nfs_ssc ast i2c_algo_bit
drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
cec rc_core drm_ttm_helper xhci_pci ttm xhci_pci_renesas xhci_hcd
crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel
drm glue_helper crypto_simd cryptd usbcore wmi sg br_netfilter bridge stp
llc
[97604.721991]  dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua
msr efivarfs
[97604.722031] ---[ end trace edcabaccd35272e2 ]---
[97604.727773] RIP: 0010:invalidate_inode_pages2_range+0x366/0x4e0

Cheers,
--
Luís

[PATCH v6] soc: fsl: enable acpi support in RCPM driver

2021-03-12 Thread Ran Wang

From: Peng Ma 

This patch enables ACPI support in RCPM driver.

Signed-off-by: Peng Ma 
Signed-off-by: Ran Wang 
---
Change in v6:
 - Remove copyright udpate to rebase on latest mainline

Change in v5:
 - Fix panic when dev->of_node is null

Change in v4:
 - Make commit subject more accurate
 - Remove unrelated new blank line

Change in v3:
 - Add #ifdef CONFIG_ACPI for acpi_device_id
 - Rename rcpm_acpi_imx_ids to rcpm_acpi_ids

Change in v2:
 - Update acpi_device_id to fix conflict with other driver

 drivers/soc/fsl/rcpm.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/drivers/soc/fsl/rcpm.c b/drivers/soc/fsl/rcpm.c
index 4ace28cab314..7aa997b932d1 100644
--- a/drivers/soc/fsl/rcpm.c
+++ b/drivers/soc/fsl/rcpm.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define RCPM_WAKEUP_CELL_MAX_SIZE  7
 
@@ -78,10 +79,14 @@ static int rcpm_pm_prepare(struct device *dev)
"fsl,rcpm-wakeup", value,
rcpm->wakeup_cells + 1);
 
-   /*  Wakeup source should refer to current rcpm device */
-   if (ret || (np->phandle != value[0]))
+   if (ret)
continue;
 
+   if (is_of_node(dev->fwnode))
+   /*  Should refer to current rcpm device */
+   if (np->phandle != value[0])
+   continue;
+
/* Property "#fsl,rcpm-wakeup-cells" of rcpm node defines the
 * number of IPPDEXPCR register cells, and "fsl,rcpm-wakeup"
 * of wakeup source IP contains an integer array:

Re: [syzbot] WARNING in huge_pmd_set_accessed

2021-03-12 Thread Dmitry Vyukov

On Fri, Mar 12, 2021 at 8:07 AM syzbot
 wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit:05a59d79 Merge git://git.kernel.org:/pub/scm/linux/kernel/..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15b8820ad0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=750735fdbc630971
> dashboard link: https://syzkaller.appspot.com/bug?extid=edb1179c837e79cc2fc3
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+edb1179c837e79cc2...@syzkaller.appspotmail.com

Kernel produced corrupted output, there is actually kvm_wait frame.

#syz dup: WARNING in kvm_wait

> [ cut here ]
> raw_local_irq_restore() called with IRQs enabled
> WARNING: CPU: 1 PID: 8400 at kernel/locking/irqflag-debug.c:10 
> warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
> Modules linked in:
> CPU: 1 PID: 8400 Comm: syz-fuzzer Not tainted 5.12.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> RIP: 0010:warn_bogus_irq_restore+0x1d/0x20 kernel/locking/irqflag-debug.c:10
> Code: be ff cc cc cc cc cc cc cc cc cc cc cc 80 3d 11 d1 ad 04 00 74 01 c3 48 
> c7 c7 20 79 6b 89 c6 05 00 d1 ad 04 01 e8 75 5b be ff <0f> 0b c3 48 39 77 10 
> 0f 84 97 00 00 00 66 f7 47 22 f0 ff 74 4b 48
> RSP: :c90001737ac8 EFLAGS: 00010282
> RAX:  RBX: 88801992a840 RCX: 
> RDX: 8880223d0200 RSI: 815b4435 RDI: f520002e6f4b
> RBP: 0200 R08:  R09: 
> R10: 815ad19e R11:  R12: 0003
> R13: ed1003325508 R14: 0001 R15: 8880b9d36000
> FS:  00c2e890() GS:8880b9d0() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> DR0:  DR1:  DR2: 
> Call Trace:
>  pv_wait arch/x86/include/asm/paravirt.h:564 [inline]
>  pv_wait_head_or_lock kernel/locking/qspinlock_paravirt.h:470 [inline]
>  __pv_queued_spin_lock_slowpath+0x8b8/0xb40 kernel/locking/qspinlock.c:508
>  pv_queued_spin_lock_slowpath arch/x86/include/asm/paravirt.h:554 [inline]
>  queued_spin_lock_slowpath arch/x86/include/asm/qspinlock.h:51 [inline]
>  queued_spin_lock include/asm-generic/qspinlock.h:85 [inline]
>  do_raw_spin_lock+0x200/0x2b0 kernel/locking/spinlock_debug.c:113
>  spin_lock include/linux/spinlock.h:354 [inline]
>  pmd_lock include/linux/mm.h:2264 [inline]
>  huge_pmd_set_accessed+0x103/0x320 mm/huge_memory.c:1265
>  handle_mm_fault+0x1bc/0x7e0 mm/memory.c:4549
>  handle_page_fault arch/x86/mm/fault.c:1475 [inline]
>  exc_page_fault+0x9e/0x180 arch/x86/mm/fault.c:1531
> RIP: 0033:0x59072c
> Code: 48 8d 05 97 25 3e 00 48 89 44 24 08 e8 6d 54 ea ff 90 e8 07 a1 ed ff eb 
> a5 cc cc cc cc cc 8b 44 24 10 48 8b 4c 24 08 89 41 24  cc cc cc cc cc cc 
> cc cc cc cc cc cc cc cc cc cc cc cc cc 48 8b
> RSP: 002b:00c0002e97b0 EFLAGS: 00010286
> RAX: 4ef5 RBX: 4ef5 RCX: 00d85fe0
> RBP: 00c0002e9890 R08: 4ef4 R09: 0059c5a0
> R13: 00aa R14: 0093f064 R15: 0038
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkal...@googlegroups.com.
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Re: [RFC 0/2] virtio-pmem: Asynchronous flush


On 12.03.21 07:02, Dan Williams wrote:

On Thu, Mar 11, 2021 at 8:21 PM Pankaj Gupta
 wrote:


Hi David,


   Jeff reported preflush order issue with the existing implementation
   of virtio pmem preflush. Dan suggested[1] to implement asynchronous flush
   for virtio pmem using work queue as done in md/RAID. This patch series
   intends to solve the preflush ordering issue and also makes the flush
   asynchronous from the submitting thread POV.

   Submitting this patch series for feeback and is in WIP. I have
   done basic testing and currently doing more testing.

Pankaj Gupta (2):
pmem: make nvdimm_flush asynchronous
virtio_pmem: Async virtio-pmem flush

   drivers/nvdimm/nd_virtio.c   | 66 ++--
   drivers/nvdimm/pmem.c| 15 
   drivers/nvdimm/region_devs.c |  3 +-
   drivers/nvdimm/virtio_pmem.c |  9 +
   drivers/nvdimm/virtio_pmem.h | 12 +++
   5 files changed, 78 insertions(+), 27 deletions(-)

[1] https://marc.info/?l=linux-kernel&m=157446316409937&w=2



Just wondering, was there any follow up of this or are we still waiting
for feedback? :)


Thank you for bringing this up.

My apologies I could not followup on this. I have another version in my local
tree but could not post it as I was not sure if I solved the problem
correctly. I will
clean it up and post for feedback as soon as I can.

P.S: Due to serious personal/family health issues I am not able to
devote much time
on this with other professional commitments. I feel bad that I have
this unfinished task.
Just in last one year things have not been stable for me & my family
and still not getting :(


No worries Pankaj. Take care of yourself and your family. The
community can handle this for you. I'm open to coaching somebody
through what's involved to get this fix landed.


Absolutely, no need to worry for now - take care of yourself and your 
loved ones! I was merely stumbling over this series while cleaning up my 
inbox, wondering if this is still stuck waiting for review/feedback. No 
need to rush anything or be stressed.


In case I have time to look into this in the future, I'd coordinate in 
this thread (especially, asking for feedback again so I know where this 
series stands)!


--
Thanks,

David / dhildenb

Re: [PATCH v3 1/3] mm: replace migrate_prep with lru_add_drain_all


On 10.03.21 17:14, Minchan Kim wrote:

Currently, migrate_prep is merely a wrapper of lru_cache_add_all.
There is not much to gain from having additional abstraction.

Use lru_add_drain_all instead of migrate_prep, which would be more
descriptive.

note: migrate_prep_local in compaction.c changed into lru_add_drain
to avoid CPU schedule cost with involving many other CPUs to keep
keep old behavior.

Signed-off-by: Minchan Kim 
---
  include/linux/migrate.h |  5 -
  mm/compaction.c |  3 ++-
  mm/mempolicy.c  |  4 ++--
  mm/migrate.c| 24 +---
  mm/page_alloc.c |  2 +-
  mm/swap.c   |  5 +
  6 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 3a389633b68f..6155d97ec76c 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -45,8 +45,6 @@ extern struct page *alloc_migration_target(struct page *page, 
unsigned long priv
  extern int isolate_movable_page(struct page *page, isolate_mode_t mode);
  extern void putback_movable_page(struct page *page);
  
-extern void migrate_prep(void);

-extern void migrate_prep_local(void);
  extern void migrate_page_states(struct page *newpage, struct page *page);
  extern void migrate_page_copy(struct page *newpage, struct page *page);
  extern int migrate_huge_page_move_mapping(struct address_space *mapping,
@@ -66,9 +64,6 @@ static inline struct page *alloc_migration_target(struct page 
*page,
  static inline int isolate_movable_page(struct page *page, isolate_mode_t mode)
{ return -EBUSY; }
  
-static inline int migrate_prep(void) { return -ENOSYS; }

-static inline int migrate_prep_local(void) { return -ENOSYS; }
-
  static inline void migrate_page_states(struct page *newpage, struct page 
*page)
  {
  }
diff --git a/mm/compaction.c b/mm/compaction.c
index e04f4476e68e..3be017ececc0 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -2319,7 +2319,8 @@ compact_zone(struct compact_control *cc, struct 
capture_control *capc)
trace_mm_compaction_begin(start_pfn, cc->migrate_pfn,
cc->free_pfn, end_pfn, sync);
  
-	migrate_prep_local();

+   /* lru_add_drain_all could be expensive with involving other CPUs */
+   lru_add_drain();
  
  	while ((ret = compact_finished(cc)) == COMPACT_CONTINUE) {

int err;
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index ab51132547b8..fc024e97be37 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1124,7 +1124,7 @@ int do_migrate_pages(struct mm_struct *mm, const 
nodemask_t *from,
int err = 0;
nodemask_t tmp;
  
-	migrate_prep();

+   lru_add_drain_all();
  
  	mmap_read_lock(mm);
  
@@ -1323,7 +1323,7 @@ static long do_mbind(unsigned long start, unsigned long len,
  
  	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
  
-		migrate_prep();

+   lru_add_drain_all();
}
{
NODEMASK_SCRATCH(scratch);
diff --git a/mm/migrate.c b/mm/migrate.c
index 62b81d5257aa..45f925e10f5a 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -57,28 +57,6 @@
  
  #include "internal.h"
  
-/*

- * migrate_prep() needs to be called before we start compiling a list of pages
- * to be migrated using isolate_lru_page(). If scheduling work on other CPUs is
- * undesirable, use migrate_prep_local()
- */
-void migrate_prep(void)
-{
-   /*
-* Clear the LRU lists so pages can be isolated.
-* Note that pages may be moved off the LRU after we have
-* drained them. Those pages will fail to migrate like other
-* pages that may be busy.
-*/
-   lru_add_drain_all();
-}
-
-/* Do the necessary work of migrate_prep but not if it involves other CPUs */
-void migrate_prep_local(void)
-{
-   lru_add_drain();
-}
-
  int isolate_movable_page(struct page *page, isolate_mode_t mode)
  {
struct address_space *mapping;
@@ -1769,7 +1747,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t 
task_nodes,
int start, i;
int err = 0, err1;
  
-	migrate_prep();

+   lru_add_drain_all();
  
  	for (i = start = 0; i < nr_pages; i++) {

const void __user *p;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2e8348936df8..f05a8db741ca 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8467,7 +8467,7 @@ static int __alloc_contig_migrate_range(struct 
compact_control *cc,
.gfp_mask = GFP_USER | __GFP_MOVABLE | __GFP_RETRY_MAYFAIL,
};
  
-	migrate_prep();

+   lru_add_drain_all();
  
  	while (pfn < end || !list_empty(&cc->migratepages)) {

if (fatal_signal_pending(current)) {
diff --git a/mm/swap.c b/mm/swap.c
index 31b844d4ed94..441d1ae1f285 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -729,6 +729,11 @@ static void lru_add_drain_per_cpu(struct work_struct 
*dummy)
  }
  
  /*

+ * lru_add_drain_all() usually needs to be called before we start compiling
+ * a list of

Re: [PATCH v3 08/11] gpio: sim: new testing module

2021-03-12 Thread Bartosz Golaszewski

On Wed, Mar 10, 2021 at 1:28 PM Andy Shevchenko
 wrote:
>

[snip]

> > +
> > +static ssize_t gpio_sim_sysfs_line_show(struct device *dev,
> > + struct device_attribute *attr,
> > + char *buf)
> > +{
> > + struct gpio_sim_attribute *line_attr = to_gpio_sim_attr(attr);
> > + struct gpio_sim_chip *chip = dev_get_drvdata(dev);
> > + int ret;
> > +
> > + mutex_lock(&chip->lock);
> > + ret = sprintf(buf, "%u\n", !!test_bit(line_attr->offset, 
> > chip->values));
>
> Shouldn't we use sysfs_emit() in a new code?
>

TIL it exists. :) I'll use it.

[snip]

> > +
> > +static ssize_t gpio_sim_config_dev_name_show(struct config_item *item,
> > +  char *page)
> > +{
> > + struct gpio_sim_chip_config *config = to_gpio_sim_chip_config(item);
> > + struct platform_device *pdev;
> > + int ret;
> > +
> > + mutex_lock(&config->lock);
> > + pdev = config->pdev;
> > + if (pdev)
> > + ret = sprintf(page, "%s\n", dev_name(&pdev->dev));
> > + else
> > + ret = sprintf(page, "n/a\n");
>
> I dunno '/' (slash) is a good character to be handled in a shell.
> I would prefer 'none' or 'not available' (I think space is easier,
> because the rules to escape much simpler: need just to take it into
> quotes, while / needs to be escaped separately).
>

My test cases work fine with 'n/a' but I can change it to 'none' if
it's less controversial.

[snip]

>
> Also don't know what the rules about using s*printf() in the configfs.
> Maybe we have sysfs_emit() analogue or it doesn't applicable here at all.
> Greg?
>

There's no configfs_emit() or anything similar. Output for simple
attributes must simply not exceed 4096 bytes. It used to be PAGE_SIZE,
now it's defined in fs/configfs/file.c as SIMPLE_ATTR_SIZE. There's no
need to check the length of the string here though as we're only
showing what we received from the user-space anyway and configfs makes
sure we don't get more than SIMPLE_ATTR_SIZE in the store callback.

[snip]

> > +
> > +static int gpio_sim_config_commit_item(struct config_item *item)
> > +{
> > + struct gpio_sim_chip_config *config = to_gpio_sim_chip_config(item);
> > + struct property_entry properties[GPIO_SIM_MAX_PROP];
> > + struct platform_device_info pdevinfo;
> > + struct platform_device *pdev;
> > + unsigned int prop_idx = 0;
> > +
> > + memset(&pdevinfo, 0, sizeof(pdevinfo));
> > + memset(properties, 0, sizeof(properties));
> > +
> > + mutex_lock(&config->lock);
> > +
> > + properties[prop_idx++] = PROPERTY_ENTRY_U32("gpio-sim,nr-gpios",
> > + config->num_lines);
>
> > + if (config->label[0] != '\0')
>
> I'm wondering if we need this check. Isn't core taking care of it?
>
> > + properties[prop_idx++] = 
> > PROPERTY_ENTRY_STRING("gpio-sim,label",
> > +config->label);
>
> > + if (config->line_names)
>
> Ditto.
>
> > + properties[prop_idx++] = PROPERTY_ENTRY_STRING_ARRAY_LEN(
> > + "gpio-line-names",
> > + config->line_names,
> > + config->num_line_names);
> > +

But I would be creating empty properties for nothing. Better to just
not have them at all.

[snip]

Bartosz

[RFC PATCH 0/7] iommu/amd: Add Generic IO Page Table Framework Support for v2 Page Table

This series introduces a new usage model for the v2 page table, where it
can be used to implement support for DMA-API by adopting the generic
IO page table framework.

One of the target usecases is to support nested IO page tables
where the guest uses the guest IO page table (v2) for translating
GVA to GPA, and the hypervisor uses the host I/O page table (v1) for
translating GPA to SPA. This is a pre-requisite for supporting the new
HW-assisted vIOMMU presented at the KVM Forum 2020.

  
https://static.sched.com/hosted_files/kvmforum2020/26/vIOMMU%20KVM%20Forum%202020.pdf

The following components are introduced in this series:

- Part 1 (patch 1-4 and 7)
  Refactor the current IOMMU page table v2 code
  to adopt the generic IO page table framework, and add
  AMD IOMMU Guest (v2) page table management code.

- Part 2 (patch 5)
  Add support for the AMD IOMMU Guest IO Protection feature (GIOV)
  where requests from the I/O device without a PASID are treated as
  if they have PASID of 0.

- Part 3 (patch 6)
  Introduce new amd_iommu_pgtable command-line to allow users
  to select the mode of operation (v1 or v2).

See AMD I/O Virtualization Technology Specification for more detail.

  http://www.amd.com/system/files/TechDocs/48882_IOMMU_3.05_PUB.pdf

Thanks,
Suravee

Suravee Suthikulpanit (7):
  iommu/amd: Refactor amd_iommu_domain_enable_v2
  iommu/amd: Update sanity check when enable PRI/ATS
  iommu/amd: Decouple the logic to enable PPR and GT
  iommu/amd: Initial support for AMD IOMMU v2 page table
  iommu/amd: Add support for Guest IO protection
  iommu/amd: Introduce amd_iommu_pgtable command-line option
  iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API

 .../admin-guide/kernel-parameters.txt |   6 +
 drivers/iommu/amd/Makefile|   2 +-
 drivers/iommu/amd/amd_iommu_types.h   |   5 +
 drivers/iommu/amd/init.c  |  42 ++-
 drivers/iommu/amd/io_pgtable_v2.c | 239 ++
 drivers/iommu/amd/iommu.c |  81 --
 drivers/iommu/io-pgtable.c|   1 +
 include/linux/io-pgtable.h|   2 +
 8 files changed, 345 insertions(+), 33 deletions(-)
 create mode 100644 drivers/iommu/amd/io_pgtable_v2.c

-- 
2.17.1

[RFC PATCH 3/7] iommu/amd: Decouple the logic to enable PPR and GT

Currently, the function to enable iommu v2 (GT) assumes PPR log
must also be enabled. This is no longer the case since the IOMMU
v2 page table can be enabled without PRR support (for DMA-API
use case).

Therefore, separate the enabling logic for PPR and GT.
There is no functional change.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/init.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9126efcbaf2c..5def566de6f6 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -898,14 +898,6 @@ static void iommu_enable_xt(struct amd_iommu *iommu)
 #endif /* CONFIG_IRQ_REMAP */
 }
 
-static void iommu_enable_gt(struct amd_iommu *iommu)
-{
-   if (!iommu_feature(iommu, FEATURE_GT))
-   return;
-
-   iommu_feature_enable(iommu, CONTROL_GT_EN);
-}
-
 /* sets a specific bit in the device table entry. */
 static void set_dev_entry_bit(u16 devid, u8 bit)
 {
@@ -1882,6 +1874,7 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
amd_iommu_max_glx_val = glxval;
else
amd_iommu_max_glx_val = min(amd_iommu_max_glx_val, 
glxval);
+   iommu_feature_enable(iommu, CONTROL_GT_EN);
}
 
if (iommu_feature(iommu, FEATURE_GT) &&
@@ -2530,21 +2523,19 @@ static void early_enable_iommus(void)
 #endif
 }
 
-static void enable_iommus_v2(void)
+static void enable_iommus_ppr(void)
 {
struct amd_iommu *iommu;
 
-   for_each_iommu(iommu) {
+   for_each_iommu(iommu)
iommu_enable_ppr_log(iommu);
-   iommu_enable_gt(iommu);
-   }
 }
 
 static void enable_iommus(void)
 {
early_enable_iommus();
 
-   enable_iommus_v2();
+   enable_iommus_ppr();
 }
 
 static void disable_iommus(void)
@@ -2935,7 +2926,7 @@ static int __init state_next(void)
register_syscore_ops(&amd_iommu_syscore_ops);
ret = amd_iommu_init_pci();
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
-   enable_iommus_v2();
+   enable_iommus_ppr();
break;
case IOMMU_PCI_INIT:
ret = amd_iommu_enable_interrupts();
-- 
2.17.1

[RFC PATCH 5/7] iommu/amd: Add support for Guest IO protection

AMD IOMMU introduces support for Guest I/O protection where the request
from the I/O device without a PASID are treated as if they have PASID 0.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/amd_iommu_types.h | 3 +++
 drivers/iommu/amd/init.c| 8 
 drivers/iommu/amd/iommu.c   | 4 
 3 files changed, 15 insertions(+)

diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 25062eb86c8b..876ba1adf73e 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -93,6 +93,7 @@
 #define FEATURE_HE (1ULL<<8)
 #define FEATURE_PC (1ULL<<9)
 #define FEATURE_GAM_VAPIC  (1ULL<<21)
+#define FEATURE_GIOSUP (1ULL<<48)
 #define FEATURE_EPHSUP (1ULL<<50)
 #define FEATURE_SNP(1ULL<<63)
 
@@ -366,6 +367,7 @@
 #define DTE_FLAG_IW (1ULL << 62)
 
 #define DTE_FLAG_IOTLB (1ULL << 32)
+#define DTE_FLAG_GIOV  (1ULL << 54)
 #define DTE_FLAG_GV(1ULL << 55)
 #define DTE_FLAG_MASK  (0x3ffULL << 32)
 #define DTE_GLX_SHIFT  (56)
@@ -519,6 +521,7 @@ struct protection_domain {
spinlock_t lock;/* mostly used to lock the page table*/
u16 id; /* the domain id written to the device table */
int glx;/* Number of levels for GCR3 table */
+   bool giov;  /* guest IO protection domain */
u64 *gcr3_tbl;  /* Guest CR3 table */
unsigned long flags;/* flags to find out type of domain */
unsigned dev_cnt;   /* devices assigned to this domain */
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 5def566de6f6..9265c1bf1d84 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1895,6 +1895,12 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
 
init_iommu_perf_ctr(iommu);
 
+   if (amd_iommu_pgtable == AMD_IOMMU_V2 &&
+   !iommu_feature(iommu, FEATURE_GIOSUP)) {
+   pr_warn("Cannot enable v2 page table for DMA-API. Fallback to 
v1.\n");
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   }
+
if (is_rd890_iommu(iommu->dev)) {
int i, j;
 
@@ -1969,6 +1975,8 @@ static void print_iommu_info(void)
if (amd_iommu_xt_mode == IRQ_REMAP_X2APIC_MODE)
pr_info("X2APIC enabled\n");
}
+   if (amd_iommu_pgtable == AMD_IOMMU_V2)
+   pr_info("GIOV enabled\n");
 }
 
 static int __init amd_iommu_init_pci(void)
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index f3800efdbb29..e29ece6e1e68 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1405,6 +1405,10 @@ static void set_dte_entry(u16 devid, struct 
protection_domain *domain,
 
pte_root |= (domain->iop.mode & DEV_ENTRY_MODE_MASK)
<< DEV_ENTRY_MODE_SHIFT;
+
+   if (domain->giov && (domain->flags & PD_IOMMUV2_MASK))
+   pte_root |= DTE_FLAG_GIOV;
+
pte_root |= DTE_FLAG_IR | DTE_FLAG_IW | DTE_FLAG_V | DTE_FLAG_TV;
 
flags = amd_iommu_dev_table[devid].data[1];
-- 
2.17.1

[RFC PATCH 7/7] iommu/amd: Add support for using AMD IOMMU v2 page table for DMA-API

Introduce init function for setting up DMA domain for DMA-API with
the IOMMU v2 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index e29ece6e1e68..bd26de8764bd 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1937,6 +1937,24 @@ static int protection_domain_init_v1(struct 
protection_domain *domain, int mode)
return 0;
 }
 
+static int protection_domain_init_v2(struct protection_domain *domain)
+{
+   spin_lock_init(&domain->lock);
+   domain->id = domain_id_alloc();
+   if (!domain->id)
+   return -ENOMEM;
+   INIT_LIST_HEAD(&domain->dev_list);
+
+   domain->giov = true;
+
+   if (amd_iommu_pgtable == AMD_IOMMU_V2 &&
+   domain_enable_v2(domain, 1, false)) {
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
 static struct protection_domain *protection_domain_alloc(unsigned int type)
 {
struct io_pgtable_ops *pgtbl_ops;
@@ -1964,6 +1982,9 @@ static struct protection_domain 
*protection_domain_alloc(unsigned int type)
case AMD_IOMMU_V1:
ret = protection_domain_init_v1(domain, mode);
break;
+   case AMD_IOMMU_V2:
+   ret = protection_domain_init_v2(domain);
+   break;
default:
ret = -EINVAL;
}
-- 
2.17.1

[RFC PATCH 6/7] iommu/amd: Introduce amd_iommu_pgtable command-line option

To allow specification whether to use v1 or v2 IOMMU pagetable for
DMA remapping when calling kernel DMA-API.

Signed-off-by: Suravee Suthikulpanit 
---
 Documentation/admin-guide/kernel-parameters.txt |  6 ++
 drivers/iommu/amd/init.c| 15 +++
 2 files changed, 21 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 04545725f187..466e807369ea 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -319,6 +319,12 @@
 This mode requires kvm-amd.avic=1.
 (Default when IOMMU HW support is present.)
 
+   amd_iommu_pgtable= [HW,X86-64]
+   Specifies one of the following AMD IOMMU page table to
+   be used for DMA remapping for DMA-API:
+   v1 - Use v1 page table (Default)
+   v2 - Use v2 page table
+
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: ,
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9265c1bf1d84..6d5163bfb87e 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3123,6 +3123,20 @@ static int __init parse_amd_iommu_dump(char *str)
return 1;
 }
 
+static int __init parse_amd_iommu_pgtable(char *str)
+{
+   for (; *str; ++str) {
+   if (strncmp(str, "v1", 2) == 0) {
+   amd_iommu_pgtable = AMD_IOMMU_V1;
+   break;
+   } else if (strncmp(str, "v2", 2) == 0) {
+   amd_iommu_pgtable = AMD_IOMMU_V2;
+   break;
+   }
+   }
+   return 1;
+}
+
 static int __init parse_amd_iommu_intr(char *str)
 {
for (; *str; ++str) {
@@ -3246,6 +3260,7 @@ static int __init parse_ivrs_acpihid(char *str)
 
 __setup("amd_iommu_dump",  parse_amd_iommu_dump);
 __setup("amd_iommu=",  parse_amd_iommu_options);
+__setup("amd_iommu_pgtable=",  parse_amd_iommu_pgtable);
 __setup("amd_iommu_intr=", parse_amd_iommu_intr);
 __setup("ivrs_ioapic", parse_ivrs_ioapic);
 __setup("ivrs_hpet",   parse_ivrs_hpet);
-- 
2.17.1

[RFC PATCH 4/7] iommu/amd: Initial support for AMD IOMMU v2 page table

Introduce IO page table framework support for AMD IOMMU v2 page table.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/Makefile  |   2 +-
 drivers/iommu/amd/amd_iommu_types.h |   2 +
 drivers/iommu/amd/io_pgtable_v2.c   | 239 
 drivers/iommu/io-pgtable.c  |   1 +
 include/linux/io-pgtable.h  |   2 +
 5 files changed, 245 insertions(+), 1 deletion(-)
 create mode 100644 drivers/iommu/amd/io_pgtable_v2.c

diff --git a/drivers/iommu/amd/Makefile b/drivers/iommu/amd/Makefile
index a935f8f4b974..773d8aa00283 100644
--- a/drivers/iommu/amd/Makefile
+++ b/drivers/iommu/amd/Makefile
@@ -1,4 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
-obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o
+obj-$(CONFIG_AMD_IOMMU) += iommu.o init.o quirks.o io_pgtable.o io_pgtable_v2.o
 obj-$(CONFIG_AMD_IOMMU_DEBUGFS) += debugfs.o
 obj-$(CONFIG_AMD_IOMMU_V2) += iommu_v2.o
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 6937e3674a16..25062eb86c8b 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -265,6 +265,7 @@
  * 512GB Pages are not supported due to a hardware bug
  */
 #define AMD_IOMMU_PGSIZES  ((~0xFFFUL) & ~(2ULL << 38))
+#define AMD_IOMMU_PGSIZES_V2   (PAGE_SIZE | (1ULL << 12) | (1ULL << 30))
 
 /* Bit value definition for dte irq remapping fields*/
 #define DTE_IRQ_PHYS_ADDR_MASK (((1ULL << 45)-1) << 6)
@@ -503,6 +504,7 @@ struct amd_io_pgtable {
int mode;
u64 *root;
atomic64_t  pt_root;/* pgtable root and pgtable mode */
+   struct mm_structv2_mm;
 };
 
 /*
diff --git a/drivers/iommu/amd/io_pgtable_v2.c 
b/drivers/iommu/amd/io_pgtable_v2.c
new file mode 100644
index ..b0b6ba2d8d35
--- /dev/null
+++ b/drivers/iommu/amd/io_pgtable_v2.c
@@ -0,0 +1,239 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * CPU-agnostic AMD IO page table v2 allocator.
+ *
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ * Author: Suravee Suthikulpanit 
+ */
+
+#define pr_fmt(fmt) "AMD-Vi: " fmt
+#define dev_fmt(fmt)pr_fmt(fmt)
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "amd_iommu_types.h"
+#include "amd_iommu.h"
+
+static pte_t *fetch_pte(struct amd_io_pgtable *pgtable,
+ unsigned long iova,
+ unsigned long *page_size)
+{
+   int level;
+   pte_t *ptep;
+
+   ptep = lookup_address_in_mm(&pgtable->v2_mm, iova, &level);
+   if (!ptep || pte_none(*ptep) || (level == PG_LEVEL_NONE))
+   return NULL;
+
+   *page_size = PTE_LEVEL_PAGE_SIZE(level-1);
+   return ptep;
+}
+
+static pte_t *v2_pte_alloc_map(struct mm_struct *mm, unsigned long vaddr)
+{
+   pgd_t *pgd;
+   p4d_t *p4d;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+
+   pgd = pgd_offset(mm, vaddr);
+   p4d = p4d_alloc(mm, pgd, vaddr);
+   if (!p4d)
+   return NULL;
+   pud = pud_alloc(mm, p4d, vaddr);
+   if (!pud)
+   return NULL;
+   pmd = pmd_alloc(mm, pud, vaddr);
+   if (!pmd)
+   return NULL;
+   pte = pte_alloc_map(mm, pmd, vaddr);
+   return pte;
+}
+
+static int iommu_v2_map_page(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+   struct protection_domain *dom = io_pgtable_ops_to_domain(ops);
+   struct amd_io_pgtable *pgtable = io_pgtable_ops_to_data(ops);
+   pte_t *pte;
+   int ret, i, count;
+   bool updated = false;
+   unsigned long o_iova = iova;
+   unsigned long pte_pgsize;
+
+   BUG_ON(!IS_ALIGNED(iova, size) || !IS_ALIGNED(paddr, size));
+
+   ret = -EINVAL;
+   if (!(prot & IOMMU_PROT_MASK))
+   goto out;
+
+   count = PAGE_SIZE_PTE_COUNT(size);
+
+   for (i = 0; i < count; ++i, iova += PAGE_SIZE, paddr += PAGE_SIZE) {
+   pte = fetch_pte(pgtable, iova, &pte_pgsize);
+   if (!pte || pte_none(*pte)) {
+   pte = v2_pte_alloc_map(&dom->iop.v2_mm, iova);
+   if (!pte)
+   goto out;
+   } else {
+   updated = true;
+   }
+   set_pte(pte, __pte((paddr & 
PAGE_MASK)|_PAGE_PRESENT|_PAGE_USER));
+   if (prot & IOMMU_PROT_IW)
+   *pte = pte_mkwrite(*pte);
+   }
+
+   if (updated) {
+   if (count > 1)
+   amd_iommu_flush_tlb(&dom->domain, 0);
+   else
+   amd_iommu_flush_page(&dom->domain, 0, o_iova);
+   }
+
+   ret = 0;
+out:
+   return ret;
+}
+
+static unsigned long iommu_v2_unmap_page(struct io_pgtable_ops *ops,
+

[RFC PATCH 2/7] iommu/amd: Update sanity check when enable PRI/ATS

Currently, PPR/ATS can be enabled only if the domain is type
identity mapping. However, when we allow the IOMMU v2 page table
to be used for DMA-API, the sanity check needs to be updated to
only apply for the case when using AMD_IOMMU_V1 page table mode.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 6f3e42495709..f3800efdbb29 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1549,7 +1549,7 @@ static int pri_reset_while_enabled(struct pci_dev *pdev)
return 0;
 }
 
-static int pdev_iommuv2_enable(struct pci_dev *pdev)
+static int pdev_pri_ats_enable(struct pci_dev *pdev)
 {
bool reset_enable;
int reqs, ret;
@@ -1624,11 +1624,19 @@ static int attach_device(struct device *dev,
struct iommu_domain *def_domain = iommu_get_dma_domain(dev);
 
ret = -EINVAL;
-   if (def_domain->type != IOMMU_DOMAIN_IDENTITY)
+
+   /*
+* In case of using AMD_IOMMU_V1 page table mode, and the device
+* is enabling for PPR/ATS support (using v2 table),
+* we need to make sure that the domain type is identity map.
+*/
+   if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
+   def_domain->type != IOMMU_DOMAIN_IDENTITY) {
goto out;
+   }
 
if (dev_data->iommu_v2) {
-   if (pdev_iommuv2_enable(pdev) != 0)
+   if (pdev_pri_ats_enable(pdev) != 0)
goto out;
 
dev_data->ats.enabled = true;
-- 
2.17.1

[RFC PATCH 1/7] iommu/amd: Refactor amd_iommu_domain_enable_v2

The current function to enable IOMMU v2 also lock the domain.
In order to reuse the same code in different code path, in which
the domain has already been locked, refactor the function to separate
the locking from the enabling logic.

Signed-off-by: Suravee Suthikulpanit 
---
 drivers/iommu/amd/iommu.c | 42 +--
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index a69a8b573e40..6f3e42495709 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -88,6 +88,7 @@ struct iommu_cmd {
 struct kmem_cache *amd_iommu_irq_cache;
 
 static void detach_device(struct device *dev);
+static int domain_enable_v2(struct protection_domain *domain, int pasids, bool 
has_ppr);
 
 /
  *
@@ -2304,10 +2305,9 @@ void amd_iommu_domain_direct_map(struct iommu_domain 
*dom)
 }
 EXPORT_SYMBOL(amd_iommu_domain_direct_map);
 
-int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
+/* Note: This function expects iommu_domain->lock to be held prior calling the 
function. */
+static int domain_enable_v2(struct protection_domain *domain, int pasids, bool 
has_ppr)
 {
-   struct protection_domain *domain = to_pdomain(dom);
-   unsigned long flags;
int levels, ret;
 
if (pasids <= 0 || pasids > (PASID_MASK + 1))
@@ -2320,17 +2320,6 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, 
int pasids)
if (levels > amd_iommu_max_glx_val)
return -EINVAL;
 
-   spin_lock_irqsave(&domain->lock, flags);
-
-   /*
-* Save us all sanity checks whether devices already in the
-* domain support IOMMUv2. Just force that the domain has no
-* devices attached when it is switched into IOMMUv2 mode.
-*/
-   ret = -EBUSY;
-   if (domain->dev_cnt > 0 || domain->flags & PD_IOMMUV2_MASK)
-   goto out;
-
ret = -ENOMEM;
domain->gcr3_tbl = (void *)get_zeroed_page(GFP_ATOMIC);
if (domain->gcr3_tbl == NULL)
@@ -2344,8 +2333,31 @@ int amd_iommu_domain_enable_v2(struct iommu_domain *dom, 
int pasids)
ret = 0;
 
 out:
-   spin_unlock_irqrestore(&domain->lock, flags);
+   return ret;
+}
 
+int amd_iommu_domain_enable_v2(struct iommu_domain *dom, int pasids)
+{
+   int ret;
+   unsigned long flags;
+   struct protection_domain *pdom = to_pdomain(dom);
+
+   spin_lock_irqsave(&pdom->lock, flags);
+
+   /*
+* Save us all sanity checks whether devices already in the
+* domain support IOMMUv2. Just force that the domain has no
+* devices attached when it is switched into IOMMUv2 mode.
+*/
+   ret = -EBUSY;
+   if (pdom->dev_cnt > 0 || pdom->flags & PD_IOMMUV2_MASK)
+   goto out;
+
+   if (pdom->dev_cnt == 0 && !(pdom->gcr3_tbl))
+   ret = domain_enable_v2(pdom, pasids, true);
+
+out:
+   spin_unlock_irqrestore(&pdom->lock, flags);
return ret;
 }
 EXPORT_SYMBOL(amd_iommu_domain_enable_v2);
-- 
2.17.1

Re: [PATCH 2/2] crypto: qat: ADF_STATUS_PF_RUNNING should be set after adf_dev_init

2021-03-12 Thread Andy Shevchenko

On Fri, Mar 12, 2021 at 9:50 AM Tong Zhang  wrote:
>
> ADF_STATUS_PF_RUNNING is (only) used and checked  by adf_vf2pf_shutdown()
> before calling adf_iov_putmsg()->mutex_lock(vf2pf_lock), however the
> vf2pf_lock is initialized in adf_dev_init(), which can fail and when it
> fail, the vf2pf_lock is either not initialized or destroyed, a subsequent
> use of vf2pf_lock will cause issue.
> To fix this issue, only set this flag if adf_dev_init() returns 0.

Makes sense, but please leave only ~2-3 (significant) lines from the
below noisy dump
Reviewed-by: Andy Shevchenko 
(after making commit message neat)

> [7.178008] 
> ==
> [7.178404] BUG: KASAN: user-memory-access in 
> __mutex_lock.isra.0+0x1ac/0x7c0
> [7.178800] Read of size 4 at addr f434 by task modprobe/96
> [7.179169]
> [7.179257] CPU: 0 PID: 96 Comm: modprobe Not tainted 
> 5.12.0-rc2-00338-gf78d76e72a46-dirty #86
> [7.179730] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> rel-1.13.0-48-gd9c812dda519-4
> [7.180345] Call Trace:
> [7.180485]  dump_stack+0x8a/0xb5
> [7.180676]  kasan_report.cold+0x10f/0x111
> [7.180907]  ? __mutex_lock.isra.0+0x1ac/0x7c0
> [7.181156]  __mutex_lock.isra.0+0x1ac/0x7c0
> [7.181394]  ? finish_task_switch+0x12f/0x360
> [7.181640]  ? __switch_to+0x339/0x6b0
> [7.181852]  ? ww_mutex_lock_interruptible+0x150/0x150
> [7.182138]  ? __sched_text_start+0x8/0x8
> [7.182363]  ? vprintk_emit+0x91/0x170
> [7.182576]  mutex_lock+0xc9/0xd0
> [7.182765]  ? __mutex_lock_slowpath+0x10/0x10
> [7.183016]  ? swsusp_write.cold+0x208/0x208
> [7.183257]  adf_iov_putmsg+0x118/0x1a0 [intel_qat]
> [7.183541]  adf_vf2pf_shutdown+0x4d/0x7b [intel_qat]
> [7.183834]  adf_dev_shutdown+0x172/0x2b0 [intel_qat]
> [7.184127]  adf_probe+0x5e9/0x600 [qat_dh895xccvf]
> [7.184403]  ? adf_remove+0x70/0x70 [qat_dh895xccvf]
> [7.184681]  ? _raw_spin_lock_irqsave+0x7b/0xd0
> [7.184936]  ? _raw_spin_unlock_irqrestore+0xd/0x20
> [7.185209]  ? adf_remove+0x70/0x70 [qat_dh895xccvf]
> [7.185489]  local_pci_probe+0x6f/0xb0
> [7.185702]  pci_device_probe+0x1e9/0x2f0
> [7.185928]  ? pci_device_remove+0xf0/0xf0
> [7.186159]  ? sysfs_do_create_link_sd.isra.0+0x76/0xe0
> [7.186458]  really_probe+0x161/0x420
> [7.186665]  driver_probe_device+0x6d/0xd0
> [7.186894]  device_driver_attach+0x82/0x90
> [7.187131]  ? device_driver_attach+0x90/0x90
> [7.187375]  __driver_attach+0x60/0x100
> [7.187591]  ? device_driver_attach+0x90/0x90
> [7.187835]  bus_for_each_dev+0xe1/0x140
> [7.188057]  ? subsys_dev_iter_exit+0x10/0x10
> [7.188302]  ? klist_node_init+0x61/0x80
> [7.188524]  bus_add_driver+0x254/0x2a0
> [7.188740]  driver_register+0xd3/0x150
> [7.188956]  ? 0xc005
> [7.189143]  adfdrv_init+0x2b/0x1000 [qat_dh895xccvf]
> [7.189427]  do_one_initcall+0x84/0x250
> [7.189644]  ? trace_event_raw_event_initcall_finish+0x150/0x150
> [7.189977]  ? _raw_spin_unlock_irqrestore+0xd/0x20
> [7.190250]  ? create_object+0x395/0x510
> [7.190472]  ? kasan_unpoison+0x21/0x50
> [7.190689]  do_init_module+0xf8/0x350
> [7.190901]  load_module+0x40c5/0x4410
> [7.191121]  ? module_frob_arch_sections+0x20/0x20
> [7.191390]  ? kernel_read_file+0x1cd/0x3e0
> [7.191626]  ? __do_sys_finit_module+0x108/0x170
> [7.191885]  __do_sys_finit_module+0x108/0x170
> [7.192134]  ? __ia32_sys_init_module+0x40/0x40
> [7.192389]  ? file_open_root+0x200/0x200
> [7.192615]  ? do_sys_open+0x85/0xe0
> [7.192817]  ? filp_open+0x50/0x50
> [7.193010]  ? exit_to_user_mode_prepare+0xfc/0x130
> [7.193283]  do_syscall_64+0x33/0x40
> [7.193486]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [7.193770] RIP: 0033:0x7f6ad21b4cf7
> [7.193971] Code: 48 89 57 30 48 8b 04 24 48 89 47 38 e9 1d a0 02 00 48 89 
> f8 48 89 f7 48 89 d6 41
> [7.194991] RSP: 002b:7ffe2a5d9028 EFLAGS: 0246 ORIG_RAX: 
> 0139
> [7.195408] RAX: ffda RBX: 00882a70 RCX: 
> 7f6ad21b4cf7
> [7.195801] RDX:  RSI: 008819e0 RDI: 
> 0003
> [7.196193] RBP: 0003 R08:  R09: 
> 0001
> [7.196588] R10: 7f6ad2218300 R11: 0246 R12: 
> 008819e0
> [7.196979] R13:  R14: 00881dd0 R15: 
> 0001
> [7.197372] 
> ==
>
> Signed-off-by: Tong Zhang 
> ---
>  drivers/crypto/qat/qat_c3xxxvf/adf_drv.c| 4 ++--
>  drivers/crypto/qat/qat_c62xvf/adf_drv.c | 4 ++--
>  drivers/crypto/qat/qat_dh895xccvf/adf_drv.c | 4 ++--
>  3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c 
> b/drivers/crypto/qat/qat_c3xxxvf/adf_drv.c
> index 1d1

Re: [PATCH v7] i2c: virtio: add a virtio i2c frontend driver

2021-03-12 Thread Arnd Bergmann

On Fri, Mar 12, 2021 at 2:33 PM Jie Deng  wrote:

> +
> +/**
> + * struct virtio_i2c_req - the virtio I2C request structure
> + * @out_hdr: the OUT header of the virtio I2C message
> + * @buf: the buffer into which data is read, or from which it's written
> + * @in_hdr: the IN header of the virtio I2C message
> + */
> +struct virtio_i2c_req {
> +   struct virtio_i2c_out_hdr out_hdr;
> +   uint8_t *buf;
> +   struct virtio_i2c_in_hdr in_hdr;
> +};

The simpler request structure clearly looks better than the previous version,
but I think I found another problem here, at least a theoretical one:

When you map the headers into the DMA address space, they should
be in separate cache lines, to allow the DMA mapping interfaces to
perform cache management on each one without accidentally clobbering
another member.

So far I think there is an assumption that virtio buffers are always
on cache-coherent devices, but if you ever have a virtio-i2c device
backend on a physical interconnect that is not cache coherent (e.g. a
microcontroller that shares the memory bus), this breaks down.

You could avoid this by either allocating arrays of each type separately,
or by marking each member that you pass to the device as
cacheline_aligned.

  Arnd

[PATCH] ata: Trivial spelling fixes in the file pata_ns87415.c

2021-03-12 Thread Bhaskar Chowdhury



Trivial spelling fixes.

Signed-off-by: Bhaskar Chowdhury 
---
 drivers/ata/pata_ns87415.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/pata_ns87415.c b/drivers/ata/pata_ns87415.c
index 1532b2e3c672..f4949e704356 100644
--- a/drivers/ata/pata_ns87415.c
+++ b/drivers/ata/pata_ns87415.c
@@ -113,7 +113,7 @@ static void ns87415_set_piomode(struct ata_port *ap, struct 
ata_device *adev)
  * ns87415_bmdma_setup -   Set up DMA
  * @qc: Command block
  *
- * Set up for bus masterng DMA. We have to do this ourselves
+ * Set up for bus mastering DMA. We have to do this ourselves
  * rather than use the helper due to a chip erratum
  */

@@ -174,7 +174,7 @@ static void ns87415_bmdma_stop(struct ata_queued_cmd *qc)
  * ns87415_irq_clear   -   Clear interrupt
  * @ap: Channel to clear
  *
- * Erratum: Due to a chip bug regisers 02 and 0A bit 1 and 2 (the
+ * Erratum: Due to a chip bug registers 02 and 0A bit 1 and 2 (the
  * error bits) are reset by writing to register 00 or 08.
  */

--
2.26.2

Re: [PATCH v3 2/3] mm: disable LRU pagevec during the migration temporarily


On 10.03.21 17:14, Minchan Kim wrote:

LRU pagevec holds refcount of pages until the pagevec are drained.
It could prevent migration since the refcount of the page is greater
than the expection in migration logic. To mitigate the issue,
callers of migrate_pages drains LRU pagevec via migrate_prep or
lru_add_drain_all before migrate_pages call.

However, it's not enough because pages coming into pagevec after the
draining call still could stay at the pagevec so it could keep
preventing page migration. Since some callers of migrate_pages have
retrial logic with LRU draining, the page would migrate at next trail
but it is still fragile in that it doesn't close the fundamental race
between upcoming LRU pages into pagvec and migration so the migration
failure could cause contiguous memory allocation failure in the end.

To close the race, this patch disables lru caches(i.e, pagevec)
during ongoing migration until migrate is done.

Since it's really hard to reproduce, I measured how many times
migrate_pages retried with force mode(it is about a fallback to a
sync migration) with below debug code.

int migrate_pages(struct list_head *from, new_page_t get_new_page,
..
..

if (rc && reason == MR_CONTIG_RANGE && pass > 2) {
printk(KERN_ERR, "pfn 0x%lx reason %d\n", page_to_pfn(page), rc);
dump_page(page, "fail to migrate");
}

The test was repeating android apps launching with cma allocation
in background every five seconds. Total cma allocation count was
about 500 during the testing. With this patch, the dump_page count
was reduced from 400 to 30.

The new interface is also useful for memory hotplug which currently
drains lru pcp caches after each migration failure. This is rather
suboptimal as it has to disrupt others running during the operation.
With the new interface the operation happens only once. This is also in
line with pcp allocator cache which are disabled for the offlining as
well.

Signed-off-by: Minchan Kim 
---
  include/linux/swap.h |  3 ++
  mm/memory_hotplug.c  |  3 +-
  mm/mempolicy.c   |  4 ++-
  mm/migrate.c |  3 +-
  mm/swap.c| 79 
  5 files changed, 75 insertions(+), 17 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 32f665b1ee85..a3e258335a7f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -339,6 +339,9 @@ extern void lru_note_cost(struct lruvec *lruvec, bool file,
  extern void lru_note_cost_page(struct page *);
  extern void lru_cache_add(struct page *);
  extern void mark_page_accessed(struct page *);
+extern void lru_cache_disable(void);
+extern void lru_cache_enable(void);
+extern bool lru_cache_disabled(void);
  extern void lru_add_drain(void);
  extern void lru_add_drain_cpu(int cpu);
  extern void lru_add_drain_cpu_zone(struct zone *zone);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 5ba51a8bdaeb..959f659ef085 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1611,6 +1611,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned 
long nr_pages)
 * in a way that pages from isolated pageblock are left on pcplists.
 */
zone_pcp_disable(zone);
+   lru_cache_disable();


Did you also experiment which effects zone_pcp_disable() might have on 
alloc_contig_range() ?


Feels like both calls could be abstracted somehow and used in both 
(memory offlining/alloc_contig_range) cases. It's essentially disabling 
some kind of caching.



Looks sane to me, but I am not that experienced with migration code to 
give this a real RB.


--
Thanks,

David / dhildenb

Re: [PATCH v2 2/2] usb: typec: tcpci_maxim: configure charging & data paths

2021-03-12 Thread Heikki Krogerus

Thu, Mar 11, 2021 at 09:24:43PM -0800, Badhri Jagan Sridharan kirjoitti:
> The change exposes the data_role and the orientation as a extcon
> interface for configuring the USB data controller.
> 
> Signed-off-by: Badhri Jagan Sridharan 
> ---
> Changes since V1:
> - Dropped changes related to get_/set_current_limit and pd_capable
>   callback. Will send them in as separate patches.
> ---
>  drivers/usb/typec/tcpm/tcpci_maxim.c | 56 
>  1 file changed, 56 insertions(+)
> 
> diff --git a/drivers/usb/typec/tcpm/tcpci_maxim.c 
> b/drivers/usb/typec/tcpm/tcpci_maxim.c
> index 041a1c393594..1210445713ee 100644
> --- a/drivers/usb/typec/tcpm/tcpci_maxim.c
> +++ b/drivers/usb/typec/tcpm/tcpci_maxim.c
> @@ -7,6 +7,8 @@
>  
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -46,6 +48,8 @@ struct max_tcpci_chip {
>   struct device *dev;
>   struct i2c_client *client;
>   struct tcpm_port *port;
> + bool attached;
> + struct extcon_dev *extcon;
>  };
>  
>  static const struct regmap_range max_tcpci_tcpci_range[] = {
> @@ -439,6 +443,39 @@ static int tcpci_init(struct tcpci *tcpci, struct 
> tcpci_data *data)
>   return -1;
>  }
>  
> +static void max_tcpci_set_roles(struct tcpci *tcpci, struct tcpci_data 
> *data, bool attached,
> + enum typec_role role, enum typec_data_role 
> data_role)
> +{
> + struct max_tcpci_chip *chip = tdata_to_max_tcpci(data);
> +
> + chip->attached = attached;
> +
> + if (!attached) {
> + extcon_set_state_sync(chip->extcon, EXTCON_USB_HOST, 0);
> + extcon_set_state_sync(chip->extcon, EXTCON_USB, 0);
> + return;
> + }
> +
> + extcon_set_state_sync(chip->extcon, data_role == TYPEC_HOST ? 
> EXTCON_USB_HOST : EXTCON_USB,
> +   1);
> +}
> +
> +static void max_tcpci_set_cc_polarity(struct tcpci *tcpci, struct tcpci_data 
> *data,
> +   enum typec_cc_polarity polarity)
> +{
> + struct max_tcpci_chip *chip = tdata_to_max_tcpci(data);
> +
> + extcon_set_property(chip->extcon, EXTCON_USB, 
> EXTCON_PROP_USB_TYPEC_POLARITY,
> + (union extcon_property_value)(int)polarity);
> + extcon_set_property(chip->extcon, EXTCON_USB_HOST, 
> EXTCON_PROP_USB_TYPEC_POLARITY,
> + (union extcon_property_value)(int)polarity);
> +}
> +
> +static const unsigned int usbpd_extcon[] = {
> + EXTCON_USB,
> + EXTCON_USB_HOST,
> +};
> +
>  static int max_tcpci_probe(struct i2c_client *client, const struct 
> i2c_device_id *i2c_id)
>  {
>   int ret;
> @@ -472,6 +509,8 @@ static int max_tcpci_probe(struct i2c_client *client, 
> const struct i2c_device_id
>   chip->data.auto_discharge_disconnect = true;
>   chip->data.vbus_vsafe0v = true;
>   chip->data.set_partner_usb_comm_capable = 
> max_tcpci_set_partner_usb_comm_capable;
> + chip->data.set_roles = max_tcpci_set_roles;
> + chip->data.set_cc_polarity = max_tcpci_set_cc_polarity;
>  
>   max_tcpci_init_regs(chip);
>   chip->tcpci = tcpci_register_port(chip->dev, &chip->data);
> @@ -484,6 +523,23 @@ static int max_tcpci_probe(struct i2c_client *client, 
> const struct i2c_device_id
>   if (ret < 0)
>   goto unreg_port;
>  
> + chip->extcon = devm_extcon_dev_allocate(&client->dev, usbpd_extcon);
> + if (IS_ERR(chip->extcon)) {
> + dev_err(&client->dev, "Error allocating extcon: %ld\n", 
> PTR_ERR(chip->extcon));
> + ret = PTR_ERR(chip->extcon);
> + goto unreg_port;
> + }
> +
> + ret = devm_extcon_dev_register(&client->dev, chip->extcon);
> + if (ret < 0) {
> + dev_err(&client->dev, "failed to register extcon device");
> + goto unreg_port;
> + }

Why do you need this? We have the dedicated USB role class because
extcon could not handle every type of system. Things are simple enough
when you have a single dual-role capable USB controller, but when you
start having more bits and pieces like muxes in between, the
consumer/supplier extcon roles get twisted.

So in case you did not know this, our goal was originally to use
extcon for handling the data role (and orientation too), but some of
drivers were refused by the extcon maintainers because of the above
reason.

Most USB controller drivers for dual-role capable USB controllers
already register a role switch, and tcpm.c always requests a handle to
one that it uses to inform the current data role, so this part should
not require any new code.


> + extcon_set_property_capability(chip->extcon, EXTCON_USB, 
> EXTCON_PROP_USB_TYPEC_POLARITY);
> + extcon_set_property_capability(chip->extcon, EXTCON_USB_HOST,
> +EXTCON_PROP_USB_TYPEC_POLARITY);
> +
>   device_init_wakeup(chip->dev, true);
>   return 0;
>  
> -- 
> 2.31.0.rc2.261.g7f71774620-goog

thanks,

--

Re: 'make O=' indigestion with module signing

2021-03-12 Thread David Howells

Valdis Klētnieks  wrote:

> So the root cause was: 'make mrproper doesn't clean certs/' out enough,
> and this chunk of certs/Makefile
> ...
> I admit not being sure how (or if) this should be fixed

It's tricky because CONFIG_MODULE_SIG_KEY may not point to a file, let alone a
file that was autogenerated - it can be given a PKCS#11 URI, for instance.  I
had to put in the autogeneration based on a magic config string value to stop
randconfig blowing up - but it only does the autogeneration if you don't put
in your own file there before building.

Possibly I can add something like:

clean-files := signing_key.pem x509.genkey

inside the

ifeq ($(CONFIG_MODULE_SIG_KEY),"certs/signing_key.pem")
...
endif

section.

David

RE: [PATCH v2 02/10] fsdax: Factor helper: dax_fault_actor()

2021-03-12 Thread ruansy.f...@fujitsu.com



> > +   /* if we are reading UNWRITTEN and HOLE, return a hole. */
> > +   if (!write &&
> > +   (iomap->type == IOMAP_UNWRITTEN || iomap->type ==
> IOMAP_HOLE)) {
> > +   if (!pmd)
> > +   return dax_load_hole(xas, mapping, &entry, vmf);
> > +   else
> > +   return dax_pmd_load_hole(xas, vmf, iomap, &entry);
> > +   }
> > +
> > +   if (iomap->type != IOMAP_MAPPED) {
> > +   WARN_ON_ONCE(1);
> > +   return VM_FAULT_SIGBUS;
> > +   }
> 
> Nit: I'd use a switch statement here for a clarity:
> 
>   switch (iomap->type) {
>   case IOMAP_MAPPED:
>   break;
>   case IOMAP_UNWRITTEN:
>   case IOMAP_HOLE:
>   if (!write) {
>   if (!pmd)
>   return dax_load_hole(xas, mapping, &entry, vmf);
>   return dax_pmd_load_hole(xas, vmf, iomap, &entry);
>   }
>   break;
>   default:
>   WARN_ON_ONCE(1);
>   return VM_FAULT_SIGBUS;
>   }
> 
Hi, Christoph

I did not use a switch-case here is because that I still have to introduce a 
'goto' for CoW(Writing on IOMAP_UNWRITTEN and the two different iomap indicate 
that it is a CoW operation. Then goto IOMAP_MAPPED branch to do the data copy 
and pfn insertion.)  You said the 'goto' makes the code convoluted.  So, I 
avoided to use it and refactored this part into so much if-else, which looks 
similar in dax_iomap_actor().  So, what's your opinion now?


--
Thanks,
Ruan Shiyang.

> 
> > +   err = dax_iomap_pfn(iomap, pos, size, &pfn);
> > +   if (err)
> > +   goto error_fault;
> > +
> > +   entry = dax_insert_entry(xas, mapping, vmf, entry, pfn, 0,
> > +write && !sync);
> > +
> > +   if (sync)
> > +   return dax_fault_synchronous_pfnp(pfnp, pfn);
> > +
> > +   ret = dax_fault_insert_pfn(vmf, pfn, pmd, write);
> > +
> > +error_fault:
> > +   if (err)
> > +   ret = dax_fault_return(err);
> > +
> > +   return ret;
> 
> It seems like the only place that sets err is the dax_iomap_pfn case above.  
> So
> I'd move the dax_fault_return there, which then allows a direct return for
> everyone else, including the open coded version of dax_fault_insert_pfn.
> 
> I really like where this is going!

YOU HAVE WON

LOTTO.NL,
2391  Beds 152 Koningin Julianaplein 21,
Den Haag-Netherlands.
(Lotto affiliate with Subscriber Agents).
From: Susan Console
(Lottery Coordinator)
Website: www.lotto.nl

Sir/Madam,

CONGRATULATIONS!!!

We are pleased to inform you of the result of the Lotto NL Winners 
International programs held on the 10th of March 2021.  Your e-mail address 
attached to ticket #: 00903228100 with prize # 778009/UK drew €1,000,000.00 
which was first in the 2nd class of the draws. you are to receive €1,000,000.00 
(One Million Euros). Because of mix up in cash
pay-outs, we ask that you keep your winning information confidential until your 
money (€1,000,000.00) has been fully remitted to you by our accredited 
pay-point bank. 

This measure must be adhere to  avoid loss of your cash prize-winners of our 
cash prizes are advised to adhere to these instructions to forestall the abuse 
of this program by other participants.  

It's important to note that this draws were conducted formally, and winners are 
selected through an internet ballot system from 60,000 individual and companies 
e-mail addresses - the draws are conducted around the world through our 
internet based ballot system. The promotion is sponsored and promoted Lotto NL. 

We congratulate you once again. We hope you will use part of it in our next 
draws; the jackpot winning is €85million.  Remember, all winning must be 
claimed not later than 20 days. After this date all unclaimed cash prize will 
be forfeited and included in the next sweepstake.  Please, in order to avoid 
unnecessary delays and complications remember to quote personal and winning 
numbers in all correspondence with us.

Congratulations once again from all members of Lotto NL. Thank you for being 
part of our promotional program.

To file for the release of your winnings you are advice to contact our Foreign 
Transfer Manager:

MR. WILSON WARREN JOHNSON

Tel: +31-620-561-787

Fax: +31-84-438-5342

Email: johnsonwilson...@gmail.com

[PATCH v2] dt-bindings: display: sitronix,st7789v-dbi: Add Waveshare 2inch LCD module

2021-03-12 Thread Carlis

From: "Carlis" 

Document support for the Waveshare 2inch LCD module display, which is a
240x320 2" TFT display driven by a Sitronix ST7789V TFT Controller.

Signed-off-by: Carlis 
---
v2:change compatible name.
---
 .../display/sitronix,st7789v-dbi.yaml | 72 +++
 1 file changed, 72 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml

diff --git 
a/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml 
b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml
new file mode 100644
index ..6abf82966230
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/sitronix,st7789v-dbi.yaml
@@ -0,0 +1,72 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/sitronix,st7789v-dbi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Sitronix ST7789V Display Panels Device Tree Bindings
+
+maintainers:
+  - Carlis 
+
+description:
+  This binding is for display panels using a Sitronix ST7789V
+  controller in SPI mode.
+
+allOf:
+  - $ref: panel/panel-common.yaml#
+
+properties:
+  compatible:
+oneOf:
+  - description:
+  Waveshare 2" 240x320 Color TFT LCD
+items:
+  - enum:
+  - waveshare,ws2inch
+  - const: sitronix,st7789v-dbi
+
+  spi-max-frequency:
+maximum: 3200
+
+  dc-gpios:
+maxItems: 1
+description: Display data/command selection (D/CX)
+
+  backlight: true
+  reg: true
+  reset-gpios: true
+  rotation: true
+
+required:
+  - compatible
+  - reg
+  - dc-gpios
+  - reset-gpios
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+backlight: backlight {
+compatible = "gpio-backlight";
+gpios = <&gpio 18 GPIO_ACTIVE_HIGH>;
+};
+
+spi {
+#address-cells = <1>;
+#size-cells = <0>;
+
+display@0{
+compatible = "waveshare,ws2inch", "sitronix,st7789v-dbi";
+reg = <0>;
+spi-max-frequency = <3200>;
+dc-gpios = <&gpio 25 GPIO_ACTIVE_HIGH>;
+reset-gpios = <&gpio 27 GPIO_ACTIVE_HIGH>;
+rotation = <270>;
+};
+};
+
+...
-- 
2.25.1

Re: [PATCH] leds: leds-dual-gpio: Add dual GPIO LEDs driver

2021-03-12 Thread Marek Behun

On Fri, 12 Mar 2021 08:48:55 +
Hermes Zhang  wrote:

> Hi Alexander,
> 
> > Am Donnerstag, 11. März 2021, 14:04:08 CET schrieb Hermes Zhang:  
> > > From: Hermes Zhang 
> > >
> > > Introduce a new Dual GPIO LED driver. These two GPIOs LED will act as
> > > one LED as normal GPIO LED but give the possibility to change the
> > > intensity in four levels: OFF, LOW, MIDDLE and HIGH.  
> > 
> > Interesting use case. Is there any real world hardware wired like that you
> > could point to?
> >   
> 
> Yes, we have the HW, it's not a chip but just some circuit to made of.
>  
> > > +config LEDS_DUAL_GPIO
> > > + tristate "LED Support for Dual GPIO connected LEDs"
> > > + depends on LEDS_CLASS
> > > + depends on GPIOLIB || COMPILE_TEST
> > > + help
> > > +   This option enables support for the two LEDs connected to GPIO
> > > +   outputs. These two GPIO LEDs act as one LED in the sysfs and
> > > +   perform different intensity by enable either one of them or both.  
> > 
> > Well, although I never had time to implement that, I suspect that could
> > conflict if someone will eventually write a driver for two pin dual color 
> > LEDs
> > connected to GPIO pins.  We actually do that on our hardware and I know
> > others do, too.
> > 
> > I asked about that back in 2019, see this thread:
> > 
> > https://www.spinics.net/lists/linux-leds/msg11665.html
> > 
> > At the time the multicolor framework was not yet merged, so today I would
> > probably make something which either uses the multicolor framework or at
> > least has a similar interface to userspace. However, it probably won't 
> > surprise
> > you all, this is not highest priority on my ToDo list. ;-)
> > 
> > (What we actually do is pretend those are separate LEDs and ignore the
> > conflicting case where both GPIOs are on and the LED is dark then.)
> >   
> 
> Yes, that case seems conflict with mine, the pattern for me is like:
> 
> P1 | P2 | LED
> -- + -- + -
>  0 |  0 | off
>  0 |  1 | Any color
>  1 |  0 | Any color
>  1 |  1 | both on
> 
> Now I'm investigate another way from Marek's suggestion by using 
> REGULATOR_GPIO, to see if could meet my requirement. If yes, then I do think 
> no new  driver is needed.

Maybe you could even implement multicolor-gpio, now that we have
multicolor LED class :)

Marek

Re: [PATCH v3 3/3] mm: fs: Invalidate BH LRU during page migration