Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.9-1 tag
Linus Torvalds writes: > On Fri, Aug 7, 2020 at 6:14 AM Michael Ellerman wrote: >> >> Just one minor conflict, in a comment in drivers/misc/ocxl/config.c. > > Well, this morning I merged the ptrace ->regset_get() updates from Al, > and that brought in a different conflict. Ah fooey. > I _think_ I resolved it correctly, but while the new model is fairly > readable, the old one sure wasn't, and who knows how messed up my > attempt to sort it out was. I don't know the pkey details on powerpc.. The old API was horrible, nice to see it gone. > So I'd appreciate it if both Al and Aneesh Kumar would check that what > I did to pkey_get() in arch/powerpc/kernel/ptrace/ptrace-view.c makes > sense and works.. It looks right to me, except it doesn't build due to ret now being unused: /linux/arch/powerpc/kernel/ptrace/ptrace-view.c: In function ‘pkey_get’: /linux/arch/powerpc/kernel/ptrace/ptrace-view.c:473:6: error: unused variable ‘ret’ [-Werror=unused-variable] 473 | int ret; Patch below, do you mind taking it directly? With that fixed our pkey selftests pass and show the expected values in those regs. > Side note - it might have been cleaner to just make it do > > membuf_store(&to, target->thread.amr); > membuf_store(&to, target->thread.iamr); > return membuf_store(&to, default_uamor); > > instead of doing that membuf_write() for the first two ones and then > the membuf_store() for the uamor field, but I did what I did to keep > the logic as close to what it used to be as possible. Yep fair enough. > If I messed up, I apologize. > > And if you agree that making it three membuf_store() instead of that > odd "depend on the exact order of the thread struct and pick two > consecutive values", I'll leave that to you as a separate cleanup. Will do. cheers >From a280ae69f248a0f87b36170a94c5665ef5353f51 Mon Sep 17 00:00:00 2001 From: Michael Ellerman Date: Sat, 8 Aug 2020 09:12:03 +1000 Subject: [PATCH] powerpc/ptrace: Fix build error in pkey_get() MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The merge resolution in commit 25d8d4eecace left ret no longer used, leading to: /linux/arch/powerpc/kernel/ptrace/ptrace-view.c: In function ‘pkey_get’: /linux/arch/powerpc/kernel/ptrace/ptrace-view.c:473:6: error: unused variable ‘ret’ 473 | int ret; Fix it by removing ret. Fixes: 25d8d4eecace ("Merge tag 'powerpc-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux") Signed-off-by: Michael Ellerman --- arch/powerpc/kernel/ptrace/ptrace-view.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c index 19823a250aa0..7e6478e7ed07 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-view.c +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c @@ -470,8 +470,6 @@ static int pkey_active(struct task_struct *target, const struct user_regset *reg static int pkey_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { - int ret; - BUILD_BUG_ON(TSO(amr) + sizeof(unsigned long) != TSO(iamr)); if (!arch_pkeys_enabled()) -- 2.25.1
Re: [PATCH 21/22] crypto: qce - add check for xts input length equal to zero
Hi, Thanks for the patch! On 8/7/20 7:20 PM, Andrei Botila wrote: > From: Andrei Botila > > Standardize the way input lengths equal to 0 are handled in all skcipher > algorithms. All the algorithms return 0 for input lengths equal to zero. > > Signed-off-by: Andrei Botila > --- > drivers/crypto/qce/skcipher.c | 3 +++ > 1 file changed, 3 insertions(+) Reviewed-by: Stanimir Varbanov > > diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c > index 5630c5addd28..887fd4dc9b43 100644 > --- a/drivers/crypto/qce/skcipher.c > +++ b/drivers/crypto/qce/skcipher.c > @@ -223,6 +223,9 @@ static int qce_skcipher_crypt(struct skcipher_request > *req, int encrypt) > int keylen; > int ret; > > + if (!req->cryptlen && IS_XTS(rctx->flags)) > + return 0; > + > rctx->flags = tmpl->alg_flags; > rctx->flags |= encrypt ? QCE_ENCRYPT : QCE_DECRYPT; > keylen = IS_XTS(rctx->flags) ? ctx->enc_keylen >> 1 : ctx->enc_keylen; > -- regards, Stan
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.9-1 tag
On Fri, Aug 07, 2020 at 10:46:13AM -0700, Linus Torvalds wrote: > On Fri, Aug 7, 2020 at 6:14 AM Michael Ellerman wrote: > > > > Just one minor conflict, in a comment in drivers/misc/ocxl/config.c. > > Well, this morning I merged the ptrace ->regset_get() updates from Al, > and that brought in a different conflict. > > I _think_ I resolved it correctly, but while the new model is fairly > readable, the old one sure wasn't, and who knows how messed up my > attempt to sort it out was. I don't know the pkey details on powerpc.. > > So I'd appreciate it if both Al and Aneesh Kumar would check that what > I did to pkey_get() in arch/powerpc/kernel/ptrace/ptrace-view.c makes > sense and works.. Grabbing... Looks sane and yes, 3 membuf_store() instead of membuf_write() + membuf_store() would make sense (might even yield better code). Up to ppc folks... > Side note - it might have been cleaner to just make it do > > membuf_store(&to, target->thread.amr); > membuf_store(&to, target->thread.iamr); > return membuf_store(&to, default_uamor); > > instead of doing that membuf_write() for the first two ones and then > the membuf_store() for the uamor field, but I did what I did to keep > the logic as close to what it used to be as possible. > > If I messed up, I apologize. > > And if you agree that making it three membuf_store() instead of that > odd "depend on the exact order of the thread struct and pick two > consecutive values", I'll leave that to you as a separate cleanup. > >Linus
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.9-1 tag
The pull request you sent on Fri, 07 Aug 2020 23:13:37 +1000: > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git > tags/powerpc-5.9-1 has been merged into torvalds/linux.git: https://git.kernel.org/torvalds/c/25d8d4eecace9de5a6a2193e4df1917afbdd3052 Thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/prtracker.html
Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.9-1 tag
On Fri, Aug 7, 2020 at 6:14 AM Michael Ellerman wrote: > > Just one minor conflict, in a comment in drivers/misc/ocxl/config.c. Well, this morning I merged the ptrace ->regset_get() updates from Al, and that brought in a different conflict. I _think_ I resolved it correctly, but while the new model is fairly readable, the old one sure wasn't, and who knows how messed up my attempt to sort it out was. I don't know the pkey details on powerpc.. So I'd appreciate it if both Al and Aneesh Kumar would check that what I did to pkey_get() in arch/powerpc/kernel/ptrace/ptrace-view.c makes sense and works.. Side note - it might have been cleaner to just make it do membuf_store(&to, target->thread.amr); membuf_store(&to, target->thread.iamr); return membuf_store(&to, default_uamor); instead of doing that membuf_write() for the first two ones and then the membuf_store() for the uamor field, but I did what I did to keep the logic as close to what it used to be as possible. If I messed up, I apologize. And if you agree that making it three membuf_store() instead of that odd "depend on the exact order of the thread struct and pick two consecutive values", I'll leave that to you as a separate cleanup. Linus
[PATCH] arch/powerpc: use simple i2c probe function
The i2c probe functions here don't use the id information provided in their second argument, so the single-parameter i2c probe function ("probe_new") can be used instead. This avoids scanning the identifier tables during probes. Signed-off-by: Stephen Kitt --- arch/powerpc/platforms/44x/ppc476.c| 5 ++--- arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c | 4 ++-- 2 files changed, 4 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/44x/ppc476.c b/arch/powerpc/platforms/44x/ppc476.c index cba83eee685c..07f7e3ce67b5 100644 --- a/arch/powerpc/platforms/44x/ppc476.c +++ b/arch/powerpc/platforms/44x/ppc476.c @@ -86,8 +86,7 @@ static void __noreturn avr_reset_system(char *cmd) avr_halt_system(AVR_PWRCTL_RESET); } -static int avr_probe(struct i2c_client *client, - const struct i2c_device_id *id) +static int avr_probe(struct i2c_client *client) { avr_i2c_client = client; ppc_md.restart = avr_reset_system; @@ -104,7 +103,7 @@ static struct i2c_driver avr_driver = { .driver = { .name = "akebono-avr", }, - .probe = avr_probe, + .probe_new = avr_probe, .id_table = avr_id, }; diff --git a/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c b/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c index 0967bdfb1691..409481016928 100644 --- a/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c +++ b/arch/powerpc/platforms/83xx/mcu_mpc8349emitx.c @@ -142,7 +142,7 @@ static int mcu_gpiochip_remove(struct mcu *mcu) return 0; } -static int mcu_probe(struct i2c_client *client, const struct i2c_device_id *id) +static int mcu_probe(struct i2c_client *client) { struct mcu *mcu; int ret; @@ -221,7 +221,7 @@ static struct i2c_driver mcu_driver = { .name = "mcu-mpc8349emitx", .of_match_table = mcu_of_match_table, }, - .probe = mcu_probe, + .probe_new = mcu_probe, .remove = mcu_remove, .id_table = mcu_ids, }; -- 2.25.4
[PATCH 22/22] crypto: vmx - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: "Breno Leitão" Cc: Nayna Jain Cc: Paulo Flabiano Smorigo Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Signed-off-by: Andrei Botila --- drivers/crypto/vmx/aes_xts.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/crypto/vmx/aes_xts.c b/drivers/crypto/vmx/aes_xts.c index 9fee1b1532a4..33107c9e2656 100644 --- a/drivers/crypto/vmx/aes_xts.c +++ b/drivers/crypto/vmx/aes_xts.c @@ -84,6 +84,9 @@ static int p8_aes_xts_crypt(struct skcipher_request *req, int enc) u8 tweak[AES_BLOCK_SIZE]; int ret; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 21/22] crypto: qce - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Signed-off-by: Andrei Botila --- drivers/crypto/qce/skcipher.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/crypto/qce/skcipher.c b/drivers/crypto/qce/skcipher.c index 5630c5addd28..887fd4dc9b43 100644 --- a/drivers/crypto/qce/skcipher.c +++ b/drivers/crypto/qce/skcipher.c @@ -223,6 +223,9 @@ static int qce_skcipher_crypt(struct skcipher_request *req, int encrypt) int keylen; int ret; + if (!req->cryptlen && IS_XTS(rctx->flags)) + return 0; + rctx->flags = tmpl->alg_flags; rctx->flags |= encrypt ? QCE_ENCRYPT : QCE_DECRYPT; keylen = IS_XTS(rctx->flags) ? ctx->enc_keylen >> 1 : ctx->enc_keylen; -- 2.17.1
[PATCH 20/22] crypto: octeontx - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Boris Brezillon Cc: Arnaud Ebalard Cc: Srujana Challa Signed-off-by: Andrei Botila --- drivers/crypto/marvell/octeontx/otx_cptvf_algs.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/crypto/marvell/octeontx/otx_cptvf_algs.c b/drivers/crypto/marvell/octeontx/otx_cptvf_algs.c index 90bb31329d4b..ec13bc3f1766 100644 --- a/drivers/crypto/marvell/octeontx/otx_cptvf_algs.c +++ b/drivers/crypto/marvell/octeontx/otx_cptvf_algs.c @@ -340,11 +340,16 @@ static inline int cpt_enc_dec(struct skcipher_request *req, u32 enc) { struct crypto_skcipher *stfm = crypto_skcipher_reqtfm(req); struct otx_cpt_req_ctx *rctx = skcipher_request_ctx(req); + struct crypto_tfm *tfm = crypto_skcipher_tfm(stfm); + struct otx_cpt_enc_ctx *ctx = crypto_tfm_ctx(tfm); struct otx_cpt_req_info *req_info = &rctx->cpt_req; u32 enc_iv_len = crypto_skcipher_ivsize(stfm); struct pci_dev *pdev; int status, cpu_num; + if (!req->cryptlen && ctx->cipher_type == OTX_CPT_AES_XTS) + return 0; + /* Validate that request doesn't exceed maximum CPT supported size */ if (req->cryptlen > OTX_CPT_MAX_REQ_SIZE) return -E2BIG; -- 2.17.1
[PATCH 19/22] crypto: inside-secure - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Antoine Tenart Signed-off-by: Andrei Botila --- drivers/crypto/inside-secure/safexcel_cipher.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c index 1ac3253b7903..03d06556ea98 100644 --- a/drivers/crypto/inside-secure/safexcel_cipher.c +++ b/drivers/crypto/inside-secure/safexcel_cipher.c @@ -2533,6 +2533,9 @@ static int safexcel_skcipher_aes_xts_cra_init(struct crypto_tfm *tfm) static int safexcel_encrypt_xts(struct skcipher_request *req) { + if (!req->cryptlen) + return 0; + if (req->cryptlen < XTS_BLOCK_SIZE) return -EINVAL; return safexcel_queue_req(&req->base, skcipher_request_ctx(req), @@ -2541,6 +2544,9 @@ static int safexcel_encrypt_xts(struct skcipher_request *req) static int safexcel_decrypt_xts(struct skcipher_request *req) { + if (!req->cryptlen) + return 0; + if (req->cryptlen < XTS_BLOCK_SIZE) return -EINVAL; return safexcel_queue_req(&req->base, skcipher_request_ctx(req), -- 2.17.1
[PATCH 18/22] crypto: hisilicon/sec - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Signed-off-by: Andrei Botila --- drivers/crypto/hisilicon/sec/sec_algs.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/crypto/hisilicon/sec/sec_algs.c b/drivers/crypto/hisilicon/sec/sec_algs.c index 8ca945ac297e..419ec4f23164 100644 --- a/drivers/crypto/hisilicon/sec/sec_algs.c +++ b/drivers/crypto/hisilicon/sec/sec_algs.c @@ -723,6 +723,10 @@ static int sec_alg_skcipher_crypto(struct skcipher_request *skreq, bool split = skreq->src != skreq->dst; gfp_t gfp = skreq->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP ? GFP_KERNEL : GFP_ATOMIC; + if (!skreq->cryptlen && (ctx->cipher_alg == SEC_C_AES_XTS_128 || +ctx->cipher_alg == SEC_C_AES_XTS_256)) + return 0; + mutex_init(&sec_req->lock); sec_req->req_base = &skreq->base; sec_req->err = 0; -- 2.17.1
[PATCH 17/22] crypto: chelsio - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Ayush Sawal Cc: Vinay Kumar Yadav Cc: Rohit Maheshwari Signed-off-by: Andrei Botila --- drivers/crypto/chelsio/chcr_algo.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/crypto/chelsio/chcr_algo.c b/drivers/crypto/chelsio/chcr_algo.c index 13b908ea4873..e9746580870a 100644 --- a/drivers/crypto/chelsio/chcr_algo.c +++ b/drivers/crypto/chelsio/chcr_algo.c @@ -1372,8 +1372,12 @@ static int chcr_aes_encrypt(struct skcipher_request *req) int err; struct uld_ctx *u_ctx = ULD_CTX(c_ctx(tfm)); struct chcr_context *ctx = c_ctx(tfm); + int subtype = get_cryptoalg_subtype(tfm); unsigned int cpu; + if (!req->cryptlen && subtype == CRYPTO_ALG_SUB_TYPE_XTS) + return 0; + cpu = get_cpu(); reqctx->txqidx = cpu % ctx->ntxq; reqctx->rxqidx = cpu % ctx->nrxq; -- 2.17.1
[PATCH 16/22] crypto: ccree - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. This change has implications not only for xts(aes) but also for cts(cbc(aes)) and cts(cbc(paes)). Cc: Gilad Ben-Yossef Signed-off-by: Andrei Botila --- drivers/crypto/ccree/cc_cipher.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/drivers/crypto/ccree/cc_cipher.c b/drivers/crypto/ccree/cc_cipher.c index 076669dc1035..112bb8b4dce6 100644 --- a/drivers/crypto/ccree/cc_cipher.c +++ b/drivers/crypto/ccree/cc_cipher.c @@ -912,17 +912,18 @@ static int cc_cipher_process(struct skcipher_request *req, /* STAT_PHASE_0: Init and sanity checks */ - if (validate_data_size(ctx_p, nbytes)) { - dev_dbg(dev, "Unsupported data size %d.\n", nbytes); - rc = -EINVAL; - goto exit_process; - } if (nbytes == 0) { /* No data to process is valid */ rc = 0; goto exit_process; } + if (validate_data_size(ctx_p, nbytes)) { + dev_dbg(dev, "Unsupported data size %d.\n", nbytes); + rc = -EINVAL; + goto exit_process; + } + if (ctx_p->fallback_on) { struct skcipher_request *subreq = skcipher_request_ctx(req); -- 2.17.1
[PATCH 15/22] crypto: ccp - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Tom Lendacky Cc: John Allen Signed-off-by: Andrei Botila --- drivers/crypto/ccp/ccp-crypto-aes-xts.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/crypto/ccp/ccp-crypto-aes-xts.c b/drivers/crypto/ccp/ccp-crypto-aes-xts.c index 6849261ca47d..6a93b54d388a 100644 --- a/drivers/crypto/ccp/ccp-crypto-aes-xts.c +++ b/drivers/crypto/ccp/ccp-crypto-aes-xts.c @@ -113,6 +113,9 @@ static int ccp_aes_xts_crypt(struct skcipher_request *req, u32 unit_size; int ret; + if (!req->cryptlen) + return 0; + if (!ctx->u.aes.key_len) return -EINVAL; -- 2.17.1
[PATCH 14/22] crypto: cavium/nitrox - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Srikanth Jampala Cc: Nagadheeraj Rottela Signed-off-by: Andrei Botila --- drivers/crypto/cavium/nitrox/nitrox_skcipher.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/crypto/cavium/nitrox/nitrox_skcipher.c b/drivers/crypto/cavium/nitrox/nitrox_skcipher.c index a553ac65f324..d76589ebe354 100644 --- a/drivers/crypto/cavium/nitrox/nitrox_skcipher.c +++ b/drivers/crypto/cavium/nitrox/nitrox_skcipher.c @@ -249,10 +249,16 @@ static int nitrox_skcipher_crypt(struct skcipher_request *skreq, bool enc) struct crypto_skcipher *cipher = crypto_skcipher_reqtfm(skreq); struct nitrox_crypto_ctx *nctx = crypto_skcipher_ctx(cipher); struct nitrox_kcrypt_request *nkreq = skcipher_request_ctx(skreq); + struct crypto_tfm *tfm = crypto_skcipher_tfm(cipher); int ivsize = crypto_skcipher_ivsize(cipher); struct se_crypto_request *creq; + const char *name; int ret; + name = crypto_tfm_alg_name(tfm); + if (!skreq->cryptlen && flexi_cipher_type(name) == CIPHER_AES_XTS) + return 0; + creq = &nkreq->creq; creq->flags = skreq->base.flags; creq->gfp = (skreq->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) ? -- 2.17.1
[PATCH 13/22] crypto: cavium/cpt - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: George Cherian Signed-off-by: Andrei Botila --- drivers/crypto/cavium/cpt/cptvf_algs.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/crypto/cavium/cpt/cptvf_algs.c b/drivers/crypto/cavium/cpt/cptvf_algs.c index 5af0dc2a8909..edc18c8dd571 100644 --- a/drivers/crypto/cavium/cpt/cptvf_algs.c +++ b/drivers/crypto/cavium/cpt/cptvf_algs.c @@ -193,6 +193,7 @@ static inline void create_output_list(struct skcipher_request *req, static inline int cvm_enc_dec(struct skcipher_request *req, u32 enc) { struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + struct cvm_enc_ctx *ctx = crypto_skcipher_ctx(tfm); struct cvm_req_ctx *rctx = skcipher_request_ctx(req); u32 enc_iv_len = crypto_skcipher_ivsize(tfm); struct fc_context *fctx = &rctx->fctx; @@ -200,6 +201,9 @@ static inline int cvm_enc_dec(struct skcipher_request *req, u32 enc) void *cdev = NULL; int status; + if (!req->cryptlen && ctx->cipher_type == AES_XTS) + return 0; + memset(req_info, 0, sizeof(struct cpt_request_info)); req_info->may_sleep = (req->base.flags & CRYPTO_TFM_REQ_MAY_SLEEP) != 0; memset(fctx, 0, sizeof(struct fc_context)); -- 2.17.1
[PATCH 12/22] crypto: bcm - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Zhang Shengju Cc: Tang Bin Signed-off-by: Andrei Botila --- drivers/crypto/bcm/cipher.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/crypto/bcm/cipher.c b/drivers/crypto/bcm/cipher.c index 8a7fa1ae1ade..8a6f225f4db7 100644 --- a/drivers/crypto/bcm/cipher.c +++ b/drivers/crypto/bcm/cipher.c @@ -1754,6 +1754,9 @@ static int skcipher_enqueue(struct skcipher_request *req, bool encrypt) crypto_skcipher_ctx(crypto_skcipher_reqtfm(req)); int err; + if (!req->cryptlen && ctx->cipher.mode == CIPHER_MODE_XTS) + return 0; + flow_log("%s() enc:%u\n", __func__, encrypt); rctx->gfp = (req->base.flags & (CRYPTO_TFM_REQ_MAY_BACKLOG | -- 2.17.1
[PATCH 11/22] crypto: artpec6 - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Jesper Nilsson Cc: Lars Persson Signed-off-by: Andrei Botila --- drivers/crypto/axis/artpec6_crypto.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/crypto/axis/artpec6_crypto.c b/drivers/crypto/axis/artpec6_crypto.c index 1a46eeddf082..243880c97629 100644 --- a/drivers/crypto/axis/artpec6_crypto.c +++ b/drivers/crypto/axis/artpec6_crypto.c @@ -1090,6 +1090,9 @@ static int artpec6_crypto_encrypt(struct skcipher_request *req) void (*complete)(struct crypto_async_request *req); int ret; + if (!req->cryptlen) + return 0; + req_ctx = skcipher_request_ctx(req); switch (ctx->crypto_type) { @@ -1135,6 +1138,9 @@ static int artpec6_crypto_decrypt(struct skcipher_request *req) struct artpec6_crypto_request_context *req_ctx = NULL; void (*complete)(struct crypto_async_request *req); + if (!req->cryptlen) + return 0; + req_ctx = skcipher_request_ctx(req); switch (ctx->crypto_type) { -- 2.17.1
[PATCH 10/22] crypto: atmel-aes - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Nicolas Ferre Cc: Alexandre Belloni Cc: Ludovic Desroches Signed-off-by: Andrei Botila --- drivers/crypto/atmel-aes.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c index a6e14491e080..af789ac73478 100644 --- a/drivers/crypto/atmel-aes.c +++ b/drivers/crypto/atmel-aes.c @@ -1107,6 +1107,10 @@ static int atmel_aes_crypt(struct skcipher_request *req, unsigned long mode) ctx->block_size = CFB64_BLOCK_SIZE; break; + case AES_FLAGS_XTS: + if (!req->cryptlen) + return 0; + default: ctx->block_size = AES_BLOCK_SIZE; break; -- 2.17.1
[PATCH 09/22] crypto: xts - add check for block length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Signed-off-by: Andrei Botila --- crypto/xts.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/crypto/xts.c b/crypto/xts.c index 3c3ed02c7663..7df68f52fddc 100644 --- a/crypto/xts.c +++ b/crypto/xts.c @@ -263,6 +263,9 @@ static int xts_encrypt(struct skcipher_request *req) struct skcipher_request *subreq = &rctx->subreq; int err; + if (!req->cryptlen) + return 0; + err = xts_init_crypt(req, xts_encrypt_done) ?: xts_xor_tweak_pre(req, true) ?: crypto_skcipher_encrypt(subreq) ?: @@ -280,6 +283,9 @@ static int xts_decrypt(struct skcipher_request *req) struct skcipher_request *subreq = &rctx->subreq; int err; + if (!req->cryptlen) + return 0; + err = xts_init_crypt(req, xts_decrypt_done) ?: xts_xor_tweak_pre(req, false) ?: crypto_skcipher_decrypt(subreq) ?: -- 2.17.1
[PATCH 08/22] crypto: x86/glue_helper - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Signed-off-by: Andrei Botila --- arch/x86/crypto/glue_helper.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/crypto/glue_helper.c b/arch/x86/crypto/glue_helper.c index d3d91a0abf88..cc5042c72910 100644 --- a/arch/x86/crypto/glue_helper.c +++ b/arch/x86/crypto/glue_helper.c @@ -275,6 +275,9 @@ int glue_xts_req_128bit(const struct common_glue_ctx *gctx, unsigned int nbytes, tail; int err; + if (!req->cryptlen) + return 0; + if (req->cryptlen < XTS_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 07/22] crypto: s390/paes - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Signed-off-by: Andrei Botila --- arch/s390/crypto/paes_s390.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/s390/crypto/paes_s390.c b/arch/s390/crypto/paes_s390.c index f3caeb17c85b..7f0861c6f019 100644 --- a/arch/s390/crypto/paes_s390.c +++ b/arch/s390/crypto/paes_s390.c @@ -494,6 +494,9 @@ static int xts_paes_crypt(struct skcipher_request *req, unsigned long modifier) u8 init[16]; } xts_param; + if (!req->cryptlen) + return 0; + ret = skcipher_walk_virt(&walk, req, false); if (ret) return ret; -- 2.17.1
[PATCH 06/22] crypto: s390/aes - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Christian Borntraeger Signed-off-by: Andrei Botila --- arch/s390/crypto/aes_s390.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/s390/crypto/aes_s390.c b/arch/s390/crypto/aes_s390.c index 73044634d342..bc8855f4b7d1 100644 --- a/arch/s390/crypto/aes_s390.c +++ b/arch/s390/crypto/aes_s390.c @@ -437,6 +437,9 @@ static int xts_aes_crypt(struct skcipher_request *req, unsigned long modifier) u8 init[16]; } xts_param; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 05/22] crypto: powerpc/aes-spe - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Signed-off-by: Andrei Botila --- arch/powerpc/crypto/aes-spe-glue.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/crypto/aes-spe-glue.c b/arch/powerpc/crypto/aes-spe-glue.c index c2b23b69d7b1..f37d8bef322b 100644 --- a/arch/powerpc/crypto/aes-spe-glue.c +++ b/arch/powerpc/crypto/aes-spe-glue.c @@ -327,6 +327,9 @@ static int ppc_xts_encrypt(struct skcipher_request *req) u8 b[2][AES_BLOCK_SIZE]; int err; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; @@ -366,6 +369,9 @@ static int ppc_xts_decrypt(struct skcipher_request *req) le128 twk; int err; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 04/22] crypto: arm64/aes-neonbs - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Catalin Marinas Cc: Will Deacon Signed-off-by: Andrei Botila --- arch/arm64/crypto/aes-neonbs-glue.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index fb507d569922..197bf24e7dae 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -330,6 +330,9 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt, int first = 1; u8 *out, *in; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 03/22] crypto: arm64/aes - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Catalin Marinas Cc: Will Deacon Signed-off-by: Andrei Botila --- arch/arm64/crypto/aes-glue.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/arm64/crypto/aes-glue.c b/arch/arm64/crypto/aes-glue.c index 395bbf64b2ab..44c9644c74b1 100644 --- a/arch/arm64/crypto/aes-glue.c +++ b/arch/arm64/crypto/aes-glue.c @@ -515,6 +515,9 @@ static int __maybe_unused xts_encrypt(struct skcipher_request *req) struct scatterlist *src, *dst; struct skcipher_walk walk; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; @@ -587,6 +590,9 @@ static int __maybe_unused xts_decrypt(struct skcipher_request *req) struct scatterlist *src, *dst; struct skcipher_walk walk; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 02/22] crypto: arm/aes-neonbs - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Russell King Signed-off-by: Andrei Botila --- arch/arm/crypto/aes-neonbs-glue.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm/crypto/aes-neonbs-glue.c b/arch/arm/crypto/aes-neonbs-glue.c index e6fd32919c81..98ca6e6cca90 100644 --- a/arch/arm/crypto/aes-neonbs-glue.c +++ b/arch/arm/crypto/aes-neonbs-glue.c @@ -339,6 +339,9 @@ static int __xts_crypt(struct skcipher_request *req, bool encrypt, struct skcipher_walk walk; int err; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 01/22] crypto: arm/aes-ce - add check for xts input length equal to zero
From: Andrei Botila Standardize the way input lengths equal to 0 are handled in all skcipher algorithms. All the algorithms return 0 for input lengths equal to zero. Cc: Russell King Signed-off-by: Andrei Botila --- arch/arm/crypto/aes-ce-glue.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/arm/crypto/aes-ce-glue.c b/arch/arm/crypto/aes-ce-glue.c index b668c97663ec..57a9cf7fe98a 100644 --- a/arch/arm/crypto/aes-ce-glue.c +++ b/arch/arm/crypto/aes-ce-glue.c @@ -452,6 +452,9 @@ static int xts_encrypt(struct skcipher_request *req) struct scatterlist *src, *dst; struct skcipher_walk walk; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; @@ -524,6 +527,9 @@ static int xts_decrypt(struct skcipher_request *req) struct scatterlist *src, *dst; struct skcipher_walk walk; + if (!req->cryptlen) + return 0; + if (req->cryptlen < AES_BLOCK_SIZE) return -EINVAL; -- 2.17.1
[PATCH 00/22] crypto: add check for xts input length equal to zero
From: Andrei Botila This patch set is a follow-up on the previous RFC discussion which can be found here: https://lore.kernel.org/r/4145904.a5p2xsn...@tauon.chronox.de This series converts all XTS implementations to return 0 when the input length is equal to 0. This change is necessary in order to standardize the way skcipher algorithms handle this corner case. This check is made for other algorithms such as CBC, ARC4, CFB, OFB, SALSA20, CTR, ECB and PCBC, XTS being the outlier here. Although some drivers do not explicitly check for requests with zero input length, their implementations might be able to deal with this case. Since we don't have the HW to test which ones are able and which ones are not we rely on the maintainers of these drivers to verify and comment if the changes are necessary in their driver or not. One important thing to keep in mind is that in some implementations we make this check only for XTS algorithms although probably all skcipher algorithms should return 0 in case of zero input length. This fix has been tested only on ARMv8 CE, the rest of the patches have been build tested *only*, and should be tested on actual hardware before being merged. Andrei Botila (22): crypto: arm/aes-ce - add check for xts input length equal to zero crypto: arm/aes-neonbs - add check for xts input length equal to zero crypto: arm64/aes - add check for xts input length equal to zero crypto: arm64/aes-neonbs - add check for xts input length equal to zero crypto: powerpc/aes-spe - add check for xts input length equal to zero crypto: s390/aes - add check for xts input length equal to zero crypto: s390/paes - add check for xts input length equal to zero crypto: x86/glue_helper - add check for xts input length equal to zero crypto: xts - add check for block length equal to zero crypto: atmel-aes - add check for xts input length equal to zero crypto: artpec6 - add check for xts input length equal to zero crypto: bcm - add check for xts input length equal to zero crypto: cavium/cpt - add check for xts input length equal to zero crypto: cavium/nitrox - add check for xts input length equal to zero crypto: ccp - add check for xts input length equal to zero crypto: ccree - add check for xts input length equal to zero crypto: chelsio - add check for xts input length equal to zero crypto: hisilicon/sec - add check for xts input length equal to zero crypto: inside-secure - add check for xts input length equal to zero crypto: octeontx - add check for xts input length equal to zero crypto: qce - add check for xts input length equal to zero crypto: vmx - add check for xts input length equal to zero arch/arm/crypto/aes-ce-glue.c| 6 ++ arch/arm/crypto/aes-neonbs-glue.c| 3 +++ arch/arm64/crypto/aes-glue.c | 6 ++ arch/arm64/crypto/aes-neonbs-glue.c | 3 +++ arch/powerpc/crypto/aes-spe-glue.c | 6 ++ arch/s390/crypto/aes_s390.c | 3 +++ arch/s390/crypto/paes_s390.c | 3 +++ arch/x86/crypto/glue_helper.c| 3 +++ crypto/xts.c | 6 ++ drivers/crypto/atmel-aes.c | 4 drivers/crypto/axis/artpec6_crypto.c | 6 ++ drivers/crypto/bcm/cipher.c | 3 +++ drivers/crypto/cavium/cpt/cptvf_algs.c | 4 drivers/crypto/cavium/nitrox/nitrox_skcipher.c | 6 ++ drivers/crypto/ccp/ccp-crypto-aes-xts.c | 3 +++ drivers/crypto/ccree/cc_cipher.c | 11 ++- drivers/crypto/chelsio/chcr_algo.c | 4 drivers/crypto/hisilicon/sec/sec_algs.c | 4 drivers/crypto/inside-secure/safexcel_cipher.c | 6 ++ drivers/crypto/marvell/octeontx/otx_cptvf_algs.c | 5 + drivers/crypto/qce/skcipher.c| 3 +++ drivers/crypto/vmx/aes_xts.c | 3 +++ 22 files changed, 96 insertions(+), 5 deletions(-) -- 2.17.1
Re: [RFC PATCH 1/2] powerpc/numa: Introduce logical numa id
"Aneesh Kumar K.V" writes: > On 8/7/20 9:54 AM, Nathan Lynch wrote: >> "Aneesh Kumar K.V" writes: >>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c >>> index e437a9ac4956..6c659aada55b 100644 >>> --- a/arch/powerpc/mm/numa.c >>> +++ b/arch/powerpc/mm/numa.c >>> @@ -221,25 +221,51 @@ static void initialize_distance_lookup_table(int nid, >>> } >>> } >>> >>> +static u32 nid_map[MAX_NUMNODES] = {[0 ... MAX_NUMNODES - 1] = >>> NUMA_NO_NODE}; >> >> It's odd to me to use MAX_NUMNODES for this array when it's going to be >> indexed not by Linux's logical node IDs but by the platform-provided >> domain number, which has no relation to MAX_NUMNODES. > > > I didn't want to dynamically allocate this. We could fetch > "ibm,max-associativity-domains" to find the size for that. The current > code do assume firmware group id to not exceed MAX_NUMNODES. Hence kept > the array size to be MAX_NUMNODEs. I do agree that it is confusing. May > be we can do #define MAX_AFFINITY_DOMAIN MAX_NUMNODES? Well, consider: - ibm,max-associativity-domains can change at runtime with LPM. This doesn't happen in practice yet, but we should probably start thinking about how to support that. - The domain numbering isn't clearly specified to have any particular properties such as beginning at zero or a contiguous range. While the current code likely contains assumptions contrary to these points, a change such as this is an opportunity to think about whether those assumptions can be reduced or removed. In particular I think it would be good to gracefully degrade when the number of NUMA affinity domains can exceed MAX_NUMNODES. Using the platform-supplied domain numbers to directly index Linux data structures will make that impossible. So, maybe genradix or even xarray wouldn't actually be overengineering here.
Re: [Latest Git kernel/Linux-next kernel] Xorg doesn't start after the seccomp updates v5.9-rc1
Hi Kees, Thanks a lot for your patch! I think your patch works because I can patch the Git source code but the kernel doesn’t boot. In my point of view your modifications aren’t responsible for this second issue. The kernel can’t initialize the graphics card anymore. I think the latest DRM updates are responsible for the second issue. Because of this second issue I can’t test your patch. Please test the latest Git kernel. Thanks, Christian > On 7. Aug 2020, at 19:45, Kees Cook wrote: > > On Fri, Aug 07, 2020 at 04:45:14PM +0200, Christian Zigotzky wrote: >> But Xorg works on Ubuntu 10.04.4 (PowerPC 32-bit), openSUSE Tumbleweed >> 20190722 PPC64 and on Fedora 27 PPC64 with the latest Git kernel. >> >> I bisected today [4]. >> >> Result: net/scm: Regularize compat handling of scm_detach_fds() >> (c0029de50982c1fb215330a5f9d433cec0cfd8cc) [5] is the first bad commit. >> >> This commit has been merged with the seccomp updates v5.9-rc1 on 2020-08-04 >> 14:11:08 -0700 [1]. Since these updates, Xorg doesn't start anymore on some >> Linux distributions. > > Hi! Thanks for bisecting; yes, sorry for the trouble (I'm still trying > to understand why my compat tests _passed_...). Regardless, can you try > this patch: > > https://lore.kernel.org/lkml/20200807173609.GJ4402@mussarela/ > > -- > Kees Cook
Re: [PATCH 10/22] crypto: atmel-aes - add check for xts input length equal to zero
Hi Andrei, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on cryptodev/master] [also build test WARNING on crypto/master next-20200807] [cannot apply to powerpc/next sparc-next/master v5.8] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Andrei-Botila/crypto-add-check-for-xts-input-length-equal-to-zero/20200808-002648 base: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master config: arm-defconfig (attached as .config) compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arm If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): drivers/crypto/atmel-aes.c: In function 'atmel_aes_crypt': >> drivers/crypto/atmel-aes.c::6: warning: this statement may fall through >> [-Wimplicit-fallthrough=] | if (!req->cryptlen) | ^ drivers/crypto/atmel-aes.c:1114:2: note: here 1114 | default: | ^~~ vim + drivers/crypto/atmel-aes.c 1085 1086 static int atmel_aes_crypt(struct skcipher_request *req, unsigned long mode) 1087 { 1088 struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(req); 1089 struct atmel_aes_base_ctx *ctx = crypto_skcipher_ctx(skcipher); 1090 struct atmel_aes_reqctx *rctx; 1091 struct atmel_aes_dev *dd; 1092 1093 switch (mode & AES_FLAGS_OPMODE_MASK) { 1094 case AES_FLAGS_CFB8: 1095 ctx->block_size = CFB8_BLOCK_SIZE; 1096 break; 1097 1098 case AES_FLAGS_CFB16: 1099 ctx->block_size = CFB16_BLOCK_SIZE; 1100 break; 1101 1102 case AES_FLAGS_CFB32: 1103 ctx->block_size = CFB32_BLOCK_SIZE; 1104 break; 1105 1106 case AES_FLAGS_CFB64: 1107 ctx->block_size = CFB64_BLOCK_SIZE; 1108 break; 1109 1110 case AES_FLAGS_XTS: > if (!req->cryptlen) 1112 return 0; 1113 1114 default: 1115 ctx->block_size = AES_BLOCK_SIZE; 1116 break; 1117 } 1118 ctx->is_aead = false; 1119 1120 dd = atmel_aes_find_dev(ctx); 1121 if (!dd) 1122 return -ENODEV; 1123 1124 rctx = skcipher_request_ctx(req); 1125 rctx->mode = mode; 1126 1127 if ((mode & AES_FLAGS_OPMODE_MASK) != AES_FLAGS_ECB && 1128 !(mode & AES_FLAGS_ENCRYPT) && req->src == req->dst) { 1129 unsigned int ivsize = crypto_skcipher_ivsize(skcipher); 1130 1131 if (req->cryptlen >= ivsize) 1132 scatterwalk_map_and_copy(rctx->lastc, req->src, 1133 req->cryptlen - ivsize, 1134 ivsize, 0); 1135 } 1136 1137 return atmel_aes_handle_queue(dd, &req->base); 1138 } 1139 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [Latest Git kernel/Linux-next kernel] Xorg doesn't start after the seccomp updates v5.9-rc1
On Fri, Aug 07, 2020 at 04:45:14PM +0200, Christian Zigotzky wrote: > But Xorg works on Ubuntu 10.04.4 (PowerPC 32-bit), openSUSE Tumbleweed > 20190722 PPC64 and on Fedora 27 PPC64 with the latest Git kernel. > > I bisected today [4]. > > Result: net/scm: Regularize compat handling of scm_detach_fds() > (c0029de50982c1fb215330a5f9d433cec0cfd8cc) [5] is the first bad commit. > > This commit has been merged with the seccomp updates v5.9-rc1 on 2020-08-04 > 14:11:08 -0700 [1]. Since these updates, Xorg doesn't start anymore on some > Linux distributions. Hi! Thanks for bisecting; yes, sorry for the trouble (I'm still trying to understand why my compat tests _passed_...). Regardless, can you try this patch: https://lore.kernel.org/lkml/20200807173609.GJ4402@mussarela/ -- Kees Cook
[RFC PATCH v1] power: don't manage floating point regs when no FPU
There is no point in copying floating point regs when there is no FPU and MATH_EMULATION is not selected. Create a new CONFIG_PPC_FPU_REGS bool that is selected by CONFIG_MATH_EMULATION and CONFIG_PPC_FPU, and use it to opt out everything related to fp_state in thread_struct. The following app runs in approx 10.50 seconds on an 8xx without the patch, and in 9.45 seconds with the patch. void sigusr1(int sig) { } int main(int argc, char **argv) { int i = 10; signal(SIGUSR1, sigusr1); for (;i--;) raise(SIGUSR1); exit(0); } Signed-off-by: Christophe Leroy --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/processor.h | 2 ++ arch/powerpc/kernel/asm-offsets.c | 2 ++ arch/powerpc/kernel/process.c | 4 arch/powerpc/kernel/ptrace/ptrace-novsx.c | 8 arch/powerpc/kernel/ptrace/ptrace.c | 4 arch/powerpc/kernel/signal.c | 12 +++- arch/powerpc/kernel/signal_32.c | 4 arch/powerpc/kernel/traps.c | 4 arch/powerpc/platforms/Kconfig.cputype| 4 10 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1f48bbfb3ce9..a2611880b904 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -416,6 +416,7 @@ config HUGETLB_PAGE_SIZE_VARIABLE config MATH_EMULATION bool "Math emulation" depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE + select PPC_FPU_REGS help Some PowerPC chips designed for embedded applications do not have a floating-point unit and therefore do not implement the diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index ed0d633ab5aa..e20b0c5abe62 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -175,8 +175,10 @@ struct thread_struct { #endif /* Debug Registers */ struct debug_reg debug; +#ifdef CONFIG_PPC_FPU_REGS struct thread_fp_state fp_state; struct thread_fp_state *fp_save_area; +#endif int fpexc_mode; /* floating-point exception mode */ unsigned intalign_ctl; /* alignment handling control */ #ifdef CONFIG_HAVE_HW_BREAKPOINT diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 8711c2164b45..6cb36c341c70 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -110,9 +110,11 @@ int main(void) #ifdef CONFIG_BOOKE OFFSET(THREAD_NORMSAVES, thread_struct, normsave[0]); #endif +#ifdef CONFIG_PPC_FPU OFFSET(THREAD_FPEXC_MODE, thread_struct, fpexc_mode); OFFSET(THREAD_FPSTATE, thread_struct, fp_state.fpr); OFFSET(THREAD_FPSAVEAREA, thread_struct, fp_save_area); +#endif OFFSET(FPSTATE_FPSCR, thread_fp_state, fpscr); OFFSET(THREAD_LOAD_FP, thread_struct, load_fp); #ifdef CONFIG_ALTIVEC diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index 016bd831908e..7e0082ac0a39 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -1694,7 +1694,9 @@ int copy_thread(unsigned long clone_flags, unsigned long usp, p->thread.ptrace_bps[i] = NULL; #endif +#ifdef CONFIG_PPC_FPU_REGS p->thread.fp_save_area = NULL; +#endif #ifdef CONFIG_ALTIVEC p->thread.vr_save_area = NULL; #endif @@ -1821,8 +1823,10 @@ void start_thread(struct pt_regs *regs, unsigned long start, unsigned long sp) #endif current->thread.load_slb = 0; current->thread.load_fp = 0; +#ifdef CONFIG_PPC_FPU_REGS memset(¤t->thread.fp_state, 0, sizeof(current->thread.fp_state)); current->thread.fp_save_area = NULL; +#endif #ifdef CONFIG_ALTIVEC memset(¤t->thread.vr_state, 0, sizeof(current->thread.vr_state)); current->thread.vr_state.vscr.u[3] = 0x0001; /* Java mode disabled */ diff --git a/arch/powerpc/kernel/ptrace/ptrace-novsx.c b/arch/powerpc/kernel/ptrace/ptrace-novsx.c index b2dc4e92d11a..8f87a11f3f8c 100644 --- a/arch/powerpc/kernel/ptrace/ptrace-novsx.c +++ b/arch/powerpc/kernel/ptrace/ptrace-novsx.c @@ -21,6 +21,7 @@ int fpr_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { +#ifdef CONFIG_PPC_FPU_REGS BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) != offsetof(struct thread_fp_state, fpr[32])); @@ -28,6 +29,9 @@ int fpr_get(struct task_struct *target, const struct user_regset *regset, return user_regset_copyout(&pos, &count, &kbuf, &ubuf, &target->thread.fp_state, 0, -1); +#else + return 0; +#endif } /* @@ -47,6 +51,7 @@ int fpr_set(struct task_struct *target, con
Re: [PATCH] powerpc:entry_32: correct the path and function name in the comment
Le 07/08/2020 à 12:19, chenzefeng a écrit : Update the comment for file's directory and function name changed. Fixes: facd04a904ff ("powerpc: convert to copy_thread_tls") Fixes: 14cf11af6cf6 ("powerpc: Merge enough to start building in arch/powerpc.") Signed-off-by: chenzefeng --- arch/powerpc/kernel/entry_32.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 8420abd4ea1c..9937593d3a33 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -696,8 +696,8 @@ handle_dabr_fault: * to the "_switch" path. If you change this , you'll have to * change the fork code also. * - * The code which creates the new task context is in 'copy_thread' - * in arch/ppc/kernel/process.c + * The code which creates the new task context is in 'copy_thread_tls' + * in arch/powerpc/kernel/process.c Does it matters at all where the function is ? I'm sure people can find it themselves. Christophe */ _GLOBAL(_switch) stwur1,-INT_FRAME_SIZE(r1)
[Latest Git kernel/Linux-next kernel] Xorg doesn't start after the seccomp updates v5.9-rc1
Hello, Xorg doesn't start with the latest Git kernel anymore on some Linux distributions after the seccomp updates v5.9-rc1 [1]. For example on Fienix (Debian Sid PowerPC 32-bit) and on ubuntu MATE 16.04.6 (PowerPC 32-bit). I tested these distributions on the A-EON AmigaOne X1000 [2], A-EON AmigaOne X5000 [3], and in a virtual e5500 QEMU machine with a virtio_gpu. Error messages: systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Connection refused systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Transport endpoint is not connected systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Transport endpoint is not connected systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Transport endpoint is not connected systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Transport endpoint is not connected systemd-journald[2238]: Failed to send WATCHDOG-1 notification message: Transport endpoint is not connected --- But Xorg works on Ubuntu 10.04.4 (PowerPC 32-bit), openSUSE Tumbleweed 20190722 PPC64 and on Fedora 27 PPC64 with the latest Git kernel. I bisected today [4]. Result: net/scm: Regularize compat handling of scm_detach_fds() (c0029de50982c1fb215330a5f9d433cec0cfd8cc) [5] is the first bad commit. This commit has been merged with the seccomp updates v5.9-rc1 on 2020-08-04 14:11:08 -0700 [1]. Since these updates, Xorg doesn't start anymore on some Linux distributions. Unfortunately I wasn't able to revert the first bad commit. The first bad commit depends on many other commits, which unfortunately I don't know. I tried to remove the modifications of the files from the first bad commit but without any success. There are just too many dependencies. Additionally I compiled a linux-next kernel because of the issue with the lastest Git kernel. Unfortunately this kernel doesn't boot. It can't initialize the graphics card. Could you please test Xorg with the latest Git kernel on some Linux distributions? Thanks, Christian [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9ecc6ea491f0c0531ad81ef9466284df260b2227 [2] https://en.wikipedia.org/wiki/AmigaOne_X1000 [3] http://wiki.amiga.org/index.php?title=X5000 [4] https://forum.hyperion-entertainment.com/viewtopic.php?p=51317#p51317 [5] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c0029de50982c1fb215330a5f9d433cec0cfd8cc
[PATCH] powerpc:entry_32: correct the path and function name in the comment
Update the comment for file's directory and function name changed. Fixes: facd04a904ff ("powerpc: convert to copy_thread_tls") Fixes: 14cf11af6cf6 ("powerpc: Merge enough to start building in arch/powerpc.") Signed-off-by: chenzefeng --- arch/powerpc/kernel/entry_32.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 8420abd4ea1c..9937593d3a33 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -696,8 +696,8 @@ handle_dabr_fault: * to the "_switch" path. If you change this , you'll have to * change the fork code also. * - * The code which creates the new task context is in 'copy_thread' - * in arch/ppc/kernel/process.c + * The code which creates the new task context is in 'copy_thread_tls' + * in arch/powerpc/kernel/process.c */ _GLOBAL(_switch) stwur1,-INT_FRAME_SIZE(r1) -- 2.12.3
[GIT PULL] Please pull powerpc/linux.git powerpc-5.9-1 tag
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hi Linus, Please pull powerpc updates for 5.9. Just one minor conflict, in a comment in drivers/misc/ocxl/config.c. Notable out of area changes: arch/m68k/include/asm/adb_iop.h # c66da95a39ec macintosh/adb-iop: Implement SRQ autopolling drivers/md/dm-writecache.c# 3e79f082ebfc libnvdimm/nvdimm/flush: Allow architecture to override the flush barrier drivers/nvdimm/region_devs.c include/asm-generic/barrier.h drivers/nvdimm/of_pmem.c # 8c26ab72663b powerpc/pmem: Initialize pmem device on newer hardware include/asm-generic/qspinlock.h # 20c0e8269e9d powerpc/pseries: Implement paravirt qspinlocks for SPLPAR include/linux/cpuhotplug.h# 1a8f0886a600 powerpc/perf/hv-24x7: Add cpu hotplug support include/linux/kexec.h # f891f19736bd kexec_file: Allow archs to handle special regions while locating memory hole kernel/kexec_file.c include/trace/events/mmflags.h# 5c9fa16e8abd powerpc/64s: Remove PROT_SAO support include/linux/mm.h mm/ksm.c cheers The following changes since commit 48778464bb7d346b47157d21ffde2af6b2d39110: Linux 5.8-rc2 (2020-06-21 15:45:29 -0700) are available in the git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git tags/powerpc-5.9-1 for you to fetch changes up to a7aaa2f26bfd932a654706b19859e7adf802bee2: selftests/powerpc: Fix pkey syscall redefinitions (2020-08-05 10:14:03 +1000) - -- powerpc updates for 5.9 - Add support for (optionally) using queued spinlocks & rwlocks. - Support for a new faster system call ABI using the scv instruction on Power9 or later. - Drop support for the PROT_SAO mmap/mprotect flag as it will be unsupported on Power10 and future processors, leaving us with no way to implement the functionality it requests. This risks breaking userspace, though we believe it is unused in practice. - A bug fix for, and then the removal of, our custom stack expansion checking. We now allow stack expansion up to the rlimit, like other architectures. - Remove the remnants of our (previously disabled) topology update code, which tried to react to NUMA layout changes on virtualised systems, but was prone to crashes and other problems. - Add PMU support for Power10 CPUs. - A change to our signal trampoline so that we don't unbalance the link stack (branch return predictor) in the signal delivery path. - Lots of other cleanups, refactorings, smaller features and so on as usual. Thanks to: Abhishek Goel, Alastair D'Silva, Alexander A. Klimov, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Balamuruhan S, Bharata B Rao, Bill Wendling, Bin Meng, Cédric Le Goater, Chris Packham, Christophe Leroy, Christoph Hellwig, Daniel Axtens, Dan Williams, David Lamparter, Desnes A. Nunes do Rosario, Erhard F., Finn Thain, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geoff Levand, Greg Kurz, Gustavo A. R. Silva, Hari Bathini, Harish, Imre Kaloz, Joel Stanley, Joe Perches, John Crispin, Jordan Niethe, Kajol Jain, Kamalesh Babulal, Kees Cook, Laurent Dufour, Leonardo Bras, Li RongQing, Madhavan Srinivasan, Mahesh Salgaonkar, Mark Cave-Ayland, Michal Suchanek, Milton Miller, Mimi Zohar, Murilo Opsfelder Araujo, Nathan Chancellor, Nathan Lynch, Naveen N. Rao, Nayna Jain, Nicholas Piggin, Oliver O'Halloran, Palmer Dabbelt, Pedro Miraglia Franco de Carvalho, Philippe Bergheaud, Pingfan Liu, Pratik Rajesh Sampat, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sam Bobroff, Sandipan Das, Santosh Sivaraj, Satheesh Rajendran, Shirisha Ganta, Sourabh Jain, Srikar Dronamraju, Stan Johnson, Stephen Rothwell, Thadeu Lima de Souza Cascardo, Thiago Jung Bauermann, Tom Lane, Vaibhav Jain, Vladis Dronov, Wei Yongjun, Wen Xiong, YueHaibing. - -- Abhishek Goel (1): cpuidle/powernv : Remove dead code block Alastair D'Silva (2): ocxl: Remove unnecessary externs ocxl: Address kernel doc errors & warnings Alexander A. Klimov (5): ocxl: Replace HTTP links with HTTPS ones powerpc/Kconfig: Replace HTTP links with HTTPS ones powerpc: Replace HTTP links with HTTPS ones macintosh/adb: Replace HTTP links with HTTPS ones macintosh/therm_adt746x: Replace HTTP links with HTTPS ones Alexey Kardashevskiy (2): powerpc/xive: Ignore kmemleak false positives powerpc/powernv/ioda: Return correct error if TCE level allocation failed Aneesh Kumar K.V (37): powerpc/mm/book3s64: Skip 16G page reservation with radix powerpc/pmem: Restrict papr_scm to P8 and above. powerpc/pmem: Add new instructions for persistent storage and sync powerpc/pmem: Add flu
[PATCH] powerpc/papr_scm: Make access mode of 'perf_stats' attribute file to '0400'
The newly introduced 'perf_stats' attribute uses the default access mode of 0444 letting non-root users access performance stats of an nvdimm and potentially force the kernel into issuing large number of expensive HCALLs. Since the information exposed by this attribute cannot be cached hence its better to ward of access to this attribute from non-root users. Hence this patch updates the access-mode of 'perf_stats' sysfs attribute file to 0400 to make it only readable to root-users. Reported-by: Aneesh Kumar K.V Signed-off-by: Vaibhav Jain --- arch/powerpc/platforms/pseries/papr_scm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index f439f0dfea7d1..31864d167a2ce 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -822,7 +822,7 @@ static ssize_t perf_stats_show(struct device *dev, kfree(stats); return rc ? rc : seq_buf_used(&s); } -DEVICE_ATTR_RO(perf_stats); +DEVICE_ATTR(perf_stats, 0400, perf_stats_show, NULL); static ssize_t flags_show(struct device *dev, struct device_attribute *attr, char *buf) -- 2.26.2
Re: [PATCH V6 0/2] tools/perf: Add extended regs support for powerpc
Em Fri, Aug 07, 2020 at 06:11:17AM -0400, Athira Rajeev escreveu: > Patch set to add perf tools support for perf extended register capability > in powerpc. > > Patch 1/2 adds extended regs for power9 ( mmcr0, mmcr1 and mmcr2 ) > to sample_reg_mask in the tool side to use with `-I?`. > Patch 2/2 adds extended regs for power10 ( mmcr3, sier2, sier3) > to sample_reg_mask in the tool side. > > Ravi bangoria found an issue with `perf record -I` while testing the > changes. The same issue is currently being worked on here: > https://lkml.org/lkml/2020/7/19/413 and will be resolved once fix > from Jin Yao is merged. > > This includes the perf tools side changes to support extended regs. > kernel side changes are sent as separate patchset. Thanks, applied. - Arnaldo
Re: [PATCH 1/2] lockdep: improve current->(hard|soft)irqs_enabled synchronisation with actual irq state
What's wrong with something like this? AFAICT there's no reason to actually try and add IRQ tracing here, it's just a hand full of instructions at the most. --- diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index 3a0db7b0b46e..6be22c1838e2 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -196,33 +196,6 @@ static inline bool arch_irqs_disabled(void) arch_local_irq_restore(flags); \ } while(0) -#ifdef CONFIG_TRACE_IRQFLAGS -#define powerpc_local_irq_pmu_save(flags) \ -do { \ - raw_local_irq_pmu_save(flags); \ - trace_hardirqs_off(); \ - } while(0) -#define powerpc_local_irq_pmu_restore(flags) \ - do {\ - if (raw_irqs_disabled_flags(flags)) { \ - raw_local_irq_pmu_restore(flags); \ - trace_hardirqs_off(); \ - } else {\ - trace_hardirqs_on();\ - raw_local_irq_pmu_restore(flags); \ - } \ - } while(0) -#else -#define powerpc_local_irq_pmu_save(flags) \ - do {\ - raw_local_irq_pmu_save(flags); \ - } while(0) -#define powerpc_local_irq_pmu_restore(flags) \ - do {\ - raw_local_irq_pmu_restore(flags); \ - } while (0) -#endif /* CONFIG_TRACE_IRQFLAGS */ - #endif /* CONFIG_PPC_BOOK3S */ #ifdef CONFIG_PPC_BOOK3E diff --git a/arch/powerpc/include/asm/local.h b/arch/powerpc/include/asm/local.h index bc4bd19b7fc2..b357a35672b1 100644 --- a/arch/powerpc/include/asm/local.h +++ b/arch/powerpc/include/asm/local.h @@ -32,9 +32,9 @@ static __inline__ void local_##op(long i, local_t *l) \ { \ unsigned long flags;\ \ - powerpc_local_irq_pmu_save(flags); \ + raw_powerpc_local_irq_pmu_save(flags); \ l->v c_op i;\ - powerpc_local_irq_pmu_restore(flags); \ + raw_powerpc_local_irq_pmu_restore(flags); \ } #define LOCAL_OP_RETURN(op, c_op) \ @@ -43,9 +43,9 @@ static __inline__ long local_##op##_return(long a, local_t *l)\ long t; \ unsigned long flags;\ \ - powerpc_local_irq_pmu_save(flags); \ + raw_powerpc_local_irq_pmu_save(flags); \ t = (l->v c_op a); \ - powerpc_local_irq_pmu_restore(flags); \ + raw_powerpc_local_irq_pmu_restore(flags); \ \ return t; \ } @@ -81,11 +81,11 @@ static __inline__ long local_cmpxchg(local_t *l, long o, long n) long t; unsigned long flags; - powerpc_local_irq_pmu_save(flags); + raw_powerpc_local_irq_pmu_save(flags); t = l->v; if (t == o) l->v = n; - powerpc_local_irq_pmu_restore(flags); + raw_powerpc_local_irq_pmu_restore(flags); return t; } @@ -95,10 +95,10 @@ static __inline__ long local_xchg(local_t *l, long n) long t; unsigned long flags; - powerpc_local_irq_pmu_save(flags); + raw_powerpc_local_irq_pmu_save(flags); t = l->v; l->v = n; - powerpc_local_irq_pmu_restore(flags); + raw_powerpc_local_irq_pmu_restore(flags); return t; } @@ -117,12 +117,12 @@ static __inline__ int local_add_unless(local_t *l, long a, long u) unsigned long flags; int ret = 0; - powerpc_local_irq_pmu_save(flags); + raw_powerpc_local_irq_pmu_save(flags); if (l->v != u) { l->v += a; ret = 1; } - powerpc_local_irq_pmu_restore(flags); + raw_powerpc_local_irq_pmu_restore(flags);
[PATCH v2] powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
When a passthrough IO adapter is removed from a pseries machine using hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the guest OS to clear all page table entries related to the adapter. If some are still present, the RTAS call which isolates the PCI slot returns error 9001 "valid outstanding translations" and the removal of the IO adapter fails. This is because when the PHBs are scanned, Linux maps automatically the INTx interrupts in the Linux interrupt number space but these are never removed. To solve this problem, we introduce a PPC platform specific pcibios_remove_bus() routine which clears all interrupt mappings when the bus is removed. This also clears the associated page table entries of the ESB pages when using XIVE. For this purpose, we record the logical interrupt numbers of the mapped interrupt under the PHB structure and let pcibios_remove_bus() do the clean up. Since some PCI adapters, like GPUs, use the "interrupt-map" property to describe interrupt mappings other than the legacy INTx interrupts, we can not restrict the size of the mapping array to PCI_NUM_INTX. The number of interrupt mappings is computed from the "interrupt-map" property and the mapping array is allocated accordingly. Cc: "Oliver O'Halloran" Cc: Alexey Kardashevskiy Signed-off-by: Cédric Le Goater --- Changes since v2: - merged 2 patches. arch/powerpc/include/asm/pci-bridge.h | 6 ++ arch/powerpc/kernel/pci-common.c | 114 ++ 2 files changed, 120 insertions(+) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index b92e81b256e5..ca75cf264ddf 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -48,6 +48,9 @@ struct pci_controller_ops { /* * Structure of a PCI controller (host bridge) + * + * @irq_count: number of interrupt mappings + * @irq_map: interrupt mappings */ struct pci_controller { struct pci_bus *bus; @@ -127,6 +130,9 @@ struct pci_controller { void *private_data; struct npu *npu; + + unsigned int irq_count; + unsigned int *irq_map; }; /* These are used for config access before all the PCI probing diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index be108616a721..deb831f0ae13 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -353,6 +353,115 @@ struct pci_controller *pci_find_controller_for_domain(int domain_nr) return NULL; } +/* + * Assumption is made on the interrupt parent. All interrupt-map + * entries are considered to have the same parent. + */ +static int pcibios_irq_map_count(struct pci_controller *phb) +{ + const __be32 *imap; + int imaplen; + struct device_node *parent; + u32 intsize, addrsize, parintsize, paraddrsize; + + if (of_property_read_u32(phb->dn, "#interrupt-cells", &intsize)) + return 0; + if (of_property_read_u32(phb->dn, "#address-cells", &addrsize)) + return 0; + + imap = of_get_property(phb->dn, "interrupt-map", &imaplen); + if (!imap) { + pr_debug("%pOF : no interrupt-map\n", phb->dn); + return 0; + } + imaplen /= sizeof(u32); + pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen); + + if (imaplen < (addrsize + intsize + 1)) + return 0; + + imap += intsize + addrsize; + parent = of_find_node_by_phandle(be32_to_cpup(imap)); + if (!parent) { + pr_debug("%pOF : no imap parent found !\n", phb->dn); + return 0; + } + + if (of_property_read_u32(parent, "#interrupt-cells", &parintsize)) { + pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn); + return 0; + } + + if (of_property_read_u32(parent, "#address-cells", ¶ddrsize)) + paraddrsize = 0; + + return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize); +} + +static void pcibios_irq_map_init(struct pci_controller *phb) +{ + phb->irq_count = pcibios_irq_map_count(phb); + if (phb->irq_count < PCI_NUM_INTX) + phb->irq_count = PCI_NUM_INTX; + + pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count); + + phb->irq_map = kcalloc(phb->irq_count, sizeof(unsigned int), + GFP_KERNEL); +} + +static void pci_irq_map_register(struct pci_dev *pdev, unsigned int virq) +{ + struct pci_controller *phb = pci_bus_to_host(pdev->bus); + int i; + + if (!phb->irq_map) + return; + + for (i = 0; i < phb->irq_count; i++) { + /* +* Look for an empty or an equivalent slot, as INTx +* interrupts can be shared between adapters. +*/ + if (phb->irq_map[i] == virq || !phb->irq_map[i]) { + phb->irq_map[i] = virq; +
[PATCH V6 2/2] tools/perf: Add perf tools support for extended regs in power10
Added support for supported regs which are new in power10 ( MMCR3, SIER2, SIER3 ) to sample_reg_mask in the tool side to use with `-I?` option. Also added PVR check to send extended mask for power10 at kernel while capturing extended regs in each sample. Signed-off-by: Athira Rajeev Reviewed-by: Kajol Jain Reviewed-and-tested-by: Ravi Bangoria --- tools/arch/powerpc/include/uapi/asm/perf_regs.h | 6 ++ tools/perf/arch/powerpc/include/perf_regs.h | 3 +++ tools/perf/arch/powerpc/util/perf_regs.c| 6 ++ 3 files changed, 15 insertions(+) diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h index 225c64c..bdf5f10 100644 --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h @@ -52,6 +52,9 @@ enum perf_event_powerpc_regs { PERF_REG_POWERPC_MMCR0, PERF_REG_POWERPC_MMCR1, PERF_REG_POWERPC_MMCR2, + PERF_REG_POWERPC_MMCR3, + PERF_REG_POWERPC_SIER2, + PERF_REG_POWERPC_SIER3, /* Max regs without the extended regs */ PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1, }; @@ -60,6 +63,9 @@ enum perf_event_powerpc_regs { /* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ #define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK) +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */ +#define PERF_REG_PMU_MASK_31 (((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) - PERF_REG_PMU_MASK) #define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1) +#define PERF_REG_MAX_ISA_31(PERF_REG_POWERPC_SIER3 + 1) #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */ diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h index 46ed00d..63f3ac9 100644 --- a/tools/perf/arch/powerpc/include/perf_regs.h +++ b/tools/perf/arch/powerpc/include/perf_regs.h @@ -68,6 +68,9 @@ [PERF_REG_POWERPC_MMCR0] = "mmcr0", [PERF_REG_POWERPC_MMCR1] = "mmcr1", [PERF_REG_POWERPC_MMCR2] = "mmcr2", + [PERF_REG_POWERPC_MMCR3] = "mmcr3", + [PERF_REG_POWERPC_SIER2] = "sier2", + [PERF_REG_POWERPC_SIER3] = "sier3", }; static inline const char *perf_reg_name(int id) diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c index d64ba0c..2b6d470 100644 --- a/tools/perf/arch/powerpc/util/perf_regs.c +++ b/tools/perf/arch/powerpc/util/perf_regs.c @@ -14,6 +14,7 @@ #include #define PVR_POWER9 0x004E +#define PVR_POWER100x0080 const struct sample_reg sample_reg_masks[] = { SMPL_REG(r0, PERF_REG_POWERPC_R0), @@ -64,6 +65,9 @@ SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0), SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1), SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2), + SMPL_REG(mmcr3, PERF_REG_POWERPC_MMCR3), + SMPL_REG(sier2, PERF_REG_POWERPC_SIER2), + SMPL_REG(sier3, PERF_REG_POWERPC_SIER3), SMPL_REG_END }; @@ -194,6 +198,8 @@ uint64_t arch__intr_reg_mask(void) version = (((mfspr(SPRN_PVR)) >> 16) & 0x); if (version == PVR_POWER9) extended_mask = PERF_REG_PMU_MASK_300; + else if (version == PVR_POWER10) + extended_mask = PERF_REG_PMU_MASK_31; else return mask; -- 1.8.3.1
[PATCH V6 1/2] tools/perf: Add perf tools support for extended register capability in powerpc
From: Anju T Sudhakar Add extended regs to sample_reg_mask in the tool side to use with `-I?` option. Perf tools side uses extended mask to display the platform supported register names (with -I? option) to the user and also send this mask to the kernel to capture the extended registers in each sample. Hence decide the mask value based on the processor version. Currently definitions for `mfspr`, `SPRN_PVR` are part of `arch/powerpc/util/header.c`. Move this to a header file so that these definitions can be re-used in other source files as well. Signed-off-by: Anju T Sudhakar [Decide extended mask at run time based on platform] Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan Reviewed-by: Kajol Jain Reviewed-and-tested-by: Ravi Bangoria --- tools/arch/powerpc/include/uapi/asm/perf_regs.h | 14 ++- tools/perf/arch/powerpc/include/perf_regs.h | 5 ++- tools/perf/arch/powerpc/util/header.c | 9 + tools/perf/arch/powerpc/util/perf_regs.c| 49 + tools/perf/arch/powerpc/util/utils_header.h | 15 5 files changed, 82 insertions(+), 10 deletions(-) create mode 100644 tools/perf/arch/powerpc/util/utils_header.h diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h index f599064..225c64c 100644 --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs { PERF_REG_POWERPC_DSISR, PERF_REG_POWERPC_SIER, PERF_REG_POWERPC_MMCRA, - PERF_REG_POWERPC_MAX, + /* Extended registers */ + PERF_REG_POWERPC_MMCR0, + PERF_REG_POWERPC_MMCR1, + PERF_REG_POWERPC_MMCR2, + /* Max regs without the extended regs */ + PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1, }; + +#define PERF_REG_PMU_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1) + +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ +#define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK) + +#define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1) #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */ diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h index e18a355..46ed00d 100644 --- a/tools/perf/arch/powerpc/include/perf_regs.h +++ b/tools/perf/arch/powerpc/include/perf_regs.h @@ -64,7 +64,10 @@ [PERF_REG_POWERPC_DAR] = "dar", [PERF_REG_POWERPC_DSISR] = "dsisr", [PERF_REG_POWERPC_SIER] = "sier", - [PERF_REG_POWERPC_MMCRA] = "mmcra" + [PERF_REG_POWERPC_MMCRA] = "mmcra", + [PERF_REG_POWERPC_MMCR0] = "mmcr0", + [PERF_REG_POWERPC_MMCR1] = "mmcr1", + [PERF_REG_POWERPC_MMCR2] = "mmcr2", }; static inline const char *perf_reg_name(int id) diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c index d487007..1a95017 100644 --- a/tools/perf/arch/powerpc/util/header.c +++ b/tools/perf/arch/powerpc/util/header.c @@ -7,17 +7,10 @@ #include #include #include "header.h" +#include "utils_header.h" #include "metricgroup.h" #include -#define mfspr(rn) ({unsigned long rval; \ -asm volatile("mfspr %0," __stringify(rn) \ - : "=r" (rval)); rval; }) - -#define SPRN_PVR0x11F /* Processor Version Register */ -#define PVR_VER(pvr)(((pvr) >> 16) & 0x) /* Version field */ -#define PVR_REV(pvr)(((pvr) >> 0) & 0x) /* Revison field */ - int get_cpuid(char *buffer, size_t sz) { diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c index 0a52429..d64ba0c 100644 --- a/tools/perf/arch/powerpc/util/perf_regs.c +++ b/tools/perf/arch/powerpc/util/perf_regs.c @@ -6,9 +6,15 @@ #include "../../../util/perf_regs.h" #include "../../../util/debug.h" +#include "../../../util/event.h" +#include "../../../util/header.h" +#include "../../../perf-sys.h" +#include "utils_header.h" #include +#define PVR_POWER9 0x004E + const struct sample_reg sample_reg_masks[] = { SMPL_REG(r0, PERF_REG_POWERPC_R0), SMPL_REG(r1, PERF_REG_POWERPC_R1), @@ -55,6 +61,9 @@ SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR), SMPL_REG(sier, PERF_REG_POWERPC_SIER), SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA), + SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0), + SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1), + SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2), SMPL_REG_END }; @@ -163,3 +172,43 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op) return SDT_ARG_VALID; } + +uint64_t arch__intr_reg_mask(void) +{ + struct perf_event_attr attr = { + .type = PERF_TYPE_HARDWARE, + .config = PERF_COUNT_HW_CPU_CYCLES, + .sample_type= PERF_SAMPLE_REGS_INTR, +
[PATCH V6 0/2] tools/perf: Add extended regs support for powerpc
Patch set to add perf tools support for perf extended register capability in powerpc. Patch 1/2 adds extended regs for power9 ( mmcr0, mmcr1 and mmcr2 ) to sample_reg_mask in the tool side to use with `-I?`. Patch 2/2 adds extended regs for power10 ( mmcr3, sier2, sier3) to sample_reg_mask in the tool side. Ravi bangoria found an issue with `perf record -I` while testing the changes. The same issue is currently being worked on here: https://lkml.org/lkml/2020/7/19/413 and will be resolved once fix from Jin Yao is merged. This includes the perf tools side changes to support extended regs. kernel side changes are sent as separate patchset. Changelog: Changes from v5 -> v6 - Split perf tools side changes to one patchset as suggested by Arnaldo Link to previous series: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=192624 Changes from v4 -> v5 - initialize `perf_reg_extended_max` to work on all platforms as suggested by Ravi Bangoria - Added Reviewed-and-Tested-by from Ravi Bangoria Changes from v3 -> v4 - Split the series and send extended regs as separate patch set here. Link to previous series : https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=190462&state=* Other PMU patches are already merged in powerpc/next. - Fixed kernel build issue when using config having CONFIG_PERF_EVENTS set and without CONFIG_PPC_PERF_CTRS reported by kernel build bot. - Included Reviewed-by from Kajol Jain. - Addressed review comments from Ravi Bangoria to initialize `perf_reg_extended_max` and define it in lowercase since it is local variable. Anju T Sudhakar (1): tools/perf: Add perf tools support for extended register capability in powerpc Athira Rajeev (1): tools/perf: Add perf tools support for extended regs in power10 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 20 - tools/perf/arch/powerpc/include/perf_regs.h | 8 +++- tools/perf/arch/powerpc/util/header.c | 9 +--- tools/perf/arch/powerpc/util/perf_regs.c| 55 + tools/perf/arch/powerpc/util/utils_header.h | 15 +++ 5 files changed, 97 insertions(+), 10 deletions(-) create mode 100644 tools/perf/arch/powerpc/util/utils_header.h -- 1.8.3.1
[PATCH V6 1/2] powerpc/perf: Add support for outputting extended regs in perf intr_regs
From: Anju T Sudhakar Add support for perf extended register capability in powerpc. The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the PMU which support extended registers. The generic code define the mask of extended registers as 0 for non supported architectures. Patch adds extended regs support for power9 platform by exposing MMCR0, MMCR1 and MMCR2 registers. REG_RESERVED mask needs update to include extended regs. `PERF_REG_EXTENDED_MASK`, contains mask value of the supported registers, is defined at runtime in the kernel based on platform since the supported registers may differ from one processor version to another and hence the MASK value. with patch -- available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26 r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2 PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0 ... intr regs: mask 0x ABI 64-bit r00xc012b77c r10xc03fe5e03930 r20xc1b0e000 r30xc03fdcddf800 r40xc03fc788 r50x9c422724be r60xc03fe5e03908 r70xff63bddc8706 r80x9e4 r90x0 r10 0x1 r11 0x0 r12 0xc01299c0 r13 0xc03c4800 r14 0x0 r15 0x7fffdd8b8b00 r16 0x0 r17 0x7fffdd8be6b8 r18 0x7e7076607730 r19 0x2f r20 0xc0001fc26c68 r21 0xc0002041e4227e00 r22 0xc0002018fb60 r23 0x1 r24 0xc03ffec4d900 r25 0x8000 r26 0x0 r27 0x1 r28 0x1 r29 0xc1be1260 r30 0x6008010 r31 0xc03ffebb7218 nip 0xc012b910 msr 0x90009033 orig_r3 0xc012b86c ctr 0xc01299c0 link 0xc012b77c xer 0x0 ccr 0x2800 softe 0x1 trap 0xf00 dar 0x0 dsisr 0x800 sier 0x0 mmcra 0x800 mmcr0 0x82008090 mmcr1 0x1e00 mmcr2 0x0 ... thread: perf:4784 Signed-off-by: Anju T Sudhakar [Defined PERF_REG_EXTENDED_MASK at run time to add support for different platforms ] Signed-off-by: Athira Rajeev Reviewed-by: Madhavan Srinivasan [Fix build issue using CONFIG_PERF_EVENTS without CONFIG_PPC_PERF_CTRS] Reported-by: kernel test robot Reviewed-by: Kajol Jain Tested-by: Nageswara R Sastry Reviewed-and-tested-by: Ravi Bangoria --- arch/powerpc/include/asm/perf_event.h| 3 +++ arch/powerpc/include/asm/perf_event_server.h | 5 arch/powerpc/include/uapi/asm/perf_regs.h| 14 +++- arch/powerpc/perf/core-book3s.c | 1 + arch/powerpc/perf/perf_regs.c| 34 +--- arch/powerpc/perf/power9-pmu.c | 6 + 6 files changed, 59 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/perf_event.h b/arch/powerpc/include/asm/perf_event.h index 1e8b2e1..daec64d 100644 --- a/arch/powerpc/include/asm/perf_event.h +++ b/arch/powerpc/include/asm/perf_event.h @@ -40,4 +40,7 @@ /* To support perf_regs sier update */ extern bool is_sier_available(void); +/* To define perf extended regs mask value */ +extern u64 PERF_REG_EXTENDED_MASK; +#define PERF_REG_EXTENDED_MASK PERF_REG_EXTENDED_MASK #endif diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h index 86c9eb06..f6acabb 100644 --- a/arch/powerpc/include/asm/perf_event_server.h +++ b/arch/powerpc/include/asm/perf_event_server.h @@ -62,6 +62,11 @@ struct power_pmu { int *blacklist_ev; /* BHRB entries in the PMU */ int bhrb_nr; + /* +* set this flag with `PERF_PMU_CAP_EXTENDED_REGS` if +* the pmu supports extended perf regs capability +*/ + int capabilities; }; /* diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h index f599064..225c64c 100644 --- a/arch/powerpc/include/uapi/asm/perf_regs.h +++ b/arch/powerpc/include/uapi/asm/perf_regs.h @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs { PERF_REG_POWERPC_DSISR, PERF_REG_POWERPC_SIER, PERF_REG_POWERPC_MMCRA, - PERF_REG_POWERPC_MAX, + /* Extended registers */ + PERF_REG_POWERPC_MMCR0, + PERF_REG_POWERPC_MMCR1, + PERF_REG_POWERPC_MMCR2, + /* Max regs without the extended regs */ + PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1, }; + +#define PERF_REG_PMU_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1) + +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ +#define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK) + +#define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1) #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */ diff
[PATCH V6 2/2] powerpc/perf: Add extended regs support for power10 platform
Include capability flag `PERF_PMU_CAP_EXTENDED_REGS` for power10 and expose MMCR3, SIER2, SIER3 registers as part of extended regs. Also introduce `PERF_REG_PMU_MASK_31` to define extended mask value at runtime for power10 Signed-off-by: Athira Rajeev [Fix build failure on PPC32 platform] Suggested-by: Ryan Grimm Reported-by: kernel test robot Reviewed-by: Kajol Jain Tested-by: Nageswara R Sastry Reviewed-and-tested-by: Ravi Bangoria --- arch/powerpc/include/uapi/asm/perf_regs.h | 6 ++ arch/powerpc/perf/perf_regs.c | 12 +++- arch/powerpc/perf/power10-pmu.c | 6 ++ 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h index 225c64c..bdf5f10 100644 --- a/arch/powerpc/include/uapi/asm/perf_regs.h +++ b/arch/powerpc/include/uapi/asm/perf_regs.h @@ -52,6 +52,9 @@ enum perf_event_powerpc_regs { PERF_REG_POWERPC_MMCR0, PERF_REG_POWERPC_MMCR1, PERF_REG_POWERPC_MMCR2, + PERF_REG_POWERPC_MMCR3, + PERF_REG_POWERPC_SIER2, + PERF_REG_POWERPC_SIER3, /* Max regs without the extended regs */ PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1, }; @@ -60,6 +63,9 @@ enum perf_event_powerpc_regs { /* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ #define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK) +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */ +#define PERF_REG_PMU_MASK_31 (((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) - PERF_REG_PMU_MASK) #define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1) +#define PERF_REG_MAX_ISA_31(PERF_REG_POWERPC_SIER3 + 1) #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */ diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c index 9301e68..8e53f2f 100644 --- a/arch/powerpc/perf/perf_regs.c +++ b/arch/powerpc/perf/perf_regs.c @@ -81,6 +81,14 @@ static u64 get_ext_regs_value(int idx) return mfspr(SPRN_MMCR1); case PERF_REG_POWERPC_MMCR2: return mfspr(SPRN_MMCR2); +#ifdef CONFIG_PPC64 + case PERF_REG_POWERPC_MMCR3: + return mfspr(SPRN_MMCR3); + case PERF_REG_POWERPC_SIER2: + return mfspr(SPRN_SIER2); + case PERF_REG_POWERPC_SIER3: + return mfspr(SPRN_SIER3); +#endif default: return 0; } } @@ -89,7 +97,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx) { u64 perf_reg_extended_max = PERF_REG_POWERPC_MAX; - if (cpu_has_feature(CPU_FTR_ARCH_300)) + if (cpu_has_feature(CPU_FTR_ARCH_31)) + perf_reg_extended_max = PERF_REG_MAX_ISA_31; + else if (cpu_has_feature(CPU_FTR_ARCH_300)) perf_reg_extended_max = PERF_REG_MAX_ISA_300; if (idx == PERF_REG_POWERPC_SIER && diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c index f7cff7f..8314865 100644 --- a/arch/powerpc/perf/power10-pmu.c +++ b/arch/powerpc/perf/power10-pmu.c @@ -87,6 +87,8 @@ #define POWER10_MMCRA_IFM3 0xC000UL #define POWER10_MMCRA_BHRB_MASK0xC000UL +extern u64 PERF_REG_EXTENDED_MASK; + /* Table of alternatives, sorted by column 0 */ static const unsigned int power10_event_alternatives[][MAX_ALT] = { { PM_RUN_CYC_ALT, PM_RUN_CYC }, @@ -397,6 +399,7 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter) .cache_events = &power10_cache_events, .attr_groups= power10_pmu_attr_groups, .bhrb_nr= 32, + .capabilities = PERF_PMU_CAP_EXTENDED_REGS, }; int init_power10_pmu(void) @@ -408,6 +411,9 @@ int init_power10_pmu(void) strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10")) return -ENODEV; + /* Set the PERF_REG_EXTENDED_MASK here */ + PERF_REG_EXTENDED_MASK = PERF_REG_PMU_MASK_31; + rc = register_power_pmu(&power10_pmu); if (rc) return rc; -- 1.8.3.1
[PATCH V6 0/2] powerpc/perf: Add support for perf extended regs in powerpc
Patch set to add support for perf extended register capability in powerpc. The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the PMU which support extended registers. The generic code define the mask of extended registers as 0 for non supported architectures. patch 1/2 defines the PERF_PMU_CAP_EXTENDED_REGS mask to output the values of mmcr0,mmcr1,mmcr2 for POWER9. Defines `PERF_REG_EXTENDED_MASK` at runtime which contains mask value of the supported registers under extended regs. patch 2/2 adds the extended regs support for power10 and exposes MMCR3, SIER2, SIER3 registers as part of extended regs. This patch series is based on powerpc/next and includes the kernel side changes to support extended regs. perf tools side changes will be sent as separate patchset. Changelog: Changes from v5 -> v6 - Split kernel changes to one patchset as suggested by Arnaldo Link to previous series: https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=192624 Changes from v4 -> v5 - initialize `perf_reg_extended_max` to work on all platforms as suggested by Ravi Bangoria - Added Reviewed-and-Tested-by from Ravi Bangoria Changes from v3 -> v4 - Split the series and send extended regs as separate patch set here. Link to previous series : https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=190462&state=* Other PMU patches are already merged in powerpc/next. - Fixed kernel build issue when using config having CONFIG_PERF_EVENTS set and without CONFIG_PPC_PERF_CTRS reported by kernel build bot. - Included Reviewed-by from Kajol Jain. - Addressed review comments from Ravi Bangoria to initialize `perf_reg_extended_max` and define it in lowercase since it is local variable. Anju T Sudhakar (1): powerpc/perf: Add support for outputting extended regs in perf intr_regs Athira Rajeev (1): powerpc/perf: Add extended regs support for power10 platform arch/powerpc/include/asm/perf_event.h| 3 ++ arch/powerpc/include/asm/perf_event_server.h | 5 arch/powerpc/include/uapi/asm/perf_regs.h| 20 - arch/powerpc/perf/core-book3s.c | 1 + arch/powerpc/perf/perf_regs.c| 44 ++-- arch/powerpc/perf/power10-pmu.c | 6 arch/powerpc/perf/power9-pmu.c | 6 7 files changed, 81 insertions(+), 4 deletions(-) -- 1.8.3.1
Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
On Fri, Aug 07, 2020 at 08:58:09AM +0200, David Hildenbrand wrote: > On 07.08.20 06:32, Andrew Morton wrote: > > On Fri, 3 Jul 2020 18:28:23 +0530 Srikar Dronamraju > > wrote: > > > >>> The memory hotplug changes that somehow because you can hotremove numa > >>> nodes and therefore make the nodemask sparse but that is not a common > >>> case. I am not sure what would happen if a completely new node was added > >>> and its corresponding node was already used by the renumbered one > >>> though. It would likely conflate the two I am afraid. But I am not sure > >>> this is really possible with x86 and a lack of a bug report would > >>> suggest that nobody is doing that at least. > >>> > >> > >> JFYI, > >> Satheesh copied in this mailchain had opened a bug a year on crash with > >> vcpu > >> hotplug on memoryless node. > >> > >> https://bugzilla.kernel.org/show_bug.cgi?id=202187 > > > > So... do we merge this patch or not? Seems that the overall view is > > "risky but nobody is likely to do anything better any time soon"? > > I recall the issue Michal saw was "fix powerpc" vs. "break other > architectures". @Michal how should we proceed? At least x86-64 won't be > affected IIUC. There is a patch to introduce the node remapping on ppc as well which should eliminate the empty node 0. https://patchwork.ozlabs.org/project/linuxppc-dev/patch/2020073916.243569-1-aneesh.ku...@linux.ibm.com/ Thanks Michal
Re: [PATCH v2 2/2] powerpc/pci: unmap all interrupts when a PHB is removed
On 8/7/20 8:01 AM, Alexey Kardashevskiy wrote: > > > On 18/06/2020 02:29, Cédric Le Goater wrote: >> Some PCI adapters, like GPUs, use the "interrupt-map" property to >> describe interrupt mappings other than the legacy INTx interrupts. >> There can be more than 4 mappings. >> >> To clear all interrupts when a PHB is removed, we need to increase the >> 'irq_map' array in which mappings are recorded. Compute the number of >> interrupt mappings from the "interrupt-map" property and allocate a >> bigger 'irq_map' array. >> >> Signed-off-by: Cédric Le Goater >> --- >> arch/powerpc/kernel/pci-common.c | 49 +++- >> 1 file changed, 48 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/kernel/pci-common.c >> b/arch/powerpc/kernel/pci-common.c >> index 515480a4bac6..deb831f0ae13 100644 >> --- a/arch/powerpc/kernel/pci-common.c >> +++ b/arch/powerpc/kernel/pci-common.c >> @@ -353,9 +353,56 @@ struct pci_controller >> *pci_find_controller_for_domain(int domain_nr) >> return NULL; >> } >> >> +/* >> + * Assumption is made on the interrupt parent. All interrupt-map >> + * entries are considered to have the same parent. >> + */ >> +static int pcibios_irq_map_count(struct pci_controller *phb) > > I wonder if > int of_irq_count(struct device_node *dev) > could work here too. If it does not, then never mind. I wished it would, but no. > Other than that, the only other comment is - merge this one into 1/2 as > 1/2 alone won't properly fix the problem but it may look like that it does: > > for phyp, the test machine just happens to have 4 entries in the map but > this is the phyp implementation detail; yes > for qemu, there are more but we only unregister 4 but kvm does not care > in general so it is ok which is also implementation detail; > > and 2/2 just makes these details not matter. Thanks, OK. It will ease backport. Sending a v2. Thanks for the review Alexey ! C. > > >> +{ >> +const __be32 *imap; >> +int imaplen; >> +struct device_node *parent; >> +u32 intsize, addrsize, parintsize, paraddrsize; >> + >> +if (of_property_read_u32(phb->dn, "#interrupt-cells", &intsize)) >> +return 0; >> +if (of_property_read_u32(phb->dn, "#address-cells", &addrsize)) >> +return 0; >> + >> +imap = of_get_property(phb->dn, "interrupt-map", &imaplen); >> +if (!imap) { >> +pr_debug("%pOF : no interrupt-map\n", phb->dn); >> +return 0; >> +} >> +imaplen /= sizeof(u32); >> +pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen); >> + >> +if (imaplen < (addrsize + intsize + 1)) >> +return 0; >> + >> +imap += intsize + addrsize; >> +parent = of_find_node_by_phandle(be32_to_cpup(imap)); >> +if (!parent) { >> +pr_debug("%pOF : no imap parent found !\n", phb->dn); >> +return 0; >> +} >> + >> +if (of_property_read_u32(parent, "#interrupt-cells", &parintsize)) { >> +pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn); >> +return 0; >> +} >> + >> +if (of_property_read_u32(parent, "#address-cells", ¶ddrsize)) >> +paraddrsize = 0; >> + >> +return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize); >> +} >> + >> static void pcibios_irq_map_init(struct pci_controller *phb) >> { >> -phb->irq_count = PCI_NUM_INTX; >> +phb->irq_count = pcibios_irq_map_count(phb); >> +if (phb->irq_count < PCI_NUM_INTX) >> +phb->irq_count = PCI_NUM_INTX; >> >> pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count); >> >> >
[PATCH v2 2/2] powerpc/topology: Override cpu_smt_mask
On Power9, a pair of SMT4 cores can be presented by the firmware as a SMT8 core for backward compatibility reasons, with the fusion of two SMT4 cores. Powerpc allows LPARs to be live migrated from Power8 to Power9. Existing software developed/configured for Power8, expects to see a SMT8 core. In order to maintain userspace backward compatibility (with Power8 chips in case of Power9) in enterprise Linux systems, the topology_sibling_cpumask has to be set to SMT8 core. cpu_smt_mask() should generally point to the cpu mask of the SMT4 core. Hence override the default cpu_smt_mask() to be powerpc specific allowing for better scheduling behaviour on Power. schbench (latency measured in usecs, so lesser is better) Without patch With patch Latency percentiles (usec) Latency percentiles (usec) 50.th: 34 50.th: 38 75.th: 47 75.th: 52 90.th: 54 90.th: 60 95.th: 57 95.th: 64 *99.th: 62 *99.th: 72 99.5000th: 65 99.5000th: 75 99.9000th: 76 99.9000th: 3452 min=0, max=9205 min=0, max=9344 schbench (With Cede disabled) Without patch With patch Latency percentiles (usec) Latency percentiles (usec) 50.th: 20 50.th: 21 75.th: 28 75.th: 29 90.th: 33 90.th: 34 95.th: 35 95.th: 37 *99.th: 40 *99.th: 40 99.5000th: 48 99.5000th: 42 99.9000th: 94 99.9000th: 79 min=0, max=791 min=0, max=791 perf bench sched pipe usec/ops : lesser is better Without patch N Min MaxMedian AvgStddev 101 5.095113 5.595269 5.204842 5.22987760.10762713 5.10 - 5.15 : ## 23% (24) 5.15 - 5.20 : #21% (22) 5.20 - 5.25 : ## 23% (24) 5.25 - 5.30 : #11% (12) 5.30 - 5.35 : ##4% (5) 5.35 - 5.40 : 3% (4) 5.40 - 5.45 : 3% (4) 5.45 - 5.50 : 1% (2) 5.50 - 5.55 : ##0% (1) 5.55 - 5.60 : 1% (2) With patch N Min MaxMedian AvgStddev 101 5.134675 8.524719 5.207658 5.27809850.34911969 5.1 - 5.5 : ## 94% (95) 5.5 - 5.8 : ##3% (4) 5.8 - 6.2 : 0% (1) 6.2 - 6.5 : 6.5 - 6.8 : 6.8 - 7.2 : 7.2 - 7.5 : 7.5 - 7.8 : 7.8 - 8.2 : 8.2 - 8.5 : perf bench sched pipe (cede disabled) usec/ops : lesser is better Without patch N Min MaxMedian AvgStddev 101 7.884227 12.576538 7.956474 8.01707220.46159054 7.9 - 8.4 : ## 99% (100) 8.4 - 8.8 : 8.8 - 9.3 : 9.3 - 9.8 : 9.8 - 10.2 : 10.2 - 10.7 : 10.7 - 11.2 : 11.2 - 11.6 : 11.6 - 12.1 : 12.1 - 12.6 : With patch N Min MaxMedian AvgStddev 101 7.956021 8.217284 8.015615 8.0283866 0.049844967 7.96 - 7.98 : ## 12% (13) 7.98 - 8.01 : ## 28% (29) 8.01 - 8.03 : 20% (21) 8.03 - 8.06 : #14% (15) 8.06 - 8.09 : ## 12% (13) 8.09 - 8.11 : ##3% (4) 8.11 - 8.14 : ### 1% (2) 8.14 - 8.17 : ### 1% (2) 8.17 - 8.19 : 8.19 - 8.22 : # 0% (1) Observations: With the patch, the initial run/iteration takes a slight longer time. This can be attributed to the fact that now we pick a CPU from a idle core which could be sleep mode. Once we remove the cede, state the numbers improve in favour of the patch. ebizzy: transactions per second (higher is better) without patch N Min MaxMedian AvgStddev 100 1018433 1304470 1193208 1182315
[PATCH v2 1/2] sched/topology: Allow archs to override cpu_smt_mask
cpu_smt_mask tracks topology_sibling_cpumask. This would be good for most architectures. One of the users of cpu_smt_mask(), would be to identify idle-cores. On Power9, a pair of SMT4 cores can be presented by the firmware as a SMT8 core for backward compatibility reasons. Powerpc allows LPARs to be live migrated from Power8 to Power9. Do note Power8 had only SMT8 cores. Existing software which has been developed/configured for Power8 would expect to see SMT8 core. Maintaining the illusion of SMT8 core is a requirement to make that work. In order to maintain above userspace backward compatibility with previous versions of processor, Power9 onwards there is option to the firmware to advertise a pair of SMT4 cores as a fused cores aka SMT8 core. On Power9 this pair shares the L2 cache as well. However, from the scheduler's point of view, a core should be determined by SMT4, since its a completely independent unit of compute. Hence allow PowerPc architecture to override the default cpu_smt_mask() to point to the SMT4 cores in a SMT8 mode. This will ensure the scheduler is always given the right information. Cc: linuxppc-dev Cc: LKML Cc: Michael Ellerman Cc: Michael Neuling Cc: Gautham R Shenoy Cc: Ingo Molnar Cc: Peter Zijlstra Cc: Valentin Schneider Cc: Dietmar Eggemann Cc: Mel Gorman Cc: Vincent Guittot Cc: Vaidyanathan Srinivasan Acked-by; Peter Zijlstra (Intel) Signed-off-by: Srikar Dronamraju --- Changelog v1->v2: Update the commit msg based on the discussion in community esp with Peter Zijlstra and Michael Ellerman. include/linux/topology.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/topology.h b/include/linux/topology.h index 608fa4aadf0e..ad03df1cc266 100644 --- a/include/linux/topology.h +++ b/include/linux/topology.h @@ -198,7 +198,7 @@ static inline int cpu_to_mem(int cpu) #define topology_die_cpumask(cpu) cpumask_of(cpu) #endif -#ifdef CONFIG_SCHED_SMT +#if defined(CONFIG_SCHED_SMT) && !defined(cpu_smt_mask) static inline const struct cpumask *cpu_smt_mask(int cpu) { return topology_sibling_cpumask(cpu); -- 2.18.2
Re: [PATCH] powerpc/pseries/hotplug-cpu: increase wait time for vCPU death
Hi everyone, Michael Ellerman writes: > Greg Kurz writes: >> On Tue, 04 Aug 2020 23:35:10 +1000 >> Michael Ellerman wrote: >>> Spinning forever seems like a bad idea, but as has been demonstrated at >>> least twice now, continuing when we don't know the state of the other >>> CPU can lead to straight up crashes. >>> >>> So I think I'm persuaded that it's preferable to have the kernel stuck >>> spinning rather than oopsing. >>> >> >> +1 >> >>> I'm 50/50 on whether we should have a cond_resched() in the loop. My >>> first instinct is no, if we're stuck here for 20s a stack trace would be >>> good. But then we will probably hit that on some big and/or heavily >>> loaded machine. >>> >>> So possibly we should call cond_resched() but have some custom logic in >>> the loop to print a warning if we are stuck for more than some >>> sufficiently long amount of time. >> >> How long should that be ? > > Yeah good question. > > I guess step one would be seeing how long it can take on the 384 vcpu > machine. And we can probably test on some other big machines. > > Hopefully Nathan can give us some idea of how long he's seen it take on > large systems? I know he was concerned about the 20s timeout of the > softlockup detector. Maybe I'm not quite clear what this is referring to, but I don't think stop-self/query-stopped-state latency increases with processor count, at least not on PowerVM. And IIRC I was concerned with the earlier patch's potential for causing the softlockup watchdog to rightly complain by polling the stopped state without ever scheduling away. The fact that smp_query_cpu_stopped() kind of collapses the two distinct results from the query-cpu-stopped-state RTAS call into one return value may make it harder than necessary to reason about the questions around cond_resched() and whether to warn. Sorry to pull this stunt but I have had some code sitting in a neglected branch that I think gets the logic around this right. What we should have is a simple C wrapper for the RTAS call that reflects the architected inputs and outputs: (-- rtas.c --) /** * rtas_query_cpu_stopped_state() - Call RTAS query-cpu-stopped-state. * @hwcpu: Identifies the processor thread to be queried. * @status: Pointer to status, valid only on success. * * Determine whether the given processor thread is in the stopped * state. If successful and @status is non-NULL, the thread's status * is stored to @status. * * Return: * * 0 - Success * * -1 - Hardware error * * -2 - Busy, try again later */ int rtas_query_cpu_stopped_state(unsigned int hwcpu, unsigned int *status) { unsigned int cpu_status; int token; int fwrc; token = rtas_token("query-cpu-stopped-state"); fwrc = rtas_call(token, 1, 2, &cpu_status, hwcpu); if (fwrc != 0) goto out; if (status != NULL) *status = cpu_status; out: return fwrc; } And then a utility function that waits for the remote thread to enter stopped state, with higher-level logic for rescheduling and warning. The fact that smp_query_cpu_stopped() currently does not handle a -2/busy status is a bug, fixed below by using rtas_busy_delay(). Note the justification for the explicit cond_resched() in the outer loop: (-- rtas.h --) /* query-cpu-stopped-state CPU_status */ #define RTAS_QCSS_STATUS_STOPPED 0 #define RTAS_QCSS_STATUS_IN_PROGRESS 1 #define RTAS_QCSS_STATUS_NOT_STOPPED 2 (-- pseries/hotplug-cpu.c --) /** * wait_for_cpu_stopped() - Wait for a cpu to enter RTAS stopped state. */ static void wait_for_cpu_stopped(unsigned int cpu) { unsigned int status; unsigned int hwcpu; hwcpu = get_hard_smp_processor_id(cpu); do { int fwrc; /* * rtas_busy_delay() will yield only if RTAS returns a * busy status. Since query-cpu-stopped-state can * yield RTAS_QCSS_STATUS_IN_PROGRESS or * RTAS_QCSS_STATUS_NOT_STOPPED for an unbounded * period before the target thread stops, we must take * care to explicitly reschedule while polling. */ cond_resched(); do { fwrc = rtas_query_cpu_stopped_state(hwcpu, &status); } while (rtas_busy_delay(fwrc)); if (fwrc == 0) continue; pr_err_ratelimited("query-cpu-stopped-state for " "thread 0x%x returned %d\n", hwcpu, fwrc); goto out; } while (status == RTAS_QCSS_STATUS_NOT_STOPPED || status == RTAS_QCSS_STATUS_IN_PROGRESS); if (status != RTAS_QCSS_STATUS_STOPPED) {
Re: [PATCH v5 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline
On 07.08.20 06:32, Andrew Morton wrote: > On Fri, 3 Jul 2020 18:28:23 +0530 Srikar Dronamraju > wrote: > >>> The memory hotplug changes that somehow because you can hotremove numa >>> nodes and therefore make the nodemask sparse but that is not a common >>> case. I am not sure what would happen if a completely new node was added >>> and its corresponding node was already used by the renumbered one >>> though. It would likely conflate the two I am afraid. But I am not sure >>> this is really possible with x86 and a lack of a bug report would >>> suggest that nobody is doing that at least. >>> >> >> JFYI, >> Satheesh copied in this mailchain had opened a bug a year on crash with vcpu >> hotplug on memoryless node. >> >> https://bugzilla.kernel.org/show_bug.cgi?id=202187 > > So... do we merge this patch or not? Seems that the overall view is > "risky but nobody is likely to do anything better any time soon"? I recall the issue Michal saw was "fix powerpc" vs. "break other architectures". @Michal how should we proceed? At least x86-64 won't be affected IIUC. -- Thanks, David / dhildenb