Re: [Linux-stm32] [PATCH v8 08/10] drm: stm: dw-mipi-dsi: let the bridge handle the HW version check

2020-06-01 Thread Adrian Ratiu

On Fri, 29 May 2020, Philippe CORNU  wrote:
Hi Adrian, and thank you very much for the patchset.  Thank you 
also for having tested it on STM32F769 and STM32MP1.  Sorry for 
the late response, Yannick and I will review it as soon as 
possible and we will keep you posted.  Note: Do not hesitate to 
put us in copy for the next version  (philippe.co...@st.com, 
yannick.fer...@st.com) Regards, Philippe :-) 


Hi Philippe,

Thank you very much for your previous and future STM testing, 
really appreciate it! I've CC'd Yannick until now but I'll also CC 
you sure :)


It's been over a month since I posted v8 and I was just gearing up 
to address all feedback, rebase & retest to prepare v9 but I'll 
wait a little longer, no problem, it's no rush.


Have an awesome day,
Adrian




On 4/27/20 10:19 AM, Adrian Ratiu wrote:

The stm mipi-dsi platform driver added a version test in
commit fa6251a747b7 ("drm/stm: dsi: check hardware version")
so that HW revisions other than v1.3x get rejected. The rockchip
driver had no such check and just assumed register layouts are
v1.3x compatible.

Having such tests was a good idea because only v130/v131 layouts
were supported at the time, however since adding multiple layout
support in the bridge, the version is automatically checked for
all drivers, compatible layouts get picked and unsupported HW is
automatically rejected by the bridge, so there's no use keeping
the test in the stm driver.

The main reason prompting this change is that the stm driver
test immediately disabled the peripheral clock after reading
the version, making the bridge read version 0x0 immediately
after in its own probe(), so we move the clock disabling after
the bridge does the version test.

Tested on STM32F769 and STM32MP1.

Cc: linux-st...@st-md-mailman.stormreply.com
Reported-by: Adrian Pop 
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
New in v6.
---
  drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 12 +++-
  1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c 
b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
index 2e1f2664495d0..7218e405d7e2b 100644
--- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -402,15 +402,6 @@ static int dw_mipi_dsi_stm_probe(struct platform_device 
*pdev)
goto err_dsi_probe;
}
  
-	dsi->hw_version = dsi_read(dsi, DSI_VERSION) & VERSION;

-   clk_disable_unprepare(pclk);
-
-   if (dsi->hw_version != HWVER_130 && dsi->hw_version != HWVER_131) {
-   ret = -ENODEV;
-   DRM_ERROR("bad dsi hardware version\n");
-   goto err_dsi_probe;
-   }
-
dw_mipi_dsi_stm_plat_data.base = dsi->base;
dw_mipi_dsi_stm_plat_data.priv_data = dsi;
  
@@ -423,6 +414,9 @@ static int dw_mipi_dsi_stm_probe(struct platform_device *pdev)

goto err_dsi_probe;
}
  
+	dsi->hw_version = dsi_read(dsi, DSI_VERSION) & VERSION;

+   clk_disable_unprepare(pclk);
+
return 0;
  
  err_dsi_probe:




[PATCH v6] char: tpm: add i2c driver for cr50

2020-12-07 Thread Adrian Ratiu
From: "dlau...@chromium.org" 

Add TPM 2.0 compatible I2C interface for chips with cr50 firmware.

The firmware running on the currently supported H1 MCU requires a
special driver to handle its specific protocol, and this makes it
unsuitable to use tpm_tis_core_* and instead it must implement the
underlying TPM protocol similar to the other I2C TPM drivers.

- All 4 bytes of status register must be read/written at once.
- FIFO and burst count is limited to 63 and must be drained by AP.
- Provides an interrupt to indicate when read response data is ready
and when the TPM is finished processing write data.

This driver is based on the existing infineon I2C TPM driver, which
most closely matches the cr50 i2c protocol behavior.

Cc: Helen Koike 
Cc: Jarkko Sakkinen 
Cc: Ezequiel Garcia 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v6:
  - Whitespace, code style and kdoc fixes (Jarkko)

Changes in v5:
  - Fix copyringht notice (Jarkko)
  - Drop CR50_NO/FORCE defines (Jarkko)
  - Rename irq handler arg dev_id -> tpm_info (Jarkko)
  - Whitespace, brakcets, christmas tree, `checkpatch --strict`, W=n fixes

Changes in v4:
  - Replace force_release enum with defines (Jarkko)

Changes in v3:
  - Misc small fixes (typos/renamings, comments, default values)
  - Moved i2c_write memcpy before lock to minimize critical section (Helen)
  - Dropped priv->locality because it stored a constant value (Helen)
  - Many kdoc, function name and style fixes in general (Jarkko)
  - Kept the force release enum instead of defines or bool (Ezequiel)

Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

Applies on next-20201207, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 790 
 3 files changed, 802 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50
+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o
+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..ec9a65e7887d
--- /dev/null
+++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
@@ -0,0 +1,790 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2020 Google Inc.
+ *
+ * Based on Infineon TPM driver by Peter Huewe.
+ *
+ * cr50 is a firmware for H1 secure modules that requires special
+ * handling for the I2C interface.
+ *
+ * - Use an interrupt for transaction status instead of hardcoded delays.
+ * - Must use write+wait+read read protocol.
+ * - All 4 bytes of status register must be read/written at once.
+ * - Burst count max is 63 bytes, and burst count behaves slightly differently
+ *   than other I2C TPMs.
+ * - When reading from FIFO the full burstcnt m

[PATCH] media: rkvdec: silence ktest bot build warning

2020-12-08 Thread Adrian Ratiu
Some configurations built by the ktest bot produce the following
warn, so mark the struct as __maybe_unused to avoid unnecessary
ML spam.

>> drivers/staging/media/rkvdec/rkvdec.c:967:34: warning: unused variable 
>> 'of_rkvdec_match' [-Wunused-const-variable]
   static const struct of_device_id of_rkvdec_match[] = {
^
   1 warning generated.

vim +/of_rkvdec_match +967 drivers/staging/media/rkvdec/rkvdec.c

   966
 > 967  static const struct of_device_id of_rkvdec_match[] = {
   968  { .compatible = "rockchip,rk3399-vdec" },
   969  { /* sentinel */ }
   970  };
   971  MODULE_DEVICE_TABLE(of, of_rkvdec_match);
   972

Cc: Boris Brezillon 
Cc: Ezequiel Garcia 
Cc: Mauro Carvalho Chehab 
Reported-by: kernel test robot 
Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/rkvdec/rkvdec.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec.c 
b/drivers/staging/media/rkvdec/rkvdec.c
index aa4f8c287618..3af0f02ec59b 100644
--- a/drivers/staging/media/rkvdec/rkvdec.c
+++ b/drivers/staging/media/rkvdec/rkvdec.c
@@ -992,7 +992,7 @@ static void rkvdec_watchdog_func(struct work_struct *work)
}
 }
 
-static const struct of_device_id of_rkvdec_match[] = {
+static const struct of_device_id __maybe_unused of_rkvdec_match[] = {
{ .compatible = "rockchip,rk3399-vdec" },
{ /* sentinel */ }
 };
-- 
2.29.2



[PATCH v4 2/2] arm: lib: xor-neon: move pragma options to makefile

2021-01-19 Thread Adrian Ratiu
Using a pragma like GCC optimize is a bad idea because it tags
all functions with an __attribute__((optimize)) which replaces
optimization options rather than appending so could result in
dropping important flags. Not recommended for production use.

Because these options should always be enabled for this file,
it's better to set them via command line. tree-vectorize is on
by default in Clang, but it doesn't hurt to make it explicit.

Suggested-by: Arvind Sankar 
Suggested-by: Ard Biesheuvel 
Reviewed-by: Nick Desaulniers 
Reviewed-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 10 --
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..12d31d1a7630 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o+= $(NEON_FLAGS) -ftree-vectorize 
-Wno-unused-variable
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index f9f3601cc2d1..65125ce69044 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -23,16 +23,6 @@ MODULE_LICENSE("GPL");
 #warning Clang does not vectorize code in this file.
 #endif
 
-/*
- * Pull in the reference implementations while instructing GCC (through
- * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
- * NEON instructions.
- */
-#ifdef CONFIG_CC_IS_GCC
-#pragma GCC optimize "tree-vectorize"
-#endif
-
-#pragma GCC diagnostic ignored "-Wunused-variable"
 #include 
 
 struct xor_block_template const xor_block_neon_inner = {
-- 
2.30.0



[PATCH v4 0/2] xor-neon: Remove GCC warn & pragmas

2021-01-19 Thread Adrian Ratiu
Dear all,

In v4 a Clang-specific vectorization warning was added at
Arnd suggestion.

This series does not address the Clang vectorize not working
bug itself which is a known pre-existing issued documented
at [1] [2] [3]. Clang vectorization needs to be investigated
in more deepth and fixed separately. The purpouse of this is
to only fix some low-hanging-fruit GCC related isues.

Tested on next-20210118 using GCC 10.2.0 and Clang 10.0.1.

[1] https://bugs.llvm.org/show_bug.cgi?id=40976
[2] https://github.com/ClangBuiltLinux/linux/issues/503
[3] https://github.com/ClangBuiltLinux/linux/issues/496

Kind regards,
Adrian

Adrian Ratiu (1):
  arm: lib: xor-neon: move pragma options to makefile

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 18 +-
 2 files changed, 6 insertions(+), 14 deletions(-)

-- 
2.30.0



[PATCH v4 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2021-01-19 Thread Adrian Ratiu
From: Nathan Chancellor 

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9")
and clarify that -ftree-vectorize now always needs enabling
for GCC by directly testing the presence of CONFIG_CC_IS_GCC.

Another reason to remove the warning is that Clang exposes
itself as GCC < 4.6 so it triggers the warning about GCC
which doesn't make much sense and misleads Clang users by
telling them to update GCC.

Because Clang is now supported by the kernel print a clear
Clang-specific warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/496
Link: https://github.com/ClangBuiltLinux/linux/issues/503
Reported-by: Nick Desaulniers 
Reviewed-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..f9f3601cc2d1 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,20 +14,22 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp 
-mfpu=neon'
 #endif
 
+/*
+ * TODO: Even though -ftree-vectorize is enabled by default in Clang, the
+ * compiler does not produce vectorized code due to its cost model.
+ * See: https://github.com/ClangBuiltLinux/linux/issues/503
+ */
+#ifdef CONFIG_CC_IS_CLANG
+#warning Clang does not vectorize code in this file.
+#endif
+
 /*
  * Pull in the reference implementations while instructing GCC (through
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.30.0



Re: [PATCH v6] char: tpm: add i2c driver for cr50

2020-12-09 Thread Adrian Ratiu

On Tue, 08 Dec 2020, Jarkko Sakkinen  wrote:
On Mon, Dec 07, 2020 at 04:20:16PM +0200, Adrian Ratiu wrote: 
From: "dlau...@chromium.org"   Add TPM 
2.0 compatible I2C interface for chips with cr50 firmware. 
The firmware running on the currently supported H1 MCU requires 
a special driver to handle its specific protocol, and this 
makes it unsuitable to use tpm_tis_core_* and instead it must 
implement the underlying TPM protocol similar to the other I2C 
TPM drivers.   - All 4 bytes of status register must be 
read/written at once.  - FIFO and burst count is limited to 63 
and must be drained by AP.  - Provides an interrupt to indicate 
when read response data is ready and when the TPM is finished 
processing write data.   This driver is based on the existing 
infineon I2C TPM driver, which most closely matches the cr50 
i2c protocol behavior. 


Starts to look legit. Has anyone tested this? 


I tested on an x86_64 Chromebook EVE (aka Google Pixelbook) by 
chainloading in legacy mode and booting into a Yocto-based 
userspace (meta-chromebook) where I used tpm2-tools to communicate 
with the chip and also built and tested a ChromiumOS userspace in 
developer mode.


I do not have access to other HW which has this chip, so it is 
about as much testing I can do to confirm the driver works on this 
HW.


Adrian



/Jarkko



Cc: Helen Koike 
Cc: Jarkko Sakkinen 
Cc: Ezequiel Garcia 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v6:
  - Whitespace, code style and kdoc fixes (Jarkko)

Changes in v5:
  - Fix copyringht notice (Jarkko)
  - Drop CR50_NO/FORCE defines (Jarkko)
  - Rename irq handler arg dev_id -> tpm_info (Jarkko)
  - Whitespace, brakcets, christmas tree, `checkpatch --strict`, W=n fixes

Changes in v4:
  - Replace force_release enum with defines (Jarkko)

Changes in v3:
  - Misc small fixes (typos/renamings, comments, default values)
  - Moved i2c_write memcpy before lock to minimize critical section (Helen)
  - Dropped priv->locality because it stored a constant value (Helen)
  - Many kdoc, function name and style fixes in general (Jarkko)
  - Kept the force release enum instead of defines or bool (Ezequiel)

Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

Applies on next-20201207, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 790 
 3 files changed, 802 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50

+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o

+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..ec9a65e7887d
--- /dev/null
+++ b/drivers/char/tpm/tpm_

Re: [PATCH v2] char: tpm: add i2c driver for cr50

2020-11-23 Thread Adrian Ratiu
On Fri, 20 Nov 2020, Helen Koike  
wrote:
Hello Adrian, 

I just spotted small things (nothing major), please see below. 


Hi Helen,

I've addressed all the points you raised and left some minor 
comments below, but I will wait a little longer before posting v3 
so others will also have a chance to review. Thank you very much!




On 11/20/20 2:23 PM, Adrian Ratiu wrote: 
From: "dlau...@chromium.org"   Add TPM 
2.0 compatible I2C interface for chips with cr50 firmware. 
The firmware running on the currently supported H1 MCU requires 
a special driver to handle its specific protocol, and this 
makes it unsuitable to use tpm_tis_core_* and instead it must 
implement the underlying TPM protocol similar to the other I2C 
TPM drivers.   - All 4 byes of status register must be 
read/written at once. 


s/byes/bytes 

- FIFO and burst count is limited to 63 and must be drained by 
AP.  - Provides an interrupt to indicate when read response 
data is ready and when the TPM is finished processing write 
data.   This driver is based on the existing infineon I2C TPM 
driver, which most closely matches the cr50 i2c protocol 
behavior.   Cc: Helen Koike  
Signed-off-by: Duncan Laurie  
[swb...@chromium.org: Depend on i2c even if it's a module, 
replace boilier plate with SPDX tag, drop asm/byteorder.h 
include, simplify return from probe] Signed-off-by: Stephen 
Boyd  Signed-off-by: Fabien Lahoudere 
 Signed-off-by: Adrian Ratiu 
 --- Changes in v2: 
  - Various small fixes all over (reorder includes, 
  MAX_BUFSIZE, comments, etc) - Reworked return values of 
  i2c_wait_tpm_ready() to fix timeout mis-handling 
so ret == 0 now means success, the wait period jiffies is 
ignored because that number is meaningless and return a proper 
timeout error in case jiffies == 0. 
  - Make i2c default to 1 message per transfer (requested by 
  Helen) - Move -EIO error reporting to transfer function to 
  cleanup transfer() itself 
and its R/W callers 
  - Remove magic value hardcodings and introduce enum 
  force_release. 
 v1 posted at https://lkml.org/lkml/2020/2/25/349  Applies on 
next-20201120, tested on Chromebook EVE.  --- 
 drivers/char/tpm/Kconfig|  10 + 
 drivers/char/tpm/Makefile   |   2 + 
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 768 
  3 files changed, 780 
 insertions(+) create mode 100644 
 drivers/char/tpm/tpm_tis_i2c_cr50.c 
 diff --git a/drivers/char/tpm/Kconfig 
b/drivers/char/tpm/Kconfig index a18c314da211..4308f9ca7a43 
100644 --- a/drivers/char/tpm/Kconfig +++ 
b/drivers/char/tpm/Kconfig @@ -86,6 +86,16 @@ config 
TCG_TIS_SYNQUACER 
 	  To compile this driver as a module, choose  M here; the 
 module will be called tpm_tis_synquacer.  
+config TCG_TIS_I2C_CR50 +	tristate "TPM Interface 
Specification 2.0 Interface (I2C - CR50)" +	depends on I2C + 
select TCG_CR50 +	help +	  This is a driver for the Google 
cr50 I2C TPM interface which is a +	  custom microcontroller 
and requires a custom i2c protocol interface +	  to 
handle the limitations of the hardware.  To compile this driver 
+	  as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.  + 
 config TCG_TIS_I2C_ATMEL tristate "TPM Interface Specification 
 1.2 Interface (I2C - Atmel)" depends on I2C 
diff --git a/drivers/char/tpm/Makefile 
b/drivers/char/tpm/Makefile index 84db4fb3a9c9..66d39ea6bd10 
100644 --- a/drivers/char/tpm/Makefile +++ 
b/drivers/char/tpm/Makefile @@ -27,6 +27,8 @@ 
obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o 
 tpm_tis_spi-y := tpm_tis_spi_main.o 
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o  
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o + 
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o 
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o 
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o 
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c new file mode 100644 
index ..37555dafdca0 --- /dev/null +++ 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c @@ -0,0 +1,768 @@ +// 
SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2016 Google 
Inc.  + * + * Based on Linux Kernel TPM driver by + * Peter 
Huewe  + * Copyright (C) 2011 
Infineon Technologies + */ + +/* + * cr50 is a firmware for H1 
secure modules that requires special + * handling for the I2C 
interface.  + * + * - Use an interrupt for transaction status 
instead of hardcoded delays + * - Must use write+wait+read read 
protocol + * - All 4 bytes of status register must be 
read/written at once + * - Burst count max is 63 bytes, and 
burst count behaves + *   slightly differently than other I2C 
TPMs + * - When reading from FIFO the full burstcnt must be 
read + *   instead of just reading header and determining the 
remainder + */ + +#include  +#include 
 +#include  +#include 
 +#include  +#include 
 +#include  +#include  
+ +#include "tpm_tis_core.h" + +#define CR50_MAX_BUFSIZE	64 

Re: [PATCH v2] char: tpm: add i2c driver for cr50

2020-11-24 Thread Adrian Ratiu

On Tue, 24 Nov 2020, Jarkko Sakkinen  wrote:
On Fri, Nov 20, 2020 at 07:23:45PM +0200, Adrian Ratiu wrote: 
From: "dlau...@chromium.org"   Add TPM 
2.0 compatible I2C interface for chips with cr50 firmware. 
The firmware running on the currently supported H1 MCU requires 
a special driver to handle its specific protocol, and this 
makes it unsuitable to use tpm_tis_core_* and instead it must 
implement the underlying TPM protocol similar to the other I2C 
TPM drivers.   - All 4 byes of status register must be 
read/written at once.  - FIFO and burst count is limited to 63 
and must be drained by AP.  - Provides an interrupt to indicate 
when read response data is ready and when the TPM is finished 
processing write data.   This driver is based on the existing 
infineon I2C TPM driver, which most closely matches the cr50 
i2c protocol behavior.   Cc: Helen Koike 
 Signed-off-by: Duncan Laurie 
 [swb...@chromium.org: Depend on i2c even 
if it's a module, replace boilier plate with SPDX tag, drop 
asm/byteorder.h include, simplify return from probe] 
Signed-off-by: Stephen Boyd  
Signed-off-by: Fabien Lahoudere 
 Signed-off-by: Adrian Ratiu 
 --- Changes in v2: 
  - Various small fixes all over (reorder includes, 
  MAX_BUFSIZE, comments, etc) - Reworked return values of 
  i2c_wait_tpm_ready() to fix timeout mis-handling 
so ret == 0 now means success, the wait period jiffies is 
ignored because that number is meaningless and return a proper 
timeout error in case jiffies == 0. 
  - Make i2c default to 1 message per transfer (requested by 
  Helen) - Move -EIO error reporting to transfer function to 
  cleanup transfer() itself 
and its R/W callers 
  - Remove magic value hardcodings and introduce enum 
  force_release. 
 v1 posted at https://lkml.org/lkml/2020/2/25/349  Applies on 
next-20201120, tested on Chromebook EVE.  --- 
 drivers/char/tpm/Kconfig|  10 + 
 drivers/char/tpm/Makefile   |   2 + 
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 768 
  3 files changed, 780 
 insertions(+) create mode 100644 
 drivers/char/tpm/tpm_tis_i2c_cr50.c 
 diff --git a/drivers/char/tpm/Kconfig 
b/drivers/char/tpm/Kconfig index a18c314da211..4308f9ca7a43 
100644 --- a/drivers/char/tpm/Kconfig +++ 
b/drivers/char/tpm/Kconfig @@ -86,6 +86,16 @@ config 
TCG_TIS_SYNQUACER 
 	  To compile this driver as a module, choose  M here; the 
 module will be called tpm_tis_synquacer.  
+config TCG_TIS_I2C_CR50 +	tristate "TPM Interface 
Specification 2.0 Interface (I2C - CR50)" +	depends on I2C + 
select TCG_CR50 +	help +	  This is a driver for the Google 
cr50 I2C TPM interface which is a +	  custom microcontroller 
and requires a custom i2c protocol interface +	  to 
handle the limitations of the hardware.  To compile this driver 
+	  as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.  + 
 config TCG_TIS_I2C_ATMEL tristate "TPM Interface Specification 
 1.2 Interface (I2C - Atmel)" depends on I2C 
diff --git a/drivers/char/tpm/Makefile 
b/drivers/char/tpm/Makefile index 84db4fb3a9c9..66d39ea6bd10 
100644 --- a/drivers/char/tpm/Makefile +++ 
b/drivers/char/tpm/Makefile @@ -27,6 +27,8 @@ 
obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o 
 tpm_tis_spi-y := tpm_tis_spi_main.o 
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o  
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o + 
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o 
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o 
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o 
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c new file mode 100644 
index ..37555dafdca0 --- /dev/null +++ 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c @@ -0,0 +1,768 @@ +// 
SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2016 Google 
Inc.  + * + * Based on Linux Kernel TPM driver by + * Peter 
Huewe  + * Copyright (C) 2011 
Infineon Technologies + */ + +/* + * cr50 is a firmware for H1 
secure modules that requires special + * handling for the I2C 
interface.  + * + * - Use an interrupt for transaction status 
instead of hardcoded delays + * - Must use write+wait+read read 
protocol + * - All 4 bytes of status register must be 
read/written at once + * - Burst count max is 63 bytes, and 
burst count behaves + *   slightly differently than other I2C 
TPMs + * - When reading from FIFO the full burstcnt must be 
read + *   instead of just reading header and determining the 
remainder + */ + +#include  +#include 
 +#include  +#include 
 +#include  +#include 
 +#include  +#include  
+ +#include "tpm_tis_core.h" + +#define CR50_MAX_BUFSIZE	64 
+#define CR50_TIMEOUT_SHORT_MS	2	/* Short timeout 
during transactions */ +#define CR50_TIMEOUT_NOIRQ_MS	20 
/* Timeout for TPM ready without IRQ */ +#define 
CR50_I2C_DID_VID	0x00281ae0L +#define CR50_I2C_MAX_RETRIES 
3	/* Max retries due to I2C errors */ +#define 
CR50_I2C_RETRY_DEL

Re: [PATCH v2] char: tpm: add i2c driver for cr50

2020-11-26 Thread Adrian Ratiu
On Thu, 26 Nov 2020, Ezequiel Garcia  
wrote:
On Thu, 2020-11-26 at 05:30 +0200, Jarkko Sakkinen wrote: 
On Tue, 2020-11-24 at 10:14 -0300, Ezequiel Garcia wrote: 
> Hi Jarkko,  Thanks for your review.   On Tue, 2020-11-24 at 
> 00:06 +0200, Jarkko Sakkinen wrote: 
> > On Fri, Nov 20, 2020 at 07:23:45PM +0200, Adrian Ratiu 
> > wrote: 
> > > From: "dlau...@chromium.org"   Add 
> > > TPM 2.0 compatible I2C interface for chips with cr50 
> > > firmware.   The firmware running on the currently 
> > > supported H1 MCU requires a special driver to handle its 
> > > specific protocol, and this makes it unsuitable to use 
> > > tpm_tis_core_* and instead it must implement the 
> > > underlying TPM protocol similar to the other I2C TPM 
> > > drivers.   - All 4 byes of status register must be 
> > > read/written at once.  - FIFO and burst count is limited 
> > > to 63 and must be drained by AP.  - Provides an interrupt 
> > > to indicate when read response data is ready and when the 
> > > TPM is finished processing write data.   This driver is 
> > > based on the existing infineon I2C TPM driver, which most 
> > > closely matches the cr50 i2c protocol behavior.   Cc: 
> > > Helen Koike  Signed-off-by: 
> > > Duncan Laurie  
> > > [swb...@chromium.org: Depend on i2c even if it's a 
> > > module, replace boilier plate with SPDX tag, drop 
> > > asm/byteorder.h include, simplify return from probe] 
> > > Signed-off-by: Stephen Boyd  
> > > Signed-off-by: Fabien Lahoudere 
> > >  Signed-off-by: Adrian 
> > > Ratiu  --- Changes in v2: 
> > >   - Various small fixes all over (reorder includes, 
> > >   MAX_BUFSIZE, 
> > > comments, etc) 
> > >   - Reworked return values of i2c_wait_tpm_ready() to fix 
> > >   timeout 
> > > mis-handling so ret == 0 now means success, the wait 
> > > period jiffies is ignored because that number is 
> > > meaningless and return a proper timeout error in case 
> > > jiffies == 0. 
> > >   - Make i2c default to 1 message per transfer (requested 
> > >   by 
> > > Helen) 
> > >   - Move -EIO error reporting to transfer function to 
> > >   cleanup 
> > > transfer() itself and its R/W callers 
> > >   - Remove magic value hardcodings and introduce enum 
> > > force_release.   v1 posted at 
> > > https://lkml.org/lkml/2020/2/25/349  Applies on 
> > > next-20201120, tested on Chromebook EVE.  --- 
> > >  drivers/char/tpm/Kconfig|  10 + 
> > >  drivers/char/tpm/Makefile   |   2 + 
> > >  drivers/char/tpm/tpm_tis_i2c_cr50.c | 768 
> > >  
> > >  3 files changed, 780 insertions(+) create mode 100644 
> > >  drivers/char/tpm/tpm_tis_i2c_cr50.c 
> > >  diff --git a/drivers/char/tpm/Kconfig 
> > > b/drivers/char/tpm/Kconfig index 
> > > a18c314da211..4308f9ca7a43 100644 --- 
> > > a/drivers/char/tpm/Kconfig +++ b/drivers/char/tpm/Kconfig 
> > > @@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER 
> > >   To compile this driver as a module, choose  M 
> > >   here; the module will be called 
> > >   tpm_tis_synquacer. 
> > >   
> > > +config TCG_TIS_I2C_CR50 +   tristate "TPM Interface 
> > > Specification 2.0 Interface (I2C - CR50)" +   depends 
> > > on I2C +   select TCG_CR50 +   help + 
> > > This is a driver for the Google cr50 I2C TPM interface 
> > > which is a + custom microcontroller and requires 
> > > a custom i2c protocol interface + to handle the 
> > > limitations of the hardware.  To compile this driver + 
> > > as a module, choose M here; the module will be called 
> > > tcg_tis_i2c_cr50.  + 
> > >  config TCG_TIS_I2C_ATMEL 
> > > tristate "TPM Interface Specification 1.2 
> > > Interface (I2C 
> > > - Atmel)" 
> > > depends on I2C 
> > > diff --git a/drivers/char/tpm/Makefile 
> > > b/drivers/char/tpm/Makefile index 
> > > 84db4fb3a9c9..66d39ea6bd10 100644 --- 
> > > a/drivers/char/tpm/Makefile +++ 
> > > b/drivers/char/tpm/Makefile @@ -27,6 +27,8 @@ 
> > > obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o 
> > >  tpm_tis_spi-y := tpm_tis_spi_main.o 
> > >  tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += 
> > >  tpm_tis_spi_cr50.o  
> > > +obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o + 
> &

[PATCH v4] char: tpm: add i2c driver for cr50

2020-12-02 Thread Adrian Ratiu
From: "dlau...@chromium.org" 

Add TPM 2.0 compatible I2C interface for chips with cr50 firmware.

The firmware running on the currently supported H1 MCU requires a
special driver to handle its specific protocol, and this makes it
unsuitable to use tpm_tis_core_* and instead it must implement the
underlying TPM protocol similar to the other I2C TPM drivers.

- All 4 bytes of status register must be read/written at once.
- FIFO and burst count is limited to 63 and must be drained by AP.
- Provides an interrupt to indicate when read response data is ready
and when the TPM is finished processing write data.

This driver is based on the existing infineon I2C TPM driver, which
most closely matches the cr50 i2c protocol behavior.

Cc: Helen Koike 
Cc: Jarkko Sakkinen 
Cc: Ezequiel Garcia 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v4:
  - Replace force_release enum with defines (Jarkko)

Changes in v3:
  - Misc small fixes (typos/renamings, comments, default values)
  - Moved i2c_write memcpy before lock to minimize critical section (Helen)
  - Dropped priv->locality because it stored a constant value (Helen)
  - Many kdoc, function name and style fixes in general (Jarkko)
  - Kept the force release enum instead of defines or bool (Ezequiel)

Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

Applies on next-20201201, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 767 
 3 files changed, 779 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50
+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o
+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..a374853a3b4b
--- /dev/null
+++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
@@ -0,0 +1,767 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2016 Google Inc.
+ *
+ * Based on Linux Kernel TPM driver by
+ * Peter Huewe 
+ * Copyright (C) 2011 Infineon Technologies
+ *
+ * cr50 is a firmware for H1 secure modules that requires special
+ * handling for the I2C interface.
+ *
+ * - Use an interrupt for transaction status instead of hardcoded delays.
+ * - Must use write+wait+read read protocol.
+ * - All 4 bytes of status register must be read/written at once.
+ * - Burst count max is 63 bytes, and burst count behaves slightly differently
+ *   than other I2C TPMs.
+ * - When reading from FIFO the full burstcnt must be read instead of just
+ *   reading header and determining the remainder.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tpm_tis_core.h"
+
+#d

Re: [PATCH v4] char: tpm: add i2c driver for cr50

2020-12-02 Thread Adrian Ratiu

On Wed, 02 Dec 2020, Jarkko Sakkinen  wrote:
On Wed, Dec 02, 2020 at 12:58:05PM +0200, Adrian Ratiu wrote: 
From: "dlau...@chromium.org"   Add TPM 
2.0 compatible I2C interface for chips with cr50 firmware. 
The firmware running on the currently supported H1 MCU requires 
a special driver to handle its specific protocol, and this 
makes it unsuitable to use tpm_tis_core_* and instead it must 
implement the underlying TPM protocol similar to the other I2C 
TPM drivers.   - All 4 bytes of status register must be 
read/written at once.  - FIFO and burst count is limited to 63 
and must be drained by AP.  - Provides an interrupt to indicate 
when read response data is ready and when the TPM is finished 
processing write data.   This driver is based on the existing 
infineon I2C TPM driver, which most closely matches the cr50 
i2c protocol behavior.   Cc: Helen Koike 
 Cc: Jarkko Sakkinen 
 Cc: Ezequiel Garcia 
 Signed-off-by: Duncan Laurie 
 [swb...@chromium.org: Depend on i2c even 
if it's a module, replace boilier plate with SPDX tag, drop 
asm/byteorder.h include, simplify return from probe] 
Signed-off-by: Stephen Boyd  
Signed-off-by: Fabien Lahoudere 
 Signed-off-by: Adrian Ratiu 
 --- Changes in v4: 
  - Replace force_release enum with defines (Jarkko) 
 Changes in v3: 
  - Misc small fixes (typos/renamings, comments, default 
  values) - Moved i2c_write memcpy before lock to minimize 
  critical section (Helen) - Dropped priv->locality because it 
  stored a constant value (Helen) - Many kdoc, function name 
  and style fixes in general (Jarkko) - Kept the force release 
  enum instead of defines or bool (Ezequiel) 
 Changes in v2: 
  - Various small fixes all over (reorder includes, 
  MAX_BUFSIZE, comments, etc) - Reworked return values of 
  i2c_wait_tpm_ready() to fix timeout mis-handling 
so ret == 0 now means success, the wait period jiffies is 
ignored because that number is meaningless and return a proper 
timeout error in case jiffies == 0. 
  - Make i2c default to 1 message per transfer (requested by 
  Helen) - Move -EIO error reporting to transfer function to 
  cleanup transfer() itself 
and its R/W callers 
  - Remove magic value hardcodings and introduce enum 
  force_release. 
 Applies on next-20201201, tested on Chromebook EVE.  --- 
 drivers/char/tpm/Kconfig|  10 + 
 drivers/char/tpm/Makefile   |   2 + 
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 767 
  3 files changed, 779 
 insertions(+) create mode 100644 
 drivers/char/tpm/tpm_tis_i2c_cr50.c 
 diff --git a/drivers/char/tpm/Kconfig 
b/drivers/char/tpm/Kconfig index a18c314da211..4308f9ca7a43 
100644 --- a/drivers/char/tpm/Kconfig +++ 
b/drivers/char/tpm/Kconfig @@ -86,6 +86,16 @@ config 
TCG_TIS_SYNQUACER 
 	  To compile this driver as a module, choose  M here; the 
 module will be called tpm_tis_synquacer.  
+config TCG_TIS_I2C_CR50 +	tristate "TPM Interface 
Specification 2.0 Interface (I2C - CR50)" +	depends on I2C + 
select TCG_CR50 +	help +	  This is a driver for the Google 
cr50 I2C TPM interface which is a +	  custom microcontroller 
and requires a custom i2c protocol interface +	  to 
handle the limitations of the hardware.  To compile this driver 
+	  as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.  + 
 config TCG_TIS_I2C_ATMEL tristate "TPM Interface Specification 
 1.2 Interface (I2C - Atmel)" depends on I2C 
diff --git a/drivers/char/tpm/Makefile 
b/drivers/char/tpm/Makefile index 84db4fb3a9c9..66d39ea6bd10 
100644 --- a/drivers/char/tpm/Makefile +++ 
b/drivers/char/tpm/Makefile @@ -27,6 +27,8 @@ 
obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o 
 tpm_tis_spi-y := tpm_tis_spi_main.o 
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o  
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o + 
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o 
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o 
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o 
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c new file mode 100644 
index ..a374853a3b4b --- /dev/null +++ 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c @@ -0,0 +1,767 @@ +// 
SPDX-License-Identifier: GPL-2.0 +/* + * Copyright 2016 Google 
Inc. 


Should be 2020. 

+ * + * Based on Linux Kernel TPM driver by + * Peter Huewe 
 + * Copyright (C) 2011 Infineon 
Technologies 


Not sure how this was derived. 



Indeed I think we should just mention the original author and 
driver like Infineon driver itself does. Thanks!


+ * + * cr50 is a firmware for H1 secure modules that requires 
special + * handling for the I2C interface.  + * + * - Use an 
interrupt for transaction status instead of hardcoded delays. 
+ * - Must use write+wait+read read protocol.  + * - All 4 
bytes of status register must be read/written at once.  + * - 
Burst count max is 63 bytes, and burst count behaves slightly 
differently + *   th

[PATCH v5] char: tpm: add i2c driver for cr50

2020-12-03 Thread Adrian Ratiu
From: "dlau...@chromium.org" 

Add TPM 2.0 compatible I2C interface for chips with cr50 firmware.

The firmware running on the currently supported H1 MCU requires a
special driver to handle its specific protocol, and this makes it
unsuitable to use tpm_tis_core_* and instead it must implement the
underlying TPM protocol similar to the other I2C TPM drivers.

- All 4 bytes of status register must be read/written at once.
- FIFO and burst count is limited to 63 and must be drained by AP.
- Provides an interrupt to indicate when read response data is ready
and when the TPM is finished processing write data.

This driver is based on the existing infineon I2C TPM driver, which
most closely matches the cr50 i2c protocol behavior.

Cc: Helen Koike 
Cc: Jarkko Sakkinen 
Cc: Ezequiel Garcia 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v5:
  - Fix copyringht notice (Jarkko)
  - Drop CR50_NO/FORCE defines (Jarkko)
  - Rename irq handler arg dev_id -> tpm_info (Jarkko)
  - Whitespace, brakcets, christmas tree, `checkpatch --strict`, W=n fixes

Changes in v4:
  - Replace force_release enum with defines (Jarkko)

Changes in v3:
  - Misc small fixes (typos/renamings, comments, default values)
  - Moved i2c_write memcpy before lock to minimize critical section (Helen)
  - Dropped priv->locality because it stored a constant value (Helen)
  - Many kdoc, function name and style fixes in general (Jarkko)
  - Kept the force release enum instead of defines or bool (Ezequiel)

Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

Applies on next-20201201, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 777 
 3 files changed, 789 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50
+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o
+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..0e9d2da9dcf5
--- /dev/null
+++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
@@ -0,0 +1,777 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2020 Google Inc.
+ *
+ * Based on Infineon TPM driver by Peter Huewe.
+ *
+ * cr50 is a firmware for H1 secure modules that requires special
+ * handling for the I2C interface.
+ *
+ * - Use an interrupt for transaction status instead of hardcoded delays.
+ * - Must use write+wait+read read protocol.
+ * - All 4 bytes of status register must be read/written at once.
+ * - Burst count max is 63 bytes, and burst count behaves slightly differently
+ *   than other I2C TPMs.
+ * - When reading from FIFO the full burstcnt must be read instead of just
+ *   reading header and determining t

Re: [PATCH v4 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2021-01-20 Thread Adrian Ratiu
On Tue, 19 Jan 2021, Nathan Chancellor  
wrote:
On Tue, Jan 19, 2021 at 03:17:23PM +0200, Adrian Ratiu wrote: 
From: Nathan Chancellor   Drop 
warning because kernel now requires GCC >= v4.9 after commit 
6ec4476ac825 ("Raise gcc version requirement to 4.9") and 
clarify that -ftree-vectorize now always needs enabling for GCC 
by directly testing the presence of CONFIG_CC_IS_GCC.   Another 
reason to remove the warning is that Clang exposes itself as 
GCC < 4.6 so it triggers the warning about GCC which doesn't 
make much sense and misleads Clang users by telling them to 
update GCC.   Because Clang is now supported by the kernel 
print a clear Clang-specific warning.   Link: 
https://github.com/ClangBuiltLinux/linux/issues/496 Link: 
https://github.com/ClangBuiltLinux/linux/issues/503 
Reported-by: Nick Desaulniers  
Reviewed-by: Nick Desaulniers  
Signed-off-by: Nathan Chancellor  
Signed-off-by: Adrian Ratiu  


The commit message looks like it is written by me but I never 
added a Clang specific warning. I appreciate wanting to give me 
credit but when you change things about my original commit 
message, please make it clear that you did the edits, something 
like: 

Signed-off-by: Nathan Chancellor  
[adrian: Add clang specific warning] Signed-off-by: Adrian Ratiu 
 



Thanks for the suggestion. Makes sense. I contemplated adding 
another patch by me on top but thought it was too much 
churn. Sorry if my edits were unclear.


--- 
 arch/arm/lib/xor-neon.c | 18 ++ 1 file 
 changed, 10 insertions(+), 8 deletions(-) 
 diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c 
index b99dd8e1c93f..f9f3601cc2d1 100644 --- 
a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ 
-14,20 +14,22 @@ MODULE_LICENSE("GPL"); 
 #error You should compile this file with '-march=armv7-a 
 -mfloat-abi=softfp -mfpu=neon' #endif  
+/* + * TODO: Even though -ftree-vectorize is enabled by 
default in Clang, the + * compiler does not produce vectorized 
code due to its cost model.  + * See: 
https://github.com/ClangBuiltLinux/linux/issues/503 + */ 
+#ifdef CONFIG_CC_IS_CLANG +#warning Clang does not vectorize 
code in this file.  +#endif 


I really do not like this. With the GCC specific warning, the 
user could just upgrade their GCC. With this warning, it is 
basically telling them don't use clang, in which case, it would 
just be better to disable this code altogether. I would rather 
see: 

1. Just don't build this file with clang altogether, which I 
believe was 
   v1's 2/2 patch. 

OR 

2. Use the pragma: 

#pragma clang loop vectorize(enable) 

as Nick suggests in v1's 2/2 patch. 

Alternatively, __restrict__ sounds like it might be beneficial 
for both GCC and clang: 

https://lore.kernel.org/lkml/20201112215033.ga438...@rani.riverdale.lan/ 



Option 1 from v1 got clearly NACKed by Nick a while back so the 
only option gonig forward is to also fix clang vectorization 
together with these changes so the warning becomes unnecessary.



 /*
  * Pull in the reference implementations while instructing GCC (through
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"

--
2.30.0



Re: [PATCH v4 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2021-01-20 Thread Adrian Ratiu
On Tue, 19 Jan 2021, Nick Desaulniers  
wrote:
On Tue, Jan 19, 2021 at 5:17 AM Adrian Ratiu 
 wrote: 


From: Nathan Chancellor  

Drop warning because kernel now requires GCC >= v4.9 after 
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9") 
and clarify that -ftree-vectorize now always needs enabling for 
GCC by directly testing the presence of CONFIG_CC_IS_GCC. 

Another reason to remove the warning is that Clang exposes 
itself as GCC < 4.6 so it triggers the warning about GCC which 
doesn't make much sense and misleads Clang users by telling 
them to update GCC. 

Because Clang is now supported by the kernel print a clear 
Clang-specific warning. 

Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: 
https://github.com/ClangBuiltLinux/linux/issues/503 
Reported-by: Nick Desaulniers  
Reviewed-by: Nick Desaulniers  


This is not the version of the patch I had reviewed; please drop 
my reviewed-by tag when you change a patch significantly, as 
otherwise it looks like I approved this patch. 

Nacked-by: Nick Desaulniers  



Sorry for not removing the reviewed-by tags from the previous 
versions in this v4. I guess the only way forward with this is to 
actually make clang vectorization work. Also thanks for the patch 
suggestion in the other e-mail!



Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..f9f3601cc2d1 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,20 +14,22 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp 
-mfpu=neon'
 #endif

+/*
+ * TODO: Even though -ftree-vectorize is enabled by default in Clang, the
+ * compiler does not produce vectorized code due to its cost model.
+ * See: https://github.com/ClangBuiltLinux/linux/issues/503
+ */
+#ifdef CONFIG_CC_IS_CLANG
+#warning Clang does not vectorize code in this file.
+#endif


Arnd, remind me again why it's a bug that the compiler's cost model
says it's faster to not produce a vectorized version of these loops?
I stand by my previous comment: https://bugs.llvm.org/show_bug.cgi?id=40976#c8


+
 /*
  * Pull in the reference implementations while instructing GCC (through
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif

 #pragma GCC diagnostic ignored "-Wunused-variable"
--
2.30.0




--
Thanks,
~Nick Desaulniers


[PATCH v3 RESEND 0/2] xor-neon: Remove GCC warn & pragmas

2021-01-18 Thread Adrian Ratiu
Dear all,

This is a resend of v3 of the patch series started at
id:20201106051436.2384842-1-adrian.ra...@collabora.com

This series does not address the Clang -ftree-vectorize not
working bug which is a known pre-existing issued documented
at [1] [2] [3]. Clang vectorization needs to be investigated
in more deepth and fixed separately. The purpouse of this is
to only fix some low-hanging-fruit GCC related isues.

Tested on next-20210118 using GCC 10.2.0 and Clang 10.0.1.

[1] https://bugs.llvm.org/show_bug.cgi?id=40976
[2] https://github.com/ClangBuiltLinux/linux/issues/503
[3] https://github.com/ClangBuiltLinux/linux/issues/496

Kind regards,
Adrian

Adrian Ratiu (1):
  arm: lib: xor-neon: move pragma options to makefile

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 17 -
 2 files changed, 1 insertion(+), 18 deletions(-)

-- 
2.30.0



[PATCH v3 RESEND 2/2] arm: lib: xor-neon: move pragma options to makefile

2021-01-18 Thread Adrian Ratiu
Using a pragma like GCC optimize is a bad idea because it tags
all functions with an __attribute__((optimize)) which replaces
optimization options rather than appending so could result in
dropping important flags. Not recommended for production use.

Because these options should always be enabled for this file,
it's better to set them via command line. tree-vectorize is on
by default in Clang, but it doesn't hurt to make it explicit.

Suggested-by: Arvind Sankar 
Suggested-by: Ard Biesheuvel 
Reviewed-by: Nick Desaulniers 
Reviewed-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 10 --
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..12d31d1a7630 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o+= $(NEON_FLAGS) -ftree-vectorize 
-Wno-unused-variable
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..62b493e386c4 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp 
-mfpu=neon'
 #endif
 
-/*
- * Pull in the reference implementations while instructing GCC (through
- * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
- * NEON instructions.
- */
-#ifdef CONFIG_CC_IS_GCC
-#pragma GCC optimize "tree-vectorize"
-#endif
-
-#pragma GCC diagnostic ignored "-Wunused-variable"
 #include 
 
 struct xor_block_template const xor_block_neon_inner = {
-- 
2.30.0



[PATCH v3 RESEND 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2021-01-18 Thread Adrian Ratiu
From: Nathan Chancellor 

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9")
and clarify that -ftree-vectorize now always needs enabling
for GCC by directly testing the presence of CONFIG_CC_IS_GCC.

Another reason to remove the warning is that Clang exposes
itself as GCC < 4.6 so it triggers the warning about GCC
which doesn't make much sense and risks misleading users.

As a side-note remark, -fttree-vectorize is on by default in
Clang, but it currently does not work (see linked issues).

Link: https://github.com/ClangBuiltLinux/linux/issues/496
Link: https://github.com/ClangBuiltLinux/linux/issues/503
Reported-by: Nick Desaulniers 
Reviewed-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.30.0



Re: [PATCH v3 RESEND 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2021-01-18 Thread Adrian Ratiu

On Mon, 18 Jan 2021, Arnd Bergmann  wrote:
On Mon, Jan 18, 2021 at 11:56 AM Adrian Ratiu 
 wrote: 


From: Nathan Chancellor  

Drop warning because kernel now requires GCC >= v4.9 after 
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9") 
and clarify that -ftree-vectorize now always needs enabling for 
GCC by directly testing the presence of CONFIG_CC_IS_GCC. 

Another reason to remove the warning is that Clang exposes 
itself as GCC < 4.6 so it triggers the warning about GCC which 
doesn't make much sense and risks misleading users. 

As a side-note remark, -fttree-vectorize is on by default in 
Clang, but it currently does not work (see linked issues). 

Link: https://github.com/ClangBuiltLinux/linux/issues/496 Link: 
https://github.com/ClangBuiltLinux/linux/issues/503 
Reported-by: Nick Desaulniers  
Reviewed-by: Nick Desaulniers  
Signed-off-by: Nathan Chancellor  
Signed-off-by: Adrian Ratiu  


Shouldn't there be a check for whatever minimum version of clang 
produces optimized code now? As I understand it, the warning was 
originally meant to complain about both old gcc and any version 
of clang, while waiting for a new version of clang to produce 
vectorized code. 

Has that happened now? 


No, clang does not produce vectorized code by default, not even 
with the -ftree-vectorize flag explicitely added like in the next 
patch in this series (that flag is enabled by default in clang 
anyway, so no effect).


Clang needs more investigation and testing because with additional 
code changes it can be "forced" to output vectorized code, but 
that is outside the scope of this series.


If you think it's a good idea I can add a warning only for Clang 
which makes more sense than telling clang users to upgrade their 
GCC, since now Clang is officially supported. What do you think?





   Arnd


Re: [PATCH v6] char: tpm: add i2c driver for cr50

2020-12-14 Thread Adrian Ratiu

On Fri, 11 Dec 2020, Jarkko Sakkinen  wrote:
On Wed, Dec 09, 2020 at 02:41:45PM +0200, Adrian Ratiu wrote: 
On Tue, 08 Dec 2020, Jarkko Sakkinen  wrote: 
> On Mon, Dec 07, 2020 at 04:20:16PM +0200, Adrian Ratiu wrote: 
> > From: "dlau...@chromium.org"   Add 
> > TPM 2.0 compatible I2C interface for chips with cr50 
> > firmware. The firmware running on the currently supported 
> > H1 MCU requires a special driver to handle its specific 
> > protocol, and this makes it unsuitable to use 
> > tpm_tis_core_* and instead it must implement the underlying 
> > TPM protocol similar to the other I2C TPM drivers.   - All 
> > 4 bytes of status register must be read/written at once.  - 
> > FIFO and burst count is limited to 63 and must be drained 
> > by AP.  - Provides an interrupt to indicate when read 
> > response data is ready and when the TPM is finished 
> > processing write data.   This driver is based on the 
> > existing infineon I2C TPM driver, which most closely 
> > matches the cr50 i2c protocol behavior. 
>  Starts to look legit. Has anyone tested this? 
 I tested on an x86_64 Chromebook EVE (aka Google Pixelbook) by 
chainloading in legacy mode and booting into a Yocto-based 
userspace (meta-chromebook) where I used tpm2-tools to 
communicate with the chip and also built and tested a 
ChromiumOS userspace in developer mode.   I do not have access 
to other HW which has this chip, so it is about as much testing 
I can do to confirm the driver works on this HW.   Adrian 


So can you respond to this with tested-by. It's sufficient 
because collateral effects of driver failing are insignificant 
for the kernel as whole. 


Tested-by: Adrian Ratiu 

 
 /Jarkko


[PATCH v3] char: tpm: add i2c driver for cr50

2020-11-27 Thread Adrian Ratiu
From: "dlau...@chromium.org" 

Add TPM 2.0 compatible I2C interface for chips with cr50 firmware.

The firmware running on the currently supported H1 MCU requires a
special driver to handle its specific protocol, and this makes it
unsuitable to use tpm_tis_core_* and instead it must implement the
underlying TPM protocol similar to the other I2C TPM drivers.

- All 4 bytes of status register must be read/written at once.
- FIFO and burst count is limited to 63 and must be drained by AP.
- Provides an interrupt to indicate when read response data is ready
and when the TPM is finished processing write data.

This driver is based on the existing infineon I2C TPM driver, which
most closely matches the cr50 i2c protocol behavior.

Cc: Helen Koike 
Cc: Jarkko Sakkinen 
Cc: Ezequiel Garcia 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v3:
  - Misc small fixes (typos/renamings, comments, default values)
  - Moved i2c_write memcpy before lock to minimize critical section (Helen)
  - Dropped priv->locality because it stored a constant value (Helen)
  - Many kdoc, function name and style fixes in general (Jarkko)
  - Kept the force release enum instead of defines or bool (Ezequiel)

Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

Applies on next-20201127, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 770 
 3 files changed, 782 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50
+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o
+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..896bf0163150
--- /dev/null
+++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
@@ -0,0 +1,770 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2016 Google Inc.
+ *
+ * Based on Linux Kernel TPM driver by
+ * Peter Huewe 
+ * Copyright (C) 2011 Infineon Technologies
+ *
+ * cr50 is a firmware for H1 secure modules that requires special
+ * handling for the I2C interface.
+ *
+ * - Use an interrupt for transaction status instead of hardcoded delays.
+ * - Must use write+wait+read read protocol.
+ * - All 4 bytes of status register must be read/written at once.
+ * - Burst count max is 63 bytes, and burst count behaves slightly differently
+ *   than other I2C TPMs.
+ * - When reading from FIFO the full burstcnt must be read instead of just
+ *   reading header and determining the remainder.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tpm_tis_core.h"
+
+#define TPM_CR50_MAX_BUFSIZE   64
+#define TPM_CR50_TIMEOUT_SHORT_MS  2   /

[PATCH v2] char: tpm: add i2c driver for cr50

2020-11-20 Thread Adrian Ratiu
From: "dlau...@chromium.org" 

Add TPM 2.0 compatible I2C interface for chips with cr50 firmware.

The firmware running on the currently supported H1 MCU requires a
special driver to handle its specific protocol, and this makes it
unsuitable to use tpm_tis_core_* and instead it must implement the
underlying TPM protocol similar to the other I2C TPM drivers.

- All 4 byes of status register must be read/written at once.
- FIFO and burst count is limited to 63 and must be drained by AP.
- Provides an interrupt to indicate when read response data is ready
and when the TPM is finished processing write data.

This driver is based on the existing infineon I2C TPM driver, which
most closely matches the cr50 i2c protocol behavior.

Cc: Helen Koike 
Signed-off-by: Duncan Laurie 
[swb...@chromium.org: Depend on i2c even if it's a module, replace
boilier plate with SPDX tag, drop asm/byteorder.h include, simplify
return from probe]
Signed-off-by: Stephen Boyd 
Signed-off-by: Fabien Lahoudere 
Signed-off-by: Adrian Ratiu 
---
Changes in v2:
  - Various small fixes all over (reorder includes, MAX_BUFSIZE, comments, etc)
  - Reworked return values of i2c_wait_tpm_ready() to fix timeout mis-handling
so ret == 0 now means success, the wait period jiffies is ignored because that
number is meaningless and return a proper timeout error in case jiffies == 0.
  - Make i2c default to 1 message per transfer (requested by Helen)
  - Move -EIO error reporting to transfer function to cleanup transfer() itself
and its R/W callers
  - Remove magic value hardcodings and introduce enum force_release.

v1 posted at https://lkml.org/lkml/2020/2/25/349

Applies on next-20201120, tested on Chromebook EVE.
---
 drivers/char/tpm/Kconfig|  10 +
 drivers/char/tpm/Makefile   |   2 +
 drivers/char/tpm/tpm_tis_i2c_cr50.c | 768 
 3 files changed, 780 insertions(+)
 create mode 100644 drivers/char/tpm/tpm_tis_i2c_cr50.c

diff --git a/drivers/char/tpm/Kconfig b/drivers/char/tpm/Kconfig
index a18c314da211..4308f9ca7a43 100644
--- a/drivers/char/tpm/Kconfig
+++ b/drivers/char/tpm/Kconfig
@@ -86,6 +86,16 @@ config TCG_TIS_SYNQUACER
  To compile this driver as a module, choose  M here;
  the module will be called tpm_tis_synquacer.
 
+config TCG_TIS_I2C_CR50
+   tristate "TPM Interface Specification 2.0 Interface (I2C - CR50)"
+   depends on I2C
+   select TCG_CR50
+   help
+ This is a driver for the Google cr50 I2C TPM interface which is a
+ custom microcontroller and requires a custom i2c protocol interface
+ to handle the limitations of the hardware.  To compile this driver
+ as a module, choose M here; the module will be called 
tcg_tis_i2c_cr50.
+
 config TCG_TIS_I2C_ATMEL
tristate "TPM Interface Specification 1.2 Interface (I2C - Atmel)"
depends on I2C
diff --git a/drivers/char/tpm/Makefile b/drivers/char/tpm/Makefile
index 84db4fb3a9c9..66d39ea6bd10 100644
--- a/drivers/char/tpm/Makefile
+++ b/drivers/char/tpm/Makefile
@@ -27,6 +27,8 @@ obj-$(CONFIG_TCG_TIS_SPI) += tpm_tis_spi.o
 tpm_tis_spi-y := tpm_tis_spi_main.o
 tpm_tis_spi-$(CONFIG_TCG_TIS_SPI_CR50) += tpm_tis_spi_cr50.o
 
+obj-$(CONFIG_TCG_TIS_I2C_CR50) += tpm_tis_i2c_cr50.o
+
 obj-$(CONFIG_TCG_TIS_I2C_ATMEL) += tpm_i2c_atmel.o
 obj-$(CONFIG_TCG_TIS_I2C_INFINEON) += tpm_i2c_infineon.o
 obj-$(CONFIG_TCG_TIS_I2C_NUVOTON) += tpm_i2c_nuvoton.o
diff --git a/drivers/char/tpm/tpm_tis_i2c_cr50.c 
b/drivers/char/tpm/tpm_tis_i2c_cr50.c
new file mode 100644
index ..37555dafdca0
--- /dev/null
+++ b/drivers/char/tpm/tpm_tis_i2c_cr50.c
@@ -0,0 +1,768 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2016 Google Inc.
+ *
+ * Based on Linux Kernel TPM driver by
+ * Peter Huewe 
+ * Copyright (C) 2011 Infineon Technologies
+ */
+
+/*
+ * cr50 is a firmware for H1 secure modules that requires special
+ * handling for the I2C interface.
+ *
+ * - Use an interrupt for transaction status instead of hardcoded delays
+ * - Must use write+wait+read read protocol
+ * - All 4 bytes of status register must be read/written at once
+ * - Burst count max is 63 bytes, and burst count behaves
+ *   slightly differently than other I2C TPMs
+ * - When reading from FIFO the full burstcnt must be read
+ *   instead of just reading header and determining the remainder
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "tpm_tis_core.h"
+
+#define CR50_MAX_BUFSIZE   64
+#define CR50_TIMEOUT_SHORT_MS  2   /* Short timeout during transactions */
+#define CR50_TIMEOUT_NOIRQ_MS  20  /* Timeout for TPM ready without IRQ */
+#define CR50_I2C_DID_VID   0x00281ae0L
+#define CR50_I2C_MAX_RETRIES   3   /* Max retries due to I2C errors */
+#define CR50_I2C_RETRY_DELAY_LO55  /* Min usecs between retries on 
I2C */
+#define CR50_I2C_RETRY_DELAY_HI   

[PATCH 2/2] brcmfmac: fix suspend/resume when power is cut off

2019-09-25 Thread Adrian Ratiu
brcmfmac assumed the wifi device always remains powered on and thus
hardcoded the MMC_PM_KEEP_POWER flag expecting the wifi device to
remain on even during suspend/resume cycles.

This is not always the case, some appliances cut power to everything
connected via SDIO for efficiency reasons and this leads to wifi not
being usable after coming out of suspend because the device was not
correctly reinitialized.

So we check for the keep_power capability and if it's not present then
we remove the device and probe it again during resume to mirror what's
happening in hardware and ensure correct reinitialization in the case
when MMC_PM_KEEP_POWER is not supported.

Suggested-by: Gustavo Padovan 
Signed-off-by: Adrian Ratiu 
---
 .../broadcom/brcm80211/brcmfmac/bcmsdh.c  | 53 ++-
 1 file changed, 39 insertions(+), 14 deletions(-)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
index fc12598b2dd3..96fd8e2bf773 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/bcmsdh.c
@@ -1108,7 +1108,8 @@ static int brcmf_ops_sdio_suspend(struct device *dev)
struct sdio_func *func;
struct brcmf_bus *bus_if;
struct brcmf_sdio_dev *sdiodev;
-   mmc_pm_flag_t sdio_flags;
+   mmc_pm_flag_t pm_caps, sdio_flags;
+   int ret = 0;
 
func = container_of(dev, struct sdio_func, dev);
brcmf_dbg(SDIO, "Enter: F%d\n", func->num);
@@ -1119,19 +1120,33 @@ static int brcmf_ops_sdio_suspend(struct device *dev)
bus_if = dev_get_drvdata(dev);
sdiodev = bus_if->bus_priv.sdio;
 
-   brcmf_sdiod_freezer_on(sdiodev);
-   brcmf_sdio_wd_timer(sdiodev->bus, 0);
+   pm_caps = sdio_get_host_pm_caps(func);
+
+   if (pm_caps & MMC_PM_KEEP_POWER) {
+   /* preserve card power during suspend */
+   brcmf_sdiod_freezer_on(sdiodev);
+   brcmf_sdio_wd_timer(sdiodev->bus, 0);
+
+   sdio_flags = MMC_PM_KEEP_POWER;
+   if (sdiodev->wowl_enabled) {
+   if (sdiodev->settings->bus.sdio.oob_irq_supported)
+   
enable_irq_wake(sdiodev->settings->bus.sdio.oob_irq_nr);
+   else
+   sdio_flags |= MMC_PM_WAKE_SDIO_IRQ;
+   }
+
+   if (sdio_set_host_pm_flags(sdiodev->func1, sdio_flags))
+   brcmf_err("Failed to set pm_flags %x\n", sdio_flags);
 
-   sdio_flags = MMC_PM_KEEP_POWER;
-   if (sdiodev->wowl_enabled) {
-   if (sdiodev->settings->bus.sdio.oob_irq_supported)
-   enable_irq_wake(sdiodev->settings->bus.sdio.oob_irq_nr);
-   else
-   sdio_flags |= MMC_PM_WAKE_SDIO_IRQ;
+   } else {
+   /* power will be cut so remove device, probe again in resume */
+   brcmf_sdiod_intr_unregister(sdiodev);
+   ret = brcmf_sdiod_remove(sdiodev);
+   if (ret)
+   brcmf_err("Failed to remove device on suspend\n");
}
-   if (sdio_set_host_pm_flags(sdiodev->func1, sdio_flags))
-   brcmf_err("Failed to set pm_flags %x\n", sdio_flags);
-   return 0;
+
+   return ret;
 }
 
 static int brcmf_ops_sdio_resume(struct device *dev)
@@ -1139,13 +1154,23 @@ static int brcmf_ops_sdio_resume(struct device *dev)
struct brcmf_bus *bus_if = dev_get_drvdata(dev);
struct brcmf_sdio_dev *sdiodev = bus_if->bus_priv.sdio;
struct sdio_func *func = container_of(dev, struct sdio_func, dev);
+   mmc_pm_flag_t pm_caps = sdio_get_host_pm_caps(func);
+   int ret = 0;
 
brcmf_dbg(SDIO, "Enter: F%d\n", func->num);
if (func->num != 2)
return 0;
 
-   brcmf_sdiod_freezer_off(sdiodev);
-   return 0;
+   if (!(pm_caps & MMC_PM_KEEP_POWER)) {
+   /* bus was powered off and device removed, probe again */
+   ret = brcmf_sdiod_probe(sdiodev);
+   if (ret)
+   brcmf_err("Failed to probe device on resume\n");
+   } else {
+   brcmf_sdiod_freezer_off(sdiodev);
+   }
+
+   return ret;
 }
 
 static const struct dev_pm_ops brcmf_sdio_pm_ops = {
-- 
2.23.0



[PATCH 1/2] brcmfmac: don't WARN when there are no requests

2019-09-25 Thread Adrian Ratiu
When n_reqs == 0 there is nothing to do so it doesn't make sense to
search for requests and issue a warning because none is found.

Signed-off-by: Martyn Welch 
Signed-off-by: Adrian Ratiu 
---
 drivers/net/wireless/broadcom/brcm80211/brcmfmac/pno.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pno.c 
b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pno.c
index 14e530601ef3..fabfbb0b40b0 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pno.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/pno.c
@@ -57,6 +57,10 @@ static int brcmf_pno_remove_request(struct brcmf_pno_info 
*pi, u64 reqid)
 
mutex_lock(&pi->req_lock);
 
+   /* Nothing to do if we have no requests */
+   if (pi->n_reqs == 0)
+   goto done;
+
/* find request */
for (i = 0; i < pi->n_reqs; i++) {
if (pi->reqs[i]->reqid == reqid)
-- 
2.23.0



Re: [PATCH v7 4/8] drm: imx: Add i.MX 6 MIPI DSI host platform driver

2020-04-28 Thread Adrian Ratiu

Hi Daniel,

On Tue, 28 Apr 2020, Daniel Vetter  wrote:
On Wed, Apr 22, 2020 at 04:07:27AM +0300, Laurent Pinchart 
wrote: 
Hi Adrian,  On Tue, Apr 21, 2020 at 07:16:06PM +0300, Adrian 
Ratiu wrote: 
> This adds support for the Synopsis DesignWare MIPI DSI v1.01 
> host controller which is embedded in i.MX 6 SoCs.   Based on 
> following patches, but updated/extended to work with existing 
> support found in the kernel:  - drm: imx: Support Synopsys 
> DesignWare MIPI DSI host controller 
>   Signed-off-by: Liu Ying  
>  Cc: Fabio Estevam  Cc: Enric Balletbo 
> Serra  Reviewed-by: Emil Velikov 
>  Tested-by: Adrian Pop 
>  Tested-by: Arnaud Ferraris 
>  Signed-off-by: Sjoerd Simons 
>  Signed-off-by: Martyn Welch 
>  Signed-off-by: Adrian Ratiu 
>  --- Changes since v6: 
>   - Replaced custom noop encoder with the simple drm encoder 
>   (Enric) - Added CONFIG_DRM_IMX6_MIPI_DSI depends on 
>   CONFIG_OF (Enric) - Dropped imx_mipi_dsi_register() because 
>   now it only creates the dummy encoder which can easily be 
>   done directly in imx_dsi_bind() 
>  Changes since v5: 
>   - Reword to remove unrelated device tree patch mention 
>   (Fabio) - Move pllref_clk enable/disable to bind/unbind 
>   (Ezequiel) - Fix freescale.com -> nxp.com email addresses 
>   (Fabio) - Also added myself as module author (Fabio) - Use 
>   DRM_DEV_* macros for consistency, print more error msg 
>  Changes since v4: 
>   - Split off driver-specific configuration of phy timings 
>   due to new upstream API.  - Move regmap infrastructure 
>   logic to separate commit (Ezequiel) - Move dsi v1.01 layout 
>   addition to a separate commit (Ezequiel) - Minor warnings 
>   and driver name fixes 
>  Changes since v3: 
>   - Renamed platform driver to reflect it's i.MX6 
>   only. (Fabio) 
>  Changes since v2: 
>   - Fixed commit tags. (Emil) 
>  Changes since v1: 
>   - Moved register definitions & regmap initialization into 
>   bridge module. Platform drivers get the regmap via 
>   plat_data after calling the bridge probe. (Emil) 
> --- 
>  drivers/gpu/drm/imx/Kconfig|   8 + 
>  drivers/gpu/drm/imx/Makefile   |   1 + 
>  drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c | 391 
>  + 3 files changed, 400 insertions(+) 
>  create mode 100644 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c 
>  diff --git a/drivers/gpu/drm/imx/Kconfig 
> b/drivers/gpu/drm/imx/Kconfig index 
> 207bf7409dfba..0dffc72df7922 100644 --- 
> a/drivers/gpu/drm/imx/Kconfig +++ 
> b/drivers/gpu/drm/imx/Kconfig @@ -39,3 +39,11 @@ config 
> DRM_IMX_HDMI 
>  	depends on DRM_IMX help Choose this if you want to use 
>  HDMI on i.MX6. 
> + +config DRM_IMX6_MIPI_DSI +	tristate "Freescale i.MX6 
> DRM MIPI DSI" +	select DRM_DW_MIPI_DSI +	depends on 
> DRM_IMX +	depends on OF +	help +	  Choose this if you want 
> to use MIPI DSI on i.MX6.  diff --git 
> a/drivers/gpu/drm/imx/Makefile b/drivers/gpu/drm/imx/Makefile 
> index 21cdcc2faabc8..9a7843c593478 100644 --- 
> a/drivers/gpu/drm/imx/Makefile +++ 
> b/drivers/gpu/drm/imx/Makefile @@ -9,3 +9,4 @@ 
> obj-$(CONFIG_DRM_IMX_TVE) += imx-tve.o 
>  obj-$(CONFIG_DRM_IMX_LDB) += imx-ldb.o 
>  obj-$(CONFIG_DRM_IMX_HDMI) += dw_hdmi-imx.o 
> +obj-$(CONFIG_DRM_IMX6_MIPI_DSI) += dw_mipi_dsi-imx6.o diff 
> --git a/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c 
> b/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c new file mode 100644 
> index 0..f8a0a4fe16e21 --- /dev/null +++ 
> b/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c @@ -0,0 +1,391 @@ 
> +// SPDX-License-Identifier: GPL-2.0+ +/* + * i.MX6 drm 
> driver - MIPI DSI Host Controller + * + * Copyright (C) 
> 2011-2015 Freescale Semiconductor, Inc.  + * Copyright (C) 
> 2019-2020 Collabora, Ltd.  + */ + +#include  
> +#include  +#include  
> +#include  +#include 
>  +#include  +#include 
>  +#include  +#include 
>  +#include  
> +#include  +#include  
> +#include  + +#include "imx-drm.h" + 
> +#define DSI_PWR_UP			0x04 +#define 
> RESET0 +#define POWERUP 
> BIT(0) + +#define DSI_PHY_IF_CTRL			0x5c 
> +#define PHY_IF_CTRL_RESET		0x0 + +#define 
> DSI_PHY_TST_CTRL0		0x64 +#define PHY_TESTCLK 
> BIT(1) +#define PHY_UNTESTCLK			0 +#define 
> PHY_TESTCLR			BIT(0) +#define 
> PHY_UNTESTCLR			0 + +#define 
> DSI_PHY_TST_CTRL1		0x68 +#define PHY_TESTEN 
> BIT(16) +#define PHY_UNTESTEN			0 +#define 
> PHY_TESTDOUT(n)			(((n) & 0xff) << 8) 
> +#define PHY_TESTDIN(n)			(((n) & 0xff) << 
> 0) + +struct imx_mipi_dsi { +	struct drm_encoder 
> encoder; +	struct device *dev; +	struct regmap *mux_sel; + 
> struct dw_mipi_dsi *mipi_dsi; +	struct clk *pllref_clk; + 
> +	void __iomem *base; +	unsigned int lane_mbps; +}; + 
> +struct dph

[PATCH] selftests/bpf: Add arm target register definitions

2019-03-04 Thread Adrian Ratiu
eBPF "restricted C" code can be compiled with LLVM/clang using target
triplets like armv7l-unknown-linux-gnueabihf and loaded/run with small
cross-compiled gobpf/elf [1] programs without requiring a full BCC
port which is also undesirable on small embedded systems due to its
size footprint. The only missing pieces are these helper macros which
otherwise have to be redefined by each eBPF arm program.

[1] https://github.com/iovisor/gobpf/tree/master/elf

Signed-off-by: Adrian Ratiu 
---
 tools/testing/selftests/bpf/bpf_helpers.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/tools/testing/selftests/bpf/bpf_helpers.h 
b/tools/testing/selftests/bpf/bpf_helpers.h
index 6c77cf7bedce..f7883576f445 100644
--- a/tools/testing/selftests/bpf/bpf_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_helpers.h
@@ -232,6 +232,9 @@ static int (*bpf_skb_pull_data)(void *, int len) =
 #elif defined(__TARGET_ARCH_s930x)
#define bpf_target_s930x
#define bpf_target_defined
+#elif defined(__TARGET_ARCH_arm)
+   #define bpf_target_arm
+   #define bpf_target_defined
 #elif defined(__TARGET_ARCH_arm64)
#define bpf_target_arm64
#define bpf_target_defined
@@ -254,6 +257,8 @@ static int (*bpf_skb_pull_data)(void *, int len) =
#define bpf_target_x86
 #elif defined(__s390x__)
#define bpf_target_s930x
+#elif defined(__arm__)
+   #define bpf_target_arm
 #elif defined(__aarch64__)
#define bpf_target_arm64
 #elif defined(__mips__)
@@ -291,6 +296,19 @@ static int (*bpf_skb_pull_data)(void *, int len) =
 #define PT_REGS_SP(x) ((x)->gprs[15])
 #define PT_REGS_IP(x) ((x)->psw.addr)
 
+#elif defined(bpf_target_arm)
+
+#define PT_REGS_PARM1(x) ((x)->uregs[0])
+#define PT_REGS_PARM2(x) ((x)->uregs[1])
+#define PT_REGS_PARM3(x) ((x)->uregs[2])
+#define PT_REGS_PARM4(x) ((x)->uregs[3])
+#define PT_REGS_PARM5(x) ((x)->uregs[4])
+#define PT_REGS_RET(x) ((x)->uregs[14])
+#define PT_REGS_FP(x) ((x)->uregs[11]) /* Works only with CONFIG_FRAME_POINTER 
*/
+#define PT_REGS_RC(x) ((x)->uregs[0])
+#define PT_REGS_SP(x) ((x)->uregs[13])
+#define PT_REGS_IP(x) ((x)->uregs[12])
+
 #elif defined(bpf_target_arm64)
 
 #define PT_REGS_PARM1(x) ((x)->regs[0])
-- 
2.20.1



[PATCH 02/18] media: hantro: make consistent use of decimal register notation

2020-10-12 Thread Adrian Ratiu
This header used a combination of direct hex offsets and decimal register
notation - via the G1_SWREG() macro - which is annoying when comparing with
the ref manuals which always use the equivalent of G1_SWREG(), so convert
the entire file to G1_SWREG() notation.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro_g1_regs.h | 52 +--
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_regs.h 
b/drivers/staging/media/hantro/hantro_g1_regs.h
index 80ff297f6f68..073b64cbe295 100644
--- a/drivers/staging/media/hantro/hantro_g1_regs.h
+++ b/drivers/staging/media/hantro/hantro_g1_regs.h
@@ -9,10 +9,10 @@
 #ifndef HANTRO_G1_REGS_H_
 #define HANTRO_G1_REGS_H_
 
-#define G1_SWREG(nr) ((nr) * 4)
+#define G1_SWREG(nr)   ((nr) * 4)
 
 /* Decoder registers. */
-#define G1_REG_INTERRUPT   0x004
+#define G1_REG_INTERRUPT   G1_SWREG(1)
 /* Interrupt bits. Some are present in:
  *- all core versions (">= g1")
  *- g1, missing in g2, but added back starting with vc8000d ("not in g2")
@@ -41,7 +41,7 @@
 #define G1_REG_INTERRUPT_DEC_BUS_INT_DIS   BIT(2) /* >= vc8000d */
 #define G1_REG_INTERRUPT_DEC_STRM_CORRUPTEDBIT(1) /* >= 
vc8000d */
 #define G1_REG_INTERRUPT_DEC_E BIT(0) /* >= g1 */
-#define G1_REG_CONFIG  0x008
+#define G1_REG_CONFIG  G1_SWREG(2)
 #define G1_REG_CONFIG_DEC_AXI_RD_ID(x) (((x) & 0xff) << 24)
 #define G1_REG_CONFIG_DEC_TIMEOUT_EBIT(23)
 #define G1_REG_CONFIG_DEC_STRSWAP32_E  BIT(22)
@@ -60,7 +60,7 @@
 #define G1_REG_CONFIG_DEC_ADV_PRE_DIS  BIT(6)
 #define G1_REG_CONFIG_DEC_SCMD_DIS BIT(5)
 #define G1_REG_CONFIG_DEC_MAX_BURST(x) (((x) & 0x1f) << 0)
-#define G1_REG_DEC_CTRL0   0x00c
+#define G1_REG_DEC_CTRL0   G1_SWREG(3)
 #define G1_REG_DEC_CTRL0_DEC_MODE(x)   (((x) & 0xf) << 28)
 #define G1_REG_DEC_CTRL0_RLC_MODE_EBIT(27)
 #define G1_REG_DEC_CTRL0_SKIP_MODE BIT(26)
@@ -85,7 +85,7 @@
 #define G1_REG_DEC_CTRL0_PICORD_COUNT_EBIT(9)
 #define G1_REG_DEC_CTRL0_DEC_AHB_HLOCK_E   BIT(8)
 #define G1_REG_DEC_CTRL0_DEC_AXI_WR_ID(x)  (((x) & 0xff) << 0)
-#define G1_REG_DEC_CTRL1   0x010
+#define G1_REG_DEC_CTRL1   G1_SWREG(4)
 #define G1_REG_DEC_CTRL1_PIC_MB_WIDTH(x)   (((x) & 0x1ff) << 23)
 #define G1_REG_DEC_CTRL1_MB_WIDTH_OFF(x)   (((x) & 0xf) << 19)
 #define G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(x)(((x) & 0xff) 
<< 11)
@@ -96,7 +96,7 @@
 #define G1_REG_DEC_CTRL1_PIC_MB_W_EXT(x)   (((x) & 0x7) << 3)
 #define G1_REG_DEC_CTRL1_PIC_MB_H_EXT(x)   (((x) & 0x7) << 0)
 #define G1_REG_DEC_CTRL1_PIC_REFER_FLAGBIT(0)
-#define G1_REG_DEC_CTRL2   0x014
+#define G1_REG_DEC_CTRL2   G1_SWREG(5)
 #define G1_REG_DEC_CTRL2_STRM_START_BIT(x) (((x) & 0x3f) << 26)
 #define G1_REG_DEC_CTRL2_SYNC_MARKER_E BIT(25)
 #define G1_REG_DEC_CTRL2_TYPE1_QUANT_E BIT(24)
@@ -139,13 +139,13 @@
 #define G1_REG_DEC_CTRL2_BOOLEAN_RANGE(x)  (((x) & 0xff) << 0)
 #define G1_REG_DEC_CTRL2_ALPHA_OFFSET(x)   (((x) & 0x1f) << 5)
 #define G1_REG_DEC_CTRL2_BETA_OFFSET(x)(((x) & 0x1f) << 0)
-#define G1_REG_DEC_CTRL3   0x018
+#define G1_REG_DEC_CTRL3   G1_SWREG(6)
 #define G1_REG_DEC_CTRL3_START_CODE_E  BIT(31)
 #define G1_REG_DEC_CTRL3_INIT_QP(x)(((x) & 0x3f) 
<< 25)
 #define G1_REG_DEC_CTRL3_CH_8PIX_ILEAV_E   BIT(24)
 #define G1_REG_DEC_CTRL3_STREAM_LEN_EXT(x) (((x) & 0xff) << 24)
 #define G1_REG_DEC_CTRL3_STREAM_LEN(x) (((x) & 0xff) << 0)
-#define G1_REG_DEC_CTRL4   0x01c
+#define G1_REG_DEC_CTRL4   G1_SWREG(7)
 #define G1_REG_DEC_CTRL4_CABAC_E   BIT(31)
 #define G1_REG_DEC_CTRL4_BLACKWHITE_E  BIT(30)
 #define G1_REG_DEC_CTRL4_DIR_8X8_INFER_E   BIT(29)
@@ -182,7 +182,7 @@
 #define G1_REG_DEC_CTRL4_INIT_DC_MATCH0(x) (((x) & 0x7) << 9)
 #define G1_REG_DEC_CTRL4_INIT_DC_MATCH1(x) (((x) & 0x7) << 6)
 #define G1_REG_DEC_CTRL4_VP7_VERSIO

[PATCH 01/18] media: hantro: document all int reg bits up to vc8000

2020-10-12 Thread Adrian Ratiu
These do not all strictly belong to the g1 core and even the majority
of previously documented bits were not used (yet) by the driver irq
handlers, but it's still very useful to have an overview of all IRQs,
especially since starting with core versions vc8000 and later the irq
bits previously used by G1 and G2 have been merged at the same address.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro_g1_regs.h | 39 +--
 1 file changed, 28 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_regs.h 
b/drivers/staging/media/hantro/hantro_g1_regs.h
index c1756e3d5391..80ff297f6f68 100644
--- a/drivers/staging/media/hantro/hantro_g1_regs.h
+++ b/drivers/staging/media/hantro/hantro_g1_regs.h
@@ -13,17 +13,34 @@
 
 /* Decoder registers. */
 #define G1_REG_INTERRUPT   0x004
-#define G1_REG_INTERRUPT_DEC_PIC_INF   BIT(24)
-#define G1_REG_INTERRUPT_DEC_TIMEOUT   BIT(18)
-#define G1_REG_INTERRUPT_DEC_SLICE_INT BIT(17)
-#define G1_REG_INTERRUPT_DEC_ERROR_INT BIT(16)
-#define G1_REG_INTERRUPT_DEC_ASO_INT   BIT(15)
-#define G1_REG_INTERRUPT_DEC_BUFFER_INTBIT(14)
-#define G1_REG_INTERRUPT_DEC_BUS_INT   BIT(13)
-#define G1_REG_INTERRUPT_DEC_RDY_INT   BIT(12)
-#define G1_REG_INTERRUPT_DEC_IRQ   BIT(8)
-#define G1_REG_INTERRUPT_DEC_IRQ_DIS   BIT(4)
-#define G1_REG_INTERRUPT_DEC_E BIT(0)
+/* Interrupt bits. Some are present in:
+ *- all core versions (">= g1")
+ *- g1, missing in g2, but added back starting with vc8000d ("not in g2")
+ *- vc8000d and later (">= vc8000d")
+ */
+#define G1_REG_INTERRUPT_DEC_PIC_INF   BIT(24) /* not in g2 */
+#define G1_REG_INTERRUPT_DEC_TILE_INT  BIT(23) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_LINE_CNT_INT  BIT(22) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_EXT_TIMEOUT_INT   BIT(21) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_NO_SLICE_INT  BIT(20) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_LAST_SLICE_INTBIT(19) /* >= 
vc8000d */
+#define G1_REG_INTERRUPT_DEC_TIMEOUT   BIT(18) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_SLICE_INT BIT(17) /* not in g2 */
+#define G1_REG_INTERRUPT_DEC_ERROR_INT BIT(16) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_ASO_INT   BIT(15) /* not in g2 */
+#define G1_REG_INTERRUPT_DEC_BUFFER_INTBIT(14) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_BUS_INT   BIT(13) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_RDY_INT   BIT(12) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_ABORT_INT BIT(11) /* >= g2 */
+#define G1_REG_INTERRUPT_DEC_IRQ   BIT(8) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_TILE_INT_EBIT(7) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_SELF_RESET_DISBIT(6) /* >= 
vc8000d */
+#define G1_REG_INTERRUPT_DEC_ABORT_E   BIT(5) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_IRQ_DIS   BIT(4) /* >= g1 */
+#define G1_REG_INTERRUPT_DEC_TIMEOUT_SOURCEBIT(3) /* >= 
vc8000d */
+#define G1_REG_INTERRUPT_DEC_BUS_INT_DIS   BIT(2) /* >= vc8000d */
+#define G1_REG_INTERRUPT_DEC_STRM_CORRUPTEDBIT(1) /* >= 
vc8000d */
+#define G1_REG_INTERRUPT_DEC_E BIT(0) /* >= g1 */
 #define G1_REG_CONFIG  0x008
 #define G1_REG_CONFIG_DEC_AXI_RD_ID(x) (((x) & 0xff) << 24)
 #define G1_REG_CONFIG_DEC_TIMEOUT_EBIT(23)
-- 
2.28.0



[PATCH 00/18] Add Hantro regmap and VC8000 h264 decode support

2020-10-12 Thread Adrian Ratiu
Dear all,

This series introduces a regmap infrastructure for the Hantro driver
which is used to compensate for different HW-revision register layouts.
To justify it h264 decoding capability is added for newer VC8000 chips.

This is a gradual conversion to the new infra - a complete conversion
would have been very big and I do not have all the HW yet to test (I'm
expecting a RK3399 shipment next week though ;). I think converting the
h264 decoder provides a nice blueprint for how the other codecs can be
converted and enabled for different HW revisions.

The end goal of this is to make the driver more generic and eliminate
entirely custom boilerplate like `struct hantro_reg` or headers with
core-specific bit manipulations like `hantro_g1_regs.h` and instead rely
on the well-tested albeit more verbose regmap subsytem.

To give just two examples of bugs which are easily discovered by using
more verbose regmap fields (very easy to compare with the datasheets)
instead of relying on bit-magic tricks: G1_REG_DEC_CTRL3_INIT_QP(x) was
off-by-1 and the wrong .clk_gate bit was set in hantro_postproc.c.

Anyway, this series also extends the MMIO regmap API to allow relaxed
writes for the theoretical reason that avoiding unnecessary membarriers
leads to less CPU usage and small improvements to battery life. However,
in practice I could not measure differences between relaxed/non-relaxed
IO, so I'm on the fence whether to keep or remove the relaxed calls.

What I could masure is the performance impact of adding more sub-reg
field acesses: a constant ~ 20 microsecond bump per G1 h264 frame. This
is acceptable considering the total time to decode a frame takes three
orders of magnitude longer, i.e. miliseconds ranges, depending on the
frame size and bitstream params, so it is an acceptable trade-off to
have a more generic driver.

This has been tested on next-20201009 with imx8mq for G1 and an SoC with
VC8000 which has not yet been added (hopefuly support lands soon).

Kind regards,
Adrian

Adrian Ratiu (18):
  media: hantro: document all int reg bits up to vc8000
  media: hantro: make consistent use of decimal register notation
  media: hantro: make G1_REG_SOFT_RESET Rockchip specific
  media: hantro: add reset controller support
  media: hantro: prepare clocks before variant inits are run
  media: hantro: imx8mq: simplify ctrlblk reset logic
  regmap: mmio: add config option to allow relaxed MMIO accesses
  media: hantro: add initial MMIO regmap infrastructure
  media: hantro: default regmap to relaxed MMIO
  media: hantro: convert G1 h264 decoder to regmap fields
  media: hantro: convert G1 postproc to regmap
  media: hantro: add VC8000D h264 decoding
  media: hantro: add VC8000D postproc support
  media: hantro: make PP enablement logic a bit smarter
  media: hantro: add user-selectable, platform-selectable H264 High10
  media: hantro: rename h264_dec as it's not G1 specific anymore
  media: hantro: add dump registers debug option before decode start
  media: hantro: document encoder reg fields

 drivers/base/regmap/regmap-mmio.c |   34 +-
 drivers/staging/media/hantro/Makefile |3 +-
 drivers/staging/media/hantro/hantro.h |   79 +-
 drivers/staging/media/hantro/hantro_drv.c |   41 +-
 drivers/staging/media/hantro/hantro_g1_regs.h |   92 +-
 ...hantro_g1_h264_dec.c => hantro_h264_dec.c} |  237 +++-
 drivers/staging/media/hantro/hantro_hw.h  |   23 +-
 .../staging/media/hantro/hantro_postproc.c|  144 ++-
 drivers/staging/media/hantro/hantro_regmap.c  | 1015 +
 drivers/staging/media/hantro/hantro_regmap.h  |  295 +
 drivers/staging/media/hantro/hantro_v4l2.c|3 +-
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |   75 +-
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |5 +-
 include/linux/regmap.h|5 +
 14 files changed, 1795 insertions(+), 256 deletions(-)
 rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => 
hantro_h264_dec.c} (58%)
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.h

-- 
2.28.0



[PATCH 14/18] media: hantro: make PP enablement logic a bit smarter

2020-10-12 Thread Adrian Ratiu
Now that we support two cores with different PP operations we need
to make the condition to enable PP a bit smarter based on what is
actually supported by each core.

While doing this also move the needs_postproc() test inside the
postproc .c file instead of cluttering the header.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h | 10 ++-
 .../staging/media/hantro/hantro_postproc.c| 29 +++
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 2d507f8d3a1d..05e59bc83b71 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -393,6 +393,9 @@ static inline void hantro_reg_write_s(struct hantro_dev 
*vpu,
vdpu_write(vpu, vdpu_read_mask(vpu, reg, val), reg->base);
 }
 
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+  const struct hantro_fmt *fmt);
+
 void *hantro_get_ctrl(struct hantro_ctx *ctx, u32 id);
 dma_addr_t hantro_get_ref(struct hantro_ctx *ctx, u64 ts);
 
@@ -408,13 +411,6 @@ hantro_get_dst_buf(struct hantro_ctx *ctx)
return v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
 }
 
-static inline bool
-hantro_needs_postproc(const struct hantro_ctx *ctx,
- const struct hantro_fmt *fmt)
-{
-   return !ctx->is_encoder && fmt->fourcc != V4L2_PIX_FMT_NV12;
-}
-
 static inline dma_addr_t
 hantro_get_dec_buf_addr(struct hantro_ctx *ctx, struct vb2_buffer *vb)
 {
diff --git a/drivers/staging/media/hantro/hantro_postproc.c 
b/drivers/staging/media/hantro/hantro_postproc.c
index a6b3e243dc39..653bae37eed9 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -22,6 +22,35 @@
 
 #define VC8000D_PP_OUT_NV120x0
 
+bool hantro_needs_postproc(const struct hantro_ctx *ctx,
+  const struct hantro_fmt *fmt)
+{
+   bool ret = false;
+
+   /* postproc is only available for decoders */
+   if (ctx->is_encoder)
+   return false;
+
+   switch (ctx->dev->core_hw_dec_rev) {
+   case HANTRO_G1_REV:
+   /*
+* for now the G1 PP is only used for NV12 -> YUYV conversion
+* so if the dst format is already NV12 we don't need it
+*/
+   ret = fmt->fourcc != V4L2_PIX_FMT_NV12;
+   break;
+   case HANTRO_VC8000_REV:
+   /*
+* for now the VC8000D PP is only used to de-tile 4x4 NV12, so
+* enabling it for something else doesn't make sense.
+*/
+   ret = fmt->fourcc == V4L2_PIX_FMT_NV12;
+   break;
+   }
+
+   return ret;
+}
+
 void hantro_postproc_enable(struct hantro_ctx *ctx)
 {
struct hantro_regmap_fields_dec *fields = ctx->dev->reg_fields_dec;
-- 
2.28.0



[PATCH 04/18] media: hantro: add reset controller support

2020-10-12 Thread Adrian Ratiu
Some SoCs might have a reset controller which disables clocks
by default in reset state which then drivers need to unreset
before being able to ungate a specific clock.

In this specific case, the hantro driver needs to ensure the
peripheral clock can be properly ungated otherwise MMIO reg
values can't be accessed.

If the SoC has no reset controller or there is no "resets" DT
property defined, this new code will have no effect.

Signed-off-by: Adrian Ratiu 
Signed-off-by: Ezequiel Garcia 
---
 drivers/staging/media/hantro/hantro.h | 1 +
 drivers/staging/media/hantro/hantro_drv.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 65f9f7ea7dcf..bb442eb1974e 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -183,6 +183,7 @@ struct hantro_dev {
struct platform_device *pdev;
struct device *dev;
struct clk_bulk_data *clocks;
+   struct reset_control *reset;
void __iomem **reg_bases;
void __iomem *enc_base;
void __iomem *dec_base;
diff --git a/drivers/staging/media/hantro/hantro_drv.c 
b/drivers/staging/media/hantro/hantro_drv.c
index 3cd00cc0a364..c2ea54552ce9 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -747,6 +748,13 @@ static int hantro_probe(struct platform_device *pdev)
 
INIT_DELAYED_WORK(&vpu->watchdog_work, hantro_watchdog);
 
+   vpu->reset = devm_reset_control_get_optional_exclusive(&pdev->dev,
+  NULL);
+   if (IS_ERR(vpu->reset))
+   vpu->reset = NULL;
+
+   reset_control_reset(vpu->reset);
+
vpu->clocks = devm_kcalloc(&pdev->dev, vpu->variant->num_clocks,
   sizeof(*vpu->clocks), GFP_KERNEL);
if (!vpu->clocks)
-- 
2.28.0



[PATCH 03/18] media: hantro: make G1_REG_SOFT_RESET Rockchip specific

2020-10-12 Thread Adrian Ratiu
This register is not documented in either the G1 or VC8000D register
maps and on VC8000D there is a conflict because at the same offset the
VPU IP defines another register with a very different meaning.

What likely happened is the HW integrator which uses only the G1 IP
core added some reset/control logic at the end of the VPU map, so
it makes sense to make this register RK-specific.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro_g1_regs.h | 1 -
 drivers/staging/media/hantro/rk3288_vpu_hw.c  | 4 +++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_regs.h 
b/drivers/staging/media/hantro/hantro_g1_regs.h
index 073b64cbe295..a482a2ba6dfe 100644
--- a/drivers/staging/media/hantro/hantro_g1_regs.h
+++ b/drivers/staging/media/hantro/hantro_g1_regs.h
@@ -315,7 +315,6 @@
 #define G1_REG_REF_BUF_CTRL2_REFBU2_THR(x) (((x) & 0xfff) << 19)
 #define G1_REG_REF_BUF_CTRL2_REFBU2_PICID(x)   (((x) & 0x1f) << 14)
 #define G1_REG_REF_BUF_CTRL2_APF_THRESHOLD(x)  (((x) & 0x3fff) << 0)
-#define G1_REG_SOFT_RESET  0x194
 
 /* Post-processor registers. */
 #define G1_REG_PP_INTERRUPTG1_SWREG(60)
diff --git a/drivers/staging/media/hantro/rk3288_vpu_hw.c 
b/drivers/staging/media/hantro/rk3288_vpu_hw.c
index 7b299ee3e93d..4ad578b1236e 100644
--- a/drivers/staging/media/hantro/rk3288_vpu_hw.c
+++ b/drivers/staging/media/hantro/rk3288_vpu_hw.c
@@ -13,6 +13,8 @@
 #include "hantro_g1_regs.h"
 #include "hantro_h1_regs.h"
 
+#define VDPU_REG_SOFT_RESET 0x194
+
 #define RK3288_ACLK_MAX_FREQ (400 * 1000 * 1000)
 
 /*
@@ -167,7 +169,7 @@ static void rk3288_vpu_dec_reset(struct hantro_ctx *ctx)
 
vdpu_write(vpu, G1_REG_INTERRUPT_DEC_IRQ_DIS, G1_REG_INTERRUPT);
vdpu_write(vpu, G1_REG_CONFIG_DEC_CLK_GATE_E, G1_REG_CONFIG);
-   vdpu_write(vpu, 1, G1_REG_SOFT_RESET);
+   vdpu_write(vpu, 1, VDPU_REG_SOFT_RESET);
 }
 
 /*
-- 
2.28.0



[PATCH 08/18] media: hantro: add initial MMIO regmap infrastructure

2020-10-12 Thread Adrian Ratiu
This creates regmaps on top of the memory mapped regions for encoders
and decoders and converts the helpers in hantro.h to do their R/W via
these regmaps.

In itself this indirection layer is quite useless, but the key is the
field API also initialized using the regmaps which is currently empty.

Further changes can define any necessary regmap field APIs for various
HW revisions like G1, G2 or configure the fields for different HW reg
layouts to support newer HW revisions like VC8000D.

No regmap is defined for the ctrl registers of imx8m because their
usage is very simple and there is no known register layout divergence.

Signed-off-by: Adrian Ratiu 
Signed-off-by: Ezequiel Garcia 
---
 drivers/staging/media/hantro/Makefile|   1 +
 drivers/staging/media/hantro/hantro.h|  35 +++--
 drivers/staging/media/hantro/hantro_drv.c|  15 +-
 drivers/staging/media/hantro/hantro_regmap.c | 144 +++
 drivers/staging/media/hantro/hantro_regmap.h |  23 +++
 5 files changed, 206 insertions(+), 12 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.c
 create mode 100644 drivers/staging/media/hantro/hantro_regmap.h

diff --git a/drivers/staging/media/hantro/Makefile 
b/drivers/staging/media/hantro/Makefile
index 743ce08eb184..52bc0ee73569 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -9,6 +9,7 @@ hantro-vpu-y += \
hantro_h1_jpeg_enc.o \
hantro_g1_h264_dec.o \
hantro_g1_mpeg2_dec.o \
+   hantro_regmap.o \
hantro_g1_vp8_dec.o \
rk3399_vpu_hw_jpeg_enc.o \
rk3399_vpu_hw_mpeg2_dec.o \
diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 2dd4362d4080..c5425cd5ac84 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -28,6 +29,8 @@
 
 struct hantro_ctx;
 struct hantro_codec_ops;
+struct hantro_regmap_dec_fields;
+struct hantro_regmap_enc_fields;
 
 #define HANTRO_JPEG_ENCODERBIT(0)
 #define HANTRO_ENCODERS0x
@@ -165,8 +168,12 @@ hantro_vdev_to_func(struct video_device *vdev)
  * dev_ macros.
  * @clocks:Array of clock handles.
  * @reg_bases: Mapped addresses of VPU registers.
- * @enc_base:  Mapped address of VPU encoder register for convenience.
- * @dec_base:  Mapped address of VPU decoder register for convenience.
+ * @regs_enc:  MMIO regmap of VPU encoder block for convenience.
+ * @regs_dec:  MMIO regmap of VPU decoder block for convenience.
+ * @reg_fields_dec:Decoder regfields inside above regamp region.
+ * @reg_fields_enc:Encoder regfields inside above regamp region.
+ * @core_hw_dec_revRuntime detected HW decoder core revision.
+ * @core_hw_enc_revRuntime detected HW encoder core revision.
  * @vpu_mutex: Mutex to synchronize V4L2 calls.
  * @irqlock:   Spinlock to synchronize access to data structures
  * shared with interrupt handlers.
@@ -184,8 +191,12 @@ struct hantro_dev {
struct clk_bulk_data *clocks;
struct reset_control *reset;
void __iomem **reg_bases;
-   void __iomem *enc_base;
-   void __iomem *dec_base;
+   struct regmap *regs_dec;
+   struct regmap *regs_enc;
+   struct hantro_regmap_fields_dec *reg_fields_dec;
+   struct hantro_regmap_fields_enc *reg_fields_enc;
+   u32 core_hw_dec_rev;
+   u32 core_hw_enc_rev;
 
struct mutex vpu_mutex; /* video_device lock */
spinlock_t irqlock;
@@ -329,20 +340,22 @@ static inline void vepu_write_relaxed(struct hantro_dev 
*vpu,
  u32 val, u32 reg)
 {
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
-   writel_relaxed(val, vpu->enc_base + reg);
+   regmap_write(vpu->regs_enc, reg, val);
 }
 
 static inline void vepu_write(struct hantro_dev *vpu, u32 val, u32 reg)
 {
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
-   writel(val, vpu->enc_base + reg);
+   regmap_write(vpu->regs_enc, reg, val);
 }
 
 static inline u32 vepu_read(struct hantro_dev *vpu, u32 reg)
 {
-   u32 val = readl(vpu->enc_base + reg);
+   u32 val;
 
+   regmap_read(vpu->regs_enc, reg, &val);
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
+
return val;
 }
 
@@ -350,20 +363,22 @@ static inline void vdpu_write_relaxed(struct hantro_dev 
*vpu,
  u32 val, u32 reg)
 {
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
-   writel_relaxed(val, vpu->dec_base + reg);
+   regmap_write(vpu->regs_dec, reg, val);
 }
 
 static inline void vdpu_write(struct hantro_dev *vpu, u32 val, u32 reg)
 {
   

[PATCH 05/18] media: hantro: prepare clocks before variant inits are run

2020-10-12 Thread Adrian Ratiu
The fundamental idea is: clocks are prepared in the driver probe() then
each use-case will enable/disable them as needed.

Some variants like imx8mq need to have the clocks enabled during the
HW init phase, so they will benefit from having the clocks prepared
before the variant init callback to avoid duing a full prepare_enable/
unprepare_disable, so move the clk prepare a bit earlier.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro_drv.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_drv.c 
b/drivers/staging/media/hantro/hantro_drv.c
index c2ea54552ce9..3734efa80a7e 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -813,22 +813,22 @@ static int hantro_probe(struct platform_device *pdev)
}
}
 
+   ret = clk_bulk_prepare(vpu->variant->num_clocks, vpu->clocks);
+   if (ret) {
+   dev_err(&pdev->dev, "Failed to prepare clocks\n");
+   return ret;
+   }
+
ret = vpu->variant->init(vpu);
if (ret) {
dev_err(&pdev->dev, "Failed to init VPU hardware\n");
-   return ret;
+   goto err_clk_unprepare;
}
 
pm_runtime_set_autosuspend_delay(vpu->dev, 100);
pm_runtime_use_autosuspend(vpu->dev);
pm_runtime_enable(vpu->dev);
 
-   ret = clk_bulk_prepare(vpu->variant->num_clocks, vpu->clocks);
-   if (ret) {
-   dev_err(&pdev->dev, "Failed to prepare clocks\n");
-   return ret;
-   }
-
ret = v4l2_device_register(&pdev->dev, &vpu->v4l2_dev);
if (ret) {
dev_err(&pdev->dev, "Failed to register v4l2 device\n");
-- 
2.28.0



[PATCH 16/18] media: hantro: rename h264_dec as it's not G1 specific anymore

2020-10-12 Thread Adrian Ratiu
The h264 decoder is now capable of decoding on both G1 and VC8000 and
other HW revisions can be added in the future by extending the hantro
regmap config, so we rename it to reflect the new status.

All other core-specific files like "hantro_g1_mpeg2_dec.c" should be
renamed as well after they have been ported to the new regmap API.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/Makefile   | 2 +-
 .../media/hantro/{hantro_g1_h264_dec.c => hantro_h264_dec.c}| 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename drivers/staging/media/hantro/{hantro_g1_h264_dec.c => 
hantro_h264_dec.c} (100%)

diff --git a/drivers/staging/media/hantro/Makefile 
b/drivers/staging/media/hantro/Makefile
index 52bc0ee73569..94f1e454c495 100644
--- a/drivers/staging/media/hantro/Makefile
+++ b/drivers/staging/media/hantro/Makefile
@@ -7,7 +7,7 @@ hantro-vpu-y += \
hantro_v4l2.o \
hantro_postproc.o \
hantro_h1_jpeg_enc.o \
-   hantro_g1_h264_dec.o \
+   hantro_h264_dec.o \
hantro_g1_mpeg2_dec.o \
hantro_regmap.o \
hantro_g1_vp8_dec.o \
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c 
b/drivers/staging/media/hantro/hantro_h264_dec.c
similarity index 100%
rename from drivers/staging/media/hantro/hantro_g1_h264_dec.c
rename to drivers/staging/media/hantro/hantro_h264_dec.c
-- 
2.28.0



[PATCH 11/18] media: hantro: convert G1 postproc to regmap

2020-10-12 Thread Adrian Ratiu
Postprocessing used the custom hantro_reg structure but now we have
regmap fields which are used for reg layouts which do the same thing,
so PP can be moved to regmap. In the future all hantro_reg references
can be removed, this is just a beginnig.

This converts only the existing G1 PP support, but the fields can be
used for other core revisions like VC8000D which will be added shortly.

While we're at it also document a few more important PP registers for
eg scaling, cropping and rotation.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h | 19 -
 drivers/staging/media/hantro/hantro_hw.h  |  2 -
 .../staging/media/hantro/hantro_postproc.c| 72 +-
 drivers/staging/media/hantro/hantro_regmap.c  | 75 +++
 drivers/staging/media/hantro/hantro_regmap.h  | 26 +++
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  1 -
 drivers/staging/media/hantro/rk3288_vpu_hw.c  |  1 -
 7 files changed, 119 insertions(+), 77 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 5b7fbdc3779d..2d507f8d3a1d 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -71,7 +71,6 @@ struct hantro_irq {
  * @num_clocks:number of clocks in the array
  * @reg_names: array of register range names
  * @num_regs:  number of register range names in the array
- * @postproc_regs: &struct hantro_postproc_regs pointer
  */
 struct hantro_variant {
unsigned int enc_offset;
@@ -92,7 +91,6 @@ struct hantro_variant {
int num_clocks;
const char * const *reg_names;
int num_regs;
-   const struct hantro_postproc_regs *postproc_regs;
 };
 
 /**
@@ -283,23 +281,6 @@ struct hantro_reg {
u32 mask;
 };
 
-struct hantro_postproc_regs {
-   struct hantro_reg pipeline_en;
-   struct hantro_reg max_burst;
-   struct hantro_reg clk_gate;
-   struct hantro_reg out_swap32;
-   struct hantro_reg out_endian;
-   struct hantro_reg out_luma_base;
-   struct hantro_reg input_width;
-   struct hantro_reg input_height;
-   struct hantro_reg output_width;
-   struct hantro_reg output_height;
-   struct hantro_reg input_fmt;
-   struct hantro_reg output_fmt;
-   struct hantro_reg orig_width;
-   struct hantro_reg display_width;
-};
-
 /* Logging helpers */
 
 /**
diff --git a/drivers/staging/media/hantro/hantro_hw.h 
b/drivers/staging/media/hantro/hantro_hw.h
index 219283a06f52..e0039a15fe85 100644
--- a/drivers/staging/media/hantro/hantro_hw.h
+++ b/drivers/staging/media/hantro/hantro_hw.h
@@ -155,8 +155,6 @@ extern const struct hantro_variant rk3328_vpu_variant;
 extern const struct hantro_variant rk3288_vpu_variant;
 extern const struct hantro_variant imx8mq_vpu_variant;
 
-extern const struct hantro_postproc_regs hantro_g1_postproc_regs;
-
 extern const u32 hantro_vp8_dec_mc_filter[8][6];
 
 void hantro_watchdog(struct work_struct *work);
diff --git a/drivers/staging/media/hantro/hantro_postproc.c 
b/drivers/staging/media/hantro/hantro_postproc.c
index 6d2a8f2a8f0b..6d1705a60d36 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -11,20 +11,7 @@
 #include "hantro.h"
 #include "hantro_hw.h"
 #include "hantro_g1_regs.h"
-
-#define HANTRO_PP_REG_WRITE(vpu, reg_name, val) \
-{ \
-   hantro_reg_write(vpu, \
-&(vpu)->variant->postproc_regs->reg_name, \
-val); \
-}
-
-#define HANTRO_PP_REG_WRITE_S(vpu, reg_name, val) \
-{ \
-   hantro_reg_write_s(vpu, \
-  &(vpu)->variant->postproc_regs->reg_name, \
-  val); \
-}
+#include "hantro_regmap.h"
 
 #define VPU_PP_IN_YUYV 0x0
 #define VPU_PP_IN_NV12 0x1
@@ -33,35 +20,15 @@
 #define VPU_PP_OUT_RGB 0x0
 #define VPU_PP_OUT_YUYV0x3
 
-const struct hantro_postproc_regs hantro_g1_postproc_regs = {
-   .pipeline_en = {G1_REG_PP_INTERRUPT, 1, 0x1},
-   .max_burst = {G1_REG_PP_DEV_CONFIG, 0, 0x1f},
-   .clk_gate = {G1_REG_PP_DEV_CONFIG, 1, 0x1},
-   .out_swap32 = {G1_REG_PP_DEV_CONFIG, 5, 0x1},
-   .out_endian = {G1_REG_PP_DEV_CONFIG, 6, 0x1},
-   .out_luma_base = {G1_REG_PP_OUT_LUMA_BASE, 0, 0x},
-   .input_width = {G1_REG_PP_INPUT_SIZE, 0, 0x1ff},
-   .input_height = {G1_REG_PP_INPUT_SIZE, 9, 0x1ff},
-   .output_width = {G1_REG_PP_CONTROL, 4, 0x7ff},
-   .output_height = {G1_REG_PP_CONTROL, 15, 0x7ff},
-   .input_fmt = {G1_REG_PP_CONTROL, 29, 0x7},
-   .output_fmt = {G1_REG_PP_CONTROL, 26, 0x7},
-   .orig_width = {G1_REG_PP_MASK1_ORIG_WIDTH, 23, 0x1ff},
-   .display_width = {G1_REG_PP_DISPLAY_WIDTH, 0, 0xfff},
-};
-
 void hantr

[PATCH 10/18] media: hantro: convert G1 h264 decoder to regmap fields

2020-10-12 Thread Adrian Ratiu
Populate the regmap field API for G1 h264 decoding and convert the
G1 h264 decoder source to use the new API. This is done because we
will add support for the newer VC8000D core which will configure
the regmap API fields differently to match its own hwreg layout.

Signed-off-by: Adrian Ratiu 
---
 .../staging/media/hantro/hantro_g1_h264_dec.c | 71 ++---
 drivers/staging/media/hantro/hantro_regmap.c  | 79 ++-
 drivers/staging/media/hantro/hantro_regmap.h  | 26 +-
 3 files changed, 145 insertions(+), 31 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c 
b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 845bef73d218..8592dfabbc5e 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -18,6 +18,9 @@
 #include "hantro_g1_regs.h"
 #include "hantro_hw.h"
 #include "hantro_v4l2.h"
+#include "hantro_regmap.h"
+
+extern struct regmap_config hantro_regmap_dec;
 
 static void set_params(struct hantro_ctx *ctx)
 {
@@ -27,10 +30,15 @@ static void set_params(struct hantro_ctx *ctx)
const struct v4l2_ctrl_h264_pps *pps = ctrls->pps;
struct vb2_v4l2_buffer *src_buf = hantro_get_src_buf(ctx);
struct hantro_dev *vpu = ctx->dev;
+   struct hantro_regmap_fields_dec *fields = vpu->reg_fields_dec;
+   u32 width = MB_WIDTH(ctx->src_fmt.width);
+   u32 height = MB_HEIGHT(ctx->src_fmt.height);
u32 reg;
 
+   regmap_field_write(fields->dec_axi_wr_id, 0x0);
+
/* Decoder control register 0. */
-   reg = G1_REG_DEC_CTRL0_DEC_AXI_WR_ID(0x0);
+   reg = 0;
if (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)
reg |= G1_REG_DEC_CTRL0_SEQ_MBAFF_E;
if (sps->profile_idc > 66) {
@@ -50,10 +58,11 @@ static void set_params(struct hantro_ctx *ctx)
vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL0);
 
/* Decoder control register 1. */
-   reg = G1_REG_DEC_CTRL1_PIC_MB_WIDTH(MB_WIDTH(ctx->src_fmt.width)) |
- G1_REG_DEC_CTRL1_PIC_MB_HEIGHT_P(MB_HEIGHT(ctx->src_fmt.height)) |
- G1_REG_DEC_CTRL1_REF_FRAMES(sps->max_num_ref_frames);
-   vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL1);
+   regmap_field_write(fields->dec_pic_width, width);
+   regmap_field_write(fields->dec_pic_height, height);
+
+   regmap_field_write(fields->dec_num_ref_frames,
+  sps->max_num_ref_frames);
 
/* Decoder control register 2. */
reg = G1_REG_DEC_CTRL2_CH_QP_OFFSET(pps->chroma_qp_index_offset) |
@@ -66,10 +75,11 @@ static void set_params(struct hantro_ctx *ctx)
vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL2);
 
/* Decoder control register 3. */
-   reg = G1_REG_DEC_CTRL3_START_CODE_E |
- G1_REG_DEC_CTRL3_INIT_QP(pps->pic_init_qp_minus26 + 26) |
- 
G1_REG_DEC_CTRL3_STREAM_LEN(vb2_get_plane_payload(&src_buf->vb2_buf, 0));
-   vdpu_write_relaxed(vpu, reg, G1_REG_DEC_CTRL3);
+   regmap_field_write(fields->dec_start_code_e, 1);
+   regmap_field_write(fields->dec_init_qp,
+  pps->pic_init_qp_minus26 + 26);
+   regmap_field_write(fields->dec_stream_len,
+  vb2_get_plane_payload(&src_buf->vb2_buf, 0));
 
/* Decoder control register 4. */
reg = G1_REG_DEC_CTRL4_FRAMENUM_LEN(sps->log2_max_frame_num_minus4 + 4) 
|
@@ -121,8 +131,7 @@ static void set_params(struct hantro_ctx *ctx)
vdpu_write_relaxed(vpu, 0, G1_REG_REF_BUF_CTRL);
 
/* Reference picture buffer control register 2. */
-   vdpu_write_relaxed(vpu, G1_REG_REF_BUF_CTRL2_APF_THRESHOLD(8),
-  G1_REG_REF_BUF_CTRL2);
+   regmap_field_write(fields->dec_apf_threshold, 8);
 }
 
 static void set_ref(struct hantro_ctx *ctx)
@@ -221,7 +230,6 @@ static void set_ref(struct hantro_ctx *ctx)
/* Set up addresses of DPB buffers. */
for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
dma_addr_t dma_addr = hantro_h264_get_ref_buf(ctx, i);
-
vdpu_write_relaxed(vpu, dma_addr, G1_REG_ADDR_REF(i));
}
 }
@@ -231,6 +239,7 @@ static void set_buffers(struct hantro_ctx *ctx)
const struct hantro_h264_dec_ctrls *ctrls = &ctx->h264_dec.ctrls;
struct vb2_v4l2_buffer *src_buf, *dst_buf;
struct hantro_dev *vpu = ctx->dev;
+   struct hantro_regmap_fields_dec *fields = vpu->reg_fields_dec;
dma_addr_t src_dma, dst_dma;
size_t offset = 0;
 
@@ -239,14 +248,14 @@ static void set_buffers(struct hantro_ctx *ctx)
 
/* Source (stream) buffer. */
src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
-   vdpu_write_relaxed(vpu, src_dma, G1_REG_ADDR_STR);
+   regmap_field_write(fields->dec_addr_str,

[PATCH 12/18] media: hantro: add VC8000D h264 decoding

2020-10-12 Thread Adrian Ratiu
VC8000D is a newer core combining both previous G1 and G2 cores into
one chip. As a result of this register layouts took a hit but the HW
functions mostly the same, so we can use regmap fields to compensate.

Signed-off-by: Adrian Ratiu 
---
 .../staging/media/hantro/hantro_g1_h264_dec.c | 29 +++-
 drivers/staging/media/hantro/hantro_regmap.c  | 69 +++
 drivers/staging/media/hantro/hantro_regmap.h  | 22 ++
 3 files changed, 117 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c 
b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index 8592dfabbc5e..a04cb616d628 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -20,6 +20,8 @@
 #include "hantro_v4l2.h"
 #include "hantro_regmap.h"
 
+#define VC8KD_TIMEOUT 0x50
+
 extern struct regmap_config hantro_regmap_dec;
 
 static void set_params(struct hantro_ctx *ctx)
@@ -33,10 +35,23 @@ static void set_params(struct hantro_ctx *ctx)
struct hantro_regmap_fields_dec *fields = vpu->reg_fields_dec;
u32 width = MB_WIDTH(ctx->src_fmt.width);
u32 height = MB_HEIGHT(ctx->src_fmt.height);
-   u32 reg;
+   u32 reg, stride;
 
regmap_field_write(fields->dec_axi_wr_id, 0x0);
 
+   if (vpu->core_hw_dec_rev == HANTRO_VC8000_REV) {
+   /* stride should be computed in hantro_try_fmt() and set here */
+   stride = width * 4 * 16;
+   regmap_field_write(fields->dec_out_y_stride, stride);
+   regmap_field_write(fields->dec_out_c_stride, stride);
+
+   /* on VC8KD the pic sizes changed from MB to CBS */
+   regmap_field_write(fields->dec_min_cb_size, 3);
+   regmap_field_write(fields->dec_max_cb_size, 4);
+   width <<= 1;
+   height <<= 1;
+   }
+
/* Decoder control register 0. */
reg = 0;
if (sps->flags & V4L2_H264_SPS_FLAG_MB_ADAPTIVE_FRAME_FIELD)
@@ -230,7 +245,7 @@ static void set_ref(struct hantro_ctx *ctx)
/* Set up addresses of DPB buffers. */
for (i = 0; i < HANTRO_H264_DPB_SIZE; i++) {
dma_addr_t dma_addr = hantro_h264_get_ref_buf(ctx, i);
-   vdpu_write_relaxed(vpu, dma_addr, G1_REG_ADDR_REF(i));
+   vdpu_write_relaxed(vpu, dma_addr, REG_ADDR_REF(i));
}
 }
 
@@ -309,7 +324,15 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
G1_REG_CONFIG_DEC_STRSWAP32_E;
vdpu_write_relaxed(vpu, reg, G1_REG_CONFIG);
break;
-   /* TODO: add VC8000 support */
+   case HANTRO_VC8000_REV:
+   regmap_field_write(fields->dec_ext_timeout_e, 1);
+   regmap_field_write(fields->dec_ext_timeout_cycles, 
VC8KD_TIMEOUT);
+   regmap_field_write(fields->dec_timeout_e, 1);
+   regmap_field_write(fields->dec_timeout_cycles, VC8KD_TIMEOUT);
+   regmap_field_write(fields->dec_buswidth, 2);
+   regmap_field_write(fields->dec_tab_swap, 3);
+   regmap_field_write(fields->dec_tiled_mode_lsb, 1);
+   break;
}
 
regmap_field_write(fields->dec_clk_gate_e, 1);
diff --git a/drivers/staging/media/hantro/hantro_regmap.c 
b/drivers/staging/media/hantro/hantro_regmap.c
index c0344b0ec8de..0e74ba69034f 100644
--- a/drivers/staging/media/hantro/hantro_regmap.c
+++ b/drivers/staging/media/hantro/hantro_regmap.c
@@ -37,8 +37,13 @@ struct hantro_field_dec {
struct reg_field cfg_dec_axi_rd_id;
struct reg_field cfg_dec_axi_wr_id;
struct reg_field cfg_dec_rlc_mode_e;
+   struct reg_field cfg_dec_strm_swap;
+   struct reg_field cfg_dec_pic_swap;
+   struct reg_field cfg_dec_dirmv_swap;
struct reg_field cfg_dec_mode;
+   struct reg_field cfg_dec_buffer_empty_int_e;
struct reg_field cfg_dec_max_burst;
+   struct reg_field cfg_dec_buswidth;
struct reg_field cfg_dec_apf_threshold;
struct reg_field cfg_dec_stream_len;
struct reg_field cfg_dec_init_qp;
@@ -51,9 +56,18 @@ struct hantro_field_dec {
struct reg_field cfg_dec_addr_dst;
struct reg_field cfg_dec_ilace_mode;
struct reg_field cfg_dec_addr_qtable;
+   struct reg_field cfg_dec_max_cb_size;
+   struct reg_field cfg_dec_min_cb_size;
+   struct reg_field cfg_dec_out_y_stride;
+   struct reg_field cfg_dec_out_c_stride;
struct reg_field cfg_dec_addr_dir_mv;
struct reg_field cfg_dec_tiled_mode_lsb;
struct reg_field cfg_dec_clk_gate_e;
+   struct reg_field cfg_dec_tab_swap;
+   struct reg_field cfg_dec_ext_timeout_cycles;
+   struct reg_field cfg_dec_ext_timeout_e;
+   struct reg_field cfg_dec_timeout_cycles;
+   struct reg_field cfg_dec_timeout_e;
 
struct reg_fie

[PATCH 18/18] media: hantro: document encoder reg fields

2020-10-12 Thread Adrian Ratiu
Even though these fields are currently unused it is still a good
idea to have them documented for future encoder implementations.

Signed-off-by: Ezequiel Garcia 
Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro_regmap.c | 580 ++-
 drivers/staging/media/hantro/hantro_regmap.h | 177 +-
 2 files changed, 754 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_regmap.c 
b/drivers/staging/media/hantro/hantro_regmap.c
index 62280b873859..f15884f29ed6 100644
--- a/drivers/staging/media/hantro/hantro_regmap.c
+++ b/drivers/staging/media/hantro/hantro_regmap.c
@@ -115,7 +115,394 @@ struct hantro_field_dec {
 };
 
 struct hantro_field_enc {
-   /* TODO: populate encoder fields */
+   struct reg_field cfg_enc_timeout_e;
+   struct reg_field cfg_enc_timeout_cycles;
+   struct reg_field cfg_enc_mode;
+   struct reg_field cfg_enc_stream_mode;
+   struct reg_field cfg_enc_enable;
+   struct reg_field cfg_enc_pic_type;
+   struct reg_field cfg_enc_pic_width;
+   struct reg_field cfg_enc_pic_height;
+   struct reg_field cfg_enc_burst_len;
+   struct reg_field cfg_enc_clk_gate_en;
+   struct reg_field cfg_enc_TODO_swap;
+   struct reg_field cfg_enc_stream_buf_limit;
+   struct reg_field cfg_enc_row_len;
+   struct reg_field cfg_enc_overfill_r;
+   struct reg_field cfg_enc_overfill_b;
+   struct reg_field cfg_enc_src_format;
+   struct reg_field cfg_enc_init_qp;
+   struct reg_field cfg_enc_chroma_qp_offset;
+   struct reg_field cfg_enc_idr_pic_id;
+   struct reg_field cfg_enc_nal_ref_idc;
+   struct reg_field cfg_enc_pps_id;
+   struct reg_field cfg_enc_nal_unit_type;
+   struct reg_field cfg_enc_frame_num;
+   struct reg_field cfg_enc_min_cb_size;
+   struct reg_field cfg_enc_max_cb_size;
+   struct reg_field cfg_enc_max_trb_size;
+   struct reg_field cfg_enc_min_trb_size;
+   struct reg_field cfg_enc_deblocking_filter_dis;
+   struct reg_field cfg_enc_slice_deblocking_filter_override;
+   struct reg_field cfg_enc_slice_deblocking_filter_dis;
+   struct reg_field cfg_enc_pps_deblocking_filter_override;
+   struct reg_field cfg_enc_slice_alpha_div2;
+   struct reg_field cfg_enc_slice_beta_div2;
+   struct reg_field cfg_enc_slice_size;
+   struct reg_field cfg_enc_nal_size_write;
+   struct reg_field cfg_enc_cabac_init_idc;
+   struct reg_field cfg_enc_pic_qp;
+   struct reg_field cfg_enc_qp_frac;
+   struct reg_field cfg_enc_entropy_coding_mode;
+   struct reg_field cfg_enc_axi_r_outstanding_num;
+   struct reg_field cfg_enc_axi_w_outstanding_num;
+   struct reg_field cfg_enc_trans8x8_mode_en;
+   struct reg_field cfg_enc_inter4x4_mode;
+   struct reg_field cfg_enc_quarter_pixmv_dis;
+   struct reg_field cfg_enc_addr_cabac;
+   struct reg_field cfg_enc_addr_str;
+   struct reg_field cfg_enc_addr_size_table;
+   struct reg_field cfg_enc_addr_rec_luma;
+   struct reg_field cfg_enc_addr_rec_luma_4n;
+   struct reg_field cfg_enc_addr_ref_luma_l0_4n0;
+   struct reg_field cfg_enc_addr_rec_chroma;
+   struct reg_field cfg_enc_addr_ref_luma;
+   struct reg_field cfg_enc_addr_ref_chroma;
+   struct reg_field cfg_enc_addr_src_y;
+   struct reg_field cfg_enc_addr_src_cb;
+   struct reg_field cfg_enc_addr_src_cr;
+   struct reg_field cfg_enc_log2_max_pic_order_cnt_lsb;
+   struct reg_field cfg_enc_log2_max_frame_num;
+   struct reg_field cfg_enc_pic_order_cnt_type;
+   struct reg_field cfg_enc_l0_delta_framenum0;
+   struct reg_field cfg_enc_l0_used_by_next_pic0;
+   struct reg_field cfg_enc_l0_used_by_next_pic1;
+
+   struct reg_field cfg_enc_lu_stride;
+   struct reg_field cfg_enc_cr_stride;
+   struct reg_field cfg_enc_ref_lu_stride;
+   struct reg_field cfg_enc_ref_ds_lu_stride;
+   struct reg_field cfg_enc_ref_cr_stride;
+   struct reg_field cfg_enc_ipcm2_left;
+   struct reg_field cfg_enc_ipcm2_right;
+   struct reg_field cfg_enc_ipcm2_top;
+   struct reg_field cfg_enc_ipcm2_bottom;
+
+   struct reg_field cfg_enc_slice_qp_offset;
+   struct reg_field cfg_enc_qp_min;
+   struct reg_field cfg_enc_qp_max;
+
+   struct reg_field cfg_enc_lambda_satd_me_0;
+   struct reg_field cfg_enc_lambda_satd_me_1;
+   struct reg_field cfg_enc_lambda_satd_me_2;
+   struct reg_field cfg_enc_lambda_satd_me_3;
+   struct reg_field cfg_enc_lambda_satd_me_4;
+   struct reg_field cfg_enc_lambda_satd_me_5;
+   struct reg_field cfg_enc_lambda_satd_me_6;
+   struct reg_field cfg_enc_lambda_satd_me_7;
+   struct reg_field cfg_enc_lambda_satd_me_8;
+   struct reg_field cfg_enc_lambda_satd_me_9;
+   struct reg_field cfg_enc_lambda_satd_me_10;
+   struct reg_field cfg_enc_lambda_satd_me_11;
+   struct reg_field cfg_enc_lambda_satd_me_12

[PATCH 09/18] media: hantro: default regmap to relaxed MMIO

2020-10-12 Thread Adrian Ratiu
This is done to match the pre-regmap membarrier behaviour, ensuring
default regmap_write calls in _relaxed() are indeed relaxed while
the non-relaxed versions include an explicit mem-barrier call.

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h| 4 
 drivers/staging/media/hantro/hantro_regmap.c | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index c5425cd5ac84..5b7fbdc3779d 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -346,6 +346,7 @@ static inline void vepu_write_relaxed(struct hantro_dev 
*vpu,
 static inline void vepu_write(struct hantro_dev *vpu, u32 val, u32 reg)
 {
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
+   wmb(); /* flush encoder previous relaxed writes */
regmap_write(vpu->regs_enc, reg, val);
 }
 
@@ -354,6 +355,7 @@ static inline u32 vepu_read(struct hantro_dev *vpu, u32 reg)
u32 val;
 
regmap_read(vpu->regs_enc, reg, &val);
+   rmb(); /* read encoder swreg data in order */
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
 
return val;
@@ -369,6 +371,7 @@ static inline void vdpu_write_relaxed(struct hantro_dev 
*vpu,
 static inline void vdpu_write(struct hantro_dev *vpu, u32 val, u32 reg)
 {
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
+   wmb();/* flush decoder previous relaxed writes */
regmap_write(vpu->regs_dec, reg, val);
 }
 
@@ -377,6 +380,7 @@ static inline u32 vdpu_read(struct hantro_dev *vpu, u32 reg)
u32 val;
 
regmap_read(vpu->regs_dec, reg, &val);
+   rmb(); /* read decoder swreg data in order */
vpu_debug(6, "0x%04x = 0x%08x\n", reg / 4, val);
 
return val;
diff --git a/drivers/staging/media/hantro/hantro_regmap.c 
b/drivers/staging/media/hantro/hantro_regmap.c
index 890e443688e2..2fc409cbd797 100644
--- a/drivers/staging/media/hantro/hantro_regmap.c
+++ b/drivers/staging/media/hantro/hantro_regmap.c
@@ -21,6 +21,7 @@ struct regmap_config hantro_regmap_dec = {
.reg_stride = 4,
/* all hantro accesses are sequential, even with respect to irq ctx */
.disable_locking = true,
+   .use_relaxed_mmio = true,
.name = "hantro_regmap_dec",
 };
 
-- 
2.28.0



[PATCH 15/18] media: hantro: add user-selectable, platform-selectable H264 High10

2020-10-12 Thread Adrian Ratiu
VPU cores starting with VC8000D feature a separate decoding mode named
"high10", capable of decoding both 8bit and 10bit streams, alongside the
previous (still supported) "normal / classic" h264 decoding mode.

The new kernel module param h264_high10 can be used to switch modes,
otherwise the driver will use the platform configured default.

Currently only 8bit decoding is implemented in the high10 mode.

Signed-off-by: Ezequiel Garcia 
Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h |   7 +
 drivers/staging/media/hantro/hantro_drv.c |  10 ++
 .../staging/media/hantro/hantro_g1_h264_dec.c | 142 ++
 drivers/staging/media/hantro/hantro_hw.h  |  21 ++-
 .../staging/media/hantro/hantro_postproc.c|   3 +-
 drivers/staging/media/hantro/hantro_regmap.c  |  36 +
 drivers/staging/media/hantro/hantro_regmap.h  |  17 +++
 drivers/staging/media/hantro/hantro_v4l2.c|   3 +-
 8 files changed, 203 insertions(+), 36 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 05e59bc83b71..70aeb11b1149 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -71,6 +71,7 @@ struct hantro_irq {
  * @num_clocks:number of clocks in the array
  * @reg_names: array of register range names
  * @num_regs:  number of register range names in the array
+ * @has_h264_high10:   platform has support for high10 decoding mode
  */
 struct hantro_variant {
unsigned int enc_offset;
@@ -91,6 +92,8 @@ struct hantro_variant {
int num_clocks;
const char * const *reg_names;
int num_regs;
+
+   bool has_h264_high10;
 };
 
 /**
@@ -177,6 +180,8 @@ hantro_vdev_to_func(struct video_device *vdev)
  * shared with interrupt handlers.
  * @variant:   Hardware variant-specific parameters.
  * @watchdog_work: Delayed work for hardware timeout handling.
+ *
+ * @h264_hw_mode:  H264 mode: legacy, high10 supported.
  */
 struct hantro_dev {
struct v4l2_device v4l2_dev;
@@ -200,6 +205,8 @@ struct hantro_dev {
spinlock_t irqlock;
const struct hantro_variant *variant;
struct delayed_work watchdog_work;
+
+   enum hantro_h264_hw_mode h264_hw_mode;
 };
 
 /**
diff --git a/drivers/staging/media/hantro/hantro_drv.c 
b/drivers/staging/media/hantro/hantro_drv.c
index e225515d6985..afb4e201fa42 100644
--- a/drivers/staging/media/hantro/hantro_drv.c
+++ b/drivers/staging/media/hantro/hantro_drv.c
@@ -32,6 +32,10 @@
 
 #define DRIVER_NAME "hantro-vpu"
 
+static bool hantro_h264_high10 = true;
+module_param_named(h264_high10, hantro_h264_high10, bool, 0444);
+MODULE_PARM_DESC(h264_high10, "Enable High10 decoding mode");
+
 int hantro_debug;
 module_param_named(debug, hantro_debug, int, 0644);
 MODULE_PARM_DESC(debug,
@@ -824,6 +828,12 @@ static int hantro_probe(struct platform_device *pdev)
goto err_clk_unprepare;
}
 
+   /* Small quirk: check if H264 High10 mode can be used */
+   if (hantro_h264_high10 && vpu->variant->has_h264_high10)
+   vpu->h264_hw_mode = HANTRO_H264_HIGH10;
+   else
+   vpu->h264_hw_mode = HANTRO_H264_LEGACY;
+
pm_runtime_set_autosuspend_delay(vpu->dev, 100);
pm_runtime_use_autosuspend(vpu->dev);
pm_runtime_enable(vpu->dev);
diff --git a/drivers/staging/media/hantro/hantro_g1_h264_dec.c 
b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
index a04cb616d628..e64b59c84111 100644
--- a/drivers/staging/media/hantro/hantro_g1_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_g1_h264_dec.c
@@ -2,6 +2,8 @@
 /*
  * Rockchip RK3288 VPU codec driver
  *
+ * Copyright (c) 2020 Collabora, Ltd.
+ *
  * Copyright (c) 2014 Rockchip Electronics Co., Ltd.
  * Hertz Wong 
  * Herman Chen 
@@ -10,6 +12,7 @@
  * Tomasz Figa 
  */
 
+#include 
 #include 
 #include 
 
@@ -20,6 +23,8 @@
 #include "hantro_v4l2.h"
 #include "hantro_regmap.h"
 
+/* TODO: remove this harcoded pixel size when adding 10bit streams */
+#define VC8KD_PIXEL_SIZE 8
 #define VC8KD_TIMEOUT 0x50
 
 extern struct regmap_config hantro_regmap_dec;
@@ -30,7 +35,6 @@ static void set_params(struct hantro_ctx *ctx)
const struct v4l2_ctrl_h264_decode_params *dec_param = ctrls->decode;
const struct v4l2_ctrl_h264_sps *sps = ctrls->sps;
const struct v4l2_ctrl_h264_pps *pps = ctrls->pps;
-   struct vb2_v4l2_buffer *src_buf = hantro_get_src_buf(ctx);
struct hantro_dev *vpu = ctx->dev;
struct hantro_regmap_fields_dec *fields = vpu->reg_fields_dec;
u32 width = MB_WIDTH(ctx->src_fmt.width);
@@ -40,8 +44,26 @@ static void set_params(struct hantro_ctx *ctx)
regmap_field_write(fields->dec_axi_wr_id, 0x0);
 
i

[PATCH 17/18] media: hantro: add dump registers debug option before decode start

2020-10-12 Thread Adrian Ratiu
It is very useful to know the status of all the decoder configuration
registers right before starting a decode operation, so add an option
to print them if register debugging is enabled (debug bit 7 is set).

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h  | 1 +
 drivers/staging/media/hantro/hantro_h264_dec.c | 9 -
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index 70aeb11b1149..1b0c441ff15a 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -304,6 +304,7 @@ struct hantro_reg {
  * bit 4 - detail fmt, ctrl, buffer q/dq information
  * bit 5 - detail function enter/leave trace information
  * bit 6 - register write/read information
+ * bit 7 - dump
  */
 extern int hantro_debug;
 
diff --git a/drivers/staging/media/hantro/hantro_h264_dec.c 
b/drivers/staging/media/hantro/hantro_h264_dec.c
index e64b59c84111..2c53394cbb0c 100644
--- a/drivers/staging/media/hantro/hantro_h264_dec.c
+++ b/drivers/staging/media/hantro/hantro_h264_dec.c
@@ -381,7 +381,9 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
struct hantro_dev *vpu = ctx->dev;
struct hantro_regmap_fields_dec *fields = vpu->reg_fields_dec;
bool do_high10 = (vpu->h264_hw_mode == HANTRO_H264_HIGH10);
-   int reg;
+   u32 max_reg = hantro_regmap_dec.max_register;
+   u32 reg_stride = hantro_regmap_dec.reg_stride;
+   int reg, i;
 
/* Prepare the H264 decoder context. */
if (hantro_h264_dec_prepare_run(ctx))
@@ -421,6 +423,11 @@ void hantro_g1_h264_dec_run(struct hantro_ctx *ctx)
regmap_field_write(fields->dec_max_burst, 16);
regmap_field_write(fields->dec_axi_rd_id, 16);
 
+   vpu_debug(7, "Reg dump at decoding start\n");
+   for (i = 0; hantro_debug & BIT(7) && i <= max_reg; i += reg_stride)
+   vpu_debug(7, "swreg %03d: %08x\n", i / 4, vdpu_read(vpu, i));
+   vpu_debug(7, "Reg dump end\n");
+
/* Start decoding! */
vdpu_write(vpu, G1_REG_INTERRUPT_DEC_E, G1_REG_INTERRUPT);
 }
-- 
2.28.0



[PATCH 13/18] media: hantro: add VC8000D postproc support

2020-10-12 Thread Adrian Ratiu
VC8000D decodes only to 4x4 tiled NV12 format and the attached PP
can be used to de-tile its output. This can bo done in two modes:

1. Pipeline mode, using the same decoder "done" irq
2. External mode, with a separate irq and input setup.

This adds the relevant postprocessor fields and support for pipeline
mode de-tiling.

Signed-off-by: Adrian Ratiu 
Signed-off-by: Ezequiel Garcia 
---
 .../staging/media/hantro/hantro_postproc.c| 58 ---
 drivers/staging/media/hantro/hantro_regmap.c  | 41 +
 drivers/staging/media/hantro/hantro_regmap.h  |  8 +++
 3 files changed, 98 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro_postproc.c 
b/drivers/staging/media/hantro/hantro_postproc.c
index 6d1705a60d36..a6b3e243dc39 100644
--- a/drivers/staging/media/hantro/hantro_postproc.c
+++ b/drivers/staging/media/hantro/hantro_postproc.c
@@ -20,22 +20,35 @@
 #define VPU_PP_OUT_RGB 0x0
 #define VPU_PP_OUT_YUYV0x3
 
+#define VC8000D_PP_OUT_NV120x0
+
 void hantro_postproc_enable(struct hantro_ctx *ctx)
 {
struct hantro_regmap_fields_dec *fields = ctx->dev->reg_fields_dec;
struct vb2_v4l2_buffer *dst_buf;
-   u32 src_pp_fmt, dst_pp_fmt;
+   u32 src_pp_fmt, dst_pp_fmt, in_width, in_height;
dma_addr_t dst_dma;
 
/* Turn on pipeline mode. Must be done first. */
regmap_field_write(fields->pp_pipeline_en, 1);
 
+   /*
+* use NV12 as input format for pipeline mode as that's what decoder
+* outputs, on VC8000D it is 4x4 tiled NV12.
+*/
src_pp_fmt = VPU_PP_IN_NV12;
 
switch (ctx->vpu_dst_fmt->fourcc) {
case V4L2_PIX_FMT_YUYV:
dst_pp_fmt = VPU_PP_OUT_YUYV;
break;
+   case V4L2_PIX_FMT_NV12:
+   /* src == dst == NV12 only makes sense to de-tile on VC8000D */
+   if (ctx->dev->core_hw_dec_rev == HANTRO_VC8000_REV) {
+   dst_pp_fmt = VC8000D_PP_OUT_NV12;
+   break;
+   }
+   fallthrough;
default:
WARN(1, "output format %d not supported by the post-processor, 
this wasn't expected.",
 ctx->vpu_dst_fmt->fourcc);
@@ -46,19 +59,46 @@ void hantro_postproc_enable(struct hantro_ctx *ctx)
dst_buf = v4l2_m2m_next_dst_buf(ctx->fh.m2m_ctx);
dst_dma = vb2_dma_contig_plane_dma_addr(&dst_buf->vb2_buf, 0);
 
-   regmap_field_write(fields->pp_clk_gate, 1);
-   regmap_field_write(fields->pp_out_endian, 1);
-   regmap_field_write(fields->pp_out_swap32, 1);
-   regmap_field_write(fields->pp_max_burst, 16);
+   switch (ctx->dev->core_hw_dec_rev) {
+   case HANTRO_G1_REV:
+   regmap_field_write(fields->pp_clk_gate, 1);
+   regmap_field_write(fields->pp_out_endian, 1);
+   regmap_field_write(fields->pp_out_swap32, 1);
+   regmap_field_write(fields->pp_max_burst, 16);
+   regmap_field_write(fields->pp_orig_width, 
MB_WIDTH(ctx->dst_fmt.width));
+   regmap_field_write(fields->pp_display_width, 
ctx->dst_fmt.width);
+   in_width = MB_WIDTH(ctx->src_fmt.width);
+   in_height = MB_WIDTH(ctx->src_fmt.height);
+   break;
+   case HANTRO_VC8000_REV:
+   /* on VC8000D the PP is used to de-tile decoder output */
+   regmap_field_write(fields->pp_out_tile_e, 0);
+
+   regmap_field_write(fields->pp_out_y_stride, ctx->dst_fmt.width);
+   regmap_field_write(fields->pp_out_c_stride, ctx->dst_fmt.width);
+
+   regmap_field_write(fields->pp_out_chroma_base, dst_dma +
+  ctx->dst_fmt.width * ctx->dst_fmt.height);
+
+   /* VC8000D input resolution is a 2-pixels length. */
+   in_width = ctx->src_fmt.width / 2;
+   in_height = ctx->src_fmt.height / 2;
+
+   break;
+   default:
+   vpu_err("PP does not recognize HW revision: %x, disabling\n",
+   ctx->dev->core_hw_dec_rev);
+   hantro_postproc_disable(ctx);
+   return;
+   }
+
regmap_field_write(fields->pp_out_luma_base, dst_dma);
-   regmap_field_write(fields->pp_input_width, 
MB_WIDTH(ctx->dst_fmt.width));
-   regmap_field_write(fields->pp_input_height, 
MB_HEIGHT(ctx->dst_fmt.height));
+   regmap_field_write(fields->pp_input_width, in_width);
+   regmap_field_write(fields->pp_input_height, in_height);
regmap_field_write(fields->pp_input_fmt, src_pp_fmt);
regmap_field_write(fields->pp_output_fmt, dst_pp_fmt);
regmap_field_write(fields->pp_output_width, ctx->dst_

[PATCH 06/18] media: hantro: imx8mq: simplify ctrlblk reset logic

2020-10-12 Thread Adrian Ratiu
The G1 and G2 cores on imx8mq share a common "control block" used
to reset and enable the core clocks as well as enable functioning
via ctrl FUSE registers (these are not the FUSEs on the VPU cores,
they are just used to enable/disable the cores and allow the real
VPU FUSE regs to become available).

The problem is that, while the cores can be operated independently
from one another (different config reg mem regions, separate IRQs),
they can not be reset or powered down independently as the current
code implies. This has been a source for many bugs and frustration
when trying to enable G2 which this driver does not support yet.

So we simplify the ctrlblk reset logic to always reset both cores,
exactly like the vendor linux-imx provided driver "hantrodec" does
for this SoC.

Going forward, this simplified code should be moved in the future to
its own reset controller driver as the reset framework also supports
shared reset resources so the runtime PM logic can disable both cores
when none of them are in use (this is not done yet because only G1
is supported in the driver so there is no need to account for G2).

Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/hantro/hantro.h   |  2 -
 drivers/staging/media/hantro/imx8m_vpu_hw.c | 74 +++--
 2 files changed, 24 insertions(+), 52 deletions(-)

diff --git a/drivers/staging/media/hantro/hantro.h 
b/drivers/staging/media/hantro/hantro.h
index bb442eb1974e..2dd4362d4080 100644
--- a/drivers/staging/media/hantro/hantro.h
+++ b/drivers/staging/media/hantro/hantro.h
@@ -167,7 +167,6 @@ hantro_vdev_to_func(struct video_device *vdev)
  * @reg_bases: Mapped addresses of VPU registers.
  * @enc_base:  Mapped address of VPU encoder register for convenience.
  * @dec_base:  Mapped address of VPU decoder register for convenience.
- * @ctrl_base: Mapped address of VPU control block.
  * @vpu_mutex: Mutex to synchronize V4L2 calls.
  * @irqlock:   Spinlock to synchronize access to data structures
  * shared with interrupt handlers.
@@ -187,7 +186,6 @@ struct hantro_dev {
void __iomem **reg_bases;
void __iomem *enc_base;
void __iomem *dec_base;
-   void __iomem *ctrl_base;
 
struct mutex vpu_mutex; /* video_device lock */
spinlock_t irqlock;
diff --git a/drivers/staging/media/hantro/imx8m_vpu_hw.c 
b/drivers/staging/media/hantro/imx8m_vpu_hw.c
index c222de075ef4..b2a401a33992 100644
--- a/drivers/staging/media/hantro/imx8m_vpu_hw.c
+++ b/drivers/staging/media/hantro/imx8m_vpu_hw.c
@@ -24,34 +24,13 @@
 #define CTRL_G1_PP_FUSE0x0c
 #define CTRL_G2_DEC_FUSE   0x10
 
-static void imx8m_soft_reset(struct hantro_dev *vpu, u32 reset_bits)
-{
-   u32 val;
-
-   /* Assert */
-   val = readl(vpu->ctrl_base + CTRL_SOFT_RESET);
-   val &= ~reset_bits;
-   writel(val, vpu->ctrl_base + CTRL_SOFT_RESET);
-
-   udelay(2);
-
-   /* Release */
-   val = readl(vpu->ctrl_base + CTRL_SOFT_RESET);
-   val |= reset_bits;
-   writel(val, vpu->ctrl_base + CTRL_SOFT_RESET);
-}
-
-static void imx8m_clk_enable(struct hantro_dev *vpu, u32 clock_bits)
-{
-   u32 val;
-
-   val = readl(vpu->ctrl_base + CTRL_CLOCK_ENABLE);
-   val |= clock_bits;
-   writel(val, vpu->ctrl_base + CTRL_CLOCK_ENABLE);
-}
-
-static int imx8mq_runtime_resume(struct hantro_dev *vpu)
+/*
+ * Due to a HW limitation, both G1 and G2 VPU cores on imx8mq need to be reset
+ * together via their unified ctrl block.
+ */
+static int imx8mq_ctrlblk_reset(struct hantro_dev *vpu)
 {
+   void __iomem *ctrl_base = vpu->reg_bases[vpu->variant->num_regs - 1];
int ret;
 
ret = clk_bulk_prepare_enable(vpu->variant->num_clocks, vpu->clocks);
@@ -60,13 +39,18 @@ static int imx8mq_runtime_resume(struct hantro_dev *vpu)
return ret;
}
 
-   imx8m_soft_reset(vpu, RESET_G1 | RESET_G2);
-   imx8m_clk_enable(vpu, CLOCK_G1 | CLOCK_G2);
+   /* reset HW and ungate clocks via ctrl block */
+   writel(RESET_G1 | RESET_G2, ctrl_base + CTRL_SOFT_RESET);
+   writel(CLOCK_G1 | CLOCK_G2, ctrl_base + CTRL_CLOCK_ENABLE);
 
-   /* Set values of the fuse registers */
-   writel(0x, vpu->ctrl_base + CTRL_G1_DEC_FUSE);
-   writel(0x, vpu->ctrl_base + CTRL_G1_PP_FUSE);
-   writel(0x, vpu->ctrl_base + CTRL_G2_DEC_FUSE);
+   /*
+* enable fuse functionalities for each core, these are not real fuses
+* but registers which enable the cores and makes accesible their real
+* read-only fuse registers describing supported features.
+*/
+   writel(0x, ctrl_base + CTRL_G1_DEC_FUSE);
+   writel(0x, ctrl_base + CTRL_G1_PP_FUSE);
+   writel(0x, ctrl_base + CTRL_G2_DEC_FUSE);
 
clk_bulk_disable_unprepare(vpu->vari

[PATCH 07/18] regmap: mmio: add config option to allow relaxed MMIO accesses

2020-10-12 Thread Adrian Ratiu
On some platforms (eg armv7 due to the CONFIG_ARM_DMA_MEM_BUFFERABLE)
MMIO R/W operations always add memory barriers which can increase load,
decrease battery life or in general reduce performance unnecessarily
on devices which access a lot of configuration registers and where
ordering does not matter (eg. media accelerators like the Verisilicon /
Hantro video decoders).

Drivers used to call the relaxed MMIO variants directly but since they
are now accessing the MMIO registers via regmaps (to compensate for
for different VPU HW reg layouts via regmap fields), there is a need
for a relaxed API / config to preserve their existing behaviour.

Signed-off-by: Adrian Ratiu 
---
 drivers/base/regmap/regmap-mmio.c | 34 +++
 include/linux/regmap.h|  5 +
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/base/regmap/regmap-mmio.c 
b/drivers/base/regmap/regmap-mmio.c
index af967d8f975e..21193ef2a923 100644
--- a/drivers/base/regmap/regmap-mmio.c
+++ b/drivers/base/regmap/regmap-mmio.c
@@ -16,6 +16,7 @@
 struct regmap_mmio_context {
void __iomem *regs;
unsigned val_bytes;
+   bool relaxed_mmio;
 
bool attached_clk;
struct clk *clk;
@@ -72,14 +73,20 @@ static void regmap_mmio_write8(struct regmap_mmio_context 
*ctx,
unsigned int reg,
unsigned int val)
 {
-   writeb(val, ctx->regs + reg);
+   if (ctx->relaxed_mmio)
+   writeb_relaxed(val, ctx->regs + reg);
+   else
+   writeb(val, ctx->regs + reg);
 }
 
 static void regmap_mmio_write16le(struct regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
 {
-   writew(val, ctx->regs + reg);
+   if (ctx->relaxed_mmio)
+   writew_relaxed(val, ctx->regs + reg);
+   else
+   writew(val, ctx->regs + reg);
 }
 
 static void regmap_mmio_write16be(struct regmap_mmio_context *ctx,
@@ -93,7 +100,10 @@ static void regmap_mmio_write32le(struct 
regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
 {
-   writel(val, ctx->regs + reg);
+   if (ctx->relaxed_mmio)
+   writel_relaxed(val, ctx->regs + reg);
+   else
+   writel(val, ctx->regs + reg);
 }
 
 static void regmap_mmio_write32be(struct regmap_mmio_context *ctx,
@@ -108,7 +118,10 @@ static void regmap_mmio_write64le(struct 
regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
 {
-   writeq(val, ctx->regs + reg);
+   if (ctx->relaxed_mmio)
+   writeq_relaxed(val, ctx->regs + reg);
+   else
+   writeq(val, ctx->regs + reg);
 }
 #endif
 
@@ -134,12 +147,18 @@ static int regmap_mmio_write(void *context, unsigned int 
reg, unsigned int val)
 static unsigned int regmap_mmio_read8(struct regmap_mmio_context *ctx,
  unsigned int reg)
 {
+   if (ctx->relaxed_mmio)
+   return readb_relaxed(ctx->regs + reg);
+
return readb(ctx->regs + reg);
 }
 
 static unsigned int regmap_mmio_read16le(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
+   if (ctx->relaxed_mmio)
+   return readw_relaxed(ctx->regs + reg);
+
return readw(ctx->regs + reg);
 }
 
@@ -152,6 +171,9 @@ static unsigned int regmap_mmio_read16be(struct 
regmap_mmio_context *ctx,
 static unsigned int regmap_mmio_read32le(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
+   if (ctx->relaxed_mmio)
+   return readl_relaxed(ctx->regs + reg);
+
return readl(ctx->regs + reg);
 }
 
@@ -165,6 +187,9 @@ static unsigned int regmap_mmio_read32be(struct 
regmap_mmio_context *ctx,
 static unsigned int regmap_mmio_read64le(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
+   if (ctx->relaxed_mmio)
+   return readq_relaxed(ctx->regs + reg);
+
return readq(ctx->regs + reg);
 }
 #endif
@@ -237,6 +262,7 @@ static struct regmap_mmio_context 
*regmap_mmio_gen_context(struct device *dev,
 
ctx->regs = regs;
ctx->val_bytes = config->val_bits / 8;
+   ctx->relaxed_mmio = config->use_relaxed_mmio;
ctx->clk = ERR_PTR(-ENODEV);
 
switch (regmap_get_val_endian(dev, ®map_mmio, config)) {
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index e7834d98207f..126fe700d1d8 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -315,6 +315,10 @@ typedef void (*regmap_unlock)(void *);
  *   masks are used.
  * @zero_flag_mask: If set, read_flag_mask and

Re: [PATCH 00/18] Add Hantro regmap and VC8000 h264 decode support

2020-10-12 Thread Adrian Ratiu

Hi Jonas,

On Mon, 12 Oct 2020, Jonas Karlman  wrote:
Hi, 

On 2020-10-12 22:59, Adrian Ratiu wrote: 
Dear all,  This series introduces a regmap infrastructure for 
the Hantro driver which is used to compensate for different 
HW-revision register layouts.  To justify it h264 decoding 
capability is added for newer VC8000 chips.   This is a gradual 
conversion to the new infra - a complete conversion would have 
been very big and I do not have all the HW yet to test (I'm 
expecting a RK3399 shipment next week though ;). I think 
converting the h264 decoder provides a nice blueprint for how 
the other codecs can be converted and enabled for different HW 
revisions.   The end goal of this is to make the driver more 
generic and eliminate entirely custom boilerplate like `struct 
hantro_reg` or headers with core-specific bit manipulations 
like `hantro_g1_regs.h` and instead rely on the well-tested 
albeit more verbose regmap subsytem.   To give just two 
examples of bugs which are easily discovered by using more 
verbose regmap fields (very easy to compare with the 
datasheets) instead of relying on bit-magic tricks: 
G1_REG_DEC_CTRL3_INIT_QP(x) was off-by-1 and the wrong 
.clk_gate bit was set in hantro_postproc.c.   Anyway, this 
series also extends the MMIO regmap API to allow relaxed writes 
for the theoretical reason that avoiding unnecessary 
membarriers leads to less CPU usage and small improvements to 
battery life. However, in practice I could not measure 
differences between relaxed/non-relaxed IO, so I'm on the fence 
whether to keep or remove the relaxed calls.   What I could 
masure is the performance impact of adding more sub-reg field 
acesses: a constant ~ 20 microsecond bump per G1 h264 
frame. This is acceptable considering the total time to decode 
a frame takes three orders of magnitude longer, 
i.e. miliseconds ranges, depending on the frame size and 
bitstream params, so it is an acceptable trade-off to have a 
more generic driver. 


In the RK3399 variant all fields use completely different 
positions so in order to make the driver fully generic all 
around 145 sub-reg fields used for h264 needs to be converted, 
see [1] for a quick generation of field mappings used for h264 
decoding. 

Any indication on how the performance will be impacted with 145 
fields compared to around 20 fields used in this series? 


I'm aware of the RK3399 bigger layout divergence and have some 
commits converting more of the reg fields, but not all that is 
required for h264 on rk3399. I haven't seen a huge perf 
degradation but more measurements are needed, basically it depends 
on how often we go from writing a reg once to multiple times due 
to splitting.


I tried some benchmarks using regmap caching (both the default 
backends provided by the regmap subsystem, and a custom one I 
wrote) but they were not helping, perhaps if we had more fields 
then that would have more of an impact.


(btw some good news is I'm having a RK3399 SoC in the mail for an 
unrelated project and expect to receive it soon :D)


IMO there will always be a trade-off between optimizing the driver 
to squeeze the most perf out of the HW, eg optimize reg writes at 
low microsec level (which I think here is unnecessary) and making 
it more generic to support more HW.


In this case a fundamental question we need to ask ourselves is if 
the RK3399 "looks like another/different-enough HW" due to its 
bigger reg shuffling to warrant a separate driver or 
driver-within-a-driver architecture instead trying to bring it 
into the fold with the others, possibly degrading perf for 
everyone. I guess we'll have to see some benchmark numbers and an 
actual h264 implementation before deciding how to proceed with 
RK3399.




Another issue with RK3399 variant is that some fields use 
different position depending on the codec used, e.g. two 
dec_ref_frames in [2].  Should we use codec specific field maps? 
or any other suggestion on how we can handle such case?


Yes, codec specific fields would be one idea, but I'd try to avoid 
it if possible to avoid unnecessary field definitions.


The regmap field API and config we currently use are just a flat 
structs (see hantro_regmap.[h|c]) but it doesn't have to be like 
that. Maybe we could organize it a bit better and in the future 
have some codec-level configs going on due to the regmap subsystem 
allowing de-coupling of the API (struct regmap_field) from the reg 
defs/configs (struct reg_field).


That is just an idea of the top of my head :) Will have to think a 
bit more about how to handle that specific use case in the 
future. Thanks!




[1] 
https://github.com/Kwiboo/rockchip-vpu-regtool/commit/8b88d94d2ed966c7d88d9a735c0c97368eb6c92d
[2] 
https://github.com/Kwiboo/rockchip-vpu-regtool/blob/master/rk3399_dec_regs.c#L1065
[3] 
https://github.com/Kwiboo/rockchip-vpu-regtool/commit/9498326296445a9ce153b585cc48e0cea05d3c93

Best regards,
J

Re: [PATCH v12 07/29] media: v4l2-mem2mem: add v4l2_m2m_suspend, v4l2_m2m_resume

2020-09-25 Thread Adrian Ratiu

Hi,

I'm having a problem with this patch which landed in linux-next.

On Fri, 14 Aug 2020, Xia Jiang  wrote:
From: Pi-Hsun Shih  

Add two functions that can be used to stop new jobs from being 
queued / continue running queued job. This can be used while a 
driver using m2m helper is going to suspend / wake up from 
resume, and can ensure that there's no job running in suspend 
process. 

BUG=b:143046833 TEST=build 

Signed-off-by: Pi-Hsun Shih  Signed-off-by: 
Jerry-ch Chen  
Reviewed-by: Tomasz Figa  --- v12: add this 
relied patch to the series --- 
 drivers/media/v4l2-core/v4l2-mem2mem.c | 41 
 ++ include/media/v4l2-mem2mem.h 
 | 22 ++ 2 files changed, 63 insertions(+) 

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c 
b/drivers/media/v4l2-core/v4l2-mem2mem.c index 
62ac9424c92a..ddfdb6375064 100644 --- 
a/drivers/media/v4l2-core/v4l2-mem2mem.c +++ 
b/drivers/media/v4l2-core/v4l2-mem2mem.c @@ -43,6 +43,10 @@ 
module_param(debug, bool, 0644); 
 #define TRANS_ABORT		(1 << 2)   
+/* The job queue is not running new jobs */ +#define 
QUEUE_PAUSED		(1 << 0) + + 
 /* Offset base for buffers on the destination queue - used to 
 distinguish 
  * between source and destination buffers when mmapping - they 
  receive the same * offsets but for different queues */ 
@@ -84,6 +88,7 @@ static const char * const m2m_entity_name[] = 
{ 
  * @job_queue:		instances queued to run * 
  @job_spinlock:	protects job_queue * @job_work: 
  worker to run queued jobs. 
+ * @job_queue_flags:	flags of the queue status, %QUEUE_PAUSED. 
  * @m2m_ops:		driver callbacks */ 
 struct v4l2_m2m_dev { 
@@ -101,6 +106,7 @@ struct v4l2_m2m_dev { 
 	struct list_head	job_queue; spinlock_t 
 job_spinlock; struct work_struct	job_work; 
+	unsigned long		job_queue_flags; 
  const struct v4l2_m2m_ops *m2m_ops; }; 
@@ -263,6 +269,12 @@ static void v4l2_m2m_try_run(struct 
v4l2_m2m_dev *m2m_dev) 
 		return; }  
+	if (m2m_dev->job_queue_flags & QUEUE_PAUSED) { + 
spin_unlock_irqrestore(&m2m_dev->job_spinlock, flags); + 
dprintk("Running new jobs is paused\n"); +		return; + 
} + 
 	m2m_dev->curr_ctx = list_first_entry(&m2m_dev->job_queue, 
 struct v4l2_m2m_ctx, queue); m2m_dev->curr_ctx->job_flags |= 
 TRANS_RUNNING; 
@@ -504,6 +516,7 @@ void v4l2_m2m_buf_done_and_job_finish(struct 
v4l2_m2m_dev *m2m_dev, 
  if (WARN_ON(!src_buf || !dst_buf)) goto unlock; 
+	v4l2_m2m_buf_done(src_buf, state); 


This line looks out of place in this commit and is causing a lot 
of warnings (1 per frame). Any reason in particular why we need 
this?


[   87.825061] [ cut here ] [   87.829695] 
WARNING: CPU: 0 PID: 0 at 
drivers/media/common/videobuf2/videobuf2-core.c:986 
vb2_buffer_done+0x208/0x2a0 [   87.840302] Modules linked in: [ 
87.843364] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
5.9.0-rc6-next-20200924+ #472 [   87.852407] Hardware name: NXP 
i.MX8MQ EVK (DT) [   87.856942] pstate: 2085 (nzCv daIf -PAN 
-UAO -TCO BTYPE=--) [   87.862953] pc : 
vb2_buffer_done+0x208/0x2a0 [   87.867224] lr : 
v4l2_m2m_buf_done_and_job_finish+0x94/0x140 [   87.872882] sp : 
80001183bd50 [   87.876195] x29: 80001183bd50 x28: 
8000115d1500  [   87.881512] x27: 80001128e018 x26: 
9fb4a828  [   87.886828] x25: a4e13a08 x24: 
0080  [   87.892143] x23: 0005 x22: 
a253bc00  [   87.897457] x21: 9fb4aa98 x20: 
9fb4a800  [   87.902772] x19: a24f x18: 
  [   87.908086] x17:  x16: 
  [   87.913400] x15:  x14: 
0500  [   87.918714] x13: 0003 x12: 
  [   87.924028] x11: 0040 x10: 
800011658520  [   87.929340] x9 : 800010998464 x8 : 
a5800270  [   87.934655] x7 :  x6 : 
9fb4aa20  [   87.939969] x5 : 80001183bd10 x4 : 
  [   87.945285] x3 :  x2 : 
  [   87.950599] x1 : 0005 x0 : 
0005  [   87.955914] Call trace: [   87.958364] 
vb2_buffer_done+0x208/0x2a0 [   87.962288] 
v4l2_m2m_buf_done_and_job_finish+0x94/0x140 [   87.967601] 
hantro_job_finish+0xa8/0xe0 [   87.971524] 
hantro_irq_done+0x58/0x90 [   87.975275] 
imx8m_vpu_g1_irq+0x8c/0x160 [   87.979201] 
__handle_irq_event_percpu+0x68/0x2a0 [   87.983905] 
handle_irq_event_percpu+0x3c/0xa0 [   87.988347] 
handle_irq_event+0x50/0xf0 [   87.992185] 
handle_fasteoi_irq+0xc0/0x180 [   87.996283] 
generic_handle_irq+0x38/0x50 [   88.000296] 
__handle_domain_irq+0x6c/0xd0 [   88.004393] 
gic_handle_irq+0x60/0x12c [   88.008143]  el1_irq+0xbc/0x180 [ 
88.011287]  arch_cpu_idle+0x1c/0x30 [   88.014864] 
do_idle+0x220/0x270 [   88.018093]  cpu_startup_entry+0x30/0x70 [ 
88.022019]  rest_init+0xe0/0xf0 [   88.025250] 
arch_call_rest_init+0x18/0x24 [   88.029347] 
start_kernel+0x7a4/0x7e0 [   88.033013] CPU: 0 PID: 0 Comm: 
swapper/0 Tainted: GW 

Re: [PATCH] media: v4l2-mem2mem: Fix spurious v4l2_m2m_buf_done

2020-09-28 Thread Adrian Ratiu

Thank you Ezequiel,

Tested-by: Adrian Ratiu 

On Mon, 28 Sep 2020, Ezequiel Garcia  
wrote:

A seemingly bad rebase introduced a spurious v4l2_m2m_buf_done,
which releases a buffer twice and therefore triggers a
noisy warning on each job:

WARNING: CPU: 0 PID: 0 at drivers/media/common/videobuf2/videobuf2-core.c:986 
vb2_buffer_done+0x208/0x2a0

Fix it by removing the spurious v4l2_m2m_buf_done.

Reported-by: Adrian Ratiu 
Fixes: 911ea8ec42dea ("media: v4l2-mem2mem: add v4l2_m2m_suspend, 
v4l2_m2m_resume")
Signed-off-by: Ezequiel Garcia 
---
 drivers/media/v4l2-core/v4l2-mem2mem.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/media/v4l2-core/v4l2-mem2mem.c 
b/drivers/media/v4l2-core/v4l2-mem2mem.c
index f626ba5ee3d9..b221b4e438a1 100644
--- a/drivers/media/v4l2-core/v4l2-mem2mem.c
+++ b/drivers/media/v4l2-core/v4l2-mem2mem.c
@@ -516,7 +516,6 @@ void v4l2_m2m_buf_done_and_job_finish(struct v4l2_m2m_dev 
*m2m_dev,
 
 	if (WARN_ON(!src_buf || !dst_buf))

goto unlock;
-   v4l2_m2m_buf_done(src_buf, state);
dst_buf->is_held = src_buf->flags & V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF;
if (!dst_buf->is_held) {
v4l2_m2m_dst_buf_remove(m2m_ctx);
--
2.27.0


Re: [Linux-stm32] [PATCH v8 08/10] drm: stm: dw-mipi-dsi: let the bridge handle the HW version check

2020-06-03 Thread Adrian Ratiu
On Tue, 02 Jun 2020, Emil Velikov  
wrote:
Hi Adrian, 


Hi Email,



On Mon, 1 Jun 2020 at 10:14, Adrian Ratiu 
 wrote: 


On Fri, 29 May 2020, Philippe CORNU  
wrote: 
> Hi Adrian, and thank you very much for the patchset.  Thank 
> you also for having tested it on STM32F769 and STM32MP1. 
> Sorry for the late response, Yannick and I will review it as 
> soon as possible and we will keep you posted.  Note: Do not 
> hesitate to put us in copy for the next version 
> (philippe.co...@st.com, yannick.fer...@st.com) Regards, 
> Philippe :-) 

Hi Philippe, 

Thank you very much for your previous and future STM testing, 
really appreciate it! I've CC'd Yannick until now but I'll also 
CC you sure :) 

It's been over a month since I posted v8 and I was just gearing 
up to address all feedback, rebase & retest to prepare v9 but 
I'll wait a little longer, no problem, it's no rush. 

Small idea, pardon for joining so late: 

Might be a good idea to add inline comment, why the clocks are 
disabled so late.  Effectively a 2 line version of the commit 
summary. 


Feel free to make that a separate/follow-up patch.


Thanks, I'll add the comment to this patch in v9.



-Emil

___
Linux-rockchip mailing list
linux-rockc...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip


Re: [PATCH v8 04/10] drm: bridge: dw_mipi_dsi: allow bridge daisy chaining

2020-06-03 Thread Adrian Ratiu
On Wed, 03 Jun 2020, Laurent Pinchart 
 wrote:
Hi Adrian, 


Hi Laurent,



Thank you for the patch. 

On Mon, Apr 27, 2020 at 11:19:46AM +0300, Adrian Ratiu wrote: 
Up until now the assumption was that the synopsis dsi bridge 
will directly connect to an encoder provided by the platform 
driver, but the current practice for drivers is to leave the 
encoder empty via the simple encoder API and add their logic to 
their own drm_bridge.   Thus we need an ablility to connect the 
DSI bridge to another bridge provided by the platform driver, 
so we extend the dw_mipi_dsi bind() API with a new "previous 
bridge" arg instead of just hardcoding NULL.   Cc: Laurent 
Pinchart  Signed-off-by: 
Adrian Ratiu  --- New in v8.  --- 
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c   | 6 -- 
 drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c | 2 +- 
 include/drm/bridge/dw_mipi_dsi.h| 5 - 3 
 files changed, 9 insertions(+), 4 deletions(-) 
 diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c index 
16fd87055e7b7..140ff40fa1b62 100644 --- 
a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c +++ 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c @@ -1456,11 
+1456,13 @@ EXPORT_SYMBOL_GPL(dw_mipi_dsi_remove); 
 /* 
  * Bind/unbind API, used from platforms based on the component 
  framework.  */ 
-int dw_mipi_dsi_bind(struct dw_mipi_dsi *dsi, struct 
drm_encoder *encoder) +int dw_mipi_dsi_bind(struct dw_mipi_dsi 
*dsi, +		 struct drm_encoder *encoder, + 
struct drm_bridge *prev_bridge) 
 { int ret;  
-	ret = drm_bridge_attach(encoder, &dsi->bridge, NULL, 0); + 
ret = drm_bridge_attach(encoder, &dsi->bridge, prev_bridge, 0); 


Please note that chaining of bridges doesn't work well if 
multiple bridges in the chain try to create a connector. This is 
why a DRM_BRIDGE_ATTACH_NO_CONNECTOR flag has been added, with a 
helper to create a connector for a chain of bridges 
(drm_bridge_connector_init()).  This won't play well with the 
component framework. I would recommend using the 
of_drm_find_bridge() instead in the rockchip driver, and 
deprecating dw_mipi_dsi_bind(). 



Thank you for this insight, indeed the bridge dw_mipi_dsi_bind() 
is clunky and we're making it even more so by possibly 
re-inventing drm_bridge_connector_init() with it in a way which 
can't work (well it does work but can lead to those nasty 
multiple-encoder corner-cases you mention).


I'll address this before posting v9, to try to move to 
of_drm_find_bridge() and remove dw_mipi_dsi_bind().



if (ret) {
DRM_ERROR("Failed to initialize bridge with drm\n");
return ret;
diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c 
b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
index 3feff0c45b3f7..83ef43be78135 100644
--- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
@@ -929,7 +929,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev,
return ret;
}
 
-	ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder);

+   ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder, NULL);
if (ret) {
DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret);
return ret;
diff --git a/include/drm/bridge/dw_mipi_dsi.h b/include/drm/bridge/dw_mipi_dsi.h
index b0e390b3288e8..699b3531f5b36 100644
--- a/include/drm/bridge/dw_mipi_dsi.h
+++ b/include/drm/bridge/dw_mipi_dsi.h
@@ -14,6 +14,7 @@
 #include 
 
 struct drm_display_mode;

+struct drm_bridge;
 struct drm_encoder;
 struct dw_mipi_dsi;
 struct mipi_dsi_device;
@@ -62,7 +63,9 @@ struct dw_mipi_dsi *dw_mipi_dsi_probe(struct platform_device 
*pdev,
  const struct dw_mipi_dsi_plat_data
  *plat_data);
 void dw_mipi_dsi_remove(struct dw_mipi_dsi *dsi);
-int dw_mipi_dsi_bind(struct dw_mipi_dsi *dsi, struct drm_encoder *encoder);
+int dw_mipi_dsi_bind(struct dw_mipi_dsi *dsi,
+struct drm_encoder *encoder,
+struct drm_bridge *prev_bridge);
 void dw_mipi_dsi_unbind(struct dw_mipi_dsi *dsi);
 void dw_mipi_dsi_set_slave(struct dw_mipi_dsi *dsi, struct dw_mipi_dsi *slave);
 


--
Regards,

Laurent Pinchart


Re: [PATCH 07/18] regmap: mmio: add config option to allow relaxed MMIO accesses

2020-10-14 Thread Adrian Ratiu

Hello Mark,

On Tue, 13 Oct 2020, Mark Brown  wrote:
On Mon, Oct 12, 2020 at 11:59:46PM +0300, Adrian Ratiu wrote: 

-	writeb(val, ctx->regs + reg); +	if (ctx->relaxed_mmio) + 
writeb_relaxed(val, ctx->regs + reg); +	else + 
writeb(val, ctx->regs + reg); 


There is no point in doing a conditional operation on every I/O, 
it'd be better to register a different set of ops when doing 
relaxed I/O. 


Indeed I have considered adding new functions but went with this 
solution because it's easier for the users to only have to define 
a "relaxed" config then test the regmap ctx as above.


Thinking a bit more about it, yes, it makes more sense to have 
dedicated ops: this way users don't have to be explicit about 
adding membarriers and can combine relaxed and non-relaxed more 
easily, so it's also a better API trade-off in addition to 
avoiding the conditional. Thanks!


Question: Do you want me to split this patch from the series and 
send it separately just for the regmap subsystem to be easier to 
review / apply?


Kind regards,
Adrian


Re: [PATCH 07/18] regmap: mmio: add config option to allow relaxed MMIO accesses

2020-10-14 Thread Adrian Ratiu

On Wed, 14 Oct 2020, Mark Brown  wrote:
On Wed, Oct 14, 2020 at 02:51:14PM +0300, Adrian Ratiu wrote: 
On Tue, 13 Oct 2020, Mark Brown  wrote: 
> On Mon, Oct 12, 2020 at 11:59:46PM +0300, Adrian Ratiu wrote: 


> > -	writeb(val, ctx->regs + reg); +	if 
> > (ctx->relaxed_mmio) + writeb_relaxed(val, ctx->regs + reg); 
> > +	else + writeb(val, ctx->regs + reg); 


> There is no point in doing a conditional operation on every 
> I/O, it'd be better to register a different set of ops when 
> doing relaxed I/O. 


Indeed I have considered adding new functions but went with 
this solution because it's easier for the users to only have to 
define a "relaxed" config then test the regmap ctx as above. 


It seems like you've taken this in a direction other than what 
I was thinking of here - defining separate ops doesn't mean we 
have to do anything which has any impact on the interface seen 
by users.  The regmap config is supplied at registration time, 
it's just as available then as it is when doing I/O.


Right. I got confused by the meaning of ops :) Sorry about that.



Thinking a bit more about it, yes, it makes more sense to have 
dedicated ops: this way users don't have to be explicit about 
adding membarriers and can combine relaxed and non-relaxed more 
easily, so it's also a better API trade-off in addition to 
avoiding the conditional. Thanks! 


I'm not sure what you're proposing here - it does seem useful to 
be able to combine relaxed and non-relaxed I/O but that seems 
like it'd break down the abstraction for regmap since tht's not 
really a concept other buses are going to have?  Unless we 
provide an operation to switch by setting flags or somethin 
possibly and integrate it with the cache perhaps.  Could you be 
a bit more specific about what you were thinking of here please?


I was thinking about exposing a relaxed API like 
regmap_write_relaxed but now that I know what you meant by ops and 
also that it doesn't make sense for other busses / violates the 
abstraction, I realize that is a bad idea and I will continue 
improving this to avoid the conditional and send a separete 
patch. Thanks again!





Question: Do you want me to split this patch from the series and send it
separately just for the regmap subsystem to be easier to review / apply?


Sure.


[PATCH v2] regmap: mmio: add config option to allow relaxed MMIO accesses

2020-10-14 Thread Adrian Ratiu
On some platforms (eg armv7 due to the CONFIG_ARM_DMA_MEM_BUFFERABLE)
MMIO R/W operations always add memory barriers which can increase load,
decrease battery life or in general reduce performance unnecessarily
on devices which access a lot of configuration registers and where
ordering does not matter (eg. media accelerators like the Verisilicon /
Hantro video decoders).

Drivers used to call the relaxed MMIO variants directly but since they
are now accessing the MMIO registers via regmaps (to compensate for
different VPU HW reg layouts via regmap fields), there is a need for a
relaxed API / config to preserve existing behaviour.

Cc: Mark Brown 
Signed-off-by: Adrian Ratiu 
---
Changes in v2:
  - Moved conditional outside of I/O call path, to be done just
  once during context initialization (Mark)
---
 drivers/base/regmap/regmap-mmio.c | 90 ---
 include/linux/regmap.h|  5 ++
 2 files changed, 87 insertions(+), 8 deletions(-)

diff --git a/drivers/base/regmap/regmap-mmio.c 
b/drivers/base/regmap/regmap-mmio.c
index af967d8f975e..f9cd51afb9d2 100644
--- a/drivers/base/regmap/regmap-mmio.c
+++ b/drivers/base/regmap/regmap-mmio.c
@@ -16,6 +16,7 @@
 struct regmap_mmio_context {
void __iomem *regs;
unsigned val_bytes;
+   bool relaxed_mmio;
 
bool attached_clk;
struct clk *clk;
@@ -75,6 +76,13 @@ static void regmap_mmio_write8(struct regmap_mmio_context 
*ctx,
writeb(val, ctx->regs + reg);
 }
 
+static void regmap_mmio_write8_relaxed(struct regmap_mmio_context *ctx,
+   unsigned int reg,
+   unsigned int val)
+{
+   writeb_relaxed(val, ctx->regs + reg);
+}
+
 static void regmap_mmio_write16le(struct regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
@@ -82,6 +90,13 @@ static void regmap_mmio_write16le(struct regmap_mmio_context 
*ctx,
writew(val, ctx->regs + reg);
 }
 
+static void regmap_mmio_write16le_relaxed(struct regmap_mmio_context *ctx,
+ unsigned int reg,
+ unsigned int val)
+{
+   writew_relaxed(val, ctx->regs + reg);
+}
+
 static void regmap_mmio_write16be(struct regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
@@ -96,6 +111,13 @@ static void regmap_mmio_write32le(struct 
regmap_mmio_context *ctx,
writel(val, ctx->regs + reg);
 }
 
+static void regmap_mmio_write32le_relaxed(struct regmap_mmio_context *ctx,
+ unsigned int reg,
+ unsigned int val)
+{
+   writel_relaxed(val, ctx->regs + reg);
+}
+
 static void regmap_mmio_write32be(struct regmap_mmio_context *ctx,
  unsigned int reg,
  unsigned int val)
@@ -110,6 +132,13 @@ static void regmap_mmio_write64le(struct 
regmap_mmio_context *ctx,
 {
writeq(val, ctx->regs + reg);
 }
+
+static void regmap_mmio_write64le_relaxed(struct regmap_mmio_context *ctx,
+ unsigned int reg,
+ unsigned int val)
+{
+   writeq_relaxed(val, ctx->regs + reg);
+}
 #endif
 
 static int regmap_mmio_write(void *context, unsigned int reg, unsigned int val)
@@ -137,12 +166,24 @@ static unsigned int regmap_mmio_read8(struct 
regmap_mmio_context *ctx,
return readb(ctx->regs + reg);
 }
 
+static unsigned int regmap_mmio_read8_relaxed(struct regmap_mmio_context *ctx,
+ unsigned int reg)
+{
+   return readb_relaxed(ctx->regs + reg);
+}
+
 static unsigned int regmap_mmio_read16le(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
return readw(ctx->regs + reg);
 }
 
+static unsigned int regmap_mmio_read16le_relaxed(struct regmap_mmio_context 
*ctx,
+unsigned int reg)
+{
+   return readw_relaxed(ctx->regs + reg);
+}
+
 static unsigned int regmap_mmio_read16be(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
@@ -155,6 +196,12 @@ static unsigned int regmap_mmio_read32le(struct 
regmap_mmio_context *ctx,
return readl(ctx->regs + reg);
 }
 
+static unsigned int regmap_mmio_read32le_relaxed(struct regmap_mmio_context 
*ctx,
+unsigned int reg)
+{
+   return readl_relaxed(ctx->regs + reg);
+}
+
 static unsigned int regmap_mmio_read32be(struct regmap_mmio_context *ctx,
 unsigned int reg)
 {
@@ -167,6 +214,12 @@ static unsigned int regmap_mmio_read64le(struct 
regmap_mmio_context *ctx,
 {
return readq(ctx->regs + reg);
 }
+
+static unsigned int regmap_mmio_read

Re: [PATCH v9 00/11] Genericize DW MIPI DSI bridge and add i.MX 6 driver

2020-10-23 Thread Adrian Ratiu

Hi Neil,

On Tue, 15 Sep 2020, Neil Armstrong  
wrote:
Hi Adrian, 

Gentle ping. 

can you rebase on drm-misc-next so I can apply the IMX and STM 
patches ?


Sorry for the late reply, somehow missed this e-mail chain.

I have a rebase of the series but further investigation revealed 
we might regress Rockchip with a partial integration, so I'm 
getting a panel for RK to test to be sure and will re-submit.




On 24/08/2020 11:47, Neil Armstrong wrote:

Hi,


On 15/08/2020 15:05, Ezequiel Garcia wrote:

Hi Neil,

On Wed, 2020-07-01 at 09:35 +0300, Adrian Ratiu wrote:

Hi Neil,

On Mon, 29 Jun 2020, Neil Armstrong  
wrote:
Hi Adrian, 

On 09/06/2020 19:49, Adrian Ratiu wrote: 

[...]




It's been a month so I think it's a good idea to go forward
applying IMX and STM patches (probably with the usual
rebase dance).

As for Rockchip...

The binding API removal change which directly touches RK can also 
be applied separately, but unfortunately I do not have access to a 
RK board with a DSI display to test it (or the bridge regmap logic 
on RK btw...), I just "eye-balled" the RK code based on the public 
docs and it LGTM.




... I'll be getting some DSI hardware to help with the pending
Rockchip issues, so we can tackle Rockchip as well. I'm quite sure
we'll loop Heiko as well if needed :-)


Sure, Adrian, can you rebase on drm-misc-next so I can apply the IMX and STM 
patches ?



Cheers,
Ezequiel


Neil


Big thank you to everyone who has contributed to this up to now,
Adrian

Adrian Ratiu (11):
  drm: bridge: dw_mipi_dsi: add initial regmap infrastructure
  drm: bridge: dw_mipi_dsi: abstract register access using reg_fields
  drm: bridge: dw_mipi_dsi: add dsi v1.01 support
  drm: bridge: dw_mipi_dsi: remove bind/unbind API
  dt-bindings: display: add i.MX6 MIPI DSI host controller doc
  ARM: dts: imx6qdl: add missing mipi dsi properties
  drm: imx: Add i.MX 6 MIPI DSI host platform driver
  drm: stm: dw-mipi-dsi: let the bridge handle the HW version check
  drm: bridge: dw-mipi-dsi: split low power cfg register into fields
  drm: bridge: dw-mipi-dsi: fix bad register field offsets
  Documentation: gpu: todo: Add dw-mipi-dsi consolidation plan

 .../display/imx/fsl,mipi-dsi-imx6.yaml| 112 +++
 Documentation/gpu/todo.rst|  25 +
 arch/arm/boot/dts/imx6qdl.dtsi|   8 +
 drivers/gpu/drm/bridge/synopsys/Kconfig   |   1 +
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 713 --
 drivers/gpu/drm/imx/Kconfig   |   8 +
 drivers/gpu/drm/imx/Makefile  |   1 +
 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c| 399 ++
 .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c   |   7 +-
 drivers/gpu/drm/stm/dw_mipi_dsi-stm.c |  16 +-
 10 files changed, 1059 insertions(+), 231 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml
 create mode 100644 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c







Re: [PATCH v5 0/3] media: rkvdec: Add a VP9 backend

2020-11-10 Thread Adrian Ratiu

Hi Ezequiel,

On Tue, 10 Nov 2020, Ezequiel Garcia 
 wrote:
On Mon, 2 Nov 2020 at 16:04, Adrian Ratiu 
 wrote: 


Dear all, 

This is v5 of the series adding VP9 profile 0 decoding to 
rkvdec. 

All feedback from v4 should be addressed, there's just one 
thing I did not address: ref_frame_sign_biases in the uAPI. The 
userspace tool I'm 


I believe that Hantro G2 VP9 needs ref_frame_sign_biases. 

I think that it's also needed for the MTK decoder.  Might be 
worth checking that as well, if the code is publicly available 
somewhere.


I consulted the imx8m app ref manual for the Hantro G2 core and 
indeed there's not one, but three fields at SWREG11 and 13 (last, 
gold, alt) to signify sign biases for ref frames. Thanks for the 
hint!




Coming to think about it, I think we are really close to having 
this uAPI directly upstream. 

Let's take a step back on why we have these uAPIs in the staging 
area. Couple years ago, there were some doubts in the media 
community about these uAPIs, and we wanted to wait a bit for 
more users before moving to public land. 

The uAPIs were meant to be in staging until enough users 
appeared and we were confident enough to move to stable. 

For VP9, given the feedback received through the year was 
already addressed, I think all that's left is to check the 
interface and make sure it can support Rockchip (RK3399, RK3326, 
etc), Hantro G2 and Mediatek, 

We will be very close to having a public API, and we could even 
merge it directly there.


Thank you very much for this background. I understand that the 
uAPI is independent from the driver implementations, so having a 
good stable uAPI is beneficial when (for example) adding support 
for VP9 on G2 in  hantro or for upstream adoption of these 
drivers.


Given this rkvdec driver implementation is also adding the VP9 
uAPI and it's very close to stability (maybe only missing ref 
frame sign bias, but who knows?) would you like to block its 
submission until the uAPI is finalized or would it make sense to 
treat the uAPI de-staging process separately because the uAPI is 
independent from the driver? 


Thanks,
Adrian



Thanks,
Ezequiel


using [1] apparently doesn't need it or the default hwreg value for it
is capable of decoding the bitstreams I used on the driver, so I don't
really have a use-case to change and test that. :)

Considering the uAPI is a work in progress and expected to be modified,
ref_frame_sign_biases can be added later with others which might be
required to enable more functionality (for eg profiles >= 1).

Series tested on rk3399 and applies on next-20201030.

[1] https://github.com/Kwiboo/FFmpeg/tree/v4l2-request-hwaccel-4.2.2-rkvdec

Changelog
-

v5:

* Drop unnecessary OUTPUT buffer payload set in .buf_prepare.
* Drop obsolete .per_request ctrl flag
* Added new vp9 ctrls to v4l2_ctrl_ptr
* Fix pahole detected padding issues
* Send userspace an error if it tries to reconfigure decode resolution
  as v4l2 or rkvdec-vp9 backend do not support dynamic res changes yet
* Allow frame ctx probability tables to be non-mandatory so users can
  set them directly during frame decoding in cases where no defaults
  have been set previously (eg. ffmpeg vp9 backend)
* Some comments and documentation clarifications
* Minor checkpatch fixes

v4:

* Drop color_space field from the VP9 interface.
  V4L2 API should be used for it.
* Clarified Segment-ID comments.
* Moved motion vector probabilities to a separate
  struct.

v3:

* Fix documentation issues found by Hans.
* Fix smatch detected issues as pointed out by Hans.
* Added patch to fix wrong bytesused set on .buf_prepare.

v2:

* Documentation style issues pointed out by Nicolas internally.
* s/VP9_PROFILE_MAX/V4L2_VP9_PROFILE_MAX/
* Fix wrong kfree(ctx).
* constify a couple structs on rkvdec-vp9.c


Boris Brezillon (2):
  media: uapi: Add VP9 stateless decoder controls
  media: rkvdec: Add the VP9 backend

Ezequiel Garcia (1):
  media: rkvdec: Fix .buf_prepare

 .../userspace-api/media/v4l/biblio.rst|   10 +
 .../media/v4l/ext-ctrls-codec.rst |  550 ++
 drivers/media/v4l2-core/v4l2-ctrls.c  |  239 +++
 drivers/media/v4l2-core/v4l2-ioctl.c  |1 +
 drivers/staging/media/rkvdec/Makefile |2 +-
 drivers/staging/media/rkvdec/rkvdec-vp9.c | 1577 +
 drivers/staging/media/rkvdec/rkvdec.c |   72 +-
 drivers/staging/media/rkvdec/rkvdec.h |6 +
 include/media/v4l2-ctrls.h|5 +
 include/media/vp9-ctrls.h |  486 +
 10 files changed, 2942 insertions(+), 6 deletions(-)
 create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
 create mode 100644 include/media/vp9-ctrls.h

--
2.29.0



Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-10 Thread Adrian Ratiu
On Tue, 10 Nov 2020, Nick Desaulniers  
wrote:
On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu 
 wrote: 


On Fri, 06 Nov 2020, Nick Desaulniers  
wrote: 
> +#pragma clang loop vectorize(enable) 
> do { 
> p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] 
> ^= p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; 
> ``` seems to generate the vectorized code. 
> 
> Why don't we find a way to make those pragma's more toolchain 
> portable, rather than open coding them like I have above 
> rather than this series? 

Hi again Nick, 

How did you verify the above pragmas generate correct 
vectorized code?  Have you tested this specific use case? 


I read the disassembly before and after my suggested use of 
pragmas; look for vld/vstr.  You can also add 
-Rpass-missed=loop-vectorize to CFLAGS_xor-neon.o in 
arch/arm/lib/Makefile and rebuild arch/arm/lib/xor-neon.o with 
CONFIG_BTRFS enabled. 



I'm asking because overrulling the cost model might not be 
enough, the only thing I can confirm is that the generated code 
is changed, but not that it is correct in any way. The object 
disasm also looks weird, but I don't have enough knowledge to 
start debugging what's happening within LLVM/Clang itself. 


It doesn't "look weird" to me. The loop is versioned based on a 
comparison whether the parameters alias or not. There's a 
non-vectorized version if the parameters are equal or close 
enough to overlap.  There's another version of the loop that's 
vectorized.  If you want just the vectorized version, then you 
have to mark the parameters as __restrict qualified, then check 
that all callers are ok with that. 



Thank you for the explanation, that does make sense now. I'm just 
a compiler optimization noob, sorry. All your help is much 
appreciated.




I also get some new warnings with your code [1], besides the 
previously 'vectorization was possible but not beneficial' 
which is still present. It is quite funny because these two 
warnings seem to contradict themselves. :) 


From which compiler?  ``` $ clang 
-Wpass-failed=transform-warning -c -x c /dev/null warning: 
unknown warning option '-Wpass-failed=transform-warning'; did 
you mean '-Wprofile-instr-missing'? [-Wunknown-warning-option] 
``` 


I'm using Clang 10.0.1-1 from the Arch Linux repo.

In the LLVM sources that transform-warning appears to be 
documented under 
llvm-10.0.1.src/docs/Passes.rst:1227:-transform-warning


Here's a build log: http://ix.io/2DIc

I always get those warnings with the pragma change you suggested, 
even on clean builds on latest linux-next.


I looked at the Arch PKGBUILD and they don't appear to do anything 
special other than patching to enable SSP and PIE by default (eg 
llvm bug 13410).




The pragma is clang specific, hence my recommendation to wrap it 
in an #ifdef __clang__. 



Yes, I understand that. :)



At this point I do not trust the compiler and am inclined to do 


Nonsense. 

like was done for GCC when it was broken: disable the 
optimization and warn users to upgrade after the compiler is 
fixed and confirmed to work. 

If you agree I can send a v2 with this and also drop the GCC 
pragma as Arvind and Ard suggested. 


If you resend "this" as in 2/2, I will NACK it.  There's nothing 
wrong with the cost model; it's saying there's little point in 
generating the vectorized version because you're still going to 
need a non-vectorized loop version anyways.  Claiming there is a 
compiler bug here is dubious just because the cost models 
between two compilers differ slightly.


Ok, so that "remark" from the compiler is safe to ignore.



Resend the patch removing the warning, remove the GCC pragma, 
but if you want to change anything here for Clang, use `#pragma 
clang loop vectorize(enable)` wrapped in an `#ifdef __clang__`. 



Thanks for making the NACK clear, so the way forward is to either 
use the pragma if I can figure out the new 'loop not vectorized' 
warning (which might also be a red herring) or just leave Clang as 
is. :)




Kind regards,
Adrian

[1]
./include/asm-generic/xor.h:11:1: warning: loop not vectorized:
the optimizer was unable to perform the requested transformation;
the transformation might be disabled or specified as part of an
unsupported transformation ordering
[-Wpass-failed=transform-warning] xor_8regs_2(unsigned long bytes,
unsigned long *p1, unsigned long *p2)



--
Thanks,
~Nick Desaulniers


Re: [PATCH v5 0/3] media: rkvdec: Add a VP9 backend

2020-11-10 Thread Adrian Ratiu
On Tue, 10 Nov 2020, Ezequiel Garcia  
wrote:
On Wed, 2020-11-11 at 00:28 +0200, Adrian Ratiu wrote: 
Hi Ezequiel, 
  
On Tue, 10 Nov 2020, Ezequiel Garcia 
 wrote: 
> On Mon, 2 Nov 2020 at 16:04, Adrian Ratiu 
>  wrote:  
> > Dear all,   This is v5 of the series adding VP9 profile 0 
> > decoding to  rkvdec.All feedback from v4 should be 
> > addressed, there's just one  thing I did not address: 
> > ref_frame_sign_biases in the uAPI. The  userspace tool I'm  
>  I believe that Hantro G2 VP9 needs ref_frame_sign_biases. 
> I think that it's also needed for the MTK decoder.  Might be 
> worth checking that as well, if the code is publicly 
> available  somewhere. 
 I consulted the imx8m app ref manual for the Hantro G2 core 
and  indeed there's not one, but three fields at SWREG11 and 13 
(last,  gold, alt) to signify sign biases for ref 
frames. Thanks for the  hint! 
  
> Coming to think about it, I think we are really close to 
> having  this uAPI directly upstream.Let's take a step 
> back on why we have these uAPIs in the staging  area. Couple 
> years ago, there were some doubts in the media  community 
> about these uAPIs, and we wanted to wait a bit for  more 
> users before moving to public land.The uAPIs were meant 
> to be in staging until enough users  appeared and we were 
> confident enough to move to stable.For VP9, given the 
> feedback received through the year was  already addressed, I 
> think all that's left is to check the  interface and make 
> sure it can support Rockchip (RK3399, RK3326,  etc), Hantro 
> G2 and Mediatek,   We will be very close to having a public 
> API, and we could even  merge it directly there. 
 Thank you very much for this background. I understand that the 
uAPI is independent from the driver implementations, so having 
a  good stable uAPI is beneficial when (for example) adding 
support  for VP9 on G2 in  hantro or for upstream adoption of 
these  drivers.   Given this rkvdec driver implementation is 
also adding the VP9  uAPI and it's very close to stability 
(maybe only missing ref  frame sign bias, but who knows?) would 
you like to block its  submission until the uAPI is finalized 
or would it make sense to  treat the uAPI de-staging process 
separately because the uAPI is  independent from the driver?   


I don't mean to block it, quite the opposite, to make sure we 
take this opportunity to go through Rockchip, Hantro and 
Mediatek, double-check the uAPI is covering all the VP9 syntax, 
and then target for public API.


That makes sense. I'm just cautious to not directly botch the 
public API, but that's what reviews are for, right? :) Thanks 
again for helping with background & direction.




Cheers,
Ezequiel


Thanks,
Adrian

> Thanks,
> Ezequiel
> 
> > using [1] apparently doesn't need it or the default hwreg value for it

> > is capable of decoding the bitstreams I used on the driver, so I don't
> > really have a use-case to change and test that. :)
> > 
> > Considering the uAPI is a work in progress and expected to be modified,

> > ref_frame_sign_biases can be added later with others which might be
> > required to enable more functionality (for eg profiles >= 1).
> > 
> > Series tested on rk3399 and applies on next-20201030.
> > 
> > [1] https://github.com/Kwiboo/FFmpeg/tree/v4l2-request-hwaccel-4.2.2-rkvdec
> > 
> > Changelog

> > -
> > 
> > v5:
> > 
> > * Drop unnecessary OUTPUT buffer payload set in .buf_prepare.

> > * Drop obsolete .per_request ctrl flag
> > * Added new vp9 ctrls to v4l2_ctrl_ptr
> > * Fix pahole detected padding issues
> > * Send userspace an error if it tries to reconfigure decode resolution
> >   as v4l2 or rkvdec-vp9 backend do not support dynamic res changes yet
> > * Allow frame ctx probability tables to be non-mandatory so users can
> >   set them directly during frame decoding in cases where no defaults
> >   have been set previously (eg. ffmpeg vp9 backend)
> > * Some comments and documentation clarifications
> > * Minor checkpatch fixes
> > 
> > v4:
> > 
> > * Drop color_space field from the VP9 interface.

> >   V4L2 API should be used for it.
> > * Clarified Segment-ID comments.
> > * Moved motion vector probabilities to a separate
> >   struct.
> > 
> > v3:
> > 
> > * Fix documentation issues found by Hans.

> > * Fix smatch detected issues as pointed out by Hans.
> > * Added patch to fix wrong bytesused set on .buf_prepare.
> > 
> > v2:
> > 
> > * Documentation style issues pointed out by Nicolas internally.

> > * s/VP9_PROFILE_MAX/V4L2_VP9_PROFILE_MAX/
&

Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-11 Thread Adrian Ratiu
On Tue, 10 Nov 2020, Nick Desaulniers  
wrote:
On Tue, Nov 10, 2020 at 3:54 PM Adrian Ratiu 
 wrote: 


On Tue, 10 Nov 2020, Nick Desaulniers  
wrote: 
> On Mon, Nov 9, 2020 at 11:51 AM Adrian Ratiu 
>  wrote: 
>> 
>> On Fri, 06 Nov 2020, Nick Desaulniers 
>>  wrote: 
>> > +#pragma clang loop vectorize(enable) 
>> > do { 
>> > p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; 
>> > p1[1] ^= p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; 
>> > ``` seems to generate the vectorized code. 
>> > 
>> > Why don't we find a way to make those pragma's more 
>> > toolchain portable, rather than open coding them like I 
>> > have above rather than this series? 
>> 
>> Hi again Nick, 
>> 
>> How did you verify the above pragmas generate correct 
>> vectorized code?  Have you tested this specific use case? 
> 
> I read the disassembly before and after my suggested use of 
> pragmas; look for vld/vstr.  You can also add 
> -Rpass-missed=loop-vectorize to CFLAGS_xor-neon.o in 
> arch/arm/lib/Makefile and rebuild arch/arm/lib/xor-neon.o 
> with CONFIG_BTRFS enabled. 
> 
>> 
>> I'm asking because overrulling the cost model might not be 
>> enough, the only thing I can confirm is that the generated 
>> code is changed, but not that it is correct in any way. The 
>> object disasm also looks weird, but I don't have enough 
>> knowledge to start debugging what's happening within 
>> LLVM/Clang itself. 
> 
> It doesn't "look weird" to me. The loop is versioned based on 
> a comparison whether the parameters alias or not. There's a 
> non-vectorized version if the parameters are equal or close 
> enough to overlap.  There's another version of the loop 
> that's vectorized.  If you want just the vectorized version, 
> then you have to mark the parameters as __restrict qualified, 
> then check that all callers are ok with that. 
> 

Thank you for the explanation, that does make sense now. I'm 
just a compiler optimization noob, sorry. All your help is much 
appreciated. 


Don't worry about it; you'll get the hang of it in no time, just 
stick with it. 



>> 
>> I also get some new warnings with your code [1], besides the 
>> previously 'vectorization was possible but not beneficial' 
>> which is still present. It is quite funny because these two 
>> warnings seem to contradict themselves. :) 
> 
> From which compiler?  ``` $ clang 
> -Wpass-failed=transform-warning -c -x c /dev/null warning: 
> unknown warning option '-Wpass-failed=transform-warning'; did 
> you mean '-Wprofile-instr-missing'? 
> [-Wunknown-warning-option] ``` 

I'm using Clang 10.0.1-1 from the Arch Linux repo. 

In the LLVM sources that transform-warning appears to be 
documented under 
llvm-10.0.1.src/docs/Passes.rst:1227:-transform-warning 

Here's a build log: http://ix.io/2DIc 

I always get those warnings with the pragma change you 
suggested, even on clean builds on latest linux-next. 

I looked at the Arch PKGBUILD and they don't appear to do 
anything special other than patching to enable SSP and PIE by 
default (eg llvm bug 13410). 


Ah, custom builds of LLVM.  Grepping for transform-warning in 
LLVM's sources, I can indeed see such a pass. I'm curious 
whether Arch is turning on that pass by default or if you 
manually enabled -Wpass-failed=transform-warning in the 
Makefile?  Maybe I need to do an assertions enabled build of 
LLVM or a debug build. Reading through llvm/docs/Passes.rst and 
llvm/docs/TransformMetadata.rst, it sounds like this should be 
triggered when a "forced optimization has failed."  So I wonder 
what's the missing variable between it working for me, vs 
warning for you?


I did not build clang myself, just did "pacman -S clang" to get 
the official distro binary package. Here's the PKGBUILD they used, 
I'm sending the commit link because recently clang 11 was upgraded 
to.


I also tested clang 11.0.0 where I get the same warnings / 
remarks.


https://github.com/archlinux/svntogit-packages/blob/8ff1bb4e4be5c6e5bede60c6b259a89f0cee6e6a/trunk/PKGBUILD



Godbolt seems to agree with me here: 
https://godbolt.org/z/Wf6YKv.  Maybe related to the "New Pass 
Manager" ... digging into that... 



> 
> The pragma is clang specific, hence my recommendation to wrap 
> it in an #ifdef __clang__. 
> 

Yes, I understand that. :) 

>> 
>> At this point I do not trust the compiler and am inclined to 
>> do 
> 
> Nonsense. 
> 
>> like was done for GCC when it was broken: disable the 
>> optimization and warn users to upgrade after the compil

[PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2020-11-12 Thread Adrian Ratiu
From: Nathan Chancellor 

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").

Reported-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.29.2



[PATCH v2 0/2] xor-neon: Remove GCC warn & pragmas

2020-11-12 Thread Adrian Ratiu
Dear all,

This is v2 of the patch series at
id:20201106051436.2384842-1-adrian.ra...@collabora.com

Tested on next-20201112 using GCC 10.2.0 and Clang 10.0.1.

Kind regards,
Adrian

Changes in v2:
  - Dropped the patch which disabled Clang vectorization (Nick)
  - Added new patch to move pragmas to makefile cmdline options
  (Arvid and Ard)
  
Adrian Ratiu (1):
  arm: lib: xor-neon: move pragma options to makefile

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 17 -
 2 files changed, 1 insertion(+), 18 deletions(-)

-- 
2.29.2



[PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile

2020-11-12 Thread Adrian Ratiu
Using a pragma like GCC optimize is a bad idea because it tags
all functions with an __attribute__((optimize)) which replaces
optimization options rather than appending so could result in
dropping important flags. Not recommended for production use.

Because these options should always be enabled for this file,
it's better to set them via command line. tree-vectorize is on
by default in Clang, but it doesn't hurt to make it explicit.

Suggested-by: Arvind Sankar 
Suggested-by: Ard Biesheuvel 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 10 --
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..12d31d1a7630 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o+= $(NEON_FLAGS) -ftree-vectorize 
-Wno-unused-variable
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..62b493e386c4 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp 
-mfpu=neon'
 #endif
 
-/*
- * Pull in the reference implementations while instructing GCC (through
- * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
- * NEON instructions.
- */
-#ifdef CONFIG_CC_IS_GCC
-#pragma GCC optimize "tree-vectorize"
-#endif
-
-#pragma GCC diagnostic ignored "-Wunused-variable"
 #include 
 
 struct xor_block_template const xor_block_neon_inner = {
-- 
2.29.2



Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2020-11-13 Thread Adrian Ratiu

Hi Ard,

On Fri, 13 Nov 2020, Ard Biesheuvel  wrote:
On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
 wrote: 


From: Nathan Chancellor  

Drop warning because kernel now requires GCC >= v4.9 after 
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9"). 

Reported-by: Nick Desaulniers  
Signed-off-by: Nathan Chancellor  
Signed-off-by: Adrian Ratiu  


Again, this does not do what it says on the tin. 

If you want to disable the pragma for Clang, call that out in 
the commit log, and don't hide it under a GCC version change.


I am not doing anything for Clang in this series.

The option to auto-vectorize in Clang is enabled by default but 
doesn't work for some reason (likely to do with how it computes 
the cost model, so maybe not even a bug at all) and if we enable 
it explicitely (eg via a Clang specific pragma) we get some 
warnings we currently do not understand, so I am not changing the 
Clang behaviour at the recommendation of Nick.


So this is only for GCC as the "tin" says :) We can fix clang 
separately as the Clang bug has always been present and is 
unrelated.




Without the pragma, the generated code is the same as the 
generic code, so it makes no sense to build xor-neon.ko at all, 
right? 



Yes that is correct and that is the reason why in v1 I opted to 
not build xor-neon.ko for Clang anymore, but that got NACKed, so 
here I'm fixing the low hanging fruit: the very obvious & clear 
GCC problems.




---
 arch/arm/lib/xor-neon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif

 #pragma GCC diagnostic ignored "-Wunused-variable"
--
2.29.2



Re: [PATCH v2 2/2] arm: lib: xor-neon: move pragma options to makefile

2020-11-13 Thread Adrian Ratiu

On Fri, 13 Nov 2020, Ard Biesheuvel  wrote:
On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
 wrote: 


Using a pragma like GCC optimize is a bad idea because it tags 
all functions with an __attribute__((optimize)) which replaces 
optimization options rather than appending so could result in 
dropping important flags. Not recommended for production use. 

Because these options should always be enabled for this file, 
it's better to set them via command line. tree-vectorize is on 
by default in Clang, but it doesn't hurt to make it explicit. 

Suggested-by: Arvind Sankar  
Suggested-by: Ard Biesheuvel  Signed-off-by: 
Adrian Ratiu  --- 
 arch/arm/lib/Makefile   |  2 +- arch/arm/lib/xor-neon.c | 10 
 -- 2 files changed, 1 insertion(+), 11 deletions(-) 

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile 
index 6d2ba454f25b..12d31d1a7630 100644 --- 
a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -45,6 
+45,6 @@ $(obj)/csumpartialcopyuser.o: 
$(obj)/csumpartialcopygeneric.S 

 ifeq ($(CONFIG_KERNEL_MODE_NEON),y) 
   NEON_FLAGS   := -march=armv7-a 
   -mfloat-abi=softfp -mfpu=neon 
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS) + 
CFLAGS_xor-neon.o+= $(NEON_FLAGS) -ftree-vectorize 
-Wno-unused-variable 
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o 
 endif 
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c 
index e1e76186ec23..62b493e386c4 100644 --- 
a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ 
-14,16 +14,6 @@ MODULE_LICENSE("GPL"); 
 #error You should compile this file with '-march=armv7-a 
 -mfloat-abi=softfp -mfpu=neon' #endif 

-/* - * Pull in the reference implementations while instructing 
GCC (through - * -ftree-vectorize) to attempt to exploit 
implicit parallelism and emit - * NEON instructions.  - */ 
-#ifdef CONFIG_CC_IS_GCC -#pragma GCC optimize "tree-vectorize" 
-#endif - -#pragma GCC diagnostic ignored "-Wunused-variable" 
 #include  

 struct xor_block_template const xor_block_neon_inner = { 
-- 2.29.2 



So what is the status now here? How does putting 
-ftree-vectorize on the command line interact with Clang? 


Clang needs to be fixed separately as -ftree-vectorize does not 
change anything, the option is enabled by default.


I know it sucks to have such a silent failure, but it's always 
been there (the "upgrade your GCC" warning during Clang builds was 
bogus) and I do not want to rush a Clang fix without fully 
understanding it.


Warning Clang users that the optimization doesn't work was 
discussed but dropped because users can't do anything about it.


If we are positively certain this is a kernel bug and not a Clang 
bug (i.e. the xor-neon use case is not enabling/triggering the 
optimization properly) I could add a TODO comment in the code 
FWIW.


Adrian


Re: [PATCH v2 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2020-11-13 Thread Adrian Ratiu

On Fri, 13 Nov 2020, Ard Biesheuvel  wrote:
On Fri, 13 Nov 2020 at 12:05, Adrian Ratiu 
 wrote: 


Hi Ard, 

On Fri, 13 Nov 2020, Ard Biesheuvel  wrote: 
> On Thu, 12 Nov 2020 at 22:23, Adrian Ratiu 
>  wrote: 
>> 
>> From: Nathan Chancellor  
>> 
>> Drop warning because kernel now requires GCC >= v4.9 after 
>> commit 6ec4476ac825 ("Raise gcc version requirement to 
>> 4.9"). 
>> 
>> Reported-by: Nick Desaulniers  
>> Signed-off-by: Nathan Chancellor  
>> Signed-off-by: Adrian Ratiu  
> 
> Again, this does not do what it says on the tin. 
> 
> If you want to disable the pragma for Clang, call that out in 
> the commit log, and don't hide it under a GCC version change. 

I am not doing anything for Clang in this series. 

The option to auto-vectorize in Clang is enabled by default but 
doesn't work for some reason (likely to do with how it computes 
the cost model, so maybe not even a bug at all) and if we 
enable it explicitely (eg via a Clang specific pragma) we get 
some warnings we currently do not understand, so I am not 
changing the Clang behaviour at the recommendation of Nick. 

So this is only for GCC as the "tin" says :) We can fix clang 
separately as the Clang bug has always been present and is 
unrelated. 



But you are adding the IS_GCC check here, no? Is that 
equivalent? IOW, does Clang today identify as GCC <= 4.6? 



I see what you mean now. Thanks.

Clang identifies as GCC <= 4.6 yes, so the code is not strictly 
speaking equivalent. The warning to upgrade GCC doesn't make sense 
for Clang but I should mention removing it in the commit message 
as well.



>
> Without the pragma, the generated code is the same as the
> generic code, so it makes no sense to build xor-neon.ko at all,
> right?
>

Yes that is correct and that is the reason why in v1 I opted to
not build xor-neon.ko for Clang anymore, but that got NACKed, so
here I'm fixing the low hanging fruit: the very obvious & clear
GCC problems.




Fair enough.


>> ---
>>  arch/arm/lib/xor-neon.c | 9 +
>>  1 file changed, 1 insertion(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
>> index b99dd8e1c93f..e1e76186ec23 100644
>> --- a/arch/arm/lib/xor-neon.c
>> +++ b/arch/arm/lib/xor-neon.c
>> @@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
>>   * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
>>   * NEON instructions.
>>   */
>> -#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
>> +#ifdef CONFIG_CC_IS_GCC
>>  #pragma GCC optimize "tree-vectorize"
>> -#else
>> -/*
>> - * While older versions of GCC do not generate incorrect code, they fail to
>> - * recognize the parallel nature of these functions, and emit plain ARM 
code,
>> - * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
>> - */
>> -#warning This code requires at least version 4.6 of GCC
>>  #endif
>>
>>  #pragma GCC diagnostic ignored "-Wunused-variable"
>> --
>> 2.29.2
>>


[PATCH v3 2/2] arm: lib: xor-neon: move pragma options to makefile

2020-11-13 Thread Adrian Ratiu
Using a pragma like GCC optimize is a bad idea because it tags
all functions with an __attribute__((optimize)) which replaces
optimization options rather than appending so could result in
dropping important flags. Not recommended for production use.

Because these options should always be enabled for this file,
it's better to set them via command line. tree-vectorize is on
by default in Clang, but it doesn't hurt to make it explicit.

Suggested-by: Arvind Sankar 
Suggested-by: Ard Biesheuvel 
Reviewed-by: Nick Desaulniers 
Reviewed-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 10 --
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..12d31d1a7630 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -45,6 +45,6 @@ $(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
 
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
-  CFLAGS_xor-neon.o+= $(NEON_FLAGS)
+  CFLAGS_xor-neon.o+= $(NEON_FLAGS) -ftree-vectorize 
-Wno-unused-variable
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..62b493e386c4 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -14,16 +14,6 @@ MODULE_LICENSE("GPL");
 #error You should compile this file with '-march=armv7-a -mfloat-abi=softfp 
-mfpu=neon'
 #endif
 
-/*
- * Pull in the reference implementations while instructing GCC (through
- * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
- * NEON instructions.
- */
-#ifdef CONFIG_CC_IS_GCC
-#pragma GCC optimize "tree-vectorize"
-#endif
-
-#pragma GCC diagnostic ignored "-Wunused-variable"
 #include 
 
 struct xor_block_template const xor_block_neon_inner = {
-- 
2.29.2



[PATCH v3 0/2] xor-neon: Remove GCC warn & pragmas

2020-11-13 Thread Adrian Ratiu
Dear all,

This is v3 of the patch series started at
id:20201106051436.2384842-1-adrian.ra...@collabora.com

This series does not address the Clang -ftree-vectorize not
working bug which is a known pre-existing issued documented
at [1] [2] [3]. Clang vectorization needs to be investigated
in more deepth and fixed separately. The purpouse of this is
to only fix some low-hanging-fruit GCC related isues.

Tested on next-20201112 using GCC 10.2.0 and Clang 10.0.1.

[1] https://bugs.llvm.org/show_bug.cgi?id=40976
[2] https://github.com/ClangBuiltLinux/linux/issues/503
[3] https://github.com/ClangBuiltLinux/linux/issues/496

Kind regards,
Adrian

Chnages in v3:
  - Reworded first commit (Ard)
  - Added tags by Nick and Nathan

Changes in v2:
  - Dropped the patch which disabled Clang vectorization (Nick)
  - Added new patch to move pragmas to makefile cmdline options
  (Arvid and Ard)

Adrian Ratiu (1):
  arm: lib: xor-neon: move pragma options to makefile

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/lib/Makefile   |  2 +-
 arch/arm/lib/xor-neon.c | 17 -
 2 files changed, 1 insertion(+), 18 deletions(-)

-- 
2.29.2



[PATCH v3 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2020-11-13 Thread Adrian Ratiu
From: Nathan Chancellor 

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9")
and clarify that -ftree-vectorize now always needs enabling
for GCC by directly testing the presence of CONFIG_CC_IS_GCC.

Another reason to remove the warning is that Clang exposes
itself as GCC < 4.6 so it triggers the warning about GCC
which doesn't make much sense and risks misleading users.

As a side-note remark, -fttree-vectorize is on by default in
Clang, but it currently does not work (see linked issues).

Link: https://github.com/ClangBuiltLinux/linux/issues/496
Link: https://github.com/ClangBuiltLinux/linux/issues/503
Reported-by: Nick Desaulniers 
Reviewed-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.29.2



Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-09 Thread Adrian Ratiu
On Fri, 06 Nov 2020, Nick Desaulniers  
wrote:
On Fri, Nov 6, 2020 at 3:50 AM Adrian Ratiu 
 wrote: 


Hi Nathan, 

On Fri, 06 Nov 2020, Nathan Chancellor 
 wrote: 
> + Ard, who wrote this code. 
> 
> On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: 
>> Due to a Clang bug [1] neon autoloop vectorization does not 
>> happen or happens badly with no gains and considering 
>> previous GCC experiences which generated unoptimized code 
>> which was worse than the default asm implementation, it is 
>> safer to default clang builds to the known good generic 
>> implementation.  The kernel currently supports a minimum 
>> Clang version of v10.0.1, see commit 1f7a44f63e6c 
>> ("compiler-clang: add build check for clang 10.0.1").   When 
>> the bug gets eventually fixed, this commit could be reverted 
>> or, if the minimum clang version bump takes a long time, a 
>> warning could be added for users to upgrade their compilers 
>> like was done for GCC.   [1] 
>> https://bugs.llvm.org/show_bug.cgi?id=40976  Signed-off-by: 
>> Adrian Ratiu  
> 
> Thank you for the patch! We are also tracking this here: 
> 
> https://github.com/ClangBuiltLinux/linux/issues/496 
> 
> It was on my TODO to revist getting the warning eliminated, 
> which likely would have involved a patch like this as well. 
> 
> I am curious if it is worth revisting or dusting off Arnd's 
> patch in the LLVM bug tracker first. I have not tried it 
> personally. If that is not a worthwhile option, I am fine 
> with this for now. It would be nice to try and get a fix 
> pinned down on the LLVM side at some point but alas, finite 
> amount of resources and people :( 

I tested Arnd's kernel patch from the LLVM bugtracker [1], but 
with the Clang v10.0.1 I still get warnings like the following 
even though the __restrict workaround seems to affect the 
generated instructions: 

./include/asm-generic/xor.h:15:2: remark: the cost-model 
indicates that interleaving is not beneficial 
[-Rpass-missed=loop-vectorize] 
./include/asm-generic/xor.h:11:1: remark: List vectorization 
was possible but not beneficial with cost 0 >= 0 
[-Rpass-missed=slp-vectorizer] xor_8regs_2(unsigned long bytes, 
unsigned long *__restrict p1, unsigned long *__restrict p2) 


If it's just a matter of overruling the cost model #pragma clang 
loop vectorize(enable) 

will do the trick. 

Indeed, ``` diff --git a/include/asm-generic/xor.h 
b/include/asm-generic/xor.h index b62a2a56a4d4..8796955498b7 
100644 --- a/include/asm-generic/xor.h +++ 
b/include/asm-generic/xor.h @@ -12,6 +12,7 @@ 
xor_8regs_2(unsigned long bytes, unsigned long *p1, unsigned 
long *p2) 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0]; p1[1] ^= p2[1]; 
@@ -32,6 +33,7 @@ xor_8regs_3(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0]; p1[1] ^= p2[1] ^ p3[1]; 
@@ -53,6 +55,7 @@ xor_8regs_4(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0] ^ p4[0]; p1[1] ^= p2[1] ^ 
p3[1] ^ p4[1]; 
@@ -75,6 +78,7 @@ xor_8regs_5(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= 
p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; 
``` seems to generate the vectorized code. 

Why don't we find a way to make those pragma's more toolchain 
portable, rather than open coding them like I have above rather 
than this series? 


Hi again Nick,

How did you verify the above pragmas generate correct vectorized 
code?  Have you tested this specific use case?


I'm asking because overrulling the cost model might not be enough, 
the only thing I can confirm is that the generated code is 
changed, but not that it is correct in any way. The object disasm 
also looks weird, but I don't have enough knowledge to start 
debugging what's happening within LLVM/Clang itself.


I also get some new warnings with your code [1], besides the 
previously 'vectorization was possible but not beneficial' which 
is still present. It is quite funny because these two warnings 
seem to contradict themselves. :)


At this point I do not trust the compiler and am inclined to do 
like was done for GCC when it was broken: disable the optimization 
and warn users to upgrade after the compiler is fixed and 
confirmed to work.


If you agree I can send a v2 with this and

[PATCH v5 2/3] media: uapi: Add VP9 stateless decoder controls

2020-11-02 Thread Adrian Ratiu
From: Boris Brezillon 

Add the VP9 stateless decoder controls plus the documentation that goes
with it.

Signed-off-by: Boris Brezillon 
Signed-off-by: Ezequiel Garcia 
Signed-off-by: Adrian Ratiu 
---
 .../userspace-api/media/v4l/biblio.rst|  10 +
 .../media/v4l/ext-ctrls-codec.rst | 550 ++
 drivers/media/v4l2-core/v4l2-ctrls.c  | 239 
 drivers/media/v4l2-core/v4l2-ioctl.c  |   1 +
 include/media/v4l2-ctrls.h|   5 +
 include/media/vp9-ctrls.h | 486 
 6 files changed, 1291 insertions(+)
 create mode 100644 include/media/vp9-ctrls.h

diff --git a/Documentation/userspace-api/media/v4l/biblio.rst 
b/Documentation/userspace-api/media/v4l/biblio.rst
index 7869b6f6ff72..6b4a83b053f5 100644
--- a/Documentation/userspace-api/media/v4l/biblio.rst
+++ b/Documentation/userspace-api/media/v4l/biblio.rst
@@ -407,3 +407,13 @@ VP8
 :title: RFC 6386: "VP8 Data Format and Decoding Guide"
 
 :author:J. Bankoski et al.
+
+.. _vp9:
+
+VP9
+===
+
+
+:title: VP9 Bitstream & Decoding Process Specification
+
+:author:Adrian Grange (Google), Peter de Rivaz (Argon Design), Jonathan 
Hunt (Argon Design)
diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst 
b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
index ce728c757eaf..456488f2b5ca 100644
--- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
+++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
@@ -2730,6 +2730,556 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
   - ``padding[3]``
   - Applications and drivers must set this to zero.
 
+.. _v4l2-mpeg-vp9:
+
+``V4L2_CID_MPEG_VIDEO_VP9_FRAME_CONTEXT(0..3) (struct)``
+Stores VP9 probabilities attached to a specific frame context. The VP9
+specification allows using a maximum of 4 contexts. Each frame being
+decoded refers to one of those context. See section '7.1.2 Refresh
+probs semantics' section of :ref:`vp9` for more details about these
+contexts.
+
+This control is bi-directional:
+
+* all 4 contexts must be initialized by userspace just after the
+  stream is started and before the first decoding request is submitted.
+* the referenced context might be read by the kernel when a decoding
+  request is submitted, and will be updated after the decoder is done
+  decoding the frame if the `V4L2_VP9_FRAME_FLAG_REFRESH_FRAME_CTX` flag
+  is set.
+* contexts will be read back by user space before each decoding request
+  to retrieve the updated probabilities.
+* userspace will re-initialize the context to their default values when
+  a reset context is required.
+
+.. note::
+
+   This compound control is not yet part of the public kernel API and
+   it is expected to change.
+
+.. c:type:: v4l2_ctrl_vp9_frame_ctx
+
+.. cssclass:: longtable
+
+.. tabularcolumns:: |p{5.8cm}|p{4.8cm}|p{6.6cm}|
+
+.. flat-table:: struct v4l2_ctrl_vp9_frame_ctx
+:header-rows:  0
+:stub-columns: 0
+:widths:   1 1 2
+
+* - struct :c:type:`v4l2_vp9_probabilities`
+  - ``probs``
+  - Structure with VP9 probabilities attached to the context.
+
+.. c:type:: v4l2_vp9_probabilities
+
+.. cssclass:: longtable
+
+.. tabularcolumns:: |p{1.5cm}|p{6.3cm}|p{9.4cm}|
+
+.. flat-table:: struct v4l2_vp9_probabilities
+:header-rows:  0
+:stub-columns: 0
+:widths:   1 1 2
+
+* - __u8
+  - ``tx8[2][1]``
+  - TX 8x8 probabilities.
+* - __u8
+  - ``tx16[2][2]``
+  - TX 16x16 probabilities.
+* - __u8
+  - ``tx32[2][3]``
+  - TX 32x32 probabilities.
+* - __u8
+  - ``coef[4][2][2][6][6][3]``
+  - Coefficient probabilities.
+* - __u8
+  - ``skip[3]``
+  - Skip probabilities.
+* - __u8
+  - ``inter_mode[7][3]``
+  - Inter prediction mode probabilities.
+* - __u8
+  - ``interp_filter[4][2]``
+  - Interpolation filter probabilities.
+* - __u8
+  - ``is_inter[4]``
+  - Is inter-block probabilities.
+* - __u8
+  - ``comp_mode[5]``
+  - Compound prediction mode probabilities.
+* - __u8
+  - ``single_ref[5][2]``
+  - Single reference probabilities.
+* - __u8
+  - ``comp_mode[5]``
+  - Compound reference probabilities.
+* - __u8
+  - ``y_mode[4][9]``
+  - Y prediction mode probabilities.
+* - __u8
+  - ``uv_mode[10][9]``
+  - UV prediction mode probabilities.
+* - __u8
+  - ``partition[16][3]``
+  - Partition probabilities.
+* - __u8
+  - ``mv.joint[3]``
+  - Motion vector joint probabilities.
+* - __u8
+  - ``mv.sign[2]``
+  - Motion vector sign probabilities.
+* - __u8
+  - ``mv.class[2][10]``
+  - Motion vector class probabilities.
+* - __u8
+  - ``mv.class0_bit[2]``
+  - Motion vector class0 bit probabilities.
+* - __u8

[PATCH v5 3/3] media: rkvdec: Add the VP9 backend

2020-11-02 Thread Adrian Ratiu
From: Boris Brezillon 

The Rockchip VDEC supports VP9 profile 0 up to 4096x2304@30fps. Add
a backend for this new format.

Signed-off-by: Boris Brezillon 
Signed-off-by: Ezequiel Garcia 
Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/rkvdec/Makefile |2 +-
 drivers/staging/media/rkvdec/rkvdec-vp9.c | 1577 +
 drivers/staging/media/rkvdec/rkvdec.c |   62 +-
 drivers/staging/media/rkvdec/rkvdec.h |6 +
 4 files changed, 1642 insertions(+), 5 deletions(-)
 create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c

diff --git a/drivers/staging/media/rkvdec/Makefile 
b/drivers/staging/media/rkvdec/Makefile
index c08fed0a39f9..cb86b429cfaa 100644
--- a/drivers/staging/media/rkvdec/Makefile
+++ b/drivers/staging/media/rkvdec/Makefile
@@ -1,3 +1,3 @@
 obj-$(CONFIG_VIDEO_ROCKCHIP_VDEC) += rockchip-vdec.o
 
-rockchip-vdec-y += rkvdec.o rkvdec-h264.o
+rockchip-vdec-y += rkvdec.o rkvdec-h264.o rkvdec-vp9.o
diff --git a/drivers/staging/media/rkvdec/rkvdec-vp9.c 
b/drivers/staging/media/rkvdec/rkvdec-vp9.c
new file mode 100644
index ..8b443ed511c9
--- /dev/null
+++ b/drivers/staging/media/rkvdec/rkvdec-vp9.c
@@ -0,0 +1,1577 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Rockchip Video Decoder VP9 backend
+ *
+ * Copyright (C) 2019 Collabora, Ltd.
+ * Boris Brezillon 
+ *
+ * Copyright (C) 2016 Rockchip Electronics Co., Ltd.
+ * Alpha Lin 
+ */
+
+#include 
+#include 
+#include 
+
+#include "rkvdec.h"
+#include "rkvdec-regs.h"
+
+#define RKVDEC_VP9_PROBE_SIZE  4864
+#define RKVDEC_VP9_COUNT_SIZE  13232
+#define RKVDEC_VP9_MAX_SEGMAP_SIZE 73728
+
+struct rkvdec_vp9_intra_mode_probs {
+   u8 y_mode[105];
+   u8 uv_mode[23];
+};
+
+struct rkvdec_vp9_intra_only_frame_probs {
+   u8 coef_intra[4][2][128];
+   struct rkvdec_vp9_intra_mode_probs intra_mode[10];
+};
+
+struct rkvdec_vp9_inter_frame_probs {
+   u8 y_mode[4][9];
+   u8 comp_mode[5];
+   u8 comp_ref[5];
+   u8 single_ref[5][2];
+   u8 inter_mode[7][3];
+   u8 interp_filter[4][2];
+   u8 padding0[11];
+   u8 coef[2][4][2][128];
+   u8 uv_mode_0_2[3][9];
+   u8 padding1[5];
+   u8 uv_mode_3_5[3][9];
+   u8 padding2[5];
+   u8 uv_mode_6_8[3][9];
+   u8 padding3[5];
+   u8 uv_mode_9[9];
+   u8 padding4[7];
+   u8 padding5[16];
+   struct {
+   u8 joint[3];
+   u8 sign[2];
+   u8 class[2][10];
+   u8 class0_bit[2];
+   u8 bits[2][10];
+   u8 class0_fr[2][2][3];
+   u8 fr[2][3];
+   u8 class0_hp[2];
+   u8 hp[2];
+   } mv;
+};
+
+struct rkvdec_vp9_probs {
+   u8 partition[16][3];
+   u8 pred[3];
+   u8 tree[7];
+   u8 skip[3];
+   u8 tx32[2][3];
+   u8 tx16[2][2];
+   u8 tx8[2][1];
+   u8 is_inter[4];
+   /* 128 bit alignment */
+   u8 padding0[3];
+   union {
+   struct rkvdec_vp9_inter_frame_probs inter;
+   struct rkvdec_vp9_intra_only_frame_probs intra_only;
+   };
+};
+
+/* Data structure describing auxiliary buffer format. */
+struct rkvdec_vp9_priv_tbl {
+   struct rkvdec_vp9_probs probs;
+   u8 segmap[2][RKVDEC_VP9_MAX_SEGMAP_SIZE];
+};
+
+struct rkvdec_vp9_refs_counts {
+   u32 eob[2];
+   u32 coeff[3];
+};
+
+struct rkvdec_vp9_inter_frame_symbol_counts {
+   u32 partition[16][4];
+   u32 skip[3][2];
+   u32 inter[4][2];
+   u32 tx32p[2][4];
+   u32 tx16p[2][4];
+   u32 tx8p[2][2];
+   u32 y_mode[4][10];
+   u32 uv_mode[10][10];
+   u32 comp[5][2];
+   u32 comp_ref[5][2];
+   u32 single_ref[5][2][2];
+   u32 mv_mode[7][4];
+   u32 filter[4][3];
+   u32 mv_joint[4];
+   u32 sign[2][2];
+   /* add 1 element for align */
+   u32 classes[2][11 + 1];
+   u32 class0[2][2];
+   u32 bits[2][10][2];
+   u32 class0_fp[2][2][4];
+   u32 fp[2][4];
+   u32 class0_hp[2][2];
+   u32 hp[2][2];
+   struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
+};
+
+struct rkvdec_vp9_intra_frame_symbol_counts {
+   u32 partition[4][4][4];
+   u32 skip[3][2];
+   u32 intra[4][2];
+   u32 tx32p[2][4];
+   u32 tx16p[2][4];
+   u32 tx8p[2][2];
+   struct rkvdec_vp9_refs_counts ref_cnt[2][4][2][6][6];
+};
+
+struct rkvdec_vp9_run {
+   struct rkvdec_run base;
+   const struct v4l2_ctrl_vp9_frame_decode_params *decode_params;
+};
+
+struct rkvdec_vp9_frame_info {
+   u32 valid : 1;
+   u32 segmapid : 1;
+   u32 frame_context_idx : 2;
+   u32 reference_mode : 2;
+   u32 tx_mode : 3;
+   u32 interpolation_filter : 3;
+   u32 flags;
+   u64 timestamp;
+   struct v4l2_vp9_segmentation seg;
+   struct v4l2_vp9_loop_filter lf;
+};
+
+struct rkvdec_vp9_ctx {
+   struct rkvdec_aux_buf priv_tbl;
+   struct rkvdec_aux_buf count

[PATCH v5 0/3] media: rkvdec: Add a VP9 backend

2020-11-02 Thread Adrian Ratiu
Dear all,

This is v5 of the series adding VP9 profile 0 decoding to rkvdec.

All feedback from v4 should be addressed, there's just one thing I did
not address: ref_frame_sign_biases in the uAPI. The userspace tool I'm
using [1] apparently doesn't need it or the default hwreg value for it
is capable of decoding the bitstreams I used on the driver, so I don't
really have a use-case to change and test that. :)

Considering the uAPI is a work in progress and expected to be modified,
ref_frame_sign_biases can be added later with others which might be
required to enable more functionality (for eg profiles >= 1).

Series tested on rk3399 and applies on next-20201030.

[1] https://github.com/Kwiboo/FFmpeg/tree/v4l2-request-hwaccel-4.2.2-rkvdec

Changelog
-

v5:

* Drop unnecessary OUTPUT buffer payload set in .buf_prepare.
* Drop obsolete .per_request ctrl flag
* Added new vp9 ctrls to v4l2_ctrl_ptr
* Fix pahole detected padding issues
* Send userspace an error if it tries to reconfigure decode resolution
  as v4l2 or rkvdec-vp9 backend do not support dynamic res changes yet
* Allow frame ctx probability tables to be non-mandatory so users can
  set them directly during frame decoding in cases where no defaults
  have been set previously (eg. ffmpeg vp9 backend)
* Some comments and documentation clarifications
* Minor checkpatch fixes

v4:

* Drop color_space field from the VP9 interface.
  V4L2 API should be used for it.
* Clarified Segment-ID comments.
* Moved motion vector probabilities to a separate
  struct.

v3:

* Fix documentation issues found by Hans.
* Fix smatch detected issues as pointed out by Hans.
* Added patch to fix wrong bytesused set on .buf_prepare.

v2:

* Documentation style issues pointed out by Nicolas internally.
* s/VP9_PROFILE_MAX/V4L2_VP9_PROFILE_MAX/
* Fix wrong kfree(ctx).
* constify a couple structs on rkvdec-vp9.c


Boris Brezillon (2):
  media: uapi: Add VP9 stateless decoder controls
  media: rkvdec: Add the VP9 backend

Ezequiel Garcia (1):
  media: rkvdec: Fix .buf_prepare

 .../userspace-api/media/v4l/biblio.rst|   10 +
 .../media/v4l/ext-ctrls-codec.rst |  550 ++
 drivers/media/v4l2-core/v4l2-ctrls.c  |  239 +++
 drivers/media/v4l2-core/v4l2-ioctl.c  |1 +
 drivers/staging/media/rkvdec/Makefile |2 +-
 drivers/staging/media/rkvdec/rkvdec-vp9.c | 1577 +
 drivers/staging/media/rkvdec/rkvdec.c |   72 +-
 drivers/staging/media/rkvdec/rkvdec.h |6 +
 include/media/v4l2-ctrls.h|5 +
 include/media/vp9-ctrls.h |  486 +
 10 files changed, 2942 insertions(+), 6 deletions(-)
 create mode 100644 drivers/staging/media/rkvdec/rkvdec-vp9.c
 create mode 100644 include/media/vp9-ctrls.h

-- 
2.29.0



[PATCH v5 1/3] media: rkvdec: Fix .buf_prepare

2020-11-02 Thread Adrian Ratiu
From: Ezequiel Garcia 

The driver should only set the payload on .buf_prepare if the
buffer is CAPTURE type. If an OUTPUT buffer has a zero bytesused
set by userspace then v4l2-core will set it to buffer length.

Fixes: cd33c830448ba ("media: rkvdec: Add the rkvdec driver")
Signed-off-by: Ezequiel Garcia 
Signed-off-by: Adrian Ratiu 
---
 drivers/staging/media/rkvdec/rkvdec.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/media/rkvdec/rkvdec.c 
b/drivers/staging/media/rkvdec/rkvdec.c
index d25c4a37e2af..0adc3828a4ba 100644
--- a/drivers/staging/media/rkvdec/rkvdec.c
+++ b/drivers/staging/media/rkvdec/rkvdec.c
@@ -471,7 +471,15 @@ static int rkvdec_buf_prepare(struct vb2_buffer *vb)
if (vb2_plane_size(vb, i) < sizeimage)
return -EINVAL;
}
-   vb2_set_plane_payload(vb, 0, f->fmt.pix_mp.plane_fmt[0].sizeimage);
+
+   /*
+* Buffer bytesused is written by driver for CAPTURE buffers.
+* (if userspace passes 0 bytesused for OUTPUT buffers, v4l2-core sets
+* it to buffer length).
+*/
+   if (!V4L2_TYPE_IS_OUTPUT(vq->type))
+   vb2_set_plane_payload(vb, 0, 
f->fmt.pix_mp.plane_fmt[0].sizeimage);
+
return 0;
 }
 
-- 
2.29.0



[PATCH 1/2] arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

2020-11-05 Thread Adrian Ratiu
From: Nathan Chancellor 

Drop warning because kernel now requires GCC >= v4.9 after
commit 6ec4476ac825 ("Raise gcc version requirement to 4.9").

Reported-by: Nick Desaulniers 
Signed-off-by: Nathan Chancellor 
Signed-off-by: Adrian Ratiu 
---
 arch/arm/lib/xor-neon.c | 9 +
 1 file changed, 1 insertion(+), 8 deletions(-)

diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index b99dd8e1c93f..e1e76186ec23 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -19,15 +19,8 @@ MODULE_LICENSE("GPL");
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
  */
-#if __GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 6)
+#ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-#else
-/*
- * While older versions of GCC do not generate incorrect code, they fail to
- * recognize the parallel nature of these functions, and emit plain ARM code,
- * which is known to be slower than the optimized ARM code in asm-arm/xor.h.
- */
-#warning This code requires at least version 4.6 of GCC
 #endif
 
 #pragma GCC diagnostic ignored "-Wunused-variable"
-- 
2.29.0



[PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-05 Thread Adrian Ratiu
Due to a Clang bug [1] neon autoloop vectorization does not happen or
happens badly with no gains and considering previous GCC experiences
which generated unoptimized code which was worse than the default asm
implementation, it is safer to default clang builds to the known good
generic implementation.

The kernel currently supports a minimum Clang version of v10.0.1, see
commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1").

When the bug gets eventually fixed, this commit could be reverted or,
if the minimum clang version bump takes a long time, a warning could
be added for users to upgrade their compilers like was done for GCC.

[1] https://bugs.llvm.org/show_bug.cgi?id=40976

Signed-off-by: Adrian Ratiu 
---
 arch/arm/include/asm/xor.h | 3 ++-
 arch/arm/lib/Makefile  | 3 +++
 arch/arm/lib/xor-neon.c| 4 
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h
index aefddec79286..49937dafaa71 100644
--- a/arch/arm/include/asm/xor.h
+++ b/arch/arm/include/asm/xor.h
@@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = {
NEON_TEMPLATES; \
} while (0)
 
-#ifdef CONFIG_KERNEL_MODE_NEON
+/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */
+#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG)
 
 extern struct xor_block_template const xor_block_neon_inner;
 
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile
index 6d2ba454f25b..53f9e7dd9714 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -43,8 +43,11 @@ endif
 $(obj)/csumpartialcopy.o:  $(obj)/csumpartialcopygeneric.S
 $(obj)/csumpartialcopyuser.o:  $(obj)/csumpartialcopygeneric.S
 
+# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976
+ifndef CONFIG_CC_IS_CLANG
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
   CFLAGS_xor-neon.o+= $(NEON_FLAGS)
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
+endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..84c91c48dfa2 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -18,6 +18,10 @@ MODULE_LICENSE("GPL");
  * Pull in the reference implementations while instructing GCC (through
  * -ftree-vectorize) to attempt to exploit implicit parallelism and emit
  * NEON instructions.
+
+ * On Clang the loop vectorizer is enabled by default, but due to a bug
+ * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke
+ * so xor-neon is disabled in favor of the default reg implementations.
  */
 #ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
-- 
2.29.0



[PATCH 0/2] arm: lib: xor-neon: Remove warn & disble neon vect

2020-11-05 Thread Adrian Ratiu
Dear all,

This is my attempt to close the loop on a relatively old discussion
[1] caused by a compiler bug [2]. In a nutshell, the Clang build issues
a bogus warning about GCC while it silently botches the neon auto-loop
vectorization. :)

Many thanks to all who have investigated this issue before me. Arnd
posted a workaround for xor.h [3], but I very much like his first
suggestion of disabling the broken feature until the compiler is fixed.

Tested on latest linux next-20201105 using bcm2835 & versatile configs
and Clang 10.0.1

P.S: While testing aarch64/imx8m I also noticed vectorization is broke
there as well, but that deserves its own patch because it's a separate
xor-neon implementation (if this approach is deemed sensible).

[1] 
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20190528235742.105510-1-natechancel...@gmail.com/
[2] https://bugs.llvm.org/show_bug.cgi?id=40976
[3] https://bugs.llvm.org/show_bug.cgi?id=40976#c6

Kind regards,
Adrian

Adrian Ratiu (1):
  arm: lib: xor-neon: disable clang vectorization

Nathan Chancellor (1):
  arm: lib: xor-neon: remove unnecessary GCC < 4.6 warning

 arch/arm/include/asm/xor.h |  3 ++-
 arch/arm/lib/Makefile  |  3 +++
 arch/arm/lib/xor-neon.c| 13 +
 3 files changed, 10 insertions(+), 9 deletions(-)

-- 
2.29.0



Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-06 Thread Adrian Ratiu

Hi Nathan,

On Fri, 06 Nov 2020, Nathan Chancellor  
wrote:
+ Ard, who wrote this code. 

On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: 
Due to a Clang bug [1] neon autoloop vectorization does not 
happen or happens badly with no gains and considering previous 
GCC experiences which generated unoptimized code which was 
worse than the default asm implementation, it is safer to 
default clang builds to the known good generic implementation. 
The kernel currently supports a minimum Clang version of 
v10.0.1, see commit 1f7a44f63e6c ("compiler-clang: add build 
check for clang 10.0.1").   When the bug gets eventually fixed, 
this commit could be reverted or, if the minimum clang version 
bump takes a long time, a warning could be added for users to 
upgrade their compilers like was done for GCC.   [1] 
https://bugs.llvm.org/show_bug.cgi?id=40976  Signed-off-by: 
Adrian Ratiu  


Thank you for the patch! We are also tracking this here: 

https://github.com/ClangBuiltLinux/linux/issues/496 

It was on my TODO to revist getting the warning eliminated, 
which likely would have involved a patch like this as well. 

I am curious if it is worth revisting or dusting off Arnd's 
patch in the LLVM bug tracker first. I have not tried it 
personally. If that is not a worthwhile option, I am fine with 
this for now. It would be nice to try and get a fix pinned down 
on the LLVM side at some point but alas, finite amount of 
resources and people :( 


I tested Arnd's kernel patch from the LLVM bugtracker [1], but 
with the Clang v10.0.1 I still get warnings like the following 
even though the __restrict workaround seems to affect the 
generated instructions:


./include/asm-generic/xor.h:15:2: remark: the cost-model indicates 
that interleaving is not beneficial [-Rpass-missed=loop-vectorize] 
./include/asm-generic/xor.h:11:1: remark: List vectorization was 
possible but not beneficial with cost 0 >= 0 
[-Rpass-missed=slp-vectorizer] xor_8regs_2(unsigned long bytes, 
unsigned long *__restrict p1, unsigned long *__restrict p2)


[1] https://bugs.llvm.org/show_bug.cgi?id=40976#c6



Should no other options come to fruition from further 
discussions, you can carry my tag forward: 

Acked-by: Nathan Chancellor  


Hopefully others can comment soon.


In my opinion we have 3 ways to go regarding this:

1. Leave it as is and try to notify the user of the breakage (eg 
add a new warning). You previously said this is not a good idea 
because the user can't do anything about it. I agree.


2. Somehow work around the compiler bug in the kernel which is 
what the LLVM bugtracker patch tries to do. This is a slippery 
slope even if we somehow get it right, especially since multiple 
Clang versions might be supported in the future and we don't know 
when the bug will be properly fixed by the compiler. In addition 
we're enabling and "hiding" possibly undefined behaviour.


3. Disable the broken feature and once the compiler bug is fixed 
enable it back warning users of old compilers that there is an 
action they can take: upgrade. This is exactly how this was 
handled for GCC previously, so there is a precedent.


This implements the 3'rd scenario which is also the first thing 
Arnd suggested in the original patch. :)


Adrian




---
 arch/arm/include/asm/xor.h | 3 ++-
 arch/arm/lib/Makefile  | 3 +++
 arch/arm/lib/xor-neon.c| 4 
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/xor.h b/arch/arm/include/asm/xor.h
index aefddec79286..49937dafaa71 100644
--- a/arch/arm/include/asm/xor.h
+++ b/arch/arm/include/asm/xor.h
@@ -141,7 +141,8 @@ static struct xor_block_template xor_block_arm4regs = {
NEON_TEMPLATES; \
} while (0)
 
-#ifdef CONFIG_KERNEL_MODE_NEON

+/* disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976 */
+#if defined(CONFIG_KERNEL_MODE_NEON) && !defined(CONFIG_CC_IS_CLANG)
 
 extern struct xor_block_template const xor_block_neon_inner;
 
diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile

index 6d2ba454f25b..53f9e7dd9714 100644
--- a/arch/arm/lib/Makefile
+++ b/arch/arm/lib/Makefile
@@ -43,8 +43,11 @@ endif
 $(obj)/csumpartialcopy.o:  $(obj)/csumpartialcopygeneric.S
 $(obj)/csumpartialcopyuser.o:  $(obj)/csumpartialcopygeneric.S
 
+# disabled on clang/arm due to https://bugs.llvm.org/show_bug.cgi?id=40976

+ifndef CONFIG_CC_IS_CLANG
 ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
   NEON_FLAGS   := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
   CFLAGS_xor-neon.o+= $(NEON_FLAGS)
   obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
 endif
+endif
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c
index e1e76186ec23..84c91c48dfa2 100644
--- a/arch/arm/lib/xor-neon.c
+++ b/arch/arm/lib/xor-neon.c
@@ -18,6 +18,10 @@ MODULE_LICENSE("GPL");
  * Pull in the reference implementations while instructing GCC (t

Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-07 Thread Adrian Ratiu
On Fri, 06 Nov 2020, Nick Desaulniers  
wrote:
On Fri, Nov 6, 2020 at 3:50 AM Adrian Ratiu 
 wrote: 


Hi Nathan, 

On Fri, 06 Nov 2020, Nathan Chancellor 
 wrote: 
> + Ard, who wrote this code. 
> 
> On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: 
>> Due to a Clang bug [1] neon autoloop vectorization does not 
>> happen or happens badly with no gains and considering 
>> previous GCC experiences which generated unoptimized code 
>> which was worse than the default asm implementation, it is 
>> safer to default clang builds to the known good generic 
>> implementation.  The kernel currently supports a minimum 
>> Clang version of v10.0.1, see commit 1f7a44f63e6c 
>> ("compiler-clang: add build check for clang 10.0.1").   When 
>> the bug gets eventually fixed, this commit could be reverted 
>> or, if the minimum clang version bump takes a long time, a 
>> warning could be added for users to upgrade their compilers 
>> like was done for GCC.   [1] 
>> https://bugs.llvm.org/show_bug.cgi?id=40976  Signed-off-by: 
>> Adrian Ratiu  
> 
> Thank you for the patch! We are also tracking this here: 
> 
> https://github.com/ClangBuiltLinux/linux/issues/496 
> 
> It was on my TODO to revist getting the warning eliminated, 
> which likely would have involved a patch like this as well. 
> 
> I am curious if it is worth revisting or dusting off Arnd's 
> patch in the LLVM bug tracker first. I have not tried it 
> personally. If that is not a worthwhile option, I am fine 
> with this for now. It would be nice to try and get a fix 
> pinned down on the LLVM side at some point but alas, finite 
> amount of resources and people :( 

I tested Arnd's kernel patch from the LLVM bugtracker [1], but 
with the Clang v10.0.1 I still get warnings like the following 
even though the __restrict workaround seems to affect the 
generated instructions: 

./include/asm-generic/xor.h:15:2: remark: the cost-model 
indicates that interleaving is not beneficial 
[-Rpass-missed=loop-vectorize] 
./include/asm-generic/xor.h:11:1: remark: List vectorization 
was possible but not beneficial with cost 0 >= 0 
[-Rpass-missed=slp-vectorizer] xor_8regs_2(unsigned long bytes, 
unsigned long *__restrict p1, unsigned long *__restrict p2) 


If it's just a matter of overruling the cost model #pragma clang 
loop vectorize(enable) 

will do the trick. 

Indeed, ``` diff --git a/include/asm-generic/xor.h 
b/include/asm-generic/xor.h index b62a2a56a4d4..8796955498b7 
100644 --- a/include/asm-generic/xor.h +++ 
b/include/asm-generic/xor.h @@ -12,6 +12,7 @@ 
xor_8regs_2(unsigned long bytes, unsigned long *p1, unsigned 
long *p2) 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0]; p1[1] ^= p2[1]; 
@@ -32,6 +33,7 @@ xor_8regs_3(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0]; p1[1] ^= p2[1] ^ p3[1]; 
@@ -53,6 +55,7 @@ xor_8regs_4(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0] ^ p4[0]; p1[1] ^= p2[1] ^ 
p3[1] ^ p4[1]; 
@@ -75,6 +78,7 @@ xor_8regs_5(unsigned long bytes, unsigned long 
*p1, unsigned long *p2, 
 { 
long lines = bytes / (sizeof (long)) / 8; 

+#pragma clang loop vectorize(enable) 
do { 
p1[0] ^= p2[0] ^ p3[0] ^ p4[0] ^ p5[0]; p1[1] ^= 
p2[1] ^ p3[1] ^ p4[1] ^ p5[1]; 
``` seems to generate the vectorized code. 

Why don't we find a way to make those pragma's more toolchain 
portable, rather than open coding them like I have above rather 
than this series?


Hi Nick,

Thank you very much for the suggestion.

I agree. If a toolchain portable way can be found to realiably 
trigger the optimization, I will gladly replace this patch. :)


Will work on it starting Monday then report back my findings or, 
if I can get it to work in a satisfying manner, send a v2 series 
directly.


The first patch is still needed because it's more of a general 
cleanup as Nathan correctly observed.


Regards,
Adrian



--
Thanks,
~Nick Desaulniers


Re: [PATCH 2/2] arm: lib: xor-neon: disable clang vectorization

2020-11-07 Thread Adrian Ratiu
On Sat, 07 Nov 2020, Russell King - ARM Linux admin 
 wrote:
On Fri, Nov 06, 2020 at 07:14:36AM +0200, Adrian Ratiu wrote: 
diff --git a/arch/arm/lib/xor-neon.c b/arch/arm/lib/xor-neon.c 
index e1e76186ec23..84c91c48dfa2 100644 --- 
a/arch/arm/lib/xor-neon.c +++ b/arch/arm/lib/xor-neon.c @@ 
-18,6 +18,10 @@ MODULE_LICENSE("GPL"); 
  * Pull in the reference implementations while instructing GCC 
  (through * -ftree-vectorize) to attempt to exploit implicit 
  parallelism and emit * NEON instructions. 
+ 


Please tidy this up before submission; we normally continue the 
"*" for blank lines in comment blocks. Thanks. 


Indeed, thank you. I will fix it if I don't replace this patch 
entirely with something similar to what Nick suggested.


Perhaps adding a checkpatch test for this is a good idea?

Adrian




+ * On Clang the loop vectorizer is enabled by default, but due to a bug
+ * (https://bugs.llvm.org/show_bug.cgi?id=40976) vectorization is broke
+ * so xor-neon is disabled in favor of the default reg implementations.
  */
 #ifdef CONFIG_CC_IS_GCC
 #pragma GCC optimize "tree-vectorize"
--
2.29.0




--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!


Re: [PATCH v9 00/11] Genericize DW MIPI DSI bridge and add i.MX 6 driver

2020-06-30 Thread Adrian Ratiu

Hi Neil,

On Mon, 29 Jun 2020, Neil Armstrong  
wrote:
Hi Adrian, 

On 09/06/2020 19:49, Adrian Ratiu wrote: 
[Re-submitting to cc dri-devel, sorry about the noise]  Hello 
all,  v9 cleanly applies on top of latest next-20200609 tree. 
v9 does not depend on other patches as the last binding doc has 
been merged.   All feedback up to this point has been 
addressed. Specific details in individual patch changelogs. 
The biggest changes are the deprecation of the Synopsys DW 
bridge bind() API in favor of of_drm_find_bridge() and .attach 
callbacks, the addition of a TODO entry which outlines future 
planned bridge driver refactorings and a reordering of some 
i.MX 6 patches to appease checkpatch.   The idea behind the 
TODO is to get this regmap and i.MX 6 driver merged and then do 
the rest of refactorings in-tree because it's easier and the 
refactorings themselves are out-of-scope of this series which 
is adding i.MX 6 support and is quite big already, so please, 
if there are more refactoring ideas, let's add them to the TODO 
doc. :) I intend to tackle those after this series is merged to 
avoid two complex inter-dependent simultaneous series. 


This has been around here for a long time and you seem to have 
addressed all the reviews. 

 As always more testing is welcome especially on Rockchip and 
STM SoCs. 


It has been tested on STM, but I'd like a feedback on RK 
platform before applying the bridge parts. 

Can the imx & stm patches be applied separately ? 



Yes the IMX and STM patches can be applied separately, they just 
both depend on the common regmap patches.


The binding API removal change which directly touches RK can also 
be applied separately, but unfortunately I do not have access to a 
RK board with a DSI display to test it (or the bridge regmap logic 
on RK btw...), I just "eye-balled" the RK code based on the public 
docs and it LGTM.



Neil



Big thank you to everyone who has contributed to this up to now,
Adrian

Adrian Ratiu (11):
  drm: bridge: dw_mipi_dsi: add initial regmap infrastructure
  drm: bridge: dw_mipi_dsi: abstract register access using reg_fields
  drm: bridge: dw_mipi_dsi: add dsi v1.01 support
  drm: bridge: dw_mipi_dsi: remove bind/unbind API
  dt-bindings: display: add i.MX6 MIPI DSI host controller doc
  ARM: dts: imx6qdl: add missing mipi dsi properties
  drm: imx: Add i.MX 6 MIPI DSI host platform driver
  drm: stm: dw-mipi-dsi: let the bridge handle the HW version check
  drm: bridge: dw-mipi-dsi: split low power cfg register into fields
  drm: bridge: dw-mipi-dsi: fix bad register field offsets
  Documentation: gpu: todo: Add dw-mipi-dsi consolidation plan

 .../display/imx/fsl,mipi-dsi-imx6.yaml| 112 +++
 Documentation/gpu/todo.rst|  25 +
 arch/arm/boot/dts/imx6qdl.dtsi|   8 +
 drivers/gpu/drm/bridge/synopsys/Kconfig   |   1 +
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 713 --
 drivers/gpu/drm/imx/Kconfig   |   8 +
 drivers/gpu/drm/imx/Makefile  |   1 +
 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c| 399 ++
 .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c   |   7 +-
 drivers/gpu/drm/stm/dw_mipi_dsi-stm.c |  16 +-
 10 files changed, 1059 insertions(+), 231 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml
 create mode 100644 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c



Re: [PATCH v9 00/11] Genericize DW MIPI DSI bridge and add i.MX 6 driver

2020-07-01 Thread Adrian Ratiu

Hi Heiko,

On Wed, 01 Jul 2020, Heiko Stübner  wrote:
Hi Adrian, 

Am Dienstag, 9. Juni 2020, 19:49:48 CEST schrieb Adrian Ratiu: 
[Re-submitting to cc dri-devel, sorry about the noise]  Hello 
all,  v9 cleanly applies on top of latest next-20200609 tree. 


at least it doesn't apply on top of current drm-misc-next for me 
which I really don't understand. 

Like patch 2/11 does 

@@ -31,6 +31,7 @@ 
 #include  
. 
 #define HWVER_131<><--><-->0x31333100<>/* IP 
 version 1.31 */ 
+#define HWVER_130<><--><-->0x31333000<>/* IP 
version 1.30 */ . 
 #define DSI_VERSION<--><--><-->0x00 #define 
 VERSION<--><--><--><-->GENMASK(31, 8) 

where the file currently looks like 

#include  #include  
#include  #include  #include 
 #include  #include 
 

#define HWVER_131			0x31333100	/* IP 
version 1.31 */ 

#define DSI_VERSION			0x00 #define VERSION 
GENMASK(31, 8) 
 
even in Linux-next 
 
So I guess ideally rebase on top of drm-misc-next


I will send a rebase on top of drm-misc-next soon (with the last 
DTS nitpick fixed and the latest acks and reviewed-by tags added).


In the meantime I also found someone within Collabora who has a RK 
with a DSI panel and found a bug (likely clock is not enabled 
early enough to access the cfg registers to get the version for 
regmap).


I'm super happy this is getting tested on RK, thank you!




Thanks
Heiko


[PATCH v9 05/11] dt-bindings: display: add i.MX6 MIPI DSI host controller doc

2020-06-09 Thread Adrian Ratiu
This provides an example DT binding for the MIPI DSI host controller
present on the i.MX6 SoC based on Synopsis DesignWare v1.01 IP.

Cc: Rob Herring 
Cc: Neil Armstrong 
Cc: Fabio Estevam 
Cc: Laurent Pinchart 
Cc: devicet...@vger.kernel.org
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Sjoerd Simons 
Signed-off-by: Martyn Welch 
Signed-off-by: Adrian Ratiu 
---
Changes since v8:
  - Fixed small compatible string typo caught by checkpatch
  - Added custom select for 'fsl,imx6-mipi-dsi' (Rob)
  - Replaced additionalProperties -> unevaluatedProperties (Rob)
  - Dropped all nodes not adding any new constraints apart from
  the recently upstreamed snps,dw-mipi-dsi.yaml (Rob)

Changes since v7:
  - Clarified port@0,1 descriptions, marked them as required and
  added missing port@0 in example (Laurent)

Changes since v6:
  - Added ref to the newly created snps,dw-mipi-dsi.yaml (Laurent)
  - Moved *-cells properties outside patternProperties (Laurent)
  - Removed the panel port documentation (Laurent)
  - Wrapped lines at 80 chars, typo fixes, sort includes (Laurent)

Changes since v5:
  - Fixed missing reg warning (Fabio)
  - Updated dt-schema and fixed warnings (Rob)

Changes since v4:
  - Fixed yaml binding to pass `make dt_binding_check dtbs_check`
  and addressed received binding feedback (Rob)

Changes since v3:
  - Added commit message (Neil)
  - Converted to yaml format (Neil)
  - Minor dt node + driver fixes (Rob)
  - Added small panel example to the host controller binding

Changes since v2:
  - Fixed commit tags (Emil)
---
 .../display/imx/fsl,mipi-dsi-imx6.yaml| 112 ++
 1 file changed, 112 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml

diff --git 
a/Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml 
b/Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml
new file mode 100644
index 0..86093729fd5f9
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml
@@ -0,0 +1,112 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/display/imx/fsl,mipi-dsi-imx6.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale i.MX6 DW MIPI DSI Host Controller
+
+maintainers:
+  - Adrian Ratiu 
+
+description: |
+  The i.MX6 DSI host controller is a Synopsys DesignWare MIPI DSI v1.01
+  IP block with a companion PHY IP.
+
+  These DT bindings follow the Synopsys DW MIPI DSI bindings defined in
+  Documentation/devicetree/bindings/display/bridge/dw_mipi_dsi.txt with
+  the following device-specific properties.
+
+allOf:
+  - $ref: ../bridge/snps,dw-mipi-dsi.yaml#
+
+# Need a custom select here or 'snps,dw-mipi-dsi' will match lots of nodes
+select:
+  properties:
+compatible:
+  contains:
+enum:
+  - fsl,imx6-mipi-dsi
+  required:
+- compatible
+
+properties:
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+  compatible:
+items:
+  - const: fsl,imx6-mipi-dsi
+  - const: snps,dw-mipi-dsi
+
+  interrupts:
+maxItems: 1
+
+  fsl,gpr:
+description:
+  Phandle to the iomuxc-gpr region containing the multiplexer ctrl 
register.
+$ref: /schemas/types.yaml#/definitions/phandle
+
+unevaluatedProperties: false
+
+required:
+  - "#address-cells"
+  - "#size-cells"
+  - compatible
+  - interrupts
+
+examples:
+  - |+
+#include 
+#include 
+#include 
+
+dsi: dsi@21e {
+#address-cells = <1>;
+#size-cells = <0>;
+compatible = "fsl,imx6-mipi-dsi", "snps,dw-mipi-dsi";
+reg = <0x021e 0x4000>;
+interrupts = <0 102 IRQ_TYPE_LEVEL_HIGH>;
+fsl,gpr = <&gpr>;
+clocks = <&clks IMX6QDL_CLK_MIPI_CORE_CFG>,
+ <&clks IMX6QDL_CLK_MIPI_IPG>;
+clock-names = "ref", "pclk";
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+port@0 {
+reg = <0>;
+mipi_mux_0: endpoint {
+remote-endpoint = <&ipu1_di0_mipi>;
+};
+};
+port@1 {
+reg = <1>;
+dsi_out: endpoint {
+remote-endpoint = <&panel_in>;
+};
+};
+};
+
+panel@0 {
+compatible = "sharp,ls032b3sx01";
+reg = <0>;
+reset-gpios = <&gpio6 8 GPIO_ACTIVE_LOW>;
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+port@0 {
+reg = <0>;
+panel_in: endpoint {
+remote-endpoint = <&dsi_out>;
+};
+};
+};
+};
+};
+
+...
-- 
2.27.0



[PATCH v9 02/11] drm: bridge: dw_mipi_dsi: abstract register access using reg_fields

2020-06-09 Thread Adrian Ratiu
Register existence, address/offsets, field layouts, reserved bits and
so on differ between MIPI-DSI versions and between SoC vendor boards.
Despite these differences the hw IP and protocols are mostly the same
so the generic bridge can be made to compensate these differences.

The current Rockchip and STM drivers hardcoded a lot of their common
definitions in the bridge code because they're based on DSI v1.30 and
1.31 which are relatively close, but in order to support older/future
versions with more diverging layouts like the v1.01 present on imx6,
we abstract some of the register accesses via the regmap field APIs.

The bridge detects the DSI core version and initializes the required
regmap register layout. Other DSI versions / register layouts can
easily be added in the future by only changing the bridge code.

The platform drivers don't require any changes, DSI register layout
versioning will be handled transparently by the bridge, but if in
the future the regmap or layouts needs to be exposed to the drivres,
it could easily be done via plat_data or a new API in dw_mipi_dsi.h.

Suggested-by: Boris Brezillon 
Reviewed-by: Emil Velikov 
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
Changes since v5:
  - Fix CONFIG_DEBUG_FS build (Adrian)
  - Fix DRM_MODE_FLAG_* test negation (Adrian)
  - Fixed cfg_phy_status range from [0,0] to [0,2]
  - Replace do {} while(0) with GCC extension ({}) (Andrzej)
  - Fixed payload no-op writes on STM devices (Adrian & Arnaud)

Changes since v4:
  - Move regmap infrastructure logic to a separate commit (Ezequiel)
  - Consolidate field infrastructure in this commit (Ezequiel)
  - Move the dsi v1.01 layout logic to a separate commit (Ezequiel)

Changes since v2:
  - Added const declarations to dw_mipi_dsi structs (Emil)
  - Fixed commit tags (Emil)

Changes since v1:
  - Moved register definitions & regmap initialization into bridge
  module. Platform drivers get the regmap via plat_data after calling
  the bridge probe (Emil).
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 499 --
 1 file changed, 347 insertions(+), 152 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 34b8668ae24ea..f453df4eb5072 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -31,6 +31,7 @@
 #include 
 
 #define HWVER_131  0x31333100  /* IP version 1.31 */
+#define HWVER_130  0x31333000  /* IP version 1.30 */
 
 #define DSI_VERSION0x00
 #define VERSIONGENMASK(31, 8)
@@ -47,7 +48,6 @@
 #define DPI_VCID(vcid) ((vcid) & 0x3)
 
 #define DSI_DPI_COLOR_CODING   0x10
-#define LOOSELY18_EN   BIT(8)
 #define DPI_COLOR_CODING_16BIT_1   0x0
 #define DPI_COLOR_CODING_16BIT_2   0x1
 #define DPI_COLOR_CODING_16BIT_3   0x2
@@ -56,11 +56,6 @@
 #define DPI_COLOR_CODING_24BIT 0x5
 
 #define DSI_DPI_CFG_POL0x14
-#define COLORM_ACTIVE_LOW  BIT(4)
-#define SHUTD_ACTIVE_LOW   BIT(3)
-#define HSYNC_ACTIVE_LOW   BIT(2)
-#define VSYNC_ACTIVE_LOW   BIT(1)
-#define DATAEN_ACTIVE_LOW  BIT(0)
 
 #define DSI_DPI_LP_CMD_TIM 0x18
 #define OUTVACT_LPCMD_TIME(p)  (((p) & 0xff) << 16)
@@ -81,27 +76,19 @@
 #define DSI_GEN_VCID   0x30
 
 #define DSI_MODE_CFG   0x34
-#define ENABLE_VIDEO_MODE  0
-#define ENABLE_CMD_MODEBIT(0)
 
 #define DSI_VID_MODE_CFG   0x38
-#define ENABLE_LOW_POWER   (0x3f << 8)
-#define ENABLE_LOW_POWER_MASK  (0x3f << 8)
+#define ENABLE_LOW_POWER   0x3f
+
 #define VID_MODE_TYPE_NON_BURST_SYNC_PULSES0x0
 #define VID_MODE_TYPE_NON_BURST_SYNC_EVENTS0x1
 #define VID_MODE_TYPE_BURST0x2
-#define VID_MODE_TYPE_MASK 0x3
-#define VID_MODE_VPG_ENABLEBIT(16)
-#define VID_MODE_VPG_HORIZONTALBIT(24)
 
 #define DSI_VID_PKT_SIZE   0x3c
-#define VID_PKT_SIZE(p)((p) & 0x3fff)
 
 #define DSI_VID_NUM_CHUNKS 0x40
-#define VID_NUM_CHUNKS(c)  ((c) & 0x1fff)
 
 #define DSI_VID_NULL_SIZE  0x44
-#define VID_NULL_SIZE(b)   ((b) & 0x1fff)
 
 #define DSI_VID_HSA_TIME   0x48
 #define DSI_VID_HBP_TIME   0x4c
@@ -125,7 +112,6 @@
 #define GEN_SW_2P_TX_LPBIT(10)
 #define GEN_SW_1P_TX_LPBIT(9)
 #define GEN_SW_0P_TX_LPBIT(8)
-#define ACK_RQST_ENBIT(1)
 #define TEAR_FX_EN BIT(0)
 
 #define CMD_MODE_ALL_LP(MAX_RD_PK

[PATCH v9 06/11] ARM: dts: imx6qdl: add missing mipi dsi properties

2020-06-09 Thread Adrian Ratiu
Now that we have a proper driver for the imx6 mipi dsi host controller
we can fill in the missing properties to get it working.

Cc: Laurent Pinchart 
Cc: Rob Herring 
Cc: devicet...@vger.kernel.org
Signed-off-by: Adrian Ratiu 
---
New in v8.
---
 arch/arm/boot/dts/imx6qdl.dtsi | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/arm/boot/dts/imx6qdl.dtsi b/arch/arm/boot/dts/imx6qdl.dtsi
index 7eec1122e5d74..d2f4fdfe4a252 100644
--- a/arch/arm/boot/dts/imx6qdl.dtsi
+++ b/arch/arm/boot/dts/imx6qdl.dtsi
@@ -1222,7 +1222,15 @@ mipi_csi: mipi@21dc000 {
};
 
mipi_dsi: mipi@21e {
+   compatible = "fsl,imx6-mipi-dsi", 
"snps,dw-mipi-dsi";
+   #address-cells = <1>;
+   #size-cells = <0>;
reg = <0x021e 0x4000>;
+   interrupts = <0 102 IRQ_TYPE_LEVEL_HIGH>;
+   fsl,gpr = <&gpr>;
+   clocks = <&clks IMX6QDL_CLK_MIPI_CORE_CFG>,
+<&clks IMX6QDL_CLK_MIPI_IPG>;
+   clock-names = "ref", "pclk";
status = "disabled";
 
ports {
-- 
2.27.0



[PATCH v9 09/11] drm: bridge: dw-mipi-dsi: split low power cfg register into fields

2020-06-09 Thread Adrian Ratiu
According to the Host Registers documentation for IMX, STM and RK
the LP cfg register should not be written entirely in one go because
some bits are reserved and should be kept to reset values, for eg.
BIT(15) which is reserved in all versions.

This also cleans up the code by removing the the ugly defines
and making field ranges & values written more explicit.

Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
New in v6.
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 105 ++
 1 file changed, 33 insertions(+), 72 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 70df0578cbe7b..1e47d40b5becb 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -120,60 +120,6 @@
 #define DSI_TO_CNT_CFG_V1010x40
 #define DSI_PCKHDL_CFG_V1010x18
 
-#define MAX_RD_PKT_SIZE_LP BIT(24)
-#define DCS_LW_TX_LP   BIT(19)
-#define DCS_SR_0P_TX_LPBIT(18)
-#define DCS_SW_1P_TX_LPBIT(17)
-#define DCS_SW_0P_TX_LPBIT(16)
-#define GEN_LW_TX_LP   BIT(14)
-#define GEN_SR_2P_TX_LPBIT(13)
-#define GEN_SR_1P_TX_LPBIT(12)
-#define GEN_SR_0P_TX_LPBIT(11)
-#define GEN_SW_2P_TX_LPBIT(10)
-#define GEN_SW_1P_TX_LPBIT(9)
-#define GEN_SW_0P_TX_LPBIT(8)
-#define TEAR_FX_EN BIT(0)
-
-#define CMD_MODE_ALL_LP(MAX_RD_PKT_SIZE_LP | \
-DCS_LW_TX_LP | \
-DCS_SR_0P_TX_LP | \
-DCS_SW_1P_TX_LP | \
-DCS_SW_0P_TX_LP | \
-GEN_LW_TX_LP | \
-GEN_SR_2P_TX_LP | \
-GEN_SR_1P_TX_LP | \
-GEN_SR_0P_TX_LP | \
-GEN_SW_2P_TX_LP | \
-GEN_SW_1P_TX_LP | \
-GEN_SW_0P_TX_LP)
-
-#define EN_TEAR_FX_V101BIT(14)
-#define DCS_LW_TX_LP_V101  BIT(12)
-#define GEN_LW_TX_LP_V101  BIT(11)
-#define MAX_RD_PKT_SIZE_LP_V101BIT(10)
-#define DCS_SW_2P_TX_LP_V101   BIT(9)
-#define DCS_SW_1P_TX_LP_V101   BIT(8)
-#define DCS_SW_0P_TX_LP_V101   BIT(7)
-#define GEN_SR_2P_TX_LP_V101   BIT(6)
-#define GEN_SR_1P_TX_LP_V101   BIT(5)
-#define GEN_SR_0P_TX_LP_V101   BIT(4)
-#define GEN_SW_2P_TX_LP_V101   BIT(3)
-#define GEN_SW_1P_TX_LP_V101   BIT(2)
-#define GEN_SW_0P_TX_LP_V101   BIT(1)
-
-#define CMD_MODE_ALL_LP_V101   (DCS_LW_TX_LP_V101 | \
-GEN_LW_TX_LP_V101 | \
-MAX_RD_PKT_SIZE_LP_V101 | \
-DCS_SW_2P_TX_LP_V101 | \
-DCS_SW_1P_TX_LP_V101 | \
-DCS_SW_0P_TX_LP_V101 | \
-GEN_SR_2P_TX_LP_V101 | \
-GEN_SR_1P_TX_LP_V101 | \
-GEN_SR_0P_TX_LP_V101 | \
-GEN_SW_2P_TX_LP_V101 | \
-GEN_SW_1P_TX_LP_V101 | \
-GEN_SW_0P_TX_LP_V101)
-
 #define DSI_GEN_HDR0x6c
 #define DSI_GEN_PLD_DATA   0x70
 
@@ -257,7 +203,11 @@ struct dw_mipi_dsi {
struct regmap_field *field_dpi_vsync_active_low;
struct regmap_field *field_dpi_hsync_active_low;
struct regmap_field *field_cmd_mode_ack_rqst_en;
-   struct regmap_field *field_cmd_mode_all_lp_en;
+   struct regmap_field *field_cmd_mode_gen_sw_sr_en;
+   struct regmap_field *field_cmd_mode_dcs_sw_sr_en;
+   struct regmap_field *field_cmd_mode_gen_lw_en;
+   struct regmap_field *field_cmd_mode_dcs_lw_en;
+   struct regmap_field *field_cmd_mode_max_rd_pkt_size;
struct regmap_field *field_cmd_mode_en;
struct regmap_field *field_cmd_pkt_status;
struct regmap_field *field_vid_mode_en;
@@ -315,7 +265,11 @@ struct dw_mipi_dsi_variant {
struct reg_fieldcfg_dpi_hsync_active_low;
struct reg_fieldcfg_cmd_mode_en;
struct reg_fieldcfg_cmd_mode_ack_rqst_en;
-   struct reg_fieldcfg_cmd_mode_all_lp_en;
+   struct reg_fieldcfg_cmd_mode_gen_sw_sr_en;
+   st

[PATCH v9 08/11] drm: stm: dw-mipi-dsi: let the bridge handle the HW version check

2020-06-09 Thread Adrian Ratiu
The stm mipi-dsi platform driver added a version test in
commit fa6251a747b7 ("drm/stm: dsi: check hardware version")
so that HW revisions other than v1.3x get rejected. The rockchip
driver had no such check and just assumed register layouts are
v1.3x compatible.

Having such tests was a good idea because only v130/v131 layouts
were supported at the time, however since adding multiple layout
support in the bridge, the version is automatically checked for
all drivers, compatible layouts get picked and unsupported HW is
automatically rejected by the bridge, so there's no use keeping
the test in the stm driver.

The main reason prompting this change is that the stm driver
test immediately disabled the peripheral clock after reading
the version, making the bridge read version 0x0 immediately
after in its own probe(), so we move the clock disabling after
the bridge does the version test.

Tested on STM32F769 and STM32MP1.

Cc: linux-st...@st-md-mailman.stormreply.com
Cc: Emil Velikov 
Reported-by: Adrian Pop 
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
New in v6.
---
 drivers/gpu/drm/stm/dw_mipi_dsi-stm.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c 
b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
index 2e1f2664495d0..45f67f8a5f6c8 100644
--- a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -396,26 +396,19 @@ static int dw_mipi_dsi_stm_probe(struct platform_device 
*pdev)
goto err_dsi_probe;
}
 
+   /* enable pclk so MMIO register values can be read, else reads == 0x0 */
ret = clk_prepare_enable(pclk);
if (ret) {
DRM_ERROR("%s: Failed to enable peripheral clk\n", __func__);
goto err_dsi_probe;
}
 
-   dsi->hw_version = dsi_read(dsi, DSI_VERSION) & VERSION;
-   clk_disable_unprepare(pclk);
-
-   if (dsi->hw_version != HWVER_130 && dsi->hw_version != HWVER_131) {
-   ret = -ENODEV;
-   DRM_ERROR("bad dsi hardware version\n");
-   goto err_dsi_probe;
-   }
-
dw_mipi_dsi_stm_plat_data.base = dsi->base;
dw_mipi_dsi_stm_plat_data.priv_data = dsi;
 
platform_set_drvdata(pdev, dsi);
 
+   /* setup the bridge, this will also access MMIO registers via regmap */
dsi->dsi = dw_mipi_dsi_probe(pdev, &dw_mipi_dsi_stm_plat_data);
if (IS_ERR(dsi->dsi)) {
ret = PTR_ERR(dsi->dsi);
@@ -423,6 +416,11 @@ static int dw_mipi_dsi_stm_probe(struct platform_device 
*pdev)
goto err_dsi_probe;
}
 
+   dsi->hw_version = dsi_read(dsi, DSI_VERSION) & VERSION;
+
+   /* initial MMIO config done, disable clk to save power */
+   clk_disable_unprepare(pclk);
+
return 0;
 
 err_dsi_probe:
-- 
2.27.0



[PATCH v9 07/11] drm: imx: Add i.MX 6 MIPI DSI host platform driver

2020-06-09 Thread Adrian Ratiu
This adds support for the Synopsis DesignWare MIPI DSI v1.01 host
controller which is embedded in i.MX 6 SoCs.

Based on following patches, but updated/extended to work with existing
support found in the kernel:

- drm: imx: Support Synopsys DesignWare MIPI DSI host controller
  Signed-off-by: Liu Ying 

Cc: Fabio Estevam 
Cc: Enric Balletbo i Serra 
Reviewed-by: Emil Velikov 
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Sjoerd Simons 
Signed-off-by: Martyn Welch 
Signed-off-by: Adrian Ratiu 
---
Changes since v8:
  - Changed Enric's email in the CC tag to his work address
  - pllref_clk != 27Mhz promoted from WARN to DRM_DEV_ERROR and
  is tested ASAP after clk enable in probe() (Enric)
  - Multiple small bridge init cleanups (Enric)
  - Multiple debug/error msg fixes to make them more clear or
  avoid redundancy (Enric)
  - Simplified imx_mipi_dsi_bind() as a result of the DW bridge
  bind() removal (Laurent)
  - Added a bridge .attach callback
  - Dropped unnecessary bridge_to_imxdsi() conversion
  - Fixed minor whitespace checkpatch warnings and moved the
  bindings doc before this patch to appease checkpatch

Changes since v7:
  - Removed encoder helper ops and added drm_bridge (Laurent)
  - Brought back drm_simple_encoder_init and dropped dependency on
  external unify encoder creation patch (Laurent)
  - Minor typo fixes

Changes since v6:
  - Replaced custom noop encoder with the simple drm encoder (Enric)
  - Added CONFIG_DRM_IMX6_MIPI_DSI depends on CONFIG_OF (Enric)
  - Dropped imx_mipi_dsi_register() because now it only creates the
  dummy encoder which can easily be done directly in imx_dsi_bind()

Changes since v5:
  - Reword to remove unrelated device tree patch mention (Fabio)
  - Move pllref_clk enable/disable to bind/unbind (Ezequiel)
  - Fix freescale.com -> nxp.com email addresses (Fabio)
  - Also added myself as module author (Fabio)
  - Use DRM_DEV_* macros for consistency, print more error msg

Changes since v4:
  - Split off driver-specific configuration of phy timings due
  to new upstream API.
  - Move regmap infrastructure logic to separate commit (Ezequiel)
  - Move dsi v1.01 layout addition to a separate commit (Ezequiel)
  - Minor warnings and driver name fixes

Changes since v3:
  - Renamed platform driver to reflect it's i.MX6 only. (Fabio)

Changes since v2:
  - Fixed commit tags. (Emil)

Changes since v1:
  - Moved register definitions & regmap initialization into bridge
  module. Platform drivers get the regmap via plat_data after
  calling the bridge probe. (Emil)
---
 drivers/gpu/drm/imx/Kconfig|   8 +
 drivers/gpu/drm/imx/Makefile   |   1 +
 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c | 399 +
 3 files changed, 408 insertions(+)
 create mode 100644 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c

diff --git a/drivers/gpu/drm/imx/Kconfig b/drivers/gpu/drm/imx/Kconfig
index 874cc532eabad..5f5b925ed09ca 100644
--- a/drivers/gpu/drm/imx/Kconfig
+++ b/drivers/gpu/drm/imx/Kconfig
@@ -46,3 +46,11 @@ config DRM_IMX_HDMI
depends on DRM_IMX
help
  Choose this if you want to use HDMI on i.MX6.
+
+config DRM_IMX6_MIPI_DSI
+   tristate "Freescale i.MX6 DRM MIPI DSI"
+   select DRM_DW_MIPI_DSI
+   depends on DRM_IMX
+   depends on OF
+   help
+ Choose this if you want to use MIPI DSI on i.MX6.
diff --git a/drivers/gpu/drm/imx/Makefile b/drivers/gpu/drm/imx/Makefile
index 94fc8e2d92344..205d7e65aa170 100644
--- a/drivers/gpu/drm/imx/Makefile
+++ b/drivers/gpu/drm/imx/Makefile
@@ -10,3 +10,4 @@ obj-$(CONFIG_DRM_IMX_LDB) += imx-ldb.o
 obj-$(CONFIG_DRM_IMX_DCIC) += imx-dcic.o
 
 obj-$(CONFIG_DRM_IMX_HDMI) += dw_hdmi-imx.o
+obj-$(CONFIG_DRM_IMX6_MIPI_DSI) += dw_mipi_dsi-imx6.o
diff --git a/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c 
b/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c
new file mode 100644
index 0..f19b355c6c06d
--- /dev/null
+++ b/drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c
@@ -0,0 +1,399 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * i.MX6 drm driver - MIPI DSI Host Controller
+ *
+ * Copyright (C) 2011-2015 Freescale Semiconductor, Inc.
+ * Copyright (C) 2019-2020 Collabora, Ltd.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "imx-drm.h"
+
+#define DSI_PWR_UP 0x04
+#define RESET  0
+#define POWERUPBIT(0)
+
+#define DSI_PHY_IF_CTRL0x5c
+#define PHY_IF_CTRL_RESET  0x0
+
+#define DSI_PHY_TST_CTRL0  0x64
+#define PHY_TESTCLKBIT(1)
+#define PHY_UNTESTCLK  0
+#define PHY_TESTCLRBIT(0)
+#define PHY_UNTESTCLR  0
+
+#define DSI_PHY_TST_CTRL1  0x68
+#define PHY_TESTEN BIT(16)
+#define PHY_

[PATCH v9 03/11] drm: bridge: dw_mipi_dsi: add dsi v1.01 support

2020-06-09 Thread Adrian Ratiu
The Synopsis MIPI DSI v1.01 host controller is quite widely used
on platforms like i.mx6 and is not very different from the other
versions like the 1.31/1.30 used on rockchip/stm. The protocols
appear to be the same, only the register layout is different and
the newer versions have new features symbolized by new registers
so adding support for it is just a matter of defining the new
layout and adding a couple of dsi version checks.

Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
Changes since v7:
  - Minor commit msg rewording for consistency

Changes since v5:
  - Fixed cfg_phy_status range from [0,0] to [0,2]

New in v5.
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 125 +-
 1 file changed, 119 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index f453df4eb5072..16fd87055e7b7 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -32,6 +32,7 @@
 
 #define HWVER_131  0x31333100  /* IP version 1.31 */
 #define HWVER_130  0x31333000  /* IP version 1.30 */
+#define HWVER_101  0x31303000  /* IP version 1.01 */
 
 #define DSI_VERSION0x00
 #define VERSIONGENMASK(31, 8)
@@ -100,6 +101,25 @@
 #define DSI_EDPI_CMD_SIZE  0x64
 
 #define DSI_CMD_MODE_CFG   0x68
+
+#define DSI_DPI_CFG0x0c
+#define DSI_TMR_LINE_CFG   0x28
+#define DSI_VTIMING_CFG0x2c
+#define DSI_PHY_TMR_CFG_V101   0x30
+#define DSI_PHY_IF_CFG_V1010x58
+#define DSI_PHY_IF_CTRL0x5c
+#define DSI_PHY_RSTZ_V101  0x54
+#define DSI_PHY_STATUS_V1010x60
+#define DSI_PHY_TST_CTRL0_V101 0x64
+#define DSI_GEN_HDR_V101   0x34
+#define DSI_GEN_PLD_DATA_V101  0x38
+#define DSI_CMD_MODE_CFG_V101  0x24
+#define DSI_CMD_PKT_STATUS_V1010x3c
+#define DSI_VID_PKT_CFG0x20
+#define DSI_VID_MODE_CFG_V101  0x1c
+#define DSI_TO_CNT_CFG_V1010x40
+#define DSI_PCKHDL_CFG_V1010x18
+
 #define MAX_RD_PKT_SIZE_LP BIT(24)
 #define DCS_LW_TX_LP   BIT(19)
 #define DCS_SR_0P_TX_LPBIT(18)
@@ -127,6 +147,33 @@
 GEN_SW_1P_TX_LP | \
 GEN_SW_0P_TX_LP)
 
+#define EN_TEAR_FX_V101BIT(14)
+#define DCS_LW_TX_LP_V101  BIT(12)
+#define GEN_LW_TX_LP_V101  BIT(11)
+#define MAX_RD_PKT_SIZE_LP_V101BIT(10)
+#define DCS_SW_2P_TX_LP_V101   BIT(9)
+#define DCS_SW_1P_TX_LP_V101   BIT(8)
+#define DCS_SW_0P_TX_LP_V101   BIT(7)
+#define GEN_SR_2P_TX_LP_V101   BIT(6)
+#define GEN_SR_1P_TX_LP_V101   BIT(5)
+#define GEN_SR_0P_TX_LP_V101   BIT(4)
+#define GEN_SW_2P_TX_LP_V101   BIT(3)
+#define GEN_SW_1P_TX_LP_V101   BIT(2)
+#define GEN_SW_0P_TX_LP_V101   BIT(1)
+
+#define CMD_MODE_ALL_LP_V101   (DCS_LW_TX_LP_V101 | \
+GEN_LW_TX_LP_V101 | \
+MAX_RD_PKT_SIZE_LP_V101 | \
+DCS_SW_2P_TX_LP_V101 | \
+DCS_SW_1P_TX_LP_V101 | \
+DCS_SW_0P_TX_LP_V101 | \
+GEN_SR_2P_TX_LP_V101 | \
+GEN_SR_1P_TX_LP_V101 | \
+GEN_SR_0P_TX_LP_V101 | \
+GEN_SW_2P_TX_LP_V101 | \
+GEN_SW_1P_TX_LP_V101 | \
+GEN_SW_0P_TX_LP_V101)
+
 #define DSI_GEN_HDR0x6c
 #define DSI_GEN_PLD_DATA   0x70
 
@@ -165,6 +212,11 @@
 #define DSI_INT_MSK0   0xc4
 #define DSI_INT_MSK1   0xc8
 
+#define DSI_ERROR_ST0_V101 0x44
+#define DSI_ERROR_ST1_V101 0x48
+#define DSI_ERROR_MSK0_V1010x4c
+#define DSI_ERROR_MSK1_V1010x50
+
 #define DSI_PHY_TMR_RD_CFG 0xf4
 
 #define PHY_STATUS_TIMEOUT_US  1
@@ -359,6 +411,49 @@ static const struct dw_mipi_dsi_variant 
dw_mipi_dsi_v130_v131_layout = {
.cfg_gen_payload =  REG_FIELD(DSI_GEN_PLD_DATA, 0, 31),
 };
 
+static const struct dw_mipi_dsi_variant dw_mipi_dsi_v101_layout = {
+   .cfg_dpi_vid =  REG_FIELD(DSI_DPI_CFG, 0, 1),
+   .cfg_dpi_color_coding = REG_FIELD(DSI_DPI_CFG, 2, 4),
+   .cfg_dpi_18loosely_en = REG_FIELD(DSI_DPI_CFG, 10, 10

[PATCH v9 04/11] drm: bridge: dw_mipi_dsi: remove bind/unbind API

2020-06-09 Thread Adrian Ratiu
The DW mipi-dsi bind/unbind API was only used to attach the bridge to
the encoder in the Rockchip driver, but with the addition of i.MX6 it
gets more complicated because the i.MX6 part of the bridge is another
bridge in itself which needs to daisy chain to the dw-mipi-dsi core.

So, instead of extending this API to allow daisy-chaining bridges and
risk having trouble with multiple connectors added by various bridges
just delete it and let the DW core bridge be accesed by SoC-specific
parts via the of_drm_find_bridge() API.

This just fixes the Rockchip driver for the bind() deprecation, it
doesn't convert it to a proper bridge daisy-chain with simple encoder
and bridge .attach call-backs, that refactoring work should be done
separately (and the i.MX6 driver can be used as reference).

Suggested-by: Laurent Pinchart 
Signed-off-by: Adrian Ratiu 
---
New in v9.
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 22 ---
 .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c   |  7 +++---
 2 files changed, 3 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 16fd87055e7b7..70df0578cbe7b 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -1453,28 +1453,6 @@ void dw_mipi_dsi_remove(struct dw_mipi_dsi *dsi)
 }
 EXPORT_SYMBOL_GPL(dw_mipi_dsi_remove);
 
-/*
- * Bind/unbind API, used from platforms based on the component framework.
- */
-int dw_mipi_dsi_bind(struct dw_mipi_dsi *dsi, struct drm_encoder *encoder)
-{
-   int ret;
-
-   ret = drm_bridge_attach(encoder, &dsi->bridge, NULL, 0);
-   if (ret) {
-   DRM_ERROR("Failed to initialize bridge with drm\n");
-   return ret;
-   }
-
-   return ret;
-}
-EXPORT_SYMBOL_GPL(dw_mipi_dsi_bind);
-
-void dw_mipi_dsi_unbind(struct dw_mipi_dsi *dsi)
-{
-}
-EXPORT_SYMBOL_GPL(dw_mipi_dsi_unbind);
-
 MODULE_AUTHOR("Chris Zhong ");
 MODULE_AUTHOR("Philippe Cornu ");
 MODULE_DESCRIPTION("DW MIPI DSI host controller driver");
diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c 
b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
index 3feff0c45b3f7..86f87c7ea03cf 100644
--- a/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi-rockchip.c
@@ -876,6 +876,7 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev,
 {
struct dw_mipi_dsi_rockchip *dsi = dev_get_drvdata(dev);
struct drm_device *drm_dev = data;
+   struct drm_bridge *dw_bridge = of_drm_find_bridge(dev->of_node);
struct device *second;
bool master1, master2;
int ret;
@@ -929,9 +930,9 @@ static int dw_mipi_dsi_rockchip_bind(struct device *dev,
return ret;
}
 
-   ret = dw_mipi_dsi_bind(dsi->dmd, &dsi->encoder);
+   ret = drm_bridge_attach(&dsi->encoder, dw_bridge, NULL, 0);
if (ret) {
-   DRM_DEV_ERROR(dev, "Failed to bind: %d\n", ret);
+   DRM_DEV_ERROR(dev, "Failed to attach DW DSI bridge: %d\n", ret);
return ret;
}
 
@@ -947,8 +948,6 @@ static void dw_mipi_dsi_rockchip_unbind(struct device *dev,
if (dsi->is_slave)
return;
 
-   dw_mipi_dsi_unbind(dsi->dmd);
-
clk_disable_unprepare(dsi->pllref_clk);
 }
 
-- 
2.27.0



[PATCH v9 11/11] Documentation: gpu: todo: Add dw-mipi-dsi consolidation plan

2020-06-09 Thread Adrian Ratiu
This documents the longer-term plan to cleanup the dw-mipi-dsi bridge
based drivers after the regmap refactor and i.MX6 driver have landed.

The goal is to get the entire bridge logic in one place and continue
the refactorings under the drm/bridge tree.

Cc: Laurent Pinchart 
Cc: Boris Brezillon 
Cc: Sam Ravnborg 
Cc: Daniel Vetter 
Signed-off-by: Adrian Ratiu 
---
 Documentation/gpu/todo.rst | 25 +
 1 file changed, 25 insertions(+)

diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 658b52f7ffc6c..2b142980a4b16 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -548,6 +548,31 @@ See drivers/gpu/drm/amd/display/TODO for tasks.
 
 Contact: Harry Wentland, Alex Deucher
 
+Reorganize dw-mipi-dsi bridge-based host-controller drivers
+---
+
+The Synopsys DW MIPI DSI bridge is used by a number of SoC platform drivers
+(STM, Rockchip, i.MX) which don't cleanly encapsulate their bridge logic which
+gets split between the Synopsys bridge (drm/bridge/synopsys/dw-mipi-dsi.c) and
+platform drivers like drm/imx/dw_mipi_dsi-imx6.c by passing around the bridge
+configuration regmap, creating new bridges / daisy chaining in platform 
drivers,
+duplicating encoder creation, having too much encoder logic instead of using 
the
+simple encoder interface and so on.
+
+The goal of this rework is to make the dw-mipi-dsi driver a better encapsulated
+bridge by moving all bridge-related logic under drm/bridge, including the SoC
+bindings which chain to the core Synopsys code under drm/bridge/dw-mipi-dsi/
+from which they can be further consolidated and cleaned up.
+
+If this goal proves to be impossible then drm_bridge might not be the correct
+abstraction for these host controllers and unifying their logic into a helper
+library encapsulating a drm_encoder might be more desirable, in other words to
+move away from drm_bridge entirely.
+
+Contact: Adrian Ratiu, Daniel Vetter, Laurent Pinchart
+
+Level: Intermediate
+
 Bootsplash
 ==
 
-- 
2.27.0



[PATCH v9 10/11] drm: bridge: dw-mipi-dsi: fix bad register field offsets

2020-06-09 Thread Adrian Ratiu
According to the DSI Host Registers sections available in the IMX,
STM and RK ref manuals for 1.01, 1.30 and 1.31, the register fields
are smaller or bigger than what's coded in the driver, leading to
r/w in reserved spaces which might cause undefined behaviours.

Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
New in v6.
---
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 46 +--
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 1e47d40b5becb..d274216c5a7c2 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -316,7 +316,7 @@ struct dw_mipi_dsi_variant {
 static const struct dw_mipi_dsi_variant dw_mipi_dsi_v130_v131_layout = {
.cfg_dpi_color_coding = REG_FIELD(DSI_DPI_COLOR_CODING, 0, 3),
.cfg_dpi_18loosely_en = REG_FIELD(DSI_DPI_COLOR_CODING, 8, 8),
-   .cfg_dpi_vid =  REG_FIELD(DSI_DPI_VCID, 0, 2),
+   .cfg_dpi_vid =  REG_FIELD(DSI_DPI_VCID, 0, 1),
.cfg_dpi_vsync_active_low = REG_FIELD(DSI_DPI_CFG_POL, 1, 1),
.cfg_dpi_hsync_active_low = REG_FIELD(DSI_DPI_CFG_POL, 2, 2),
.cfg_cmd_mode_ack_rqst_en = REG_FIELD(DSI_CMD_MODE_CFG, 1, 1),
@@ -325,29 +325,29 @@ static const struct dw_mipi_dsi_variant 
dw_mipi_dsi_v130_v131_layout = {
.cfg_cmd_mode_dcs_sw_sr_en =REG_FIELD(DSI_CMD_MODE_CFG, 16, 18),
.cfg_cmd_mode_dcs_lw_en =   REG_FIELD(DSI_CMD_MODE_CFG, 19, 19),
.cfg_cmd_mode_max_rd_pkt_size = REG_FIELD(DSI_CMD_MODE_CFG, 24, 24),
-   .cfg_cmd_mode_en =  REG_FIELD(DSI_MODE_CFG, 0, 31),
-   .cfg_cmd_pkt_status =   REG_FIELD(DSI_CMD_PKT_STATUS, 0, 31),
-   .cfg_vid_mode_en =  REG_FIELD(DSI_MODE_CFG, 0, 31),
+   .cfg_cmd_mode_en =  REG_FIELD(DSI_MODE_CFG, 0, 0),
+   .cfg_cmd_pkt_status =   REG_FIELD(DSI_CMD_PKT_STATUS, 0, 6),
+   .cfg_vid_mode_en =  REG_FIELD(DSI_MODE_CFG, 0, 0),
.cfg_vid_mode_type =REG_FIELD(DSI_VID_MODE_CFG, 0, 1),
.cfg_vid_mode_low_power =   REG_FIELD(DSI_VID_MODE_CFG, 8, 13),
.cfg_vid_mode_vpg_en =  REG_FIELD(DSI_VID_MODE_CFG, 16, 16),
.cfg_vid_mode_vpg_horiz =   REG_FIELD(DSI_VID_MODE_CFG, 24, 24),
-   .cfg_vid_pkt_size = REG_FIELD(DSI_VID_PKT_SIZE, 0, 10),
-   .cfg_vid_hsa_time = REG_FIELD(DSI_VID_HSA_TIME, 0, 31),
-   .cfg_vid_hbp_time = REG_FIELD(DSI_VID_HBP_TIME, 0, 31),
-   .cfg_vid_hline_time =   REG_FIELD(DSI_VID_HLINE_TIME, 0, 31),
-   .cfg_vid_vsa_time = REG_FIELD(DSI_VID_VSA_LINES, 0, 31),
-   .cfg_vid_vbp_time = REG_FIELD(DSI_VID_VBP_LINES, 0, 31),
-   .cfg_vid_vfp_time = REG_FIELD(DSI_VID_VFP_LINES, 0, 31),
-   .cfg_vid_vactive_time = REG_FIELD(DSI_VID_VACTIVE_LINES, 0, 31),
+   .cfg_vid_pkt_size = REG_FIELD(DSI_VID_PKT_SIZE, 0, 13),
+   .cfg_vid_hsa_time = REG_FIELD(DSI_VID_HSA_TIME, 0, 11),
+   .cfg_vid_hbp_time = REG_FIELD(DSI_VID_HBP_TIME, 0, 11),
+   .cfg_vid_hline_time =   REG_FIELD(DSI_VID_HLINE_TIME, 0, 14),
+   .cfg_vid_vsa_time = REG_FIELD(DSI_VID_VSA_LINES, 0, 9),
+   .cfg_vid_vbp_time = REG_FIELD(DSI_VID_VBP_LINES, 0, 9),
+   .cfg_vid_vfp_time = REG_FIELD(DSI_VID_VFP_LINES, 0, 9),
+   .cfg_vid_vactive_time = REG_FIELD(DSI_VID_VACTIVE_LINES, 0, 13),
.cfg_phy_txrequestclkhs =   REG_FIELD(DSI_LPCLK_CTRL, 0, 0),
-   .cfg_phy_bta_time = REG_FIELD(DSI_BTA_TO_CNT, 0, 31),
-   .cfg_phy_max_rd_time =  REG_FIELD(DSI_PHY_TMR_CFG, 0, 15),
+   .cfg_phy_bta_time = REG_FIELD(DSI_BTA_TO_CNT, 0, 15),
+   .cfg_phy_max_rd_time =  REG_FIELD(DSI_PHY_TMR_CFG, 0, 14),
.cfg_phy_lp2hs_time =   REG_FIELD(DSI_PHY_TMR_CFG, 16, 23),
.cfg_phy_hs2lp_time =   REG_FIELD(DSI_PHY_TMR_CFG, 24, 31),
-   .cfg_phy_max_rd_time_v131 = REG_FIELD(DSI_PHY_TMR_RD_CFG, 0, 15),
-   .cfg_phy_lp2hs_time_v131 =  REG_FIELD(DSI_PHY_TMR_CFG, 0, 15),
-   .cfg_phy_hs2lp_time_v131 =  REG_FIELD(DSI_PHY_TMR_CFG, 16, 31),
+   .cfg_phy_max_rd_time_v131 = REG_FIELD(DSI_PHY_TMR_RD_CFG, 0, 14),
+   .cfg_phy_lp2hs_time_v131 =  REG_FIELD(DSI_PHY_TMR_CFG, 0, 9),
+   .cfg_phy_hs2lp_time_v131 =  REG_FIELD(DSI_PHY_TMR_CFG, 16, 25),
.cfg_phy_clklp2hs_time =REG_FIELD(DSI_PHY_TMR_LPCLK_CFG, 0, 15),
.cfg_phy_clkhs2lp_time =REG_FIELD(DSI_PHY_TMR_LPCLK_CFG, 16, 
31),
.cfg_phy_testclr =  REG_FIELD(DSI_PHY_TST_CTRL0, 0, 0),
@@ -361,11 +361,11 @@ static const struct dw_mipi_dsi_va

[PATCH v9 00/11] Genericize DW MIPI DSI bridge and add i.MX 6 driver

2020-06-09 Thread Adrian Ratiu
[Re-submitting to cc dri-devel, sorry about the noise]

Hello all,

v9 cleanly applies on top of latest next-20200609 tree.

v9 does not depend on other patches as the last binding doc has been merged.

All feedback up to this point has been addressed. Specific details in
individual patch changelogs.

The biggest changes are the deprecation of the Synopsys DW bridge bind()
API in favor of of_drm_find_bridge() and .attach callbacks, the addition
of a TODO entry which outlines future planned bridge driver refactorings
and a reordering of some i.MX 6 patches to appease checkpatch.

The idea behind the TODO is to get this regmap and i.MX 6 driver merged
and then do the rest of refactorings in-tree because it's easier and the
refactorings themselves are out-of-scope of this series which is adding
i.MX 6 support and is quite big already, so please, if there are more
refactoring ideas, let's add them to the TODO doc. :) I intend to tackle
those after this series is merged to avoid two complex inter-dependent
simultaneous series.

As always more testing is welcome especially on Rockchip and STM SoCs.

Big thank you to everyone who has contributed to this up to now,
Adrian

Adrian Ratiu (11):
  drm: bridge: dw_mipi_dsi: add initial regmap infrastructure
  drm: bridge: dw_mipi_dsi: abstract register access using reg_fields
  drm: bridge: dw_mipi_dsi: add dsi v1.01 support
  drm: bridge: dw_mipi_dsi: remove bind/unbind API
  dt-bindings: display: add i.MX6 MIPI DSI host controller doc
  ARM: dts: imx6qdl: add missing mipi dsi properties
  drm: imx: Add i.MX 6 MIPI DSI host platform driver
  drm: stm: dw-mipi-dsi: let the bridge handle the HW version check
  drm: bridge: dw-mipi-dsi: split low power cfg register into fields
  drm: bridge: dw-mipi-dsi: fix bad register field offsets
  Documentation: gpu: todo: Add dw-mipi-dsi consolidation plan

 .../display/imx/fsl,mipi-dsi-imx6.yaml| 112 +++
 Documentation/gpu/todo.rst|  25 +
 arch/arm/boot/dts/imx6qdl.dtsi|   8 +
 drivers/gpu/drm/bridge/synopsys/Kconfig   |   1 +
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 713 --
 drivers/gpu/drm/imx/Kconfig   |   8 +
 drivers/gpu/drm/imx/Makefile  |   1 +
 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c| 399 ++
 .../gpu/drm/rockchip/dw-mipi-dsi-rockchip.c   |   7 +-
 drivers/gpu/drm/stm/dw_mipi_dsi-stm.c |  16 +-
 10 files changed, 1059 insertions(+), 231 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/imx/fsl,mipi-dsi-imx6.yaml
 create mode 100644 drivers/gpu/drm/imx/dw_mipi_dsi-imx6.c

-- 
2.27.0



[PATCH v9 01/11] drm: bridge: dw_mipi_dsi: add initial regmap infrastructure

2020-06-09 Thread Adrian Ratiu
In order to support multiple versions of the Synopsis MIPI DSI host
controller, which have different register layouts but almost identical
HW protocols, we add a regmap infrastructure which can abstract away
register accesses for platform drivers using the bridge.

The controller HW revision is detected during bridge probe which will
be used in future commits to load the relevant register layout which
the bridge will use transparently to the platform drivers.

Suggested-by: Ezequiel Garcia 
Reviewed-by: Enric Balletbo i Serra 
Tested-by: Adrian Pop 
Tested-by: Arnaud Ferraris 
Signed-off-by: Adrian Ratiu 
---
Changes since v8:
  - Minor typo fix
  - Added Reviewed-by Enric tag

Changes since v7:
  - Minor checkpatch line fix

Changes since v6:
  - Select REGMAP_MMIO in Kconfig (Enric)
  - Drop unnecessary stack variable inits (Enric)
  - Make bridge error ASAP after a bad revision read (Enric)
  - Drop redundant read of hw_version in dphy_timing_config (Enric)

New in v5.
---
 drivers/gpu/drm/bridge/synopsys/Kconfig   |   1 +
 drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c | 210 ++
 2 files changed, 121 insertions(+), 90 deletions(-)

diff --git a/drivers/gpu/drm/bridge/synopsys/Kconfig 
b/drivers/gpu/drm/bridge/synopsys/Kconfig
index 21a1be3ced0f3..080146093b68e 100644
--- a/drivers/gpu/drm/bridge/synopsys/Kconfig
+++ b/drivers/gpu/drm/bridge/synopsys/Kconfig
@@ -39,3 +39,4 @@ config DRM_DW_MIPI_DSI
select DRM_KMS_HELPER
select DRM_MIPI_DSI
select DRM_PANEL_BRIDGE
+   select REGMAP_MMIO
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c 
b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
index 5ef0f154aa7bd..34b8668ae24ea 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -227,6 +228,7 @@ struct dw_mipi_dsi {
struct drm_bridge *panel_bridge;
struct device *dev;
void __iomem *base;
+   struct regmap *regs;
 
struct clk *pclk;
 
@@ -235,6 +237,7 @@ struct dw_mipi_dsi {
u32 lanes;
u32 format;
unsigned long mode_flags;
+   u32 hw_version;
 
 #ifdef CONFIG_DEBUG_FS
struct dentry *debugfs;
@@ -249,6 +252,13 @@ struct dw_mipi_dsi {
const struct dw_mipi_dsi_plat_data *plat_data;
 };
 
+static const struct regmap_config dw_mipi_dsi_regmap_cfg = {
+   .reg_bits = 32,
+   .val_bits = 32,
+   .reg_stride = 4,
+   .name = "dw-mipi-dsi",
+};
+
 /*
  * Check if either a link to a master or slave is present
  */
@@ -280,16 +290,6 @@ static inline struct dw_mipi_dsi *bridge_to_dsi(struct 
drm_bridge *bridge)
return container_of(bridge, struct dw_mipi_dsi, bridge);
 }
 
-static inline void dsi_write(struct dw_mipi_dsi *dsi, u32 reg, u32 val)
-{
-   writel(val, dsi->base + reg);
-}
-
-static inline u32 dsi_read(struct dw_mipi_dsi *dsi, u32 reg)
-{
-   return readl(dsi->base + reg);
-}
-
 static int dw_mipi_dsi_host_attach(struct mipi_dsi_host *host,
   struct mipi_dsi_device *device)
 {
@@ -366,8 +366,8 @@ static void dw_mipi_message_config(struct dw_mipi_dsi *dsi,
if (lpm)
val |= CMD_MODE_ALL_LP;
 
-   dsi_write(dsi, DSI_LPCLK_CTRL, lpm ? 0 : PHY_TXREQUESTCLKHS);
-   dsi_write(dsi, DSI_CMD_MODE_CFG, val);
+   regmap_write(dsi->regs, DSI_LPCLK_CTRL, lpm ? 0 : PHY_TXREQUESTCLKHS);
+   regmap_write(dsi->regs, DSI_CMD_MODE_CFG, val);
 }
 
 static int dw_mipi_dsi_gen_pkt_hdr_write(struct dw_mipi_dsi *dsi, u32 hdr_val)
@@ -375,20 +375,20 @@ static int dw_mipi_dsi_gen_pkt_hdr_write(struct 
dw_mipi_dsi *dsi, u32 hdr_val)
int ret;
u32 val, mask;
 
-   ret = readl_poll_timeout(dsi->base + DSI_CMD_PKT_STATUS,
-val, !(val & GEN_CMD_FULL), 1000,
-CMD_PKT_STATUS_TIMEOUT_US);
+   ret = regmap_read_poll_timeout(dsi->regs, DSI_CMD_PKT_STATUS,
+  val, !(val & GEN_CMD_FULL), 1000,
+  CMD_PKT_STATUS_TIMEOUT_US);
if (ret) {
dev_err(dsi->dev, "failed to get available command FIFO\n");
return ret;
}
 
-   dsi_write(dsi, DSI_GEN_HDR, hdr_val);
+   regmap_write(dsi->regs, DSI_GEN_HDR, hdr_val);
 
mask = GEN_CMD_EMPTY | GEN_PLD_W_EMPTY;
-   ret = readl_poll_timeout(dsi->base + DSI_CMD_PKT_STATUS,
-val, (val & mask) == mask,
-1000, CMD_PKT_STATUS_TIMEOUT_US);
+   ret = regmap_read_poll_timeout(dsi->regs, DSI_CMD_PKT_STATUS,
+  val, (val & mask) == mask,
+  1000, CMD_PKT_STATUS_TIMEOUT_US);
if (ret) {
dev_err(dsi-&

[PATCH 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-25 Thread Ioan-Adrian Ratiu
7
Item(Global): Physical Maximum, data= [ 0xff 0x00 ] 255
Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255
Item(Local ): Usage, data= [ 0x02 ] 2
(null)
Item(Main  ): Output, data= [ 0x02 ] 2
Data Variable Absolute No_Wrap Linear
Preferred_State No_Null_Position Non_Volatile 
Bitfield
Item(Main  ): End Collection, data=none
Item(Main  ): End Collection, data=none

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 58 
 1 file changed, 58 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 8fd4bf7..2523f8a 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -234,58 +234,6 @@ static __u8 pid0011_rdesc_fixed[] = {
0xC0/*  End Collection  */
 };
 
-static __u8 pid0006_rdesc_fixed[] = {
-   0x05, 0x01,/* Usage Page (Generic Desktop)  */
-   0x09, 0x04,/* Usage (Joystick)  */
-   0xA1, 0x01,/* Collection (Application)  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x05,/* Report Count (5)  */
-   0x15, 0x00,/* Logical Minimum (0)   */
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x35, 0x00,/* Physical Minimum (0)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x09, 0x30,/* Usage (X) */
-   0x09, 0x33,/* Usage (Ry)*/
-   0x09, 0x32,/* Usage (Z) */
-   0x09, 0x31,/* Usage (Y) */
-   0x09, 0x34,/* Usage (Ry)*/
-   0x81, 0x02,/* Input (Variable)  */
-   0x75, 0x04,/* Report Size (4)   */
-   0x95, 0x01,/* Report Count (1)  */
-   0x25, 0x07,/* Logical Maximum (7)   */
-   0x46, 0x3B, 0x01,  /* Physical Maximum (315)*/
-   0x65, 0x14,/* Unit (Centimeter) */
-   0x09, 0x39,/* Usage (Hat switch)*/
-   0x81, 0x42,/* Input (Variable)  */
-   0x65, 0x00,/* Unit (None)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x0C,/* Report Count (12) */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x05, 0x09,/* Usage Page (Button)   */
-   0x19, 0x01,/* Usage Minimum (0x01)  */
-   0x29, 0x0C,/* Usage Maximum (0x0C)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0x06, 0x00, 0xFF,  /* Usage Page (Vendor Defined)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x08,/* Report Count (8)  */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x09, 0x01,/* Usage (0x01)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0xC0,  /*   End Collection  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x07,/* Report Count (7)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x09, 0x02,/* Usage (0x02)  */
-   0x91, 0x02,/* Output (Variable) */
-   0xC0,  /*   End Collection  */
-   0xC0   /* End Collection*/
-};
-
 static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 *rdesc,
unsigned int *rsize)
 {
@@ -296,12 +244,6 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
*rsize = sizeof(pid0011_rdesc_fixed);
}
break;
-   case 0x0006:
-   if (*rsize == sizeof(pid0006_rdesc_fixed)) {
-   rdesc = pid0006_rdesc_fixed;
-   *rsize = sizeof(pid0006_rdesc_fixed);
-   }
-   break;
}
return rdesc;
 }
-- 
2.10.0



[PATCH 2/2] hid: input: add HID_QUIRK_REUSE_AXES and fix dragonrise

2016-09-25 Thread Ioan-Adrian Ratiu
Commit 79346d620e9d ("HID: input: force generic axis to be mapped to their
user space axis") made mapping generic axes to their userspace equivalents
mandatory and some lower end gamepads which were depending on the previous
behaviour suffered severe regressions because they were reusing axes and
expecting hid-input to multiplex their map to the respective userspace axis
by always searching for and using the next available axis.

Now the result is that different device axes appear on a single axis in
userspace, which is clearly a regression in the hid-input driver because it
needs to continue to handle this hardware as expected before the forcing
to provide the same interface to userspace.

Since these lower-end gamepads like 0079:0006 are definitely the exception,
create a quirk to fix them.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c|  2 ++
 drivers/hid/hid-input.c | 16 +++-
 include/linux/hid.h |  1 +
 3 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 2523f8a..27fc826 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -274,6 +274,8 @@ static int dr_probe(struct hid_device *hdev, const struct 
hid_device_id *id)
hid_hw_stop(hdev);
goto err;
}
+   /* has only 5 axes and reuses X, Y */
+   hdev->quirks |= HID_QUIRK_REUSE_AXES;
break;
}
 
diff --git a/drivers/hid/hid-input.c b/drivers/hid/hid-input.c
index fb9ace1..1cc6fe4 100644
--- a/drivers/hid/hid-input.c
+++ b/drivers/hid/hid-input.c
@@ -633,11 +633,17 @@ static void hidinput_configure_usage(struct hid_input 
*hidinput, struct hid_fiel
/* These usage IDs map directly to the usage codes. */
case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
-   if (field->flags & HID_MAIN_ITEM_RELATIVE)
-   map_rel(usage->hid & 0xf);
-   else
-   map_abs_clear(usage->hid & 0xf);
-   break;
+
+   /* if quirk is active don't force the userspace mapping,
+* instead search and use the next available axis.
+*/
+   if (!(device->quirks & HID_QUIRK_REUSE_AXES)) {
+   if (field->flags & HID_MAIN_ITEM_RELATIVE)
+   map_rel(usage->hid & 0xf);
+   else
+   map_abs_clear(usage->hid & 0xf);
+   break;
+   }
 
case HID_GD_SLIDER: case HID_GD_DIAL: case HID_GD_WHEEL:
if (field->flags & HID_MAIN_ITEM_RELATIVE)
diff --git a/include/linux/hid.h b/include/linux/hid.h
index 75b66ec..0979920 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -320,6 +320,7 @@ struct hid_item {
 #define HID_QUIRK_NO_EMPTY_INPUT   0x0100
 #define HID_QUIRK_NO_INIT_INPUT_REPORTS0x0200
 #define HID_QUIRK_ALWAYS_POLL  0x0400
+#define HID_QUIRK_REUSE_AXES   0x0800
 #define HID_QUIRK_SKIP_OUTPUT_REPORTS  0x0001
 #define HID_QUIRK_SKIP_OUTPUT_REPORT_ID0x0002
 #define HID_QUIRK_NO_OUTPUT_REPORTS_ON_INTR_EP 0x0004
-- 
2.10.0



Re: [PATCH 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-26 Thread Ioan-Adrian Ratiu
Hi

On Mon, 26 Sep 2016, Benjamin Tissoires  wrote:
> Thanks for the patch series. I am not against it, but I'd rather see the
> commit message of this one amended, and the second patch changed.

Sorry if I came out too aggresive (I'll amend), I'm just annoyed that I
had to spend a weekend night digging through this crap because my fiancee
came crying to me that her gamepad stopped working after a kernel update.

>
> On Sep 26 2016 or thereabouts, Ioan-Adrian Ratiu wrote:
>> This reverts commit 18339f59c3a6 ("HID: dragonrise: fix HID...")
>> because the "fix" is bogus. That report descriptor is different in
>
> I am pretty sure this "fix" works for many. You seem to have a different
> hardware (generation probably) that makes the "fix" to fail for you.
> The issue is more that the manufacturer doesn't bother to reallocate a
> new PID for the new device when they change something in it, so please
> don't blame the author of the fix (which I am not).

We don't actually know that it's a new generation of hardware without a
new PID and I really doubt manufacturers are *that* stupid, they should
at least have increased the device revision number, but... China.

>
>> hardware (see below) and it's the way the hardware works, it can't be
>
> Well, it's kind of hard to compare the lsusb output to the fixup in the
> kernel. I'd like to know what changed, but I can't...
>
>> fixed at this level because it reuses axes by design.
>
> It can be fixed in hid-dr. See my comments on the next patch.

Thanks, I'll try to add an input mapping in the driver. However if
indeed there are different hardware with the same PID and this "fix" is
for another issue than the one I'm having then I'd really rather not
revert this if possible to not break other people's hardware. But I also
can't keep it because it breaks my hardware.

Does anyone have any suggestions what to do in this case?

>
>> 
>> What this change tried to fix is a regression caused by commit 20aef664f139
>
> As mentioned in your next patch in the series, the correct commit id is
> 79346d620e9de87912de73337f6df8b7f9a46888

Thanks, this was a slip-up on my part.

>
>> ("HID: input: force generic axis to be mapped to their user space axis") by
>> working around the problem and trying to change the report descriptor in
>> hid-dr, which obviously can't work and introduces more breakage because it
>> adds another unnecessary layer of multiplexing/indirection, making the
>> dragonrise gamepad practically unusable in userspace.
>
> Not sure who you are blaming here. Is it me (the author of 79346d620e or
> the author of the fix in hid-dr)? If it's me, I agree, the patch was a
> little too aggressive, though I must say only one hardware maker has
> such a crappy device that we need to care of. So this is why I just let
> the patch in place without trying to have a better solution.
>
> If the blame is on the author of the hid-dr, I must say that I find the
> tone of this paragraph quite aggressive for nothing. Your device is
> different than the one that was used for the original fix, so it breaks.
> But I can guarantee you that the fix works for the intended device (I
> happen to have one I tested recently). So please blame the hardware
> maker, not the people involved in the community who are doing their
> best.

I'm not assigning blame because it's counterproductive. Of course, I
agree if we are to find a scapegoat it's the crappy manufacturers
because they make the crappy hardware with crappy or no Linux support
(btw I am in no way affiliated with dragonrise or any other vendor).

The only thing that annoys me is that this known hid-input kernel
regression has been ignored for all this time, leaving users like me
dead in the watter with a broken driver upon kernel update.

>
>> 
>> This needs to be fixed where the regression was initially introduced in
>> hid-input (the next patch does this).
>
> Actually, again, I tend to disagree :)
> I'll go more in details in the next patch.
>
> Cheers,
> Benjamin
>
>> 
>> Here's the descriptor taken directly from the device via lsusb:
>> HID Device Descriptor:
>>   bLength 9
>>   bDescriptorType33
>>   bcdHID   1.10
>>   bCountryCode   33 US
>>   bNumDescriptors 1
>>   bDescriptorType34 Report
>>   wDescriptorLength 101
>>   Report Descriptor: (length is 101)
>> Item(Global): Usage Page, data= [ 0x01 ] 1
>>

Re: [PATCH 2/2] hid: input: add HID_QUIRK_REUSE_AXES and fix dragonrise

2016-09-27 Thread Ioan-Adrian Ratiu
On Tue, 27 Sep 2016, Vladislav Naumov  wrote:
> Yes, I still have one of those!
> 0079:0011 DragonRise Inc. Gamepad
> Left shift buttons are broken now, but axis and main buttons are still 
> working.
> Axis is handled properly with 3.16.0-4-686-pae #1 SMP Debian
> 3.16.7-ckt25-2 (2016-04-08) i686 GNU/Linux from debian/stable.
> I can test what you want.

Can you please wait a little until I post v2 later today and test v2
directly? Because the change in it's current form has no effect on
0079:0011 (the current quirk is enabled only for 0006).

When I add the input mapping in the hid-dr driver then it will affect
both 0006 and 0011 so that's the patch really worth testing.

Thanks a lot for taking time to test this,
Ionel

> Should I apply the patch from forwarded message to upstream kernel, or
> I can just pull it from some host with everything applied?
>
> On Mon, Sep 26, 2016 at 4:53 PM, Nikolai Kondrashov  wrote:
>> Hi Benjamin,
>>
>> On 09/26/2016 12:29 PM, Benjamin Tissoires wrote:
>>>
>>> Ideally, we need to have Dragon Rise 0x0011 tested too. Nick, would you
>>> mind checking it if you still have this particular device?
>>
>>
>> I never had it, but perhaps Vladislav still has some.
>>
>> Vladislav, would you be able to test a change to the kernel module for your
>> Dragonrise gamepads?
>>
>> Please see below for context.
>>
>> Thank you.
>>
>> Nick
>>
>> On 09/26/2016 12:29 PM, Benjamin Tissoires wrote:
>>>
>>> On Sep 26 2016 or thereabouts, Ioan-Adrian Ratiu wrote:
>>>>
>>>> Commit 79346d620e9d ("HID: input: force generic axis to be mapped to
>>>> their
>>>> user space axis") made mapping generic axes to their userspace
>>>> equivalents
>>>> mandatory and some lower end gamepads which were depending on the
>>>> previous
>>>> behaviour suffered severe regressions because they were reusing axes and
>>>> expecting hid-input to multiplex their map to the respective userspace
>>>> axis
>>>> by always searching for and using the next available axis.
>>>
>>>
>>> Yes, I apologies for the breakage and the regression, though I must say
>>> that for now, only one hardware maker and one device (or range of devices
>>> from the look of it) has needed to be quirked.
>>>
>>>>
>>>> Now the result is that different device axes appear on a single axis in
>>>> userspace, which is clearly a regression in the hid-input driver because
>>>> it
>>>> needs to continue to handle this hardware as expected before the forcing
>>>> to provide the same interface to userspace.
>>>>
>>>> Since these lower-end gamepads like 0079:0006 are definitely the
>>>> exception,
>>>> create a quirk to fix them.
>>>
>>>
>>> Given that we only have this particular vendor that is an issue, I'd
>>> rather see the fix in hid-dr.c. The reason being that you actually don't
>>> need to have a global quirk and this simplifies the path in hid-input.
>>> Plus for users, they can just upgrade hid-dr without having to recompile
>>> their kernel when hid-core is not compiled as a module.
>>>
>>> The cleanest solution that wouldn't require any quirk in hid-core is to
>>> simply add an .input_mapping() callback in hid-dr.c.
>>>
>>> The code of the callback could be something like (untested):
>>>
>>> static int dr_input_mapping(struct hid_device *hdev, struct hid_input *hi,
>>> struct hid_field *field, struct hid_usage *usage,
>>> unsigned long **bit, int *max)
>>> {
>>> switch (usage->hid) {
>>> /*
>>>  * revert the old hid-input behavior where axes
>>>  * can be randomly assigned when the hid usage is
>>>  * reused.
>>>  */
>>> case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
>>> case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
>>> if (field->flags & HID_MAIN_ITEM_RELATIVE)
>>> map_rel(usage->hid & 0xf);
>>> else
>>> map_abs(usage->hid & 0xf);
>>> return 1;
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> Hopefully, something like this should revert the old behavior for all
>>> hid-dr touchpads.
>>&

[PATCH v2 1/2] Revert "HID: dragonrise: fix HID Descriptor for 0x0006 PID"

2016-09-27 Thread Ioan-Adrian Ratiu
This reverts commit 18339f59c3a6 ("HID: dragonrise: fix HID...") because it
breaks certain dragonrise 0079:0006 gamepads. While it may fix a breakage
caused by commit 79346d620e9d ("HID: input: force generic axis to be mapped
to their user space axis"), it is probable that the manufacturer released
different hardware with the same PID so this fix works for only a subset
and breaks the other gamepads sharing the PID.

What is needed is another more generic solution which fixes 79346d620e9d
("HID: input: force generic axis ...") breakage for this controller: we
need to add an exception for this driver to make it keep the old behaviour
previous to the initial breakage (this is done in patch 2 of this series).

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 58 
 1 file changed, 58 deletions(-)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 8fd4bf7..2523f8a 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -234,58 +234,6 @@ static __u8 pid0011_rdesc_fixed[] = {
0xC0/*  End Collection  */
 };
 
-static __u8 pid0006_rdesc_fixed[] = {
-   0x05, 0x01,/* Usage Page (Generic Desktop)  */
-   0x09, 0x04,/* Usage (Joystick)  */
-   0xA1, 0x01,/* Collection (Application)  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x05,/* Report Count (5)  */
-   0x15, 0x00,/* Logical Minimum (0)   */
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x35, 0x00,/* Physical Minimum (0)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x09, 0x30,/* Usage (X) */
-   0x09, 0x33,/* Usage (Ry)*/
-   0x09, 0x32,/* Usage (Z) */
-   0x09, 0x31,/* Usage (Y) */
-   0x09, 0x34,/* Usage (Ry)*/
-   0x81, 0x02,/* Input (Variable)  */
-   0x75, 0x04,/* Report Size (4)   */
-   0x95, 0x01,/* Report Count (1)  */
-   0x25, 0x07,/* Logical Maximum (7)   */
-   0x46, 0x3B, 0x01,  /* Physical Maximum (315)*/
-   0x65, 0x14,/* Unit (Centimeter) */
-   0x09, 0x39,/* Usage (Hat switch)*/
-   0x81, 0x42,/* Input (Variable)  */
-   0x65, 0x00,/* Unit (None)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x0C,/* Report Count (12) */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x05, 0x09,/* Usage Page (Button)   */
-   0x19, 0x01,/* Usage Minimum (0x01)  */
-   0x29, 0x0C,/* Usage Maximum (0x0C)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0x06, 0x00, 0xFF,  /* Usage Page (Vendor Defined)   */
-   0x75, 0x01,/* Report Size (1)   */
-   0x95, 0x08,/* Report Count (8)  */
-   0x25, 0x01,/* Logical Maximum (1)   */
-   0x45, 0x01,/* Physical Maximum (1)  */
-   0x09, 0x01,/* Usage (0x01)  */
-   0x81, 0x02,/* Input (Variable)  */
-   0xC0,  /*   End Collection  */
-   0xA1, 0x02,/*   Collection (Logical)*/
-   0x75, 0x08,/* Report Size (8)   */
-   0x95, 0x07,/* Report Count (7)  */
-   0x46, 0xFF, 0x00,  /* Physical Maximum (255)*/
-   0x26, 0xFF, 0x00,  /* Logical Maximum (255) */
-   0x09, 0x02,/* Usage (0x02)  */
-   0x91, 0x02,/* Output (Variable) */
-   0xC0,  /*   End Collection  */
-   0xC0   /* End Collection*/
-};
-
 static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 *rdesc,
unsigned int *rsize)
 {
@@ -296,12 +244,6 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
*rsize = sizeof(pid0011_rdesc_fixed);
}
break;
-   case 0x0006:
-   if (*rsize == sizeof(pid0006_rdesc_fixed)) {
-   rdesc = pid0006_rdesc_fixed;
-   *rsiz

[PATCH v2 2/2] hid: hid-dr: add input mapping for axis selection

2016-09-27 Thread Ioan-Adrian Ratiu
Commit 79346d620e9d ("HID: input: force generic axis to be mapped to their
user space axis") made mapping generic axes to their userspace equivalents
mandatory and some lower end gamepads which were depending on the previous
behaviour suffered severe regressions because they were reusing axes and
expecting hid-input to multiplex their map to the respective userspace axis
by always searching for and using the next available axis.

One solution is to add a hid quirk for this type of "previous" behaviour in
hid-input to bypass the new axes policy in favour of the old one, but since
only one hardware vendor seems to be affected negatively we're better off
making and exception and mapping in the driver for now; if more vendors or
drivers turn out to experience the problem we should reconsider the quirk
solution.

Signed-off-by: Ioan-Adrian Ratiu 
---
 drivers/hid/hid-dr.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/drivers/hid/hid-dr.c b/drivers/hid/hid-dr.c
index 2523f8a..818ea7d9 100644
--- a/drivers/hid/hid-dr.c
+++ b/drivers/hid/hid-dr.c
@@ -248,6 +248,30 @@ static __u8 *dr_report_fixup(struct hid_device *hdev, __u8 
*rdesc,
return rdesc;
 }
 
+#define map_abs(c)  hid_map_usage(hi, usage, bit, max, EV_ABS, (c))
+#define map_rel(c)  hid_map_usage(hi, usage, bit, max, EV_REL, (c))
+
+static int dr_input_mapping(struct hid_device *hdev, struct hid_input *hi,
+   struct hid_field *field, struct hid_usage *usage,
+   unsigned long **bit, int *max)
+{
+   switch (usage->hid) {
+   /*
+* revert to the old hid-input behavior where axes
+* can be randomly assigned when hid->usage is reused.
+*/
+   case HID_GD_X: case HID_GD_Y: case HID_GD_Z:
+   case HID_GD_RX: case HID_GD_RY: case HID_GD_RZ:
+   if (field->flags & HID_MAIN_ITEM_RELATIVE)
+   map_rel(usage->hid & 0xf);
+   else
+   map_abs(usage->hid & 0xf);
+   return 1;
+   }
+
+   return 0;
+}
+
 static int dr_probe(struct hid_device *hdev, const struct hid_device_id *id)
 {
int ret;
@@ -294,6 +318,7 @@ static struct hid_driver dr_driver = {
.id_table = dr_devices,
.report_fixup = dr_report_fixup,
.probe = dr_probe,
+   .input_mapping = dr_input_mapping,
 };
 module_hid_driver(dr_driver);
 
-- 
2.10.0



[RFC][PATCH] Revert "ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs"

2019-01-27 Thread Ioan-Adrian Ratiu
This reverts commit bea8a160c621d19f7f78b13e14e03f4b8e44cd4b.

Contrary to what the commit message says, on my rpi 3 b v1.2 changing
the polarity causes the exact behaviour this commit intends to fix, as
described at the referenced link below (wlan0 disapears).

With reset-gpios = ... GPIO_ACTIVE_HIGH, brcmfmac errors in dmesg:

[7.977512] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.977623] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978007] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978377] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978724] brcmfmac: brcmf_sdio_dpc: failed backplane access over SDIO, 
halting operation
[7.978734] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg 
failed w/status -110
[7.978747] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[7.982817] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.982880] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.983255] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame

The only solution I currently have is to revert and everything works
as expected and as before changing the polarity.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443
Signed-off-by: Ioan-Adrian Ratiu 
---
 arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts | 2 +-
 arch/arm/boot/dts/bcm2837-rpi-3-b.dts  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
index 93762244be7f..4adb85e66be3 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
@@ -31,7 +31,7 @@
 
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
index 89e6fd547c75..c318bcbc6ba7 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
@@ -26,7 +26,7 @@
 
wifi_pwrseq: wifi-pwrseq {
compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
-- 
2.20.1



Re: [RFC][PATCH] Revert "ARM: dts: bcm2837: Fix polarity of wifi reset GPIOs"

2019-01-27 Thread Ioan-Adrian Ratiu

Link to the full 4.19.18 config I'm using:

https://drive.google.com/open?id=1ZI3MeGB2fkYMsEjzGQYXUk2wqr0h9h7R

On Sun, 27 Jan 2019, Ioan-Adrian Ratiu  wrote:

This reverts commit bea8a160c621d19f7f78b13e14e03f4b8e44cd4b.

Contrary to what the commit message says, on my rpi 3 b v1.2 changing
the polarity causes the exact behaviour this commit intends to fix, as
described at the referenced link below (wlan0 disapears).

With reset-gpios = ... GPIO_ACTIVE_HIGH, brcmfmac errors in dmesg:

[7.977512] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.977623] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978007] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978377] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.978724] brcmfmac: brcmf_sdio_dpc: failed backplane access over SDIO, 
halting operation
[7.978734] brcmfmac: brcmf_proto_bcdc_query_dcmd: brcmf_proto_bcdc_msg 
failed w/status -110
[7.978747] brcmfmac: brcmf_cfg80211_get_channel: chanspec failed (-110)
[7.982817] brcmfmac: brcmf_sdio_bus_sleep: error while changing bus sleep 
state -110
[7.982880] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame
[7.983255] brcmfmac: brcmf_sdio_txfail: sdio error, abort command and 
terminate frame

The only solution I currently have is to revert and everything works
as expected and as before changing the polarity.

Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=911443
Signed-off-by: Ioan-Adrian Ratiu 
---
 arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts | 2 +-
 arch/arm/boot/dts/bcm2837-rpi-3-b.dts  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts 
b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
index 93762244be7f..4adb85e66be3 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b-plus.dts
@@ -31,7 +31,7 @@
 
 	wifi_pwrseq: wifi-pwrseq {

compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
diff --git a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts

index 89e6fd547c75..c318bcbc6ba7 100644
--- a/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
+++ b/arch/arm/boot/dts/bcm2837-rpi-3-b.dts
@@ -26,7 +26,7 @@
 
 	wifi_pwrseq: wifi-pwrseq {

compatible = "mmc-pwrseq-simple";
-   reset-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
+   reset-gpios = <&expgpio 1 GPIO_ACTIVE_HIGH>;
};
 };
 
--

2.20.1


  1   2   >