Re: [PATCH AUTOSEL 5.6 16/62] most: core: use function subsys_initcall()

2020-05-14 Thread Greg Kroah-Hartman
On Thu, May 14, 2020 at 02:51:01PM -0400, Sasha Levin wrote:
> From: Christian Gromm 
> 
> [ Upstream commit 5e56bc06e18dfc8a66180fa369384b36e2ab621a ]
> 
> This patch replaces function module_init() with subsys_initcall().
> It is needed to ensure that the core module of the driver is
> initialized before a component tries to register with the core. This
> leads to a NULL pointer dereference if the driver is configured as
> in-tree.
> 
> Signed-off-by: Christian Gromm 
> Reported-by: kernel test robot 
> Link: 
> https://lore.kernel.org/r/1587741394-22021-1-git-send-email-christian.gr...@microchip.com
> Signed-off-by: Greg Kroah-Hartman 
> Signed-off-by: Sasha Levin 
> ---
>  drivers/staging/most/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/most/core.c b/drivers/staging/most/core.c
> index 0c4ae6920d77d..409c48c597f2f 100644
> --- a/drivers/staging/most/core.c
> +++ b/drivers/staging/most/core.c
> @@ -1484,7 +1484,7 @@ static void __exit most_exit(void)
>   ida_destroy(&mdev_id);
>  }
>  
> -module_init(most_init);
> +subsys_initcall(most_init);
>  module_exit(most_exit);
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Christian Gromm ");

This is not needed in 5.6 and older kernels due to the most/core.c code
being in staging for these releases.  It only became an issue when it
moved out of staging.

So please drop this from here and any older trees you might have
selected it for.

thanks,

greg k-h


[PATCH 0/8] Copy hashmap to libapi, use in perf expr

2020-05-14 Thread Ian Rogers
Perf's expr code currently builds an array of strings then removes
duplicates. The array is larger than necessary and has recently been
increased in size. When this was done it was commented that a hashmap
would be preferable.

libbpf has a hashmap but libbpf isn't currently required to build
perf. To satisfy various concerns this change copies libbpf's hashmap
into libapi, it then adds a check in perf that the two are in sync.

Andrii's patch to hashmap from bpf-next is brought into this set to
fix issues with hashmap__clear.

Three minor changes to libbpf's hashmap are made that remove an unused
dependency, fix a compiler warning and make sure the hashmap isn't
part of the symbols in a static build of libbpf (dsos are handled by
the existing version script).

Two perf test are also brought in as they need refactoring to account
for the expr API change and it is expected they will land aheadof
this.
https://lore.kernel.org/lkml/20200513062236.854-2-irog...@google.com/

Tested with 'perf test' and make -C tools/perf build-test.

The hashmap change was originally part of an RFC:
https://lore.kernel.org/lkml/20200508053629.210324-1-irog...@google.com/

Andrii Nakryiko (1):
  libbpf: Fix memory leak and possible double-free in hashmap__clear

Ian Rogers (7):
  libbpf hashmap: Remove unused #include
  libbpf hashmap: Fix signedness warnings
  libbpf hashmap: Localize static hashmap__* symbols
  tools lib/api: Copy libbpf hashmap to libapi
  perf test: Provide a subtest callback to ask for the reason for
skipping a subtest
  perf test: Improve pmu event metric testing
  perf expr: Migrate expr ids table to a hashmap

 tools/lib/api/Build |   1 +
 tools/lib/api/hashmap.c | 238 
 tools/lib/api/hashmap.h | 177 
 tools/lib/bpf/Makefile  |   2 +
 tools/lib/bpf/hashmap.c |  10 +-
 tools/lib/bpf/hashmap.h |   1 -
 tools/perf/check-headers.sh |   4 +
 tools/perf/tests/builtin-test.c |  18 ++-
 tools/perf/tests/expr.c |  40 +++---
 tools/perf/tests/pmu-events.c   | 169 ++-
 tools/perf/tests/tests.h|   4 +
 tools/perf/util/expr.c  | 129 +
 tools/perf/util/expr.h  |  22 ++-
 tools/perf/util/expr.y  |  22 +--
 tools/perf/util/metricgroup.c   |  87 ++--
 tools/perf/util/stat-shadow.c   |  49 ---
 16 files changed, 793 insertions(+), 180 deletions(-)
 create mode 100644 tools/lib/api/hashmap.c
 create mode 100644 tools/lib/api/hashmap.h

-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH 3/8] libbpf hashmap: Fix signedness warnings

2020-05-14 Thread Ian Rogers
Fixes the following warnings:

hashmap.c: In function ‘hashmap__clear’:
hashmap.h:150:20: error: comparison of integer expressions of different 
signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
  150 |  for (bkt = 0; bkt < map->cap; bkt++)\

hashmap.c: In function ‘hashmap_grow’:
hashmap.h:150:20: error: comparison of integer expressions of different 
signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
  150 |  for (bkt = 0; bkt < map->cap; bkt++)\

Signed-off-by: Ian Rogers 
---
 tools/lib/bpf/hashmap.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/hashmap.c b/tools/lib/bpf/hashmap.c
index cffb96202e0d..a405dad068f5 100644
--- a/tools/lib/bpf/hashmap.c
+++ b/tools/lib/bpf/hashmap.c
@@ -60,7 +60,7 @@ struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
 void hashmap__clear(struct hashmap *map)
 {
struct hashmap_entry *cur, *tmp;
-   int bkt;
+   size_t bkt;
 
hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
free(cur);
@@ -100,8 +100,7 @@ static int hashmap_grow(struct hashmap *map)
struct hashmap_entry **new_buckets;
struct hashmap_entry *cur, *tmp;
size_t new_cap_bits, new_cap;
-   size_t h;
-   int bkt;
+   size_t h, bkt;
 
new_cap_bits = map->cap_bits + 1;
if (new_cap_bits < HASHMAP_MIN_CAP_BITS)
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH 5/8] tools lib/api: Copy libbpf hashmap to libapi

2020-05-14 Thread Ian Rogers
Allow use of hashmap in more than just libbpf, where it isn't an
exported symbol. Place in libapi as that is a required part of
tools/perf whereas libbpf is currently optional.
Modify perf's check-headers.sh script to check that the files are kept
in sync, in the same way kernel headers are checked. This will warn if
they are out of sync at the start of a perf build.

Signed-off-by: Ian Rogers 
---
 tools/lib/api/Build |   1 +
 tools/lib/api/hashmap.c | 238 
 tools/lib/api/hashmap.h | 177 +++
 tools/perf/check-headers.sh |   4 +
 4 files changed, 420 insertions(+)
 create mode 100644 tools/lib/api/hashmap.c
 create mode 100644 tools/lib/api/hashmap.h

diff --git a/tools/lib/api/Build b/tools/lib/api/Build
index 6e2373db5598..2c8787a88080 100644
--- a/tools/lib/api/Build
+++ b/tools/lib/api/Build
@@ -2,6 +2,7 @@ libapi-y += fd/
 libapi-y += fs/
 libapi-y += cpu.o
 libapi-y += debug.o
+libapi-y += hashmap.o
 libapi-y += str_error_r.o
 
 $(OUTPUT)str_error_r.o: ../str_error_r.c FORCE
diff --git a/tools/lib/api/hashmap.c b/tools/lib/api/hashmap.c
new file mode 100644
index ..a405dad068f5
--- /dev/null
+++ b/tools/lib/api/hashmap.c
@@ -0,0 +1,238 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+/*
+ * Generic non-thread safe hash map implementation.
+ *
+ * Copyright (c) 2019 Facebook
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "hashmap.h"
+
+/* make sure libbpf doesn't use kernel-only integer typedefs */
+#pragma GCC poison u8 u16 u32 u64 s8 s16 s32 s64
+
+/* start with 4 buckets */
+#define HASHMAP_MIN_CAP_BITS 2
+
+static void hashmap_add_entry(struct hashmap_entry **pprev,
+ struct hashmap_entry *entry)
+{
+   entry->next = *pprev;
+   *pprev = entry;
+}
+
+static void hashmap_del_entry(struct hashmap_entry **pprev,
+ struct hashmap_entry *entry)
+{
+   *pprev = entry->next;
+   entry->next = NULL;
+}
+
+void hashmap__init(struct hashmap *map, hashmap_hash_fn hash_fn,
+  hashmap_equal_fn equal_fn, void *ctx)
+{
+   map->hash_fn = hash_fn;
+   map->equal_fn = equal_fn;
+   map->ctx = ctx;
+
+   map->buckets = NULL;
+   map->cap = 0;
+   map->cap_bits = 0;
+   map->sz = 0;
+}
+
+struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
+hashmap_equal_fn equal_fn,
+void *ctx)
+{
+   struct hashmap *map = malloc(sizeof(struct hashmap));
+
+   if (!map)
+   return ERR_PTR(-ENOMEM);
+   hashmap__init(map, hash_fn, equal_fn, ctx);
+   return map;
+}
+
+void hashmap__clear(struct hashmap *map)
+{
+   struct hashmap_entry *cur, *tmp;
+   size_t bkt;
+
+   hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
+   free(cur);
+   }
+   free(map->buckets);
+   map->buckets = NULL;
+   map->cap = map->cap_bits = map->sz = 0;
+}
+
+void hashmap__free(struct hashmap *map)
+{
+   if (!map)
+   return;
+
+   hashmap__clear(map);
+   free(map);
+}
+
+size_t hashmap__size(const struct hashmap *map)
+{
+   return map->sz;
+}
+
+size_t hashmap__capacity(const struct hashmap *map)
+{
+   return map->cap;
+}
+
+static bool hashmap_needs_to_grow(struct hashmap *map)
+{
+   /* grow if empty or more than 75% filled */
+   return (map->cap == 0) || ((map->sz + 1) * 4 / 3 > map->cap);
+}
+
+static int hashmap_grow(struct hashmap *map)
+{
+   struct hashmap_entry **new_buckets;
+   struct hashmap_entry *cur, *tmp;
+   size_t new_cap_bits, new_cap;
+   size_t h, bkt;
+
+   new_cap_bits = map->cap_bits + 1;
+   if (new_cap_bits < HASHMAP_MIN_CAP_BITS)
+   new_cap_bits = HASHMAP_MIN_CAP_BITS;
+
+   new_cap = 1UL << new_cap_bits;
+   new_buckets = calloc(new_cap, sizeof(new_buckets[0]));
+   if (!new_buckets)
+   return -ENOMEM;
+
+   hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
+   h = hash_bits(map->hash_fn(cur->key, map->ctx), new_cap_bits);
+   hashmap_add_entry(&new_buckets[h], cur);
+   }
+
+   map->cap = new_cap;
+   map->cap_bits = new_cap_bits;
+   free(map->buckets);
+   map->buckets = new_buckets;
+
+   return 0;
+}
+
+static bool hashmap_find_entry(const struct hashmap *map,
+  const void *key, size_t hash,
+  struct hashmap_entry ***pprev,
+  struct hashmap_entry **entry)
+{
+   struct hashmap_entry *cur, **prev_ptr;
+
+   if (!map->buckets)
+   return false;
+
+   for (prev_ptr = &map->buckets[hash], cur = *prev_ptr;
+cur;
+prev_ptr = &cur->next, cur = cur->next) {
+   if (map->equal_fn(cur->key, key, map->ctx)) {
+   if (pprev)
+

[PATCH 7/8] perf test: Improve pmu event metric testing

2020-05-14 Thread Ian Rogers
Break pmu-events test into 2 and add a test to verify that all pmu
metric expressions simply parse. Try to parse all metric ids/events,
skip/warn if metrics for the current architecture fail to parse. To
support warning for a skip, and an ability for a subtest to describe why
it skips.

Tested on power9, skylakex, haswell, broadwell, westmere, sandybridge and
ivybridge.

May skip/warn on other architectures if metrics are invalid. In
particular s390 is untested, but its expressions are trivial. The
untested architectures with expressions are power8, cascadelakex,
tremontx, skylake, jaketown, ivytown and variants of haswell and
broadwell.

v3. addresses review comments from John Garry ,
Jiri Olsa  and Arnaldo Carvalho de Melo
.
v2. changes the commit message as event parsing errors no longer cause
the test to fail.

Signed-off-by: Ian Rogers 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: John Garry 
Cc: Kajol Jain 
Cc: Kan Liang 
Cc: Leo Yan 
Cc: Mark Rutland 
Cc: Namhyung Kim 
Cc: Paul Clarke 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Link: http://lore.kernel.org/lkml/20200513212933.41273-1-irog...@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/builtin-test.c |   7 ++
 tools/perf/tests/pmu-events.c   | 168 ++--
 tools/perf/tests/tests.h|   3 +
 3 files changed, 172 insertions(+), 6 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index baee735e6aa5..9553f8061772 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -75,6 +75,13 @@ static struct test generic_tests[] = {
{
.desc = "PMU events",
.func = test__pmu_events,
+   .subtest = {
+   .skip_if_fail   = false,
+   .get_nr = test__pmu_events_subtest_get_nr,
+   .get_desc   = test__pmu_events_subtest_get_desc,
+   .skip_reason= test__pmu_events_subtest_skip_reason,
+   },
+
},
{
.desc = "DSO data read",
diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index d64261da8bf7..e21f0addcfbb 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -8,6 +8,9 @@
 #include 
 #include "debug.h"
 #include "../pmu-events/pmu-events.h"
+#include "util/evlist.h"
+#include "util/expr.h"
+#include "util/parse-events.h"
 
 struct perf_pmu_test_event {
struct pmu_event event;
@@ -144,7 +147,7 @@ static struct pmu_events_map 
*__test_pmu_get_events_map(void)
 }
 
 /* Verify generated events from pmu-events.c is as expected */
-static int __test_pmu_event_table(void)
+static int test_pmu_event_table(void)
 {
struct pmu_events_map *map = __test_pmu_get_events_map();
struct pmu_event *table;
@@ -347,14 +350,11 @@ static int __test__pmu_event_aliases(char *pmu_name, int 
*count)
return res;
 }
 
-int test__pmu_events(struct test *test __maybe_unused,
-int subtest __maybe_unused)
+
+static int test_aliases(void)
 {
struct perf_pmu *pmu = NULL;
 
-   if (__test_pmu_event_table())
-   return -1;
-
while ((pmu = perf_pmu__scan(pmu)) != NULL) {
int count = 0;
 
@@ -377,3 +377,159 @@ int test__pmu_events(struct test *test __maybe_unused,
 
return 0;
 }
+
+static bool is_number(const char *str)
+{
+   char *end_ptr;
+
+   strtod(str, &end_ptr);
+   return end_ptr != str;
+}
+
+static int check_parse_id(const char *id, bool same_cpu, struct pmu_event *pe)
+{
+   struct parse_events_error error;
+   struct evlist *evlist;
+   int ret;
+
+   /* Numbers are always valid. */
+   if (is_number(id))
+   return 0;
+
+   evlist = evlist__new();
+   memset(&error, 0, sizeof(error));
+   ret = parse_events(evlist, id, &error);
+   if (ret && same_cpu) {
+   pr_warning("Parse event failed metric '%s' id '%s' expr '%s'\n",
+   pe->metric_name, id, pe->metric_expr);
+   pr_warning("Error string '%s' help '%s'\n", error.str,
+   error.help);
+   } else if (ret) {
+   pr_debug3("Parse event failed, but for an event that may not be 
supported by this CPU.\nid '%s' metric '%s' expr '%s'\n",
+ id, pe->metric_name, pe->metric_expr);
+   ret = 0;
+   }
+   evlist__delete(evlist);
+   free(error.str);
+   free(error.help);
+   free(error.first_str);
+   free(error.first_help);
+   return ret;
+}
+
+static void expr_failure(const char *msg,
+const struct pmu_events_map *map,
+const struct pmu_event *pe)
+{
+   pr_debug("%s for map %s %s %s\n",
+   msg, map->cpuid, map->version, map->type);
+ 

[PATCH 2/8] libbpf hashmap: Remove unused #include

2020-05-14 Thread Ian Rogers
Remove #include of libbpf_internal.h that is unused.
Discussed in this thread:
https://lore.kernel.org/lkml/caef4bzzrmieds_8r8g4vaaewvjzpb4xylnpf0x2vny8otzk...@mail.gmail.com/

Signed-off-by: Ian Rogers 
---
 tools/lib/bpf/hashmap.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/lib/bpf/hashmap.h b/tools/lib/bpf/hashmap.h
index bae8879cdf58..e823b35e7371 100644
--- a/tools/lib/bpf/hashmap.h
+++ b/tools/lib/bpf/hashmap.h
@@ -15,7 +15,6 @@
 #else
 #include 
 #endif
-#include "libbpf_internal.h"
 
 static inline size_t hash_bits(size_t h, int bits)
 {
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH 8/8] perf expr: Migrate expr ids table to a hashmap

2020-05-14 Thread Ian Rogers
Use a hashmap between a char* string and a double* value. While bpf's
hashmap entries are size_t in size, we can't guarantee sizeof(size_t) >=
sizeof(double). Avoid a memory allocation when gathering ids by making 0.0
a special value encoded as NULL.

Original map suggestion by Andi Kleen:
https://lore.kernel.org/lkml/20200224210308.gq160...@tassilo.jf.intel.com/
and seconded by Jiri Olsa:
https://lore.kernel.org/lkml/20200423112915.GH1136647@krava/

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/expr.c   |  40 ++-
 tools/perf/tests/pmu-events.c |  25 +++
 tools/perf/util/expr.c| 129 +++---
 tools/perf/util/expr.h|  22 +++---
 tools/perf/util/expr.y|  22 +-
 tools/perf/util/metricgroup.c |  87 +++
 tools/perf/util/stat-shadow.c |  49 -
 7 files changed, 193 insertions(+), 181 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 3f742612776a..5e606fd5a2c6 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -19,11 +19,9 @@ static int test(struct expr_parse_ctx *ctx, const char *e, 
double val2)
 int test__expr(struct test *t __maybe_unused, int subtest __maybe_unused)
 {
const char *p;
-   const char **other;
-   double val;
-   int i, ret;
+   double val, *val_ptr;
+   int ret;
struct expr_parse_ctx ctx;
-   int num_other;
 
expr__ctx_init(&ctx);
expr__add_id(&ctx, "FOO", 1);
@@ -52,25 +50,29 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
ret = expr__parse(&val, &ctx, p, 1);
TEST_ASSERT_VAL("missing operand", ret == -1);
 
+   hashmap__clear(&ctx.ids);
TEST_ASSERT_VAL("find other",
-   expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", 
&other, &num_other, 1) == 0);
-   TEST_ASSERT_VAL("find other", num_other == 3);
-   TEST_ASSERT_VAL("find other", !strcmp(other[0], "BAR"));
-   TEST_ASSERT_VAL("find other", !strcmp(other[1], "BAZ"));
-   TEST_ASSERT_VAL("find other", !strcmp(other[2], "BOZO"));
-   TEST_ASSERT_VAL("find other", other[3] == NULL);
+   expr__find_other("FOO + BAR + BAZ + BOZO", "FOO",
+&ctx, 1) == 0);
+   TEST_ASSERT_VAL("find other", hashmap__size(&ctx.ids) == 3);
+   TEST_ASSERT_VAL("find other", hashmap__find(&ctx.ids, "BAR",
+   (void **)&val_ptr));
+   TEST_ASSERT_VAL("find other", hashmap__find(&ctx.ids, "BAZ",
+   (void **)&val_ptr));
+   TEST_ASSERT_VAL("find other", hashmap__find(&ctx.ids, "BOZO",
+   (void **)&val_ptr));
 
+   hashmap__clear(&ctx.ids);
TEST_ASSERT_VAL("find other",
-   expr__find_other("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@", NULL,
-  &other, &num_other, 3) == 0);
-   TEST_ASSERT_VAL("find other", num_other == 2);
-   TEST_ASSERT_VAL("find other", !strcmp(other[0], "EVENT1,param=3/"));
-   TEST_ASSERT_VAL("find other", !strcmp(other[1], "EVENT2,param=3/"));
-   TEST_ASSERT_VAL("find other", other[2] == NULL);
+   expr__find_other("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@",
+NULL, &ctx, 3) == 0);
+   TEST_ASSERT_VAL("find other", hashmap__size(&ctx.ids) == 2);
+   TEST_ASSERT_VAL("find other", hashmap__find(&ctx.ids, "EVENT1,param=3/",
+   (void **)&val_ptr));
+   TEST_ASSERT_VAL("find other", hashmap__find(&ctx.ids, "EVENT2,param=3/",
+   (void **)&val_ptr));
 
-   for (i = 0; i < num_other; i++)
-   zfree(&other[i]);
-   free((void *)other);
+   expr__ctx_clear(&ctx);
 
return 0;
 }
diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index e21f0addcfbb..3de59564deb0 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -433,8 +433,6 @@ static int test_parsing(void)
struct pmu_events_map *map;
struct pmu_event *pe;
int i, j, k;
-   const char **ids;
-   int idnum;
int ret = 0;
struct expr_parse_ctx ctx;
double result;
@@ -446,29 +444,34 @@ static int test_parsing(void)
break;
j = 0;
for (;;) {
+   struct hashmap_entry *cur;
+   size_t bkt;
+
pe = &map->table[j++];
if (!pe->name && !pe->metric_group && !pe->metric_name)
break;
if (!pe->metric_expr)
continue;
-   if (expr__find_other(pe->metric_expr, NULL,
-   

[PATCH 6/8] perf test: Provide a subtest callback to ask for the reason for skipping a subtest

2020-05-14 Thread Ian Rogers
Now subtests can inform why a test was skipped. The upcoming patch
improvint PMU event metric testing will use it.

Signed-off-by: Ian Rogers 
Cc: Adrian Hunter 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Jin Yao 
Cc: Jiri Olsa 
Cc: John Garry 
Cc: Kajol Jain 
Cc: Kan Liang 
Cc: Leo Yan 
Cc: Mark Rutland 
Cc: Namhyung Kim 
Cc: Paul Clarke 
Cc: Peter Zijlstra 
Cc: Stephane Eranian 
Link: http://lore.kernel.org/lkml/20200513212933.41273-1-irog...@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/builtin-test.c | 11 +--
 tools/perf/tests/tests.h|  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 3471ec52ea11..baee735e6aa5 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -429,8 +429,15 @@ static int test_and_print(struct test *t, bool force_skip, 
int subtest)
case TEST_OK:
pr_info(" Ok\n");
break;
-   case TEST_SKIP:
-   color_fprintf(stderr, PERF_COLOR_YELLOW, " Skip\n");
+   case TEST_SKIP: {
+   const char *skip_reason = NULL;
+   if (t->subtest.skip_reason)
+   skip_reason = t->subtest.skip_reason(subtest);
+   if (skip_reason)
+   color_fprintf(stderr, PERF_COLOR_YELLOW, " Skip 
(%s)\n", skip_reason);
+   else
+   color_fprintf(stderr, PERF_COLOR_YELLOW, " Skip\n");
+   }
break;
case TEST_FAIL:
default:
diff --git a/tools/perf/tests/tests.h b/tools/perf/tests/tests.h
index d6d4ac34eeb7..88e45aeab94f 100644
--- a/tools/perf/tests/tests.h
+++ b/tools/perf/tests/tests.h
@@ -34,6 +34,7 @@ struct test {
bool skip_if_fail;
int (*get_nr)(void);
const char *(*get_desc)(int subtest);
+   const char *(*skip_reason)(int subtest);
} subtest;
bool (*is_supported)(void);
void *priv;
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH 1/8] libbpf: Fix memory leak and possible double-free in hashmap__clear

2020-05-14 Thread Ian Rogers
From: Andrii Nakryiko 

Fix memory leak in hashmap_clear() not freeing hashmap_entry structs for each
of the remaining entries. Also NULL-out bucket list to prevent possible
double-free between hashmap__clear() and hashmap__free().

Running test_progs-asan flavor clearly showed this problem.

Reported-by: Alston Tang 
Signed-off-by: Andrii Nakryiko 
Signed-off-by: Alexei Starovoitov 
Link: https://lore.kernel.org/bpf/20200429012111.277390-5-andr...@fb.com
Signed-off-by: Ian Rogers 
---
 tools/lib/bpf/hashmap.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/lib/bpf/hashmap.c b/tools/lib/bpf/hashmap.c
index 54c30c802070..cffb96202e0d 100644
--- a/tools/lib/bpf/hashmap.c
+++ b/tools/lib/bpf/hashmap.c
@@ -59,7 +59,14 @@ struct hashmap *hashmap__new(hashmap_hash_fn hash_fn,
 
 void hashmap__clear(struct hashmap *map)
 {
+   struct hashmap_entry *cur, *tmp;
+   int bkt;
+
+   hashmap__for_each_entry_safe(map, cur, tmp, bkt) {
+   free(cur);
+   }
free(map->buckets);
+   map->buckets = NULL;
map->cap = map->cap_bits = map->sz = 0;
 }
 
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH 4/8] libbpf hashmap: Localize static hashmap__* symbols

2020-05-14 Thread Ian Rogers
Localize the hashmap__* symbols in libbpf.a. To allow for a version in
libapi.

Before:
$ nm libbpf.a
...
0002088a t hashmap_add_entry
0001712a t hashmap__append
00020aa3 T hashmap__capacity
0002099c T hashmap__clear
000208b3 t hashmap_del_entry
00020fc1 T hashmap__delete
00020f29 T hashmap__find
00020c6c t hashmap_find_entry
00020a61 T hashmap__free
00020b08 t hashmap_grow
000208dd T hashmap__init
00020d35 T hashmap__insert
00020ab5 t hashmap_needs_to_grow
00020947 T hashmap__new
0775 t hashmap__set
000212f8 t hashmap__set
00020a91 T hashmap__size
...

After:
$ nm libbpf.a
...
0002088a t hashmap_add_entry
0001712a t hashmap__append
00020aa3 t hashmap__capacity
0002099c t hashmap__clear
000208b3 t hashmap_del_entry
00020fc1 t hashmap__delete
00020f29 t hashmap__find
00020c6c t hashmap_find_entry
00020a61 t hashmap__free
00020b08 t hashmap_grow
000208dd t hashmap__init
00020d35 t hashmap__insert
00020ab5 t hashmap_needs_to_grow
00020947 t hashmap__new
0775 t hashmap__set
000212f8 t hashmap__set
00020a91 t hashmap__size
...

Signed-off-by: Ian Rogers 
---
 tools/lib/bpf/Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index aee7f1a83c77..4a1cdbceb04e 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -20,6 +20,7 @@ srctree := $(patsubst %/,%,$(dir $(srctree)))
 endif
 
 INSTALL = install
+OBJCOPY ?= objcopy
 
 # Use DESTDIR for installing into a different root directory.
 # This is useful for building a package. The program will be
@@ -181,6 +182,7 @@ $(BPF_IN_SHARED): force elfdep zdep bpfdep 
$(BPF_HELPER_DEFS)
 
 $(BPF_IN_STATIC): force elfdep zdep bpfdep $(BPF_HELPER_DEFS)
$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
+   $(Q)$(OBJCOPY) -w -L hashmap__\* $@
 
 $(BPF_HELPER_DEFS): $(srctree)/tools/include/uapi/linux/bpf.h
$(QUIET_GEN)$(srctree)/scripts/bpf_helpers_doc.py --header \
-- 
2.26.2.761.g0e0b3e54be-goog



Re: [PATCH v2 0/5] mtd: spi-nor: Add support for Octal 8D-8D-8D mode

2020-05-14 Thread Pratyush Yadav
Hi Mason,

On 15/05/20 10:26AM, masonccy...@mxic.com.tw wrote:
> 
> Hi Pratyush,
> 
> > > > > I can't apply your patches to enable xSPI Octal mode for 
> > > > > mx25uw51245g because your patches set up Octal protocol first and 
> > > > > then using Octal protocol to write Configuration Register 2(CFG 
> > > > > Reg2). I think driver
> > > > > should write CFG Reg2 in SPI 1-1-1 mode (power on state) and make 
> sure
> > > > > write CFG Reg 2 is success and then setup Octa protocol in the 
> last.
> > > > 
> > > > Register writes should work in 1S mode, because nor->reg_proto is 
> only 
> > > > set _after_ 8D mode is enabled (see spi_nor_octal_dtr_enable()). In 
> > > > fact, both patch 15 and 16 in my series use register writes in 1S 
> mode.
> > > 
> > > but I didn't see driver roll back "nor->read/write_proto = 1" 
> > > if xxx->octal_dtr_enable() return failed!
> > 
> > I copied what spi_nor_quad_enable() did, and made failure fatal. So if 
> > xxx->octal_dtr_enable() fails, the probe would fail and the flash would 
> > be unusable. You can try your hand at a fallback system where you try 
> 
> IMHO, it's not a good for system booting from SPI-NOR, 
> driver should still keep system alive in SPI 1-1-1 mode in case of 
> enable Octal/Quad failed.

I agree.
 
> Therefore, my patches is to setup nor->read/write_proto = 8 in case 
> driver enable Octal mode is success. And to enable Octal mode in
> spi_nor_late_init_params()rather than as spi_nor_quad_enable()did.

Like I mentioned before, spi_nor_late_init_params() is called _before_ 
we call spi_nor_spimem_adjust_hwcaps(). That call is needed to make sure 
the controller also supports octal mode operations. Otherwise, you'd end 
up enabling octal mode on a controller that doesn't support it with no 
way of going back now.

But we can still have a fallback mechanism even in spi_nor_init() that 
would switch to a "less preferred" mode (like 1-1-1 mode) if "more 
preferred" ones like octal or quad fail.

That said, I think it would be a good idea to not keep tacking features 
on this series. This makes it harder for reviewers because now they are 
trying to shoot a moving target. Let basic 8D support stabilize and get 
merged in, and then a fallback system can be submitted as a separate 
patch series.

-- 
Regards,
Pratyush Yadav


Re: [PATCH 5/5] PCI: uniphier: Add error message when failed to get phy

2020-05-14 Thread kbuild test robot
Hi Kunihiko,

I love your patch! Perhaps something to improve:

[auto build test WARNING on pci/next]
[also build test WARNING on robh/for-next v5.7-rc5 next-20200514]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Kunihiko-Hayashi/PCI-uniphier-Add-features-for-UniPhier-PCIe-host-controller/20200515-125031
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: i386-allyesconfig (attached as .config)
compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

In file included from include/linux/device.h:15:0,
from include/linux/pci.h:37,
from drivers/pci/controller/dwc/pcie-uniphier.c:18:
drivers/pci/controller/dwc/pcie-uniphier.c: In function 'uniphier_pcie_probe':
>> drivers/pci/controller/dwc/pcie-uniphier.c:470:16: warning: format '%d' 
>> expects argument of type 'int', but argument 3 has type 'long int' 
>> [-Wformat=]
dev_err(dev, "Failed to get phy (%d)n", PTR_ERR(priv->phy));
^
include/linux/dev_printk.h:19:22: note: in definition of macro 'dev_fmt'
#define dev_fmt(fmt) fmt
^~~
>> drivers/pci/controller/dwc/pcie-uniphier.c:470:3: note: in expansion of 
>> macro 'dev_err'
dev_err(dev, "Failed to get phy (%d)n", PTR_ERR(priv->phy));
^~~

vim +470 drivers/pci/controller/dwc/pcie-uniphier.c

   430  
   431  static int uniphier_pcie_probe(struct platform_device *pdev)
   432  {
   433  struct device *dev = &pdev->dev;
   434  struct uniphier_pcie_priv *priv;
   435  struct resource *res;
   436  int ret;
   437  
   438  priv = devm_kzalloc(dev, sizeof(*priv), GFP_KERNEL);
   439  if (!priv)
   440  return -ENOMEM;
   441  
   442  priv->pci.dev = dev;
   443  priv->pci.ops = &dw_pcie_ops;
   444  
   445  res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "dbi");
   446  priv->pci.dbi_base = devm_pci_remap_cfg_resource(dev, res);
   447  if (IS_ERR(priv->pci.dbi_base))
   448  return PTR_ERR(priv->pci.dbi_base);
   449  
   450  res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "atu");
   451  priv->pci.atu_base = devm_pci_remap_cfg_resource(dev, res);
   452  if (IS_ERR(priv->pci.atu_base))
   453  priv->pci.atu_base = NULL;
   454  
   455  res = platform_get_resource_byname(pdev, IORESOURCE_MEM, 
"link");
   456  priv->base = devm_ioremap_resource(dev, res);
   457  if (IS_ERR(priv->base))
   458  return PTR_ERR(priv->base);
   459  
   460  priv->clk = devm_clk_get(dev, NULL);
   461  if (IS_ERR(priv->clk))
   462  return PTR_ERR(priv->clk);
   463  
   464  priv->rst = devm_reset_control_get_shared(dev, NULL);
   465  if (IS_ERR(priv->rst))
   466  return PTR_ERR(priv->rst);
   467  
   468  priv->phy = devm_phy_optional_get(dev, "pcie-phy");
   469  if (IS_ERR(priv->phy)) {
 > 470  dev_err(dev, "Failed to get phy (%d)\n", 
 > PTR_ERR(priv->phy));
   471  return PTR_ERR(priv->phy);
   472  }
   473  
   474  platform_set_drvdata(pdev, priv);
   475  
   476  ret = uniphier_pcie_host_enable(priv);
   477  if (ret)
   478  return ret;
   479  
   480  return uniphier_add_pcie_port(priv, pdev);
   481  }
   482  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH v2 07/10] block: use sectors_to_npage() and PAGE_SECTORS to clean up code

2020-05-14 Thread Leizhen (ThunderTown)



On 2020/5/15 12:19, Matthew Wilcox wrote:
> On Thu, May 07, 2020 at 03:50:57PM +0800, Zhen Lei wrote:
>> +++ b/block/blk-settings.c
>> @@ -150,7 +150,7 @@ void blk_queue_max_hw_sectors(struct request_queue *q, 
>> unsigned int max_hw_secto
>>  unsigned int max_sectors;
>>  
>>  if ((max_hw_sectors << 9) < PAGE_SIZE) {
>> -max_hw_sectors = 1 << (PAGE_SHIFT - 9);
>> +max_hw_sectors = PAGE_SECTORS;
> 
> Surely this should be:
> 
>   if (max_hw_sectors < PAGE_SECTORS) {
>   max_hw_sectors = PAGE_SECTORS;
> 
> ... no?

I've noticed this place before. "(max_hw_sectors << 9) < PAGE_SIZE" can also 
make sure
that max_hw_sectors is not too large, that means (max_hw_sectors << 9) may 
overflow.

> 
>> -page = read_mapping_page(mapping,
>> -(pgoff_t)(n >> (PAGE_SHIFT - 9)), NULL);
>> +page = read_mapping_page(mapping, (pgoff_t)sectors_to_npage(n), NULL);
> 
> ... again, get the type right, and you won't need the cast.
OK, I'll consider it.

> 
> 
> .
> 



Re: [PATCH] xhci: Fix log mistake of xhci_start

2020-05-14 Thread Mathias Nyman
On 15.5.2020 8.45, jiahao wrote:
> It is obvious that XCHI_MAX_HALT_USEC is usec,
>  not milliseconds; Replace 'milliseconds' with
> 'usec' of the debug message.
> 
> Signed-off-by: jiahao 
> ---
>  drivers/usb/host/xhci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index bee5dec..d011472 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -147,7 +147,7 @@ int xhci_start(struct xhci_hcd *xhci)
>   STS_HALT, 0, XHCI_MAX_HALT_USEC);
>   if (ret == -ETIMEDOUT)
>   xhci_err(xhci, "Host took too long to start, "
> - "waited %u microseconds.\n",
> + "waited %u usec.\n",

It already says "microseconds", no need to change it

-Mathias



Possibility of conflicting memory types in lazier TLB mode?

2020-05-14 Thread Nicholas Piggin
Hi Rik,

Commit 145f573b89a62 ("Make lazy TLB mode lazier").

A couple of questions here (and I don't know the x86 architecture too 
well let alone the ASID stuff, so bear with me). I'm assuming, and it 
appears to be in the x86 manual that you can't map the same physical 
page with conflicting memory types on different processors in general
(or in different ASIDs on the same processor?)

Firstly, the freed_tables check, that's to prevent CPUs in the lazy mode 
with this mm loaded in their ASID from bringing in new translations 
based on random "stuff" if they happen to speculatively load userspace 
addresses (but in lazy mode they would never explicitly load such 
addresses), right?

I'm guessing that's a problem but the changed pte case is not, is 
because the table walker is going to barf if it sees garbage, but a 
valid pte is okay.

Now the intel manual says conflicting attributes are bad because you'll 
lose cache coherency on stores. But the speculative accesses from the 
lazy thread will never push stores to cache coherency and result of the 
loads doesn't matter, so maybe that's how this special case avoids the
problem.

But what about if there are (real, not speculative) stores in the store 
queue still on the lazy thread from when it was switched, that have not 
yet become coherent? The page is freed by another CPU and reallocated
for something that maps it as nocache. Do you have a coherency problem 
there?

Ensuring the store queue is drained when switching to lazy seems like it 
would fix it, maybe context switch code does that already or you have 
some other trick or reason it's not a problem. Am I way off base here?

Thanks,
Nick


Re: [PATCH v2 1/6] dt-bindings: mfd: add Khadas Microcontroller bindings

2020-05-14 Thread Amit Kucheria
On Tue, May 12, 2020 at 6:56 PM Neil Armstrong  wrote:
>
> This Microcontroller is present on the Khadas VIM1, VIM2, VIM3 and Edge
> boards.
>
> It has multiple boot control features like password check, power-on
> options, power-off control and system FAN control on recent boards.
>
> Signed-off-by: Neil Armstrong 
> Reviewed-by: Rob Herring 

Reviewed-by: Amit Kucheria 

> ---
>  .../devicetree/bindings/mfd/khadas,mcu.yaml   | 44 +++
>  1 file changed, 44 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
>
> diff --git a/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml 
> b/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
> new file mode 100644
> index ..a3b976f101e8
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/mfd/khadas,mcu.yaml
> @@ -0,0 +1,44 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/mfd/khadas,mcu.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Khadas on-board Microcontroller Device Tree Bindings
> +
> +maintainers:
> +  - Neil Armstrong 
> +
> +description: |
> +  Khadas embeds a microcontroller on their VIM and Edge boards adding some
> +  system feature as PWM Fan control (for VIM2 rev14 or VIM3), User memory
> +  storage, IR/Key resume control, system power LED control and more.
> +
> +properties:
> +  compatible:
> +enum:
> +  - khadas,mcu # MCU revision is discoverable
> +
> +  "#cooling-cells": # Only needed for boards having FAN control feature
> +const: 2
> +
> +  reg:
> +maxItems: 1
> +
> +required:
> +  - compatible
> +  - reg
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +i2c {
> +  #address-cells = <1>;
> +  #size-cells = <0>;
> +  khadas_mcu: system-controller@18 {
> +compatible = "khadas,mcu";
> +reg = <0x18>;
> +#cooling-cells = <2>;
> +  };
> +};
> --
> 2.22.0
>


Re: [PATCH v2 06/10] mm/swap: use npage_to_sectors() and PAGE_SECTORS to clean up code

2020-05-14 Thread Leizhen (ThunderTown)



On 2020/5/15 12:14, Matthew Wilcox wrote:
> On Thu, May 07, 2020 at 03:50:56PM +0800, Zhen Lei wrote:
>> +++ b/mm/page_io.c
>> @@ -38,7 +38,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
>>  
>>  bio->bi_iter.bi_sector = map_swap_page(page, &bdev);
>>  bio_set_dev(bio, bdev);
>> -bio->bi_iter.bi_sector <<= PAGE_SHIFT - 9;
>> +bio->bi_iter.bi_sector *= PAGE_SECTORS;
>>  bio->bi_end_io = end_io;
> 
> This just doesn't look right.  Why is map_swap_page() returning a sector_t
> which isn't actually a sector_t?

I try to understand map_swap_page(). Here maybe a bug. Otherwise, it would be
better to add a temporary variable to cache the return value of 
map_swap_page(page, &bdev).

> 
> 
> .
> 



Re: [PATCH] ceph: don't return -ESTALE if there's still an open file

2020-05-14 Thread Amir Goldstein
+CC: fstests

On Thu, May 14, 2020 at 4:15 PM Jeff Layton  wrote:
>
> On Thu, 2020-05-14 at 13:48 +0100, Luis Henriques wrote:
> > On Thu, May 14, 2020 at 08:10:09AM -0400, Jeff Layton wrote:
> > > On Thu, 2020-05-14 at 12:14 +0100, Luis Henriques wrote:
> > > > Similarly to commit 03f219041fdb ("ceph: check i_nlink while converting
> > > > a file handle to dentry"), this fixes another corner case with
> > > > name_to_handle_at/open_by_handle_at.  The issue has been detected by
> > > > xfstest generic/467, when doing:
> > > >
> > > >  - name_to_handle_at("/cephfs/myfile")
> > > >  - open("/cephfs/myfile")
> > > >  - unlink("/cephfs/myfile")
> > > >  - open_by_handle_at()
> > > >
> > > > The call to open_by_handle_at should not fail because the file still
> > > > exists and we do have a valid handle to it.
> > > >
> > > > Signed-off-by: Luis Henriques 
> > > > ---
> > > >  fs/ceph/export.c | 13 +++--
> > > >  1 file changed, 11 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> > > > index 79dc06881e78..8556df9d94d0 100644
> > > > --- a/fs/ceph/export.c
> > > > +++ b/fs/ceph/export.c
> > > > @@ -171,12 +171,21 @@ struct inode *ceph_lookup_inode(struct 
> > > > super_block *sb, u64 ino)
> > > >
> > > >  static struct dentry *__fh_to_dentry(struct super_block *sb, u64 ino)
> > > >  {
> > > > + struct ceph_inode_info *ci;
> > > >   struct inode *inode = __lookup_inode(sb, ino);
> > > > +
> > > >   if (IS_ERR(inode))
> > > >   return ERR_CAST(inode);
> > > >   if (inode->i_nlink == 0) {
> > > > - iput(inode);
> > > > - return ERR_PTR(-ESTALE);
> > > > + bool is_open;
> > > > + ci = ceph_inode(inode);
> > > > + spin_lock(&ci->i_ceph_lock);
> > > > + is_open = __ceph_is_file_opened(ci);
> > > > + spin_unlock(&ci->i_ceph_lock);
> > > > + if (!is_open) {
> > > > + iput(inode);
> > > > + return ERR_PTR(-ESTALE);
> > > > + }
> > > >   }
> > > >   return d_obtain_alias(inode);
> > > >  }
> > >
> > > Thanks Luis. Out of curiousity, is there any reason we shouldn't ignore
> > > the i_nlink value here? Does anything obviously break if we do?
> >
> > Yes, the scenario described in commit 03f219041fdb is still valid, which
> > is basically the same but without the extra open(2):
> >
> >   - name_to_handle_at("/cephfs/myfile")
> >   - unlink("/cephfs/myfile")
> >   - open_by_handle_at()
> >
>
> Ok, I guess we end up doing some delayed cleanup, and that allows the
> inode to be found in that situation.
>
> > The open_by_handle_at man page isn't really clear about these 2 scenarios,
> > but generic/426 will fail if -ESTALE isn't returned.  Want me to add a
> > comment to the code, describing these 2 scenarios?
> >
>
> (cc'ing Amir since he added this test)
>
> I don't think there is any hard requirement that open_by_handle_at
> should fail in that situation. It generally does for most filesystems
> due to the way they handle cl794798fa xfsqa: test open_by_handle() on 
> unlinked and freed inode clusters
eaning up unlinked inodes, but I don't
> think it's technically illegal to allow the inode to still be found. If
> the caller cares about whether it has been unlinked it can always test
> i_nlink itself.
>
> Amir, is this required for some reason that I'm not aware of?

Hi Jeff,

The origin of this test is in fstests commit:
794798fa xfsqa: test open_by_handle() on unlinked and freed inode clusters

It was introduced to catch an xfs bug, so this behavior is the expectation
of xfs filesystem, but note that it is not a general expectation to fail
open_by_handle() after unlink(), it is an expectation to fail open_by_handle()
after unlink() + sync() + drop_caches.

I have later converted the test to generic, because I needed to check the
same expectation for overlayfs use case, which is:
The original inode is always there (in lower layer), unlink creates a whiteout
mark and open_by_handle should treat that as ESTALE, otherwise the
unlinked files would be accessible to nfs clients forever.

In overlayfs, we handle the open file case by returning a dentry only
in case the inode with deletion mark in question is already in inode cache,
but we take care not to populate inode cache with the check.
It is easier, because we do not need to get inode into cache for checking
the delete marker.

Maybe you could instead check in __fh_to_dentry():

if (inode->i_nlink == 0 && atomic_read(&inode->i_count) == 1)) {
iput(inode);
return ERR_PTR(-ESTALE);
}

The above is untested, so I don't know if it's enough to pass generic/426.
Note that generic/467 also checks the same behavior for rmdir().

If you decide that ceph does not need to comply to this behavior,
then we probably need to whitelist/blocklist the filesystems that
want to test this behavior, which will be a shame.

Thanks,
Amir.


Re: [PATCH v2 3/6] thermal: add support for the MCU controlled FAN on Khadas boards

2020-05-14 Thread Amit Kucheria
On Tue, May 12, 2020 at 6:56 PM Neil Armstrong  wrote:
>
> The new Khadas VIM2 and VIM3 boards controls the cooling fan via the
> on-board microcontroller.
>
> This implements the FAN control as thermal devices and as cell of the Khadas
> MCU MFD driver.
>
> Signed-off-by: Neil Armstrong 
> ---
>  drivers/thermal/Kconfig  |  10 ++
>  drivers/thermal/Makefile |   1 +
>  drivers/thermal/khadas_mcu_fan.c | 174 +++
>  3 files changed, 185 insertions(+)
>  create mode 100644 drivers/thermal/khadas_mcu_fan.c
>
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 91af271e9bb0..72b3960cc5ac 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -490,4 +490,14 @@ config SPRD_THERMAL
> help
>   Support for the Spreadtrum thermal sensor driver in the Linux 
> thermal
>   framework.
> +
> +config KHADAS_MCU_FAN_THERMAL
> +   tristate "Khadas MCU controller FAN cooling support"
> +   depends on OF || COMPILE_TEST

Could you add a depends on the some board/SoC Kconfig option here so
this doesn't show up for non-Amlogic/non-Khadas boards?

Looks OK otherwise.

> +   select MFD_CORE
> +   select REGMAP
> +   help
> + If you say yes here you get support for the FAN controlled
> + by the Microcontroller found on the Khadas VIM boards.
> +
>  endif
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 8c8ed7b79915..460428c2122c 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_ZX2967_THERMAL)  += zx2967_thermal.o
>  obj-$(CONFIG_UNIPHIER_THERMAL) += uniphier_thermal.o
>  obj-$(CONFIG_AMLOGIC_THERMAL) += amlogic_thermal.o
>  obj-$(CONFIG_SPRD_THERMAL) += sprd_thermal.o
> +obj-$(CONFIG_KHADAS_MCU_FAN_THERMAL)   += khadas_mcu_fan.o
> diff --git a/drivers/thermal/khadas_mcu_fan.c 
> b/drivers/thermal/khadas_mcu_fan.c
> new file mode 100644
> index ..044d4aba8be2
> --- /dev/null
> +++ b/drivers/thermal/khadas_mcu_fan.c
> @@ -0,0 +1,174 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Khadas MCU Controlled FAN driver
> + *
> + * Copyright (C) 2020 BayLibre SAS
> + * Author(s): Neil Armstrong 
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define MAX_LEVEL 3
> +
> +struct khadas_mcu_fan_ctx {
> +   struct khadas_mcu *mcu;
> +   unsigned int level;
> +   struct thermal_cooling_device *cdev;
> +};
> +
> +static int khadas_mcu_fan_set_level(struct khadas_mcu_fan_ctx *ctx,
> +   unsigned int level)
> +{
> +   int ret;
> +
> +   ret = regmap_write(ctx->mcu->map, KHADAS_MCU_CMD_FAN_STATUS_CTRL_REG,
> +  level);
> +   if (ret)
> +   return ret;
> +
> +   ctx->level = level;
> +
> +   return 0;
> +}
> +
> +static int khadas_mcu_fan_get_max_state(struct thermal_cooling_device *cdev,
> +   unsigned long *state)
> +{
> +   struct khadas_mcu_fan_ctx *ctx = cdev->devdata;
> +
> +   if (!ctx)
> +   return -EINVAL;
> +
> +   *state = MAX_LEVEL;
> +
> +   return 0;
> +}
> +
> +static int khadas_mcu_fan_get_cur_state(struct thermal_cooling_device *cdev,
> +   unsigned long *state)
> +{
> +   struct khadas_mcu_fan_ctx *ctx = cdev->devdata;
> +
> +   if (!ctx)
> +   return -EINVAL;
> +
> +   *state = ctx->level;
> +
> +   return 0;
> +}
> +
> +static int
> +khadas_mcu_fan_set_cur_state(struct thermal_cooling_device *cdev,
> +unsigned long state)
> +{
> +   struct khadas_mcu_fan_ctx *ctx = cdev->devdata;
> +
> +   if (!ctx || (state > MAX_LEVEL))
> +   return -EINVAL;
> +
> +   if (state == ctx->level)
> +   return 0;
> +
> +   return khadas_mcu_fan_set_level(ctx, state);
> +}
> +
> +static const struct thermal_cooling_device_ops khadas_mcu_fan_cooling_ops = {
> +   .get_max_state = khadas_mcu_fan_get_max_state,
> +   .get_cur_state = khadas_mcu_fan_get_cur_state,
> +   .set_cur_state = khadas_mcu_fan_set_cur_state,
> +};
> +
> +static int khadas_mcu_fan_probe(struct platform_device *pdev)
> +{
> +   struct khadas_mcu *mcu = dev_get_drvdata(pdev->dev.parent);
> +   struct thermal_cooling_device *cdev;
> +   struct device *dev = &pdev->dev;
> +   struct khadas_mcu_fan_ctx *ctx;
> +   int ret;
> +
> +   ctx = devm_kzalloc(dev, sizeof(*ctx), GFP_KERNEL);
> +   if (!ctx)
> +   return -ENOMEM;
> +   ctx->mcu = mcu;
> +   platform_set_drvdata(pdev, ctx);
> +
> +   cdev = devm_thermal_of_cooling_device_register(dev->parent,
> +   dev->parent->of_node, "khadas-mcu-fan", ctx,
> +   &khadas_mcu_fan_cooling_ops);
> +   if (IS_ERR(cdev)) {
> +   

Re: [PATCH v2 5/6] dmaengine: dw: Introduce max burst length hw config

2020-05-14 Thread Vinod Koul
On 12-05-20, 22:12, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 05:08:20PM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:41:53PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:03PM +0300, Serge Semin wrote:
> > > > IP core of the DW DMA controller may be synthesized with different
> > > > max burst length of the transfers per each channel. According to 
> > > > Synopsis
> > > > having the fixed maximum burst transactions length may provide some
> > > > performance gain. At the same time setting up the source and destination
> > > > multi size exceeding the max burst length limitation may cause a serious
> > > > problems. In our case the system just hangs up. In order to fix this
> > > > lets introduce the max burst length platform config of the DW DMA
> > > > controller device and don't let the DMA channels configuration code
> > > > exceed the burst length hardware limitation. Depending on the IP core
> > > > configuration the maximum value can vary from channel to channel.
> > > > It can be detected either in runtime from the DWC parameter registers
> > > > or from the dedicated dts property.
> > > 
> > > I'm wondering what can be the scenario when your peripheral will ask 
> > > something
> > > which is not supported by DMA controller?
> > 
> > I may misunderstood your statement, because seeing your activity around my
> > patchsets including the SPI patchset and sometimes very helpful comments,
> > this question answer seems too obvious to see you asking it.
> > 
> > No need to go far for an example. See the DW APB SSI driver. Its DMA module
> > specifies the burst length to be 16, while not all of ours channels 
> > supports it.
> > Yes, originally it has been developed for the Intel Midfield SPI, but since 
> > I
> > converted the driver into a generic code we can't use a fixed value. For 
> > instance
> > in our hardware only two DMA channels of total 16 are capable of bursting 
> > up to
> > 16 bytes (data items) at a time, the rest of them are limited with up to 4 
> > bytes
> > burst length. While there are two SPI interfaces, each of which need to 
> > have two
> > DMA channels for communications. So I need four channels in total to 
> > allocate to
> > provide the DMA capability for all interfaces. In order to set the SPI 
> > controller
> > up with valid optimized parameters the max-burst-length is required. 
> > Otherwise we
> > can end up with buffers overrun/underrun.
> 
> Right, and we come to the question which channel better to be used by SPI and
> the rest devices. Without specific filter function you can easily get into a
> case of inverted optimizations, when SPI got channels with burst = 4, while
> it's needed 16, and other hardware otherwise. Performance wise it's worse
> scenario which we may avoid in the first place, right?

If one has channels which are different and described as such in DT,
then I think it does make sense to specify in your board-dt about the
specific channels you would require...
> 
> > > Peripheral needs to supply a lot of configuration parameters specific to 
> > > the
> > > DMA controller in use (that's why we have struct dw_dma_slave).
> > > So, seems to me the feasible approach is supply correct data in the first 
> > > place.
> > 
> > How to supply a valid data if clients don't know the DMA controller 
> > limitations
> > in general?
> 
> This is a good question. DMA controllers are quite different and having 
> unified
> capabilities structure for all is almost impossible task to fulfil. That's why
> custom filter function(s) can help here. Based on compatible string you can
> implement whatever customized quirks like two functions, for example, to try 
> 16
> burst size first and fallback to 4 if none was previously found.
> 
> > > If you have specific channels to acquire then you probably need to 
> > > provide a
> > > custom xlate / filter functions. Because above seems a bit hackish 
> > > workaround
> > > of dynamic channel allocation mechanism.
> > 
> > No, I don't have a specific channel to acquire and in general you may use 
> > any
> > returned from the DMA subsystem (though some platforms may need a dedicated
> > channels to use, in this case xlate / filter is required). In our SoC any 
> > DW DMAC
> > channel can be used for any DMA-capable peripherals like SPI, I2C, UART. 
> > But the
> > their DMA settings must properly and optimally configured. It can be only 
> > done
> > if you know the DMA controller parameters like max burst length, max 
> > block-size,
> > etc.
> > 
> > So no. The change proposed by this patch isn't workaround, but a useful 
> > feature,
> > moreover expected to be supported by the generic DMA subsystem.
> 
> See above.
> 
> > > But let's see what we can do better. Since maximum is defined on the 
> > > slave side
> > > device, it probably needs to define minimum as well, otherwise it's 
> > > possible
> > > that some hardware can't cope underrun bursts.
> > 
> > There is no need to de

Re: [RFC PATCH 1/8] qaic: Add skeleton driver

2020-05-14 Thread Greg KH
On Thu, May 14, 2020 at 06:43:06PM -0600, Jeffrey Hugo wrote:
> On 5/14/2020 8:07 AM, Jeffrey Hugo wrote:
> > +#define QAIC_NAME  "Qualcomm Cloud AI 100"
> 
> > +static struct pci_driver qaic_pci_driver = {
> > +   .name = QAIC_NAME,
> 
> A question about the community's preference here.
> 
> Our driver name is very verbose due to the desire to have the proper
> "branding".  However, I can see it being a bit obtuse, particularly in logs.
> 
> Would the community prefer something more succinct here, such as "qaic"?

Make it the same a KBUILD_MODNAME and then no one can complain :)

thanks,

greg k-h


Re: [PATCH 4/4] ipv6: symbol_get to access a sit symbol

2020-05-14 Thread Christoph Hellwig
On Thu, May 14, 2020 at 05:53:55PM -0700, David Miller wrote:
> You're not undoing one, but two levels of abstraction here.
> 
> Is this "ipip6_tunnel_locate()" call part of the SIT ioctl implementation?

Yes.  Take a look at the convoluted case handling the
SIOCADDTUNNEL and SIOCCHGTUNNEL commands in ipip6_tunnel_ioctl.

> Where did it come from?   Why are ->ndo_do_ioctl() implementations no longer
> allowed from here?

The problem is that we feed kernel pointers to it, which requires
set_fs address space overrides that I plan to kill off entirely.

> Honestly, this feels like a bit much.

My initial plan was to add a ->tunnel_ctl method to the net_device_ops,
and lift the copy_{to,from}_user for SIOCADDTUNNEL, SIOCCHGTUNNEL,
SIOCDELTUNNEL and maybe SIOCGETTUNNEL to net/socket.c.  But that turned
out to have two problems:

 - first these ioctls names use SIOCDEVPRIVATE range, that can also
   be implemented by other drivers
 - the ip_tunnel_parm struture is only used by the ipv4 tunneling
   drivers (including sit), the "real" ipv6 tunnels use a
   ip6_tnl_parm or ip6_tnl_parm structure instead

But if you don't like the symbol_get approach, I could do the
tunnel_ctl operation, just for the іpv4-ish tunnels, and only for
the kernel callers.

---end quoted text---


Re: [PATCH v2 4/6] dmaengine: dw: Print warning if multi-block is unsupported

2020-05-14 Thread Vinod Koul
Hi Serge,

On 12-05-20, 15:42, Serge Semin wrote:
> Vinod,
> 
> Could you join the discussion for a little bit?
> 
> In order to properly fix the problem discussed in this topic, we need to
> introduce an additional capability exported by DMA channel handlers on 
> per-channel
> basis. It must be a number, which would indicate an upper limitation of the 
> SG list
> entries amount.
> Something like this would do it:
> struct dma_slave_caps {
> ...
>   unsigned int max_sg_nents;
> ...

Looking at the discussion, I agree we should can this up in the
interface. The max_dma_len suggests the length of a descriptor allowed,
it does not convey the sg_nents supported which in the case of nollp is
one.

Btw is this is a real hardware issue, I have found that value of such
hardware is very less and people did fix it up in subsequent revs to add
llp support.

Also, another question is why this cannot be handled in driver, I agree
your hardware does not support llp but that does not stop you from
breaking a multi_sg list into N hardware descriptors and keep submitting
them (for this to work submission should be done in isr and not in bh,
unfortunately very few driver take that route). TBH the max_sg_nents or
max_dma_len are HW restrictions and SW *can* deal with then :-)

In an idea world, you should break the sw descriptor submitted into N hw
descriptors and submit to hardware and let user know when the sw
descriptor is completed. Of course we do not do that :(

> };
> As Andy suggested it's value should be interpreted as:
> 0  - unlimited number of entries,
> 1:MAX_UINT - actual limit to the number of entries.

Hmm why 0, why not MAX_UINT for unlimited?

> In addition to that seeing the dma_get_slave_caps() method provide the caps 
> only
> by getting them from the DMA device descriptor, while we need to have an info 
> on
> per-channel basis, it would be good to introduce a new DMA-device callback 
> like:
> struct dma_device {
> ...
>   int (*device_caps)(struct dma_chan *chan,
>  struct dma_slave_caps *caps);

Do you have a controller where channel caps are on per-channel basis?

> ...
> };
> So the DMA driver could override the generic DMA device capabilities with the
> values specific to the DMA channels. Such functionality will be also helpful 
> for
> the max-burst-len parameter introduced by this patchset, since depending on 
> the
> IP-core synthesis parameters it may be channel-specific.
> 
> Alternatively we could just introduce a new fields to the dma_chan structure 
> and
> retrieve the new caps values from them in the dma_get_slave_caps() method.
> Though the solution with callback I like better.
> 
> What is your opinion about this? What solution you'd prefer?
> 
> On Tue, May 12, 2020 at 12:08:00AM +0300, Andy Shevchenko wrote:
> > On Tue, May 12, 2020 at 12:07:14AM +0300, Andy Shevchenko wrote:
> > > On Mon, May 11, 2020 at 10:32:55PM +0300, Serge Semin wrote:
> > > > On Mon, May 11, 2020 at 04:58:53PM +0300, Andy Shevchenko wrote:
> > > > > On Mon, May 11, 2020 at 4:48 PM Serge Semin
> > > > >  wrote:
> > > > > >
> > > > > > On Mon, May 11, 2020 at 12:58:13PM +0100, Mark Brown wrote:
> > > > > > > On Mon, May 11, 2020 at 05:10:16AM +0300, Serge Semin wrote:
> > > > > > >
> > > > > > > > Alas linearizing the SPI messages won't help in this case 
> > > > > > > > because the DW DMA
> > > > > > > > driver will split it into the max transaction chunks anyway.
> > > > > > >
> > > > > > > That sounds like you need to also impose a limit on the maximum 
> > > > > > > message
> > > > > > > size as well then, with that you should be able to handle 
> > > > > > > messages up
> > > > > > > to whatever that limit is.  There's code for that bit already, so 
> > > > > > > long
> > > > > > > as the limit is not too low it should be fine for most devices and
> > > > > > > client drivers can see the limit so they can be updated to work 
> > > > > > > with it
> > > > > > > if needed.
> > > > > >
> > > > > > Hmm, this might work. The problem will be with imposing such 
> > > > > > limitation through
> > > > > > the DW APB SSI driver. In order to do this I need to know:
> > > > > > 1) Whether multi-block LLP is supported by the DW DMA controller.
> > > > > > 2) Maximum DW DMA transfer block size.
> > > > > > Then I'll be able to use this information in the can_dma() callback 
> > > > > > to enable
> > > > > > the DMA xfers only for the safe transfers. Did you mean something 
> > > > > > like this when
> > > > > > you said "There's code for that bit already" ? If you meant the 
> > > > > > max_dma_len
> > > > > > parameter, then setting it won't work, because it just limits the 
> > > > > > SG items size
> > > > > > not the total length of a single transfer.
> > > > > >
> > > > > > So the question is of how to export the multi-block LLP flag from 
> > > > > > DW DMAc
> > > > > > driver. Andy?
> > > > > 
> > > > > I'm not sure I understand why do you need this being exported. Just

Re: [PATCH v7 2/4] usb: dwc3: qcom: Add interconnect support in dwc3 driver

2020-05-14 Thread Felipe Balbi

Hi,

Georgi Djakov  writes:
>> Sandeep Maheswaram  writes:
>>> +static int dwc3_qcom_interconnect_init(struct dwc3_qcom *qcom)
>>> +{
>>> +   struct device *dev = qcom->dev;
>>> +   int ret;
>>> +
>>> +   if (!device_is_bound(&qcom->dwc3->dev))
>>> +   return -EPROBE_DEFER;
>>
>> this breaks allmodconfig. I'm dropping this series from my queue for
>> this merge window.
>
> Sorry, I meant this patch ;-)

 I guess that's due to INTERCONNECT being a module. There is currently a
>>>
>>> I believe it's because of this:
>>> ERROR: modpost: "device_is_bound" [drivers/usb/dwc3/dwc3-qcom.ko] undefined!
>>>
 discussion about this  with Viresh and Georgi in response to another
 automated build failure. Viresh suggests changing CONFIG_INTERCONNECT
 from tristate to bool, which seems sensible to me given that interconnect
 is a core subsystem.
>>>
>>> The problem you are talking about would arise when INTERCONNECT=m and
>>> USB_DWC3_QCOM=y and it definitely exists here and could be triggered with
>>> randconfig build. So i suggest to squash also the diff below.
>>>
>>> Thanks,
>>> Georgi
>>>
>>> ---8<---
>>> diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
>>> index 206caa0ea1c6..6661788b1a76 100644
>>> --- a/drivers/usb/dwc3/Kconfig
>>> +++ b/drivers/usb/dwc3/Kconfig
>>> @@ -129,6 +129,7 @@ config USB_DWC3_QCOM
>>> tristate "Qualcomm Platform"
>>> depends on ARCH_QCOM || COMPILE_TEST
>>> depends on EXTCON || !EXTCON
>>> +   depends on INTERCONNECT || !INTERCONNECT
>> 
>> I would prefer to see a patch adding EXPORT_SYMBOL_GPL() to device_is_bound()
>
> Agree, but just to clarify, that these are two separate issues that need to
> be fixed. The device_is_bound() is the first one and USB_DWC3_QCOM=y combined
> with INTERCONNECT=m is the second one.

If INTERCONNECT=m, QCOM3 shouldn't be y. I think the following is
enough:

depends on INTERCONNECT=y || INTERCONNECT=USB_DWC3_QCOM

-- 
balbi


signature.asc
Description: PGP signature


Re: [f2fs-dev] [PATCH] f2fs: flush dirty meta pages when flushing them

2020-05-14 Thread Chao Yu
On 2020/5/15 10:15, Jaegeuk Kim wrote:
> Let's guarantee flusing dirty meta pages to avoid infinite loop.

What's the root cause? Race case or meta page flush failure?

Thanks,

> 
> Signed-off-by: Jaegeuk Kim 
> ---
>  fs/f2fs/checkpoint.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 620a386d82c1a..9a7f695d5adb3 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1266,6 +1266,9 @@ void f2fs_wait_on_all_pages(struct f2fs_sb_info *sbi, 
> int type)
>   if (unlikely(f2fs_cp_error(sbi)))
>   break;
>  
> + if (type == F2FS_DIRTY_META)
> + f2fs_sync_meta_pages(sbi, META, LONG_MAX,
> + FS_CP_META_IO);
>   io_schedule_timeout(DEFAULT_IO_TIMEOUT);
>   }
>   finish_wait(&sbi->cp_wait, &wait);
> @@ -1493,8 +1496,6 @@ static int do_checkpoint(struct f2fs_sb_info *sbi, 
> struct cp_control *cpc)
>   sbi->last_valid_block_count = sbi->total_valid_block_count;
>   percpu_counter_set(&sbi->alloc_valid_block_count, 0);
>  
> - /* Here, we have one bio having CP pack except cp pack 2 page */
> - f2fs_sync_meta_pages(sbi, META, LONG_MAX, FS_CP_META_IO);
>   /* Wait for all dirty meta pages to be submitted for IO */
>   f2fs_wait_on_all_pages(sbi, F2FS_DIRTY_META);
>  
> 


Re: [PATCH v2 06/10] mm/swap: use npage_to_sectors() and PAGE_SECTORS to clean up code

2020-05-14 Thread Leizhen (ThunderTown)



On 2020/5/15 12:06, Matthew Wilcox wrote:
> On Thu, May 07, 2020 at 03:50:56PM +0800, Zhen Lei wrote:
>> @@ -266,7 +266,7 @@ int swap_writepage(struct page *page, struct 
>> writeback_control *wbc)
>>  
>>  static sector_t swap_page_sector(struct page *page)
>>  {
>> -return (sector_t)__page_file_index(page) << (PAGE_SHIFT - 9);
>> +return npage_to_sectors((sector_t)__page_file_index(page));
> 
> If you make npage_to_sectors() a proper function instead of a macro,
> you can do the casting inside the function instead of in the callers
> (which is prone to bugs).

Oh, yes. __page_file_index(page) maybe called many times in marco, althouth 
currently
it is not. So that, not all are suitable for page_to_sector(). And for this 
example,
still need to use "<< PAGE_SECTORS_SHIFT".

> 
> Also, this is a great example of why page_to_sector() was a better name
> than npage_to_sectors().  This function doesn't return a count of sectors,
> it returns a sector number within the swap device.
OK, so I will change to page_to_sector()/sector_to_page().

> 
> 
> .
> 



[PATCH] mmc: sdhci-of-dwcmshc: add suspend/resume support

2020-05-14 Thread Jisheng Zhang
Add dwcmshc specific system-level suspend and resume support.

Signed-off-by: Jisheng Zhang 
---
 drivers/mmc/host/sdhci-of-dwcmshc.c | 43 +
 1 file changed, 43 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-dwcmshc.c 
b/drivers/mmc/host/sdhci-of-dwcmshc.c
index a9ed0e006e06..64ac0dbee95c 100644
--- a/drivers/mmc/host/sdhci-of-dwcmshc.c
+++ b/drivers/mmc/host/sdhci-of-dwcmshc.c
@@ -163,6 +163,48 @@ static int dwcmshc_remove(struct platform_device *pdev)
return 0;
 }
 
+#ifdef CONFIG_PM_SLEEP
+static int dwcmshc_suspend(struct device *dev)
+{
+   struct sdhci_host *host = dev_get_drvdata(dev);
+   struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
+   struct dwcmshc_priv *priv = sdhci_pltfm_priv(pltfm_host);
+   int ret;
+
+   ret = sdhci_suspend_host(host);
+   if (ret)
+   return ret;
+
+   clk_disable_unprepare(pltfm_host->clk);
+   if (!IS_ERR(priv->bus_clk))
+   clk_disable_unprepare(priv->bus_clk);
+
+   return ret;
+}
+
+static int dwcmshc_resume(struct device *dev)
+{
+   struct sdhci_host *host = dev_get_drvdata(dev);
+   struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
+   struct dwcmshc_priv *priv = sdhci_pltfm_priv(pltfm_host);
+   int ret;
+
+   ret = clk_prepare_enable(pltfm_host->clk);
+   if (ret)
+   return ret;
+
+   if (!IS_ERR(priv->bus_clk)) {
+   ret = clk_prepare_enable(priv->bus_clk);
+   if (ret)
+   return ret;
+   }
+
+   return sdhci_resume_host(host);
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(dwcmshc_pmops, dwcmshc_suspend, dwcmshc_resume);
+
 static const struct of_device_id sdhci_dwcmshc_dt_ids[] = {
{ .compatible = "snps,dwcmshc-sdhci" },
{}
@@ -173,6 +215,7 @@ static struct platform_driver sdhci_dwcmshc_driver = {
.driver = {
.name   = "sdhci-dwcmshc",
.of_match_table = sdhci_dwcmshc_dt_ids,
+   .pm = &dwcmshc_pmops,
},
.probe  = dwcmshc_probe,
.remove = dwcmshc_remove,
-- 
2.26.2



[PATCH v8] Add matrix keypad driver support for Mediatek SoCs

2020-05-14 Thread Fengping Yu

Change since v7:
- specify compatible property as const string
- add maxItem in required property
- squash keypad example nodes
- sort header file with alphabetic order
- align all define values and add MTK_ prefix to make more uniform
- change debounce value to default 16ms if not specified in dts
- remove extra braces
- separate clk prepare as an internal driver function
- add special compatible string
- modify CONFIG_KEYBOARD_MTK_KPD=m to build keypad as ko module

fengping.yu (3):
  dt-bindings: Add keypad devicetree documentation
  drivers: input: keyboard: Add mtk keypad driver
  configs: defconfig: Add CONFIG_ng.yu (3):
  dt-bindings: Add keypad devicetree documentation
  drivers: input: keyboard: Add mtk keypad driver
  configs: defconfig: Add CONFIG_KEYBOARD_MTK_KPD=m

 .../devicetree/bindings/input/mtk-kpd.yaml|  94 +++
 arch/arm64/configs/defconfig  |   1 +
 drivers/input/keyboard/Kconfig|   9 +
 drivers/input/keyboard/Makefile   |   1 +
 drivers/input/keyboard/mtk-kpd.c  | 231 ++
 5 files changed, 336 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/input/mtk-kpd.yaml
 create mode 100644 drivers/input/keyboard/mtk-kpd.c

--
2.18.0



Re: [PATCH] crypto: blake2b - Fix clang optimization for ARMv7-M

2020-05-14 Thread Herbert Xu
On Tue, May 05, 2020 at 03:53:45PM +0200, Arnd Bergmann wrote:
> When building for ARMv7-M, clang-9 or higher tries to unroll some loops,
> which ends up confusing the register allocator to the point of generating
> rather bad code and using more than the warning limit for stack frames:
> 
> warning: stack frame size of 1200 bytes in function 'blake2b_compress' 
> [-Wframe-larger-than=]
> 
> Forcing it to not unroll the final loop avoids this problem.
> 
> Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
> Signed-off-by: Arnd Bergmann 
> ---
>  crypto/blake2b_generic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH 0/4] crypto: constify struct debugfs_reg32

2020-05-14 Thread Herbert Xu
On Sat, May 09, 2020 at 12:34:58AM +0200, Rikard Falkeborn wrote:
> A small series constifying struct debugfs_reg32 where it can be made
> const. There's no dependency between the patches.
> 
> Rikard Falkeborn (4):
>   crypto: ccree - constify struct debugfs_reg32
>   crypto: hisilicon/hpre - constify struct debugfs_reg32
>   crypto: hisilicon/zip - constify struct debugfs_reg32
>   crypto: hisilicon/sec2 - constify sec_dfx_regs
> 
>  drivers/crypto/ccree/cc_debugfs.c | 4 ++--
>  drivers/crypto/hisilicon/hpre/hpre_main.c | 4 ++--
>  drivers/crypto/hisilicon/sec2/sec_main.c  | 2 +-
>  drivers/crypto/hisilicon/zip/zip_main.c   | 2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH 5.7-rc5] HID: apple: Swap the Fn and Left Control keys on Apple keyboards

2020-05-14 Thread free5lot

This patch allows users to swap the Fn and left Control keys on all Apple
keyboards: internal (e.g. Macbooks) and external (both wired and wireless).
The patch adds a new hid-apple module param: swap_fn_leftctrl (off by default).

Signed-off-by: Zakhar Semenov 
---
This patch was created to eliminate the inconvenience of having an unusual
order of 4 left-bottom keys on Apple keyboards for GNU+Linux users.
Now it's possible to swap the Fn and left Control keys on Macbooks and
external Apple keyboards and have the same experience as on usual PC layout.

The patch has been heavily tested for about 5 years by community at:
https://github.com/free5lot/hid-apple-patched
The patch is small and straightforward. The modified version of hid-apple
is currently mentioned in wiki-documentation of several major GNU/Linux
distributions (e.g. Ubuntu, Arch, openSUSE).


--- hid-apple.c.orig2020-05-12 11:06:26.0 +0300
+++ hid-apple.c 2020-05-15 20:00:00.0 +0300
@@ -51,6 +51,12 @@ MODULE_PARM_DESC(swap_opt_cmd, "Swap the
"(For people who want to keep Windows PC keyboard muscle memory. 
"
"[0] = as-is, Mac layout. 1 = swapped, Windows layout.)");

+static unsigned int swap_fn_leftctrl;
+module_param(swap_fn_leftctrl, uint, 0644);
+MODULE_PARM_DESC(swap_fn_leftctrl, "Swap the Fn and left Control keys. "
+   "(For people who want to keep PC keyboard muscle memory. "
+   "[0] = as-is, Mac layout, 1 = swapped, PC layout)");
+
 struct apple_sc {
unsigned long quirks;
unsigned int fn_on;
@@ -162,6 +168,11 @@ static const struct apple_key_translatio
{ }
 };

+static const struct apple_key_translation swapped_fn_leftctrl_keys[] = {
+   { KEY_FN, KEY_LEFTCTRL },
+   { }
+};
+
 static const struct apple_key_translation *apple_find_translation(
const struct apple_key_translation *table, u16 from)
 {
@@ -183,9 +194,11 @@ static int hidinput_apple_event(struct h
bool do_translate;
u16 code = 0;

-   if (usage->code == KEY_FN) {
+   u16 fn_keycode = (swap_fn_leftctrl) ? (KEY_LEFTCTRL) : (KEY_FN);
+
+   if (usage->code == fn_keycode) {
asc->fn_on = !!value;
-   input_event(input, usage->type, usage->code, value);
+   input_event(input, usage->type, KEY_FN, value);
return 1;
}

@@ -270,6 +283,14 @@ static int hidinput_apple_event(struct h
}
}

+   if (swap_fn_leftctrl) {
+   trans = apple_find_translation(swapped_fn_leftctrl_keys, 
usage->code);
+   if (trans) {
+   input_event(input, usage->type, trans->to, value);
+   return 1;
+   }
+   }
+
return 0;
 }

@@ -333,6 +354,11 @@ static void apple_setup_input(struct inp

for (trans = apple_iso_keyboard; trans->from; trans++)
set_bit(trans->to, input->keybit);
+
+   if (swap_fn_leftctrl) {
+   for (trans = swapped_fn_leftctrl_keys; trans->from; trans++)
+   set_bit(trans->to, input->keybit);
+   }
 }

 static int apple_input_mapping(struct hid_device *hdev, struct hid_input *hi,


Re: [PATCH v2 3/6] dmaengine: dw: Set DMA device max segment size parameter

2020-05-14 Thread Vinod Koul
On 12-05-20, 15:35, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 12:16:22AM +0300, Serge Semin wrote:
> > On Fri, May 08, 2020 at 02:21:52PM +0300, Andy Shevchenko wrote:
> > > On Fri, May 08, 2020 at 01:53:01PM +0300, Serge Semin wrote:
> > > > Maximum block size DW DMAC configuration corresponds to the max segment
> > > > size DMA parameter in the DMA core subsystem notation. Lets set it with 
> > > > a
> > > > value specific to the probed DW DMA controller. It shall help the DMA
> > > > clients to create size-optimized SG-list items for the controller. This 
> > > > in
> > > > turn will cause less dw_desc allocations, less LLP reinitializations,
> > > > better DMA device performance.
> 
> > > Yeah, I have locally something like this and I didn't dare to upstream 
> > > because
> > > there is an issue. We have this information per DMA controller, while we
> > > actually need this on per DMA channel basis.
> > > 
> > > Above will work only for synthesized DMA with all channels having same 
> > > block
> > > size. That's why above conditional is not needed anyway.
> > 
> > Hm, I don't really see why the conditional isn't needed and this won't 
> > work. As
> > you can see in the loop above Initially I find a maximum of all channels 
> > maximum
> > block sizes and use it then as a max segment size parameter for the whole 
> > device.
> > If the DW DMA controller has the same max block size of all channels, then 
> > it
> > will be found. If the channels've been synthesized with different block 
> > sizes,
> > then the optimization will work for the one with greatest block size. The SG
> > list entries of the channels with lesser max block size will be split up
> > by the DW DMAC driver, which would have been done anyway without
> > max_segment_size being set. Here we at least provide the optimization for 
> > the
> > channels with greatest max block size.
> > 
> > I do understand that it would be good to have this parameter setup on per 
> > generic
> > DMA channel descriptor basis. But DMA core and device descriptor doesn't 
> > provide
> > such facility, so setting at least some justified value is a good idea.
> > 
> > > 
> > > OTOH, I never saw the DesignWare DMA to be synthesized differently (I 
> > > remember
> > > that Intel Medfield has interesting settings, but I don't remember if DMA
> > > channels are different inside the same controller).
> > > 
> > > Vineet, do you have any information that Synopsys customers synthesized 
> > > DMA
> > > controllers with different channel characteristics inside one DMA IP?
> > 
> > AFAICS the DW DMAC channels can be synthesized with different max block 
> > size.
> > The IP core supports such configuration. So we can't assume that such DMAC
> > release can't be found in a real hardware just because we've never seen one.
> > No matter what Vineet will have to say in response to your question.
> 
> My point here that we probably can avoid complications till we have real
> hardware where it's different. As I said I don't remember a such, except
> *maybe* Intel Medfield, which is quite outdated and not supported for wider
> audience anyway.

IIRC Intel Medfield has couple of dma controller instances each one with
different parameters *but* each instance has same channel configuration.

I do not recall seeing that we have synthesis parameters per channel
basis... But I maybe wrong, it's been a while.

-- 
~Vinod


[git pull] drm fixes for 5.7-rc6

2020-05-14 Thread Dave Airlie
Hey Linus,

As mentioned last week an i915 PR came in late, but I left it, so the
i915 bits of this cover 2 weeks, which is why it's likely a bit larger
than usual. Otherwise it's mostly amdgpu fixes, one tegra fix, one
meson fix.

Regards,
Dave.

drm-fixes-2020-05-15:
drm fixes for v5.7-rc6

i915 (two weeks):
- Handle idling during i915_gem_evict_something busy loops (Chris)
- Mark current submissions with a weak-dependency (Chris)
- Propagate error from completed fences (Chris)
- Fixes on execlist to avoid GPU hang situation (Chris)
- Fixes couple deadlocks (Chris)
- Timeslice preemption fixes (Chris)
- Fix Display Port interrupt handling on Tiger Lake (Imre)
- Reduce debug noise around Frame Buffer Compression (Peter)
- Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
- Avoid dereferencing a dead context (Chris)

tegra:
- tegra120/4 smmu fixes

 amdgpu:
- Clockgating fixes
- Fix fbdev with scatter/gather display
- S4 fix for navi
- Soft recovery for gfx10
- Freesync fixes
- Atomic check cursor fix
- Add a gfxoff quirk
- MST fix

amdkfd:
- Fix GEM reference counting

meson:
- error code propogation fix
The following changes since commit 2ef96a5bb12be62ef75b5828c0aab838ebb29cb8:

  Linux 5.7-rc5 (2020-05-10 15:16:58 -0700)

are available in the Git repository at:

  git://anongit.freedesktop.org/drm/drm tags/drm-fixes-2020-05-15

for you to fetch changes up to 1d2a1eb13610a9c8ec95f6f1e02cef55000f28e3:

  Merge tag 'drm-misc-fixes-2020-05-14' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes (2020-05-15
16:00:57 +1000)


drm fixes for v5.7-rc6

i915 (two weeks):
- Handle idling during i915_gem_evict_something busy loops (Chris)
- Mark current submissions with a weak-dependency (Chris)
- Propagate error from completed fences (Chris)
- Fixes on execlist to avoid GPU hang situation (Chris)
- Fixes couple deadlocks (Chris)
- Timeslice preemption fixes (Chris)
- Fix Display Port interrupt handling on Tiger Lake (Imre)
- Reduce debug noise around Frame Buffer Compression (Peter)
- Fix logic around IPC W/a for Coffee Lake and Kaby Lake (Sultan)
- Avoid dereferencing a dead context (Chris)

tegra:
- tegra120/4 smmu fixes

 amdgpu:
- Clockgating fixes
- Fix fbdev with scatter/gather display
- S4 fix for navi
- Soft recovery for gfx10
- Freesync fixes
- Atomic check cursor fix
- Add a gfxoff quirk
- MST fix

amdkfd:
- Fix GEM reference counting

meson:
- error code propogation fix


Alex Deucher (2):
  drm/amdgpu: force fbdev into vram
  drm/amdgpu: implement soft_recovery for gfx10

Bernard Zhao (1):
  drm/meson: pm resume add return errno branch

Chris Wilson (10):
  drm/i915: Avoid dereferencing a dead context
  drm/i915/gt: Make timeslicing an explicit engine property
  drm/i915: Check current i915_vma.pin_count status first on unbind
  drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore
  drm/i915/gem: Remove object_is_locked assertion from
unpin_from_display_plane
  drm/i915/execlists: Avoid reusing the same logical CCID
  drm/i915/execlists: Track inflight CCID
  drm/i915: Propagate error from completed fences
  drm/i915: Mark concurrent submissions with a weak-dependency
  drm/i915: Handle idling during i915_gem_evict_something busy loops

Colin Xu (1):
  drm/i915/gvt: Init DPLL/DDI vreg for virtual display instead of
inheritance.

Dave Airlie (5):
  Merge tag 'drm-intel-fixes-2020-05-07' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'drm/tegra/for-5.7-fixes' of
git://anongit.freedesktop.org/tegra/linux into drm-fixes
  Merge tag 'amd-drm-fixes-5.7-2020-05-13' of
git://people.freedesktop.org/~agd5f/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2020-05-13-1' of
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
  Merge tag 'drm-misc-fixes-2020-05-14' of
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Evan Quan (4):
  drm/amdgpu: disable MGCG/MGLS also on gfx CG ungate
  drm/amdgpu: drop unnecessary cancel_delayed_work_sync on PG ungate
  drm/amd/powerplay: perform PG ungate prior to CG ungate
  drm/amdgpu: enable hibernate support on Navi1X

Felix Kuehling (1):
  drm/amdgpu: Use GEM obj reference for KFD BOs

Imre Deak (1):
  drm/i915/tgl+: Fix interrupt handling for DP AUX transactions

Leo (Hanghong) Ma (1):
  drm/amd/amdgpu: Update update_config() logic

Nicholas Kazlauskas (1):
  drm/amd/display: Fix vblank and pageflip event handling for FreeSync

Peter Jones (1):
  Make the "Reducing compressed framebufer size" message be DRM_INFO_ONCE()

Rodrigo Vivi (1):
  Merge tag 'gvt-fixes-2020-05-12' of
https://github.com/intel/gvt-linux into drm-intel-fixes

Simon Ser (1):
  drm/amd/display: add basic atomic check for cursor plane

Sultan Alsawaf (1):
  drm/i915: Don't e

Re: [PATCH v7 2/4] usb: dwc3: qcom: Add interconnect support in dwc3 driver

2020-05-14 Thread Georgi Djakov
Hi,

On 5/15/20 08:54, Felipe Balbi wrote:
> 
> Hi,
> 
> Georgi Djakov  writes:
>> On 5/14/20 20:13, Matthias Kaehlcke wrote:
>>> On Thu, May 14, 2020 at 02:30:28PM +0300, Felipe Balbi wrote:
 Felipe Balbi  writes:

> Hi,
>
> Sandeep Maheswaram  writes:
>> +static int dwc3_qcom_interconnect_init(struct dwc3_qcom *qcom)
>> +{
>> +struct device *dev = qcom->dev;
>> +int ret;
>> +
>> +if (!device_is_bound(&qcom->dwc3->dev))
>> +return -EPROBE_DEFER;
>
> this breaks allmodconfig. I'm dropping this series from my queue for
> this merge window.

 Sorry, I meant this patch ;-)
>>>
>>> I guess that's due to INTERCONNECT being a module. There is currently a
>>
>> I believe it's because of this:
>> ERROR: modpost: "device_is_bound" [drivers/usb/dwc3/dwc3-qcom.ko] undefined!
>>
>>> discussion about this  with Viresh and Georgi in response to another
>>> automated build failure. Viresh suggests changing CONFIG_INTERCONNECT
>>> from tristate to bool, which seems sensible to me given that interconnect
>>> is a core subsystem.
>>
>> The problem you are talking about would arise when INTERCONNECT=m and
>> USB_DWC3_QCOM=y and it definitely exists here and could be triggered with
>> randconfig build. So i suggest to squash also the diff below.
>>
>> Thanks,
>> Georgi
>>
>> ---8<---
>> diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
>> index 206caa0ea1c6..6661788b1a76 100644
>> --- a/drivers/usb/dwc3/Kconfig
>> +++ b/drivers/usb/dwc3/Kconfig
>> @@ -129,6 +129,7 @@ config USB_DWC3_QCOM
>>  tristate "Qualcomm Platform"
>>  depends on ARCH_QCOM || COMPILE_TEST
>>  depends on EXTCON || !EXTCON
>> +depends on INTERCONNECT || !INTERCONNECT
> 
> I would prefer to see a patch adding EXPORT_SYMBOL_GPL() to device_is_bound()

Agree, but just to clarify, that these are two separate issues that need to
be fixed. The device_is_bound() is the first one and USB_DWC3_QCOM=y combined
with INTERCONNECT=m is the second one.

Thanks,
Georgi


Re: [PATCH v2 2/6] dt-bindings: dma: dw: Add max burst transaction length property

2020-05-14 Thread Vinod Koul
On 12-05-20, 15:38, Andy Shevchenko wrote:
> On Tue, May 12, 2020 at 02:49:46PM +0300, Serge Semin wrote:
> > On Tue, May 12, 2020 at 12:08:04PM +0300, Andy Shevchenko wrote:
> > > On Tue, May 12, 2020 at 12:35:31AM +0300, Serge Semin wrote:
> > > > On Tue, May 12, 2020 at 12:01:38AM +0300, Andy Shevchenko wrote:
> > > > > On Mon, May 11, 2020 at 11:05:28PM +0300, Serge Semin wrote:
> > > > > > On Fri, May 08, 2020 at 02:12:42PM +0300, Andy Shevchenko wrote:
> > > > > > > On Fri, May 08, 2020 at 01:53:00PM +0300, Serge Semin wrote:
> 
> ...
> 
> I leave it to Rob and Vinod.
> It won't break our case, so, feel free with your approach.

I agree the DT is about describing the hardware and looks like value of
1 is not allowed. If allowed it should be added..

> P.S. Perhaps at some point we need to
> 1) convert properties to be u32 (it will simplify things);
> 2) convert legacy ones to proper format ('-' instead of '_', vendor prefix 
> added);
> 3) parse them in core with device property API.

These suggestions are good and should be done.

-- 
~Vinod


Re: [PATCH] crypto: Replace zero-length array with flexible-array

2020-05-14 Thread Herbert Xu
Gustavo A. R. Silva  wrote:
> The current codebase makes use of the zero-length array language
> extension to the C90 standard, but the preferred mechanism to declare
> variable-length types such as these ones is a flexible array member[1][2],
> introduced in C99:
> 
> struct foo {
>int stuff;
>struct boo array[];
> };
> 
> By making use of the mechanism above, we will get a compiler warning
> in case the flexible array does not occur last in the structure, which
> will help us prevent some kind of undefined behavior bugs from being
> inadvertently introduced[3] to the codebase from now on.
> 
> Also, notice that, dynamic memory allocations won't be affected by
> this change:
> 
> "Flexible array members have incomplete type, and so the sizeof operator
> may not be applied. As a quirk of the original implementation of
> zero-length arrays, sizeof evaluates to zero."[1]
> 
> sizeof(flexible-array-member) triggers a warning because flexible array
> members have incomplete type[1]. There are some instances of code in
> which the sizeof operator is being incorrectly/erroneously applied to
> zero-length arrays and the result is zero. Such instances may be hiding
> some bugs. So, this work (flexible-array member conversions) will also
> help to get completely rid of those sorts of issues.
> 
> This issue was found with the help of Coccinelle.
> 
> [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> [2] https://github.com/KSPP/linux/issues/21
> [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
> 
> Signed-off-by: Gustavo A. R. Silva 
> ---
> drivers/crypto/chelsio/chcr_crypto.h |8 
> 1 file changed, 4 insertions(+), 4 deletions(-)

Please resend via netdev.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [PATCH] perf evsel: Get group fd from CPU0 for system wide event

2020-05-14 Thread Jin, Yao

Hi Jiri,

On 5/9/2020 3:37 PM, Jin, Yao wrote:

Hi Jiri,

On 5/5/2020 8:03 AM, Jiri Olsa wrote:

On Sat, May 02, 2020 at 10:33:59AM +0800, Jin, Yao wrote:

SNIP


@@ -1461,6 +1461,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int 
thread)
   BUG_ON(!leader->core.fd);
   fd = FD(leader, cpu, thread);
+    if (fd == -1 && leader->core.system_wide)


fd does not need to be -1 in here.. in my setup cstate_pkg/c2-residency/
has cpumask 0, so other cpus never get open and are 0, and the whole thing
ends up with:

sys_perf_event_open: pid -1  cpu 1  group_fd 0  flags 0
sys_perf_event_open failed, error -9

I actualy thought we put -1 to fd array but couldn't find it.. perhaps we 
should od that




I have tested on two platforms. On KBL desktop fd is 0 for this case, but on
oncascadelakex server, fd is -1, so the BUG_ON(fd == -1) is triggered.


+    fd = FD(leader, 0, thread);
+


so how do we group following events?

    cstate_pkg/c2-residency/ - cpumask 0
    msr/tsc/ - all cpus



Not sure if it's enough to only use cpumask 0 because
cstate_pkg/c2-residency/ should be per-socket.


cpu 0 is fine.. the rest I have no idea ;-)



Perhaps we directly remove the BUG_ON(fd == -1) assertion?


I think we need to make clear how to deal with grouping over
events that comes for different cpus

so how do we group following events?

   cstate_pkg/c2-residency/ - cpumask 0
   msr/tsc/ - all cpus


what's the reason/expected output of groups with above events?
seems to make sense only if we limit msr/tsc/ to cpumask 0 as well

jirka



On 2-socket machine (e.g cascadelakex), "cstate_pkg/c2-residency/" is per-socket event and the 
cpumask is 0 and 24.


root@lkp-csl-2sp5 /sys/devices/cstate_pkg# cat cpumask
0,24

We can't limit it to cpumask 0. It should be programmed on CPU0 and CPU24 (the first CPU on each 
socket).


The "msr/tsc" are per-cpu event, it should be programmed on all cpus. So I don't think we can limit 
msr/tsc to cpumask 0.


The issue is how we deal with get_group_fd().

static int get_group_fd(struct evsel *evsel, int cpu, int thread)
{
     struct evsel *leader = evsel->leader;
     int fd;

     if (evsel__is_group_leader(evsel))
     return -1;

     /*
  * Leader must be already processed/open,
  * if not it's a bug.
  */
     BUG_ON(!leader->core.fd);

     fd = FD(leader, cpu, thread);
     BUG_ON(fd == -1);

     return fd;
}

When evsel is "msr/tsc/",

FD(leader, 0, 0) is 3 (3 is the fd of "cstate_pkg/c2-residency/" on CPU0)
FD(leader, 1, 0) is -1
BUG_ON asserted.

If we just return group_fd(-1) for "msr/tsc", it looks like it's not a problem, 
is it?

Thanks
Jin Yao


I think I get the root cause. That should be a serious bug in get_group_fd, 
access violation!

For a group mixed with system-wide event and per-core event and the group leader is system-wide 
event, access violation will happen.


perf_evsel__alloc_fd allocates one FD member for system-wide event (only 
FD(evsel, 0, 0) is valid).

But for per core event, perf_evsel__alloc_fd allocates N FD members (N = ncpus). For example, for 
ncpus is 8, FD(evsel, 0, 0) to FD(evsel, 7, 0) are valid.


get_group_fd(struct evsel *evsel, int cpu, int thread)
{
struct evsel *leader = evsel->leader;

fd = FD(leader, cpu, thread);/* access violation may happen here */
}

If leader is system-wide event, only the FD(leader, 0, 0) is valid.

When get_group_fd accesses FD(leader, 1, 0), access violation happens.

My fix is:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 28683b0eb738..db05b8a1e1a8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1440,6 +1440,9 @@ static int get_group_fd(struct evsel *evsel, int cpu, int 
thread)
if (evsel__is_group_leader(evsel))
return -1;

+   if (leader->core.system_wide && !evsel->core.system_wide)
+   return -2;
+
/*
 * Leader must be already processed/open,
 * if not it's a bug.
@@ -1665,6 +1668,11 @@ static int evsel__open_cpu(struct evsel *evsel, struct 
perf_cpu_map *cpus,
pid = perf_thread_map__pid(threads, thread);

group_fd = get_group_fd(evsel, cpu, thread);
+   if (group_fd == -2) {
+   errno = EINVAL;
+   err = -EINVAL;
+   goto out_close;
+   }
 retry_open:
test_attr__ready();

It enables the perf_evlist__reset_weak_group. And in the second_pass (in __run_perf_stat), the 
events will be opened successfully.


I have tested OK for this fix on cascadelakex.

Thanks
Jin Yao


linux-next: build failure after merge of the tip tree

2020-05-14 Thread Stephen Rothwell
Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

arch/x86/kernel/ftrace.c: In function 'set_ftrace_ops_ro':
arch/x86/kernel/ftrace.c:444:32: error: 'ftrace_epilogue' undeclared (first use 
in this function)
  444 |end_offset = (unsigned long)ftrace_epilogue;
  |^~~

Caused by commit

  0298739b7983 ("x86,ftrace: Fix ftrace_regs_caller() unwind")

from the tip tree ineracting with commit

  59566b0b622e ("x86/ftrace: Have ftrace trampolines turn read-only at the end 
of system boot up")

from Linus' tree.

I applied the following merge fix patch (I don't know if this is
correct, but it seemed reasonable):

From: Stephen Rothwell 
Date: Fri, 15 May 2020 15:51:17 +1000
Subject: [PATCH] fixup for "x86/ftrace: Have ftrace trampolines turn read-only
 at the end of system boot up"

Signed-off-by: Stephen Rothwell 
---
 arch/x86/kernel/ftrace.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index f8917a6f25b7..c84d28e90a58 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -441,7 +441,7 @@ void set_ftrace_ops_ro(void)
end_offset = (unsigned long)ftrace_regs_caller_end;
} else {
start_offset = (unsigned long)ftrace_caller;
-   end_offset = (unsigned long)ftrace_epilogue;
+   end_offset = (unsigned long)ftrace_caller_end;
}
size = end_offset - start_offset;
size = size + RET_SIZE + sizeof(void *);
-- 
2.26.2

-- 
Cheers,
Stephen Rothwell


pgpk63h7Lms6U.pgp
Description: OpenPGP digital signature


Re: [PATCH] Revert "mmc: sdhci-xenon: add runtime pm support and reimplement standby"

2020-05-14 Thread Jisheng Zhang
On Thu, 14 May 2020 12:18:58 +0200 Ulf Hansson wrote:

> 
> 
> On Thu, 14 May 2020 at 07:45, Jisheng Zhang  
> wrote:
> >
> > On Wed, 13 May 2020 14:15:21 +0200 Ulf Hansson wrote:
> >  
> > >
> > >
> > > On Wed, 13 May 2020 at 11:47, Jisheng Zhang  
> > > wrote:  
> > > >
> > > > This reverts commit a027b2c5fed78851e69fab395b02d127a7759fc7.
> > > >
> > > > The HW supports auto clock gating, so it's useless to do runtime pm
> > > > in software.  
> > >
> > > Runtime PM isn't soley about clock gating. Moreover it manages the  
> >
> > Per my understanding, current xenon rpm implementation is just clock gating.

what's your option about this? My point is the HW can auto clock
gate, so what's the benefit of current rpm implementation given it only does
clock gating. FWICT, when submitting the xenon rpm patch, I don't think the
author  compared the power consumption. If the comparison is done, it's easy
to find the rpm doesn't bring any power consumption benefit at all.

> >  
> > > "pltfm_host->clk", which means even if the controller supports auto
> > > clock gating, gating/ungating the externally provided clock still
> > > makes sense.  
> >
> >clock ---  xenon IP
> >   |___ rpm   |__ HW Auto clock gate
> >
> > Per my understanding, with rpm, both clock and IP is clock gated; while with
> > Auto clock gate, the IP is clock gated. So the only difference is clock 
> > itself.
> > Considering the gain(suspect we have power consumption gain, see below), the
> > pay -- 56 LoCs and latency doesn't deserve gain.
> >
> > Even if considering from power consumption POV, 
> > sdhci_runtime_suspend_host(),
> > sdhci_runtime_resume_host(), and the retune process could be more than the 
> > clock
> > itself.  
> 
> Right.
> 
> The re-tune may be costly, yes. However, whether the re-tune is
> *really* needed actually varies depending on the sdhci variant and the
> SoC. Additionally, re-tune isn't done for all types of (e)MMC/SD/SDIO
> cards.
> 
> I see a few options that you can explore.
> 
> 1. There is no requirement to call sdhci_runtime_suspend|resume_host()
> from sdhci-xenon's ->runtime_suspend|resume() callbacks - if that's
> not explicitly needed. The point is, you can do other things there,
> that suits your variant/SoC better.

Yes, there's no requirement to call sdhci_runtime_suspend|resume_host().
But simply removing the calls would break system suspend. How to handle
this situation?

> 
> 2. Perhaps for embedded eMMCs, with a non-removable slot, the
> re-tuning is costly. If you want to prevent the device from entering
> runtime suspend for that slot, for example, just do an additional
> pm_runtime_get_noresume() during ->probe().
> 
> [...]
> 
> Kind regards
> Uffe



Re: [PATCH 7/9] x86: Add support for function granular KASLR

2020-05-14 Thread Baoquan He
On 04/15/20 at 02:04pm, Kristen Carlson Accardi wrote:
...
> diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
> index 9652d5c2afda..2e108fdc7757 100644
> --- a/arch/x86/boot/compressed/misc.c
> +++ b/arch/x86/boot/compressed/misc.c
> @@ -26,9 +26,6 @@
>   * it is not safe to place pointers in static structures.
>   */
>  
> -/* Macros used by the included decompressor code below. */
> -#define STATIC   static
> -

Here removing STATIC definition might be the reason why the LKP
reported build error to patch 7/9.

>  /*
>   * Use normal definitions of mem*() from string.c. There are already
>   * included header files which expect a definition of memset() and by
> @@ -49,6 +46,8 @@ struct boot_params *boot_params;
>  
>  memptr free_mem_ptr;
>  memptr free_mem_end_ptr;
> +unsigned long malloc_ptr;
> +int malloc_count;
>  
>  static char *vidmem;
>  static int vidport;
> @@ -203,10 +202,20 @@ static void handle_relocations(void *output, unsigned 
> long output_len,
>   if (IS_ENABLED(CONFIG_X86_64))
>   delta = virt_addr - LOAD_PHYSICAL_ADDR;
>  
> - if (!delta) {
> - debug_putstr("No relocation needed... ");
> - return;
> + /*
> +  * it is possible to have delta be zero and
> +  * still have enabled fg kaslr. We need to perform relocations
> +  * for fgkaslr regardless of whether the base address has moved.
> +  */
> + if (!IS_ENABLED(CONFIG_FG_KASLR) || nokaslr) {
> + if (!delta) {
> + debug_putstr("No relocation needed... ");
> + return;
> + }
>   }
> +
> + pre_relocations_cleanup(map);
> +
>   debug_putstr("Performing relocations... ");

I testes this patchset on x86_64 machine, it works well. Seems the
debug printing need a little bit adjustment.

-   debug_putstr("Performing relocations... ");
+   debug_putstr("\nPerforming relocations... ");


Decompressing Linux... Parsing ELF... 
Parsing ELF section headers... 
Looking for symbols... 
Re-sorting kallsyms ...Performing relocations... 
   ^
Updating exception table...

Re-sorting exception table...

>  
>   /*
> @@ -230,35 +239,106 @@ static void handle_relocations(void *output, unsigned 
> long output_len,
>*/
>   for (reloc = output + output_len - sizeof(*reloc); *reloc; reloc--) {
>   long extended = *reloc;
> + long value;
> +
> + /*
> +  * if using fgkaslr, we might have moved the address
> +  * of the relocation. Check it to see if it needs adjusting
> +  * from the original address.
> +  */
> + (void) adjust_address(&extended);
> +
>   extended += map;
>  
>   ptr = (unsigned long)extended;
>   if (ptr < min_addr || ptr > max_addr)
>   error("32-bit relocation outside of kernel!\n");
>  
> - *(uint32_t *)ptr += delta;
> + value = *(int32_t *)ptr;
> +
> + /*
> +  * If using fgkaslr, the value of the relocation
> +  * might need to be changed because it referred
> +  * to an address that has moved.
> +  */
> + adjust_address(&value);
> +
> + value += delta;
> +
> + *(uint32_t *)ptr = value;
>   }
>  #ifdef CONFIG_X86_64
>   while (*--reloc) {
>   long extended = *reloc;
> + long value;
> + long oldvalue;
> + Elf64_Shdr *s;
> +
> + /*
> +  * if using fgkaslr, we might have moved the address
> +  * of the relocation. Check it to see if it needs adjusting
> +  * from the original address.
> +  */
> + s = adjust_address(&extended);
> +
>   extended += map;
>  
>   ptr = (unsigned long)extended;
>   if (ptr < min_addr || ptr > max_addr)
>   error("inverse 32-bit relocation outside of kernel!\n");
>  
> - *(int32_t *)ptr -= delta;
> + value = *(int32_t *)ptr;
> + oldvalue = value;
> +
> + /*
> +  * If using fgkaslr, these relocs will contain
> +  * relative offsets which might need to be
> +  * changed because it referred
> +  * to an address that has moved.
> +  */
> + adjust_relative_offset(*reloc, &value, s);
> +
> + /*
> +  * only percpu symbols need to have their values adjusted for
> +  * base address kaslr since relative offsets within the .text
> +  * and .text.* sections are ok wrt each other.
> +  */
> + if (is_percpu_addr(*reloc, oldvalue))
> + value -= delta;
> +
> + *(int32_t *)ptr = value;
> 

Re: [PATCH v7 2/4] usb: dwc3: qcom: Add interconnect support in dwc3 driver

2020-05-14 Thread Felipe Balbi

Hi,

Georgi Djakov  writes:
> On 5/14/20 20:13, Matthias Kaehlcke wrote:
>> On Thu, May 14, 2020 at 02:30:28PM +0300, Felipe Balbi wrote:
>>> Felipe Balbi  writes:
>>>
 Hi,

 Sandeep Maheswaram  writes:
> +static int dwc3_qcom_interconnect_init(struct dwc3_qcom *qcom)
> +{
> + struct device *dev = qcom->dev;
> + int ret;
> +
> + if (!device_is_bound(&qcom->dwc3->dev))
> + return -EPROBE_DEFER;

 this breaks allmodconfig. I'm dropping this series from my queue for
 this merge window.
>>>
>>> Sorry, I meant this patch ;-)
>> 
>> I guess that's due to INTERCONNECT being a module. There is currently a
>
> I believe it's because of this:
> ERROR: modpost: "device_is_bound" [drivers/usb/dwc3/dwc3-qcom.ko] undefined!
>
>> discussion about this  with Viresh and Georgi in response to another
>> automated build failure. Viresh suggests changing CONFIG_INTERCONNECT
>> from tristate to bool, which seems sensible to me given that interconnect
>> is a core subsystem.
>
> The problem you are talking about would arise when INTERCONNECT=m and
> USB_DWC3_QCOM=y and it definitely exists here and could be triggered with
> randconfig build. So i suggest to squash also the diff below.
>
> Thanks,
> Georgi
>
> ---8<---
> diff --git a/drivers/usb/dwc3/Kconfig b/drivers/usb/dwc3/Kconfig
> index 206caa0ea1c6..6661788b1a76 100644
> --- a/drivers/usb/dwc3/Kconfig
> +++ b/drivers/usb/dwc3/Kconfig
> @@ -129,6 +129,7 @@ config USB_DWC3_QCOM
>   tristate "Qualcomm Platform"
>   depends on ARCH_QCOM || COMPILE_TEST
>   depends on EXTCON || !EXTCON
> + depends on INTERCONNECT || !INTERCONNECT

I would prefer to see a patch adding EXPORT_SYMBOL_GPL() to device_is_bound()

-- 
balbi


signature.asc
Description: PGP signature


Re: [patch V4 part 5 01/31] genirq: Provide irq_enter/exit_rcu()

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> irq_enter()/exit() include the RCU handling. To properly separate the RCU
> handling provide variants which contain only the non-RCU related
> functionality.

Acked-by: Andy Lutomirski 


[PATCH v11 2/2] drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP bridge driver

2020-05-14 Thread Xin Ji
The ANX7625 is an ultra-low power 4K Mobile HD Transmitter designed
for portable device. It converts MIPI DSI/DPI to DisplayPort 1.3 4K.

The ANX7625 can support both USB Type-C PD feature and MIPI DSI/DPI
to DP feature. This driver only enabled MIPI DSI/DPI to DP feature.

Signed-off-by: Xin Ji 
---
 drivers/gpu/drm/bridge/Makefile   |2 +-
 drivers/gpu/drm/bridge/analogix/Kconfig   |8 +
 drivers/gpu/drm/bridge/analogix/Makefile  |1 +
 drivers/gpu/drm/bridge/analogix/anx7625.c | 1961 +
 drivers/gpu/drm/bridge/analogix/anx7625.h |  397 ++
 5 files changed, 2368 insertions(+), 1 deletion(-)
 create mode 100644 drivers/gpu/drm/bridge/analogix/anx7625.c
 create mode 100644 drivers/gpu/drm/bridge/analogix/anx7625.h

diff --git a/drivers/gpu/drm/bridge/Makefile b/drivers/gpu/drm/bridge/Makefile
index 4934fcf..bcd388a 100644
--- a/drivers/gpu/drm/bridge/Makefile
+++ b/drivers/gpu/drm/bridge/Makefile
@@ -12,8 +12,8 @@ obj-$(CONFIG_DRM_SII9234) += sii9234.o
 obj-$(CONFIG_DRM_THINE_THC63LVD1024) += thc63lvd1024.o
 obj-$(CONFIG_DRM_TOSHIBA_TC358764) += tc358764.o
 obj-$(CONFIG_DRM_TOSHIBA_TC358767) += tc358767.o
-obj-$(CONFIG_DRM_ANALOGIX_DP) += analogix/
 obj-$(CONFIG_DRM_I2C_ADV7511) += adv7511/
 obj-$(CONFIG_DRM_TI_SN65DSI86) += ti-sn65dsi86.o
 obj-$(CONFIG_DRM_TI_TFP410) += ti-tfp410.o
+obj-y += analogix/
 obj-y += synopsys/
diff --git a/drivers/gpu/drm/bridge/analogix/Kconfig 
b/drivers/gpu/drm/bridge/analogix/Kconfig
index e930ff9..c772be2 100644
--- a/drivers/gpu/drm/bridge/analogix/Kconfig
+++ b/drivers/gpu/drm/bridge/analogix/Kconfig
@@ -2,3 +2,11 @@
 config DRM_ANALOGIX_DP
tristate
depends on DRM
+
+config DRM_ANALOGIX_ANX7625
+   tristate "Analogix Anx7625 MIPI to DP interface support"
+   depends on DRM
+   depends on OF
+   help
+   ANX7625 is an ultra-low power 4K mobile HD transmitter designed
+   for portable devices. It converts MIPI/DPI to DisplayPort1.3 4K.
diff --git a/drivers/gpu/drm/bridge/analogix/Makefile 
b/drivers/gpu/drm/bridge/analogix/Makefile
index fdbf3fd..b6c4a19 100644
--- a/drivers/gpu/drm/bridge/analogix/Makefile
+++ b/drivers/gpu/drm/bridge/analogix/Makefile
@@ -1,3 +1,4 @@
 # SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_DRM_ANALOGIX_ANX7625) += anx7625.o
 analogix_dp-objs := analogix_dp_core.o analogix_dp_reg.o
 obj-$(CONFIG_DRM_ANALOGIX_DP) += analogix_dp.o
diff --git a/drivers/gpu/drm/bridge/analogix/anx7625.c 
b/drivers/gpu/drm/bridge/analogix/anx7625.c
new file mode 100644
index 000..2afa869
--- /dev/null
+++ b/drivers/gpu/drm/bridge/analogix/anx7625.c
@@ -0,0 +1,1961 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright(c) 2020, Analogix Semiconductor. All rights reserved.
+ *
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include "anx7625.h"
+
+/*
+ * There is a sync issue while access I2C register between AP(CPU) and
+ * internal firmware(OCM), to avoid the race condition, AP should access
+ * the reserved slave address before slave address occurs changes.
+ */
+static int i2c_access_workaround(struct anx7625_data *ctx,
+struct i2c_client *client)
+{
+   u8 offset;
+   struct device *dev = &client->dev;
+   int ret;
+
+   if (client == ctx->last_client)
+   return 0;
+
+   ctx->last_client = client;
+
+   if (client == ctx->i2c.tcpc_client)
+   offset = RSVD_00_ADDR;
+   else if (client == ctx->i2c.tx_p0_client)
+   offset = RSVD_D1_ADDR;
+   else if (client == ctx->i2c.tx_p1_client)
+   offset = RSVD_60_ADDR;
+   else if (client == ctx->i2c.rx_p0_client)
+   offset = RSVD_39_ADDR;
+   else if (client == ctx->i2c.rx_p1_client)
+   offset = RSVD_7F_ADDR;
+   else
+   offset = RSVD_00_ADDR;
+
+   ret = i2c_smbus_write_byte_data(client, offset, 0x00);
+   if (ret < 0)
+   DRM_DEV_ERROR(dev,
+ "fail to access i2c id=%x\n:%x",
+ client->addr, offset);
+
+   return ret;
+}
+
+static int anx7625_reg_read(struct anx7625_data *ctx,
+   struct i2c_client *client, u8 reg_addr)
+{
+   int ret;
+   struct device *dev = &client->dev;
+
+   i2c_access_workaround(ctx, client);
+
+   ret = i2c_smbus_read_byte_data(client, reg_addr);
+   if (ret < 0)
+   DRM_DEV_ERROR(dev, "read i2c fail id=%x:%x\n",
+ client->addr, reg_addr);
+
+   return ret;
+}
+
+static int anx7625_reg_block_read(struct anx7625_data *ctx,
+ struct i2c_client *client,
+ u8 reg_addr, u

Re: [PATCH V2] dma: zynqmp_dma: Move list_del inside zynqmp_dma_free_descriptor.

2020-05-14 Thread Vinod Koul
On 06-05-20, 12:28, Rafał Hibner wrote:
> List elements are not formally removed from list during zynqmp_dma_reset.

Applied after fixing subsystem name to dmaengine, thanks
-- 
~Vinod


[PATCH v11 1/2] dt-bindings: drm/bridge: anx7625: MIPI to DP transmitter DT schema

2020-05-14 Thread Xin Ji
anx7625: MIPI to DP transmitter DT schema

Signed-off-by: Xin Ji 
---
 .../bindings/display/bridge/analogix,anx7625.yaml  | 95 ++
 1 file changed, 95 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml

diff --git 
a/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml 
b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
new file mode 100644
index 000..60585a4
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
@@ -0,0 +1,95 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright 2019 Analogix Semiconductor, Inc.
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/display/bridge/analogix,anx7625.yaml#";
+$schema: "http://devicetree.org/meta-schemas/core.yaml#";
+
+title: Analogix ANX7625 SlimPort (4K Mobile HD Transmitter)
+
+maintainers:
+  - Xin Ji 
+
+description: |
+  The ANX7625 is an ultra-low power 4K Mobile HD Transmitter
+  designed for portable devices.
+
+properties:
+  compatible:
+items:
+  - const: analogix,anx7625
+
+  reg:
+maxItems: 1
+
+  interrupts:
+description: used for interrupt pin B8.
+maxItems: 1
+
+  enable-gpios:
+description: used for power on chip control, POWER_EN pin D2.
+maxItems: 1
+
+  reset-gpios:
+description: used for reset chip control, RESET_N pin B7.
+maxItems: 1
+
+  ports:
+type: object
+
+properties:
+  port@0:
+type: object
+description:
+  Video port for MIPI DSI input.
+
+  port@1:
+type: object
+description:
+  Video port for panel or connector.
+
+required:
+- port@0
+- port@1
+
+required:
+  - compatible
+  - reg
+  - ports
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+i2c0 {
+#address-cells = <1>;
+#size-cells = <0>;
+
+encoder@58 {
+compatible = "analogix,anx7625";
+reg = <0x58>;
+enable-gpios = <&pio 45 GPIO_ACTIVE_HIGH>;
+reset-gpios = <&pio 73 GPIO_ACTIVE_HIGH>;
+
+ports {
+#address-cells = <1>;
+#size-cells = <0>;
+
+mipi2dp_bridge_in: port@0 {
+reg = <0>;
+anx7625_in: endpoint {
+remote-endpoint = <&mipi_dsi>;
+};
+};
+
+mipi2dp_bridge_out: port@1 {
+reg = <1>;
+anx7625_out: endpoint {
+remote-endpoint = <&panel_in>;
+};
+};
+};
+};
+};
-- 
2.7.4



[PATCH v11 0/2] Add initial support for slimport anx7625

2020-05-14 Thread Xin Ji
Hi all,

The following series add support for the Slimport ANX7625 transmitter, a
ultra-low power Full-HD 4K MIPI to DP transmitter designed for portable device.


This is the v11 version, any mistakes, please let me know, I will fix it in
the next series.

Change history:
v11: Fix comments from Rob Herring
 - Update commit message.
 - Remove unused label.

v10: Fix comments from Rob Herring, Daniel.
 - Fix dt_binding_check warning.
 - Update description.

v9: Fix comments from Sam, Nicolas, Daniel
 - Remove extcon interface.
 - Remove DPI support.
 - Fix dt_binding_check complains.
 - Code clean up and update description.

v8: Fix comments from Nicolas.
 - Fix several coding format.
 - Update description.

v7:
 - Fix critical timing(eg:odd hfp/hbp) in "mode_fixup" interface,
   enhance MIPI RX tolerance by setting register MIPI_DIGITAL_ADJ_1 to 0x3D.


Xin Ji (2):
  dt-bindings: drm/bridge: anx7625: MIPI to DP transmitter DT schema
  drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP bridge driver

 .../bindings/display/bridge/analogix,anx7625.yaml  |   95 +
 drivers/gpu/drm/bridge/Makefile|2 +-
 drivers/gpu/drm/bridge/analogix/Kconfig|8 +
 drivers/gpu/drm/bridge/analogix/Makefile   |1 +
 drivers/gpu/drm/bridge/analogix/anx7625.c  | 1961 
 drivers/gpu/drm/bridge/analogix/anx7625.h  |  397 
 6 files changed, 2463 insertions(+), 1 deletion(-)
 create mode 100644 
Documentation/devicetree/bindings/display/bridge/analogix,anx7625.yaml
 create mode 100644 drivers/gpu/drm/bridge/analogix/anx7625.c
 create mode 100644 drivers/gpu/drm/bridge/analogix/anx7625.h

-- 
2.7.4



Re: [PATCH] kgdb: Fix broken handling of printk() in NMI context

2020-05-14 Thread Sumit Garg
Hi Petr,

On Thu, 14 May 2020 at 14:13, Petr Mladek  wrote:
>
> On Wed 2020-05-13 19:04:48, Sumit Garg wrote:
> > On Tue, 12 May 2020 at 19:55, Daniel Thompson
> >  wrote:
> > >
> > > On Tue, May 12, 2020 at 02:18:34PM +0530, Sumit Garg wrote:
> > > > Since commit 42a0bb3f7138 ("printk/nmi: generic solution for safe printk
> > > > in NMI"), kgdb entry in NMI context defaults to use safe NMI printk()
> > >
> > > I didn't see the author on Cc: nor any of the folks whose hands it
> > > passed through. It would definitely be good to involve them in this
> > > discussion.
> > >
> >
> > Thanks for updating the Cc: list.
> >
> > >
> > > > which involves CPU specific buffers and deferred printk() until exit 
> > > > from
> > > > NMI context.
> > > >
> > > > But kgdb being a stop-the-world debugger, we don't want to defer 
> > > > printk()
> > > > especially backtrace on corresponding CPUs. So instead switch to normal
> > > > printk() mode in kgdb_cpu_enter() if entry is in NMI context.
> > >
> > > So, firstly I should *definitely* take a mea cupla for not shouting
> > > about this at the time (I was on Cc:... twice). Only thing I can say
> > > confidently is that the test suite didn't yell about this and so I
> > > didn't look at this as closely as I should have done (and that it
> > > didn't yell is mostly because I'm still building out the test suite
> > > coverage).
> > >
> > > Anyhow...
> > >
> > > This feels a little like we are smearing the printk() interception logic
> > > across the kernel in ways that make things hard to read. If we accepted
> > > this patch we then have, the new NMI interception logic, the old kdb
> > > interception logic and some hacks in the kgdb trap handler to defang the
> > > NMI interception logic and force the kdb logic to kick in.
> > >
> > > Wouldn't it be better to migrate kdb interception logic up a couple of
> > > levels so that it continues to function even when we are in nmi printk
> > > mode. That way *all* the printk() interception code would end up in
> > > one place.
> > >
> >
> > Yes it would be better to have all printk() interception code at one
> > place. Let me see if I can come up with an integrated logic.
>
> It might be enough to move the kdb_check from vprintk_default()
> to vprintk_func().
>
> I have never used kdb. I did not know that it was able to stop
> kernel in any context.
>
> Would this work? It is only compile tested!

Thanks for this fix patch. It did resolve the issue and now I am able
to see the backtrace in the kdb shell. So we can go ahead with your
patch and drop mine.

>
> From 14ae6c9f0cbd1479cb898c864c7ab46e20f3cf6f Mon Sep 17 00:00:00 2001
> From: Petr Mladek 
> Date: Thu, 14 May 2020 10:37:44 +0200
> Subject: [PATCH] printk/kdb: Redirect printk messages into kdb in any context
>
> kdb is able to stop kernel even in NMI context where printk() is redirected
> to the printk_safe() lockless variant. Move the check and redirect to kdb
> even in this case.
>
> Signed-off-by: Petr Mladek 
> ---
>  kernel/printk/printk.c  | 14 +-
>  kernel/printk/printk_safe.c |  8 
>  2 files changed, 9 insertions(+), 13 deletions(-)
>

Reported-by: Sumit Garg 
Tested-by: Sumit Garg 

-Sumit

> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 9a9b6156270b..63a1aa377cd9 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -35,7 +35,6 @@
>  #include 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -2036,18 +2035,7 @@ EXPORT_SYMBOL(vprintk);
>
>  int vprintk_default(const char *fmt, va_list args)
>  {
> -   int r;
> -
> -#ifdef CONFIG_KGDB_KDB
> -   /* Allow to pass printk() to kdb but avoid a recursion. */
> -   if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) {
> -   r = vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
> -   return r;
> -   }
> -#endif
> -   r = vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
> -
> -   return r;
> +   return vprintk_emit(0, LOGLEVEL_DEFAULT, NULL, 0, fmt, args);
>  }
>  EXPORT_SYMBOL_GPL(vprintk_default);
>
> diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c
> index d9a659a686f3..81734497c625 100644
> --- a/kernel/printk/printk_safe.c
> +++ b/kernel/printk/printk_safe.c
> @@ -6,6 +6,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -359,6 +360,13 @@ void __printk_safe_exit(void)
>
>  __printf(1, 0) int vprintk_func(const char *fmt, va_list args)
>  {
> +#ifdef CONFIG_KGDB_KDB
> +   /* Allow to pass printk() to kdb but avoid a recursion. */
> +   if (unlikely(kdb_trap_printk && kdb_printf_cpu < 0)) {
> +   return vkdb_printf(KDB_MSGSRC_PRINTK, fmt, args);
> +   }
> +#endif
> +
> /*
>  * Try to use the main logbuf even in NMI. But avoid calling console
>  * drivers that might have their own locks.
> --
> 2.26.1
>


[PATCH] xhci: Fix log mistake of xhci_start

2020-05-14 Thread jiahao
It is obvious that XCHI_MAX_HALT_USEC is usec,
 not milliseconds; Replace 'milliseconds' with
'usec' of the debug message.

Signed-off-by: jiahao 
---
 drivers/usb/host/xhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index bee5dec..d011472 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -147,7 +147,7 @@ int xhci_start(struct xhci_hcd *xhci)
STS_HALT, 0, XHCI_MAX_HALT_USEC);
if (ret == -ETIMEDOUT)
xhci_err(xhci, "Host took too long to start, "
-   "waited %u microseconds.\n",
+   "waited %u usec.\n",
XHCI_MAX_HALT_USEC);
if (!ret)
/* clear state flags. Including dying, halted or removing */
-- 
2.7.4



Re: [patch V4 part 4 24/24] x86/entry: Convert double fault exception to IDTENTRY_DF

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Convert #DF to IDTENTRY_DF
>   - Implement the C entry point with DEFINE_IDTENTRY_DF
>   - Emit the ASM stub with DECLARE_IDTENTRY_DF on 64bit
>   - Remove the ASM idtentry in 64bit
>   - Adjust the 32bit shim code
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.
>

Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 23/24] x86/entry: Provide IDTENTRY_DF

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Provide a separate macro for #DF as this needs to emit paranoid only code
> and has also a special ASM stub in 32bit.

Acked-by: Andy Lutomirski 

but... maybe it would be cleaner just to open-code all of this in the
next patch?  This is a lot of macro to do nothing at all.


Re: [patch V4 part 4 21/24] x86/traps: Address objtool noinstr complaints in #DB

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> The functions invoked from handle_debug() can be instrumented. Tell objtool
> about it.

Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 22/24] x86/mce: Address objtools noinstr complaints

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Mark the relevant functions noinstr, use the plain non-instrumented MSR
> accessors. The only odd part is the instr_begin()/end() pair around the
> indirect machine_check_vector() call as objtool can't figure that out. The
> possible invoked functions are annotated correctly.

Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 20/24] x86/traps: Restructure #DB handling

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Now that there are separate entry points, move the kernel/user_mode specifc
> checks into the entry functions so the common handling code does not need
> the extra mode checks. Make the code more readable while at it.

Acked-by: Andy Lutomirski 


RE: [PATCH v7 1/2] PCI: xilinx-cpm: Add YAML schemas for Versal CPM Root Port

2020-05-14 Thread Bharat Kumar Gogada
Hi Rob,

Can you please let us know if you have any inputs on this.

Regards,
Bharat

> -Original Message-
> From: Bharat Kumar Gogada 
> Sent: Thursday, May 7, 2020 5:29 PM
> To: linux-...@vger.kernel.org; linux-kernel@vger.kernel.org
> Cc: lorenzo.pieral...@arm.com; bhelg...@google.com; r...@kernel.org;
> Ravikiran Gummaluri ; Bharat Kumar Gogada
> 
> Subject: [PATCH v7 1/2] PCI: xilinx-cpm: Add YAML schemas for Versal CPM
> Root Port
> 
> Add YAML schemas documentation for Versal CPM Root Port driver.
> 
> Signed-off-by: Bharat Kumar Gogada 
> ---
>  .../devicetree/bindings/pci/xilinx-versal-cpm.yaml | 105
> +
>  1 file changed, 105 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/pci/xilinx-versal-
> cpm.yaml
> 
> diff --git a/Documentation/devicetree/bindings/pci/xilinx-versal-cpm.yaml
> b/Documentation/devicetree/bindings/pci/xilinx-versal-cpm.yaml
> new file mode 100644
> index 000..5fc5c3f
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/pci/xilinx-versal-cpm.yaml
> @@ -0,0 +1,105 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) %YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/pci/xilinx-versal-cpm.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: CPM Host Controller device tree for Xilinx Versal SoCs
> +
> +maintainers:
> +  - Bharat Kumar Gogada 
> +
> +properties:
> +  compatible:
> +const: xlnx,versal-cpm-host-1.00
> +
> +  "#address-cells":
> +const: 3
> +
> +  "#size-cells":
> +const: 2
> +
> +  reg:
> +items:
> +  - description: Configuration space region and bridge registers.
> +  - description: CPM system level control and status registers.
> +
> +  reg-names:
> +items:
> +  - const: cfg
> +  - const: cpm_slcr
> +
> +  interrupts:
> +maxItems: 1
> +
> +  msi-map:
> +description:
> +  Maps a Requester ID to an MSI controller and associated MSI sideband
> data.
> +
> +  ranges:
> +maxItems: 2
> +
> +  "#interrupt-cells":
> +const: 1
> +
> +  interrupt-map-mask:
> +description: Standard PCI IRQ mapping properties.
> +
> +  interrupt-map:
> +description: Standard PCI IRQ mapping properties.
> +
> +  interrupt_controller:
> +description: Interrupt controller child node.
> +
> +  bus-range:
> +description: Range of bus numbers associated with this controller.
> +
> +required:
> +  - compatible
> +  - "#address-cells"
> +  - "#size-cells"
> +  - reg
> +  - reg-names
> +  - "#interrupt-cells"
> +  - interrupts
> +  - interrupt-parent
> +  - interrupt-map
> +  - interrupt-map-mask
> +  - ranges
> +  - bus-range
> +  - msi-map
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +
> +versal {
> +   #address-cells = <2>;
> +   #size-cells = <2>;
> +   cpm_pcie: pci@fca1 {
> +   compatible = "xlnx,versal-cpm-host-1.00";
> +   #address-cells = <3>;
> +   #interrupt-cells = <1>;
> +   #size-cells = <2>;
> +   interrupts = <0 72 4>;
> +   interrupt-parent = <&gic>;
> +   interrupt-map-mask = <0 0 0 7>;
> +   interrupt-map = <0 0 0 1 &pcie_intc_0 0>,
> +   <0 0 0 2 &pcie_intc_0 1>,
> +   <0 0 0 3 &pcie_intc_0 2>,
> +   <0 0 0 4 &pcie_intc_0 3>;
> +   bus-range = <0x00 0xff>;
> +   ranges = <0x0200 0x0 0xe000 0x0 0xe000 0x0
> 0x1000>,
> +<0x4300 0x80 0x 0x80 0x 
> 0x0
> 0x8000>;
> +   msi-map = <0x0 &its_gic 0x0 0x1>;
> +   reg = <0x6 0x 0x0 0x1000>,
> + <0x0 0xfca1 0x0 0x1000>;
> +   reg-names = "cfg", "cpm_slcr";
> +   pcie_intc_0: interrupt_controller {
> +   #address-cells = <0>;
> +   #interrupt-cells = <1>;
> +   interrupt-controller ;
> +   };
> +};
> +};
> --
> 2.7.4



Re: [patch V4 part 4 15/24] x86/db: Split out dr6/7 handling

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> From: Peter Zijlstra 
>
> DR6/7 should be handled before nmi_enter() is invoked and restore after
> nmi_exit() to minimize the exposure.
>
> Split it out into helper inlines and bring it into the correct order.

> +*
> +* Entry text is excluded for HW_BP_X and cpu_entry_area, which
> +* includes the entry stack is excluded for everything.
> +*/
> +   get_debugreg(*dr7, 6);
> +   set_debugreg(0, 7);

Fortunately, PeterZ is hiding in a brown paper bag, so I don't have to
comment :)

Other than that:

Acked-by: Andy Lutomirski 


[PATCH v1 3/4] driver core: fw_devlink: Add support for batching fwnode parsing

2020-05-14 Thread Saravana Kannan
The amount of time spent parsing fwnodes of devices can become really
high if the devices are added in an non-ideal order. Worst case can be
O(N^2) when N devices are added. But this can be optimized to O(N) by
adding all the devices and then parsing all their fwnodes in one batch.

This commit adds fw_devlink_pause() and fw_devlink_resume() to allow
doing this.

Signed-off-by: Saravana Kannan 
---
 drivers/base/base.h|   1 +
 drivers/base/core.c| 116 ++---
 drivers/base/dd.c  |   8 +++
 include/linux/fwnode.h |   2 +
 4 files changed, 120 insertions(+), 7 deletions(-)

diff --git a/drivers/base/base.h b/drivers/base/base.h
index 40fb069a8a7e..95c22c0f9036 100644
--- a/drivers/base/base.h
+++ b/drivers/base/base.h
@@ -153,6 +153,7 @@ extern char *make_class_name(const char *name, struct 
kobject *kobj);
 extern int devres_release_all(struct device *dev);
 extern void device_block_probing(void);
 extern void device_unblock_probing(void);
+extern void driver_deferred_probe_force_trigger(void);
 
 /* /sys/devices directory */
 extern struct kset *devices_kset;
diff --git a/drivers/base/core.c b/drivers/base/core.c
index f585d92e09d0..84c569726d75 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -49,6 +49,9 @@ static LIST_HEAD(wait_for_suppliers);
 static DEFINE_MUTEX(wfs_lock);
 static LIST_HEAD(deferred_sync);
 static unsigned int defer_sync_state_count = 1;
+static unsigned int defer_fw_devlink_count;
+static DEFINE_MUTEX(defer_fw_devlink_lock);
+static bool fw_devlink_is_permissive(void);
 
 #ifdef CONFIG_SRCU
 static DEFINE_MUTEX(device_links_lock);
@@ -527,7 +530,7 @@ static void device_link_add_missing_supplier_links(void)
int ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
if (!ret)
list_del_init(&dev->links.needs_suppliers);
-   else if (ret != -ENODEV)
+   else if (ret != -ENODEV || fw_devlink_is_permissive())
dev->links.need_for_probe = false;
}
mutex_unlock(&wfs_lock);
@@ -1177,17 +1180,116 @@ static void fw_devlink_link_device(struct device *dev)
 {
int fw_ret;
 
-   device_link_add_missing_supplier_links();
+   if (!fw_devlink_flags)
+   return;
+
+   mutex_lock(&defer_fw_devlink_lock);
+   if (!defer_fw_devlink_count)
+   device_link_add_missing_supplier_links();
+
+   /*
+* The device's fwnode not having add_links() doesn't affect if other
+* consumers can find this device as a supplier.  So, this check is
+* intentionally placed after device_link_add_missing_supplier_links().
+*/
+   if (!fwnode_has_op(dev->fwnode, add_links))
+   goto out;
 
-   if (fw_devlink_flags && fwnode_has_op(dev->fwnode, add_links)) {
+   /*
+* If fw_devlink is being deferred, assume all devices have mandatory
+* suppliers they need to link to later. Then, when the fw_devlink is
+* resumed, all these devices will get a chance to try and link to any
+* suppliers they have.
+*/
+   if (!defer_fw_devlink_count) {
fw_ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
-   if (fw_ret == -ENODEV && !fw_devlink_is_permissive())
-   device_link_wait_for_mandatory_supplier(dev);
-   else if (fw_ret)
-   device_link_wait_for_optional_supplier(dev);
+   if (fw_ret == -ENODEV && fw_devlink_is_permissive())
+   fw_ret = -EAGAIN;
+   } else {
+   fw_ret = -ENODEV;
}
+
+   if (fw_ret == -ENODEV)
+   device_link_wait_for_mandatory_supplier(dev);
+   else if (fw_ret)
+   device_link_wait_for_optional_supplier(dev);
+
+out:
+   mutex_unlock(&defer_fw_devlink_lock);
 }
 
+/**
+ * fw_devlink_pause - Pause parsing of fwnode to create device links
+ *
+ * Calling this function defers any fwnode parsing to create device links until
+ * fw_devlink_resume() is called. Both these functions are ref counted and the
+ * caller needs to match the calls.
+ *
+ * While fw_devlink is paused:
+ * - Any device that is added won't have its fwnode parsed to create device
+ *   links.
+ * - The probe of the device will also be deferred during this period.
+ * - Any devices that were already added, but waiting for suppliers won't be
+ *   able to link to newly added devices.
+ *
+ * Once fw_devlink_resume():
+ * - All the fwnodes that was not parsed will be parsed.
+ * - All the devices that were deferred probing will be reattempted if they
+ *   aren't waiting for any more suppliers.
+ *
+ * This pair of functions, is mainly meant to optimize the parsing of fwnodes
+ * when a lot of devices that need to link to each other are added in a short
+ * interval of time. For example, adding all the top level devices in a system.
+ *
+ * For example, if N de

[PATCH v1 2/4] driver core: Look for waiting consumers only for a fwnode's primary device

2020-05-14 Thread Saravana Kannan
Commit 4dbe191c046e ("driver core: Add device links from fwnode only for
the primary device") skipped linking a fwnode's secondary device to
the suppliers listed in its fwnode.

However, a fwnode's secondary device can't be found using
get_dev_from_fwnode(). So, there's no point in trying to see if devices
waiting for suppliers might want to link to a fwnode's secondary device.

This commit removes that unnecessary step for devices that aren't a
fwnode's primary device and also moves the code to a more appropriate
part of the file.

Signed-off-by: Saravana Kannan 
---
 drivers/base/core.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2b454aae64b5..f585d92e09d0 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1173,6 +1173,21 @@ static bool fw_devlink_is_permissive(void)
return fw_devlink_flags == DL_FLAG_SYNC_STATE_ONLY;
 }
 
+static void fw_devlink_link_device(struct device *dev)
+{
+   int fw_ret;
+
+   device_link_add_missing_supplier_links();
+
+   if (fw_devlink_flags && fwnode_has_op(dev->fwnode, add_links)) {
+   fw_ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
+   if (fw_ret == -ENODEV && !fw_devlink_is_permissive())
+   device_link_wait_for_mandatory_supplier(dev);
+   else if (fw_ret)
+   device_link_wait_for_optional_supplier(dev);
+   }
+}
+
 /* Device links support end. */
 
 int (*platform_notify)(struct device *dev) = NULL;
@@ -2407,7 +2422,7 @@ int device_add(struct device *dev)
struct device *parent;
struct kobject *kobj;
struct class_interface *class_intf;
-   int error = -EINVAL, fw_ret;
+   int error = -EINVAL;
struct kobject *glue_dir = NULL;
bool is_fwnode_dev = false;
 
@@ -2524,16 +2539,8 @@ int device_add(struct device *dev)
 * waiting consumers can link to it before the driver is bound to the
 * device and the driver sync_state callback is called for this device.
 */
-   device_link_add_missing_supplier_links();
-
-   if (fw_devlink_flags && is_fwnode_dev &&
-   fwnode_has_op(dev->fwnode, add_links)) {
-   fw_ret = fwnode_call_int_op(dev->fwnode, add_links, dev);
-   if (fw_ret == -ENODEV && !fw_devlink_is_permissive())
-   device_link_wait_for_mandatory_supplier(dev);
-   else if (fw_ret)
-   device_link_wait_for_optional_supplier(dev);
-   }
+   if (is_fwnode_dev)
+   fw_devlink_link_device(dev);
 
bus_probe_device(dev);
if (parent)
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v1 4/4] of: platform: Batch fwnode parsing when adding all top level devices

2020-05-14 Thread Saravana Kannan
The fw_devlink_pause() and fw_devlink_resume() APIs allow batching the
parsing of the device tree nodes when a lot of devices are added. This
will significantly cut down parsing time (as much a 1 second on some
systems). So, use them when adding devices for all the top level device
tree nodes in a system.

Signed-off-by: Saravana Kannan 
---
 drivers/of/platform.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 3371e4a06248..55d719347810 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -538,7 +538,9 @@ static int __init of_platform_default_populate_init(void)
}
 
/* Populate everything else. */
+   fw_devlink_pause();
of_platform_default_populate(NULL, NULL, NULL);
+   fw_devlink_resume();
 
return 0;
 }
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v1 1/4] driver core: Move code to the right part of the file

2020-05-14 Thread Saravana Kannan
This commit just moves around code to match the general organization of
the file.

Signed-off-by: Saravana Kannan 
---
 drivers/base/core.c | 60 ++---
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index c9045521596f..2b454aae64b5 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1143,6 +1143,36 @@ static void device_links_purge(struct device *dev)
device_links_write_unlock();
 }
 
+static u32 fw_devlink_flags = DL_FLAG_SYNC_STATE_ONLY;
+static int __init fw_devlink_setup(char *arg)
+{
+   if (!arg)
+   return -EINVAL;
+
+   if (strcmp(arg, "off") == 0) {
+   fw_devlink_flags = 0;
+   } else if (strcmp(arg, "permissive") == 0) {
+   fw_devlink_flags = DL_FLAG_SYNC_STATE_ONLY;
+   } else if (strcmp(arg, "on") == 0) {
+   fw_devlink_flags = DL_FLAG_AUTOPROBE_CONSUMER;
+   } else if (strcmp(arg, "rpm") == 0) {
+   fw_devlink_flags = DL_FLAG_AUTOPROBE_CONSUMER |
+  DL_FLAG_PM_RUNTIME;
+   }
+   return 0;
+}
+early_param("fw_devlink", fw_devlink_setup);
+
+u32 fw_devlink_get_flags(void)
+{
+   return fw_devlink_flags;
+}
+
+static bool fw_devlink_is_permissive(void)
+{
+   return fw_devlink_flags == DL_FLAG_SYNC_STATE_ONLY;
+}
+
 /* Device links support end. */
 
 int (*platform_notify)(struct device *dev) = NULL;
@@ -2345,36 +2375,6 @@ static int device_private_init(struct device *dev)
return 0;
 }
 
-static u32 fw_devlink_flags = DL_FLAG_SYNC_STATE_ONLY;
-static int __init fw_devlink_setup(char *arg)
-{
-   if (!arg)
-   return -EINVAL;
-
-   if (strcmp(arg, "off") == 0) {
-   fw_devlink_flags = 0;
-   } else if (strcmp(arg, "permissive") == 0) {
-   fw_devlink_flags = DL_FLAG_SYNC_STATE_ONLY;
-   } else if (strcmp(arg, "on") == 0) {
-   fw_devlink_flags = DL_FLAG_AUTOPROBE_CONSUMER;
-   } else if (strcmp(arg, "rpm") == 0) {
-   fw_devlink_flags = DL_FLAG_AUTOPROBE_CONSUMER |
-  DL_FLAG_PM_RUNTIME;
-   }
-   return 0;
-}
-early_param("fw_devlink", fw_devlink_setup);
-
-u32 fw_devlink_get_flags(void)
-{
-   return fw_devlink_flags;
-}
-
-static bool fw_devlink_is_permissive(void)
-{
-   return fw_devlink_flags == DL_FLAG_SYNC_STATE_ONLY;
-}
-
 /**
  * device_add - add device to device hierarchy.
  * @dev: device.
-- 
2.26.2.761.g0e0b3e54be-goog



[PATCH v1 0/4] Optimize fw_devlink parsing

2020-05-14 Thread Saravana Kannan
When fw_devlink is enabled on hardware with a large number of device
tree nodes, the initial device addition done in
of_platform_default_populate_init() can be very inefficient. This is
because most devices will fail to find all their suppliers when they are
added and will keep trying to parse their device tree nodes and link to
any newly added devices

This was an item on my TODO list that I'm finally getting around to. On
hardware I'm testing on, this saved 1.216 _seconds_!  Another SoC vendor
was also able to test a similar but hacky patch series and confirmed
that it saved them around 1 second.

Thanks,
Saravana
P.S: It took me longer to write the comments than the code!

Saravana Kannan (4):
  driver core: Move code to the right part of the file
  driver core: Look for waiting consumers only for a fwnode's primary
device
  driver core: fw_devlink: Add support for batching fwnode parsing
  of: platform: Batch fwnode parsing when adding all top level devices

 drivers/base/base.h|   1 +
 drivers/base/core.c| 193 -
 drivers/base/dd.c  |   8 ++
 drivers/of/platform.c  |   2 +
 include/linux/fwnode.h |   2 +
 5 files changed, 164 insertions(+), 42 deletions(-)

-- 
2.26.2.761.g0e0b3e54be-goog



drivers/vhost/vhost.c:1014:16: sparse: sparse: cast to restricted __virtio16

2020-05-14 Thread kbuild test robot
Hi Jason,

First bad commit (maybe != root cause):

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   1ae7efb388540adc1653a51a3bc3b2c9cef5ec1a
commit: 20c384f1ea1a0bc7320bc445c72dd02d2970d594 vhost: refine vhost and vringh 
kconfig
date:   6 weeks ago
reproduce:
# apt-get install sparse
# sparse version: v0.6.1-193-gb8fad4bc-dirty
git checkout 20c384f1ea1a0bc7320bc445c72dd02d2970d594
make ARCH=x86_64 allmodconfig
make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 


sparse warnings: (new ones prefixed by >>)

   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
   drivers/vhost/vhost.c:937:16: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void *addr @@got restricted 
__virtio16 [noderef]  *
   drivers/vhost/vhost.c:900:42: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef]  *addr @@
got n:1> *addr @@
   drivers/vhost/vhost.c:900:42: sparse:expected void [noderef]  
*addr
   drivers/vhost/vhost.c:900:42: sparse:got void *addr
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
   drivers/vhost/vhost.c:922:16: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void *addr @@got restricted 
__virtio16 [noderef] [usertype]  *
   drivers/vhost/vhost.c:900:42: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef]  *addr @@
got n:1> *addr @@
   drivers/vhost/vhost.c:900:42: sparse:expected void [noderef]  
*addr
   drivers/vhost/vhost.c:900:42: sparse:got void *addr
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
>> drivers/vhost/vhost.c:1014:16: sparse: sparse: cast to restricted __virtio16
   drivers/vhost/vhost.c:1014:16: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void *addr @@got restricted 
__virtio16 [noderef]  *
>> drivers/vhost/vhost.c:1014:16: sparse: sparse: cast to restricted __virtio16
   drivers/vhost/vhost.c:900:42: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef]  *addr @@
got n:1> *addr @@
   drivers/vhost/vhost.c:900:42: sparse:expected void [noderef]  
*addr
   drivers/vhost/vhost.c:900:42: sparse:got void *addr
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
   drivers/vhost/vhost.c:989:16: sparse: sparse: cast to restricted __virtio16
   drivers/vhost/vhost.c:989:16: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void *addr @@got restricted 
__virtio16 [noderef]  *
   drivers/vhost/vhost.c:989:16: sparse: sparse: cast to restricted __virtio16
   drivers/vhost/vhost.c:900:42: sparse: sparse: incorrect type in argument 2 
(different address spaces) @@expected void [noderef]  *addr @@
got n:1> *addr @@
   drivers/vhost/vhost.c:900:42: sparse:expected void [noderef]  
*addr
   drivers/vhost/vhost.c:900:42: sparse:got void *addr
   drivers/vhost/vhost.c:753:17: sparse: sparse: incorrect type in return 
expression (different address spaces) @@expected void [noderef]  * 
@@got n:1> * @@
   drivers/vhost/vhost.c:753:17: sparse:expected void [noderef]  *
   drivers/vhost/vhost.c:753:17: sparse:got void *
  

Re: [patch V4 part 4 19/24] x86/entry: Implement user mode C entry points for #DB and #MCE

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> The MCE entry point uses the same mechanism as the IST entry point for
> now. For #DB split the inner workings and just keep the ist_enter/exit
> magic in the IST variant. Fixup the ASM code to emit the proper
> noist_##cfunc call.
>


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 18/24] x86/entry: Provide IDTRENTRY_NOIST variants for #DB and #MC

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Provide NOIST entry point macros which allows to implement NOIST variants
> of the C entry points. These are invoked when #DB or #MC enter from user
> space. This allows explicit handling of the difference between user mode
> and kernel mode entry later.


Acked-by: Andy Lutomirski 


RE: [PATCH v7 0/6] Add new series Micron SPI NAND devices

2020-05-14 Thread Poonam Aggrwal
Adding Ashish.

Regards
Poonam

> -Original Message-
> From: Naresh Kamboju 
> Sent: Friday, May 15, 2020 10:57 AM
> To: shiva.linuxwo...@gmail.com; Miquel Raynal ;
> Shivamurthy Shastri 
> Cc: Richard Weinberger ; Vignesh Raghavendra
> ; Boris Brezillon ;
> Chuanhong Guo ; Frieder Schrempf
> ; linux-...@lists.infradead.org; open list 
>  ker...@vger.kernel.org>; Poonam Aggrwal ;
> Suram Suram ; lkft-tri...@lists.linaro.org
> Subject: Re: [PATCH v7 0/6] Add new series Micron SPI NAND devices
> 
> On Wed, 11 Mar 2020 at 23:28,  wrote:
> >
> > From: Shivamurthy Shastri 
> >
> > This patchset is for the new series of Micron SPI NAND devices, and
> > the following links are their datasheets.
> 
> While boot NXP ls2088 device with mainline kernel the following nand warning
> noticed. How critical this warning ?
> 
> [1.357722] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0x48
> [1.364085] nand: Micron MT29F16G08ABACAWP
> [1.368181] nand: 2048 MiB, SLC, erase size: 512 KiB, page size:
> 4096, OOB size: 224
> [1.375932] nand: WARNING: 53000.flash: the ECC used on your
> system is too weak compared to the one required by the NAND chip
> 
> [1.388767] Bad block table found at page 524160, version 0x01
> [1.396833] Bad block table found at page 524032, version 0x01
> [1.403781] nand_read_bbt: bad block at 0x02d0
> [1.408921] nand_read_bbt: bad block at 0x02d8
> [1.414750] fsl,ifc-nand 53000.nand: IFC NAND device at
> 0x53000, bank 2
> 
> 
> Full test log,
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fqa-
> reports.linaro.org%2Flkft%2Flinux-mainline-oe%2Fbuild%2Fv5.7-rc5-55-
> g1ae7efb38854%2Ftestrun%2F18254%2Flog&data=02%7C01%7Cpoonam.
> aggrwal%40nxp.com%7C146f634c869f4c70baa108d7f8909ffb%7C686ea1d3bc2
> b4c6fa92cd99c5c301635%7C0%7C0%7C637251172354638298&sdata=%2B
> Jhs%2Fb92%2BA56WzYdHe%2BBhXWfjk8feCGAFv%2BRzFKC9PM%3D&rese
> rved=0
> 
> - Naresh


linux-next: manual merge of the devicetree tree with the net-next tree

2020-05-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the devicetree tree got a conflict in:

  Documentation/devicetree/bindings/net/qcom,ipa.yaml

between commit:

  8456c54408a2 ("dt-bindings: net: add IPA iommus property")

from the net-next tree and commit:

  fba5618451d2 ("dt-bindings: Fix incorrect 'reg' property sizes")

from the devicetree tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc Documentation/devicetree/bindings/net/qcom,ipa.yaml
index 7b749fc04c32,b2ac7606095b..
--- a/Documentation/devicetree/bindings/net/qcom,ipa.yaml
+++ b/Documentation/devicetree/bindings/net/qcom,ipa.yaml
@@@ -171,10 -162,9 +169,10 @@@ examples
  modem-init;
  modem-remoteproc = <&mss_pil>;
  
 +iommus = <&apps_smmu 0x720 0x3>;
- reg = <0 0x1e4 0 0x7000>,
- <0 0x1e47000 0 0x2000>,
- <0 0x1e04000 0 0x2c000>;
+ reg = <0x1e4 0x7000>,
+ <0x1e47000 0x2000>,
+ <0x1e04000 0x2c000>;
  reg-names = "ipa-reg",
  "ipa-shared",
  "gsi";


pgpTmSaeY4bqz.pgp
Description: OpenPGP digital signature


Re: [patch V4 part 4 17/24] x86/entry/64: Remove error code clearing from #DB and #MCE ASM stub

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> The C entry points do not expect an error code.


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 16/24] x86/entry: Convert Debug exception to IDTENTRY_DB

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Convert #DB to IDTENTRY_ERRORCODE:
>   - Implement the C entry point with DEFINE_IDTENTRY_DB
>   - Emit the ASM stub with DECLARE_IDTENTRY
>   - Remove the ASM idtentry in 64bit
>   - Remove the open coded ASM entry code in 32bit
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 14/24] x86/nmi: Protect NMI entry against instrumentation

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Mark all functions in the fragile code parts noinstr or force inlining so
> they can't be instrumented.


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 13/24] x86/entry: Convert NMI to IDTENTRY_NMI

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Convert #NMI to IDTENTRY_NMI:
>   - Implement the C entry point with DEFINE_IDTENTRY_NMI
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>
> No functional change.


Acked-by: Andy Lutomirski 


Re: [PATCH v7 0/6] Add new series Micron SPI NAND devices

2020-05-14 Thread Naresh Kamboju
On Wed, 11 Mar 2020 at 23:28,  wrote:
>
> From: Shivamurthy Shastri 
>
> This patchset is for the new series of Micron SPI NAND devices, and the
> following links are their datasheets.

While boot NXP ls2088 device with mainline kernel the following
nand warning noticed. How critical this warning ?

[1.357722] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0x48
[1.364085] nand: Micron MT29F16G08ABACAWP
[1.368181] nand: 2048 MiB, SLC, erase size: 512 KiB, page size:
4096, OOB size: 224
[1.375932] nand: WARNING: 53000.flash: the ECC used on your
system is too weak compared to the one required by the NAND chip

[1.388767] Bad block table found at page 524160, version 0x01
[1.396833] Bad block table found at page 524032, version 0x01
[1.403781] nand_read_bbt: bad block at 0x02d0
[1.408921] nand_read_bbt: bad block at 0x02d8
[1.414750] fsl,ifc-nand 53000.nand: IFC NAND device at
0x53000, bank 2


Full test log,
https://qa-reports.linaro.org/lkft/linux-mainline-oe/build/v5.7-rc5-55-g1ae7efb38854/testrun/18254/log

- Naresh


Re: [patch V4 part 4 12/24] x86/idtentry: Provide IDTENTRY_XEN for XEN/PV

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> XEN/PV has special wrappers for NMI and DB exceptions. They redirect these
> exceptions through regular IDTENTRY points. Provide the necessary IDTENTRY
> macros to make this work


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 10/24] x86/entry: Convert Machine Check to IDTENTRY_IST

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> Convert #MC to IDTENTRY_MCE:
>   - Implement the C entry points with DEFINE_IDTENTRY_MCE
>   - Emit the ASM stub with DECLARE_IDTENTRY_MCE
>   - Remove the ASM idtentry in 64bit
>   - Remove the open coded ASM entry code in 32bit
>   - Fixup the XEN/PV code
>   - Remove the old prototyoes
>   - Remove the error code from *machine_check_vector() as
> it is always 0 and not used by any of the functions
> it can point to. Fixup all the functions as well.
>
> No functional change.


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 11/24] x86/mce: Use untraced rd/wrmsr in the MCE offline/crash check

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> mce_check_crashing_cpu() is called right at the entry of the MCE
> handler. It uses mce_rdmsr() and mce_wrmsr() which are wrappers around
> rdmsr() and wrmsr() to handle the MCE error injection mechanism, which is
> pointless in this context, i.e. when the MCE hits an offline CPU or the
> system is already marked crashing.
>
> The MSR access can also be traced, so use the untraceable variants. This
> is also safe vs. XEN paravirt as these MSRs are not affected by XEN PV
> modifications.


Acked-by: Andy Lutomirski 


Re: [patch V4 part 4 09/24] x86/mce: Move nmi_enter/exit() into the entry point

2020-05-14 Thread Andy Lutomirski
On Tue, May 5, 2020 at 7:16 AM Thomas Gleixner  wrote:
>
> There is no reason to have nmi_enter/exit() in the actual MCE
> handlers. Move it to the entry point. This also covers the until now
> uncovered initial handler which only prints.

Acked-by: Andy Lutomirski 


Re: [PATCH v2 1/2] [media] mtk-mdp: add driver to probe mdp components

2020-05-14 Thread Eizan Miyamoto
On Thu, May 7, 2020 at 11:44 PM Enric Balletbo i Serra
 wrote:
>
> Hi Eizan,
>
> On 7/5/20 13:11, Eizan Miyamoto wrote:
> > On Thu, May 7, 2020 at 2:54 AM Enric Balletbo Serra  
> > wrote:
> >>
> >> Hi Eizan,
> >>
> >> Thank you for the patch.
> >>
> >> Missatge de Eizan Miyamoto  del dia dc., 6 de maig
> >> 2020 a les 10:41:
> >>>
> >>> Broadly, this patch (1) adds a driver for various MTK MDP components to
> >>> go alongside the main MTK MDP driver, and (2) hooks them all together
> >>> using the component framework.
> >>>
> >>> (1) Up until now, the MTK MDP driver controls 8 devices in the device
> >>> tree on its own. When running tests for the hardware video decoder, we
> >>> found that the iommus and LARBs were not being properly configured. To
> >>> configure them, a driver for each be added to mtk_mdp_comp so that
> >>> mtk_iommu_add_device() can (eventually) be called from dma_configure()
> >>> inside really_probe().
> >>>
> >>> (2) The integration into the component framework allows us to defer the
> >>> registration with the v4l2 subsystem until all the MDP-related devices
> >>> have been probed, so that the relevant device node does not become
> >>> available until initialization of all the components is complete.
> >>>
> >>> Some notes about how the component framework has been integrated:
> >>>
> >>> - The driver for the rdma0 component serves double duty as the "master"
> >>>   (aggregate) driver as well as a component driver. This is a non-ideal
> >>>   compromise until a better solution is developed. This device is
> >>>   differentiated from the rest by checking for a "mediatek,vpu" property
> >>>   in the device node.
> >>>
> >>> - The list of mdp components remains hard-coded as mtk_mdp_comp_dt_ids[]
> >>>   in mtk_mdp_core.c, and as mtk_mdp_comp_driver_dt_match[] in
> >>>   mtk_mdp_comp.c. This unfortunate duplication of information is
> >>>   addressed in a following patch in this series.
> >>>
> >>> - The component driver calls component_add() for each device that is
> >>>   probed.
> >>>
> >>> - In mtk_mdp_probe (the "master" device), we scan the device tree for
> >>>   any matching nodes against mtk_mdp_comp_dt_ids, and add component
> >>>   matches for them. The match criteria is a matching device node
> >>>   pointer.
> >>>
> >>> - When the set of components devices that have been probed corresponds
> >>>   with the list that is generated by the "master", the callback to
> >>>   mtk_mdp_master_bind() is made, which then calls the component bind
> >>>   functions.
> >>>
> >>> - Inside mtk_mdp_master_bind(), once all the component bind functions
> >>>   have been called, we can then register our device to the v4l2
> >>>   subsystem.
> >>>
> >>> - The call to pm_runtime_enable() in the master device is called after
> >>>   all the components have been registered by their bind() functions
> >>>   called by mtk_mtp_master_bind(). As a result, the list of components
> >>>   will not change while power management callbacks mtk_mdp_suspend()/
> >>>   resume() are accessing the list of components.
> >>>
> >>> Signed-off-by: Eizan Miyamoto 
> >>> ---
> >>>
> >>> Changes in v2: None
> >>>
> >>
> >> Not really true :-)
> >>
> >>>  drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 150 +--
> >>>  drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  26 +--
> >>>  drivers/media/platform/mtk-mdp/mtk_mdp_core.c | 176 +-
> >>>  drivers/media/platform/mtk-mdp/mtk_mdp_core.h |   1 +
> >>>  4 files changed, 263 insertions(+), 90 deletions(-)
> >>>
> >>> diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c 
> >>> b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
> >>> index 362fff924aef..5b4d482df778 100644
> >>> --- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
> >>> +++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
> >>> @@ -5,14 +5,53 @@
> >>>   */
> >>>
> >>>  #include 
> >>> +#include 
> >>>  #include 
> >>> -#include 
> >>> +#include 
> >>>  #include 
> >>> +#include 
> >>> +#include 
> >>> +#include 
> >>>  #include 
> >>>  #include 
> >>> +#include 
> >>>
> >>>  #include "mtk_mdp_comp.h"
> >>> -
> >>> +#include "mtk_mdp_core.h"
> >>> +
> >>> +/**
> >>> + * enum mtk_mdp_comp_type - the MDP component
> >>> + * @MTK_MDP_RDMA:  Read DMA
> >>> + * @MTK_MDP_RSZ:   Reszer
> >>> + * @MTK_MDP_WDMA:  Write DMA
> >>> + * @MTK_MDP_WROT:  Write DMA with rotation
> >>> + * @MTK_MDP_COMP_TYPE_MAX: Placeholder for num elems in this enum
> >>> + */
> >>> +enum mtk_mdp_comp_type {
> >>> +   MTK_MDP_RDMA,
> >>> +   MTK_MDP_RSZ,
> >>> +   MTK_MDP_WDMA,
> >>> +   MTK_MDP_WROT,
> >>> +   MTK_MDP_COMP_TYPE_MAX,
> >>> +};
> >>> +
> >>> +static const struct of_device_id mtk_mdp_comp_driver_dt_match[] = {
> >>> +   {
> >>> +   .compatible = "mediatek,mt8173-mdp-rdma",
> >>> +   .data = (void *)MTK_MDP_RDMA
> >>> +   }, {
> >>> +   .compatible = "mediatek,mt8173-mdp-rsz",
> >>>

linux-kernel@vger.kernel.org

2020-05-14 Thread Qian Cai
Lockdep is screwed here in next-20200514 due to "BUG: MAX_LOCKDEP_ENTRIES too 
low". One of the traces below pointed to this linux-next commit,

8c8e824d4ef0 watch_queue: Introduce a non-repeating system-unique superblock ID

which was accidentally just showed up in next-20200514 along with,

46896d79c514 watch_queue: Add superblock notifications

I did have here,

CONFIG_SB_NOTIFICATIONS=y
CONFIG_MOUNT_NOTIFICATIONS=y
CONFIG_FSINFO=y

While MAX_LOCKDEP_ENTRIES is 32768, I noticed there is one type of lock had a 
lot along,

# grep  'type->s_umount_key’ /proc/lockdep_chains | wc -l
6979

type->s_umount_key is from alloc_super(),

lockdep_set_class(&s->s_umount, &type->s_umount_key);

Any thought before I bury myself bisecting this?

[15323.316234] LTP: starting isofs (isofs.sh)
[15369.549302] LTP: starting fs_fill
[15369.837565] /dev/zero: Can't open blockdev
[15378.107150] EXT4-fs (loop0): mounting ext3 file system using the ext4 
subsystem
[15378.180704] EXT4-fs (loop0): mounted filesystem with ordered data mode. 
Opts: (null)
[15378.191630] ext3 filesystem being mounted at 
/tmp/ltp-M6YHxqgN9o/mIisOB/mntpoint supports timestamps until 2038 (0x7fff)
[15448.581853] EXT4-fs (loop0): mounted filesystem with ordered data mode. 
Opts: (null)
[15448.592515] ext4 filesystem being mounted at 
/tmp/ltp-M6YHxqgN9o/mIisOB/mntpoint supports timestamps until 2038 (0x7fff)
[15482.370368] XFS (loop0): Mounting V5 Filesystem
[15482.413544] XFS (loop0): Ending clean mount
[15482.427896] xfs filesystem being mounted at 
/tmp/ltp-M6YHxqgN9o/mIisOB/mntpoint supports timestamps until 2038 (0x7fff)
[15495.280716] XFS (loop0): Unmounting Filesystem
[15613.223513] LTP: starting binfmt_misc01 (binfmt_misc01.sh)
[15615.253089] BUG: MAX_LOCKDEP_ENTRIES too low!
[15615.258178] turning off the locking correctness validator.
[15615.264402] CPU: 4 PID: 80930 Comm: mount Tainted: G   O  
5.7.0-rc5-next-20200514 #1
[15615.273942] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, 
BIOS A40 07/10/2019
[15615.283218] Call Trace:
[15615.286388]  dump_stack+0xa7/0xea
[15615.290429]  alloc_list_entry.cold.37+0x11/0x18
[15615.295689]  __lock_acquire+0x2aad/0x3260
[15615.300428]  ? register_lock_class+0xb90/0xb90
[15615.305603]  ? __kasan_check_read+0x11/0x20
[15615.310514]  ? mark_lock+0x160/0xfe0
[15615.314814]  ? check_chain_key+0x1df/0x2e0
[15615.319637]  ? print_irqtrace_events+0x110/0x110
[15615.324984]  lock_acquire+0x1a2/0x680
[15615.329373]  ? vfs_generate_unique_id+0x23/0x70
[15615.334632]  ? check_flags.part.28+0x220/0x220
[15615.339806]  ? ktime_get+0xf2/0x150
[15615.344016]  ? lockdep_hardirqs_on+0x1b0/0x2c0
[15615.349190]  ? vfs_generate_unique_id+0x14/0x70
[15615.354453]  ? trace_hardirqs_on+0x3a/0x160
[15615.359366]  _raw_spin_lock+0x2f/0x40
[15615.363754]  ? vfs_generate_unique_id+0x23/0x70
[15615.369014]  vfs_generate_unique_id+0x23/0x70
vfs_generate_unique_id at fs/super.c:1890
[15615.374099]  alloc_super+0x531/0x5b0
alloc_super at fs/super.c:286
[15615.378398]  ? alloc_file.cold.7+0x19/0x19
[15615.383220]  sget_fc+0xb9/0x3a0
sget_fc at fs/super.c:539
[15615.387082]  ? compare_single+0x10/0x10
[15615.391645]  ? bm_get_tree+0x20/0x20 [binfmt_misc]
[15615.397167]  vfs_get_super+0x4e/0x1a0
vfs_get_super at fs/super.c:1197
[15615.401554]  get_tree_single+0x13/0x20
[15615.406031]  bm_get_tree+0x15/0x20 [binfmt_misc]
[15615.411380]  vfs_get_tree+0x54/0x150
[15615.415681]  do_mount+0xef4/0x11b0
[15615.419807]  ? copy_mount_string+0x20/0x20
[15615.424630]  ? __kasan_check_write+0x14/0x20
[15615.429630]  ? _copy_from_user+0x95/0xd0
[15615.434282]  ? memdup_user+0x58/0x90
[15615.438579]  __x64_sys_mount+0x100/0x120
[15615.443231]  do_syscall_64+0xcc/0xaf0
[15615.447617]  ? trace_hardirqs_on_thunk+0x1a/0x1c
[15615.452965]  ? syscall_return_slowpath+0x580/0x580
[15615.458487]  ? entry_SYSCALL_64_after_hwframe+0x3e/0xb3
[15615.464447]  ? trace_hardirqs_off_caller+0x3a/0x150
[15615.470055]  ? trace_hardirqs_off_thunk+0x1a/0x1c
[15615.475489]  entry_SYSCALL_64_after_hwframe+0x49/0xb3
[15615.481271] RIP: 0033:0x7f9721fb79ee
[15615.485572] Code: 48 8b 0d 9d f4 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 
0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 
f0 ff ff 73 01 c3 48 8b 0d 6a f4 2b 00 f7 d8 64 89 01 48
[15615.505155] RSP: 002b:7ffd555d4ac8 EFLAGS: 0246 ORIG_RAX: 
00a5
[15615.513474] RAX: ffda RBX: 5591d34c19c0 RCX: 7f9721fb79ee
[15615.521355] RDX: 5591d34c1ba0 RSI: 5591d34c38c0 RDI: 5591d34c1bc0
[15615.529232] RBP: 7f9722d63184 R08:  R09: 0003
[15615.537113] R10: c0ed R11: 0246 R12: 
[15615.544991] R13: c0ed R14: 5591d34c1bc0 R15: 5591d34c1ba0

=== the second run ===

[28392.003312] LTP: starting read_all_dev (read_all -d /dev -p -q -r 10)
[28392.125739] BUG: MAX_LOCKDEP_ENTRIES t

[RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

2020-05-14 Thread Viresh Kumar
From: Sudeep Holla 

Hi Rob, Arnd and Jassi,

This stuff has been doing rounds on the mailing list since several years
now with no agreed conclusion by all the parties. And here is another
attempt to get some feedback from everyone involved to close this once
and for ever. Your comments will very much be appreciated.

The ARM MHU is defined here in the TRM [1] for your reference, which
states following:

"The MHU drives the signal using a 32-bit register, with all 32
bits logically ORed together. The MHU provides a set of
registers to enable software to set, clear, and check the status
of each of the bits of this register independently.  The use of
32 bits for each interrupt line enables software to provide more
information about the source of the interrupt. For example, each
bit of the register can be associated with a type of event that
can contribute to raising the interrupt."

On few other platforms, like qcom, similar doorbell mechanism is present
with separate interrupt for each of the bits (that's how I understood
it), while in case of ARM MHU, there is a single interrupt line for all
the 32 bits. Also in case of ARM MHU, these registers and interrupts
have 3 copies for different priority levels, i.e. low priority
non-secure, high priority non-secure and secure channels.

For ARM MHU, both the dt bindings and the Linux driver support 3
channels for the different priorities right now and support sending a 32
bit data on every transfer in a locked fashion, i.e. only one transfer
can be done at once and the other have to wait for it to finish first.

Here are the point of view of the parties involved on this subject:

Jassi's viewpoint:

- Virtualization of channels should be discouraged in software based on
  specific usecases of the application. This may invite other mailbox
  driver authors to ask for doing virtualization in their drivers.

- In mailbox's terminology, every channel is equivalent to a signal,
  since there is only one signal generated here by the MHU, there should
  be only one channel per priority level.

- The clients should send data (of just setting 1 bit or many in the 32
  bit word) using the existing mechanism as the delays due to
  serialization shouldn't be significant anyway.

- The driver supports 90% of the users with the current implementation
  and it shouldn't be extended to support doorbell and implement two
  different modes by changing value of #mbox-cells field in bindings.

Sudeep (ARM) and myself as well to some extent:

- The hardware gives us the capability to write the register in
  parallel, i.e. we can write 0x800 and 0x400 together without any
  software locks, and so these 32 bits should be considered as separate
  channel even if only one interrupt is issued by the hardware finally.
  This shouldn't be called as virtualization of the channels, as the
  hardware supports this (as clearly mentioned in the TRM) and it takes
  care of handling the signal properly.

- With serialization, if we use only one channel as today at every
  priority, if there are 5 requests to send signal to the receiver and
  the dvfs request is the last one in queue (which may be called from
  scheduler's hot path with fast switching), it unnecessarily needs to
  wait for the first four transfers to finish due to the software
  locking imposed by the mailbox framework. This adds additional delay,
  maybe of few ms only, which isn't required by the hardware but just by
  the software and few ms can be important in scheduler's hotpath.

- With the current approach it isn't possible to assign different bits
  (or doorbell numbers) to clients from DT and the only way of doing
  that without adding new bindings is by extending #mbox-cells to accept
  a value of 2 as done in this patch.

Jassi and Sudeep, I hope I was able to represent both the view points
properly here. Please correct me if I have made a mistake here.

This is it. It would be nice to get the views of everyone now on this
and how should this be handled.

Thanks.

[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0515f/DDI0515F_juno_arm_development_platform_soc_trm.pdf
 , section 3.4.4, page 3-38.

Signed-off-by: Sudeep Holla 
Signed-off-by: Viresh Kumar 
---
 .../devicetree/bindings/mailbox/arm-mhu.txt   | 39 ++-
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/mailbox/arm-mhu.txt 
b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
index 4971f03f0b33..ba659bcc7109 100644
--- a/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
+++ b/Documentation/devicetree/bindings/mailbox/arm-mhu.txt
@@ -10,6 +10,15 @@ STAT register and the remote clears it after having read the 
data.
 The last channel is specified to be a 'Secure' resource, hence can't be
 used by Linux running NS.
 
+The MHU drives the interrupt signal using a 32-bit register, with all
+32-bits logically ORed together. It pr

RE: [PATCH V2] pwm: tegra: dynamic clk freq configuration by PWM driver

2020-05-14 Thread Sandipan Patra
Hello Uwe,

Gentle reminder to review my replies.
It will help me to push next cleaner patch.


Thanks & Regards,
Sandipan

> -Original Message-
> From: Sandipan Patra
> Sent: Thursday, May 7, 2020 1:40 PM
> To: Uwe Kleine-König 
> Cc: Thierry Reding ; robh...@kernel.org; Jonathan
> Hunter ; Bibek Basu ; Laxman
> Dewangan ; linux-...@vger.kernel.org;
> devicet...@vger.kernel.org; linux-te...@vger.kernel.org; linux-
> ker...@vger.kernel.org
> Subject: RE: [PATCH V2] pwm: tegra: dynamic clk freq configuration by PWM
> driver
> 
> Hello,
> 
> > -Original Message-
> > From: Uwe Kleine-König 
> > Sent: Tuesday, May 5, 2020 1:42 AM
> > To: Sandipan Patra 
> > Cc: Thierry Reding ; robh...@kernel.org; Jonathan
> > Hunter ; Bibek Basu ; Laxman
> > Dewangan ; linux-...@vger.kernel.org;
> > devicet...@vger.kernel.org; linux-te...@vger.kernel.org; linux-
> > ker...@vger.kernel.org
> > Subject: Re: [PATCH V2] pwm: tegra: dynamic clk freq configuration by
> > PWM driver
> >
> > External email: Use caution opening links or attachments
> >
> >
> > Hello,
> >
> > On Mon, Apr 20, 2020 at 09:24:03PM +0530, Sandipan Patra wrote:
> > > Added support for dynamic clock freq configuration in pwm kernel driver.
> > > Earlier the pwm driver used to cache boot time clock rate by pwm
> > > clock parent during probe. Hence dynamically changing pwm frequency
> > > was not possible for all the possible ranges. With this change,
> > > dynamic calculation is enabled and it is able to set the requested
> > > period from sysfs knob provided the value is supported by clock source.
> > >
> > > Changes mainly have 2 parts:
> > >   - T186 and later chips [1]
> > >   - T210 and prior chips [2]
> > >
> > > For [1] - Changes implemented to set pwm period dynamically and
> > >   also checks added to allow only if requested period(ns) is
> > >   below or equals to higher range.
> > >
> > > For [2] - Only checks if the requested period(ns) is below or equals
> > >   to higher range defined by max clock limit. The limitation
> > >   in T210 or prior chips are due to the reason of having only
> > >   one pwm-controller supporting multiple channels. But later
> > >   chips have multiple pwm controller instances each having
> > > single channel support.
> > >
> > > Signed-off-by: Sandipan Patra 
> > > ---
> > > V2:
> > > 1. Min period_ns calculation is moved to probe.
> > > 2. Added descriptioins for PWM register bits and regarding behaviour
> > >of the controller when new configuration is applied or pwm is disabled.
> > > 3. Setting period with possible value when supplied period is below limit.
> > > 4. Corrected the earlier code comment:
> > >plus 1 instead of minus 1 during pwm calculation
> > >
> > >  drivers/pwm/pwm-tegra.c | 110
> > > +---
> > >  1 file changed, 94 insertions(+), 16 deletions(-)
> > >
> > > diff --git a/drivers/pwm/pwm-tegra.c b/drivers/pwm/pwm-tegra.c index
> > > d26ed8f..7a36325 100644
> > > --- a/drivers/pwm/pwm-tegra.c
> > > +++ b/drivers/pwm/pwm-tegra.c
> > > @@ -4,8 +4,39 @@
> > >   *
> > >   * Tegra pulse-width-modulation controller driver
> > >   *
> > > - * Copyright (c) 2010, NVIDIA Corporation.
> > > - * Based on arch/arm/plat-mxc/pwm.c by Sascha Hauer
> > > 
> > > + * Copyright (c) 2010-2020, NVIDIA Corporation.
> > > + *
> > > + * Overview of Tegra Pulse Width Modulator Register:
> > > + * 1. 13-bit: Frequency division (SCALE)
> > > + * 2. 8-bit : Puls division (DUTY)
> > > + * 3. 1-bit : Enable bit
> > > + *
> > > + * The PWM clock frequency is divided by 256 before subdividing it
> > > + based
> > > + * on the programmable frequency division value to generate the
> > > + required
> > > + * frequency for PWM output. The maximum output frequency that can
> > > + be
> > > + * achieved is (max rate of source clock) / 256.
> > > + * i.e. if source clock rate is 408 MHz, maximum output frequency cab be:
> >
> > s/i.e./e.g./, s/cab/can/
> 
> Noted, correction in next patch.
> 
> >
> > > + * 408 MHz/256 = 1.6 MHz.
> > > + * This 1.6 MHz frequency can further be divided using SCALE value in 
> > > PWM.
> > > + *
> > > + * PWM pulse width: 8 bits are usable [23:16] for varying pulse width.
> > > + * To achieve 100% duty cycle, program Bit [24] of this register to
> > > + * 1’b1. In which case the other bits [23:16] are set to don't care.
> > > + *
> > > + * Limitations and known facts:
> >
> > Please use "Limitations:" here to make this easier greppable.
> 
> Will update in next patch.
> 
> >
> > > + * - When PWM is disabled, the output is driven to 0.
> >
> > 0 or inactive?
> 
> Yes, Inactive. When it is 0, it is disabled.
> Will update it to "inactive".
> 
> >
> > > + * - It does not allow the current PWM period to complete and
> > > + *   stops abruptly.
> > > + *
> > > + * - If the register is reconfigured while pwm is running,
> >
> > s/pwm/PWM/
> 
> Noted, correction in next patch.
> 
> >
> > > + *

Re: [PATCH v3 1/4] dma-buf: add support for virtio exported objects

2020-05-14 Thread David Stevens
On Thu, May 14, 2020 at 9:30 PM Daniel Vetter  wrote:
> On Thu, May 14, 2020 at 05:19:40PM +0900, David Stevens wrote:
> > Sorry for the duplicate reply, didn't notice this until now.
> >
> > > Just storing
> > > the uuid should be doable (assuming this doesn't change during the
> > > lifetime of the buffer), so no need for a callback.
> >
> > Directly storing the uuid doesn't work that well because of
> > synchronization issues. The uuid needs to be shared between multiple
> > virtio devices with independent command streams, so to prevent races
> > between importing and exporting, the exporting driver can't share the
> > uuid with other drivers until it knows that the device has finished
> > registering the uuid. That requires a round trip to and then back from
> > the device. Using a callback allows the latency from that round trip
> > registration to be hidden.
>
> Uh, that means you actually do something and there's locking involved.
> Makes stuff more complicated, invariant attributes are a lot easier
> generally. Registering that uuid just always doesn't work, and blocking
> when you're exporting?

Registering the id at creation and blocking in gem export is feasible,
but it doesn't work well for systems with a centralized buffer
allocator that doesn't support batch allocations (e.g. gralloc). In
such a system, the round trip latency would almost certainly be
included in the buffer allocation time. At least on the system I'm
working on, I suspect that would add 10s of milliseconds of startup
latency to video pipelines (although I haven't benchmarked the
difference). Doing the blocking as late as possible means most or all
of the latency can be hidden behind other pipeline setup work.

In terms of complexity, I think the synchronization would be basically
the same in either approach, just in different locations. All it would
do is alleviate the need for a callback to fetch the UUID.

-David


Re: [PATCH] optee: don't fail on unsuccessful device enumeration

2020-05-14 Thread Sumit Garg
Hi Volodymyr,

On Fri, 15 May 2020 at 06:32, Volodymyr Babchuk  wrote:
>
> Hi Sumit,
>
> On Thu, 14 May 2020 at 08:38, Sumit Garg  wrote:
> >
> > Hi Volodymyr,
> >
> > On Thu, 14 May 2020 at 06:48, Volodymyr Babchuk  
> > wrote:
> > >
> > > Hi Sumit,
> > >
> > > On Wed, 13 May 2020 at 11:24, Sumit Garg  wrote:
> > > >
> > > > Hi Volodymyr,
> > > >
> > > > On Wed, 13 May 2020 at 13:30, Jens Wiklander 
> > > >  wrote:
> > > > >
> > > > > Hi Volodymyr,
> > > > >
> > > > > On Wed, May 13, 2020 at 2:36 AM Volodymyr Babchuk
> > > > >  wrote:
> > > > > >
> > > > > > optee_enumerate_devices() can fail for multiple of reasons. For
> > > > > > example, I encountered issue when Xen OP-TEE mediator NACKed
> > > > > > PTA_CMD_GET_DEVICES call.
> > > >
> > > > Could you share a detailed description of the issue which you are
> > > > facing? optee_enumerate_devices() is a simple invocation of pseudo TA
> > > > and cases where OP-TEE doesn't provide corresponding pseudo TA are
> > > > handled very well.
> > >
> > > Yes, I did some research and looks like issue is broader, than I
> > > expected.  It is my fault, that I wasn't paying attention to the tee
> > > client support in the kernel.  Basically, it is incompatible with the
> > > virtualization feature. You see, the main issue with virtual machines
> > > is the second stage MMU. Intermediate physical address, that appear to
> > > be contiguous for the kernel may be not contiguous in the real
> > > physical memory due to 2nd stage MMU mappings. This is the reason I
> > > introduced OPTEE_MSG_ATTR_NONCONTIG in the kernel driver.
> > >
> > > But, looks like kernel-side optee client does not use this feature. It
> > > tries to provide SHM buffer as a simple contiguous span of memory. Xen
> > > blocks calls with OPTEE_MSG_ATTR_TYPE_TMEM_*   but without
> > > OPTEE_MSG_ATTR_NONCONTIG , because it can't translate IPAs to PAs for
> > > such buffers. This is why call to  PTA_CMD_GET_DEVICES fails.
> > >
> > > Valid fix would be to use OPTEE_MSG_ATTR_NONCONTIG whenever possible.
> > >
> >
> > Thanks for the detailed analysis. It looks like you are missing the
> > following fix patch in your tree which basically fixed broken
> > tee_shm_alloc() in case dynamic shared memory is enabled (IIRC
> > virtualization only supports dynamic shared memory).
> >
> > commit a249dd200d03791cab23e47571f3e13d9c72af6c
>
> Actually, I have this patch in my tree. So, it does not fixes the
> issue. Which is weird, actually. I'm planning to look deeper into
> this.

AFAICT, the only difference here is that it's the kernel memory
registered rather than user-space memory. But I am not very conversant
with the Xen environment. So I hope you will be able to find the root
cause as to why Xen is blocking this invocation.

>
> >
> > > >
> > > > > > This should not result in driver
> > > > > > initialization error because this is an optional feature.
> > > >
> > > > I wouldn't call it an optional feature as there might be real kernel
> > > > drivers dependent on this enumeration. Also, it is a simple example to
> > > > self test OP-TEE functionality too. So I am not sure how much
> > > > functional OP-TEE would be if this basic TA invocation fails.
> > >
> > > Well, it fixed case when Xen is involved. I think, this is valid
> > > combination, when platform have the newest OP-TEE, but slightly older
> > > kernel. So, imagine that OP-TEE provides PTA_CMD_GET_DEVICES, but
> > > kernel can't use because it uses plain TMEM arguments,which are not
> > > supported in virtualized environment.
> > >
> > > If there are kernel drivers, that depend on this PTA, they would not
> > > work in any case. But at least userspace clients still be able to use
> > > OP-TEE. This is why I call this feature "optional".
> >
> > As you can see above, tee_shm_alloc() being broken in your case was
> > detected via this simple pseudo TA invocation. So IMO, it would be
> > better to keep the existing behaviour as it provides a kind of basic
> > OP-TEE driver runtime self test too. Also, I think it would be a
> > better user experience to have every OP-TEE interface working rather
> > than a partially broken interface.
>
> I can see your point. But I think, that it is good to not to break
> backward- and forward- compatibility. Imagine, that user upgrades
> OP-TEE without changing the kernel. Previously it worked well, but new
> OP-TEE provides new PTA and kernel refuses to load the optee driver
> because driver fails to initialize that PTA.
>
> This is basically what happened with me. Platform that I am using does
> not provide any OP-TEE devices so I assumed that I can safely ignore
> this feature. But, when I flashed the latest OP-TEE build I got dead
> optee driver. This is confusing from a user standpoint. You don't
> expect that firmware upgrade to another minor version will break
> existing setup. My proposed patch at least prints the warning, so user
> would know where to look...

Warning prints aren't much useful in the sense tha

Re: [PATCH v4] Make initramfs honor CONFIG_DEVTMPFS_MOUNT

2020-05-14 Thread Randy Dunlap
On 5/14/20 9:04 PM, Rob Landley wrote:
> FYI I dug up my old https://lkml.org/lkml/2017/9/13/651 and ported it to
> current, because I needed it for a thing.
> 
> From: Rob Landley 
> 
> Make initramfs honor CONFIG_DEVTMPFS_MOUNT, and move
> /dev/console open after devtmpfs mount.
> 
> Add workaround for Debian bug that was copied by Ubuntu.
> 
> Signed-off-by: Rob Landley 
> 

Hi Rob,

You need to send this patch to some maintainer who could merge it.

And it uses the wrong multi-line comment format.

cheers.
-- 
~Randy



[PATCH 0/5] PCI: uniphier: Add features for UniPhier PCIe host controller

2020-05-14 Thread Kunihiko Hayashi
This series adds some features for UniPhier PCIe host controller.

- Add support for PME and AER invoked by MSI interrupt
- Add iATU register view support for PCIe version >= 4.80
- Add an error message when failing to get phy driver

This adds a new function called by MSI handler in DesignWare PCIe framework,
that invokes PME and AER funcions to detect the factor from SoC-dependent
registers.

---

Kunihiko Hayashi (5):
  PCI: dwc: Add msi_host_isr() callback
  PCI: uniphier: Add misc interrupt handler to invoke PME and AER
  dt-bindings: PCI: uniphier: Add iATU register description
  PCI: uniphier: Add iATU register support
  PCI: uniphier: Add error message when failed to get phy

 .../devicetree/bindings/pci/uniphier-pcie.txt  |  1 +
 drivers/pci/controller/dwc/pcie-designware-host.c  |  8 +--
 drivers/pci/controller/dwc/pcie-designware.h   |  1 +
 drivers/pci/controller/dwc/pcie-uniphier.c | 62 +-
 4 files changed, 56 insertions(+), 16 deletions(-)

-- 
2.7.4



[PATCH 4/5] PCI: uniphier: Add iATU register support

2020-05-14 Thread Kunihiko Hayashi
This gets iATU register area from reg property. In Synopsis DWC version
4.80 or later, since iATU register area is separated from core register
area, this area is necessary to get from DT independently.

Signed-off-by: Kunihiko Hayashi 
---
 drivers/pci/controller/dwc/pcie-uniphier.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c 
b/drivers/pci/controller/dwc/pcie-uniphier.c
index 508fc7b..6180d50 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -461,6 +461,11 @@ static int uniphier_pcie_probe(struct platform_device 
*pdev)
if (IS_ERR(priv->pci.dbi_base))
return PTR_ERR(priv->pci.dbi_base);
 
+   res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "atu");
+   priv->pci.atu_base = devm_pci_remap_cfg_resource(dev, res);
+   if (IS_ERR(priv->pci.atu_base))
+   priv->pci.atu_base = NULL;
+
res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "link");
priv->base = devm_ioremap_resource(dev, res);
if (IS_ERR(priv->base))
-- 
2.7.4



[PATCH 5/5] PCI: uniphier: Add error message when failed to get phy

2020-05-14 Thread Kunihiko Hayashi
Even if phy driver doesn't probe, the error message can't be
distinguished from other errors. This displays error message
caused by the phy driver explicitly.

Signed-off-by: Kunihiko Hayashi 
---
 drivers/pci/controller/dwc/pcie-uniphier.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c 
b/drivers/pci/controller/dwc/pcie-uniphier.c
index 6180d50..2bcf394 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -480,8 +480,10 @@ static int uniphier_pcie_probe(struct platform_device 
*pdev)
return PTR_ERR(priv->rst);
 
priv->phy = devm_phy_optional_get(dev, "pcie-phy");
-   if (IS_ERR(priv->phy))
+   if (IS_ERR(priv->phy)) {
+   dev_err(dev, "Failed to get phy (%d)\n", PTR_ERR(priv->phy));
return PTR_ERR(priv->phy);
+   }
 
platform_set_drvdata(pdev, priv);
 
-- 
2.7.4



[PATCH 2/5] PCI: uniphier: Add misc interrupt handler to invoke PME and AER

2020-05-14 Thread Kunihiko Hayashi
The misc interrupts consisting of PME, AER, and Link event, is handled
by INTx handler, however, these interrupts should be also handled by
MSI handler.

This adds the function uniphier_pcie_misc_isr() that handles misc
intterupts, which is called from both INTx and MSI handlers.
This function detects PME and AER interrupts with the status register,
and invoke PME and AER drivers related to INTx or MSI.

And this sets the mask for misc interrupts from INTx if MSI is enabled
and sets the mask for misc interrupts from MSI if MSI is disabled.

Signed-off-by: Kunihiko Hayashi 
---
 drivers/pci/controller/dwc/pcie-uniphier.c | 53 +++---
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-uniphier.c 
b/drivers/pci/controller/dwc/pcie-uniphier.c
index 8fd7bad..508fc7b 100644
--- a/drivers/pci/controller/dwc/pcie-uniphier.c
+++ b/drivers/pci/controller/dwc/pcie-uniphier.c
@@ -44,7 +44,9 @@
 #define PCL_SYS_AUX_PWR_DETBIT(8)
 
 #define PCL_RCV_INT0x8108
+#define PCL_RCV_INT_ALL_INT_MASK   GENMASK(28, 25)
 #define PCL_RCV_INT_ALL_ENABLE GENMASK(20, 17)
+#define PCL_RCV_INT_ALL_MSI_MASK   GENMASK(12, 9)
 #define PCL_CFG_BW_MGT_STATUS  BIT(4)
 #define PCL_CFG_LINK_AUTO_BW_STATUSBIT(3)
 #define PCL_CFG_AER_RC_ERR_MSI_STATUS  BIT(2)
@@ -167,7 +169,15 @@ static void uniphier_pcie_stop_link(struct dw_pcie *pci)
 
 static void uniphier_pcie_irq_enable(struct uniphier_pcie_priv *priv)
 {
-   writel(PCL_RCV_INT_ALL_ENABLE, priv->base + PCL_RCV_INT);
+   u32 val;
+
+   val = PCL_RCV_INT_ALL_ENABLE;
+   if (pci_msi_enabled())
+   val |= PCL_RCV_INT_ALL_INT_MASK;
+   else
+   val |= PCL_RCV_INT_ALL_MSI_MASK;
+
+   writel(val, priv->base + PCL_RCV_INT);
writel(PCL_RCV_INTX_ALL_ENABLE, priv->base + PCL_RCV_INTX);
 }
 
@@ -237,28 +247,48 @@ static const struct irq_domain_ops 
uniphier_intx_domain_ops = {
.map = uniphier_pcie_intx_map,
 };
 
-static void uniphier_pcie_irq_handler(struct irq_desc *desc)
+static void uniphier_pcie_misc_isr(struct pcie_port *pp)
 {
-   struct pcie_port *pp = irq_desc_get_handler_data(desc);
struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
struct uniphier_pcie_priv *priv = to_uniphier_pcie(pci);
-   struct irq_chip *chip = irq_desc_get_chip(desc);
-   unsigned long reg;
-   u32 val, bit, virq;
+   u32 val, virq;
 
-   /* INT for debug */
val = readl(priv->base + PCL_RCV_INT);
 
if (val & PCL_CFG_BW_MGT_STATUS)
dev_dbg(pci->dev, "Link Bandwidth Management Event\n");
+
if (val & PCL_CFG_LINK_AUTO_BW_STATUS)
dev_dbg(pci->dev, "Link Autonomous Bandwidth Event\n");
-   if (val & PCL_CFG_AER_RC_ERR_MSI_STATUS)
-   dev_dbg(pci->dev, "Root Error\n");
-   if (val & PCL_CFG_PME_MSI_STATUS)
-   dev_dbg(pci->dev, "PME Interrupt\n");
+
+   if (pci_msi_enabled()) {
+   if (val & PCL_CFG_AER_RC_ERR_MSI_STATUS) {
+   dev_dbg(pci->dev, "Root Error Status\n");
+   virq = irq_linear_revmap(pp->irq_domain, 0);
+   generic_handle_irq(virq);
+   }
+
+   if (val & PCL_CFG_PME_MSI_STATUS) {
+   dev_dbg(pci->dev, "PME Interrupt\n");
+   virq = irq_linear_revmap(pp->irq_domain, 0);
+   generic_handle_irq(virq);
+   }
+   }
 
writel(val, priv->base + PCL_RCV_INT);
+}
+
+static void uniphier_pcie_irq_handler(struct irq_desc *desc)
+{
+   struct pcie_port *pp = irq_desc_get_handler_data(desc);
+   struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+   struct uniphier_pcie_priv *priv = to_uniphier_pcie(pci);
+   struct irq_chip *chip = irq_desc_get_chip(desc);
+   unsigned long reg;
+   u32 val, bit, virq;
+
+   /* misc interrupt */
+   uniphier_pcie_misc_isr(pp);
 
/* INTx */
chained_irq_enter(chip, desc);
@@ -336,6 +366,7 @@ static int uniphier_pcie_host_init(struct pcie_port *pp)
 
 static const struct dw_pcie_host_ops uniphier_pcie_host_ops = {
.host_init = uniphier_pcie_host_init,
+   .msi_host_isr = uniphier_pcie_misc_isr,
 };
 
 static int uniphier_add_pcie_port(struct uniphier_pcie_priv *priv,
-- 
2.7.4



[PATCH 3/5] dt-bindings: PCI: uniphier: Add iATU register description

2020-05-14 Thread Kunihiko Hayashi
In the dt-bindings, "atu" reg-names is required to get the register space
for iATU in Synopsis DWC version 4.80 or later.

Signed-off-by: Kunihiko Hayashi 
---
 Documentation/devicetree/bindings/pci/uniphier-pcie.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/pci/uniphier-pcie.txt 
b/Documentation/devicetree/bindings/pci/uniphier-pcie.txt
index 1fa2c59..c4b7381 100644
--- a/Documentation/devicetree/bindings/pci/uniphier-pcie.txt
+++ b/Documentation/devicetree/bindings/pci/uniphier-pcie.txt
@@ -16,6 +16,7 @@ Required properties:
 "dbi"- controller configuration registers
 "link"   - SoC-specific glue layer registers
 "config" - PCIe configuration space
+"atu"- iATU registers for DWC version 4.80 or later
 - clocks: A phandle to the clock gate for PCIe glue layer including
the host controller.
 - resets: A phandle to the reset line for PCIe glue layer including
-- 
2.7.4



[PATCH 1/5] PCI: dwc: Add msi_host_isr() callback

2020-05-14 Thread Kunihiko Hayashi
This adds msi_host_isr() callback function support to describe
SoC-dependent service triggered by MSI.

For example, when AER interrupt is triggered by MSI, the callback function
reads SoC-dependent registers and detects that the interrupt is from AER,
and invoke AER interrupts related to MSI.

Signed-off-by: Kunihiko Hayashi 
---
 drivers/pci/controller/dwc/pcie-designware-host.c | 8 
 drivers/pci/controller/dwc/pcie-designware.h  | 1 +
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c 
b/drivers/pci/controller/dwc/pcie-designware-host.c
index 0f36a92..491b7a8 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -110,13 +110,13 @@ irqreturn_t dw_handle_msi_irq(struct pcie_port *pp)
 static void dw_chained_msi_isr(struct irq_desc *desc)
 {
struct irq_chip *chip = irq_desc_get_chip(desc);
-   struct pcie_port *pp;
+   struct pcie_port *pp = irq_desc_get_handler_data(desc);
 
-   chained_irq_enter(chip, desc);
+   if (pp->ops->msi_host_isr)
+   pp->ops->msi_host_isr(pp);
 
-   pp = irq_desc_get_handler_data(desc);
+   chained_irq_enter(chip, desc);
dw_handle_msi_irq(pp);
-
chained_irq_exit(chip, desc);
 }
 
diff --git a/drivers/pci/controller/dwc/pcie-designware.h 
b/drivers/pci/controller/dwc/pcie-designware.h
index 5a18e94..27fee10 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -160,6 +160,7 @@ struct dw_pcie_host_ops {
void (*scan_bus)(struct pcie_port *pp);
void (*set_num_vectors)(struct pcie_port *pp);
int (*msi_host_init)(struct pcie_port *pp);
+   void (*msi_host_isr)(struct pcie_port *pp);
 };
 
 struct pcie_port {
-- 
2.7.4



Re: [PATCH] interconnect: Disallow interconnect core to be built as a module

2020-05-14 Thread Georgi Djakov
On 9/12/19 19:33, Bjorn Andersson wrote:
> On Thu, Aug 29, 2019 at 1:07 AM Viresh Kumar  wrote:
>>
>> Building individual drivers as modules is fine but allowing a core
>> framework to be built as a module makes it really complex and should be
>> avoided.
>>
>> Whatever uses the interconnect core APIs must also be built as a module
>> if interconnect core is built as module, else we will see compilation
>> failures.
>>
>> If another core framework (like cpufreq, clk, etc), that can't be built
>> as module, needs to use interconnect APIs then we will start seeing
>> compilation failures with allmodconfig configurations as the symbols
>> (like of_icc_get()) used in other frameworks will not be available in
>> the built-in image.
>>
>> Disallow the interconnect core to be built as a module to avoid all
>> these issues.

Hi Greg,

We had a discussion [1] a few months back about frameworks being built as
modules. IIUC, you initially expressed some doubts about this patch, so i
wanted to check with you again on this.

While i think that the possibility for a framework core to be a module is a
nice feature, and we should try to be as modular as possible, it seems that
handling dependencies between the different core frameworks becomes difficult
when one of them is tristate.

This of course affects the drivers which use it (every client should express
the dependency in Kconfig as a "depends on framework || !framework"), in order
to avoid build failures in the case when framework=m and client=y. However, this
is not a big issue.

But it gets more complex when another framework2 becomes a client of the modular
framework and especially when framework2 is "select"-ed in Kconfig by it's
users. When selects are used in Kconfig, it forces the option, without ever
visiting the dependencies. I am not sure what we should do in this case, maybe
we can continue and sprinkle more "depends on framework || !framework" also for
every single user which selects framework2.. But i believe that this is very
inconvenient.

Well, the above is not impossible, but other frameworks (regulator, clk, reset,
pinctrl, etc.) are solving this problem by just being bool, instead of tristate.
This makes life much easier for everyone. So i am wondering if it wouldn't be
more appropriate to use the same approach here too?

Thanks,
Georgi

[1] https://lore.kernel.org/linux-pm/20191107142111.gb109...@kroah.com/

>>
> 
> Reviewed-by: Bjorn Andersson 
> 
>> Signed-off-by: Viresh Kumar 
>> ---
>>  drivers/interconnect/Kconfig | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/interconnect/Kconfig b/drivers/interconnect/Kconfig
>> index bfa4ca3ab7a9..b6ea8f0a6122 100644
>> --- a/drivers/interconnect/Kconfig
>> +++ b/drivers/interconnect/Kconfig
>> @@ -1,6 +1,6 @@
>>  # SPDX-License-Identifier: GPL-2.0-only
>>  menuconfig INTERCONNECT
>> -   tristate "On-Chip Interconnect management support"
>> +   bool "On-Chip Interconnect management support"
>> help
>>   Support for management of the on-chip interconnects.
>>
>> --
>> 2.21.0.rc0.269.g1a574e7a288b
>>


[PATCH v2 8/9] fs/ext4: Introduce DAX inode flag

2020-05-14 Thread ira . weiny
From: Ira Weiny 

Add a flag to preserve FS_XFLAG_DAX in the ext4 inode.

Set the flag to be user visible and changeable.  Set the flag to be
inherited.  Allow applications to change the flag at any time.

Finally, on regular files, flag the inode to not be cached to facilitate
changing S_DAX on the next creation of the inode.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Change from V0:
Add FS_DAX_FL to include/uapi/linux/fs.h
to be consistent
Move ext4_dax_dontcache() to ext4_ioctl_setflags()
This ensures that it is only set when the flags are going to be
set and not if there is an error
Also this sets don't cache in the FS_IOC_SETFLAGS case

Change from RFC:
use new d_mark_dontcache()
Allow caching if ALWAYS/NEVER is set
Rebased to latest Linus master
Change flag to unused 0x0100
update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  | 13 +
 fs/ext4/inode.c |  4 +++-
 fs/ext4/ioctl.c | 24 +++-
 include/uapi/linux/fs.h |  1 +
 4 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 6235440e4c39..e8fce4fc50c4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -415,13 +415,16 @@ struct flex_groups {
 #define EXT4_VERITY_FL 0x0010 /* Verity protected inode */
 #define EXT4_EA_INODE_FL   0x0020 /* Inode used for large EA */
 /* 0x0040 was formerly EXT4_EOFBLOCKS_FL */
+
+#define EXT4_DAX_FL0x0100 /* Inode is DAX */
+
 #define EXT4_INLINE_DATA_FL0x1000 /* Inode has inline data. */
 #define EXT4_PROJINHERIT_FL0x2000 /* Create with parents 
projid */
 #define EXT4_CASEFOLD_FL   0x4000 /* Casefolded file */
 #define EXT4_RESERVED_FL   0x8000 /* reserved for ext4 lib */
 
-#define EXT4_FL_USER_VISIBLE   0x705BDFFF /* User visible flags */
-#define EXT4_FL_USER_MODIFIABLE0x604BC0FF /* User modifiable 
flags */
+#define EXT4_FL_USER_VISIBLE   0x715BDFFF /* User visible flags */
+#define EXT4_FL_USER_MODIFIABLE0x614BC0FF /* User modifiable 
flags */
 
 /* Flags we can manipulate with through EXT4_IOC_FSSETXATTR */
 #define EXT4_FL_XFLAG_VISIBLE  (EXT4_SYNC_FL | \
@@ -429,14 +432,16 @@ struct flex_groups {
 EXT4_APPEND_FL | \
 EXT4_NODUMP_FL | \
 EXT4_NOATIME_FL | \
-EXT4_PROJINHERIT_FL)
+EXT4_PROJINHERIT_FL | \
+EXT4_DAX_FL)
 
 /* Flags that should be inherited by new inodes from their parent. */
 #define EXT4_FL_INHERITED (EXT4_SECRM_FL | EXT4_UNRM_FL | EXT4_COMPR_FL |\
   EXT4_SYNC_FL | EXT4_NODUMP_FL | EXT4_NOATIME_FL |\
   EXT4_NOCOMPR_FL | EXT4_JOURNAL_DATA_FL |\
   EXT4_NOTAIL_FL | EXT4_DIRSYNC_FL |\
-  EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL)
+  EXT4_PROJINHERIT_FL | EXT4_CASEFOLD_FL |\
+  EXT4_DAX_FL)
 
 /* Flags that are appropriate for regular files (all but dir-specific ones). */
 #define EXT4_REG_FLMASK (~(EXT4_DIRSYNC_FL | EXT4_TOPDIR_FL | EXT4_CASEFOLD_FL 
|\
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 140b1930e2f4..105cf04f7940 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
 
 static bool ext4_should_enable_dax(struct inode *inode)
 {
+   unsigned int flags = EXT4_I(inode)->i_flags;
+
if (test_opt2(inode->i_sb, DAX_NEVER))
return false;
if (!S_ISREG(inode->i_mode))
@@ -4418,7 +4420,7 @@ static bool ext4_should_enable_dax(struct inode *inode)
if (test_opt(inode->i_sb, DAX_ALWAYS))
return true;
 
-   return false;
+   return flags & EXT4_DAX_FL;
 }
 
 void ext4_set_inode_flags(struct inode *inode, bool init)
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 145083e8cd1e..d6d018ea8e94 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -292,6 +292,21 @@ static int ext4_ioctl_check_immutable(struct inode *inode, 
__u32 new_projid,
return 0;
 }
 
+static void ext4_dax_dontcache(struct inode *inode, unsigned int flags)
+{
+   struct ext4_inode_info *ei = EXT4_I(inode);
+
+   if (S_ISDIR(inode->i_mode))
+   return;
+
+   if (test_opt2(inode->i_sb, DAX_NEVER) ||
+   test_opt(inode->i_sb, DAX_ALWAYS))
+   return;
+
+   if ((ei->i_flags ^ flags) & EXT4_DAX_FL)
+   d_mark_dontcache(inode);
+}
+
 static int ext4_ioctl_setflags(struct inode *inode,
   unsigned 

[PATCH v2 0/9] Enable ext4 support for per-file/directory DAX operations

2020-05-14 Thread ira . weiny
From: Ira Weiny 

Enable the same per file DAX support in ext4 as was done for xfs.  This series
builds and depends on the V11 series for xfs.[1]

This passes the same xfstests test as XFS.

The only issue is that this modifies the old mount option parsing code rather
than waiting for the new parsing code to be finalized.

This series starts with 3 fixes which include making Verity and Encrypt truly
mutually exclusive from DAX.  I think these first 3 patches should be picked up
for 5.8 regardless of what is decided regarding the mount parsing.

[1] https://lore.kernel.org/lkml/20200428002142.404144-1-ira.we...@intel.com/

---
Changes from V1:
Fix up mount options
Pick up reviews

To: linux-kernel@vger.kernel.org
Cc: "Darrick J. Wong" 
Cc: Dan Williams 
Cc: Dave Chinner 
Cc: Christoph Hellwig 
Cc: "Theodore Y. Ts'o" 
Cc: Jan Kara 
Cc: linux-e...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org


Ira Weiny (9):
  fs/ext4: Narrow scope of DAX check in setflags
  fs/ext4: Disallow verity if inode is DAX
  fs/ext4: Disallow encryption if inode is DAX
  fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS
  fs/ext4: Update ext4_should_use_dax()
  fs/ext4: Only change S_DAX on inode load
  fs/ext4: Make DAX mount option a tri-state
  fs/ext4: Introduce DAX inode flag
  Documentation/dax: Update DAX enablement for ext4

 Documentation/filesystems/dax.txt |  6 +-
 Documentation/filesystems/ext4/verity.rst |  7 ++
 Documentation/filesystems/fscrypt.rst |  4 +-
 fs/ext4/ext4.h| 21 --
 fs/ext4/ialloc.c  |  2 +-
 fs/ext4/inode.c   | 27 +--
 fs/ext4/ioctl.c   | 31 ++--
 fs/ext4/super.c   | 87 ---
 fs/ext4/verity.c  |  5 +-
 include/uapi/linux/fs.h   |  1 +
 10 files changed, 144 insertions(+), 47 deletions(-)

-- 
2.25.1



[PATCH v2 9/9] Documentation/dax: Update DAX enablement for ext4

2020-05-14 Thread ira . weiny
From: Ira Weiny 

Update the document to reflect ext4 and xfs now behave the same.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from RFC:
Update with ext2 text...
---
 Documentation/filesystems/dax.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/filesystems/dax.txt 
b/Documentation/filesystems/dax.txt
index 735fb4b54117..265c4f808dbf 100644
--- a/Documentation/filesystems/dax.txt
+++ b/Documentation/filesystems/dax.txt
@@ -25,7 +25,7 @@ size when creating the filesystem.
 Currently 3 filesystems support DAX: ext2, ext4 and xfs.  Enabling DAX on them
 is different.
 
-Enabling DAX on ext4 and ext2
+Enabling DAX on ext2
 -
 
 When mounting the filesystem, use the "-o dax" option on the command line or
@@ -33,8 +33,8 @@ add 'dax' to the options in /etc/fstab.  This works to enable 
DAX on all files
 within the filesystem.  It is equivalent to the '-o dax=always' behavior below.
 
 
-Enabling DAX on xfs

+Enabling DAX on xfs and ext4
+
 
 Summary
 ---
-- 
2.25.1



[PATCH v2 7/9] fs/ext4: Make DAX mount option a tri-state

2020-05-14 Thread ira . weiny
From: Ira Weiny 

We add 'always', 'never', and 'inode' (default).  '-o dax' continues to
operate the same which is equivalent to 'always'.  This new
functionality is limited to ext4 only.

Specifically we introduce a 2nd DAX mount flag EXT4_MOUNT2_DAX_NEVER and set
it and EXT4_MOUNT_DAX_ALWAYS appropriately for the mode.

We also force EXT4_MOUNT2_DAX_NEVER if !CONFIG_FS_DAX.

Finally, EXT4_MOUNT2_DAX_INODE is used solely to detect if the user
specified that option for printing.

Signed-off-by: Ira Weiny 

---
Changes from V1:
Fix up mounting options to only show an option if specified
Fix remount to prevent dax changes
Isolate behavior to ext4 only

Changes from RFC:
Combine remount check for DAX_NEVER with DAX_ALWAYS
Update ext4_should_enable_dax()
---
 fs/ext4/ext4.h  |  2 ++
 fs/ext4/inode.c |  2 ++
 fs/ext4/super.c | 67 +
 3 files changed, 61 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 86a0994332ce..6235440e4c39 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1168,6 +1168,8 @@ struct ext4_inode_info {
  blocks */
 #define EXT4_MOUNT2_HURD_COMPAT0x0004 /* Support 
HURD-castrated
  file systems */
+#define EXT4_MOUNT2_DAX_NEVER  0x0008 /* Do not allow Direct 
Access */
+#define EXT4_MOUNT2_DAX_INODE  0x0010 /* For printing options only 
*/
 
 #define EXT4_MOUNT2_EXPLICIT_JOURNAL_CHECKSUM  0x0008 /* User explicitly
specified journal checksum */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 23e42a223235..140b1930e2f4 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,6 +4400,8 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
 
 static bool ext4_should_enable_dax(struct inode *inode)
 {
+   if (test_opt2(inode->i_sb, DAX_NEVER))
+   return false;
if (!S_ISREG(inode->i_mode))
return false;
if (ext4_should_journal_data(inode))
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 5ec900fdf73c..4753de53b186 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1504,7 +1504,8 @@ enum {
Opt_usrjquota, Opt_grpjquota, Opt_offusrjquota, Opt_offgrpjquota,
Opt_jqfmt_vfsold, Opt_jqfmt_vfsv0, Opt_jqfmt_vfsv1, Opt_quota,
Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err,
-   Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, Opt_dax,
+   Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version,
+   Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never,
Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error,
Opt_nowarn_on_error, Opt_mblk_io_submit,
Opt_lazytime, Opt_nolazytime, Opt_debug_want_extra_isize,
@@ -1571,6 +1572,9 @@ static const match_table_t tokens = {
{Opt_nobarrier, "nobarrier"},
{Opt_i_version, "i_version"},
{Opt_dax, "dax"},
+   {Opt_dax_always, "dax=always"},
+   {Opt_dax_inode, "dax=inode"},
+   {Opt_dax_never, "dax=never"},
{Opt_stripe, "stripe=%u"},
{Opt_delalloc, "delalloc"},
{Opt_warn_on_error, "warn_on_error"},
@@ -1718,6 +1722,7 @@ static int clear_qf_name(struct super_block *sb, int 
qtype)
 #define MOPT_NO_EXT3   0x0200
 #define MOPT_EXT4_ONLY (MOPT_NO_EXT2 | MOPT_NO_EXT3)
 #define MOPT_STRING0x0400
+#define MOPT_SKIP  0x0800
 
 static const struct mount_opts {
int token;
@@ -1767,7 +1772,13 @@ static const struct mount_opts {
{Opt_min_batch_time, 0, MOPT_GTE0},
{Opt_inode_readahead_blks, 0, MOPT_GTE0},
{Opt_init_itable, 0, MOPT_GTE0},
-   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
+   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET | MOPT_SKIP},
+   {Opt_dax_always, EXT4_MOUNT_DAX_ALWAYS,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+   {Opt_dax_inode, EXT4_MOUNT2_DAX_INODE,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
+   {Opt_dax_never, EXT4_MOUNT2_DAX_NEVER,
+   MOPT_EXT4_ONLY | MOPT_SET | MOPT_SKIP},
{Opt_stripe, 0, MOPT_GTE0},
{Opt_resuid, 0, MOPT_GTE0},
{Opt_resgid, 0, MOPT_GTE0},
@@ -2076,13 +2087,32 @@ static int handle_mount_opt(struct super_block *sb, 
char *opt, int token,
}
sbi->s_jquota_fmt = m->mount_opt;
 #endif
-   } else if (token == Opt_dax) {
+   } else if (token == Opt_dax || token == Opt_dax_always ||
+  token == Opt_dax_inode || token == Opt_dax_never) {
 #ifdef CONFIG_FS_DAX
-   ext4_msg(sb, KERN_WARNING,
-   "DAX enabled. Warning: EXPERIMENTAL, use at your own risk");
-   sbi->s_mount_opt |= m->mount_opt;
+   switch (token) {
+   case Opt_dax:
+   case Opt_dax_always:
+  

[PATCH v2 4/9] fs/ext4: Change EXT4_MOUNT_DAX to EXT4_MOUNT_DAX_ALWAYS

2020-05-14 Thread ira . weiny
From: Ira Weiny 

In prep for the new tri-state mount option which then introduces
EXT4_MOUNT_DAX_NEVER.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes:
New patch
---
 fs/ext4/ext4.h  |  4 ++--
 fs/ext4/inode.c |  2 +-
 fs/ext4/super.c | 12 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 91eb4381cae5..1a3daf2d18ef 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1123,9 +1123,9 @@ struct ext4_inode_info {
 #define EXT4_MOUNT_MINIX_DF0x00080 /* Mimics the Minix statfs */
 #define EXT4_MOUNT_NOLOAD  0x00100 /* Don't use existing journal*/
 #ifdef CONFIG_FS_DAX
-#define EXT4_MOUNT_DAX 0x00200 /* Direct Access */
+#define EXT4_MOUNT_DAX_ALWAYS  0x00200 /* Direct Access */
 #else
-#define EXT4_MOUNT_DAX 0
+#define EXT4_MOUNT_DAX_ALWAYS  0
 #endif
 #define EXT4_MOUNT_DATA_FLAGS  0x00C00 /* Mode for data writes: */
 #define EXT4_MOUNT_JOURNAL_DATA0x00400 /* Write data to 
journal */
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2a4aae6acdcb..a10ff12194db 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4400,7 +4400,7 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
 
 static bool ext4_should_use_dax(struct inode *inode)
 {
-   if (!test_opt(inode->i_sb, DAX))
+   if (!test_opt(inode->i_sb, DAX_ALWAYS))
return false;
if (!S_ISREG(inode->i_mode))
return false;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9873ab27e3fa..d0434b513919 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1767,7 +1767,7 @@ static const struct mount_opts {
{Opt_min_batch_time, 0, MOPT_GTE0},
{Opt_inode_readahead_blks, 0, MOPT_GTE0},
{Opt_init_itable, 0, MOPT_GTE0},
-   {Opt_dax, EXT4_MOUNT_DAX, MOPT_SET},
+   {Opt_dax, EXT4_MOUNT_DAX_ALWAYS, MOPT_SET},
{Opt_stripe, 0, MOPT_GTE0},
{Opt_resuid, 0, MOPT_GTE0},
{Opt_resgid, 0, MOPT_GTE0},
@@ -3974,7 +3974,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
 "both data=journal and dioread_nolock");
goto failed_mount;
}
-   if (test_opt(sb, DAX)) {
+   if (test_opt(sb, DAX_ALWAYS)) {
ext4_msg(sb, KERN_ERR, "can't mount with "
 "both data=journal and dax");
goto failed_mount;
@@ -4084,7 +4084,7 @@ static int ext4_fill_super(struct super_block *sb, void 
*data, int silent)
goto failed_mount;
}
 
-   if (sbi->s_mount_opt & EXT4_MOUNT_DAX) {
+   if (sbi->s_mount_opt & EXT4_MOUNT_DAX_ALWAYS) {
if (ext4_has_feature_inline_data(sb)) {
ext4_msg(sb, KERN_ERR, "Cannot use DAX on a filesystem"
" that may contain inline data");
@@ -5404,7 +5404,7 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
err = -EINVAL;
goto restore_opts;
}
-   if (test_opt(sb, DAX)) {
+   if (test_opt(sb, DAX_ALWAYS)) {
ext4_msg(sb, KERN_ERR, "can't mount with "
 "both data=journal and dax");
err = -EINVAL;
@@ -5425,10 +5425,10 @@ static int ext4_remount(struct super_block *sb, int 
*flags, char *data)
goto restore_opts;
}
 
-   if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX) {
+   if ((sbi->s_mount_opt ^ old_opts.s_mount_opt) & EXT4_MOUNT_DAX_ALWAYS) {
ext4_msg(sb, KERN_WARNING, "warning: refusing change of "
"dax flag with busy inodes while remounting");
-   sbi->s_mount_opt ^= EXT4_MOUNT_DAX;
+   sbi->s_mount_opt ^= EXT4_MOUNT_DAX_ALWAYS;
}
 
if (sbi->s_mount_flags & EXT4_MF_FS_ABORTED)
-- 
2.25.1



[PATCH v2 6/9] fs/ext4: Only change S_DAX on inode load

2020-05-14 Thread ira . weiny
From: Ira Weiny 

To prevent complications with in memory inodes we only set S_DAX on
inode load.  FS_XFLAG_DAX can be changed at any time and S_DAX will
change after inode eviction and reload.

Add init bool to ext4_set_inode_flags() to indicate if the inode is
being newly initialized.

Assert that S_DAX is not set on an inode which is just being loaded.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from RFC:
Change J_ASSERT() to WARN_ON_ONCE()
Fix bug which would clear S_DAX incorrectly
---
 fs/ext4/ext4.h   |  2 +-
 fs/ext4/ialloc.c |  2 +-
 fs/ext4/inode.c  | 13 ++---
 fs/ext4/ioctl.c  |  3 ++-
 fs/ext4/super.c  |  4 ++--
 fs/ext4/verity.c |  2 +-
 6 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 1a3daf2d18ef..86a0994332ce 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2692,7 +2692,7 @@ extern int ext4_can_truncate(struct inode *inode);
 extern int ext4_truncate(struct inode *);
 extern int ext4_break_layouts(struct inode *);
 extern int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length);
-extern void ext4_set_inode_flags(struct inode *);
+extern void ext4_set_inode_flags(struct inode *, bool init);
 extern int ext4_alloc_da_blocks(struct inode *inode);
 extern void ext4_set_aops(struct inode *inode);
 extern int ext4_writepage_trans_blocks(struct inode *);
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 4b8c9a9bdf0c..7941c140723f 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1116,7 +1116,7 @@ struct inode *__ext4_new_inode(handle_t *handle, struct 
inode *dir,
ei->i_block_group = group;
ei->i_last_alloc_group = ~0;
 
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, true);
if (IS_DIRSYNC(inode))
ext4_handle_sync(handle);
if (insert_inode_locked(inode) < 0) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d3a4c2ed7a1c..23e42a223235 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4419,11 +4419,13 @@ static bool ext4_should_enable_dax(struct inode *inode)
return false;
 }
 
-void ext4_set_inode_flags(struct inode *inode)
+void ext4_set_inode_flags(struct inode *inode, bool init)
 {
unsigned int flags = EXT4_I(inode)->i_flags;
unsigned int new_fl = 0;
 
+   WARN_ON_ONCE(IS_DAX(inode) && init);
+
if (flags & EXT4_SYNC_FL)
new_fl |= S_SYNC;
if (flags & EXT4_APPEND_FL)
@@ -4434,8 +4436,13 @@ void ext4_set_inode_flags(struct inode *inode)
new_fl |= S_NOATIME;
if (flags & EXT4_DIRSYNC_FL)
new_fl |= S_DIRSYNC;
-   if (ext4_should_enable_dax(inode))
+
+   /* Because of the way inode_set_flags() works we must preserve S_DAX
+* here if already set. */
+   new_fl |= (inode->i_flags & S_DAX);
+   if (init && ext4_should_enable_dax(inode))
new_fl |= S_DAX;
+
if (flags & EXT4_ENCRYPT_FL)
new_fl |= S_ENCRYPTED;
if (flags & EXT4_CASEFOLD_FL)
@@ -4649,7 +4656,7 @@ struct inode *__ext4_iget(struct super_block *sb, 
unsigned long ino,
 * not initialized on a new filesystem. */
}
ei->i_flags = le32_to_cpu(raw_inode->i_flags);
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, true);
inode->i_blocks = ext4_inode_blocks(raw_inode, ei);
ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl_lo);
if (ext4_has_feature_64bit(sb))
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 5813e5e73eab..145083e8cd1e 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -381,7 +381,8 @@ static int ext4_ioctl_setflags(struct inode *inode,
ext4_clear_inode_flag(inode, i);
}
 
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
+
inode->i_ctime = current_time(inode);
 
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index d0434b513919..5ec900fdf73c 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1344,7 +1344,7 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
ext4_clear_inode_state(inode,
EXT4_STATE_MAY_INLINE_DATA);
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
}
return res;
}
@@ -1367,7 +1367,7 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
ctx, len, 0);
if (!res) {
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-   ext4_set_inode_flags(inode);
+   ext4_set_inode_flags(inode, false);
res = ext4_mark_inode_dirty(handle, inode);
if (res)

[PATCH v2 5/9] fs/ext4: Update ext4_should_use_dax()

2020-05-14 Thread ira . weiny
From: Ira Weiny 

S_DAX should only be enabled when the underlying block device supports
dax.

Change ext4_should_use_dax() to check for device support prior to the
over riding mount option.

While we are at it change the function to ext4_should_enable_dax() as
this better reflects the ask as well as matches xfs.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes from RFC
Change function name to 'should enable'
Clean up bool conversion
Reorder this for better bisect-ability
---
 fs/ext4/inode.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index a10ff12194db..d3a4c2ed7a1c 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4398,10 +4398,8 @@ int ext4_get_inode_loc(struct inode *inode, struct 
ext4_iloc *iloc)
!ext4_test_inode_state(inode, EXT4_STATE_XATTR));
 }
 
-static bool ext4_should_use_dax(struct inode *inode)
+static bool ext4_should_enable_dax(struct inode *inode)
 {
-   if (!test_opt(inode->i_sb, DAX_ALWAYS))
-   return false;
if (!S_ISREG(inode->i_mode))
return false;
if (ext4_should_journal_data(inode))
@@ -4412,7 +4410,13 @@ static bool ext4_should_use_dax(struct inode *inode)
return false;
if (ext4_test_inode_flag(inode, EXT4_INODE_VERITY))
return false;
-   return true;
+   if (!bdev_dax_supported(inode->i_sb->s_bdev,
+   inode->i_sb->s_blocksize))
+   return false;
+   if (test_opt(inode->i_sb, DAX_ALWAYS))
+   return true;
+
+   return false;
 }
 
 void ext4_set_inode_flags(struct inode *inode)
@@ -4430,7 +4434,7 @@ void ext4_set_inode_flags(struct inode *inode)
new_fl |= S_NOATIME;
if (flags & EXT4_DIRSYNC_FL)
new_fl |= S_DIRSYNC;
-   if (ext4_should_use_dax(inode))
+   if (ext4_should_enable_dax(inode))
new_fl |= S_DAX;
if (flags & EXT4_ENCRYPT_FL)
new_fl |= S_ENCRYPTED;
-- 
2.25.1



[PATCH v2 3/9] fs/ext4: Disallow encryption if inode is DAX

2020-05-14 Thread ira . weiny
From: Ira Weiny 

Encryption and DAX are incompatible.  Changing the DAX mode due to a
change in Encryption mode is wrong without a corresponding
address_space_operations update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

Furthermore, clarify the documentation of the exclusivity and how that
will work.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes:
remove WARN_ON_ONCE
Add documentation to the encrypt doc WRT DAX
---
 Documentation/filesystems/fscrypt.rst |  4 +++-
 fs/ext4/super.c   | 10 +-
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/Documentation/filesystems/fscrypt.rst 
b/Documentation/filesystems/fscrypt.rst
index aa072112cfff..1475b8d52fef 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -1038,7 +1038,9 @@ astute users may notice some differences in behavior:
 - The ext4 filesystem does not support data journaling with encrypted
   regular files.  It will fall back to ordered data mode instead.
 
-- DAX (Direct Access) is not supported on encrypted files.
+- DAX (Direct Access) is not supported on encrypted files.  Attempts to enable
+  DAX on an encrypted file will fail.  Mount options will _not_ enable DAX on
+  encrypted files.
 
 - The st_size of an encrypted symlink will not necessarily give the
   length of the symlink target as required by POSIX.  It will actually
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index bf5fcb477f66..9873ab27e3fa 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1320,7 +1320,7 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
if (inode->i_ino == EXT4_ROOT_INO)
return -EPERM;
 
-   if (WARN_ON_ONCE(IS_DAX(inode) && i_size_read(inode)))
+   if (IS_DAX(inode))
return -EINVAL;
 
res = ext4_convert_inline_data(inode);
@@ -1344,10 +1344,6 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
ext4_clear_inode_state(inode,
EXT4_STATE_MAY_INLINE_DATA);
-   /*
-* Update inode->i_flags - S_ENCRYPTED will be enabled,
-* S_DAX may be disabled
-*/
ext4_set_inode_flags(inode);
}
return res;
@@ -1371,10 +1367,6 @@ static int ext4_set_context(struct inode *inode, const 
void *ctx, size_t len,
ctx, len, 0);
if (!res) {
ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT);
-   /*
-* Update inode->i_flags - S_ENCRYPTED will be enabled,
-* S_DAX may be disabled
-*/
ext4_set_inode_flags(inode);
res = ext4_mark_inode_dirty(handle, inode);
if (res)
-- 
2.25.1



[PATCH v2 1/9] fs/ext4: Narrow scope of DAX check in setflags

2020-05-14 Thread ira . weiny
From: Ira Weiny 

When preventing DAX and journaling on an inode.  Use the effective DAX
check rather than the mount option.

This will be required to support per inode DAX flags.

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 
---
 fs/ext4/ioctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index bfc1281fc4cb..5813e5e73eab 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -393,9 +393,9 @@ static int ext4_ioctl_setflags(struct inode *inode,
if ((jflag ^ oldflags) & (EXT4_JOURNAL_DATA_FL)) {
/*
 * Changes to the journaling mode can cause unsafe changes to
-* S_DAX if we are using the DAX mount option.
+* S_DAX if the inode is DAX
 */
-   if (test_opt(inode->i_sb, DAX)) {
+   if (IS_DAX(inode)) {
err = -EBUSY;
goto flags_out;
}
-- 
2.25.1



[PATCH v2 2/9] fs/ext4: Disallow verity if inode is DAX

2020-05-14 Thread ira . weiny
From: Ira Weiny 

Verity and DAX are incompatible.  Changing the DAX mode due to a verity
flag change is wrong without a corresponding address_space_operations
update.

Make the 2 options mutually exclusive by returning an error if DAX was
set first.

(Setting DAX is already disabled if Verity is set first.)

Reviewed-by: Jan Kara 
Signed-off-by: Ira Weiny 

---
Changes:
remove WARN_ON_ONCE
Add documentation for DAX/Verity exclusivity
---
 Documentation/filesystems/ext4/verity.rst | 7 +++
 fs/ext4/verity.c  | 3 +++
 2 files changed, 10 insertions(+)

diff --git a/Documentation/filesystems/ext4/verity.rst 
b/Documentation/filesystems/ext4/verity.rst
index 3e4c0ee0e068..51ab1aa17e59 100644
--- a/Documentation/filesystems/ext4/verity.rst
+++ b/Documentation/filesystems/ext4/verity.rst
@@ -39,3 +39,10 @@ is encrypted as well as the data itself.
 
 Verity files cannot have blocks allocated past the end of the verity
 metadata.
+
+Verity and DAX
+--
+
+Verity and DAX are not compatible and attempts to set both of these flags on a
+file will fail.
+
diff --git a/fs/ext4/verity.c b/fs/ext4/verity.c
index dc5ec724d889..f05a09fb2ae4 100644
--- a/fs/ext4/verity.c
+++ b/fs/ext4/verity.c
@@ -113,6 +113,9 @@ static int ext4_begin_enable_verity(struct file *filp)
handle_t *handle;
int err;
 
+   if (IS_DAX(inode))
+   return -EINVAL;
+
if (ext4_verity_in_progress(inode))
return -EBUSY;
 
-- 
2.25.1



[PATCH 0/4] Move the sysctl interface to the corresponding feature code file

2020-05-14 Thread Xiaoming Ni
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

Here, the sysctl interfaces of hung task and watchdog are moved to the
corresponding feature code files

https://lkml.org/lkml/2020/5/11/1419

Xiaoming Ni (4):
  hung_task: Move hung_task syscl interface to hung_task_sysctl.c
  proc/sysctl: add shared variables -1
  watchdog: move watchdog sysctl to watchdog.c
  sysctl: Add register_sysctl_init() interface

 fs/proc/proc_sysctl.c|   2 +-
 include/linux/sched/sysctl.h |   8 +--
 include/linux/sysctl.h   |   3 +
 kernel/Makefile  |   4 +-
 kernel/hung_task.c   |   6 +-
 kernel/hung_task.h   |  21 ++
 kernel/hung_task_sysctl.c|  66 +
 kernel/sysctl.c  | 168 ++-
 kernel/watchdog.c| 101 ++
 9 files changed, 219 insertions(+), 160 deletions(-)
 create mode 100644 kernel/hung_task.h
 create mode 100644 kernel/hung_task_sysctl.c

-- 
1.8.5.6



[PATCH 3/4] watchdog: move watchdog sysctl to watchdog.c

2020-05-14 Thread Xiaoming Ni
Move watchdog sysctl interface to watchdog.c.
Use register_sysctl() to register the sysctl interface to avoid
merge conflicts when different features modify sysctl.c at the same time.

Signed-off-by: Xiaoming Ni 
---
 kernel/sysctl.c   |  96 
 kernel/watchdog.c | 117 ++
 2 files changed, 117 insertions(+), 96 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 01fc559..e394990 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -97,9 +97,6 @@
 #ifdef CONFIG_STACKLEAK_RUNTIME_DISABLE
 #include 
 #endif
-#ifdef CONFIG_LOCKUP_DETECTOR
-#include 
-#endif
 
 #if defined(CONFIG_SYSCTL)
 
@@ -120,9 +117,6 @@
 #endif
 
 /* Constants used for minimum and  maximum */
-#ifdef CONFIG_LOCKUP_DETECTOR
-static int sixty = 60;
-#endif
 
 static int __maybe_unused two = 2;
 static int __maybe_unused four = 4;
@@ -887,96 +881,6 @@ static int sysrq_sysctl_handler(struct ctl_table *table, 
int write,
.mode   = 0444,
.proc_handler   = proc_dointvec,
},
-#if defined(CONFIG_LOCKUP_DETECTOR)
-   {
-   .procname   = "watchdog",
-   .data   = &watchdog_user_enabled,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_watchdog,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-   {
-   .procname   = "watchdog_thresh",
-   .data   = &watchdog_thresh,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_watchdog_thresh,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = &sixty,
-   },
-   {
-   .procname   = "nmi_watchdog",
-   .data   = &nmi_watchdog_user_enabled,
-   .maxlen = sizeof(int),
-   .mode   = NMI_WATCHDOG_SYSCTL_PERM,
-   .proc_handler   = proc_nmi_watchdog,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-   {
-   .procname   = "watchdog_cpumask",
-   .data   = &watchdog_cpumask_bits,
-   .maxlen = NR_CPUS,
-   .mode   = 0644,
-   .proc_handler   = proc_watchdog_cpumask,
-   },
-#ifdef CONFIG_SOFTLOCKUP_DETECTOR
-   {
-   .procname   = "soft_watchdog",
-   .data   = &soft_watchdog_user_enabled,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_soft_watchdog,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-   {
-   .procname   = "softlockup_panic",
-   .data   = &softlockup_panic,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec_minmax,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-#ifdef CONFIG_SMP
-   {
-   .procname   = "softlockup_all_cpu_backtrace",
-   .data   = &sysctl_softlockup_all_cpu_backtrace,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec_minmax,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-#endif /* CONFIG_SMP */
-#endif
-#ifdef CONFIG_HARDLOCKUP_DETECTOR
-   {
-   .procname   = "hardlockup_panic",
-   .data   = &hardlockup_panic,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec_minmax,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-#ifdef CONFIG_SMP
-   {
-   .procname   = "hardlockup_all_cpu_backtrace",
-   .data   = &sysctl_hardlockup_all_cpu_backtrace,
-   .maxlen = sizeof(int),
-   .mode   = 0644,
-   .proc_handler   = proc_dointvec_minmax,
-   .extra1 = SYSCTL_ZERO,
-   .extra2 = SYSCTL_ONE,
-   },
-#endif /* CONFIG_SMP */
-#endif
-#endif
-
 #if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
{
.procname   = "unknown_nmi_panic",
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b6b1f54..05e1d58 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -23,6 +23,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_SYSCTL
+#include 
+#endif
 
 #include 
 #include 
@@ -756,10 +759,124 @@ int proc_watchdog_cpu

  1   2   3   4   5   6   7   8   9   10   >