Re: [PATCH] irqchip: plic: Fix priority base offset

2019-03-26 Thread Alistair Francis
On Fri, Mar 22, 2019 at 6:27 AM Christoph Hellwig  wrote:
>
> On Wed, Mar 20, 2019 at 05:04:58PM -0700, Alistair Francis wrote:
> > > Well, it starts at 0x00, but the first one is reserved.  If you think
> > > that is too confusing I'd rather throw in a comment explaining this
> > > fact rather than making the calculating more complicated.
> >
> > It doesn't mention that it starts at 0 when you look here:
> > https://sifive.cdn.prismic.io/sifive%2F834354f0-08e6-423c-bf1f-0cb58ef14061_fu540-c000-v1.0.pdf
>
> It doesn't say that.  But it is completely obvious from the map,
> and from how everything else works.  In this case I think the
> documentation is simply written in a confusing way, and we need to fix
> it once we have an official riscv spec level documentation of this
> hardware.

I agree that the documentation is written in a confusing way. In
saying that we make it even more confusing by not following the
documentation and doing something different, which is what we are
doing now.

If the documentation changes in the future we can update the code to
make the new documentation but at the moment I think it makes more
sense to match the documentation. It makes it a lot easier to compare
the code and the documentation when they match. Hopefully that can
avoid and off-by-one index issues as we have seen recently.

Alistair


Re: [RFC PATCH v4 0/8] This patch-set is to enable Guest CET support

2019-03-26 Thread Sean Christopherson
On Tue, Mar 26, 2019 at 04:45:34AM +0800, Yang Weijiang wrote:
> Hi, Paolo and Sean,
> Do you have any comments on v4 patches?

My backlog is a bit full at the moment, I'll try to review the series
later this week.


[PATCH 4/4] perf: arm_spe: Enable ACPI/Platform automatic module loading

2019-03-26 Thread Jeremy Linton
Lets add the MODULE_TABLE and platform id_table entries so that
the SPE driver can attach to the ACPI platform device created by
the core pmu code.

Signed-off-by: Jeremy Linton 
Reviewed-by: Sudeep Holla 
---
 drivers/perf/arm_spe_pmu.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 7cb766dafe85..ffa2c76c08bb 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -1176,7 +1176,13 @@ static const struct of_device_id arm_spe_pmu_of_match[] 
= {
 };
 MODULE_DEVICE_TABLE(of, arm_spe_pmu_of_match);
 
-static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
+static const struct platform_device_id arm_spe_match[] = {
+   { "arm,spe-v1", 0},
+   { }
+};
+MODULE_DEVICE_TABLE(platform, arm_spe_match);
+
+static int arm_spe_pmu_device_probe(struct platform_device *pdev)
 {
int ret;
struct arm_spe_pmu *spe_pmu;
@@ -1236,11 +1242,12 @@ static int arm_spe_pmu_device_remove(struct 
platform_device *pdev)
 }
 
 static struct platform_driver arm_spe_pmu_driver = {
+   .id_table = arm_spe_match,
.driver = {
.name   = DRVNAME,
.of_match_table = of_match_ptr(arm_spe_pmu_of_match),
},
-   .probe  = arm_spe_pmu_device_dt_probe,
+   .probe  = arm_spe_pmu_device_probe,
.remove = arm_spe_pmu_device_remove,
 };
 
-- 
2.20.1



[PATCH 3/4] arm_pmu: acpi: spe: Add initial MADT/SPE probing

2019-03-26 Thread Jeremy Linton
ACPI 6.3 adds additional fields to the MADT GICC
structure to describe SPE PPI's. We pick these out
of the cached reference to the madt_gicc structure
similarly to the core PMU code. We then create a platform
device referring to the IRQ and let the user/module loader
decide whether to load the SPE driver.

Signed-off-by: Jeremy Linton 
---
 arch/arm64/include/asm/acpi.h |  3 ++
 drivers/perf/arm_pmu_acpi.c   | 69 +++
 2 files changed, 72 insertions(+)

diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 7628efbe6c12..d10399b9f998 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -41,6 +41,9 @@
(!(entry) || (entry)->header.length < ACPI_MADT_GICC_MIN_LENGTH || \
(unsigned long)(entry) + (entry)->header.length > (end))
 
+#define ACPI_MADT_GICC_SPE  (ACPI_OFFSET(struct acpi_madt_generic_interrupt, \
+   spe_interrupt) + sizeof(u16))
+
 /* Basic configuration for ACPI */
 #ifdef CONFIG_ACPI
 pgprot_t __acpi_get_mem_attribute(phys_addr_t addr);
diff --git a/drivers/perf/arm_pmu_acpi.c b/drivers/perf/arm_pmu_acpi.c
index 0f197516d708..a2418108eab2 100644
--- a/drivers/perf/arm_pmu_acpi.c
+++ b/drivers/perf/arm_pmu_acpi.c
@@ -74,6 +74,73 @@ static void arm_pmu_acpi_unregister_irq(int cpu)
acpi_unregister_gsi(gsi);
 }
 
+static struct resource spe_resources[] = {
+   {
+   /* irq */
+   .flags  = IORESOURCE_IRQ,
+   }
+};
+
+static struct platform_device spe_dev = {
+   .name = "arm,spe-v1",
+   .id = -1,
+   .resource = spe_resources,
+   .num_resources = ARRAY_SIZE(spe_resources)
+};
+
+/*
+ * For lack of a better place, hook the normal PMU MADT walk
+ * and create a SPE device if we detect a recent MADT with
+ * a homogeneous PPI mapping.
+ */
+static int arm_spe_acpi_parse_irqs(void)
+{
+   int cpu, ret, irq;
+   int hetid;
+   u16 gsi = 0;
+   bool first = true;
+
+   struct acpi_madt_generic_interrupt *gicc;
+
+   /*
+* sanity check all the GICC tables for the same interrupt number
+* for now we only support homogeneous ACPI/SPE machines.
+*/
+   for_each_possible_cpu(cpu) {
+   gicc = acpi_cpu_get_madt_gicc(cpu);
+
+   if (gicc->header.length < ACPI_MADT_GICC_SPE)
+   return -ENODEV;
+   if (first) {
+   gsi = gicc->spe_interrupt;
+   if (!gsi)
+   return -ENODEV;
+   hetid = find_acpi_cpu_topology_hetero_id(cpu);
+   first = false;
+   } else if ((gsi != gicc->spe_interrupt) ||
+  (hetid != find_acpi_cpu_topology_hetero_id(cpu))) {
+   pr_warn("ACPI: SPE must be homogeneous\n");
+   return -EINVAL;
+   }
+   }
+
+   irq = acpi_register_gsi(NULL, gsi, ACPI_LEVEL_SENSITIVE,
+   ACPI_ACTIVE_HIGH);
+   if (irq < 0) {
+   pr_warn("ACPI: SPE Unable to register interrupt: %d\n", gsi);
+   return irq;
+   }
+
+   spe_resources[0].start = irq;
+   ret = platform_device_register(&spe_dev);
+   if (ret < 0) {
+   pr_warn("ACPI: SPE: Unable to register device\n");
+   acpi_unregister_gsi(gsi);
+   }
+
+   return ret;
+}
+
 static int arm_pmu_acpi_parse_irqs(void)
 {
int irq, cpu, irq_cpu, err;
@@ -279,6 +346,8 @@ static int arm_pmu_acpi_init(void)
if (acpi_disabled)
return 0;
 
+   arm_spe_acpi_parse_irqs(); /* failures are expected */
+
ret = arm_pmu_acpi_parse_irqs();
if (ret)
return ret;
-- 
2.20.1



[PATCH 0/4] arm64: SPE ACPI enablement

2019-03-26 Thread Jeremy Linton
This patch series enables the Arm Statistical Profiling
Extension (SPE) on ACPI platforms.

This is possible because ACPI 6.3 uses a previously
reserved field in the MADT to store the SPE interrupt
number, similarly to how the normal PMU is described.
If a consistent valid interrupt exists across all the
cores in the system, a platform device is registered.
That then triggers the SPE module, which runs as normal.

This version also adds the ability to parse the PPTT for
IDENTICAL cores. We then use this to sanity check the
single SPE device we create. This creates a bit of a
problem with respect to the specification though. The
specification says that its legal for multiple tree's
to exist in the PPTT. We handle this fine, but what
happens in the case of multiple tree's is that the lack
of a common node with IDENTICAL set forces us to assume
that there are multiple non IDENTICAL cores in the
machine.

Jeremy Linton (4):
  ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens
  ACPI/PPTT: Modify node flag detection to find last IDENTICAL
  arm_pmu: acpi: spe: Add initial MADT/SPE probing
  perf: arm_spe: Enable ACPI/Platform automatic module loading

 arch/arm64/include/asm/acpi.h |  3 ++
 drivers/acpi/pptt.c   | 82 ++-
 drivers/perf/arm_pmu_acpi.c   | 69 +
 drivers/perf/arm_spe_pmu.c| 11 -
 include/linux/acpi.h  |  5 +++
 5 files changed, 157 insertions(+), 13 deletions(-)

-- 
2.20.1



[PATCH 1/4] ACPI/PPTT: Add function to return ACPI 6.3 Identical tokens

2019-03-26 Thread Jeremy Linton
ACPI 6.3 adds a flag to indicate that child nodes are all
identical cores. This is useful to authoritatively determine
if a set of (possibly offline) cores are identical or not.

Since the flag doesn't give us a unique id we can generate
one and use it to create bitmaps of sibling nodes, or simply
in a loop to determine if a subset of cores are identical.

Signed-off-by: Jeremy Linton 
---
 drivers/acpi/pptt.c  | 26 ++
 include/linux/acpi.h |  5 +
 2 files changed, 31 insertions(+)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 065c4fc245d1..472c95ec816b 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -660,3 +660,29 @@ int find_acpi_cpu_topology_package(unsigned int cpu)
return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
  ACPI_PPTT_PHYSICAL_PACKAGE);
 }
+
+/**
+ * find_acpi_cpu_topology_hetero_id() - Determine a unique implementation
+ * @cpu: Kernel logical cpu number
+ *
+ * Determine a unique heterogeneous ID for the given CPU. CPUs with the same
+ * implementation should have matching IDs. Since this is a tree we can only
+ * detect implementations where the heterogeneous flag is the parent to all
+ * matching cores. AKA if a two socket machine has two different core types
+ * in each socket this will end up being represented as four unique core types
+ * rather than two.
+ *
+ * The returned ID can be used to group peers with identical implementation.
+ *
+ * The search terminates when a level is found with the identical 
implementation
+ * flag set or we reach a root node.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cpu cannot be found.
+ * Otherwise returns a value which represents a group of identical cores
+ * similar to this cpu.
+ */
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+   return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
+ ACPI_PPTT_ACPI_IDENTICAL);
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index d5dcebd7aad3..1444fb042898 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1309,6 +1309,7 @@ static inline int lpit_read_residency_count_address(u64 
*address)
 #ifdef CONFIG_ACPI_PPTT
 int find_acpi_cpu_topology(unsigned int cpu, int level);
 int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
 int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
 #else
 static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
@@ -1319,6 +1320,10 @@ static inline int 
find_acpi_cpu_topology_package(unsigned int cpu)
 {
return -EINVAL;
 }
+static int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+   return -EINVAL;
+}
 static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
 {
return -EINVAL;
-- 
2.20.1



[PATCH 2/4] ACPI/PPTT: Modify node flag detection to find last IDENTICAL

2019-03-26 Thread Jeremy Linton
The ACPI specification implies that the IDENTICAL flag should be
set on all non leaf nodes where the children are identical.
This means that we need to be searching for the last node with
the identical flag set rather than the first one.

To achieve this with the existing code we need to pass a
function through the tree traversal logic so we can check
the next node to assure that IDENTICAL isn't set before returning
a node with IDENTICAL set.

Signed-off-by: Jeremy Linton 
---
 drivers/acpi/pptt.c | 62 +++--
 1 file changed, 48 insertions(+), 14 deletions(-)

diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 472c95ec816b..db18510346f9 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -432,17 +432,51 @@ static void cache_setup_acpi_cpu(struct acpi_table_header 
*table,
}
 }
 
+
+typedef bool (*node_check)(struct acpi_table_header *table_hdr,
+  struct acpi_pptt_processor *cpu);
+static bool flag_package(struct acpi_table_header *table_hdr,
+struct acpi_pptt_processor *cpu)
+{
+   return cpu->flags & ACPI_PPTT_PHYSICAL_PACKAGE;
+}
+
+static bool flag_identical(struct acpi_table_header *table_hdr,
+  struct acpi_pptt_processor *cpu)
+{
+   struct acpi_pptt_processor *next;
+
+   /* heterogeneous machines must use PPTT revision > 1 */
+   if (table_hdr->revision < 2)
+   return false;
+
+   /* Locate the last node in the tree with IDENTICAL set */
+   if (cpu->flags & ACPI_PPTT_ACPI_IDENTICAL) {
+   next = fetch_pptt_node(table_hdr, cpu->parent);
+   if (!(next && next->flags & ACPI_PPTT_ACPI_IDENTICAL))
+   return true;
+   }
+
+   return false;
+}
+
+static bool flag_none(struct acpi_table_header *table_hdr,
+ struct acpi_pptt_processor *cpu)
+{
+   return false;
+}
+
 /* Passing level values greater than this will result in search termination */
 #define PPTT_ABORT_PACKAGE 0xFF
 
-static struct acpi_pptt_processor *acpi_find_processor_package_id(struct 
acpi_table_header *table_hdr,
- struct 
acpi_pptt_processor *cpu,
- int level, 
int flag)
+static struct acpi_pptt_processor *acpi_find_processor_tag_id(struct 
acpi_table_header *table_hdr,
+ struct 
acpi_pptt_processor *cpu,
+ int level, 
node_check chk)
 {
struct acpi_pptt_processor *prev_node;
 
while (cpu && level) {
-   if (cpu->flags & flag)
+   if (chk(table_hdr, cpu))
break;
pr_debug("level %d\n", level);
prev_node = fetch_pptt_node(table_hdr, cpu->parent);
@@ -473,15 +507,15 @@ static void acpi_pptt_warn_missing(void)
  * Return: Unique value, or -ENOENT if unable to locate cpu
  */
 static int topology_get_acpi_cpu_tag(struct acpi_table_header *table,
-unsigned int cpu, int level, int flag)
+unsigned int cpu, int level, node_check 
chk)
 {
struct acpi_pptt_processor *cpu_node;
u32 acpi_cpu_id = get_acpi_id_for_cpu(cpu);
 
cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
if (cpu_node) {
-   cpu_node = acpi_find_processor_package_id(table, cpu_node,
- level, flag);
+   cpu_node = acpi_find_processor_tag_id(table, cpu_node,
+ level, chk);
/*
 * As per specification if the processor structure represents
 * an actual processor, then ACPI processor ID must be valid.
@@ -498,7 +532,7 @@ static int topology_get_acpi_cpu_tag(struct 
acpi_table_header *table,
return -ENOENT;
 }
 
-static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, int flag)
+static int find_acpi_cpu_topology_tag(unsigned int cpu, int level, node_check 
chk)
 {
struct acpi_table_header *table;
acpi_status status;
@@ -509,7 +543,7 @@ static int find_acpi_cpu_topology_tag(unsigned int cpu, int 
level, int flag)
acpi_pptt_warn_missing();
return -ENOENT;
}
-   retval = topology_get_acpi_cpu_tag(table, cpu, level, flag);
+   retval = topology_get_acpi_cpu_tag(table, cpu, level, chk);
pr_debug("Topology Setup ACPI cpu %d, level %d ret = %d\n",
 cpu, level, retval);
acpi_put_table(table);
@@ -601,7 +635,7 @@ int cache_setup_acpi(unsigned int cpu)
  */
 int find_acpi_cpu_topology(unsigned int cpu, int level)
 {
-   return find_acpi_cpu_topology_tag(cpu, level, 0);
+   return find_acpi_cpu_topolog

RE: [PATCH v3] HID: core: move Usage Page concatenation to Main item

2019-03-26 Thread Junge, Terry
Hi Nicolas,

This patch looks good except for one comment/question below.

Thanks,
Terry

On Tuesday, March 26, 2019 1:04 PM Nicolas Saenz Julienne 
 wrote:
>
>As seen on some USB wireless keyboards manufactured by Primax, the HID
>parser was using some assumptions that are not always true. In this case it's s
>the fact that, inside the scope of a main item, an Usage Page will always
>precede an Usage.
>
>The spec is not pretty clear as 6.2.2.7 states "Any usage that follows is
>interpreted as a Usage ID and concatenated with the Usage Page".
>While 6.2.2.8 states "When the parser encounters a main item it concatenates
>the last declared Usage Page with a Usage to form a complete usage value."
>Being somewhat contradictory it was decided to match Window's
>implementation, which follows 6.2.2.8.
>
>In summary, the patch moves the Usage Page concatenation from the local
>item parsing function to the main item parsing function.
>
>Signed-off-by: Nicolas Saenz Julienne 
>---
>
>v2->v3: - Update patch title
>
>v1->v2: - Add usage concatenation to hid_scan_main()
>   - Rework tests in hid-tools, making sure no-one is failing
>
> drivers/hid/hid-core.c | 40 
> include/linux/hid.h|  1 +
> 2 files changed, 29 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c index
>9993b692598f..40c836ce3248 100644
>--- a/drivers/hid/hid-core.c
>+++ b/drivers/hid/hid-core.c
>@@ -218,13 +218,14 @@ static unsigned hid_lookup_collection(struct
>hid_parser *parser, unsigned type)
>  * Add a usage to the temporary parser table.
>  */
>
>-static int hid_add_usage(struct hid_parser *parser, unsigned usage)
>+static int hid_add_usage(struct hid_parser *parser, unsigned usage,
>+__u8 size)
> {
>   if (parser->local.usage_index >= HID_MAX_USAGES) {
>   hid_err(parser->device, "usage index exceeded\n");
>   return -1;
>   }
>   parser->local.usage[parser->local.usage_index] = usage;
>+  parser->local.usage_size[parser->local.usage_index] = size;
>   parser->local.collection_index[parser->local.usage_index] =
>   parser->collection_stack_ptr ?
>   parser->collection_stack[parser->collection_stack_ptr - 1] : 0;
>@@ -486,10 +487,7 @@ static int hid_parser_local(struct hid_parser *parser,
>struct hid_item *item)
>   return 0;
>   }
>
>-  if (item->size <= 2)
>-  data = (parser->global.usage_page << 16) + data;
>-
>-  return hid_add_usage(parser, data);
>+  return hid_add_usage(parser, data, item->size);
>
>   case HID_LOCAL_ITEM_TAG_USAGE_MINIMUM:
>
>@@ -498,9 +496,6 @@ static int hid_parser_local(struct hid_parser *parser,
>struct hid_item *item)
>   return 0;
>   }
>
>-  if (item->size <= 2)
>-  data = (parser->global.usage_page << 16) + data;
>-
>   parser->local.usage_minimum = data;
>   return 0;
>
>@@ -511,9 +506,6 @@ static int hid_parser_local(struct hid_parser *parser,
>struct hid_item *item)
>   return 0;
>   }
>
>-  if (item->size <= 2)
>-  data = (parser->global.usage_page << 16) + data;
>-
>   count = data - parser->local.usage_minimum;
>   if (count + parser->local.usage_index >= HID_MAX_USAGES) {
>   /*
>@@ -533,7 +525,7 @@ static int hid_parser_local(struct hid_parser *parser,
>struct hid_item *item)
>   }
>
>   for (n = parser->local.usage_minimum; n <= data; n++)
>-  if (hid_add_usage(parser, n)) {
>+  if (hid_add_usage(parser, n, item->size)) {
>   dbg_hid("hid_add_usage failed\n");
>   return -1;
>   }
>@@ -547,6 +539,26 @@ static int hid_parser_local(struct hid_parser *parser,
>struct hid_item *item)
>   return 0;
> }
>
>+/*
>+ * Concatenate Usage Pages into Usages where relevant:
>+ * As per specification, 6.2.2.8: "When the parser encounters a main
>+item it
>+ * concatenates the last declared Usage Page with a Usage to form a
>+complete
>+ * usage value."
>+ */
>+
>+static void hid_concatenate_usage_page(struct hid_parser *parser) {
>+  unsigned usages;
>+  int i;
>+
>+  usages = max_t(unsigned, parser->local.usage_index,
>+   parser->global.report_count);

I don't think we need to worry about global.report_count here,
just concatenate for the usages currently in the local queue so could
this be simplified by removing usages and just using local.usage_index?

for (i = 0; i < local.usage_index; i++)

>+
>+  for (i = 0; i < usages; i++)
>+  if (parser->local.usage_size[i] <= 2)
>+  parser->local.usage[i] += parser->global.usage_page
><< 16; }
>+
> /*
>  * Proce

Re: [PATCH 22/27] Lock down kprobes

2019-03-26 Thread Masami Hiramatsu
On Tue, 26 Mar 2019 10:41:23 -0700
Matthew Garrett  wrote:

> On Tue, Mar 26, 2019 at 5:30 AM Masami Hiramatsu  wrote:
> >
> > On Mon, 25 Mar 2019 15:09:49 -0700
> > Matthew Garrett  wrote:
> >
> > > From: David Howells 
> > >
> > > Disallow the creation of kprobes when the kernel is locked down by
> > > preventing their registration.  This prevents kprobes from being used to
> > > access kernel memory, either to make modifications or to steal crypto 
> > > data.
> >
> > Hmm, if you enforce signature check of modules, those modules
> > should be allowed to use kprobes?
> > I think we should introduce some kind of trust inheritance from
> > signed (trusted) modules.
> 
> Is there any way to install a kprobe /without/ it coming from a
> module? The presumption in lockdown mode is that module signing is
> enforced, so I'll admit to not being entirely clear on why this patch
> is needed in that case.

Yes, there are 2 paths, ftrace and perf(bpf). If you want to disable ftrace
path (which start from user's input via tracefs), this should be done in
trace_kprobe_create()@kernel/trace/trace_kprobe.c.
If you want to disable both, 
__register_trace_kprobe()@kernel/trace/trace_kprobe.c
is the best place.

Thank you,

-- 
Masami Hiramatsu 


Re: [RFC PATCH v2 1/3] resource: Request IO port regions from children of ioport_resource

2019-03-26 Thread Bjorn Helgaas
[+cc Catalin, Will, linux-arm-kernel]

On Tue, Mar 26, 2019 at 04:33:55PM +, John Garry wrote:
> On 25/03/2019 23:32, Bjorn Helgaas wrote:
> > On Thu, Mar 21, 2019 at 02:14:08AM +0800, John Garry wrote:
> > > Currently when we request an IO port region, the request is made directly
> > > to the top resource, ioport_resource.
> > 
> > Let's be explicit here, e.g.,
> > 
> >   Currently request_region() requests an IO port region directly from the
> >   top resource, ioport_resource.
> 
> ok
> 
> > > There is an issue here, in that drivers may successfully request an IO
> > > port region even if the IO port region has not even been mapped in
> > > (in pci_remap_iospace()).
> > > 
> > > This may lead to crashes when the system has no PCI host, or, has a host
> > > but it has failed enumeration, while drivers still attempt to access PCI
> > > IO ports, as below:
> > 
> > I don't understand the strategy here.  f71882fg is not a driver for a
> > PCI device, so it should work even if there is no PCI host in the
> > system.
> 
> From my checking, the f71882fg hwmon is accessed via the super-io interface
> on the PCH on x86. The super-io interface is at fixed addresses, those being
> 0x2e and 0x4e.
> 
> Please see the following:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/hwmon/f71805f.c?h=v5.1-rc2#n1621
> 
> and
> 
> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/8-series-chipset-pch-datasheet.pdf
> (Table 9.2).
> 
> On x86 systems, these PCH IO ports will be mapped on a PCI bus, like:
> 
> $more /proc/ioports
> -0cf7 : PCI Bus :00
>   -001f : dma1
>   0020-0021 : pic1
>   0040-0043 : timer0
>   0050-0053 : timer1
>   0060-0060 : keyboard
>   0064-0064 : keyboard
>   0070-0077 : rtc0
>   0080-008f : dma page reg
>   00a0-00a1 : pic2
>   00c0-00df : dma2
>   00f0-00ff : fpu
> 
> So, the idea in the patch is that if PCI Bus :00 does not exist because
> of no PCI host, then we should fail a request to an IO port region.

I'm not convinced about this last sentence.

It's true that on most modern systems, including that Intel PCH, the
Super I/O controller is attached via an LPC bridge on a PCI bus.

But I don't think it's an actual requirement that PCI be involved.
There certainly once were systems, e.g., PC/104, that had ISA devices
but no PCI.  Maybe Super I/O attached via ISA is obsolete enough that
we don't care any more, but I really don't know.

> > On x86, I think inb/inw/inl from a port where nothing responds
> > probably just returns ~0, and outb/outw/outl just get dropped.
> > Shouldn't arm64 do the same, without crashing?
> 
> That would be ideal and we're doing something similar in patch 2/3.
> 
> So on ARM64 we have to IO remap the PCI IO resource. If this mapping is not
> done (due to no PCI host), then any inb/inw/inl calls will crash the system.

My take is that ARM64 is responsible for implementing inb/inw/inl in
such a way that they don't crash.  I don't think it's practical to
update all the old ISA drivers or even the core code to work around
that.

> So in patch 2/3, I am also making the change to the logical PIO inb/inw/inl
> accessors to discard accesses when no PCI MMIO regions are registered in
> logical PIO space.
> 
> This is really a second line of defense (this patch being the first).
> 
> > > root@(none)$root@(none)$ insmod f71882fg.ko
> > > [  152.215377] Unable to handle kernel paging request at virtual address 
> > > 7dfffee0002e
> > > [  152.231299] Mem abort info:
> > > [  152.236898]   ESR = 0x9646
> > > [  152.243019]   Exception class = DABT (current EL), IL = 32 bits
> > > [  152.254905]   SET = 0, FnV = 0
> > > [  152.261024]   EA = 0, S1PTW = 0
> > > [  152.267320] Data abort info:
> > > [  152.273091]   ISV = 0, ISS = 0x0046
> > > [  152.280784]   CM = 0, WnR = 1
> > > [  152.286730] swapper pgtable: 4k pages, 48-bit VAs, pgdp = 
> > > (ptrval)
> > > [  152.300537] [7dfffee0002e] pgd=0141c003, 
> > > pud=0141d003, pmd=
> > > [  152.318016] Internal error: Oops: 9646 [#1] PREEMPT SMP
> > > [  152.329199] Modules linked in: f71882fg(+)
> > > [  152.337415] CPU: 8 PID: 2732 Comm: insmod Not tainted 
> > > 5.1.0-rc1-2-gab1a0e9200b8-dirty #102
> > > [  152.354712] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon 
> > > D05 IT21 Nemo 2.0 RC0 04/18/2018
> > > [  152.373058] pstate: 8005 (Nzcv daif -PAN -UAO)
> > > [  152.382675] pc : logic_outb+0x54/0xb8
> > > [  152.390017] lr : f71882fg_find+0x64/0x390 [f71882fg]
> > > [  152.399977] sp : 13393aa0
> > > [  152.406618] x29: 13393aa0 x28: 08b98b10
> > > [  152.417278] x27: 13393df0 x26: 0100
> > > [  152.427938] x25: 801f8c872d30 x24: 1142
> > > [  152.438598] x23: 801fb49d2940 x22: 11291000
> > > [  152.449257] x21: 002e x20: 0087
> > > [  152.459917] x19: 13393b44 x18: 

[PATCH 1/3] fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock

2019-03-26 Thread Kirill Smelkov
Commit 9c225f2655 (vfs: atomic f_pos accesses as per POSIX) added locking for
file.f_pos access and in particular made concurrent read and write not possible
- now both those functions take f_pos lock for the whole run, and so if e.g. a
read is blocked waiting for data, write will deadlock waiting for that read to
complete. This caused regression for stream-like files where previously read
and write could run simultaneously, but after that patch could not do so
anymore. See e.g. 581d21a2d0 (xenbus: fix deadlock on writes to 
/proc/xen/xenbus)
which fixes such regression for particular case of /proc/xen/xenbus.

The patch that added f_pos lock in 2014 (see https://lkml.org/lkml/2014/2/17/324
for background discussion) did so to guarantee POSIX thread safety for
read/write/lseek and added the locking to file descriptors of all regular
files. In 2014 that thread-safety problem was not new as it was already 
discussed
earlier in 2006: https://lwn.net/Articles/180387. However even though 2006'th
version of Linus's patch (https://lwn.net/Articles/180396) was adding f_pos
locking "only for files that are marked seekable with FMODE_LSEEK (thus avoiding
the stream-like objects like pipes and sockets)", 2014'th version - the one that
actually made it into the tree as 9c225f2655 - is doing so irregardless of 
whether
a file is seekable or not. The reason that it did so is, probably, that there 
are
many files that are marked non-seekable, but e.g. their read implementation
actually depends on knowing current position to correctly handle the read. Some
examples:

kernel/power/user.c snapshot_read
fs/debugfs/file.c   u32_array_read
fs/fuse/control.c   fuse_conn_waiting_read + ...
drivers/hwmon/asus_atk0110.catk_debugfs_ggrp_read
arch/s390/hypfs/inode.c hypfs_read_iter
...

In despite that, many nonseekable_open users implement read and write with pure
stream semantics - they don't depend on passed ppos at all. And for those cases
where read could wait for something inside, it creates a situation similar to
xenbus - the write could be never made to go until read is done, and read is
waiting for some, potentially external, event, for potentially unbounded time
-> deadlock. Besides xenbus, there are 14 such places in the kernel that I've
found with semantic patch (see below):

drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock 
.write()
drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can 
deadlock .write()
drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock 
.write()
drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() 
can deadlock .write()
net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock 
.write()
drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can 
deadlock .write()
drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can 
deadlock .write()
drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock 
.write()
net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can 
deadlock .write()
drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can 
deadlock .write()
drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock 
.write()
drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can 
deadlock .write()
drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() 
can deadlock .write()
drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock 
.write()

In addition to the cases above another regression caused by f_pos locking is
that now FUSE filesystems that implement open with FOPEN_NONSEEKABLE flag, can
no longer implement bidirectional stream-like files - for the same reason
as above e.g. read can deadlock write locking on file.f_pos in the kernel.
FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f7 (fuse: implement
nonseekable open) to support OSSPD (https://github.com/libfuse/osspd;
https://lwn.net/Articles/308445). OSSPD implements /dev/dsp in userspace with
FOPEN_NONSEEKABLE flag, with corresponding read and write routines not
depending on current position at all, and with both read and write being
potentially blocking operations:

https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510

Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
"somewhat pipe-like files ..." with read handler not using offset. However
that test implements only read without write and cannot exercise the deadlock
scenario:


https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131

https://github.com/libfuse/libfu

Re: [PATCH v3] kmemleaak: survive in a low-memory situation

2019-03-26 Thread Qian Cai
On 3/26/19 12:06 PM, Catalin Marinas wrote:
> I wonder whether we'd be better off to replace the metadata allocator
> with gen_pool. This way we'd also get rid of early logging/replaying of
> the memory allocations since we can populate the gen_pool early with a
> static buffer.

I suppose this is not going to work well, as DMA_API_DEBUG use a similar
approach [1] but I still saw it is struggling in a low-memory situation and
disable itself occasionally.

[1] https://lkml.org/lkml/2018/12/10/383



Re: New feature/ABI review process [was Re: [RESEND PATCH v6 04/12] x86/fsgsbase/64:..]

2019-03-26 Thread Andi Kleen
> 
> If you want to advocate the more complex design of mixed SWAPGS/FSGSBASE
> then provide numbers and not hand-waving. Numbers of real-world workloads,
> not numbers of artificial test cases which exercise the rare worst case.

Well you're proposing the much more complicated solution, not me.

SWAPGS is simple and it works everywhere except for paranoid.

> Yes, it's extra work and it's well spent. If the numbers are not
> significantly different then the simpler and consistent design is a clear
> win.

As long as everything is cache hot it's likely only a couple
of cycles difference (as Intel CPUs are very good executing
crappy code too), but if it's not then you end up with a huge cache miss
cost, causing jitter. That's a problem for real time for example.

>   > Accessing user GSBASE needs a couple of SWAPGS operations. It is
>   > avoidable if the user GSBASE is saved at kernel entry, being updated as
>   > changes, and restored back at kernel exit. However, it seems to spend
>   > more cycles for savings and restorations. Little or no benefit was
>   > measured from experiments.
> 
> So little or no benefit was measured. I don't see how that maps to your
> 'SWAPGS will be a lot faster' claim. One of those claims is obviously
> wrong.

If everything is cache hot it won't make much difference,
but if you have a cache miss you end up eating the cost.

> 
> Aside of this needs more than numbers:
> 
>   1) Proper documentation how the mixed bag is managed.

How SWAPGS is managed?

Like it always was since 20+ years when the x86_64
port was originally born.

The only case which has to do an two SWAPGS is the 
context switch when it switches the base. Everything else
just does SWAPGS at the edges for kernel entries.

> You have a track record of not caring much about either of these, but I
> very much care for good reasons. I've been bitten by glued on and half
> baked patches from Intel in the past 10 years so many times, that I'm
> simply refusing to take anything which is not properly structured and
> documented.

In this case you're proposing the change, the Intel patch just leaves
SWAPGS alone. So you have to describe why it's a good idea.
At least what you proposed on this wasn't convincing
and would be rejected by a proper code review.

-Andi



Re: [PATCH v2 2/4] mm/sparse: Optimize sparse_add_one_section()

2019-03-26 Thread Baoquan He
Hi Michal,

On 03/26/19 at 03:31pm, Michal Hocko wrote:
> > > > OK, I am fine to drop it. Or only put the section existence checking
> > > > earlier to avoid unnecessary usemap/memmap allocation?
> > > 
> > > DO you have any data on how often that happens? Should basically never
> > > happening, right?
> > 
> > Oh, you think about it in this aspect. Yes, it rarely happens.
> > Always allocating firstly can increase efficiency. Then I will just drop
> > it.
> 
> OK, let me try once more. Doing a check early is something that makes
> sense in general. Another question is whether the check is needed at
> all. So rather than fiddling with its placement I would go whether it is
> actually failing at all. I suspect it doesn't because the memory hotplug
> is currently enforced to be section aligned. There are people who would
> like to allow subsection or section unaligned aware hotplug and then
> this would be much more relevant but without any solid justification
> such a patch is not really helpful because it might cause code conflicts
> with other work or obscure the git blame tracking by an additional hop.
> 
> In short, if you want to optimize something then make sure you describe
> what you are optimizing how it helps.

I must be dizzy last night when thinking and replying mails, I thought
about it a while, got a point you may mean. Now when I check mail and
rethink about it, that reply may make misunderstanding. It doesn't
actually makes sense to optimize, just a little code block moving. I now
agree with you that it doesn't optimize anything and may impact people's
code change. Sorry about that.

Thanks
Baoquan


Re: [PATCH] EDAC/amd64: Use maximum channel count for the EDAC channel layer size

2019-03-26 Thread Borislav Petkov
On Tue, Mar 26, 2019 at 07:15:29PM +, Ghannam, Yazen wrote:
> Just tested on a fully populated system. Everything seems to be okay.

Thanks, queued.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH] timekeeping: Force upper bound for setting CLOCK_REALTIME

2019-03-26 Thread Thomas Gleixner
On Tue, 26 Mar 2019, Arnd Bergmann wrote:
> On Tue, Mar 26, 2019 at 1:31 PM Thomas Gleixner  wrote:
> >
> > On Tue, 26 Mar 2019, Miroslav Lichvar wrote:
> > > On Sat, Mar 23, 2019 at 11:36:19AM +0100, Thomas Gleixner wrote:
> > > > It is reasonable to force an upper bound for the various methods of 
> > > > setting
> > > > CLOCK_REALTIME. Year 2262 is the absolute upper bound. Assume a maximum
> > > > uptime of 30 years which is plenty enough even for esoteric embedded
> > > > systems. That results in an upper bound of year 2232 for setting the 
> > > > time.
> > >
> > > The patch looks good to me.
> > >
> > > I like this approach better than using a larger value closer to the
> > > overflow (e.g. one week) and stepping the clock back automatically
> > > when the clock reaches that time, but I suspect it might possibly
> > > break more tests (or any unusual applications messing with time) as a
> > > much larger interval is now EINVAL.
> >
> > I'm fine with breaking a few tests on the way rather than having undefined
> > behaviour and the constant flow of patches tackling the wrong end of the
> > stick.
> 
> I think the one downside of your approach is that it introduces a second
> arbitrary cut-off point after which the system almost functions perfectly,
> but is no longer able to do ntp updates or set the right time after a reboot.

Yes, I'm aware of that. But we talk about 113 years from now. Assume we can
fix that proper before the two of us retire. Then you'd need a system which
runs an 80-100 years old kernel in 2232 to run into that problem for real.

There is actually a proper solution for this (ignore RTCs). All user space
interfaces are going to be timespec64 based soon. So they can accomodate
more than 1e11 years.

Now if the kernel internally uses special functions to convert from and to
timespec64 for all interfaces which deal with CLOCK_REALTIME absolute time,
then we still can manage the internal representation in u64 nanoseconds and
have an offset added/subtracted on the relevant interfaces.

That's going to be a bit hairy when time is set back or forth so it needs
to adjust that internal offset, but for regular operation it might be good
enough to have the possible time setting limited to a fixed range depending
on the initial offset.

But even updating the offset should be managable. The conversion functions
would need a seqcount loop and the resulting internal values would be a
struct containing the value and the offset at conversion time. That'd allow
to fix them up at any boundary later on. Not that I want to to that, but if
absolutely necessary, it can be done.

> That said, all other ideas I've managed to come up with are worse,
>  so I agree on going ahead with this version.
> 
> We could still bikeshed over the exact cutoff time, as the one you
> picked isn't particularly intuitive. It's almost exactly 30 years before
> the final end point, but your calculation is off by a few days because
> of leap years. And no, I don't have a particular preference for any
> other color of this bikeshed either, it's probably as good as any other
> time within 20 years of what you suggested.

Haha, we surely could bikeshed that until retirement and then hand it over
to the next generations which might come to an agreement shortly before
2262 :)

Thanks,

tglx


[PATCH tip/core/rcu 0/2] straggling consolidation cleanups for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains a few straggling RCU consolidation updates:

1.  Update kprobes's documentation of obsolete RCU update functions.

2.  Update netfilter comment from call_rcu_bh() to call_rcu()

Thanx, Paul



 Documentation/kprobes.txt  |6 +++---
 net/ipv4/netfilter/ipt_CLUSTERIP.c |2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)



[PATCH tip/core/rcu 2/2] net/ipv4/netfilter: Update comment from call_rcu_bh() to call_rcu()

2019-03-26 Thread Paul E. McKenney
The RCU flavors have been consolidated, so this commit replaces a
comment's mention of call_rcu_bh() with call_rcu().

Signed-off-by: Paul E. McKenney 
Cc: Pablo Neira Ayuso 
Cc: Florian Westphal 
Cc: "David S. Miller" 
Cc: Alexey Kuznetsov 
Cc: Hideaki YOSHIFUJI 
Cc: 
Cc: 
Cc: 
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 835d50b279f5..a2a88ab07f7b 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -56,7 +56,7 @@ struct clusterip_config {
 #endif
enum clusterip_hashmode hash_mode;  /* which hashing mode */
u_int32_t hash_initval; /* hash initialization */
-   struct rcu_head rcu;/* for call_rcu_bh */
+   struct rcu_head rcu;/* for call_rcu */
struct net *net;/* netns for pernet list */
char ifname[IFNAMSIZ];  /* device ifname */
 };
-- 
2.17.1



linux-next: build failure after merge of the sound-asoc tree

2019-03-26 Thread Stephen Rothwell
Hi all,

After merging the sound-asoc tree, today's linux-next build (x86_64
allmodconfig) failed like this:

In file included from include/linux/printk.h:330,
 from include/linux/kernel.h:15,
 from include/linux/clk.h:16,
 from sound/soc/fsl/fsl_audmix.c:8:
sound/soc/fsl/fsl_audmix.c: In function 'fsl_audmix_state_trans':
include/linux/dynamic_debug.h:80:13: error: initializer element is not constant
   .format = (fmt),\
 ^
include/linux/dynamic_debug.h:116:2: note: in expansion of macro 
'DEFINE_DYNAMIC_DEBUG_METADATA'
  DEFINE_DYNAMIC_DEBUG_METADATA(id, fmt);  \
  ^
include/linux/dynamic_debug.h:136:2: note: in expansion of macro 
'__dynamic_func_call'
  __dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
  ^~~
include/linux/dynamic_debug.h:150:2: note: in expansion of macro 
'_dynamic_func_call'
  _dynamic_func_call(fmt,__dynamic_dev_dbg,   \
  ^~
include/linux/device.h:1493:2: note: in expansion of macro 'dynamic_dev_dbg'
  dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__)
  ^~~
sound/soc/fsl/fsl_audmix.c:93:3: note: in expansion of macro 'dev_dbg'
   dev_dbg(comp->dev, prm.msg);
   ^~~
include/linux/dynamic_debug.h:80:13: note: (near initialization for 
'__UNIQUE_ID_ddebug374.format')
   .format = (fmt),\
 ^
include/linux/dynamic_debug.h:116:2: note: in expansion of macro 
'DEFINE_DYNAMIC_DEBUG_METADATA'
  DEFINE_DYNAMIC_DEBUG_METADATA(id, fmt);  \
  ^
include/linux/dynamic_debug.h:136:2: note: in expansion of macro 
'__dynamic_func_call'
  __dynamic_func_call(__UNIQUE_ID(ddebug), fmt, func, ##__VA_ARGS__)
  ^~~
include/linux/dynamic_debug.h:150:2: note: in expansion of macro 
'_dynamic_func_call'
  _dynamic_func_call(fmt,__dynamic_dev_dbg,   \
  ^~
include/linux/device.h:1493:2: note: in expansion of macro 'dynamic_dev_dbg'
  dynamic_dev_dbg(dev, dev_fmt(fmt), ##__VA_ARGS__)
  ^~~
sound/soc/fsl/fsl_audmix.c:93:3: note: in expansion of macro 'dev_dbg'
   dev_dbg(comp->dev, prm.msg);
   ^~~

Caused by commit

  be1df61cf06e ("ASoC: fsl: Add Audio Mixer CPU DAI driver")

I have reverted that commit (and its 2 following ones) for today.

-- 
Cheers,
Stephen Rothwell


pgpd60vvLQbi8.pgp
Description: OpenPGP digital signature


Re: INFO: rcu detected stall in __perf_sw_event

2019-03-26 Thread syzbot

syzbot has bisected this bug to:

commit cf85d89562f39cc7ae73de54639f1915a9195b7a
Author: Finn Thain 
Date:   Fri May 25 07:34:36 2018 +

m68k/mac: Enable PDMA for PowerBook 500 series

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226cb8b20
start commit:   b0314565 Merge tag 'for_linus' of git://git.kernel.org/pub..
git tree:   upstream
final crash:https://syzkaller.appspot.com/x/report.txt?x=1126cb8b20
console output: https://syzkaller.appspot.com/x/log.txt?x=1626cb8b20
kernel config:  https://syzkaller.appspot.com/x/.config?x=8f00801d7b7c4fe6
dashboard link: https://syzkaller.appspot.com/bug?extid=a41ac89a0712acde0e84
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1707cd2f40

Reported-by: syzbot+a41ac89a0712acde0...@syzkaller.appspotmail.com
Fixes: cf85d89562f3 ("m68k/mac: Enable PDMA for PowerBook 500 series")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection


[PATCH tip/core/rcu 0/4] Documentation updates for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains documentation updates:

1.  Remove obsolete RCU update functions from RCU documentation.

2.  Repair some whitespace damage, courtesy of Tycho Andersen.

3.  Describe choice of rcu_dereference() APIs and __rcu usage.

4.  Fix typos and otherwise modernize checklist.txt.

Thanx, Paul



 Design/Data-Structures/Data-Structures.html |3 
 Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html |4 
 Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html|5 
 NMI-RCU.txt |   13 -
 UP.txt  |6 
 checklist.txt   |  119 +---
 rcu.txt |8 
 rcu_dereference.txt |  103 ++
 rcubarrier.txt  |   27 +-
 whatisRCU.txt   |   10 -
 10 files changed, 199 insertions(+), 99 deletions(-)



[PATCH tip/core/rcu 3/4] doc: Describe choice of rcu_dereference() APIs and __rcu usage

2019-03-26 Thread Paul E. McKenney
Reported-by: Andrew Morton 
Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/rcu_dereference.txt | 103 ++
 1 file changed, 103 insertions(+)

diff --git a/Documentation/RCU/rcu_dereference.txt 
b/Documentation/RCU/rcu_dereference.txt
index ab96227bad42..bf699e8cfc75 100644
--- a/Documentation/RCU/rcu_dereference.txt
+++ b/Documentation/RCU/rcu_dereference.txt
@@ -351,3 +351,106 @@ garbage values.
 
 In short, rcu_dereference() is -not- optional when you are going to
 dereference the resulting pointer.
+
+
+WHICH MEMBER OF THE rcu_dereference() FAMILY SHOULD YOU USE?
+
+First, please avoid using rcu_dereference_raw() and also please avoid
+using rcu_dereference_check() and rcu_dereference_protected() with a
+second argument with a constant value of 1 (or true, for that matter).
+With that caution out of the way, here is some guidance for which
+member of the rcu_dereference() to use in various situations:
+
+1. If the access needs to be within an RCU read-side critical
+   section, use rcu_dereference().  With the new consolidated
+   RCU flavors, an RCU read-side critical section is entered
+   using rcu_read_lock(), anything that disables bottom halves,
+   anything that disables interrupts, or anything that disables
+   preemption.
+
+2. If the access might be within an RCU read-side critical section
+   on the one hand, or protected by (say) my_lock on the other,
+   use rcu_dereference_check(), for example:
+
+   p1 = rcu_dereference_check(p->rcu_protected_pointer,
+  lockdep_is_held(&my_lock));
+
+
+3. If the access might be within an RCU read-side critical section
+   on the one hand, or protected by either my_lock or your_lock on
+   the other, again use rcu_dereference_check(), for example:
+
+   p1 = rcu_dereference_check(p->rcu_protected_pointer,
+  lockdep_is_held(&my_lock) ||
+  lockdep_is_held(&your_lock));
+
+4. If the access is on the update side, so that it is always protected
+   by my_lock, use rcu_dereference_protected():
+
+   p1 = rcu_dereference_protected(p->rcu_protected_pointer,
+  lockdep_is_held(&my_lock));
+
+   This can be extended to handle multiple locks as in #3 above,
+   and both can be extended to check other conditions as well.
+
+5. If the protection is supplied by the caller, and is thus unknown
+   to this code, that is the rare case when rcu_dereference_raw()
+   is appropriate.  In addition, rcu_dereference_raw() might be
+   appropriate when the lockdep expression would be excessively
+   complex, except that a better approach in that case might be to
+   take a long hard look at your synchronization design.  Still,
+   there are data-locking cases where any one of a very large number
+   of locks or reference counters suffices to protect the pointer,
+   so rcu_dereference_raw() does have its place.
+
+   However, its place is probably quite a bit smaller than one
+   might expect given the number of uses in the current kernel.
+   Ditto for its synonym, rcu_dereference_check( ... , 1), and
+   its close relative, rcu_dereference_protected(... , 1).
+
+
+SPARSE CHECKING OF RCU-PROTECTED POINTERS
+
+The sparse static-analysis tool checks for direct access to RCU-protected
+pointers, which can result in "interesting" bugs due to compiler
+optimizations involving invented loads and perhaps also load tearing.
+For example, suppose someone mistakenly does something like this:
+
+   p = q->rcu_protected_pointer;
+   do_something_with(p->a);
+   do_something_else_with(p->b);
+
+If register pressure is high, the compiler might optimize "p" out
+of existence, transforming the code to something like this:
+
+   do_something_with(q->rcu_protected_pointer->a);
+   do_something_else_with(q->rcu_protected_pointer->b);
+
+This could fatally disappoint your code if q->rcu_protected_pointer
+changed in the meantime.  Nor is this a theoretical problem:  Exactly
+this sort of bug cost Paul E. McKenney (and several of his innocent
+colleagues) a three-day weekend back in the early 1990s.
+
+Load tearing could of course result in dereferencing a mashup of a pair
+of pointers, which also might fatally disappoint your code.
+
+These problems could have been avoided simply by making the code instead
+read as follows:
+
+   p = rcu_dereference(q->rcu_protected_pointer);
+   do_something_with(p->a);
+   do_something_else_with(p->b);
+
+Unfortunately, these sorts of bugs can be extremely hard to spot during
+review.  This is where the sparse tool comes into play, along with the
+"__rcu" marker.  If you mark a pointer declaration, whether in a structure
+or as a formal parameter, with "__rcu", which tells sparse to compl

[PATCH tip/core/rcu 2/4] doc: Repair some whitespace damage

2019-03-26 Thread Paul E. McKenney
From: Tycho Andersen 

A diagram in whatisRCU.txt has space character before tabs.  This commit
therefore makes this diagram consistent with elsewhere in the document:
Use one leading tab, followed by spaces for any additional whitespace
required.

Signed-off-by: Tycho Andersen 
Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/whatisRCU.txt | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 1ace20815bb1..981651a8b65d 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -310,7 +310,7 @@ reader, updater, and reclaimer.
 
 
rcu_assign_pointer()
-   ++
+   ++
+-->| reader |-+
|   ++ |
|   |  |
@@ -318,12 +318,12 @@ reader, updater, and reclaimer.
|   |  | rcu_read_lock()
|   |  | rcu_read_unlock()
|rcu_dereference()  |  |
-   +-+  |  |
-   | updater |<-+  |
-   +-+ V
+   +-+ |  |
+   | updater |<+  |
+   +-+V
|+---+
+--->| reclaimer |
-+---+
++---+
  Defer:
  synchronize_rcu() & call_rcu()
 
-- 
2.17.1



[PATCH tip/core/rcu 1/4] doc: Remove obsolete RCU update functions from RCU documentation

2019-03-26 Thread Paul E. McKenney
Now that synchronize_rcu_bh, synchronize_rcu_bh_expedited, call_rcu_bh,
rcu_barrier_bh, synchronize_sched, synchronize_sched_expedited,
call_rcu_sched, rcu_barrier_sched, get_state_synchronize_sched,
and cond_synchronize_sched are obsolete, let's remove them from the
documentation aside from a small historical section.

Signed-off-by: Paul E. McKenney 
---
 .../Data-Structures/Data-Structures.html  |  3 +-
 .../Expedited-Grace-Periods.html  |  4 +-
 .../Tree-RCU-Memory-Ordering.html |  5 +-
 Documentation/RCU/NMI-RCU.txt | 13 ++--
 Documentation/RCU/UP.txt  |  6 +-
 Documentation/RCU/checklist.txt   | 76 +--
 Documentation/RCU/rcu.txt |  8 +-
 Documentation/RCU/rcubarrier.txt  | 27 ---
 8 files changed, 66 insertions(+), 76 deletions(-)

diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html 
b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
index 18f179807563..c30c1957c7e6 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
@@ -155,8 +155,7 @@ keeping lock contention under control at all tree levels 
regardless
 of the level of loading on the system.
 
 RCU updaters wait for normal grace periods by registering
-RCU callbacks, either directly via call_rcu() and
-friends (namely call_rcu_bh() and call_rcu_sched()),
+RCU callbacks, either directly via call_rcu()
 or indirectly via synchronize_rcu() and friends.
 RCU callbacks are represented by rcu_head structures,
 which are queued on rcu_data structures while they are
diff --git 
a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html 
b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html
index 19e7a5fb6b73..57300db4b5ff 100644
--- 
a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html
+++ 
b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.html
@@ -56,6 +56,7 @@ sections.
 RCU-preempt Expedited Grace Periods
 
 
+CONFIG_PREEMPT=y kernels implement RCU-preempt.
 The overall flow of the handling of a given CPU by an RCU-preempt
 expedited grace period is shown in the following diagram:
 
@@ -139,6 +140,7 @@ or offline, among other things.
 RCU-sched Expedited Grace Periods
 
 
+CONFIG_PREEMPT=n kernels implement RCU-sched.
 The overall flow of the handling of a given CPU by an RCU-sched
 expedited grace period is shown in the following diagram:
 
@@ -146,7 +148,7 @@ expedited grace period is shown in the following diagram:
 
 
 As with RCU-preempt, RCU-sched's
-synchronize_sched_expedited() ignores offline and
+synchronize_rcu_expedited() ignores offline and
 idle CPUs, again because they are in remotely detectable
 quiescent states.
 However, because the
diff --git 
a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html 
b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
index 8d21af02b1f0..c64f8d26609f 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
@@ -34,12 +34,11 @@ Similarly, any code that happens before the beginning of a 
given RCU grace
 period is guaranteed to see the effects of all accesses following the end
 of that grace period that are within RCU read-side critical sections.
 
-This guarantee is particularly pervasive for synchronize_sched(),
-for which RCU-sched read-side critical sections include any region
+Note well that RCU-sched read-side critical sections include any region
 of code for which preemption is disabled.
 Given that each individual machine instruction can be thought of as
 an extremely small region of preemption-disabled code, one can think of
-synchronize_sched() as smp_mb() on steroids.
+synchronize_rcu() as smp_mb() on steroids.
 
 RCU updaters use this guarantee by splitting their updates into
 two phases, one of which is executed before the grace period and
diff --git a/Documentation/RCU/NMI-RCU.txt b/Documentation/RCU/NMI-RCU.txt
index 68f83b23..881353fd5bff 100644
--- a/Documentation/RCU/NMI-RCU.txt
+++ b/Documentation/RCU/NMI-RCU.txt
@@ -81,18 +81,19 @@ currently executing on some other CPU.  We therefore cannot 
free
 up any data structures used by the old NMI handler until execution
 of it completes on all other CPUs.
 
-One way to accomplish this is via synchronize_sched(), perhaps as
+One way to accomplish this is via synchronize_rcu(), perhaps as
 follows:
 
unset_nmi_callback();
-   synchronize_sched();
+   synchronize_rcu();
kfree(my_nmi_data);
 
-This works because synchronize_sched() blocks until all CPUs complete
-any preemption-disabled segments of code that they were executing.
-Since NMI handlers disable preemption, synchronize_sched() is guaranteed
+This works because (as of v4.2

[PATCH tip/core/rcu 4/4] doc: Fix typos and otherwise modernize checklist.txt

2019-03-26 Thread Paul E. McKenney
This commit fixes some issues with Documentation/RCU/checklist.txt.

Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/checklist.txt | 43 +++--
 1 file changed, 25 insertions(+), 18 deletions(-)

diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index fcc59fea5cd4..e98ff261a438 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -318,7 +318,7 @@ over a rather long period of time, but improvements are 
always welcome!
 
 11.Any lock acquired by an RCU callback must be acquired elsewhere
with softirq disabled, e.g., via spin_lock_irqsave(),
-   spin_lock_bh(), etc.  Failing to disable irq on a given
+   spin_lock_bh(), etc.  Failing to disable softirq on a given
acquisition of that lock will result in deadlock as soon as
the RCU softirq handler happens to run your RCU callback while
interrupting that acquisition's critical section.
@@ -331,13 +331,16 @@ over a rather long period of time, but improvements are 
always welcome!
must use whatever locking or other synchronization is required
to safely access and/or modify that data structure.
 
-   RCU callbacks are -usually- executed on the same CPU that executed
-   the corresponding call_rcu() or call_srcu().  but are by -no-
-   means guaranteed to be.  For example, if a given CPU goes offline
-   while having an RCU callback pending, then that RCU callback
-   will execute on some surviving CPU.  (If this was not the case,
-   a self-spawning RCU callback would prevent the victim CPU from
-   ever going offline.)
+   Do not assume that RCU callbacks will be executed on the same
+   CPU that executed the corresponding call_rcu() or call_srcu().
+   For example, if a given CPU goes offline while having an RCU
+   callback pending, then that RCU callback will execute on some
+   surviving CPU.  (If this was not the case, a self-spawning RCU
+   callback would prevent the victim CPU from ever going offline.)
+   Furthermore, CPUs designated by rcu_nocbs= might well -always-
+   have their RCU callbacks executed on some other CPUs, in fact,
+   for some  real-time workloads, this is the whole point of using
+   the rcu_nocbs= kernel boot parameter.
 
 13.Unlike other forms of RCU, it -is- permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
@@ -379,8 +382,9 @@ over a rather long period of time, but improvements are 
always welcome!
never sends IPIs to other CPUs, so it is easier on
real-time workloads than is synchronize_rcu_expedited().
 
-   Note that rcu_dereference() and rcu_assign_pointer() relate to
-   SRCU just as they do to other forms of RCU.
+   Note that rcu_assign_pointer() relates to SRCU just as it does to
+   other forms of RCU, but instead of rcu_dereference() you should
+   use srcu_dereference() in order to avoid lockdep splats.
 
 14.The whole point of call_rcu(), synchronize_rcu(), and friends
is to wait until all pre-existing readers have finished before
@@ -400,6 +404,9 @@ over a rather long period of time, but improvements are 
always welcome!
read-side critical sections.  It is the responsibility of the
RCU update-side primitives to deal with this.
 
+   For SRCU readers, you can use smp_mb__after_srcu_read_unlock()
+   immediately after an srcu_read_unlock() to get a full barrier.
+
 16.Use CONFIG_PROVE_LOCKING, CONFIG_DEBUG_OBJECTS_RCU_HEAD, and the
__rcu sparse checks to validate your RCU code.  These can help
find problems as follows:
@@ -423,15 +430,15 @@ over a rather long period of time, but improvements are 
always welcome!
These debugging aids can help you find problems that are
otherwise extremely difficult to spot.
 
-17.If you register a callback using call_rcu() or call_srcu(),
-   and pass in a function defined within a loadable module,
-   then it in necessary to wait for all pending callbacks to
-   be invoked after the last invocation and before unloading
-   that module.  Note that it is absolutely -not- sufficient to
-   wait for a grace period!  The current (say) synchronize_rcu()
-   implementation waits only for all previous callbacks registered
-   on the CPU that synchronize_rcu() is running on, but it is -not-
+17.If you register a callback using call_rcu() or call_srcu(), and
+   pass in a function defined within a loadable module, then it in
+   necessary to wait for all pending callbacks to be invoked after
+   the last invocation and before unloading that module.  Note that
+   it is absolutely -not- sufficient to wait for a grace period!
+   The current (say) synchronize_rcu() implementation is -not-
guaranteed to wait for callbacks registered on other CPUs.
+   Or eve

[PATCH] Yama: mark local symbols as static

2019-03-26 Thread Jann Horn
sparse complains that Yama defines functions and a variable as non-static
even though they don't exist in any header. Fix it by making them static.

Signed-off-by: Jann Horn 
---
 security/yama/yama_lsm.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/security/yama/yama_lsm.c b/security/yama/yama_lsm.c
index 57cc60722dd3..06b14a57b0a4 100644
--- a/security/yama/yama_lsm.c
+++ b/security/yama/yama_lsm.c
@@ -206,7 +206,7 @@ static void yama_ptracer_del(struct task_struct *tracer,
  * yama_task_free - check for task_pid to remove from exception list
  * @task: task being removed
  */
-void yama_task_free(struct task_struct *task)
+static void yama_task_free(struct task_struct *task)
 {
yama_ptracer_del(task, task);
 }
@@ -401,7 +401,7 @@ static int yama_ptrace_access_check(struct task_struct 
*child,
  *
  * Returns 0 if following the ptrace is allowed, -ve on error.
  */
-int yama_ptrace_traceme(struct task_struct *parent)
+static int yama_ptrace_traceme(struct task_struct *parent)
 {
int rc = 0;
 
@@ -452,7 +452,7 @@ static int yama_dointvec_minmax(struct ctl_table *table, 
int write,
 static int zero;
 static int max_scope = YAMA_SCOPE_NO_ATTACH;
 
-struct ctl_path yama_sysctl_path[] = {
+static struct ctl_path yama_sysctl_path[] = {
{ .procname = "kernel", },
{ .procname = "yama", },
{ }
-- 
2.21.0.392.gf8f6787159e-goog



[PATCH tip/core/rcu 16/18] rcu: Eliminate redundant NULL-pointer check

2019-03-26 Thread Paul E. McKenney
Because rcu_wake_cond() checks for a null task_struct pointer, there is
no need for its callers to do so.  This commit eliminates the redundant
check.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_plugin.h | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index f0aeb7416dcc..81d3cd821891 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1191,8 +1191,6 @@ static int rcu_boost_kthread(void *arg)
 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
__releases(rnp->lock)
 {
-   struct task_struct *t;
-
raw_lockdep_assert_held_rcu_node(rnp);
if (!rcu_preempt_blocked_readers_cgp(rnp) && rnp->exp_tasks == NULL) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
@@ -1206,9 +1204,8 @@ static void rcu_initiate_boost(struct rcu_node *rnp, 
unsigned long flags)
if (rnp->exp_tasks == NULL)
rnp->boost_tasks = rnp->gp_tasks;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-   t = rnp->boost_kthread_task;
-   if (t)
-   rcu_wake_cond(t, rnp->boost_kthread_status);
+   rcu_wake_cond(rnp->boost_kthread_task,
+ rnp->boost_kthread_status);
} else {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
-- 
2.17.1



[PATCH tip/core/rcu 0/18] Miscellaneous fixes for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains miscellaneous fixes.

1.  Unconditionally expedite during suspend/hibernate (unless the
real-time guys have disabled expediting altogether, that is).

2.  Avoid unnecessary softirq when system is idle, courtesy of
Joel Fernandes.

3.  rcu_qs -- Use raise_softirq_irqoff to not save irqs twice,
courtesy of Cyrill Gorcunov.

4.  Make exit_rcu() handle non-preempted RCU readers.

5.  Set rcutree.kthread_prio sysfs access to read-only, courtesy
of Liu Song.

6.  MAINTAINERS: RCU now has its own email list.

7.  MAINTAINERS: Add -rcu branch name ("dev").

8.  rcu: Move common code out of if-else block, courtesy of Akira
Yokosawa.

9.  Allow rcu_nocbs= to specify all CPUs.

10. Report error for bad rcu_nocbs= parameter values.

11. Fix self-wakeups for grace-period kthread, courtesy of Neeraj
Upadhyay.

12. Default jiffies_to_sched_qs to jiffies_till_sched_qs, courtesy
of Neeraj Upadhyay.

13. Do a single rhp->func read in rcu_head_after_call_rcu(),
courtesy of Neeraj Upadhyay.

14. Update jiffies_to_sched_qs and adjust_jiffies_till_sched_qs()
comments.

15. Fix force_qs_rnp() header comment, courtesy of Zhouyi Zhou.

16. Eliminate redundant NULL-pointer check.

17. Fix typo in tree_exp.h comment.

18. Correct READ_ONCE()/WRITE_ONCE() for ->rcu_read_unlock_special.

Thanx, Paul



 Documentation/admin-guide/kernel-parameters.txt |4 +-
 MAINTAINERS |   16 
 include/linux/rcupdate.h|6 ++-
 kernel/rcu/tiny.c   |2 -
 kernel/rcu/tree.c   |   31 +++
 kernel/rcu/tree_exp.h   |4 +-
 kernel/rcu/tree_plugin.h|   48 +++-
 7 files changed, 63 insertions(+), 48 deletions(-)



[PATCH tip/core/rcu 07/18] MAINTAINERS: Add -rcu branch name ("dev")

2019-03-26 Thread Paul E. McKenney
Signed-off-by: Paul E. McKenney 
---
 MAINTAINERS | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 1924b52937a6..a9b5270d006e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8983,7 +8983,7 @@ R:Daniel Lustig 
 L: linux-kernel@vger.kernel.org
 L: linux-a...@vger.kernel.org
 S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev
 F: tools/memory-model/
 F: Documentation/atomic_bitops.txt
 F: Documentation/atomic_t.txt
@@ -13033,7 +13033,7 @@ R:  Mathieu Desnoyers 

 R: Lai Jiangshan 
 L: r...@vger.kernel.org
 S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev
 F: tools/testing/selftests/rcutorture
 
 RDC R-321X SoC
@@ -13082,7 +13082,7 @@ R:  Joel Fernandes 
 L: r...@vger.kernel.org
 W: http://www.rdrop.com/users/paulmck/RCU/
 S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev
 F: Documentation/RCU/
 X: Documentation/RCU/torture.txt
 F: include/linux/rcu*
@@ -14237,7 +14237,7 @@ R:  Mathieu Desnoyers 

 L: r...@vger.kernel.org
 W: http://www.rdrop.com/users/paulmck/RCU/
 S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev
 F: include/linux/srcu*.h
 F: kernel/rcu/srcu*.c
 
@@ -15684,7 +15684,7 @@ M:  "Paul E. McKenney" 
 M: Josh Triplett 
 L: linux-kernel@vger.kernel.org
 S: Supported
-T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev
 F: Documentation/RCU/torture.txt
 F: kernel/torture.c
 F: kernel/rcu/rcutorture.c
-- 
2.17.1



Re: [PATCH 09/14] bus: ti-sysc: Move rstctrl reset to happen later

2019-03-26 Thread Tony Lindgren
* Tony Lindgren  [190325 22:00]:
> We should not do the reset until the clocks are enabled. Let's only init
> restctrl in sysc_init_resets() and do the reset later on in sysc_reset().
...

>  static int sysc_reset(struct sysc *ddata)
>  {
>   int offset = ddata->offsets[SYSC_SYSCONFIG];
> - int val;
> + int error, val;
>  
>   if (ddata->legacy_mode || offset < 0 ||
>   ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
> - return 0;
> + return sysc_rstctrl_reset_deassert(ddata, false);
> +
> + error = sysc_rstctrl_reset_deassert(ddata, true);
> + if (error)
> + return error;

This change is wrong, we need to deassert rstctrl reset before
we enable clocks, not after. Updated version below.

Regards,

Tony

8< 
>From tony Mon Sep 17 00:00:00 2001
From: Tony Lindgren 
Date: Thu, 21 Mar 2019 11:00:21 -0700
Subject: [PATCH] bus: ti-sysc: Move rstctrl reset to happen later

We can do the rsstctrl a bit later, but need to deassert rstctrl reset
before the clocks are enabled if asserted. Let's only init restctrl
in sysc_init_resets() and do the reset later on just before we enable
the device clocks.

Signed-off-by: Tony Lindgren 
---
 drivers/bus/ti-sysc.c | 61 +++
 1 file changed, 39 insertions(+), 22 deletions(-)

diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
--- a/drivers/bus/ti-sysc.c
+++ b/drivers/bus/ti-sysc.c
@@ -339,38 +339,18 @@ static void sysc_disable_opt_clocks(struct sysc *ddata)
 }
 
 /**
- * sysc_init_resets - reset module on init
+ * sysc_init_resets - init rstctrl reset line if configured
  * @ddata: device driver data
  *
- * A module can have both OCP softreset control and external rstctrl.
- * If more complicated rstctrl resets are needed, please handle these
- * directly from the child device driver and map only the module reset
- * for the parent interconnect target module device.
- *
- * Automatic reset of the module on init can be skipped with the
- * "ti,no-reset-on-init" device tree property.
+ * See sysc_rstctrl_reset_deassert().
  */
 static int sysc_init_resets(struct sysc *ddata)
 {
-   int error;
-
ddata->rsts =
devm_reset_control_array_get_optional_exclusive(ddata->dev);
if (IS_ERR(ddata->rsts))
return PTR_ERR(ddata->rsts);
 
-   if (ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
-   goto deassert;
-
-   error = reset_control_assert(ddata->rsts);
-   if (error)
-   return error;
-
-deassert:
-   error = reset_control_deassert(ddata->rsts);
-   if (error)
-   return error;
-
return 0;
 }
 
@@ -1031,6 +1011,35 @@ static int sysc_legacy_init(struct sysc *ddata)
return error;
 }
 
+/**
+ * sysc_rstctrl_reset_deassert - deassert rstctrl reset
+ * @ddata: device driver data
+ * @reset: reset before deassert
+ *
+ * A module can have both OCP softreset control and external rstctrl.
+ * If more complicated rstctrl resets are needed, please handle these
+ * directly from the child device driver and map only the module reset
+ * for the parent interconnect target module device.
+ *
+ * Automatic reset of the module on init can be skipped with the
+ * "ti,no-reset-on-init" device tree property.
+ */
+static int sysc_rstctrl_reset_deassert(struct sysc *ddata, bool reset)
+{
+   int error;
+
+   if (!ddata->rsts)
+   return 0;
+
+   if (reset) {
+   error = reset_control_assert(ddata->rsts);
+   if (error)
+   return error;
+   }
+
+   return reset_control_deassert(ddata->rsts);
+}
+
 static int sysc_reset(struct sysc *ddata)
 {
int offset = ddata->offsets[SYSC_SYSCONFIG];
@@ -1071,6 +1080,14 @@ static int sysc_init_module(struct sysc *ddata)
 {
int error = 0;
bool manage_clocks = true;
+   bool reset = true;
+
+   if (ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
+   reset = false;
+
+   error = sysc_rstctrl_reset_deassert(ddata, reset);
+   if (error)
+   return error;
 
if (ddata->cfg.quirks &
(SYSC_QUIRK_NO_IDLE | SYSC_QUIRK_NO_IDLE_ON_INIT))
-- 
2.21.0


[PATCH tip/core/rcu 05/18] rcu: Set rcutree.kthread_prio sysfs access to read-only

2019-03-26 Thread Paul E. McKenney
From: Liu Song 

The rcutree.kthread_prio kernel-boot parameter is used to set the
priority for boost (rcub), per-CPU (rcuc), and grace-period (rcu_preempt
or rcu_sched) kthreads.  It is also used by rcutorture to check whether
it is possible to meaningfully test RCU priority boosting.  However,
all of these cases will either ignore or be confused by any post-boot
changes to rcutree.kthread_prio.

Note that the user really can change the priorities of all of these
kthreads using chrt, given sufficient privileges.  Therefore, the
read-write nature of sysfs access to rcutree.kthread_prio is thus at
best an attractive nuisance.

This commit therefore changes sysfs access to rcutree.kthread_prio to
be read-only.

Signed-off-by: Liu Song 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2f78a115d34c..296131450414 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -149,7 +149,7 @@ static void sync_sched_exp_online_cleanup(int cpu);
 
 /* rcuc/rcub kthread realtime priority */
 static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
-module_param(kthread_prio, int, 0644);
+module_param(kthread_prio, int, 0444);
 
 /* Delay in jiffies for grace-period initialization delays, debug only. */
 
-- 
2.17.1



[PATCH tip/core/rcu 18/18] rcu: Correct READ_ONCE()/WRITE_ONCE() for ->rcu_read_unlock_special

2019-03-26 Thread Paul E. McKenney
The task_struct structure's ->rcu_read_unlock_special field is only ever
read or written by the owning task, but it is accessed both at process
and interrupt levels.  It may therefore be accessed using plain reads
and writes while interrupts are disabled, but must be accessed using
READ_ONCE() and WRITE_ONCE() or better otherwise.  This commit makes a
few adjustments to align with this discipline.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_exp.h| 2 +-
 kernel/rcu/tree_plugin.h | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index ec4fb93a5dbe..1ee0782213b8 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -633,7 +633,7 @@ static void rcu_exp_handler(void *unused)
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (rnp->expmask & rdp->grpmask) {
rdp->deferred_qs = true;
-   WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true);
+   t->rcu_read_unlock_special.b.exp_hint = true;
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 81d3cd821891..6ddb3c05e88f 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -285,7 +285,7 @@ static void rcu_qs(void)
   TPS("cpuqs"));
__this_cpu_write(rcu_data.cpu_no_qs.b.norm, false);
barrier(); /* Coordinate with rcu_flavor_sched_clock_irq(). */
-   current->rcu_read_unlock_special.b.need_qs = false;
+   WRITE_ONCE(current->rcu_read_unlock_special.b.need_qs, false);
}
 }
 
@@ -817,7 +817,7 @@ void exit_rcu(void)
if (unlikely(!list_empty(¤t->rcu_node_entry))) {
t->rcu_read_lock_nesting = 1;
barrier();
-   t->rcu_read_unlock_special.b.blocked = true;
+   WRITE_ONCE(t->rcu_read_unlock_special.b.blocked, true);
} else if (unlikely(t->rcu_read_lock_nesting)) {
t->rcu_read_lock_nesting = 1;
} else {
-- 
2.17.1



[PATCH tip/core/rcu 13/18] rcu: Do a single rhp->func read in rcu_head_after_call_rcu()

2019-03-26 Thread Paul E. McKenney
From: Neeraj Upadhyay 

The rcu_head_after_call_rcu() function reads the rhp->func pointer twice,
which can result in a false-positive WARN_ON_ONCE() if the callback
were passed to call_rcu() between the two reads.  Although racing
rcu_head_after_call_rcu() with call_rcu() is to be a dubious use case
(the return value is not reliable in that case), intermittent and
irreproducible warnings are also quite dubious.  This commit therefore
uses a single READ_ONCE() to pick up the value of rhp->func once, then
tests that value twice, thus guaranteeing consistent processing within
rcu_head_after_call_rcu()().

Neverthless, racing rcu_head_after_call_rcu() with call_rcu() is still
a dubious use case.

Signed-off-by: Neeraj Upadhyay 
[ paulmck: Add blank line after declaration per checkpatch.pl. ]
Signed-off-by: Paul E. McKenney 
---
 include/linux/rcupdate.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 6cdb1db776cf..922bb6848813 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -878,9 +878,11 @@ static inline void rcu_head_init(struct rcu_head *rhp)
 static inline bool
 rcu_head_after_call_rcu(struct rcu_head *rhp, rcu_callback_t f)
 {
-   if (READ_ONCE(rhp->func) == f)
+   rcu_callback_t func = READ_ONCE(rhp->func);
+
+   if (func == f)
return true;
-   WARN_ON_ONCE(READ_ONCE(rhp->func) != (rcu_callback_t)~0L);
+   WARN_ON_ONCE(func != (rcu_callback_t)~0L);
return false;
 }
 
-- 
2.17.1



[PATCH tip/core/rcu 11/18] rcu: Fix self-wakeups for grace-period kthread

2019-03-26 Thread Paul E. McKenney
From: Neeraj Upadhyay 

The current rcu_gp_kthread_wake() function uses in_interrupt()
and thus does a self-wakeup from all interrupt contexts, including
the pointless case where the GP kthread happens to be running with
bottom halves disabled, along with the impossible case where the GP
kthread is running within an NMI handler (you are not supposed to invoke
rcu_gp_kthread_wake() from within an NMI handler.  This commit therefore
replaces the in_interrupt() with in_irq(), so that the self-wakeups
happen only from handlers for hardware interrupts and softirqs.
This also makes the code match the comment.

Signed-off-by: Neeraj Upadhyay 
Signed-off-by: Paul E. McKenney 
Acked-by: Steven Rostedt (VMware) 
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 5aefd36ac648..139fa1f5c537 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1585,7 +1585,7 @@ static bool rcu_future_gp_cleanup(struct rcu_node *rnp)
 static void rcu_gp_kthread_wake(void)
 {
if ((current == rcu_state.gp_kthread &&
-!in_interrupt() && !in_serving_softirq()) ||
+!in_irq() && !in_serving_softirq()) ||
!READ_ONCE(rcu_state.gp_flags) ||
!rcu_state.gp_kthread)
return;
-- 
2.17.1



[PATCH tip/core/rcu 04/18] rcu: Make exit_rcu() handle non-preempted RCU readers

2019-03-26 Thread Paul E. McKenney
The purpose of exit_rcu() is to handle cases where buggy code causes a
task to exit within an RCU read-side critical section.  It currently
does that in the case where said RCU read-side critical section was
preempted at least once, but fails to handle cases where preemption did
not occur.  This case needs to be handled because otherwise the final
context switch away from the exiting task will incorrectly behave as if
task exit were instead a preemption of an RCU read-side critical section,
and will therefore queue the exiting task.  The exiting task will have
exited, and thus won't ever execute rcu_read_unlock(), which means that
it will remain queued forever, blocking all subsequent grace periods,
and eventually resulting in OOM.

Although this is arguably better than letting grace periods proceed
and having a later rcu_read_unlock() access the now-freed task
structure that once belonged to the exiting tasks, it would obviously
be better to correctly handle this case.  This commit therefore sets
->rcu_read_lock_nesting to 1 in that case, so that the subsequence call
to __rcu_read_unlock() causes the exiting task to exit its dangling RCU
read-side critical section.

Note that deferred quiescent states need not be considered.  The reason
is that removing the task from the ->blkd_tasks[] list in the call to
rcu_preempt_deferred_qs() handles the per-task component of any deferred
quiescent state, and all other components of any deferred quiescent state
are associated with the CPU, which isn't going anywhere until some later
CPU-hotplug operation, which will report any remaining deferred quiescent
states from within the rcu_report_dead() function.

Note also that negative values of ->rcu_read_lock_nesting need not be
considered.  First, these won't show up in exit_rcu() unless there is
a serious bug in RCU, and second, setting ->rcu_read_lock_nesting sets
the state so that the RCU read-side critical section will be exited
normally.

Again, this code has no effect unless there has been some prior bug
that prevents a task from leaving an RCU read-side critical section
before exiting.  Furthermore, there have been no reports of the bug
fixed by this commit appearing in production.  This commit is therefore
absolutely -not- recommended for backporting to -stable.

Reported-by: ABHISHEK DUBEY 
Reported-by: BHARATH Y MOURYA 
Reported-by: Aravinda Prasad 
Signed-off-by: Paul E. McKenney 
Tested-by: ABHISHEK DUBEY 
---
 kernel/rcu/tree_plugin.h | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 97dba50f6fb2..d408661d5fb7 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -804,19 +804,25 @@ static void rcu_flavor_sched_clock_irq(int user)
 
 /*
  * Check for a task exiting while in a preemptible-RCU read-side
- * critical section, clean up if so.  No need to issue warnings,
- * as debug_check_no_locks_held() already does this if lockdep
- * is enabled.
+ * critical section, clean up if so.  No need to issue warnings, as
+ * debug_check_no_locks_held() already does this if lockdep is enabled.
+ * Besides, if this function does anything other than just immediately
+ * return, there was a bug of some sort.  Spewing warnings from this
+ * function is like as not to simply obscure important prior warnings.
  */
 void exit_rcu(void)
 {
struct task_struct *t = current;
 
-   if (likely(list_empty(¤t->rcu_node_entry)))
+   if (unlikely(!list_empty(¤t->rcu_node_entry))) {
+   t->rcu_read_lock_nesting = 1;
+   barrier();
+   t->rcu_read_unlock_special.b.blocked = true;
+   } else if (unlikely(t->rcu_read_lock_nesting)) {
+   t->rcu_read_lock_nesting = 1;
+   } else {
return;
-   t->rcu_read_lock_nesting = 1;
-   barrier();
-   t->rcu_read_unlock_special.b.blocked = true;
+   }
__rcu_read_unlock();
rcu_preempt_deferred_qs(current);
 }
-- 
2.17.1



[PATCH tip/core/rcu 12/18] rcu: Default jiffies_to_sched_qs to jiffies_till_sched_qs

2019-03-26 Thread Paul E. McKenney
From: Neeraj Upadhyay 

The current code only calls adjust_jiffies_till_sched_qs() if
jiffies_till_sched_qs is left at its default value, so when the
jiffies_till_sched_qs kernel-boot parameter actually is specified,
jiffies_to_sched_qs will be left with the value zero, which
will result in useless slowdowns of cond_resched().  This commit
therefore changes rcu_init_geometry() to unconditionally invoke
adjust_jiffies_till_sched_qs(), which ensures that jiffies_to_sched_qs
will be initialized in all cases, thus maintaining good cond_resched()
performance.

Signed-off-by: Neeraj Upadhyay 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 139fa1f5c537..466299c3d2da 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3739,8 +3739,7 @@ static void __init rcu_init_geometry(void)
jiffies_till_first_fqs = d;
if (jiffies_till_next_fqs == ULONG_MAX)
jiffies_till_next_fqs = d;
-   if (jiffies_till_sched_qs == ULONG_MAX)
-   adjust_jiffies_till_sched_qs();
+   adjust_jiffies_till_sched_qs();
 
/* If the compile-time values are accurate, just leave. */
if (rcu_fanout_leaf == RCU_FANOUT_LEAF &&
-- 
2.17.1



[PATCH tip/core/rcu 03/18] rcu: rcu_qs -- Use raise_softirq_irqoff to not save irqs twice

2019-03-26 Thread Paul E. McKenney
From: Cyrill Gorcunov 

The rcu_qs is disabling IRQs by self so no need to do the same in raise_softirq
but instead we can save some cycles using raise_softirq_irqoff directly.

CC: Paul E. McKenney 
Signed-off-by: Cyrill Gorcunov 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tiny.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index 911bd9076d43..477b4eb44af5 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -52,7 +52,7 @@ void rcu_qs(void)
local_irq_save(flags);
if (rcu_ctrlblk.donetail != rcu_ctrlblk.curtail) {
rcu_ctrlblk.donetail = rcu_ctrlblk.curtail;
-   raise_softirq(RCU_SOFTIRQ);
+   raise_softirq_irqoff(RCU_SOFTIRQ);
}
local_irq_restore(flags);
 }
-- 
2.17.1



[PATCH tip/core/rcu 10/18] rcu: Report error for bad rcu_nocbs= parameter values

2019-03-26 Thread Paul E. McKenney
This commit prints a console message when cpulist_parse() reports a
bad list of CPUs, and sets all CPUs' bits in that case.  The reason for
setting all CPUs' bits is that this is the safe(r) choice for real-time
workloads, which would normally be the ones using the rcu_nocbs= kernel
boot parameter.  Either way, later RCU console log messages list the
actual set of CPUs whose RCU callbacks will be offloaded.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_plugin.h | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index ed4a6dabf31d..f0aeb7416dcc 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1772,14 +1772,22 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp)
  */
 
 
-/* Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters. */
+/*
+ * Parse the boot-time rcu_nocb_mask CPU list from the kernel parameters.
+ * The string after the "rcu_nocbs=" is either "all" for all CPUs, or a
+ * comma-separated list of CPUs and/or CPU ranges.  If an invalid list is
+ * given, a warning is emitted and all CPUs are offloaded.
+ */
 static int __init rcu_nocb_setup(char *str)
 {
alloc_bootmem_cpumask_var(&rcu_nocb_mask);
if (!strcasecmp(str, "all"))
cpumask_setall(rcu_nocb_mask);
else
-   cpulist_parse(str, rcu_nocb_mask);
+   if (cpulist_parse(str, rcu_nocb_mask)) {
+   pr_warn("rcu_nocbs= bad CPU range, all CPUs set\n");
+   cpumask_setall(rcu_nocb_mask);
+   }
return 1;
 }
 __setup("rcu_nocbs=", rcu_nocb_setup);
-- 
2.17.1



[PATCH tip/core/rcu 15/18] rcu: Fix force_qs_rnp() header comment

2019-03-26 Thread Paul E. McKenney
From: Zhouyi Zhou 

Previously, threads blocked on offlining CPUS were migrated to the
root rcu_node structure, thus requiring RCU priority boosting on this
structure.  However, since commit d19fb8d1f3f6 ("rcu: Don't migrate
blocked tasks even if all corresponding CPUs offline"), RCU does not
migrate blocked tasks.  Consequently, RCU no longer does RCU priority
boosting on the root rcu_node structure as of commit 1be0085b515e ("rcu:
Don't initiate RCU priority boosting on root rcu_node").

This commit therefore brings comments for the force_qs_rnp() function's
header comment in line with this new no-root-boosting reality.

Signed-off-by: Zhouyi Zhou 
[ paulmck: Also remove obsolete comment on suppressing new grace periods. ]
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index e117732bcd5d..abc8512ceb5f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2548,11 +2548,11 @@ void rcu_sched_clock_irq(int user)
 }
 
 /*
- * Scan the leaf rcu_node structures, processing dyntick state for any that
- * have not yet encountered a quiescent state, using the function specified.
- * Also initiate boosting for any threads blocked on the root rcu_node.
- *
- * The caller must have suppressed start of new grace periods.
+ * Scan the leaf rcu_node structures.  For each structure on which all
+ * CPUs have reported a quiescent state and on which there are tasks
+ * blocking the current grace period, initiate RCU priority boosting.
+ * Otherwise, invoke the specified function to check dyntick state for
+ * each CPU that has not yet reported a quiescent state.
  */
 static void force_qs_rnp(int (*f)(struct rcu_data *rdp))
 {
-- 
2.17.1



[PATCH tip/core/rcu 14/18] rcu: Update jiffies_to_sched_qs and adjust_jiffies_till_sched_qs() comments

2019-03-26 Thread Paul E. McKenney
This commit better documents the jiffies_to_sched_qs default-value
strategy used by adjust_jiffies_till_sched_qs()

Reported-by: Joel Fernandes 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 466299c3d2da..e117732bcd5d 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -406,7 +406,7 @@ static bool rcu_kick_kthreads;
  */
 static ulong jiffies_till_sched_qs = ULONG_MAX;
 module_param(jiffies_till_sched_qs, ulong, 0444);
-static ulong jiffies_to_sched_qs; /* Adjusted version of above if not default 
*/
+static ulong jiffies_to_sched_qs; /* See adjust_jiffies_till_sched_qs(). */
 module_param(jiffies_to_sched_qs, ulong, 0444); /* Display only! */
 
 /*
@@ -424,6 +424,7 @@ static void adjust_jiffies_till_sched_qs(void)
WRITE_ONCE(jiffies_to_sched_qs, jiffies_till_sched_qs);
return;
}
+   /* Otherwise, set to third fqs scan, but bound below on large system. */
j = READ_ONCE(jiffies_till_first_fqs) +
  2 * READ_ONCE(jiffies_till_next_fqs);
if (j < HZ / 10 + nr_cpu_ids / RCU_JIFFIES_FQS_DIV)
-- 
2.17.1



[PATCH tip/core/rcu 02/18] rcu: Avoid unnecessary softirq when system is idle

2019-03-26 Thread Paul E. McKenney
From: "Joel Fernandes (Google)" 

When there are no callbacks pending on an idle system, I noticed that
RCU softirq is continuously firing. During this the cpu_no_qs is set to
false, and core_needs_qs is set to true indefinitely. This causes
rcu_process_callbacks to be repeatedly called, even though the node
corresponding to the CPU has that CPU's mask bit cleared and the system
is idle. I believe the race is when such mask clearing is done during
idle CPU scan of the quiescent state forcing stage in the kthread
instead of the softirq. Since the rnp mask is cleared, but the flags on
the CPU's rdp are not cleared, the CPU thinks it still needs to report
to core RCU.

Cure this by clearing the core_needs_qs flag when the CPU detects that
its node is already updated which will avoid the unwanted softirq raises
to the benefit of real-time systems.

Test: Ran rcutorture for various tree RCU configs.

Signed-off-by: Joel Fernandes (Google) 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 95e3250b7b6e..2f78a115d34c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2296,6 +2296,7 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
}
mask = rdp->grpmask;
if ((rnp->qsmask & mask) == 0) {
+   rdp->core_needs_qs = false;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
} else {
rdp->core_needs_qs = false;
-- 
2.17.1



[PATCH tip/core/rcu 06/18] MAINTAINERS: RCU now has its own email list

2019-03-26 Thread Paul E. McKenney
This commit makes r...@vger.kernel.org be the official list for RCU-related
topics.

Signed-off-by: Paul E. McKenney 
---
 MAINTAINERS | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index e17ebf70b548..1924b52937a6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13031,7 +13031,7 @@ M:  Josh Triplett 
 R: Steven Rostedt 
 R: Mathieu Desnoyers 
 R: Lai Jiangshan 
-L: linux-kernel@vger.kernel.org
+L: r...@vger.kernel.org
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
 F: tools/testing/selftests/rcutorture
@@ -13079,7 +13079,7 @@ R:  Steven Rostedt 
 R: Mathieu Desnoyers 
 R: Lai Jiangshan 
 R: Joel Fernandes 
-L: linux-kernel@vger.kernel.org
+L: r...@vger.kernel.org
 W: http://www.rdrop.com/users/paulmck/RCU/
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
@@ -14234,7 +14234,7 @@ M:  "Paul E. McKenney" 
 M: Josh Triplett 
 R: Steven Rostedt 
 R: Mathieu Desnoyers 
-L: linux-kernel@vger.kernel.org
+L: r...@vger.kernel.org
 W: http://www.rdrop.com/users/paulmck/RCU/
 S: Supported
 T: git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
-- 
2.17.1



[PATCH tip/core/rcu 01/18] rcu: Unconditionally expedite during suspend/hibernate

2019-03-26 Thread Paul E. McKenney
The rcu_pm_notify() function refuses to switch to/from expedited grace
periods on systems with more than 256 CPUs due to the serialized
initialization of expedited grace periods.  However, expedited grace
periods are now initialized in parallel, removing this concern.
This commit therefore removes the checks from rcu_pm_notify(), so that
expedited grace periods are used unconditionally during suspend/resume
and hibernate/wake operations.

As always, real-time workloads wishing to completely avoid expedited
grace periods can use the rcupdate.rcu_normal= kernel parameter.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index acd6ccf56faf..95e3250b7b6e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3559,13 +3559,11 @@ static int rcu_pm_notify(struct notifier_block *self,
switch (action) {
case PM_HIBERNATION_PREPARE:
case PM_SUSPEND_PREPARE:
-   if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */
-   rcu_expedite_gp();
+   rcu_expedite_gp();
break;
case PM_POST_HIBERNATION:
case PM_POST_SUSPEND:
-   if (nr_cpu_ids <= 256) /* Expediting bad for large systems. */
-   rcu_unexpedite_gp();
+   rcu_unexpedite_gp();
break;
default:
break;
-- 
2.17.1



[PATCH tip/core/rcu 08/18] rcu: Move common code out of if-else block

2019-03-26 Thread Paul E. McKenney
From: Akira Yokosawa 

As the result of recent addition of "rdp->core_needs_qs = false;" in
the "if" block, now both branches of the if-else have the same
assignment.

Factor it out and reduce line count.

Signed-off-by: Akira Yokosawa 
Cc: Joel Fernandes 
Signed-off-by: Paul E. McKenney 
Acked-by: Joel Fernandes (Google) 
---
 kernel/rcu/tree.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 296131450414..5aefd36ac648 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2295,12 +2295,10 @@ rcu_report_qs_rdp(int cpu, struct rcu_data *rdp)
return;
}
mask = rdp->grpmask;
+   rdp->core_needs_qs = false;
if ((rnp->qsmask & mask) == 0) {
-   rdp->core_needs_qs = false;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
} else {
-   rdp->core_needs_qs = false;
-
/*
 * This GP can't end until cpu checks in, so all of our
 * callbacks can be processed during the next GP.
-- 
2.17.1



[PATCH tip/core/rcu 0/2] SRCU updates for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains SRCU updates:

1.  Check for in-flight callbacks in _cleanup_srcu_struct().

2.  Remove cleanup_srcu_struct_quiesced().

Thanx, Paul



 drivers/nvme/host/core.c |2 +-
 include/linux/srcu.h |   36 +---
 kernel/rcu/rcutorture.c  |7 +--
 kernel/rcu/srcutiny.c|9 +++--
 kernel/rcu/srcutree.c|   32 ++--
 5 files changed, 20 insertions(+), 66 deletions(-)



[PATCH tip/core/rcu 17/18] rcu: Fix typo in tree_exp.h comment

2019-03-26 Thread Paul E. McKenney
This commit changes a rcu_exp_handler() comment from rcu_preempt_defer_qs()
to rcu_preempt_deferred_qs() in order to better match reality.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_exp.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 4c2a0189e748..ec4fb93a5dbe 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -648,7 +648,7 @@ static void rcu_exp_handler(void *unused)
 *
 * If the CPU is fully enabled (or if some buggy RCU-preempt
 * read-side critical section is being used from idle), just
-* invoke rcu_preempt_defer_qs() to immediately report the
+* invoke rcu_preempt_deferred_qs() to immediately report the
 * quiescent state.  We cannot use rcu_read_unlock_special()
 * because we are in an interrupt handler, which will cause that
 * function to take an early exit without doing anything.
-- 
2.17.1



[PATCH tip/core/rcu 09/18] rcu: Allow rcu_nocbs= to specify all CPUs

2019-03-26 Thread Paul E. McKenney
Currently, the rcu_nocbs= kernel boot parameter requires that a specific
list of CPUs be specified, and has no way to say "all of them".
As noted by user RavFX in a comment to Phoronix topic 1002538, this
is an inconvenient side effect of the removal of the RCU_NOCB_CPU_ALL
Kconfig option.  This commit therefore enables the rcu_nocbs= kernel boot
parameter to be given the string "all", as in "rcu_nocbs=all" to specify
that all CPUs on the system are to have their RCU callbacks offloaded.

Another approach would be to make cpulist_parse() check for "all", but
there are uses of cpulist_parse() that do other checking, which could
conflict with an "all".  This commit therefore focuses on the specific
use of cpulist_parse() in rcu_nocb_setup().

Just a note to other people who would like changes to Linux-kernel RCU:
If you send your requests to me directly, they might get fixed somewhat
faster.  RavFX's comment was posted on January 22, 2018 and I first saw
it on March 5, 2019.  And the only reason that I found it -at- -all- was
that I was looking for projects using RCU, and my search engine showed
me that Phoronix comment quite by accident.  Your choice, though!  ;-)

Signed-off-by: Paul E. McKenney 
---
 Documentation/admin-guide/kernel-parameters.txt | 4 +++-
 kernel/rcu/tree_plugin.h| 5 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 2b8ee90bb644..d377a2166b79 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3623,7 +3623,9 @@
see CONFIG_RAS_CEC help text.
 
rcu_nocbs=  [KNL]
-   The argument is a cpu list, as described above.
+   The argument is a cpu list, as described above,
+   except that the string "all" can be used to
+   specify every CPU on the system.
 
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
the specified list of CPUs to be no-callback CPUs.
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index d408661d5fb7..ed4a6dabf31d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1776,7 +1776,10 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp)
 static int __init rcu_nocb_setup(char *str)
 {
alloc_bootmem_cpumask_var(&rcu_nocb_mask);
-   cpulist_parse(str, rcu_nocb_mask);
+   if (!strcasecmp(str, "all"))
+   cpumask_setall(rcu_nocb_mask);
+   else
+   cpulist_parse(str, rcu_nocb_mask);
return 1;
 }
 __setup("rcu_nocbs=", rcu_nocb_setup);
-- 
2.17.1



Re: [RFC 4/4] net/ipv4/fib: Don't synchronise_rcu() every 512Kb

2019-03-26 Thread Dmitry Safonov
On 3/26/19 3:39 PM, David Ahern wrote:
> On 3/26/19 9:30 AM, Dmitry Safonov wrote:
>> Fib trie has a hard-coded sync_pages limit to call synchronise_rcu().
>> The limit is 128 pages or 512Kb (considering common case with 4Kb
>> pages).
>>
>> Unfortunately, at Arista we have use-scenarios with full view software
>> forwarding. At the scale of 100K and more routes even on 2 core boxes
>> the hard-coded limit starts actively shooting in the leg: lockup
>> detector notices that rtnl_lock is held for seconds.
>> First reason is previously broken MAX_WORK, that didn't limit pending
>> balancing work. While fixing it, I've noticed that the bottle-neck is
>> actually in the number of synchronise_rcu() calls.
>>
>> I've tried to fix it with a patch to decrement number of tnodes in rcu
>> callback, but it hasn't much affected performance.
>>
>> One possible way to "fix" it - provide another sysctl to control
>> sync_pages, but in my POV it's nasty - exposing another realisation
>> detail into user-space.
> 
> well, that was accepted last week. ;-)
> 
> commit 9ab948a91b2c2abc8e82845c0e61f4b1683e3a4f
> Author: David Ahern 
> Date:   Wed Mar 20 09:18:59 2019 -0700
> 
> ipv4: Allow amount of dirty memory from fib resizing to be controllable
> 
> 
> Can you see how that change (should backport easily) affects your test
> case? From my perspective 16MB was the sweet spot.

FWIW, I would like to +Cc Paul here.

TLDR; we're looking with David into ways to improve a hardcoded limit
tnode_free_size at net/ipv4/fib_trie.c: currently it's way too low
(512Kb). David created a patch to provide sysctl that controls the limit
and it would solve a problem for both of us. In parallel, I thought that
exposing this to userspace is not much fun and added a shrinker with
synchronize_rcu(). I'm not any sure that the latter is actually a sane
solution..
Is there any guarantee that memory to-be freed by call_rcu() will get
freed in OOM conditions? Might there be a chance that we don't need any
limit here at all?

Worth to mention that I don't argue David's patch as I pointed that it
would (will) solve the problem for us both, but with good intentions
wondering if we can do something here rather a new sysctl knob.

Thanks,
  Dmitry


[PATCH tip/core/rcu 2/2] srcu: Remove cleanup_srcu_struct_quiesced()

2019-03-26 Thread Paul E. McKenney
The cleanup_srcu_struct_quiesced() function was added because NVME
used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that
NVME workqueues waiting on SRCU workqueues could result in deadlocks
during low-memory conditions.  However, SRCU now also has WQ_MEM_RECLAIM
workqueues, so there is no longer a potential for deadlock.  Furthermore,
it turns out to be extremely hard to use cleanup_srcu_struct_quiesced()
correctly due to the fact that SRCU callback invocation accesses the
srcu_struct structure's per-CPU data area just after callbacks are
invoked.  Therefore, the usual practice of using srcu_barrier() to wait
for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced()
fails because SRCU's callback-invocation workqueue handler might be
delayed, which can result in cleanup_srcu_struct_quiesced() being invoked
(and thus freeing the per-CPU data) before the SRCU's callback-invocation
workqueue handler is finished using that per-CPU data.  Nor is this a
theoretical problem: KASAN emitted use-after-free warnings because of
this problem on actual runs.

In short, NVME can now safely invoke cleanup_srcu_struct(), which
avoids the use-after-free scenario.  And cleanup_srcu_struct_quiesced()
is quite difficult to use safely.  This commit therefore removes
cleanup_srcu_struct_quiesced(), switching its sole user back to
cleanup_srcu_struct().  This effectively reverts the following pair
of commits:

f7194ac32ca2 ("srcu: Add cleanup_srcu_struct_quiesced()")
4317228ad9b8 ("nvme: Avoid flush dependency in delete controller flow")

Reported-by: Bart Van Assche 
Signed-off-by: Paul E. McKenney 
Reviewed-by: Bart Van Assche 
Tested-by: Bart Van Assche 
---
 drivers/nvme/host/core.c |  2 +-
 include/linux/srcu.h | 36 +---
 kernel/rcu/rcutorture.c  |  7 +--
 kernel/rcu/srcutiny.c|  9 +++--
 kernel/rcu/srcutree.c| 30 --
 5 files changed, 18 insertions(+), 66 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 470601980794..739c5b4830d7 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -388,7 +388,7 @@ static void nvme_free_ns_head(struct kref *ref)
nvme_mpath_remove_disk(head);
ida_simple_remove(&head->subsys->ns_ida, head->instance);
list_del_init(&head->entry);
-   cleanup_srcu_struct_quiesced(&head->srcu);
+   cleanup_srcu_struct(&head->srcu);
nvme_put_subsystem(head->subsys);
kfree(head);
 }
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index c495b2d51569..e432cc92c73d 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -56,45 +56,11 @@ struct srcu_struct { };
 
 void call_srcu(struct srcu_struct *ssp, struct rcu_head *head,
void (*func)(struct rcu_head *head));
-void _cleanup_srcu_struct(struct srcu_struct *ssp, bool quiesced);
+void cleanup_srcu_struct(struct srcu_struct *ssp);
 int __srcu_read_lock(struct srcu_struct *ssp) __acquires(ssp);
 void __srcu_read_unlock(struct srcu_struct *ssp, int idx) __releases(ssp);
 void synchronize_srcu(struct srcu_struct *ssp);
 
-/**
- * cleanup_srcu_struct - deconstruct a sleep-RCU structure
- * @ssp: structure to clean up.
- *
- * Must invoke this after you are finished using a given srcu_struct that
- * was initialized via init_srcu_struct(), else you leak memory.
- */
-static inline void cleanup_srcu_struct(struct srcu_struct *ssp)
-{
-   _cleanup_srcu_struct(ssp, false);
-}
-
-/**
- * cleanup_srcu_struct_quiesced - deconstruct a quiesced sleep-RCU structure
- * @ssp: structure to clean up.
- *
- * Must invoke this after you are finished using a given srcu_struct that
- * was initialized via init_srcu_struct(), else you leak memory.  Also,
- * all grace-period processing must have completed.
- *
- * "Completed" means that the last synchronize_srcu() and
- * synchronize_srcu_expedited() calls must have returned before the call
- * to cleanup_srcu_struct_quiesced().  It also means that the callback
- * from the last call_srcu() must have been invoked before the call to
- * cleanup_srcu_struct_quiesced(), but you can use srcu_barrier() to help
- * with this last.  Violating these rules will get you a WARN_ON() splat
- * (with high probability, anyway), and will also cause the srcu_struct
- * to be leaked.
- */
-static inline void cleanup_srcu_struct_quiesced(struct srcu_struct *ssp)
-{
-   _cleanup_srcu_struct(ssp, true);
-}
-
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
 /**
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index f14d1b18a74f..d2b226110835 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -592,12 +592,7 @@ static void srcu_torture_init(void)
 
 static void srcu_torture_cleanup(void)
 {
-   static DEFINE_TORTURE_RANDOM(rand);
-
-   if (torture_random(&rand) & 0x800)
-   cleanup_srcu_struct(&srcu_ctld);
-   else
-   cleanup_srcu_struct_quiesced(&srcu_ctld);
+

Re: [PATCH 4.14 00/41] 4.14.109-stable review

2019-03-26 Thread shuah

On 3/26/19 12:29 AM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.14.109 release.
There are 41 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar 28 04:26:32 UTC 2019.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.14.109-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.14.y
and the diffstat can be found below.

thanks,

greg k-h



Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


[PATCH tip/core/rcu 1/2] srcu: Check for in-flight callbacks in _cleanup_srcu_struct()

2019-03-26 Thread Paul E. McKenney
If someone fails to drain the corresponding SRCU callbacks (for
example, by failing to invoke srcu_barrier()) before invoking either
cleanup_srcu_struct() or cleanup_srcu_struct_quiesced(), the resulting
diagnostic is an ambiguous use-after-free diagnostic, and even then
only if you are running something like KASAN.  This commit therefore
improves SRCU diagnostics by adding checks for in-flight callbacks at
_cleanup_srcu_struct() time.

Note that these diagnostics can still be defeated, for example, by
invoking call_srcu() concurrently with cleanup_srcu_struct().  Which is
a really bad idea, but sometimes all too easy to do.  But even then,
these diagnostics have at least some probability of catching the problem.

Reported-by: Sagi Grimberg 
Reported-by: Bart Van Assche 
Signed-off-by: Paul E. McKenney 
Tested-by: Bart Van Assche 
---
 kernel/rcu/srcutree.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index a60b8ba9e1ac..4f30f3ecabc1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -387,6 +387,8 @@ void _cleanup_srcu_struct(struct srcu_struct *ssp, bool 
quiesced)
del_timer_sync(&sdp->delay_work);
flush_work(&sdp->work);
}
+   if (WARN_ON(rcu_segcblist_n_cbs(&sdp->srcu_cblist)))
+   return; /* Forgot srcu_barrier(), so just leak it! */
}
if (WARN_ON(rcu_seq_state(READ_ONCE(ssp->srcu_gp_seq)) != 
SRCU_STATE_IDLE) ||
WARN_ON(srcu_readers_active(ssp))) {
-- 
2.17.1



Re: [PATCH 4.19 00/45] 4.19.32-stable review

2019-03-26 Thread shuah

On 3/26/19 12:29 AM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.19.32 release.
There are 45 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar 28 04:26:41 UTC 2019.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.32-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.19.y
and the diffstat can be found below.

thanks,

greg k-h



Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah



Re: [PATCH 4.9 00/30] 4.9.166-stable review

2019-03-26 Thread shuah

On 3/26/19 12:29 AM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 4.9.166 release.
There are 30 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar 28 04:25:51 UTC 2019.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.166-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h



Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah



Re: [PATCH 5.0 00/52] 5.0.5-stable review

2019-03-26 Thread shuah

On 3/26/19 12:29 AM, Greg Kroah-Hartman wrote:

This is the start of the stable review cycle for the 5.0.5 release.
There are 52 patches in this series, all will be posted as a response
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Mar 28 04:26:38 UTC 2019.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:

https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.0.5-rc1.gz
or in the git tree and branch at:

git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git 
linux-5.0.y
and the diffstat can be found below.

thanks,

greg k-h



Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah



RE: [PATCH v19,RESEND 08/27] x86/cpu/intel: Detect SGX support and update caps appropriately

2019-03-26 Thread Huang, Kai
> On Tue, Mar 26, 2019 at 02:25:52PM -0700, Huang, Kai wrote:
> > >
> > > That being said, this in no way impacts KVM's ability to virtualize SGX, 
> > > e.g.
> > > KVM can directly do CPUID and {RD,WR}MSR to probe the capabilities
> > > of the platform as needed.
> >
> > I am not following. KVM can do whatever it wants, but it cannot change
> > the fact that KVM guest cannot run intel enclave if platform's MSRs
> > are configured to 3rd party and locked.
> >
> > Or am I misunderstanding?
> 
> What does that have to do with this patch?  The only thing this patch does is
> clear a *software* bit that says "SGX LC is enabled" so that the kernel can
> make the reasonable assumption that the MSRs are writable when
> X86_FEATURE_SGX_LC=1.

Sorted out offline discussion with you. Will let you handle :)

Thanks,
-Kai


[PATCH tip/core/rcu 0/11] RCU CPU stall-warning changes for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series is primarily code movement for RCU CPU stall warnings.
If I am having a hard time finding the various scattered pieces of
this code, it is in need of consolidation!

1-3.Move RCU CPU stall-warning code into kernel/rcu/tree_stall.h.

4.  Inline RCU task stall-warning helper functions.

5.  Move rcu_print_task_exp_stall() to tree_exp.h.

6.  Inline RCU stall-warning info helper functions.

7.  Move FAST_NO_HZ stall-warning code to tree_stall.h.

8.  Organize functions in tree_stall.h.

9.  Move irq-disabled stall-warning checking to tree_stall.h.

10. Move forward-progress checkers into tree_stall.h

11. Fix nohz status in stall warning, courtesy of Neeraj Upadhyay.

Thanx, Paul



 rcu.h |1 
 tree.c|  479 -
 tree.h|   18 -
 tree_exp.h|   32 +
 tree_plugin.h |  213 
 tree_stall.h  |  951 ++
 update.c  |   59 ---
 7 files changed, 875 insertions(+), 878 deletions(-)



[PATCH tip/core/rcu 01/11] rcu: Move RCU CPU stall-warning code out of update.c

2019-03-26 Thread Paul E. McKenney
The RCU CPU stall-warning code for normal grace periods is currently
scattered across three files, due to earlier Tiny RCU support for RCU
CPU stall warnings and for old Kconfig options that have long since
been retired.  Given that it is hard for the lead RCU maintainer to
find relevant stall-warning code, it would be good to consolidate it.
This commit starts this process by moving stall-warning code from
kernel/rcu/update.c to a new kernel/rcu/tree_stall.h file.

Note that the definitions of rcu_cpu_stall_suppress and
rcu_cpu_stall_timeout must remain in kernel/rcu/update.h to provide
compatibility for kernel boot parameter lists.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcu.h|  1 +
 kernel/rcu/tree.c   |  1 +
 kernel/rcu/tree_stall.h | 63 +
 kernel/rcu/update.c | 59 +-
 4 files changed, 66 insertions(+), 58 deletions(-)
 create mode 100644 kernel/rcu/tree_stall.h

diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index acee72c0b24b..4b58c907b4b7 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -233,6 +233,7 @@ static inline bool __rcu_reclaim(const char *rn, struct 
rcu_head *head)
 #ifdef CONFIG_RCU_STALL_COMMON
 
 extern int rcu_cpu_stall_suppress;
+extern int rcu_cpu_stall_timeout;
 int rcu_jiffies_till_stall_check(void);
 
 #define rcu_ftrace_dump_stall_suppress() \
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index acd6ccf56faf..424d50ccf9e6 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3858,5 +3858,6 @@ void __init rcu_init(void)
srcu_init();
 }
 
+#include "tree_stall.h"
 #include "tree_exp.h"
 #include "tree_plugin.h"
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
new file mode 100644
index ..682189f4d083
--- /dev/null
+++ b/kernel/rcu/tree_stall.h
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * RCU CPU stall warnings for normal RCU grace periods
+ *
+ * Copyright IBM Corporation, 2019
+ *
+ * Author: Paul E. McKenney 
+ */
+
+
+#ifdef CONFIG_PROVE_RCU
+#define RCU_STALL_DELAY_DELTA (5 * HZ)
+#else
+#define RCU_STALL_DELAY_DELTA 0
+#endif
+
+int rcu_jiffies_till_stall_check(void)
+{
+   int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
+
+   /*
+* Limit check must be consistent with the Kconfig limits
+* for CONFIG_RCU_CPU_STALL_TIMEOUT.
+*/
+   if (till_stall_check < 3) {
+   WRITE_ONCE(rcu_cpu_stall_timeout, 3);
+   till_stall_check = 3;
+   } else if (till_stall_check > 300) {
+   WRITE_ONCE(rcu_cpu_stall_timeout, 300);
+   till_stall_check = 300;
+   }
+   return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
+}
+EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
+
+void rcu_sysrq_start(void)
+{
+   if (!rcu_cpu_stall_suppress)
+   rcu_cpu_stall_suppress = 2;
+}
+
+void rcu_sysrq_end(void)
+{
+   if (rcu_cpu_stall_suppress == 2)
+   rcu_cpu_stall_suppress = 0;
+}
+
+static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
+{
+   rcu_cpu_stall_suppress = 1;
+   return NOTIFY_DONE;
+}
+
+static struct notifier_block rcu_panic_block = {
+   .notifier_call = rcu_panic,
+};
+
+static int __init check_cpu_stall_init(void)
+{
+   atomic_notifier_chain_register(&panic_notifier_list, &rcu_panic_block);
+   return 0;
+}
+early_initcall(check_cpu_stall_init);
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index cbaa976c5945..c3bf44ba42e5 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -424,68 +424,11 @@ EXPORT_SYMBOL_GPL(do_trace_rcu_torture_read);
 #endif
 
 #ifdef CONFIG_RCU_STALL_COMMON
-
-#ifdef CONFIG_PROVE_RCU
-#define RCU_STALL_DELAY_DELTA (5 * HZ)
-#else
-#define RCU_STALL_DELAY_DELTA 0
-#endif
-
 int rcu_cpu_stall_suppress __read_mostly; /* 1 = suppress stall warnings. */
 EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress);
-static int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
-
 module_param(rcu_cpu_stall_suppress, int, 0644);
+int rcu_cpu_stall_timeout __read_mostly = CONFIG_RCU_CPU_STALL_TIMEOUT;
 module_param(rcu_cpu_stall_timeout, int, 0644);
-
-int rcu_jiffies_till_stall_check(void)
-{
-   int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
-
-   /*
-* Limit check must be consistent with the Kconfig limits
-* for CONFIG_RCU_CPU_STALL_TIMEOUT.
-*/
-   if (till_stall_check < 3) {
-   WRITE_ONCE(rcu_cpu_stall_timeout, 3);
-   till_stall_check = 3;
-   } else if (till_stall_check > 300) {
-   WRITE_ONCE(rcu_cpu_stall_timeout, 300);
-   till_stall_check = 300;
-   }
-   return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;
-}
-EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
-
-void rcu_sysrq_start(void)
-{
-   if (!rcu_cpu_stall_suppress)
-   rcu_cpu_stall_supp

[PATCH tip/core/rcu 07/11] rcu: Move FAST_NO_HZ stall-warning code to tree_stall.h

2019-03-26 Thread Paul E. McKenney
This commit further consolidates the stall-warning code by moving
print_cpu_stall_info() and its helper functions along with
zero_cpu_stall_ticks() to kernel/rcu/tree_stall.h.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.h|  1 -
 kernel/rcu/tree_plugin.h | 80 
 kernel/rcu/tree_stall.h  | 80 
 3 files changed, 80 insertions(+), 81 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index d73472af49e7..49bf3b00bb50 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -416,7 +416,6 @@ static void rcu_prepare_for_idle(void);
 static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
 static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
 static void rcu_preempt_deferred_qs(struct task_struct *t);
-static void print_cpu_stall_info(int cpu);
 static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static bool rcu_nocb_cpu_needs_barrier(int cpu);
 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 2df5bb04fd7a..a1f9d7c15bd8 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1528,86 +1528,6 @@ static void rcu_cleanup_after_idle(void)
 
 #endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
 
-#ifdef CONFIG_RCU_FAST_NO_HZ
-
-static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
-{
-   struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
-
-   sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c",
-   rdp->last_accelerate & 0x, jiffies & 0x,
-   ".l"[rdp->all_lazy],
-   ".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)],
-   ".D"[!rdp->tick_nohz_enabled_snap]);
-}
-
-#else /* #ifdef CONFIG_RCU_FAST_NO_HZ */
-
-static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
-{
-   *cp = '\0';
-}
-
-#endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */
-
-/*
- * Print out diagnostic information for the specified stalled CPU.
- *
- * If the specified CPU is aware of the current RCU grace period, then
- * print the number of scheduling clock interrupts the CPU has taken
- * during the time that it has been aware.  Otherwise, print the number
- * of RCU grace periods that this CPU is ignorant of, for example, "1"
- * if the CPU was aware of the previous grace period.
- *
- * Also print out idle and (if CONFIG_RCU_FAST_NO_HZ) idle-entry info.
- */
-static void print_cpu_stall_info(int cpu)
-{
-   unsigned long delta;
-   char fast_no_hz[72];
-   struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
-   char *ticks_title;
-   unsigned long ticks_value;
-
-   /*
-* We could be printing a lot while holding a spinlock.  Avoid
-* triggering hard lockup.
-*/
-   touch_nmi_watchdog();
-
-   ticks_value = rcu_seq_ctr(rcu_state.gp_seq - rdp->gp_seq);
-   if (ticks_value) {
-   ticks_title = "GPs behind";
-   } else {
-   ticks_title = "ticks this GP";
-   ticks_value = rdp->ticks_this_gp;
-   }
-   print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
-   delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
-   pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u 
fqs=%ld %s\n",
-  cpu,
-  "O."[!!cpu_online(cpu)],
-  "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
-  "N."[!!(rdp->grpmask & rdp->mynode->qsmaskinitnext)],
-  !IS_ENABLED(CONFIG_IRQ_WORK) ? '?' :
-   rdp->rcu_iw_pending ? (int)min(delta, 9UL) + '0' :
-   "!."[!delta],
-  ticks_value, ticks_title,
-  rcu_dynticks_snap(rdp) & 0xfff,
-  rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
-  rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
-  READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
-  fast_no_hz);
-}
-
-/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
-static void zero_cpu_stall_ticks(struct rcu_data *rdp)
-{
-   rdp->ticks_this_gp = 0;
-   rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id());
-   WRITE_ONCE(rdp->last_fqs_resched, jiffies);
-}
-
 #ifdef CONFIG_RCU_NOCB_CPU
 
 /*
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 7ef3b596e45f..19b915380a6f 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -223,6 +223,86 @@ static void panic_on_rcu_stall(void)
panic("RCU Stall\n");
 }
 
+#ifdef CONFIG_RCU_FAST_NO_HZ
+
+static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
+{
+   struct rcu_data *rdp = &per_cpu(rcu_data, cpu);
+
+   sprintf(cp, "last_accelerate: %04lx/%04lx, Nonlazy posted: %c%c%c",
+   rdp->last_accelerate & 0x, jiffies & 0x,
+   ".l"[rdp->all_lazy],
+   ".L"

[PATCH tip/core/rcu 03/11] rcu: Move RCU CPU stall-warning code out of tree.c

2019-03-26 Thread Paul E. McKenney
This commit completes the process of consolidating the code for RCU CPU
stall warnings for normal grace periods by moving the remaining such
code from kernel/rcu/tree.c to kernel/rcu/tree_stall.h.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c   | 291 ---
 kernel/rcu/tree.h   |  10 +-
 kernel/rcu/tree_stall.h | 292 
 3 files changed, 299 insertions(+), 294 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 424d50ccf9e6..001dd05f6e38 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -102,8 +102,6 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
 /* Number of rcu_nodes at specified level. */
 int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
 int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. 
*/
-/* panic() on RCU Stall sysctl. */
-int sysctl_panic_on_rcu_stall __read_mostly;
 /* Commandeer a sysrq key to dump RCU's tree. */
 static bool sysrq_rcu;
 module_param(sysrq_rcu, bool, 0444);
@@ -1167,295 +1165,6 @@ static int rcu_implicit_dynticks_qs(struct rcu_data 
*rdp)
return 0;
 }
 
-static void record_gp_stall_check_time(void)
-{
-   unsigned long j = jiffies;
-   unsigned long j1;
-
-   rcu_state.gp_start = j;
-   j1 = rcu_jiffies_till_stall_check();
-   /* Record ->gp_start before ->jiffies_stall. */
-   smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
-   rcu_state.jiffies_resched = j + j1 / 2;
-   rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
-}
-
-/*
- * Complain about starvation of grace-period kthread.
- */
-static void rcu_check_gp_kthread_starvation(void)
-{
-   struct task_struct *gpk = rcu_state.gp_kthread;
-   unsigned long j;
-
-   j = jiffies - READ_ONCE(rcu_state.gp_activity);
-   if (j > 2 * HZ) {
-   pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) 
->state=%#lx ->cpu=%d\n",
-  rcu_state.name, j,
-  (long)rcu_seq_current(&rcu_state.gp_seq),
-  READ_ONCE(rcu_state.gp_flags),
-  gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
-  gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
-   if (gpk) {
-   pr_err("RCU grace-period kthread stack dump:\n");
-   sched_show_task(gpk);
-   wake_up_process(gpk);
-   }
-   }
-}
-
-/*
- * Dump stacks of all tasks running on stalled CPUs.  First try using
- * NMIs, but fall back to manual remote stack tracing on architectures
- * that don't support NMI-based stack dumps.  The NMI-triggered stack
- * traces are more accurate because they are printed by the target CPU.
- */
-static void rcu_dump_cpu_stacks(void)
-{
-   int cpu;
-   unsigned long flags;
-   struct rcu_node *rnp;
-
-   rcu_for_each_leaf_node(rnp) {
-   raw_spin_lock_irqsave_rcu_node(rnp, flags);
-   for_each_leaf_node_possible_cpu(rnp, cpu)
-   if (rnp->qsmask & leaf_node_cpu_bit(rnp, cpu))
-   if (!trigger_single_cpu_backtrace(cpu))
-   dump_cpu_task(cpu);
-   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-   }
-}
-
-/*
- * If too much time has passed in the current grace period, and if
- * so configured, go kick the relevant kthreads.
- */
-static void rcu_stall_kick_kthreads(void)
-{
-   unsigned long j;
-
-   if (!rcu_kick_kthreads)
-   return;
-   j = READ_ONCE(rcu_state.jiffies_kick_kthreads);
-   if (time_after(jiffies, j) && rcu_state.gp_kthread &&
-   (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) {
-   WARN_ONCE(1, "Kicking %s grace-period kthread\n",
- rcu_state.name);
-   rcu_ftrace_dump(DUMP_ALL);
-   wake_up_process(rcu_state.gp_kthread);
-   WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ);
-   }
-}
-
-static void panic_on_rcu_stall(void)
-{
-   if (sysctl_panic_on_rcu_stall)
-   panic("RCU Stall\n");
-}
-
-static void print_other_cpu_stall(unsigned long gp_seq)
-{
-   int cpu;
-   unsigned long flags;
-   unsigned long gpa;
-   unsigned long j;
-   int ndetected = 0;
-   struct rcu_node *rnp = rcu_get_root();
-   long totqlen = 0;
-
-   /* Kick and suppress, if so configured. */
-   rcu_stall_kick_kthreads();
-   if (rcu_cpu_stall_suppress)
-   return;
-
-   /*
-* OK, time to rat on our buddy...
-* See Documentation/RCU/stallwarn.txt for info on how to debug
-* RCU CPU stall warnings.
-*/
-   pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name);
-   print_cpu_stall_info_begin();
-   rcu_for_each_leaf_node(rnp) {
-   raw_spin_lock_irqsave_rcu_

[PATCH tip/core/rcu 05/11] rcu: Move rcu_print_task_exp_stall() to tree_exp.h

2019-03-26 Thread Paul E. McKenney
Because expedited CPU stall warnings are contained within the
kernel/rcu/tree_exp.h file, rcu_print_task_exp_stall() should live
there too.  This commit carries out the required code motion.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_exp.h| 32 
 kernel/rcu/tree_plugin.h | 31 ---
 2 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 4c2a0189e748..7be3e085ddd6 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -10,6 +10,7 @@
 #include 
 
 static void rcu_exp_handler(void *unused);
+static int rcu_print_task_exp_stall(struct rcu_node *rnp);
 
 /*
  * Record the start of an expedited grace period.
@@ -670,6 +671,27 @@ static void sync_sched_exp_online_cleanup(int cpu)
 {
 }
 
+/*
+ * Scan the current list of tasks blocked within RCU read-side critical
+ * sections, printing out the tid of each that is blocking the current
+ * expedited grace period.
+ */
+static int rcu_print_task_exp_stall(struct rcu_node *rnp)
+{
+   struct task_struct *t;
+   int ndetected = 0;
+
+   if (!rnp->exp_tasks)
+   return 0;
+   t = list_entry(rnp->exp_tasks->prev,
+  struct task_struct, rcu_node_entry);
+   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
+   pr_cont(" P%d", t->pid);
+   ndetected++;
+   }
+   return ndetected;
+}
+
 #else /* #ifdef CONFIG_PREEMPT_RCU */
 
 /* Invoked on each online non-idle CPU for expedited quiescent state. */
@@ -709,6 +731,16 @@ static void sync_sched_exp_online_cleanup(int cpu)
WARN_ON_ONCE(ret);
 }
 
+/*
+ * Because preemptible RCU does not exist, we never have to check for
+ * tasks blocked within RCU read-side critical sections that are
+ * blocking the current expedited grace period.
+ */
+static int rcu_print_task_exp_stall(struct rcu_node *rnp)
+{
+   return 0;
+}
+
 #endif /* #else #ifdef CONFIG_PREEMPT_RCU */
 
 /**
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 7fa3bc4d481b..72519c57f656 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -642,27 +642,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
rcu_preempt_deferred_qs_irqrestore(t, flags);
 }
 
-/*
- * Scan the current list of tasks blocked within RCU read-side critical
- * sections, printing out the tid of each that is blocking the current
- * expedited grace period.
- */
-static int rcu_print_task_exp_stall(struct rcu_node *rnp)
-{
-   struct task_struct *t;
-   int ndetected = 0;
-
-   if (!rnp->exp_tasks)
-   return 0;
-   t = list_entry(rnp->exp_tasks->prev,
-  struct task_struct, rcu_node_entry);
-   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
-   pr_cont(" P%d", t->pid);
-   ndetected++;
-   }
-   return ndetected;
-}
-
 /*
  * Check that the list of blocked tasks for the newly completed grace
  * period is in fact empty.  It is a serious bug to complete a grace
@@ -906,16 +885,6 @@ static bool rcu_preempt_need_deferred_qs(struct 
task_struct *t)
 }
 static void rcu_preempt_deferred_qs(struct task_struct *t) { }
 
-/*
- * Because preemptible RCU does not exist, we never have to check for
- * tasks blocked within RCU read-side critical sections that are
- * blocking the current expedited grace period.
- */
-static int rcu_print_task_exp_stall(struct rcu_node *rnp)
-{
-   return 0;
-}
-
 /*
  * Because there is no preemptible RCU, there can be no readers blocked,
  * so there is no need to check for blocked tasks.  So check only for
-- 
2.17.1



Re: [PATCH 09/14] bus: ti-sysc: Move rstctrl reset to happen later

2019-03-26 Thread Suman Anna
Hi Tony,

On 3/26/19 6:13 PM, Tony Lindgren wrote:
> * Tony Lindgren  [190325 22:00]:
>> We should not do the reset until the clocks are enabled. Let's only init
>> restctrl in sysc_init_resets() and do the reset later on in sysc_reset().
> ...
> 
>>  static int sysc_reset(struct sysc *ddata)
>>  {
>>  int offset = ddata->offsets[SYSC_SYSCONFIG];
>> -int val;
>> +int error, val;
>>  
>>  if (ddata->legacy_mode || offset < 0 ||
>>  ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
>> -return 0;
>> +return sysc_rstctrl_reset_deassert(ddata, false);
>> +
>> +error = sysc_rstctrl_reset_deassert(ddata, true);
>> +if (error)
>> +return error;
> 
> This change is wrong, we need to deassert rstctrl reset before
> we enable clocks, not after. Updated version below.

Hmm, are you envisioning the SYSC reset (OCP SoftReset) here or the PRCM
RSTCTRL hardresets here? The latter in general requires the clocks to be
running first (module won't be in ready status until you deassert the
hardresets with clocks running). You can look up the Warm-reset or
Cold-reset sequences in the TRMs for any of the processors.

I am working on preparing the next version of PRUSS patches with ti-sysc
on AM33xx/AM437x/AM57xx platforms, so will pick up these patches for my
testing.

regards
Suman

> 
> Regards,
> 
> Tony
> 
> 8< 
> From tony Mon Sep 17 00:00:00 2001
> From: Tony Lindgren 
> Date: Thu, 21 Mar 2019 11:00:21 -0700
> Subject: [PATCH] bus: ti-sysc: Move rstctrl reset to happen later
> 
> We can do the rsstctrl a bit later, but need to deassert rstctrl reset
> before the clocks are enabled if asserted. Let's only init restctrl
> in sysc_init_resets() and do the reset later on just before we enable
> the device clocks.
> 
> Signed-off-by: Tony Lindgren 
> ---
>  drivers/bus/ti-sysc.c | 61 +++
>  1 file changed, 39 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
> --- a/drivers/bus/ti-sysc.c
> +++ b/drivers/bus/ti-sysc.c
> @@ -339,38 +339,18 @@ static void sysc_disable_opt_clocks(struct sysc *ddata)
>  }
>  
>  /**
> - * sysc_init_resets - reset module on init
> + * sysc_init_resets - init rstctrl reset line if configured
>   * @ddata: device driver data
>   *
> - * A module can have both OCP softreset control and external rstctrl.
> - * If more complicated rstctrl resets are needed, please handle these
> - * directly from the child device driver and map only the module reset
> - * for the parent interconnect target module device.
> - *
> - * Automatic reset of the module on init can be skipped with the
> - * "ti,no-reset-on-init" device tree property.
> + * See sysc_rstctrl_reset_deassert().
>   */
>  static int sysc_init_resets(struct sysc *ddata)
>  {
> - int error;
> -
>   ddata->rsts =
>   devm_reset_control_array_get_optional_exclusive(ddata->dev);
>   if (IS_ERR(ddata->rsts))
>   return PTR_ERR(ddata->rsts);
>  
> - if (ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
> - goto deassert;
> -
> - error = reset_control_assert(ddata->rsts);
> - if (error)
> - return error;
> -
> -deassert:
> - error = reset_control_deassert(ddata->rsts);
> - if (error)
> - return error;
> -
>   return 0;
>  }
>  
> @@ -1031,6 +1011,35 @@ static int sysc_legacy_init(struct sysc *ddata)
>   return error;
>  }
>  
> +/**
> + * sysc_rstctrl_reset_deassert - deassert rstctrl reset
> + * @ddata: device driver data
> + * @reset: reset before deassert
> + *
> + * A module can have both OCP softreset control and external rstctrl.
> + * If more complicated rstctrl resets are needed, please handle these
> + * directly from the child device driver and map only the module reset
> + * for the parent interconnect target module device.
> + *
> + * Automatic reset of the module on init can be skipped with the
> + * "ti,no-reset-on-init" device tree property.
> + */
> +static int sysc_rstctrl_reset_deassert(struct sysc *ddata, bool reset)
> +{
> + int error;
> +
> + if (!ddata->rsts)
> + return 0;
> +
> + if (reset) {
> + error = reset_control_assert(ddata->rsts);
> + if (error)
> + return error;
> + }
> +
> + return reset_control_deassert(ddata->rsts);
> +}
> +
>  static int sysc_reset(struct sysc *ddata)
>  {
>   int offset = ddata->offsets[SYSC_SYSCONFIG];
> @@ -1071,6 +1080,14 @@ static int sysc_init_module(struct sysc *ddata)
>  {
>   int error = 0;
>   bool manage_clocks = true;
> + bool reset = true;
> +
> + if (ddata->cfg.quirks & SYSC_QUIRK_NO_RESET_ON_INIT)
> + reset = false;
> +
> + error = sysc_rstctrl_reset_deassert(ddata, reset);
> + if (error)
> + return error;
>  
>   if (ddata->cfg.quirks &
>   (SYSC_QUIRK_NO_IDLE | SYSC_QUIRK_NO_IDLE_ON_INIT

[PATCH tip/core/rcu 02/11] rcu: Move RCU CPU stall-warning code out of tree_plugin.h

2019-03-26 Thread Paul E. McKenney
The RCU CPU stall-warning code for normal grace periods is currently
scattered across two files, due to earlier Tiny RCU support for RCU
CPU stall warnings and for old Kconfig options that have long since
been retired.  Given that it is hard for the lead RCU maintainer to
find relevant stall-warning code, it would be good to consolidate it.
This commit continues this process by moving stall-warning code from
kernel/rcu/tree_plugin.c to a new kernel/rcu/tree_stall.h file.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_plugin.h | 90 -
 kernel/rcu/tree_stall.h  | 95 
 2 files changed, 95 insertions(+), 90 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 97dba50f6fb2..7fa3bc4d481b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -642,79 +642,6 @@ static void rcu_read_unlock_special(struct task_struct *t)
rcu_preempt_deferred_qs_irqrestore(t, flags);
 }
 
-/*
- * Dump detailed information for all tasks blocking the current RCU
- * grace period on the specified rcu_node structure.
- */
-static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
-{
-   unsigned long flags;
-   struct task_struct *t;
-
-   raw_spin_lock_irqsave_rcu_node(rnp, flags);
-   if (!rcu_preempt_blocked_readers_cgp(rnp)) {
-   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-   return;
-   }
-   t = list_entry(rnp->gp_tasks->prev,
-  struct task_struct, rcu_node_entry);
-   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
-   /*
-* We could be printing a lot while holding a spinlock.
-* Avoid triggering hard lockup.
-*/
-   touch_nmi_watchdog();
-   sched_show_task(t);
-   }
-   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
-}
-
-/*
- * Dump detailed information for all tasks blocking the current RCU
- * grace period.
- */
-static void rcu_print_detail_task_stall(void)
-{
-   struct rcu_node *rnp = rcu_get_root();
-
-   rcu_print_detail_task_stall_rnp(rnp);
-   rcu_for_each_leaf_node(rnp)
-   rcu_print_detail_task_stall_rnp(rnp);
-}
-
-static void rcu_print_task_stall_begin(struct rcu_node *rnp)
-{
-   pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
-  rnp->level, rnp->grplo, rnp->grphi);
-}
-
-static void rcu_print_task_stall_end(void)
-{
-   pr_cont("\n");
-}
-
-/*
- * Scan the current list of tasks blocked within RCU read-side critical
- * sections, printing out the tid of each.
- */
-static int rcu_print_task_stall(struct rcu_node *rnp)
-{
-   struct task_struct *t;
-   int ndetected = 0;
-
-   if (!rcu_preempt_blocked_readers_cgp(rnp))
-   return 0;
-   rcu_print_task_stall_begin(rnp);
-   t = list_entry(rnp->gp_tasks->prev,
-  struct task_struct, rcu_node_entry);
-   list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
-   pr_cont(" P%d", t->pid);
-   ndetected++;
-   }
-   rcu_print_task_stall_end();
-   return ndetected;
-}
-
 /*
  * Scan the current list of tasks blocked within RCU read-side critical
  * sections, printing out the tid of each that is blocking the current
@@ -979,23 +906,6 @@ static bool rcu_preempt_need_deferred_qs(struct 
task_struct *t)
 }
 static void rcu_preempt_deferred_qs(struct task_struct *t) { }
 
-/*
- * Because preemptible RCU does not exist, we never have to check for
- * tasks blocked within RCU read-side critical sections.
- */
-static void rcu_print_detail_task_stall(void)
-{
-}
-
-/*
- * Because preemptible RCU does not exist, we never have to check for
- * tasks blocked within RCU read-side critical sections.
- */
-static int rcu_print_task_stall(struct rcu_node *rnp)
-{
-   return 0;
-}
-
 /*
  * Because preemptible RCU does not exist, we never have to check for
  * tasks blocked within RCU read-side critical sections that are
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 682189f4d083..6f5f94944f49 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -61,3 +61,98 @@ static int __init check_cpu_stall_init(void)
return 0;
 }
 early_initcall(check_cpu_stall_init);
+
+#ifdef CONFIG_PREEMPT
+
+/*
+ * Dump detailed information for all tasks blocking the current RCU
+ * grace period on the specified rcu_node structure.
+ */
+static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
+{
+   unsigned long flags;
+   struct task_struct *t;
+
+   raw_spin_lock_irqsave_rcu_node(rnp, flags);
+   if (!rcu_preempt_blocked_readers_cgp(rnp)) {
+   raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
+   return;
+   }
+   t = list_entry(rnp->gp_tasks->prev,
+  struct task_struct, rcu_node_entry

[PATCH tip/core/rcu 10/11] rcu: Move forward-progress checkers into tree_stall.h

2019-03-26 Thread Paul E. McKenney
This commit further consolidates stall-warning functionality by moving
forward-progress checkers into kernel/rcu/tree_stall.h, updating a
comment or two while in the area.  More specifically, this commit moves
show_rcu_gp_kthreads(), rcu_check_gp_start_stall(), rcu_fwd_progress_check(),
sysrq_rcu, sysrq_show_rcu(), sysrq_rcudump_op, and rcu_sysrq_init() from
kernel/rcu/tree.c to kernel/rcu/tree_stall.h.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c   | 166 --
 kernel/rcu/tree.h   |   2 +
 kernel/rcu/tree_stall.h | 171 
 3 files changed, 173 insertions(+), 166 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 929531ed168c..4cb1ebd93b0c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -102,9 +102,6 @@ int rcu_num_lvls __read_mostly = RCU_NUM_LVLS;
 /* Number of rcu_nodes at specified level. */
 int num_rcu_lvl[] = NUM_RCU_LVL_INIT;
 int rcu_num_nodes __read_mostly = NUM_RCU_NODES; /* Total # rcu_nodes in use. 
*/
-/* Commandeer a sysrq key to dump RCU's tree. */
-static bool sysrq_rcu;
-module_param(sysrq_rcu, bool, 0444);
 
 /*
  * The rcu_scheduler_active variable is initialized to the value
@@ -510,74 +507,6 @@ static const char *gp_state_getname(short gs)
return gp_state_names[gs];
 }
 
-/*
- * Show the state of the grace-period kthreads.
- */
-void show_rcu_gp_kthreads(void)
-{
-   int cpu;
-   unsigned long j;
-   unsigned long ja;
-   unsigned long jr;
-   unsigned long jw;
-   struct rcu_data *rdp;
-   struct rcu_node *rnp;
-
-   j = jiffies;
-   ja = j - READ_ONCE(rcu_state.gp_activity);
-   jr = j - READ_ONCE(rcu_state.gp_req_activity);
-   jw = j - READ_ONCE(rcu_state.gp_wake_time);
-   pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu 
->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld 
->gp_seq_needed %ld ->gp_flags %#x\n",
-   rcu_state.name, gp_state_getname(rcu_state.gp_state),
-   rcu_state.gp_state,
-   rcu_state.gp_kthread ? rcu_state.gp_kthread->state : 0x1L,
-   ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
-   (long)READ_ONCE(rcu_state.gp_seq),
-   (long)READ_ONCE(rcu_get_root()->gp_seq_needed),
-   READ_ONCE(rcu_state.gp_flags));
-   rcu_for_each_node_breadth_first(rnp) {
-   if (ULONG_CMP_GE(rcu_state.gp_seq, rnp->gp_seq_needed))
-   continue;
-   pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
-   rnp->grplo, rnp->grphi, (long)rnp->gp_seq,
-   (long)rnp->gp_seq_needed);
-   if (!rcu_is_leaf_node(rnp))
-   continue;
-   for_each_leaf_node_possible_cpu(rnp, cpu) {
-   rdp = per_cpu_ptr(&rcu_data, cpu);
-   if (rdp->gpwrap ||
-   ULONG_CMP_GE(rcu_state.gp_seq,
-rdp->gp_seq_needed))
-   continue;
-   pr_info("\tcpu %d ->gp_seq_needed %ld\n",
-   cpu, (long)rdp->gp_seq_needed);
-   }
-   }
-   /* sched_show_task(rcu_state.gp_kthread); */
-}
-EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
-
-/* Dump grace-period-request information due to commandeered sysrq. */
-static void sysrq_show_rcu(int key)
-{
-   show_rcu_gp_kthreads();
-}
-
-static struct sysrq_key_op sysrq_rcudump_op = {
-   .handler = sysrq_show_rcu,
-   .help_msg = "show-rcu(y)",
-   .action_msg = "Show RCU tree",
-   .enable_mask = SYSRQ_ENABLE_DUMP,
-};
-
-static int __init rcu_sysrq_init(void)
-{
-   if (sysrq_rcu)
-   return register_sysrq_key('y', &sysrq_rcudump_op);
-   return 0;
-}
-early_initcall(rcu_sysrq_init);
-
 /*
  * Send along grace-period-related data for rcutorture diagnostics.
  */
@@ -2323,101 +2252,6 @@ void rcu_force_quiescent_state(void)
 }
 EXPORT_SYMBOL_GPL(rcu_force_quiescent_state);
 
-/*
- * This function checks for grace-period requests that fail to motivate
- * RCU to come out of its idle mode.
- */
-void
-rcu_check_gp_start_stall(struct rcu_node *rnp, struct rcu_data *rdp,
-const unsigned long gpssdelay)
-{
-   unsigned long flags;
-   unsigned long j;
-   struct rcu_node *rnp_root = rcu_get_root();
-   static atomic_t warned = ATOMIC_INIT(0);
-
-   if (!IS_ENABLED(CONFIG_PROVE_RCU) || rcu_gp_in_progress() ||
-   ULONG_CMP_GE(rnp_root->gp_seq, rnp_root->gp_seq_needed))
-   return;
-   j = jiffies; /* Expensive access, and in common case don't get here. */
-   if (time_before(j, READ_ONCE(rcu_state.gp_req_activity) + gpssdelay) ||
-   time_before(j, READ_ONCE(rcu_state.gp_activity) + gpssdelay) ||
-   atomic_read(&warned))
- 

[PATCH tip/core/rcu 11/11] rcu: Fix nohz status in stall warning

2019-03-26 Thread Paul E. McKenney
From: Neeraj Upadhyay 

The Documentation/RCU/stallwarn.txt file says that stall warnings
print "D" if dyntick-idle processing is enabled, but the code in
print_cpu_stall_fast_no_hz() prints "." instead.  This commit therefore
reverses the sense of the test to make the code match the documentation.

Signed-off-by: Neeraj Upadhyay 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree_stall.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 9e3db08d02bc..f65a73a97323 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -267,7 +267,7 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
rdp->last_accelerate & 0x, jiffies & 0x,
".l"[rdp->all_lazy],
".L"[!rcu_segcblist_n_nonlazy_cbs(&rdp->cblist)],
-   ".D"[!rdp->tick_nohz_enabled_snap]);
+   ".D"[!!rdp->tick_nohz_enabled_snap]);
 }
 
 #else /* #ifdef CONFIG_RCU_FAST_NO_HZ */
-- 
2.17.1



[PATCH tip/core/rcu 08/11] rcu: Organize functions in tree_stall.h

2019-03-26 Thread Paul E. McKenney
This commit does only code movement, removal of now-unneeded forward
declarations, and addition of comments.  It organizes the functions
that implement RCU CPU stall warnings for normal grace periods into
three categories:

1.  Control of RCU CPU stall warnings, including computing timeouts.

2.  Interaction of stall warnings with grace periods.

3.  Actual printing of the RCU CPU stall-warning messages.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.h   |   1 -
 kernel/rcu/tree_stall.h | 180 ++--
 2 files changed, 97 insertions(+), 84 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 49bf3b00bb50..099410dbcbe9 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -442,6 +442,5 @@ static void rcu_dynticks_task_enter(void);
 static void rcu_dynticks_task_exit(void);
 
 /* Forward declarations for tree_stall.h */
-static int rcu_print_task_stall(struct rcu_node *rnp);
 static void record_gp_stall_check_time(void);
 static void check_cpu_stall(struct rcu_data *rdp);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 19b915380a6f..03ed47883d8a 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -7,6 +7,9 @@
  * Author: Paul E. McKenney 
  */
 
+//
+//
+// Controlling CPU stall warnings, including delay calculation.
 
 /* panic() on RCU Stall sysctl. */
 int sysctl_panic_on_rcu_stall __read_mostly;
@@ -17,6 +20,7 @@ int sysctl_panic_on_rcu_stall __read_mostly;
 #define RCU_STALL_DELAY_DELTA 0
 #endif
 
+/* Limit-check stall timeouts specified at boottime and runtime. */
 int rcu_jiffies_till_stall_check(void)
 {
int till_stall_check = READ_ONCE(rcu_cpu_stall_timeout);
@@ -36,6 +40,7 @@ int rcu_jiffies_till_stall_check(void)
 }
 EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
 
+/* Don't do RCU CPU stall warnings during long sysrq printouts. */
 void rcu_sysrq_start(void)
 {
if (!rcu_cpu_stall_suppress)
@@ -48,6 +53,7 @@ void rcu_sysrq_end(void)
rcu_cpu_stall_suppress = 0;
 }
 
+/* Don't print RCU CPU stall warnings during a kernel panic. */
 static int rcu_panic(struct notifier_block *this, unsigned long ev, void *ptr)
 {
rcu_cpu_stall_suppress = 1;
@@ -65,6 +71,78 @@ static int __init check_cpu_stall_init(void)
 }
 early_initcall(check_cpu_stall_init);
 
+/* If so specified via sysctl, panic, yielding cleaner stall-warning output. */
+static void panic_on_rcu_stall(void)
+{
+   if (sysctl_panic_on_rcu_stall)
+   panic("RCU Stall\n");
+}
+
+/**
+ * rcu_cpu_stall_reset - prevent further stall warnings in current grace period
+ *
+ * Set the stall-warning timeout way off into the future, thus preventing
+ * any RCU CPU stall-warning messages from appearing in the current set of
+ * RCU grace periods.
+ *
+ * The caller must disable hard irqs.
+ */
+void rcu_cpu_stall_reset(void)
+{
+   WRITE_ONCE(rcu_state.jiffies_stall, jiffies + ULONG_MAX / 2);
+}
+
+//
+//
+// Interaction with RCU grace periods
+
+/* Start of new grace period, so record stall time (and forcing times). */
+static void record_gp_stall_check_time(void)
+{
+   unsigned long j = jiffies;
+   unsigned long j1;
+
+   rcu_state.gp_start = j;
+   j1 = rcu_jiffies_till_stall_check();
+   /* Record ->gp_start before ->jiffies_stall. */
+   smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
+   rcu_state.jiffies_resched = j + j1 / 2;
+   rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
+}
+
+/* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
+static void zero_cpu_stall_ticks(struct rcu_data *rdp)
+{
+   rdp->ticks_this_gp = 0;
+   rdp->softirq_snap = kstat_softirqs_cpu(RCU_SOFTIRQ, smp_processor_id());
+   WRITE_ONCE(rdp->last_fqs_resched, jiffies);
+}
+
+/*
+ * If too much time has passed in the current grace period, and if
+ * so configured, go kick the relevant kthreads.
+ */
+static void rcu_stall_kick_kthreads(void)
+{
+   unsigned long j;
+
+   if (!rcu_kick_kthreads)
+   return;
+   j = READ_ONCE(rcu_state.jiffies_kick_kthreads);
+   if (time_after(jiffies, j) && rcu_state.gp_kthread &&
+   (rcu_gp_in_progress() || READ_ONCE(rcu_state.gp_flags))) {
+   WARN_ONCE(1, "Kicking %s grace-period kthread\n",
+ rcu_state.name);
+   rcu_ftrace_dump(DUMP_ALL);
+   wake_up_process(rcu_state.gp_kthread);
+   WRITE_ONCE(rcu_state.jiffies_kick_kthreads, j + HZ);
+   }
+}
+
+//
+//
+// Printing RCU CPU stall warnings
+
 #ifdef CONFIG_PREEMPT
 
 /*
@@ -137,43 +215,6 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
 }
 #endif /* #else #ifdef C

[PATCH tip/core/rcu 06/11] rcu: Inline RCU stall-warning info helper functions

2019-03-26 Thread Paul E. McKenney
The print_cpu_stall_info_begin() and print_cpu_stall_info_end() print a
single character each onto the console, and are a holdover from a time
when RCU CPU stall warning messages could be abbreviated using a long-gone
Kconfig option.  This commit therefore adds these single characters to
already-printed strings in the calling functions, and then eliminates
both print_cpu_stall_info_begin() and print_cpu_stall_info_end().

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.h|  2 --
 kernel/rcu/tree_plugin.h | 12 
 kernel/rcu/tree_stall.h  | 12 
 3 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index c6df9a13dd06..d73472af49e7 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -416,9 +416,7 @@ static void rcu_prepare_for_idle(void);
 static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
 static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
 static void rcu_preempt_deferred_qs(struct task_struct *t);
-static void print_cpu_stall_info_begin(void);
 static void print_cpu_stall_info(int cpu);
-static void print_cpu_stall_info_end(void);
 static void zero_cpu_stall_ticks(struct rcu_data *rdp);
 static bool rcu_nocb_cpu_needs_barrier(int cpu);
 static struct swait_queue_head *rcu_nocb_gp_get(struct rcu_node *rnp);
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 72519c57f656..2df5bb04fd7a 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1550,12 +1550,6 @@ static void print_cpu_stall_fast_no_hz(char *cp, int cpu)
 
 #endif /* #else #ifdef CONFIG_RCU_FAST_NO_HZ */
 
-/* Initiate the stall-info list. */
-static void print_cpu_stall_info_begin(void)
-{
-   pr_cont("\n");
-}
-
 /*
  * Print out diagnostic information for the specified stalled CPU.
  *
@@ -1606,12 +1600,6 @@ static void print_cpu_stall_info(int cpu)
   fast_no_hz);
 }
 
-/* Terminate the stall-info list. */
-static void print_cpu_stall_info_end(void)
-{
-   pr_err("\t");
-}
-
 /* Zero ->ticks_this_gp and snapshot the number of RCU softirq handlers. */
 static void zero_cpu_stall_ticks(struct rcu_data *rdp)
 {
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index b476786b8ef7..7ef3b596e45f 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -243,8 +243,7 @@ static void print_other_cpu_stall(unsigned long gp_seq)
 * See Documentation/RCU/stallwarn.txt for info on how to debug
 * RCU CPU stall warnings.
 */
-   pr_err("INFO: %s detected stalls on CPUs/tasks:", rcu_state.name);
-   print_cpu_stall_info_begin();
+   pr_err("INFO: %s detected stalls on CPUs/tasks:\n", rcu_state.name);
rcu_for_each_leaf_node(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
ndetected += rcu_print_task_stall(rnp);
@@ -258,10 +257,9 @@ static void print_other_cpu_stall(unsigned long gp_seq)
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
 
-   print_cpu_stall_info_end();
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
-   pr_cont("(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
+   pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
   smp_processor_id(), (long)(jiffies - rcu_state.gp_start),
   (long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
if (ndetected) {
@@ -314,15 +312,13 @@ static void print_cpu_stall(void)
 * See Documentation/RCU/stallwarn.txt for info on how to debug
 * RCU CPU stall warnings.
 */
-   pr_err("INFO: %s self-detected stall on CPU", rcu_state.name);
-   print_cpu_stall_info_begin();
+   pr_err("INFO: %s self-detected stall on CPU\n", rcu_state.name);
raw_spin_lock_irqsave_rcu_node(rdp->mynode, flags);
print_cpu_stall_info(smp_processor_id());
raw_spin_unlock_irqrestore_rcu_node(rdp->mynode, flags);
-   print_cpu_stall_info_end();
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
-   pr_cont(" (t=%lu jiffies g=%ld q=%lu)\n",
+   pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n",
jiffies - rcu_state.gp_start,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
 
-- 
2.17.1



[PATCH tip/core/rcu 09/11] rcu: Move irq-disabled stall-warning checking to tree_stall.h

2019-03-26 Thread Paul E. McKenney
The rcu_iw_handler() function's sole purpose in life is to indicate
whether a stalled CPU had interrupts disabled, so it belongs in
kernel/rcu/tree_stall.h.  This commit therefore makes that move,
clarifying its header comment while in the area.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.c   | 21 -
 kernel/rcu/tree.h   |  1 +
 kernel/rcu/tree_stall.h | 20 
 3 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 001dd05f6e38..929531ed168c 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1031,27 +1031,6 @@ static int dyntick_save_progress_counter(struct rcu_data 
*rdp)
return 0;
 }
 
-/*
- * Handler for the irq_work request posted when a grace period has
- * gone on for too long, but not yet long enough for an RCU CPU
- * stall warning.  Set state appropriately, but just complain if
- * there is unexpected state on entry.
- */
-static void rcu_iw_handler(struct irq_work *iwp)
-{
-   struct rcu_data *rdp;
-   struct rcu_node *rnp;
-
-   rdp = container_of(iwp, struct rcu_data, rcu_iw);
-   rnp = rdp->mynode;
-   raw_spin_lock_rcu_node(rnp);
-   if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
-   rdp->rcu_iw_gp_seq = rnp->gp_seq;
-   rdp->rcu_iw_pending = false;
-   }
-   raw_spin_unlock_rcu_node(rnp);
-}
-
 /*
  * Return true if the specified CPU has passed through a quiescent
  * state by virtue of being in or having passed through an dynticks
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 099410dbcbe9..f882ce3ca5a5 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -443,4 +443,5 @@ static void rcu_dynticks_task_exit(void);
 
 /* Forward declarations for tree_stall.h */
 static void record_gp_stall_check_time(void);
+static void rcu_iw_handler(struct irq_work *iwp);
 static void check_cpu_stall(struct rcu_data *rdp);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index 03ed47883d8a..526e223e41ce 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -139,6 +139,26 @@ static void rcu_stall_kick_kthreads(void)
}
 }
 
+/*
+ * Handler for the irq_work request posted about halfway into the RCU CPU
+ * stall timeout, and used to detect excessive irq disabling.  Set state
+ * appropriately, but just complain if there is unexpected state on entry.
+ */
+static void rcu_iw_handler(struct irq_work *iwp)
+{
+   struct rcu_data *rdp;
+   struct rcu_node *rnp;
+
+   rdp = container_of(iwp, struct rcu_data, rcu_iw);
+   rnp = rdp->mynode;
+   raw_spin_lock_rcu_node(rnp);
+   if (!WARN_ON_ONCE(!rdp->rcu_iw_pending)) {
+   rdp->rcu_iw_gp_seq = rnp->gp_seq;
+   rdp->rcu_iw_pending = false;
+   }
+   raw_spin_unlock_rcu_node(rnp);
+}
+
 //
 //
 // Printing RCU CPU stall warnings
-- 
2.17.1



[PATCH tip/core/rcu 04/11] rcu: Inline RCU task stall-warning helper functions

2019-03-26 Thread Paul E. McKenney
The rcu_print_detail_task_stall(), rcu_print_task_stall_begin(), and
rcu_print_task_stall_end() functions were defined to allow long-gone
Kconfig options to provide an abbreviated RCU CPU stall warning printout.
This commit saves a few lines of code by inlining them into their sole
callers.

While in the area, a useless call of rcu_print_detail_task_stall_rnp()
on the root rcu_node structure was eliminated.  If there is only one
rcu_node structure, its tasks get printed twice, but if there are more,
the root rcu_node structure is guaranteed to have an empty list of blocked
tasks, hence the uselessness.  (Long ago, root rcu_node structures with
non-empty ->blkd_tasks lists could happen, but no longer.)

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/tree.h   |  1 -
 kernel/rcu/tree_stall.h | 36 +++-
 2 files changed, 7 insertions(+), 30 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 3c4e26fff806..c6df9a13dd06 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -445,7 +445,6 @@ static void rcu_dynticks_task_enter(void);
 static void rcu_dynticks_task_exit(void);
 
 /* Forward declarations for tree_stall.h */
-static void rcu_print_detail_task_stall(void);
 static int rcu_print_task_stall(struct rcu_node *rnp);
 static void record_gp_stall_check_time(void);
 static void check_cpu_stall(struct rcu_data *rdp);
diff --git a/kernel/rcu/tree_stall.h b/kernel/rcu/tree_stall.h
index e0e73f493363..b476786b8ef7 100644
--- a/kernel/rcu/tree_stall.h
+++ b/kernel/rcu/tree_stall.h
@@ -94,30 +94,6 @@ static void rcu_print_detail_task_stall_rnp(struct rcu_node 
*rnp)
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 }
 
-/*
- * Dump detailed information for all tasks blocking the current RCU
- * grace period.
- */
-static void rcu_print_detail_task_stall(void)
-{
-   struct rcu_node *rnp = rcu_get_root();
-
-   rcu_print_detail_task_stall_rnp(rnp);
-   rcu_for_each_leaf_node(rnp)
-   rcu_print_detail_task_stall_rnp(rnp);
-}
-
-static void rcu_print_task_stall_begin(struct rcu_node *rnp)
-{
-   pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
-  rnp->level, rnp->grplo, rnp->grphi);
-}
-
-static void rcu_print_task_stall_end(void)
-{
-   pr_cont("\n");
-}
-
 /*
  * Scan the current list of tasks blocked within RCU read-side critical
  * sections, printing out the tid of each.
@@ -129,14 +105,15 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
 
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
-   rcu_print_task_stall_begin(rnp);
+   pr_err("\tTasks blocked on level-%d rcu_node (CPUs %d-%d):",
+  rnp->level, rnp->grplo, rnp->grphi);
t = list_entry(rnp->gp_tasks->prev,
   struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
-   rcu_print_task_stall_end();
+   pr_cont("\n");
return ndetected;
 }
 
@@ -146,7 +123,7 @@ static int rcu_print_task_stall(struct rcu_node *rnp)
  * Because preemptible RCU does not exist, we never have to check for
  * tasks blocked within RCU read-side critical sections.
  */
-static void rcu_print_detail_task_stall(void)
+static void rcu_print_detail_task_stall_rnp(struct rcu_node *rnp)
 {
 }
 
@@ -253,7 +230,7 @@ static void print_other_cpu_stall(unsigned long gp_seq)
unsigned long gpa;
unsigned long j;
int ndetected = 0;
-   struct rcu_node *rnp = rcu_get_root();
+   struct rcu_node *rnp;
long totqlen = 0;
 
/* Kick and suppress, if so configured. */
@@ -291,7 +268,8 @@ static void print_other_cpu_stall(unsigned long gp_seq)
rcu_dump_cpu_stacks();
 
/* Complain about tasks blocking the grace period. */
-   rcu_print_detail_task_stall();
+   rcu_for_each_leaf_node(rnp)
+   rcu_print_detail_task_stall_rnp(rnp);
} else {
if (rcu_seq_current(&rcu_state.gp_seq) != gp_seq) {
pr_err("INFO: Stall ended before state dump start\n");
-- 
2.17.1



Re: INFO: rcu detected stall in __perf_sw_event

2019-03-26 Thread Finn Thain
On Tue, 26 Mar 2019, syzbot wrote:

> syzbot has bisected this bug to:
> 
> commit cf85d89562f39cc7ae73de54639f1915a9195b7a
> Author: Finn Thain 
> Date:   Fri May 25 07:34:36 2018 +
> 
>m68k/mac: Enable PDMA for PowerBook 500 series
> 

Looks like a false positive. But if you really are running syzkaller on a 
PowerBook 500, you'll want to check that you have an FPU, otherwise a CPU 
erratum will probably mess up your results.

-- 

> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1226cb8b20
> start commit:   b0314565 Merge tag 'for_linus' of git://git.kernel.org/pub..
> git tree:   upstream
> final crash:https://syzkaller.appspot.com/x/report.txt?x=1126cb8b20
> console output: https://syzkaller.appspot.com/x/log.txt?x=1626cb8b20
> kernel config:  https://syzkaller.appspot.com/x/.config?x=8f00801d7b7c4fe6
> dashboard link: https://syzkaller.appspot.com/bug?extid=a41ac89a0712acde0e84
> syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1707cd2f40
> 
> Reported-by: syzbot+a41ac89a0712acde0...@syzkaller.appspotmail.com
> Fixes: cf85d89562f3 ("m68k/mac: Enable PDMA for PowerBook 500 series")
> 
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection


[PATCH tip/core/rcu 4/9] rcutorture: Remove ->ext_irq_conflict field

2019-03-26 Thread Paul E. McKenney
Back when there was a separate RCU-bh flavor, the ->ext_irq_conflict
field was used to prevent executing local_bh_enable() while interrupts
were disabled.  However, there is no longer an RCU-bh flavor, so this
commit removes the no-longer-needed ->ext_irq_conflict field.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcutorture.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 2453229ba15a..21ab3c7eb221 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -299,7 +299,6 @@ struct rcu_torture_ops {
int irq_capable;
int can_boost;
int extendables;
-   int ext_irq_conflict;
const char *name;
 };
 
@@ -1170,10 +1169,6 @@ rcutorture_extend_mask(int oldmask, struct 
torture_random_state *trsp)
((!(mask & RCUTORTURE_RDR_BH) && (oldmask & RCUTORTURE_RDR_BH)) ||
 (!(mask & RCUTORTURE_RDR_RBH) && (oldmask & RCUTORTURE_RDR_RBH
mask |= RCUTORTURE_RDR_BH | RCUTORTURE_RDR_RBH;
-   if ((mask & RCUTORTURE_RDR_IRQ) &&
-   !(mask & cur_ops->ext_irq_conflict) &&
-   (oldmask & cur_ops->ext_irq_conflict))
-   mask |= cur_ops->ext_irq_conflict; /* Or if readers object. */
return mask ?: RCUTORTURE_RDR_RCU;
 }
 
-- 
2.17.1



[PATCH tip/core/rcu 7/9] rcuperf: Fix cleanup path for invalid perf_type strings

2019-03-26 Thread Paul E. McKenney
If the specified rcuperf.perf_type is not in the rcu_perf_init()
function's perf_ops[] array, rcuperf prints some console messages and
then invokes rcu_perf_cleanup() to set state so that a future torture
test can run.  However, rcu_perf_cleanup() also attempts to end the
test that didn't actually start, and in doing so relies on the value
of cur_ops, a value that is not particularly relevant in this case.
This can result in confusing output or even follow-on failures due to
attempts to use facilities that have not been properly initialized.

This commit therefore sets the value of cur_ops to NULL in this case and
inserts a check near the beginning of rcu_perf_cleanup(), thus avoiding
relying on an irrelevant cur_ops value.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcuperf.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index c29761152874..7a6890b23c5f 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -494,6 +494,10 @@ rcu_perf_cleanup(void)
 
if (torture_cleanup_begin())
return;
+   if (!cur_ops) {
+   torture_cleanup_end();
+   return;
+   }
 
if (reader_tasks) {
for (i = 0; i < nrealreaders; i++)
@@ -614,6 +618,7 @@ rcu_perf_init(void)
pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_PERF_TEST));
firsterr = -EINVAL;
+   cur_ops = NULL;
goto unwind;
}
if (cur_ops->init)
-- 
2.17.1



RE: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver

2019-03-26 Thread Sonal Santan



> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, March 25, 2019 1:28 PM
> To: Sonal Santan 
> Cc: dri-de...@lists.freedesktop.org; gre...@linuxfoundation.org; Cyril
> Chemparathy ; linux-kernel@vger.kernel.org; Lizhi Hou
> ; Michal Simek ; airl...@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
> 
> On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.san...@xilinx.com wrote:
> > From: Sonal Santan 
> >
> > Hello,
> >
> > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > have been deployed by leading FaaS vendors and many enterprise
> customers.
> 
> Cool, first fpga driver submitted to drm! And from a high level I think this
> makes a lot of sense.
> 
> > PLATFORM ARCHITECTURE
> >
> > Alveo PCIe platforms have a static shell and a reconfigurable
> > (dynamic) region. The shell is automatically loaded from PROM when
> > host is booted and PCIe is enumerated by BIOS. Shell cannot be changed
> > till next cold reboot. The shell exposes two physical functions:
> > management physical function and user physical function.
> >
> > Users compile their high level design in C/C++/OpenCL or RTL into FPGA
> > image using SDx compiler. The FPGA image packaged as xclbin file can
> > be loaded onto reconfigurable region. The image may contain one or
> > more compute unit. Users can dynamically swap the full image running
> > on the reconfigurable region in order to switch between different
> workloads.
> >
> > XRT DRIVERS
> >
> > XRT Linux kernel driver xmgmt binds to mgmt pf. The driver is modular
> > and organized into several platform drivers which primarily handle the
> > following functionality:
> > 1.  ICAP programming (FPGA bitstream download with FPGA Mgr
> > integration) 2.  Clock scaling 3.  Loading firmware container also
> > called dsabin (embedded Microblaze
> > firmware for ERT and XMC, optional clearing bitstream) 4.  In-band
> > sensors: temp, voltage, power, etc.
> > 5.  AXI Firewall management
> > 6.  Device reset and rescan
> > 7.  Hardware mailbox for communication between two physical functions
> >
> > XRT Linux kernel driver xocl binds to user pf. Like its peer, this
> > driver is also modular and organized into several platform drivers
> > which handle the following functionality:
> > 1.  Device memory topology discovery and memory management 2.  Buffer
> > object abstraction and management for client process 3.  XDMA MM PCIe
> > DMA engine programming 4.  Multi-process aware context management 5.
> > Compute unit execution management (optionally with help of ERT) for
> > client processes
> > 6.  Hardware mailbox for communication between two physical functions
> >
> > The drivers export ioctls and sysfs nodes for various services. xocl
> > driver makes heavy use of DRM GEM features for device memory
> > management, reference counting, mmap support and export/import. xocl
> > also includes a simple scheduler called KDS which schedules compute
> > units and interacts with hardware scheduler running ERT firmware. The
> > scheduler understands custom opcodes packaged into command objects
> and
> > provides an asynchronous command done notification via POSIX poll.
> >
> > More details on architecture, software APIs, ioctl definitions,
> > execution model, etc. is available as Sphinx documentation--
> >
> > https://xilinx.github.io/XRT/2018.3/html/index.html
> >
> > The complete runtime software stack (XRT) which includes out of tree
> > kernel drivers, user space libraries, board utilities and firmware for
> > the hardware scheduler is open source and available at
> > https://github.com/Xilinx/XRT
> 
> Before digging into the implementation side more I looked into the userspace
> here. I admit I got lost a bit, since there's lots of indirections and 
> abstractions
> going on, but it seems like this is just a fancy ioctl wrapper/driver backend
> abstractions. Not really something applications would use.
Sonal Santan 

4:20 PM (1 minute ago)

to me


> -Original Message-
> From: Daniel Vetter [mailto:daniel.vet...@ffwll.ch] On Behalf Of Daniel Vetter
> Sent: Monday, March 25, 2019 1:28 PM
> To: Sonal Santan 
> Cc: dri-de...@lists.freedesktop.org; gre...@linuxfoundation.org; Cyril
> Chemparathy ; linux-kernel@vger.kernel.org; Lizhi Hou
> ; Michal Simek ; airl...@redhat.com
> Subject: Re: [RFC PATCH Xilinx Alveo 0/6] Xilinx PCIe accelerator driver
>
> On Tue, Mar 19, 2019 at 02:53:55PM -0700, sonal.san...@xilinx.com wrote:
> > From: Sonal Santan 
> >
> > Hello,
> >
> > This patch series adds drivers for Xilinx Alveo PCIe accelerator cards.
> > These drivers are part of Xilinx Runtime (XRT) open source stack and
> > have been deployed by leading FaaS vendors and many enterprise
> customers.
>
> Cool, first fpga driver submitted to drm! And from a high level I think 

[PATCH tip/core/rcu 2/9] tools/.../rcutorture: Convert to SPDX license identifier

2019-03-26 Thread Paul E. McKenney
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address and add copyright notices.

Signed-off-by: Paul E. McKenney 
---
 .../selftests/rcutorture/bin/configNR_CPUS.sh  | 17 ++---
 .../rcutorture/bin/config_override.sh  | 17 ++---
 .../selftests/rcutorture/bin/configcheck.sh| 18 +++---
 .../selftests/rcutorture/bin/configinit.sh | 17 ++---
 .../selftests/rcutorture/bin/cpus2use.sh   | 17 ++---
 .../selftests/rcutorture/bin/functions.sh  | 17 ++---
 .../testing/selftests/rcutorture/bin/jitter.sh | 17 ++---
 .../selftests/rcutorture/bin/kvm-build.sh  | 17 ++---
 .../rcutorture/bin/kvm-find-errors.sh  |  5 +
 .../rcutorture/bin/kvm-recheck-lock.sh | 17 ++---
 .../rcutorture/bin/kvm-recheck-rcu.sh  | 17 ++---
 .../bin/kvm-recheck-rcuperf-ftrace.sh  | 17 ++---
 .../rcutorture/bin/kvm-recheck-rcuperf.sh  | 17 ++---
 .../selftests/rcutorture/bin/kvm-recheck.sh| 17 ++---
 .../selftests/rcutorture/bin/kvm-test-1-run.sh | 17 ++---
 tools/testing/selftests/rcutorture/bin/kvm.sh  | 17 ++---
 .../selftests/rcutorture/bin/mkinitrd.sh   | 15 +--
 .../selftests/rcutorture/bin/parse-build.sh| 17 ++---
 .../selftests/rcutorture/bin/parse-console.sh  | 17 ++---
 .../rcutorture/configs/lock/ver_functions.sh   | 17 ++---
 .../rcutorture/configs/rcu/ver_functions.sh| 17 ++---
 .../configs/rcuperf/ver_functions.sh   | 17 ++---
 22 files changed, 47 insertions(+), 314 deletions(-)

diff --git a/tools/testing/selftests/rcutorture/bin/configNR_CPUS.sh 
b/tools/testing/selftests/rcutorture/bin/configNR_CPUS.sh
index 43540f1828cc..2deea2169fc2 100755
--- a/tools/testing/selftests/rcutorture/bin/configNR_CPUS.sh
+++ b/tools/testing/selftests/rcutorture/bin/configNR_CPUS.sh
@@ -1,4 +1,5 @@
 #!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
 #
 # Extract the number of CPUs expected from the specified Kconfig-file
 # fragment by checking CONFIG_SMP and CONFIG_NR_CPUS.  If the specified
@@ -7,23 +8,9 @@
 #
 # Usage: configNR_CPUS.sh config-frag
 #
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program; if not, you can access it online at
-# http://www.gnu.org/licenses/gpl-2.0.html.
-#
 # Copyright (C) IBM Corporation, 2013
 #
-# Authors: Paul E. McKenney 
+# Authors: Paul E. McKenney 
 
 cf=$1
 if test ! -r $cf
diff --git a/tools/testing/selftests/rcutorture/bin/config_override.sh 
b/tools/testing/selftests/rcutorture/bin/config_override.sh
index ef7fcbac3d42..90016c359e83 100755
--- a/tools/testing/selftests/rcutorture/bin/config_override.sh
+++ b/tools/testing/selftests/rcutorture/bin/config_override.sh
@@ -1,4 +1,5 @@
 #!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
 #
 # config_override.sh base override
 #
@@ -6,23 +7,9 @@
 # that conflict with any in override, concatenating what remains and
 # sending the result to standard output.
 #
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program; if not, you can access it online at
-# http://www.gnu.org/licenses/gpl-2.0.html.
-#
 # Copyright (C) IBM Corporation, 2017
 #
-# Authors: Paul E. McKenney 
+# Authors: Paul E. McKenney 
 
 base=$1
 if test -r $base
diff --git a/tools/testing/selftests/rcutorture/bin/configcheck.sh 
b/tools/testing/selftests/rcutorture/bin/configcheck.sh
index 197deece7c7c..5b25524d0366 100755
--- a/tools/testing/selftests/rcutorture/bin/configcheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/configcheck.sh
@@ -1,23 +1,11 @@
 #!/bin/bash
-# Usage: configcheck.sh .config .config-template
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Publ

[PATCH tip/core/rcu 5/9] rcutorture: Fix expected forward progress duration in OOM notifier

2019-03-26 Thread Paul E. McKenney
From: Neeraj Upadhyay 

The rcutorture_oom_notify() function has a misplaced close parenthesis
that results in increasingly long delays in rcu_fwd_progress_check()'s
checking for various RCU forward-progress problems.  This commit therefore
puts the parenthesis in the right place.

Signed-off-by: Neeraj Upadhyay 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 21ab3c7eb221..b42682b94cb7 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1843,7 +1843,7 @@ static int rcutorture_oom_notify(struct notifier_block 
*self,
WARN(1, "%s invoked upon OOM during forward-progress testing.\n",
 __func__);
rcu_torture_fwd_cb_hist();
-   rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat) / 2));
+   rcu_fwd_progress_check(1 + (jiffies - READ_ONCE(rcu_fwd_startat)) / 2);
WRITE_ONCE(rcu_fwd_emergency_stop, true);
smp_mb(); /* Emergency stop before free and wait to avoid hangs. */
pr_info("%s: Freed %lu RCU callbacks.\n",
-- 
2.17.1



[PATCH tip/core/rcu 0/9] Torture-test updates for v5.2

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains torture-test updates:

1.  Don't try to offline the last CPU.

2.  Convert rcutorture scripting to SPDX license identifier.

3.  Make rcutorture_extend_mask() comment match the code.

4.  Remove ->ext_irq_conflict field.

5.  Fix expected forward progress duration in OOM notifier, courtesy
of Neeraj Upadhyay.

6.  Fix cleanup path for invalid torture_type strings.

7.  Fix cleanup path for invalid perf_type strings.

8.  NULL cxt.lwsa and cxt.lrsa to allow locktorture bad-arg detection.

9.  Suppress false-positive CONFIG_INITRAMFS_SOURCE complaint.

Thanx, Paul



 kernel/locking/locktorture.c |2 +
 kernel/rcu/rcuperf.c |5 ++
 kernel/rcu/rcutorture.c  |   14 
+++
 kernel/torture.c |2 +
 tools/testing/selftests/rcutorture/bin/configNR_CPUS.sh  |   17 
+---
 tools/testing/selftests/rcutorture/bin/config_override.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/configcheck.sh|   19 
++
 tools/testing/selftests/rcutorture/bin/configinit.sh |   17 
+---
 tools/testing/selftests/rcutorture/bin/cpus2use.sh   |   17 
+---
 tools/testing/selftests/rcutorture/bin/functions.sh  |   17 
+---
 tools/testing/selftests/rcutorture/bin/jitter.sh |   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-build.sh  |   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-find-errors.sh|5 ++
 tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh   |   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh |   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-recheck.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh |   17 
+---
 tools/testing/selftests/rcutorture/bin/kvm.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/mkinitrd.sh   |   15 
---
 tools/testing/selftests/rcutorture/bin/parse-build.sh|   17 
+---
 tools/testing/selftests/rcutorture/bin/parse-console.sh  |   17 
+---
 tools/testing/selftests/rcutorture/configs/lock/ver_functions.sh |   17 
+---
 tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh  |   17 
+---
 tools/testing/selftests/rcutorture/configs/rcuperf/ver_functions.sh  |   17 
+---
 26 files changed, 64 insertions(+), 321 deletions(-)



[PATCH tip/core/rcu 8/9] locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection

2019-03-26 Thread Paul E. McKenney
Currently, lock_torture_cleanup() uses the values of cxt.lwsa and cxt.lrsa
to detect bad parameters that prevented locktorture from initializing,
let alone running.  In this case, lock_torture_cleanup() does no cleanup
aside from invoking torture_cleanup_begin() and torture_cleanup_end(),
as required to permit future torture tests to run.  However, this
heuristic fails if the run with bad parameters was preceded by a previous
run that actually ran:  In this case, both cxt.lwsa and cxt.lrsa will
remain non-zero, which means that the current lock_torture_cleanup()
invocation will be unable to detect the fact that it should skip cleanup,
which can result in charming outcomes such as double frees.

This commit therefore NULLs out both cxt.lwsa and cxt.lrsa at the end
of any run that actually ran.

Signed-off-by: Paul E. McKenney 
Cc: Davidlohr Bueso 
Cc: Josh Triplett 
---
 kernel/locking/locktorture.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index ad40a2617063..80a463d31a8d 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -829,7 +829,9 @@ static void lock_torture_cleanup(void)
"End of test: SUCCESS");
 
kfree(cxt.lwsa);
+   cxt.lwsa = NULL;
kfree(cxt.lrsa);
+   cxt.lrsa = NULL;
 
 end:
torture_cleanup_end();
-- 
2.17.1



[PATCH tip/core/rcu 9/9] torture: Suppress false-positive CONFIG_INITRAMFS_SOURCE complaint

2019-03-26 Thread Paul E. McKenney
The scripting must supply the CONFIG_INITRAMFS_SOURCE Kconfig option
so that kbuild can find the desired initrd, but the configcheck.sh
script gets confused by this option because it takes a string instead
of the expected y/n/m.  This causes checkconfig.sh to complain about
CONFIG_INITRAMFS_SOURCE in the torture-test output (though not in the
summary).  As more people use rcutorture, the resulting confusion is
an increasing concern.

This commit therefore suppresses this false-positive warning by filtering
CONFIG_INITRAMFS_SOURCE from within the checkconfig.sh script.

Reported-by: Joel Fernandes 
Signed-off-by: Paul E. McKenney 
---
 tools/testing/selftests/rcutorture/bin/configcheck.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/rcutorture/bin/configcheck.sh 
b/tools/testing/selftests/rcutorture/bin/configcheck.sh
index 5b25524d0366..31584cee84d7 100755
--- a/tools/testing/selftests/rcutorture/bin/configcheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/configcheck.sh
@@ -14,6 +14,7 @@ mkdir $T
 cat $1 > $T/.config
 
 cat $2 | sed -e 's/\(.*\)=n/# \1 is not set/' -e 's/^#CHECK#//' |
+grep -v '^CONFIG_INITRAMFS_SOURCE' |
 awk'
 {
print "if grep -q \"" $0 "\" < '"$T/.config"'";
-- 
2.17.1



[PATCH tip/core/rcu 6/9] rcutorture: Fix cleanup path for invalid torture_type strings

2019-03-26 Thread Paul E. McKenney
If the specified rcutorture.torture_type is not in the rcu_torture_init()
function's torture_ops[] array, rcutorture prints some console messages
and then invokes rcu_torture_cleanup() to set state so that a future
torture test can run.  However, rcu_torture_cleanup() also attempts to
end the test that didn't actually start, and in doing so relies on the
value of cur_ops, a value that is not particularly relevant in this case.
This can result in confusing output or even follow-on failures due to
attempts to use facilities that have not been properly initialized.

This commit therefore sets the value of cur_ops to NULL in this case
and inserts a check near the beginning of rcu_torture_cleanup(),
thus avoiding relying on an irrelevant cur_ops value.

Reported-by: kernel test robot 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcutorture.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index b42682b94cb7..e3c0f57ab0aa 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -2089,6 +2089,10 @@ rcu_torture_cleanup(void)
cur_ops->cb_barrier();
return;
}
+   if (!cur_ops) {
+   torture_cleanup_end();
+   return;
+   }
 
rcu_torture_barrier_cleanup();
torture_stop_kthread(rcu_torture_fwd_prog, fwd_prog_task);
@@ -2262,6 +2266,7 @@ rcu_torture_init(void)
pr_cont("\n");
WARN_ON(!IS_MODULE(CONFIG_RCU_TORTURE_TEST));
firsterr = -EINVAL;
+   cur_ops = NULL;
goto unwind;
}
if (cur_ops->fqs == NULL && fqs_duration != 0) {
-- 
2.17.1



[PATCH tip/core/rcu 1/9] torture: Don't try to offline the last CPU

2019-03-26 Thread Paul E. McKenney
If there is only one online CPU, it doesn't make sense to try to offline
it, as any such attempt is guaranteed to fail.  This commit therefore
check for this condition and refuses to attempt the nonsensical.

Reported-by: Su Yue 
Signed-off-by: Paul E. McKenney 
Tested-By: Su Yue 
---
 kernel/torture.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/torture.c b/kernel/torture.c
index 8faa1a9aaeb9..17b2be9bde12 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -88,6 +88,8 @@ bool torture_offline(int cpu, long *n_offl_attempts, long 
*n_offl_successes,
 
if (!cpu_online(cpu) || !cpu_is_hotpluggable(cpu))
return false;
+   if (num_online_cpus() <= 1)
+   return false;  /* Can't offline the last CPU. */
 
if (verbose > 1)
pr_alert("%s" TORTURE_FLAG
-- 
2.17.1



[PATCH tip/core/rcu 3/9] rcutorture: Make rcutorture_extend_mask() comment match the code

2019-03-26 Thread Paul E. McKenney
The code actually rarely uses more than one type of RCU read-side
protection, as is actually desired given that we need some reasonable
probability of preempting RCU read-side critical sections, which cannot
happen with multiple types of protection.  This comment therefore adjusts
the comment.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/rcutorture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index f14d1b18a74f..2453229ba15a 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -1160,7 +1160,7 @@ rcutorture_extend_mask(int oldmask, struct 
torture_random_state *trsp)
unsigned long randmask2 = randmask1 >> 3;
 
WARN_ON_ONCE(mask >> RCUTORTURE_RDR_SHIFT);
-   /* Most of the time lots of bits, half the time only one bit. */
+   /* Mostly only one bit (need preemption!), sometimes lots of bits. */
if (!(randmask1 & 0x7))
mask = mask & randmask2;
else
-- 
2.17.1



[PATCH 3/3] fuse: Add FOPEN_STREAM and use stream_open() if filesystem returned that from open handler

2019-03-26 Thread Kirill Smelkov
Starting from 9c225f2655 (vfs: atomic f_pos accesses as per POSIX) files
opened even via nonseekable_open gate read and write via lock and do not
allow them to be run simultaneously. This can create read vs write
deadlock if a filesystem is trying to implement a socket-like file which
is intended to be simultaneously used for both read and write from
filesystem client. See previous patch "fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock" for details and e.g. 581d21a2d0 (xenbus: fix deadlock on
writes to /proc/xen/xenbus) for a similar deadlock example on /proc/xen/xenbus.

To avoid such deadlock it was tempting fuse_finish_open to use
stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags,
but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
and in particular GVFS which actually uses offset in its read and write
handlers

https://codesearch.debian.net/search?q=-%3Enonseekable+%3D

https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080

https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346

https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

so if we would do such a change it will break a real user.

-> Add another flag (FOPEN_STREAM) for filesystem servers to indicate
that the opened handler is having stream-like semantics; does not use
file position and thus the kernel is free to issue simultaneous read and
write request on opened file handle.

This patch together with stream_open should be added to stable kernels starting 
from
v3.14+ (the kernel where 9c225f2655 first appeared). This will allow to patch
OSSPD and other FUSE filesystems that provide stream-like files to return
FOPEN_STREAM | FOPEN_NONSEEKABLE in open handler and this way avoid the 
deadlock on
all kernel versions. This should work because fuse_finish_open ignores unknown 
open
flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that
is not aware of this flag cannot hurt. In turn the kernel that is not aware of
FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to
implement streams without read vs write deadlock.

Cc: Al Viro 
Cc: Linus Torvalds 
Cc: Michael Kerrisk 
Cc: Yongzhi Pan 
Cc: Jonathan Corbet 
Cc: David Vrabel 
Cc: Juergen Gross 
Cc: Tejun Heo 
Cc: Kirill Tkhai 
Cc: Arnd Bergmann 
Cc: Christoph Hellwig 
Cc: Greg Kroah-Hartman 
Cc: Julia Lawall 
Cc: Nikolaus Rath 
Cc: Han-Wen Nienhuys 
Signed-off-by: Kirill Smelkov 
---
 fs/fuse/file.c| 4 +++-
 include/uapi/linux/fuse.h | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index ffaffe18352a..7ea4099cde16 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -181,7 +181,9 @@ void fuse_finish_open(struct inode *inode, struct file 
*file)
file->f_op = &fuse_direct_io_file_operations;
if (!(ff->open_flags & FOPEN_KEEP_CACHE))
invalidate_inode_pages2(inode->i_mapping);
-   if (ff->open_flags & FOPEN_NONSEEKABLE)
+   if (ff->open_flags & FOPEN_STREAM)
+   stream_open(inode, file);
+   else if (ff->open_flags & FOPEN_NONSEEKABLE)
nonseekable_open(inode, file);
if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) {
struct fuse_inode *fi = get_fuse_inode(inode);
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index b4967d48bfda..93ac72a1e4ff 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -226,11 +226,13 @@ struct fuse_file_lock {
  * FOPEN_KEEP_CACHE: don't invalidate the data cache on open
  * FOPEN_NONSEEKABLE: the file is not seekable
  * FOPEN_CACHE_DIR: allow caching this directory
+ * FOPEN_STREAM: the file is stream-like
  */
 #define FOPEN_DIRECT_IO(1 << 0)
 #define FOPEN_KEEP_CACHE   (1 << 1)
 #define FOPEN_NONSEEKABLE  (1 << 2)
 #define FOPEN_CACHE_DIR(1 << 3)
+#define FOPEN_STREAM   (1 << 4)
 
 /**
  * INIT request/reply flags
-- 
2.21.0.392.gf8f6787159


Re: [PATCH 09/14] bus: ti-sysc: Move rstctrl reset to happen later

2019-03-26 Thread Tony Lindgren
Hi,

* Suman Anna  [190326 23:22]:
> On 3/26/19 6:13 PM, Tony Lindgren wrote:
> Hmm, are you envisioning the SYSC reset (OCP SoftReset) here or the PRCM
> RSTCTRL hardresets here? The latter in general requires the clocks to be
> running first (module won't be in ready status until you deassert the
> hardresets with clocks running). You can look up the Warm-reset or
> Cold-reset sequences in the TRMs for any of the processors.

That's for rstctrl. I just did a quick test with my earlier
reset-simple patch and I noticed sgx on am33xx produces a
clock error unless we deassert it's rstrctrl before enabling
clocks first:

gfx-l3-clkctrl:0004:0: failed to enable

> I am working on preparing the next version of PRUSS patches with ti-sysc
> on AM33xx/AM437x/AM57xx platforms, so will pick up these patches for my
> testing.

OK great, yes please check and test with your rstctrl use case.
I guess you still need to use the reset-simple patch for now
until we have a proper prm rstctrl driver.

Note that you probably also want to leave out the struct
omap_hwmod data from omap_hwmod_*_data.c files with rstctrl
entries.

Regards,

Tony


[PATCH RFC memory-model 0/21] LKMM updates for review

2019-03-26 Thread Paul E. McKenney
Hello!

This series contains LKMM updates:

1.  Make scripts be executable.

2.  Fix comment in MP+poonceonces.litmus, courtesy of Andrea Parri.

3.  Do not use "herd" to refer to "herd7", courtesy of Andrea Parri.

4.  Rewrite "KERNEL I/O BARRIER EFFECTS" section of memory-barriers.txt,
courtesy of Will Deacon.

5-6.Make LKMM scripts note timeouts instead of just saying that
the validation was bad.

7.  Make LKMM scripts identify litmus-test typos and use of
unsupported primitives instead of just saying that the validation
was bad.

8.  Add support for synchronize_srcu_expedited().

9.  Make LKMM scripts detect unconditional deadlocks.

10-21.  Leverage Boqun Feng's C-to-assembly litmus-test-translation
capability to allow verifying LKMM against hardware models
for checkalllitmus.sh.  This is a work in progress.

Thanx, Paul



 Documentation/memory-barriers.txt |  115 +--
 tools/memory-model/linux-kernel.def   |1 
 tools/memory-model/litmus-tests/.gitignore|4 
 tools/memory-model/litmus-tests/MP+poonceonces.litmus |2 
 tools/memory-model/litmus-tests/README|2 
 tools/memory-model/lock.cat   |2 
 tools/memory-model/scripts/README |   12 -
 tools/memory-model/scripts/checkalllitmus.sh  |   29 +--
 tools/memory-model/scripts/checklitmus.sh |  101 +
 tools/memory-model/scripts/cmplitmushist.sh   |   53 ++
 tools/memory-model/scripts/judgelitmus.sh |  114 +++---
 tools/memory-model/scripts/parseargs.sh   |   11 +
 tools/memory-model/scripts/runlitmus.sh   |  137 ++
 tools/memory-model/scripts/runlitmushist.sh   |3 
 tools/memory-model/scripts/simpletest.sh  |   35 
 15 files changed, 425 insertions(+), 196 deletions(-)



[PATCH tip/core/rcu 01/21] tools/memory-model: Make scripts be executable

2019-03-26 Thread Paul E. McKenney
This commit simplifies life a bit by making all of the scripts in
tools/memory-model/scripts be executable.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/checkghlitmus.sh   | 0
 tools/memory-model/scripts/checklitmushist.sh | 0
 tools/memory-model/scripts/cmplitmushist.sh   | 0
 tools/memory-model/scripts/initlitmushist.sh  | 0
 tools/memory-model/scripts/judgelitmus.sh | 0
 tools/memory-model/scripts/newlitmushist.sh   | 0
 tools/memory-model/scripts/parseargs.sh   | 0
 tools/memory-model/scripts/runlitmushist.sh   | 0
 8 files changed, 0 insertions(+), 0 deletions(-)
 mode change 100644 => 100755 tools/memory-model/scripts/checkghlitmus.sh
 mode change 100644 => 100755 tools/memory-model/scripts/checklitmushist.sh
 mode change 100644 => 100755 tools/memory-model/scripts/cmplitmushist.sh
 mode change 100644 => 100755 tools/memory-model/scripts/initlitmushist.sh
 mode change 100644 => 100755 tools/memory-model/scripts/judgelitmus.sh
 mode change 100644 => 100755 tools/memory-model/scripts/newlitmushist.sh
 mode change 100644 => 100755 tools/memory-model/scripts/parseargs.sh
 mode change 100644 => 100755 tools/memory-model/scripts/runlitmushist.sh

diff --git a/tools/memory-model/scripts/checkghlitmus.sh 
b/tools/memory-model/scripts/checkghlitmus.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/checklitmushist.sh 
b/tools/memory-model/scripts/checklitmushist.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/cmplitmushist.sh 
b/tools/memory-model/scripts/cmplitmushist.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/initlitmushist.sh 
b/tools/memory-model/scripts/initlitmushist.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/newlitmushist.sh 
b/tools/memory-model/scripts/newlitmushist.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/parseargs.sh 
b/tools/memory-model/scripts/parseargs.sh
old mode 100644
new mode 100755
diff --git a/tools/memory-model/scripts/runlitmushist.sh 
b/tools/memory-model/scripts/runlitmushist.sh
old mode 100644
new mode 100755
-- 
2.17.1



[PATCH tip/core/rcu 17/21] tools/memory-model: Make runlitmus.sh generate .litmus.out for --hw

2019-03-26 Thread Paul E. McKenney
In the absence of "Result:" comments, the runlitmus.sh script relies on
litmus.out files from prior LKMM runs.  This can be a bit user-hostile,
so this commit makes runlitmus.sh generate any needed .litmus.out files
that don't already exist.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/runlitmus.sh | 54 ++---
 1 file changed, 30 insertions(+), 24 deletions(-)

diff --git a/tools/memory-model/scripts/runlitmus.sh 
b/tools/memory-model/scripts/runlitmus.sh
index 0ae46ab08cbd..186944a7a528 100755
--- a/tools/memory-model/scripts/runlitmus.sh
+++ b/tools/memory-model/scripts/runlitmus.sh
@@ -28,42 +28,48 @@ if test -f "$litmus" -a -r "$litmus"
 then
:
 else
-   echo ' --- ' error: \"$litmus\" is not a readable file
+   echo ' !!! ' error: \"$litmus\" is not a readable file
exit 255
 fi
 
-if test -z "$LKMM_HW_MAP_FILE"
+if test -z "$LKMM_HW_MAP_FILE" -o ! -e $LKMM_DESTDIR/$litmus.out
 then
# LKMM run
herdoptions=${LKMM_HERD_OPTIONS--conf linux-kernel.cfg}
echo Herd options: $herdoptions > $LKMM_DESTDIR/$litmus.out
/usr/bin/time $LKMM_TIMEOUT_CMD herd7 $herdoptions $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
-else
-   # Hardware run
+   ret=$?
+   if test -z "$LKMM_HW_MAP_FILE"
+   then
+   exit $ret
+   fi
+   echo " --- " Automatically generated LKMM output for '"'--hw 
$LKMM_HW_MAP_FILE'"' run
+fi
 
-   T=/tmp/checklitmushw.sh.$$
-   trap 'rm -rf $T' 0 2
-   mkdir $T
+# Hardware run
 
-   # Generate filenames
-   catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
-   mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
-   themefile="$T/${LKMM_HW_MAP_FILE}.theme"
-   herdoptions="-model $LKMM_HW_CAT_FILE"
-   hwlitmus=`echo $litmus | sed -e 
's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`
-   hwlitmusfile=`echo $hwlitmus | sed -e 's,^.*/,,'`
+T=/tmp/checklitmushw.sh.$$
+trap 'rm -rf $T' 0 2
+mkdir $T
 
-   # Don't run on litmus tests with complex synchronization
-   if ! scripts/simpletest.sh $litmus
-   then
-   echo ' --- ' error: \"$litmus\" contains locking, RCU, or SRCU
-   exit 254
-   fi
+# Generate filenames
+catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
+mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
+themefile="$T/${LKMM_HW_MAP_FILE}.theme"
+herdoptions="-model $LKMM_HW_CAT_FILE"
+hwlitmus=`echo $litmus | sed -e 's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`
+hwlitmusfile=`echo $hwlitmus | sed -e 's,^.*/,,'`
 
-   # Generate the assembly code and run herd on it.
-   gen_theme7 -n 10 -map $mapfile -call Linux.call > $themefile
-   jingle7 -theme $themefile $litmus > $T/$hwlitmusfile 2> 
$T/$hwlitmusfile.jingle7.out
-   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 -model $catfile $T/$hwlitmusfile 
> $LKMM_DESTDIR/$hwlitmus.out 2>&1
+# Don't run on litmus tests with complex synchronization
+if ! scripts/simpletest.sh $litmus
+then
+   echo ' --- ' error: \"$litmus\" contains locking, RCU, or SRCU
+   exit 254
 fi
 
+# Generate the assembly code and run herd on it.
+gen_theme7 -n 10 -map $mapfile -call Linux.call > $themefile
+jingle7 -theme $themefile $litmus > $T/$hwlitmusfile 2> 
$T/$hwlitmusfile.jingle7.out
+/usr/bin/time $LKMM_TIMEOUT_CMD herd7 -model $catfile $T/$hwlitmusfile > 
$LKMM_DESTDIR/$hwlitmus.out 2>&1
+
 exit $?
-- 
2.17.1



[PATCH tip/core/rcu 03/21] tools/memory-model: Do not use "herd" to refer to "herd7"

2019-03-26 Thread Paul E. McKenney
From: Andrea Parri 

Use "herd7" in each such reference.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
Cc: Daniel Lustig 
Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/litmus-tests/README   | 2 +-
 tools/memory-model/lock.cat  | 2 +-
 tools/memory-model/scripts/README| 4 ++--
 tools/memory-model/scripts/checkalllitmus.sh | 2 +-
 tools/memory-model/scripts/checklitmus.sh| 2 +-
 tools/memory-model/scripts/parseargs.sh  | 2 +-
 tools/memory-model/scripts/runlitmushist.sh  | 2 +-
 7 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/memory-model/litmus-tests/README 
b/tools/memory-model/litmus-tests/README
index 5ee08f129094..681f9067fa9e 100644
--- a/tools/memory-model/litmus-tests/README
+++ b/tools/memory-model/litmus-tests/README
@@ -244,7 +244,7 @@ produce the name:
 Adding the ".litmus" suffix: SB+rfionceonce-poonceonces.litmus
 
 The descriptors that describe connections between consecutive accesses
-within the cycle through a given litmus test can be provided by the herd
+within the cycle through a given litmus test can be provided by the herd7
 tool (Rfi, Po, Fre, and so on) or by the linux-kernel.bell file (Once,
 Release, Acquire, and so on).
 
diff --git a/tools/memory-model/lock.cat b/tools/memory-model/lock.cat
index a059d1a6d8a2..6b52f365d73a 100644
--- a/tools/memory-model/lock.cat
+++ b/tools/memory-model/lock.cat
@@ -11,7 +11,7 @@
 include "cross.cat"
 
 (*
- * The lock-related events generated by herd are as follows:
+ * The lock-related events generated by herd7 are as follows:
  *
  * LKR Lock-Read: the read part of a spin_lock() or successful
  * spin_trylock() read-modify-write event pair
diff --git a/tools/memory-model/scripts/README 
b/tools/memory-model/scripts/README
index 29375a1fbbfa..095c7eb36f9f 100644
--- a/tools/memory-model/scripts/README
+++ b/tools/memory-model/scripts/README
@@ -22,7 +22,7 @@ checklitmushist.sh
 
Run all litmus tests having .litmus.out files from previous
initlitmushist.sh or newlitmushist.sh runs, comparing the
-   herd output to that of the original runs.
+   herd7 output to that of the original runs.
 
 checklitmus.sh
 
@@ -43,7 +43,7 @@ initlitmushist.sh
 
 judgelitmus.sh
 
-   Given a .litmus file and its .litmus.out herd output, check the
+   Given a .litmus file and its .litmus.out herd7 output, check the
.litmus.out file against the .litmus file's "Result:" comment to
judge whether the test ran correctly.  Not normally run manually,
provided instead for use by other scripts.
diff --git a/tools/memory-model/scripts/checkalllitmus.sh 
b/tools/memory-model/scripts/checkalllitmus.sh
index b35fcd61ecf6..3c0c7fbbd223 100755
--- a/tools/memory-model/scripts/checkalllitmus.sh
+++ b/tools/memory-model/scripts/checkalllitmus.sh
@@ -1,7 +1,7 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0+
 #
-# Run herd tests on all .litmus files in the litmus-tests directory
+# Run herd7 tests on all .litmus files in the litmus-tests directory
 # and check each file's result against a "Result:" comment within that
 # litmus test.  If the verification result does not match that specified
 # in the litmus test, this script prints an error message prefixed with
diff --git a/tools/memory-model/scripts/checklitmus.sh 
b/tools/memory-model/scripts/checklitmus.sh
index dd08801a30b0..11461ed40b5e 100755
--- a/tools/memory-model/scripts/checklitmus.sh
+++ b/tools/memory-model/scripts/checklitmus.sh
@@ -1,7 +1,7 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0+
 #
-# Run a herd test and invokes judgelitmus.sh to check the result against
+# Run a herd7 test and invokes judgelitmus.sh to check the result against
 # a "Result:" comment within the litmus test.  It also outputs verification
 # results to a file whose name is that of the specified litmus test, but
 # with ".out" appended.
diff --git a/tools/memory-model/scripts/parseargs.sh 
b/tools/memory-model/scripts/parseargs.sh
index 859e1d581e05..40f52080fdbd 100755
--- a/tools/memory-model/scripts/parseargs.sh
+++ b/tools/memory-model/scripts/parseargs.sh
@@ -91,7 +91,7 @@ do
shift
;;
--herdopts|--herdopt)
-   checkarg --destdir "(herd options)" "$#" "$2" '.*' '^--'
+   checkarg --destdir "(herd7 options)" "$#" "$2" '.*' '^--'
LKMM_HERD_OPTIONS="$2"
shift
;;
diff --git a/tools/memory-model/scripts/runlitmushist.sh 
b/tools/memory-model/scripts/runlitmushist.sh
index e507f5f933d5..6ed376f495bb 100755
--- a/tools/memory-model/scripts/runlitmushist.sh
+++ b/tools/memory-model/scripts/runlitmushist.sh
@@ -79,7 +79,7 @@ then
echo ' ---' Summary: 1>&2
grep '!!!' $T/*.sh.out 1>&2
nfail=

[PATCH tip/core/rcu 10/21] tools/memory-model: Update parseargs.sh for hardware verification

2019-03-26 Thread Paul E. McKenney
This commit adds a --hw argument to parseargs.sh to specify the CPU
family for a hardware verification.  For example, "--hw AArch64" will
specify that a C-language litmus test is to be translated to ARMv8 and
the result verified.  This will set the LKMM_HW_MAP_FILE environment
variable accordingly.  If there is no --hw argument, this environment
variable will be set to the empty string.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/parseargs.sh | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/memory-model/scripts/parseargs.sh 
b/tools/memory-model/scripts/parseargs.sh
index 40f52080fdbd..0dd087a8410e 100755
--- a/tools/memory-model/scripts/parseargs.sh
+++ b/tools/memory-model/scripts/parseargs.sh
@@ -27,6 +27,7 @@ initparam () {
 
 initparam LKMM_DESTDIR "."
 initparam LKMM_HERD_OPTIONS "-conf linux-kernel.cfg"
+initparam LKMM_HW_MAP_FILE ""
 initparam LKMM_JOBS `getconf _NPROCESSORS_ONLN`
 initparam LKMM_PROCS "3"
 initparam LKMM_TIMEOUT "1m"
@@ -37,10 +38,11 @@ usagehelp () {
echo "Usage $scriptname [ arguments ]"
echo "  --destdir path (place for .litmus.out, default by .litmus)"
echo "  --herdopts -conf linux-kernel.cfg ..."
+   echo "  --hw AArch64"
echo "  --jobs N (number of jobs, default one per CPU)"
echo "  --procs N (litmus tests with at most this many processes)"
echo "  --timeout N (herd7 timeout (e.g., 10s, 1m, 2hr, 1d, '')"
-   echo "Defaults: --destdir '$LKMM_DESTDIR_DEF' --herdopts 
'$LKMM_HERD_OPTIONS_DEF' --jobs '$LKMM_JOBS_DEF' --procs '$LKMM_PROCS_DEF' 
--timeout '$LKMM_TIMEOUT_DEF'"
+   echo "Defaults: --destdir '$LKMM_DESTDIR_DEF' --herdopts 
'$LKMM_HERD_OPTIONS_DEF' --hw '$LKMM_HW_MAP_FILE' --jobs '$LKMM_JOBS_DEF' 
--procs '$LKMM_PROCS_DEF' --timeout '$LKMM_TIMEOUT_DEF'"
exit 1
 }
 
@@ -95,6 +97,11 @@ do
LKMM_HERD_OPTIONS="$2"
shift
;;
+   --hw)
+   checkarg --hw "(.map file architecture name)" "$#" "$2" 
'^[A-Za-z0-9_-]\+' '^--'
+   LKMM_HW_MAP_FILE="$2"
+   shift
+   ;;
-j[1-9]*)
njobs="`echo $1 | sed -e 's/^-j//'`"
trailchars="`echo $njobs | sed -e 's/[0-9]\+\(.*\)$/\1/'`"
-- 
2.17.1



[PATCH tip/core/rcu 14/21] tools/memory-model: Hardware checking for check{,all}litmus.sh

2019-03-26 Thread Paul E. McKenney
This commit makes checklitmus.sh and checkalllitmus.sh check to see
if a hardware verification was specified (via the --hw command-line
argument, which sets the LKMM_HW_MAP_FILE environment variable).
If so, the C-language litmus test is converted to the specified type
of assembly-language litmus test and herd is run on it.  Hardware is
permitted to be stronger than LKMM requires, so "Always" and "Never"
verifications of "Sometimes" C-language litmus tests are forgiven.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/checkalllitmus.sh | 23 +--
 tools/memory-model/scripts/checklitmus.sh| 42 ++--
 2 files changed, 49 insertions(+), 16 deletions(-)

diff --git a/tools/memory-model/scripts/checkalllitmus.sh 
b/tools/memory-model/scripts/checkalllitmus.sh
index d43c6b91db30..1e7afd3a51e4 100755
--- a/tools/memory-model/scripts/checkalllitmus.sh
+++ b/tools/memory-model/scripts/checkalllitmus.sh
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/bin/bash
 # SPDX-License-Identifier: GPL-2.0+
 #
 # Run herd7 tests on all .litmus files in the litmus-tests directory
@@ -8,6 +8,11 @@
 # "^^^".  It also outputs verification results to a file whose name is
 # that of the specified litmus test, but with ".out" appended.
 #
+# If the --hw argument is specified, this script translates the .litmus
+# C-language file to the specified type of assembly and verifies that.
+# But in this case, litmus tests using complex synchronization (such as
+# locking, RCU, and SRCU) are cheerfully ignored.
+#
 # Usage:
 #  checkalllitmus.sh
 #
@@ -38,21 +43,15 @@ then
( cd "$LKMM_DESTDIR"; sed -e 's/^/mkdir -p /' | sh )
 fi
 
-# Find the checklitmus script.  If it is not where we expect it, then
-# assume that the caller has the PATH environment variable set
-# appropriately.
-if test -x scripts/checklitmus.sh
-then
-   clscript=scripts/checklitmus.sh
-else
-   clscript=checklitmus.sh
-fi
-
 # Run the script on all the litmus tests in the specified directory
 ret=0
 for i in $litmusdir/*.litmus
 do
-   if ! $clscript $i
+   if test -n "$LKMM_HW_MAP_FILE" && ! scripts/simpletest.sh $i
+   then
+   continue
+   fi
+   if ! scripts/checklitmus.sh $i
then
ret=1
fi
diff --git a/tools/memory-model/scripts/checklitmus.sh 
b/tools/memory-model/scripts/checklitmus.sh
index 11461ed40b5e..1922a5af2c17 100755
--- a/tools/memory-model/scripts/checklitmus.sh
+++ b/tools/memory-model/scripts/checklitmus.sh
@@ -6,6 +6,11 @@
 # results to a file whose name is that of the specified litmus test, but
 # with ".out" appended.
 #
+# If the --hw argument is specified, this script translates the .litmus
+# C-language file to the specified type of assembly and verifies that.
+# But in this case, litmus tests using complex synchronization (such as
+# locking, RCU, and SRCU) are cheerfully ignored.
+#
 # Usage:
 #  checklitmus.sh file.litmus
 #
@@ -18,8 +23,6 @@
 # Author: Paul E. McKenney 
 
 litmus=$1
-herdoptions=${LKMM_HERD_OPTIONS--conf linux-kernel.cfg}
-
 if test -f "$litmus" -a -r "$litmus"
 then
:
@@ -28,7 +31,38 @@ else
exit 255
 fi
 
-echo Herd options: $herdoptions > $LKMM_DESTDIR/$litmus.out
-/usr/bin/time $LKMM_TIMEOUT_CMD herd7 $herdoptions $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
+if test -z "$LKMM_HW_MAP_FILE"
+then
+   # LKMM run
+   herdoptions=${LKMM_HERD_OPTIONS--conf linux-kernel.cfg}
+   echo Herd options: $herdoptions > $LKMM_DESTDIR/$litmus.out
+   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 $herdoptions $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
+else
+   # Hardware run
+
+   T=/tmp/checklitmushw.sh.$$
+   trap 'rm -rf $T' 0 2
+   mkdir $T
+
+   # Generate filenames
+   catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
+   mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
+   themefile="$T/${LKMM_HW_MAP_FILE}.theme"
+   herdoptions="-model $LKMM_HW_CAT_FILE"
+   hwlitmus=`echo $litmus | sed -e 
's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`
+   hwlitmusfile=`echo $hwlitmus | sed -e 's,^.*/,,'`
+
+   # Don't run on litmus tests with complex synchronization
+   if ! scripts/simpletest.sh $litmus
+   then
+   echo ' --- ' error: \"$litmus\" contains locking, RCU, or SRCU
+   exit 254
+   fi
+
+   # Generate the assembly code and run herd on it.
+   gen_theme7 -n 10 -map $mapfile -call Linux.call > $themefile
+   jingle7 -theme $themefile $litmus > $T/$hwlitmusfile 2> 
$T/$hwlitmusfile.jingle7.out
+   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 -model $catfile $T/$hwlitmusfile 
> $LKMM_DESTDIR/$hwlitmus.out 2>&1
+fi
 
 scripts/judgelitmus.sh $litmus
-- 
2.17.1



[PATCH tip/core/rcu 06/21] tools/memory-model: Make cmplitmushist.sh note timeouts

2019-03-26 Thread Paul E. McKenney
Currently, cmplitmushist.sh treats timeouts (as in the "--timeout"
argument) as "Missing Observation line".  This can be misleading because
it is quite possible that running the test longer would have produced
a verification.  This commit therefore changes cmplitmushist.sh to check
for timeouts and to report them with "Timed out".

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/cmplitmushist.sh | 22 +
 1 file changed, 22 insertions(+)

diff --git a/tools/memory-model/scripts/cmplitmushist.sh 
b/tools/memory-model/scripts/cmplitmushist.sh
index 0f498aeeccf5..b9c174dd8004 100755
--- a/tools/memory-model/scripts/cmplitmushist.sh
+++ b/tools/memory-model/scripts/cmplitmushist.sh
@@ -12,12 +12,30 @@ trap 'rm -rf $T' 0
 mkdir $T
 
 # comparetest oldpath newpath
+timedout=0
 perfect=0
 obsline=0
 noobsline=0
 obsresult=0
 badcompare=0
 comparetest () {
+   if grep -q '^Command exited with non-zero status 124' $1 ||
+  grep -q '^Command exited with non-zero status 124' $2
+   then
+   if grep -q '^Command exited with non-zero status 124' $1 &&
+  grep -q '^Command exited with non-zero status 124' $2
+   then
+   echo Both runs timed out: $2
+   elif grep -q '^Command exited with non-zero status 124' $1
+   then
+   echo Old run timed out: $2
+   elif grep -q '^Command exited with non-zero status 124' $2
+   then
+   echo New run timed out: $2
+   fi
+   timedout=`expr "$timedout" + 1`
+   return 0
+   fi
grep -v 'maxresident)k\|minor)pagefaults\|^Time' $1 > $T/oldout
grep -v 'maxresident)k\|minor)pagefaults\|^Time' $2 > $T/newout
if cmp -s $T/oldout $T/newout && grep -q '^Observation' $1
@@ -78,6 +96,10 @@ if test "$obsresult" -ne 0
 then
echo Matching Observation Always/Sometimes/Never result: $obsresult 1>&2
 fi
+if test "$timedout" -ne 0
+then
+   echo "!!!" Timed out: $timedout 1>&2
+fi
 if test "$badcompare" -ne 0
 then
echo "!!!" Result changed: $badcompare 1>&2
-- 
2.17.1



[PATCH tip/core/rcu 08/21] tools/memory-model: Add support for synchronize_srcu_expedited()

2019-03-26 Thread Paul E. McKenney
Given that synchronize_rcu_expedited() is supported, this commit adds
support for synchronize_srcu_expedited().

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/linux-kernel.def | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/memory-model/linux-kernel.def 
b/tools/memory-model/linux-kernel.def
index 0c3f0ef486f4..551eeaa389d4 100644
--- a/tools/memory-model/linux-kernel.def
+++ b/tools/memory-model/linux-kernel.def
@@ -51,6 +51,7 @@ synchronize_rcu_expedited() { __fence{sync-rcu}; }
 srcu_read_lock(X)  __srcu{srcu-lock}(X)
 srcu_read_unlock(X,Y) { __srcu{srcu-unlock}(X,Y); }
 synchronize_srcu(X)  { __srcu{sync-srcu}(X); }
+synchronize_srcu_expedited(X)  { __srcu{sync-srcu}(X); }
 
 // Atomic
 atomic_read(X) READ_ONCE(*X)
-- 
2.17.1



[PATCH tip/core/rcu 15/21] tools/memory-model: Make judgelitmus.sh ransack .litmus.out files

2019-03-26 Thread Paul E. McKenney
The judgelitmus.sh script currently relies solely on the "Result:"
comment in the .litmus file.  This is problematic when using the --hw
argument, because it is necessary to check the hardware model against
LKMM even in the absence of "Result:" comments.

This commit therefore modifies judgelitmus.sh to check the observation
in a .litmus.out file, in case one was generated by a previous LKMM run.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/judgelitmus.sh | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
index a5a865620f14..a1313f9960c3 100755
--- a/tools/memory-model/scripts/judgelitmus.sh
+++ b/tools/memory-model/scripts/judgelitmus.sh
@@ -8,7 +8,9 @@
 # is provided, this is assumed to be a hardware test, and the output is
 # assumed to be in file.HW.litmus.out, where "HW" is the --hw argument.
 # In addition, non-Sometimes verification results will be noted, but
-# forgiven.
+# forgiven.  Furthermore, if there is no "Result:" comment but there is
+# an LKMM .litmus.out file, the observation in that file will be used
+# to judge the assembly-language verification.
 #
 # Usage:
 #  judgelitmus.sh file.litmus
@@ -32,9 +34,11 @@ fi
 if test -z "$LKMM_HW_MAP_FILE"
 then
litmusout=$litmus.out
+   lkmmout=
 else
litmusout="`echo $litmus |
sed -e 's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`.out"
+   lkmmout=$litmus.out
 fi
 if test -f "$LKMM_DESTDIR/$litmusout" -a -r "$LKMM_DESTDIR/$litmusout"
 then
@@ -46,6 +50,9 @@ fi
 if grep -q '^ \* Result: ' $litmus
 then
outcome=`grep -m 1 '^ \* Result: ' $litmus | awk '{ print $3 }'`
+elif test -n "$LKMM_HW_MAP_FILE" && grep -q '^Observation' 
$LKMM_DESTDIR/$lkmmout > /dev/null 2>&1
+then
+   outcome=`grep -m 1 '^Observation ' $LKMM_DESTDIR/$lkmmout | awk '{ 
print $3 }'`
 else
outcome=specified
 fi
-- 
2.17.1



[PATCH tip/core/rcu 04/21] docs/memory-barriers.txt: Rewrite "KERNEL I/O BARRIER EFFECTS" section

2019-03-26 Thread Paul E. McKenney
From: Will Deacon 

The "KERNEL I/O BARRIER EFFECTS" section of memory-barriers.txt is vague,
x86-centric, out-of-date, incomplete and demonstrably incorrect in places.
This is largely because I/O ordering is a horrible can of worms, but also
because the document has stagnated as our understanding has evolved.

Attempt to address some of that, by rewriting the section based on
recent(-ish) discussions with Arnd, BenH and others. Maybe one day we'll
find a way to formalise this stuff, but for now let's at least try to
make the English easier to understand.

Cc: "Paul E. McKenney" 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Arnd Bergmann 
Cc: Peter Zijlstra 
Cc: Andrea Parri 
Cc: Palmer Dabbelt 
Cc: Daniel Lustig 
Cc: David Howells 
Cc: Alan Stern 
Cc: Linus Torvalds 
Cc: "Maciej W. Rozycki" 
Cc: Mikulas Patocka 
Signed-off-by: Will Deacon 
Signed-off-by: Paul E. McKenney 
---
 Documentation/memory-barriers.txt | 115 ++
 1 file changed, 70 insertions(+), 45 deletions(-)

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index 1c22b21ae922..158947ae78c2 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -2599,72 +2599,97 @@ likely, then interrupt-disabling locks should be used 
to guarantee ordering.
 KERNEL I/O BARRIER EFFECTS
 ==
 
-When accessing I/O memory, drivers should use the appropriate accessor
-functions:
+Interfacing with peripherals via I/O accesses is deeply architecture and device
+specific. Therefore, drivers which are inherently non-portable may rely on
+specific behaviours of their target systems in order to achieve synchronization
+in the most lightweight manner possible. For drivers intending to be portable
+between multiple architectures and bus implementations, the kernel offers a
+series of accessor functions that provide various degrees of ordering
+guarantees:
 
- (*) inX(), outX():
+ (*) readX(), writeX():
 
- These are intended to talk to I/O space rather than memory space, but
- that's primarily a CPU-specific concept.  The i386 and x86_64 processors
- do indeed have special I/O space access cycles and instructions, but many
- CPUs don't have such a concept.
+ The readX() and writeX() MMIO accessors take a pointer to the peripheral
+ being accessed as an __iomem * parameter. For pointers mapped with the
+ default I/O attributes (e.g. those returned by ioremap()), then the
+ ordering guarantees are as follows:
 
- The PCI bus, amongst others, defines an I/O space concept which - on such
- CPUs as i386 and x86_64 - readily maps to the CPU's concept of I/O
- space.  However, it may also be mapped as a virtual I/O space in the CPU's
- memory map, particularly on those CPUs that don't support alternate I/O
- spaces.
+ 1. All readX() and writeX() accesses to the same peripheral are ordered
+with respect to each other. For example, this ensures that MMIO 
register
+   writes by the CPU to a particular device will arrive in program order.
 
- Accesses to this space may be fully synchronous (as on i386), but
- intermediary bridges (such as the PCI host bridge) may not fully honour
- that.
+ 2. A writeX() by the CPU to the peripheral will first wait for the
+completion of all prior CPU writes to memory. For example, this ensures
+that writes by the CPU to an outbound DMA buffer allocated by
+dma_alloc_coherent() will be visible to a DMA engine when the CPU 
writes
+to its MMIO control register to trigger the transfer.
 
- They are guaranteed to be fully ordered with respect to each other.
+ 3. A readX() by the CPU from the peripheral will complete before any
+   subsequent CPU reads from memory can begin. For example, this ensures
+   that reads by the CPU from an incoming DMA buffer allocated by
+   dma_alloc_coherent() will not see stale data after reading from the DMA
+   engine's MMIO status register to establish that the DMA transfer has
+   completed.
 
- They are not guaranteed to be fully ordered with respect to other types of
- memory and I/O operation.
+ 4. A readX() by the CPU from the peripheral will complete before any
+   subsequent delay() loop can begin execution. For example, this ensures
+   that two MMIO register writes by the CPU to a peripheral will arrive at
+   least 1us apart if the first write is immediately read back with readX()
+   and udelay(1) is called prior to the second writeX().
 
- (*) readX(), writeX():
+ __iomem pointers obtained with non-default attributes (e.g. those returned
+ by ioremap_wc()) are unlikely to provide many of these guarantees.
 
- Whether these are guaranteed to be fully ordered and uncombined with
- respect to each other on the issuing CPU depends on the characteristics
- defined for the memory window through whic

[PATCH tip/core/rcu 16/21] tools/memory-model: Split runlitmus.sh out of checklitmus.sh

2019-03-26 Thread Paul E. McKenney
This commit prepares for adding --hw capability to github litmus-test
scripts by splitting runlitmus.sh (which simply runs the verification)
out of checklitmus.sh (which also judges the results).

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/checklitmus.sh | 57 ++-
 tools/memory-model/scripts/runlitmus.sh   | 69 +++
 2 files changed, 73 insertions(+), 53 deletions(-)
 create mode 100755 tools/memory-model/scripts/runlitmus.sh

diff --git a/tools/memory-model/scripts/checklitmus.sh 
b/tools/memory-model/scripts/checklitmus.sh
index 1922a5af2c17..31a120d68a79 100755
--- a/tools/memory-model/scripts/checklitmus.sh
+++ b/tools/memory-model/scripts/checklitmus.sh
@@ -1,15 +1,8 @@
 #!/bin/sh
 # SPDX-License-Identifier: GPL-2.0+
 #
-# Run a herd7 test and invokes judgelitmus.sh to check the result against
-# a "Result:" comment within the litmus test.  It also outputs verification
-# results to a file whose name is that of the specified litmus test, but
-# with ".out" appended.
-#
-# If the --hw argument is specified, this script translates the .litmus
-# C-language file to the specified type of assembly and verifies that.
-# But in this case, litmus tests using complex synchronization (such as
-# locking, RCU, and SRCU) are cheerfully ignored.
+# Invokes runlitmus.sh and judgelitmus.sh on its arguments to run the
+# specified litmus test and pass judgment on the results.
 #
 # Usage:
 #  checklitmus.sh file.litmus
@@ -22,47 +15,5 @@
 #
 # Author: Paul E. McKenney 
 
-litmus=$1
-if test -f "$litmus" -a -r "$litmus"
-then
-   :
-else
-   echo ' --- ' error: \"$litmus\" is not a readable file
-   exit 255
-fi
-
-if test -z "$LKMM_HW_MAP_FILE"
-then
-   # LKMM run
-   herdoptions=${LKMM_HERD_OPTIONS--conf linux-kernel.cfg}
-   echo Herd options: $herdoptions > $LKMM_DESTDIR/$litmus.out
-   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 $herdoptions $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
-else
-   # Hardware run
-
-   T=/tmp/checklitmushw.sh.$$
-   trap 'rm -rf $T' 0 2
-   mkdir $T
-
-   # Generate filenames
-   catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
-   mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
-   themefile="$T/${LKMM_HW_MAP_FILE}.theme"
-   herdoptions="-model $LKMM_HW_CAT_FILE"
-   hwlitmus=`echo $litmus | sed -e 
's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`
-   hwlitmusfile=`echo $hwlitmus | sed -e 's,^.*/,,'`
-
-   # Don't run on litmus tests with complex synchronization
-   if ! scripts/simpletest.sh $litmus
-   then
-   echo ' --- ' error: \"$litmus\" contains locking, RCU, or SRCU
-   exit 254
-   fi
-
-   # Generate the assembly code and run herd on it.
-   gen_theme7 -n 10 -map $mapfile -call Linux.call > $themefile
-   jingle7 -theme $themefile $litmus > $T/$hwlitmusfile 2> 
$T/$hwlitmusfile.jingle7.out
-   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 -model $catfile $T/$hwlitmusfile 
> $LKMM_DESTDIR/$hwlitmus.out 2>&1
-fi
-
-scripts/judgelitmus.sh $litmus
+scripts/runlitmus.sh $1
+scripts/judgelitmus.sh $1
diff --git a/tools/memory-model/scripts/runlitmus.sh 
b/tools/memory-model/scripts/runlitmus.sh
new file mode 100755
index ..0ae46ab08cbd
--- /dev/null
+++ b/tools/memory-model/scripts/runlitmus.sh
@@ -0,0 +1,69 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Without the -hw argument, runs a herd7 test and outputs verification
+# results to a file whose name is that of the specified litmus test,
+# but with ".out" appended.
+#
+# If the --hw argument is specified, this script translates the .litmus
+# C-language file to the specified type of assembly and verifies that.
+# But in this case, litmus tests using complex synchronization (such as
+# locking, RCU, and SRCU) are cheerfully ignored.
+#
+# Either way, return the status of the herd command.
+#
+# Usage:
+#  runlitmus.sh file.litmus
+#
+# Run this in the directory containing the memory model, specifying the
+# pathname of the litmus test to check.  The caller is expected to have
+# properly set up the LKMM environment variables.
+#
+# Copyright IBM Corporation, 2019
+#
+# Author: Paul E. McKenney 
+
+litmus=$1
+if test -f "$litmus" -a -r "$litmus"
+then
+   :
+else
+   echo ' --- ' error: \"$litmus\" is not a readable file
+   exit 255
+fi
+
+if test -z "$LKMM_HW_MAP_FILE"
+then
+   # LKMM run
+   herdoptions=${LKMM_HERD_OPTIONS--conf linux-kernel.cfg}
+   echo Herd options: $herdoptions > $LKMM_DESTDIR/$litmus.out
+   /usr/bin/time $LKMM_TIMEOUT_CMD herd7 $herdoptions $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
+else
+   # Hardware run
+
+   T=/tmp/checklitmushw.sh.$$
+   trap 'rm -rf $T' 0 2
+   mkdir $T
+
+   # Generate filenames
+   catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
+   mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
+   themefile="$T/${LKMM_HW_MAP_FI

[PATCH tip/core/rcu 20/21] tools/memory-model: Allow herd to deduce CPU type

2019-03-26 Thread Paul E. McKenney
Currently, the scripts specify the CPU's .cat file to herd.  But this is
pointless because herd will select a good and sufficient .cat file from
the assembly-language litmus test itself.  This commit therefore removes
the -model argument to herd, allowing herd to figure the CPU family out
itself.

Note that the user can override herd's choice using the "--herdopts"
argument to the scripts.

Suggested-by: Luc Maranget 
Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/runlitmus.sh | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/memory-model/scripts/runlitmus.sh 
b/tools/memory-model/scripts/runlitmus.sh
index 71d13bb0c764..482ed7a08951 100755
--- a/tools/memory-model/scripts/runlitmus.sh
+++ b/tools/memory-model/scripts/runlitmus.sh
@@ -53,7 +53,6 @@ trap 'rm -rf $T' 0 2
 mkdir $T
 
 # Generate filenames
-catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
 mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
 themefile="$T/${LKMM_HW_MAP_FILE}.theme"
 herdoptions="-model $LKMM_HW_CAT_FILE"
@@ -70,6 +69,6 @@ fi
 # Generate the assembly code and run herd on it.
 gen_theme7 -n 10 -map $mapfile -call Linux.call > $themefile
 jingle7 -theme $themefile $litmus > $LKMM_DESTDIR/$hwlitmus 2> 
$T/$hwlitmusfile.jingle7.out
-/usr/bin/time $LKMM_TIMEOUT_CMD herd7 -model $catfile $LKMM_DESTDIR/$hwlitmus 
> $LKMM_DESTDIR/$hwlitmus.out 2>&1
+/usr/bin/time $LKMM_TIMEOUT_CMD herd7 $LKMM_DESTDIR/$hwlitmus > 
$LKMM_DESTDIR/$hwlitmus.out 2>&1
 
 exit $?
-- 
2.17.1



[PATCH tip/core/rcu 09/21] tools/memory-model: Make judgelitmus.sh detect hard deadlocks

2019-03-26 Thread Paul E. McKenney
If a litmus test specifies "Result: Never" and if it contains an
unconditional ("hard") deadlock, then running checklitmus.sh on it will
not flag any errors, despite the fact that there are no executions.
This commit therefore updates judgelitmus.sh to complain about tests
with no executions that are marked, but not as "Result: DEADLOCK".

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/judgelitmus.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
index d40439c7b71e..84c62eee321b 100755
--- a/tools/memory-model/scripts/judgelitmus.sh
+++ b/tools/memory-model/scripts/judgelitmus.sh
@@ -83,6 +83,14 @@ then
fi
ret=1
fi
+elif grep '^Observation' $LKMM_DESTDIR/$litmus.out | grep -q 'Never 0 0$'
+then
+   echo " !!! Unexpected non-$outcome deadlock" $litmus
+   if ! grep -q '!!!' $LKMM_DESTDIR/$litmus.out
+   then
+   echo " !!! Unexpected non-$outcome deadlock" $litmus >> 
$LKMM_DESTDIR/$litmus.out 2>&1
+   fi
+   ret=1
 elif grep '^Observation' $LKMM_DESTDIR/$litmus.out | grep -q $outcome || test 
"$outcome" = Maybe
 then
ret=0
-- 
2.17.1



[PATCH tip/core/rcu 18/21] tools/memory-model: Move from .AArch64.litmus.out to .litmus.AArch.out

2019-03-26 Thread Paul E. McKenney
When the github scripts see ".litmus.out", they assume that there must be
a corresponding C-language ".litmus" file.  Won't they be disappointed
when they instead see nothing, or, worse yet, the corresponding
assembly-language litmus test?  This commit therefore swaps the hardware
tag with the "litmus" to avoid this sort of disappointment.

This commit also adjusts the .gitignore file so as to avoid adding these
new ".out" files to git.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/litmus-tests/.gitignore | 2 +-
 tools/memory-model/scripts/judgelitmus.sh  | 2 +-
 tools/memory-model/scripts/runlitmus.sh| 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/memory-model/litmus-tests/.gitignore 
b/tools/memory-model/litmus-tests/.gitignore
index 6e2ddc54152f..f47cb2045f13 100644
--- a/tools/memory-model/litmus-tests/.gitignore
+++ b/tools/memory-model/litmus-tests/.gitignore
@@ -1 +1 @@
-*.litmus.out
+*.out
diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
index a1313f9960c3..6ecd223c0f4c 100755
--- a/tools/memory-model/scripts/judgelitmus.sh
+++ b/tools/memory-model/scripts/judgelitmus.sh
@@ -37,7 +37,7 @@ then
lkmmout=
 else
litmusout="`echo $litmus |
-   sed -e 's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`.out"
+   sed -e 's/\.litmus$/.litmus.'${LKMM_HW_MAP_FILE}'/'`.out"
lkmmout=$litmus.out
 fi
 if test -f "$LKMM_DESTDIR/$litmusout" -a -r "$LKMM_DESTDIR/$litmusout"
diff --git a/tools/memory-model/scripts/runlitmus.sh 
b/tools/memory-model/scripts/runlitmus.sh
index 186944a7a528..154c95ce79da 100755
--- a/tools/memory-model/scripts/runlitmus.sh
+++ b/tools/memory-model/scripts/runlitmus.sh
@@ -57,7 +57,7 @@ catfile="`echo $LKMM_HW_MAP_FILE | tr '[A-Z]' '[a-z]'`.cat"
 mapfile="Linux2${LKMM_HW_MAP_FILE}.map"
 themefile="$T/${LKMM_HW_MAP_FILE}.theme"
 herdoptions="-model $LKMM_HW_CAT_FILE"
-hwlitmus=`echo $litmus | sed -e 's/\.litmus$/.'${LKMM_HW_MAP_FILE}'.litmus/'`
+hwlitmus=`echo $litmus | sed -e 's/\.litmus$/.litmus.'${LKMM_HW_MAP_FILE}'/'`
 hwlitmusfile=`echo $hwlitmus | sed -e 's,^.*/,,'`
 
 # Don't run on litmus tests with complex synchronization
-- 
2.17.1



[PATCH tip/core/rcu 07/21] tools/memory-model: Make judgelitmus.sh identify bad macros

2019-03-26 Thread Paul E. McKenney
Currently, judgelitmus.sh treats use of unknown primitives (such as
srcu_read_lock() prior to SRCU support) as "!!! Verification error".
This can be misleading because it fails to call out typos and running
a version LKMM on a litmus test requiring a feature not provided by
that version.  This commit therefore changes judgelitmus.sh to check
for unknown primitives and to report them, for example, with:

'!!! Current LKMM version does not know "rcu_write_lock"'.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/cmplitmushist.sh | 31 ++---
 tools/memory-model/scripts/judgelitmus.sh   | 12 
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/tools/memory-model/scripts/cmplitmushist.sh 
b/tools/memory-model/scripts/cmplitmushist.sh
index b9c174dd8004..ca1ac8b64614 100755
--- a/tools/memory-model/scripts/cmplitmushist.sh
+++ b/tools/memory-model/scripts/cmplitmushist.sh
@@ -12,6 +12,7 @@ trap 'rm -rf $T' 0
 mkdir $T
 
 # comparetest oldpath newpath
+badmacnam=0
 timedout=0
 perfect=0
 obsline=0
@@ -19,8 +20,26 @@ noobsline=0
 obsresult=0
 badcompare=0
 comparetest () {
-   if grep -q '^Command exited with non-zero status 124' $1 ||
-  grep -q '^Command exited with non-zero status 124' $2
+   if grep -q ': Unknown macro ' $1 || grep -q ': Unknown macro ' $2
+   then
+   if grep -q ': Unknown macro ' $1
+   then
+   badname=`grep ': Unknown macro ' $1 |
+   sed -e 's/^.*: Unknown macro //' |
+   sed -e 's/ (User error).*$//'`
+   echo 'Current LKMM version does not know "'$badname'"' 
$1
+   fi
+   if grep -q ': Unknown macro ' $2
+   then
+   badname=`grep ': Unknown macro ' $2 |
+   sed -e 's/^.*: Unknown macro //' |
+   sed -e 's/ (User error).*$//'`
+   echo 'Current LKMM version does not know "'$badname'"' 
$2
+   fi
+   badmacnam=`expr "$badmacnam" + 1`
+   return 0
+   elif grep -q '^Command exited with non-zero status 124' $1 ||
+grep -q '^Command exited with non-zero status 124' $2
then
if grep -q '^Command exited with non-zero status 124' $1 &&
   grep -q '^Command exited with non-zero status 124' $2
@@ -56,7 +75,7 @@ comparetest () {
return 0
fi
else
-   echo Missing Observation line "(e.g., herd7 timeout)": $2
+   echo Missing Observation line "(e.g., syntax error)": $2
noobsline=`expr "$noobsline" + 1`
return 0
fi
@@ -90,7 +109,7 @@ then
 fi
 if test "$noobsline" -ne 0
 then
-   echo Missing Observation line "(e.g., herd7 timeout)": $noobsline 1>&2
+   echo Missing Observation line "(e.g., syntax error)": $noobsline 1>&2
 fi
 if test "$obsresult" -ne 0
 then
@@ -100,6 +119,10 @@ if test "$timedout" -ne 0
 then
echo "!!!" Timed out: $timedout 1>&2
 fi
+if test "$badmacnam" -ne 0
+then
+   echo "!!!" Unknown primitive: $badmacnam 1>&2
+fi
 if test "$badcompare" -ne 0
 then
echo "!!!" Result changed: $badcompare 1>&2
diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
index d3c313b9a458..d40439c7b71e 100755
--- a/tools/memory-model/scripts/judgelitmus.sh
+++ b/tools/memory-model/scripts/judgelitmus.sh
@@ -42,6 +42,18 @@ grep '^Observation' $LKMM_DESTDIR/$litmus.out
 if grep -q '^Observation' $LKMM_DESTDIR/$litmus.out
 then
:
+elif grep ': Unknown macro ' $LKMM_DESTDIR/$litmus.out
+then
+   badname=`grep ': Unknown macro ' $LKMM_DESTDIR/$litmus.out |
+   sed -e 's/^.*: Unknown macro //' |
+   sed -e 's/ (User error).*$//'`
+   badmsg=' !!! Current LKMM version does not know "'$badname'"'" $litmus"
+   echo $badmsg
+   if ! grep -q '!!!' $LKMM_DESTDIR/$litmus.out
+   then
+   echo ' !!! '$badmsg >> $LKMM_DESTDIR/$litmus.out 2>&1
+   fi
+   exit 254
 elif grep '^Command exited with non-zero status 124' $LKMM_DESTDIR/$litmus.out
 then
echo ' !!! Timeout' $litmus
-- 
2.17.1



[PATCH tip/core/rcu 02/21] tools/memory-model: Fix comment in MP+poonceonces.litmus

2019-03-26 Thread Paul E. McKenney
From: Andrea Parri 

The comment should say "Sometimes" for the result.

Signed-off-by: Andrea Parri 
Cc: Alan Stern 
Cc: Will Deacon 
Cc: Peter Zijlstra 
Cc: Boqun Feng 
Cc: Nicholas Piggin 
Cc: David Howells 
Cc: Jade Alglave 
Cc: Luc Maranget 
Cc: "Paul E. McKenney" 
Cc: Akira Yokosawa 
Cc: Daniel Lustig 
Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/litmus-tests/MP+poonceonces.litmus | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/memory-model/litmus-tests/MP+poonceonces.litmus 
b/tools/memory-model/litmus-tests/MP+poonceonces.litmus
index b2b60b84fb9d..172f0145301c 100644
--- a/tools/memory-model/litmus-tests/MP+poonceonces.litmus
+++ b/tools/memory-model/litmus-tests/MP+poonceonces.litmus
@@ -1,7 +1,7 @@
 C MP+poonceonces
 
 (*
- * Result: Maybe
+ * Result: Sometimes
  *
  * Can the counter-intuitive message-passing outcome be prevented with
  * no ordering at all?
-- 
2.17.1



[PATCH tip/core/rcu 05/21] tools/memory-model: Make judgelitmus.sh note timeouts

2019-03-26 Thread Paul E. McKenney
Currently, judgelitmus.sh treats timeouts (as in the "--timeout" argument)
as "!!! Verification error".  This can be misleading because it is quite
possible that running the test longer would have produced a verification.
This commit therefore changes judgelitmus.sh to check for timeouts and
to report them with "!!! Timeout".

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/judgelitmus.sh | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/memory-model/scripts/judgelitmus.sh 
b/tools/memory-model/scripts/judgelitmus.sh
index 0cc63875e395..d3c313b9a458 100755
--- a/tools/memory-model/scripts/judgelitmus.sh
+++ b/tools/memory-model/scripts/judgelitmus.sh
@@ -42,6 +42,14 @@ grep '^Observation' $LKMM_DESTDIR/$litmus.out
 if grep -q '^Observation' $LKMM_DESTDIR/$litmus.out
 then
:
+elif grep '^Command exited with non-zero status 124' $LKMM_DESTDIR/$litmus.out
+then
+   echo ' !!! Timeout' $litmus
+   if ! grep -q '!!!' $LKMM_DESTDIR/$litmus.out
+   then
+   echo ' !!! Timeout' >> $LKMM_DESTDIR/$litmus.out 2>&1
+   fi
+   exit 124
 else
echo ' !!! Verification error' $litmus
if ! grep -q '!!!' $LKMM_DESTDIR/$litmus.out
-- 
2.17.1



[PATCH tip/core/rcu 12/21] tools/memory-model: Add simpletest.sh to check locking, RCU, and SRCU

2019-03-26 Thread Paul E. McKenney
This commit abstracts out common function to check a given litmus test
for locking, RCU, and SRCU in order to avoid duplicating code.

Signed-off-by: Paul E. McKenney 
---
 tools/memory-model/scripts/simpletest.sh | 35 
 1 file changed, 35 insertions(+)
 create mode 100755 tools/memory-model/scripts/simpletest.sh

diff --git a/tools/memory-model/scripts/simpletest.sh 
b/tools/memory-model/scripts/simpletest.sh
new file mode 100755
index ..b03420e0dbb6
--- /dev/null
+++ b/tools/memory-model/scripts/simpletest.sh
@@ -0,0 +1,35 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0+
+#
+# Give zero status if this is a simple test and non-zero otherwise.
+# Simple tests do not contain locking, RCU, or SRCU.
+#
+# Usage:
+#  simpletest.sh file.litmus
+#
+# Copyright IBM Corporation, 2019
+#
+# Author: Paul E. McKenney 
+
+
+litmus=$1
+
+if test -f "$litmus" -a -r "$litmus"
+then
+   :
+else
+   echo ' --- ' error: \"$litmus\" is not a readable file
+   exit 255
+fi
+exclude="^[[:space:]]*\("
+exclude="${exclude}spin_lock(\|spin_unlock(\|spin_trylock(\|spin_is_locked("
+exclude="${exclude}\|rcu_read_lock(\|rcu_read_unlock("
+exclude="${exclude}\|synchronize_rcu(\|synchronize_rcu_expedited("
+exclude="${exclude}\|srcu_read_lock(\|srcu_read_unlock("
+exclude="${exclude}\|synchronize_srcu(\|synchronize_srcu_expedited("
+exclude="${exclude}\)"
+if grep -q $exclude $litmus
+then
+   exit 255
+fi
+exit 0
-- 
2.17.1



<    3   4   5   6   7   8   9   10   >