[PATCH] Input: synaptics - Lenovo ThinkPad T25 and T480 devices should use RMI

2018-07-06 Thread kitsunyan
The touchpads on both T25 and T480 are accessible over SMBUS/RMI.

Signed-off-by: kitsunyan 
---
 drivers/input/mouse/synaptics.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index 55d33500d55e..be934a082424 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -175,7 +175,9 @@ static const char * const smbus_pnp_ids[] = {
"LEN0071", /* T480 */
"LEN0072", /* X1 Carbon Gen 5 (2017) - Elan/ALPS trackpoint */
"LEN0073", /* X1 Carbon G5 (Elantech) */
+   "LEN008e", /* T25 */
"LEN0092", /* X1 Carbon 6 */
+   "LEN0093", /* T480 */
"LEN0096", /* X280 */
"LEN0097", /* X280 -> ALPS trackpoint */
"LEN200f", /* T450s */
-- 
2.18.0



[PATCH] Input: synaptics - Lenovo ThinkPad T25 and T480 devices should use RMI

2018-07-06 Thread kitsunyan
The touchpads on both T25 and T480 are accessible over SMBUS/RMI.

Signed-off-by: kitsunyan 
---
 drivers/input/mouse/synaptics.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index 55d33500d55e..be934a082424 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -175,7 +175,9 @@ static const char * const smbus_pnp_ids[] = {
"LEN0071", /* T480 */
"LEN0072", /* X1 Carbon Gen 5 (2017) - Elan/ALPS trackpoint */
"LEN0073", /* X1 Carbon G5 (Elantech) */
+   "LEN008e", /* T25 */
"LEN0092", /* X1 Carbon 6 */
+   "LEN0093", /* T480 */
"LEN0096", /* X280 */
"LEN0097", /* X280 -> ALPS trackpoint */
"LEN200f", /* T450s */
-- 
2.18.0



Re: [PATCH v9 5/7] tracing: Centralize preemptirq tracepoints and unify their usage

2018-07-06 Thread Joel Fernandes
On Fri, Jul 06, 2018 at 06:06:10PM -0400, Steven Rostedt wrote:
> 
> Peter,
> 
> Want to ack this? It touches Lockdep.
> 
> Joel,
> 
> I got to this patch and I'm still reviewing it. I'll hopefully have my
> full review done by next week. I'll make it a priority. But I still
> would like Peter's ack on this one, as he's the maintainer of lockdep.

Thanks a lot Steven.

Peter, the lockdep calls are just small changes to the calling of the irq
on/off hooks and minor clean ups. Also I ran full locking API selftests with
all tests passing. I hope you are Ok with this change. Appreciate an Ack for
the lockdep bits and thanks.

-Joel
 

> Thanks,
> 
> -- Steve
> 
> 
> On Thu, 28 Jun 2018 11:21:47 -0700
> Joel Fernandes  wrote:
> 
> > From: "Joel Fernandes (Google)" 
> > 
> > This patch detaches the preemptirq tracepoints from the tracers and
> > keeps it separate.
> > 
> > Advantages:
> > * Lockdep and irqsoff event can now run in parallel since they no longer
> > have their own calls.
> > 
> > * This unifies the usecase of adding hooks to an irqsoff and irqson
> > event, and a preemptoff and preempton event.
> >   3 users of the events exist:
> >   - Lockdep
> >   - irqsoff and preemptoff tracers
> >   - irqs and preempt trace events
> > 
> > The unification cleans up several ifdefs and makes the code in preempt
> > tracer and irqsoff tracers simpler. It gets rid of all the horrific
> > ifdeferry around PROVE_LOCKING and makes configuration of the different
> > users of the tracepoints more easy and understandable. It also gets rid
> > of the time_* function calls from the lockdep hooks used to call into
> > the preemptirq tracer which is not needed anymore. The negative delta in
> > lines of code in this patch is quite large too.
> > 
> > In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
> > as a single point for registering probes onto the tracepoints. With
> > this,
> > the web of config options for preempt/irq toggle tracepoints and its
> > users becomes:
> > 
> >  PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
> >| | \ |   |
> >\(selects)/  \\ (selects) /
> >   TRACE_PREEMPT_TOGGLE   > TRACE_IRQFLAGS
> >   \  /
> >\ (depends on)   /
> >  PREEMPTIRQ_TRACEPOINTS
> > 
> > One note, I have to check for lockdep recursion in the code that calls
> > the trace events API and bail out if we're in lockdep recursion
> > protection to prevent something like the following case: a spin_lock is
> > taken. Then lockdep_acquired is called.  That does a raw_local_irq_save
> > and then sets lockdep_recursion, and then calls __lockdep_acquired. In
> > this function, a call to get_lock_stats happens which calls
> > preempt_disable, which calls trace IRQS off somewhere which enters my
> > tracepoint code and sets the tracing_irq_cpu flag to prevent recursion.
> > This flag is then never cleared causing lockdep paths to never be
> > entered and thus causing splats and other bad things.
> > 
> > Other than the performance tests mentioned in the previous patch, I also
> > ran the locking API test suite. I verified that all tests cases are
> > passing.
> > 
> > I also injected issues by not registering lockdep probes onto the
> > tracepoints and I see failures to confirm that the probes are indeed
> > working.
> > 
> > This series + lockdep probes not registered (just to inject errors):
> > [0.00]  hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]sirq-safe-A => hirqs-on/21:FAILED|FAILED|  ok  |
> > [0.00]  hard-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]  soft-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]  hard-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
> > [0.00]  soft-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
> > [0.00] hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
> > [0.00] soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
> > 
> > With this series + lockdep probes registered, all locking tests pass:
> > 
> > [0.00]  hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
> > [0.00]  hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]  soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]  hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
> > [0.00] hard-safe-A + 

Re: [PATCH v9 5/7] tracing: Centralize preemptirq tracepoints and unify their usage

2018-07-06 Thread Joel Fernandes
On Fri, Jul 06, 2018 at 06:06:10PM -0400, Steven Rostedt wrote:
> 
> Peter,
> 
> Want to ack this? It touches Lockdep.
> 
> Joel,
> 
> I got to this patch and I'm still reviewing it. I'll hopefully have my
> full review done by next week. I'll make it a priority. But I still
> would like Peter's ack on this one, as he's the maintainer of lockdep.

Thanks a lot Steven.

Peter, the lockdep calls are just small changes to the calling of the irq
on/off hooks and minor clean ups. Also I ran full locking API selftests with
all tests passing. I hope you are Ok with this change. Appreciate an Ack for
the lockdep bits and thanks.

-Joel
 

> Thanks,
> 
> -- Steve
> 
> 
> On Thu, 28 Jun 2018 11:21:47 -0700
> Joel Fernandes  wrote:
> 
> > From: "Joel Fernandes (Google)" 
> > 
> > This patch detaches the preemptirq tracepoints from the tracers and
> > keeps it separate.
> > 
> > Advantages:
> > * Lockdep and irqsoff event can now run in parallel since they no longer
> > have their own calls.
> > 
> > * This unifies the usecase of adding hooks to an irqsoff and irqson
> > event, and a preemptoff and preempton event.
> >   3 users of the events exist:
> >   - Lockdep
> >   - irqsoff and preemptoff tracers
> >   - irqs and preempt trace events
> > 
> > The unification cleans up several ifdefs and makes the code in preempt
> > tracer and irqsoff tracers simpler. It gets rid of all the horrific
> > ifdeferry around PROVE_LOCKING and makes configuration of the different
> > users of the tracepoints more easy and understandable. It also gets rid
> > of the time_* function calls from the lockdep hooks used to call into
> > the preemptirq tracer which is not needed anymore. The negative delta in
> > lines of code in this patch is quite large too.
> > 
> > In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
> > as a single point for registering probes onto the tracepoints. With
> > this,
> > the web of config options for preempt/irq toggle tracepoints and its
> > users becomes:
> > 
> >  PREEMPT_TRACER   PREEMPTIRQ_EVENTS  IRQSOFF_TRACER PROVE_LOCKING
> >| | \ |   |
> >\(selects)/  \\ (selects) /
> >   TRACE_PREEMPT_TOGGLE   > TRACE_IRQFLAGS
> >   \  /
> >\ (depends on)   /
> >  PREEMPTIRQ_TRACEPOINTS
> > 
> > One note, I have to check for lockdep recursion in the code that calls
> > the trace events API and bail out if we're in lockdep recursion
> > protection to prevent something like the following case: a spin_lock is
> > taken. Then lockdep_acquired is called.  That does a raw_local_irq_save
> > and then sets lockdep_recursion, and then calls __lockdep_acquired. In
> > this function, a call to get_lock_stats happens which calls
> > preempt_disable, which calls trace IRQS off somewhere which enters my
> > tracepoint code and sets the tracing_irq_cpu flag to prevent recursion.
> > This flag is then never cleared causing lockdep paths to never be
> > entered and thus causing splats and other bad things.
> > 
> > Other than the performance tests mentioned in the previous patch, I also
> > ran the locking API test suite. I verified that all tests cases are
> > passing.
> > 
> > I also injected issues by not registering lockdep probes onto the
> > tracepoints and I see failures to confirm that the probes are indeed
> > working.
> > 
> > This series + lockdep probes not registered (just to inject errors):
> > [0.00]  hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]sirq-safe-A => hirqs-on/21:FAILED|FAILED|  ok  |
> > [0.00]  hard-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]  soft-safe-A + irqs-on/12:FAILED|FAILED|  ok  |
> > [0.00]  hard-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
> > [0.00]  soft-safe-A + irqs-on/21:FAILED|FAILED|  ok  |
> > [0.00] hard-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
> > [0.00] soft-safe-A + unsafe-B #1/123:  ok  |  ok  |  ok  |
> > 
> > With this series + lockdep probes registered, all locking tests pass:
> > 
> > [0.00]  hard-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-irqs-on + irq-safe-A/21:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]sirq-safe-A => hirqs-on/21:  ok  |  ok  |  ok  |
> > [0.00]  hard-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]  soft-safe-A + irqs-on/12:  ok  |  ok  |  ok  |
> > [0.00]  hard-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
> > [0.00]  soft-safe-A + irqs-on/21:  ok  |  ok  |  ok  |
> > [0.00] hard-safe-A + 

Re: [Bisect] ext4_validate_inode_bitmap:98: comm stress-ng: Corrupt inode bitmap

2018-07-06 Thread Theodore Y. Ts'o
On Fri, Jul 06, 2018 at 11:43:24AM -0600, dann frazier wrote:
> Hi,
>   We're seeing a regression triggered by the stress-ng[*] "chdir" test
> that I've bisected to:
> 
> 044e6e3d74a3 ext4: don't update checksum of new initialized bitmaps
> 
> So far we've only seen failures on servers based on HiSilicon's family
> of ARM64 SoCs (D05/Hi1616 SoC, D06/Hi1620 SoC). On these systems it is
> very reproducible.

Thanks for the report.  Can you verify whether or not this patch fixes
things for you?

- Ted

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index da6c10c1e37a..1cfb74bc4dca 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -90,6 +90,8 @@ static int ext4_validate_inode_bitmap(struct super_block *sb,
return -EFSCORRUPTED;
 
ext4_lock_group(sb, block_group);
+   if (buffer_verified(bh))
+   goto verified;
blk = ext4_inode_bitmap(sb, desc);
if (!ext4_inode_bitmap_csum_verify(sb, block_group, desc, bh,
   EXT4_INODES_PER_GROUP(sb) / 8)) {
@@ -101,6 +103,7 @@ static int ext4_validate_inode_bitmap(struct super_block 
*sb,
return -EFSBADCRC;
}
set_buffer_verified(bh);
+verified:
ext4_unlock_group(sb, block_group);
return 0;
 }


Re: [Bisect] ext4_validate_inode_bitmap:98: comm stress-ng: Corrupt inode bitmap

2018-07-06 Thread Theodore Y. Ts'o
On Fri, Jul 06, 2018 at 11:43:24AM -0600, dann frazier wrote:
> Hi,
>   We're seeing a regression triggered by the stress-ng[*] "chdir" test
> that I've bisected to:
> 
> 044e6e3d74a3 ext4: don't update checksum of new initialized bitmaps
> 
> So far we've only seen failures on servers based on HiSilicon's family
> of ARM64 SoCs (D05/Hi1616 SoC, D06/Hi1620 SoC). On these systems it is
> very reproducible.

Thanks for the report.  Can you verify whether or not this patch fixes
things for you?

- Ted

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index da6c10c1e37a..1cfb74bc4dca 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -90,6 +90,8 @@ static int ext4_validate_inode_bitmap(struct super_block *sb,
return -EFSCORRUPTED;
 
ext4_lock_group(sb, block_group);
+   if (buffer_verified(bh))
+   goto verified;
blk = ext4_inode_bitmap(sb, desc);
if (!ext4_inode_bitmap_csum_verify(sb, block_group, desc, bh,
   EXT4_INODES_PER_GROUP(sb) / 8)) {
@@ -101,6 +103,7 @@ static int ext4_validate_inode_bitmap(struct super_block 
*sb,
return -EFSBADCRC;
}
set_buffer_verified(bh);
+verified:
ext4_unlock_group(sb, block_group);
return 0;
 }


[PATCH] platform/x86/toshiba_acpi.c: fix defined but not used build warnings

2018-07-06 Thread Randy Dunlap
From: Randy Dunlap 

Fix a build warning in toshiba_acpi.c when CONFIG_PROC_FS is not enabled
by marking the unused function as __maybe_unused.

../drivers/platform/x86/toshiba_acpi.c:1685:12: warning: 'version_proc_show' 
defined but not used [-Wunused-function]

Signed-off-by: Randy Dunlap 
Cc: Azael Avalos 
Cc: platform-driver-...@vger.kernel.org
Cc: Darren Hart 
Cc: Andy Shevchenko 
---
 drivers/platform/x86/toshiba_acpi.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next-20180706.orig/drivers/platform/x86/toshiba_acpi.c
+++ linux-next-20180706/drivers/platform/x86/toshiba_acpi.c
@@ -34,6 +34,7 @@
 #define TOSHIBA_ACPI_VERSION   "0.24"
 #define PROC_INTERFACE_VERSION 1
 
+#include 
 #include 
 #include 
 #include 
@@ -1682,7 +1683,7 @@ static const struct file_operations keys
.write  = keys_proc_write,
 };
 
-static int version_proc_show(struct seq_file *m, void *v)
+static int __maybe_unused version_proc_show(struct seq_file *m, void *v)
 {
seq_printf(m, "driver:  %s\n", TOSHIBA_ACPI_VERSION);
seq_printf(m, "proc_interface:  %d\n", PROC_INTERFACE_VERSION);




[PATCH] platform/x86/toshiba_acpi.c: fix defined but not used build warnings

2018-07-06 Thread Randy Dunlap
From: Randy Dunlap 

Fix a build warning in toshiba_acpi.c when CONFIG_PROC_FS is not enabled
by marking the unused function as __maybe_unused.

../drivers/platform/x86/toshiba_acpi.c:1685:12: warning: 'version_proc_show' 
defined but not used [-Wunused-function]

Signed-off-by: Randy Dunlap 
Cc: Azael Avalos 
Cc: platform-driver-...@vger.kernel.org
Cc: Darren Hart 
Cc: Andy Shevchenko 
---
 drivers/platform/x86/toshiba_acpi.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- linux-next-20180706.orig/drivers/platform/x86/toshiba_acpi.c
+++ linux-next-20180706/drivers/platform/x86/toshiba_acpi.c
@@ -34,6 +34,7 @@
 #define TOSHIBA_ACPI_VERSION   "0.24"
 #define PROC_INTERFACE_VERSION 1
 
+#include 
 #include 
 #include 
 #include 
@@ -1682,7 +1683,7 @@ static const struct file_operations keys
.write  = keys_proc_write,
 };
 
-static int version_proc_show(struct seq_file *m, void *v)
+static int __maybe_unused version_proc_show(struct seq_file *m, void *v)
 {
seq_printf(m, "driver:  %s\n", TOSHIBA_ACPI_VERSION);
seq_printf(m, "proc_interface:  %d\n", PROC_INTERFACE_VERSION);




Re: [PATCH v2] platform/x86: intel-hid: Add support for Device Specific Methods

2018-07-06 Thread Mario.Limonciello
>I strongly advocate for vendors to have more control over their drivers,
>but this scenario really frustrates me. I don't think I can justify this
>to Linus as a fix. But before we just say "no" (because hey, I want
>these fixes available as early as possible too), let's ask Rafael if he
>has an opinion or if there is precedent for this in his experience with
>ACPI drivers in general:

Full disclosure - an updated FW has since been rolled out that reverted this
behavior back to previous FW behavior due to lack of Linux support for the
new _DSM.  There is desire to use the new interface (as it did fix actual
problems with the old one) so at some point it may return.  When that happens
it would be ideal that people who are (for example) running an LTS kernel
or distro kernel that tracks stable can pick it up too.


Re: [PATCH v2] platform/x86: intel-hid: Add support for Device Specific Methods

2018-07-06 Thread Mario.Limonciello
>I strongly advocate for vendors to have more control over their drivers,
>but this scenario really frustrates me. I don't think I can justify this
>to Linus as a fix. But before we just say "no" (because hey, I want
>these fixes available as early as possible too), let's ask Rafael if he
>has an opinion or if there is precedent for this in his experience with
>ACPI drivers in general:

Full disclosure - an updated FW has since been rolled out that reverted this
behavior back to previous FW behavior due to lack of Linux support for the
new _DSM.  There is desire to use the new interface (as it did fix actual
problems with the old one) so at some point it may return.  When that happens
it would be ideal that people who are (for example) running an LTS kernel
or distro kernel that tracks stable can pick it up too.


[PATCH] mtdchar: fix overflows in adjustment of `count`

2018-07-06 Thread Jann Horn
The first checks in mtdchar_read() and mtdchar_write() attempt to limit
`count` such that `*ppos + count <= mtd->size`. However, they ignore the
possibility of `*ppos > mtd->size`, allowing the calculation of `count` to
wrap around. `mtdchar_lseek()` prevents seeking beyond mtd->size, but the
pread/pwrite syscalls bypass this.

I haven't found any codepath on which this actually causes dangerous
behavior, but it seems like a sensible change anyway.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn 
---
 drivers/mtd/mtdchar.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index cd67c85cc87d..02389528f622 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -160,8 +160,12 @@ static ssize_t mtdchar_read(struct file *file, char __user 
*buf, size_t count,
 
pr_debug("MTD_read\n");
 
-   if (*ppos + count > mtd->size)
-   count = mtd->size - *ppos;
+   if (*ppos + count > mtd->size) {
+   if (*ppos < mtd->size)
+   count = mtd->size - *ppos;
+   else
+   count = 0;
+   }
 
if (!count)
return 0;
@@ -246,7 +250,7 @@ static ssize_t mtdchar_write(struct file *file, const char 
__user *buf, size_t c
 
pr_debug("MTD_write\n");
 
-   if (*ppos == mtd->size)
+   if (*ppos >= mtd->size)
return -ENOSPC;
 
if (*ppos + count > mtd->size)
-- 
2.18.0.203.gfac676dfb9-goog



[PATCH] mtdchar: fix overflows in adjustment of `count`

2018-07-06 Thread Jann Horn
The first checks in mtdchar_read() and mtdchar_write() attempt to limit
`count` such that `*ppos + count <= mtd->size`. However, they ignore the
possibility of `*ppos > mtd->size`, allowing the calculation of `count` to
wrap around. `mtdchar_lseek()` prevents seeking beyond mtd->size, but the
pread/pwrite syscalls bypass this.

I haven't found any codepath on which this actually causes dangerous
behavior, but it seems like a sensible change anyway.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn 
---
 drivers/mtd/mtdchar.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/mtdchar.c b/drivers/mtd/mtdchar.c
index cd67c85cc87d..02389528f622 100644
--- a/drivers/mtd/mtdchar.c
+++ b/drivers/mtd/mtdchar.c
@@ -160,8 +160,12 @@ static ssize_t mtdchar_read(struct file *file, char __user 
*buf, size_t count,
 
pr_debug("MTD_read\n");
 
-   if (*ppos + count > mtd->size)
-   count = mtd->size - *ppos;
+   if (*ppos + count > mtd->size) {
+   if (*ppos < mtd->size)
+   count = mtd->size - *ppos;
+   else
+   count = 0;
+   }
 
if (!count)
return 0;
@@ -246,7 +250,7 @@ static ssize_t mtdchar_write(struct file *file, const char 
__user *buf, size_t c
 
pr_debug("MTD_write\n");
 
-   if (*ppos == mtd->size)
+   if (*ppos >= mtd->size)
return -ENOSPC;
 
if (*ppos + count > mtd->size)
-- 
2.18.0.203.gfac676dfb9-goog



pull request: linux-firmware: update cxgb4 firmware

2018-07-06 Thread Ganesh Goudar
Hi,

Kindly pull the new firmware from the following URL.
git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

Thanks
Ganesh

The following changes since commit d1147327232ec4616a66ab898df84f9700c816c1:

  Merge branch 'for-upstreaming-v1.7.2-vsw' of 
https://github.com/felix-cavium/linux-firmware (2018-06-06 13:23:36 -0400)

are available in the git repository at:


  git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

for you to fetch changes up to 6213586dc3bc830cb27ce726a3002bb312bfa567:

  cxgb4: update firmware to revision 1.20.8.0 (2018-07-06 09:59:41 -0700)


Ganesh Goudar (1):
  cxgb4: update firmware to revision 1.20.8.0

 WHENCE  |  12 ++--
 cxgb4/t4fw-1.19.1.0.bin | Bin 553984 -> 0 bytes
 cxgb4/t4fw-1.20.8.0.bin | Bin 0 -> 559616 bytes
 cxgb4/t4fw.bin  |   2 +-
 cxgb4/t5fw-1.19.1.0.bin | Bin 651776 -> 0 bytes
 cxgb4/t5fw-1.20.8.0.bin | Bin 0 -> 646144 bytes
 cxgb4/t5fw.bin  |   2 +-
 cxgb4/t6fw-1.19.1.0.bin | Bin 698368 -> 0 bytes
 cxgb4/t6fw-1.20.8.0.bin | Bin 0 -> 692736 bytes
 cxgb4/t6fw.bin  |   2 +-
 10 files changed, 9 insertions(+), 9 deletions(-)
 delete mode 100644 cxgb4/t4fw-1.19.1.0.bin
 create mode 100644 cxgb4/t4fw-1.20.8.0.bin
 delete mode 100644 cxgb4/t5fw-1.19.1.0.bin
 create mode 100644 cxgb4/t5fw-1.20.8.0.bin
 delete mode 100644 cxgb4/t6fw-1.19.1.0.bin
 create mode 100644 cxgb4/t6fw-1.20.8.0.bin


pull request: linux-firmware: update cxgb4 firmware

2018-07-06 Thread Ganesh Goudar
Hi,

Kindly pull the new firmware from the following URL.
git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

Thanks
Ganesh

The following changes since commit d1147327232ec4616a66ab898df84f9700c816c1:

  Merge branch 'for-upstreaming-v1.7.2-vsw' of 
https://github.com/felix-cavium/linux-firmware (2018-06-06 13:23:36 -0400)

are available in the git repository at:


  git://git.chelsio.net/pub/git/linux-firmware.git for-upstream

for you to fetch changes up to 6213586dc3bc830cb27ce726a3002bb312bfa567:

  cxgb4: update firmware to revision 1.20.8.0 (2018-07-06 09:59:41 -0700)


Ganesh Goudar (1):
  cxgb4: update firmware to revision 1.20.8.0

 WHENCE  |  12 ++--
 cxgb4/t4fw-1.19.1.0.bin | Bin 553984 -> 0 bytes
 cxgb4/t4fw-1.20.8.0.bin | Bin 0 -> 559616 bytes
 cxgb4/t4fw.bin  |   2 +-
 cxgb4/t5fw-1.19.1.0.bin | Bin 651776 -> 0 bytes
 cxgb4/t5fw-1.20.8.0.bin | Bin 0 -> 646144 bytes
 cxgb4/t5fw.bin  |   2 +-
 cxgb4/t6fw-1.19.1.0.bin | Bin 698368 -> 0 bytes
 cxgb4/t6fw-1.20.8.0.bin | Bin 0 -> 692736 bytes
 cxgb4/t6fw.bin  |   2 +-
 10 files changed, 9 insertions(+), 9 deletions(-)
 delete mode 100644 cxgb4/t4fw-1.19.1.0.bin
 create mode 100644 cxgb4/t4fw-1.20.8.0.bin
 delete mode 100644 cxgb4/t5fw-1.19.1.0.bin
 create mode 100644 cxgb4/t5fw-1.20.8.0.bin
 delete mode 100644 cxgb4/t6fw-1.19.1.0.bin
 create mode 100644 cxgb4/t6fw-1.20.8.0.bin


Re: kernel BUG at mm/shmem.c:LINE!

2018-07-06 Thread Matthew Wilcox
On Fri, Jul 06, 2018 at 06:19:02PM -0700, syzbot wrote:
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b8e0dfee3fd8c9012...@syzkaller.appspotmail.com
> 
> raw: 02fffc001028 ea0007011dc8 ea0007058b48 8801a7576ab8
> raw: 016e 8801a7588930 0003 8801d9a44c80
> page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != index)
> page->mem_cgroup:8801d9a44c80
> [ cut here ]
> kernel BUG at mm/shmem.c:815!
> invalid opcode:  [#1] SMP KASAN
> CPU: 0 PID: 4429 Comm: syz-executor697 Not tainted 4.18.0-rc3-next-20180706+
> #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815

Pretty sure this one's mine.  At least I spotted a codepath earlier
today which could lead to it.  I'll fix it in the morning.


Re: kernel BUG at mm/shmem.c:LINE!

2018-07-06 Thread Matthew Wilcox
On Fri, Jul 06, 2018 at 06:19:02PM -0700, syzbot wrote:
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+b8e0dfee3fd8c9012...@syzkaller.appspotmail.com
> 
> raw: 02fffc001028 ea0007011dc8 ea0007058b48 8801a7576ab8
> raw: 016e 8801a7588930 0003 8801d9a44c80
> page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != index)
> page->mem_cgroup:8801d9a44c80
> [ cut here ]
> kernel BUG at mm/shmem.c:815!
> invalid opcode:  [#1] SMP KASAN
> CPU: 0 PID: 4429 Comm: syz-executor697 Not tainted 4.18.0-rc3-next-20180706+
> #1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815

Pretty sure this one's mine.  At least I spotted a codepath earlier
today which could lead to it.  I'll fix it in the morning.


[PATCH 3/3] i2c: mediatek: Use DMA safe buffers for i2c transactions

2018-07-06 Thread Jun Gao
From: Jun Gao 

DMA mode will always be used in i2c transactions, try to allocate
a DMA safe buffer if the buf of struct i2c_msg used is not DMA safe.

Signed-off-by: Jun Gao 
---
 drivers/i2c/busses/i2c-mt65xx.c | 62 -
 1 file changed, 55 insertions(+), 7 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
index 806e8b90..dd014ee 100644
--- a/drivers/i2c/busses/i2c-mt65xx.c
+++ b/drivers/i2c/busses/i2c-mt65xx.c
@@ -441,6 +441,8 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, struct 
i2c_msg *msgs,
u16 control_reg;
u16 restart_flag = 0;
u32 reg_4g_mode;
+   u8 *dma_rd_buf;
+   u8 *dma_wr_buf;
dma_addr_t rpaddr = 0;
dma_addr_t wpaddr = 0;
int ret;
@@ -500,10 +502,18 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
if (i2c->op == I2C_MASTER_RD) {
writel(I2C_DMA_INT_FLAG_NONE, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CON_RX, i2c->pdmabase + OFFSET_CON);
-   rpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_rd_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_rd_buf)
+   return -ENOMEM;
+
+   rpaddr = dma_map_single(i2c->dev, dma_rd_buf,
msgs->len, DMA_FROM_DEVICE);
-   if (dma_mapping_error(i2c->dev, rpaddr))
+   if (dma_mapping_error(i2c->dev, rpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_rd_buf);
+
return -ENOMEM;
+   }
 
if (i2c->dev_comp->support_33bits) {
reg_4g_mode = mtk_i2c_set_4g_mode(rpaddr);
@@ -515,10 +525,18 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
} else if (i2c->op == I2C_MASTER_WR) {
writel(I2C_DMA_INT_FLAG_NONE, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CON_TX, i2c->pdmabase + OFFSET_CON);
-   wpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_wr_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_wr_buf)
+   return -ENOMEM;
+
+   wpaddr = dma_map_single(i2c->dev, dma_wr_buf,
msgs->len, DMA_TO_DEVICE);
-   if (dma_mapping_error(i2c->dev, wpaddr))
+   if (dma_mapping_error(i2c->dev, wpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
return -ENOMEM;
+   }
 
if (i2c->dev_comp->support_33bits) {
reg_4g_mode = mtk_i2c_set_4g_mode(wpaddr);
@@ -530,16 +548,39 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
} else {
writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_CON);
-   wpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_wr_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_wr_buf)
+   return -ENOMEM;
+
+   wpaddr = dma_map_single(i2c->dev, dma_wr_buf,
msgs->len, DMA_TO_DEVICE);
-   if (dma_mapping_error(i2c->dev, wpaddr))
+   if (dma_mapping_error(i2c->dev, wpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
return -ENOMEM;
-   rpaddr = dma_map_single(i2c->dev, (msgs + 1)->buf,
+   }
+
+   dma_rd_buf = i2c_get_dma_safe_msg_buf((msgs + 1), 0);
+   if (!dma_rd_buf) {
+   dma_unmap_single(i2c->dev, wpaddr,
+msgs->len, DMA_TO_DEVICE);
+
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
+   return -ENOMEM;
+   }
+
+   rpaddr = dma_map_single(i2c->dev, dma_rd_buf,
(msgs + 1)->len,
DMA_FROM_DEVICE);
if (dma_mapping_error(i2c->dev, rpaddr)) {
dma_unmap_single(i2c->dev, wpaddr,
 msgs->len, DMA_TO_DEVICE);
+
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+   i2c_free_dma_safe_msg_buf((msgs + 1), dma_rd_buf);
+
return -ENOMEM;
}
 
@@ -578,14 +619,21 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
if (i2c->op == I2C_MASTER_WR) {
dma_unmap_single(i2c->dev, wpaddr,
 msgs->len, DMA_TO_DEVICE);
+
+   i2c_release_dma_safe_msg_buf(msgs, dma_wr_buf);
} else if (i2c->op == I2C_MASTER_RD) {

[PATCH 0/3] Register i2c adapter driver earlier and use DMA safe buffers

2018-07-06 Thread Jun Gao
This patch series based on v4.18-rc1, include i2c adapter driver register time
modification, DMA safe buffer free function and DMA safe buffers used for i2c
transactions.

Jun Gao (3):
  i2c: mediatek: Register i2c adapter driver earlier
  i2c: Add helper to ease DMA handling
  i2c: mediatek: Use DMA safe buffers for i2c transactions

 drivers/i2c/busses/i2c-mt65xx.c | 74 -
 drivers/i2c/i2c-core-base.c | 14 
 include/linux/i2c.h |  1 +
 3 files changed, 81 insertions(+), 8 deletions(-)

--
1.8.1.1



[PATCH 3/3] i2c: mediatek: Use DMA safe buffers for i2c transactions

2018-07-06 Thread Jun Gao
From: Jun Gao 

DMA mode will always be used in i2c transactions, try to allocate
a DMA safe buffer if the buf of struct i2c_msg used is not DMA safe.

Signed-off-by: Jun Gao 
---
 drivers/i2c/busses/i2c-mt65xx.c | 62 -
 1 file changed, 55 insertions(+), 7 deletions(-)

diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
index 806e8b90..dd014ee 100644
--- a/drivers/i2c/busses/i2c-mt65xx.c
+++ b/drivers/i2c/busses/i2c-mt65xx.c
@@ -441,6 +441,8 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, struct 
i2c_msg *msgs,
u16 control_reg;
u16 restart_flag = 0;
u32 reg_4g_mode;
+   u8 *dma_rd_buf;
+   u8 *dma_wr_buf;
dma_addr_t rpaddr = 0;
dma_addr_t wpaddr = 0;
int ret;
@@ -500,10 +502,18 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
if (i2c->op == I2C_MASTER_RD) {
writel(I2C_DMA_INT_FLAG_NONE, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CON_RX, i2c->pdmabase + OFFSET_CON);
-   rpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_rd_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_rd_buf)
+   return -ENOMEM;
+
+   rpaddr = dma_map_single(i2c->dev, dma_rd_buf,
msgs->len, DMA_FROM_DEVICE);
-   if (dma_mapping_error(i2c->dev, rpaddr))
+   if (dma_mapping_error(i2c->dev, rpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_rd_buf);
+
return -ENOMEM;
+   }
 
if (i2c->dev_comp->support_33bits) {
reg_4g_mode = mtk_i2c_set_4g_mode(rpaddr);
@@ -515,10 +525,18 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
} else if (i2c->op == I2C_MASTER_WR) {
writel(I2C_DMA_INT_FLAG_NONE, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CON_TX, i2c->pdmabase + OFFSET_CON);
-   wpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_wr_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_wr_buf)
+   return -ENOMEM;
+
+   wpaddr = dma_map_single(i2c->dev, dma_wr_buf,
msgs->len, DMA_TO_DEVICE);
-   if (dma_mapping_error(i2c->dev, wpaddr))
+   if (dma_mapping_error(i2c->dev, wpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
return -ENOMEM;
+   }
 
if (i2c->dev_comp->support_33bits) {
reg_4g_mode = mtk_i2c_set_4g_mode(wpaddr);
@@ -530,16 +548,39 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
} else {
writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_INT_FLAG);
writel(I2C_DMA_CLR_FLAG, i2c->pdmabase + OFFSET_CON);
-   wpaddr = dma_map_single(i2c->dev, msgs->buf,
+
+   dma_wr_buf = i2c_get_dma_safe_msg_buf(msgs, 0);
+   if (!dma_wr_buf)
+   return -ENOMEM;
+
+   wpaddr = dma_map_single(i2c->dev, dma_wr_buf,
msgs->len, DMA_TO_DEVICE);
-   if (dma_mapping_error(i2c->dev, wpaddr))
+   if (dma_mapping_error(i2c->dev, wpaddr)) {
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
return -ENOMEM;
-   rpaddr = dma_map_single(i2c->dev, (msgs + 1)->buf,
+   }
+
+   dma_rd_buf = i2c_get_dma_safe_msg_buf((msgs + 1), 0);
+   if (!dma_rd_buf) {
+   dma_unmap_single(i2c->dev, wpaddr,
+msgs->len, DMA_TO_DEVICE);
+
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+
+   return -ENOMEM;
+   }
+
+   rpaddr = dma_map_single(i2c->dev, dma_rd_buf,
(msgs + 1)->len,
DMA_FROM_DEVICE);
if (dma_mapping_error(i2c->dev, rpaddr)) {
dma_unmap_single(i2c->dev, wpaddr,
 msgs->len, DMA_TO_DEVICE);
+
+   i2c_free_dma_safe_msg_buf(msgs, dma_wr_buf);
+   i2c_free_dma_safe_msg_buf((msgs + 1), dma_rd_buf);
+
return -ENOMEM;
}
 
@@ -578,14 +619,21 @@ static int mtk_i2c_do_transfer(struct mtk_i2c *i2c, 
struct i2c_msg *msgs,
if (i2c->op == I2C_MASTER_WR) {
dma_unmap_single(i2c->dev, wpaddr,
 msgs->len, DMA_TO_DEVICE);
+
+   i2c_release_dma_safe_msg_buf(msgs, dma_wr_buf);
} else if (i2c->op == I2C_MASTER_RD) {

[PATCH 0/3] Register i2c adapter driver earlier and use DMA safe buffers

2018-07-06 Thread Jun Gao
This patch series based on v4.18-rc1, include i2c adapter driver register time
modification, DMA safe buffer free function and DMA safe buffers used for i2c
transactions.

Jun Gao (3):
  i2c: mediatek: Register i2c adapter driver earlier
  i2c: Add helper to ease DMA handling
  i2c: mediatek: Use DMA safe buffers for i2c transactions

 drivers/i2c/busses/i2c-mt65xx.c | 74 -
 drivers/i2c/i2c-core-base.c | 14 
 include/linux/i2c.h |  1 +
 3 files changed, 81 insertions(+), 8 deletions(-)

--
1.8.1.1



[PATCH] ARM: dts: imx: Add ZII SCU2 Mezz board

2018-07-06 Thread Andrey Smirnov
Add support for the Zodiac Inflight Innovations SCU2 Mezz
board (i.MX51-based).

Cc: Fabio Estevam 
Cc: Nikita Yushchenko 
Cc: Lucas Stach 
Cc: cphe...@gmail.com
Cc: Shawn Guo 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Fabio Estevam 
Signed-off-by: Andrey Gusakov 
Signed-off-by: Andrey Smirnov 
---

Shawn:

This is a spin-off of SCU2 Mezz board support originally found in [v0]
as per out off-list (Fabio, Chris, Nikita, myself) to add support for
ZII hardware fist and worry about factoring out commonalites later.

Original submission was done by Andrey Gusakov, but he is too busy on
other projects, so the honors of submitting this were delegated to me.

NOTE: RAVE SP ("zii,rave-sp-mezz") node is technically supported by
upstream, but it needs some fixes from [rave-sp-fixes] to work
correctly. If you want me to drop that node until [rave-sp-fixes] is
merged, let me know.

Changes since [v0]:

 - Patch converted to be a standalone file not dependent on any
   ZII-specific .dtsi

 - Added RAVE SP node with all the children that are currently
   supported by upstream

 - Droppped ecspi2 node. That node didn't have any child devices in
   [v0] because none of the chips connected to that bus are supported
   upstream. This node can be added later once anything attached to it
   has upstream drivers.

 - Dropped i2c_gpio. That bus was originally added for RAVE SP related
   prototyping and is unused in actual product.

 - Various newline fixes pointed out in [v0]

 - Most of then nodes should be sorted alphabetically (I might have
   missed some)

 - Collected Reviewed-by from Fabio (Fabio, I assumed you won't mind,
   but let me know if you want me to drop it)

[v0] 
lkml.kernel.org/r/1529603100-31958-4-git-send-email-andrey.gusa...@cogentembedded.com
[rave-sp-fixes] 
lkml.kernel.org/r/20180707024108.32373-1-andrew.smir...@gmail.com

 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/imx51-zii-scu2-mezz.dts | 448 ++
 2 files changed, 450 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/imx51-zii-scu2-mezz.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 37a3de760d40..1d6acbab7062 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -358,7 +358,8 @@ dtb-$(CONFIG_SOC_IMX51) += \
imx51-digi-connectcore-jsk.dtb \
imx51-eukrea-mbimxsd51-baseboard.dtb \
imx51-ts4800.dtb \
-   imx51-zii-rdu1.dtb
+   imx51-zii-rdu1.dtb \
+   imx51-zii-scu2-mezz.dtb
 dtb-$(CONFIG_SOC_IMX53) += \
imx53-ard.dtb \
imx53-cx9020.dtb \
diff --git a/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts 
b/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
new file mode 100644
index ..26cf08549df4
--- /dev/null
+++ b/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
@@ -0,0 +1,448 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+
+/*
+ * Copyright (C) 2018 Zodiac Inflight Innovations
+ */
+
+/dts-v1/;
+
+#include "imx51.dtsi"
+
+/ {
+   model = "ZII SCU2 Mezz Board";
+   compatible = "zii,imx51-scu2-mezz", "fsl,imx51";
+
+   chosen {
+   stdout-path = 
+   };
+
+   /* Will be filled by the bootloader */
+   memory@9000 {
+   reg = <0x9000 0>;
+   };
+
+   aliases {
+   mdio-gpio0 = _gpio;
+   };
+
+   usb_vbus: regulator-usb-vbus {
+   compatible = "regulator-fixed";
+   pinctrl-names = "default";
+   pinctrl-0 = <_usb_mmc_reset>;
+   gpio = < 13 GPIO_ACTIVE_LOW>;
+   startup-delay-us = <15>;
+   regulator-name = "usb_vbus";
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   };
+
+   mdio_gpio: mdio-gpio {
+   compatible = "virtual,mdio-gpio";
+   pinctrl-names = "default";
+   pinctrl-0 = <_swmdio>;
+   gpios = < 7 GPIO_ACTIVE_HIGH>, /* mdc */
+   < 6 GPIO_ACTIVE_HIGH>; /* mdio */
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   switch@0 {
+   compatible = "marvell,mv88e6085";
+   reg = <0>;
+   dsa,member = <0 0>;
+   eeprom-length = <512>;
+   interrupt-parent = <>;
+   interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   label = "port4";
+   };
+
+ 

[PATCH] ARM: dts: imx: Add ZII SCU2 Mezz board

2018-07-06 Thread Andrey Smirnov
Add support for the Zodiac Inflight Innovations SCU2 Mezz
board (i.MX51-based).

Cc: Fabio Estevam 
Cc: Nikita Yushchenko 
Cc: Lucas Stach 
Cc: cphe...@gmail.com
Cc: Shawn Guo 
Cc: Rob Herring 
Cc: Mark Rutland 
Cc: linux-arm-ker...@lists.infradead.org
Cc: devicet...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Fabio Estevam 
Signed-off-by: Andrey Gusakov 
Signed-off-by: Andrey Smirnov 
---

Shawn:

This is a spin-off of SCU2 Mezz board support originally found in [v0]
as per out off-list (Fabio, Chris, Nikita, myself) to add support for
ZII hardware fist and worry about factoring out commonalites later.

Original submission was done by Andrey Gusakov, but he is too busy on
other projects, so the honors of submitting this were delegated to me.

NOTE: RAVE SP ("zii,rave-sp-mezz") node is technically supported by
upstream, but it needs some fixes from [rave-sp-fixes] to work
correctly. If you want me to drop that node until [rave-sp-fixes] is
merged, let me know.

Changes since [v0]:

 - Patch converted to be a standalone file not dependent on any
   ZII-specific .dtsi

 - Added RAVE SP node with all the children that are currently
   supported by upstream

 - Droppped ecspi2 node. That node didn't have any child devices in
   [v0] because none of the chips connected to that bus are supported
   upstream. This node can be added later once anything attached to it
   has upstream drivers.

 - Dropped i2c_gpio. That bus was originally added for RAVE SP related
   prototyping and is unused in actual product.

 - Various newline fixes pointed out in [v0]

 - Most of then nodes should be sorted alphabetically (I might have
   missed some)

 - Collected Reviewed-by from Fabio (Fabio, I assumed you won't mind,
   but let me know if you want me to drop it)

[v0] 
lkml.kernel.org/r/1529603100-31958-4-git-send-email-andrey.gusa...@cogentembedded.com
[rave-sp-fixes] 
lkml.kernel.org/r/20180707024108.32373-1-andrew.smir...@gmail.com

 arch/arm/boot/dts/Makefile|   3 +-
 arch/arm/boot/dts/imx51-zii-scu2-mezz.dts | 448 ++
 2 files changed, 450 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/boot/dts/imx51-zii-scu2-mezz.dts

diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
index 37a3de760d40..1d6acbab7062 100644
--- a/arch/arm/boot/dts/Makefile
+++ b/arch/arm/boot/dts/Makefile
@@ -358,7 +358,8 @@ dtb-$(CONFIG_SOC_IMX51) += \
imx51-digi-connectcore-jsk.dtb \
imx51-eukrea-mbimxsd51-baseboard.dtb \
imx51-ts4800.dtb \
-   imx51-zii-rdu1.dtb
+   imx51-zii-rdu1.dtb \
+   imx51-zii-scu2-mezz.dtb
 dtb-$(CONFIG_SOC_IMX53) += \
imx53-ard.dtb \
imx53-cx9020.dtb \
diff --git a/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts 
b/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
new file mode 100644
index ..26cf08549df4
--- /dev/null
+++ b/arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
@@ -0,0 +1,448 @@
+// SPDX-License-Identifier: (GPL-2.0 OR MIT)
+
+/*
+ * Copyright (C) 2018 Zodiac Inflight Innovations
+ */
+
+/dts-v1/;
+
+#include "imx51.dtsi"
+
+/ {
+   model = "ZII SCU2 Mezz Board";
+   compatible = "zii,imx51-scu2-mezz", "fsl,imx51";
+
+   chosen {
+   stdout-path = 
+   };
+
+   /* Will be filled by the bootloader */
+   memory@9000 {
+   reg = <0x9000 0>;
+   };
+
+   aliases {
+   mdio-gpio0 = _gpio;
+   };
+
+   usb_vbus: regulator-usb-vbus {
+   compatible = "regulator-fixed";
+   pinctrl-names = "default";
+   pinctrl-0 = <_usb_mmc_reset>;
+   gpio = < 13 GPIO_ACTIVE_LOW>;
+   startup-delay-us = <15>;
+   regulator-name = "usb_vbus";
+   regulator-min-microvolt = <500>;
+   regulator-max-microvolt = <500>;
+   };
+
+   mdio_gpio: mdio-gpio {
+   compatible = "virtual,mdio-gpio";
+   pinctrl-names = "default";
+   pinctrl-0 = <_swmdio>;
+   gpios = < 7 GPIO_ACTIVE_HIGH>, /* mdc */
+   < 6 GPIO_ACTIVE_HIGH>; /* mdio */
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   switch@0 {
+   compatible = "marvell,mv88e6085";
+   reg = <0>;
+   dsa,member = <0 0>;
+   eeprom-length = <512>;
+   interrupt-parent = <>;
+   interrupts = <7 IRQ_TYPE_LEVEL_LOW>;
+   interrupt-controller;
+   #interrupt-cells = <2>;
+
+   ports {
+   #address-cells = <1>;
+   #size-cells = <0>;
+
+   port@0 {
+   reg = <0>;
+   label = "port4";
+   };
+
+ 

[PATCH 2/3] i2c: Add helper to ease DMA handling

2018-07-06 Thread Jun Gao
From: Jun Gao 

This function is needed by i2c_get_dma_safe_msg_buf() potentially.
It is used to free DMA safe buffer when DMA operation fails.

Signed-off-by: Jun Gao 
---
 drivers/i2c/i2c-core-base.c | 14 ++
 include/linux/i2c.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index 31d16ad..2b518ea 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -2288,6 +2288,20 @@ void i2c_release_dma_safe_msg_buf(struct i2c_msg *msg, 
u8 *buf)
 }
 EXPORT_SYMBOL_GPL(i2c_release_dma_safe_msg_buf);
 
+/**
+ * i2c_free_dma_safe_msg_buf - free DMA safe buffer
+ * @msg: the message related to DMA safe buffer
+ * @buf: the buffer obtained from i2c_get_dma_safe_msg_buf(). May be NULL.
+ */
+void i2c_free_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf)
+{
+   if (!buf || buf == msg->buf)
+   return;
+
+   kfree(buf);
+}
+EXPORT_SYMBOL_GPL(i2c_free_dma_safe_msg_buf);
+
 MODULE_AUTHOR("Simon G. Vogl ");
 MODULE_DESCRIPTION("I2C-Bus main module");
 MODULE_LICENSE("GPL");
diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index 254cd34..6d62f93 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -860,6 +860,7 @@ static inline u8 i2c_8bit_addr_from_msg(const struct 
i2c_msg *msg)
 
 u8 *i2c_get_dma_safe_msg_buf(struct i2c_msg *msg, unsigned int threshold);
 void i2c_release_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf);
+void i2c_free_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf);
 
 int i2c_handle_smbus_host_notify(struct i2c_adapter *adap, unsigned short 
addr);
 /**
-- 
1.8.1.1



[PATCH 1/3] i2c: mediatek: Register i2c adapter driver earlier

2018-07-06 Thread Jun Gao
From: Jun Gao 

As i2c adapter, i2c slave devices will depend on it. In order not to
block the initializations of i2c slave devices, register i2c adapter
driver at appropriate time.

Signed-off-by: Jun Gao 
---
 drivers/i2c/busses/i2c-mt65xx.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
index 1e57f58..806e8b90 100644
--- a/drivers/i2c/busses/i2c-mt65xx.c
+++ b/drivers/i2c/busses/i2c-mt65xx.c
@@ -888,7 +888,17 @@ static int mtk_i2c_resume(struct device *dev)
},
 };
 
-module_platform_driver(mtk_i2c_driver);
+static int __init mtk_i2c_adap_init(void)
+{
+   return platform_driver_register(_i2c_driver);
+}
+subsys_initcall(mtk_i2c_adap_init);
+
+static void __exit mtk_i2c_adap_exit(void)
+{
+   platform_driver_unregister(_i2c_driver);
+}
+module_exit(mtk_i2c_adap_exit);
 
 MODULE_LICENSE("GPL v2");
 MODULE_DESCRIPTION("MediaTek I2C Bus Driver");
-- 
1.8.1.1



[PATCH 1/3] i2c: mediatek: Register i2c adapter driver earlier

2018-07-06 Thread Jun Gao
From: Jun Gao 

As i2c adapter, i2c slave devices will depend on it. In order not to
block the initializations of i2c slave devices, register i2c adapter
driver at appropriate time.

Signed-off-by: Jun Gao 
---
 drivers/i2c/busses/i2c-mt65xx.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-mt65xx.c b/drivers/i2c/busses/i2c-mt65xx.c
index 1e57f58..806e8b90 100644
--- a/drivers/i2c/busses/i2c-mt65xx.c
+++ b/drivers/i2c/busses/i2c-mt65xx.c
@@ -888,7 +888,17 @@ static int mtk_i2c_resume(struct device *dev)
},
 };
 
-module_platform_driver(mtk_i2c_driver);
+static int __init mtk_i2c_adap_init(void)
+{
+   return platform_driver_register(_i2c_driver);
+}
+subsys_initcall(mtk_i2c_adap_init);
+
+static void __exit mtk_i2c_adap_exit(void)
+{
+   platform_driver_unregister(_i2c_driver);
+}
+module_exit(mtk_i2c_adap_exit);
 
 MODULE_LICENSE("GPL v2");
 MODULE_DESCRIPTION("MediaTek I2C Bus Driver");
-- 
1.8.1.1



[PATCH 2/3] i2c: Add helper to ease DMA handling

2018-07-06 Thread Jun Gao
From: Jun Gao 

This function is needed by i2c_get_dma_safe_msg_buf() potentially.
It is used to free DMA safe buffer when DMA operation fails.

Signed-off-by: Jun Gao 
---
 drivers/i2c/i2c-core-base.c | 14 ++
 include/linux/i2c.h |  1 +
 2 files changed, 15 insertions(+)

diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index 31d16ad..2b518ea 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -2288,6 +2288,20 @@ void i2c_release_dma_safe_msg_buf(struct i2c_msg *msg, 
u8 *buf)
 }
 EXPORT_SYMBOL_GPL(i2c_release_dma_safe_msg_buf);
 
+/**
+ * i2c_free_dma_safe_msg_buf - free DMA safe buffer
+ * @msg: the message related to DMA safe buffer
+ * @buf: the buffer obtained from i2c_get_dma_safe_msg_buf(). May be NULL.
+ */
+void i2c_free_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf)
+{
+   if (!buf || buf == msg->buf)
+   return;
+
+   kfree(buf);
+}
+EXPORT_SYMBOL_GPL(i2c_free_dma_safe_msg_buf);
+
 MODULE_AUTHOR("Simon G. Vogl ");
 MODULE_DESCRIPTION("I2C-Bus main module");
 MODULE_LICENSE("GPL");
diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index 254cd34..6d62f93 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -860,6 +860,7 @@ static inline u8 i2c_8bit_addr_from_msg(const struct 
i2c_msg *msg)
 
 u8 *i2c_get_dma_safe_msg_buf(struct i2c_msg *msg, unsigned int threshold);
 void i2c_release_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf);
+void i2c_free_dma_safe_msg_buf(struct i2c_msg *msg, u8 *buf);
 
 int i2c_handle_smbus_host_notify(struct i2c_adapter *adap, unsigned short 
addr);
 /**
-- 
1.8.1.1



[PATCH 3/6] mfd: rave-sp: Initialize flow control and parity of the port

2018-07-06 Thread Andrey Smirnov
Relying on serial port defaults for flow control and parity can result
in complete breakdown of communication with RAVE SP on some platforms
where defaults are not what we need them to be. One such case is
VF610-base ZII SPU3 board (not supported upstream). To avoid this
problem in the future, add code to explicitly configure both.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index a90ec4986b22..aa75d5841ca0 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -766,6 +766,13 @@ static int rave_sp_probe(struct serdev_device *serdev)
return ret;
 
serdev_device_set_baudrate(serdev, baud);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = serdev_device_set_parity(serdev, SERDEV_PARITY_NONE);
+   if (ret) {
+   dev_err(dev, "Failed to set parity\n");
+   return ret;
+   }
 
ret = rave_sp_get_status(sp);
if (ret) {
-- 
2.17.1



[PATCH 2/6] mfd: rave-sp: Fix incorrectly specified checksum type

2018-07-06 Thread Andrey Smirnov
RAVE SP firmware covered by "legacy" variant uses 16-bit CCITT
checksum algorithm. Change the code to correctly reflect that.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index dfa4f5f1c376..a90ec4986b22 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -697,7 +697,7 @@ static const struct rave_sp_checksum rave_sp_checksum_ccitt 
= {
 };
 
 static const struct rave_sp_variant rave_sp_legacy = {
-   .checksum = _sp_checksum_8b2c,
+   .checksum = _sp_checksum_ccitt,
.cmd = {
.translate = rave_sp_default_cmd_translate,
},
-- 
2.17.1



[PATCH 3/6] mfd: rave-sp: Initialize flow control and parity of the port

2018-07-06 Thread Andrey Smirnov
Relying on serial port defaults for flow control and parity can result
in complete breakdown of communication with RAVE SP on some platforms
where defaults are not what we need them to be. One such case is
VF610-base ZII SPU3 board (not supported upstream). To avoid this
problem in the future, add code to explicitly configure both.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index a90ec4986b22..aa75d5841ca0 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -766,6 +766,13 @@ static int rave_sp_probe(struct serdev_device *serdev)
return ret;
 
serdev_device_set_baudrate(serdev, baud);
+   serdev_device_set_flow_control(serdev, false);
+
+   ret = serdev_device_set_parity(serdev, SERDEV_PARITY_NONE);
+   if (ret) {
+   dev_err(dev, "Failed to set parity\n");
+   return ret;
+   }
 
ret = rave_sp_get_status(sp);
if (ret) {
-- 
2.17.1



[PATCH 2/6] mfd: rave-sp: Fix incorrectly specified checksum type

2018-07-06 Thread Andrey Smirnov
RAVE SP firmware covered by "legacy" variant uses 16-bit CCITT
checksum algorithm. Change the code to correctly reflect that.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index dfa4f5f1c376..a90ec4986b22 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -697,7 +697,7 @@ static const struct rave_sp_checksum rave_sp_checksum_ccitt 
= {
 };
 
 static const struct rave_sp_variant rave_sp_legacy = {
-   .checksum = _sp_checksum_8b2c,
+   .checksum = _sp_checksum_ccitt,
.cmd = {
.translate = rave_sp_default_cmd_translate,
},
-- 
2.17.1



[PATCH 6/6] mfd: rave-sp: Emulate CMD_GET_STATUS on device that don't support it

2018-07-06 Thread Andrey Smirnov
CMD_GET_STATUS is not supported by some devices implementing
RDU2-compatible ICD as well as "legacy" devices. To account for that
fact, add code that obtains the same information (app/bootloader FW
version) using several different commands.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 96 ---
 1 file changed, 63 insertions(+), 33 deletions(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index eee62ba6874d..2a8369657e38 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -117,14 +117,44 @@ struct rave_sp_checksum {
void (*subroutine)(const u8 *, size_t, u8 *);
 };
 
+struct rave_sp_version {
+   u8 hardware;
+   __le16 major;
+   u8 minor;
+   u8 letter[2];
+} __packed;
+
+struct rave_sp_status {
+   struct rave_sp_version bootloader_version;
+   struct rave_sp_version firmware_version;
+   u16 rdu_eeprom_flag;
+   u16 dds_eeprom_flag;
+   u8  pic_flag;
+   u8  orientation;
+   u32 etc;
+   s16 temp[2];
+   u8  backlight_current[3];
+   u8  dip_switch;
+   u8  host_interrupt;
+   u16 voltage_28;
+   u8  i2c_device_status;
+   u8  power_status;
+   u8  general_status;
+   u8  deprecated1;
+   u8  power_led_status;
+   u8  deprecated2;
+   u8  periph_power_shutoff;
+} __packed;
+
 /**
  * struct rave_sp_variant_cmds - Variant specific command routines
  *
  * @translate: Generic to variant specific command mapping routine
- *
+ * @get_status: Variant specific implementation of CMD_GET_STATUS
  */
 struct rave_sp_variant_cmds {
int (*translate)(enum rave_sp_command);
+   int (*get_status)(struct rave_sp *sp, struct rave_sp_status *);
 };
 
 /**
@@ -170,35 +200,6 @@ struct rave_sp {
const char *part_number_bootloader;
 };
 
-struct rave_sp_version {
-   u8 hardware;
-   __le16 major;
-   u8 minor;
-   u8 letter[2];
-} __packed;
-
-struct rave_sp_status {
-   struct rave_sp_version bootloader_version;
-   struct rave_sp_version firmware_version;
-   u16 rdu_eeprom_flag;
-   u16 dds_eeprom_flag;
-   u8  pic_flag;
-   u8  orientation;
-   u32 etc;
-   s16 temp[2];
-   u8  backlight_current[3];
-   u8  dip_switch;
-   u8  host_interrupt;
-   u16 voltage_28;
-   u8  i2c_device_status;
-   u8  power_status;
-   u8  general_status;
-   u8  deprecated1;
-   u8  power_led_status;
-   u8  deprecated2;
-   u8  periph_power_shutoff;
-} __packed;
-
 static bool rave_sp_id_is_event(u8 code)
 {
return (code & 0xF0) == RAVE_SP_EVNT_BASE;
@@ -660,18 +661,44 @@ static const char *devm_rave_sp_version(struct device 
*dev,
  version->letter[1]);
 }
 
-static int rave_sp_get_status(struct rave_sp *sp)
+static int rave_sp_rdu1_get_status(struct rave_sp *sp,
+  struct rave_sp_status *status)
 {
-   struct device *dev = >serdev->dev;
u8 cmd[] = {
[0] = RAVE_SP_CMD_STATUS,
[1] = 0
};
+
+   return rave_sp_exec(sp, cmd, sizeof(cmd), status, sizeof(*status));
+}
+
+static int rave_sp_emulated_get_status(struct rave_sp *sp,
+  struct rave_sp_status *status)
+{
+   u8 cmd[] = {
+   [0] = RAVE_SP_CMD_GET_FIRMWARE_VERSION,
+   [1] = 0,
+   };
+   int ret;
+
+   ret = rave_sp_exec(sp, cmd, sizeof(cmd), >firmware_version,
+  sizeof(status->firmware_version));
+   if (ret)
+   return ret;
+
+   cmd[0] = RAVE_SP_CMD_GET_BOOTLOADER_VERSION;
+   return rave_sp_exec(sp, cmd, sizeof(cmd), >bootloader_version,
+   sizeof(status->bootloader_version));
+}
+
+static int rave_sp_get_status(struct rave_sp *sp)
+{
+   struct device *dev = >serdev->dev;
struct rave_sp_status status;
const char *version;
int ret;
 
-   ret = rave_sp_exec(sp, cmd, sizeof(cmd), , sizeof(status));
+   ret = sp->variant->cmd.get_status(sp, );
if (ret)
return ret;
 
@@ -704,6 +731,7 @@ static const struct rave_sp_variant rave_sp_legacy = {
.checksum = _sp_checksum_ccitt,
.cmd = {
.translate = rave_sp_default_cmd_translate,
+   .get_status = rave_sp_emulated_get_status,
},
 };
 
@@ -711,6 +739,7 @@ static const struct rave_sp_variant rave_sp_rdu1 = {
.checksum = _sp_checksum_8b2c,
.cmd = {
.translate = rave_sp_rdu1_cmd_translate,
+   .get_status = rave_sp_rdu1_get_status,
},
 };
 
@@ -718,6 +747,7 @@ static const struct rave_sp_variant rave_sp_rdu2 = {
.checksum = _sp_checksum_ccitt,
.cmd = {

[PATCH 0/6] RAVE SP MFD driver fixes/improvements

2018-07-06 Thread Andrey Smirnov
Lee:

This series is a number of small fixes the resulted from using RAVE SP
driver on wider selection of ZII devices that initial driver was
tested on. In addition to RDU1 and RDU2, the driver is now known to
work reasonably well on SCU2 Mezz (being upstreamed currently) as well
as SPU3 (not supported by upstream yet).

Hopefully all of the changes are straightforward.

Let me know if anything needs changing.

Thanks,
Andrey Smirnov

Andrey Smirnov (6):
  mfd: rave-sp: Remove unused defines
  mfd: rave-sp: Fix incorrectly specified checksum type
  mfd: rave-sp: Initialize flow control and parity of the port
  mfd: rave-sp: Add legacy EEPROM access command translation
  mfd: rave-sp: Add legacy watchdog ping command translation
  mfd: rave-sp: Emulate CMD_GET_STATUS on device that don't support it

 drivers/mfd/rave-sp.c   | 119 +++-
 include/linux/mfd/rave-sp.h |   1 +
 2 files changed, 76 insertions(+), 44 deletions(-)

-- 
2.17.1



[PATCH 6/6] mfd: rave-sp: Emulate CMD_GET_STATUS on device that don't support it

2018-07-06 Thread Andrey Smirnov
CMD_GET_STATUS is not supported by some devices implementing
RDU2-compatible ICD as well as "legacy" devices. To account for that
fact, add code that obtains the same information (app/bootloader FW
version) using several different commands.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 96 ---
 1 file changed, 63 insertions(+), 33 deletions(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index eee62ba6874d..2a8369657e38 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -117,14 +117,44 @@ struct rave_sp_checksum {
void (*subroutine)(const u8 *, size_t, u8 *);
 };
 
+struct rave_sp_version {
+   u8 hardware;
+   __le16 major;
+   u8 minor;
+   u8 letter[2];
+} __packed;
+
+struct rave_sp_status {
+   struct rave_sp_version bootloader_version;
+   struct rave_sp_version firmware_version;
+   u16 rdu_eeprom_flag;
+   u16 dds_eeprom_flag;
+   u8  pic_flag;
+   u8  orientation;
+   u32 etc;
+   s16 temp[2];
+   u8  backlight_current[3];
+   u8  dip_switch;
+   u8  host_interrupt;
+   u16 voltage_28;
+   u8  i2c_device_status;
+   u8  power_status;
+   u8  general_status;
+   u8  deprecated1;
+   u8  power_led_status;
+   u8  deprecated2;
+   u8  periph_power_shutoff;
+} __packed;
+
 /**
  * struct rave_sp_variant_cmds - Variant specific command routines
  *
  * @translate: Generic to variant specific command mapping routine
- *
+ * @get_status: Variant specific implementation of CMD_GET_STATUS
  */
 struct rave_sp_variant_cmds {
int (*translate)(enum rave_sp_command);
+   int (*get_status)(struct rave_sp *sp, struct rave_sp_status *);
 };
 
 /**
@@ -170,35 +200,6 @@ struct rave_sp {
const char *part_number_bootloader;
 };
 
-struct rave_sp_version {
-   u8 hardware;
-   __le16 major;
-   u8 minor;
-   u8 letter[2];
-} __packed;
-
-struct rave_sp_status {
-   struct rave_sp_version bootloader_version;
-   struct rave_sp_version firmware_version;
-   u16 rdu_eeprom_flag;
-   u16 dds_eeprom_flag;
-   u8  pic_flag;
-   u8  orientation;
-   u32 etc;
-   s16 temp[2];
-   u8  backlight_current[3];
-   u8  dip_switch;
-   u8  host_interrupt;
-   u16 voltage_28;
-   u8  i2c_device_status;
-   u8  power_status;
-   u8  general_status;
-   u8  deprecated1;
-   u8  power_led_status;
-   u8  deprecated2;
-   u8  periph_power_shutoff;
-} __packed;
-
 static bool rave_sp_id_is_event(u8 code)
 {
return (code & 0xF0) == RAVE_SP_EVNT_BASE;
@@ -660,18 +661,44 @@ static const char *devm_rave_sp_version(struct device 
*dev,
  version->letter[1]);
 }
 
-static int rave_sp_get_status(struct rave_sp *sp)
+static int rave_sp_rdu1_get_status(struct rave_sp *sp,
+  struct rave_sp_status *status)
 {
-   struct device *dev = >serdev->dev;
u8 cmd[] = {
[0] = RAVE_SP_CMD_STATUS,
[1] = 0
};
+
+   return rave_sp_exec(sp, cmd, sizeof(cmd), status, sizeof(*status));
+}
+
+static int rave_sp_emulated_get_status(struct rave_sp *sp,
+  struct rave_sp_status *status)
+{
+   u8 cmd[] = {
+   [0] = RAVE_SP_CMD_GET_FIRMWARE_VERSION,
+   [1] = 0,
+   };
+   int ret;
+
+   ret = rave_sp_exec(sp, cmd, sizeof(cmd), >firmware_version,
+  sizeof(status->firmware_version));
+   if (ret)
+   return ret;
+
+   cmd[0] = RAVE_SP_CMD_GET_BOOTLOADER_VERSION;
+   return rave_sp_exec(sp, cmd, sizeof(cmd), >bootloader_version,
+   sizeof(status->bootloader_version));
+}
+
+static int rave_sp_get_status(struct rave_sp *sp)
+{
+   struct device *dev = >serdev->dev;
struct rave_sp_status status;
const char *version;
int ret;
 
-   ret = rave_sp_exec(sp, cmd, sizeof(cmd), , sizeof(status));
+   ret = sp->variant->cmd.get_status(sp, );
if (ret)
return ret;
 
@@ -704,6 +731,7 @@ static const struct rave_sp_variant rave_sp_legacy = {
.checksum = _sp_checksum_ccitt,
.cmd = {
.translate = rave_sp_default_cmd_translate,
+   .get_status = rave_sp_emulated_get_status,
},
 };
 
@@ -711,6 +739,7 @@ static const struct rave_sp_variant rave_sp_rdu1 = {
.checksum = _sp_checksum_8b2c,
.cmd = {
.translate = rave_sp_rdu1_cmd_translate,
+   .get_status = rave_sp_rdu1_get_status,
},
 };
 
@@ -718,6 +747,7 @@ static const struct rave_sp_variant rave_sp_rdu2 = {
.checksum = _sp_checksum_ccitt,
.cmd = {

[PATCH 0/6] RAVE SP MFD driver fixes/improvements

2018-07-06 Thread Andrey Smirnov
Lee:

This series is a number of small fixes the resulted from using RAVE SP
driver on wider selection of ZII devices that initial driver was
tested on. In addition to RDU1 and RDU2, the driver is now known to
work reasonably well on SCU2 Mezz (being upstreamed currently) as well
as SPU3 (not supported by upstream yet).

Hopefully all of the changes are straightforward.

Let me know if anything needs changing.

Thanks,
Andrey Smirnov

Andrey Smirnov (6):
  mfd: rave-sp: Remove unused defines
  mfd: rave-sp: Fix incorrectly specified checksum type
  mfd: rave-sp: Initialize flow control and parity of the port
  mfd: rave-sp: Add legacy EEPROM access command translation
  mfd: rave-sp: Add legacy watchdog ping command translation
  mfd: rave-sp: Emulate CMD_GET_STATUS on device that don't support it

 drivers/mfd/rave-sp.c   | 119 +++-
 include/linux/mfd/rave-sp.h |   1 +
 2 files changed, 76 insertions(+), 44 deletions(-)

-- 
2.17.1



[PATCH 5/6] mfd: rave-sp: Add legacy watchdog ping command translation

2018-07-06 Thread Andrey Smirnov
This is needed to make rave-sp-wdt driver to properly ping the
watchdog on "legacy" firmware.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index a999fa721b03..eee62ba6874d 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -631,6 +631,8 @@ static int rave_sp_default_cmd_translate(enum 
rave_sp_command command)
return 0x14;
case RAVE_SP_CMD_SW_WDT:
return 0x1C;
+   case RAVE_SP_CMD_PET_WDT:
+   return 0x1D;
case RAVE_SP_CMD_RESET:
return 0x1E;
case RAVE_SP_CMD_RESET_REASON:
-- 
2.17.1



[PATCH 5/6] mfd: rave-sp: Add legacy watchdog ping command translation

2018-07-06 Thread Andrey Smirnov
This is needed to make rave-sp-wdt driver to properly ping the
watchdog on "legacy" firmware.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index a999fa721b03..eee62ba6874d 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -631,6 +631,8 @@ static int rave_sp_default_cmd_translate(enum 
rave_sp_command command)
return 0x14;
case RAVE_SP_CMD_SW_WDT:
return 0x1C;
+   case RAVE_SP_CMD_PET_WDT:
+   return 0x1D;
case RAVE_SP_CMD_RESET:
return 0x1E;
case RAVE_SP_CMD_RESET_REASON:
-- 
2.17.1



[PATCH 1/6] mfd: rave-sp: Remove unused defines

2018-07-06 Thread Andrey Smirnov
Remove unusded defines that are a leftover from earlier iterations of
the driver.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index 36dcd98977d6..dfa4f5f1c376 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -63,16 +63,6 @@
 #define RAVE_SP_TX_BUFFER_SIZE \
(RAVE_SP_STX_ETX_SIZE + 2 * RAVE_SP_RX_BUFFER_SIZE)
 
-#define RAVE_SP_BOOT_SOURCE_GET0
-#define RAVE_SP_BOOT_SOURCE_SET1
-
-#define RAVE_SP_RDU2_BOARD_TYPE_RMB0
-#define RAVE_SP_RDU2_BOARD_TYPE_DEB1
-
-#define RAVE_SP_BOOT_SOURCE_SD 0
-#define RAVE_SP_BOOT_SOURCE_EMMC   1
-#define RAVE_SP_BOOT_SOURCE_NOR2
-
 /**
  * enum rave_sp_deframer_state - Possible state for de-framer
  *
-- 
2.17.1



[PATCH 4/6] mfd: rave-sp: Add legacy EEPROM access command translation

2018-07-06 Thread Andrey Smirnov
This is needed to make rave-sp-eeprom driver work on "legacy"
firmware.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c   | 2 ++
 include/linux/mfd/rave-sp.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index aa75d5841ca0..a999fa721b03 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -635,6 +635,8 @@ static int rave_sp_default_cmd_translate(enum 
rave_sp_command command)
return 0x1E;
case RAVE_SP_CMD_RESET_REASON:
return 0x1F;
+   case RAVE_SP_CMD_RMB_EEPROM:
+   return 0x20;
default:
return -EINVAL;
}
diff --git a/include/linux/mfd/rave-sp.h b/include/linux/mfd/rave-sp.h
index fe0ce7bc59cf..11eef77ef976 100644
--- a/include/linux/mfd/rave-sp.h
+++ b/include/linux/mfd/rave-sp.h
@@ -21,6 +21,7 @@ enum rave_sp_command {
RAVE_SP_CMD_STATUS  = 0xA0,
RAVE_SP_CMD_SW_WDT  = 0xA1,
RAVE_SP_CMD_PET_WDT = 0xA2,
+   RAVE_SP_CMD_RMB_EEPROM  = 0xA4,
RAVE_SP_CMD_SET_BACKLIGHT   = 0xA6,
RAVE_SP_CMD_RESET   = 0xA7,
RAVE_SP_CMD_RESET_REASON= 0xA8,
-- 
2.17.1



[PATCH 1/6] mfd: rave-sp: Remove unused defines

2018-07-06 Thread Andrey Smirnov
Remove unusded defines that are a leftover from earlier iterations of
the driver.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index 36dcd98977d6..dfa4f5f1c376 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -63,16 +63,6 @@
 #define RAVE_SP_TX_BUFFER_SIZE \
(RAVE_SP_STX_ETX_SIZE + 2 * RAVE_SP_RX_BUFFER_SIZE)
 
-#define RAVE_SP_BOOT_SOURCE_GET0
-#define RAVE_SP_BOOT_SOURCE_SET1
-
-#define RAVE_SP_RDU2_BOARD_TYPE_RMB0
-#define RAVE_SP_RDU2_BOARD_TYPE_DEB1
-
-#define RAVE_SP_BOOT_SOURCE_SD 0
-#define RAVE_SP_BOOT_SOURCE_EMMC   1
-#define RAVE_SP_BOOT_SOURCE_NOR2
-
 /**
  * enum rave_sp_deframer_state - Possible state for de-framer
  *
-- 
2.17.1



[PATCH 4/6] mfd: rave-sp: Add legacy EEPROM access command translation

2018-07-06 Thread Andrey Smirnov
This is needed to make rave-sp-eeprom driver work on "legacy"
firmware.

Cc: linux-kernel@vger.kernel.org
Cc: cphe...@gmail.com
Cc: Lucas Stach 
Cc: Nikita Yushchenko 
Cc: Lee Jones 
Signed-off-by: Andrey Smirnov 
---
 drivers/mfd/rave-sp.c   | 2 ++
 include/linux/mfd/rave-sp.h | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
index aa75d5841ca0..a999fa721b03 100644
--- a/drivers/mfd/rave-sp.c
+++ b/drivers/mfd/rave-sp.c
@@ -635,6 +635,8 @@ static int rave_sp_default_cmd_translate(enum 
rave_sp_command command)
return 0x1E;
case RAVE_SP_CMD_RESET_REASON:
return 0x1F;
+   case RAVE_SP_CMD_RMB_EEPROM:
+   return 0x20;
default:
return -EINVAL;
}
diff --git a/include/linux/mfd/rave-sp.h b/include/linux/mfd/rave-sp.h
index fe0ce7bc59cf..11eef77ef976 100644
--- a/include/linux/mfd/rave-sp.h
+++ b/include/linux/mfd/rave-sp.h
@@ -21,6 +21,7 @@ enum rave_sp_command {
RAVE_SP_CMD_STATUS  = 0xA0,
RAVE_SP_CMD_SW_WDT  = 0xA1,
RAVE_SP_CMD_PET_WDT = 0xA2,
+   RAVE_SP_CMD_RMB_EEPROM  = 0xA4,
RAVE_SP_CMD_SET_BACKLIGHT   = 0xA6,
RAVE_SP_CMD_RESET   = 0xA7,
RAVE_SP_CMD_RESET_REASON= 0xA8,
-- 
2.17.1



[PATCH] ibmasm: don't write out of bounds in read handler

2018-07-06 Thread Jann Horn
This read handler had a lot of custom logic and wrote outside the bounds of
the provided buffer. This could lead to kernel and userspace memory
corruption. Just use simple_read_from_buffer() with a stack buffer.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn 
---
NOTE: I put a "CC: stable" tag on this commit because it's a simple
change and I don't know whether bugs in this code matter; I don't
have any idea what the userland for this looks like.
If it's not important, feel free to remove the tag.

 drivers/misc/ibmasm/ibmasmfs.c | 27 +++
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/ibmasm/ibmasmfs.c b/drivers/misc/ibmasm/ibmasmfs.c
index e05c3245930a..fa840666bdd1 100644
--- a/drivers/misc/ibmasm/ibmasmfs.c
+++ b/drivers/misc/ibmasm/ibmasmfs.c
@@ -507,35 +507,14 @@ static int remote_settings_file_close(struct inode 
*inode, struct file *file)
 static ssize_t remote_settings_file_read(struct file *file, char __user *buf, 
size_t count, loff_t *offset)
 {
void __iomem *address = (void __iomem *)file->private_data;
-   unsigned char *page;
-   int retval;
int len = 0;
unsigned int value;
-
-   if (*offset < 0)
-   return -EINVAL;
-   if (count == 0 || count > 1024)
-   return 0;
-   if (*offset != 0)
-   return 0;
-
-   page = (unsigned char *)__get_free_page(GFP_KERNEL);
-   if (!page)
-   return -ENOMEM;
+   char lbuf[20];
 
value = readl(address);
-   len = sprintf(page, "%d\n", value);
-
-   if (copy_to_user(buf, page, len)) {
-   retval = -EFAULT;
-   goto exit;
-   }
-   *offset += len;
-   retval = len;
+   len = snprintf(lbuf, sizeof(lbuf), "%d\n", value);
 
-exit:
-   free_page((unsigned long)page);
-   return retval;
+   return simple_read_from_buffer(buf, count, offset, lbuf, len);
 }
 
 static ssize_t remote_settings_file_write(struct file *file, const char __user 
*ubuff, size_t count, loff_t *offset)
-- 
2.18.0.203.gfac676dfb9-goog



[PATCH] ibmasm: don't write out of bounds in read handler

2018-07-06 Thread Jann Horn
This read handler had a lot of custom logic and wrote outside the bounds of
the provided buffer. This could lead to kernel and userspace memory
corruption. Just use simple_read_from_buffer() with a stack buffer.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn 
---
NOTE: I put a "CC: stable" tag on this commit because it's a simple
change and I don't know whether bugs in this code matter; I don't
have any idea what the userland for this looks like.
If it's not important, feel free to remove the tag.

 drivers/misc/ibmasm/ibmasmfs.c | 27 +++
 1 file changed, 3 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/ibmasm/ibmasmfs.c b/drivers/misc/ibmasm/ibmasmfs.c
index e05c3245930a..fa840666bdd1 100644
--- a/drivers/misc/ibmasm/ibmasmfs.c
+++ b/drivers/misc/ibmasm/ibmasmfs.c
@@ -507,35 +507,14 @@ static int remote_settings_file_close(struct inode 
*inode, struct file *file)
 static ssize_t remote_settings_file_read(struct file *file, char __user *buf, 
size_t count, loff_t *offset)
 {
void __iomem *address = (void __iomem *)file->private_data;
-   unsigned char *page;
-   int retval;
int len = 0;
unsigned int value;
-
-   if (*offset < 0)
-   return -EINVAL;
-   if (count == 0 || count > 1024)
-   return 0;
-   if (*offset != 0)
-   return 0;
-
-   page = (unsigned char *)__get_free_page(GFP_KERNEL);
-   if (!page)
-   return -ENOMEM;
+   char lbuf[20];
 
value = readl(address);
-   len = sprintf(page, "%d\n", value);
-
-   if (copy_to_user(buf, page, len)) {
-   retval = -EFAULT;
-   goto exit;
-   }
-   *offset += len;
-   retval = len;
+   len = snprintf(lbuf, sizeof(lbuf), "%d\n", value);
 
-exit:
-   free_page((unsigned long)page);
-   return retval;
+   return simple_read_from_buffer(buf, count, offset, lbuf, len);
 }
 
 static ssize_t remote_settings_file_write(struct file *file, const char __user 
*ubuff, size_t count, loff_t *offset)
-- 
2.18.0.203.gfac676dfb9-goog



[PATCH v4 1/3] uio: use request_threaded_irq instead

2018-07-06 Thread xiubli
From: Xiubo Li 

Prepraing for changing to use mutex lock.

Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index e8f4ac9..b4b2ae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -902,8 +902,9 @@ int __uio_register_device(struct module *owner,
 * FDs at the time of unregister and therefore may not be
 * freed until they are released.
 */
-   ret = request_irq(info->irq, uio_interrupt,
- info->irq_flags, info->name, idev);
+   ret = request_threaded_irq(info->irq, NULL, uio_interrupt,
+  info->irq_flags, info->name, idev);
+
if (ret)
goto err_request_irq;
}
-- 
1.8.3.1



[PATCH v4 1/3] uio: use request_threaded_irq instead

2018-07-06 Thread xiubli
From: Xiubo Li 

Prepraing for changing to use mutex lock.

Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index e8f4ac9..b4b2ae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -902,8 +902,9 @@ int __uio_register_device(struct module *owner,
 * FDs at the time of unregister and therefore may not be
 * freed until they are released.
 */
-   ret = request_irq(info->irq, uio_interrupt,
- info->irq_flags, info->name, idev);
+   ret = request_threaded_irq(info->irq, NULL, uio_interrupt,
+  info->irq_flags, info->name, idev);
+
if (ret)
goto err_request_irq;
}
-- 
1.8.3.1



[PATCH v4 2/3] uio: change to use the mutex lock instead of the spin lock

2018-07-06 Thread xiubli
From: Xiubo Li 

We are hitting a regression with the following commit:

commit a93e7b331568227500186a465fee3c2cb5dffd1f
Author: Hamish Martin 
Date:   Mon May 14 13:32:23 2018 +1200

uio: Prevent device destruction while fds are open

The problem is the addition of spin_lock_irqsave in uio_write. This
leads to hitting  uio_write -> copy_from_user -> _copy_from_user ->
might_fault and the logs filling up with sleeping warnings.

I also noticed some uio drivers allocate memory, sleep, grab mutexes
from callouts like open() and release and uio is now doing
spin_lock_irqsave while calling them.

Reported-by: Mike Christie 
CC: Hamish Martin 
Reviewed-by: Hamish Martin 
Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c  | 32 +---
 include/linux/uio_driver.h |  2 +-
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index b4b2ae1..655ade4 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -433,7 +433,6 @@ static int uio_open(struct inode *inode, struct file *filep)
struct uio_device *idev;
struct uio_listener *listener;
int ret = 0;
-   unsigned long flags;
 
mutex_lock(_lock);
idev = idr_find(_idr, iminor(inode));
@@ -460,10 +459,10 @@ static int uio_open(struct inode *inode, struct file 
*filep)
listener->event_count = atomic_read(>event);
filep->private_data = listener;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (idev->info && idev->info->open)
ret = idev->info->open(idev->info, inode);
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
if (ret)
goto err_infoopen;
 
@@ -495,12 +494,11 @@ static int uio_release(struct inode *inode, struct file 
*filep)
int ret = 0;
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (idev->info && idev->info->release)
ret = idev->info->release(idev->info, inode);
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
module_put(idev->owner);
kfree(listener);
@@ -513,12 +511,11 @@ static __poll_t uio_poll(struct file *filep, poll_table 
*wait)
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
__poll_t ret = 0;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq)
ret = -EIO;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
if (ret)
return ret;
@@ -537,12 +534,11 @@ static ssize_t uio_read(struct file *filep, char __user 
*buf,
DECLARE_WAITQUEUE(wait, current);
ssize_t retval = 0;
s32 event_count;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq)
retval = -EIO;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
if (retval)
return retval;
@@ -592,9 +588,8 @@ static ssize_t uio_write(struct file *filep, const char 
__user *buf,
struct uio_device *idev = listener->dev;
ssize_t retval;
s32 irq_on;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq) {
retval = -EIO;
goto out;
@@ -618,7 +613,7 @@ static ssize_t uio_write(struct file *filep, const char 
__user *buf,
retval = idev->info->irqcontrol(idev->info, irq_on);
 
 out:
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
return retval ? retval : sizeof(s32);
 }
 
@@ -865,7 +860,7 @@ int __uio_register_device(struct module *owner,
 
idev->owner = owner;
idev->info = info;
-   spin_lock_init(>info_lock);
+   mutex_init(>info_lock);
init_waitqueue_head(>wait);
atomic_set(>event, 0);
 
@@ -929,7 +924,6 @@ int __uio_register_device(struct module *owner,
 void uio_unregister_device(struct uio_info *info)
 {
struct uio_device *idev;
-   unsigned long flags;
 
if (!info || !info->uio_dev)
return;
@@ -943,9 +937,9 @@ void uio_unregister_device(struct uio_info *info)
if (info->irq && info->irq != UIO_IRQ_CUSTOM)
free_irq(info->irq, idev);
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
idev->info = NULL;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
device_unregister(>dev);
 
diff --git 

[PATCH v4 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread xiubli
From: Xiubo Li 

For the target_core_user use case, after the device is unregistered
it maybe still opened in user space, then the kernel will crash, like:

[  251.163692] BUG: unable to handle kernel NULL pointer dereference at 
0008
[  251.163820] IP: [] show_name+0x23/0x40 [uio]
[  251.163965] PGD 800062694067 PUD 62696067 PMD 0
[  251.164097] Oops:  [#1] SMP
...
[  251.165605]  e1000 mptscsih mptbase drm_panel_orientation_quirks dm_mirror 
dm_region_hash dm_log dm_mod
[  251.166014] CPU: 0 PID: 13380 Comm: tcmu-runner Kdump: loaded Not tainted 
3.10.0-916.el7.test.x86_64 #1
[  251.166381] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 05/19/2017
[  251.166747] task: 971eb91db0c0 ti: 971e9e384000 task.ti: 
971e9e384000
[  251.167137] RIP: 0010:[]  [] 
show_name+0x23/0x40 [uio]
[  251.167563] RSP: 0018:971e9e387dc8  EFLAGS: 00010282
[  251.167978] RAX:  RBX: 971e9e3f8000 RCX: 971eb8368d98
[  251.168408] RDX: 971e9e3f8000 RSI: c0738084 RDI: 971e9e3f8000
[  251.168856] RBP: 971e9e387dd0 R08: 971eb8bc0018 R09: 
[  251.169296] R10: 1000 R11: a09d444d R12: a1076e80
[  251.169750] R13: 971e9e387f18 R14: 0001 R15: 971e9cfb1c80
[  251.170213] FS:  7ff37d175880() GS:971ebb60() 
knlGS:
[  251.170693] CS:  0010 DS:  ES:  CR0: 80050033
[  251.171248] CR2: 0008 CR3: 001f6000 CR4: 003607f0
[  251.172071] DR0:  DR1:  DR2: 
[  251.172640] DR3:  DR6: fffe0ff0 DR7: 0400
[  251.173236] Call Trace:
[  251.173789]  [] dev_attr_show+0x23/0x60
[  251.174356]  [] ? mutex_lock+0x12/0x2f
[  251.174892]  [] sysfs_kf_seq_show+0xcf/0x1f0
[  251.175433]  [] kernfs_seq_show+0x26/0x30
[  251.175981]  [] seq_read+0x110/0x3f0
[  251.176609]  [] kernfs_fop_read+0xf5/0x160
[  251.177158]  [] vfs_read+0x9f/0x170
[  251.177707]  [] SyS_read+0x7f/0xf0
[  251.178268]  [] system_call_fastpath+0x1c/0x21
[  251.178823] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 
89 d3 e8 7e 96 56 e0 48 8b 80 d8 02 00 00 48 89 df 48 c7 c6 84 80 73 c0 <48> 8b 
50 08 31 c0 e8 e2 67 44 e0 5b 48 98 5d c3 0f 1f 00 66 2e
[  251.180115] RIP  [] show_name+0x23/0x40 [uio]
[  251.180820]  RSP 
[  251.181473] CR2: 0008

CC: Hamish Martin 
CC: Mike Christie 
Reviewed-by: Hamish Martin 
Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c | 104 +-
 1 file changed, 88 insertions(+), 16 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 655ade4..5d421d7 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -215,7 +215,20 @@ static ssize_t name_show(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
struct uio_device *idev = dev_get_drvdata(dev);
-   return sprintf(buf, "%s\n", idev->info->name);
+   int ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = -EINVAL;
+   dev_err(dev, "the device has been unregistered\n");
+   goto out;
+   }
+
+   ret = sprintf(buf, "%s\n", idev->info->name);
+
+out:
+   mutex_unlock(>info_lock);
+   return ret;
 }
 static DEVICE_ATTR_RO(name);
 
@@ -223,7 +236,20 @@ static ssize_t version_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct uio_device *idev = dev_get_drvdata(dev);
-   return sprintf(buf, "%s\n", idev->info->version);
+   int ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = -EINVAL;
+   dev_err(dev, "the device has been unregistered\n");
+   goto out;
+   }
+
+   ret = sprintf(buf, "%s\n", idev->info->version);
+
+out:
+   mutex_unlock(>info_lock);
+   return ret;
 }
 static DEVICE_ATTR_RO(version);
 
@@ -415,11 +441,15 @@ void uio_event_notify(struct uio_info *info)
 static irqreturn_t uio_interrupt(int irq, void *dev_id)
 {
struct uio_device *idev = (struct uio_device *)dev_id;
-   irqreturn_t ret = idev->info->handler(irq, idev->info);
+   irqreturn_t ret;
+
+   mutex_lock(>info_lock);
 
+   ret = idev->info->handler(irq, idev->info);
if (ret == IRQ_HANDLED)
uio_event_notify(idev->info);
 
+   mutex_unlock(>info_lock);
return ret;
 }
 
@@ -460,6 +490,12 @@ static int uio_open(struct inode *inode, struct file 
*filep)
filep->private_data = listener;
 
mutex_lock(>info_lock);
+   if (!idev->info) {
+   mutex_unlock(>info_lock);
+   ret = -EINVAL;
+   goto err_alloc_listener;
+   }
+
if (idev->info && idev->info->open)
ret = idev->info->open(idev->info, inode);
 

[PATCH v4 2/3] uio: change to use the mutex lock instead of the spin lock

2018-07-06 Thread xiubli
From: Xiubo Li 

We are hitting a regression with the following commit:

commit a93e7b331568227500186a465fee3c2cb5dffd1f
Author: Hamish Martin 
Date:   Mon May 14 13:32:23 2018 +1200

uio: Prevent device destruction while fds are open

The problem is the addition of spin_lock_irqsave in uio_write. This
leads to hitting  uio_write -> copy_from_user -> _copy_from_user ->
might_fault and the logs filling up with sleeping warnings.

I also noticed some uio drivers allocate memory, sleep, grab mutexes
from callouts like open() and release and uio is now doing
spin_lock_irqsave while calling them.

Reported-by: Mike Christie 
CC: Hamish Martin 
Reviewed-by: Hamish Martin 
Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c  | 32 +---
 include/linux/uio_driver.h |  2 +-
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index b4b2ae1..655ade4 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -433,7 +433,6 @@ static int uio_open(struct inode *inode, struct file *filep)
struct uio_device *idev;
struct uio_listener *listener;
int ret = 0;
-   unsigned long flags;
 
mutex_lock(_lock);
idev = idr_find(_idr, iminor(inode));
@@ -460,10 +459,10 @@ static int uio_open(struct inode *inode, struct file 
*filep)
listener->event_count = atomic_read(>event);
filep->private_data = listener;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (idev->info && idev->info->open)
ret = idev->info->open(idev->info, inode);
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
if (ret)
goto err_infoopen;
 
@@ -495,12 +494,11 @@ static int uio_release(struct inode *inode, struct file 
*filep)
int ret = 0;
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (idev->info && idev->info->release)
ret = idev->info->release(idev->info, inode);
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
module_put(idev->owner);
kfree(listener);
@@ -513,12 +511,11 @@ static __poll_t uio_poll(struct file *filep, poll_table 
*wait)
struct uio_listener *listener = filep->private_data;
struct uio_device *idev = listener->dev;
__poll_t ret = 0;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq)
ret = -EIO;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
if (ret)
return ret;
@@ -537,12 +534,11 @@ static ssize_t uio_read(struct file *filep, char __user 
*buf,
DECLARE_WAITQUEUE(wait, current);
ssize_t retval = 0;
s32 event_count;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq)
retval = -EIO;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
if (retval)
return retval;
@@ -592,9 +588,8 @@ static ssize_t uio_write(struct file *filep, const char 
__user *buf,
struct uio_device *idev = listener->dev;
ssize_t retval;
s32 irq_on;
-   unsigned long flags;
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
if (!idev->info || !idev->info->irq) {
retval = -EIO;
goto out;
@@ -618,7 +613,7 @@ static ssize_t uio_write(struct file *filep, const char 
__user *buf,
retval = idev->info->irqcontrol(idev->info, irq_on);
 
 out:
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
return retval ? retval : sizeof(s32);
 }
 
@@ -865,7 +860,7 @@ int __uio_register_device(struct module *owner,
 
idev->owner = owner;
idev->info = info;
-   spin_lock_init(>info_lock);
+   mutex_init(>info_lock);
init_waitqueue_head(>wait);
atomic_set(>event, 0);
 
@@ -929,7 +924,6 @@ int __uio_register_device(struct module *owner,
 void uio_unregister_device(struct uio_info *info)
 {
struct uio_device *idev;
-   unsigned long flags;
 
if (!info || !info->uio_dev)
return;
@@ -943,9 +937,9 @@ void uio_unregister_device(struct uio_info *info)
if (info->irq && info->irq != UIO_IRQ_CUSTOM)
free_irq(info->irq, idev);
 
-   spin_lock_irqsave(>info_lock, flags);
+   mutex_lock(>info_lock);
idev->info = NULL;
-   spin_unlock_irqrestore(>info_lock, flags);
+   mutex_unlock(>info_lock);
 
device_unregister(>dev);
 
diff --git 

[PATCH v4 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread xiubli
From: Xiubo Li 

For the target_core_user use case, after the device is unregistered
it maybe still opened in user space, then the kernel will crash, like:

[  251.163692] BUG: unable to handle kernel NULL pointer dereference at 
0008
[  251.163820] IP: [] show_name+0x23/0x40 [uio]
[  251.163965] PGD 800062694067 PUD 62696067 PMD 0
[  251.164097] Oops:  [#1] SMP
...
[  251.165605]  e1000 mptscsih mptbase drm_panel_orientation_quirks dm_mirror 
dm_region_hash dm_log dm_mod
[  251.166014] CPU: 0 PID: 13380 Comm: tcmu-runner Kdump: loaded Not tainted 
3.10.0-916.el7.test.x86_64 #1
[  251.166381] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 05/19/2017
[  251.166747] task: 971eb91db0c0 ti: 971e9e384000 task.ti: 
971e9e384000
[  251.167137] RIP: 0010:[]  [] 
show_name+0x23/0x40 [uio]
[  251.167563] RSP: 0018:971e9e387dc8  EFLAGS: 00010282
[  251.167978] RAX:  RBX: 971e9e3f8000 RCX: 971eb8368d98
[  251.168408] RDX: 971e9e3f8000 RSI: c0738084 RDI: 971e9e3f8000
[  251.168856] RBP: 971e9e387dd0 R08: 971eb8bc0018 R09: 
[  251.169296] R10: 1000 R11: a09d444d R12: a1076e80
[  251.169750] R13: 971e9e387f18 R14: 0001 R15: 971e9cfb1c80
[  251.170213] FS:  7ff37d175880() GS:971ebb60() 
knlGS:
[  251.170693] CS:  0010 DS:  ES:  CR0: 80050033
[  251.171248] CR2: 0008 CR3: 001f6000 CR4: 003607f0
[  251.172071] DR0:  DR1:  DR2: 
[  251.172640] DR3:  DR6: fffe0ff0 DR7: 0400
[  251.173236] Call Trace:
[  251.173789]  [] dev_attr_show+0x23/0x60
[  251.174356]  [] ? mutex_lock+0x12/0x2f
[  251.174892]  [] sysfs_kf_seq_show+0xcf/0x1f0
[  251.175433]  [] kernfs_seq_show+0x26/0x30
[  251.175981]  [] seq_read+0x110/0x3f0
[  251.176609]  [] kernfs_fop_read+0xf5/0x160
[  251.177158]  [] vfs_read+0x9f/0x170
[  251.177707]  [] SyS_read+0x7f/0xf0
[  251.178268]  [] system_call_fastpath+0x1c/0x21
[  251.178823] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 
89 d3 e8 7e 96 56 e0 48 8b 80 d8 02 00 00 48 89 df 48 c7 c6 84 80 73 c0 <48> 8b 
50 08 31 c0 e8 e2 67 44 e0 5b 48 98 5d c3 0f 1f 00 66 2e
[  251.180115] RIP  [] show_name+0x23/0x40 [uio]
[  251.180820]  RSP 
[  251.181473] CR2: 0008

CC: Hamish Martin 
CC: Mike Christie 
Reviewed-by: Hamish Martin 
Signed-off-by: Xiubo Li 
---
 drivers/uio/uio.c | 104 +-
 1 file changed, 88 insertions(+), 16 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 655ade4..5d421d7 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -215,7 +215,20 @@ static ssize_t name_show(struct device *dev,
 struct device_attribute *attr, char *buf)
 {
struct uio_device *idev = dev_get_drvdata(dev);
-   return sprintf(buf, "%s\n", idev->info->name);
+   int ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = -EINVAL;
+   dev_err(dev, "the device has been unregistered\n");
+   goto out;
+   }
+
+   ret = sprintf(buf, "%s\n", idev->info->name);
+
+out:
+   mutex_unlock(>info_lock);
+   return ret;
 }
 static DEVICE_ATTR_RO(name);
 
@@ -223,7 +236,20 @@ static ssize_t version_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct uio_device *idev = dev_get_drvdata(dev);
-   return sprintf(buf, "%s\n", idev->info->version);
+   int ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = -EINVAL;
+   dev_err(dev, "the device has been unregistered\n");
+   goto out;
+   }
+
+   ret = sprintf(buf, "%s\n", idev->info->version);
+
+out:
+   mutex_unlock(>info_lock);
+   return ret;
 }
 static DEVICE_ATTR_RO(version);
 
@@ -415,11 +441,15 @@ void uio_event_notify(struct uio_info *info)
 static irqreturn_t uio_interrupt(int irq, void *dev_id)
 {
struct uio_device *idev = (struct uio_device *)dev_id;
-   irqreturn_t ret = idev->info->handler(irq, idev->info);
+   irqreturn_t ret;
+
+   mutex_lock(>info_lock);
 
+   ret = idev->info->handler(irq, idev->info);
if (ret == IRQ_HANDLED)
uio_event_notify(idev->info);
 
+   mutex_unlock(>info_lock);
return ret;
 }
 
@@ -460,6 +490,12 @@ static int uio_open(struct inode *inode, struct file 
*filep)
filep->private_data = listener;
 
mutex_lock(>info_lock);
+   if (!idev->info) {
+   mutex_unlock(>info_lock);
+   ret = -EINVAL;
+   goto err_alloc_listener;
+   }
+
if (idev->info && idev->info->open)
ret = idev->info->open(idev->info, inode);
 

[PATCH v4 0/3] uio: fix potential crash bug

2018-07-06 Thread xiubli
From: Xiubo Li 

V2:
- resend it with some small fix

V3:
- switch to use request_threaded_irq

V4:
- remove useless checking code, Thanks Mike.
- Thanks very much for the review from Hamish and Mike.


Xiubo Li (3):
  uio: use request_threaded_irq instead
  uio: change to use the mutex lock instead of the spin lock
  uio: fix crash after the device is unregistered

 drivers/uio/uio.c  | 139 +
 include/linux/uio_driver.h |   2 +-
 2 files changed, 104 insertions(+), 37 deletions(-)

-- 
1.8.3.1



[PATCH v4 0/3] uio: fix potential crash bug

2018-07-06 Thread xiubli
From: Xiubo Li 

V2:
- resend it with some small fix

V3:
- switch to use request_threaded_irq

V4:
- remove useless checking code, Thanks Mike.
- Thanks very much for the review from Hamish and Mike.


Xiubo Li (3):
  uio: use request_threaded_irq instead
  uio: change to use the mutex lock instead of the spin lock
  uio: fix crash after the device is unregistered

 drivers/uio/uio.c  | 139 +
 include/linux/uio_driver.h |   2 +-
 2 files changed, 104 insertions(+), 37 deletions(-)

-- 
1.8.3.1



[RFC] Add BPF_SYNCHRONIZE bpf(2) command

2018-07-06 Thread Daniel Colascione
BPF_SYNCHRONIZE waits for any BPF programs active at the time of
BPF_SYNCHRONIZE to complete, allowing userspace to ensure atomicity of
RCU data structure operations with respect to active programs. For
example, userspace can update a map->map entry to point to a new map,
use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to
complete, and then drain the old map without fear that BPF programs
may still be updating it.

Signed-off-by: Daniel Colascione 
---
 include/uapi/linux/bpf.h |  1 +
 kernel/bpf/syscall.c | 14 ++
 2 files changed, 15 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b7db3261c62d..4365c50e8055 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -98,6 +98,7 @@ enum bpf_cmd {
BPF_BTF_LOAD,
BPF_BTF_GET_FD_BY_ID,
BPF_TASK_FD_QUERY,
+   BPF_SYNCHRONIZE,
 };
 
 enum bpf_map_type {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index d10ecd78105f..60ec7811846e 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2272,6 +2272,20 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, 
uattr, unsigned int, siz
if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
return -EPERM;
 
+   if (cmd == BPF_SYNCHRONIZE) {
+   if (uattr != NULL || size != 0)
+   return -EINVAL;
+   err = security_bpf(cmd, NULL, 0);
+   if (err < 0)
+   return err;
+   /* BPF programs are run with preempt disabled, so
+* synchronize_sched is sufficient even with
+* RCU_PREEMPT.
+*/
+   synchronize_sched();
+   return 0;
+   }
+
err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
if (err)
return err;
-- 
2.18.0.203.gfac676dfb9-goog



[RFC] Add BPF_SYNCHRONIZE bpf(2) command

2018-07-06 Thread Daniel Colascione
BPF_SYNCHRONIZE waits for any BPF programs active at the time of
BPF_SYNCHRONIZE to complete, allowing userspace to ensure atomicity of
RCU data structure operations with respect to active programs. For
example, userspace can update a map->map entry to point to a new map,
use BPF_SYNCHRONIZE to wait for any BPF programs using the old map to
complete, and then drain the old map without fear that BPF programs
may still be updating it.

Signed-off-by: Daniel Colascione 
---
 include/uapi/linux/bpf.h |  1 +
 kernel/bpf/syscall.c | 14 ++
 2 files changed, 15 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b7db3261c62d..4365c50e8055 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -98,6 +98,7 @@ enum bpf_cmd {
BPF_BTF_LOAD,
BPF_BTF_GET_FD_BY_ID,
BPF_TASK_FD_QUERY,
+   BPF_SYNCHRONIZE,
 };
 
 enum bpf_map_type {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index d10ecd78105f..60ec7811846e 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2272,6 +2272,20 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, 
uattr, unsigned int, siz
if (sysctl_unprivileged_bpf_disabled && !capable(CAP_SYS_ADMIN))
return -EPERM;
 
+   if (cmd == BPF_SYNCHRONIZE) {
+   if (uattr != NULL || size != 0)
+   return -EINVAL;
+   err = security_bpf(cmd, NULL, 0);
+   if (err < 0)
+   return err;
+   /* BPF programs are run with preempt disabled, so
+* synchronize_sched is sufficient even with
+* RCU_PREEMPT.
+*/
+   synchronize_sched();
+   return 0;
+   }
+
err = bpf_check_uarg_tail_zero(uattr, sizeof(attr), size);
if (err)
return err;
-- 
2.18.0.203.gfac676dfb9-goog



[PATCH] staging: speakup: fix wraparound in uaccess length check

2018-07-06 Thread Jann Horn
If softsynthx_read() is called with `count < 3`, `count - 3` wraps, causing
the loop to copy as much data as available to the provided buffer. If
softsynthx_read() is invoked through sys_splice(), this causes an
unbounded kernel write; but even when userspace just reads from it
normally, a small size could cause userspace crashes.

Fixes: 425e586cf95b ("speakup: add unicode variant of /dev/softsynth")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn 
---

Reproducer (kernel overflows userspace stack, resulting in segfault):

root@debian:/home/user# cat test.c
#include 
int main(void) {
  char buf[1];
  read(0, buf, 1);
}
root@debian:/home/user# gcc -o test test.c
root@debian:/home/user# ./test < /dev/softsynth
[do some stuff on the console so that it prints text]
Segmentation fault
root@debian:/home/user# strace ./test < /dev/softsynth
execve("./test", ["./test"], [/* 21 vars */]) = 0
brk(NULL)   = 0x55d5977da000
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f2ca2cac000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=103509, ...}) = 0
mmap(NULL, 103509, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f2ca2c92000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f2ca26ed000
mprotect(0x7f2ca2882000, 2097152, PROT_NONE) = 0
mmap(0x7f2ca2a82000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7f2ca2a82000
mmap(0x7f2ca2a88000, 14752, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f2ca2a88000
close(3)= 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f2ca2c9
arch_prctl(ARCH_SET_FS, 0x7f2ca2c90700) = 0
mprotect(0x7f2ca2a82000, 16384, PROT_READ) = 0
mprotect(0x55d596384000, 4096, PROT_READ) = 0
mprotect(0x7f2ca2caf000, 4096, PROT_READ) = 0
munmap(0x7f2ca2c92000, 103509)  = 0
read(0, "\30\0012s\0015p\0015v\0011x\0010b\0010o\0015f\n", 1) = 23
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV +++
Segmentation fault


 drivers/staging/speakup/speakup_soft.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/speakup/speakup_soft.c 
b/drivers/staging/speakup/speakup_soft.c
index a61bc41b82d7..f9b405bd052d 100644
--- a/drivers/staging/speakup/speakup_soft.c
+++ b/drivers/staging/speakup/speakup_soft.c
@@ -228,7 +228,7 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
init = get_initstring();
 
/* Keep 3 bytes available for a 16bit UTF-8-encoded character */
-   while (chars_sent <= count - 3) {
+   while (chars_sent < count) {
if (speakup_info.flushing) {
speakup_info.flushing = 0;
ch = '\x18';
@@ -257,6 +257,8 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
0x80 | (ch & 0x3f)
};
 
+   if (chars_sent + 2 > count)
+   break;
if (copy_to_user(cp, s, sizeof(s)))
return -EFAULT;
 
@@ -269,6 +271,8 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
0x80 | (ch & 0x3f)
};
 
+   if (chars_sent + 3 > count)
+   break;
if (copy_to_user(cp, s, sizeof(s)))
return -EFAULT;
 
-- 
2.18.0.203.gfac676dfb9-goog



[PATCH] staging: speakup: fix wraparound in uaccess length check

2018-07-06 Thread Jann Horn
If softsynthx_read() is called with `count < 3`, `count - 3` wraps, causing
the loop to copy as much data as available to the provided buffer. If
softsynthx_read() is invoked through sys_splice(), this causes an
unbounded kernel write; but even when userspace just reads from it
normally, a small size could cause userspace crashes.

Fixes: 425e586cf95b ("speakup: add unicode variant of /dev/softsynth")
Cc: sta...@vger.kernel.org
Signed-off-by: Jann Horn 
---

Reproducer (kernel overflows userspace stack, resulting in segfault):

root@debian:/home/user# cat test.c
#include 
int main(void) {
  char buf[1];
  read(0, buf, 1);
}
root@debian:/home/user# gcc -o test test.c
root@debian:/home/user# ./test < /dev/softsynth
[do some stuff on the console so that it prints text]
Segmentation fault
root@debian:/home/user# strace ./test < /dev/softsynth
execve("./test", ["./test"], [/* 21 vars */]) = 0
brk(NULL)   = 0x55d5977da000
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f2ca2cac000
access("/etc/ld.so.preload", R_OK)  = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=103509, ...}) = 0
mmap(NULL, 103509, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f2ca2c92000
close(3)= 0
access("/etc/ld.so.nohwcap", F_OK)  = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x7f2ca26ed000
mprotect(0x7f2ca2882000, 2097152, PROT_NONE) = 0
mmap(0x7f2ca2a82000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7f2ca2a82000
mmap(0x7f2ca2a88000, 14752, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f2ca2a88000
close(3)= 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x7f2ca2c9
arch_prctl(ARCH_SET_FS, 0x7f2ca2c90700) = 0
mprotect(0x7f2ca2a82000, 16384, PROT_READ) = 0
mprotect(0x55d596384000, 4096, PROT_READ) = 0
mprotect(0x7f2ca2caf000, 4096, PROT_READ) = 0
munmap(0x7f2ca2c92000, 103509)  = 0
read(0, "\30\0012s\0015p\0015v\0011x\0010b\0010o\0015f\n", 1) = 23
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGSEGV +++
Segmentation fault


 drivers/staging/speakup/speakup_soft.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/speakup/speakup_soft.c 
b/drivers/staging/speakup/speakup_soft.c
index a61bc41b82d7..f9b405bd052d 100644
--- a/drivers/staging/speakup/speakup_soft.c
+++ b/drivers/staging/speakup/speakup_soft.c
@@ -228,7 +228,7 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
init = get_initstring();
 
/* Keep 3 bytes available for a 16bit UTF-8-encoded character */
-   while (chars_sent <= count - 3) {
+   while (chars_sent < count) {
if (speakup_info.flushing) {
speakup_info.flushing = 0;
ch = '\x18';
@@ -257,6 +257,8 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
0x80 | (ch & 0x3f)
};
 
+   if (chars_sent + 2 > count)
+   break;
if (copy_to_user(cp, s, sizeof(s)))
return -EFAULT;
 
@@ -269,6 +271,8 @@ static ssize_t softsynthx_read(struct file *fp, char __user 
*buf, size_t count,
0x80 | (ch & 0x3f)
};
 
+   if (chars_sent + 3 > count)
+   break;
if (copy_to_user(cp, s, sizeof(s)))
return -EFAULT;
 
-- 
2.18.0.203.gfac676dfb9-goog



Re: [PATCH] KASAN: use-after-free Read in rdma_listen

2018-07-06 Thread Jason Gunthorpe
On Sat, Jul 07, 2018 at 03:41:30AM +0200, Tomas Bortoli wrote:
> I don't have a background on usage or internals of the driver at issue
> but I hope these clues will help in finding the proper fix.

I think anything is useful, thanks..

The truth is that nobody is left that seems to really understand this
code and syzkaller has shown it is full of various bugs..

If there is someone out there that would like to tackle it, let me
know. There might be a possibility to support such work.

Jason


Re: [PATCH] KASAN: use-after-free Read in rdma_listen

2018-07-06 Thread Jason Gunthorpe
On Sat, Jul 07, 2018 at 03:41:30AM +0200, Tomas Bortoli wrote:
> I don't have a background on usage or internals of the driver at issue
> but I hope these clues will help in finding the proper fix.

I think anything is useful, thanks..

The truth is that nobody is left that seems to really understand this
code and syzkaller has shown it is full of various bugs..

If there is someone out there that would like to tackle it, let me
know. There might be a possibility to support such work.

Jason


Re: [PATCH 11/14] dt-bindings: fsi: Document binding for the fsi-master-ast-cf "device"

2018-07-06 Thread Benjamin Herrenschmidt
On Thu, 2018-07-05 at 10:08 -0600, Rob Herring wrote:
> 
> > It's not really a SOC block from a vendor, it's a pseudo-device in a
> > way. The current one that doesn't use the coldfire offload is just
> > compatible "fsi-master-gpio".
> > 
> > I can add a vendor but what should it be ? aspeed because it runs on
> > the aspeed SoCs only ? ibm because we wrote it and FSI is an IBM
> > protocol ?
> 
> I would say aspeed as it is tied to their chip.
> 
> > 
> > - doesn't make sense here though.
> 
> But you do already have  in the compatible, but in a slightly
> different form and position. And "cf" is the block.
>
> So I'd propose: aspeed,ast2500-cf-fsi-master

Ok, I'll do that.

> > 
> > > > +
> > > > + - clock-gpios = ;: GPIO for FSI clock
> > > > + - data-gpios = ; : GPIO for FSI data signal
> > > > + - enable-gpios = ;   : GPIO for enable signal
> > > > + - trans-gpios = ;: GPIO for voltage 
> > > > translator enable
> > > > + - mux-gpios = ;  : GPIO for pin multiplexing with 
> > > > other
> > > 
> > > So the gpio info is pased to the CF? Otherwise, what's the point of
> > > having these in DT?
> > 
> > In the original version you are looking at, they are not passed to the
> > CF per-se but the driver does use aspeed GPIO specific APIs to
> > configure them to be owned by the CF, so we need the references.
> 
> Okay.
> 
> > However, I've just reworked the ucode with a few tricks to avoid losing
> > singificant performance, so that we can indeed pass them to the CF,
> > thus avoiding the need for a per-system image, so the above are here to
> > stay.
> > 
> > > > +  functions (eg, external FSI 
> > > > masters)
> > > > + - memory-region = ;  : Reference to the reserved 
> > > > memory for
> > > > +  the ColdFire. Must be 2M 
> > > > aligned on
> > > > + AST2400 and 1M aligned on AST2500
> > > > + - sram = ;   : Reference to the SRAM 
> > > > node.
> > > > + - cvic = ;   : Reference to the CVIC 
> > > > node.
> > > 
> > > Vendor prefixes.
> > 
> > On what ? Why would an "sram" pointer have a vendor prefix ? Or a
> > memory region pointer ?
> 
> memory-region is a standard property. sram and cvic are not, so should
> have vendor prefixes. However, perhaps we should add a common "sram"
> property to sram/sram.txt.

Hrm... originally vendor prefix on properties were for things that
didn't have a binding afaik. IE a way for an f-code driver to stash
things in the DT that were vendor specific and retrieve them from the
OS driver for example.

Here with well defined bindings it's rather bloaty don't you think ? I
don't strongly object to doing it, it's just a bit ... odd.

Cheers,
Ben.



Re: [PATCH 11/14] dt-bindings: fsi: Document binding for the fsi-master-ast-cf "device"

2018-07-06 Thread Benjamin Herrenschmidt
On Thu, 2018-07-05 at 10:08 -0600, Rob Herring wrote:
> 
> > It's not really a SOC block from a vendor, it's a pseudo-device in a
> > way. The current one that doesn't use the coldfire offload is just
> > compatible "fsi-master-gpio".
> > 
> > I can add a vendor but what should it be ? aspeed because it runs on
> > the aspeed SoCs only ? ibm because we wrote it and FSI is an IBM
> > protocol ?
> 
> I would say aspeed as it is tied to their chip.
> 
> > 
> > - doesn't make sense here though.
> 
> But you do already have  in the compatible, but in a slightly
> different form and position. And "cf" is the block.
>
> So I'd propose: aspeed,ast2500-cf-fsi-master

Ok, I'll do that.

> > 
> > > > +
> > > > + - clock-gpios = ;: GPIO for FSI clock
> > > > + - data-gpios = ; : GPIO for FSI data signal
> > > > + - enable-gpios = ;   : GPIO for enable signal
> > > > + - trans-gpios = ;: GPIO for voltage 
> > > > translator enable
> > > > + - mux-gpios = ;  : GPIO for pin multiplexing with 
> > > > other
> > > 
> > > So the gpio info is pased to the CF? Otherwise, what's the point of
> > > having these in DT?
> > 
> > In the original version you are looking at, they are not passed to the
> > CF per-se but the driver does use aspeed GPIO specific APIs to
> > configure them to be owned by the CF, so we need the references.
> 
> Okay.
> 
> > However, I've just reworked the ucode with a few tricks to avoid losing
> > singificant performance, so that we can indeed pass them to the CF,
> > thus avoiding the need for a per-system image, so the above are here to
> > stay.
> > 
> > > > +  functions (eg, external FSI 
> > > > masters)
> > > > + - memory-region = ;  : Reference to the reserved 
> > > > memory for
> > > > +  the ColdFire. Must be 2M 
> > > > aligned on
> > > > + AST2400 and 1M aligned on AST2500
> > > > + - sram = ;   : Reference to the SRAM 
> > > > node.
> > > > + - cvic = ;   : Reference to the CVIC 
> > > > node.
> > > 
> > > Vendor prefixes.
> > 
> > On what ? Why would an "sram" pointer have a vendor prefix ? Or a
> > memory region pointer ?
> 
> memory-region is a standard property. sram and cvic are not, so should
> have vendor prefixes. However, perhaps we should add a common "sram"
> property to sram/sram.txt.

Hrm... originally vendor prefix on properties were for things that
didn't have a binding afaik. IE a way for an f-code driver to stash
things in the DT that were vendor specific and retrieve them from the
OS driver for example.

Here with well defined bindings it's rather bloaty don't you think ? I
don't strongly object to doing it, it's just a bit ... odd.

Cheers,
Ben.



Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread Xiubo Li

On 2018/7/7 2:58, Mike Christie wrote:

On 07/05/2018 09:57 PM, xiu...@redhat.com wrote:

  void uio_event_notify(struct uio_info *info)
  {
-   struct uio_device *idev = info->uio_dev;
+   struct uio_device *idev;
+
+   if (!info)
+   return;
+
+   idev = info->uio_dev;
  

For this one too, I am not sure if it is needed.

uio_interrupt -> uio_event_notify. See other mail.

driver XYZ -> uio_event_notify. I think drivers need to handle this and
set some bits and/or perform some cleanup to make sure they are not
calling uio_event_notify after it has called uio_unregister_device. The
problem with the above test is if they do not they could have called
uio_unregister_device right after the info test so you could still hit
the problem.


When we are tcmu_destroy_device(), if the netlink notify event to 
userspace is not successful then the TCMU will call the uio unregister, 
which will set the idev->info = NULL, without close and deleting the 
device in userspace. But the TCMU could still queue cmds to the ring 
buffer, then the uio_event_notify will be called.


For this case only when using idev->info it would happen. And currently 
there is no need to check this, I will remove it for now.


Thanks,

BRs



Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread Xiubo Li

On 2018/7/7 2:58, Mike Christie wrote:

On 07/05/2018 09:57 PM, xiu...@redhat.com wrote:

  void uio_event_notify(struct uio_info *info)
  {
-   struct uio_device *idev = info->uio_dev;
+   struct uio_device *idev;
+
+   if (!info)
+   return;
+
+   idev = info->uio_dev;
  

For this one too, I am not sure if it is needed.

uio_interrupt -> uio_event_notify. See other mail.

driver XYZ -> uio_event_notify. I think drivers need to handle this and
set some bits and/or perform some cleanup to make sure they are not
calling uio_event_notify after it has called uio_unregister_device. The
problem with the above test is if they do not they could have called
uio_unregister_device right after the info test so you could still hit
the problem.


When we are tcmu_destroy_device(), if the netlink notify event to 
userspace is not successful then the TCMU will call the uio unregister, 
which will set the idev->info = NULL, without close and deleting the 
device in userspace. But the TCMU could still queue cmds to the ring 
buffer, then the uio_event_notify will be called.


For this case only when using idev->info it would happen. And currently 
there is no need to check this, I will remove it for now.


Thanks,

BRs



[PATCH] KASAN: use-after-free Read in rdma_listen

2018-07-06 Thread Tomas Bortoli
Hi,

I spent some time debugging the Syzkaller's found issue at subject:

https://syzkaller.appspot.com/bug?id=b8febdb3c7c8c1f1b606fb903cee66b21b2fd02f

And I've backtracked the UAF to the fact that the cma_listen_on_all()
function adds "id_priv->list" to the global var "listen_any_list" but
then such element is not removed in the rdma_destroy_id() function
(though I've seen that the call to cma_release_dev() in
rdma_destroy_id() should do the removal but doesn't get executed).

Therefore, if a program allocates a "struct rdma_cm_id" (through
ucma_open + ucma_create_id), then executes cma_listen_on_all(), then
frees the struct and repeat, during the second execution of
cma_listen_on_all() the kernel will try to update the references of the
freed node, triggering the UAF. I was able to fix the UAF with this ugly
patch:

--- b/drivers/infiniband/core/cma.c    2018-07-07 02:28:03.214589868 +0200
+++ a/drivers/infiniband/core/cma.c    2018-07-07 03:35:44.325301216 +0200
@@ -1678,6 +1678,11 @@ void rdma_destroy_id(struct rdma_cm_id *
     mutex_lock(_priv->handler_mutex);
     mutex_unlock(_priv->handler_mutex);
 
+    mutex_lock();
+    if(id_priv->list.next!=0 && id_priv->list.prev!=0)
+        list_del(_priv->list);
+    mutex_unlock();
+
     if (id_priv->cma_dev) {
         rdma_restrack_del(_priv->res);
         if (rdma_cap_ib_cm(id_priv->id.device, 1)) {

Note: I only tested this patch against the shortest reproducer for this
issue (not any other use of rdma_cm):

https://syzkaller.appspot.com/text?tag=ReproC=1334f10f80

I had to add that "if" in the patch because running the reproducer
(after several iterations) provoked a NULL-dereference in the added
list_del() call because for some reason I haven't cleared yet the next
and prev pointers of the list at issue gets zeroed, sometimes ( by what ??).


Moreover, I noticed that running the reproducer for "long" time exhaust
all the available memory. To spot the memory leaks I recompiled with:

CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1

The reproducer induces, apparently, 2 memory leaks reported by kmemleak:

unreferenced object 0x880069f49d40 (size 512):
  comm "repro", pid 4263, jiffies 4294722196 (age 688.262s)
  hex dump (first 32 bytes):
    00 b8 13 5a 00 88 ff ff 40 9d f4 69 00 88 ff ff  ...Z@..i
    0a 00 98 a6 00 00 00 00 fe 80 00 00 00 00 00 00  
  backtrace:
    [<75a2f334>] kmem_cache_alloc_trace+0x1b2/0x3d0
    [<75fd9fea>] rdma_resolve_ip+0xc0/0x6b0
    [<33592b0b>] rdma_resolve_addr+0x490/0x2580
    [] ucma_resolve_ip+0x193/0x260
    [<68f1c2b7>] ucma_write+0x2ec/0x3f0
    [<015692cc>] __vfs_write+0x107/0x920
    [<9528b010>] vfs_write+0x189/0x510
    [<1a5d169b>] ksys_write+0xfa/0x240
    [] __x64_sys_write+0x73/0xb0
    [<71590ffb>] do_syscall_64+0x18c/0x760
    [<3c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<59247e9d>] 0x


unreferenced object 0x88006c0c0bc0 (size 576):
  comm "repro", pid 4261, jiffies 4294722191 (age 688.261s)
  hex dump (first 32 bytes):
    00 02 00 00 00 00 00 00 80 b8 07 6c 00 88 ff ff  ...l
    b0 7d 2c 6b 00 88 ff ff d8 0b 0c 6c 00 88 ff ff  .},k...l
  backtrace:
    [<39511ef2>] kmem_cache_alloc+0x1b2/0x3d0
    [<106bf668>] radix_tree_node_alloc.constprop.18+0x5e/0x2e0
    [<5b2f026d>] idr_get_free+0x9f5/0x1000
    [<445baa5a>] idr_alloc_u32+0x1bc/0x3d0
    [<7fd1b6f4>] idr_alloc+0xfd/0x190
    [] cma_alloc_port+0xb0/0x170
    [<8f968f9e>] rdma_bind_addr+0x1252/0x1f00
    [] rdma_resolve_addr+0x39e/0x2580
    [] ucma_resolve_ip+0x193/0x260
    [<68f1c2b7>] ucma_write+0x2ec/0x3f0
    [<015692cc>] __vfs_write+0x107/0x920
    [<9528b010>] vfs_write+0x189/0x510
    [<1a5d169b>] ksys_write+0xfa/0x240
    [] __x64_sys_write+0x73/0xb0
    [<71590ffb>] do_syscall_64+0x18c/0x760
    [<3c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

I don't have a background on usage or internals of the driver at issue
but I hope these clues will help in finding the proper fix.

Tomas




[PATCH] KASAN: use-after-free Read in rdma_listen

2018-07-06 Thread Tomas Bortoli
Hi,

I spent some time debugging the Syzkaller's found issue at subject:

https://syzkaller.appspot.com/bug?id=b8febdb3c7c8c1f1b606fb903cee66b21b2fd02f

And I've backtracked the UAF to the fact that the cma_listen_on_all()
function adds "id_priv->list" to the global var "listen_any_list" but
then such element is not removed in the rdma_destroy_id() function
(though I've seen that the call to cma_release_dev() in
rdma_destroy_id() should do the removal but doesn't get executed).

Therefore, if a program allocates a "struct rdma_cm_id" (through
ucma_open + ucma_create_id), then executes cma_listen_on_all(), then
frees the struct and repeat, during the second execution of
cma_listen_on_all() the kernel will try to update the references of the
freed node, triggering the UAF. I was able to fix the UAF with this ugly
patch:

--- b/drivers/infiniband/core/cma.c    2018-07-07 02:28:03.214589868 +0200
+++ a/drivers/infiniband/core/cma.c    2018-07-07 03:35:44.325301216 +0200
@@ -1678,6 +1678,11 @@ void rdma_destroy_id(struct rdma_cm_id *
     mutex_lock(_priv->handler_mutex);
     mutex_unlock(_priv->handler_mutex);
 
+    mutex_lock();
+    if(id_priv->list.next!=0 && id_priv->list.prev!=0)
+        list_del(_priv->list);
+    mutex_unlock();
+
     if (id_priv->cma_dev) {
         rdma_restrack_del(_priv->res);
         if (rdma_cap_ib_cm(id_priv->id.device, 1)) {

Note: I only tested this patch against the shortest reproducer for this
issue (not any other use of rdma_cm):

https://syzkaller.appspot.com/text?tag=ReproC=1334f10f80

I had to add that "if" in the patch because running the reproducer
(after several iterations) provoked a NULL-dereference in the added
list_del() call because for some reason I haven't cleared yet the next
and prev pointers of the list at issue gets zeroed, sometimes ( by what ??).


Moreover, I noticed that running the reproducer for "long" time exhaust
all the available memory. To spot the memory leaks I recompiled with:

CONFIG_HAVE_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK=y
CONFIG_DEBUG_KMEMLEAK_EARLY_LOG_SIZE=1

The reproducer induces, apparently, 2 memory leaks reported by kmemleak:

unreferenced object 0x880069f49d40 (size 512):
  comm "repro", pid 4263, jiffies 4294722196 (age 688.262s)
  hex dump (first 32 bytes):
    00 b8 13 5a 00 88 ff ff 40 9d f4 69 00 88 ff ff  ...Z@..i
    0a 00 98 a6 00 00 00 00 fe 80 00 00 00 00 00 00  
  backtrace:
    [<75a2f334>] kmem_cache_alloc_trace+0x1b2/0x3d0
    [<75fd9fea>] rdma_resolve_ip+0xc0/0x6b0
    [<33592b0b>] rdma_resolve_addr+0x490/0x2580
    [] ucma_resolve_ip+0x193/0x260
    [<68f1c2b7>] ucma_write+0x2ec/0x3f0
    [<015692cc>] __vfs_write+0x107/0x920
    [<9528b010>] vfs_write+0x189/0x510
    [<1a5d169b>] ksys_write+0xfa/0x240
    [] __x64_sys_write+0x73/0xb0
    [<71590ffb>] do_syscall_64+0x18c/0x760
    [<3c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [<59247e9d>] 0x


unreferenced object 0x88006c0c0bc0 (size 576):
  comm "repro", pid 4261, jiffies 4294722191 (age 688.261s)
  hex dump (first 32 bytes):
    00 02 00 00 00 00 00 00 80 b8 07 6c 00 88 ff ff  ...l
    b0 7d 2c 6b 00 88 ff ff d8 0b 0c 6c 00 88 ff ff  .},k...l
  backtrace:
    [<39511ef2>] kmem_cache_alloc+0x1b2/0x3d0
    [<106bf668>] radix_tree_node_alloc.constprop.18+0x5e/0x2e0
    [<5b2f026d>] idr_get_free+0x9f5/0x1000
    [<445baa5a>] idr_alloc_u32+0x1bc/0x3d0
    [<7fd1b6f4>] idr_alloc+0xfd/0x190
    [] cma_alloc_port+0xb0/0x170
    [<8f968f9e>] rdma_bind_addr+0x1252/0x1f00
    [] rdma_resolve_addr+0x39e/0x2580
    [] ucma_resolve_ip+0x193/0x260
    [<68f1c2b7>] ucma_write+0x2ec/0x3f0
    [<015692cc>] __vfs_write+0x107/0x920
    [<9528b010>] vfs_write+0x189/0x510
    [<1a5d169b>] ksys_write+0xfa/0x240
    [] __x64_sys_write+0x73/0xb0
    [<71590ffb>] do_syscall_64+0x18c/0x760
    [<3c31113f>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

I don't have a background on usage or internals of the driver at issue
but I hope these clues will help in finding the proper fix.

Tomas




Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread Xiubo Li

On 2018/7/7 2:23, Mike Christie wrote:

On 07/05/2018 09:57 PM, xiu...@redhat.com wrote:

  static irqreturn_t uio_interrupt(int irq, void *dev_id)
  {
struct uio_device *idev = (struct uio_device *)dev_id;
-   irqreturn_t ret = idev->info->handler(irq, idev->info);
+   irqreturn_t ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = IRQ_NONE;
+   goto out;
+   }
  
+	ret = idev->info->handler(irq, idev->info);

if (ret == IRQ_HANDLED)
uio_event_notify(idev->info);
  
+out:

+   mutex_unlock(>info_lock);
return ret;
  }


Do you need the interrupt related changes in this patch and the first
one?
Actually, the NULL checking is not a must, we can remove this. But the 
lock/unlock is needed.

  When we do uio_unregister_device -> free_irq does free_irq return
when there are no longer running interrupt handlers that we requested?

If that is not the case then I think we can hit a similar bug. We do:

__uio_register_device -> device_register -> device's refcount goes to
zero so we do -> uio_device_release -> kfree(idev)

and if it is possible the interrupt handler could still run after
free_irq then we would end up doing:

uio_interrupt -> mutex_lock(>info_lock) -> idev access freed memory.


I think this shouldn't happen. Because the free_irq function does not 
return until any executing interrupts for this IRQ have completed.


Thanks,
BRs



Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered

2018-07-06 Thread Xiubo Li

On 2018/7/7 2:23, Mike Christie wrote:

On 07/05/2018 09:57 PM, xiu...@redhat.com wrote:

  static irqreturn_t uio_interrupt(int irq, void *dev_id)
  {
struct uio_device *idev = (struct uio_device *)dev_id;
-   irqreturn_t ret = idev->info->handler(irq, idev->info);
+   irqreturn_t ret;
+
+   mutex_lock(>info_lock);
+   if (!idev->info) {
+   ret = IRQ_NONE;
+   goto out;
+   }
  
+	ret = idev->info->handler(irq, idev->info);

if (ret == IRQ_HANDLED)
uio_event_notify(idev->info);
  
+out:

+   mutex_unlock(>info_lock);
return ret;
  }


Do you need the interrupt related changes in this patch and the first
one?
Actually, the NULL checking is not a must, we can remove this. But the 
lock/unlock is needed.

  When we do uio_unregister_device -> free_irq does free_irq return
when there are no longer running interrupt handlers that we requested?

If that is not the case then I think we can hit a similar bug. We do:

__uio_register_device -> device_register -> device's refcount goes to
zero so we do -> uio_device_release -> kfree(idev)

and if it is possible the interrupt handler could still run after
free_irq then we would end up doing:

uio_interrupt -> mutex_lock(>info_lock) -> idev access freed memory.


I think this shouldn't happen. Because the free_irq function does not 
return until any executing interrupts for this IRQ have completed.


Thanks,
BRs



Re: 4.17.x won't boot due to "x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G"

2018-07-06 Thread Masahiro Yamada
2018-07-07 1:29 GMT+09:00 Kirill A. Shutemov :
> On Fri, Jul 06, 2018 at 11:13:02PM +0900, Masahiro Yamada wrote:
>> >> LDFLAGS is for internal-use.
>> >> Please do not override it from the command line.
>> >
>> > Can we generate a build error if a user try to override LDFLAGS, CFLAGS or
>> > other critical internal-use-only variables?
>>
>> Yes, Make can check where variables came from.
>
> I think we should do this.
>
>> >> make LDFLAGS_KERNEL=...  LDFLAGS_MODULE=...
>> >> will allow you to append linker flags.
>> >
>> > Okay. It makes me wounder if we should taint kernel in such cases?
>> > Custom compiler/linker flags are risky and can lead to weird bugs.
>>
>> OK.
>> So, what problem are we discussing?
>
> Users set custom LDFLAGS/CFLAGS and break kernel. Then report bug that
> hard to debug. See
>
> https://bugzilla.kernel.org/show_bug.cgi?id=200385


CFLAGS is only used under tools/.
Passing CFLAGS is probably no effect to the kernel.

In Linux makefiles,
KBUILD_ prefixed variables are used internally.

KBUILD_CFLAGS, KBUILD_CPPFLAGS, KBUILD_AFLAGS, etc.


LDFLAGS is an exception.  I do not know why.
Renaming LDFLAGS to KBUILD_LDFLAGS
will make the code consistent.

At least, it will avoid overriding flags by accident.

Of course, users still can change KBUILD_LDFLAGS
if they really want.

The build system could add belt and braces checks for that,
but it is arguable since
there are lots of lots of internal variables.




> and start of this thread:
>
> https://lore.kernel.org/lkml/20180701213243.ga20...@trogon.sfo.coreos.systems/
>
> It took me a while to track down the issue. I blamed linker for a while.
>
>> > I've got it wrong. *Any* LDFLAGS option passed to make this way:
>> >
>> >  make LDFLAGS="..."
>>
>> In your previous mail, I thought you were asking me how to pass
>> custom linker flags.
>>
>> If not, we do not need to think about that case.
>> Just say "Do not do that".
>
> At least we need to make user aware about risk of setting custom flags.
>
> --
>  Kirill A. Shutemov



-- 
Best Regards
Masahiro Yamada


Re: 4.17.x won't boot due to "x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G"

2018-07-06 Thread Masahiro Yamada
2018-07-07 1:29 GMT+09:00 Kirill A. Shutemov :
> On Fri, Jul 06, 2018 at 11:13:02PM +0900, Masahiro Yamada wrote:
>> >> LDFLAGS is for internal-use.
>> >> Please do not override it from the command line.
>> >
>> > Can we generate a build error if a user try to override LDFLAGS, CFLAGS or
>> > other critical internal-use-only variables?
>>
>> Yes, Make can check where variables came from.
>
> I think we should do this.
>
>> >> make LDFLAGS_KERNEL=...  LDFLAGS_MODULE=...
>> >> will allow you to append linker flags.
>> >
>> > Okay. It makes me wounder if we should taint kernel in such cases?
>> > Custom compiler/linker flags are risky and can lead to weird bugs.
>>
>> OK.
>> So, what problem are we discussing?
>
> Users set custom LDFLAGS/CFLAGS and break kernel. Then report bug that
> hard to debug. See
>
> https://bugzilla.kernel.org/show_bug.cgi?id=200385


CFLAGS is only used under tools/.
Passing CFLAGS is probably no effect to the kernel.

In Linux makefiles,
KBUILD_ prefixed variables are used internally.

KBUILD_CFLAGS, KBUILD_CPPFLAGS, KBUILD_AFLAGS, etc.


LDFLAGS is an exception.  I do not know why.
Renaming LDFLAGS to KBUILD_LDFLAGS
will make the code consistent.

At least, it will avoid overriding flags by accident.

Of course, users still can change KBUILD_LDFLAGS
if they really want.

The build system could add belt and braces checks for that,
but it is arguable since
there are lots of lots of internal variables.




> and start of this thread:
>
> https://lore.kernel.org/lkml/20180701213243.ga20...@trogon.sfo.coreos.systems/
>
> It took me a while to track down the issue. I blamed linker for a while.
>
>> > I've got it wrong. *Any* LDFLAGS option passed to make this way:
>> >
>> >  make LDFLAGS="..."
>>
>> In your previous mail, I thought you were asking me how to pass
>> custom linker flags.
>>
>> If not, we do not need to think about that case.
>> Just say "Do not do that".
>
> At least we need to make user aware about risk of setting custom flags.
>
> --
>  Kirill A. Shutemov



-- 
Best Regards
Masahiro Yamada


kernel BUG at mm/shmem.c:LINE!

2018-07-06 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:526674536360 Add linux-next specific files for 20180706
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=116d16fc40
kernel config:  https://syzkaller.appspot.com/x/.config?x=c8d1cfc0cb798e48
dashboard link: https://syzkaller.appspot.com/bug?extid=b8e0dfee3fd8c9012771
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=170e462c40
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15f1ba2c40

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b8e0dfee3fd8c9012...@syzkaller.appspotmail.com

raw: 02fffc001028 ea0007011dc8 ea0007058b48 8801a7576ab8
raw: 016e 8801a7588930 0003 8801d9a44c80
page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != index)
page->mem_cgroup:8801d9a44c80
[ cut here ]
kernel BUG at mm/shmem.c:815!
invalid opcode:  [#1] SMP KASAN
CPU: 0 PID: 4429 Comm: syz-executor697 Not tainted  
4.18.0-rc3-next-20180706+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815
Code: 00 0f 85 bd 19 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8  
a5 f0 d6 ff 48 c7 c6 e0 32 f1 87 4c 89 e7 e8 16 10 05 00 <0f> 0b e8 8f f0  
d6 ff 49 8d 7c 24 20 48 89 f8 48 c1 e8 03 80 3c 18

RSP: 0018:8801ab88e158 EFLAGS: 00010246
RAX:  RBX: dc00 RCX: 
RDX:  RSI: 81aaab95 RDI: ed0035711c18
RBP: 8801ab88e8d0 R08: 8801a7af04c0 R09: ed003b5c4fc0
R10: ed003b5c4fc0 R11: 8801dae27e07 R12: ea0007058b00
R13: 8801ab88e8a8 R14: 0001 R15: 016e
FS:  () GS:8801dae0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004b625c CR3: 08e6a000 CR4: 001406f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 shmem_truncate_range+0x27/0xa0 mm/shmem.c:971
 shmem_evict_inode+0x3b2/0xcb0 mm/shmem.c:1071
 evict+0x4ae/0x990 fs/inode.c:558
 iput_final fs/inode.c:1508 [inline]
 iput+0x635/0xaa0 fs/inode.c:1534
 dentry_unlink_inode+0x4ae/0x640 fs/dcache.c:377
 __dentry_kill+0x44c/0x7a0 fs/dcache.c:569
 dentry_kill+0xc9/0x5a0 fs/dcache.c:688
 dput.part.26+0x66b/0x7a0 fs/dcache.c:849
 dput+0x15/0x20 fs/dcache.c:831
 __fput+0x558/0x930 fs/file_table.c:235
 fput+0x15/0x20 fs/file_table.c:251
 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x1b08/0x2750 kernel/exit.c:869
 do_group_exit+0x177/0x440 kernel/exit.c:972
 get_signal+0x88e/0x1970 kernel/signal.c:2467
 do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
 exit_to_usermode_loop+0x2e0/0x370 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441c29
Code: Bad RIP value.
RSP: 002b:7fff6e973338 EFLAGS: 0246 ORIG_RAX: 0028
RAX: ffe0 RBX:  RCX: 00441c29
RDX: 2180 RSI: 0004 RDI: 0003
RBP: 7fff6e973350 R08: 0001 R09: 
R10: 0a42 R11: 0246 R12: 
R13: 0005 R14:  R15: 
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace 68c2f261fd3bbf54 ]---
RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815
Code: 00 0f 85 bd 19 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8  
a5 f0 d6 ff 48 c7 c6 e0 32 f1 87 4c 89 e7 e8 16 10 05 00 <0f> 0b e8 8f f0  
d6 ff 49 8d 7c 24 20 48 89 f8 48 c1 e8 03 80 3c 18

RSP: 0018:8801ab88e158 EFLAGS: 00010246
RAX:  RBX: dc00 RCX: 
RDX:  RSI: 81aaab95 RDI: ed0035711c18
RBP: 8801ab88e8d0 R08: 8801a7af04c0 R09: ed003b5c4fc0
R10: ed003b5c4fc0 R11: 8801dae27e07 R12: ea0007058b00
R13: 8801ab88e8a8 R14: 0001 R15: 016e
FS:  () GS:8801dae0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 00441bff CR3: 08e6a000 CR4: 001406f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep tr

kernel BUG at mm/shmem.c:LINE!

2018-07-06 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:526674536360 Add linux-next specific files for 20180706
git tree:   linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=116d16fc40
kernel config:  https://syzkaller.appspot.com/x/.config?x=c8d1cfc0cb798e48
dashboard link: https://syzkaller.appspot.com/bug?extid=b8e0dfee3fd8c9012771
compiler:   gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=170e462c40
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15f1ba2c40

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b8e0dfee3fd8c9012...@syzkaller.appspotmail.com

raw: 02fffc001028 ea0007011dc8 ea0007058b48 8801a7576ab8
raw: 016e 8801a7588930 0003 8801d9a44c80
page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != index)
page->mem_cgroup:8801d9a44c80
[ cut here ]
kernel BUG at mm/shmem.c:815!
invalid opcode:  [#1] SMP KASAN
CPU: 0 PID: 4429 Comm: syz-executor697 Not tainted  
4.18.0-rc3-next-20180706+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815
Code: 00 0f 85 bd 19 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8  
a5 f0 d6 ff 48 c7 c6 e0 32 f1 87 4c 89 e7 e8 16 10 05 00 <0f> 0b e8 8f f0  
d6 ff 49 8d 7c 24 20 48 89 f8 48 c1 e8 03 80 3c 18

RSP: 0018:8801ab88e158 EFLAGS: 00010246
RAX:  RBX: dc00 RCX: 
RDX:  RSI: 81aaab95 RDI: ed0035711c18
RBP: 8801ab88e8d0 R08: 8801a7af04c0 R09: ed003b5c4fc0
R10: ed003b5c4fc0 R11: 8801dae27e07 R12: ea0007058b00
R13: 8801ab88e8a8 R14: 0001 R15: 016e
FS:  () GS:8801dae0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 004b625c CR3: 08e6a000 CR4: 001406f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
 shmem_truncate_range+0x27/0xa0 mm/shmem.c:971
 shmem_evict_inode+0x3b2/0xcb0 mm/shmem.c:1071
 evict+0x4ae/0x990 fs/inode.c:558
 iput_final fs/inode.c:1508 [inline]
 iput+0x635/0xaa0 fs/inode.c:1534
 dentry_unlink_inode+0x4ae/0x640 fs/dcache.c:377
 __dentry_kill+0x44c/0x7a0 fs/dcache.c:569
 dentry_kill+0xc9/0x5a0 fs/dcache.c:688
 dput.part.26+0x66b/0x7a0 fs/dcache.c:849
 dput+0x15/0x20 fs/dcache.c:831
 __fput+0x558/0x930 fs/file_table.c:235
 fput+0x15/0x20 fs/file_table.c:251
 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113
 exit_task_work include/linux/task_work.h:22 [inline]
 do_exit+0x1b08/0x2750 kernel/exit.c:869
 do_group_exit+0x177/0x440 kernel/exit.c:972
 get_signal+0x88e/0x1970 kernel/signal.c:2467
 do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
 exit_to_usermode_loop+0x2e0/0x370 arch/x86/entry/common.c:162
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441c29
Code: Bad RIP value.
RSP: 002b:7fff6e973338 EFLAGS: 0246 ORIG_RAX: 0028
RAX: ffe0 RBX:  RCX: 00441c29
RDX: 2180 RSI: 0004 RDI: 0003
RBP: 7fff6e973350 R08: 0001 R09: 
R10: 0a42 R11: 0246 R12: 
R13: 0005 R14:  R15: 
Modules linked in:
Dumping ftrace buffer:
   (ftrace buffer empty)
---[ end trace 68c2f261fd3bbf54 ]---
RIP: 0010:shmem_undo_range+0xdaa/0x29a0 mm/shmem.c:815
Code: 00 0f 85 bd 19 00 00 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8  
a5 f0 d6 ff 48 c7 c6 e0 32 f1 87 4c 89 e7 e8 16 10 05 00 <0f> 0b e8 8f f0  
d6 ff 49 8d 7c 24 20 48 89 f8 48 c1 e8 03 80 3c 18

RSP: 0018:8801ab88e158 EFLAGS: 00010246
RAX:  RBX: dc00 RCX: 
RDX:  RSI: 81aaab95 RDI: ed0035711c18
RBP: 8801ab88e8d0 R08: 8801a7af04c0 R09: ed003b5c4fc0
R10: ed003b5c4fc0 R11: 8801dae27e07 R12: ea0007058b00
R13: 8801ab88e8a8 R14: 0001 R15: 016e
FS:  () GS:8801dae0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 00441bff CR3: 08e6a000 CR4: 001406f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep tr

[PATCH 0/7] HOSTFLAGS and HOSTLDFLAGS from the environment (new approach)

2018-07-06 Thread Laura Abbott
Hi,

This is a follow up from
lkml.kernel.org/r/<20180329004805.7278-1-labb...@redhat.com> . I was
pointed to a previous suggestion (https://lkml.org/lkml/2018/2/28/178)
to rename the variables for better namespacing. I hadn't seen anyone
follow up so this is a series to do just that. I split up the rename by
variable for ease of review. This was mostly just a pretty
straight-forward search and replace. I intentionally didn't Cc all the
outside maintainers on the rename because I wanted to make sure this
approach was correct before bothering people.

This series also includes two other issuse I found to finally have all
the generated files pick up the host flags. The first seems to be a typo
of CHOSTFLAGS instead of HOSTCFLAGS. The second is a missing HOSTLDFLAGS
when linking fixdep. I put these two separately in case there was
interest in picking them up for stable.

For the interested, the full command line I've been testing with:

HOSTCFLAGS="-O2 -g -pipe -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions
-fstack-protector-strong -grecord-gcc-switches
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic
-fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection"

HOSTLDFLAGS="-Wl,-z,relro -Wl,-z,now
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld"

Thanks,
Laura

Laura Abbott (7):
  tools: build: Fixup host c flags
  tools: build: Use HOSTLDFLAGS with fixdep
  treewide: Rename HOSTCFLAGS -> KBUILD_HOSTCFLAGS
  treewide: Rename HOSTCXXFLAGS to KBUILD_HOSTCXXFLAGS
  treewide: Rename HOSTLDFLAGS to KBUILD_HOSTLDFLAGS
  treewide: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS
  Kbuild: Use HOST*FLAGS options from the command line

 Makefile   | 17 +
 arch/alpha/boot/Makefile   |  2 +-
 arch/s390/tools/Makefile   |  4 ++--
 arch/x86/tools/Makefile|  4 ++--
 net/bpfilter/Makefile  |  4 ++--
 samples/bpf/Makefile   | 28 ++--
 samples/connector/Makefile |  2 +-
 samples/hidraw/Makefile|  2 +-
 samples/seccomp/Makefile   | 24 
 samples/statx/Makefile |  2 +-
 samples/uhid/Makefile  |  2 +-
 scripts/Kbuild.include |  2 +-
 scripts/Makefile   |  4 ++--
 scripts/Makefile.host  | 28 ++--
 scripts/dtc/Makefile   | 24 
 scripts/genksyms/Makefile  |  4 ++--
 scripts/kconfig/Makefile   | 14 +++---
 tools/build/Build.include  |  2 +-
 tools/build/Makefile   |  2 +-
 tools/objtool/Makefile |  4 ++--
 20 files changed, 88 insertions(+), 87 deletions(-)

-- 
2.17.1



[PATCH 0/7] HOSTFLAGS and HOSTLDFLAGS from the environment (new approach)

2018-07-06 Thread Laura Abbott
Hi,

This is a follow up from
lkml.kernel.org/r/<20180329004805.7278-1-labb...@redhat.com> . I was
pointed to a previous suggestion (https://lkml.org/lkml/2018/2/28/178)
to rename the variables for better namespacing. I hadn't seen anyone
follow up so this is a series to do just that. I split up the rename by
variable for ease of review. This was mostly just a pretty
straight-forward search and replace. I intentionally didn't Cc all the
outside maintainers on the rename because I wanted to make sure this
approach was correct before bothering people.

This series also includes two other issuse I found to finally have all
the generated files pick up the host flags. The first seems to be a typo
of CHOSTFLAGS instead of HOSTCFLAGS. The second is a missing HOSTLDFLAGS
when linking fixdep. I put these two separately in case there was
interest in picking them up for stable.

For the interested, the full command line I've been testing with:

HOSTCFLAGS="-O2 -g -pipe -Wall -Werror=format-security
-Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions
-fstack-protector-strong -grecord-gcc-switches
-specs=/usr/lib/rpm/redhat/redhat-hardened-cc1
-specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic
-fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection"

HOSTLDFLAGS="-Wl,-z,relro -Wl,-z,now
-specs=/usr/lib/rpm/redhat/redhat-hardened-ld"

Thanks,
Laura

Laura Abbott (7):
  tools: build: Fixup host c flags
  tools: build: Use HOSTLDFLAGS with fixdep
  treewide: Rename HOSTCFLAGS -> KBUILD_HOSTCFLAGS
  treewide: Rename HOSTCXXFLAGS to KBUILD_HOSTCXXFLAGS
  treewide: Rename HOSTLDFLAGS to KBUILD_HOSTLDFLAGS
  treewide: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS
  Kbuild: Use HOST*FLAGS options from the command line

 Makefile   | 17 +
 arch/alpha/boot/Makefile   |  2 +-
 arch/s390/tools/Makefile   |  4 ++--
 arch/x86/tools/Makefile|  4 ++--
 net/bpfilter/Makefile  |  4 ++--
 samples/bpf/Makefile   | 28 ++--
 samples/connector/Makefile |  2 +-
 samples/hidraw/Makefile|  2 +-
 samples/seccomp/Makefile   | 24 
 samples/statx/Makefile |  2 +-
 samples/uhid/Makefile  |  2 +-
 scripts/Kbuild.include |  2 +-
 scripts/Makefile   |  4 ++--
 scripts/Makefile.host  | 28 ++--
 scripts/dtc/Makefile   | 24 
 scripts/genksyms/Makefile  |  4 ++--
 scripts/kconfig/Makefile   | 14 +++---
 tools/build/Build.include  |  2 +-
 tools/build/Makefile   |  2 +-
 tools/objtool/Makefile |  4 ++--
 20 files changed, 88 insertions(+), 87 deletions(-)

-- 
2.17.1



[PATCH 3/7] treewide: Rename HOSTCFLAGS -> KBUILD_HOSTCFLAGS

2018-07-06 Thread Laura Abbott
In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to
KBUILD_HOSTCFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile   |  4 ++--
 arch/alpha/boot/Makefile   |  2 +-
 arch/s390/tools/Makefile   |  4 ++--
 arch/x86/tools/Makefile|  4 ++--
 net/bpfilter/Makefile  |  2 +-
 samples/bpf/Makefile   | 26 +-
 samples/connector/Makefile |  2 +-
 samples/hidraw/Makefile|  2 +-
 samples/seccomp/Makefile   | 24 
 samples/statx/Makefile |  2 +-
 samples/uhid/Makefile  |  2 +-
 scripts/Kbuild.include |  2 +-
 scripts/Makefile   |  4 ++--
 scripts/Makefile.host  |  4 ++--
 scripts/dtc/Makefile   | 24 
 scripts/genksyms/Makefile  |  4 ++--
 scripts/kconfig/Makefile   | 12 ++--
 tools/build/Build.include  |  2 +-
 tools/objtool/Makefile |  2 +-
 19 files changed, 64 insertions(+), 64 deletions(-)

diff --git a/Makefile b/Makefile
index d15ac32afbaf..16cab22b1f40 100644
--- a/Makefile
+++ b/Makefile
@@ -359,7 +359,7 @@ HOST_LFS_LIBS := $(shell getconf LFS_LIBS)
 
 HOSTCC   = gcc
 HOSTCXX  = g++
-HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
+KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
@@ -429,7 +429,7 @@ KBUILD_LDFLAGS_MODULE := -T 
$(srctree)/scripts/module-common.lds
 LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
-export ARCH SRCARCH CONFIG_SHELL HOSTCC HOSTCFLAGS CROSS_COMPILE AS LD CC
+export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
 export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
diff --git a/arch/alpha/boot/Makefile b/arch/alpha/boot/Makefile
index 0cbe4c59d3ce..dfccf0195306 100644
--- a/arch/alpha/boot/Makefile
+++ b/arch/alpha/boot/Makefile
@@ -14,7 +14,7 @@ targets   := vmlinux.gz vmlinux \
   tools/bootpzh bootloader bootpheader bootpzheader 
 OBJSTRIP   := $(obj)/tools/objstrip
 
-HOSTCFLAGS := -Wall -I$(objtree)/usr/include
+KBUILD_HOSTCFLAGS  := -Wall -I$(objtree)/usr/include
 BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj)
 
 # SRM bootable image.  Copy to offset 512 of a partition.
diff --git a/arch/s390/tools/Makefile b/arch/s390/tools/Makefile
index 48cdac1143a9..40afcebf4e67 100644
--- a/arch/s390/tools/Makefile
+++ b/arch/s390/tools/Makefile
@@ -14,8 +14,8 @@ kapi: $(kapi-hdrs-y)
 hostprogs-y+= gen_facilities
 hostprogs-y+= gen_opcode_table
 
-HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)
-HOSTCFLAGS_gen_opcode_table.o += -Wall $(LINUXINCLUDE)
+KBUILD_HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)
+KBUILD_HOSTCFLAGS_gen_opcode_table.o += -Wall $(LINUXINCLUDE)
 
 # Ensure output directory exists
 _dummy := $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 09af7ff53044..140de7589aa2 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -29,9 +29,9 @@ posttest: $(obj)/insn_decoder_test vmlinux $(obj)/insn_sanity
 hostprogs-y+= insn_decoder_test insn_sanity
 
 # -I needed for generated C source and C source which in the kernel tree.
-HOSTCFLAGS_insn_decoder_test.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ 
-I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
+KBUILD_HOSTCFLAGS_insn_decoder_test.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ 
-I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
 
-HOSTCFLAGS_insn_sanity.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/
+KBUILD_HOSTCFLAGS_insn_sanity.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/
 
 # Dependencies are also needed.
 $(obj)/insn_decoder_test.o: $(srctree)/arch/x86/lib/insn.c 
$(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h 
$(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h 
$(objtree)/arch/x86/lib/inat-tables.c
diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 39c6980b5d99..70beeb4ad806 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -5,7 +5,7 @@
 
 hostprogs-y := bpfilter_umh
 bpfilter_umh-objs := main.o
-HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
+KBUILD_HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
 HOSTCC := $(CC)
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git 

[PATCH 5/7] treewide: Rename HOSTLDFLAGS to KBUILD_HOSTLDFLAGS

2018-07-06 Thread Laura Abbott


In preparation for enabling command line LDFLAGS, re-name HOSTLDFLAGS to
KBUILD_HOSTLDFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile   |  4 ++--
 net/bpfilter/Makefile  |  2 +-
 scripts/Makefile.host  | 10 +-
 tools/build/Makefile   |  2 +-
 tools/objtool/Makefile |  2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index 9c3f1864d678..319736b80016 100644
--- a/Makefile
+++ b/Makefile
@@ -362,7 +362,7 @@ HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
-HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
+KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
 HOST_LOADLIBES := $(HOST_LFS_LIBS)
 
 # Make variables (CC, etc...)
@@ -430,7 +430,7 @@ LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
-export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
+export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 70beeb4ad806..0947ee7f70d5 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -12,7 +12,7 @@ ifeq ($(CONFIG_BPFILTER_UMH), y)
 # builtin bpfilter_umh should be compiled with -static
 # since rootfs isn't mounted at the time of __init
 # function is called and do_execv won't find elf interpreter
-HOSTLDFLAGS += -static
+KBUILD_HOSTLDFLAGS += -static
 endif
 
 $(obj)/bpfilter_umh_blob.o: $(obj)/bpfilter_umh
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 1c153cc5b530..07acf5779f55 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -84,7 +84,7 @@ hostcxx_flags  = -Wp,-MD,$(depfile) $(__hostcxx_flags)
 # Create executable from a single .c file
 # host-csingle -> Executable
 quiet_cmd_host-csingle = HOSTCC  $@
-  cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(HOSTLDFLAGS) -o $@ $< \
+  cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(KBUILD_HOSTLDFLAGS) -o $@ 
$< \
$(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-csingle): $(obj)/%: $(src)/%.c FORCE
$(call if_changed_dep,host-csingle)
@@ -92,7 +92,7 @@ $(host-csingle): $(obj)/%: $(src)/%.c FORCE
 # Link an executable based on list of .o files, all plain c
 # host-cmulti -> executable
 quiet_cmd_host-cmulti  = HOSTLD  $@
-  cmd_host-cmulti  = $(HOSTCC) $(HOSTLDFLAGS) -o $@ \
+  cmd_host-cmulti  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(addprefix $(obj)/,$($(@F)-objs)) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cmulti): FORCE
@@ -109,7 +109,7 @@ $(host-cobjs): $(obj)/%.o: $(src)/%.c FORCE
 # Link an executable based on list of .o files, a mixture of .c and .cc
 # host-cxxmulti -> executable
 quiet_cmd_host-cxxmulti= HOSTLD  $@
-  cmd_host-cxxmulti= $(HOSTCXX) $(HOSTLDFLAGS) -o $@ \
+  cmd_host-cxxmulti= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(foreach o,objs cxxobjs,\
  $(addprefix $(obj)/,$($(@F)-$(o \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
@@ -143,7 +143,7 @@ $(host-cxxshobjs): $(obj)/%.o: $(src)/%.c FORCE
 # Link a shared library, based on position independent .o files
 # *.o -> .so shared library (host-cshlib)
 quiet_cmd_host-cshlib  = HOSTLLD -shared $@
-  cmd_host-cshlib  = $(HOSTCC) $(HOSTLDFLAGS) -shared -o $@ \
+  cmd_host-cshlib  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -shared -o $@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cshlib): FORCE
@@ -153,7 +153,7 @@ $(call multi_depend, $(host-cshlib), .so, -objs)
 # Link a shared library, based on position independent .o files
 # *.o -> .so shared library (host-cxxshlib)
 quiet_cmd_host-cxxshlib= HOSTLLD -shared $@
-  cmd_host-cxxshlib= $(HOSTCXX) $(HOSTLDFLAGS) -shared -o $@ \
+  cmd_host-cxxshlib= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -shared -o 
$@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cxxshlib): FORCE
diff --git a/tools/build/Makefile b/tools/build/Makefile
index cd0fd2cd165b..5f7903e8eabd 100644
--- a/tools/build/Makefile
+++ b/tools/build/Makefile
@@ -43,7 +43,7 @@ $(OUTPUT)fixdep-in.o: FORCE
$(Q)$(MAKE) $(build)=fixdep
 
 $(OUTPUT)fixdep: $(OUTPUT)fixdep-in.o
-   $(QUIET_LINK)$(HOSTCC) $(HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
+   $(QUIET_LINK)$(HOSTCC) $(KBUILD_HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
 
 FORCE:
 

Re: [PATCH] f2fs: split discard command in prior to block layer

2018-07-06 Thread Jaegeuk Kim
On 07/07, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2018/7/7 6:45, Jaegeuk Kim wrote:
> > On 07/04, Chao Yu wrote:
> >> From: Chao Yu 
> >>
> >> Some devices has small max_{hw,}discard_sectors, so that in
> >> __blkdev_issue_discard(), one big size discard bio can be split
> >> into multiple small size discard bios, result in heavy load in IO
> >> scheduler and device, which can hang other sync IO for long time.
> >>
> >> Now, f2fs is trying to control discard commands more elaboratively,
> >> in order to make less conflict in between discard IO and user IO
> >> to enhance application's performance, so in this patch, we will
> >> split discard bio in f2fs in prior to in block layer to reduce
> >> issuing multiple discard bios in a short time.
> > 
> > Hi Chao,
> > 
> > In terms of # of candidates, can we control this when actually issuing
> > the discard commands?
> 
> IIUC, you mean once we pick one discard entry in rbtree, if
> max_{hw,}discard_sectors is smaller than size of this discard, then we can 
> split
> it into smaller ones by discard_sectors, and just issue one or partials of 
> them?

Yes, sort of.

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/f2fs.h| 13 ++---
> >>  fs/f2fs/segment.c | 25 ++---
> >>  2 files changed, 28 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index a9da5a089cb4..a09d2b2d9520 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -178,7 +178,6 @@ enum {
> >>  
> >>  #define MAX_DISCARD_BLOCKS(sbi)   BLKS_PER_SEC(sbi)
> >>  #define DEF_MAX_DISCARD_REQUEST   8   /* issue 8 discards per 
> >> round */
> >> -#define DEF_MAX_DISCARD_LEN   512 /* Max. 2MB per discard 
> >> */
> >>  #define DEF_MIN_DISCARD_ISSUE_TIME50  /* 50 ms, if exists */
> >>  #define DEF_MID_DISCARD_ISSUE_TIME500 /* 500 ms, if device 
> >> busy */
> >>  #define DEF_MAX_DISCARD_ISSUE_TIME6   /* 60 s, if no 
> >> candidates */
> >> @@ -701,22 +700,22 @@ static inline void set_extent_info(struct 
> >> extent_info *ei, unsigned int fofs,
> >>  }
> >>  
> >>  static inline bool __is_discard_mergeable(struct discard_info *back,
> >> -  struct discard_info *front)
> >> +  struct discard_info *front, unsigned int max_len)
> >>  {
> >>return (back->lstart + back->len == front->lstart) &&
> >> -  (back->len + front->len < DEF_MAX_DISCARD_LEN);
> >> +  (back->len + front->len <= max_len);
> >>  }
> >>  
> >>  static inline bool __is_discard_back_mergeable(struct discard_info *cur,
> >> -  struct discard_info *back)
> >> +  struct discard_info *back, unsigned int max_len)
> >>  {
> >> -  return __is_discard_mergeable(back, cur);
> >> +  return __is_discard_mergeable(back, cur, max_len);
> >>  }
> >>  
> >>  static inline bool __is_discard_front_mergeable(struct discard_info *cur,
> >> -  struct discard_info *front)
> >> +  struct discard_info *front, unsigned int max_len)
> >>  {
> >> -  return __is_discard_mergeable(cur, front);
> >> +  return __is_discard_mergeable(cur, front, max_len);
> >>  }
> >>  
> >>  static inline bool __is_extent_mergeable(struct extent_info *back,
> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >> index 4648561e2bfd..8e417a12684d 100644
> >> --- a/fs/f2fs/segment.c
> >> +++ b/fs/f2fs/segment.c
> >> @@ -1086,6 +1086,9 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>struct discard_cmd *dc;
> >>struct discard_info di = {0};
> >>struct rb_node **insert_p = NULL, *insert_parent = NULL;
> >> +  struct request_queue *q = bdev_get_queue(bdev);
> >> +  unsigned int max_discard_blocks =
> >> +  SECTOR_TO_BLOCK(q->limits.max_discard_sectors);
> >>block_t end = lstart + len;
> >>  
> >>mutex_lock(>cmd_lock);
> >> @@ -1129,7 +1132,8 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>  
> >>if (prev_dc && prev_dc->state == D_PREP &&
> >>prev_dc->bdev == bdev &&
> >> -  __is_discard_back_mergeable(, _dc->di)) {
> >> +  __is_discard_back_mergeable(, _dc->di,
> >> +  max_discard_blocks)) {
> >>prev_dc->di.len += di.len;
> >>dcc->undiscard_blks += di.len;
> >>__relocate_discard_cmd(dcc, prev_dc);
> >> @@ -1140,7 +1144,8 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>  
> >>if (next_dc && next_dc->state == D_PREP &&
> >>next_dc->bdev == bdev &&
> >> -  __is_discard_front_mergeable(, _dc->di)) {
> >> +  __is_discard_front_mergeable(, _dc->di,
> >> + 

[PATCH 3/7] treewide: Rename HOSTCFLAGS -> KBUILD_HOSTCFLAGS

2018-07-06 Thread Laura Abbott
In preparation for enabling command line CFLAGS, re-name HOSTCFLAGS to
KBUILD_HOSTCFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile   |  4 ++--
 arch/alpha/boot/Makefile   |  2 +-
 arch/s390/tools/Makefile   |  4 ++--
 arch/x86/tools/Makefile|  4 ++--
 net/bpfilter/Makefile  |  2 +-
 samples/bpf/Makefile   | 26 +-
 samples/connector/Makefile |  2 +-
 samples/hidraw/Makefile|  2 +-
 samples/seccomp/Makefile   | 24 
 samples/statx/Makefile |  2 +-
 samples/uhid/Makefile  |  2 +-
 scripts/Kbuild.include |  2 +-
 scripts/Makefile   |  4 ++--
 scripts/Makefile.host  |  4 ++--
 scripts/dtc/Makefile   | 24 
 scripts/genksyms/Makefile  |  4 ++--
 scripts/kconfig/Makefile   | 12 ++--
 tools/build/Build.include  |  2 +-
 tools/objtool/Makefile |  2 +-
 19 files changed, 64 insertions(+), 64 deletions(-)

diff --git a/Makefile b/Makefile
index d15ac32afbaf..16cab22b1f40 100644
--- a/Makefile
+++ b/Makefile
@@ -359,7 +359,7 @@ HOST_LFS_LIBS := $(shell getconf LFS_LIBS)
 
 HOSTCC   = gcc
 HOSTCXX  = g++
-HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
+KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
@@ -429,7 +429,7 @@ KBUILD_LDFLAGS_MODULE := -T 
$(srctree)/scripts/module-common.lds
 LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
-export ARCH SRCARCH CONFIG_SHELL HOSTCC HOSTCFLAGS CROSS_COMPILE AS LD CC
+export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
 export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
diff --git a/arch/alpha/boot/Makefile b/arch/alpha/boot/Makefile
index 0cbe4c59d3ce..dfccf0195306 100644
--- a/arch/alpha/boot/Makefile
+++ b/arch/alpha/boot/Makefile
@@ -14,7 +14,7 @@ targets   := vmlinux.gz vmlinux \
   tools/bootpzh bootloader bootpheader bootpzheader 
 OBJSTRIP   := $(obj)/tools/objstrip
 
-HOSTCFLAGS := -Wall -I$(objtree)/usr/include
+KBUILD_HOSTCFLAGS  := -Wall -I$(objtree)/usr/include
 BOOTCFLAGS += -I$(objtree)/$(obj) -I$(srctree)/$(obj)
 
 # SRM bootable image.  Copy to offset 512 of a partition.
diff --git a/arch/s390/tools/Makefile b/arch/s390/tools/Makefile
index 48cdac1143a9..40afcebf4e67 100644
--- a/arch/s390/tools/Makefile
+++ b/arch/s390/tools/Makefile
@@ -14,8 +14,8 @@ kapi: $(kapi-hdrs-y)
 hostprogs-y+= gen_facilities
 hostprogs-y+= gen_opcode_table
 
-HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)
-HOSTCFLAGS_gen_opcode_table.o += -Wall $(LINUXINCLUDE)
+KBUILD_HOSTCFLAGS_gen_facilities.o += -Wall $(LINUXINCLUDE)
+KBUILD_HOSTCFLAGS_gen_opcode_table.o += -Wall $(LINUXINCLUDE)
 
 # Ensure output directory exists
 _dummy := $(shell [ -d '$(kapi)' ] || mkdir -p '$(kapi)')
diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 09af7ff53044..140de7589aa2 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -29,9 +29,9 @@ posttest: $(obj)/insn_decoder_test vmlinux $(obj)/insn_sanity
 hostprogs-y+= insn_decoder_test insn_sanity
 
 # -I needed for generated C source and C source which in the kernel tree.
-HOSTCFLAGS_insn_decoder_test.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ 
-I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
+KBUILD_HOSTCFLAGS_insn_decoder_test.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ 
-I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
 
-HOSTCFLAGS_insn_sanity.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/
+KBUILD_HOSTCFLAGS_insn_sanity.o := -Wall -I$(objtree)/arch/x86/lib/ 
-I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/
 
 # Dependencies are also needed.
 $(obj)/insn_decoder_test.o: $(srctree)/arch/x86/lib/insn.c 
$(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h 
$(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h 
$(objtree)/arch/x86/lib/inat-tables.c
diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 39c6980b5d99..70beeb4ad806 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -5,7 +5,7 @@
 
 hostprogs-y := bpfilter_umh
 bpfilter_umh-objs := main.o
-HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
+KBUILD_HOSTCFLAGS += -I. -Itools/include/ -Itools/include/uapi
 HOSTCC := $(CC)
 
 ifeq ($(CONFIG_BPFILTER_UMH), y)
diff --git 

[PATCH 5/7] treewide: Rename HOSTLDFLAGS to KBUILD_HOSTLDFLAGS

2018-07-06 Thread Laura Abbott


In preparation for enabling command line LDFLAGS, re-name HOSTLDFLAGS to
KBUILD_HOSTLDFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile   |  4 ++--
 net/bpfilter/Makefile  |  2 +-
 scripts/Makefile.host  | 10 +-
 tools/build/Makefile   |  2 +-
 tools/objtool/Makefile |  2 +-
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/Makefile b/Makefile
index 9c3f1864d678..319736b80016 100644
--- a/Makefile
+++ b/Makefile
@@ -362,7 +362,7 @@ HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
-HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
+KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
 HOST_LOADLIBES := $(HOST_LFS_LIBS)
 
 # Make variables (CC, etc...)
@@ -430,7 +430,7 @@ LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
-export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
+export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
diff --git a/net/bpfilter/Makefile b/net/bpfilter/Makefile
index 70beeb4ad806..0947ee7f70d5 100644
--- a/net/bpfilter/Makefile
+++ b/net/bpfilter/Makefile
@@ -12,7 +12,7 @@ ifeq ($(CONFIG_BPFILTER_UMH), y)
 # builtin bpfilter_umh should be compiled with -static
 # since rootfs isn't mounted at the time of __init
 # function is called and do_execv won't find elf interpreter
-HOSTLDFLAGS += -static
+KBUILD_HOSTLDFLAGS += -static
 endif
 
 $(obj)/bpfilter_umh_blob.o: $(obj)/bpfilter_umh
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 1c153cc5b530..07acf5779f55 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -84,7 +84,7 @@ hostcxx_flags  = -Wp,-MD,$(depfile) $(__hostcxx_flags)
 # Create executable from a single .c file
 # host-csingle -> Executable
 quiet_cmd_host-csingle = HOSTCC  $@
-  cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(HOSTLDFLAGS) -o $@ $< \
+  cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(KBUILD_HOSTLDFLAGS) -o $@ 
$< \
$(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-csingle): $(obj)/%: $(src)/%.c FORCE
$(call if_changed_dep,host-csingle)
@@ -92,7 +92,7 @@ $(host-csingle): $(obj)/%: $(src)/%.c FORCE
 # Link an executable based on list of .o files, all plain c
 # host-cmulti -> executable
 quiet_cmd_host-cmulti  = HOSTLD  $@
-  cmd_host-cmulti  = $(HOSTCC) $(HOSTLDFLAGS) -o $@ \
+  cmd_host-cmulti  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(addprefix $(obj)/,$($(@F)-objs)) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cmulti): FORCE
@@ -109,7 +109,7 @@ $(host-cobjs): $(obj)/%.o: $(src)/%.c FORCE
 # Link an executable based on list of .o files, a mixture of .c and .cc
 # host-cxxmulti -> executable
 quiet_cmd_host-cxxmulti= HOSTLD  $@
-  cmd_host-cxxmulti= $(HOSTCXX) $(HOSTLDFLAGS) -o $@ \
+  cmd_host-cxxmulti= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(foreach o,objs cxxobjs,\
  $(addprefix $(obj)/,$($(@F)-$(o \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
@@ -143,7 +143,7 @@ $(host-cxxshobjs): $(obj)/%.o: $(src)/%.c FORCE
 # Link a shared library, based on position independent .o files
 # *.o -> .so shared library (host-cshlib)
 quiet_cmd_host-cshlib  = HOSTLLD -shared $@
-  cmd_host-cshlib  = $(HOSTCC) $(HOSTLDFLAGS) -shared -o $@ \
+  cmd_host-cshlib  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -shared -o $@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cshlib): FORCE
@@ -153,7 +153,7 @@ $(call multi_depend, $(host-cshlib), .so, -objs)
 # Link a shared library, based on position independent .o files
 # *.o -> .so shared library (host-cxxshlib)
 quiet_cmd_host-cxxshlib= HOSTLLD -shared $@
-  cmd_host-cxxshlib= $(HOSTCXX) $(HOSTLDFLAGS) -shared -o $@ \
+  cmd_host-cxxshlib= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -shared -o 
$@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
  $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
 $(host-cxxshlib): FORCE
diff --git a/tools/build/Makefile b/tools/build/Makefile
index cd0fd2cd165b..5f7903e8eabd 100644
--- a/tools/build/Makefile
+++ b/tools/build/Makefile
@@ -43,7 +43,7 @@ $(OUTPUT)fixdep-in.o: FORCE
$(Q)$(MAKE) $(build)=fixdep
 
 $(OUTPUT)fixdep: $(OUTPUT)fixdep-in.o
-   $(QUIET_LINK)$(HOSTCC) $(HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
+   $(QUIET_LINK)$(HOSTCC) $(KBUILD_HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
 
 FORCE:
 

Re: [PATCH] f2fs: split discard command in prior to block layer

2018-07-06 Thread Jaegeuk Kim
On 07/07, Chao Yu wrote:
> Hi Jaegeuk,
> 
> On 2018/7/7 6:45, Jaegeuk Kim wrote:
> > On 07/04, Chao Yu wrote:
> >> From: Chao Yu 
> >>
> >> Some devices has small max_{hw,}discard_sectors, so that in
> >> __blkdev_issue_discard(), one big size discard bio can be split
> >> into multiple small size discard bios, result in heavy load in IO
> >> scheduler and device, which can hang other sync IO for long time.
> >>
> >> Now, f2fs is trying to control discard commands more elaboratively,
> >> in order to make less conflict in between discard IO and user IO
> >> to enhance application's performance, so in this patch, we will
> >> split discard bio in f2fs in prior to in block layer to reduce
> >> issuing multiple discard bios in a short time.
> > 
> > Hi Chao,
> > 
> > In terms of # of candidates, can we control this when actually issuing
> > the discard commands?
> 
> IIUC, you mean once we pick one discard entry in rbtree, if
> max_{hw,}discard_sectors is smaller than size of this discard, then we can 
> split
> it into smaller ones by discard_sectors, and just issue one or partials of 
> them?

Yes, sort of.

> 
> Thanks,
> 
> > 
> > Thanks,
> > 
> >>
> >> Signed-off-by: Chao Yu 
> >> ---
> >>  fs/f2fs/f2fs.h| 13 ++---
> >>  fs/f2fs/segment.c | 25 ++---
> >>  2 files changed, 28 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> >> index a9da5a089cb4..a09d2b2d9520 100644
> >> --- a/fs/f2fs/f2fs.h
> >> +++ b/fs/f2fs/f2fs.h
> >> @@ -178,7 +178,6 @@ enum {
> >>  
> >>  #define MAX_DISCARD_BLOCKS(sbi)   BLKS_PER_SEC(sbi)
> >>  #define DEF_MAX_DISCARD_REQUEST   8   /* issue 8 discards per 
> >> round */
> >> -#define DEF_MAX_DISCARD_LEN   512 /* Max. 2MB per discard 
> >> */
> >>  #define DEF_MIN_DISCARD_ISSUE_TIME50  /* 50 ms, if exists */
> >>  #define DEF_MID_DISCARD_ISSUE_TIME500 /* 500 ms, if device 
> >> busy */
> >>  #define DEF_MAX_DISCARD_ISSUE_TIME6   /* 60 s, if no 
> >> candidates */
> >> @@ -701,22 +700,22 @@ static inline void set_extent_info(struct 
> >> extent_info *ei, unsigned int fofs,
> >>  }
> >>  
> >>  static inline bool __is_discard_mergeable(struct discard_info *back,
> >> -  struct discard_info *front)
> >> +  struct discard_info *front, unsigned int max_len)
> >>  {
> >>return (back->lstart + back->len == front->lstart) &&
> >> -  (back->len + front->len < DEF_MAX_DISCARD_LEN);
> >> +  (back->len + front->len <= max_len);
> >>  }
> >>  
> >>  static inline bool __is_discard_back_mergeable(struct discard_info *cur,
> >> -  struct discard_info *back)
> >> +  struct discard_info *back, unsigned int max_len)
> >>  {
> >> -  return __is_discard_mergeable(back, cur);
> >> +  return __is_discard_mergeable(back, cur, max_len);
> >>  }
> >>  
> >>  static inline bool __is_discard_front_mergeable(struct discard_info *cur,
> >> -  struct discard_info *front)
> >> +  struct discard_info *front, unsigned int max_len)
> >>  {
> >> -  return __is_discard_mergeable(cur, front);
> >> +  return __is_discard_mergeable(cur, front, max_len);
> >>  }
> >>  
> >>  static inline bool __is_extent_mergeable(struct extent_info *back,
> >> diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
> >> index 4648561e2bfd..8e417a12684d 100644
> >> --- a/fs/f2fs/segment.c
> >> +++ b/fs/f2fs/segment.c
> >> @@ -1086,6 +1086,9 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>struct discard_cmd *dc;
> >>struct discard_info di = {0};
> >>struct rb_node **insert_p = NULL, *insert_parent = NULL;
> >> +  struct request_queue *q = bdev_get_queue(bdev);
> >> +  unsigned int max_discard_blocks =
> >> +  SECTOR_TO_BLOCK(q->limits.max_discard_sectors);
> >>block_t end = lstart + len;
> >>  
> >>mutex_lock(>cmd_lock);
> >> @@ -1129,7 +1132,8 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>  
> >>if (prev_dc && prev_dc->state == D_PREP &&
> >>prev_dc->bdev == bdev &&
> >> -  __is_discard_back_mergeable(, _dc->di)) {
> >> +  __is_discard_back_mergeable(, _dc->di,
> >> +  max_discard_blocks)) {
> >>prev_dc->di.len += di.len;
> >>dcc->undiscard_blks += di.len;
> >>__relocate_discard_cmd(dcc, prev_dc);
> >> @@ -1140,7 +1144,8 @@ static void __update_discard_tree_range(struct 
> >> f2fs_sb_info *sbi,
> >>  
> >>if (next_dc && next_dc->state == D_PREP &&
> >>next_dc->bdev == bdev &&
> >> -  __is_discard_front_mergeable(, _dc->di)) {
> >> +  __is_discard_front_mergeable(, _dc->di,
> >> + 

[PATCH 1/7] tools: build: Fixup host c flags

2018-07-06 Thread Laura Abbott
Commit 0c3b7e42616f ("tools build: Add support for host programs format")
introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the
variable is HOSTCFLAGS. Fix this up.

Fixes: 0c3b7e42616f ("tools build: Add support for host programs format")
Signed-off-by: Laura Abbott 
---
This seemed like a typo to the best of my understanding and certain
things wouldn't link properly until this was fixed.
---
 tools/build/Build.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Build.include b/tools/build/Build.include
index a4bbb984941d..b5c679cd441c 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -98,4 +98,4 @@ cxx_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CXXFLAGS) 
-D"BUILD_STR(s)=\#s" $(CXX
 ###
 ## HOSTCC C flags
 
-host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CHOSTFLAGS) 
-D"BUILD_STR(s)=\#s" $(CHOSTFLAGS_$(basetarget).o) $(CHOSTFLAGS_$(obj))
+host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(HOSTCFLAGS) 
-D"BUILD_STR(s)=\#s" $(HOSTCFLAGS_$(basetarget).o) $(HOSTCFLAGS_$(obj))
-- 
2.17.1



[PATCH 1/7] tools: build: Fixup host c flags

2018-07-06 Thread Laura Abbott
Commit 0c3b7e42616f ("tools build: Add support for host programs format")
introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the
variable is HOSTCFLAGS. Fix this up.

Fixes: 0c3b7e42616f ("tools build: Add support for host programs format")
Signed-off-by: Laura Abbott 
---
This seemed like a typo to the best of my understanding and certain
things wouldn't link properly until this was fixed.
---
 tools/build/Build.include | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Build.include b/tools/build/Build.include
index a4bbb984941d..b5c679cd441c 100644
--- a/tools/build/Build.include
+++ b/tools/build/Build.include
@@ -98,4 +98,4 @@ cxx_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CXXFLAGS) 
-D"BUILD_STR(s)=\#s" $(CXX
 ###
 ## HOSTCC C flags
 
-host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(CHOSTFLAGS) 
-D"BUILD_STR(s)=\#s" $(CHOSTFLAGS_$(basetarget).o) $(CHOSTFLAGS_$(obj))
+host_c_flags = -Wp,-MD,$(depfile) -Wp,-MT,$@ $(HOSTCFLAGS) 
-D"BUILD_STR(s)=\#s" $(HOSTCFLAGS_$(basetarget).o) $(HOSTCFLAGS_$(obj))
-- 
2.17.1



[PATCH 4/7] treewide: Rename HOSTCXXFLAGS to KBUILD_HOSTCXXFLAGS

2018-07-06 Thread Laura Abbott


In preparation for enabling command line CXXFLAGS, re-name HOSTCXXFLAGS to
KBUILD_HOSTCXXFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile | 4 ++--
 scripts/Makefile.host| 4 ++--
 scripts/kconfig/Makefile | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 16cab22b1f40..9c3f1864d678 100644
--- a/Makefile
+++ b/Makefile
@@ -361,7 +361,7 @@ HOSTCC   = gcc
 HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
-HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
+KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
 HOST_LOADLIBES := $(HOST_LFS_LIBS)
 
@@ -432,7 +432,7 @@ GCC_PLUGINS_CFLAGS :=
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
 export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
-export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
+export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
 export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
 export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 7a4c90f2b0ce..1c153cc5b530 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -64,8 +64,8 @@ host-cxxshobjs:= $(addprefix 
$(obj)/,$(host-cxxshobjs))
 
 _hostc_flags   = $(KBUILD_HOSTCFLAGS)   $(HOST_EXTRACFLAGS)   \
  $(KBUILD_HOSTCFLAGS_$(basetarget).o)
-_hostcxx_flags = $(HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \
- $(HOSTCXXFLAGS_$(basetarget).o)
+_hostcxx_flags = $(KBUILD_HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \
+ $(KBUILD_HOSTCXXFLAGS_$(basetarget).o)
 
 ifeq ($(KBUILD_SRC),)
 __hostc_flags  = $(_hostc_flags)
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index f98a22618e1f..37605d92c597 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -192,7 +192,7 @@ qconf-cxxobjs   := qconf.o
 qconf-objs := zconf.tab.o
 
 HOSTLOADLIBES_qconf= $(shell . $(obj)/.qconf-cfg && echo $$libs)
-HOSTCXXFLAGS_qconf.o   = $(shell . $(obj)/.qconf-cfg && echo $$cflags)
+KBUILD_HOSTCXXFLAGS_qconf.o= $(shell . $(obj)/.qconf-cfg && echo $$cflags)
 
 $(obj)/qconf.o: $(obj)/.qconf-cfg $(obj)/qconf.moc
 
-- 
2.17.1



[PATCH 7/7] Kbuild: Use HOST*FLAGS options from the command line

2018-07-06 Thread Laura Abbott
Now that we have the rename in place, reuse the HOST*FLAGS options as
something that can be set from the command line and included with the
rest of the flags.

Signed-off-by: Laura Abbott 
---
 Makefile | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index d925af1fb11b..77c74b3cf30b 100644
--- a/Makefile
+++ b/Makefile
@@ -360,10 +360,11 @@ HOST_LFS_LIBS := $(shell getconf LFS_LIBS)
 HOSTCC   = gcc
 HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-   -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
-KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
-KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
-KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS)
+   -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS) \
+   $(HOSTCFLAGS)
+KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS) $(HOSTCXXFLAGS)
+KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS) $(HOSTLDFLAGS)
+KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS) $(HOST_LOADLIBES)
 
 # Make variables (CC, etc...)
 AS = $(CROSS_COMPILE)as
-- 
2.17.1



[PATCH 4/7] treewide: Rename HOSTCXXFLAGS to KBUILD_HOSTCXXFLAGS

2018-07-06 Thread Laura Abbott


In preparation for enabling command line CXXFLAGS, re-name HOSTCXXFLAGS to
KBUILD_HOSTCXXFLAGS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile | 4 ++--
 scripts/Makefile.host| 4 ++--
 scripts/kconfig/Makefile | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile
index 16cab22b1f40..9c3f1864d678 100644
--- a/Makefile
+++ b/Makefile
@@ -361,7 +361,7 @@ HOSTCC   = gcc
 HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
-HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
+KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
 HOST_LOADLIBES := $(HOST_LFS_LIBS)
 
@@ -432,7 +432,7 @@ GCC_PLUGINS_CFLAGS :=
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
 export CPP AR NM STRIP OBJCOPY OBJDUMP HOSTLDFLAGS HOST_LOADLIBES
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
-export HOSTCXX HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
+export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
 export KBUILD_CPPFLAGS NOSTDINC_FLAGS LINUXINCLUDE OBJCOPYFLAGS LDFLAGS
 export KBUILD_CFLAGS CFLAGS_KERNEL CFLAGS_MODULE
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 7a4c90f2b0ce..1c153cc5b530 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -64,8 +64,8 @@ host-cxxshobjs:= $(addprefix 
$(obj)/,$(host-cxxshobjs))
 
 _hostc_flags   = $(KBUILD_HOSTCFLAGS)   $(HOST_EXTRACFLAGS)   \
  $(KBUILD_HOSTCFLAGS_$(basetarget).o)
-_hostcxx_flags = $(HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \
- $(HOSTCXXFLAGS_$(basetarget).o)
+_hostcxx_flags = $(KBUILD_HOSTCXXFLAGS) $(HOST_EXTRACXXFLAGS) \
+ $(KBUILD_HOSTCXXFLAGS_$(basetarget).o)
 
 ifeq ($(KBUILD_SRC),)
 __hostc_flags  = $(_hostc_flags)
diff --git a/scripts/kconfig/Makefile b/scripts/kconfig/Makefile
index f98a22618e1f..37605d92c597 100644
--- a/scripts/kconfig/Makefile
+++ b/scripts/kconfig/Makefile
@@ -192,7 +192,7 @@ qconf-cxxobjs   := qconf.o
 qconf-objs := zconf.tab.o
 
 HOSTLOADLIBES_qconf= $(shell . $(obj)/.qconf-cfg && echo $$libs)
-HOSTCXXFLAGS_qconf.o   = $(shell . $(obj)/.qconf-cfg && echo $$cflags)
+KBUILD_HOSTCXXFLAGS_qconf.o= $(shell . $(obj)/.qconf-cfg && echo $$cflags)
 
 $(obj)/qconf.o: $(obj)/.qconf-cfg $(obj)/qconf.moc
 
-- 
2.17.1



[PATCH 7/7] Kbuild: Use HOST*FLAGS options from the command line

2018-07-06 Thread Laura Abbott
Now that we have the rename in place, reuse the HOST*FLAGS options as
something that can be set from the command line and included with the
rest of the flags.

Signed-off-by: Laura Abbott 
---
 Makefile | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index d925af1fb11b..77c74b3cf30b 100644
--- a/Makefile
+++ b/Makefile
@@ -360,10 +360,11 @@ HOST_LFS_LIBS := $(shell getconf LFS_LIBS)
 HOSTCC   = gcc
 HOSTCXX  = g++
 KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
-   -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
-KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
-KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
-KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS)
+   -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS) \
+   $(HOSTCFLAGS)
+KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS) $(HOSTCXXFLAGS)
+KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS) $(HOSTLDFLAGS)
+KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS) $(HOST_LOADLIBES)
 
 # Make variables (CC, etc...)
 AS = $(CROSS_COMPILE)as
-- 
2.17.1



[PATCH 2/7] tools: build: Use HOSTLDFLAGS with fixdep

2018-07-06 Thread Laura Abbott
The final link of fixdep uses LDFLAGS but not the existing HOSTLDFLAGS.
Fix this.

Signed-off-by: Laura Abbott 
---
This was another one where I couldn't tell of the use of LDFLAGS was a
typo or intentional, hence keeping both.
---
 tools/build/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Makefile b/tools/build/Makefile
index 5eb4b5ad79cb..cd0fd2cd165b 100644
--- a/tools/build/Makefile
+++ b/tools/build/Makefile
@@ -43,7 +43,7 @@ $(OUTPUT)fixdep-in.o: FORCE
$(Q)$(MAKE) $(build)=fixdep
 
 $(OUTPUT)fixdep: $(OUTPUT)fixdep-in.o
-   $(QUIET_LINK)$(HOSTCC) $(LDFLAGS) -o $@ $<
+   $(QUIET_LINK)$(HOSTCC) $(HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
 
 FORCE:
 
-- 
2.17.1



[PATCH 6/7] treewide: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS

2018-07-06 Thread Laura Abbott
In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES to
KBUILD_HOSTLDLIBS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile  |  4 ++--
 samples/bpf/Makefile  |  2 +-
 scripts/Makefile.host | 10 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 319736b80016..d925af1fb11b 100644
--- a/Makefile
+++ b/Makefile
@@ -363,7 +363,7 @@ KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes 
-Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
-HOST_LOADLIBES := $(HOST_LFS_LIBS)
+KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS)
 
 # Make variables (CC, etc...)
 AS = $(CROSS_COMPILE)as
@@ -430,7 +430,7 @@ LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
-export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS HOST_LOADLIBES
+export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 09f90955412e..c20fb2fa9c37 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -180,7 +180,7 @@ KBUILD_HOSTCFLAGS_trace_event_user.o += 
-I$(srctree)/tools/lib/bpf/
 KBUILD_HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 KBUILD_HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
 
-HOST_LOADLIBES += $(LIBBPF) -lelf
+KBUILD_HOSTLDLIBS  += $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4  += -lrt
 HOSTLOADLIBES_trace_output += -lrt
 HOSTLOADLIBES_map_perf_test+= -lrt
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 07acf5779f55..7b784413d921 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -85,7 +85,7 @@ hostcxx_flags  = -Wp,-MD,$(depfile) $(__hostcxx_flags)
 # host-csingle -> Executable
 quiet_cmd_host-csingle = HOSTCC  $@
   cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(KBUILD_HOSTLDFLAGS) -o $@ 
$< \
-   $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+   $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-csingle): $(obj)/%: $(src)/%.c FORCE
$(call if_changed_dep,host-csingle)
 
@@ -94,7 +94,7 @@ $(host-csingle): $(obj)/%: $(src)/%.c FORCE
 quiet_cmd_host-cmulti  = HOSTLD  $@
   cmd_host-cmulti  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(addprefix $(obj)/,$($(@F)-objs)) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cmulti): FORCE
$(call if_changed,host-cmulti)
 $(call multi_depend, $(host-cmulti), , -objs)
@@ -112,7 +112,7 @@ quiet_cmd_host-cxxmulti = HOSTLD  $@
   cmd_host-cxxmulti= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(foreach o,objs cxxobjs,\
  $(addprefix $(obj)/,$($(@F)-$(o \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cxxmulti): FORCE
$(call if_changed,host-cxxmulti)
 $(call multi_depend, $(host-cxxmulti), , -objs -cxxobjs)
@@ -145,7 +145,7 @@ $(host-cxxshobjs): $(obj)/%.o: $(src)/%.c FORCE
 quiet_cmd_host-cshlib  = HOSTLLD -shared $@
   cmd_host-cshlib  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -shared -o $@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cshlib): FORCE
$(call if_changed,host-cshlib)
 $(call multi_depend, $(host-cshlib), .so, -objs)
@@ -155,7 +155,7 @@ $(call multi_depend, $(host-cshlib), .so, -objs)
 quiet_cmd_host-cxxshlib= HOSTLLD -shared $@
   cmd_host-cxxshlib= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -shared -o 
$@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cxxshlib): FORCE
$(call if_changed,host-cxxshlib)
 $(call multi_depend, $(host-cxxshlib), .so, -objs)
-- 
2.17.1



[PATCH 2/7] tools: build: Use HOSTLDFLAGS with fixdep

2018-07-06 Thread Laura Abbott
The final link of fixdep uses LDFLAGS but not the existing HOSTLDFLAGS.
Fix this.

Signed-off-by: Laura Abbott 
---
This was another one where I couldn't tell of the use of LDFLAGS was a
typo or intentional, hence keeping both.
---
 tools/build/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/build/Makefile b/tools/build/Makefile
index 5eb4b5ad79cb..cd0fd2cd165b 100644
--- a/tools/build/Makefile
+++ b/tools/build/Makefile
@@ -43,7 +43,7 @@ $(OUTPUT)fixdep-in.o: FORCE
$(Q)$(MAKE) $(build)=fixdep
 
 $(OUTPUT)fixdep: $(OUTPUT)fixdep-in.o
-   $(QUIET_LINK)$(HOSTCC) $(LDFLAGS) -o $@ $<
+   $(QUIET_LINK)$(HOSTCC) $(HOSTLDFLAGS) $(LDFLAGS) -o $@ $<
 
 FORCE:
 
-- 
2.17.1



[PATCH 6/7] treewide: Rename HOST_LOADLIBES to KBUILD_HOSTLDLIBS

2018-07-06 Thread Laura Abbott
In preparation for enabling command line LDLIBS, re-name HOST_LOADLIBES to
KBUILD_HOSTLDLIBS as the internal use only flags. This should not have any
visible effects.

Signed-off-by: Laura Abbott 
---
 Makefile  |  4 ++--
 samples/bpf/Makefile  |  2 +-
 scripts/Makefile.host | 10 +-
 3 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Makefile b/Makefile
index 319736b80016..d925af1fb11b 100644
--- a/Makefile
+++ b/Makefile
@@ -363,7 +363,7 @@ KBUILD_HOSTCFLAGS   := -Wall -Wmissing-prototypes 
-Wstrict-prototypes -O2 \
-fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 KBUILD_HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)
-HOST_LOADLIBES := $(HOST_LFS_LIBS)
+KBUILD_HOSTLDLIBS := $(HOST_LFS_LIBS)
 
 # Make variables (CC, etc...)
 AS = $(CROSS_COMPILE)as
@@ -430,7 +430,7 @@ LDFLAGS :=
 GCC_PLUGINS_CFLAGS :=
 
 export ARCH SRCARCH CONFIG_SHELL HOSTCC KBUILD_HOSTCFLAGS CROSS_COMPILE AS LD 
CC
-export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS HOST_LOADLIBES
+export CPP AR NM STRIP OBJCOPY OBJDUMP KBUILD_HOSTLDFLAGS KBUILD_HOSTLDLIBS
 export MAKE LEX YACC AWK GENKSYMS INSTALLKERNEL PERL PYTHON PYTHON2 PYTHON3 
UTS_MACHINE
 export HOSTCXX KBUILD_HOSTCXXFLAGS LDFLAGS_MODULE CHECK CHECKFLAGS
 
diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 09f90955412e..c20fb2fa9c37 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -180,7 +180,7 @@ KBUILD_HOSTCFLAGS_trace_event_user.o += 
-I$(srctree)/tools/lib/bpf/
 KBUILD_HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 KBUILD_HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
 
-HOST_LOADLIBES += $(LIBBPF) -lelf
+KBUILD_HOSTLDLIBS  += $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4  += -lrt
 HOSTLOADLIBES_trace_output += -lrt
 HOSTLOADLIBES_map_perf_test+= -lrt
diff --git a/scripts/Makefile.host b/scripts/Makefile.host
index 07acf5779f55..7b784413d921 100644
--- a/scripts/Makefile.host
+++ b/scripts/Makefile.host
@@ -85,7 +85,7 @@ hostcxx_flags  = -Wp,-MD,$(depfile) $(__hostcxx_flags)
 # host-csingle -> Executable
 quiet_cmd_host-csingle = HOSTCC  $@
   cmd_host-csingle = $(HOSTCC) $(hostc_flags) $(KBUILD_HOSTLDFLAGS) -o $@ 
$< \
-   $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+   $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-csingle): $(obj)/%: $(src)/%.c FORCE
$(call if_changed_dep,host-csingle)
 
@@ -94,7 +94,7 @@ $(host-csingle): $(obj)/%: $(src)/%.c FORCE
 quiet_cmd_host-cmulti  = HOSTLD  $@
   cmd_host-cmulti  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(addprefix $(obj)/,$($(@F)-objs)) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cmulti): FORCE
$(call if_changed,host-cmulti)
 $(call multi_depend, $(host-cmulti), , -objs)
@@ -112,7 +112,7 @@ quiet_cmd_host-cxxmulti = HOSTLD  $@
   cmd_host-cxxmulti= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -o $@ \
  $(foreach o,objs cxxobjs,\
  $(addprefix $(obj)/,$($(@F)-$(o \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cxxmulti): FORCE
$(call if_changed,host-cxxmulti)
 $(call multi_depend, $(host-cxxmulti), , -objs -cxxobjs)
@@ -145,7 +145,7 @@ $(host-cxxshobjs): $(obj)/%.o: $(src)/%.c FORCE
 quiet_cmd_host-cshlib  = HOSTLLD -shared $@
   cmd_host-cshlib  = $(HOSTCC) $(KBUILD_HOSTLDFLAGS) -shared -o $@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cshlib): FORCE
$(call if_changed,host-cshlib)
 $(call multi_depend, $(host-cshlib), .so, -objs)
@@ -155,7 +155,7 @@ $(call multi_depend, $(host-cshlib), .so, -objs)
 quiet_cmd_host-cxxshlib= HOSTLLD -shared $@
   cmd_host-cxxshlib= $(HOSTCXX) $(KBUILD_HOSTLDFLAGS) -shared -o 
$@ \
  $(addprefix $(obj)/,$($(@F:.so=-objs))) \
- $(HOST_LOADLIBES) $(HOSTLOADLIBES_$(@F))
+ $(KBUILD_HOSTLDLIBS) $(HOSTLOADLIBES_$(@F))
 $(host-cxxshlib): FORCE
$(call if_changed,host-cxxshlib)
 $(call multi_depend, $(host-cxxshlib), .so, -objs)
-- 
2.17.1



Re: 4.17.x won't boot due to "x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G"

2018-07-06 Thread Masahiro Yamada
2018-07-06 23:39 GMT+09:00 Gabriel C :
> 2018-07-06 16:13 GMT+02:00 Masahiro Yamada :
>> Hi.
>>
>> 2018-07-06 19:41 GMT+09:00 Kirill A. Shutemov :
>>> On Fri, Jul 06, 2018 at 03:37:58PM +0900, Masahiro Yamada wrote:
 >> > > Also see https://bugzilla.kernel.org/show_bug.cgi?id=200385 ,
 >> > >
 >> > > 0a1756bd2897951c03c1cb671bdfd40729ac2177 is acting up
 >> > > too with the same symptoms
 >> >
 >> > I tracked it down to -flto in LDFLAGS. I'll look more into this.
 >>
 >> -flto in LDFLAGS screws up this part of paging_prepare():
 >
 > +Masahiro, Michal.
 >
 > I've got it wrong. *Any* LDFLAGS option passed to make this way:
 >
 >   make LDFLAGS="..."
 >
 > would cause a issue. Even empty.
 >
 > It overrides all assignments to the variable in the makefile.
 > As result the image is built without -pie and linker doesn't generate
 > position independed code.
 >
 > Looks like the patch below helps, but my make-fu is poor.
 > I don't see many override directives in kernel makefiles.
 > It makes me think that there's a better way to fix this.
 >
 > Hm?


 LDFLAGS is for internal-use.
 Please do not override it from the command line.
>>>
>>> Can we generate a build error if a user try to override LDFLAGS, CFLAGS or
>>> other critical internal-use-only variables?
>>
>> Yes, Make can check where variables came from.
>>
>>
>>> This breakage was rather hard to debug. We need to have some kind of
>>> fail-safe for the future.
>>>
 You want to pass your own linker flags
 for building vmlinux and modules,
 but do not want to pass them to
 the decompressor (arch/x86/boot/compressed).

 Correct?
>>>
>>> I personally don't think that changing compiler/linker options for kernel
>>> build is good idea in general.
>>>
 Kbuild provides a way for users
 to pass additional linker flags to modules.
 (LDFLAGS_MODULE)


 But, there is no way to do that for vmlinux.

 It is easy to support it, though.

 https://patchwork.kernel.org/patch/10510833/

 If this is the one you want, I can merge this.


 make LDFLAGS_KERNEL=...  LDFLAGS_MODULE=...
 will allow you to append linker flags.
>>>
>>> Okay. It makes me wounder if we should taint kernel in such cases?
>>> Custom compiler/linker flags are risky and can lead to weird bugs.
>>
>> OK.
>> So, what problem are we discussing?
>>
>>
>>> I've got it wrong. *Any* LDFLAGS option passed to make this way:
>>>
>>>  make LDFLAGS="..."
>>
>> In your previous mail, I thought you were asking me how to pass
>> custom linker flags.
>>
>> If not, we do not need to think about that case.
>> Just say "Do not do that".
>
> I am sorry but I have a hard time to get your logic here.
>
> You are saying : the *env* variable LDFLAGS as well passing
> LDFLAGS to make , which your build allows should not be use
> because is for 'internal usage' .. ?
>
> Well that logic you have here is wrong and wrong for any project
> not just for the kernel,


Why 'my logic'?

LDFLAGS has been long used internally since the old days,
before I ever worked on the kernel.


I shared my knowledge about the kernel build system.

The current situation is not nice,
but why are you blaming me for the code I did not add ?


Note:
I have never said 'the *env* variable LDFLAGS'


> If you know 'parts' need have particular flags then 'you' have to
> ensure nothing
> overrides these or nothing at all can chage these.
>
> So swap your logic and apped LDFLAGS to your private
> 'call_it_whatever_you_wish_KERNEL_NEED_BE_THERE_ANY_KIND_FLAGS'
> and don't allow these to be changed at all , when you
> *know* they have be there.
>
>
> Teling users to not use LD/C/CXX flags is not really going to work right ?
>
>
> BR



-- 
Best Regards
Masahiro Yamada


Re: 4.17.x won't boot due to "x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G"

2018-07-06 Thread Masahiro Yamada
2018-07-06 23:39 GMT+09:00 Gabriel C :
> 2018-07-06 16:13 GMT+02:00 Masahiro Yamada :
>> Hi.
>>
>> 2018-07-06 19:41 GMT+09:00 Kirill A. Shutemov :
>>> On Fri, Jul 06, 2018 at 03:37:58PM +0900, Masahiro Yamada wrote:
 >> > > Also see https://bugzilla.kernel.org/show_bug.cgi?id=200385 ,
 >> > >
 >> > > 0a1756bd2897951c03c1cb671bdfd40729ac2177 is acting up
 >> > > too with the same symptoms
 >> >
 >> > I tracked it down to -flto in LDFLAGS. I'll look more into this.
 >>
 >> -flto in LDFLAGS screws up this part of paging_prepare():
 >
 > +Masahiro, Michal.
 >
 > I've got it wrong. *Any* LDFLAGS option passed to make this way:
 >
 >   make LDFLAGS="..."
 >
 > would cause a issue. Even empty.
 >
 > It overrides all assignments to the variable in the makefile.
 > As result the image is built without -pie and linker doesn't generate
 > position independed code.
 >
 > Looks like the patch below helps, but my make-fu is poor.
 > I don't see many override directives in kernel makefiles.
 > It makes me think that there's a better way to fix this.
 >
 > Hm?


 LDFLAGS is for internal-use.
 Please do not override it from the command line.
>>>
>>> Can we generate a build error if a user try to override LDFLAGS, CFLAGS or
>>> other critical internal-use-only variables?
>>
>> Yes, Make can check where variables came from.
>>
>>
>>> This breakage was rather hard to debug. We need to have some kind of
>>> fail-safe for the future.
>>>
 You want to pass your own linker flags
 for building vmlinux and modules,
 but do not want to pass them to
 the decompressor (arch/x86/boot/compressed).

 Correct?
>>>
>>> I personally don't think that changing compiler/linker options for kernel
>>> build is good idea in general.
>>>
 Kbuild provides a way for users
 to pass additional linker flags to modules.
 (LDFLAGS_MODULE)


 But, there is no way to do that for vmlinux.

 It is easy to support it, though.

 https://patchwork.kernel.org/patch/10510833/

 If this is the one you want, I can merge this.


 make LDFLAGS_KERNEL=...  LDFLAGS_MODULE=...
 will allow you to append linker flags.
>>>
>>> Okay. It makes me wounder if we should taint kernel in such cases?
>>> Custom compiler/linker flags are risky and can lead to weird bugs.
>>
>> OK.
>> So, what problem are we discussing?
>>
>>
>>> I've got it wrong. *Any* LDFLAGS option passed to make this way:
>>>
>>>  make LDFLAGS="..."
>>
>> In your previous mail, I thought you were asking me how to pass
>> custom linker flags.
>>
>> If not, we do not need to think about that case.
>> Just say "Do not do that".
>
> I am sorry but I have a hard time to get your logic here.
>
> You are saying : the *env* variable LDFLAGS as well passing
> LDFLAGS to make , which your build allows should not be use
> because is for 'internal usage' .. ?
>
> Well that logic you have here is wrong and wrong for any project
> not just for the kernel,


Why 'my logic'?

LDFLAGS has been long used internally since the old days,
before I ever worked on the kernel.


I shared my knowledge about the kernel build system.

The current situation is not nice,
but why are you blaming me for the code I did not add ?


Note:
I have never said 'the *env* variable LDFLAGS'


> If you know 'parts' need have particular flags then 'you' have to
> ensure nothing
> overrides these or nothing at all can chage these.
>
> So swap your logic and apped LDFLAGS to your private
> 'call_it_whatever_you_wish_KERNEL_NEED_BE_THERE_ANY_KIND_FLAGS'
> and don't allow these to be changed at all , when you
> *know* they have be there.
>
>
> Teling users to not use LD/C/CXX flags is not really going to work right ?
>
>
> BR



-- 
Best Regards
Masahiro Yamada


[PATCH 2/3] x86/modules: Increase randomization for modules

2018-07-06 Thread Rick Edgecombe
This changes the behavior of the KASLR logic for allocating memory for the text
sections of loadable modules. It randomizes the location of each module text
section with about 17 bits of entropy in typical use. This is enabled on X86_64
only. For 32 bit, the behavior is unchanged.

The algorithm evenly breaks the module space in two, a random area and a backup
area. For module text allocations, it first tries to allocate at a number of
randomly located starting pages inside the random section. If this fails, it
will allocate in the backup area. The backup area base will be offset in the
same way as the current algorithm does for the base area, 1024 possible
locations.

Signed-off-by: Rick Edgecombe 
---
 arch/x86/include/asm/pgtable_64_types.h |   1 +
 arch/x86/kernel/module.c| 103 ++--
 2 files changed, 98 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 054765a..56452a0 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -141,6 +141,7 @@ extern unsigned int ptrs_per_p4d;
 /* The module sections ends with the start of the fixmap */
 #define MODULES_END_AC(0xff00, UL)
 #define MODULES_LEN(MODULES_END - MODULES_VADDR)
+#define MODULES_RAND_LEN   PAGE_ALIGN((MODULES_LEN/3)*2)
 
 #define ESPFIX_PGD_ENTRY   _AC(-2, UL)
 #define ESPFIX_BASE_ADDR   (ESPFIX_PGD_ENTRY << P4D_SHIFT)
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index f58336a..49f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -77,6 +77,93 @@ static unsigned long int get_module_load_offset(void)
 }
 #endif
 
+static unsigned long get_module_area_base(void)
+{
+   return MODULES_VADDR + get_module_load_offset();
+}
+
+#if defined(CONFIG_X86_64) && defined(CONFIG_RANDOMIZE_BASE)
+static unsigned long get_module_vmalloc_start(void)
+{
+   if (kaslr_enabled())
+   return MODULES_VADDR + MODULES_RAND_LEN
+   + get_module_load_offset();
+   else
+   return get_module_area_base();
+}
+
+static void *try_module_alloc(unsigned long addr, unsigned long size,
+   int try_purge)
+{
+   return __vmalloc_node_try_addr(addr, size, GFP_KERNEL,
+   PAGE_KERNEL_EXEC, 0,
+   NUMA_NO_NODE, try_purge,
+   __builtin_return_address(0));
+}
+
+/*
+ * Try to allocate in the random area. First 5000 times without purging, then
+ * 5000 times with purging. If these fail, return NULL.
+ */
+static void *try_module_randomize_each(unsigned long size)
+{
+   void *p = NULL;
+   unsigned int i;
+   unsigned long offset;
+   unsigned long addr;
+   unsigned long end;
+   unsigned long last_lazy_free_blocked = 0;
+   const unsigned long nr_mod_positions = MODULES_RAND_LEN / MODULE_ALIGN;
+   const unsigned long nr_try_purge = 5000;
+   const unsigned long nr_no_purge = 5000;
+
+   if (!kaslr_enabled())
+   return NULL;
+
+   for (i = 0; i < nr_try_purge + nr_no_purge; i++) {
+   offset = (get_random_long() % nr_mod_positions) * MODULE_ALIGN;
+   addr = (unsigned long)MODULES_VADDR + offset;
+   end = addr + size;
+
+   if (end > addr && end < MODULES_END) {
+   if (i < nr_no_purge) {
+   /* First try to avoid having to purge */
+   p = try_module_alloc(addr, size, 0);
+
+   /*
+* Save the last value that was blocked by a
+* lazy purge area
+*/
+   if (IS_ERR(p) && PTR_ERR(p) == -EBUSY)
+   last_lazy_free_blocked = addr;
+   else if (p && !IS_ERR(p))
+   return p;
+   } else {
+   /* Give up and allow for purges */
+   if (i == nr_try_purge && last_lazy_free_blocked)
+   addr = last_lazy_free_blocked;
+
+   p = try_module_alloc(addr, size, 1);
+
+   if (p)
+   return p;
+   }
+   }
+   }
+   return NULL;
+}
+#else
+static unsigned long get_module_vmalloc_start(void)
+{
+   return get_module_area_base();
+}
+
+static void *try_module_randomize_each(unsigned long size)
+{
+   return NULL;
+}
+#endif
+
 void *module_alloc(unsigned long size)
 {
void *p;
@@ -84,16 +171,20 @@ void 

[PATCH 2/3] x86/modules: Increase randomization for modules

2018-07-06 Thread Rick Edgecombe
This changes the behavior of the KASLR logic for allocating memory for the text
sections of loadable modules. It randomizes the location of each module text
section with about 17 bits of entropy in typical use. This is enabled on X86_64
only. For 32 bit, the behavior is unchanged.

The algorithm evenly breaks the module space in two, a random area and a backup
area. For module text allocations, it first tries to allocate at a number of
randomly located starting pages inside the random section. If this fails, it
will allocate in the backup area. The backup area base will be offset in the
same way as the current algorithm does for the base area, 1024 possible
locations.

Signed-off-by: Rick Edgecombe 
---
 arch/x86/include/asm/pgtable_64_types.h |   1 +
 arch/x86/kernel/module.c| 103 ++--
 2 files changed, 98 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 054765a..56452a0 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -141,6 +141,7 @@ extern unsigned int ptrs_per_p4d;
 /* The module sections ends with the start of the fixmap */
 #define MODULES_END_AC(0xff00, UL)
 #define MODULES_LEN(MODULES_END - MODULES_VADDR)
+#define MODULES_RAND_LEN   PAGE_ALIGN((MODULES_LEN/3)*2)
 
 #define ESPFIX_PGD_ENTRY   _AC(-2, UL)
 #define ESPFIX_BASE_ADDR   (ESPFIX_PGD_ENTRY << P4D_SHIFT)
diff --git a/arch/x86/kernel/module.c b/arch/x86/kernel/module.c
index f58336a..49f 100644
--- a/arch/x86/kernel/module.c
+++ b/arch/x86/kernel/module.c
@@ -77,6 +77,93 @@ static unsigned long int get_module_load_offset(void)
 }
 #endif
 
+static unsigned long get_module_area_base(void)
+{
+   return MODULES_VADDR + get_module_load_offset();
+}
+
+#if defined(CONFIG_X86_64) && defined(CONFIG_RANDOMIZE_BASE)
+static unsigned long get_module_vmalloc_start(void)
+{
+   if (kaslr_enabled())
+   return MODULES_VADDR + MODULES_RAND_LEN
+   + get_module_load_offset();
+   else
+   return get_module_area_base();
+}
+
+static void *try_module_alloc(unsigned long addr, unsigned long size,
+   int try_purge)
+{
+   return __vmalloc_node_try_addr(addr, size, GFP_KERNEL,
+   PAGE_KERNEL_EXEC, 0,
+   NUMA_NO_NODE, try_purge,
+   __builtin_return_address(0));
+}
+
+/*
+ * Try to allocate in the random area. First 5000 times without purging, then
+ * 5000 times with purging. If these fail, return NULL.
+ */
+static void *try_module_randomize_each(unsigned long size)
+{
+   void *p = NULL;
+   unsigned int i;
+   unsigned long offset;
+   unsigned long addr;
+   unsigned long end;
+   unsigned long last_lazy_free_blocked = 0;
+   const unsigned long nr_mod_positions = MODULES_RAND_LEN / MODULE_ALIGN;
+   const unsigned long nr_try_purge = 5000;
+   const unsigned long nr_no_purge = 5000;
+
+   if (!kaslr_enabled())
+   return NULL;
+
+   for (i = 0; i < nr_try_purge + nr_no_purge; i++) {
+   offset = (get_random_long() % nr_mod_positions) * MODULE_ALIGN;
+   addr = (unsigned long)MODULES_VADDR + offset;
+   end = addr + size;
+
+   if (end > addr && end < MODULES_END) {
+   if (i < nr_no_purge) {
+   /* First try to avoid having to purge */
+   p = try_module_alloc(addr, size, 0);
+
+   /*
+* Save the last value that was blocked by a
+* lazy purge area
+*/
+   if (IS_ERR(p) && PTR_ERR(p) == -EBUSY)
+   last_lazy_free_blocked = addr;
+   else if (p && !IS_ERR(p))
+   return p;
+   } else {
+   /* Give up and allow for purges */
+   if (i == nr_try_purge && last_lazy_free_blocked)
+   addr = last_lazy_free_blocked;
+
+   p = try_module_alloc(addr, size, 1);
+
+   if (p)
+   return p;
+   }
+   }
+   }
+   return NULL;
+}
+#else
+static unsigned long get_module_vmalloc_start(void)
+{
+   return get_module_area_base();
+}
+
+static void *try_module_randomize_each(unsigned long size)
+{
+   return NULL;
+}
+#endif
+
 void *module_alloc(unsigned long size)
 {
void *p;
@@ -84,16 +171,20 @@ void 

[PATCH 3/3] vmalloc: Add debugfs modfraginfo

2018-07-06 Thread Rick Edgecombe
Add debugfs file "modfraginfo" for providing info on module space
fragmentation.  This can be used for determining if loadable module
randomization is causing any problems for extreme module loading situations,
like huge numbers of modules or extremely large modules.

Sample output when RANDOMIZE_BASE and X86_64 is configured:
Largest free space: 847253504
External Memory Fragementation: 20%
Allocations in backup area: 0

Sample output when just X86_64:
Largest free space: 847253504
External Memory Fragementation: 20%

Signed-off-by: Rick Edgecombe 
---
 mm/vmalloc.c | 101 ++-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b6f2449..85441f2 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +34,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
@@ -2922,7 +2924,104 @@ static int __init proc_vmalloc_init(void)
proc_create_seq("vmallocinfo", 0400, NULL, _op);
return 0;
 }
-module_init(proc_vmalloc_init);
+#else
+static int proc_vmalloc_init(void)
+{
+   return 0;
+}
+#endif
 
+#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64)
+#if defined(CONFIG_RANDOMIZE_BASE)
+static void print_backup_area(struct seq_file *m, unsigned long backup_cnt)
+{
+   if (kaslr_enabled())
+   seq_printf(m, "Allocations in backup area:\t%lu\n", backup_cnt);
+}
+#else
+static void print_backup_area(struct seq_file *m, unsigned long backup_cnt)
+{
+}
 #endif
+static int modulefraginfo_debug_show(struct seq_file *m, void *v)
+{
+   struct list_head *i;
+   unsigned long last_end = MODULES_VADDR;
+   unsigned long total_free = 0;
+   unsigned long largest_free = 0;
+   unsigned long backup_cnt = 0;
+   unsigned long gap;
+
+   spin_lock(_area_lock);
+
+   list_for_each(i, _area_list) {
+   struct vmap_area *obj = list_entry(i, struct vmap_area, list);
+
+   if (!(obj->flags & VM_LAZY_FREE)
+   && obj->va_start >= MODULES_VADDR
+   && obj->va_end <= MODULES_END) {
+
+   if (obj->va_start >= MODULES_VADDR + MODULES_RAND_LEN)
+   backup_cnt++;
+
+   gap = (obj->va_start - last_end);
+   if (gap > largest_free)
+   largest_free = gap;
+   total_free += gap;
+
+   last_end = obj->va_end;
+   }
+   }
+
+   gap = (MODULES_END - last_end);
+   if (gap > largest_free)
+   largest_free = gap;
+   total_free += gap;
+
+   spin_unlock(_area_lock);
 
+   seq_printf(m, "Largest free space:\t\t%lu\n", largest_free);
+   if (total_free)
+   seq_printf(m, "External Memory Fragmentation:\t%lu%%\n",
+   100-(100*largest_free/total_free));
+   else
+   seq_puts(m, "External Memory Fragmentation:\t0%%\n");
+
+   print_backup_area(m, backup_cnt);
+
+   return 0;
+}
+
+static int proc_module_frag_debug_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, modulefraginfo_debug_show, NULL);
+}
+
+static const struct file_operations debug_module_frag_operations = {
+   .open   = proc_module_frag_debug_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static void debug_modfrag_init(void)
+{
+   debugfs_create_file("modfraginfo", 0400, NULL, NULL,
+   _module_frag_operations);
+}
+#else /* defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64) */
+static void debug_modfrag_init(void)
+{
+}
+#endif
+
+#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_PROC_FS)
+static int __init info_vmalloc_init(void)
+{
+   proc_vmalloc_init();
+   debug_modfrag_init();
+   return 0;
+}
+
+module_init(info_vmalloc_init);
+#endif
-- 
2.7.4



[PATCH 1/3] vmalloc: Add __vmalloc_node_try_addr function

2018-07-06 Thread Rick Edgecombe
Create __vmalloc_node_try_addr function that tries to allocate at a specific
address and supports caller specified behavior for whether any lazy purging
happens if there is a collision.

Signed-off-by: Rick Edgecombe 
---
 include/linux/vmalloc.h |   3 +
 mm/vmalloc.c| 174 
 2 files changed, 177 insertions(+)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 398e9c9..c7712c8 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, 
unsigned long align,
unsigned long start, unsigned long end, gfp_t gfp_mask,
pgprot_t prot, unsigned long vm_flags, int node,
const void *caller);
+extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+   gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags,
+   int node, int try_purge, const void *caller);
 #ifndef CONFIG_MMU
 extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags);
 static inline void *__vmalloc_node_flags_caller(unsigned long size, int node,
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cfea25b..b6f2449 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1710,6 +1710,180 @@ static void *__vmalloc_area_node(struct vm_struct 
*area, gfp_t gfp_mask,
 }
 
 /**
+ * __vmalloc_try_addr  -  try to alloc at a specific address
+ * @addr:  address to try
+ * @size:  size to try
+ * @gfp_mask:  flags for the page level allocator
+ * @prot:  protection mask for the allocated pages
+ * @vm_flags:  additional vm area flags (e.g. %VM_NO_GUARD)
+ * @node:  node to use for allocation or NUMA_NO_NODE
+ * @try_purge: try to purge if needed to fulfill and allocation
+ * @caller:caller's return address
+ *
+ * Try to allocate at the specific address. If it succeeds the address is
+ * returned. If it fails NULL is returned. If try_purge is zero, it will
+ * return an EBUSY ERR_PTR if it could have allocated if it was allowed to
+ * purge. It may trigger TLB flushes if a purge is needed, and try_purge is
+ * set.
+ */
+void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+   gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags,
+   int node, int try_purge, const void *caller)
+{
+   struct vmap_area *va;
+   struct vm_struct *area;
+   struct rb_node *n;
+   struct vmap_area *cur_va = NULL;
+   struct vmap_area *first_before = NULL;
+
+   int not_at_end = 0;
+   int need_purge = 0;
+   int blocked = 0;
+   int purged = 0;
+
+   unsigned long real_size = size;
+   unsigned long addr_end;
+
+   size = PAGE_ALIGN(size);
+   if (!size || (size >> PAGE_SHIFT) > totalram_pages)
+   return NULL;
+
+   WARN_ON(in_interrupt());
+
+   va = kmalloc_node(sizeof(struct vmap_area),
+   gfp_mask & GFP_RECLAIM_MASK, node);
+   if (unlikely(!va)) {
+   warn_alloc(gfp_mask, NULL,
+   "kmalloc: allocation failure");
+   return NULL;
+   }
+
+   area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
+   if (unlikely(!area)) {
+   warn_alloc(gfp_mask, NULL,
+   "kmalloc: allocation failure");
+   goto failva;
+   }
+   /*
+* Only scan the relevant parts containing pointers to other objects
+* to avoid false negatives.
+*/
+   kmemleak_scan_area(>rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
+
+   if (!(vm_flags & VM_NO_GUARD))
+   size += PAGE_SIZE;
+
+   addr_end = addr + size;
+   if (addr > addr_end)
+   return NULL;
+
+retry:
+   spin_lock(_area_lock);
+
+   n = vmap_area_root.rb_node;
+   while (n) {
+   cur_va = rb_entry(n, struct vmap_area, rb_node);
+   if (addr < cur_va->va_end) {
+   not_at_end = 1;
+   if (cur_va->va_start == addr) {
+   first_before = cur_va;
+   break;
+   }
+   n = n->rb_left;
+   } else {
+   first_before = cur_va;
+   n = n->rb_right;
+   }
+   }
+
+   /*
+* Linearly search through to make sure there is a hole, unless we are
+* at the end of the VA list.
+*/
+   if (not_at_end) {
+   /*
+* If there is no VA that starts before the
+* target address, start the check from the closest VA.
+*/
+   if (first_before)
+   cur_va = first_before;
+
+   

[PATCH 3/3] vmalloc: Add debugfs modfraginfo

2018-07-06 Thread Rick Edgecombe
Add debugfs file "modfraginfo" for providing info on module space
fragmentation.  This can be used for determining if loadable module
randomization is causing any problems for extreme module loading situations,
like huge numbers of modules or extremely large modules.

Sample output when RANDOMIZE_BASE and X86_64 is configured:
Largest free space: 847253504
External Memory Fragementation: 20%
Allocations in backup area: 0

Sample output when just X86_64:
Largest free space: 847253504
External Memory Fragementation: 20%

Signed-off-by: Rick Edgecombe 
---
 mm/vmalloc.c | 101 ++-
 1 file changed, 100 insertions(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b6f2449..85441f2 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +34,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 
@@ -2922,7 +2924,104 @@ static int __init proc_vmalloc_init(void)
proc_create_seq("vmallocinfo", 0400, NULL, _op);
return 0;
 }
-module_init(proc_vmalloc_init);
+#else
+static int proc_vmalloc_init(void)
+{
+   return 0;
+}
+#endif
 
+#if defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64)
+#if defined(CONFIG_RANDOMIZE_BASE)
+static void print_backup_area(struct seq_file *m, unsigned long backup_cnt)
+{
+   if (kaslr_enabled())
+   seq_printf(m, "Allocations in backup area:\t%lu\n", backup_cnt);
+}
+#else
+static void print_backup_area(struct seq_file *m, unsigned long backup_cnt)
+{
+}
 #endif
+static int modulefraginfo_debug_show(struct seq_file *m, void *v)
+{
+   struct list_head *i;
+   unsigned long last_end = MODULES_VADDR;
+   unsigned long total_free = 0;
+   unsigned long largest_free = 0;
+   unsigned long backup_cnt = 0;
+   unsigned long gap;
+
+   spin_lock(_area_lock);
+
+   list_for_each(i, _area_list) {
+   struct vmap_area *obj = list_entry(i, struct vmap_area, list);
+
+   if (!(obj->flags & VM_LAZY_FREE)
+   && obj->va_start >= MODULES_VADDR
+   && obj->va_end <= MODULES_END) {
+
+   if (obj->va_start >= MODULES_VADDR + MODULES_RAND_LEN)
+   backup_cnt++;
+
+   gap = (obj->va_start - last_end);
+   if (gap > largest_free)
+   largest_free = gap;
+   total_free += gap;
+
+   last_end = obj->va_end;
+   }
+   }
+
+   gap = (MODULES_END - last_end);
+   if (gap > largest_free)
+   largest_free = gap;
+   total_free += gap;
+
+   spin_unlock(_area_lock);
 
+   seq_printf(m, "Largest free space:\t\t%lu\n", largest_free);
+   if (total_free)
+   seq_printf(m, "External Memory Fragmentation:\t%lu%%\n",
+   100-(100*largest_free/total_free));
+   else
+   seq_puts(m, "External Memory Fragmentation:\t0%%\n");
+
+   print_backup_area(m, backup_cnt);
+
+   return 0;
+}
+
+static int proc_module_frag_debug_open(struct inode *inode, struct file *file)
+{
+   return single_open(file, modulefraginfo_debug_show, NULL);
+}
+
+static const struct file_operations debug_module_frag_operations = {
+   .open   = proc_module_frag_debug_open,
+   .read   = seq_read,
+   .llseek = seq_lseek,
+   .release= single_release,
+};
+
+static void debug_modfrag_init(void)
+{
+   debugfs_create_file("modfraginfo", 0400, NULL, NULL,
+   _module_frag_operations);
+}
+#else /* defined(CONFIG_DEBUG_FS) && defined(CONFIG_X86_64) */
+static void debug_modfrag_init(void)
+{
+}
+#endif
+
+#if defined(CONFIG_DEBUG_FS) || defined(CONFIG_PROC_FS)
+static int __init info_vmalloc_init(void)
+{
+   proc_vmalloc_init();
+   debug_modfrag_init();
+   return 0;
+}
+
+module_init(info_vmalloc_init);
+#endif
-- 
2.7.4



[PATCH 1/3] vmalloc: Add __vmalloc_node_try_addr function

2018-07-06 Thread Rick Edgecombe
Create __vmalloc_node_try_addr function that tries to allocate at a specific
address and supports caller specified behavior for whether any lazy purging
happens if there is a collision.

Signed-off-by: Rick Edgecombe 
---
 include/linux/vmalloc.h |   3 +
 mm/vmalloc.c| 174 
 2 files changed, 177 insertions(+)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 398e9c9..c7712c8 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -82,6 +82,9 @@ extern void *__vmalloc_node_range(unsigned long size, 
unsigned long align,
unsigned long start, unsigned long end, gfp_t gfp_mask,
pgprot_t prot, unsigned long vm_flags, int node,
const void *caller);
+extern void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+   gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags,
+   int node, int try_purge, const void *caller);
 #ifndef CONFIG_MMU
 extern void *__vmalloc_node_flags(unsigned long size, int node, gfp_t flags);
 static inline void *__vmalloc_node_flags_caller(unsigned long size, int node,
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index cfea25b..b6f2449 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1710,6 +1710,180 @@ static void *__vmalloc_area_node(struct vm_struct 
*area, gfp_t gfp_mask,
 }
 
 /**
+ * __vmalloc_try_addr  -  try to alloc at a specific address
+ * @addr:  address to try
+ * @size:  size to try
+ * @gfp_mask:  flags for the page level allocator
+ * @prot:  protection mask for the allocated pages
+ * @vm_flags:  additional vm area flags (e.g. %VM_NO_GUARD)
+ * @node:  node to use for allocation or NUMA_NO_NODE
+ * @try_purge: try to purge if needed to fulfill and allocation
+ * @caller:caller's return address
+ *
+ * Try to allocate at the specific address. If it succeeds the address is
+ * returned. If it fails NULL is returned. If try_purge is zero, it will
+ * return an EBUSY ERR_PTR if it could have allocated if it was allowed to
+ * purge. It may trigger TLB flushes if a purge is needed, and try_purge is
+ * set.
+ */
+void *__vmalloc_node_try_addr(unsigned long addr, unsigned long size,
+   gfp_t gfp_mask, pgprot_t prot, unsigned long vm_flags,
+   int node, int try_purge, const void *caller)
+{
+   struct vmap_area *va;
+   struct vm_struct *area;
+   struct rb_node *n;
+   struct vmap_area *cur_va = NULL;
+   struct vmap_area *first_before = NULL;
+
+   int not_at_end = 0;
+   int need_purge = 0;
+   int blocked = 0;
+   int purged = 0;
+
+   unsigned long real_size = size;
+   unsigned long addr_end;
+
+   size = PAGE_ALIGN(size);
+   if (!size || (size >> PAGE_SHIFT) > totalram_pages)
+   return NULL;
+
+   WARN_ON(in_interrupt());
+
+   va = kmalloc_node(sizeof(struct vmap_area),
+   gfp_mask & GFP_RECLAIM_MASK, node);
+   if (unlikely(!va)) {
+   warn_alloc(gfp_mask, NULL,
+   "kmalloc: allocation failure");
+   return NULL;
+   }
+
+   area = kzalloc_node(sizeof(*area), gfp_mask & GFP_RECLAIM_MASK, node);
+   if (unlikely(!area)) {
+   warn_alloc(gfp_mask, NULL,
+   "kmalloc: allocation failure");
+   goto failva;
+   }
+   /*
+* Only scan the relevant parts containing pointers to other objects
+* to avoid false negatives.
+*/
+   kmemleak_scan_area(>rb_node, SIZE_MAX, gfp_mask & GFP_RECLAIM_MASK);
+
+   if (!(vm_flags & VM_NO_GUARD))
+   size += PAGE_SIZE;
+
+   addr_end = addr + size;
+   if (addr > addr_end)
+   return NULL;
+
+retry:
+   spin_lock(_area_lock);
+
+   n = vmap_area_root.rb_node;
+   while (n) {
+   cur_va = rb_entry(n, struct vmap_area, rb_node);
+   if (addr < cur_va->va_end) {
+   not_at_end = 1;
+   if (cur_va->va_start == addr) {
+   first_before = cur_va;
+   break;
+   }
+   n = n->rb_left;
+   } else {
+   first_before = cur_va;
+   n = n->rb_right;
+   }
+   }
+
+   /*
+* Linearly search through to make sure there is a hole, unless we are
+* at the end of the VA list.
+*/
+   if (not_at_end) {
+   /*
+* If there is no VA that starts before the
+* target address, start the check from the closest VA.
+*/
+   if (first_before)
+   cur_va = first_before;
+
+   

[PATCH RFC V2 0/3] KASLR feature to randomize each loadable module

2018-07-06 Thread Rick Edgecombe
Hi,

This is v2 of the "KASLR feature to randomize each loadable module" patchset.
The purpose is to increase the randomization and makes the modules randomized
in relation to each other instead of just the base, so that if one module leaks,
the location of the others can't be inferred.

This code needs some refactoring and simplification, but I was hoping to get
some feedback on the benchmarks and provide an update.

V2 brings the TLB flushes down to close to the existing algorithm and increases
the modules that get high randomness based on the concerns raised by Jann Horn
about the BPF JIT use case. It also tries to address Kees Cook's comments about
possible minimal boot time regression by measuring the average allocation time
to be below the existing allocator. It also addresses Mathew Wilcox's comment
on the GFP_NOWARN not being needed. There is also some data on PTE memory use
which is higher than the original algorithm, as suggested by Jann.

This is off of 4.18-RC3.

Changes since v1:
 - New implementation of __vmalloc_node_try_addr based on the
   __vmalloc_node_range implementation, that only flushes TLB when needed.
 - Modified module loading algorithm to try to reduce the TLB flushes further.
 - Increase "random area" tries in order to increase the number of modules that
   can get high randomness.
 - Increase "random area" size to 2/3 of module area in order to increase the
   number of modules that can get high randomness.
 - Fix for 0day failures on other architectures.
 - Fix for wrong debugfs permissions. (thanks to Jann)
 - Spelling fix .(thanks to Jann)
 - Data on module_alloc performance and TLB flushes. (brought up by Kees and
   Jann)
 - Data on memory usage. (suggested by Jann)
 
Todo:
 - Refactor __vmalloc_node_try_addr to be smaller and share more code with the
   normal vmalloc path, and misc cleanup
 - More real world testing other than the synthetic micro benchmarks described
   below. BPF kselftest brought up by Daniel Borkman.

New Algorithm
=
In addition to __vmalloc_node_try_addr only purging the lazy free areas when it
needs to, it also now supports a mode where it will fail to allocate instead of
doing any purge. In this case it reports when the allocation would have
succeeded if it was allowed to purge. It returns this information via an
ERR_PTR.

The logic for the selection of a location in the random ara is changed as well.
The number of tries is increased from 10 to 1, which actually still gives
good performance. At a high level, the vmalloc algorithm quickly traverses an
RB-Tree to find a start position and then more slowly traverses a link list to
look for an open spot. Since this new algorithm randomly picks a spot, it
mostly just needs to traverse the RB-Tree, and as a result the "tries" are fast
enough that the number can be high and still be faster than traversing the
linked list. In the data below you can see that the random algorithm is on
average actually faster than the existing one.

The increase in number of tries is also to support the BPF JIT use case, by
increasing the number of modules that can get high randomness.

Since the __vmalloc_node_try_addr now can optionally fail instead of purging,
for the first half of the tries, the algorithm tries to find a spot where it
doesn't need to do a purge. For the second half it allows purges. The 50:50
split is to try to be a happy medium between reducing TLB flushes and reducing
average allocation time.

Randomness
==
In the last patchset the size of the random area used in the calculations was
incorrect. The entropy should have been closer to 17 bits, not 18, which is why
its lower here even though the number of random area tries is cranked up. 17.3
bits is likely maintained to much higher number of allocations than shown here
in reality, since it seems that the BPF JIT allocations are usually smaller than
modules. If you assume the majority of allocations are 1 page, 17 bits is
maintained to 8000 modules.

Modules Min Info
100017.3
200017.3
300017.3
400017.3
500017.08
600016.30
700015.67
800014.92

Allocation Time
===
The average module_alloc time over N modules was actually always faster with the
random algorithm:

Modules Existing(ns)New(ns)
10004,761   1,134
20009,730   1,149
300015,572  1,396
400020,723  2,161
500026,206  4,349
600031,374  8,615
700036,123  14,009
800040,174  23,396

Average Nth Allocation time was usually better than the existing algorithm,
until the modules get very high.

Module  Original(ns)New(ns)
10008,800   1,288
200020,949  1,477
300031,980  2,583
400044,539  9,250
500055,212  25,986
600065,968  39,540
700074,883  57,798
800085,392  97,319

TLB 

[PATCH RFC V2 0/3] KASLR feature to randomize each loadable module

2018-07-06 Thread Rick Edgecombe
Hi,

This is v2 of the "KASLR feature to randomize each loadable module" patchset.
The purpose is to increase the randomization and makes the modules randomized
in relation to each other instead of just the base, so that if one module leaks,
the location of the others can't be inferred.

This code needs some refactoring and simplification, but I was hoping to get
some feedback on the benchmarks and provide an update.

V2 brings the TLB flushes down to close to the existing algorithm and increases
the modules that get high randomness based on the concerns raised by Jann Horn
about the BPF JIT use case. It also tries to address Kees Cook's comments about
possible minimal boot time regression by measuring the average allocation time
to be below the existing allocator. It also addresses Mathew Wilcox's comment
on the GFP_NOWARN not being needed. There is also some data on PTE memory use
which is higher than the original algorithm, as suggested by Jann.

This is off of 4.18-RC3.

Changes since v1:
 - New implementation of __vmalloc_node_try_addr based on the
   __vmalloc_node_range implementation, that only flushes TLB when needed.
 - Modified module loading algorithm to try to reduce the TLB flushes further.
 - Increase "random area" tries in order to increase the number of modules that
   can get high randomness.
 - Increase "random area" size to 2/3 of module area in order to increase the
   number of modules that can get high randomness.
 - Fix for 0day failures on other architectures.
 - Fix for wrong debugfs permissions. (thanks to Jann)
 - Spelling fix .(thanks to Jann)
 - Data on module_alloc performance and TLB flushes. (brought up by Kees and
   Jann)
 - Data on memory usage. (suggested by Jann)
 
Todo:
 - Refactor __vmalloc_node_try_addr to be smaller and share more code with the
   normal vmalloc path, and misc cleanup
 - More real world testing other than the synthetic micro benchmarks described
   below. BPF kselftest brought up by Daniel Borkman.

New Algorithm
=
In addition to __vmalloc_node_try_addr only purging the lazy free areas when it
needs to, it also now supports a mode where it will fail to allocate instead of
doing any purge. In this case it reports when the allocation would have
succeeded if it was allowed to purge. It returns this information via an
ERR_PTR.

The logic for the selection of a location in the random ara is changed as well.
The number of tries is increased from 10 to 1, which actually still gives
good performance. At a high level, the vmalloc algorithm quickly traverses an
RB-Tree to find a start position and then more slowly traverses a link list to
look for an open spot. Since this new algorithm randomly picks a spot, it
mostly just needs to traverse the RB-Tree, and as a result the "tries" are fast
enough that the number can be high and still be faster than traversing the
linked list. In the data below you can see that the random algorithm is on
average actually faster than the existing one.

The increase in number of tries is also to support the BPF JIT use case, by
increasing the number of modules that can get high randomness.

Since the __vmalloc_node_try_addr now can optionally fail instead of purging,
for the first half of the tries, the algorithm tries to find a spot where it
doesn't need to do a purge. For the second half it allows purges. The 50:50
split is to try to be a happy medium between reducing TLB flushes and reducing
average allocation time.

Randomness
==
In the last patchset the size of the random area used in the calculations was
incorrect. The entropy should have been closer to 17 bits, not 18, which is why
its lower here even though the number of random area tries is cranked up. 17.3
bits is likely maintained to much higher number of allocations than shown here
in reality, since it seems that the BPF JIT allocations are usually smaller than
modules. If you assume the majority of allocations are 1 page, 17 bits is
maintained to 8000 modules.

Modules Min Info
100017.3
200017.3
300017.3
400017.3
500017.08
600016.30
700015.67
800014.92

Allocation Time
===
The average module_alloc time over N modules was actually always faster with the
random algorithm:

Modules Existing(ns)New(ns)
10004,761   1,134
20009,730   1,149
300015,572  1,396
400020,723  2,161
500026,206  4,349
600031,374  8,615
700036,123  14,009
800040,174  23,396

Average Nth Allocation time was usually better than the existing algorithm,
until the modules get very high.

Module  Original(ns)New(ns)
10008,800   1,288
200020,949  1,477
300031,980  2,583
400044,539  9,250
500055,212  25,986
600065,968  39,540
700074,883  57,798
800085,392  97,319

TLB 

Re: [RESEND PATCH v2 3/9] asm-generic: Move some macros from linux/bitops.h to a new bits.h file

2018-07-06 Thread Andrew Morton
On Tue, 19 Jun 2018 13:53:08 +0100 Will Deacon  wrote:

> In preparation for implementing the asm-generic atomic bitops in terms
> of atomic_long_*, we need to prevent asm/atomic.h implementations from
> pulling in linux/bitops.h. A common reason for this include is for the
> BITS_PER_BYTE definition, so move this and some other BIT() and masking
> macros into a new header file, linux/bits.h
> 
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -2,29 +2,9 @@
>  #ifndef _LINUX_BITOPS_H
>  #define _LINUX_BITOPS_H
>  #include 
> +#include 
>  
> -#ifdef   __KERNEL__
> -#define BIT(nr)  (1UL << (nr))
> -#define BIT_ULL(nr)  (1ULL << (nr))
> -#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
> -#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
> -#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG))
> -#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG)
> -#define BITS_PER_BYTE8
>  #define BITS_TO_LONGS(nr)DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
> -#endif

Why does it leave BITS_TO_LONGS() in place?

That becomes unfortunate with Chris's patch, so I'm moving
BITS_TO_LONGS() into bits.h.


From: Chris Wilson 
Subject: include/linux/bitops.h: introduce BITS_PER_TYPE

net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the
number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro
to bitops.h, alongside BITS_PER_BYTE, for wider usage.

Link: http://lkml.kernel.org/r/20180706094458.14116-1-ch...@chris-wilson.co.uk
Signed-off-by: Chris Wilson 
Reviewed-by: Jani Nikula 
Cc: Randy Dunlap 
Cc: Andy Gospodarek 
Cc: David S. Miller 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
---

 include/linux/bitops.h  |3 ++-
 include/linux/net_dim.h |1 -
 2 files changed, 2 insertions(+), 2 deletions(-)

diff -puN include/linux/bitops.h~bitops-introduce-bits_per_type 
include/linux/bitops.h
--- a/include/linux/bitops.h~bitops-introduce-bits_per_type
+++ a/include/linux/bitops.h
@@ -11,7 +11,8 @@
 #define BIT_ULL_MASK(nr)   (1ULL << ((nr) % BITS_PER_LONG_LONG))
 #define BIT_ULL_WORD(nr)   ((nr) / BITS_PER_LONG_LONG)
 #define BITS_PER_BYTE  8
-#define BITS_TO_LONGS(nr)  DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
+#define BITS_TO_LONGS(nr)  DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
 #endif
 
 /*
diff -puN include/linux/net_dim.h~bitops-introduce-bits_per_type 
include/linux/net_dim.h
--- a/include/linux/net_dim.h~bitops-introduce-bits_per_type
+++ a/include/linux/net_dim.h
@@ -363,7 +363,6 @@ static inline void net_dim_sample(u16 ev
 }
 
 #define NET_DIM_NEVENTS 64
-#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
 #define BIT_GAP(bits, end, start) end) - (start)) + BIT_ULL(bits)) & 
(BIT_ULL(bits) - 1))
 
 static inline void net_dim_calc_stats(struct net_dim_sample *start,
_



Re: [RESEND PATCH v2 3/9] asm-generic: Move some macros from linux/bitops.h to a new bits.h file

2018-07-06 Thread Andrew Morton
On Tue, 19 Jun 2018 13:53:08 +0100 Will Deacon  wrote:

> In preparation for implementing the asm-generic atomic bitops in terms
> of atomic_long_*, we need to prevent asm/atomic.h implementations from
> pulling in linux/bitops.h. A common reason for this include is for the
> BITS_PER_BYTE definition, so move this and some other BIT() and masking
> macros into a new header file, linux/bits.h
> 
> --- a/include/linux/bitops.h
> +++ b/include/linux/bitops.h
> @@ -2,29 +2,9 @@
>  #ifndef _LINUX_BITOPS_H
>  #define _LINUX_BITOPS_H
>  #include 
> +#include 
>  
> -#ifdef   __KERNEL__
> -#define BIT(nr)  (1UL << (nr))
> -#define BIT_ULL(nr)  (1ULL << (nr))
> -#define BIT_MASK(nr) (1UL << ((nr) % BITS_PER_LONG))
> -#define BIT_WORD(nr) ((nr) / BITS_PER_LONG)
> -#define BIT_ULL_MASK(nr) (1ULL << ((nr) % BITS_PER_LONG_LONG))
> -#define BIT_ULL_WORD(nr) ((nr) / BITS_PER_LONG_LONG)
> -#define BITS_PER_BYTE8
>  #define BITS_TO_LONGS(nr)DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
> -#endif

Why does it leave BITS_TO_LONGS() in place?

That becomes unfortunate with Chris's patch, so I'm moving
BITS_TO_LONGS() into bits.h.


From: Chris Wilson 
Subject: include/linux/bitops.h: introduce BITS_PER_TYPE

net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the
number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro
to bitops.h, alongside BITS_PER_BYTE, for wider usage.

Link: http://lkml.kernel.org/r/20180706094458.14116-1-ch...@chris-wilson.co.uk
Signed-off-by: Chris Wilson 
Reviewed-by: Jani Nikula 
Cc: Randy Dunlap 
Cc: Andy Gospodarek 
Cc: David S. Miller 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Signed-off-by: Andrew Morton 
---

 include/linux/bitops.h  |3 ++-
 include/linux/net_dim.h |1 -
 2 files changed, 2 insertions(+), 2 deletions(-)

diff -puN include/linux/bitops.h~bitops-introduce-bits_per_type 
include/linux/bitops.h
--- a/include/linux/bitops.h~bitops-introduce-bits_per_type
+++ a/include/linux/bitops.h
@@ -11,7 +11,8 @@
 #define BIT_ULL_MASK(nr)   (1ULL << ((nr) % BITS_PER_LONG_LONG))
 #define BIT_ULL_WORD(nr)   ((nr) / BITS_PER_LONG_LONG)
 #define BITS_PER_BYTE  8
-#define BITS_TO_LONGS(nr)  DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long))
+#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
+#define BITS_TO_LONGS(nr)  DIV_ROUND_UP(nr, BITS_PER_TYPE(long))
 #endif
 
 /*
diff -puN include/linux/net_dim.h~bitops-introduce-bits_per_type 
include/linux/net_dim.h
--- a/include/linux/net_dim.h~bitops-introduce-bits_per_type
+++ a/include/linux/net_dim.h
@@ -363,7 +363,6 @@ static inline void net_dim_sample(u16 ev
 }
 
 #define NET_DIM_NEVENTS 64
-#define BITS_PER_TYPE(type) (sizeof(type) * BITS_PER_BYTE)
 #define BIT_GAP(bits, end, start) end) - (start)) + BIT_ULL(bits)) & 
(BIT_ULL(bits) - 1))
 
 static inline void net_dim_calc_stats(struct net_dim_sample *start,
_



[PATCH v4 1/2] dt-bindings: phy: Add binding doc for Stingray PCIe PHY

2018-07-06 Thread Ray Jui
Add binding document for Stingray PCIe PHYs for both PAXB and PAXC based
root complex

Signed-off-by: Ray Jui 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/phy/brcm,sr-pcie-phy.txt   | 41 ++
 1 file changed, 41 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/brcm,sr-pcie-phy.txt

diff --git a/Documentation/devicetree/bindings/phy/brcm,sr-pcie-phy.txt 
b/Documentation/devicetree/bindings/phy/brcm,sr-pcie-phy.txt
new file mode 100644
index 000..e8d8228
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/brcm,sr-pcie-phy.txt
@@ -0,0 +1,41 @@
+Broadcom Stingray PCIe PHY
+
+Required properties:
+- compatible: must be "brcm,sr-pcie-phy"
+- reg: base address and length of the PCIe SS register space
+- brcm,sr-cdru: phandle to the CDRU syscon node
+- brcm,sr-mhb: phandle to the MHB syscon node
+- #phy-cells: Must be 1, denotes the PHY index
+
+For PAXB based root complex, one can have a configuration of up to 8 PHYs
+PHY index goes from 0 to 7
+
+For the internal PAXC based root complex, PHY index is always 8
+
+Example:
+   mhb: syscon@60401000 {
+   compatible = "brcm,sr-mhb", "syscon";
+   reg = <0 0x60401000 0 0x38c>;
+   };
+
+   cdru: syscon@6641d000 {
+   compatible = "brcm,sr-cdru", "syscon";
+   reg = <0 0x6641d000 0 0x400>;
+   };
+
+   pcie_phy: phy@4000 {
+   compatible = "brcm,sr-pcie-phy";
+   reg = <0 0x4000 0 0x800>;
+   brcm,sr-cdru = <>;
+   brcm,sr-mhb = <>;
+   #phy-cells = <1>;
+   };
+
+   /* users of the PCIe PHY */
+
+   pcie0: pcie@4800 {
+   ...
+   ...
+   phys = <_phy 0>;
+   phy-names = "pcie-phy";
+   };
-- 
2.1.4



[PATCH v4 2/2] phy: bcm-sr-pcie: Add Stingray PCIe PHY driver

2018-07-06 Thread Ray Jui
Add Stingray PCIe PHY driver for both PAXB and PAXC root complex

Signed-off-by: Ray Jui 
---
 drivers/phy/broadcom/Kconfig   |  10 ++
 drivers/phy/broadcom/Makefile  |   2 +
 drivers/phy/broadcom/phy-bcm-sr-pcie.c | 305 +
 3 files changed, 317 insertions(+)
 create mode 100644 drivers/phy/broadcom/phy-bcm-sr-pcie.c

diff --git a/drivers/phy/broadcom/Kconfig b/drivers/phy/broadcom/Kconfig
index 97d27b0d5..8786a96 100644
--- a/drivers/phy/broadcom/Kconfig
+++ b/drivers/phy/broadcom/Kconfig
@@ -80,3 +80,13 @@ config PHY_BRCM_USB
  This driver is required by the USB XHCI, EHCI and OHCI
  drivers.
  If unsure, say N.
+
+config PHY_BCM_SR_PCIE
+   tristate "Broadcom Stingray PCIe PHY driver"
+   depends on OF && (ARCH_BCM_IPROC || COMPILE_TEST)
+   select GENERIC_PHY
+   select MFD_SYSCON
+   default ARCH_BCM_IPROC
+   help
+ Enable this to support the Broadcom Stingray PCIe PHY
+ If unsure, say N.
diff --git a/drivers/phy/broadcom/Makefile b/drivers/phy/broadcom/Makefile
index 13e000c..0f60184 100644
--- a/drivers/phy/broadcom/Makefile
+++ b/drivers/phy/broadcom/Makefile
@@ -9,3 +9,5 @@ obj-$(CONFIG_PHY_BRCM_SATA) += phy-brcm-sata.o
 obj-$(CONFIG_PHY_BRCM_USB) += phy-brcm-usb-dvr.o
 
 phy-brcm-usb-dvr-objs := phy-brcm-usb.o phy-brcm-usb-init.o
+
+obj-$(CONFIG_PHY_BCM_SR_PCIE)  += phy-bcm-sr-pcie.o
diff --git a/drivers/phy/broadcom/phy-bcm-sr-pcie.c 
b/drivers/phy/broadcom/phy-bcm-sr-pcie.c
new file mode 100644
index 000..c10e95f
--- /dev/null
+++ b/drivers/phy/broadcom/phy-bcm-sr-pcie.c
@@ -0,0 +1,305 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2016-2018 Broadcom
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* we have up to 8 PAXB based RC. The 9th one is always PAXC */
+#define SR_NR_PCIE_PHYS   9
+#define SR_PAXC_PHY_IDX   (SR_NR_PCIE_PHYS - 1)
+
+#define PCIE_PIPEMUX_CFG_OFFSET   0x10c
+#define PCIE_PIPEMUX_SELECT_STRAP 0xf
+
+#define CDRU_STRAP_DATA_LSW_OFFSET0x5c
+#define PCIE_PIPEMUX_SHIFT19
+#define PCIE_PIPEMUX_MASK 0xf
+
+#define MHB_MEM_PW_PAXC_OFFSET0x1c0
+#define MHB_PWR_ARR_POWERON   0x8
+#define MHB_PWR_ARR_POWEROK   0x4
+#define MHB_PWR_POWERON   0x2
+#define MHB_PWR_POWEROK   0x1
+#define MHB_PWR_STATUS_MASK   (MHB_PWR_ARR_POWERON | \
+  MHB_PWR_ARR_POWEROK | \
+  MHB_PWR_POWERON | \
+  MHB_PWR_POWEROK)
+
+struct sr_pcie_phy_core;
+
+/**
+ * struct sr_pcie_phy - Stingray PCIe PHY
+ *
+ * @core: pointer to the Stingray PCIe PHY core control
+ * @index: PHY index
+ * @phy: pointer to the kernel PHY device
+ */
+struct sr_pcie_phy {
+   struct sr_pcie_phy_core *core;
+   unsigned int index;
+   struct phy *phy;
+};
+
+/**
+ * struct sr_pcie_phy_core - Stingray PCIe PHY core control
+ *
+ * @dev: pointer to device
+ * @base: base register of PCIe SS
+ * @cdru: regmap to the CDRU device
+ * @mhb: regmap to the MHB device
+ * @pipemux: pipemuex strap
+ * @phys: array of PCIe PHYs
+ */
+struct sr_pcie_phy_core {
+   struct device *dev;
+   void __iomem *base;
+   struct regmap *cdru;
+   struct regmap *mhb;
+   u32 pipemux;
+   struct sr_pcie_phy phys[SR_NR_PCIE_PHYS];
+};
+
+/*
+ * PCIe PIPEMUX lookup table
+ *
+ * Each array index represents a PIPEMUX strap setting
+ * The array element represents a bitmap where a set bit means the PCIe
+ * core and associated serdes has been enabled as RC and is available for use
+ */
+static const u8 pipemux_table[] = {
+   /* PIPEMUX = 0, EP 1x16 */
+   0x00,
+   /* PIPEMUX = 1, EP 2x8 */
+   0x00,
+   /* PIPEMUX = 2, EP 4x4 */
+   0x00,
+   /* PIPEMUX = 3, RC 2x8, cores 0, 7 */
+   0x81,
+   /* PIPEMUX = 4, RC 4x4, cores 0, 1, 6, 7 */
+   0xc3,
+   /* PIPEMUX = 5, RC 8x2, all 8 cores */
+   0xff,
+   /* PIPEMUX = 6, RC 3x4 + 2x2, cores 0, 2, 3, 6, 7 */
+   0xcd,
+   /* PIPEMUX = 7, RC 1x4 + 6x2, cores 0, 2, 3, 4, 5, 6, 7 */
+   0xfd,
+   /* PIPEMUX = 8, EP 1x8 + RC 4x2, cores 4, 5, 6, 7 */
+   0xf0,
+   /* PIPEMUX = 9, EP 1x8 + RC 2x4, cores 6, 7 */
+   0xc0,
+   /* PIPEMUX = 10, EP 2x4 + RC 2x4, cores 1, 6 */
+   0x42,
+   /* PIPEMUX = 11, EP 2x4 + RC 4x2, cores 2, 3, 4, 5 */
+   0x3c,
+   /* PIPEMUX = 12, EP 1x4 + RC 6x2, cores 2, 3, 4, 5, 6, 7 */
+   0xfc,
+   /* PIPEMUX = 13, RC 2x4 + RC 1x4 + 2x2, cores 2, 3, 6 */
+   0x4c,
+};
+
+/*
+ * Return true if the strap setting is valid
+ */
+static bool pipemux_strap_is_valid(u32 pipemux)
+{
+   return !!(pipemux < ARRAY_SIZE(pipemux_table));
+}
+
+/*
+ * Read the PCIe PIPEMUX from strap
+ */
+static u32 

  1   2   3   4   5   6   7   8   9   10   >