Re: [PATCH] acpi, numa, ia64: Parse all entries of SRAT memory affinity table

2017-11-20 Thread Ganapatrao Kulkarni
On Thu, Nov 16, 2017 at 12:16 AM, Luck, Tony  wrote:
>> This check is already added in x86 and extending same to ia64.
>
> Looks OK.
>
> Acked-by: Tony Luck 

thanks Tony.

ACPI Maintainers,
any comments on this patch? i can send rebasing to 4.15-rc1?

thanks
Ganapat


Re: [PATCH] acpi, numa, ia64: Parse all entries of SRAT memory affinity table

2017-11-20 Thread Ganapatrao Kulkarni
On Thu, Nov 16, 2017 at 12:16 AM, Luck, Tony  wrote:
>> This check is already added in x86 and extending same to ia64.
>
> Looks OK.
>
> Acked-by: Tony Luck 

thanks Tony.

ACPI Maintainers,
any comments on this patch? i can send rebasing to 4.15-rc1?

thanks
Ganapat


Re: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed

2017-11-20 Thread Steven Rostedt
On Mon, 20 Nov 2017 11:30:40 -0500
joe.ko...@concurrent-rt.com wrote:

> Hi Steve,
> A quick perusal of 4.11.12-rt16 shows that it has an
> entirely new version of migrate_disable which to me appears
> correct.
> 
> In that new implementation, migrate_enable() recalculates
> p->nr_cpus_allowed when it switches the task back to
> using p->cpus_mask.  This brings the two back into sync
> if anything had happened to get them out of sync while
> migration was disabled (as would happen on an affinity
> change during that disable period).
> 
> 4.9.47-rt37 has the old implementation and it appears to
> have same bug as 4.4-rt though I have yet to test 4.9-rt.
> 
> The fix in  these older versions could take one of two
> forms: either we recalculate p->nr_cpus_allowed when
> migrate_enable goes back to using p->cpus_allowed,
> as the 4.11-rt version does, or the one place where we
> allow p->nr_cpus_allowed to diverge from p->cpus_allowed
> be fixed.  The patch I submitted earlier takes this second
> approach.
> 

Ideally, I would like to stay close to what upstream -rt does. Would
you be able to backport the 4.11-rt patch?

I'm currently working on releasing 4.9-rt and 4.4-rt with the latest
backports. I could easily add this one too.

-- Steve


Re: [PATCH] 4.4.86-rt99: fix sync breakage between nr_cpus_allowed and cpus_allowed

2017-11-20 Thread Steven Rostedt
On Mon, 20 Nov 2017 11:30:40 -0500
joe.ko...@concurrent-rt.com wrote:

> Hi Steve,
> A quick perusal of 4.11.12-rt16 shows that it has an
> entirely new version of migrate_disable which to me appears
> correct.
> 
> In that new implementation, migrate_enable() recalculates
> p->nr_cpus_allowed when it switches the task back to
> using p->cpus_mask.  This brings the two back into sync
> if anything had happened to get them out of sync while
> migration was disabled (as would happen on an affinity
> change during that disable period).
> 
> 4.9.47-rt37 has the old implementation and it appears to
> have same bug as 4.4-rt though I have yet to test 4.9-rt.
> 
> The fix in  these older versions could take one of two
> forms: either we recalculate p->nr_cpus_allowed when
> migrate_enable goes back to using p->cpus_allowed,
> as the 4.11-rt version does, or the one place where we
> allow p->nr_cpus_allowed to diverge from p->cpus_allowed
> be fixed.  The patch I submitted earlier takes this second
> approach.
> 

Ideally, I would like to stay close to what upstream -rt does. Would
you be able to backport the 4.11-rt patch?

I'm currently working on releasing 4.9-rt and 4.4-rt with the latest
backports. I could easily add this one too.

-- Steve


Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)

2017-11-20 Thread Nicolas Pitre
On Mon, 20 Nov 2017, Guenter Roeck wrote:

> On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote:
> > On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > 
> > > bdata->node_min_pfn=6 PFN_PHYS(bdata->node_min_pfn)=c000 
> > > start_off=536000 region=c0536000
> > 
> > If PFN_PHYS(bdata->node_min_pfn)=c000 and
> > region=c0536000 that means phys_to_virt() is a no-op.
> > 
> No, it is |= 0x8000

Then the bootmem registration looks very fishy. If you have:

> I think the problem is the 0x6 in bdata->node_min_pfn. It is shifted
> left by PFN_PHYS, making it 0xc000, which in my understanding is
> a virtual address.

Exact.

#define __pa(x) ((unsigned long)(x) & 0x7fff)
#define __va(x) ((void *)((unsigned long)(x) | 0x8000))

With that, the only possible physical address range you may have is 
0x4000 - 0x7fff, and it better start at 0x4000. If that's 
not where your RAM is then something is wrong.

This is in fact a very bad idea to define __va() and __pa() using 
bitwise operations as this hides mistakes like defining physical RAM 
address at 0xc000. Instead, it should look like:

#define __pa(x) ((unsigned long)(x) - 0x8000)
#define __va(x) ((void *)((unsigned long)(x) + 0x8000))

This way, bad physical RAM address definitions will be caught 
immediately.

> That doesn't seem to be easy to fix. It seems there is a mixup of physical
> and  virtual addresses in the architecture.

Well... I don't think there is much else to say other than this needs 
fixing.


Nicolas


Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)

2017-11-20 Thread Nicolas Pitre
On Mon, 20 Nov 2017, Guenter Roeck wrote:

> On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote:
> > On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > 
> > > bdata->node_min_pfn=6 PFN_PHYS(bdata->node_min_pfn)=c000 
> > > start_off=536000 region=c0536000
> > 
> > If PFN_PHYS(bdata->node_min_pfn)=c000 and
> > region=c0536000 that means phys_to_virt() is a no-op.
> > 
> No, it is |= 0x8000

Then the bootmem registration looks very fishy. If you have:

> I think the problem is the 0x6 in bdata->node_min_pfn. It is shifted
> left by PFN_PHYS, making it 0xc000, which in my understanding is
> a virtual address.

Exact.

#define __pa(x) ((unsigned long)(x) & 0x7fff)
#define __va(x) ((void *)((unsigned long)(x) | 0x8000))

With that, the only possible physical address range you may have is 
0x4000 - 0x7fff, and it better start at 0x4000. If that's 
not where your RAM is then something is wrong.

This is in fact a very bad idea to define __va() and __pa() using 
bitwise operations as this hides mistakes like defining physical RAM 
address at 0xc000. Instead, it should look like:

#define __pa(x) ((unsigned long)(x) - 0x8000)
#define __va(x) ((void *)((unsigned long)(x) + 0x8000))

This way, bad physical RAM address definitions will be caught 
immediately.

> That doesn't seem to be easy to fix. It seems there is a mixup of physical
> and  virtual addresses in the architecture.

Well... I don't think there is much else to say other than this needs 
fixing.


Nicolas


Re: [PATCH v3] tracing: Allocate mask_str buffer dynamically

2017-11-20 Thread Steven Rostedt
On Mon, 20 Nov 2017 21:33:23 +0800
"Du, Changbin"  wrote:

> Hi Steven,
> Have you picked up this patch or need more polish? Thanks.
> 

Neither. You sent this while I was traveling, and it was missed.

I'll look at it tomorrow.

Thanks,

-- Steve


Re: [PATCH v3] tracing: Allocate mask_str buffer dynamically

2017-11-20 Thread Steven Rostedt
On Mon, 20 Nov 2017 21:33:23 +0800
"Du, Changbin"  wrote:

> Hi Steven,
> Have you picked up this patch or need more polish? Thanks.
> 

Neither. You sent this while I was traveling, and it was missed.

I'll look at it tomorrow.

Thanks,

-- Steve


Re: [PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-20 Thread Ching Huang
On Mon, 2017-11-20 at 22:03 -0500, Martin K. Petersen wrote:
> Ching,
> 
> > The following patches apply to Martin's 4.15/scsi-queue.
> 
> Applied to 4.16/scsi-queue. Thank you!
> 
Hi Martin,

Thank you for response.
These patches can apply to 4.16/scsi-queue is very good.

It will be very appreciation if you can spend a little time to review our 
driver's patch. Providing a better software driver is you and our common 
target. We will keep going.

Thanks,
Ching



Re: [PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-20 Thread Ching Huang
On Mon, 2017-11-20 at 22:03 -0500, Martin K. Petersen wrote:
> Ching,
> 
> > The following patches apply to Martin's 4.15/scsi-queue.
> 
> Applied to 4.16/scsi-queue. Thank you!
> 
Hi Martin,

Thank you for response.
These patches can apply to 4.16/scsi-queue is very good.

It will be very appreciation if you can spend a little time to review our 
driver's patch. Providing a better software driver is you and our common 
target. We will keep going.

Thanks,
Ching



[PATCH] arm64: kaslr: Fix kaslr end boundary of virt addr

2017-11-20 Thread Chen Feng
With kaslr and kasan enable both, I got the follow issue.

[   16.130523s]kasan: reg->base = 1, phys_end =1c000,start = 
4000, end = ffc0
[   16.142517s]___alloc_bootmem_nopanic:257
[   16.148284s]__alloc_memory_core_early:63, addr = 197fc7fc0
[   16.155670s]__alloc_memory_core_early:65, virt = d7fc7fc0
[   16.163635s]__alloc_memory_core_early:67, toshow = ff8ffaff8ff8
[   16.171783s]__alloc_memory_core_early:69, show_phy = ffe2649f8ff8
[   16.180145s]Unable to handle kernel paging request at virtual address 
ff8ffaff8ff8
[   16.189971s]pgd = ffad9c507000
[   16.195220s][ff8ffaff8ff8] *pgd=000197fc8003, *pud=000197fc8003

*reg->base = 1, phys_end =1c000,start = 4000, end = 
ffc0*

memstart_addr 0
ARM64_MEMSTART_ALIGN 0x4000
memstart_offset_seed 0xffc7
PHYS_OFFSET = 0 - memstart_addr = 0 - 3E4000 = FFC1C000

reg->base = 0x1  -> 0x4000
phys_end  = 0x1c000  -> 0xffc0  This is confused, end less than 
start.

And In memblock it use "start_addr + size" as the end addr. So in function 
kasan_init,
if the start >= end, it will not map the hole block address. But the memory in 
this
block is valid. And it can be allocated as well.

So donot use the last memory region. Changing "range = range / 
ARM64_MEMSTART_ALIGN + 1" to
range = range / ARM64_MEMSTART_ALIGN;

Signed-off-by: Chen Feng 
Signed-off-by: Chen Xiang 
---
 arch/arm64/mm/init.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 716d122..60112c0 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -267,11 +267,8 @@ void __init arm64_memblock_init(void)
 * margin, the size of the region that the available physical
 * memory spans, randomize the linear region as well.
 */
-   if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
-   range = range / ARM64_MEMSTART_ALIGN + 1;
-   memstart_addr -= ARM64_MEMSTART_ALIGN *
-((range * memstart_offset_seed) >> 16);
-   }
+   if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN)
+   memstart_addr -= (range * memstart_offset_seed) >> 16;
}
 
/*
-- 
1.9.1



[PATCH] arm64: kaslr: Fix kaslr end boundary of virt addr

2017-11-20 Thread Chen Feng
With kaslr and kasan enable both, I got the follow issue.

[   16.130523s]kasan: reg->base = 1, phys_end =1c000,start = 
4000, end = ffc0
[   16.142517s]___alloc_bootmem_nopanic:257
[   16.148284s]__alloc_memory_core_early:63, addr = 197fc7fc0
[   16.155670s]__alloc_memory_core_early:65, virt = d7fc7fc0
[   16.163635s]__alloc_memory_core_early:67, toshow = ff8ffaff8ff8
[   16.171783s]__alloc_memory_core_early:69, show_phy = ffe2649f8ff8
[   16.180145s]Unable to handle kernel paging request at virtual address 
ff8ffaff8ff8
[   16.189971s]pgd = ffad9c507000
[   16.195220s][ff8ffaff8ff8] *pgd=000197fc8003, *pud=000197fc8003

*reg->base = 1, phys_end =1c000,start = 4000, end = 
ffc0*

memstart_addr 0
ARM64_MEMSTART_ALIGN 0x4000
memstart_offset_seed 0xffc7
PHYS_OFFSET = 0 - memstart_addr = 0 - 3E4000 = FFC1C000

reg->base = 0x1  -> 0x4000
phys_end  = 0x1c000  -> 0xffc0  This is confused, end less than 
start.

And In memblock it use "start_addr + size" as the end addr. So in function 
kasan_init,
if the start >= end, it will not map the hole block address. But the memory in 
this
block is valid. And it can be allocated as well.

So donot use the last memory region. Changing "range = range / 
ARM64_MEMSTART_ALIGN + 1" to
range = range / ARM64_MEMSTART_ALIGN;

Signed-off-by: Chen Feng 
Signed-off-by: Chen Xiang 
---
 arch/arm64/mm/init.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 716d122..60112c0 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -267,11 +267,8 @@ void __init arm64_memblock_init(void)
 * margin, the size of the region that the available physical
 * memory spans, randomize the linear region as well.
 */
-   if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) {
-   range = range / ARM64_MEMSTART_ALIGN + 1;
-   memstart_addr -= ARM64_MEMSTART_ALIGN *
-((range * memstart_offset_seed) >> 16);
-   }
+   if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN)
+   memstart_addr -= (range * memstart_offset_seed) >> 16;
}
 
/*
-- 
1.9.1



Re: [PATCH] platform/x86: Add Acer Wireless Radio Control driver

2017-11-20 Thread Chris Chiu
On Mon, Nov 20, 2017 at 8:24 PM, Andy Shevchenko
 wrote:
> On Mon, Nov 20, 2017 at 8:31 AM, Chris Chiu  wrote:
>> On Fri, Nov 17, 2017 at 10:25 PM, Andy Shevchenko
>>  wrote:
>>> On Thu, Nov 16, 2017 at 3:44 PM, Chris Chiu  wrote:
>
 +
 +struct acer_wireless_data {
 +   struct input_dev *idev;
>>>
 +   struct acpi_device *adev;
>>>
>>> Do you need this one?
>>> I suppose whenever you need struct device out of it you may find it
>>> via input->parent. Right?
>>>
>>
>> But I think it would make life easier to have the following line in
>> .notify callback
>> struct acer_wireless_data * data = acpi_driver_data(adev);
>>
>> I can get its parent acer_wireless_data and point to input_dev easier.
>
> Yes, though if I'm not mistaken, you may get rid completely of your
> custom container and use struct input_dev for accessing that via
> to_acpi_device(idev->dev.parent).
>
> It would be less code, less data structures, cleaner view.
> Right?
>
 +static int acer_wireless_remove(struct acpi_device *adev)
 +{
 +   return 0;
 +}
 +
>>>
>>> No need UI suppose.
>>>
>>
>> You mean remove this acer_wireless_remove which does nothing? Will do.
>
> Correct.
>
> --
> With Best Regards,
> Andy Shevchenko

I think I get your point. Will soon have a v2 patch later.


Re: [PATCH] platform/x86: Add Acer Wireless Radio Control driver

2017-11-20 Thread Chris Chiu
On Mon, Nov 20, 2017 at 8:24 PM, Andy Shevchenko
 wrote:
> On Mon, Nov 20, 2017 at 8:31 AM, Chris Chiu  wrote:
>> On Fri, Nov 17, 2017 at 10:25 PM, Andy Shevchenko
>>  wrote:
>>> On Thu, Nov 16, 2017 at 3:44 PM, Chris Chiu  wrote:
>
 +
 +struct acer_wireless_data {
 +   struct input_dev *idev;
>>>
 +   struct acpi_device *adev;
>>>
>>> Do you need this one?
>>> I suppose whenever you need struct device out of it you may find it
>>> via input->parent. Right?
>>>
>>
>> But I think it would make life easier to have the following line in
>> .notify callback
>> struct acer_wireless_data * data = acpi_driver_data(adev);
>>
>> I can get its parent acer_wireless_data and point to input_dev easier.
>
> Yes, though if I'm not mistaken, you may get rid completely of your
> custom container and use struct input_dev for accessing that via
> to_acpi_device(idev->dev.parent).
>
> It would be less code, less data structures, cleaner view.
> Right?
>
 +static int acer_wireless_remove(struct acpi_device *adev)
 +{
 +   return 0;
 +}
 +
>>>
>>> No need UI suppose.
>>>
>>
>> You mean remove this acer_wireless_remove which does nothing? Will do.
>
> Correct.
>
> --
> With Best Regards,
> Andy Shevchenko

I think I get your point. Will soon have a v2 patch later.


Re: [PATCH] MIPS: Fix CPS SMP NS16550 UART defaults

2017-11-20 Thread Guenter Roeck

On 11/20/2017 04:02 PM, James Hogan wrote:

From: James Hogan 

The MIPS_CPS_NS16550_BASE and MIPS_CPS_NS16550_SHIFT options have no
defaults for non-Malta platforms which select SYS_SUPPORTS_MIPS_CPS
(i.e. the pistachio and generic platforms). This is problematic for
automated allyesconfig and allmodconfig builds based on these platforms,
since make silentoldconfig tries to ask the user for values, and
especially since v4.15 where the default platform was switched to
generic.

Default these options to 0 and arrange for MIPS_CPS_NS16550 to be no
when using that default base address, so that the option only has an
effect when the default is provided (i.e. Malta) or when a value is
provided by the user.

Fixes: 609cf6f2291a ("MIPS: CPS: Early debug using an ns16550-compatible UART")
Signed-off-by: James Hogan 
Reviewed-by: Paul Burton 
Cc: Ralf Baechle 
Cc: Guenter Roeck 
Cc: linux-m...@linux-mips.org
---
Guenter: I'm guessing this is the problem you're referring to.


Yes.

Tested-by: Guenter Roeck 

mips:allmodconfig still fails to build with this patch applied, but elsewhere.

In file included from 
/opt/buildbot/slave/hwmon-testing/build/include/linux/bcma/bcma.h:10:0,
 from 
/opt/buildbot/slave/hwmon-testing/build/drivers/bcma/bcma_private.h:9,
 from 
/opt/buildbot/slave/hwmon-testing/build/drivers/bcma/main.c:8:
/opt/buildbot/slave/hwmon-testing/build/include/linux/bcma/bcma_driver_pci.h:218:24:
 error: field 'pci_controller' has incomplete type

Guenter


Re: [PATCH] MIPS: Fix CPS SMP NS16550 UART defaults

2017-11-20 Thread Guenter Roeck

On 11/20/2017 04:02 PM, James Hogan wrote:

From: James Hogan 

The MIPS_CPS_NS16550_BASE and MIPS_CPS_NS16550_SHIFT options have no
defaults for non-Malta platforms which select SYS_SUPPORTS_MIPS_CPS
(i.e. the pistachio and generic platforms). This is problematic for
automated allyesconfig and allmodconfig builds based on these platforms,
since make silentoldconfig tries to ask the user for values, and
especially since v4.15 where the default platform was switched to
generic.

Default these options to 0 and arrange for MIPS_CPS_NS16550 to be no
when using that default base address, so that the option only has an
effect when the default is provided (i.e. Malta) or when a value is
provided by the user.

Fixes: 609cf6f2291a ("MIPS: CPS: Early debug using an ns16550-compatible UART")
Signed-off-by: James Hogan 
Reviewed-by: Paul Burton 
Cc: Ralf Baechle 
Cc: Guenter Roeck 
Cc: linux-m...@linux-mips.org
---
Guenter: I'm guessing this is the problem you're referring to.


Yes.

Tested-by: Guenter Roeck 

mips:allmodconfig still fails to build with this patch applied, but elsewhere.

In file included from 
/opt/buildbot/slave/hwmon-testing/build/include/linux/bcma/bcma.h:10:0,
 from 
/opt/buildbot/slave/hwmon-testing/build/drivers/bcma/bcma_private.h:9,
 from 
/opt/buildbot/slave/hwmon-testing/build/drivers/bcma/main.c:8:
/opt/buildbot/slave/hwmon-testing/build/include/linux/bcma/bcma_driver_pci.h:218:24:
 error: field 'pci_controller' has incomplete type

Guenter


Re: [PATCH] cpufreq: mediatek: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

2017-11-20 Thread Viresh Kumar
On 20-11-17, 13:32, Jesse Chan wrote:
> This change resolves a new compile-time warning
> when built as a loadable module:
> 
> WARNING: modpost: missing MODULE_LICENSE() in 
> drivers/cpufreq/mediatek-cpufreq.o
> see include/linux/module.h for more information
> 
> This adds the license as "GPL v2", which matches the header of the file.
> 
> MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
> 
> Signed-off-by: Jesse Chan 
> ---
>  drivers/cpufreq/mediatek-cpufreq.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/cpufreq/mediatek-cpufreq.c 
> b/drivers/cpufreq/mediatek-cpufreq.c
> index 18c4bd9a5c65..e0d5090b303d 100644
> --- a/drivers/cpufreq/mediatek-cpufreq.c
> +++ b/drivers/cpufreq/mediatek-cpufreq.c
> @@ -620,3 +620,7 @@ static int __init mtk_cpufreq_driver_init(void)
>   return 0;
>  }
>  device_initcall(mtk_cpufreq_driver_init);
> +
> +MODULE_DESCRIPTION("MediaTek CPUFreq driver");
> +MODULE_AUTHOR("Pi-Cheng Chen ");
> +MODULE_LICENSE("GPL v2");

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH] cpufreq: mediatek: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

2017-11-20 Thread Viresh Kumar
On 20-11-17, 13:32, Jesse Chan wrote:
> This change resolves a new compile-time warning
> when built as a loadable module:
> 
> WARNING: modpost: missing MODULE_LICENSE() in 
> drivers/cpufreq/mediatek-cpufreq.o
> see include/linux/module.h for more information
> 
> This adds the license as "GPL v2", which matches the header of the file.
> 
> MODULE_DESCRIPTION and MODULE_AUTHOR are also added.
> 
> Signed-off-by: Jesse Chan 
> ---
>  drivers/cpufreq/mediatek-cpufreq.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/cpufreq/mediatek-cpufreq.c 
> b/drivers/cpufreq/mediatek-cpufreq.c
> index 18c4bd9a5c65..e0d5090b303d 100644
> --- a/drivers/cpufreq/mediatek-cpufreq.c
> +++ b/drivers/cpufreq/mediatek-cpufreq.c
> @@ -620,3 +620,7 @@ static int __init mtk_cpufreq_driver_init(void)
>   return 0;
>  }
>  device_initcall(mtk_cpufreq_driver_init);
> +
> +MODULE_DESCRIPTION("MediaTek CPUFreq driver");
> +MODULE_AUTHOR("Pi-Cheng Chen ");
> +MODULE_LICENSE("GPL v2");

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH] bfa: remove unused pointer 'port'

2017-11-20 Thread Martin K. Petersen

Colin,

> The pointer 'port' is being assigned but it is never read, hence it is
> redundant and can be removed. Cleans up clang warning:
>
> drivers/scsi/bfa/bfad_attr.c:505:2: warning: Value stored to 'port'
> is never read

Applied to 4.16/scsi-queue. Thanks, Colin!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] bfa: remove unused pointer 'port'

2017-11-20 Thread Martin K. Petersen

Colin,

> The pointer 'port' is being assigned but it is never read, hence it is
> redundant and can be removed. Cleans up clang warning:
>
> drivers/scsi/bfa/bfad_attr.c:505:2: warning: Value stored to 'port'
> is never read

Applied to 4.16/scsi-queue. Thanks, Colin!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: aacraid: remove unused variable managed_request_id

2017-11-20 Thread Martin K. Petersen

Colin,

> Variable managed_request_id is being assigned but it is never read,
> hence it is redundant and can be removed. Cleans up clang warning:

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: aacraid: remove unused variable managed_request_id

2017-11-20 Thread Martin K. Petersen

Colin,

> Variable managed_request_id is being assigned but it is never read,
> hence it is redundant and can be removed. Cleans up clang warning:

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

2017-11-20 Thread Mike Galbraith
On Mon, 2017-11-20 at 16:33 -0500, Mikulas Patocka wrote:
> 
> Is there some specific scenario where you need to call 
> blk_schedule_flush_plug from rt_spin_lock_fastlock?

Excellent question.  What's the difference between not getting IO
started because you meet a mutex with an rt_mutex under the hood, and
not getting IO started because you meet a spinlock with an rt_mutex
under the hood?  If just doing the mutex side puts this thing back to
sleep, I'm happy.

-Mike


Re: [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper

2017-11-20 Thread Mike Galbraith
On Mon, 2017-11-20 at 16:33 -0500, Mikulas Patocka wrote:
> 
> Is there some specific scenario where you need to call 
> blk_schedule_flush_plug from rt_spin_lock_fastlock?

Excellent question.  What's the difference between not getting IO
started because you meet a mutex with an rt_mutex under the hood, and
not getting IO started because you meet a spinlock with an rt_mutex
under the hood?  If just doing the mutex side puts this thing back to
sleep, I'm happy.

-Mike


Re: [PATCH 13/16] x86/entry/64: Create a percpu SYSCALL entry trampoline

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 6:34 PM, Josh Poimboeuf  wrote:
> On Mon, Nov 20, 2017 at 09:07:44AM -0800, Andy Lutomirski wrote:
>> + /* Save RDI, since we need a scratch register. */
>> + pushq   %rdi
>> +
>> + /*
>> +  * x86 lacks a near absolute jump, and we can't jump to the real
>> +  * entry text with a relative jump, so we use a double trampoline.
>> +  */
>> + movq$entry_SYSCALL_64_stage2, %rdi
>> + jmp *%rdi
>> +END(entry_SYSCALL_64_trampoline)
>> +
>> + .popsection
>> +
>> +ENTRY(entry_SYSCALL_64_stage2)
>> + /*
>> +  * Rather than polluting the normal SYSCALL path with stack switching
>> +  * nonsense, fix up our register state to match its expectations.
>> +  */
>> + UNWIND_HINT_EMPTY
>> + popq %rdi
>> + jmp entry_SYSCALL_64_after_hwframe
>> +END(entry_SYSCALL_64_stage2)
>
> Is there a reason why you couldn't just do the following?
>
> pushq   $entry_SYSCALL_64_after_hwframe
> ret
>
> Then you wouldn't need the 2nd trampoline and the %rdi clobber.
>

Good suggestion.  Thanks!

> --
> Josh


Re: [PATCH 13/16] x86/entry/64: Create a percpu SYSCALL entry trampoline

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 6:34 PM, Josh Poimboeuf  wrote:
> On Mon, Nov 20, 2017 at 09:07:44AM -0800, Andy Lutomirski wrote:
>> + /* Save RDI, since we need a scratch register. */
>> + pushq   %rdi
>> +
>> + /*
>> +  * x86 lacks a near absolute jump, and we can't jump to the real
>> +  * entry text with a relative jump, so we use a double trampoline.
>> +  */
>> + movq$entry_SYSCALL_64_stage2, %rdi
>> + jmp *%rdi
>> +END(entry_SYSCALL_64_trampoline)
>> +
>> + .popsection
>> +
>> +ENTRY(entry_SYSCALL_64_stage2)
>> + /*
>> +  * Rather than polluting the normal SYSCALL path with stack switching
>> +  * nonsense, fix up our register state to match its expectations.
>> +  */
>> + UNWIND_HINT_EMPTY
>> + popq %rdi
>> + jmp entry_SYSCALL_64_after_hwframe
>> +END(entry_SYSCALL_64_stage2)
>
> Is there a reason why you couldn't just do the following?
>
> pushq   $entry_SYSCALL_64_after_hwframe
> ret
>
> Then you wouldn't need the 2nd trampoline and the %rdi clobber.
>

Good suggestion.  Thanks!

> --
> Josh


Re: Improving documentation of parent-ID field in /proc/PID/mountinfo

2017-11-20 Thread Ram Pai
On Mon, Nov 20, 2017 at 10:07:29AM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Miklos,
> 
> Sorry for the slow follow-up.
> 
> On 14 November 2017 at 17:16, Miklos Szeredi  wrote:
> > On Tue, Nov 14, 2017 at 8:08 AM, Michael Kerrisk (man-pages)
> >  wrote:
> >> Hi Miklos, Ram
> >>
> >> Thanks for your comments. A question below.
> >>
> >> On 13 November 2017 at 09:11, Miklos Szeredi  wrote:
> >>> On Mon, Nov 13, 2017 at 8:55 AM, Ram Pai  wrote:
>  On Mon, Nov 13, 2017 at 07:02:21AM +0100, Michael Kerrisk (man-pages) 
>  wrote:
> > Hello Ram,
> >
> > Long ago (2.6.29) you added the /proc/PID/mountinfo file and
> > associated documentation in Documentation/filesystems/proc.txt. Later,
> > I pasted much of that documentation into the proc(5) manual page.
> >
> > That documentation says of the second field in the file:
> >
> > [[
> > This file contains lines of the form:
> >
> > 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root 
> > rw,errors=continue
> > (1)(2)(3)   (4)   (5)  (6)  (7)   (8) (9)   (10) (11)
> >
> > (1) mount ID:  unique identifier of the mount (may be reused after 
> > umount)
> > (2) parent ID:  ID of parent (or of self for the top of the mount tree)
> > ...
> > ]]
> >
> > The last piece of the description of field (2) doesn't seem to be
> > correct, or is at least rather unclear. I take this to be saying that
> > that for the root mount point, /, field (2) will have the same value
> > as field (1). I never actually looked at this detail closely, but
> > Alexander pointed out that this is obviously not so, as one can
> > immediately verify:
> >
> > $ grep '/ / ' /proc/$$/mountinfo
> > 65 0 8:2 / / rw,relatime shared:1 - ext4 /dev/sda2 
> > rw,seclabel,data=order
> >
> > I dug around in the kernel source for a bit. I do not have an exact
> > handle on the details, but I can see roughly what is going on.
> > Internally, there seems to be one ("hidden") mount ID reserved to each
> > mount namespace, and that ID is the parent of the root mount point.
> >
> > Looking through the (4.14) kernel source, mount IDs are allocated by
> > mnt_alloc_id() (in fs/namespace.c), which is in turn called by
> > alloc_vfsmnt() which is in turn called by clone_mnt().
> >
> > A new mount namespace is created by the kernel function copy_mnt_ns()
> > (in fs/namespace.c, called by create_new_namespaces() in
> > kernel/nsproxy.c). The copy_mnt_ns() function calls copy_tree() (in
> > fs/namespace.c), and copy_tree() calls clone_mnt() in *two* places.
> > The first of these is the call that creates the "hidden" mount ID that
> > becomes the parent of the root mount point. (I verified this by
> > instrumenting the kernel with a few printk() calls to display the
> > IDs.)  The second place where copy_tree() calls clone_mnt() is in a
> > loop that replicates each of the mount points (including the root
> > mount point) in the source mount namespace.
> 
>  We used to report that mount, ones upon a time.  Something has changed
>  the behavior since then and its not reported any more, thus making it
>  hidden.
> >>>
> >>> The hidden one is the initramfs, I believe.  That's the root of the
> >>> mount namespace, and the when a namespace is cloned, the tree is
> >>> copied from the namespace root.
> >>>
> >>> It is "hidden" because no process has its root there.  Note the
> >>> difference between namespace root and process root: the first is the
> >>> real root of the mount tree and is unchangeable, the second is
> >>> pointing to some place in a mount tree and can be changed (chroot).
> >>>
> >>> So there's nothing special in this rootfs, it is just hidden because
> >>> it's not the root of any task.
> >>>
> >>> The description of  field (2) is correct, it just does not make it
> >>> clear what it means by "root".
> >>
> >> Sorry -- do you mean the old description is correct, or my new
> >> description (below)?
> >
> > Well, both are correct, yours just describes the same thing at the
> > higher level.  But I think rootfs is an implementation detail, so is
> > the fact that it gets a zero mount ID, so I think the original
> > description better captures the essence of the interface.  Except it
> > needs to clarify what "top of the mount tree" means.  It doesn't mean
> > current process's root, rather it means the root of the mount tree in
> > the current mount namespace.
> 
> Thanks for the further info.
> 
> But, the problem is that the existing description is at best misleading:
> 
>   (2)  parent  ID: the ID of the parent mount (or of self for
>the top of the mount tree).
> 
> That implies that we'll find one line in the list where field 1 and
> field 2 

Re: Improving documentation of parent-ID field in /proc/PID/mountinfo

2017-11-20 Thread Ram Pai
On Mon, Nov 20, 2017 at 10:07:29AM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Miklos,
> 
> Sorry for the slow follow-up.
> 
> On 14 November 2017 at 17:16, Miklos Szeredi  wrote:
> > On Tue, Nov 14, 2017 at 8:08 AM, Michael Kerrisk (man-pages)
> >  wrote:
> >> Hi Miklos, Ram
> >>
> >> Thanks for your comments. A question below.
> >>
> >> On 13 November 2017 at 09:11, Miklos Szeredi  wrote:
> >>> On Mon, Nov 13, 2017 at 8:55 AM, Ram Pai  wrote:
>  On Mon, Nov 13, 2017 at 07:02:21AM +0100, Michael Kerrisk (man-pages) 
>  wrote:
> > Hello Ram,
> >
> > Long ago (2.6.29) you added the /proc/PID/mountinfo file and
> > associated documentation in Documentation/filesystems/proc.txt. Later,
> > I pasted much of that documentation into the proc(5) manual page.
> >
> > That documentation says of the second field in the file:
> >
> > [[
> > This file contains lines of the form:
> >
> > 36 35 98:0 /mnt1 /mnt2 rw,noatime master:1 - ext3 /dev/root 
> > rw,errors=continue
> > (1)(2)(3)   (4)   (5)  (6)  (7)   (8) (9)   (10) (11)
> >
> > (1) mount ID:  unique identifier of the mount (may be reused after 
> > umount)
> > (2) parent ID:  ID of parent (or of self for the top of the mount tree)
> > ...
> > ]]
> >
> > The last piece of the description of field (2) doesn't seem to be
> > correct, or is at least rather unclear. I take this to be saying that
> > that for the root mount point, /, field (2) will have the same value
> > as field (1). I never actually looked at this detail closely, but
> > Alexander pointed out that this is obviously not so, as one can
> > immediately verify:
> >
> > $ grep '/ / ' /proc/$$/mountinfo
> > 65 0 8:2 / / rw,relatime shared:1 - ext4 /dev/sda2 
> > rw,seclabel,data=order
> >
> > I dug around in the kernel source for a bit. I do not have an exact
> > handle on the details, but I can see roughly what is going on.
> > Internally, there seems to be one ("hidden") mount ID reserved to each
> > mount namespace, and that ID is the parent of the root mount point.
> >
> > Looking through the (4.14) kernel source, mount IDs are allocated by
> > mnt_alloc_id() (in fs/namespace.c), which is in turn called by
> > alloc_vfsmnt() which is in turn called by clone_mnt().
> >
> > A new mount namespace is created by the kernel function copy_mnt_ns()
> > (in fs/namespace.c, called by create_new_namespaces() in
> > kernel/nsproxy.c). The copy_mnt_ns() function calls copy_tree() (in
> > fs/namespace.c), and copy_tree() calls clone_mnt() in *two* places.
> > The first of these is the call that creates the "hidden" mount ID that
> > becomes the parent of the root mount point. (I verified this by
> > instrumenting the kernel with a few printk() calls to display the
> > IDs.)  The second place where copy_tree() calls clone_mnt() is in a
> > loop that replicates each of the mount points (including the root
> > mount point) in the source mount namespace.
> 
>  We used to report that mount, ones upon a time.  Something has changed
>  the behavior since then and its not reported any more, thus making it
>  hidden.
> >>>
> >>> The hidden one is the initramfs, I believe.  That's the root of the
> >>> mount namespace, and the when a namespace is cloned, the tree is
> >>> copied from the namespace root.
> >>>
> >>> It is "hidden" because no process has its root there.  Note the
> >>> difference between namespace root and process root: the first is the
> >>> real root of the mount tree and is unchangeable, the second is
> >>> pointing to some place in a mount tree and can be changed (chroot).
> >>>
> >>> So there's nothing special in this rootfs, it is just hidden because
> >>> it's not the root of any task.
> >>>
> >>> The description of  field (2) is correct, it just does not make it
> >>> clear what it means by "root".
> >>
> >> Sorry -- do you mean the old description is correct, or my new
> >> description (below)?
> >
> > Well, both are correct, yours just describes the same thing at the
> > higher level.  But I think rootfs is an implementation detail, so is
> > the fact that it gets a zero mount ID, so I think the original
> > description better captures the essence of the interface.  Except it
> > needs to clarify what "top of the mount tree" means.  It doesn't mean
> > current process's root, rather it means the root of the mount tree in
> > the current mount namespace.
> 
> Thanks for the further info.
> 
> But, the problem is that the existing description is at best misleading:
> 
>   (2)  parent  ID: the ID of the parent mount (or of self for
>the top of the mount tree).
> 
> That implies that we'll find one line in the list where field 1 and
> field 2 are the same. But we don't, because the mountns rootfs entry
> is not shown in 

Re: [PATCH] pinctrl: intel: Initialize GPIO properly when used through irqchip

2017-11-20 Thread Chris Chiu
On Mon, Nov 20, 2017 at 11:19 PM, Mika Westerberg
 wrote:
> When a GPIO is requested using gpiod_get_* APIs the intel pinctrl driver
> switches the pin to GPIO mode and makes sure interrupts are routed to
> the GPIO hardware instead of IOAPIC. However, if the GPIO is used
> directly through irqchip, as is the case with many I2C-HID devices where
> I2C core automatically configures interrupt for the device, the pin is
> not initialized as GPIO. Instead we rely that the BIOS configures the
> pin accordingly which seems not to be the case at least in Asus X540NA
> SKU3 with Focaltech touchpad.
>
> When the pin is not properly configured it might result weird behaviour
> like interrupts suddenly stop firing completely and the touchpad stops
> responding to user input.
>
> Fix this by properly initializing the pin to GPIO mode also when it is
> used directly through irqchip.
>
> Reported-by: Daniel Drake 
> Reported-by: Chris Chiu 
> Signed-off-by: Mika Westerberg 
> ---
> Chris, Daniel,
>
> Could you check that this still solves the issue and maybe provide your
> Tested-by? Thanks!
>

I've verified on X540NA here and it's working fine w/o any abnormal touchpad
stop.

Tested-by: Chris Chiu 


Re: [PATCH] pinctrl: intel: Initialize GPIO properly when used through irqchip

2017-11-20 Thread Chris Chiu
On Mon, Nov 20, 2017 at 11:19 PM, Mika Westerberg
 wrote:
> When a GPIO is requested using gpiod_get_* APIs the intel pinctrl driver
> switches the pin to GPIO mode and makes sure interrupts are routed to
> the GPIO hardware instead of IOAPIC. However, if the GPIO is used
> directly through irqchip, as is the case with many I2C-HID devices where
> I2C core automatically configures interrupt for the device, the pin is
> not initialized as GPIO. Instead we rely that the BIOS configures the
> pin accordingly which seems not to be the case at least in Asus X540NA
> SKU3 with Focaltech touchpad.
>
> When the pin is not properly configured it might result weird behaviour
> like interrupts suddenly stop firing completely and the touchpad stops
> responding to user input.
>
> Fix this by properly initializing the pin to GPIO mode also when it is
> used directly through irqchip.
>
> Reported-by: Daniel Drake 
> Reported-by: Chris Chiu 
> Signed-off-by: Mika Westerberg 
> ---
> Chris, Daniel,
>
> Could you check that this still solves the issue and maybe provide your
> Tested-by? Thanks!
>

I've verified on X540NA here and it's working fine w/o any abnormal touchpad
stop.

Tested-by: Chris Chiu 


Re: [PATCH] scsi: csiostor: remove unneeded DRIVER_LICENSE #define

2017-11-20 Thread Martin K. Petersen

Greg,

> There is no need to #define the license of the driver, just put it in
> the MODULE_LICENSE() line directly as a text string.
>
> This allows tools that check that the module license matches the source
> code license to work properly, as there is no need to unwind the
> unneeded dereference, especially when the string is defined in a .h file
> far away from the .c file it is used in.

Applied to 4.16/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: csiostor: remove unneeded DRIVER_LICENSE #define

2017-11-20 Thread Martin K. Petersen

Greg,

> There is no need to #define the license of the driver, just put it in
> the MODULE_LICENSE() line directly as a text string.
>
> This allows tools that check that the module license matches the source
> code license to work properly, as there is no need to unwind the
> unneeded dereference, especially when the string is defined in a .h file
> far away from the .c file it is used in.

Applied to 4.16/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 3/3] scsi: 3w-9xxx: rework lock timeouts

2017-11-20 Thread Martin K. Petersen

Arnd,

> The TW_IOCTL_GET_LOCK ioctl uses do_gettimeofday() to check whether a
> lock has expired. This can misbehave due to a concurrent
> settimeofday() call, as it is based on 'real' time, and it will
> overflow in y2038 on 32-bit architectures, producing unexpected
> results when used across the overflow time.
>
> This changes it to using monotonic time, using ktime_get() to simplify
> the code.

Applied to 4.16/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 3/3] scsi: 3w-9xxx: rework lock timeouts

2017-11-20 Thread Martin K. Petersen

Arnd,

> The TW_IOCTL_GET_LOCK ioctl uses do_gettimeofday() to check whether a
> lock has expired. This can misbehave due to a concurrent
> settimeofday() call, as it is based on 'real' time, and it will
> overflow in y2038 on 32-bit architectures, producing unexpected
> results when used across the overflow time.
>
> This changes it to using monotonic time, using ktime_get() to simplify
> the code.

Applied to 4.16/scsi-queue, thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 2/3] scsi: 3ware: use 64-bit times for FW time sync

2017-11-20 Thread Martin K. Petersen

Arnd,

> The calculation of the number of seconds since Sunday 00:00:00
> overflows in 2106, meaning that we instead will return the seconds
> since Wednesday 06:28:16 afterwards.
>
> Using 64-bit time stamps avoids this slight inconsistency, and the
> deprecated do_gettimeofday(), replacing it with the simpler
> ktime_get_real_seconds().

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 2/3] scsi: 3ware: use 64-bit times for FW time sync

2017-11-20 Thread Martin K. Petersen

Arnd,

> The calculation of the number of seconds since Sunday 00:00:00
> overflows in 2106, meaning that we instead will return the seconds
> since Wednesday 06:28:16 afterwards.
>
> Using 64-bit time stamps avoids this slight inconsistency, and the
> deprecated do_gettimeofday(), replacing it with the simpler
> ktime_get_real_seconds().

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


[PATCH v3] x86/umip: Warn if UMIP-protected instructions are used

2017-11-20 Thread Ricardo Neri
Print a rate-limited warning when a user space program attempts to execute
any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
and SMSW).

This is useful because, when CONFIG_X86_INTEL_UMIP is selected and
supported by the hardware, user space programs that try to execute such
instructions will receive a SIGSEGV signal that they might not expect.
In the specific cases for which emulation is provided (instructions SGDT,
SIDT and SMSW in protected and virtual-8086 modes), no signal is
generated. However, a warning is helpful to encourage updates in such
programs to avoid the use of such instructions.

Warnings are printed via a customized printk() function that also provides
information about the program that attempted to use the affected
instructions.

Utility macros are defined to wrap umip_printk() for the error and warning
kernel log levels.

While here, replace an existing call to the generic rate-limited pr_err()
with the new umip_pr_err().

Cc: Andy Lutomirski 
Cc: H. Peter Anvin 
Cc: Borislav Petkov 
Cc: Tony Luck 
Cc: Paolo Bonzini 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Suggested-by: Linus Torvalds 
Signed-off-by: Ricardo Neri 
---
Changes since V2:
* Dropped patches 1, 2, and 3 as they were applied on the tip repository.
* Renamed umip_pr_warn() to umip_printk(). Instead, added utility macros
  for the error and warning kernel log levels.
* Implemented umip_printk() as a variadic function. This also includes
  using the __printf() macro to type-check arguments of the format
  string and arguments.
* Capitalized all the instructions' mnemonics in strings used for printing
  warnings.

Changes since V1:
* Capitalized all the instructions' mnemonics in both code and patch
  descriptions.
* Corrected documentation of umip_pr_warn() to correctly reflect the function
  name.
* Updated description of patch #4 to describe the update to the existing
  rate-limited pr_err().
---
 arch/x86/kernel/umip.c | 62 ++
 1 file changed, 58 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 1f1f2d5..dabbac3 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -82,6 +82,57 @@
 #defineUMIP_INST_SLDT  3   /* 0F 00 /0 */
 #defineUMIP_INST_STR   4   /* 0F 00 /1 */
 
+const char * const umip_insns[5] = {
+   [UMIP_INST_SGDT] = "SGDT",
+   [UMIP_INST_SIDT] = "SIDT",
+   [UMIP_INST_SMSW] = "SMSW",
+   [UMIP_INST_SLDT] = "SLDT",
+   [UMIP_INST_STR] = "STR",
+};
+
+#define umip_pr_err(regs, fmt, ...) \
+   umip_printk(regs, KERN_ERR, fmt, ##__VA_ARGS__)
+#define umip_pr_warning(regs, fmt, ...) \
+   umip_printk(regs, KERN_WARNING, fmt,  ##__VA_ARGS__)
+
+/**
+ * umip_printk() - Print a rate-limited message
+ * @regs:  Register set with the context in which the warning is printed
+ * @log_level: Kernel log level to print the message
+ * @fmt:   The text string to print
+ *
+ * Print the text contained in @fmt. The print rate is limited to bursts of 5
+ * messages every two minutes. The purpose of this customized version of
+ * printk() is to print messages when user space processes use any of the
+ * UMIP-protected instructions. Thus, the printed text is prepended with the
+ * task name and process ID number of the current task as well as the
+ * instruction and stack pointers in @regs as seen when entering kernel mode.
+ *
+ * Returns:
+ *
+ * None.
+ */
+static __printf(3, 4)
+void umip_printk(const struct pt_regs *regs, const char *log_level,
+const char *fmt, ...)
+{
+   /* Bursts of 5 messages every two minutes */
+   static DEFINE_RATELIMIT_STATE(ratelimit, 2 * 60 * HZ, 5);
+   struct task_struct *tsk = current;
+   struct va_format vaf;
+   va_list args;
+
+   if (!__ratelimit())
+   return;
+
+   va_start(args, fmt);
+   vaf.fmt = fmt;
+   vaf.va = 
+   printk("%s" pr_fmt("%s[%d] ip:%lx sp:%lx: %pV"), log_level, tsk->comm,
+  task_pid_nr(tsk), regs->ip, regs->sp, );
+   va_end(args);
+}
+
 /**
  * identify_insn() - Identify a UMIP-protected instruction
  * @insn:  Instruction structure with opcode and ModRM byte.
@@ -236,10 +287,8 @@ static void force_sig_info_umip_fault(void __user *addr, 
struct pt_regs *regs)
if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
return;
 
-   pr_err_ratelimited("%s[%d] umip emulation segfault ip:%lx sp:%lx 
error:%x in %lx\n",
-  tsk->comm, task_pid_nr(tsk), regs->ip,
-  regs->sp, X86_PF_USER | X86_PF_WRITE,
-  regs->ip);
+   umip_pr_err(regs, "segfault in emulation. error%x\n",
+   X86_PF_USER | 

[PATCH v3] x86/umip: Warn if UMIP-protected instructions are used

2017-11-20 Thread Ricardo Neri
Print a rate-limited warning when a user space program attempts to execute
any of the instructions that UMIP protects (i.e., SGDT, SIDT, SLDT, STR
and SMSW).

This is useful because, when CONFIG_X86_INTEL_UMIP is selected and
supported by the hardware, user space programs that try to execute such
instructions will receive a SIGSEGV signal that they might not expect.
In the specific cases for which emulation is provided (instructions SGDT,
SIDT and SMSW in protected and virtual-8086 modes), no signal is
generated. However, a warning is helpful to encourage updates in such
programs to avoid the use of such instructions.

Warnings are printed via a customized printk() function that also provides
information about the program that attempted to use the affected
instructions.

Utility macros are defined to wrap umip_printk() for the error and warning
kernel log levels.

While here, replace an existing call to the generic rate-limited pr_err()
with the new umip_pr_err().

Cc: Andy Lutomirski 
Cc: H. Peter Anvin 
Cc: Borislav Petkov 
Cc: Tony Luck 
Cc: Paolo Bonzini 
Cc: Ravi V. Shankar 
Cc: x...@kernel.org
Suggested-by: Linus Torvalds 
Signed-off-by: Ricardo Neri 
---
Changes since V2:
* Dropped patches 1, 2, and 3 as they were applied on the tip repository.
* Renamed umip_pr_warn() to umip_printk(). Instead, added utility macros
  for the error and warning kernel log levels.
* Implemented umip_printk() as a variadic function. This also includes
  using the __printf() macro to type-check arguments of the format
  string and arguments.
* Capitalized all the instructions' mnemonics in strings used for printing
  warnings.

Changes since V1:
* Capitalized all the instructions' mnemonics in both code and patch
  descriptions.
* Corrected documentation of umip_pr_warn() to correctly reflect the function
  name.
* Updated description of patch #4 to describe the update to the existing
  rate-limited pr_err().
---
 arch/x86/kernel/umip.c | 62 ++
 1 file changed, 58 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 1f1f2d5..dabbac3 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -82,6 +82,57 @@
 #defineUMIP_INST_SLDT  3   /* 0F 00 /0 */
 #defineUMIP_INST_STR   4   /* 0F 00 /1 */
 
+const char * const umip_insns[5] = {
+   [UMIP_INST_SGDT] = "SGDT",
+   [UMIP_INST_SIDT] = "SIDT",
+   [UMIP_INST_SMSW] = "SMSW",
+   [UMIP_INST_SLDT] = "SLDT",
+   [UMIP_INST_STR] = "STR",
+};
+
+#define umip_pr_err(regs, fmt, ...) \
+   umip_printk(regs, KERN_ERR, fmt, ##__VA_ARGS__)
+#define umip_pr_warning(regs, fmt, ...) \
+   umip_printk(regs, KERN_WARNING, fmt,  ##__VA_ARGS__)
+
+/**
+ * umip_printk() - Print a rate-limited message
+ * @regs:  Register set with the context in which the warning is printed
+ * @log_level: Kernel log level to print the message
+ * @fmt:   The text string to print
+ *
+ * Print the text contained in @fmt. The print rate is limited to bursts of 5
+ * messages every two minutes. The purpose of this customized version of
+ * printk() is to print messages when user space processes use any of the
+ * UMIP-protected instructions. Thus, the printed text is prepended with the
+ * task name and process ID number of the current task as well as the
+ * instruction and stack pointers in @regs as seen when entering kernel mode.
+ *
+ * Returns:
+ *
+ * None.
+ */
+static __printf(3, 4)
+void umip_printk(const struct pt_regs *regs, const char *log_level,
+const char *fmt, ...)
+{
+   /* Bursts of 5 messages every two minutes */
+   static DEFINE_RATELIMIT_STATE(ratelimit, 2 * 60 * HZ, 5);
+   struct task_struct *tsk = current;
+   struct va_format vaf;
+   va_list args;
+
+   if (!__ratelimit())
+   return;
+
+   va_start(args, fmt);
+   vaf.fmt = fmt;
+   vaf.va = 
+   printk("%s" pr_fmt("%s[%d] ip:%lx sp:%lx: %pV"), log_level, tsk->comm,
+  task_pid_nr(tsk), regs->ip, regs->sp, );
+   va_end(args);
+}
+
 /**
  * identify_insn() - Identify a UMIP-protected instruction
  * @insn:  Instruction structure with opcode and ModRM byte.
@@ -236,10 +287,8 @@ static void force_sig_info_umip_fault(void __user *addr, 
struct pt_regs *regs)
if (!(show_unhandled_signals && unhandled_signal(tsk, SIGSEGV)))
return;
 
-   pr_err_ratelimited("%s[%d] umip emulation segfault ip:%lx sp:%lx 
error:%x in %lx\n",
-  tsk->comm, task_pid_nr(tsk), regs->ip,
-  regs->sp, X86_PF_USER | X86_PF_WRITE,
-  regs->ip);
+   umip_pr_err(regs, "segfault in emulation. error%x\n",
+   X86_PF_USER | X86_PF_WRITE);
 }
 
 /**
@@ -326,10 +375,15 @@ bool fixup_umip_exception(struct pt_regs *regs)
if (umip_inst < 0)
return false;
 
+   umip_pr_warning(regs, "%s 

Re: [PATCH 1/3] scsi: 3ware: fix 32-bit time calculations

2017-11-20 Thread Martin K. Petersen

Arnd,

> twl_aen_queue_event/twa_aen_queue_event, we use do_gettimeofday()
> to read the lower 32 bits of the current time in seconds, to pass
> them to the TW_IOCTL_GET_NEXT_EVENT ioctl or the 3ware_aen_read
> sysfs file.
>
> This will overflow on all architectures in year 2106, there is
> not much we can do about that without breaking the ABI. User
> space has 90 years to learn to deal with it, so it's probably ok.
>
> I'm changing it to use ktime_get_real_seconds() with a comment
> to document what happens when.

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 1/3] scsi: 3ware: fix 32-bit time calculations

2017-11-20 Thread Martin K. Petersen

Arnd,

> twl_aen_queue_event/twa_aen_queue_event, we use do_gettimeofday()
> to read the lower 32 bits of the current time in seconds, to pass
> them to the TW_IOCTL_GET_NEXT_EVENT ioctl or the 3ware_aen_read
> sysfs file.
>
> This will overflow on all architectures in year 2106, there is
> not much we can do about that without breaking the ABI. User
> space has 90 years to learn to deal with it, so it's probably ok.
>
> I'm changing it to use ktime_get_real_seconds() with a comment
> to document what happens when.

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] [man-pages] adjtimex.2: document clock_adjtime

2017-11-20 Thread Richard Cochran
On Mon, Nov 20, 2017 at 11:53:02PM +0100, Arnd Bergmann wrote:
>  .B EINVAL
> +The
> +.I clk_id
> +specified is not supported on this system.

We return EINVAL when the clockid is not valid.  That can mean two
things.  Either the SYS-V style hard coded positive clockid is out of
range, or the dynamic negative clockid does not refer to a valid
instance of a clock object.

Dynamic clocks might also return ENODEV in case a hot-plugable device
(like USB) has disappeared after its character device was opened.

Thanks,
Richard


Re: [PATCH 0/7] scsi: bfa: do_gettimeofday removal

2017-11-20 Thread Martin K. Petersen

Arnd,

> The bfa driver is one of the main users of do_gettimeofday(), a
> function that I'm trying to remove as part of the y2038 cleanup.
>
> The timestamps are all uses in slightly different ways, so this has
> turned into a rather longish series for doing something that should be
> simple.
>
> The last patch in the series ("scsi: bfa: use 64-bit times in
> bfa_aen_entry_s ABI") is one that needs to be reviewed very carefully,
> and it can be skipped if the maintainers prefer to leave the 32-bit
> ABI unchanged, the rest are hopefully fairly straightforward.

Applied to 4.16/scsi-queue, thanks!

Will drop #7 if something breaks.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] [man-pages] adjtimex.2: document clock_adjtime

2017-11-20 Thread Richard Cochran
On Mon, Nov 20, 2017 at 11:53:02PM +0100, Arnd Bergmann wrote:
>  .B EINVAL
> +The
> +.I clk_id
> +specified is not supported on this system.

We return EINVAL when the clockid is not valid.  That can mean two
things.  Either the SYS-V style hard coded positive clockid is out of
range, or the dynamic negative clockid does not refer to a valid
instance of a clock object.

Dynamic clocks might also return ENODEV in case a hot-plugable device
(like USB) has disappeared after its character device was opened.

Thanks,
Richard


Re: [PATCH 0/7] scsi: bfa: do_gettimeofday removal

2017-11-20 Thread Martin K. Petersen

Arnd,

> The bfa driver is one of the main users of do_gettimeofday(), a
> function that I'm trying to remove as part of the y2038 cleanup.
>
> The timestamps are all uses in slightly different ways, so this has
> turned into a rather longish series for doing something that should be
> simple.
>
> The last patch in the series ("scsi: bfa: use 64-bit times in
> bfa_aen_entry_s ABI") is one that needs to be reviewed very carefully,
> and it can be skipped if the maintainers prefer to leave the 32-bit
> ABI unchanged, the rest are hopefully fairly straightforward.

Applied to 4.16/scsi-queue, thanks!

Will drop #7 if something breaks.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-20 Thread Martin K. Petersen

Ching,

> The following patches apply to Martin's 4.15/scsi-queue.

Applied to 4.16/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 0/13] scsi: arcmsr: add some driver options and support new adapter ARC-1884

2017-11-20 Thread Martin K. Petersen

Ching,

> The following patches apply to Martin's 4.15/scsi-queue.

Applied to 4.16/scsi-queue. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 1/6] mmc: renesas_sdhc: remove eprobe jump label

2017-11-20 Thread Masahiro Yamada
2017-11-21 5:31 GMT+09:00 Wolfram Sang :
>
>> - ret = PTR_ERR(priv->clk);
>> - dev_err(>dev, "cannot get clock: %d\n", ret);
>> - goto eprobe;
>> + dev_err(>dev, "cannot get clock\n");
>> + return PTR_ERR(priv->clk);
>
> Why dropping the 'ret' printout? Will it be printed by the core?
>

No.
I just wanted to save "ret = PTR_ERR(priv->clk)" line.


I will restore the printout as follows:

priv->clk = devm_clk_get(>dev, NULL);
if (IS_ERR(priv->clk)) {
ret = PTR_ERR(priv->clk);
dev_err(>dev, "cannot get clock: %d\n", ret);
return ret;
}



-- 
Best Regards
Masahiro Yamada


Re: [PATCH 1/6] mmc: renesas_sdhc: remove eprobe jump label

2017-11-20 Thread Masahiro Yamada
2017-11-21 5:31 GMT+09:00 Wolfram Sang :
>
>> - ret = PTR_ERR(priv->clk);
>> - dev_err(>dev, "cannot get clock: %d\n", ret);
>> - goto eprobe;
>> + dev_err(>dev, "cannot get clock\n");
>> + return PTR_ERR(priv->clk);
>
> Why dropping the 'ret' printout? Will it be printed by the core?
>

No.
I just wanted to save "ret = PTR_ERR(priv->clk)" line.


I will restore the printout as follows:

priv->clk = devm_clk_get(>dev, NULL);
if (IS_ERR(priv->clk)) {
ret = PTR_ERR(priv->clk);
dev_err(>dev, "cannot get clock: %d\n", ret);
return ret;
}



-- 
Best Regards
Masahiro Yamada


linux-next: Tree for Nov 21

2017-11-20 Thread Stephen Rothwell
Hi all,

Please do not add any v4.16 material to your linux-next included trees
until v4.15-rc1 has been released.

Changes since 20171120:

I have removed the following trees since they have not been updated in
over a year:

backlight-fixes
berlin
binfmt_misc
borntraeger
cortex-m
dax-misc
dma-buf
drm-exynos
freevxfs
fscache
ftrace-fixes
h8300
hexagon
hwspinlock
ia64
jdelvare-hwmon
kgdb
luto-misc
mpc5xxx
omap-pending
omap_dss2
pcmcia
rpi
target-merge
unicore32
v9fs

The akpm-current tree still had its build failure for which I applied a
fix patch.

Non-merge commits (relative to Linus' tree): 807
 835 files changed, 14447 insertions(+), 7649 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 245 trees (counting Linus' and 40 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (c8a0739b185d Merge tag 'ntb-4.15' of 
git://github.com/jonmason/ntb)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (bb3f38c3c5b7 kbuild: clang: fix build failures 
with sparse check)
Merging arc-current/for-curr (f3156851616b ARCv2: boot log: updates for HS48: 
dual-issue, ECC, Loop Buffer)
Merging arm-current/fixes (b9dd05c7002e ARM: 8720/1: ensure dump_instr() checks 
addr_limit)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (7ecb37f62fe5 powerpc/perf: Fix core-imc hotplug 
callback failure during imc initialization)
Merging sparc/master (1deab8ce2c91 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (32a72bbd5da2 net: vxge: Fix some indentation issues)
Merging ipsec/master (94802151894d Revert "xfrm: Fix stack-out-of-bounds read 
in xfrm_state_find.")
Merging netfilter/master (fbcd253d2448 netfilter: conntrack: lower timeout to 
RETRANS seconds if window is 0)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (1514f6fc1369 Merge tag 
'iwlwifi-for-kalle-2017-11-19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes)
Merging mac80211/master (33ddd81e2bd5 mac80211: properly free 
requested-but-not-started TX agg sessions)
Merging sound-current/for-linus (0ce48e172797 ALSA: hda/realtek: Add headset 
mic support for Intel NUC Skull Canyon)
Merging pci-current/for-linus (1b6115fbe3b3 Merge tag 'pci-v4.15-changes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci)
Merging driver-core.current/driver-core-linus (cf9b0772f2e4 Merge tag 
'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging tty.current/tty-linus (894025f24bd0 Merge tag 'usb-4.15-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb)
Merging usb.current/usb-linus (894025f24bd0 Merge tag 'usb-4.15-rc1' of 
git:

linux-next: Tree for Nov 21

2017-11-20 Thread Stephen Rothwell
Hi all,

Please do not add any v4.16 material to your linux-next included trees
until v4.15-rc1 has been released.

Changes since 20171120:

I have removed the following trees since they have not been updated in
over a year:

backlight-fixes
berlin
binfmt_misc
borntraeger
cortex-m
dax-misc
dma-buf
drm-exynos
freevxfs
fscache
ftrace-fixes
h8300
hexagon
hwspinlock
ia64
jdelvare-hwmon
kgdb
luto-misc
mpc5xxx
omap-pending
omap_dss2
pcmcia
rpi
target-merge
unicore32
v9fs

The akpm-current tree still had its build failure for which I applied a
fix patch.

Non-merge commits (relative to Linus' tree): 807
 835 files changed, 14447 insertions(+), 7649 deletions(-)



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" and checkout or reset to the new
master.

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log
files in the Next directory.  Between each merge, the tree was built
with a ppc64_defconfig for powerpc, an allmodconfig for x86_64, a
multi_v7_defconfig for arm and a native build of tools/perf. After
the final fixups (if any), I do an x86_64 modules_install followed by
builds for x86_64 allnoconfig, powerpc allnoconfig (32 and 64 bit),
ppc44x_defconfig, allyesconfig and pseries_le_defconfig and i386, sparc
and sparc64 defconfig. And finally, a simple boot test of the powerpc
pseries_le_defconfig kernel in qemu (with and without kvm enabled).

Below is a summary of the state of the merge.

I am currently merging 245 trees (counting Linus' and 40 trees of bug
fix patches pending for the current merge release).

Stats about the size of the tree over time can be seen at
http://neuling.org/linux-next-size.html .

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

-- 
Cheers,
Stephen Rothwell

$ git checkout master
$ git reset --hard stable
Merging origin/master (c8a0739b185d Merge tag 'ntb-4.15' of 
git://github.com/jonmason/ntb)
Merging fixes/master (820bf5c419e4 Merge tag 'scsi-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi)
Merging kbuild-current/fixes (bb3f38c3c5b7 kbuild: clang: fix build failures 
with sparse check)
Merging arc-current/for-curr (f3156851616b ARCv2: boot log: updates for HS48: 
dual-issue, ECC, Loop Buffer)
Merging arm-current/fixes (b9dd05c7002e ARM: 8720/1: ensure dump_instr() checks 
addr_limit)
Merging m68k-current/for-linus (5e387199c17c m68k/defconfig: Update defconfigs 
for v4.14-rc7)
Merging metag-fixes/fixes (b884a190afce metag/usercopy: Add missing fixups)
Merging powerpc-fixes/fixes (7ecb37f62fe5 powerpc/perf: Fix core-imc hotplug 
callback failure during imc initialization)
Merging sparc/master (1deab8ce2c91 Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc)
Merging fscrypt-current/for-stable (42d97eb0ade3 fscrypt: fix renaming and 
linking special files)
Merging net/master (32a72bbd5da2 net: vxge: Fix some indentation issues)
Merging ipsec/master (94802151894d Revert "xfrm: Fix stack-out-of-bounds read 
in xfrm_state_find.")
Merging netfilter/master (fbcd253d2448 netfilter: conntrack: lower timeout to 
RETRANS seconds if window is 0)
Merging ipvs/master (f7fb77fc1235 netfilter: nft_compat: check extension hook 
mask only if set)
Merging wireless-drivers/master (1514f6fc1369 Merge tag 
'iwlwifi-for-kalle-2017-11-19' of 
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes)
Merging mac80211/master (33ddd81e2bd5 mac80211: properly free 
requested-but-not-started TX agg sessions)
Merging sound-current/for-linus (0ce48e172797 ALSA: hda/realtek: Add headset 
mic support for Intel NUC Skull Canyon)
Merging pci-current/for-linus (1b6115fbe3b3 Merge tag 'pci-v4.15-changes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci)
Merging driver-core.current/driver-core-linus (cf9b0772f2e4 Merge tag 
'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging tty.current/tty-linus (894025f24bd0 Merge tag 'usb-4.15-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb)
Merging usb.current/usb-linus (894025f24bd0 Merge tag 'usb-4.15-rc1' of 
git:

Re: [PATCH 13/16] x86/entry/64: Create a percpu SYSCALL entry trampoline

2017-11-20 Thread Josh Poimboeuf
On Mon, Nov 20, 2017 at 09:07:44AM -0800, Andy Lutomirski wrote:
> + /* Save RDI, since we need a scratch register. */
> + pushq   %rdi
> +
> + /*
> +  * x86 lacks a near absolute jump, and we can't jump to the real
> +  * entry text with a relative jump, so we use a double trampoline.
> +  */
> + movq$entry_SYSCALL_64_stage2, %rdi
> + jmp *%rdi
> +END(entry_SYSCALL_64_trampoline)
> +
> + .popsection
> +
> +ENTRY(entry_SYSCALL_64_stage2)
> + /*
> +  * Rather than polluting the normal SYSCALL path with stack switching
> +  * nonsense, fix up our register state to match its expectations.
> +  */
> + UNWIND_HINT_EMPTY
> + popq %rdi
> + jmp entry_SYSCALL_64_after_hwframe
> +END(entry_SYSCALL_64_stage2)

Is there a reason why you couldn't just do the following?

pushq   $entry_SYSCALL_64_after_hwframe
ret

Then you wouldn't need the 2nd trampoline and the %rdi clobber.

-- 
Josh


Re: [PATCH 13/16] x86/entry/64: Create a percpu SYSCALL entry trampoline

2017-11-20 Thread Josh Poimboeuf
On Mon, Nov 20, 2017 at 09:07:44AM -0800, Andy Lutomirski wrote:
> + /* Save RDI, since we need a scratch register. */
> + pushq   %rdi
> +
> + /*
> +  * x86 lacks a near absolute jump, and we can't jump to the real
> +  * entry text with a relative jump, so we use a double trampoline.
> +  */
> + movq$entry_SYSCALL_64_stage2, %rdi
> + jmp *%rdi
> +END(entry_SYSCALL_64_trampoline)
> +
> + .popsection
> +
> +ENTRY(entry_SYSCALL_64_stage2)
> + /*
> +  * Rather than polluting the normal SYSCALL path with stack switching
> +  * nonsense, fix up our register state to match its expectations.
> +  */
> + UNWIND_HINT_EMPTY
> + popq %rdi
> + jmp entry_SYSCALL_64_after_hwframe
> +END(entry_SYSCALL_64_stage2)

Is there a reason why you couldn't just do the following?

pushq   $entry_SYSCALL_64_after_hwframe
ret

Then you wouldn't need the 2nd trampoline and the %rdi clobber.

-- 
Josh


Re: [PATCH] scsi: ppa: mark expected switch fall-throughs

2017-11-20 Thread Martin K. Petersen

Gustavo A.,

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: ppa: mark expected switch fall-throughs

2017-11-20 Thread Martin K. Petersen

Gustavo A.,

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.

Applied to 4.16/scsi-queue.

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] fnic: use 64-bit timestamps

2017-11-20 Thread Martin K. Petersen

> struct timespec is deprecated since it overflows in 2038 on 32-bit
> architectures, so we should use timespec64 consistently.
>
> I'm slightly adapting the format strings here, to make sure we print
> the nanoseconds with the correct number of leading zeroes.

Satish: Please review/test. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] fnic: use 64-bit timestamps

2017-11-20 Thread Martin K. Petersen

> struct timespec is deprecated since it overflows in 2038 on 32-bit
> architectures, so we should use timespec64 consistently.
>
> I'm slightly adapting the format strings here, to make sure we print
> the nanoseconds with the correct number of leading zeroes.

Satish: Please review/test. Thank you!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH 02/16] x86/dumpstack: Add get_stack_info() support for the SYSENTER stack

2017-11-20 Thread Josh Poimboeuf
On Mon, Nov 20, 2017 at 05:39:34PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 20, 2017 at 1:55 PM, Josh Poimboeuf  wrote:
> > On Mon, Nov 20, 2017 at 01:30:12PM -0800, Andy Lutomirski wrote:
> >> On Mon, Nov 20, 2017 at 1:27 PM, Josh Poimboeuf  
> >> wrote:
> >> > On Mon, Nov 20, 2017 at 01:07:16PM -0800, Andy Lutomirski wrote:
> >> >> >> but, more importantly, the OOPS unwinder will just bail without this
> >> >> >> patch.  With the patch, we get a valid unwind, except that everything
> >> >> >> has a ?  in front.
> >> >> >
> >> >> > Hm.  I can't even fathom how that's possible.  Are you talking about 
> >> >> > the
> >> >> > "unwind from NMI to SYSENTER stack" path?  Or any unwind to a syscall?
> >> >> > Either way I'm baffled...  If the unwinder only encounters the 
> >> >> > SYSENTER
> >> >> > stack at the end, how could that cause everything beforehand to have a
> >> >> > question mark?
> >> >>
> >> >> I mean that, if I put a ud2 or other bug in the code that runs on the
> >> >> SYSENTER stack, without this patch, I get a totally blank call trace.
> >> >
> >> > I would expect a blank call trace either way...
> >>
> >> Try making sync_regs use a few kB of stack space or, better yet, call
> >> a non-inlined function that uses too much stack.
> >
> > You mean overflow the exception stack?  I still don't see how that would
> > do it.
> >
> > If you could show a specific example, with splats from before/after,
> > that would be helpful.  Because I still have no idea how this patch
> > could possibly help.
> 
> I added BUG() to sync_regs().  With the patch, I get:
> 
> [4.211553] PANIC: double fault, error_code: 0x0
> [4.212113] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #920
> [4.212741] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1.fc26 04/01/2014
> [4.213536] task: 88001aa18000 task.stack: 88001aa2
> [4.214059] RIP: 0010:do_error_trap+0x33/0x1c0
> [4.214449] RSP: :ff1b8f78 EFLAGS: 00010096
> [4.214934] RAX: dc00 RBX: ff1b8f90 RCX: 
> 0006
> [4.215554] RDX: 82048b20 RSI:  RDI: 
> ff1b9110
> [4.216176] RBP: ff1b9088 R08: 0004 R09: 
> 
> [4.216793] R10:  R11: fbe3723f R12: 
> 0006
> [4.217419] R13:  R14: 0004 R15: 
> 
> [4.218046] FS:  () GS:88001ae0()
> knlGS:
> [4.218775] CS:  0010 DS:  ES:  CR0: 80050033
> [4.219280] CR2: ff1b8f68 CR3: 193da002 CR4: 
> 003606f0
> [4.219931] Call Trace:
> [4.220156]  
> [4.220383]  ? async_page_fault+0x36/0x60
> [4.220768]  ? invalid_op+0x22/0x40
> [4.221087]  ? async_page_fault+0x36/0x60
> [4.221442]  ? sync_regs+0x3c/0x40
> [4.221745]  ? sync_regs+0x2e/0x40
> [4.222051]  ? error_entry+0x6c/0xd0
> [4.222395]  ? async_page_fault+0x36/0x60
> [4.222748]  

Ah, page fault.  I thought you were talking about an NMI.  I get it now.

Did it overflow the stack?  I think that would explain the question
marks.

-- 
Josh


Re: [PATCH 02/16] x86/dumpstack: Add get_stack_info() support for the SYSENTER stack

2017-11-20 Thread Josh Poimboeuf
On Mon, Nov 20, 2017 at 05:39:34PM -0800, Andy Lutomirski wrote:
> On Mon, Nov 20, 2017 at 1:55 PM, Josh Poimboeuf  wrote:
> > On Mon, Nov 20, 2017 at 01:30:12PM -0800, Andy Lutomirski wrote:
> >> On Mon, Nov 20, 2017 at 1:27 PM, Josh Poimboeuf  
> >> wrote:
> >> > On Mon, Nov 20, 2017 at 01:07:16PM -0800, Andy Lutomirski wrote:
> >> >> >> but, more importantly, the OOPS unwinder will just bail without this
> >> >> >> patch.  With the patch, we get a valid unwind, except that everything
> >> >> >> has a ?  in front.
> >> >> >
> >> >> > Hm.  I can't even fathom how that's possible.  Are you talking about 
> >> >> > the
> >> >> > "unwind from NMI to SYSENTER stack" path?  Or any unwind to a syscall?
> >> >> > Either way I'm baffled...  If the unwinder only encounters the 
> >> >> > SYSENTER
> >> >> > stack at the end, how could that cause everything beforehand to have a
> >> >> > question mark?
> >> >>
> >> >> I mean that, if I put a ud2 or other bug in the code that runs on the
> >> >> SYSENTER stack, without this patch, I get a totally blank call trace.
> >> >
> >> > I would expect a blank call trace either way...
> >>
> >> Try making sync_regs use a few kB of stack space or, better yet, call
> >> a non-inlined function that uses too much stack.
> >
> > You mean overflow the exception stack?  I still don't see how that would
> > do it.
> >
> > If you could show a specific example, with splats from before/after,
> > that would be helpful.  Because I still have no idea how this patch
> > could possibly help.
> 
> I added BUG() to sync_regs().  With the patch, I get:
> 
> [4.211553] PANIC: double fault, error_code: 0x0
> [4.212113] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #920
> [4.212741] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS 1.10.2-1.fc26 04/01/2014
> [4.213536] task: 88001aa18000 task.stack: 88001aa2
> [4.214059] RIP: 0010:do_error_trap+0x33/0x1c0
> [4.214449] RSP: :ff1b8f78 EFLAGS: 00010096
> [4.214934] RAX: dc00 RBX: ff1b8f90 RCX: 
> 0006
> [4.215554] RDX: 82048b20 RSI:  RDI: 
> ff1b9110
> [4.216176] RBP: ff1b9088 R08: 0004 R09: 
> 
> [4.216793] R10:  R11: fbe3723f R12: 
> 0006
> [4.217419] R13:  R14: 0004 R15: 
> 
> [4.218046] FS:  () GS:88001ae0()
> knlGS:
> [4.218775] CS:  0010 DS:  ES:  CR0: 80050033
> [4.219280] CR2: ff1b8f68 CR3: 193da002 CR4: 
> 003606f0
> [4.219931] Call Trace:
> [4.220156]  
> [4.220383]  ? async_page_fault+0x36/0x60
> [4.220768]  ? invalid_op+0x22/0x40
> [4.221087]  ? async_page_fault+0x36/0x60
> [4.221442]  ? sync_regs+0x3c/0x40
> [4.221745]  ? sync_regs+0x2e/0x40
> [4.222051]  ? error_entry+0x6c/0xd0
> [4.222395]  ? async_page_fault+0x36/0x60
> [4.222748]  

Ah, page fault.  I thought you were talking about an NMI.  I get it now.

Did it overflow the stack?  I think that would explain the question
marks.

-- 
Josh


Re: [PATCH] scsi: bnx2i: bnx2i_hwi: use swap macro in bnx2i_send_iscsi_nopout

2017-11-20 Thread Martin K. Petersen

Gustavo A.,

> Make use of the swap macro and remove unnecessary variable tmp.
> This makes the code easier to read and maintain.

Applied to 4.16/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH] scsi: bnx2i: bnx2i_hwi: use swap macro in bnx2i_send_iscsi_nopout

2017-11-20 Thread Martin K. Petersen

Gustavo A.,

> Make use of the swap macro and remove unnecessary variable tmp.
> This makes the code easier to read and maintain.

Applied to 4.16/scsi-queue. Thanks!

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: STRICT_KERNEL_RWX on PPC32 is broken on PowerMac G4

2017-11-20 Thread Balbir Singh
On Sun, Nov 19, 2017 at 1:36 AM, LEROY Christophe
 wrote:
> Meelis Roos  a écrit :
>
>>> > > How early does it hang ? Any oops or trace ?
>>> >
>>> > Very early - instead oif kernel emssages, I see some repeated gibberish
>>> > of some characteers, and the background turns white.
>>> > I am booting from yaboot, background is normally black.
>>>
>>> Ok, could you try by replacing #ifdef CONFIG_STRICT_KERNEL_RWX by #if 0
>>> in arch/powerpc/lib/code-patching.c
>>
>>
>> With this change and CONFIG_STRICT_KERNEL_RWX=y, it still boots.
>>
>> BTW, I get these warnings (sorry for the word wrap from screen paste) -
>> may they be related or rather not?
>>
>>   WRAParch/powerpc/boot/zImage.pmac
>>   WRAParch/powerpc/boot/zImage.coff
>>   WRAParch/powerpc/boot/zImage.miboot
>> INFO: Uncompressed kernel (size 0x5d4c3c) overlaps the address of the
>> wrapper(0x40)
>> INFO: Fixing the link_address of wrapper to (0x60)
>>
>
> Then i believe there is something wrong with commit
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20171115=37bc3e5fd764fb258ff4fcbb90b6d1b67fb466c1
>
> Balbir, do you have any idea ?
>

Hmm.. interesting, so nobats works, but code-patching has issues? Any
chance you can boot with xmon=on on the command line
when you drop into the xmon> prompt, type dl to get the kernel log.

Balbir Singh.


Re: STRICT_KERNEL_RWX on PPC32 is broken on PowerMac G4

2017-11-20 Thread Balbir Singh
On Sun, Nov 19, 2017 at 1:36 AM, LEROY Christophe
 wrote:
> Meelis Roos  a écrit :
>
>>> > > How early does it hang ? Any oops or trace ?
>>> >
>>> > Very early - instead oif kernel emssages, I see some repeated gibberish
>>> > of some characteers, and the background turns white.
>>> > I am booting from yaboot, background is normally black.
>>>
>>> Ok, could you try by replacing #ifdef CONFIG_STRICT_KERNEL_RWX by #if 0
>>> in arch/powerpc/lib/code-patching.c
>>
>>
>> With this change and CONFIG_STRICT_KERNEL_RWX=y, it still boots.
>>
>> BTW, I get these warnings (sorry for the word wrap from screen paste) -
>> may they be related or rather not?
>>
>>   WRAParch/powerpc/boot/zImage.pmac
>>   WRAParch/powerpc/boot/zImage.coff
>>   WRAParch/powerpc/boot/zImage.miboot
>> INFO: Uncompressed kernel (size 0x5d4c3c) overlaps the address of the
>> wrapper(0x40)
>> INFO: Fixing the link_address of wrapper to (0x60)
>>
>
> Then i believe there is something wrong with commit
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20171115=37bc3e5fd764fb258ff4fcbb90b6d1b67fb466c1
>
> Balbir, do you have any idea ?
>

Hmm.. interesting, so nobats works, but code-patching has issues? Any
chance you can boot with xmon=on on the command line
when you drop into the xmon> prompt, type dl to get the kernel log.

Balbir Singh.


[PATCH] mm: migrate: fix an incorrect call of prep_transhuge_page()

2017-11-20 Thread Zi Yan
From: Zi Yan 

In [1], Andrea reported that during memory hotplug/hot remove
prep_transhuge_page() is called incorrectly on non-THP pages for
migration, when THP is on but THP migration is not enabled.
This leads to a bad state of target pages for migration.

This patch fixes it by only calling prep_transhuge_page() when we are
certain that the target page is THP.

[1] https://lkml.org/lkml/2017/11/20/411

Cc: sta...@vger.kernel.org # v4.14
Fixes: 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports thp 
migration")
Reported-by: Andrea Reale 
Signed-off-by: Zi Yan 
Cc: Naoya Horiguchi 
Cc: "Jérôme Glisse" 
---
 include/linux/migrate.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 895ec0c4942e..a2246cf670ba 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -54,7 +54,7 @@ static inline struct page *new_page_nodemask(struct page 
*page,
new_page = __alloc_pages_nodemask(gfp_mask, order,
preferred_nid, nodemask);
 
-   if (new_page && PageTransHuge(page))
+   if (new_page && PageTransHuge(new_page))
prep_transhuge_page(new_page);
 
return new_page;
-- 
2.14.2



[PATCH] mm: migrate: fix an incorrect call of prep_transhuge_page()

2017-11-20 Thread Zi Yan
From: Zi Yan 

In [1], Andrea reported that during memory hotplug/hot remove
prep_transhuge_page() is called incorrectly on non-THP pages for
migration, when THP is on but THP migration is not enabled.
This leads to a bad state of target pages for migration.

This patch fixes it by only calling prep_transhuge_page() when we are
certain that the target page is THP.

[1] https://lkml.org/lkml/2017/11/20/411

Cc: sta...@vger.kernel.org # v4.14
Fixes: 8135d8926c08 ("mm: memory_hotplug: memory hotremove supports thp 
migration")
Reported-by: Andrea Reale 
Signed-off-by: Zi Yan 
Cc: Naoya Horiguchi 
Cc: "Jérôme Glisse" 
---
 include/linux/migrate.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/migrate.h b/include/linux/migrate.h
index 895ec0c4942e..a2246cf670ba 100644
--- a/include/linux/migrate.h
+++ b/include/linux/migrate.h
@@ -54,7 +54,7 @@ static inline struct page *new_page_nodemask(struct page 
*page,
new_page = __alloc_pages_nodemask(gfp_mask, order,
preferred_nid, nodemask);
 
-   if (new_page && PageTransHuge(page))
+   if (new_page && PageTransHuge(new_page))
prep_transhuge_page(new_page);
 
return new_page;
-- 
2.14.2



[PATCH] x86/smpboot: set topology CPU mask before use.

2017-11-20 Thread ning . a . zhang
From: Zhang Ning 

we detect topology CPU mask in tsc is used before it is set, it leads
to longer bootup time.

let's check the code.

smpboot.c:smp_callin()
 ---> calibarate.c:calibrate_delay()
  ---> tsc.c: calibrate_delay_is_known()
   ---> topology_core_cpumask(): read topology CPU mask
 ---> set_cpu_sibling_map(raw_smp_processor_id())
  ---> cpumask_set_cpu(cpu, topology_core_cpumask(cpu));

from the calling chain, we know topology CPU mask is used before it
actually set.

So move set_cpu_sibling_map before calibrate_delay.

Change-Id: I4eb8facb8751fe7aa2c6d2eac32437266d92ec00
Signed-off-by: Zhang Ning 
---
 arch/x86/kernel/smpboot.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 36171bcd91f8..acee1ca3ef43 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -181,6 +181,12 @@ static void smp_callin(void)
smp_store_cpu_info(cpuid);
 
/*
+* This must be done before setting cpu_online_mask
+* or calling notify_cpu_starting.
+*/
+   set_cpu_sibling_map(raw_smp_processor_id());
+
+   /*
 * Get our bogomips.
 * Update loops_per_jiffy in cpu_data. Previous call to
 * smp_store_cpu_info() stored a value that is close but not as
@@ -190,11 +196,7 @@ static void smp_callin(void)
cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
pr_debug("Stack at about %p\n", );
 
-   /*
-* This must be done before setting cpu_online_mask
-* or calling notify_cpu_starting.
-*/
-   set_cpu_sibling_map(raw_smp_processor_id());
+
wmb();
 
notify_cpu_starting(cpuid);
-- 
2.11.0



[PATCH] x86/smpboot: set topology CPU mask before use.

2017-11-20 Thread ning . a . zhang
From: Zhang Ning 

we detect topology CPU mask in tsc is used before it is set, it leads
to longer bootup time.

let's check the code.

smpboot.c:smp_callin()
 ---> calibarate.c:calibrate_delay()
  ---> tsc.c: calibrate_delay_is_known()
   ---> topology_core_cpumask(): read topology CPU mask
 ---> set_cpu_sibling_map(raw_smp_processor_id())
  ---> cpumask_set_cpu(cpu, topology_core_cpumask(cpu));

from the calling chain, we know topology CPU mask is used before it
actually set.

So move set_cpu_sibling_map before calibrate_delay.

Change-Id: I4eb8facb8751fe7aa2c6d2eac32437266d92ec00
Signed-off-by: Zhang Ning 
---
 arch/x86/kernel/smpboot.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 36171bcd91f8..acee1ca3ef43 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -181,6 +181,12 @@ static void smp_callin(void)
smp_store_cpu_info(cpuid);
 
/*
+* This must be done before setting cpu_online_mask
+* or calling notify_cpu_starting.
+*/
+   set_cpu_sibling_map(raw_smp_processor_id());
+
+   /*
 * Get our bogomips.
 * Update loops_per_jiffy in cpu_data. Previous call to
 * smp_store_cpu_info() stored a value that is close but not as
@@ -190,11 +196,7 @@ static void smp_callin(void)
cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
pr_debug("Stack at about %p\n", );
 
-   /*
-* This must be done before setting cpu_online_mask
-* or calling notify_cpu_starting.
-*/
-   set_cpu_sibling_map(raw_smp_processor_id());
+
wmb();
 
notify_cpu_starting(cpuid);
-- 
2.11.0



Re: [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored

2017-11-20 Thread NeilBrown
On Wed, May 10 2017, Ian Kent wrote:

> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
> of an automount by the call. But this flag is unconditionally cleared
> for all stat family system calls except statx().
>
> stat family system calls have always triggered mount requests for the
> negative dentry case in follow_automount() which is intended but prevents
> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>
> In order to handle the AT_NO_AUTOMOUNT for both system calls the
> negative dentry case in follow_automount() needs to be changed to
> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
> required flags are clear).
>
> AFAICT this change doesn't have any noticable side effects and may,
> in some use cases (although I didn't see it in testing) prevent
> unnecessary callbacks to the automount daemon.

Actually, this patch does have a noticeable side effect.

Previously, if /home were an indirect mount point so that /home/neilb
would mount my home directory, "ls -l /home/neilb" would always work.
Now it doesn't.

This happens because "ls" calls 'lstat' on the path and when that fails
with "ENOENT", it doesn't bother trying to open.

An alternate approach to the problem that this patch addresses is to
*not* change follow_automount() but instead change the automount daemon
and possibly autofs.

When a notification of access for an indirect mount point is received,
it would firstly just create the directory - not triggering a mount.
If another notification is then received (after the directory has been
created), it then triggers the mount.

I suspect this might need changes to autofs to avoid races when two
threads call lstat() on the same path at the same time.  We would need
to ensure that automount didn't see this as two requests though
maybe it already has enough locking.

Does that seem reasonable?  Should we revert this patch (as a
regression) and do something in automount instead?

Thanks,
NeilBrown


>
> It's also possible that a stat family call has been made with a
> path that is in the process of being mounted by some other process.
> But stat family calls should return the automount state of the path
> as it is "now" so it shouldn't wait for mount completion.
>
> This is the same semantic as the positive dentry case already
> handled.
>
> Signed-off-by: Ian Kent 
> Cc: David Howells 
> Cc: Colin Walters 
> Cc: Ondrej Holy 
> Cc: sta...@vger.kernel.org
> ---
>  fs/namei.c |   15 ---
>  include/linux/fs.h |3 +--
>  2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 7286f87..cd74838 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1129,9 +1129,18 @@ static int follow_automount(struct path *path, struct 
> nameidata *nd,
>* of the daemon to instantiate them before they can be used.
>*/
>   if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
> -LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
> - path->dentry->d_inode)
> - return -EISDIR;
> +LOOKUP_OPEN | LOOKUP_CREATE |
> +LOOKUP_AUTOMOUNT))) {
> + /* Positive dentry that isn't meant to trigger an
> +  * automount, EISDIR will allow it to be used,
> +  * otherwise there's no mount here "now" so return
> +  * ENOENT.
> +  */
> + if (path->dentry->d_inode)
> + return -EISDIR;
> + else
> + return -ENOENT;
> + }
>  
>   if (path->dentry->d_sb->s_user_ns != _user_ns)
>   return -EACCES;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 26488b4..be09684 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2935,8 +2935,7 @@ static inline int vfs_lstat(const char __user *name, 
> struct kstat *stat)
>  static inline int vfs_fstatat(int dfd, const char __user *filename,
> struct kstat *stat, int flags)
>  {
> - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
> -  stat, STATX_BASIC_STATS);
> + return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
>  }
>  static inline int vfs_fstat(int fd, struct kstat *stat)
>  {


signature.asc
Description: PGP signature


Re: [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored

2017-11-20 Thread NeilBrown
On Wed, May 10 2017, Ian Kent wrote:

> The fstatat(2) and statx() calls can pass the flag AT_NO_AUTOMOUNT
> which is meant to clear the LOOKUP_AUTOMOUNT flag and prevent triggering
> of an automount by the call. But this flag is unconditionally cleared
> for all stat family system calls except statx().
>
> stat family system calls have always triggered mount requests for the
> negative dentry case in follow_automount() which is intended but prevents
> the fstatat(2) and statx() AT_NO_AUTOMOUNT case from being handled.
>
> In order to handle the AT_NO_AUTOMOUNT for both system calls the
> negative dentry case in follow_automount() needs to be changed to
> return ENOENT when the LOOKUP_AUTOMOUNT flag is clear (and the other
> required flags are clear).
>
> AFAICT this change doesn't have any noticable side effects and may,
> in some use cases (although I didn't see it in testing) prevent
> unnecessary callbacks to the automount daemon.

Actually, this patch does have a noticeable side effect.

Previously, if /home were an indirect mount point so that /home/neilb
would mount my home directory, "ls -l /home/neilb" would always work.
Now it doesn't.

This happens because "ls" calls 'lstat' on the path and when that fails
with "ENOENT", it doesn't bother trying to open.

An alternate approach to the problem that this patch addresses is to
*not* change follow_automount() but instead change the automount daemon
and possibly autofs.

When a notification of access for an indirect mount point is received,
it would firstly just create the directory - not triggering a mount.
If another notification is then received (after the directory has been
created), it then triggers the mount.

I suspect this might need changes to autofs to avoid races when two
threads call lstat() on the same path at the same time.  We would need
to ensure that automount didn't see this as two requests though
maybe it already has enough locking.

Does that seem reasonable?  Should we revert this patch (as a
regression) and do something in automount instead?

Thanks,
NeilBrown


>
> It's also possible that a stat family call has been made with a
> path that is in the process of being mounted by some other process.
> But stat family calls should return the automount state of the path
> as it is "now" so it shouldn't wait for mount completion.
>
> This is the same semantic as the positive dentry case already
> handled.
>
> Signed-off-by: Ian Kent 
> Cc: David Howells 
> Cc: Colin Walters 
> Cc: Ondrej Holy 
> Cc: sta...@vger.kernel.org
> ---
>  fs/namei.c |   15 ---
>  include/linux/fs.h |3 +--
>  2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index 7286f87..cd74838 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1129,9 +1129,18 @@ static int follow_automount(struct path *path, struct 
> nameidata *nd,
>* of the daemon to instantiate them before they can be used.
>*/
>   if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY |
> -LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) &&
> - path->dentry->d_inode)
> - return -EISDIR;
> +LOOKUP_OPEN | LOOKUP_CREATE |
> +LOOKUP_AUTOMOUNT))) {
> + /* Positive dentry that isn't meant to trigger an
> +  * automount, EISDIR will allow it to be used,
> +  * otherwise there's no mount here "now" so return
> +  * ENOENT.
> +  */
> + if (path->dentry->d_inode)
> + return -EISDIR;
> + else
> + return -ENOENT;
> + }
>  
>   if (path->dentry->d_sb->s_user_ns != _user_ns)
>   return -EACCES;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 26488b4..be09684 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2935,8 +2935,7 @@ static inline int vfs_lstat(const char __user *name, 
> struct kstat *stat)
>  static inline int vfs_fstatat(int dfd, const char __user *filename,
> struct kstat *stat, int flags)
>  {
> - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT,
> -  stat, STATX_BASIC_STATS);
> + return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS);
>  }
>  static inline int vfs_fstat(int fd, struct kstat *stat)
>  {


signature.asc
Description: PGP signature


Re: [alsa-devel] [PATCH 2/2] ASoC: fsl_ssi: serialize AC'97 register access operations

2017-11-20 Thread Nicolin Chen
On Mon, Nov 20, 2017 at 11:16:07PM +0100, Maciej S. Szmigiero wrote:
> AC'97 register access operations (both read and write) on SSI use a one,
> shared set of SSI registers for AC'97 register address and data.
> This means that only one such access is possible at a time and so all these
> operations need to be serialized.
> 
> Since an AC'97 register access operation in this driver takes 100us+ let's
> use a mutex for this.
> 
> Use this opportunity to also change a default value returned from AC'97
> register read function from -1 to 0, since that's what AC'97 specs require
> to be returned when unknown / undefined registers are read.
> 
> Signed-off-by: Maciej S. Szmigiero 

>  static unsigned short fsl_ssi_ac97_read(struct snd_ac97 *ac97,
> @@ -1287,16 +1295,18 @@ static unsigned short fsl_ssi_ac97_read(struct 
> snd_ac97 *ac97,
>  {
>   struct regmap *regs = fsl_ac97_data->regs;
>  
> - unsigned short val = -1;
> + unsigned short val = 0;
>   u32 reg_val;
>   unsigned int lreg;
>   int ret;
>  
> + mutex_lock(_ac97_data->ac97_reg_lock);
> +
>   ret = clk_prepare_enable(fsl_ac97_data->clk);
>   if (ret) {
>   pr_err("ac97 read clk_prepare_enable failed: %d\n",
>   ret);
> - return -1;
> + goto ret_unlock;

It will return val (== 0) in this case. Will this be correctly
handled by callers? I find sound/ac97/bus.c checks if ret < 0
for ops->read().

So it might be better to add "val = ret;" before goto? Or use
val instead of ret directly?

>   }
>  
>   lreg = (reg & 0x7f) <<  12;
> @@ -1311,6 +1321,8 @@ static unsigned short fsl_ssi_ac97_read(struct snd_ac97 
> *ac97,
>  
>   clk_disable_unprepare(fsl_ac97_data->clk);
>  
> +ret_unlock:
> + mutex_unlock(_ac97_data->ac97_reg_lock);
>   return val;
>  }



Re: [alsa-devel] [PATCH 2/2] ASoC: fsl_ssi: serialize AC'97 register access operations

2017-11-20 Thread Nicolin Chen
On Mon, Nov 20, 2017 at 11:16:07PM +0100, Maciej S. Szmigiero wrote:
> AC'97 register access operations (both read and write) on SSI use a one,
> shared set of SSI registers for AC'97 register address and data.
> This means that only one such access is possible at a time and so all these
> operations need to be serialized.
> 
> Since an AC'97 register access operation in this driver takes 100us+ let's
> use a mutex for this.
> 
> Use this opportunity to also change a default value returned from AC'97
> register read function from -1 to 0, since that's what AC'97 specs require
> to be returned when unknown / undefined registers are read.
> 
> Signed-off-by: Maciej S. Szmigiero 

>  static unsigned short fsl_ssi_ac97_read(struct snd_ac97 *ac97,
> @@ -1287,16 +1295,18 @@ static unsigned short fsl_ssi_ac97_read(struct 
> snd_ac97 *ac97,
>  {
>   struct regmap *regs = fsl_ac97_data->regs;
>  
> - unsigned short val = -1;
> + unsigned short val = 0;
>   u32 reg_val;
>   unsigned int lreg;
>   int ret;
>  
> + mutex_lock(_ac97_data->ac97_reg_lock);
> +
>   ret = clk_prepare_enable(fsl_ac97_data->clk);
>   if (ret) {
>   pr_err("ac97 read clk_prepare_enable failed: %d\n",
>   ret);
> - return -1;
> + goto ret_unlock;

It will return val (== 0) in this case. Will this be correctly
handled by callers? I find sound/ac97/bus.c checks if ret < 0
for ops->read().

So it might be better to add "val = ret;" before goto? Or use
val instead of ret directly?

>   }
>  
>   lreg = (reg & 0x7f) <<  12;
> @@ -1311,6 +1321,8 @@ static unsigned short fsl_ssi_ac97_read(struct snd_ac97 
> *ac97,
>  
>   clk_disable_unprepare(fsl_ac97_data->clk);
>  
> +ret_unlock:
> + mutex_unlock(_ac97_data->ac97_reg_lock);
>   return val;
>  }



Re: Blank console but X11 works on MCP79 - old regression since 3.8

2017-11-20 Thread Ilia Mirkin
On Sat, Nov 18, 2017 at 12:23 AM, Ilia Mirkin  wrote:
> On Fri, Nov 17, 2017 at 2:37 PM, Ilia Mirkin  wrote:
>> On Fri, Nov 17, 2017 at 2:25 PM, Ondrej Zary  
>> wrote:
>>> On Friday 17 November 2017 18:41:17 Ilia Mirkin wrote:
 On Fri, Nov 17, 2017 at 12:33 PM, Ondrej Zary

  wrote:
 > @@ -483,8 +483,8 @@
 >  nouveau :02:00.0: disp:0860:  -> 0500
 >  nouveau :02:00.0: disp:0864: 
 >  nouveau :02:00.0: disp:0868:  -> 04000500
 > -nouveau :02:00.0: disp:086c:  -> 00100500
 > -nouveau :02:00.0: disp:0870: e900 -> 1e00
 > +nouveau :02:00.0: disp:086c:  -> 00100a00
 > +nouveau :02:00.0: disp:0870: e900 -> e800
 >  nouveau :02:00.0: disp:0874:  -> 
 >  nouveau :02:00.0: disp:0878: 
 >  nouveau :02:00.0: disp:0880: 0500
 >
 > Looks like it's using 8bpp (0x1e00) in 32MB case but 16bpp (0xe800) in
 > 64MB case. Why?
 >
 > I get blank screen even with 64MB with video=1280x1024-8 kernel
 > parameter. Console works with video=1280x1024-16 even with 32MB stolen
 > memory.
 >
 > Conclusions: 8-bit support is broken and bpp reduction is weird.

 OK, well that makes a *ton* of sense (8bpp being broken).

 I think the idea of bpp reduction is that when you're on your shiny
 new Riva TNT with 16MB of VRAM, you don't want to go crazy allocating
 all that to a pinned fbcon - almost half of that would go to a single
 32bpp 1600x1200 buffer, more for 1920x1200. You want to be able to
 have at least a few fb-sized buffers for backbuffer rendering, etc.

 The specific limits could probably use tweaking - I think they only
 consider VRAM size, not the fb size.

 I guess 8bpp worked prior to the change you bisected though, so we
 should figure out what we did wrong in the new code.
>>>
>>> Yes, booted 3.7 (last working kernel) and it's running in 8bpp.
>>
>> By the way, instead of booting $kernel, you can use modetest from
>> libdrm/tests. Not sure if it supports C8 though =/
>
> It didn't. But it does now - I mailed a patch to dri-devel, also (with
> slight fix) available at
>
> https://people.freedesktop.org/~imirkin/patches/0001-modetest-add-C8-support-to-generate-SMPTE-pattern.patch
>
> This works on GK208 but not on G92 (whose display unit is much closer
> to your MCP79's). You can run as
>
> ./modetest -s DVI-I-1:1920x1200@C8
>
> This should display a SMPTE pattern, and exit when you hit enter. When
> it does so, it doesn't restore fbcon, but you can swtich to another
> vty to get console back.
>
> I get a white picture on G92. Now just have to figure out how to fix
> it. Someone should also test on a G80 if possible, since that takes a
> different path as well.

Someone tested out a GF100 and it had the same issue.

I've since determined that the color is that of the first entry in the
LUT. With the above program, it's (192, 192, 192) which looks white.

  -ilia


Re: Blank console but X11 works on MCP79 - old regression since 3.8

2017-11-20 Thread Ilia Mirkin
On Sat, Nov 18, 2017 at 12:23 AM, Ilia Mirkin  wrote:
> On Fri, Nov 17, 2017 at 2:37 PM, Ilia Mirkin  wrote:
>> On Fri, Nov 17, 2017 at 2:25 PM, Ondrej Zary  
>> wrote:
>>> On Friday 17 November 2017 18:41:17 Ilia Mirkin wrote:
 On Fri, Nov 17, 2017 at 12:33 PM, Ondrej Zary

  wrote:
 > @@ -483,8 +483,8 @@
 >  nouveau :02:00.0: disp:0860:  -> 0500
 >  nouveau :02:00.0: disp:0864: 
 >  nouveau :02:00.0: disp:0868:  -> 04000500
 > -nouveau :02:00.0: disp:086c:  -> 00100500
 > -nouveau :02:00.0: disp:0870: e900 -> 1e00
 > +nouveau :02:00.0: disp:086c:  -> 00100a00
 > +nouveau :02:00.0: disp:0870: e900 -> e800
 >  nouveau :02:00.0: disp:0874:  -> 
 >  nouveau :02:00.0: disp:0878: 
 >  nouveau :02:00.0: disp:0880: 0500
 >
 > Looks like it's using 8bpp (0x1e00) in 32MB case but 16bpp (0xe800) in
 > 64MB case. Why?
 >
 > I get blank screen even with 64MB with video=1280x1024-8 kernel
 > parameter. Console works with video=1280x1024-16 even with 32MB stolen
 > memory.
 >
 > Conclusions: 8-bit support is broken and bpp reduction is weird.

 OK, well that makes a *ton* of sense (8bpp being broken).

 I think the idea of bpp reduction is that when you're on your shiny
 new Riva TNT with 16MB of VRAM, you don't want to go crazy allocating
 all that to a pinned fbcon - almost half of that would go to a single
 32bpp 1600x1200 buffer, more for 1920x1200. You want to be able to
 have at least a few fb-sized buffers for backbuffer rendering, etc.

 The specific limits could probably use tweaking - I think they only
 consider VRAM size, not the fb size.

 I guess 8bpp worked prior to the change you bisected though, so we
 should figure out what we did wrong in the new code.
>>>
>>> Yes, booted 3.7 (last working kernel) and it's running in 8bpp.
>>
>> By the way, instead of booting $kernel, you can use modetest from
>> libdrm/tests. Not sure if it supports C8 though =/
>
> It didn't. But it does now - I mailed a patch to dri-devel, also (with
> slight fix) available at
>
> https://people.freedesktop.org/~imirkin/patches/0001-modetest-add-C8-support-to-generate-SMPTE-pattern.patch
>
> This works on GK208 but not on G92 (whose display unit is much closer
> to your MCP79's). You can run as
>
> ./modetest -s DVI-I-1:1920x1200@C8
>
> This should display a SMPTE pattern, and exit when you hit enter. When
> it does so, it doesn't restore fbcon, but you can swtich to another
> vty to get console back.
>
> I get a white picture on G92. Now just have to figure out how to fix
> it. Someone should also test on a G80 if possible, since that takes a
> different path as well.

Someone tested out a GF100 and it had the same issue.

I've since determined that the color is that of the first entry in the
LUT. With the above program, it's (192, 192, 192) which looks white.

  -ilia


Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)

2017-11-20 Thread Guenter Roeck
On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote:
> On Mon, 20 Nov 2017, Guenter Roeck wrote:
> 
> > On Mon, Nov 20, 2017 at 03:21:32PM -0500, Nicolas Pitre wrote:
> > > On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > > 
> > > > On Mon, Nov 20, 2017 at 01:18:38PM -0500, Nicolas Pitre wrote:
> > > > > On Sun, 19 Nov 2017, Guenter Roeck wrote:
> > > > > 
> > > > > > On 11/19/2017 08:08 PM, Nicolas Pitre wrote:
> > > > > > > On Sun, 19 Nov 2017, Guenter Roeck wrote:
> > > > > > > > On 11/19/2017 12:36 PM, Nicolas Pitre wrote:
> > > > > > > > > On Sat, 18 Nov 2017, Guenter Roeck wrote:
> > > > > > > > > > On Tue, Oct 03, 2017 at 06:29:49PM -0400, Nicolas Pitre 
> > > > > > > > > > wrote:
> > > > > > > > > > > @@ -2295,6 +2295,7 @@ void __init 
> > > > > > > > > > > setup_per_cpu_areas(void)
> > > > > > > > > > >   if (pcpu_setup_first_chunk(ai, fc) < 0)
> > > > > > > > > > >   panic("Failed to initialize percpu 
> > > > > > > > > > > areas.");
> > > > > > > > > > > + pcpu_free_alloc_info(ai);
> > > > > > > > > > 
> > > > > > > > > > This is the culprit. Everything works fine if I remove this 
> > > > > > > > > > line.
> > > > > > > > > 
> > > > > > > > > Without this line, the memory at the ai pointer is leaked. 
> > > > > > > > > Maybe this is
> > > > > > > > > modifying the memory allocation pattern and that triggers a 
> > > > > > > > > bug later on
> > > > > > > > > in your case.
> > > > > > > > > 
> > > > > > > > > At that point the console driver is not yet initialized and 
> > > > > > > > > any error
> > > > > > > > > message won't be printed. You should enable the early console 
> > > > > > > > > mechanism
> > > > > > > > > in your kernel (see arch/cris/arch-v32/kernel/debugport.c) 
> > > > > > > > > and see what
> > > > > > > > > that might tell you.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > The problem is that BUG() on crisv32 does not yield useful 
> > > > > > > > output.
> > > > > > > > Anyway, here is the culprit.
> > > > > > > > 
> > > > > > > > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > > > > > > > index 6aef64254203..2bcc8901450c 100644
> > > > > > > > --- a/mm/bootmem.c
> > > > > > > > +++ b/mm/bootmem.c
> > > > > > > > @@ -382,7 +382,8 @@ static int __init mark_bootmem(unsigned 
> > > > > > > > long start,
> > > > > > > > unsigned long end,
> > > > > > > >  return 0;
> > > > > > > >  pos = bdata->node_low_pfn;
> > > > > > > >  }
> > > > > > > > -   BUG();
> > > > > > > > +   WARN(1, "mark_bootmem(): memory range 0x%lx-0x%lx not 
> > > > > > > > found\n",
> > > > > > > > start,
> > > > > > > > end);
> > > > > > > > +   return -ENOMEM;
> > > > > > > >   }
> > > > > > > > 
> > > > > > > >   /**
> > > > > > > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > > > > > > index 79e3549cab0f..c75622d844f1 100644
> > > > > > > > --- a/mm/percpu.c
> > > > > > > > +++ b/mm/percpu.c
> > > > > > > > @@ -1881,6 +1881,7 @@ struct pcpu_alloc_info * __init
> > > > > > > > pcpu_alloc_alloc_info(int nr_groups,
> > > > > > > >*/
> > > > > > > >   void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
> > > > > > > >   {
> > > > > > > > +   printk("pcpu_free_alloc_info(%p (0x%lx))\n", ai, 
> > > > > > > > __pa(ai));
> > > > > > > >  memblock_free_early(__pa(ai), ai->__ai_size);
> > > > > > > >
> > > > > > > > results in:
> > > > > > > > 
> > > > > > > > pcpu_free_alloc_info(c0534000 (0x40534000))
> > > > > > > > [ cut here ]
> > > > > > > > WARNING: CPU: 0 PID: 0 at mm/bootmem.c:385 
> > > > > > > > mark_bootmem+0x9a/0xaa
> > > > > > > > mark_bootmem(): memory range 0x2029a-0x2029b not found
> > > > > > > 
> > > > > > > Well... PFN_UP(0x40534000) should give 0x40534. How you might end 
> > > > > > > up
> > > > > > > with 0x2029a in mark_bootmem(), let alone not exit on the first 
> > > > > > > "if (max
> > > > > > > == end) return 0;" within the loop is rather weird.
> > > > > > > 
> > > > > > pcpu_free_alloc_info: ai=c0536000, __pa(ai)=0x40536000,
> > > > > > PFN_UP(__pa(ai))=0x2029b, PFN_UP(ai)=0x6029b
> > > > > > 
> > > > > > bootmem range is 0x6..0x61000. It doesn't get to "if (max == 
> > > > > > end)"
> > > > > > because "pos (=0x2029b) < bdata->node_min_pfn (=0x6)".
> > > > > 
> > > > > OK. the 0x2029b is the result of PAGE_SIZE being 8192 in your case.
> > > > > However the bootmem allocator deals with physical addresses not 
> > > > > virtual 
> > > > > ones. So it shouldn't give you a 0x6..0x61000 range.
> > > > > 
> > > > > Would be interesting to see what result you get on line 860 of 
> > > > > mm/bootmem.c.
> > > > > 
> > > > Nothing; __alloc_bootmem_low_node() is not called.
> > > > 
> > > > Call chain is:
> > > >   pcpu_alloc_alloc_info
> > > > memblock_virt_alloc_nopanic
> > > >   __alloc_bootmem_nopanic
> > > > ___alloc_bootmem_nopanic
> > > 
> > > But from there it should continue with: 
> > > 
> > >   

Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)

2017-11-20 Thread Guenter Roeck
On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote:
> On Mon, 20 Nov 2017, Guenter Roeck wrote:
> 
> > On Mon, Nov 20, 2017 at 03:21:32PM -0500, Nicolas Pitre wrote:
> > > On Mon, 20 Nov 2017, Guenter Roeck wrote:
> > > 
> > > > On Mon, Nov 20, 2017 at 01:18:38PM -0500, Nicolas Pitre wrote:
> > > > > On Sun, 19 Nov 2017, Guenter Roeck wrote:
> > > > > 
> > > > > > On 11/19/2017 08:08 PM, Nicolas Pitre wrote:
> > > > > > > On Sun, 19 Nov 2017, Guenter Roeck wrote:
> > > > > > > > On 11/19/2017 12:36 PM, Nicolas Pitre wrote:
> > > > > > > > > On Sat, 18 Nov 2017, Guenter Roeck wrote:
> > > > > > > > > > On Tue, Oct 03, 2017 at 06:29:49PM -0400, Nicolas Pitre 
> > > > > > > > > > wrote:
> > > > > > > > > > > @@ -2295,6 +2295,7 @@ void __init 
> > > > > > > > > > > setup_per_cpu_areas(void)
> > > > > > > > > > >   if (pcpu_setup_first_chunk(ai, fc) < 0)
> > > > > > > > > > >   panic("Failed to initialize percpu 
> > > > > > > > > > > areas.");
> > > > > > > > > > > + pcpu_free_alloc_info(ai);
> > > > > > > > > > 
> > > > > > > > > > This is the culprit. Everything works fine if I remove this 
> > > > > > > > > > line.
> > > > > > > > > 
> > > > > > > > > Without this line, the memory at the ai pointer is leaked. 
> > > > > > > > > Maybe this is
> > > > > > > > > modifying the memory allocation pattern and that triggers a 
> > > > > > > > > bug later on
> > > > > > > > > in your case.
> > > > > > > > > 
> > > > > > > > > At that point the console driver is not yet initialized and 
> > > > > > > > > any error
> > > > > > > > > message won't be printed. You should enable the early console 
> > > > > > > > > mechanism
> > > > > > > > > in your kernel (see arch/cris/arch-v32/kernel/debugport.c) 
> > > > > > > > > and see what
> > > > > > > > > that might tell you.
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > The problem is that BUG() on crisv32 does not yield useful 
> > > > > > > > output.
> > > > > > > > Anyway, here is the culprit.
> > > > > > > > 
> > > > > > > > diff --git a/mm/bootmem.c b/mm/bootmem.c
> > > > > > > > index 6aef64254203..2bcc8901450c 100644
> > > > > > > > --- a/mm/bootmem.c
> > > > > > > > +++ b/mm/bootmem.c
> > > > > > > > @@ -382,7 +382,8 @@ static int __init mark_bootmem(unsigned 
> > > > > > > > long start,
> > > > > > > > unsigned long end,
> > > > > > > >  return 0;
> > > > > > > >  pos = bdata->node_low_pfn;
> > > > > > > >  }
> > > > > > > > -   BUG();
> > > > > > > > +   WARN(1, "mark_bootmem(): memory range 0x%lx-0x%lx not 
> > > > > > > > found\n",
> > > > > > > > start,
> > > > > > > > end);
> > > > > > > > +   return -ENOMEM;
> > > > > > > >   }
> > > > > > > > 
> > > > > > > >   /**
> > > > > > > > diff --git a/mm/percpu.c b/mm/percpu.c
> > > > > > > > index 79e3549cab0f..c75622d844f1 100644
> > > > > > > > --- a/mm/percpu.c
> > > > > > > > +++ b/mm/percpu.c
> > > > > > > > @@ -1881,6 +1881,7 @@ struct pcpu_alloc_info * __init
> > > > > > > > pcpu_alloc_alloc_info(int nr_groups,
> > > > > > > >*/
> > > > > > > >   void __init pcpu_free_alloc_info(struct pcpu_alloc_info *ai)
> > > > > > > >   {
> > > > > > > > +   printk("pcpu_free_alloc_info(%p (0x%lx))\n", ai, 
> > > > > > > > __pa(ai));
> > > > > > > >  memblock_free_early(__pa(ai), ai->__ai_size);
> > > > > > > >
> > > > > > > > results in:
> > > > > > > > 
> > > > > > > > pcpu_free_alloc_info(c0534000 (0x40534000))
> > > > > > > > [ cut here ]
> > > > > > > > WARNING: CPU: 0 PID: 0 at mm/bootmem.c:385 
> > > > > > > > mark_bootmem+0x9a/0xaa
> > > > > > > > mark_bootmem(): memory range 0x2029a-0x2029b not found
> > > > > > > 
> > > > > > > Well... PFN_UP(0x40534000) should give 0x40534. How you might end 
> > > > > > > up
> > > > > > > with 0x2029a in mark_bootmem(), let alone not exit on the first 
> > > > > > > "if (max
> > > > > > > == end) return 0;" within the loop is rather weird.
> > > > > > > 
> > > > > > pcpu_free_alloc_info: ai=c0536000, __pa(ai)=0x40536000,
> > > > > > PFN_UP(__pa(ai))=0x2029b, PFN_UP(ai)=0x6029b
> > > > > > 
> > > > > > bootmem range is 0x6..0x61000. It doesn't get to "if (max == 
> > > > > > end)"
> > > > > > because "pos (=0x2029b) < bdata->node_min_pfn (=0x6)".
> > > > > 
> > > > > OK. the 0x2029b is the result of PAGE_SIZE being 8192 in your case.
> > > > > However the bootmem allocator deals with physical addresses not 
> > > > > virtual 
> > > > > ones. So it shouldn't give you a 0x6..0x61000 range.
> > > > > 
> > > > > Would be interesting to see what result you get on line 860 of 
> > > > > mm/bootmem.c.
> > > > > 
> > > > Nothing; __alloc_bootmem_low_node() is not called.
> > > > 
> > > > Call chain is:
> > > >   pcpu_alloc_alloc_info
> > > > memblock_virt_alloc_nopanic
> > > >   __alloc_bootmem_nopanic
> > > > ___alloc_bootmem_nopanic
> > > 
> > > But from there it should continue with: 
> > > 
> > >   

Re: [PATCH 02/16] x86/dumpstack: Add get_stack_info() support for the SYSENTER stack

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 1:55 PM, Josh Poimboeuf  wrote:
> On Mon, Nov 20, 2017 at 01:30:12PM -0800, Andy Lutomirski wrote:
>> On Mon, Nov 20, 2017 at 1:27 PM, Josh Poimboeuf  wrote:
>> > On Mon, Nov 20, 2017 at 01:07:16PM -0800, Andy Lutomirski wrote:
>> >> >> but, more importantly, the OOPS unwinder will just bail without this
>> >> >> patch.  With the patch, we get a valid unwind, except that everything
>> >> >> has a ?  in front.
>> >> >
>> >> > Hm.  I can't even fathom how that's possible.  Are you talking about the
>> >> > "unwind from NMI to SYSENTER stack" path?  Or any unwind to a syscall?
>> >> > Either way I'm baffled...  If the unwinder only encounters the SYSENTER
>> >> > stack at the end, how could that cause everything beforehand to have a
>> >> > question mark?
>> >>
>> >> I mean that, if I put a ud2 or other bug in the code that runs on the
>> >> SYSENTER stack, without this patch, I get a totally blank call trace.
>> >
>> > I would expect a blank call trace either way...
>>
>> Try making sync_regs use a few kB of stack space or, better yet, call
>> a non-inlined function that uses too much stack.
>
> You mean overflow the exception stack?  I still don't see how that would
> do it.
>
> If you could show a specific example, with splats from before/after,
> that would be helpful.  Because I still have no idea how this patch
> could possibly help.

I added BUG() to sync_regs().  With the patch, I get:

[4.211553] PANIC: double fault, error_code: 0x0
[4.212113] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #920
[4.212741] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc26 04/01/2014
[4.213536] task: 88001aa18000 task.stack: 88001aa2
[4.214059] RIP: 0010:do_error_trap+0x33/0x1c0
[4.214449] RSP: :ff1b8f78 EFLAGS: 00010096
[4.214934] RAX: dc00 RBX: ff1b8f90 RCX: 0006
[4.215554] RDX: 82048b20 RSI:  RDI: ff1b9110
[4.216176] RBP: ff1b9088 R08: 0004 R09: 
[4.216793] R10:  R11: fbe3723f R12: 0006
[4.217419] R13:  R14: 0004 R15: 
[4.218046] FS:  () GS:88001ae0()
knlGS:
[4.218775] CS:  0010 DS:  ES:  CR0: 80050033
[4.219280] CR2: ff1b8f68 CR3: 193da002 CR4: 003606f0
[4.219931] Call Trace:
[4.220156]  
[4.220383]  ? async_page_fault+0x36/0x60
[4.220768]  ? invalid_op+0x22/0x40
[4.221087]  ? async_page_fault+0x36/0x60
[4.221442]  ? sync_regs+0x3c/0x40
[4.221745]  ? sync_regs+0x2e/0x40
[4.222051]  ? error_entry+0x6c/0xd0
[4.222395]  ? async_page_fault+0x36/0x60
[4.222748]  
[4.223014] Code: 00 00 00 fc ff df 41 55 41 54 49 89 f7 55 53 48
89 fd 48 81 c7 88 00 00 00 49 89 cc 45 89 c6 48 81 ec d8 00 00 00 48
8d 5c 24 18 <48> 89 14 24 48 c7 44 24 18 b3 8a b5 41 48 c7 44 24 20 b8
5e 4d

Without the patch, I get:

[3.962218] PANIC: double fault, error_code: 0x0
[3.962728] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #921
[3.963225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc26 04/01/2014
[3.963998] task: 88001aa18000 task.stack: 88001aa2
[3.964526] RIP: 0010:do_error_trap+0x33/0x1c0
[3.964947] RSP: :ff1b8f78 EFLAGS: 00010096
[3.965439] RAX: dc00 RBX: ff1b8f90 RCX: 0006
[3.966067] RDX: 82048b20 RSI:  RDI: ff1b9110
[3.966696] RBP: ff1b9088 R08: 0004 R09: 
[3.967316] R10:  R11: fbe3723f R12: 0006
[3.967939] R13:  R14: 0004 R15: 
[3.968565] FS:  () GS:88001ae0()
knlGS:
[3.969272] CS:  0010 DS:  ES:  CR0: 80050033
[3.969776] CR2: ff1b8f68 CR3: 19308002 CR4: 003606f0
[3.970406] Call Trace:
[3.970632] Code: 00 00 00 fc ff df 41 55 41 54 49 89 f7 55 53 48
89 fd 48 81 c7 88 00 00 00 49 89 cc 45 89 c6 48 81 ec d8 00 00 00 48
8d 5c 24 18 <48> 89 14 24 48 c7 44 24 18 b3 8a b5 41 48 c7 44 24 20 b8
5e 4d

I prefer the former :)

>
> --
> Josh


Re: [PATCH 02/16] x86/dumpstack: Add get_stack_info() support for the SYSENTER stack

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 1:55 PM, Josh Poimboeuf  wrote:
> On Mon, Nov 20, 2017 at 01:30:12PM -0800, Andy Lutomirski wrote:
>> On Mon, Nov 20, 2017 at 1:27 PM, Josh Poimboeuf  wrote:
>> > On Mon, Nov 20, 2017 at 01:07:16PM -0800, Andy Lutomirski wrote:
>> >> >> but, more importantly, the OOPS unwinder will just bail without this
>> >> >> patch.  With the patch, we get a valid unwind, except that everything
>> >> >> has a ?  in front.
>> >> >
>> >> > Hm.  I can't even fathom how that's possible.  Are you talking about the
>> >> > "unwind from NMI to SYSENTER stack" path?  Or any unwind to a syscall?
>> >> > Either way I'm baffled...  If the unwinder only encounters the SYSENTER
>> >> > stack at the end, how could that cause everything beforehand to have a
>> >> > question mark?
>> >>
>> >> I mean that, if I put a ud2 or other bug in the code that runs on the
>> >> SYSENTER stack, without this patch, I get a totally blank call trace.
>> >
>> > I would expect a blank call trace either way...
>>
>> Try making sync_regs use a few kB of stack space or, better yet, call
>> a non-inlined function that uses too much stack.
>
> You mean overflow the exception stack?  I still don't see how that would
> do it.
>
> If you could show a specific example, with splats from before/after,
> that would be helpful.  Because I still have no idea how this patch
> could possibly help.

I added BUG() to sync_regs().  With the patch, I get:

[4.211553] PANIC: double fault, error_code: 0x0
[4.212113] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #920
[4.212741] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc26 04/01/2014
[4.213536] task: 88001aa18000 task.stack: 88001aa2
[4.214059] RIP: 0010:do_error_trap+0x33/0x1c0
[4.214449] RSP: :ff1b8f78 EFLAGS: 00010096
[4.214934] RAX: dc00 RBX: ff1b8f90 RCX: 0006
[4.215554] RDX: 82048b20 RSI:  RDI: ff1b9110
[4.216176] RBP: ff1b9088 R08: 0004 R09: 
[4.216793] R10:  R11: fbe3723f R12: 0006
[4.217419] R13:  R14: 0004 R15: 
[4.218046] FS:  () GS:88001ae0()
knlGS:
[4.218775] CS:  0010 DS:  ES:  CR0: 80050033
[4.219280] CR2: ff1b8f68 CR3: 193da002 CR4: 003606f0
[4.219931] Call Trace:
[4.220156]  
[4.220383]  ? async_page_fault+0x36/0x60
[4.220768]  ? invalid_op+0x22/0x40
[4.221087]  ? async_page_fault+0x36/0x60
[4.221442]  ? sync_regs+0x3c/0x40
[4.221745]  ? sync_regs+0x2e/0x40
[4.222051]  ? error_entry+0x6c/0xd0
[4.222395]  ? async_page_fault+0x36/0x60
[4.222748]  
[4.223014] Code: 00 00 00 fc ff df 41 55 41 54 49 89 f7 55 53 48
89 fd 48 81 c7 88 00 00 00 49 89 cc 45 89 c6 48 81 ec d8 00 00 00 48
8d 5c 24 18 <48> 89 14 24 48 c7 44 24 18 b3 8a b5 41 48 c7 44 24 20 b8
5e 4d

Without the patch, I get:

[3.962218] PANIC: double fault, error_code: 0x0
[3.962728] CPU: 0 PID: 1 Comm: sh Not tainted 4.14.0+ #921
[3.963225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1.fc26 04/01/2014
[3.963998] task: 88001aa18000 task.stack: 88001aa2
[3.964526] RIP: 0010:do_error_trap+0x33/0x1c0
[3.964947] RSP: :ff1b8f78 EFLAGS: 00010096
[3.965439] RAX: dc00 RBX: ff1b8f90 RCX: 0006
[3.966067] RDX: 82048b20 RSI:  RDI: ff1b9110
[3.966696] RBP: ff1b9088 R08: 0004 R09: 
[3.967316] R10:  R11: fbe3723f R12: 0006
[3.967939] R13:  R14: 0004 R15: 
[3.968565] FS:  () GS:88001ae0()
knlGS:
[3.969272] CS:  0010 DS:  ES:  CR0: 80050033
[3.969776] CR2: ff1b8f68 CR3: 19308002 CR4: 003606f0
[3.970406] Call Trace:
[3.970632] Code: 00 00 00 fc ff df 41 55 41 54 49 89 f7 55 53 48
89 fd 48 81 c7 88 00 00 00 49 89 cc 45 89 c6 48 81 ec d8 00 00 00 48
8d 5c 24 18 <48> 89 14 24 48 c7 44 24 18 b3 8a b5 41 48 c7 44 24 20 b8
5e 4d

I prefer the former :)

>
> --
> Josh


Re: [PATCH 00/13] block: assorted cleanup for bio splitting and cloning.

2017-11-20 Thread Mike Snitzer
On Mon, Nov 20 2017 at  7:34pm -0500,
NeilBrown  wrote:

> On Mon, Nov 20 2017, Mike Snitzer wrote:
> 
> > On Sun, Jun 18, 2017 at 5:36 PM, NeilBrown  wrote:
> >> On Sun, Jun 18 2017, Jens Axboe wrote:
> >>
> >>> On Sun, Jun 18 2017, NeilBrown wrote:
>  This is a resend of my series of patches working
>  towards removing the bioset work queues.
> 
>  This set is based on for-4.13/block.
> 
>  It incorporates the revised versions of all the patches that were
>  resent following feedback on the last set.
> 
>  It also includes a minor grammatic improvement to a comment, and
>  simple changes to compensate for a couple of changes to the block tree
>  since the last posting.
> 
>  I hope to eventually get rid of the new BIOSET_NEED_RESCUER flag,
>  but that needs work in dm and probably bcache first.
> >>>
> >>> Thanks Neil, applied.
> >>
> >> Thanks a lot Jens.
> >
> > I missed this line of work until now.  Not quite sure why I wasn't
> > cc'd or my review/ack required for the DM changes in this patchset.
> 
> Hi Mike,
>  I'm sorry you weren't included on those.  My thinking at the time was
>  probably that they were purely cosmetic changes which made no
>  functional difference to dm.  That is no excuse though and I do
>  apologize.
> 
> >
> > But I've now queued this patch for once Linus gets back (reverts DM
> > changes from commit 47e0fb461f):
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next=c9fdc42ba23eabd1ba7aef199fb9bb4b4fe5c545
> 
> This patch does two things.
> 1/ It removes the BIOSET_NEED_RESCUER flag from biosets created by dm.
>   This a functional changed over the code from before my patches.
>   Previously, all biosets were given a rescuer thread.
>   After my patch set, biosets only got a rescuer thread if
>   BIOSET_NEED_RESCUER was passed, and it was passed for all biosets.
>   I then removed it from places were I was certain it wasn't needed.
>   I didn't remove it from dm because I wasn't certain.  Your
>   patch does remove the flags, which I think is incorrect - see below.
> 
> 2/ It changes flush_current_bio_list() so that bios allocated from a
>bioset that does not have a rescue_workqueue are now added to
>the ->rescue_list for their bio_set, and ->rescue_work is queued
>on the NULL ->rescue_workqueue, resulting in a NULL dereference.
>I suspect you don't want this.
> 
> The patch description claims that the patch fixes something, but it
> isn't clear to me what it is meant to be fixing.
> 
> It makes reference to  dbba42d8 which is described as removing an unused
> bioset process, though what it actually does is remove an used bioset
> (and obvious the process disappears with it).  My patch doesn't change
> that behavior.

Well I looked at this because Zdenek reported that with more recent
kernels he is seeing the "bioset" per DM device again (whereas it was
thought to be removed with mikulas' commit dbba42d8 -- but that commit
removed "bioset" only in terms of q->bio_split.

Looks like the q->bio_split bioset is created with BIOSET_NEED_RESCUER
but DM calls bioset_free() for q->bio_split.  Strikes me as odd that
"bioset" was removed DM devices until it recently reappeared.
Especially if what you say is accurate (that BIOSET_NEED_RESCUER was
implied with the old code.. I believe you!)

I tried to quickly answer how "bioset" is now re-appearing for DM
devices but I obviously need to put more time to it.

(And my goodness does this bioset rescue workqueue need a better name
than "bioset"! Should include the bdevname too)

> > As is, it is my understanding that DM has no need for a bio_set's
> > rescue_workqueue.  So its removal would appear to only be gated by
> > bcache?
> >
> > But I could be mistaken, moving forward please let me know what you
> > feel needs doing in DM to make it a better citizen.
> 
> I think you are mistaken.
> Please see
>https://www.redhat.com/archives/dm-devel/2017-August/msg00310.html
> and
>https://www.redhat.com/archives/dm-devel/2017-August/msg00315.html
> for which the thread continues:
>https://www.redhat.com/archives/dm-devel/2017-September/msg1.html
> 
> These were sent to you, though most of the conversation happened with
> Mikulas.
> 
> I think that the patches in those threads explain why dm currently needs
> rescuer threads, and shows how dm can be changed to no longer need the
> rescuer.  I would appreciate your thoughts on these patches.  I can
> resend them if that would help.

No need to resend.  I'll work through the old threads.

> That would then just leave bcache  I find it a bit of a challenge to
> reason about the code in bcache, but if we can remove
> BIOSET_NEED_RESCUER from dm, that will be an extra incentive for me to learn 
> :-)

I'm all for properly removing BIOSET_NEED_RESCUER from DM.

I'll look closer at all of this in the morning (for now I'm backing the

Re: [PATCH 00/13] block: assorted cleanup for bio splitting and cloning.

2017-11-20 Thread Mike Snitzer
On Mon, Nov 20 2017 at  7:34pm -0500,
NeilBrown  wrote:

> On Mon, Nov 20 2017, Mike Snitzer wrote:
> 
> > On Sun, Jun 18, 2017 at 5:36 PM, NeilBrown  wrote:
> >> On Sun, Jun 18 2017, Jens Axboe wrote:
> >>
> >>> On Sun, Jun 18 2017, NeilBrown wrote:
>  This is a resend of my series of patches working
>  towards removing the bioset work queues.
> 
>  This set is based on for-4.13/block.
> 
>  It incorporates the revised versions of all the patches that were
>  resent following feedback on the last set.
> 
>  It also includes a minor grammatic improvement to a comment, and
>  simple changes to compensate for a couple of changes to the block tree
>  since the last posting.
> 
>  I hope to eventually get rid of the new BIOSET_NEED_RESCUER flag,
>  but that needs work in dm and probably bcache first.
> >>>
> >>> Thanks Neil, applied.
> >>
> >> Thanks a lot Jens.
> >
> > I missed this line of work until now.  Not quite sure why I wasn't
> > cc'd or my review/ack required for the DM changes in this patchset.
> 
> Hi Mike,
>  I'm sorry you weren't included on those.  My thinking at the time was
>  probably that they were purely cosmetic changes which made no
>  functional difference to dm.  That is no excuse though and I do
>  apologize.
> 
> >
> > But I've now queued this patch for once Linus gets back (reverts DM
> > changes from commit 47e0fb461f):
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next=c9fdc42ba23eabd1ba7aef199fb9bb4b4fe5c545
> 
> This patch does two things.
> 1/ It removes the BIOSET_NEED_RESCUER flag from biosets created by dm.
>   This a functional changed over the code from before my patches.
>   Previously, all biosets were given a rescuer thread.
>   After my patch set, biosets only got a rescuer thread if
>   BIOSET_NEED_RESCUER was passed, and it was passed for all biosets.
>   I then removed it from places were I was certain it wasn't needed.
>   I didn't remove it from dm because I wasn't certain.  Your
>   patch does remove the flags, which I think is incorrect - see below.
> 
> 2/ It changes flush_current_bio_list() so that bios allocated from a
>bioset that does not have a rescue_workqueue are now added to
>the ->rescue_list for their bio_set, and ->rescue_work is queued
>on the NULL ->rescue_workqueue, resulting in a NULL dereference.
>I suspect you don't want this.
> 
> The patch description claims that the patch fixes something, but it
> isn't clear to me what it is meant to be fixing.
> 
> It makes reference to  dbba42d8 which is described as removing an unused
> bioset process, though what it actually does is remove an used bioset
> (and obvious the process disappears with it).  My patch doesn't change
> that behavior.

Well I looked at this because Zdenek reported that with more recent
kernels he is seeing the "bioset" per DM device again (whereas it was
thought to be removed with mikulas' commit dbba42d8 -- but that commit
removed "bioset" only in terms of q->bio_split.

Looks like the q->bio_split bioset is created with BIOSET_NEED_RESCUER
but DM calls bioset_free() for q->bio_split.  Strikes me as odd that
"bioset" was removed DM devices until it recently reappeared.
Especially if what you say is accurate (that BIOSET_NEED_RESCUER was
implied with the old code.. I believe you!)

I tried to quickly answer how "bioset" is now re-appearing for DM
devices but I obviously need to put more time to it.

(And my goodness does this bioset rescue workqueue need a better name
than "bioset"! Should include the bdevname too)

> > As is, it is my understanding that DM has no need for a bio_set's
> > rescue_workqueue.  So its removal would appear to only be gated by
> > bcache?
> >
> > But I could be mistaken, moving forward please let me know what you
> > feel needs doing in DM to make it a better citizen.
> 
> I think you are mistaken.
> Please see
>https://www.redhat.com/archives/dm-devel/2017-August/msg00310.html
> and
>https://www.redhat.com/archives/dm-devel/2017-August/msg00315.html
> for which the thread continues:
>https://www.redhat.com/archives/dm-devel/2017-September/msg1.html
> 
> These were sent to you, though most of the conversation happened with
> Mikulas.
> 
> I think that the patches in those threads explain why dm currently needs
> rescuer threads, and shows how dm can be changed to no longer need the
> rescuer.  I would appreciate your thoughts on these patches.  I can
> resend them if that would help.

No need to resend.  I'll work through the old threads.

> That would then just leave bcache  I find it a bit of a challenge to
> reason about the code in bcache, but if we can remove
> BIOSET_NEED_RESCUER from dm, that will be an extra incentive for me to learn 
> :-)

I'm all for properly removing BIOSET_NEED_RESCUER from DM.

I'll look closer at all of this in the morning (for now I'm backing the
patch I referenced out from 

Re: [PATCH 07/16] x86/asm: Move SYSENTER_stack to the beginning of struct tss_struct

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 3:37 PM, Thomas Gleixner  wrote:
> On Mon, 20 Nov 2017, Andy Lutomirski wrote:
>>  struct tss_struct {
>>   /*
>> -  * The hardware state:
>> +  * Space for the temporary SYSENTER stack.  Used for the entry
>> +  * trampoline as well.  Size it such that tss_struct ends up
>> +  * as a multiple of PAGE_SIZE.  This calculation assumes that
>> +  * io_bitmap is a multiple of PAGE_SIZE (8192 bytes) plus one
>> +  * long.
>
> I don't see how sizeof(tss_struct) is a multiple of PAGE_SIZE
>
> canary  =8
> stack   =  512
> hw_tss  =  104
> io bitmap   = 8200
> -
>   8824
>
> The alignment is what blows it up to 3 * PAGE_SIZE

Whoops!  That *was* correct in the RFC version version, but I changed
it and failed to fix the comment.


Re: [PATCH 07/16] x86/asm: Move SYSENTER_stack to the beginning of struct tss_struct

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 3:37 PM, Thomas Gleixner  wrote:
> On Mon, 20 Nov 2017, Andy Lutomirski wrote:
>>  struct tss_struct {
>>   /*
>> -  * The hardware state:
>> +  * Space for the temporary SYSENTER stack.  Used for the entry
>> +  * trampoline as well.  Size it such that tss_struct ends up
>> +  * as a multiple of PAGE_SIZE.  This calculation assumes that
>> +  * io_bitmap is a multiple of PAGE_SIZE (8192 bytes) plus one
>> +  * long.
>
> I don't see how sizeof(tss_struct) is a multiple of PAGE_SIZE
>
> canary  =8
> stack   =  512
> hw_tss  =  104
> io bitmap   = 8200
> -
>   8824
>
> The alignment is what blows it up to 3 * PAGE_SIZE

Whoops!  That *was* correct in the RFC version version, but I changed
it and failed to fix the comment.


Re: [PATCH v4] iio : Add cm3218 smbus ara and acpi support

2017-11-20 Thread Phil Reid

On 20/11/2017 18:57, Mika Westerberg wrote:

+Jarkko

On Sun, Nov 19, 2017 at 04:35:51PM +, Jonathan Cameron wrote:

On Thu, 2 Nov 2017 16:04:07 +0100
Wolfram Sang  wrote:


On Thu, Nov 02, 2017 at 02:35:50PM +, Jonathan Cameron wrote:

On Fri, 27 Oct 2017 18:27:02 +0200
Marc CAPDEVILLE  wrote:
   

On asus T100, Capella cm3218 chip is implemented as ambiant light
sensor. This chip expose an smbus ARA protocol device on standard
address 0x0c. The chip is not functional before all alerts are
acknowledged.
On asus T100, this device is enumerated on ACPI bus and the
description give tow I2C connection. The first is the connection to
the ARA device and the second gives the real address of the device.

So, on device probe,If the i2c address is the ARA address and the
device is enumerated via acpi, we lookup for the real address in
the ACPI resource list and change it in the client structure.
if an interrupt resource is given, and only for cm3218 chip,
we declare an smbus_alert device.

Signed-off-by: Marc CAPDEVILLE 


Wolfram - this needs input from you on how to neatly handle
an ACPI registered ARA.


ACPI is really not my field. Try asking the I2C ACPI maintainers or
Benjamin Tissoires  who did work on SMBus
interrupts recently.


Hi Mika, Benjamin,

So we've lost most of the context in this thread, but the basic question
is how to handle smbus ARA support with ACPI.

https://patchwork.kernel.org/patch/10030309/

Has the proposal made in this driver which is really not terribly nice
(as it registers the ARA device by messing with the address and registering
a second device).

As I understood it the ARA device registration should be handled by the
i2c master, but there are very few examples.

Phil pointed out that equivalent OF support recently got taken from him..
https://www.spinics.net/lists/devicetree/msg191947.html
https://www.spinics.net/lists/linux-i2c/msg31173.html

Any thoughts on the right way to do this?


There does not seem to be any way in ACPI to tell which "connection" is
used to describe ARA so that part currently is something each driver
needs to handle as they know the device the best. I don't think we have
any means to handle it in generic way in I2C core except to provide some
helpers that work on top of i2c_setup_smbus_alert() but understand ACPI
resources. Say provide function like this:

   int acpi_i2c_setup_smbus_alert(struct i2c_adapter *adapter, int index);

Which then extracts automatically I2cSerialBus connection from "index"
and calls i2c_setup_smbus_alert() accordingly.

In the long run we could introduce _DSD property that can be used to
name the connection in the same way DT does;

 Name (_CRS, ResourceTemplate () {
 I2cSerialBus () { ... } // ARA
 I2cSerialBus () { ... } // normal device address
 })

 Name (_DSD, Package () {
 ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 Package () {
 Package () {"smbus_alert", 0} // Where 0 means the first 
I2cSerialBus
 ...
 }
 })



I wonder if it's worth involving the smbus_alert driver in this case.
The cm3218 driver doesn't appear to be using the alert callback in strcut 
i2c_driver.
So the smbus_alert driver is not going to notifiy the cm3218 driver.
Are there more than one alert/ara capable devices on the bus?
Perhaps a workaround in this case is if that acpi entry is defined the cm3218 
driver
handles that ara request directly to clear the interrupt.


--
Regards
Phil Reid



Re: [PATCH v4] iio : Add cm3218 smbus ara and acpi support

2017-11-20 Thread Phil Reid

On 20/11/2017 18:57, Mika Westerberg wrote:

+Jarkko

On Sun, Nov 19, 2017 at 04:35:51PM +, Jonathan Cameron wrote:

On Thu, 2 Nov 2017 16:04:07 +0100
Wolfram Sang  wrote:


On Thu, Nov 02, 2017 at 02:35:50PM +, Jonathan Cameron wrote:

On Fri, 27 Oct 2017 18:27:02 +0200
Marc CAPDEVILLE  wrote:
   

On asus T100, Capella cm3218 chip is implemented as ambiant light
sensor. This chip expose an smbus ARA protocol device on standard
address 0x0c. The chip is not functional before all alerts are
acknowledged.
On asus T100, this device is enumerated on ACPI bus and the
description give tow I2C connection. The first is the connection to
the ARA device and the second gives the real address of the device.

So, on device probe,If the i2c address is the ARA address and the
device is enumerated via acpi, we lookup for the real address in
the ACPI resource list and change it in the client structure.
if an interrupt resource is given, and only for cm3218 chip,
we declare an smbus_alert device.

Signed-off-by: Marc CAPDEVILLE 


Wolfram - this needs input from you on how to neatly handle
an ACPI registered ARA.


ACPI is really not my field. Try asking the I2C ACPI maintainers or
Benjamin Tissoires  who did work on SMBus
interrupts recently.


Hi Mika, Benjamin,

So we've lost most of the context in this thread, but the basic question
is how to handle smbus ARA support with ACPI.

https://patchwork.kernel.org/patch/10030309/

Has the proposal made in this driver which is really not terribly nice
(as it registers the ARA device by messing with the address and registering
a second device).

As I understood it the ARA device registration should be handled by the
i2c master, but there are very few examples.

Phil pointed out that equivalent OF support recently got taken from him..
https://www.spinics.net/lists/devicetree/msg191947.html
https://www.spinics.net/lists/linux-i2c/msg31173.html

Any thoughts on the right way to do this?


There does not seem to be any way in ACPI to tell which "connection" is
used to describe ARA so that part currently is something each driver
needs to handle as they know the device the best. I don't think we have
any means to handle it in generic way in I2C core except to provide some
helpers that work on top of i2c_setup_smbus_alert() but understand ACPI
resources. Say provide function like this:

   int acpi_i2c_setup_smbus_alert(struct i2c_adapter *adapter, int index);

Which then extracts automatically I2cSerialBus connection from "index"
and calls i2c_setup_smbus_alert() accordingly.

In the long run we could introduce _DSD property that can be used to
name the connection in the same way DT does;

 Name (_CRS, ResourceTemplate () {
 I2cSerialBus () { ... } // ARA
 I2cSerialBus () { ... } // normal device address
 })

 Name (_DSD, Package () {
 ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 Package () {
 Package () {"smbus_alert", 0} // Where 0 means the first 
I2cSerialBus
 ...
 }
 })



I wonder if it's worth involving the smbus_alert driver in this case.
The cm3218 driver doesn't appear to be using the alert callback in strcut 
i2c_driver.
So the smbus_alert driver is not going to notifiy the cm3218 driver.
Are there more than one alert/ara capable devices on the bus?
Perhaps a workaround in this case is if that acpi entry is defined the cm3218 
driver
handles that ara request directly to clear the interrupt.


--
Regards
Phil Reid



Re: [PATCH 04/16] x86/fixmap: Generalize the GDT fixmap mechanism

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 2:01 PM, Thomas Gleixner  wrote:
> On Mon, 20 Nov 2017, Andy Lutomirski wrote:
>
> Just a few nits.
>
>>  /* Provide the fixmap address of the remapped GDT */
>>  static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
>>  {
>> - unsigned int idx = get_cpu_gdt_ro_index(cpu);
>> - return (struct desc_struct *)__fix_to_virt(idx);
>> + return (struct desc_struct *)_cpu_entry_area(cpu)->gdt;
>>  }
>>
>>  /* Provide the current read-only GDT */
>> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
>> index b0c505fe9a95..5ff8e251772a 100644
>> --- a/arch/x86/include/asm/fixmap.h
>> +++ b/arch/x86/include/asm/fixmap.h
>> @@ -44,6 +44,17 @@ extern unsigned long __FIXADDR_TOP;
>>PAGE_SIZE)
>>  #endif
>>
>> +/*
>> + * cpu_entry_area is a percpu region in the fixmap that contains things
>> + * need by the CPU and early entry/exit code.  Real types aren't used here
>
> s/need/needed/

Yeah, fixed

>
>> + * to avoid circular header dependencies.
>
> :(

Hmm.  I could probably fix this, but it involves (at least) moving a
struct definition and adding several new includes, and I'm not sure
it'll actually converge to something  working.

>
>> + */
>> +struct cpu_entry_area
>> +{
>> + char gdt[PAGE_SIZE];
>> +};
>> +
>> +#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
>
>> +static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int 
>> page)
>> +{
>> + BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0);
>> +
>> + return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page;
>> +}
>> +
>> +#define __get_cpu_entry_area_offset_index(cpu, offset) ({\
>> + BUILD_BUG_ON(offset % PAGE_SIZE != 0);  \
>> + __get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE);   \
>> + })
>> +
>> +#define get_cpu_entry_area_index(cpu, field) \
>> + __get_cpu_entry_area_offset_index((cpu), offsetof(struct 
>> cpu_entry_area, field))
>
> Any reason why those need to be macros?

The former is a macro because I doubt that BUILD_BUG_ON is valid in
that context in a function.  The latter is a macro because 'field' is
a name, not a value.

>
>> +static inline struct cpu_entry_area *get_cpu_entry_area(int cpu)
>> +{
>> + return (struct cpu_entry_area *)
>> + __fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0));
>> +}
>> +
>
> Thanks,
>
> tglx
>


Re: [PATCH 04/16] x86/fixmap: Generalize the GDT fixmap mechanism

2017-11-20 Thread Andy Lutomirski
On Mon, Nov 20, 2017 at 2:01 PM, Thomas Gleixner  wrote:
> On Mon, 20 Nov 2017, Andy Lutomirski wrote:
>
> Just a few nits.
>
>>  /* Provide the fixmap address of the remapped GDT */
>>  static inline struct desc_struct *get_cpu_gdt_ro(int cpu)
>>  {
>> - unsigned int idx = get_cpu_gdt_ro_index(cpu);
>> - return (struct desc_struct *)__fix_to_virt(idx);
>> + return (struct desc_struct *)_cpu_entry_area(cpu)->gdt;
>>  }
>>
>>  /* Provide the current read-only GDT */
>> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
>> index b0c505fe9a95..5ff8e251772a 100644
>> --- a/arch/x86/include/asm/fixmap.h
>> +++ b/arch/x86/include/asm/fixmap.h
>> @@ -44,6 +44,17 @@ extern unsigned long __FIXADDR_TOP;
>>PAGE_SIZE)
>>  #endif
>>
>> +/*
>> + * cpu_entry_area is a percpu region in the fixmap that contains things
>> + * need by the CPU and early entry/exit code.  Real types aren't used here
>
> s/need/needed/

Yeah, fixed

>
>> + * to avoid circular header dependencies.
>
> :(

Hmm.  I could probably fix this, but it involves (at least) moving a
struct definition and adding several new includes, and I'm not sure
it'll actually converge to something  working.

>
>> + */
>> +struct cpu_entry_area
>> +{
>> + char gdt[PAGE_SIZE];
>> +};
>> +
>> +#define CPU_ENTRY_AREA_PAGES (sizeof(struct cpu_entry_area) / PAGE_SIZE)
>
>> +static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int 
>> page)
>> +{
>> + BUILD_BUG_ON(sizeof(struct cpu_entry_area) % PAGE_SIZE != 0);
>> +
>> + return FIX_CPU_ENTRY_AREA_BOTTOM - cpu*CPU_ENTRY_AREA_PAGES - page;
>> +}
>> +
>> +#define __get_cpu_entry_area_offset_index(cpu, offset) ({\
>> + BUILD_BUG_ON(offset % PAGE_SIZE != 0);  \
>> + __get_cpu_entry_area_page_index(cpu, offset / PAGE_SIZE);   \
>> + })
>> +
>> +#define get_cpu_entry_area_index(cpu, field) \
>> + __get_cpu_entry_area_offset_index((cpu), offsetof(struct 
>> cpu_entry_area, field))
>
> Any reason why those need to be macros?

The former is a macro because I doubt that BUILD_BUG_ON is valid in
that context in a function.  The latter is a macro because 'field' is
a name, not a value.

>
>> +static inline struct cpu_entry_area *get_cpu_entry_area(int cpu)
>> +{
>> + return (struct cpu_entry_area *)
>> + __fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0));
>> +}
>> +
>
> Thanks,
>
> tglx
>


Re: [PATCH] ACPI / battery: add quirk for Asus GL502VSK and UX305LA

2017-11-20 Thread Rafael J. Wysocki
On Friday, September 22, 2017 10:27:44 AM CET Kai-Heng Feng wrote:
> On Asus GL502VSK and UX305LA, ACPI incorrectly reports discharging when
> battery is full and AC is plugged.
> 
> However rate_now is correct under this circumstance, hence we can use
> "rate_now == 0" as a predicate to report battery full status correctly.
> 
> BugLink: https://bugs.launchpad.net/bugs/1482390
> Signed-off-by: Kai-Heng Feng 
> ---
>  drivers/acpi/battery.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
> index 13e7b56e33ae..f9f008cf3da7 100644
> --- a/drivers/acpi/battery.c
> +++ b/drivers/acpi/battery.c
> @@ -70,6 +70,7 @@ static async_cookie_t async_cookie;
>  static bool battery_driver_registered;
>  static int battery_bix_broken_package;
>  static int battery_notification_delay_ms;
> +static int battery_full_discharging;
>  static unsigned int cache_time = 1000;
>  module_param(cache_time, uint, 0644);
>  MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
> @@ -214,7 +215,10 @@ static int acpi_battery_get_property(struct power_supply 
> *psy,
>   return -ENODEV;
>   switch (psp) {
>   case POWER_SUPPLY_PROP_STATUS:
> - if (battery->state & ACPI_BATTERY_STATE_DISCHARGING)
> + if (battery_full_discharging && battery->rate_now == 0 &&
> + battery->state & ACPI_BATTERY_STATE_DISCHARGING)
> + val->intval = POWER_SUPPLY_STATUS_FULL;
> + else if (battery->state & ACPI_BATTERY_STATE_DISCHARGING)

You are adding a special sub-case to the
(battery->state & ACPI_BATTERY_STATE_DISCHARGING) case, so it would be cleaner
to do that in the following way:

if (battery->state & ACPI_BATTERY_STATE_DISCHARGING) {
if (battery_full_discharging && battery->rate_now == 0)
val->intval = POWER_SUPPLY_STATUS_FULL;
else
val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
} else if (battery->state & ACPI_BATTERY_STATE_CHARGING) {
val->intval = POWER_SUPPLY_STATUS_CHARGING;
}

>   val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
>   else if (battery->state & ACPI_BATTERY_STATE_CHARGING)
>   val->intval = POWER_SUPPLY_STATUS_CHARGING;
> @@ -1166,6 +1170,13 @@ battery_notification_delay_quirk(const struct 
> dmi_system_id *d)
>   return 0;
>  }
>  
> +static int __init
> +battery_full_discharging_quirk(const struct dmi_system_id *d)
> +{
> + battery_full_discharging = 1;
> + return 0;
> +}
> +
>  static const struct dmi_system_id bat_dmi_table[] __initconst = {
>   {
>   .callback = battery_bix_broken_package_quirk,
> @@ -1183,6 +1194,22 @@ static const struct dmi_system_id bat_dmi_table[] 
> __initconst = {
>   DMI_MATCH(DMI_PRODUCT_NAME, "Aspire V5-573G"),
>   },
>   },
> + {
> + .callback = battery_full_discharging_quirk,
> + .ident = "ASUS GL502VSK",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_PRODUCT_NAME, "GL502VSK"),
> + },
> + },
> + {
> + .callback = battery_full_discharging_quirk,
> + .ident = "ASUS UX305LA",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_PRODUCT_NAME, "UX305LA"),
> + },
> + },
>   {},
>  };
>  
> 

Thanks,
Rafael



Re: [PATCH] ACPI / battery: add quirk for Asus GL502VSK and UX305LA

2017-11-20 Thread Rafael J. Wysocki
On Friday, September 22, 2017 10:27:44 AM CET Kai-Heng Feng wrote:
> On Asus GL502VSK and UX305LA, ACPI incorrectly reports discharging when
> battery is full and AC is plugged.
> 
> However rate_now is correct under this circumstance, hence we can use
> "rate_now == 0" as a predicate to report battery full status correctly.
> 
> BugLink: https://bugs.launchpad.net/bugs/1482390
> Signed-off-by: Kai-Heng Feng 
> ---
>  drivers/acpi/battery.c | 29 -
>  1 file changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
> index 13e7b56e33ae..f9f008cf3da7 100644
> --- a/drivers/acpi/battery.c
> +++ b/drivers/acpi/battery.c
> @@ -70,6 +70,7 @@ static async_cookie_t async_cookie;
>  static bool battery_driver_registered;
>  static int battery_bix_broken_package;
>  static int battery_notification_delay_ms;
> +static int battery_full_discharging;
>  static unsigned int cache_time = 1000;
>  module_param(cache_time, uint, 0644);
>  MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
> @@ -214,7 +215,10 @@ static int acpi_battery_get_property(struct power_supply 
> *psy,
>   return -ENODEV;
>   switch (psp) {
>   case POWER_SUPPLY_PROP_STATUS:
> - if (battery->state & ACPI_BATTERY_STATE_DISCHARGING)
> + if (battery_full_discharging && battery->rate_now == 0 &&
> + battery->state & ACPI_BATTERY_STATE_DISCHARGING)
> + val->intval = POWER_SUPPLY_STATUS_FULL;
> + else if (battery->state & ACPI_BATTERY_STATE_DISCHARGING)

You are adding a special sub-case to the
(battery->state & ACPI_BATTERY_STATE_DISCHARGING) case, so it would be cleaner
to do that in the following way:

if (battery->state & ACPI_BATTERY_STATE_DISCHARGING) {
if (battery_full_discharging && battery->rate_now == 0)
val->intval = POWER_SUPPLY_STATUS_FULL;
else
val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
} else if (battery->state & ACPI_BATTERY_STATE_CHARGING) {
val->intval = POWER_SUPPLY_STATUS_CHARGING;
}

>   val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
>   else if (battery->state & ACPI_BATTERY_STATE_CHARGING)
>   val->intval = POWER_SUPPLY_STATUS_CHARGING;
> @@ -1166,6 +1170,13 @@ battery_notification_delay_quirk(const struct 
> dmi_system_id *d)
>   return 0;
>  }
>  
> +static int __init
> +battery_full_discharging_quirk(const struct dmi_system_id *d)
> +{
> + battery_full_discharging = 1;
> + return 0;
> +}
> +
>  static const struct dmi_system_id bat_dmi_table[] __initconst = {
>   {
>   .callback = battery_bix_broken_package_quirk,
> @@ -1183,6 +1194,22 @@ static const struct dmi_system_id bat_dmi_table[] 
> __initconst = {
>   DMI_MATCH(DMI_PRODUCT_NAME, "Aspire V5-573G"),
>   },
>   },
> + {
> + .callback = battery_full_discharging_quirk,
> + .ident = "ASUS GL502VSK",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_PRODUCT_NAME, "GL502VSK"),
> + },
> + },
> + {
> + .callback = battery_full_discharging_quirk,
> + .ident = "ASUS UX305LA",
> + .matches = {
> + DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
> + DMI_MATCH(DMI_PRODUCT_NAME, "UX305LA"),
> + },
> + },
>   {},
>  };
>  
> 

Thanks,
Rafael



Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: call _fsl_ssi_set_dai_fmt() just once in AC'97 mode

2017-11-20 Thread Nicolin Chen
On Tue, Nov 21, 2017 at 01:32:09AM +0100, Maciej S. Szmigiero wrote:
> On 21.11.2017 01:00, Nicolin Chen wrote:
> > On Mon, Nov 20, 2017 at 11:13:45PM +0100, Maciej S. Szmigiero wrote:
> (..)
> >> We need to make sure, however, that only proper channel slots are enabled
> >> at playback start time since some AC'97 CODECs (like VT1613) were observed
> >> requesting via SLOTREQ (and so enabling at SSI) spurious ones just after
> >> an AC'97 link is started but before the CODEC is configured by its driver.
> > 
> > I don't really understand this part. Why do we need to *make sure*
> > and set SACCDIS and SACCEN again since they're initialized already> 
> > Could you please elaborate a bit more?
> 
> SACCDIS and SACCEN aren't just normal registers.
> They are a way to disable or enable bits in SACCST register - writing
> a '1' bit at some position in SACCDIS unsets the relevant bit in SACCST,
> writing a '1' bit at some position in SACCEN sets the relevant bit in
> SACCST.
> 
> But a bit in SACCST can also be set (and so an AC'97 channel slot
> enabled) if a CODEC sets the relevant bit in its SLOTREQ request (it is
> enough if this bit was set in SLOTREQ just once, bits in SACCST are
> 'sticky').

This makes sense now. It could be mentioned a bit in the commit log
as well.

> If an extra slot gets enabled that's a disaster for playback because some
> of normal left or right channel samples are instead redirected to this
> extra slot.
> 
> Unfortunately, the VT1613 codec on UDOO board (which is the only user of
> fsl_ssi driver in the AC'97 mode) is a bit broken and likes to (sometimes)
> set extra bits in its SLOTREQ request - I've confirmed this on two boards
> bought a few months apart directly from UDOO shop.
> 
> A workaround implemented in fsl-asoc-card of setting an appropriate
> CODEC register so that slots 3/4 (the normal stereo playback slots) are
> used for S/PDIF seems to mostly fix this issue but since this codec is so
> untrustworthy then:

If this move is also a workaround, probably it'd be better to have
an fsl_ssi_ac97_xxx_war() function with some comments/descriptions,
and include it in the config(). Since it doesn't hurt to set these
registers again, I am personally fine with that. But it needs to be
clear -- otherwise, people may try to remove it later with a change
like: Removing redundant SACCEN/SACCDIS settings since they're done
during probe().


Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: call _fsl_ssi_set_dai_fmt() just once in AC'97 mode

2017-11-20 Thread Nicolin Chen
On Tue, Nov 21, 2017 at 01:32:09AM +0100, Maciej S. Szmigiero wrote:
> On 21.11.2017 01:00, Nicolin Chen wrote:
> > On Mon, Nov 20, 2017 at 11:13:45PM +0100, Maciej S. Szmigiero wrote:
> (..)
> >> We need to make sure, however, that only proper channel slots are enabled
> >> at playback start time since some AC'97 CODECs (like VT1613) were observed
> >> requesting via SLOTREQ (and so enabling at SSI) spurious ones just after
> >> an AC'97 link is started but before the CODEC is configured by its driver.
> > 
> > I don't really understand this part. Why do we need to *make sure*
> > and set SACCDIS and SACCEN again since they're initialized already> 
> > Could you please elaborate a bit more?
> 
> SACCDIS and SACCEN aren't just normal registers.
> They are a way to disable or enable bits in SACCST register - writing
> a '1' bit at some position in SACCDIS unsets the relevant bit in SACCST,
> writing a '1' bit at some position in SACCEN sets the relevant bit in
> SACCST.
> 
> But a bit in SACCST can also be set (and so an AC'97 channel slot
> enabled) if a CODEC sets the relevant bit in its SLOTREQ request (it is
> enough if this bit was set in SLOTREQ just once, bits in SACCST are
> 'sticky').

This makes sense now. It could be mentioned a bit in the commit log
as well.

> If an extra slot gets enabled that's a disaster for playback because some
> of normal left or right channel samples are instead redirected to this
> extra slot.
> 
> Unfortunately, the VT1613 codec on UDOO board (which is the only user of
> fsl_ssi driver in the AC'97 mode) is a bit broken and likes to (sometimes)
> set extra bits in its SLOTREQ request - I've confirmed this on two boards
> bought a few months apart directly from UDOO shop.
> 
> A workaround implemented in fsl-asoc-card of setting an appropriate
> CODEC register so that slots 3/4 (the normal stereo playback slots) are
> used for S/PDIF seems to mostly fix this issue but since this codec is so
> untrustworthy then:

If this move is also a workaround, probably it'd be better to have
an fsl_ssi_ac97_xxx_war() function with some comments/descriptions,
and include it in the config(). Since it doesn't hurt to set these
registers again, I am personally fine with that. But it needs to be
clear -- otherwise, people may try to remove it later with a change
like: Removing redundant SACCEN/SACCDIS settings since they're done
during probe().


Re: [PATCH v2] mm: show total hugetlb memory consumption in /proc/meminfo

2017-11-20 Thread Mike Kravetz
On 11/20/2017 04:51 PM, Andrew Morton wrote:
> On Wed, 15 Nov 2017 23:14:09 + Roman Gushchin  wrote:
> 
>> Currently we display some hugepage statistics (total, free, etc)
>> in /proc/meminfo, but only for default hugepage size (e.g. 2Mb).
>>
>> If hugepages of different sizes are used (like 2Mb and 1Gb on x86-64),
>> /proc/meminfo output can be confusing, as non-default sized hugepages
>> are not reflected at all, and there are no signs that they are
>> existing and consuming system memory.
>>
>> To solve this problem, let's display the total amount of memory,
>> consumed by hugetlb pages of all sized (both free and used).
>> Let's call it "Hugetlb", and display size in kB to match generic
>> /proc/meminfo style.
>>
>> For example, (1024 2Mb pages and 2 1Gb pages are pre-allocated):
>>   $ cat /proc/meminfo
>>   MemTotal:8168984 kB
>>   MemFree: 3789276 kB
>>   <...>
>>   CmaFree:   0 kB
>>   HugePages_Total:1024
>>   HugePages_Free: 1024
>>   HugePages_Rsvd:0
>>   HugePages_Surp:0
>>   Hugepagesize:   2048 kB
>>   Hugetlb: 4194304 kB
>>   DirectMap4k:   32632 kB
>>   DirectMap2M: 4161536 kB
>>   DirectMap1G: 6291456 kB
>>
>> Also, this patch updates corresponding docs to reflect
>> Hugetlb entry meaning and difference between Hugetlb and
>> HugePages_Total * Hugepagesize.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2973,20 +2973,32 @@ int hugetlb_overcommit_handler(struct ctl_table 
>> *table, int write,
>>  
>>  void hugetlb_report_meminfo(struct seq_file *m)
>>  {
>> -struct hstate *h = _hstate;
>> +struct hstate *h;
>> +unsigned long total = 0;
>> +
>>  if (!hugepages_supported())
>>  return;
>> -seq_printf(m,
>> -"HugePages_Total:   %5lu\n"
>> -"HugePages_Free:%5lu\n"
>> -"HugePages_Rsvd:%5lu\n"
>> -"HugePages_Surp:%5lu\n"
>> -"Hugepagesize:   %8lu kB\n",
>> -h->nr_huge_pages,
>> -h->free_huge_pages,
>> -h->resv_huge_pages,
>> -h->surplus_huge_pages,
>> -1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
>> +
>> +for_each_hstate(h) {
>> +unsigned long count = h->nr_huge_pages;
>> +
>> +total += (PAGE_SIZE << huge_page_order(h)) * count;
>> +
>> +if (h == _hstate)
> 
> I'm not understanding this test.  Are we assuming that default_hstate
> always refers to the highest-index hstate?  If so why, and is that
> valid?

Actually default_hstate is defined as:

#define default_hstate (hstates[default_hstate_idx])

default_hstate_idx is set during hugetlb_init based upon default_hstate_size
which defaults to HPAGE_SIZE.  However, it can be overridden by the kernel
command line argument "default_hugepagesz=".

By definition and history /proc/meminfo lists information on the default
huge page size.  This code is looping through all hstates to get the total
memory consumed by hugetlb pages for the new "Hugetlb" field.  When it gets
to the default huge page size, it prints the historic fields.

Hope that helps,
-- 
Mike Kravetz

> 
>> +seq_printf(m,
>> +   "HugePages_Total:   %5lu\n"
>> +   "HugePages_Free:%5lu\n"
>> +   "HugePages_Rsvd:%5lu\n"
>> +   "HugePages_Surp:%5lu\n"
>> +   "Hugepagesize:   %8lu kB\n",
>> +   count,
>> +   h->free_huge_pages,
>> +   h->resv_huge_pages,
>> +   h->surplus_huge_pages,
>> +   (PAGE_SIZE << huge_page_order(h)) / 1024);
>> +}
>> +
>> +seq_printf(m, "Hugetlb:%8lu kB\n", total / 1024);
>>  }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 


Re: [PATCH v2] mm: show total hugetlb memory consumption in /proc/meminfo

2017-11-20 Thread Mike Kravetz
On 11/20/2017 04:51 PM, Andrew Morton wrote:
> On Wed, 15 Nov 2017 23:14:09 + Roman Gushchin  wrote:
> 
>> Currently we display some hugepage statistics (total, free, etc)
>> in /proc/meminfo, but only for default hugepage size (e.g. 2Mb).
>>
>> If hugepages of different sizes are used (like 2Mb and 1Gb on x86-64),
>> /proc/meminfo output can be confusing, as non-default sized hugepages
>> are not reflected at all, and there are no signs that they are
>> existing and consuming system memory.
>>
>> To solve this problem, let's display the total amount of memory,
>> consumed by hugetlb pages of all sized (both free and used).
>> Let's call it "Hugetlb", and display size in kB to match generic
>> /proc/meminfo style.
>>
>> For example, (1024 2Mb pages and 2 1Gb pages are pre-allocated):
>>   $ cat /proc/meminfo
>>   MemTotal:8168984 kB
>>   MemFree: 3789276 kB
>>   <...>
>>   CmaFree:   0 kB
>>   HugePages_Total:1024
>>   HugePages_Free: 1024
>>   HugePages_Rsvd:0
>>   HugePages_Surp:0
>>   Hugepagesize:   2048 kB
>>   Hugetlb: 4194304 kB
>>   DirectMap4k:   32632 kB
>>   DirectMap2M: 4161536 kB
>>   DirectMap1G: 6291456 kB
>>
>> Also, this patch updates corresponding docs to reflect
>> Hugetlb entry meaning and difference between Hugetlb and
>> HugePages_Total * Hugepagesize.
>>
>> ...
>>
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -2973,20 +2973,32 @@ int hugetlb_overcommit_handler(struct ctl_table 
>> *table, int write,
>>  
>>  void hugetlb_report_meminfo(struct seq_file *m)
>>  {
>> -struct hstate *h = _hstate;
>> +struct hstate *h;
>> +unsigned long total = 0;
>> +
>>  if (!hugepages_supported())
>>  return;
>> -seq_printf(m,
>> -"HugePages_Total:   %5lu\n"
>> -"HugePages_Free:%5lu\n"
>> -"HugePages_Rsvd:%5lu\n"
>> -"HugePages_Surp:%5lu\n"
>> -"Hugepagesize:   %8lu kB\n",
>> -h->nr_huge_pages,
>> -h->free_huge_pages,
>> -h->resv_huge_pages,
>> -h->surplus_huge_pages,
>> -1UL << (huge_page_order(h) + PAGE_SHIFT - 10));
>> +
>> +for_each_hstate(h) {
>> +unsigned long count = h->nr_huge_pages;
>> +
>> +total += (PAGE_SIZE << huge_page_order(h)) * count;
>> +
>> +if (h == _hstate)
> 
> I'm not understanding this test.  Are we assuming that default_hstate
> always refers to the highest-index hstate?  If so why, and is that
> valid?

Actually default_hstate is defined as:

#define default_hstate (hstates[default_hstate_idx])

default_hstate_idx is set during hugetlb_init based upon default_hstate_size
which defaults to HPAGE_SIZE.  However, it can be overridden by the kernel
command line argument "default_hugepagesz=".

By definition and history /proc/meminfo lists information on the default
huge page size.  This code is looping through all hstates to get the total
memory consumed by hugetlb pages for the new "Hugetlb" field.  When it gets
to the default huge page size, it prints the historic fields.

Hope that helps,
-- 
Mike Kravetz

> 
>> +seq_printf(m,
>> +   "HugePages_Total:   %5lu\n"
>> +   "HugePages_Free:%5lu\n"
>> +   "HugePages_Rsvd:%5lu\n"
>> +   "HugePages_Surp:%5lu\n"
>> +   "Hugepagesize:   %8lu kB\n",
>> +   count,
>> +   h->free_huge_pages,
>> +   h->resv_huge_pages,
>> +   h->surplus_huge_pages,
>> +   (PAGE_SIZE << huge_page_order(h)) / 1024);
>> +}
>> +
>> +seq_printf(m, "Hugetlb:%8lu kB\n", total / 1024);
>>  }
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
> 


Re: [PATCH] devres: use MACRO instead of function for devm_ioremap_nocache

2017-11-20 Thread Yisheng Xie


On 2017/11/20 17:32, Yisheng Xie wrote:
> Default ioremap is ioremap_nocache, so devm_ioremap has the same function
> with devm_ioremap_nocache, which may just be killed. However, there
> are many places which use devm_ioremap_nocache instead of devm_ioremap.
> 
> This patch is to use MACRO for devm_ioremap_nocache, which will reduce
> the size of devres.o from 206824 Bytes to 203912 Bytes.
Oh, maybe it is better to let devm_ioremap use MACRO, and user may more clear
to know that devm_ioremap is default is devm_ioremap*_nocache*.

Any suggestions?

Thanks
Yisheng Xie
> 
> Signed-off-by: Yisheng Xie 
> ---
>  include/linux/io.h |  3 +--
>  lib/devres.c   | 29 -
>  2 files changed, 1 insertion(+), 31 deletions(-)
> 
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 32e30e8..1516ccd 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -75,14 +75,13 @@ static inline void devm_ioport_unmap(struct device *dev, 
> void __iomem *addr)
>  
>  void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
>  resource_size_t size);
> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
> offset,
> -resource_size_t size);
>  void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
>  resource_size_t size);
>  void devm_iounmap(struct device *dev, void __iomem *addr);
>  int check_signature(const volatile void __iomem *io_addr,
>   const unsigned char *signature, int length);
>  void devm_ioremap_release(struct device *dev, void *res);
> +#define devm_ioremap_nocache devm_ioremap
>  
>  void *devm_memremap(struct device *dev, resource_size_t offset,
>   size_t size, unsigned long flags);
> diff --git a/lib/devres.c b/lib/devres.c
> index 5f2aedd..f818fcf 100644
> --- a/lib/devres.c
> +++ b/lib/devres.c
> @@ -44,35 +44,6 @@ void __iomem *devm_ioremap(struct device *dev, 
> resource_size_t offset,
>  EXPORT_SYMBOL(devm_ioremap);
>  
>  /**
> - * devm_ioremap_nocache - Managed ioremap_nocache()
> - * @dev: Generic device to remap IO address for
> - * @offset: Resource address to map
> - * @size: Size of map
> - *
> - * Managed ioremap_nocache().  Map is automatically unmapped on driver
> - * detach.
> - */
> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
> offset,
> -resource_size_t size)
> -{
> - void __iomem **ptr, *addr;
> -
> - ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
> - if (!ptr)
> - return NULL;
> -
> - addr = ioremap_nocache(offset, size);
> - if (addr) {
> - *ptr = addr;
> - devres_add(dev, ptr);
> - } else
> - devres_free(ptr);
> -
> - return addr;
> -}
> -EXPORT_SYMBOL(devm_ioremap_nocache);
> -
> -/**
>   * devm_ioremap_wc - Managed ioremap_wc()
>   * @dev: Generic device to remap IO address for
>   * @offset: Resource address to map
> 



Re: [PATCH] devres: use MACRO instead of function for devm_ioremap_nocache

2017-11-20 Thread Yisheng Xie


On 2017/11/20 17:32, Yisheng Xie wrote:
> Default ioremap is ioremap_nocache, so devm_ioremap has the same function
> with devm_ioremap_nocache, which may just be killed. However, there
> are many places which use devm_ioremap_nocache instead of devm_ioremap.
> 
> This patch is to use MACRO for devm_ioremap_nocache, which will reduce
> the size of devres.o from 206824 Bytes to 203912 Bytes.
Oh, maybe it is better to let devm_ioremap use MACRO, and user may more clear
to know that devm_ioremap is default is devm_ioremap*_nocache*.

Any suggestions?

Thanks
Yisheng Xie
> 
> Signed-off-by: Yisheng Xie 
> ---
>  include/linux/io.h |  3 +--
>  lib/devres.c   | 29 -
>  2 files changed, 1 insertion(+), 31 deletions(-)
> 
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 32e30e8..1516ccd 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -75,14 +75,13 @@ static inline void devm_ioport_unmap(struct device *dev, 
> void __iomem *addr)
>  
>  void __iomem *devm_ioremap(struct device *dev, resource_size_t offset,
>  resource_size_t size);
> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
> offset,
> -resource_size_t size);
>  void __iomem *devm_ioremap_wc(struct device *dev, resource_size_t offset,
>  resource_size_t size);
>  void devm_iounmap(struct device *dev, void __iomem *addr);
>  int check_signature(const volatile void __iomem *io_addr,
>   const unsigned char *signature, int length);
>  void devm_ioremap_release(struct device *dev, void *res);
> +#define devm_ioremap_nocache devm_ioremap
>  
>  void *devm_memremap(struct device *dev, resource_size_t offset,
>   size_t size, unsigned long flags);
> diff --git a/lib/devres.c b/lib/devres.c
> index 5f2aedd..f818fcf 100644
> --- a/lib/devres.c
> +++ b/lib/devres.c
> @@ -44,35 +44,6 @@ void __iomem *devm_ioremap(struct device *dev, 
> resource_size_t offset,
>  EXPORT_SYMBOL(devm_ioremap);
>  
>  /**
> - * devm_ioremap_nocache - Managed ioremap_nocache()
> - * @dev: Generic device to remap IO address for
> - * @offset: Resource address to map
> - * @size: Size of map
> - *
> - * Managed ioremap_nocache().  Map is automatically unmapped on driver
> - * detach.
> - */
> -void __iomem *devm_ioremap_nocache(struct device *dev, resource_size_t 
> offset,
> -resource_size_t size)
> -{
> - void __iomem **ptr, *addr;
> -
> - ptr = devres_alloc(devm_ioremap_release, sizeof(*ptr), GFP_KERNEL);
> - if (!ptr)
> - return NULL;
> -
> - addr = ioremap_nocache(offset, size);
> - if (addr) {
> - *ptr = addr;
> - devres_add(dev, ptr);
> - } else
> - devres_free(ptr);
> -
> - return addr;
> -}
> -EXPORT_SYMBOL(devm_ioremap_nocache);
> -
> -/**
>   * devm_ioremap_wc - Managed ioremap_wc()
>   * @dev: Generic device to remap IO address for
>   * @offset: Resource address to map
> 



Re: [patches] Re: [PATCH] dt-bindings: Add a RISC-V SBI firmware node

2017-11-20 Thread Jonathan Neuschäfer
On Mon, Nov 20, 2017 at 01:28:01PM -0800, Palmer Dabbelt wrote:
[...]
> > > +++ b/Documentation/devicetree/bindings/firmware/riscv.sbi.txt
> > 
> > Nit: Other bindings use either a comma (as in the compatible string,
> > "riscv,sbi.txt") or a dash (vendor-product.txt, "riscv-sbi.txt") in the
> > file name.
> 
> That was just a typo, I'll fix it.

Ok

> > > @@ -0,0 +1,20 @@
> > > +RISC-V Supervisor Binary Interface (SBI)
> > > +
> > > +The RISC-V privileged ISA specification mandates the presence of a 
> > > supervisor
> > > +binary interface that performs some operations which might otherwise 
> > > require
> > > +particularly complicated instructions.  This interface includes
> > > +inter-processor interrupts, TLB flushes, i-cache and TLB shootdowns, a
> > > +console, and power management.
> > > +
> > > +Required properties:
> > > +- compatible: must contain one of the following
> > > + * "riscv,sbi" for the SBI defined by the privileged specification of the
> > > +   system.
> > 
> > "of the system" seems to imply that different RISC-V systems (different
> > RISC-V implementations) can have different privileged specifications.
> 
> Actually, that was intentional -- I wrote it this way because different
> RISC-V systems do have different privileged specifications.  The RISC-V
> specifications aren't frozen in time, they're just guaranteed to be
> compatible in the future.  For example, the user ISA document has been
> updated multiple times (the C spec, eliminating some unspecified behavior)
> and will continue to be updated (V and other extensions, the memory model).
> The privileged spec will be updated in a compatible way just like the user
> spec will be -- I know there's at least hypervisor support in the works, and
> I saw some things to remove undefined behavior go past as well.
> 
> In a similar fashion, the ABI and SBI will continue to evolve.  For example,
> we'll probably add new system calls to extend the user ABI and new hyper
> calls to extend the SBI.

My problem with the wording was that the OS somehow has to know which
version and variant of the SBI it is talking to -- either through
in-band communication (an SBI call to request SBI information, etc.), or
through devicetree or similar mechanisms.

> 
> > I think it's better to refer to concrete documents, that don't depend on
> > the rest of the system, instead. Either:


Thanks,
Jonathan Neuschäfer


signature.asc
Description: PGP signature


Re: [patches] Re: [PATCH] dt-bindings: Add a RISC-V SBI firmware node

2017-11-20 Thread Jonathan Neuschäfer
On Mon, Nov 20, 2017 at 01:28:01PM -0800, Palmer Dabbelt wrote:
[...]
> > > +++ b/Documentation/devicetree/bindings/firmware/riscv.sbi.txt
> > 
> > Nit: Other bindings use either a comma (as in the compatible string,
> > "riscv,sbi.txt") or a dash (vendor-product.txt, "riscv-sbi.txt") in the
> > file name.
> 
> That was just a typo, I'll fix it.

Ok

> > > @@ -0,0 +1,20 @@
> > > +RISC-V Supervisor Binary Interface (SBI)
> > > +
> > > +The RISC-V privileged ISA specification mandates the presence of a 
> > > supervisor
> > > +binary interface that performs some operations which might otherwise 
> > > require
> > > +particularly complicated instructions.  This interface includes
> > > +inter-processor interrupts, TLB flushes, i-cache and TLB shootdowns, a
> > > +console, and power management.
> > > +
> > > +Required properties:
> > > +- compatible: must contain one of the following
> > > + * "riscv,sbi" for the SBI defined by the privileged specification of the
> > > +   system.
> > 
> > "of the system" seems to imply that different RISC-V systems (different
> > RISC-V implementations) can have different privileged specifications.
> 
> Actually, that was intentional -- I wrote it this way because different
> RISC-V systems do have different privileged specifications.  The RISC-V
> specifications aren't frozen in time, they're just guaranteed to be
> compatible in the future.  For example, the user ISA document has been
> updated multiple times (the C spec, eliminating some unspecified behavior)
> and will continue to be updated (V and other extensions, the memory model).
> The privileged spec will be updated in a compatible way just like the user
> spec will be -- I know there's at least hypervisor support in the works, and
> I saw some things to remove undefined behavior go past as well.
> 
> In a similar fashion, the ABI and SBI will continue to evolve.  For example,
> we'll probably add new system calls to extend the user ABI and new hyper
> calls to extend the SBI.

My problem with the wording was that the OS somehow has to know which
version and variant of the SBI it is talking to -- either through
in-band communication (an SBI call to request SBI information, etc.), or
through devicetree or similar mechanisms.

> 
> > I think it's better to refer to concrete documents, that don't depend on
> > the rest of the system, instead. Either:


Thanks,
Jonathan Neuschäfer


signature.asc
Description: PGP signature


Re: [PATCH 2/3] x86/apic: Update the 'apic=' description of setting APIC driver

2017-11-20 Thread Dou Liyang

Hi Randy,

At 11/21/2017 01:21 AM, Randy Dunlap wrote:

On 11/20/2017 05:27 AM, Dou Liyang wrote:

There are two consumers of apic=: the APIC debug level and the low
level generic architecture code, but Linux just documented the first
one.

Append the second description.

Signed-off-by: Dou Liyang 
---
 Documentation/admin-guide/kernel-parameters.txt | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 62436bd..fdd2382 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -328,11 +328,14 @@
not play well with APC CPU idle - disable it if you have
APC and your system crashes randomly.

-   apic=   [APIC,X86-32] Advanced Programmable Interrupt Controller
+   apic=   [APIC,X86] Advanced Programmable Interrupt Controller
Change the output verbosity whilst booting
Format: { quiet (default) | verbose | debug }
Change the amount of debugging information output
when initialising the APIC and IO-APIC components.
+   If in X86-32, one more function is registering APIC
+   driver.
+   Examples: apic=bigsmp, bigsmp is an APIC driver name.


How about this?

For X86-32, this can also be used to specify an APIC
driver name.
Format: apic=drivername
Example: apic=bigsmp



Pretty good! Will use it in the next version.

Thanks.
dou.


apic_extnmi=[APIC,X86] External NMI delivery setting
Format: { bsp (default) | all | none }









Re: [PATCH 2/3] x86/apic: Update the 'apic=' description of setting APIC driver

2017-11-20 Thread Dou Liyang

Hi Randy,

At 11/21/2017 01:21 AM, Randy Dunlap wrote:

On 11/20/2017 05:27 AM, Dou Liyang wrote:

There are two consumers of apic=: the APIC debug level and the low
level generic architecture code, but Linux just documented the first
one.

Append the second description.

Signed-off-by: Dou Liyang 
---
 Documentation/admin-guide/kernel-parameters.txt | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 62436bd..fdd2382 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -328,11 +328,14 @@
not play well with APC CPU idle - disable it if you have
APC and your system crashes randomly.

-   apic=   [APIC,X86-32] Advanced Programmable Interrupt Controller
+   apic=   [APIC,X86] Advanced Programmable Interrupt Controller
Change the output verbosity whilst booting
Format: { quiet (default) | verbose | debug }
Change the amount of debugging information output
when initialising the APIC and IO-APIC components.
+   If in X86-32, one more function is registering APIC
+   driver.
+   Examples: apic=bigsmp, bigsmp is an APIC driver name.


How about this?

For X86-32, this can also be used to specify an APIC
driver name.
Format: apic=drivername
Example: apic=bigsmp



Pretty good! Will use it in the next version.

Thanks.
dou.


apic_extnmi=[APIC,X86] External NMI delivery setting
Format: { bsp (default) | all | none }









Re: [PATCH] mm, meminit: Serially initialise deferred memory if trace_buf_size is specified

2017-11-20 Thread Andrew Morton
On Fri, 17 Nov 2017 13:19:56 -0500 Pavel Tatashin  
wrote:

> On Thu, Nov 16, 2017 at 5:06 AM, Mel Gorman  
> wrote:
> > 4. Put a check into the page allocator slowpath that triggers serialised
> >init if the system is booting and an allocation is about to fail. It
> >would be such a cold path that it would never be noticable although it
> >would leave dead code in the kernel image once boot had completed
> 
> Hi Mel,
> 
> The forth approach is the best as it is seamless for admins and
> engineers, it will also work on any system configuration with any
> parameters without any special involvement.

Apart from what-mel-said, I'd be concerned that this failsafe would
almost never get tested.  We should find some way to ensure that this
code gets exercised in some people's kernels on a permanent basis and
I'm not sure how to do that.

One option might be to ask Fengguang to add the occasional
test_pavels_stuff=1 to the kernel boot commandline.  That's better
than nothing but 0-day only runs on a small number of machine types.


Re: [PATCH] mm, meminit: Serially initialise deferred memory if trace_buf_size is specified

2017-11-20 Thread Andrew Morton
On Fri, 17 Nov 2017 13:19:56 -0500 Pavel Tatashin  
wrote:

> On Thu, Nov 16, 2017 at 5:06 AM, Mel Gorman  
> wrote:
> > 4. Put a check into the page allocator slowpath that triggers serialised
> >init if the system is booting and an allocation is about to fail. It
> >would be such a cold path that it would never be noticable although it
> >would leave dead code in the kernel image once boot had completed
> 
> Hi Mel,
> 
> The forth approach is the best as it is seamless for admins and
> engineers, it will also work on any system configuration with any
> parameters without any special involvement.

Apart from what-mel-said, I'd be concerned that this failsafe would
almost never get tested.  We should find some way to ensure that this
code gets exercised in some people's kernels on a permanent basis and
I'm not sure how to do that.

One option might be to ask Fengguang to add the occasional
test_pavels_stuff=1 to the kernel boot commandline.  That's better
than nothing but 0-day only runs on a small number of machine types.


Re: [v2,12/18] kbuild: add support for clang LTO

2017-11-20 Thread Nicholas Piggin
On Mon, 20 Nov 2017 12:21:52 -0800
Sami Tolvanen  wrote:

> On Sat, Nov 18, 2017 at 01:21:39PM +1000, Nicholas Piggin wrote:
> > Do you have any kind of numbers for this, out of curiosity? Binary
> > size, performance, build time?  
> 
> I don't have performance numbers to share. Are there any specific
> benchmarks you'd be interested in seeing? Build time typically
> increases with LTO and in my experience, binary size tends to increase
> by ~10-15% as well.

By deduction, then you must see some performance improvement? :) 

I just wonder are you doing this because there is some worthwhile
performance gain? Or to enable more testing and development of LTO?
Any clues for why a user would want to enable it.

> 
> > Why is this needed? It would have been nice to get rid of the
> > !THIN_ARCHIVES option if you can make the patches work with the thin
> > archives paths.  
> 
> I believe LLVMgold doesn't know how to deal with an archive of LLVM IR
> files, but I can certainly use thin archives as an index and extract
> the path names for linking. I'll look into it.

Thanks, if you could. Possibly file a request with LLVMgold too, it
seems to be that toolchain support for archives is quite strong, so it
will be good to keep pushing for that.

Thanks,
Nick


Re: [v2,12/18] kbuild: add support for clang LTO

2017-11-20 Thread Nicholas Piggin
On Mon, 20 Nov 2017 12:21:52 -0800
Sami Tolvanen  wrote:

> On Sat, Nov 18, 2017 at 01:21:39PM +1000, Nicholas Piggin wrote:
> > Do you have any kind of numbers for this, out of curiosity? Binary
> > size, performance, build time?  
> 
> I don't have performance numbers to share. Are there any specific
> benchmarks you'd be interested in seeing? Build time typically
> increases with LTO and in my experience, binary size tends to increase
> by ~10-15% as well.

By deduction, then you must see some performance improvement? :) 

I just wonder are you doing this because there is some worthwhile
performance gain? Or to enable more testing and development of LTO?
Any clues for why a user would want to enable it.

> 
> > Why is this needed? It would have been nice to get rid of the
> > !THIN_ARCHIVES option if you can make the patches work with the thin
> > archives paths.  
> 
> I believe LLVMgold doesn't know how to deal with an archive of LLVM IR
> files, but I can certainly use thin archives as an index and extract
> the path names for linking. I'll look into it.

Thanks, if you could. Possibly file a request with LLVMgold too, it
seems to be that toolchain support for archives is quite strong, so it
will be good to keep pushing for that.

Thanks,
Nick


<    1   2   3   4   5   6   7   8   9   10   >